arXivA. J. Lew, Y. Cao, M. J. BuehlerThu, May 28, 2026, 10:38 AM PDT
score 14.8
New benchmark tests whether AI can do real scientific discovery
Original: ProjectionBench: Evaluating Scientific Hypothesis Generation in LLMs Under Progressive Information Disclosure
Source: arxiv.org ↗
Writing ELI5 summary…