← back
arXivA. J. Lew, Y. Cao, M. J. BuehlerThu, May 28, 2026, 10:38 AM PDT
score 14.8

New benchmark tests whether AI can do real scientific discovery

Original: ProjectionBench: Evaluating Scientific Hypothesis Generation in LLMs Under Progressive Information Disclosure

Source: arxiv.org

Writing ELI5 summary…