x.comKexin HuangMon, May 18, 2026, 8:33 AM PDT
score 16.5
53likes10RT
Benchmark grades AI agents on biology research reasoning, not just answers
Original: 📢Introducing BiomniBench — the first benchmark focused on evaluating the process, not just the final answer, of AI agents on long-horizon biology research tasks.
Source: x.com ↗
Writing ELI5 summary…