← back
x.comThomas WolfWed, May 20, 2026, 10:47 AM PDT
score 17.5
12likes2RT7reply

Terminal-Bench expands to measure AI performance on scientific tasks

Original: I'm very excited about this extension to the celebrated Terminal-Bench to science.

Source: x.com

Writing ELI5 summary…