x.comQian LiuThu, Jul 2, 2026, 10:30 PM PDT
score 16.1
42likes4RT1reply
EdgeBench benchmark tests AI agents on 12-72 hour tasks
Original: Nice long-horizon benchmark!
Source: x.com ↗
Writing ELI5 summary…
Original: Nice long-horizon benchmark!
Source: x.com ↗
Writing ELI5 summary…