← back
x.comGuohao Li 🐫Fri, Jul 3, 2026, 1:17 AM PDT
score 16.4
73likes12RT

EdgeBench: Benchmark for agents working 12-72 hours on tasks

Original: such a beautiful graph of agents learning through environment interactions

Source: x.com

Writing ELI5 summary…