x.comCameron R. Wolfe, Ph.D.Sat, May 30, 2026, 2:14 PM PDT
score 15.7
47likes7reply
AI labs spend billions on training data but lack public benchmarks
Original: If this is true, then I really hope we have some better public evaluation benchmarks that are released as a result of this spending. Even top benchmarks for coding agents are usually super small. For
Source: x.com ↗
Writing ELI5 summary…