← back
x.comCameron R. Wolfe, Ph.D.Sat, May 30, 2026, 2:14 PM PDT
score 15.7
47likes7reply

AI labs spend billions on training data but lack public benchmarks

Original: If this is true, then I really hope we have some better public evaluation benchmarks that are released as a result of this spending. Even top benchmarks for coding agents are usually super small. For

Source: x.com

Writing ELI5 summary…