← back
x.comHugo LarochelleSat, May 23, 2026, 7:59 AM PDT
score 15.2
8RT

Training loss predicts how language models scale on real tasks

Original: RT @sivareddyg: Nature is complex. Why would cross-entropy loss predict scaling behavior of language models on downstream task? Introducing…

Source: x.com

Writing ELI5 summary…