← back
x.comTatsunori HashimotoThu, May 21, 2026, 8:51 AM PDT
score 17.1
1,019likes153RT27reply

Large language models may not need data filtering

Original: Some new results I found surprising that I’m tweeting for Chris (who isnt on here). With enough compute, the best data filter for LMs (on DCLM) might be no filter. Why? Large models can tolerate a sur

Source: x.com

Writing ELI5 summary…