x.comTatsunori HashimotoThu, May 21, 2026, 8:51 AM PDT
score 17.1
1,019likes153RT27reply
Large language models may not need data filtering
Original: Some new results I found surprising that I’m tweeting for Chris (who isnt on here). With enough compute, the best data filter for LMs (on DCLM) might be no filter. Why? Large models can tolerate a sur
Source: x.com ↗
Writing ELI5 summary…