← back
arXivJohnny Tian-Zheng Wei, Jerry Li, Ameya Godbole, Robin JiaSat, May 23, 2026, 7:06 PM PDT
score 15.9

Fixing inflated test scores when training data leaks into evaluations

Original: Spiking the training data to correct for test set contamination

Source: arxiv.org

Writing ELI5 summary…