← back
arXivKlaudia-Doris Thellmann, Bernhard Stadler, Michael Färber, Jens LehmannSun, May 24, 2026, 12:06 AM PDT
score 16.1

Translation errors silently break multilingual AI benchmarks

Original: Quantifying the Impact of Translation Errors on Multilingual LLM Evaluation

Source: arxiv.org

Writing ELI5 summary…