arXivKlaudia-Doris Thellmann, Bernhard Stadler, Michael Färber, Jens LehmannSun, May 24, 2026, 12:06 AM PDT
score 16.1
Translation errors silently break multilingual AI benchmarks
Original: Quantifying the Impact of Translation Errors on Multilingual LLM Evaluation
Source: arxiv.org ↗
Writing ELI5 summary…