arXivAhmer Tabassum, Sarfraz Ahmad, Hasan Iqbal, Owais Aijaz, Momina Ahsan, Preslav NakovFri, Jun 5, 2026, 4:35 AM PDT
score 15.3
New Urdu language test exposes gaps in AI models
Original: UrduMMLU: A Massive Multitask Benchmark for Urdu Language Understanding
Source: arxiv.org ↗
Writing ELI5 summary…