← back
arXivAhmer Tabassum, Sarfraz Ahmad, Hasan Iqbal, Owais Aijaz, Momina Ahsan, Preslav NakovFri, Jun 5, 2026, 4:35 AM PDT
score 15.3

New Urdu language test exposes gaps in AI models

Original: UrduMMLU: A Massive Multitask Benchmark for Urdu Language Understanding

Source: arxiv.org

Writing ELI5 summary…