arXivJiamin Chen, Yidi Wu, Qiexiang Wang, Qianben Chen, Yuchen Li, Yansen Zhang, Xiaokun Zhang, Wangchunshu Zhou, Chen MaThu, May 28, 2026, 8:46 AM PDT
score 14.7
Using AI to revive saturated benchmark scores through smarter evaluation
Original: SEAL: Can Saturated Benchmarks Be Revived by LLM-as-a-Meta-Judge?
Source: arxiv.org ↗
Writing ELI5 summary…