arXivXing Yue, Linjuan Wu, Daoxin Zhang, Yongliang Shen, Weiming LuFri, Jun 5, 2026, 1:34 AM PDT
score 15.2
Reusable evaluation skills improve AI judge accuracy without per-query rubrics
Original: Beyond Rubrics: Exploration-Guided Evaluation Skills for Reward Modeling
Source: arxiv.org ↗
Writing ELI5 summary…