← back
arXivAnshun Asher Zheng, Kanishka Misra, David I. Beaver, Junyi Jessy LiMon, Jun 1, 2026, 10:51 AM PDT
score 16.6

Benchmark tests how well AI learns hidden rules from examples

Original: HERO'S JOURNEY: Testing Complex Rule Induction with Text Games

Source: arxiv.org

Writing ELI5 summary…