arXivChunru Lin, Hongxin Zhang, Fenghao Yu, Zhehuan Chen, Thomas L. Griffiths, Yejin Choi, David Held, Chuang GanThu, May 28, 2026, 10:57 AM PDT

score 14.8

New benchmark exposes reasoning gaps in robotic AI systems

Original: RoboWits: Unexpected Challenges for Robotic Creative Problem Solving

Writing ELI5 summary…