← back
arXivYuxin Chen, Yi Zhang, Zhengzhou Cai, Yaorui Shi, Zhiyuan Yao, Chenhang Cui, Jingnan Zheng, Yaqi Huo, Xi Su, Qi Gu, Xunliang Cai, Xiang Wang, An Zhang, Tat-Seng ChuaTue, May 26, 2026, 8:07 AM PDT
score 16.4

Benchmark reveals AI agents struggle with learning user preferences

Original: VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions

Source: arxiv.org

Writing ELI5 summary…