arXivYuxin Chen, Yi Zhang, Zhengzhou Cai, Yaorui Shi, Zhiyuan Yao, Chenhang Cui, Jingnan Zheng, Yaqi Huo, Xi Su, Qi Gu, Xunliang Cai, Xiang Wang, An Zhang, Tat-Seng ChuaTue, May 26, 2026, 8:07 AM PDT
score 16.4
Benchmark reveals AI agents struggle with learning user preferences
Original: VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions
Source: arxiv.org ↗
Writing ELI5 summary…