arXivYiheng Shu, Bernal Jiménez Gutiérrez, Saisri Padmaja Jonnalagedda, Yuguang Yao, Huan Sun, Yu SuMon, Jun 1, 2026, 9:32 AM PDT
score 16.5
New benchmark tests if AI agents learn and reuse knowledge over time
Original: AGENTCL: Toward Rigorous Evaluation of Continual Learning in Language Agents
Source: arxiv.org ↗
Writing ELI5 summary…