← back
arXivRishabh Sabharwal, Hongru Wang, Amos Storkey, Jeff Z. PanMon, Jun 8, 2026, 10:08 AM PDT
score 17.2

Research AI agents struggle to improve from repeated feedback

Original: Multi-Turn Evaluation of Deep Research Agents Under Process-Level Feedback

Source: arxiv.org

Writing ELI5 summary…