← back
arXivPiercosma Bisconti, Matteo Prandi, Federico Pierucci, Federico Sartore, Enrico Panai, Laura Caroli, Yue Zhu, Adam Leon Smith, Luca Nannini, Marcello Galisai, Susanna Cifani, Francesco Giarrusso, Marcantonio Bracale Syrnikov, Daniele NardiThu, May 21, 2026, 8:50 AM PDT
score 14.7

New benchmark tests if AI agents fall for gradual manipulation attacks

Original: Boiling the Frog: A Multi-Turn Benchmark for Agentic Safety

Source: arxiv.org

Writing ELI5 summary…