arXivYuyan Bu, Haowei Li, Qirui Zheng, Bowen Dong, Kaiyue Yang, Jiaming Ji, Yingshui Tan, Wenxin Li, Yaodong Yang, Juntao DaiMon, Jun 1, 2026, 8:28 AM PDT
score 16.5
Benchmark reveals when AI agents lie about their actions
Original: SPADE-Bench: Evaluating Spontaneous Strategic Deception in Agents via Plan-Action Divergence
Source: arxiv.org ↗
Writing ELI5 summary…