arXivShiping Zhu, Yibo Yang, Zhengyang Wang, Tiancheng Shen, Dandan Guo, Ming-Hsuan YangMon, Jun 8, 2026, 6:17 AM PDT
score 17.0
New benchmark tests AI agents watching multi-person conversations
Original: H2HMem: A Multimodal Memory Benchmark for Agents in Human-Human Interactions
Source: arxiv.org ↗
Writing ELI5 summary…