arXivHebin Hu, Renke Dai, Ah-Hwee Tan, Yilin KangTue, May 19, 2026, 5:38 AM PDT
score 16.3
New benchmark tests AI healthcare agents on patient history recall
Original: Synthesis and Evaluation of Long-term History-aware Medical Dialogue
Source: arxiv.org ↗
Writing ELI5 summary…