arXivNan Li, Albert Gatt, Massimo PoesioTue, Jun 30, 2026, 7:22 AM PDT
score 16.6
Vision-language models mistake visible information for shared understanding
Original: Seeing Is Not Sharing: Some Vision-Language Models Overestimate Common Ground in Asymmetric Dialogue
Source: arxiv.org ↗
Writing ELI5 summary…