← back
arXivNan Li, Albert Gatt, Massimo PoesioTue, Jun 30, 2026, 7:22 AM PDT
score 16.6

Vision-language models mistake visible information for shared understanding

Original: Seeing Is Not Sharing: Some Vision-Language Models Overestimate Common Ground in Asymmetric Dialogue

Source: arxiv.org

Writing ELI5 summary…