← back
arXivSelim Kuzucu, Alessio Tonioni, Vasile Lup, Bernt Schiele, Federico Tombari, Muhammad Ferjad NaeemThu, May 28, 2026, 8:57 AM PDT
score 14.7

Faster vision-language AI through smarter image token compression

Original: PARCEL: Pool-Anchored Resampling with Conditioned Elastic Queries for Efficient Vision-Language Understanding

Source: arxiv.org

Writing ELI5 summary…