arXivJiahui Wang, Kai Zhang, Mai Han, Huanghe ZhangTue, Jun 2, 2026, 5:36 AM PDT
score 17.1
Two-stage pruning reduces redundancy in vision-language model inference
Original: When Attention Collapses: Stage-Aware Visual Token Pruning from Structure to Semantics
Source: arxiv.org ↗
Writing ELI5 summary…