arXivJuncheng Wu, Hardy Chen, Haoqin Tu, Xianfeng Tang, Freda Shi, Hui Liu, Hanqing Lu, Cihang Xie, Yuyin ZhouTue, May 19, 2026, 10:58 AM PDT
score 16.5
Separating seeing from thinking improves vision-language model training
Original: From Seeing to Thinking: Decoupling Perception and Reasoning Improves Post-Training of Vision-Language Models
Source: arxiv.org ↗
Writing ELI5 summary…