arXivYang Zhou, Ranajoy Sadhukhan, Zhaofeng Sun, Zhuoming Chen, Souvik Kundu, Saket Dingliwal, Sai Muralidhar Jayanthi, Aram Galstyan, Haizhong Zheng, Beidi ChenSat, Jun 6, 2026, 9:24 PM PDT
score 16.0
Sparse rollout training stabilizes reasoning-heavy language model learning
Original: Sparrow: Sparse Rollout for Stable and Efficient Long-context RL of Large Language Models
Source: arxiv.org ↗
Writing ELI5 summary…