← back
arXivZhanming Shen, Jintao Tong, Shaotian Yan, Chen Shen, Hao Chen, Wentao Ye, Xiaomeng Hu, Rui Miao, Haobo Wang, Junbo Zhao, Gang Chen, Jieping YeThu, Jul 2, 2026, 7:33 AM PDT
score 17.0

Fix for self-distillation method that harms long-reasoning AI models

Original: Purified OPSD: On-Policy Self-Distillation Without Losing How to Think

Source: arxiv.org

Writing ELI5 summary…

Fix for self-distillation method that harms long-reasoning AI models · TinyNews · TinyNews