arXivRunxi Cheng, Yuchen Guan, Yongxian Wei, Qianpu Sun, Qixiu Li, Sinan Du, Feng Xiong, Chun Yuan, Yan Lu, Yeyun GongWed, May 20, 2026, 2:35 AM PDT
score 16.2
Reusing model memories to scale language models cheaper
Original: Memory Grafting: Scaling Language Model Pre-training via Offline Conditional Memory
Source: arxiv.org ↗
Writing ELI5 summary…