arXivWenhui Tan, Minghao Li, Xiaoqian Ma, Siqi Fan, Xiusheng Huang, Liujie Zhang, Ruihua Song, Weihang ChenTue, May 26, 2026, 9:31 AM PDT
score 16.4
New method speeds up language model inference by half
Original: Pair-In, Pair-Out: Latent Multi-Token Prediction for Efficient LLMs
Source: arxiv.org ↗
Writing ELI5 summary…