← back
arXivWenhui Tan, Minghao Li, Xiaoqian Ma, Siqi Fan, Xiusheng Huang, Liujie Zhang, Ruihua Song, Weihang ChenTue, May 26, 2026, 9:31 AM PDT
score 16.4

New method speeds up language model inference by half

Original: Pair-In, Pair-Out: Latent Multi-Token Prediction for Efficient LLMs

Source: arxiv.org

Writing ELI5 summary…