arXivZhiben Chen, Youpeng Zhao, Yang Sui, Jun Wang, Yuzhang ShangTue, May 19, 2026, 10:59 AM PDT
score 16.5
Faster AI model inference by smartly managing memory and compute
Original: TIDE: Efficient and Lossless MoE Diffusion LLM Inference with I/O-aware Expert Offload
Source: arxiv.org ↗
Writing ELI5 summary…