arXivJian Lin, Jiazhi Mi, Zicong Hong, Haodong Wang, Qianli Liu, Haodyue Zhang, Peng Li, Song GuoMon, May 18, 2026, 1:54 AM PDT
score 16.9
KVDrive speeds up long-context AI inference across GPU, RAM, and SSD
Original: KVDrive: A Holistic Multi-Tier KV Cache Management System for Long-Context LLM Inference
Source: arxiv.org ↗
Writing ELI5 summary…