← back
arXivJian Lin, Jiazhi Mi, Zicong Hong, Haodong Wang, Qianli Liu, Haodyue Zhang, Peng Li, Song GuoMon, May 18, 2026, 1:54 AM PDT
score 16.9

KVDrive speeds up long-context AI inference across GPU, RAM, and SSD

Original: KVDrive: A Holistic Multi-Tier KV Cache Management System for Long-Context LLM Inference

Source: arxiv.org

Writing ELI5 summary…