arXivYutao Sun, Yanqi Zhang, Li Dong, Jianyong Wang, Furu WeiThu, Jun 4, 2026, 10:54 AM PDT
score 17.2
Faster long-context AI by reusing attention computations across layers
Original: You Only Index Once: Cross-Layer Sparse Attention with Shared Routing
Source: arxiv.org ↗
Writing ELI5 summary…