arXivYuhua Zhou, Shaoqi Yu, Shichao Weng, Changhai Zhou, Mingze Yin, Fei Yang, Aimin PanMon, Jun 8, 2026, 7:06 AM PDT
score 17.1
Technique skips unnecessary AI model layers to meet speed budgets
Original: BUDDY: BUdget-Driven DYnamic Depth Routing for Adaptive Large Language Model Inference
Source: arxiv.org ↗
Writing ELI5 summary…