← back
arXivYuhua Zhou, Shaoqi Yu, Shichao Weng, Changhai Zhou, Mingze Yin, Fei Yang, Aimin PanMon, Jun 8, 2026, 7:06 AM PDT
score 17.1

Technique skips unnecessary AI model layers to meet speed budgets

Original: BUDDY: BUdget-Driven DYnamic Depth Routing for Adaptive Large Language Model Inference

Source: arxiv.org

Writing ELI5 summary…