x.comGraham NeubigMon, Jun 29, 2026, 4:31 PM PDT
score 17.3
258likes23RT9reply
Sidekick architecture cuts LLM costs by using small models for simple tasks
Original: We've found this sort of "sidekick" architecture to be very effective at cutting LLM spend because it allows you to do context control and not spend expensive tokens on simple tasks.
Source: gist.github.com ↗
Writing ELI5 summary…