x.comAndrew NgThu, Jun 4, 2026, 9:44 AM PDT
score 17.3
175likes28RT28reply
New course teaches efficient serving of large AI language models
Original: New course on serving LLMs efficiently -- how do you serve models to many concurrent users at low latency and reasonable cost? This short course is built with @RedHat and taught by @cedricclyburn.
Source: x.com ↗
Writing ELI5 summary…