x.comMuyu HeTue, Jun 2, 2026, 8:49 PM PDT
score 17.2
805likes69RT6reply
Mathematics-first approach to optimizing gradient computations in AI training
Original: I am a big fan of Jianlin Su's blog because it always starts from first principles in mathematics, rather than "ML tricks", to approach a typical ML problem (eg. training-free MoE load balancing).
Source: x.com ↗
Writing ELI5 summary…