← back
x.comMuyu HeTue, Jun 2, 2026, 8:49 PM PDT
score 17.2
805likes69RT6reply

Mathematics-first approach to optimizing gradient computations in AI training

Original: I am a big fan of Jianlin Su's blog because it always starts from first principles in mathematics, rather than "ML tricks", to approach a typical ML problem (eg. training-free MoE load balancing).

Source: x.com

Writing ELI5 summary…