x.comh100envyWed, Jun 3, 2026, 10:32 AM PDT
score 16.4
79likes7RT10reply
Tri Dao on the GPU kernel powering every large language model
Original: Tri Dao wrote FlashAttention, the GPU kernel running inside every large language model on earth.
Source: x.com ↗
Writing ELI5 summary…