← back
x.comh100envyWed, Jun 3, 2026, 10:32 AM PDT
score 16.4
79likes7RT10reply

Tri Dao on the GPU kernel powering every large language model

Original: Tri Dao wrote FlashAttention, the GPU kernel running inside every large language model on earth.

Source: x.com

Writing ELI5 summary…