x.comhardmaruWed, May 27, 2026, 7:52 AM PDT
score 17.9
1,740likes229RT46reply
Training deep networks one block at a time cuts memory use
Original: For over a decade, we’ve accepted that end-to-end backprop is the only way to train deep networks. But holding the entire network in memory all at once is why AI training is hitting a resource wall.
Source: x.com ↗
Writing ELI5 summary…