← back
arXivMikkel Godsk Jørgensen, Lars Kai HansenFri, May 29, 2026, 4:53 AM PDT
score 15.3

Sparse Autoencoders can steer language models better than thought

Original: Steering LLMs? Actually, Sparse Autoencoders can outperform simple baselines

Source: arxiv.org

Writing ELI5 summary…