arXivVijeta Deshpande, Tootiya Giyahchi, Veena Padmanabhan, Leman Akoglu, Anna RumshiskyWed, May 27, 2026, 8:59 AM PDT
score 16.5
Activation steering struggles to generate diverse training data for safety classifiers
Original: Activation Steering for Synthetic Data Generation: The Role of Diversity in Downstream Safety Detection
Source: arxiv.org ↗
Writing ELI5 summary…