arXivManjiang Yu, Hongji Li, Junwei Chen, Xue Li, Priyanka Singh, Yang Cao, Lijie HuWed, May 27, 2026, 9:39 AM PDT
score 16.5
Adaptive safety corrections for language models without retraining
Original: Multi-Adapter Representation Interventions via Energy Calibration
Source: arxiv.org ↗
Writing ELI5 summary…