arXivHankun Lin, Ruqi ZhangMon, Jun 8, 2026, 8:33 AM PDT
score 17.1
Lightweight steering method fixes language model drift mid-generation
Original: Gradient-Guided Reward Optimization for Inference-time Alignment
Source: arxiv.org ↗
Writing ELI5 summary…