arXivZhenyu Sun, Zheng Xu, Ermin WeiThu, May 28, 2026, 10:56 AM PDT
score 14.8
AI learns to adapt reward preferences on the fly without retraining
Original: In-Context Reward Adaptation for Robust Preference Modeling
Source: arxiv.org ↗
Writing ELI5 summary…