← back
arXivZhenyu Sun, Zheng Xu, Ermin WeiThu, May 28, 2026, 10:56 AM PDT
score 14.8

AI learns to adapt reward preferences on the fly without retraining

Original: In-Context Reward Adaptation for Robust Preference Modeling

Source: arxiv.org

Writing ELI5 summary…