← back
arXivDonghwan LeeFri, May 15, 2026, 8:54 AM PDT
score 14.7

New math explains why Q-learning overestimates value

Original: Sign-Separated Finite-Time Error Analysis of Q-Learning

Source: arxiv.org

Writing ELI5 summary…