← back
arXivAnton Bolychev, Georgiy Malaniya, Sinan Ibrahim, Pavel OsinenkoMon, Jun 8, 2026, 10:59 AM PDT
score 17.2

RL training method transfers from working baseline to improved policy

Original: An Agency-Transferring Model-Free Policy Enhancement Technique

Source: arxiv.org

Writing ELI5 summary…