← back
arXivZhaoyu Zhu, Rui Gao, Shuang LiMon, May 25, 2026, 10:42 AM PDT
score 16.5

Proving convergence for transport-based reinforcement learning

Original: Global Convergence of Wasserstein Policy Gradient for Entropy-Regularized Reinforcement Learning

Source: arxiv.org

Writing ELI5 summary…