x.comRulin ShaoMon, May 25, 2026, 12:49 PM PDT
score 15.8
195likes28RT3reply
Training system learns to improve its own reward metrics
Original: DR Tulu is now accepted for an oral presentation at #ICML2026 🙏
Source: arxiv.org ↗
Writing ELI5 summary…
Original: DR Tulu is now accepted for an oral presentation at #ICML2026 🙏
Source: arxiv.org ↗
Writing ELI5 summary…