x.comDaily Dose of Data ScienceSat, Jul 4, 2026, 2:30 AM PDT
score 20.3
111likes18RT3reply
Natural language rewards replace hand-coded scoring in RL agent training
Original: Karpathy's prediction about RL is coming true now!
Source: x.com ↗
Writing ELI5 summary…