x.comQiyao MaSat, Jul 4, 2026, 4:59 PM PDT
score 15.3
29likes2reply
AI safety alignment bottleneck: training a multi-target reward model
Original: Been working on non-verifiable domains — open-domain QA last year, safety alignment this summer at MSR.
Source: x.com ↗
Writing ELI5 summary…