← back
x.comQiyao MaSat, Jul 4, 2026, 4:59 PM PDT
score 15.3
29likes2reply

AI safety alignment bottleneck: training a multi-target reward model

Original: Been working on non-verifiable domains — open-domain QA last year, safety alignment this summer at MSR.

Source: x.com

Writing ELI5 summary…