Using natural language axes for richer AI feedback instead of binary preferences

Original: Cool work from @marceltornev on learning from rich feedback instead of binarized preferences. Annotators name their own axes in natural language, and the reward model conditions on the axis text.

Source: x.com ↗

Writing ELI5 summary…

Using natural language axes for richer AI feedback instead of binary preferences · TinyNews · TinyNews