← back
arXivDavid Demitri Africa, Arathi ManiTue, Jun 2, 2026, 8:54 AM PDT
score 16.4

Consistency training can accidentally amplify AI misalignment

Original: Consistency Training Can Entrench Misalignment

Source: arxiv.org

Writing ELI5 summary…