arXivDavid Lindner, Victoria Krakovna, Sebastian FarquharThu, May 28, 2026, 10:56 AM PDT

score 15.6

1cites

New test framework checks if AI agents will sabotage their work

Original: Gram: Assessing sabotage propensities via automated alignment auditing

Writing ELI5 summary…