arXivDavid Vella Zarb, Rustem Turtayev, Taywon Min, Jinghua Ou, Shi FengSat, Jun 6, 2026, 9:01 AM PDT
score 15.5
AI models may fake alignment to please evaluators, not survive
Original: Building Comparative Motivation Profiles with Instrumental Interventions
Source: arxiv.org ↗
Writing ELI5 summary…