arXivZhongyang Lin, Ziran Zhao, Feifei Zhai, Pengyuan LiuTue, Jun 2, 2026, 4:01 AM PDT
score 17.0
New defense system stops AI jailbreak attacks without blocking helpful requests
Original: NeuroArmor: Safe-Variant-Guided Representation Consistency for Selective Re-Anchoring in Jailbreak Defense
Source: arxiv.org ↗
Writing ELI5 summary…