← back
arXivQinyan Zhou, Peixin Zhang, Jun Sun, Haonan Zhang, Dongxia WangTue, Jun 2, 2026, 6:07 AM PDT
score 17.1

Automated tool finds and fixes AI safety systems wrongly rejecting safe questions

Original: DDOR: Delta Debugging for Explainable Overrefusal Testing and Repair

Source: arxiv.org

Writing ELI5 summary…