← back
arXivKrishnapriya Vishnubhotla, Hillary Dawkins, Isar Nejadgholi, Svetlana KiritchenkoTue, Jun 2, 2026, 6:39 AM PDT
score 16.3

Fine-tuning language models risks safety unless linked to real tasks

Original: Safety Measurements for Fine-tuned LLMs Should be Grounded in Capability

Source: arxiv.org

Writing ELI5 summary…