arXivKrishnapriya Vishnubhotla, Hillary Dawkins, Isar Nejadgholi, Svetlana KiritchenkoTue, Jun 2, 2026, 6:39 AM PDT
score 16.3
Fine-tuning language models risks safety unless linked to real tasks
Original: Safety Measurements for Fine-tuned LLMs Should be Grounded in Capability
Source: arxiv.org ↗
Writing ELI5 summary…