← back
arXivThibaud Ardoin, Jonas Schäfer, Gerhard WunderThu, Jun 4, 2026, 8:54 AM PDT
score 18.1

Language models can secretly mark their own outputs for identification

Original: LLM Self-Recognition: Steering and Retrieving Activation Signatures

Source: arxiv.org

Writing ELI5 summary…