arXivMalia Barker, Bishal Lakha, Edoardo Serra, Francesco GulloTue, Jun 2, 2026, 6:09 AM PDT
score 17.1
LLMs fail simple math when numbers change slightly
Original: Testing LLM Arithmetic Reasoning Generalization with Automatic Numeric-Remapping Attacks
Source: arxiv.org ↗
Writing ELI5 summary…