Can LLMs subtract numbers?
2511.02795v1
cs.LG, cs.CL
2025-11-06
Авторы:
Mayank Jobanputra, Nils Philipp Walter, Maitrey Mehta, Blerta Veseli, Evan Parker Kelly Chapple, Yifan Wang, Sneha Chetani, Ellie Pavlick, Antonio Vergari, Vera Demberg
Abstract
We present a systematic study of subtraction in large language models (LLMs).
While prior benchmarks emphasize addition and multiplication, subtraction has
received comparatively little attention despite being structurally distinct as
a non-commutative operation. We evaluate eight pretrained LLMs spanning four
families on addition and subtraction problems. Our experiments reveal that
subtraction accuracy lags behind addition by a wide margin. We find that the
errors for ($a-b$) are concentrated in cases where ($a<b$). In such cases, LLMs
frequently produce the correct magnitude but omit the negative sign. Probing
analyses show that LLMs internally encode whether results should be negative,
yet this information is often not reflected in generated outputs. We further
test well-known techniques such as few-shot learning and instruction-tuning to
see if they can improve the LLMs' performance. Our results suggest that while
few-shot prompting yields modest gains, the instruction-tuned models achieve
near-perfect accuracies in generating the negative sign. Together, these
findings provide a clearer characterization of the limitations and
recoverability of LLMs' arithmetic capabilities in subtraction.
Ссылки и действия
Дополнительные ресурсы: