Understanding Robustness of Model Editing in Code LLMs: An Empirical Study
2511.03182v1
cs.SE, cs.LG
2025-11-07
Авторы:
Vinaik Chhetri, A. B Siddique, Umar Farooq
Abstract
Large language models (LLMs) are increasingly used in software development.
However, while LLMs remain static after pretraining, programming languages and
APIs continue to evolve, leading to the generation of deprecated or
incompatible code that undermines reliability. Retraining LLMs from scratch to
reflect such changes is computationally expensive, making model editing a
promising lightweight alternative that updates only a small subset of
parameters. Despite its potential, it remains unclear whether model editing
yields genuine syntactic and semantic adaptations or merely superficial fixes.
In this work, we present a systematic study of five state-of-the-art model
editing methods: Constrained Fine-Tuning (FT), GRACE, MEMIT, PMET, and ROME. We
apply these methods to three leading open-source code LLMs, CodeLlama,
CodeQwen1.5, and DeepSeek-Coder, under controlled API deprecation scenarios.
Our evaluation covers both instant and sequential editing settings, using three
disjoint evaluation sets designed to assess reliability, generalization, and
specificity. We measure model correctness at three levels: successful
compilation, partial test case pass, and full test pass. Our findings show that
instant edits consistently degrade model performance, with syntactic validity
dropping by up to 86 percentage points and functional correctness declining by
45 points even in the best-performing setting. Sequential edits further amplify
this degradation, and in some cases, model performance collapses entirely.
Across all models, most passing generations relied on workarounds rather than
correctly adopting the intended changes, while faulty adoptions that result in
test failures or compilation errors were significantly more frequent. Correct
adoptions, where the model correctly integrates the intended change, occurred
in only about 6% of cases.
Ссылки и действия
Дополнительные ресурсы: