Cost-Efficient Long Code Translation using LLMs while Leveraging Identifier Replacements
2510.09045v1
cs.SE, cs.AI, cs.IR, cs.LG
2025-10-14
Авторы:
Manojit Chakraborty, Madhusudan Ghosh, Rishabh Gupta
Abstract
In the domain of software development, LLMs have been utilized to automate
tasks such as code translation, where source code from one programming language
is translated to another while preserving its functionality. However, LLMs
often struggle with long source codes that don't fit into the context window,
which produces inaccurate translations. To address this, we propose a novel
zero-shot code translation method that incorporates identifier replacement. By
substituting user-given long identifiers with generalized placeholders during
translation, our method allows the LLM to focus on the logical structure of the
code, by reducing token count and memory usage, which improves the efficiency
and cost-effectiveness of long code translation. Our empirical results
demonstrate that our approach preserves syntactical and hierarchical
information and produces translation results with reduced tokens.