RESCUE: Retrieval Augmented Secure Code Generation
2510.18204v1
cs.CR, cs.LG, cs.SE
2025-10-23
Авторы:
Jiahao Shi, Tianyi Zhang
Abstract
Despite recent advances, Large Language Models (LLMs) still generate
vulnerable code. Retrieval-Augmented Generation (RAG) has the potential to
enhance LLMs for secure code generation by incorporating external security
knowledge. However, the conventional RAG design struggles with the noise of raw
security-related documents, and existing retrieval methods overlook the
significant security semantics implicitly embedded in task descriptions. To
address these issues, we propose RESCUE, a new RAG framework for secure code
generation with two key innovations. First, we propose a hybrid knowledge base
construction method that combines LLM-assisted cluster-then-summarize
distillation with program slicing, producing both high-level security
guidelines and concise, security-focused code examples. Second, we design a
hierarchical multi-faceted retrieval to traverse the constructed knowledge base
from top to bottom and integrates multiple security-critical facts at each
hierarchical level, ensuring comprehensive and accurate retrieval. We evaluated
RESCUE on four benchmarks and compared it with five state-of-the-art secure
code generation methods on six LLMs. The results demonstrate that RESCUE
improves the SecurePass@1 metric by an average of 4.8 points, establishing a
new state-of-the-art performance for security. Furthermore, we performed
in-depth analysis and ablation studies to rigorously validate the effectiveness
of individual components in RESCUE.
Ссылки и действия
Дополнительные ресурсы: