SpareCodeSearch: Searching for Code Context When You Have No Spare GPU
2510.12948v1
cs.SE, cs.AI
2025-10-17
Авторы:
Minh Nguyen
Abstract
Retrieval-Augmented Generation (RAG) frameworks aim to enhance Code Language
Models (CLMs) by including another module for retrieving relevant context to
construct the input prompt. However, these retrieval modules commonly use
semantic search, requiring substantial computational resources for training and
hosting these embedded models, making them infeasible to integrate into
lightweight applications such as in-IDE AI-based code completion. In this
solution paper, we prove that using keyword-search is sufficient to retrieve
relevant and useful code context inside large codebases, without the need for
extensive GPU resources. The usefulness of code contexts found by our solution
is demonstrated through their completion results on the Code Context
Competition's benchmark, reaching 0.748 and 0.725 chRF scores on Kotlin and
Python tracks, respectively.
Ссылки и действия
Дополнительные ресурсы: