Beyond the limitation of a single query: Train your LLM for query expansion with Reinforcement Learning
2510.10009v1
cs.CL, cs.AI, cs.IR
2025-10-15
Авторы:
Shu Zhao, Tan Yu, Anbang Xu
Abstract
Reasoning-augmented search agents, such as Search-R1, are trained to reason,
search, and generate the final answer iteratively. Nevertheless, due to their
limited capabilities in reasoning and search, their performance on multi-hop QA
benchmarks remains far from satisfactory. To handle complex or compound
queries, we train an LLM-based search agent with the native capability of query
expansion through reinforcement learning. In each turn, our search agent
proposes several query variants, which are searched simultaneously to cover
more relevant information. Meanwhile, given limited post-training data and
computing resources, it is very challenging for a search agent to master
multiple tasks, including query generation, retrieved information
understanding, and answer generation. Therefore, we propose incorporating a
pre-trained squeezer model that helps the search agent understand the retrieved
documents, allowing the search agent to focus on query generation for high
retrieval recall. With the assistance of the squeezer model, we discover that
even a small-scale 3B LLM can demonstrate a strong capability of query
expansion and achieve state-of-the-art accuracy on the multi-hop QA benchmarks.
To be specific, our experiments across seven question-answering benchmarks
demonstrate that our method, named ExpandSearch, achieves an average
improvement of 4.4% compared to state-of-the-art baselines, with strong gains
on multi-hop reasoning tasks requiring diverse evidence aggregation.
Ссылки и действия
Дополнительные ресурсы: