HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation
2510.07794v1
cs.CL, cs.AI, cs.LG
2025-10-11
Авторы:
Peilin Wu, Mian Zhang, Kun Wan, Wentian Zhao, Kaiyu He, Xinya Du, Zhiyu Chen
Abstract
Agentic RAG is a powerful technique for incorporating external information
that LLMs lack, enabling better problem solving and question answering.
However, suboptimal search behaviors exist widely, such as over-search
(retrieving information already known) and under-search (failing to search when
necessary), which leads to unnecessary overhead and unreliable outputs. Current
training methods, which typically rely on outcome-based rewards in a RL
framework, lack the fine-grained control needed to address these
inefficiencies. To overcome this, we introduce Hierarchical Process Rewards for
Efficient agentic RAG (HiPRAG), a training methodology that incorporates a
fine-grained, knowledge-grounded process reward into the RL training. Our
approach evaluates the necessity of each search decision on-the-fly by
decomposing the agent's reasoning trajectory into discrete, parsable steps. We
then apply a hierarchical reward function that provides an additional bonus
based on the proportion of optimal search and non-search steps, on top of
commonly used outcome and format rewards. Experiments on the Qwen2.5 and
Llama-3.2 models across seven diverse QA benchmarks show that our method
achieves average accuracies of 65.4% (3B) and 67.2% (7B). This is accomplished
while improving search efficiency, reducing the over-search rate to just 2.3%
and concurrently lowering the under-search rate. These results demonstrate the
efficacy of optimizing the reasoning process itself, not just the final
outcome. Further experiments and analysis demonstrate that HiPRAG shows good
generalizability across a wide range of RL algorithms, model families, sizes,
and types. This work demonstrates the importance and potential of fine-grained
control through RL, for improving the efficiency and optimality of reasoning
for search agents.
Ссылки и действия
Дополнительные ресурсы: