LLM-Specific Utility: A New Perspective for Retrieval-Augmented Generation
2510.11358v1
cs.CL, cs.AI, cs.IR
2025-10-15
Авторы:
Hengran Zhang, Keping Bi, Jiafeng Guo, Jiaming Zhang, Shuaiqiang Wang, Dawei Yin, Xueqi Cheng
Abstract
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by
incorporating external knowledge. While traditional retrieval focuses on
relevance, RAG's effectiveness depends on the utility of retrieved passages,
i.e., the usefulness in facilitating the generation of an accurate and
comprehensive answer. Existing studies often treat utility as a generic
attribute, ignoring the fact that different LLMs may benefit differently from
the same passage due to variations in internal knowledge and comprehension
ability. In this work, we introduce and systematically investigate the notion
of LLM-specific utility. Through large-scale experiments across multiple
datasets and LLMs, we demonstrate that human-annotated passages are not optimal
for LLMs and that ground-truth utilitarian passages are not transferable across
different LLMs. These findings highlight the necessity of adopting the
LLM-specific utility in RAG research. Our findings indicate that some
human-annotated passages are not ground-truth utilitarian passages for specific
LLMs, partially due to the varying readability of queries and passages for
LLMs, a tendency for which perplexity is a key metric. Based on these findings,
we propose a benchmarking procedure for LLM-specific utility judgments. We
evaluate existing utility judgment methods on six datasets and find that while
verbalized methods using pseudo-answers perform robustly, LLMs struggle to
assess utility effectively-failing to reject all passages for known queries and
to select truly useful ones for unknown queries.
Ссылки и действия
Дополнительные ресурсы: