Query Generation Pipeline with Enhanced Answerability Assessment for Financial Information Retrieval

2511.05000v1 cs.IR, cs.AI 2025-11-11

Авторы:

Hyunkyu Kim, Yeeun Yoo, Youngjun Kwak

Abstract

As financial applications of large language models (LLMs) gain attention, accurate Information Retrieval (IR) remains crucial for reliable AI services. However, existing benchmarks fail to capture the complex and domain-specific information needs of real-world banking scenarios. Building domain-specific IR benchmarks is costly and constrained by legal restrictions on using real customer data. To address these challenges, we propose a systematic methodology for constructing domain-specific IR benchmarks through LLM-based query generation. As a concrete implementation of this methodology, our pipeline combines single and multi-document query generation with an enhanced and reasoning-augmented answerability assessment method, achieving stronger alignment with human judgments than prior approaches. Using this methodology, we construct KoBankIR, comprising 815 queries derived from 204 official banking documents. Our experiments show that existing retrieval models struggle with the complex multi-document queries in KoBankIR, demonstrating the value of our systematic approach for domain-specific benchmark construction and underscoring the need for improved retrieval techniques in financial domains.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Query Generation Pipeline with Enhanced Answerability Assessment for Financial Information Retrieval

Авторы:

Abstract

Ссылки и действия

Связанные статьи

BookRAG: A Hierarchical Structure-aware Index-based Approach for Retrieval-Augme...

Structured Spectral Reasoning for Frequency-Adaptive Multimodal Recommendation

Q-BERT4Rec: Quantized Semantic-ID Representation Learning for Multimodal Recomme...

AskNearby: An LLM-Based Application for Neighborhood Information Retrieval and P...

Evaluating Embedding Models and Pipeline Optimization for AI Search Quality

Навигация