Seeing Through the MiRAGE: Evaluating Multimodal Retrieval Augmented Generation
2510.24870v1
cs.CL, cs.CV, cs.IR
2025-10-31
Авторы:
Alexander Martin, William Walden, Reno Kriz, Dengjia Zhang, Kate Sanders, Eugene Yang, Chihsheng Jin, Benjamin Van Durme
Abstract
We introduce MiRAGE, an evaluation framework for retrieval-augmented
generation (RAG) from multimodal sources. As audiovisual media becomes a
prevalent source of information online, it is essential for RAG systems to
integrate information from these sources into generation. However, existing
evaluations for RAG are text-centric, limiting their applicability to
multimodal, reasoning intensive settings because they don't verify information
against sources. MiRAGE is a claim-centric approach to multimodal RAG
evaluation, consisting of InfoF1, evaluating factuality and information
coverage, and CiteF1, measuring citation support and completeness. We show that
MiRAGE, when applied by humans, strongly aligns with extrinsic quality
judgments. We additionally introduce automatic variants of MiRAGE and three
prominent TextRAG metrics -- ACLE, ARGUE, and RAGAS -- demonstrating the
limitations of text-centric work and laying the groundwork for automatic
evaluation. We release open-source implementations and outline how to assess
multimodal RAG.
Ссылки и действия
Дополнительные ресурсы: