Beyond Ranked Lists: The SARAL Framework for Cross-Lingual Document Set Retrieval
2511.03228v1
cs.CL, cs.IR
2025-11-07
Авторы:
Shantanu Agarwal, Joel Barry, Elizabeth Boschee, Scott Miller
Abstract
Machine Translation for English Retrieval of Information in Any Language
(MATERIAL) is an IARPA initiative targeted to advance the state of
cross-lingual information retrieval (CLIR). This report provides a detailed
description of Information Sciences Institute's (ISI's) Summarization and
domain-Adaptive Retrieval Across Language's (SARAL's) effort for MATERIAL.
Specifically, we outline our team's novel approach to handle CLIR with emphasis
in developing an approach amenable to retrieve a query-relevant document
\textit{set}, and not just a ranked document-list. In MATERIAL's Phase-3
evaluations, SARAL exceeded the performance of other teams in five out of six
evaluation conditions spanning three different languages (Farsi, Kazakh, and
Georgian).
Ссылки и действия
Дополнительные ресурсы: