When Sufficient is not Enough: Utilizing the Rashomon Effect for Complete Evidence Extraction

2511.07055v1 cs.CL, cs.IR, cs.LG 2025-11-15

Авторы:

Katharina Beckh, Stefan Rüping

Abstract

Feature attribution methods typically provide minimal sufficient evidence justifying a model decision. However, in many applications this is inadequate. For compliance and cataloging, the full set of contributing features must be identified - complete evidence. We perform a case study on a medical dataset which contains human-annotated complete evidence. We show that individual models typically recover only subsets of complete evidence and that aggregating evidence from several models improves evidence recall from $\sim$0.60 (single best model) to $\sim$0.86 (ensemble). We analyze the recall-precision trade-off, the role of training with evidence, dynamic ensembles with certainty thresholds, and discuss implications.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

When Sufficient is not Enough: Utilizing the Rashomon Effect for Complete Evidence Extraction

Авторы:

Abstract

Ссылки и действия

Связанные статьи

AgentPRM: Process Reward Models for LLM Agents via Step-Wise Promise and Progres...

PluriHop: Exhaustive, Recall-Sensitive QA over Distractor-Rich Corpora

Are Smaller Open-Weight LLMs Closing the Gap to Proprietary Models for Biomedica...

Mental Multi-class Classification on Social Media: Benchmarking Transformer Arch...

mmBERT: A Modern Multilingual Encoder with Annealed Language Learning

Навигация