When Sufficient is not Enough: Utilizing the Rashomon Effect for Complete Evidence Extraction

2511.07055v1 cs.CL, cs.IR, cs.LG 2025-11-15
Авторы:

Katharina Beckh, Stefan Rüping

Abstract

Feature attribution methods typically provide minimal sufficient evidence justifying a model decision. However, in many applications this is inadequate. For compliance and cataloging, the full set of contributing features must be identified - complete evidence. We perform a case study on a medical dataset which contains human-annotated complete evidence. We show that individual models typically recover only subsets of complete evidence and that aggregating evidence from several models improves evidence recall from $\sim$0.60 (single best model) to $\sim$0.86 (ensemble). We analyze the recall-precision trade-off, the role of training with evidence, dynamic ensembles with certainty thresholds, and discuss implications.

Ссылки и действия

Связанные статьи

Are Smaller Open-Weight LLMs Closing the Gap to Proprietary Models for Biomedica...

#### Контекст Открытые версии больших языковых моделей (LLMs) постоянно совершают значительные прорывы в области ИИ. Наи...

2025-09-25

Mental Multi-class Classification on Social Media: Benchmarking Transformer Arch...

#### Контекст Социальные медиа становятся важной платформой для открытия о личных борьбах с психологическими расстройст...

2025-09-24

mmBERT: A Modern Multilingual Encoder with Annealed Language Learning

## Контекст В настоящее время современные модели языкового представления, такие как BERT, широко используются для решен...

2025-09-10