Does the Model Say What the Data Says? A Simple Heuristic for Model Data Alignment

2511.21931v1 cs.LG, cs.AI 2025-12-01

Авторы:

Henry Salgado, Meagan Kendall, Martine Ceberio

Abstract

In this work, we propose a simple and computationally efficient framework to evaluate whether machine learning models align with the structure of the data they learn from; that is, whether \textit{the model says what the data says}. Unlike existing interpretability methods that focus exclusively on explaining model behavior, our approach establishes a baseline derived directly from the data itself. Drawing inspiration from Rubin's Potential Outcomes Framework, we quantify how strongly each feature separates the two outcome groups in a binary classification task, moving beyond traditional descriptive statistics to estimate each feature's effect on the outcome. By comparing these data-derived feature rankings against model-based explanations, we provide practitioners with an interpretable and model-agnostic method to assess model--data alignment.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Does the Model Say What the Data Says? A Simple Heuristic for Model Data Alignment

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Prototype-Based Semantic Consistency Alignment for Domain Adaptive Retrieval

Diffusion Fine-Tuning via Reparameterized Policy Gradient of the Soft Q-Function

TimesNet-Gen: Deep Learning-based Site Specific Strong Motion Generation

Realizable Abstractions: Near-Optimal Hierarchical Reinforcement Learning

BEP: A Binary Error Propagation Algorithm for Binary Neural Networks Training

Навигация