SpurBreast: A Curated Dataset for Investigating Spurious Correlations in Real-world Breast MRI Classification
2510.02109v1
eess.IV, cs.AI, cs.CV
2025-10-04
Авторы:
Jong Bum Won, Wesley De Neve, Joris Vankerschaver, Utku Ozbulak
Abstract
Deep neural networks (DNNs) have demonstrated remarkable success in medical
imaging, yet their real-world deployment remains challenging due to spurious
correlations, where models can learn non-clinical features instead of
meaningful medical patterns. Existing medical imaging datasets are not designed
to systematically study this issue, largely due to restrictive licensing and
limited supplementary patient data. To address this gap, we introduce
SpurBreast, a curated breast MRI dataset that intentionally incorporates
spurious correlations to evaluate their impact on model performance. Analyzing
over 100 features involving patient, device, and imaging protocol, we identify
two dominant spurious signals: magnetic field strength (a global feature
influencing the entire image) and image orientation (a local feature affecting
spatial alignment). Through controlled dataset splits, we demonstrate that DNNs
can exploit these non-clinical signals, achieving high validation accuracy
while failing to generalize to unbiased test data. Alongside these two datasets
containing spurious correlations, we also provide benchmark datasets without
spurious correlations, allowing researchers to systematically investigate
clinically relevant and irrelevant features, uncertainty estimation,
adversarial robustness, and generalization strategies. Models and datasets are
available at https://github.com/utkuozbulak/spurbreast.
Ссылки и действия
Дополнительные ресурсы: