A Median Perspective on Unlabeled Data for Out-of-Distribution Detection
2510.06505v1
cs.LG, cs.AI, math.OC, stat.ML
2025-10-10
Авторы:
Momin Abbas, Ali Falahati, Hossein Goli, Mohammad Mohammadi Amiri
Abstract
Out-of-distribution (OOD) detection plays a crucial role in ensuring the
robustness and reliability of machine learning systems deployed in real-world
applications. Recent approaches have explored the use of unlabeled data,
showing potential for enhancing OOD detection capabilities. However,
effectively utilizing unlabeled in-the-wild data remains challenging due to the
mixed nature of both in-distribution (InD) and OOD samples. The lack of a
distinct set of OOD samples complicates the task of training an optimal OOD
classifier. In this work, we introduce Medix, a novel framework designed to
identify potential outliers from unlabeled data using the median operation. We
use the median because it provides a stable estimate of the central tendency,
as an OOD detection mechanism, due to its robustness against noise and
outliers. Using these identified outliers, along with labeled InD data, we
train a robust OOD classifier. From a theoretical perspective, we derive error
bounds that demonstrate Medix achieves a low error rate. Empirical results
further substantiate our claims, as Medix outperforms existing methods across
the board in open-world settings, confirming the validity of our theoretical
insights.