Enforcing Calibration in Multi-Output Probabilistic Regression with Pre-rank Regularization
2510.21273v2
stat.ML, cs.LG
2025-10-28
Авторы:
Naomi Desobry, Elnura Zhalieva, Souhaib Ben Taieb
Abstract
Probabilistic models must be well calibrated to support reliable
decision-making. While calibration in single-output regression is well studied,
defining and achieving multivariate calibration in multi-output regression
remains considerably more challenging. The existing literature on multivariate
calibration primarily focuses on diagnostic tools based on pre-rank functions,
which are projections that reduce multivariate prediction-observation pairs to
univariate summaries to detect specific types of miscalibration. In this work,
we go beyond diagnostics and introduce a general regularization framework to
enforce multivariate calibration during training for arbitrary pre-rank
functions. This framework encompasses existing approaches such as highest
density region calibration and copula calibration. Our method enforces
calibration by penalizing deviations of the projected probability integral
transforms (PITs) from the uniform distribution, and can be added as a
regularization term to the loss function of any probabilistic predictor.
Specifically, we propose a regularization loss that jointly enforces both
marginal and multivariate pre-rank calibration. We also introduce a new
PCA-based pre-rank that captures calibration along directions of maximal
variance in the predictive distribution, while also enabling dimensionality
reduction. Across 18 real-world multi-output regression datasets, we show that
unregularized models are consistently miscalibrated, and that our methods
significantly improve calibration across all pre-rank functions without
sacrificing predictive accuracy.
Ссылки и действия
Дополнительные ресурсы: