Scalable Utility-Aware Multiclass Calibration
2510.25458v1
cs.LG, cs.AI, stat.ML
2025-10-31
Авторы:
Mahmoud Hegazy, Michael I. Jordan, Aymeric Dieuleveut
Abstract
Ensuring that classifiers are well-calibrated, i.e., their predictions align
with observed frequencies, is a minimal and fundamental requirement for
classifiers to be viewed as trustworthy. Existing methods for assessing
multiclass calibration often focus on specific aspects associated with
prediction (e.g., top-class confidence, class-wise calibration) or utilize
computationally challenging variational formulations. In this work, we study
scalable \emph{evaluation} of multiclass calibration. To this end, we propose
utility calibration, a general framework that measures the calibration error
relative to a specific utility function that encapsulates the goals or decision
criteria relevant to the end user. We demonstrate how this framework can unify
and re-interpret several existing calibration metrics, particularly allowing
for more robust versions of the top-class and class-wise calibration metrics,
and, going beyond such binarized approaches, toward assessing calibration for
richer classes of downstream utilities.
Ссылки и действия
Дополнительные ресурсы: