Monitoring the calibration of probability forecasts with an application to concept drift detection involving image classification
2510.25573v1
stat.ML, cs.LG
2025-10-31
Авторы:
Christopher T. Franck, Anne R. Driscoll, Zoe Szajnfarber, William H. Woodall
Abstract
Machine learning approaches for image classification have led to impressive
advances in that field. For example, convolutional neural networks are able to
achieve remarkable image classification accuracy across a wide range of
applications in industry, defense, and other areas. While these machine
learning models boast impressive accuracy, a related concern is how to assess
and maintain calibration in the predictions these models make. A classification
model is said to be well calibrated if its predicted probabilities correspond
with the rates events actually occur. While there are many available methods to
assess machine learning calibration and recalibrate faulty predictions, less
effort has been spent on developing approaches that continually monitor
predictive models for potential loss of calibration as time passes. We propose
a cumulative sum-based approach with dynamic limits that enable detection of
miscalibration in both traditional process monitoring and concept drift
applications. This enables early detection of operational context changes that
impact image classification performance in the field. The proposed chart can be
used broadly in any situation where the user needs to monitor probability
predictions over time for potential lapses in calibration. Importantly, our
method operates on probability predictions and event outcomes and does not
require under-the-hood access to the machine learning model.
Ссылки и действия
Дополнительные ресурсы: