On Joint Regularization and Calibration in Deep Ensembles
2511.04160v2
cs.LG, stat.ML
2025-11-10
Авторы:
Laurits Fredsgaard, Mikkel N. Schmidt
Abstract
Deep ensembles are a powerful tool in machine learning, improving both model
performance and uncertainty calibration. While ensembles are typically formed
by training and tuning models individually, evidence suggests that jointly
tuning the ensemble can lead to better performance. This paper investigates the
impact of jointly tuning weight decay, temperature scaling, and early stopping
on both predictive performance and uncertainty quantification. Additionally, we
propose a partially overlapping holdout strategy as a practical compromise
between enabling joint evaluation and maximizing the use of data for training.
Our results demonstrate that jointly tuning the ensemble generally matches or
improves performance, with significant variation in effect size across
different tasks and metrics. We highlight the trade-offs between individual and
joint optimization in deep ensemble training, with the overlapping holdout
strategy offering an attractive practical solution. We believe our findings
provide valuable insights and guidance for practitioners looking to optimize
deep ensemble models. Code is available at:
https://github.com/lauritsf/ensemble-optimality-gap
Ссылки и действия
Дополнительные ресурсы: