Contrastive Predictive Coding Done Right for Mutual Information Estimation
2510.25983v1
cs.LG, cs.IT, math.IT, stat.ML
2025-11-01
Авторы:
J. Jon Ryu, Pavan Yeddanapudi, Xiangxiang Xu, Gregory W. Wornell
Abstract
The InfoNCE objective, originally introduced for contrastive representation
learning, has become a popular choice for mutual information (MI) estimation,
despite its indirect connection to MI. In this paper, we demonstrate why
InfoNCE should not be regarded as a valid MI estimator, and we introduce a
simple modification, which we refer to as InfoNCE-anchor, for accurate MI
estimation. Our modification introduces an auxiliary anchor class, enabling
consistent density ratio estimation and yielding a plug-in MI estimator with
significantly reduced bias. Beyond this, we generalize our framework using
proper scoring rules, which recover InfoNCE-anchor as a special case when the
log score is employed. This formulation unifies a broad spectrum of contrastive
objectives, including NCE, InfoNCE, and $f$-divergence variants, under a single
principled framework. Empirically, we find that InfoNCE-anchor with the log
score achieves the most accurate MI estimates; however, in self-supervised
representation learning experiments, we find that the anchor does not improve
the downstream task performance. These findings corroborate that contrastive
representation learning benefits not from accurate MI estimation per se, but
from the learning of structured density ratios.