Mode Collapse of Mean-Field Variational Inference
2510.17063v1
stat.ML, cs.LG
2025-10-22
Авторы:
Shunan Sheng, Bohan Wu, Alberto González-Sanz
Abstract
Mean-field variational inference (MFVI) is a widely used method for
approximating high-dimensional probability distributions by product measures.
It has been empirically observed that MFVI optimizers often suffer from mode
collapse. Specifically, when the target measure $\pi$ is a mixture $\pi = w P_0
+ (1 - w) P_1$, the MFVI optimizer tends to place most of its mass near a
single component of the mixture. This work provides the first theoretical
explanation of mode collapse in MFVI. We introduce the notion to capture the
separatedness of the two mixture components -- called
$\varepsilon$-separateness -- and derive explicit bounds on the fraction of
mass that any MFVI optimizer assigns to each component when $P_0$ and $P_1$ are
$\varepsilon$-separated for sufficiently small $\varepsilon$. Our results
suggest that the occurrence of mode collapse crucially depends on the relative
position of the components. To address this issue, we propose the rotational
variational inference (RoVI), which augments MFVI with a rotation matrix. The
numerical studies support our theoretical findings and demonstrate the benefits
of RoVI.
Ссылки и действия
Дополнительные ресурсы: