Overspecified Mixture Discriminant Analysis: Exponential Convergence, Statistical Guarantees, and Remote Sensing Applications
2510.27056v1
stat.ML, cs.LG
2025-11-04
Авторы:
Arman Bolatov, Alan Legg, Igor Melnykov, Amantay Nurlanuly, Maxat Tezekbayev, Zhenisbek Assylbekov
Abstract
This study explores the classification error of Mixture Discriminant Analysis
(MDA) in scenarios where the number of mixture components exceeds those present
in the actual data distribution, a condition known as overspecification. We use
a two-component Gaussian mixture model within each class to fit data generated
from a single Gaussian, analyzing both the algorithmic convergence of the
Expectation-Maximization (EM) algorithm and the statistical classification
error. We demonstrate that, with suitable initialization, the EM algorithm
converges exponentially fast to the Bayes risk at the population level.
Further, we extend our results to finite samples, showing that the
classification error converges to Bayes risk with a rate $n^{-1/2}$ under mild
conditions on the initial parameter estimates and sample size. This work
provides a rigorous theoretical framework for understanding the performance of
overspecified MDA, which is often used empirically in complex data settings,
such as image and text classification. To validate our theory, we conduct
experiments on remote sensing datasets.
Ссылки и действия
Дополнительные ресурсы: