MCE: Towards a General Framework for Handling Missing Modalities under Imbalanced Missing Rates
2510.10534v1
cs.CV, cs.LG, cs.MM
2025-10-15
Авторы:
Binyu Zhao, Wei Zhang, Zhaonian Zou
Abstract
Multi-modal learning has made significant advances across diverse pattern
recognition applications. However, handling missing modalities, especially
under imbalanced missing rates, remains a major challenge. This imbalance
triggers a vicious cycle: modalities with higher missing rates receive fewer
updates, leading to inconsistent learning progress and representational
degradation that further diminishes their contribution. Existing methods
typically focus on global dataset-level balancing, often overlooking critical
sample-level variations in modality utility and the underlying issue of
degraded feature quality. We propose Modality Capability Enhancement (MCE) to
tackle these limitations. MCE includes two synergistic components: i) Learning
Capability Enhancement (LCE), which introduces multi-level factors to
dynamically balance modality-specific learning progress, and ii) Representation
Capability Enhancement (RCE), which improves feature semantics and robustness
through subset prediction and cross-modal completion tasks. Comprehensive
evaluations on four multi-modal benchmarks show that MCE consistently
outperforms state-of-the-art methods under various missing configurations. The
journal preprint version is now available at
https://doi.org/10.1016/j.patcog.2025.112591. Our code is available at
https://github.com/byzhaoAI/MCE.
Ссылки и действия
Дополнительные ресурсы: