Stochastic Difference-of-Convex Optimization with Momentum

2510.17503v1 cs.LG, math.OC, stat.ML 2025-10-22

Авторы:

El Mahdi Chayti, Martin Jaggi

Abstract

Stochastic difference-of-convex (DC) optimization is prevalent in numerous machine learning applications, yet its convergence properties under small batch sizes remain poorly understood. Existing methods typically require large batches or strong noise assumptions, which limit their practical use. In this work, we show that momentum enables convergence under standard smoothness and bounded variance assumptions (of the concave part) for any batch size. We prove that without momentum, convergence may fail regardless of stepsize, highlighting its necessity. Our momentum-based algorithm achieves provable convergence and demonstrates strong empirical performance.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Найти цитирования в Google Scholar
Поиск в Semantic Scholar
Другие статьи категории cs.LG, math.OC, stat.ML

Stochastic Difference-of-Convex Optimization with Momentum

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Diagonalizing the Softmax: Hadamard Initialization for Tractable Cross-Entropy D...

When do spectral gradient updates help in deep learning?

Lower Complexity Bounds for Nonconvex-Strongly-Convex Bilevel Optimization with ...

Adaptivity and Universality: Problem-dependent Universal Regret for Online Conve...

A Best-of-Both-Worlds Proof for Tsallis-INF without Fenchel Conjugates

Навигация