Stochastic Difference-of-Convex Optimization with Momentum
2510.17503v1
cs.LG, math.OC, stat.ML
2025-10-22
Авторы:
El Mahdi Chayti, Martin Jaggi
Abstract
Stochastic difference-of-convex (DC) optimization is prevalent in numerous
machine learning applications, yet its convergence properties under small batch
sizes remain poorly understood. Existing methods typically require large
batches or strong noise assumptions, which limit their practical use. In this
work, we show that momentum enables convergence under standard smoothness and
bounded variance assumptions (of the concave part) for any batch size. We prove
that without momentum, convergence may fail regardless of stepsize,
highlighting its necessity. Our momentum-based algorithm achieves provable
convergence and demonstrates strong empirical performance.