Towards Continuous-Time Approximations for Stochastic Gradient Descent without Replacement

2512.04703v1 cs.LG, math.PR 2025-12-05

Авторы:

Stefan Perko

Abstract

Gradient optimization algorithms using epochs, that is those based on stochastic gradient descent without replacement (SGDo), are predominantly used to train machine learning models in practice. However, the mathematical theory of SGDo and related algorithms remain underexplored compared to their "with replacement" and "one-pass" counterparts. In this article, we propose a stochastic, continuous-time approximation to SGDo with additive noise based on a Young differential equation driven by a stochastic process we call an "epoched Brownian motion". We show its usefulness by proving the almost sure convergence of the continuous-time approximation for strongly convex objectives and learning rate schedules of the form $u_t = \frac{1}{(1+t)^β}, β\in (0,1)$. Moreover, we compute an upper bound on the asymptotic rate of almost sure convergence, which is as good or better than previous results for SGDo.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Towards Continuous-Time Approximations for Stochastic Gradient Descent without Replacement

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Covering-Space Normalizing Flows: Approximating Pushforwards on Lens Spaces

Resolving Node Identifiability in Graph Neural Processes via Laplacian Spectral ...

Global Dynamics of Heavy-Tailed SGDs in Nonconvex Loss Landscape: Characterizati...

Rethinking Nonlinearity: Trainable Gaussian Mixture Modules for Modern Neural Ar...

Deep Learning for Markov Chains: Lyapunov Functions, Poisson's Equation, and Sta...

Навигация