Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region
2509.25351v1
cs.LG, stat.ML
2025-10-03
Авторы:
Shuang Liang, Guido Montúfar
Abstract
We examine gradient descent in matrix factorization and show that under large
step sizes the parameter space develops a fractal structure. We derive the
exact critical step size for convergence in scalar-vector factorization and
show that near criticality the selected minimizer depends sensitively on the
initialization. Moreover, we show that adding regularization amplifies this
sensitivity, generating a fractal boundary between initializations that
converge and those that diverge. The analysis extends to general matrix
factorization with orthogonal initialization. Our findings reveal that
near-critical step sizes induce a chaotic regime of gradient descent where the
long-term dynamics are unpredictable and there are no simple implicit biases,
such as towards balancedness, minimum norm, or flatness.
Ссылки и действия
Дополнительные ресурсы: