AuON: A Linear-time Alternative to Semi-Orthogonal Momentum Updates
2509.24320v2
cs.LG, stat.ML
2025-10-01
Авторы:
Dipan Maity
Abstract
Orthogonal gradient updates have emerged as a promising direction in
optimization for machine learning. However, traditional approaches such as
SVD/QR decomposition incur prohibitive computational costs of O(n^3) and
underperform compared to well-tuned SGD with momentum, since momentum is
applied only after strict orthogonalization. Recent advances, such as Muon,
improve efficiency by applying momentum before orthogonalization and producing
semi-orthogonal matrices via Newton-Schulz iterations, reducing complexity to
O(n^2). Nevertheless, quadratic costs remain a bottleneck.
In this work, we study the semi-orthogonal properties of momentum-based
updates and develop a method to bound momentum updates under a spectral-norm
trust region, preserving directional information without requiring explicit
semi-orthogonalization.
We propose AuON (Alternative Unit-norm momentum updates by Normalized
nonlinear scaling), a linear-time optimizer that achieves strong performance
without constructing semi-orthogonal matrices, while preserving structural
alignment and reconditioning ill-posed updates. Our approach combines
hyperbolic-cosine RMS scaling transformations with normalization, demonstrating
both effectiveness and computational efficiency compared to Newton-Schulz
methods. We further introduce a hybrid variant (Hybrid-AuON) that applies a
single Newton-Schulz iteration. Experiments across vision and language
benchmarks show that AuON and its hybrid variant achieve performance comparable
to strong baselines such as AdamW and Muon.
Code is available at: https://github.com/ryyzn9/AuON
Ссылки и действия
Дополнительные ресурсы: