Dynamically Weighted Momentum with Adaptive Step Sizes for Efficient Deep Network Training
2510.25042v1
cs.LG, cs.NE
2025-10-31
Авторы:
Zhifeng Wang, Longlong Li, Chunyan Zeng
Abstract
Within the current sphere of deep learning research, despite the extensive
application of optimization algorithms such as Stochastic Gradient Descent
(SGD) and Adaptive Moment Estimation (Adam), there remains a pronounced
inadequacy in their capability to address fluctuations in learning efficiency,
meet the demands of complex models, and tackle non-convex optimization issues.
These challenges primarily arise from the algorithms' limitations in handling
complex data structures and models, for instance, difficulties in selecting an
appropriate learning rate, avoiding local optima, and navigating through
high-dimensional spaces. To address these issues, this paper introduces a novel
optimization algorithm named DWMGrad. This algorithm, building on the
foundations of traditional methods, incorporates a dynamic guidance mechanism
reliant on historical data to dynamically update momentum and learning rates.
This allows the optimizer to flexibly adjust its reliance on historical
information, adapting to various training scenarios. This strategy not only
enables the optimizer to better adapt to changing environments and task
complexities but also, as validated through extensive experimentation,
demonstrates DWMGrad's ability to achieve faster convergence rates and higher
accuracies under a multitude of scenarios.
Ссылки и действия
Дополнительные ресурсы: