HVAdam: A Full-Dimension Adaptive Optimizer

2511.20277v1 cs.LG, cs.AI 2025-11-26

Авторы:

Yiheng Zhang, Shaowu Wu, Yuanzhuo Xu, Jiajun Wu, Shang Xu, Steve Drew, Xiaoguang Niu

Abstract

Adaptive optimizers such as Adam have achieved great success in training large-scale models like large language models and diffusion models. However, they often generalize worse than non-adaptive methods, such as SGD on classical architectures like CNNs. We identify a key cause of this performance gap: adaptivity in pre-conditioners, which limits the optimizer's ability to adapt to diverse optimization landscapes. To address this, we propose Anon (Adaptivity Non-restricted Optimizer with Novel convergence technique), a novel optimizer with continuously tunable adaptivity , allowing it to interpolate between SGD-like and Adam-like behaviors and even extrapolate beyond both. To ensure convergence across the entire adaptivity spectrum, we introduce incremental delay update (IDU), a novel mechanism that is more flexible than AMSGrad's hard max-tracking strategy and enhances robustness to gradient noise. We theoretically establish convergence guarantees under both convex and non-convex settings. Empirically, Anon consistently outperforms state-of-the-art optimizers on representative image classification, diffusion, and language modeling tasks. These results demonstrate that adaptivity can serve as a valuable tunable design principle, and Anon provides the first unified and reliable framework capable of bridging the gap between classical and modern optimizers and surpassing their advantageous properties.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

HVAdam: A Full-Dimension Adaptive Optimizer

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Approximation of Box Decomposition Algorithm for Fast Hypervolume-Based Multi-Ob...

NEAT: Neighborhood-Guided, Efficient, Autoregressive Set Transformer for 3D Mole...

Sparse Attention Post-Training for Mechanistic Interpretability

Neural Coherence : Find higher performance to out-of-distribution tasks from few...

Impugan: Learning Conditional Generative Models for Robust Data Imputation

Навигация