Enhancing Fractional Gradient Descent with Learned Optimizers

2510.18783v1 cs.LG, math.OC, stat.ML 2025-10-23

Авторы:

Jan Sobotka, Petr Šimánek, Pavel Kordík

Abstract

Fractional Gradient Descent (FGD) offers a novel and promising way to accelerate optimization by incorporating fractional calculus into machine learning. Although FGD has shown encouraging initial results across various optimization tasks, it faces significant challenges with convergence behavior and hyperparameter selection. Moreover, the impact of its hyperparameters is not fully understood, and scheduling them is particularly difficult in non-convex settings such as neural network training. To address these issues, we propose a novel approach called Learning to Optimize Caputo Fractional Gradient Descent (L2O-CFGD), which meta-learns how to dynamically tune the hyperparameters of Caputo FGD (CFGD). Our method's meta-learned schedule outperforms CFGD with static hyperparameters found through an extensive search and, in some tasks, achieves performance comparable to a fully black-box meta-learned optimizer. L2O-CFGD can thus serve as a powerful tool for researchers to identify high-performing hyperparameters and gain insights on how to leverage the history-dependence of the fractional differential in optimization.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Enhancing Fractional Gradient Descent with Learned Optimizers

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Diagonalizing the Softmax: Hadamard Initialization for Tractable Cross-Entropy D...

When do spectral gradient updates help in deep learning?

Lower Complexity Bounds for Nonconvex-Strongly-Convex Bilevel Optimization with ...

Adaptivity and Universality: Problem-dependent Universal Regret for Online Conve...

A Best-of-Both-Worlds Proof for Tsallis-INF without Fenchel Conjugates

Навигация