Beyond Scaffold: A Unified Spatio-Temporal Gradient Tracking Method

2512.01732v1 cs.LG, math.OC 2025-12-04

Авторы:

Yan Huang, Jinming Xu, Jiming Chen, Karl Henrik Johansson

Abstract

In distributed and federated learning algorithms, communication overhead is often reduced by performing multiple local updates between communication rounds. However, due to data heterogeneity across nodes and the local gradient noise within each node, this strategy can lead to the drift of local models away from the global optimum. To address this issue, we revisit the well-known federated learning method Scaffold (Karimireddy et al., 2020) under a gradient tracking perspective, and propose a unified spatio-temporal gradient tracking algorithm, termed ST-GT, for distributed stochastic optimization over time-varying graphs. ST-GT tracks the global gradient across neighboring nodes to mitigate data heterogeneity, while maintaining a running average of local gradients to substantially suppress noise, with slightly more storage overhead. Without assuming bounded data heterogeneity, we prove that ST-GT attains a linear convergence rate for strongly convex problems and a sublinear rate for nonconvex cases. Notably, ST-GT achieves the first linear speed-up in communication complexity with respect to the number of local updates per round $τ$ for the strongly-convex setting. Compared to traditional gradient tracking methods, ST-GT reduces the topology-dependent noise term from $σ^2$ to $σ^2/τ$, where $σ^2$ denotes the noise level, thereby improving communication efficiency.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Beyond Scaffold: A Unified Spatio-Temporal Gradient Tracking Method

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Convergence for Discrete Parameter Updates

The Geometry of Intelligence: Deterministic Functional Topology as a Foundation ...

Risk-Sensitive Q-Learning in Continuous Time with Application to Dynamic Portfol...

ARM-Explainer -- Explaining and improving graph neural network predictions for t...

Mean-Field Limits for Two-Layer Neural Networks Trained with Consensus-Based Opt...

Навигация