FedMuon: Federated Learning with Bias-corrected LMO-based Optimization
2509.26337v1
cs.LG, math.OC
2025-10-02
Авторы:
Yuki Takezawa, Anastasia Koloskova, Xiaowen Jiang, Sebastian U. Stich
Abstract
Recently, a new optimization method based on the linear minimization oracle
(LMO), called Muon, has been attracting increasing attention since it can train
neural networks faster than existing adaptive optimization methods, such as
Adam. In this paper, we study how Muon can be utilized in federated learning.
We first show that straightforwardly using Muon as the local optimizer of
FedAvg does not converge to the stationary point since the LMO is a biased
operator. We then propose FedMuon which can mitigate this issue. We also
analyze how solving the LMO approximately affects the convergence rate and find
that, surprisingly, FedMuon can converge for any number of Newton-Schulz
iterations, while it can converge faster as we solve the LMO more accurately.
Through experiments, we demonstrated that FedMuon can outperform the
state-of-the-art federated learning methods.
Ссылки и действия
Дополнительные ресурсы: