On Provable Benefits of Muon in Federated Learning
2510.03866v1
cs.LG, stat.ML
2025-10-08
Авторы:
Xinwen Zhang, Hongchang Gao
Abstract
The recently introduced optimizer, Muon, has gained increasing attention due
to its superior performance across a wide range of applications. However, its
effectiveness in federated learning remains unexplored. To address this gap,
this paper investigates the performance of Muon in the federated learning
setting. Specifically, we propose a new algorithm, FedMuon, and establish its
convergence rate for nonconvex problems. Our theoretical analysis reveals
multiple favorable properties of FedMuon. In particular, due to its
orthonormalized update direction, the learning rate of FedMuon is independent
of problem-specific parameters, and, importantly, it can naturally accommodate
heavy-tailed noise. The extensive experiments on a variety of neural network
architectures validate the effectiveness of the proposed algorithm.
Ссылки и действия
Дополнительные ресурсы: