Optimization Benchmark for Diffusion Models on Dynamical Systems
2510.19376v1
cs.LG, math.OC
2025-10-24
Авторы:
Fabian Schaipp
Abstract
The training of diffusion models is often absent in the evaluation of new
optimization techniques. In this work, we benchmark recent optimization
algorithms for training a diffusion model for denoising flow trajectories. We
observe that Muon and SOAP are highly efficient alternatives to AdamW (18%
lower final loss). We also revisit several recent phenomena related to the
training of models for text or image applications in the context of diffusion
model training. This includes the impact of the learning-rate schedule on the
training dynamics, and the performance gap between Adam and SGD.
Ссылки и действия
Дополнительные ресурсы: