Quantitative convergence of trained single layer neural networks to Gaussian processes
2509.24544v1
stat.ML, cs.LG, math.PR
2025-10-01
Авторы:
Eloy Mosig, Andrea Agazzi, Dario Trevisan
Abstract
In this paper, we study the quantitative convergence of shallow neural
networks trained via gradient descent to their associated Gaussian processes in
the infinite-width limit.
While previous work has established qualitative convergence under broad
settings, precise, finite-width estimates remain limited, particularly during
training.
We provide explicit upper bounds on the quadratic Wasserstein distance
between the network output and its Gaussian approximation at any training time
$t \ge 0$, demonstrating polynomial decay with network width.
Our results quantify how architectural parameters, such as width and input
dimension, influence convergence, and how training dynamics affect the
approximation error.