Convergence, design and training of continuous-time dropout as a random batch method
2510.13134v1
cs.LG, math.OC, 68T07, 65C35, 37N35, 65K10, 35Q49
2025-10-17
Авторы:
Antonio Álvarez-López, Martín Hernández
Abstract
We study dropout regularization in continuous-time models through the lens of
random-batch methods -- a family of stochastic sampling schemes originally
devised to reduce the computational cost of interacting particle systems. We
construct an unbiased, well-posed estimator that mimics dropout by sampling
neuron batches over time intervals of length $h$. Trajectory-wise convergence
is established with linear rate in $h$ for the expected uniform error. At the
distribution level, we establish stability for the associated continuity
equation, with total-variation error of order $h^{1/2}$ under mild moment
assumptions. During training with fixed batch sampling across epochs, a
Pontryagin-based adjoint analysis bounds deviations in the optimal cost and
control, as well as in gradient-descent iterates. On the design side, we
compare convergence rates for canonical batch sampling schemes, recover
standard Bernoulli dropout as a special case, and derive a cost--accuracy
trade-off yielding a closed-form optimal $h$. We then specialize to a
single-layer neural ODE and validate the theory on classification and flow
matching, observing the predicted rates, regularization effects, and favorable
runtime and memory profiles.