Precise asymptotic analysis of Sobolev training for random feature models
2511.03050v1
stat.ML, cond-mat.dis-nn, cs.LG, math.PR, math.ST, stat.TH
2025-11-07
Авторы:
Katharine E Fisher, Matthew TC Li, Youssef Marzouk, Timo Schorlepp
Abstract
Gradient information is widely useful and available in applications, and is
therefore natural to include in the training of neural networks. Yet little is
known theoretically about the impact of Sobolev training -- regression with
both function and gradient data -- on the generalization error of highly
overparameterized predictive models in high dimensions. In this paper, we
obtain a precise characterization of this training modality for random feature
(RF) models in the limit where the number of trainable parameters, input
dimensions, and training data tend proportionally to infinity. Our model for
Sobolev training reflects practical implementations by sketching gradient data
onto finite dimensional subspaces. By combining the replica method from
statistical physics with linearizations in operator-valued free probability
theory, we derive a closed-form description for the generalization errors of
the trained RF models. For target functions described by single-index models,
we demonstrate that supplementing function data with additional gradient data
does not universally improve predictive performance. Rather, the degree of
overparameterization should inform the choice of training method. More broadly,
our results identify settings where models perform optimally by interpolating
noisy function and gradient data.