Calibrated Principal Component Regression
2510.19020v1
stat.ML, cs.LG
2025-10-25
Авторы:
Yixuan Florence Wu, Yilun Zhu, Lei Cao and, Naichen Shi
Abstract
We propose a new method for statistical inference in generalized linear
models. In the overparameterized regime, Principal Component Regression (PCR)
reduces variance by projecting high-dimensional data to a low-dimensional
principal subspace before fitting. However, PCR incurs truncation bias whenever
the true regression vector has mass outside the retained principal components
(PC). To mitigate the bias, we propose Calibrated Principal Component
Regression (CPCR), which first learns a low-variance prior in the PC subspace
and then calibrates the model in the original feature space via a centered
Tikhonov step. CPCR leverages cross-fitting and controls the truncation bias by
softening PCR's hard cutoff. Theoretically, we calculate the out-of-sample risk
in the random matrix regime, which shows that CPCR outperforms standard PCR
when the regression signal has non-negligible components in low-variance
directions. Empirically, CPCR consistently improves prediction across multiple
overparameterized problems. The results highlight CPCR's stability and
flexibility in modern overparameterized settings.
Ссылки и действия
Дополнительные ресурсы: