Learning Linear Regression with Low-Rank Tasks in-Context
2510.04548v1
cond-mat.dis-nn, cs.LG, stat.ML
2025-10-08
Авторы:
Kaito Takanami, Takashi Takahashi, Yoshiyuki Kabashima
Abstract
In-context learning (ICL) is a key building block of modern large language
models, yet its theoretical mechanisms remain poorly understood. It is
particularly mysterious how ICL operates in real-world applications where tasks
have a common structure. In this work, we address this problem by analyzing a
linear attention model trained on low-rank regression tasks. Within this
setting, we precisely characterize the distribution of predictions and the
generalization error in the high-dimensional limit. Moreover, we find that
statistical fluctuations in finite pre-training data induce an implicit
regularization. Finally, we identify a sharp phase transition of the
generalization error governed by task structure. These results provide a
framework for understanding how transformers learn to learn the task structure.