The Mechanistic Emergence of Symbol Grounding in Language Models
2510.13796v2
cs.CL, cs.CV
2025-10-17
Авторы:
Shuyu Wu, Ziqiao Ma, Xiaoxi Luo, Yidong Huang, Josue Torres-Fonseca, Freda Shi, Joyce Chai
Abstract
Symbol grounding (Harnad, 1990) describes how symbols such as words acquire
their meanings by connecting to real-world sensorimotor experiences. Recent
work has shown preliminary evidence that grounding may emerge in
(vision-)language models trained at scale without using explicit grounding
objectives. Yet, the specific loci of this emergence and the mechanisms that
drive it remain largely unexplored. To address this problem, we introduce a
controlled evaluation framework that systematically traces how symbol grounding
arises within the internal computations through mechanistic and causal
analysis. Our findings show that grounding concentrates in middle-layer
computations and is implemented through the aggregate mechanism, where
attention heads aggregate the environmental ground to support the prediction of
linguistic forms. This phenomenon replicates in multimodal dialogue and across
architectures (Transformers and state-space models), but not in unidirectional
LSTMs. Our results provide behavioral and mechanistic evidence that symbol
grounding can emerge in language models, with practical implications for
predicting and potentially controlling the reliability of generation.
Ссылки и действия
Дополнительные ресурсы: