AILA--First Experiments with Localist Language Models
2511.03559v1
cs.CL, cs.AI
2025-11-07
Авторы:
Joachim Diederich
Abstract
This paper presents the first empirical demonstration of controllable
locality in transformer language models, a novel architectural framework that
enables continuous control over the degree of representation localization
through a tunable locality dial parameter. Unlike traditional language models
that rely exclusively on distributed representations, our approach allows
dynamic interpolation between highly interpretable localist encodings and
efficient distributed representations without requiring model retraining. We
conducted experiments on the WikiText corpus using a two-layer transformer
architecture, systematically varying the locality parameter {\lambda} across
the full spectrum from 1.0 (fully localist) to 0.0 (fully distributed). Our
results demonstrate that localist configurations achieve dramatically lower
attention entropy, with {\lambda} = 1.0 yielding 5.36 bits compared to 7.18
bits at {\lambda} = 0.0, while maintaining substantially higher pointer
fidelity scores reflecting stronger alignment with rule-specified targets.
Prediction experiments reveal that intermediate locality values optimize the
tradeoff between interpretability and performance, with {\lambda} = 0.6
achieving test perplexity of 4.65 and accuracy of 84.7%. These findings
establish that localist language models provide a practical framework for
applications in regulated domains requiring both transparency and capability,
offering precise mathematical control over the interpretability-performance
spectrum through explicit penalty thresholds and information-theoretic design
principles.
Ссылки и действия
Дополнительные ресурсы: