Task-Level Insights from Eigenvalues across Sequence Models
2510.09379v1
cs.LG, cs.AI, cs.SY, eess.SY
2025-10-14
Авторы:
Rahel Rickenbach, Jelena Trisovic, Alexandre Didier, Jerome Sieber, Melanie N. Zeilinger
Abstract
Although softmax attention drives state-of-the-art performance for sequence
models, its quadratic complexity limits scalability, motivating linear
alternatives such as state space models (SSMs). While these alternatives
improve efficiency, their fundamental differences in information processing
remain poorly understood. In this work, we leverage the recently proposed
dynamical systems framework to represent softmax, norm and linear attention as
dynamical systems, enabling a structured comparison with SSMs by analyzing
their respective eigenvalue spectra. Since eigenvalues capture essential
aspects of dynamical system behavior, we conduct an extensive empirical
analysis across diverse sequence models and benchmarks. We first show that
eigenvalues influence essential aspects of memory and long-range dependency
modeling, revealing spectral signatures that align with task requirements.
Building on these insights, we then investigate how architectural modifications
in sequence models impact both eigenvalue spectra and task performance. This
correspondence further strengthens the position of eigenvalue analysis as a
principled metric for interpreting, understanding, and ultimately improving the
capabilities of sequence models.