Redundancy as a Structural Information Principle for Learning and Generalization
2510.10938v1
cs.LG, cs.AI, cs.IT, math.IT, stat.ML
2025-10-16
Авторы:
Yuda Bi, Ying Zhu, Vince D Calhoun
Abstract
We present a theoretical framework that extends classical information theory
to finite and structured systems by redefining redundancy as a fundamental
property of information organization rather than inefficiency. In this
framework, redundancy is expressed as a general family of informational
divergences that unifies multiple classical measures, such as mutual
information, chi-squared dependence, and spectral redundancy, under a single
geometric principle. This reveals that these traditional quantities are not
isolated heuristics but projections of a shared redundancy geometry. The theory
further predicts that redundancy is bounded both above and below, giving rise
to an optimal equilibrium that balances over-compression (loss of structure)
and over-coupling (collapse). While classical communication theory favors
minimal redundancy for transmission efficiency, finite and structured systems,
such as those underlying real-world learning, achieve maximal stability and
generalization near this equilibrium. Experiments with masked autoencoders are
used to illustrate and verify this principle: the model exhibits a stable
redundancy level where generalization peaks. Together, these results establish
redundancy as a measurable and tunable quantity that bridges the asymptotic
world of communication and the finite world of learning.