Redundancy as a Structural Information Principle for Learning and Generalization

2510.10938v1 cs.LG, cs.AI, cs.IT, math.IT, stat.ML 2025-10-16
Авторы:

Yuda Bi, Ying Zhu, Vince D Calhoun

Abstract

We present a theoretical framework that extends classical information theory to finite and structured systems by redefining redundancy as a fundamental property of information organization rather than inefficiency. In this framework, redundancy is expressed as a general family of informational divergences that unifies multiple classical measures, such as mutual information, chi-squared dependence, and spectral redundancy, under a single geometric principle. This reveals that these traditional quantities are not isolated heuristics but projections of a shared redundancy geometry. The theory further predicts that redundancy is bounded both above and below, giving rise to an optimal equilibrium that balances over-compression (loss of structure) and over-coupling (collapse). While classical communication theory favors minimal redundancy for transmission efficiency, finite and structured systems, such as those underlying real-world learning, achieve maximal stability and generalization near this equilibrium. Experiments with masked autoencoders are used to illustrate and verify this principle: the model exhibits a stable redundancy level where generalization peaks. Together, these results establish redundancy as a measurable and tunable quantity that bridges the asymptotic world of communication and the finite world of learning.

Ссылки и действия

Связанные статьи

The Alignment Bottleneck

## Контекст Современные большие языковые модели (БЯМ) показывают значительный прогресс в обработке естественного языка,...

2025-09-23