Measuring the Intrinsic Dimension of Earth Representations
2511.02101v2
cs.LG, cs.IT, math.IT
2025-11-06
Авторы:
Arjun Rao, Marc Rußwurm, Konstantin Klemmer, Esther Rolf
Abstract
Within the context of representation learning for Earth observation,
geographic Implicit Neural Representations (INRs) embed low-dimensional
location inputs (longitude, latitude) into high-dimensional embeddings, through
models trained on geo-referenced satellite, image or text data. Despite the
common aim of geographic INRs to distill Earth's data into compact,
learning-friendly representations, we lack an understanding of how much
information is contained in these Earth representations, and where that
information is concentrated. The intrinsic dimension of a dataset measures the
number of degrees of freedom required to capture its local variability,
regardless of the ambient high-dimensional space in which it is embedded. This
work provides the first study of the intrinsic dimensionality of geographic
INRs. Analyzing INRs with ambient dimension between 256 and 512, we find that
their intrinsic dimensions fall roughly between 2 and 10 and are sensitive to
changing spatial resolution and input modalities during INR pre-training.
Furthermore, we show that the intrinsic dimension of a geographic INR
correlates with downstream task performance and can capture spatial artifacts,
facilitating model evaluation and diagnostics. More broadly, our work offers an
architecture-agnostic, label-free metric of information content that can enable
unsupervised evaluation, model selection, and pre-training design across INRs.
Ссылки и действия
Дополнительные ресурсы: