MCbiF: Measuring Topological Autocorrelation in Multiscale Clusterings via 2-Parameter Persistent Homology
2510.14710v1
math.AT, cs.LG, physics.data-an, Primary 55N31, Secondary 62H30
2025-10-18
Авторы:
Juni Schindler, Mauricio Barahona
Abstract
Datasets often possess an intrinsic multiscale structure with meaningful
descriptions at different levels of coarseness. Such datasets are naturally
described as multi-resolution clusterings, i.e., not necessarily hierarchical
sequences of partitions across scales. To analyse and compare such sequences,
we use tools from topological data analysis and define the Multiscale
Clustering Bifiltration (MCbiF), a 2-parameter filtration of abstract
simplicial complexes that encodes cluster intersection patterns across scales.
The MCbiF can be interpreted as a higher-order extension of Sankey diagrams and
reduces to a dendrogram for hierarchical sequences. We show that the
multiparameter persistent homology (MPH) of the MCbiF yields a finitely
presented and block decomposable module, and its stable Hilbert functions
characterise the topological autocorrelation of the sequence of partitions. In
particular, at dimension zero, the MPH captures violations of the refinement
order of partitions, whereas at dimension one, the MPH captures higher-order
inconsistencies between clusters across scales. We demonstrate through
experiments the use of MCbiF Hilbert functions as topological feature maps for
downstream machine learning tasks. MCbiF feature maps outperform
information-based baseline features on both regression and classification tasks
on synthetic sets of non-hierarchical sequences of partitions. We also show an
application of MCbiF to real-world data to measure non-hierarchies in wild mice
social grouping patterns across time.