From Moments to Models: Graphon Mixture-Aware Mixup and Contrastive Learning
2510.03690v1
cs.LG, stat.ML
2025-10-08
Авторы:
Ali Azizpour, Reza Ramezanpour, Ashutosh Sabharwal, Santiago Segarra
Abstract
Real-world graph datasets often consist of mixtures of populations, where
graphs are generated from multiple distinct underlying distributions. However,
modern representation learning approaches, such as graph contrastive learning
(GCL) and augmentation methods like Mixup, typically overlook this mixture
structure. In this work, we propose a unified framework that explicitly models
data as a mixture of underlying probabilistic graph generative models
represented by graphons. To characterize these graphons, we leverage graph
moments (motif densities) to cluster graphs arising from the same model. This
enables us to disentangle the mixture components and identify their distinct
generative mechanisms. This model-aware partitioning benefits two key graph
learning tasks: 1) It enables a graphon-mixture-aware mixup (GMAM), a data
augmentation technique that interpolates in a semantically valid space guided
by the estimated graphons, instead of assuming a single graphon per class. 2)
For GCL, it enables model-adaptive and principled augmentations. Additionally,
by introducing a new model-aware objective, our proposed approach (termed MGCL)
improves negative sampling by restricting negatives to graphs from other
models. We establish a key theoretical guarantee: a novel, tighter bound
showing that graphs sampled from graphons with small cut distance will have
similar motif densities with high probability. Extensive experiments on
benchmark datasets demonstrate strong empirical performance. In unsupervised
learning, MGCL achieves state-of-the-art results, obtaining the top average
rank across eight datasets. In supervised learning, GMAM consistently
outperforms existing strategies, achieving new state-of-the-art accuracy in 6
out of 7 datasets.
Ссылки и действия
Дополнительные ресурсы: