scMRDR: A scalable and flexible framework for unpaired single-cell multi-omics data integration
2510.24987v1
q-bio.QM, cs.LG, q-bio.GN
2025-11-01
Авторы:
Jianle Sun, Chaoqi Liang, Ran Wei, Peng Zheng, Lei Bai, Wanli Ouyang, Hongliang Yan, Peng Ye
Abstract
Advances in single-cell sequencing have enabled high-resolution profiling of
diverse molecular modalities, while integrating unpaired multi-omics
single-cell data remains challenging. Existing approaches either rely on pair
information or prior correspondences, or require computing a global pairwise
coupling matrix, limiting their scalability and flexibility. In this paper, we
introduce a scalable and flexible generative framework called single-cell
Multi-omics Regularized Disentangled Representations (scMRDR) for unpaired
multi-omics integration. Specifically, we disentangle each cell's latent
representations into modality-shared and modality-specific components using a
well-designed $\beta$-VAE architecture, which are augmented with isometric
regularization to preserve intra-omics biological heterogeneity, adversarial
objective to encourage cross-modal alignment, and masked reconstruction loss
strategy to address the issue of missing features across modalities. Our method
achieves excellent performance on benchmark datasets in terms of batch
correction, modality alignment, and biological signal preservation. Crucially,
it scales effectively to large-level datasets and supports integration of more
than two omics, offering a powerful and flexible solution for large-scale
multi-omics data integration and downstream biological discovery.