Transforming Multi-Omics Integration with GANs: Applications in Alzheimer's and Cancer
2510.19870v1
q-bio.QM, cs.LG, stat.ML
2025-10-25
Авторы:
Md Selim Reza, Sabrin Afroz, Mostafizer Rahman, Md Ashad Alam
Abstract
Multi-omics data integration is crucial for understanding complex diseases,
yet limited sample sizes, noise, and heterogeneity often reduce predictive
power. To address these challenges, we introduce Omics-GAN, a Generative
Adversarial Network (GAN)-based framework designed to generate high-quality
synthetic multi-omics profiles while preserving biological relationships. We
evaluated Omics-GAN on three omics types (mRNA, miRNA, and DNA methylation)
using the ROSMAP cohort for Alzheimer's disease (AD) and TCGA datasets for
colon and liver cancer. A support vector machine (SVM) classifier with repeated
5-fold cross-validation demonstrated that synthetic datasets consistently
improved prediction accuracy compared to original omics profiles. The AUC of
SVM for mRNA improved from 0.72 to 0.74 in AD, and from 0.68 to 0.72 in liver
cancer. Synthetic miRNA enhanced classification in colon cancer from 0.59 to
0.69, while synthetic methylation data improved performance in liver cancer
from 0.64 to 0.71. Boxplot analyses confirmed that synthetic data preserved
statistical distributions while reducing noise and outliers. Feature selection
identified significant genes overlapping with original datasets and revealed
additional candidates validated by GO and KEGG enrichment analyses. Finally,
molecular docking highlighted potential drug repurposing candidates, including
Nilotinib for AD, Atovaquone for liver cancer, and Tecovirimat for colon
cancer. Omics-GAN enhances disease prediction, preserves biological fidelity,
and accelerates biomarker and drug discovery, offering a scalable strategy for
precision medicine applications.