Multimodal Foundation Models for Early Disease Detection

2510.01899v1 cs.LG, cs.AI, cs.HC 2025-10-04

Авторы:

Md Talha Mohsin, Ismail Abdulrashid

Abstract

Healthcare generates diverse streams of data, including electronic health records (EHR), medical imaging, genetics, and ongoing monitoring from wearable devices. Traditional diagnostic models frequently analyze these sources in isolation, which constrains their capacity to identify cross-modal correlations essential for early disease diagnosis. Our research presents a multimodal foundation model that consolidates diverse patient data through an attention-based transformer framework. At first, dedicated encoders put each modality into a shared latent space. Then, they combine them using multi-head attention and residual normalization. The architecture is made for pretraining on many tasks, which makes it easy to adapt to new diseases and datasets with little extra work. We provide an experimental strategy that uses benchmark datasets in oncology, cardiology, and neurology, with the goal of testing early detection tasks. The framework includes data governance and model management tools in addition to technological performance to improve transparency, reliability, and clinical interpretability. The suggested method works toward a single foundation model for precision diagnostics, which could improve the accuracy of predictions and help doctors make decisions.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Multimodal Foundation Models for Early Disease Detection

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Monte Carlo Expected Threat (MOCET) Scoring

Simulated Human Learning in a Dynamic, Partially-Observed, Time-Series Environme...

Assessing the Real-World Utility of Explainable AI for Arousal Diagnostics: An A...

From Prototypes to Sparse ECG Explanations: SHAP-Driven Counterfactuals for Mult...

NeuroRVQ: Multi-Scale EEG Tokenization for Generative Large Brainwave Models

Навигация