Few-Shot Multimodal Medical Imaging: A Theoretical Framework
2511.01140v1
stat.ML, cs.AI, cs.CV, cs.LG, eess.IV
2025-11-07
Авторы:
Md Talha Mohsin, Ismail Abdulrashid
Abstract
Medical imaging relies heavily on large, labeled datasets. But,
unfortunately, they are not always easily accessible in clinical settings.
Additionally, many practitioners often face various structural obstacles like
limited data availability, fragmented data systems, and unbalanced datasets.
These barriers often lead to the increased diagnostic uncertainty,
underrepresentation of certain conditions, reduced model robustness, and biased
diagnostic decisions. In response to these challenges, approaches such as
transfer learning, meta-learning, and multimodal fusion have made great
strides. However, they still need a solid theoretical justification for why
they succeed or fail in situations where data is scarce. To address this gap,
we propose a unified theoretical framework that characterizes learning and
inference under low-resource medical imaging conditions. We first formalize the
learning objective under few-shot conditions and compute sample complexity
constraints to estimate the smallest quantity of data needed to achieve
clinically reliable accuracy. Then based on ideas from PAC-learning and
PAC-Bayesian theory, we explain how multimodal integration encourages
generalization and quantifies uncertainty under sparse supervision. We further
propose a formal metric for explanation stability, offering interpretability
guarantees under low-data conditions. Taken together, the proposed framework
establishes a principled foundation for constructing dependable, data-efficient
diagnostic systems by jointly characterizing sample efficiency, uncertainty
quantification, and interpretability in a unified theoretical setting.