The Impact of Image Resolution on Biomedical Multimodal Large Language Models
2510.18304v1
cs.CV, cs.CL
2025-10-23
Авторы:
Liangyu Chen, James Burgess, Jeffrey J Nirschl, Orr Zohar, Serena Yeung-Levy
Abstract
Imaging technologies are fundamental to biomedical research and modern
medicine, requiring analysis of high-resolution images across various
modalities. While multimodal large language models (MLLMs) show promise for
biomedical image analysis, most are designed for low-resolution images from
general-purpose datasets, risking critical information loss. We investigate how
image resolution affects MLLM performance in biomedical applications and
demonstrate that: (1) native-resolution training and inference significantly
improve performance across multiple tasks, (2) misalignment between training
and inference resolutions severely degrades performance, and (3)
mixed-resolution training effectively mitigates misalignment and balances
computational constraints with performance requirements. Based on these
findings, we recommend prioritizing native-resolution inference and
mixed-resolution datasets to optimize biomedical MLLMs for transformative
impact in scientific research and clinical applications.
Ссылки и действия
Дополнительные ресурсы: