Prompt-Conditioned FiLM and Multi-Scale Fusion on MedSigLIP for Low-Dose CT Quality Assessment

2511.12256v1 cs.CV, cs.AI, eess.IV 2025-11-18

Авторы:

Tolga Demiroglu, Mehmet Ozan Unal, Metin Ertas, Isa Yildirim

Abstract

We propose a prompt-conditioned framework built on MedSigLIP that injects textual priors via Feature-wise Linear Modulation (FiLM) and multi-scale pooling. Text prompts condition patch-token features on clinical intent, enabling data-efficient learning and rapid adaptation. The architecture combines global, local, and texture-aware pooling through separate regression heads fused by a lightweight MLP, trained with pairwise ranking loss. Evaluated on the LDCTIQA2023 (a public LDCT quality assessment challenge) with 1,000 training images, we achieve PLCC = 0.9575, SROCC = 0.9561, and KROCC = 0.8301, surpassing the top-ranked published challenge submissions and demonstrating the effectiveness of our prompt-guided approach.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Prompt-Conditioned FiLM and Multi-Scale Fusion on MedSigLIP for Low-Dose CT Quality Assessment

Авторы:

Abstract

Ссылки и действия

Связанные статьи

C3Net: Context-Contrast Network for Camouflaged Object Detection

MSRNet: A Multi-Scale Recursive Network for Camouflaged Object Detection

Deep learning-based object detection of offshore platforms on Sentinel-1 Imagery...

Estimation of Segmental Longitudinal Strain in Transesophageal Echocardiography ...

Explainable Detection of AI-Generated Images with Artifact Localization Using Fa...

Навигация