Visual Bias and Interpretability in Deep Learning for Dermatological Image Analysis

2508.04573v1 cs.CV 2025-08-09

Авторы:

Enam Ahmed Taufik, Abdullah Khondoker, Antara Firoz Parsa, Seraj Al Mahmud Mostafa

Резюме на русском

Проблема: Анализ дерматологических изображений через глубокие нейросети стал ключевым инструментом в компьютерной ассистентной диагностике, но существуют значительные вызовы, связанные с высокой схожестью классов, различием внутри- и межклассовных особенностей, а также сложностью текстур дерматологических повреждений. Решение: В настоящей работе предлагается глубокое обучающее рамфейзмклюлыйчемовянийключевойдерматологии, проверяя три метода предобработки изображений: стандартное RGB, преобразование в CMY-цвет, и CLAHE. Модели DenseNet201, EfficientNetB5 и трансформер-модели (ViT, Swin Transformer, DinoV2 Large) были оценены с помощью метрик точности и F1-skor. Основные выводы: Модель DinoV2 с предварительным RGB-обработкой показала самые высокие результаты точности и F1-skor. Grad-CAM-визуализации, примененные к RGB-входным данным, демонстрируют точное локализуемое границы границ, повышая транспарентность и практичность CAD-систем для дерматологии.

Abstract

Accurate skin disease classification is a critical yet challenging task due to high inter-class similarity, intra-class variability, and complex lesion textures. While deep learning-based computer-aided diagnosis (CAD) systems have shown promise in automating dermatological assessments, their performance is highly dependent on image pre-processing and model architecture. This study proposes a deep learning framework for multi-class skin disease classification, systematically evaluating three image pre-processing techniques: standard RGB, CMY color space transformation, and Contrast Limited Adaptive Histogram Equalization (CLAHE). We benchmark the performance of pre-trained convolutional neural networks (DenseNet201, Efficient-NetB5) and transformer-based models (ViT, Swin Transformer, DinoV2 Large) using accuracy and F1-score as evaluation metrics. Results show that DinoV2 with RGB pre-processing achieves the highest accuracy (up to 93%) and F1-scores across all variants. Grad-CAM visualizations applied to RGB inputs further reveal precise lesion localization, enhancing interpretability. These findings underscore the importance of effective pre-processing and model choice in building robust and explainable CAD systems for dermatology.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Visual Bias and Interpretability in Deep Learning for Dermatological Image Analysis

Авторы:

Резюме на русском

Abstract

Ссылки и действия

Связанные статьи

ViRectify: A Challenging Benchmark for Video Reasoning Correction with Multimoda...

PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with P...

ViDiC: Video Difference Captioning

Beyond the Ground Truth: Enhanced Supervision for Image Restoration

TempR1: Improving Temporal Understanding of MLLMs via Temporal-Aware Multi-Task ...

Навигация