Artificial Intelligence-Based Classification of Spitz Tumors

2508.05391v1 eess.IV, cs.CV 2025-08-09
Авторы:

Ruben T. Lucassen, Marjanna Romers, Chiel F. Ebbelaar, Aia N. Najem, Donal P. Hayes, Antien L. Mooyaart, Sara Roshani, Liliane C. D. Wynaendts, Nikolas Stathonikos, Gerben E. Breimer, Anne M. L. Jansen, Mitko Veta, Willeke A. M. Blokx

Резюме на русском

Spitz-туморы широко известны своей диагностической сложностью, в связи с чем исследовалось потенциало AI-моделей в различных ситуациях. Исследование анализировало 393 Spitz-туморов и 379 конвенциональных меланом, сравнивая предсказательную эффективность AI-моделей с четырьмя опытными патологами. AI-модели, основанные на UNI-функциях, показали сильный показатель AUROC (0.95) и достоверность (0.86) при различении Spitz-туморов и меланом, обнаружив генетические аберрации с достоверностью 0.55 против 0.25 в случае случайного предположения. Также был проведен эксперимент, показавший, что AI-рекомендации могут уменьшить стоимость материалов, время отклика и диагностические исследования. Общий вывод: AI-модели достигли высокой точности в дифференциации Spitz-туморов и меланом, что демонстрирует их потенциал для улучшения диагностики.

Abstract

Spitz tumors are diagnostically challenging due to overlap in atypical histological features with conventional melanomas. We investigated to what extent AI models, using histological and/or clinical features, can: (1) distinguish Spitz tumors from conventional melanomas; (2) predict the underlying genetic aberration of Spitz tumors; and (3) predict the diagnostic category of Spitz tumors. The AI models were developed and validated using a dataset of 393 Spitz tumors and 379 conventional melanomas. Predictive performance was measured using the AUROC and the accuracy. The performance of the AI models was compared with that of four experienced pathologists in a reader study. Moreover, a simulation experiment was conducted to investigate the impact of implementing AI-based recommendations for ancillary diagnostic testing on the workflow of the pathology department. The best AI model based on UNI features reached an AUROC of 0.95 and an accuracy of 0.86 in differentiating Spitz tumors from conventional melanomas. The genetic aberration was predicted with an accuracy of 0.55 compared to 0.25 for randomly guessing. The diagnostic category was predicted with an accuracy of 0.51, where random chance-level accuracy equaled 0.33. On all three tasks, the AI models performed better than the four pathologists, although differences were not statistically significant for most individual comparisons. Based on the simulation experiment, implementing AI-based recommendations for ancillary diagnostic testing could reduce material costs, turnaround times, and examinations. In conclusion, the AI models achieved a strong predictive performance in distinguishing between Spitz tumors and conventional melanomas. On the more challenging tasks of predicting the genetic aberration and the diagnostic category of Spitz tumors, the AI models performed better than random chance.

Ссылки и действия