Assessing the Real-World Utility of Explainable AI for Arousal Diagnostics: An Application-Grounded User Study
2510.21389v1
cs.LG, cs.AI, cs.HC
2025-10-28
Авторы:
Stefan Kraft, Andreas Theissler, Vera Wienhausen-Wilke, Gjergji Kasneci, Hendrik Lensch
Abstract
Artificial intelligence (AI) systems increasingly match or surpass human
experts in biomedical signal interpretation. However, their effective
integration into clinical practice requires more than high predictive accuracy.
Clinicians must discern \textit{when} and \textit{why} to trust algorithmic
recommendations. This work presents an application-grounded user study with
eight professional sleep medicine practitioners, who score nocturnal arousal
events in polysomnographic data under three conditions: (i) manual scoring,
(ii) black-box (BB) AI assistance, and (iii) transparent white-box (WB) AI
assistance. Assistance is provided either from the \textit{start} of scoring or
as a post-hoc quality-control (\textit{QC}) review. We systematically evaluate
how the type and timing of assistance influence event-level and clinically most
relevant count-based performance, time requirements, and user experience. When
evaluated against the clinical standard used to train the AI, both AI and
human-AI teams significantly outperform unaided experts, with collaboration
also reducing inter-rater variability. Notably, transparent AI assistance
applied as a targeted QC step yields median event-level performance
improvements of approximately 30\% over black-box assistance, and QC timing
further enhances count-based outcomes. While WB and QC approaches increase the
time required for scoring, start-time assistance is faster and preferred by
most participants. Participants overwhelmingly favor transparency, with seven
out of eight expressing willingness to adopt the system with minor or no
modifications. In summary, strategically timed transparent AI assistance
effectively balances accuracy and clinical efficiency, providing a promising
pathway toward trustworthy AI integration and user acceptance in clinical
workflows.
Ссылки и действия
Дополнительные ресурсы: