DHAuDS: A Dynamic and Heterogeneous Audio Benchmark for Test-Time Adaptation

2511.18421v1 cs.SD, cs.LG 2025-11-26

Авторы:

Weichuang Shao, Iman Yi Liao, Tomas Henrique Bode Maul, Tissa Chandesa

Abstract

Audio classifiers frequently face domain shift, when models trained on one dataset lose accuracy on data recorded in acoustically different conditions. Previous Test-Time Adaptation (TTA) research in speech and sound analysis often evaluates models under fixed or mismatched noise settings, that fail to mimic real-world variability. To overcome these limitations, this paper presents DHAuDS (Dynamic and Heterogeneous Audio Domain Shift), a benchmark designed to assess TTA approaches under more realistic and diverse acoustic shifts. DHAuDS comprises four standardized benchmarks: UrbanSound8K-C, SpeechCommandsV2-C, VocalSound-C, and ReefSet-C, each constructed with dynamic corruption severity levels and heterogeneous noise types to simulate authentic audio degradation scenarios. The framework defines 14 evaluation criteria for each benchmark (8 for UrbanSound8K-C), resulting in 50 unrepeated criteria (124 experiments) that collectively enable fair, reproducible, and cross-domain comparison of TTA algorithms. Through the inclusion of dynamic and mixed-domain noise settings, DHAuDS offers a consistent and publicly reproducible testbed to support ongoing studies in robust and adaptive audio modeling.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

DHAuDS: A Dynamic and Heterogeneous Audio Benchmark for Test-Time Adaptation

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Contract-Driven QoE Auditing for Speech and Singing Services: From MOS Regressio...

Generative Multi-modal Feedback for Singing Voice Synthesis Evaluation

Differentiable Attenuation Filters for Feedback Delay Networks

Count The Notes: Histogram-Based Supervision for Automatic Music Transcription

Segmentwise Pruning in Audio-Language Models

Навигация