Lightweight and Generalizable Acoustic Scene Representations via Contrastive Fine-Tuning and Distillation

2510.03728v1 cs.SD, cs.LG, eess.AS, eess.SP 2025-10-08

Авторы:

Kuang Yuan, Yang Gao, Xilin Li, Xinhao Mei, Syavosh Zadissa, Tarun Pruthi, Saeed Bagheri Sereshki

Abstract

Acoustic scene classification (ASC) models on edge devices typically operate under fixed class assumptions, lacking the transferability needed for real-world applications that require adaptation to new or refined acoustic categories. We propose ContrastASC, which learns generalizable acoustic scene representations by structuring the embedding space to preserve semantic relationships between scenes, enabling adaptation to unseen categories without retraining. Our approach combines supervised contrastive fine-tuning of pre-trained models with contrastive representation distillation to transfer this structured knowledge to compact student models. Our evaluation shows that ContrastASC demonstrates improved few-shot adaptation to unseen categories while maintaining strong closed-set performance.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Lightweight and Generalizable Acoustic Scene Representations via Contrastive Fine-Tuning and Distillation

Авторы:

Abstract

Ссылки и действия

Связанные статьи

GLA-Grad++: An Improved Griffin-Lim Guided Diffusion Model for Speech Synthesis

XAI-Driven Spectral Analysis of Cough Sounds for Respiratory Disease Characteriz...

Навигация