Bayesian Low-Rank Factorization for Robust Model Adaptation

2510.18723v1 cs.CL, cs.LG, cs.SD, eess.AS 2025-10-23
Авторы:

Enes Yavuz Ugan, Ngoc-Quan Pham, Alexander Waibel

Abstract

Large speech foundation models achieve strong performance across many domains, but they often require adaptation to handle local needs such as code-switching, where speakers mix languages within the same utterance. Direct fine-tuning of these models risks overfitting to the target domain and overwriting the broad capabilities of the base model. To address this challenge, we explore Bayesian factorized adapters for speech foundation models, which place priors near zero to achieve sparser adaptation matrices and thereby retain general performance while adapting to specific domains. We apply our approach to the Whisper model and evaluate on different multilingual code-switching scenarios. Our results show only minimal adaptation loss while significantly reducing catastrophic forgetting of the base model. Compared to LoRA, our method achieves a backward gain of 54% with only a 4% drop on the new domain. These findings highlight the effectiveness of Bayesian adaptation for fine-tuning speech foundation models without sacrificing generalization.

Ссылки и действия

Связанные статьи

CarelessWhisper: Turning Whisper into a Causal Streaming Model

#### Контекст **Automatic Speech Recognition (ASR)** — одна из наиболее активно развивающихся областей искусственного ин...

2025-08-19

Text to Speech System for Meitei Mayek Script

## Контекст Маніпурский язык является языком, широко распространенным в Северо-Восточной регионе Индии. Он использует с...

2025-08-13

How Does a Deep Neural Network Look at Lexical Stress?

## Контекст Глубокие нейронные сети (DNN) доказали свою эффективность в обработке языка, особенно в сфере распознавания ...

2025-08-13

The State Of TTS: A Case Study with Human Fooling Rates

**Резюме** В статье предлагается Human Fooling Rate (HFR) — метрика, оценивающая вероятность того, что машинно-генериру...

2025-08-09