Adapting Language Balance in Code-Switching Speech
2510.18724v1
cs.CL, cs.LG, cs.SD, eess.AS
2025-10-23
Авторы:
Enes Yavuz Ugan, Ngoc-Quan Pham, Alexander Waibel
Abstract
Despite achieving impressive results on standard benchmarks, large
foundational models still struggle against code-switching test cases. When data
scarcity cannot be used as the usual justification for poor performance, the
reason may lie in the infrequent occurrence of code-switched moments, where the
embedding of the second language appears subtly. Instead of expecting the
models to learn this infrequency on their own, it might be beneficial to
provide the training process with labels. Evaluating model performance on
code-switching data requires careful localization of code-switching points
where recognition errors are most consequential, so that the analysis
emphasizes mistakes occurring at those moments. Building on this observation,
we leverage the difference between the embedded and the main language to
highlight those code-switching points and thereby emphasize learning at those
locations. This simple yet effective differentiable surrogate mitigates context
bias during generation -- the central challenge in code-switching -- thereby
improving the model's robustness. Our experiments with Arabic and
Chinese-English showed that the models are able to predict the switching places
more correctly, reflected by the reduced substitution error.