A Neural Model for Contextual Biasing Score Learning and Filtering
2510.23849v1
eess.AS, cs.AI, cs.CL, cs.SD
2025-10-30
Авторы:
Wanting Huang, Weiran Wang
Abstract
Contextual biasing improves automatic speech recognition (ASR) by integrating
external knowledge, such as user-specific phrases or entities, during decoding.
In this work, we use an attention-based biasing decoder to produce scores for
candidate phrases based on acoustic information extracted by an ASR encoder,
which can be used to filter out unlikely phrases and to calculate bonus for
shallow-fusion biasing. We introduce a per-token discriminative objective that
encourages higher scores for ground-truth phrases while suppressing
distractors. Experiments on the Librispeech biasing benchmark show that our
method effectively filters out majority of the candidate phrases, and
significantly improves recognition accuracy under different biasing conditions
when the scores are used in shallow fusion biasing. Our approach is modular and
can be used with any ASR system, and the filtering mechanism can potentially
boost performance of other biasing methods.