Backdoor Attacks Against Speech Language Models
2510.01157v1
cs.CL, cs.CR, cs.SD
2025-10-04
Авторы:
Alexandrine Fortier, Thomas Thebaud, Jesús Villalba, Najim Dehak, Patrick Cardinal
Abstract
Large Language Models (LLMs) and their multimodal extensions are becoming
increasingly popular. One common approach to enable multimodality is to cascade
domain-specific encoders with an LLM, making the resulting model inherit
vulnerabilities from all of its components. In this work, we present the first
systematic study of audio backdoor attacks against speech language models. We
demonstrate its effectiveness across four speech encoders and three datasets,
covering four tasks: automatic speech recognition (ASR), speech emotion
recognition, and gender and age prediction. The attack consistently achieves
high success rates, ranging from 90.76% to 99.41%. To better understand how
backdoors propagate, we conduct a component-wise analysis to identify the most
vulnerable stages of the pipeline. Finally, we propose a fine-tuning-based
defense that mitigates the threat of poisoned pretrained encoders.
Ссылки и действия
Дополнительные ресурсы: