The silence of the weights: an investigation of structural pruning strategies for attention-based audio signal architectures
2509.26207v1
cs.SD, cs.LG
2025-10-02
Авторы:
Andrea Diecidue, Carlo Alberto Barbano, Piero Fraternali, Mathieu Fontaine, Enzo Tartaglione
Abstract
Transformer-based models have become the state of the art across multiple
domains, from natural language processing to machine listening, thanks to
attention mechanisms. However, the attention layers require a large number of
parameters and high-end hardware for both training and inference. We propose a
novel pruning technique targeted explicitly at the attention mechanism, where
we decouple the pruning of the four layers in the attention block, namely:
query, keys, values and outputs' projection matrices. We also investigate
pruning strategies to prune along the head and channel dimensions, and compare
the performance of the Audio Spectrogram Transformer (AST) model under
different pruning scenarios. Our results show that even by pruning 50\% of the
attention parameters we incur in performance degradation of less than 1\%
Ссылки и действия
Дополнительные ресурсы: