The silence of the weights: an investigation of structural pruning strategies for attention-based audio signal architectures

2509.26207v1 cs.SD, cs.LG 2025-10-02

Авторы:

Andrea Diecidue, Carlo Alberto Barbano, Piero Fraternali, Mathieu Fontaine, Enzo Tartaglione

Abstract

Transformer-based models have become the state of the art across multiple domains, from natural language processing to machine listening, thanks to attention mechanisms. However, the attention layers require a large number of parameters and high-end hardware for both training and inference. We propose a novel pruning technique targeted explicitly at the attention mechanism, where we decouple the pruning of the four layers in the attention block, namely: query, keys, values and outputs' projection matrices. We also investigate pruning strategies to prune along the head and channel dimensions, and compare the performance of the Audio Spectrogram Transformer (AST) model under different pruning scenarios. Our results show that even by pruning 50\% of the attention parameters we incur in performance degradation of less than 1\%

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

The silence of the weights: an investigation of structural pruning strategies for attention-based audio signal architectures

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Contract-Driven QoE Auditing for Speech and Singing Services: From MOS Regressio...

Generative Multi-modal Feedback for Singing Voice Synthesis Evaluation

Differentiable Attenuation Filters for Feedback Delay Networks

DHAuDS: A Dynamic and Heterogeneous Audio Benchmark for Test-Time Adaptation

Count The Notes: Histogram-Based Supervision for Automatic Music Transcription

Навигация