C-SWAP: Explainability-Aware Structured Pruning for Efficient Neural Networks Compression
2510.18636v1
cs.CV, cs.AI, cs.LG, cs.RO
2025-10-23
Авторы:
Baptiste Bauvin, Loïc Baret, Ola Ahmad
Abstract
Neural network compression has gained increasing attention in recent years,
particularly in computer vision applications, where the need for model
reduction is crucial for overcoming deployment constraints. Pruning is a widely
used technique that prompts sparsity in model structures, e.g. weights,
neurons, and layers, reducing size and inference costs. Structured pruning is
especially important as it allows for the removal of entire structures, which
further accelerates inference time and reduces memory overhead. However, it can
be computationally expensive, requiring iterative retraining and optimization.
To overcome this problem, recent methods considered one-shot setting, which
applies pruning directly at post-training. Unfortunately, they often lead to a
considerable drop in performance. In this paper, we focus on this issue by
proposing a novel one-shot pruning framework that relies on explainable deep
learning. First, we introduce a causal-aware pruning approach that leverages
cause-effect relations between model predictions and structures in a
progressive pruning process. It allows us to efficiently reduce the size of the
network, ensuring that the removed structures do not deter the performance of
the model. Then, through experiments conducted on convolution neural network
and vision transformer baselines, pre-trained on classification tasks, we
demonstrate that our method consistently achieves substantial reductions in
model size, with minimal impact on performance, and without the need for
fine-tuning. Overall, our approach outperforms its counterparts, offering the
best trade-off. Our code is available on GitHub.