Distribution-Aware Tensor Decomposition for Compression of Convolutional Neural Networks
2511.04494v1
cs.LG, cs.CV
2025-11-08
Авторы:
Alper Kalle, Theo Rudkiewicz, Mohamed-Oumar Ouerfelli, Mohamed Tamaazousti
Abstract
Neural networks are widely used for image-related tasks but typically demand
considerable computing power. Once a network has been trained, however, its
memory- and compute-footprint can be reduced by compression. In this work, we
focus on compression through tensorization and low-rank representations.
Whereas classical approaches search for a low-rank approximation by minimizing
an isotropic norm such as the Frobenius norm in weight-space, we use
data-informed norms that measure the error in function space. Concretely, we
minimize the change in the layer's output distribution, which can be expressed
as $\lVert (W - \widetilde{W}) \Sigma^{1/2}\rVert_F$ where $\Sigma^{1/2}$ is
the square root of the covariance matrix of the layer's input and $W$,
$\widetilde{W}$ are the original and compressed weights. We propose new
alternating least square algorithms for the two most common tensor
decompositions (Tucker-2 and CPD) that directly optimize the new norm. Unlike
conventional compression pipelines, which almost always require
post-compression fine-tuning, our data-informed approach often achieves
competitive accuracy without any fine-tuning. We further show that the same
covariance-based norm can be transferred from one dataset to another with only
a minor accuracy drop, enabling compression even when the original training
dataset is unavailable. Experiments on several CNN architectures (ResNet-18/50,
and GoogLeNet) and datasets (ImageNet, FGVC-Aircraft, Cifar10, and Cifar100)
confirm the advantages of the proposed method.
Ссылки и действия
Дополнительные ресурсы: