Efficient Multi-bit Quantization Network Training via Weight Bias Correction and Bit-wise Coreset Sampling
2510.20673v1
cs.CV, cs.LG
2025-10-25
Авторы:
Jinhee Kim, Jae Jun An, Kang Eun Jeon, Jong Hwan Ko
Abstract
Multi-bit quantization networks enable flexible deployment of deep neural
networks by supporting multiple precision levels within a single model.
However, existing approaches suffer from significant training overhead as
full-dataset updates are repeated for each supported bit-width, resulting in a
cost that scales linearly with the number of precisions. Additionally, extra
fine-tuning stages are often required to support additional or intermediate
precision options, further compounding the overall training burden. To address
this issue, we propose two techniques that greatly reduce the training overhead
without compromising model utility: (i) Weight bias correction enables shared
batch normalization and eliminates the need for fine-tuning by neutralizing
quantization-induced bias across bit-widths and aligning activation
distributions; and (ii) Bit-wise coreset sampling strategy allows each child
model to train on a compact, informative subset selected via gradient-based
importance scores by exploiting the implicit knowledge transfer phenomenon.
Experiments on CIFAR-10/100, TinyImageNet, and ImageNet-1K with both ResNet and
ViT architectures demonstrate that our method achieves competitive or superior
accuracy while reducing training time up to 7.88x. Our code is released at
https://github.com/a2jinhee/EMQNet_jk.
Ссылки и действия
Дополнительные ресурсы: