Binary Quadratic Quantization: Beyond First-Order Quantization for Real-Valued Matrix Compression
2510.18650v1
cs.CV, cs.AI, cs.LG, cs.NE
2025-10-23
Авторы:
Kyo Kuroki, Yasuyuki Okoshi, Thiem Van Chu, Kazushi Kawamura, Masato Motomura
Abstract
This paper proposes a novel matrix quantization method, Binary Quadratic
Quantization (BQQ). In contrast to conventional first-order quantization
approaches, such as uniform quantization and binary coding quantization, that
approximate real-valued matrices via linear combinations of binary bases, BQQ
leverages the expressive power of binary quadratic expressions while
maintaining an extremely compact data format. We validate our approach with two
experiments: a matrix compression benchmark and post-training quantization
(PTQ) on pretrained Vision Transformer-based models. Experimental results
demonstrate that BQQ consistently achieves a superior trade-off between memory
efficiency and reconstruction error than conventional methods for compressing
diverse matrix data. It also delivers strong PTQ performance, even though we
neither target state-of-the-art PTQ accuracy under tight memory constraints nor
rely on PTQ-specific binary matrix optimization. For example, our proposed
method outperforms the state-of-the-art PTQ method by up to 2.2\% and 59.1% on
the ImageNet dataset under the calibration-based and data-free scenarios,
respectively, with quantization equivalent to 2 bits. These findings highlight
the surprising effectiveness of binary quadratic expressions for efficient
matrix approximation and neural network compression.