Norm-Q: Effective Compression Method for Hidden Markov Models in Neuro-Symbolic Applications
2509.25439v1
cs.LG, cs.NE
2025-10-03
Авторы:
Hanyuan Gao, Xiaoxuan Yang
Abstract
Hidden Markov models (HMM) are commonly used in generation tasks and have
demonstrated strong capabilities in neuro-symbolic applications for the Markov
property. These applications leverage the strengths of neural networks and
symbolic reasoning to create robust and interpretable AI systems. However, they
may inherit and amplify the shortcomings of both approaches. Both components
require dense computation and data transfer, and their communication further
hinders performance. This paper proposes Norm-Q, a normalized linear
quantization approach for compressing probabilistic symbolic models, such as
HMMs. We reduce the bit width of the data with minimal impact, thereby
alleviating memory and bandwidth stress and enabling deployment on potential
custom hardware. Our method introduces a normalized quantization-aware
expectation maximization process for probabilistic model training. The
experimental results show that Norm-Q achieves a higher compression rate with
reasonable score loss compared to traditional quantization methods. In the case
of the constrained generation task of large language models, we successfully
quantize an HMM of 4096 hidden states to 8 bits without loss and, at most, 3
bits with acceptable loss. Notably, the Norm-Q method can achieve a compression
rate of 99% for the weights of the HMM. The code is open source at
https://github.com/superstarghy/Norm-Q.
Ссылки и действия
Дополнительные ресурсы: