Towards Sampling Data Structures for Tensor Products in Turnstile Streams
2510.03678v1
cs.LG, stat.ML
2025-10-08
Авторы:
Zhao Song, Shenghao Xie, Samson Zhou
Abstract
This paper studies the computational challenges of large-scale
attention-based models in artificial intelligence by utilizing importance
sampling methods in the streaming setting. Inspired by the classical definition
of the $\ell_2$ sampler and the recent progress of the attention scheme in
Large Language Models (LLMs), we propose the definition of the attention
sampler. Our approach significantly reduces the computational burden of
traditional attention mechanisms. We analyze the effectiveness of the attention
sampler from a theoretical perspective, including space and update time.
Additionally, our framework exhibits scalability and broad applicability across
various model architectures and domains.
Ссылки и действия
Дополнительные ресурсы: