NeuroRVQ: Multi-Scale EEG Tokenization for Generative Large Brainwave Models
2510.13068v1
cs.LG, cs.AI, cs.HC
2025-10-17
Авторы:
Konstantinos Barmpas, Na Lee, Alexandros Koliousis, Yannis Panagakis, Dimitrios A. Adamos, Nikolaos Laskaris, Stefanos Zafeiriou
Abstract
Electroencephalography (EEG) captures neural activity across multiple
temporal and spectral scales, yielding signals that are rich but complex for
representation learning. Recently, EEG foundation models trained to predict
masked signal-tokens have shown promise for learning generalizable
representations. However, their performance is hindered by their signal
tokenization modules. Existing neural tokenizers fail to preserve
high-frequency dynamics, limiting their ability to reconstruct EEG signals with
high fidelity. We introduce NeuroRVQ, a scalable Large Brainwave Model (LBM)
centered on a codebook-based tokenizer. Our tokenizer integrates: (i)
multi-scale feature extraction modules that capture the full frequency neural
spectrum; (ii) hierarchical residual vector quantization (RVQ) codebooks for
high-resolution encoding; and, (iii) an EEG signal phase- and amplitude-aware
loss function for efficient training. This design enables efficient EEG
compression while supporting accurate reconstruction across all frequency
bands, leading to robust generative masked modeling. Our empirical results
demonstrate that NeuroRVQ achieves lower reconstruction error and outperforms
existing LBMs on a variety of downstream tasks. More broadly, NeuroRVQ
tokenizer establishes a strong prior for codebook-based general-purpose
brainwave models, enabling advances in neural decoding, generative modeling and
multimodal biosignal integration.
Ссылки и действия
Дополнительные ресурсы: