Towards Interpretable Visual Decoding with Attention to Brain Representations
2509.23566v1
cs.CV, I.2.0; I.4.9
2025-10-02
Авторы:
Pinyuan Feng, Hossein Adeli, Wenxuan Guo, Fan Cheng, Ethan Hwang, Nikolaus Kriegeskorte
Abstract
Recent work has demonstrated that complex visual stimuli can be decoded from
human brain activity using deep generative models, helping brain science
researchers interpret how the brain represents real-world scenes. However, most
current approaches leverage mapping brain signals into intermediate image or
text feature spaces before guiding the generative process, masking the effect
of contributions from different brain areas on the final reconstruction output.
In this work, we propose NeuroAdapter, a visual decoding framework that
directly conditions a latent diffusion model on brain representations,
bypassing the need for intermediate feature spaces. Our method demonstrates
competitive visual reconstruction quality on public fMRI datasets compared to
prior work, while providing greater transparency into how brain signals shape
the generation process. To this end, we contribute an Image-Brain
BI-directional interpretability framework (IBBI) which investigates
cross-attention mechanisms across diffusion denoising steps to reveal how
different cortical areas influence the unfolding generative trajectory. Our
results highlight the potential of end-to-end brain-to-image decoding and
establish a path toward interpreting diffusion models through the lens of
visual neuroscience.
Ссылки и действия
Дополнительные ресурсы: