📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 0
Последнее обновление: сегодня
Авторы:
Daniel Harari, Michael Sidorov, Liel David, Chen Shterental, Abrham Kahsay Gebreselasie, Muhammad Haris Khan
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Large multi-modal models (LMMs) show increasing performance in realistic visual tasks for images and, more recently, for videos. For example, given a video sequence, such models are able to describe in detail objects, the surroundings and dynamic actions. In this study, we explored the extent to which these models ground their semantic understanding in the actual visual input. Specifically, given sequences of hands interacting with objects, we asked models when and where the interaction begins o...
Авторы:
Roman Beliy, Amit Zalcher, Jonathan Kogman, Navve Wasserman, Michal Irani
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Reconstructing images seen by people from their fMRI brain recordings
provides a non-invasive window into the human brain. Despite recent progress
enabled by diffusion models, current methods often lack faithfulness to the
actual seen images. We present "Brain-IT", a brain-inspired approach that
addresses this challenge through a Brain Interaction Transformer (BIT),
allowing effective interactions between clusters of functionally-similar
brain-voxels. These functional-clusters are shared by all ...
Авторы:
Connor Lane, Daniel Z. Kaplan, Tanishq Mathew Abraham, Paul S. Scotti
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
A key question for adapting modern deep learning architectures to functional
MRI (fMRI) is how to represent the data for model input. To bridge the modality
gap between fMRI and natural images, we transform the 4D volumetric fMRI data
into videos of 2D fMRI activity flat maps. We train Vision Transformers on 2.3K
hours of fMRI flat map videos from the Human Connectome Project using the
spatiotemporal masked autoencoder (MAE) framework. We observe that masked fMRI
modeling performance improves wi...