📊 Статистика дайджестов
Всего дайджестов: 35039 Добавлено сегодня: 432
Последнее обновление: сегодня
Авторы:
Chuang Chen, Wenyi Ge
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Environmental perception systems play a critical role in high-precision
mapping and autonomous navigation, with LiDAR serving as a core sensor that
provides accurate 3D point cloud data. How to efficiently process unstructured
point clouds while extracting structured semantic information remains a
significant challenge, and in recent years, numerous pseudo-image-based
representation methods have emerged to achieve a balance between efficiency and
performance. However, they often overlook the str...
📄 MCE: Towards a General Framework for Handling Missing Modalities under Imbalanced Missing Rates
2025-10-15Авторы:
Binyu Zhao, Wei Zhang, Zhaonian Zou
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Multi-modal learning has made significant advances across diverse pattern
recognition applications. However, handling missing modalities, especially
under imbalanced missing rates, remains a major challenge. This imbalance
triggers a vicious cycle: modalities with higher missing rates receive fewer
updates, leading to inconsistent learning progress and representational
degradation that further diminishes their contribution. Existing methods
typically focus on global dataset-level balancing, ofte...
Авторы:
Farouq Benchallal, Adel Hafiane, Nicolas Ragot, Raphael Canals
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Weed species classification represents an important step for the development
of automated targeting systems that allow the adoption of precision agriculture
practices. To reduce costs and yield losses caused by their presence. The
identification of weeds is a challenging problem due to their shared
similarities with crop plants and the variability related to the differences in
terms of their types. Along with the variations in relation to changes in field
conditions. Moreover, to fully benefit f...
Авторы:
Zhaolin Hu, Kun Li, Hehe Fan, Yi Yang
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Linear attention mechanisms have emerged as efficient alternatives to full
self-attention in Graph Transformers, offering linear time complexity. However,
existing linear attention models often suffer from a significant drop in
expressiveness due to low-rank projection structures and overly uniform
attention distributions. We theoretically prove that these properties reduce
the class separability of node representations, limiting the model's
classification ability. To address this, we propose a ...
Авторы:
Yuan Xu, Zimu Zhang, Xiaoxuan Ma, Wentao Zhu, Yu Qiao, Yizhou Wang
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Virtual and augmented reality systems increasingly demand intelligent
adaptation to user behaviors for enhanced interaction experiences. Achieving
this requires accurately understanding human intentions and predicting future
situated behaviors - such as gaze direction and object interactions - which is
vital for creating responsive VR/AR environments and applications like
personalized assistants. However, accurate behavioral prediction demands
modeling the underlying cognitive processes that dri...
Авторы:
Maral Doctorarastoo, Katherine A. Flanigan, Mario Bergés, Christopher McComb
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The capacity to predict human spatial preferences within built environments
is instrumental for developing Cyber-Physical-Social Infrastructure Systems
(CPSIS). A significant challenge in this domain is the generalizability of
preference models, particularly their efficacy in predicting preferences within
environmental configurations not encountered during training. While deep
learning models have shown promise in learning complex spatial and contextual
dependencies, it remains unclear which neu...
📄 Chart-RVR: Reinforcement Learning with Verifiable Rewards for Explainable Chart Reasoning
2025-10-15Авторы:
Sanchit Sinha, Oana Frunza, Kashif Rasul, Yuriy Nevmyvaka, Aidong Zhang
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The capabilities of Large Vision-Language Models (LVLMs) have reached
state-of-the-art on many visual reasoning tasks, including chart reasoning, yet
they still falter on out-of-distribution (OOD) data, and degrade further when
asked to produce their chain-of-thought (CoT) rationales, limiting
explainability. We present Chart-RVR, a general framework that fine-tunes LVLMs
to be more robust and explainable for chart reasoning by coupling Group
Relative Policy Optimization (GRPO) with automaticall...
Авторы:
Lin Zhu, Yifeng Yang, Xinbing Wang, Qinying Gu, Nanyang Ye
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Recent approaches for vision-language models (VLMs) have shown remarkable
success in achieving fast downstream adaptation. When applied to real-world
downstream tasks, VLMs inevitably encounter both the in-distribution (ID) data
and out-of-distribution (OOD) data. The OOD datasets often include both
covariate shifts (e.g., known classes with changes in image styles) and
semantic shifts (e.g., test-time unseen classes). This highlights the
importance of improving VLMs' generalization ability to c...
Авторы:
Hongyu Zhu, Lin Chen, Mounim A. El-Yacoubi, Mingsheng Shang
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Multimodal Sentiment Analysis (MSA) aims to identify and interpret human
emotions by integrating information from heterogeneous data sources such as
text, video, and audio. While deep learning models have advanced in network
architecture design, they remain heavily limited by scarce multimodal annotated
data. Although Mixup-based augmentation improves generalization in unimodal
tasks, its direct application to MSA introduces critical challenges: random
mixing often amplifies label ambiguity and ...
Авторы:
Boyang Zheng, Nanye Ma, Shengbang Tong, Saining Xie
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Latent generative modeling, where a pretrained autoencoder maps pixels into a
latent space for the diffusion process, has become the standard strategy for
Diffusion Transformers (DiT); however, the autoencoder component has barely
evolved. Most DiTs continue to rely on the original VAE encoder, which
introduces several limitations: outdated backbones that compromise
architectural simplicity, low-dimensional latent spaces that restrict
information capacity, and weak representations that result fr...
Показано 371 -
380
из 863 записей