📊 Статистика дайджестов
Всего дайджестов: 35039 Добавлено сегодня: 432
Последнее обновление: сегодня
Авторы:
Aditi Bhalla, Christian Hellert, Enkelejda Kasneci
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Driver distraction remains a leading cause of road traffic accidents, contributing to thousands of fatalities annually across the globe. While deep learning-based driver activity recognition methods have shown promise in detecting such distractions, their effectiveness in real-world deployments is hindered by two critical challenges: variations in camera viewpoints (cross-view) and domain shifts such as change in sensor modality or environment. Existing methods typically address either cross-vie...
Авторы:
Ren Zhang, Huilai Li, Chao qi, Guoliang Xu, Tianyu Zhou, Wei wei, Jianqin Yin
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Micro expression recognition (MER) is crucial for inferring genuine emotion. Applying a multimodal large language model (MLLM) to this task enables spatio-temporal analysis of facial motion and provides interpretable descriptions. However, there are still two core challenges: (1) The entanglement of static appearance and dynamic motion cues prevents the model from focusing on subtle motion; (2) Textual labels in existing MER datasets do not fully correspond to underlying facial muscle movements,...
Авторы:
Christofer Meinecke, Estelle Guéville, David Joseph Wrisley
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We aim to theorize the medieval manuscript page and its contents more holistically, using state-of-the-art techniques to segment and describe the entire manuscript folio, for the purpose of creating richer training data for computer vision techniques, namely instance segmentation, and multimodal models for medieval-specific visual content.
Авторы:
Siddharth Betala, Kushan Raj, Vipul Betala, Rohan Saswade
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
In this paper, we describe our system under the team name BLEU Monday for the English-to-Indic Multimodal Translation Task at WAT 2025. We participate in the text-only translation tasks for English-Hindi, English-Bengali, English-Malayalam, and English-Odia language pairs. We present a two-stage approach that addresses quality issues in the training data through automated error detection and correction, followed by parameter-efficient model fine-tuning.
Our methodology introduces a vision-augm...
📄 History-Aware Reasoning for GUI Agents
2025-11-15Авторы:
Ziwei Wang, Leyang Yang, Xiaoxuan Tang, Sheng Zhou, Dajun Chen, Wei Jiang, Yong Li
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Advances in Multimodal Large Language Models have significantly enhanced Graphical User Interface (GUI) automation. Equipping GUI agents with reliable episodic reasoning capabilities is essential for bridging the gap between users' concise task descriptions and the complexities of real-world execution. Current methods integrate Reinforcement Learning (RL) with System-2 Chain-of-Thought, yielding notable gains in reasoning enhancement. For long-horizon GUI tasks, historical interactions connect e...
📄 GentleHumanoid: Learning Upper-body Compliance for Contact-rich Human and Object Interaction
2025-11-08Авторы:
Qingzhou Lu, Yao Feng, Baiyu Shi, Michael Piseno, Zhenan Bao, C. Karen Liu
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Humanoid robots are expected to operate in human-centered environments where
safe and natural physical interaction is essential. However, most recent
reinforcement learning (RL) policies emphasize rigid tracking and suppress
external forces. Existing impedance-augmented approaches are typically
restricted to base or end-effector control and focus on resisting extreme
forces rather than enabling compliance. We introduce GentleHumanoid, a
framework that integrates impedance control into a whole-bo...
Авторы:
Yoshihiro Maruyama
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Human activity recognition is challenging because sensor signals shift with
context, motion, and environment; effective models must therefore remain stable
as the world around them changes. We introduce a categorical symmetry-aware
learning framework that captures how signals vary over time, scale, and sensor
hierarchy. We build these factors into the structure of feature
representations, yielding models that automatically preserve the relationships
between sensors and remain stable under realis...
Авторы:
Hongbo Lan, Zhenlin An, Haoyu Li, Vaibhav Singh, Longfei Shangguan
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
This paper introduces \sysname, a system that accelerates vision-guided
physical property reasoning to enable augmented visual cognition. \sysname
minimizes the run-time latency of this reasoning pipeline through a combination
of both algorithmic and systematic optimizations, including rapid geometric 3D
reconstruction, efficient semantic feature fusion, and parallel view encoding.
Through these simple yet effective optimizations, \sysname reduces the
end-to-end latency of this reasoning pipelin...
📄 VitalLens 2.0: High-Fidelity rPPG for Heart Rate Variability Estimation from Face Video
2025-11-04Авторы:
Philipp V. Rouast
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
This report introduces VitalLens 2.0, a new deep learning model for
estimating physiological signals from face video. This new model demonstrates a
significant leap in accuracy for remote photoplethysmography (rPPG), enabling
the robust estimation of not only heart rate (HR) and respiratory rate (RR) but
also Heart Rate Variability (HRV) metrics. This advance is achieved through a
combination of a new model architecture and a substantial increase in the size
and diversity of our training data, n...
📄 Using Salient Object Detection to Identify Manipulative Cookie Banners that Circumvent GDPR
2025-11-04Авторы:
Riley Grossman, Michael Smith, Cristian Borcea, Yi Chen
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The main goal of this paper is to study how often cookie banners that comply
with the General Data Protection Regulation (GDPR) contain aesthetic
manipulation, a design tactic to draw users' attention to the button that
permits personal data sharing. As a byproduct of this goal, we also evaluate
how frequently the banners comply with GDPR and the recommendations of national
data protection authorities regarding banner designs. We visited 2,579 websites
and identified the type of cookie banner im...
Показано 11 -
20
из 54 записей