📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language

2025-11-20

Авторы:

Minyoung Hwang, Alexandra Forsey-Smerek, Nathaniel Dennler, Andreea Bobu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Robots can adapt to user preferences by learning reward functions from demonstrations, but with limited data, reward models often overfit to spurious correlations and fail to generalize. This happens because demonstrations show robots how to do a task but not what matters for that task, causing the model to focus on irrelevant state details. Natural language can more directly specify what the robot should focus on, and, in principle, disambiguate between many reward functions consistent with the...

ID: 2511.14565v1 cs.RO, cs.AI

arXiv PDF

📄 Is Your VLM for Autonomous Driving Safety-Ready? A Comprehensive Benchmark for Evaluating External and In-Cabin Risks

2025-11-20

Авторы:

Xianhui Meng, Yuchen Zhang, Zhijian Huang, Zheng Lu, Ziling Ji, Yaoyao Yin, Hongyuan Zhang, Guangfeng Jiang, Yandan Lin, Long Chen, Hangjun Ye, Li Zhang, Jun Liu, Xiaoshuai Hao

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Vision-Language Models (VLMs) show great promise for autonomous driving, but their suitability for safety-critical scenarios is largely unexplored, raising safety concerns. This issue arises from the lack of comprehensive benchmarks that assess both external environmental risks and in-cabin driving behavior safety simultaneously. To bridge this critical gap, we introduce DSBench, the first comprehensive Driving Safety Benchmark designed to assess a VLM's awareness of various safety risks in a un...

ID: 2511.14592v1 cs.RO, cs.AI

arXiv PDF

📄 NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards

2025-11-20

Авторы:

Chia-Yu Hung, Navonil Majumder, Haoyuan Deng, Liu Renhang, Yankang Ang, Amir Zadeh, Chuan Li, Dorien Herremans, Ziwei Wang, Soujanya Poria

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Vision--language--action (VLA) models have recently shown promising performance on a variety of embodied tasks, yet they still fall short in reliability and generalization, especially when deployed across different embodiments or real-world environments. In this work, we introduce NORA-1.5, a VLA model built from the pre-trained NORA backbone by adding to it a flow-matching-based action expert. This architectural enhancement alone yields substantial performance gains, enabling NORA-1.5 to outper...

ID: 2511.14659v1 cs.RO, cs.AI

arXiv PDF

📄 Autonomous Underwater Cognitive System for Adaptive Navigation: A SLAM-Integrated Cognitive Architecture

2025-11-19

Авторы:

K. A. I. N Jayarathne, R. M. N. M. Rathnayaka, D. P. S. S. Peiris

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Deep-sea exploration poses significant challenges, including disorientation, communication loss, and navigational failures in dynamic underwater environments. This paper presents an Autonomous Underwater Cognitive System (AUCS) that integrates Simultaneous Localization and Mapping (SLAM) with a Soar-based cognitive architecture to enable adaptive navigation in complex oceanic conditions. The system fuses multi-sensor data from SONAR, LiDAR, IMU, and DVL with cognitive reasoning modules for perce...

ID: 2511.11845v1 cs.RO, cs.AI, cs.AR

arXiv PDF

📄 Decoupled Action Head: Confining Task Knowledge to Conditioning Layers

2025-11-19

Авторы:

Jian Zhou, Sihao Lin, Shuai Fu, Qi WU

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Behavior Cloning (BC) is a data-driven supervised learning approach that has gained increasing attention with the success of scaling laws in language and vision domains. Among its implementations in robotic manipulation, Diffusion Policy (DP), with its two variants DP-CNN (DP-C) and DP-Transformer (DP-T), is one of the most effective and widely adopted models, demonstrating the advantages of predicting continuous action sequences. However, both DP and other BC methods remain constrained by the s...

ID: 2511.12101v1 cs.RO, cs.AI, cs.LG

arXiv PDF

📄 Locally Optimal Solutions to Constraint Displacement Problems via Path-Obstacle Overlaps

2025-11-19

Авторы:

Antony Thomas, Fulvio Mastrogiovanni, Marco Baglietto

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We present a unified approach for constraint displacement problems in which a robot finds a feasible path by displacing constraints or obstacles. To this end, we propose a two stage process that returns locally optimal obstacle displacements to enable a feasible path for the robot. The first stage proceeds by computing a trajectory through the obstacles while minimizing an appropriate objective function. In the second stage, these obstacles are displaced to make the computed robot trajectory fea...

ID: 2511.12203v1 cs.RO, cs.AI

arXiv PDF

📄 Towards High-Consistency Embodied World Model with Multi-View Trajectory Videos

2025-11-19

Авторы:

Taiyi Su, Jian Zhu, Yaxuan Li, Chong Ma, Zitai Huang, Yichen Zhu, Hanli Wang, Yi Xu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Embodied world models aim to predict and interact with the physical world through visual observations and actions. However, existing models struggle to accurately translate low-level actions (e.g., joint positions) into precise robotic movements in predicted frames, leading to inconsistencies with real-world physical interactions. To address these limitations, we propose MTV-World, an embodied world model that introduces Multi-view Trajectory-Video control for precise visuomotor prediction. Spec...

ID: 2511.12882v1 cs.RO, cs.AI

arXiv PDF

📄 EL3DD: Extended Latent 3D Diffusion for Language Conditioned Multitask Manipulation

2025-11-19

Авторы:

Jonas Bode, Raphael Memmesheimer, Sven Behnke

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Acting in human environments is a crucial capability for general-purpose robots, necessitating a robust understanding of natural language and its application to physical tasks. This paper seeks to harness the capabilities of diffusion models within a visuomotor policy framework that merges visual and textual inputs to generate precise robotic trajectories. By employing reference demonstrations during training, the model learns to execute manipulation tasks specified through textual commands with...

ID: 2511.13312v1 cs.RO, cs.AI, cs.LG

arXiv PDF

📄 Towards Affect-Adaptive Human-Robot Interaction: A Protocol for Multimodal Dataset Collection on Social Anxiety

2025-11-19

Авторы:

Vesna Poprcova, Iulia Lefter, Matthias Wieser, Martijn Warnier, Frances Brazier

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Social anxiety is a prevalent condition that affects interpersonal interactions and social functioning. Recent advances in artificial intelligence and social robotics offer new opportunities to examine social anxiety in the human-robot interaction context. Accurate detection of affective states and behaviours associated with social anxiety requires multimodal datasets, where each signal modality provides complementary insights into its manifestations. However, such datasets remain scarce, limiti...

ID: 2511.13530v1 cs.RO, cs.AI

arXiv PDF

📄 From Power to Precision: Learning Fine-grained Dexterity for Multi-fingered Robotic Hands

2025-11-19

Авторы:

Jianglong Ye, Lai Wei, Guangqi Jiang, Changwei Jing, Xueyan Zou, Xiaolong Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Human grasps can be roughly categorized into two types: power grasps and precision grasps. Precision grasping enables tool use and is believed to have influenced human evolution. Today's multi-fingered robotic hands are effective in power grasps, but for tasks requiring precision, parallel grippers are still more widely adopted. This contrast highlights a key limitation in current robotic hand design: the difficulty of achieving both stable power grasps and precise, fine-grained manipulation wit...

ID: 2511.13710v1 cs.RO, cs.AI, cs.LG

arXiv PDF

Показано 71 - 80 из 544 записей