📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Efficient Vision-Language-Action Models for Embodied Manipulation: A Systematic Survey

2025-10-22

Авторы:

Weifan Guan, Qinghao Hu, Aosheng Li, Jian Cheng

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Vision-Language-Action (VLA) models extend vision-language models to embodied control by mapping natural-language instructions and visual observations to robot actions. Despite their capabilities, VLA systems face significant challenges due to their massive computational and memory demands, which conflict with the constraints of edge platforms such as on-board mobile manipulators that require real-time performance. Addressing this tension has become a central focus of recent research. In light o...

ID: 2510.17111v1 cs.RO, cs.AI, cs.LG

arXiv PDF

📄 SimpleVSF: VLM-Scoring Fusion for Trajectory Prediction of End-to-End Autonomous Driving

2025-10-22

Авторы:

Peiru Zheng, Yun Zhao, Zhan Gong, Hong Zhu, Shaohua Wu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

End-to-end autonomous driving has emerged as a promising paradigm for achieving robust and intelligent driving policies. However, existing end-to-end methods still face significant challenges, such as suboptimal decision-making in complex scenarios. In this paper,we propose SimpleVSF (Simple VLM-Scoring Fusion), a novel framework that enhances end-to-end planning by leveraging the cognitive capabilities of Vision-Language Models (VLMs) and advanced trajectory fusion techniques. We utilize the co...

ID: 2510.17191v1 cs.RO, cs.AI

arXiv PDF

📄 Bridging Embodiment Gaps: Deploying Vision-Language-Action Models on Soft Robots

2025-10-22

Авторы:

Haochen Su, Cristian Meo, Francesco Stella, Andrea Peirone, Kai Junge, Josie Hughes

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Robotic systems are increasingly expected to operate in human-centered, unstructured environments where safety, adaptability, and generalization are essential. Vision-Language-Action (VLA) models have been proposed as a language guided generalized control framework for real robots. However, their deployment has been limited to conventional serial link manipulators. Coupled by their rigidity and unpredictability of learning based control, the ability to safely interact with the environment is mis...

ID: 2510.17369v1 cs.RO, cs.AI, cs.LG

arXiv PDF

📄 From Spatial to Actions: Grounding Vision-Language-Action Model in Spatial Foundation Priors

2025-10-22

Авторы:

Zhengshen Zhang, Hao Li, Yalun Dai, Zhengbang Zhu, Lei Zhou, Chenchen Liu, Dong Wang, Francis E. H. Tay, Sijin Chen, Ziwei Liu, Yuxiao Liu, Xinghang Li, Pan Zhou

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Existing vision-language-action (VLA) models act in 3D real-world but are typically built on 2D encoders, leaving a spatial reasoning gap that limits generalization and adaptability. Recent 3D integration techniques for VLAs either require specialized sensors and transfer poorly across modalities, or inject weak cues that lack geometry and degrade vision-language alignment. In this work, we introduce FALCON (From Spatial to Action), a novel paradigm that injects rich 3D spatial tokens into the a...

ID: 2510.17439v1 cs.RO, cs.AI, cs.CV, cs.LG

arXiv PDF

📄 Intent-Driven LLM Ensemble Planning for Flexible Multi-Robot Disassembly: Demonstration on EV Batteries

2025-10-22

Авторы:

Cansu Erdogan, Cesar Alan Contreras, Alireza Rastegarpanah, Manolis Chiou, Rustam Stolkin

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper addresses the problem of planning complex manipulation tasks, in which multiple robots with different end-effectors and capabilities, informed by computer vision, must plan and execute concatenated sequences of actions on a variety of objects that can appear in arbitrary positions and configurations in unstructured scenes. We propose an intent-driven planning pipeline which can robustly construct such action sequences with varying degrees of supervisory input from a human using simple...

ID: 2510.17576v1 cs.RO, cs.AI, cs.HC, cs.MA

arXiv PDF

📄 RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation

2025-10-22

Авторы:

Yuquan Xue, Guanxing Lu, Zhenyu Wu, Chuanrui Zhang, Bofang Jia, Zhengyi Gu, Yansong Tang, Ziwei Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Vision-Language-Action models (VLAs) have demonstrated remarkable performance on complex robotic manipulation tasks through imitation learning. However, existing imitation learning datasets contain only successful trajectories and lack failure or recovery data, especially for out-of-distribution (OOD) states where the robot deviates from the main policy due to minor perturbations or errors, leading VLA models to struggle with states deviating from the training distribution. To this end, we propo...

ID: 2510.17640v1 cs.RO, cs.AI, cs.LG

arXiv PDF

📄 SoftMimic: Learning Compliant Whole-body Control from Examples

2025-10-22

Авторы:

Gabriel B. Margolis, Michelle Wang, Nolan Fey, Pulkit Agrawal

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We introduce SoftMimic, a framework for learning compliant whole-body control policies for humanoid robots from example motions. Imitating human motions with reinforcement learning allows humanoids to quickly learn new skills, but existing methods incentivize stiff control that aggressively corrects deviations from a reference motion, leading to brittle and unsafe behavior when the robot encounters unexpected contacts. In contrast, SoftMimic enables robots to respond compliantly to external forc...

ID: 2510.17792v1 cs.RO, cs.AI, cs.LG

arXiv PDF

📄 Architecture Is All You Need: Diversity-Enabled Sweet Spots for Robust Humanoid Locomotion

2025-10-21

Авторы:

Blake Werner, Lizhi Yang, Aaron D. Ames

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Robust humanoid locomotion in unstructured environments requires architectures that balance fast low-level stabilization with slower perceptual decision-making. We show that a simple layered control architecture (LCA), a proprioceptive stabilizer running at high rate, coupled with a compact low-rate perceptual policy, enables substantially more robust performance than monolithic end-to-end designs, even when using minimal perception encoders. Through a two-stage training curriculum (blind stabil...

ID: 2510.14947v2 cs.RO, cs.AI, cs.LG, cs.SY, eess.SY

arXiv PDF

📄 CBF-RL: Safety Filtering Reinforcement Learning in Training with Control Barrier Functions

2025-10-21

Авторы:

Lizhi Yang, Blake Werner, Massimiliano de Sa, Aaron D. Ames

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Reinforcement learning (RL), while powerful and expressive, can often prioritize performance at the expense of safety. Yet safety violations can lead to catastrophic outcomes in real-world deployments. Control Barrier Functions (CBFs) offer a principled method to enforce dynamic safety -- traditionally deployed online via safety filters. While the result is safe behavior, the fact that the RL policy does not have knowledge of the CBF can lead to conservative behaviors. This paper proposes CBF-RL...

ID: 2510.14959v2 cs.RO, cs.AI, cs.LG, cs.SY, eess.SY

arXiv PDF

📄 ASBI: Leveraging Informative Real-World Data for Active Black-Box Simulator Tuning

2025-10-21

Авторы:

Gahee Kim, Takamitsu Matsubara

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Black-box simulators are widely used in robotics, but optimizing their parameters remains challenging due to inaccessible likelihoods. Simulation-Based Inference (SBI) tackles this issue using simulation-driven approaches, estimating the posterior from offline real observations and forward simulations. However, in black-box scenarios, preparing observations that contain sufficient information for parameter estimation is difficult due to the unknown relationship between parameters and observation...

ID: 2510.15331v1 cs.RO, cs.AI

arXiv PDF

Показано 181 - 190 из 544 записей