DeLTa: Demonstration and Language-Guided Novel Transparent Object Manipulation
2510.05662v1
cs.RO, cs.CV
2025-10-09
Авторы:
Taeyeop Lee, Gyuree Kang, Bowen Wen, Youngho Kim, Seunghyeok Back, In So Kweon, David Hyunchul Shim, Kuk-Jin Yoon
Abstract
Despite the prevalence of transparent object interactions in human everyday
life, transparent robotic manipulation research remains limited to
short-horizon tasks and basic grasping capabilities.Although some methods have
partially addressed these issues, most of them have limitations in
generalizability to novel objects and are insufficient for precise long-horizon
robot manipulation. To address this limitation, we propose DeLTa (Demonstration
and Language-Guided Novel Transparent Object Manipulation), a novel framework
that integrates depth estimation, 6D pose estimation, and vision-language
planning for precise long-horizon manipulation of transparent objects guided by
natural task instructions. A key advantage of our method is its
single-demonstration approach, which generalizes 6D trajectories to novel
transparent objects without requiring category-level priors or additional
training. Additionally, we present a task planner that refines the
VLM-generated plan to account for the constraints of a single-arm, eye-in-hand
robot for long-horizon object manipulation tasks. Through comprehensive
evaluation, we demonstrate that our method significantly outperforms existing
transparent object manipulation approaches, particularly in long-horizon
scenarios requiring precise manipulation capabilities. Project page:
https://sites.google.com/view/DeLTa25/
Ссылки и действия
Дополнительные ресурсы: