NovaFlow: Zero-Shot Manipulation via Actionable Flow from Generated Videos
2510.08568v1
cs.RO, cs.AI, cs.CV
2025-10-11
Авторы:
Hongyu Li, Lingfeng Sun, Yafei Hu, Duy Ta, Jennifer Barry, George Konidaris, Jiahui Fu
Abstract
Enabling robots to execute novel manipulation tasks zero-shot is a central
goal in robotics. Most existing methods assume in-distribution tasks or rely on
fine-tuning with embodiment-matched data, limiting transfer across platforms.
We present NovaFlow, an autonomous manipulation framework that converts a task
description into an actionable plan for a target robot without any
demonstrations. Given a task description, NovaFlow synthesizes a video using a
video generation model and distills it into 3D actionable object flow using
off-the-shelf perception modules. From the object flow, it computes relative
poses for rigid objects and realizes them as robot actions via grasp proposals
and trajectory optimization. For deformable objects, this flow serves as a
tracking objective for model-based planning with a particle-based dynamics
model. By decoupling task understanding from low-level control, NovaFlow
naturally transfers across embodiments. We validate on rigid, articulated, and
deformable object manipulation tasks using a table-top Franka arm and a Spot
quadrupedal mobile robot, and achieve effective zero-shot execution without
demonstrations or embodiment-specific training. Project website:
https://novaflow.lhy.xyz/.
Ссылки и действия
Дополнительные ресурсы: