FreeAction: Training-Free Techniques for Enhanced Fidelity of Trajectory-to-Video Generation

2509.24241v1 cs.CV, cs.RO 2025-10-01

Авторы:

Seungwook Kim, Seunghyeon Lee, Minsu Cho

Abstract

Generating realistic robot videos from explicit action trajectories is a critical step toward building effective world models and robotics foundation models. We introduce two training-free, inference-time techniques that fully exploit explicit action parameters in diffusion-based robot video generation. Instead of treating action vectors as passive conditioning signals, our methods actively incorporate them to guide both the classifier-free guidance process and the initialization of Gaussian latents. First, action-scaled classifier-free guidance dynamically modulates guidance strength in proportion to action magnitude, enhancing controllability over motion intensity. Second, action-scaled noise truncation adjusts the distribution of initially sampled noise to better align with the desired motion dynamics. Experiments on real robot manipulation datasets demonstrate that these techniques significantly improve action coherence and visual quality across diverse robot environments.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

FreeAction: Training-Free Techniques for Enhanced Fidelity of Trajectory-to-Video Generation

Авторы:

Abstract

Ссылки и действия

Связанные статьи

FASTer: Toward Efficient Autoregressive Vision Language Action Modeling via neur...

Object Reconstruction under Occlusion with Generative Priors and Contact-induced...

Image Generation as a Visual Planner for Robotic Manipulation

TrajDiff: End-to-end Autonomous Driving without Perception Annotation

SwiftVLA: Unlocking Spatiotemporal Dynamics for Lightweight VLA Models at Minima...

Навигация