FreeAction: Training-Free Techniques for Enhanced Fidelity of Trajectory-to-Video Generation
2509.24241v1
cs.CV, cs.RO
2025-10-01
Авторы:
Seungwook Kim, Seunghyeon Lee, Minsu Cho
Abstract
Generating realistic robot videos from explicit action trajectories is a
critical step toward building effective world models and robotics foundation
models. We introduce two training-free, inference-time techniques that fully
exploit explicit action parameters in diffusion-based robot video generation.
Instead of treating action vectors as passive conditioning signals, our methods
actively incorporate them to guide both the classifier-free guidance process
and the initialization of Gaussian latents. First, action-scaled
classifier-free guidance dynamically modulates guidance strength in proportion
to action magnitude, enhancing controllability over motion intensity. Second,
action-scaled noise truncation adjusts the distribution of initially sampled
noise to better align with the desired motion dynamics. Experiments on real
robot manipulation datasets demonstrate that these techniques significantly
improve action coherence and visual quality across diverse robot environments.
Ссылки и действия
Дополнительные ресурсы: