Learning Generalizable Visuomotor Policy through Dynamics-Alignment
2510.27114v1
cs.RO, cs.LG
2025-11-04
Авторы:
Dohyeok Lee, Jung Min Lee, Munkyung Kim, Seokhun Ju, Jin Woo Koo, Kyungjae Lee, Dohyeong Kim, TaeHyun Cho, Jungwoo Lee
Abstract
Behavior cloning methods for robot learning suffer from poor generalization
due to limited data support beyond expert demonstrations. Recent approaches
leveraging video prediction models have shown promising results by learning
rich spatiotemporal representations from large-scale datasets. However, these
models learn action-agnostic dynamics that cannot distinguish between different
control inputs, limiting their utility for precise manipulation tasks and
requiring large pretraining datasets. We propose a Dynamics-Aligned Flow
Matching Policy (DAP) that integrates dynamics prediction into policy learning.
Our method introduces a novel architecture where policy and dynamics models
provide mutual corrective feedback during action generation, enabling
self-correction and improved generalization. Empirical validation demonstrates
generalization performance superior to baseline methods on real-world robotic
manipulation tasks, showing particular robustness in OOD scenarios including
visual distractions and lighting variations.
Ссылки и действия
Дополнительные ресурсы: