ROPES: Robotic Pose Estimation via Score-Based Causal Representation Learning
2510.20884v1
cs.RO, cs.LG
2025-10-28
Авторы:
Pranamya Kulkarni, Puranjay Datta, Burak Varıcı, Emre Acartürk, Karthikeyan Shanmugam, Ali Tajer
Abstract
Causal representation learning (CRL) has emerged as a powerful unsupervised
framework that (i) disentangles the latent generative factors underlying
high-dimensional data, and (ii) learns the cause-and-effect interactions among
the disentangled variables. Despite extensive recent advances in
identifiability and some practical progress, a substantial gap remains between
theory and real-world practice. This paper takes a step toward closing that gap
by bringing CRL to robotics, a domain that has motivated CRL. Specifically,
this paper addresses the well-defined robot pose estimation -- the recovery of
position and orientation from raw images -- by introducing Robotic Pose
Estimation via Score-Based CRL (ROPES). Being an unsupervised framework, ROPES
embodies the essence of interventional CRL by identifying those generative
factors that are actuated: images are generated by intrinsic and extrinsic
latent factors (e.g., joint angles, arm/limb geometry, lighting, background,
and camera configuration) and the objective is to disentangle and recover the
controllable latent variables, i.e., those that can be directly manipulated
(intervened upon) through actuation. Interventional CRL theory shows that
variables that undergo variations via interventions can be identified. In
robotics, such interventions arise naturally by commanding actuators of various
joints and recording images under varied controls. Empirical evaluations in
semi-synthetic manipulator experiments demonstrate that ROPES successfully
disentangles latent generative factors with high fidelity with respect to the
ground truth. Crucially, this is achieved by leveraging only distributional
changes, without using any labeled data. The paper also includes a comparison
with a baseline based on a recently proposed semi-supervised framework. This
paper concludes by positioning robot pose estimation as a near-practical
testbed for CRL.
Ссылки и действия
Дополнительные ресурсы: