Failure Prediction at Runtime for Generative Robot Policies
2510.09459v1
cs.RO, cs.AI, cs.LG
2025-10-14
Авторы:
Ralf Römer, Adrian Kobras, Luca Worbis, Angela P. Schoellig
Abstract
Imitation learning (IL) with generative models, such as diffusion and flow
matching, has enabled robots to perform complex, long-horizon tasks. However,
distribution shifts from unseen environments or compounding action errors can
still cause unpredictable and unsafe behavior, leading to task failure. Early
failure prediction during runtime is therefore essential for deploying robots
in human-centered and safety-critical environments. We propose FIPER, a general
framework for Failure Prediction at Runtime for generative IL policies that
does not require failure data. FIPER identifies two key indicators of impending
failure: (i) out-of-distribution (OOD) observations detected via random network
distillation in the policy's embedding space, and (ii) high uncertainty in
generated actions measured by a novel action-chunk entropy score. Both failure
prediction scores are calibrated using a small set of successful rollouts via
conformal prediction. A failure alarm is triggered when both indicators,
aggregated over short time windows, exceed their thresholds. We evaluate FIPER
across five simulation and real-world environments involving diverse failure
modes. Our results demonstrate that FIPER better distinguishes actual failures
from benign OOD situations and predicts failures more accurately and earlier
than existing methods. We thus consider this work an important step towards
more interpretable and safer generative robot policies. Code, data and videos
are available at https://tum-lsy.github.io/fiper_website.
Ссылки и действия
Дополнительные ресурсы: