Safety Assessment in Reinforcement Learning via Model Predictive Control
2510.20955v1
cs.LG, cs.RO
2025-10-28
Авторы:
Jeff Pflueger, Michael Everett
Abstract
Model-free reinforcement learning approaches are promising for control but
typically lack formal safety guarantees. Existing methods to shield or
otherwise provide these guarantees often rely on detailed knowledge of the
safety specifications. Instead, this work's insight is that many
difficult-to-specify safety issues are best characterized by invariance.
Accordingly, we propose to leverage reversibility as a method for preventing
these safety issues throughout the training process. Our method uses
model-predictive path integral control to check the safety of an action
proposed by a learned policy throughout training. A key advantage of this
approach is that it only requires the ability to query the black-box dynamics,
not explicit knowledge of the dynamics or safety constraints. Experimental
results demonstrate that the proposed algorithm successfully aborts before all
unsafe actions, while still achieving comparable training progress to a
baseline PPO approach that is allowed to violate safety.
Ссылки и действия
Дополнительные ресурсы: