Predictive Preference Learning from Human Interventions
2510.01545v1
cs.LG, cs.AI, cs.RO
2025-10-04
Авторы:
Haoyuan Cai, Zhenghao Peng, Bolei Zhou
Abstract
Learning from human involvement aims to incorporate the human subject to
monitor and correct agent behavior errors. Although most interactive imitation
learning methods focus on correcting the agent's action at the current state,
they do not adjust its actions in future states, which may be potentially more
hazardous. To address this, we introduce Predictive Preference Learning from
Human Interventions (PPL), which leverages the implicit preference signals
contained in human interventions to inform predictions of future rollouts. The
key idea of PPL is to bootstrap each human intervention into L future time
steps, called the preference horizon, with the assumption that the agent
follows the same action and the human makes the same intervention in the
preference horizon. By applying preference optimization on these future states,
expert corrections are propagated into the safety-critical regions where the
agent is expected to explore, significantly improving learning efficiency and
reducing human demonstrations needed. We evaluate our approach with experiments
on both autonomous driving and robotic manipulation benchmarks and demonstrate
its efficiency and generality. Our theoretical analysis further shows that
selecting an appropriate preference horizon L balances coverage of risky states
with label correctness, thereby bounding the algorithmic optimality gap. Demo
and code are available at: https://metadriverse.github.io/ppl
Ссылки и действия
Дополнительные ресурсы: