Integrating Offline Pre-Training with Online Fine-Tuning: A Reinforcement Learning Approach for Robot Social Navigation
2510.00466v1
cs.RO, cs.AI
2025-10-05
Авторы:
Run Su, Hao Fu, Shuai Zhou, Yingao Fu
Abstract
Offline reinforcement learning (RL) has emerged as a promising framework for
addressing robot social navigation challenges. However, inherent uncertainties
in pedestrian behavior and limited environmental interaction during training
often lead to suboptimal exploration and distributional shifts between offline
training and online deployment. To overcome these limitations, this paper
proposes a novel offline-to-online fine-tuning RL algorithm for robot social
navigation by integrating Return-to-Go (RTG) prediction into a causal
Transformer architecture. Our algorithm features a spatiotem-poral fusion model
designed to precisely estimate RTG values in real-time by jointly encoding
temporal pedestrian motion patterns and spatial crowd dynamics. This RTG
prediction framework mitigates distribution shift by aligning offline policy
training with online environmental interactions. Furthermore, a hybrid
offline-online experience sampling mechanism is built to stabilize policy
updates during fine-tuning, ensuring balanced integration of pre-trained
knowledge and real-time adaptation. Extensive experiments in simulated social
navigation environments demonstrate that our method achieves a higher success
rate and lower collision rate compared to state-of-the-art baselines. These
results underscore the efficacy of our algorithm in enhancing navigation policy
robustness and adaptability. This work paves the way for more reliable and
adaptive robotic navigation systems in real-world applications.
Ссылки и действия
Дополнительные ресурсы: