Hybrid DQN-TD3 Reinforcement Learning for Autonomous Navigation in Dynamic Environments
2510.26646v1
cs.RO, cs.AI, cs.LG
2025-11-01
Авторы:
Xiaoyi He, Danggui Chen, Zhenshuo Zhang, Zimeng Bai
Abstract
This paper presents a hierarchical path-planning and control framework that
combines a high-level Deep Q-Network (DQN) for discrete sub-goal selection with
a low-level Twin Delayed Deep Deterministic Policy Gradient (TD3) controller
for continuous actuation. The high-level module selects behaviors and
sub-goals; the low-level module executes smooth velocity commands. We design a
practical reward shaping scheme (direction, distance, obstacle avoidance,
action smoothness, collision penalty, time penalty, and progress), together
with a LiDAR-based safety gate that prevents unsafe motions. The system is
implemented in ROS + Gazebo (TurtleBot3) and evaluated with PathBench metrics,
including success rate, collision rate, path efficiency, and re-planning
efficiency, in dynamic and partially observable environments. Experiments show
improved success rate and sample efficiency over single-algorithm baselines
(DQN or TD3 alone) and rule-based planners, with better generalization to
unseen obstacle configurations and reduced abrupt control changes. Code and
evaluation scripts are available at the project repository.
Ссылки и действия
Дополнительные ресурсы: