When Planners Meet Reality: How Learned, Reactive Traffic Agents Shift nuPlan Benchmarks
2510.14677v1
cs.RO, cs.AI, cs.LG, cs.MA
2025-10-18
Авторы:
Steffen Hagedorn, Luka Donkov, Aron Distelzweig, Alexandru P. Condurache
Abstract
Planner evaluation in closed-loop simulation often uses rule-based traffic
agents, whose simplistic and passive behavior can hide planner deficiencies and
bias rankings. Widely used IDM agents simply follow a lead vehicle and cannot
react to vehicles in adjacent lanes, hindering tests of complex interaction
capabilities. We address this issue by integrating the state-of-the-art learned
traffic agent model SMART into nuPlan. Thus, we are the first to evaluate
planners under more realistic conditions and quantify how conclusions shift
when narrowing the sim-to-real gap. Our analysis covers 14 recent planners and
established baselines and shows that IDM-based simulation overestimates
planning performance: nearly all scores deteriorate. In contrast, many planners
interact better than previously assumed and even improve in multi-lane,
interaction-heavy scenarios like lane changes or turns. Methods trained in
closed-loop demonstrate the best and most stable driving performance. However,
when reaching their limits in augmented edge-case scenarios, all learned
planners degrade abruptly, whereas rule-based planners maintain reasonable
basic behavior. Based on our results, we suggest SMART-reactive simulation as a
new standard closed-loop benchmark in nuPlan and release the SMART agents as a
drop-in alternative to IDM at https://github.com/shgd95/InteractiveClosedLoop.