DriveCritic: Towards Context-Aware, Human-Aligned Evaluation for Autonomous Driving with Vision-Language Models
2510.13108v1
cs.CV, cs.AI, cs.RO
2025-10-17
Авторы:
Jingyu Song, Zhenxin Li, Shiyi Lan, Xinglong Sun, Nadine Chang, Maying Shen, Joshua Chen, Katherine A. Skinner, Jose M. Alvarez
Abstract
Benchmarking autonomous driving planners to align with human judgment remains
a critical challenge, as state-of-the-art metrics like the Extended Predictive
Driver Model Score (EPDMS) lack context awareness in nuanced scenarios. To
address this, we introduce DriveCritic, a novel framework featuring two key
contributions: the DriveCritic dataset, a curated collection of challenging
scenarios where context is critical for correct judgment and annotated with
pairwise human preferences, and the DriveCritic model, a Vision-Language Model
(VLM) based evaluator. Fine-tuned using a two-stage supervised and
reinforcement learning pipeline, the DriveCritic model learns to adjudicate
between trajectory pairs by integrating visual and symbolic context.
Experiments show DriveCritic significantly outperforms existing metrics and
baselines in matching human preferences and demonstrates strong context
awareness. Overall, our work provides a more reliable, human-aligned foundation
to evaluating autonomous driving systems.
Ссылки и действия
Дополнительные ресурсы: