DistillDrive: End-to-End Multi-Mode Autonomous Driving Distillation by Isomorphic Hetero-Source Planning Model
2508.05402v1
cs.RO, cs.CV
2025-08-09
Авторы:
Rui Yu, Xianghang Zhang, Runkai Zhao, Huaicheng Yan, Meng Wang
Резюме на русском
**Резюме**
Автоматическое управление транспортными средствами столкнулось с ограничениями в робастности и универсальности решений, опирающихся только на модели ego-vehicle. На основе этой проблемы авторы предлагают DistillDrive — модель классического размера с использованием knowledge distillation. Она оптимизирует multi-mode motion planning с использованием planning-oriented instances, созданных с помощью generative modeling. Особенностью является использование structured scene representations в качестве teacher model, которая нацелена на повышение робастности и уменьшение collision rate. Результаты на nuScenes и NAVSIM демонстрируют улучшение closed-loop performance на 3 балла и сокращение collision rate на 50% по сравнению со стандартным подходом. Авторы обещают сделать исходный код и модель доступными для исследователей.
Abstract
End-to-end autonomous driving has been recently seen rapid development,
exerting a profound influence on both industry and academia. However, the
existing work places excessive focus on ego-vehicle status as their sole
learning objectives and lacks of planning-oriented understanding, which limits
the robustness of the overall decision-making prcocess. In this work, we
introduce DistillDrive, an end-to-end knowledge distillation-based autonomous
driving model that leverages diversified instance imitation to enhance
multi-mode motion feature learning. Specifically, we employ a planning model
based on structured scene representations as the teacher model, leveraging its
diversified planning instances as multi-objective learning targets for the
end-to-end model. Moreover, we incorporate reinforcement learning to enhance
the optimization of state-to-decision mappings, while utilizing generative
modeling to construct planning-oriented instances, fostering intricate
interactions within the latent space. We validate our model on the nuScenes and
NAVSIM datasets, achieving a 50\% reduction in collision rate and a 3-point
improvement in closed-loop performance compared to the baseline model. Code and
model are publicly available at https://github.com/YuruiAI/DistillDrive
Ссылки и действия
Дополнительные ресурсы: