Intention Enhanced Diffusion Model for Multimodal Pedestrian Trajectory Prediction

2508.04229v1 cs.CV 2025-08-09

Авторы:

Yu Liu, Zhijie Liu, Xiao Ren, You-Fu Li, He Kong

Резюме на русском

Определение пути движения пешеходов является ключевым заданием для планирования движения автономных машин. Тем не менее, точное прогнозирование многомодальных и неопределенных траекторий пешеходов остается вызовом. Недавние развития на основе диффузионных моделей показали перспективу в моделировании стохастических траекторий. Однако многие из них не учитывают предполагаемые движущиеся подвизки пешеходов, что может снизить точность и понимаемость моделей. В данной работе предлагается развитие диффузионной модели, которая включает в себя моделирование движущихся подвизки пешеходов в виде латеральных и долготеральных компонент. Для этого вводится модуль распознавания движущихся подвизки, который позволяет модели эффективно перехватывать эти подвизки. Дополнительно используется механизм управления, обеспечивающий интерпретируемость полученных траекторий. Эксперименты на двух популярных траекторий показали, что предложенная модель демонстрирует высокую эффективность, сравнимое с текущими лидерами.

Abstract

Predicting pedestrian motion trajectories is critical for path planning and motion control of autonomous vehicles. However, accurately forecasting crowd trajectories remains a challenging task due to the inherently multimodal and uncertain nature of human motion. Recent diffusion-based models have shown promising results in capturing the stochasticity of pedestrian behavior for trajectory prediction. However, few diffusion-based approaches explicitly incorporate the underlying motion intentions of pedestrians, which can limit the interpretability and precision of prediction models. In this work, we propose a diffusion-based multimodal trajectory prediction model that incorporates pedestrians' motion intentions into the prediction framework. The motion intentions are decomposed into lateral and longitudinal components, and a pedestrian intention recognition module is introduced to enable the model to effectively capture these intentions. Furthermore, we adopt an efficient guidance mechanism that facilitates the generation of interpretable trajectories. The proposed framework is evaluated on two widely used human trajectory prediction benchmarks, ETH and UCY, on which it is compared against state-of-the-art methods. The experimental results demonstrate that our method achieves competitive performance.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Intention Enhanced Diffusion Model for Multimodal Pedestrian Trajectory Prediction

Авторы:

Резюме на русском

Abstract

Ссылки и действия

Связанные статьи

ViRectify: A Challenging Benchmark for Video Reasoning Correction with Multimoda...

PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with P...

ViDiC: Video Difference Captioning

Beyond the Ground Truth: Enhanced Supervision for Image Restoration

TempR1: Improving Temporal Understanding of MLLMs via Temporal-Aware Multi-Task ...

Навигация