SIMSplat: Predictive Driving Scene Editing with Language-aligned 4D Gaussian Splatting

2510.02469v1 cs.RO, cs.AI, cs.CL, cs.CV 2025-10-07
Авторы:

Sung-Yeon Park, Adam Lee, Juanwu Lu, Can Cui, Luyang Jiang, Rohit Gupta, Kyungtae Han, Ahmadreza Moradipari, Ziran Wang

Abstract

Driving scene manipulation with sensor data is emerging as a promising alternative to traditional virtual driving simulators. However, existing frameworks struggle to generate realistic scenarios efficiently due to limited editing capabilities. To address these challenges, we present SIMSplat, a predictive driving scene editor with language-aligned Gaussian splatting. As a language-controlled editor, SIMSplat enables intuitive manipulation using natural language prompts. By aligning language with Gaussian-reconstructed scenes, it further supports direct querying of road objects, allowing precise and flexible editing. Our method provides detailed object-level editing, including adding new objects and modifying the trajectories of both vehicles and pedestrians, while also incorporating predictive path refinement through multi-agent motion prediction to generate realistic interactions among all agents in the scene. Experiments on the Waymo dataset demonstrate SIMSplat's extensive editing capabilities and adaptability across a wide range of scenarios. Project page: https://sungyeonparkk.github.io/simsplat/

Ссылки и действия

Связанные статьи

DreamNav: A Trajectory-Based Imaginative Framework for Zero-Shot Vision-and-Lang...

## Контекст Vision-and-Language Navigation in Continuous Environments (VLN-CE) является одной из ключевых функций для об...

2025-09-17

OmniEVA: Embodied Versatile Planner via Task-Adaptive 3D-Grounded and Embodiment...

## Контекст В последние годы, развитие многомодальных больших языковых моделей (MLLMs) обеспечило новые возможности для...

2025-09-15

OmniEVA: Embodied Versatile Planner via Task-Adaptive 3D-Grounded and Embodiment...

#### Контекст Комбинация multimodal large language models (MLLMs) с обзорными возможностями обнаружения и интерпретации...

2025-09-13

CorrectNav: Self-Correction Flywheel Empowers Vision-Language-Action Navigation ...

#### Контекст Визионно-языковое навигационное моделирование (VLA) широко применяется в сегменте развития искусственного...

2025-08-16