Clone Deterministic 3D Worlds with Geometrically-Regularized World Models

2510.26782v1 cs.LG, cs.AI, cs.CV 2025-11-01

Авторы:

Zaishuo Xia, Yukuan Lu, Xinyi Li, Yifan Xu, Yubei Chen

Abstract

A world model is an internal model that simulates how the world evolves. Given past observations and actions, it predicts the future of both the embodied agent and its environment. Accurate world models are essential for enabling agents to think, plan, and reason effectively in complex, dynamic settings. Despite rapid progress, current world models remain brittle and degrade over long horizons. We argue that a central cause is representation quality: exteroceptive inputs (e.g., images) are high-dimensional, and lossy or entangled latents make dynamics learning unnecessarily hard. We therefore ask whether improving representation learning alone can substantially improve world-model performance. In this work, we take a step toward building a truly accurate world model by addressing a fundamental yet open problem: constructing a model that can fully clone and overfit to a deterministic 3D world. We propose Geometrically-Regularized World Models (GRWM), which enforces that consecutive points along a natural sensory trajectory remain close in latent representation space. This approach yields significantly improved latent representations that align closely with the true topology of the environment. GRWM is plug-and-play, requires only minimal architectural modification, scales with trajectory length, and is compatible with diverse latent generative backbones. Across deterministic 3D settings and long-horizon prediction tasks, GRWM significantly increases rollout fidelity and stability. Analyses show that its benefits stem from learning a latent manifold with superior geometric structure. These findings support a clear takeaway: improving representation learning is a direct and useful path to robust world models, delivering reliable long-horizon predictions without enlarging the dynamics module.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Clone Deterministic 3D Worlds with Geometrically-Regularized World Models

Авторы:

Abstract

Ссылки и действия

Связанные статьи

TV2TV: A Unified Framework for Interleaved Language and Video Generation

The Universal Weight Subspace Hypothesis

STeP-Diff: Spatio-Temporal Physics-Informed Diffusion Models for Mobile Fine-Gra...

Open-Set Domain Adaptation Under Background Distribution Shift: Challenges and A...

First On-Orbit Demonstration of a Geospatial Foundation Model

Навигация