ESCORT: Efficient Stein-variational and Sliced Consistency-Optimized Temporal Belief Representation for POMDPs
2510.21107v1
cs.LG, cs.AI, cs.RO
2025-10-28
Авторы:
Yunuo Zhang, Baiting Luo, Ayan Mukhopadhyay, Gabor Karsai, Abhishek Dubey
Abstract
In Partially Observable Markov Decision Processes (POMDPs), maintaining and
updating belief distributions over possible underlying states provides a
principled way to summarize action-observation history for effective
decision-making under uncertainty. As environments grow more realistic, belief
distributions develop complexity that standard mathematical models cannot
accurately capture, creating a fundamental challenge in maintaining
representational accuracy. Despite advances in deep learning and probabilistic
modeling, existing POMDP belief approximation methods fail to accurately
represent complex uncertainty structures such as high-dimensional, multi-modal
belief distributions, resulting in estimation errors that lead to suboptimal
agent behaviors. To address this challenge, we present ESCORT (Efficient
Stein-variational and sliced Consistency-Optimized Representation for Temporal
beliefs), a particle-based framework for capturing complex, multi-modal
distributions in high-dimensional belief spaces. ESCORT extends SVGD with two
key innovations: correlation-aware projections that model dependencies between
state dimensions, and temporal consistency constraints that stabilize updates
while preserving correlation structures. This approach retains SVGD's
attractive-repulsive particle dynamics while enabling accurate modeling of
intricate correlation patterns. Unlike particle filters prone to degeneracy or
parametric methods with fixed representational capacity, ESCORT dynamically
adapts to belief landscape complexity without resampling or restrictive
distributional assumptions. We demonstrate ESCORT's effectiveness through
extensive evaluations on both POMDP domains and synthetic multi-modal
distributions of varying dimensionality, where it consistently outperforms
state-of-the-art methods in terms of belief approximation accuracy and
downstream decision quality.
Ссылки и действия
Дополнительные ресурсы: