HyPerNav: Hybrid Perception for Object-Oriented Navigation in Unknown Environment
2510.22917v2
cs.RO, cs.AI
2025-10-29
Авторы:
Zecheng Yin, Hao Zhao, Zhen Li
Abstract
Objective-oriented navigation(ObjNav) enables robot to navigate to target
object directly and autonomously in an unknown environment. Effective
perception in navigation in unknown environment is critical for autonomous
robots. While egocentric observations from RGB-D sensors provide abundant local
information, real-time top-down maps offer valuable global context for ObjNav.
Nevertheless, the majority of existing studies focus on a single source, seldom
integrating these two complementary perceptual modalities, despite the fact
that humans naturally attend to both. With the rapid advancement of
Vision-Language Models(VLMs), we propose Hybrid Perception Navigation
(HyPerNav), leveraging VLMs' strong reasoning and vision-language understanding
capabilities to jointly perceive both local and global information to enhance
the effectiveness and intelligence of navigation in unknown environments. In
both massive simulation evaluation and real-world validation, our methods
achieved state-of-the-art performance against popular baselines. Benefiting
from hybrid perception approach, our method captures richer cues and finds the
objects more effectively, by simultaneously leveraging information
understanding from egocentric observations and the top-down map. Our ablation
study further proved that either of the hybrid perception contributes to the
navigation performance.
Ссылки и действия
Дополнительные ресурсы: