EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval
2510.18546v1
cs.RO, cs.AI
2025-10-23
Авторы:
Zebin Yang, Sunjian Zheng, Tong Xie, Tianshi Xu, Bo Yu, Fan Wang, Jie Tang, Shaoshan Liu, Meng Li
Abstract
Object-goal navigation (ObjNav) tasks an agent with navigating to the
location of a specific object in an unseen environment. Embodied agents
equipped with large language models (LLMs) and online constructed navigation
maps can perform ObjNav in a zero-shot manner. However, existing agents heavily
rely on giant LLMs on the cloud, e.g., GPT-4, while directly switching to small
LLMs, e.g., LLaMA3.2-11b, suffer from significant success rate drops due to
limited model capacity for understanding complex navigation maps, which
prevents deploying ObjNav on local devices. At the same time, the long prompt
introduced by the navigation map description will cause high planning latency
on local devices. In this paper, we propose EfficientNav to enable on-device
efficient LLM-based zero-shot ObjNav. To help the smaller LLMs better
understand the environment, we propose semantics-aware memory retrieval to
prune redundant information in navigation maps. To reduce planning latency, we
propose discrete memory caching and attention-based memory clustering to
efficiently save and re-use the KV cache. Extensive experimental results
demonstrate that EfficientNav achieves 11.1% improvement in success rate on
HM3D benchmark over GPT-4-based baselines, and demonstrates 6.7x real-time
latency reduction and 4.7x end-to-end latency reduction over GPT-4 planner. Our
code will be released soon.
Ссылки и действия
Дополнительные ресурсы: