GraphGeo: Multi-Agent Debate Framework for Visual Geo-localization with Heterogeneous Graph Neural Networks
2511.00908v1
cs.CV, cs.GR
2025-11-07
Авторы:
Heng Zheng, Yuling Shi, Xiaodong Gu, Haochen You, Zijian Zhang, Lubin Gan, Hao Zhang, Wenjun Huang, Jin Huang
Abstract
Visual geo-localization requires extensive geographic knowledge and
sophisticated reasoning to determine image locations without GPS metadata.
Traditional retrieval methods are constrained by database coverage and quality.
Recent Large Vision-Language Models (LVLMs) enable direct location reasoning
from image content, yet individual models struggle with diverse geographic
regions and complex scenes. Existing multi-agent systems improve performance
through model collaboration but treat all agent interactions uniformly. They
lack mechanisms to handle conflicting predictions effectively. We propose
\textbf{GraphGeo}, a multi-agent debate framework using heterogeneous graph
neural networks for visual geo-localization. Our approach models diverse debate
relationships through typed edges, distinguishing supportive collaboration,
competitive argumentation, and knowledge transfer. We introduce a dual-level
debate mechanism combining node-level refinement and edge-level argumentation
modeling. A cross-level topology refinement strategy enables co-evolution
between graph structure and agent representations. Experiments on multiple
benchmarks demonstrate GraphGeo significantly outperforms state-of-the-art
methods. Our framework transforms cognitive conflicts between agents into
enhanced geo-localization accuracy through structured debate.
Ссылки и действия
Дополнительные ресурсы: