X-Ego: Acquiring Team-Level Tactical Situational Awareness via Cross-Egocentric Contrastive Video Representation Learning
2510.19150v1
cs.CV, cs.AI, cs.LG
2025-10-24
Авторы:
Yunzhe Wang, Soham Hans, Volkan Ustun
Abstract
Human team tactics emerge from each player's individual perspective and their
ability to anticipate, interpret, and adapt to teammates' intentions. While
advances in video understanding have improved the modeling of team interactions
in sports, most existing work relies on third-person broadcast views and
overlooks the synchronous, egocentric nature of multi-agent learning. We
introduce X-Ego-CS, a benchmark dataset consisting of 124 hours of gameplay
footage from 45 professional-level matches of the popular e-sports game
Counter-Strike 2, designed to facilitate research on multi-agent
decision-making in complex 3D environments. X-Ego-CS provides cross-egocentric
video streams that synchronously capture all players' first-person perspectives
along with state-action trajectories. Building on this resource, we propose
Cross-Ego Contrastive Learning (CECL), which aligns teammates' egocentric
visual streams to foster team-level tactical situational awareness from an
individual's perspective. We evaluate CECL on a teammate-opponent location
prediction task, demonstrating its effectiveness in enhancing an agent's
ability to infer both teammate and opponent positions from a single
first-person view using state-of-the-art video encoders. Together, X-Ego-CS and
CECL establish a foundation for cross-egocentric multi-agent benchmarking in
esports. More broadly, our work positions gameplay understanding as a testbed
for multi-agent modeling and tactical learning, with implications for
spatiotemporal reasoning and human-AI teaming in both virtual and real-world
domains. Code and dataset are available at https://github.com/HATS-ICT/x-ego.
Ссылки и действия
Дополнительные ресурсы: