Combo-Gait: Unified Transformer Framework for Multi-Modal Gait Recognition and Attribute Analysis
2510.10417v1
cs.CV, cs.AI, cs.LG
2025-10-15
Авторы:
Zhao-Yang Wang, Zhimin Shao, Jieneng Chen, Rama Chellappa
Abstract
Gait recognition is an important biometric for human identification at a
distance, particularly under low-resolution or unconstrained environments.
Current works typically focus on either 2D representations (e.g., silhouettes
and skeletons) or 3D representations (e.g., meshes and SMPLs), but relying on a
single modality often fails to capture the full geometric and dynamic
complexity of human walking patterns. In this paper, we propose a multi-modal
and multi-task framework that combines 2D temporal silhouettes with 3D SMPL
features for robust gait analysis. Beyond identification, we introduce a
multitask learning strategy that jointly performs gait recognition and human
attribute estimation, including age, body mass index (BMI), and gender. A
unified transformer is employed to effectively fuse multi-modal gait features
and better learn attribute-related representations, while preserving
discriminative identity cues. Extensive experiments on the large-scale BRIAR
datasets, collected under challenging conditions such as long-range distances
(up to 1 km) and extreme pitch angles (up to 50{\deg}), demonstrate that our
approach outperforms state-of-the-art methods in gait recognition and provides
accurate human attribute estimation. These results highlight the promise of
multi-modal and multitask learning for advancing gait-based human understanding
in real-world scenarios.
Ссылки и действия
Дополнительные ресурсы: