Paving the Way Towards Kinematic Assessment Using Monocular Video: A Preclinical Benchmark of State-of-the-Art Deep-Learning-Based 3D Human Pose Estimators Against Inertial Sensors in Daily Living Activities
2510.02264v1
cs.CV, cs.AI, cs.LG
2025-10-04
Авторы:
Mario Medrano-Paredes, Carmen Fernández-González, Francisco-Javier Díaz-Pernas, Hichem Saoudi, Javier González-Alonso, Mario Martínez-Zarzuela
Abstract
Advances in machine learning and wearable sensors offer new opportunities for
capturing and analyzing human movement outside specialized laboratories.
Accurate assessment of human movement under real-world conditions is essential
for telemedicine, sports science, and rehabilitation. This preclinical
benchmark compares monocular video-based 3D human pose estimation models with
inertial measurement units (IMUs), leveraging the VIDIMU dataset containing a
total of 13 clinically relevant daily activities which were captured using both
commodity video cameras and five IMUs. During this initial study only healthy
subjects were recorded, so results cannot be generalized to pathological
cohorts. Joint angles derived from state-of-the-art deep learning frameworks
(MotionAGFormer, MotionBERT, MMPose 2D-to-3D pose lifting, and NVIDIA
BodyTrack) were evaluated against joint angles computed from IMU data using
OpenSim inverse kinematics following the Human3.6M dataset format with 17
keypoints. Among them, MotionAGFormer demonstrated superior performance,
achieving the lowest overall RMSE ($9.27\deg \pm 4.80\deg$) and MAE ($7.86\deg
\pm 4.18\deg$), as well as the highest Pearson correlation ($0.86 \pm 0.15$)
and the highest coefficient of determination $R^{2}$ ($0.67 \pm 0.28$). The
results reveal that both technologies are viable for out-of-the-lab kinematic
assessment. However, they also highlight key trade-offs between video- and
sensor-based approaches including costs, accessibility, and precision. This
study clarifies where off-the-shelf video models already provide clinically
promising kinematics in healthy adults and where they lag behind IMU-based
estimates while establishing valuable guidelines for researchers and clinicians
seeking to develop robust, cost-effective, and user-friendly solutions for
telehealth and remote patient monitoring.
Ссылки и действия
Дополнительные ресурсы: