Paving the Way Towards Kinematic Assessment Using Monocular Video: A Benchmark of State-of-the-Art Deep-Learning-Based 3D Human Pose Estimators Against Inertial Sensors in Daily Living Activities

Medrano-Paredes, Mario; Fernández-González, Carmen; Díaz-Pernas, Francisco-Javier; Saoudi, Hichem; González-Alonso, Javier; Martínez-Zarzuela, Mario

doi:10.5281/zenodo.15088423

Published March 26, 2025 | Version 1.0.0

Dataset Open

Paving the Way Towards Kinematic Assessment Using Monocular Video: A Benchmark of State-of-the-Art Deep-Learning-Based 3D Human Pose Estimators Against Inertial Sensors in Daily Living Activities

1. Universidad de Valladolid
2. University of Valladolid

Advances in machine learning and wearable sensors offer new opportunities for capturing and analyzing human movement outside specialized laboratories. Accurately tracking and evaluating human movement under real-world conditions is essential for telemedicine, sports science, and rehabilitation. This work introduces a comprehensive benchmark comparing deep learning monocular video-based human pose estimation models with inertial measurement unit (IMU)-driven methods, leveraging VIDIMU dataset containing a total of 13 clinically relevant activities which were captured using both commodity video cameras and 5 IMUs. Joint angles derived from state-of-the-art deep learning frameworks (MotionAGFormer, MotionBERT, MMPose 2D-to-3D pose lifting, and NVIDIA BodyTrack included in Maxine-AR-SDK) were evaluated against joint angles computed from IMU data using OpenSim inverse kinematic methods. A graphical comparison of the angles estimated by each model shows the overall performance for each activity.

The results, which also contains the evaluation of multiple metrics (RMSE, NMRSE, MAE, correlation and coefficient of determination) in table and plot format, highlight key trade-offs between video- and sensor-based approaches including costs, accessibility and precision across different daily life activities. This work establishes valuable guidelines for researchers and clinicians seeking to develop robust, cost-effective, and user-friendly solutions for telehealth and remote patient monitoring solutions, ultimately bridging the gap between AI-driven motion capture and accessible healthcare applications.

Files

analysis.zip

Files (913.9 MB)

Name	Size	Download all
analysis.zip md5:d04bdd6edfee0865d0772756ad2ba727	81.8 MB	Preview Download
jointangles.zip md5:3f80d368163d4d0a35028ac028a9db24	459.2 MB	Preview Download
pose3d.zip md5:ac23a9092f46be8d179c85bc6ede0713	368.4 MB	Preview Download
results.zip md5:1de2d2ab1173d1304f6baebdd41a6e0d	4.5 MB	Preview Download

Additional details

Is derived from: Dataset: 10.5281/zenodo.7681316 (DOI); Journal article: arXiv:2303.16150 (arXiv); Journal article: 10.1038/s41597-023-02554-9 (DOI)

Submitted: 2025-03-26

	All versions	This version
Views	109	109
Downloads	88	88
Data volume	20.4 GB	20.4 GB

Paving the Way Towards Kinematic Assessment Using Monocular Video: A Benchmark of State-of-the-Art Deep-Learning-Based 3D Human Pose Estimators Against Inertial Sensors in Daily Living Activities

Files

analysis.zip

Files (913.9 MB)

Additional details

Related works

Dates

Paving the Way Towards Kinematic Assessment Using Monocular Video: A Benchmark of State-of-the-Art Deep-Learning-Based 3D Human Pose Estimators Against Inertial Sensors in Daily Living Activities

Creators

Description

Files

analysis.zip

Files (913.9 MB)

Additional details

Related works

Dates