MPOSE2021: a Dataset for Short-time Pose-based Human Action Recognition

Vittorio Mazzia; Simone Angarano; Francesco Salvetti; Federico Angelini; Marcello Chiaberge

doi:10.5281/zenodo.5507363

Published July 6, 2021 | Version v2

Dataset Open

MPOSE2021: a Dataset for Short-time Pose-based Human Action Recognition

1. Politecnico di Torino
2. Newcastle University

MPOSE2021

MPOSE2021 is a Dataset for short-time pose-based Human Action Recognition (HAR). MPOSE2021 is specifically designed to perform short-time Human Action Recognition, as presented in [12].

MPOSE2021 is developed as an evolution of the MPOSE Dataset [1-3]. It is made by human pose data detected by OpenPose [4] and Posenet [11] on popular datasets for HAR, i.e. Weizmann [5], i3DPost [6], IXMAS [7], KTH [8], UTKinetic-Action3D (RGB only) [9] and UTD-MHAD (RGB only) [10], alongside original video datasets, i.e. ISLD and ISLD-Additional-Sequences [1]. Since these datasets have heterogenous action labels, each dataset labels is remapped to a common and homogeneous list of actions.

To properly use MPOSE2021 and all the functionalities developed by the authors, we recommend using the official repository MPOSE2021_Dataset.

Dataset Description

The repository contains 3 datasets (namely 1, 2 and 3) which consist of the same data divided in different train/test splits. Each dataset contains X and y numpy arrays for both training and testing. X has the following shape:

(number_of_samples, time_window, number_of_keypoints, x_y_p)

where

time_window = 30
number_of_keypoints = 17 (PoseNet) or 13 (OpenPose)
x_y_p contains 2D keypoint coordinates (x,y) in the original video reference frame and the keypoint confidence (p <= 1)

References

[1] F. Angelini, Z. Fu, Y. Long, L. Shao and S. M. Naqvi, "2D Pose-based Real-time Human Action Recognition with Occlusion-handling," in IEEE Transactions on Multimedia. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8853267&isnumber=4456689

[2] F. Angelini, J. Yan and S. M. Naqvi, "Privacy-preserving Online Human Behaviour Anomaly Detection Based on Body Movements and Objects Positions," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 2019, pp. 8444-8448. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8683026&isnumber=8682151

[3] F. Angelini and S. M. Naqvi, "Joint RGB-Pose Based Human Action Recognition for Anomaly Detection Applications," 2019 22th International Conference on Information Fusion (FUSION), Ottawa, ON, Canada, 2019, pp. 1-7. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9011277&isnumber=9011156

[4] Cao, Zhe, et al. "OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields." IEEE transactions on pattern analysis and machine intelligence 43.1 (2019): 172-186.

[5] Gorelick, Lena, et al. "Actions as space-time shapes." IEEE transactions on pattern analysis and machine intelligence 29.12 (2007): 2247-2253.

[6] Starck, Jonathan, and Adrian Hilton. "Surface capture for performance-based animation." IEEE computer graphics and applications 27.3 (2007): 21-31.

[7] Weinland, Daniel, Mustafa Özuysal, and Pascal Fua. "Making action recognition robust to occlusions and viewpoint changes." European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2010.

[8] Schuldt, Christian, Ivan Laptev, and Barbara Caputo. "Recognizing human actions: a local SVM approach." Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004. Vol. 3. IEEE, 2004.

[9] L. Xia, C.C. Chen and JK Aggarwal. "View invariant human action recognition using histograms of 3D joints", 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 20-27, 2012.

[10] C. Chen, R. Jafari, and N. Kehtarnavaz. "UTD-MHAD: A Multimodal Dataset for Human Action Recognition Utilizing a Depth Camera and a Wearable Inertial Sensor". Proceedings of IEEE International Conference on Image Processing, Canada, 2015.

[11] G. Papandreou, T. Zhu, L.C. Chen, S. Gidaris, J. Tompson, K. Murphy. "PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model". Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 269-286

[12] V. Mazzia, S. Angarano, F. Salvetti, F. Angelini, M. Chiaberge. "Action Transformer: A Self-Attention Model for Short-Time Human Action Recognition". arXiv preprint (https://arxiv.org/abs/2107.00606), 2021.

Files

1.txt

Files (267.1 MB)

Name	Size	Download all
1.txt md5:890f1bd450bea4bd12091e56e414296f	560.2 kB	Preview Download
1.zip md5:486ec5e9ed5fe38cd347598bd9a0da34	88.5 MB	Preview Download
2.txt md5:f0b8042d4740fa79ae074edfeb2e71ac	560.2 kB	Preview Download
2.zip md5:10ca989eceaabe07eb711020f64b64ab	88.5 MB	Preview Download
3.txt md5:4fe999596baf8fd569a5f6cbb1a533fc	560.2 kB	Preview Download
3.zip md5:b9ea73ed2bbfd88902742bc444620afe	88.5 MB	Preview Download

Additional details

F. Angelini, Z. Fu, Y. Long, L. Shao and S. M. Naqvi, "2D Pose-based Real-time Human Action Recognition with Occlusion-handling," in IEEE Transactions on Multimedia. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8853267&isnumber=4456689
F. Angelini, J. Yan and S. M. Naqvi, "Privacy-preserving Online Human Behaviour Anomaly Detection Based on Body Movements and Objects Positions," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 2019, pp. 8444-8448. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8683026&isnumber=8682151
F. Angelini and S. M. Naqvi, "Joint RGB-Pose Based Human Action Recognition for Anomaly Detection Applications," 2019 22th International Conference on Information Fusion (FUSION), Ottawa, ON, Canada, 2019, pp. 1-7. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9011277&isnumber=9011156
Cao, Zhe, et al. "OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields." IEEE transactions on pattern analysis and machine intelligence 43.1 (2019): 172-186
Gorelick, Lena, et al. "Actions as space-time shapes." IEEE transactions on pattern analysis and machine intelligence 29.12 (2007): 2247-2253
Starck, Jonathan, and Adrian Hilton. "Surface capture for performance-based animation." IEEE computer graphics and applications 27.3 (2007): 21-31
Weinland, Daniel, Mustafa Özuysal, and Pascal Fua. "Making action recognition robust to occlusions and viewpoint changes." European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2010
Schuldt, Christian, Ivan Laptev, and Barbara Caputo. "Recognizing human actions: a local SVM approach." Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004. Vol. 3. IEEE, 2004
L. Xia, C.C. Chen and JK Aggarwal. "View invariant human action recognition using histograms of 3D joints", 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 20-27, 2012
C. Chen, R. Jafari, and N. Kehtarnavaz. "UTD-MHAD: A Multimodal Dataset for Human Action Recognition Utilizing a Depth Camera and a Wearable Inertial Sensor". Proceedings of IEEE International Conference on Image Processing, Canada, 2015
G. Papandreou, T. Zhu, L.C. Chen, S. Gidaris, J. Tompson, K. Murphy. "PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model". Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 269-286
V. Mazzia, S. Angarano, F. Salvetti, F. Angelini, M. Chiaberge. "Action Transformer: A Self-Attention Model for Short-Time Human Action Recognition". arXiv preprint (https://arxiv.org/abs/2107.00606), 2021

	All versions	This version
Views	2,808	274
Downloads	1,248	81
Data volume	562.1 GB	8.1 GB

MPOSE2021: a Dataset for Short-time Pose-based Human Action Recognition

Creators

Description

Files

1.txt

Files (267.1 MB)

Additional details

References