MPOSE2021: a Dataset for Short-time Pose-based Human Action Recognition
Creators
- 1. Politecnico di Torino
- 2. Newcastle University
Description
MPOSE2021
MPOSE2021 is a Dataset for short-time pose-based Human Action Recognition (HAR). MPOSE2021 is specifically designed to perform short-time Human Action Recognition, as presented in [12].
MPOSE2021 is developed as an evolution of the MPOSE Dataset [1-3]. It is made by human pose data detected by OpenPose [4] and Posenet [11] on popular datasets for HAR, i.e. Weizmann [5], i3DPost [6], IXMAS [7], KTH [8], UTKinetic-Action3D (RGB only) [9] and UTD-MHAD (RGB only) [10], alongside original video datasets, i.e. ISLD and ISLD-Additional-Sequences [1]. Since these datasets have heterogenous action labels, each dataset labels is remapped to a common and homogeneous list of actions.
To properly use MPOSE2021 and all the functionalities developed by the authors, we recommend using the official repository MPOSE2021_Dataset.
Dataset Description
The repository contains 3 datasets (namely 1, 2 and 3) which consist of the same data divided in different train/test splits. Each dataset contains X and y numpy arrays for both training and testing. X has the following shape:
(number_of_samples, time_window, number_of_keypoints, x_y_p)
where
- time_window = 30
- number_of_keypoints = 17 (PoseNet) or 13 (OpenPose)
- x_y_p contains 2D keypoint coordinates (x,y) in the original video reference frame and the keypoint confidence (p <= 1)
References
[1] F. Angelini, Z. Fu, Y. Long, L. Shao and S. M. Naqvi, "2D Pose-based Real-time Human Action Recognition with Occlusion-handling," in IEEE Transactions on Multimedia. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8853267&isnumber=4456689
[2] F. Angelini, J. Yan and S. M. Naqvi, "Privacy-preserving Online Human Behaviour Anomaly Detection Based on Body Movements and Objects Positions," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 2019, pp. 8444-8448. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8683026&isnumber=8682151
[3] F. Angelini and S. M. Naqvi, "Joint RGB-Pose Based Human Action Recognition for Anomaly Detection Applications," 2019 22th International Conference on Information Fusion (FUSION), Ottawa, ON, Canada, 2019, pp. 1-7. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9011277&isnumber=9011156
[4] Cao, Zhe, et al. "OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields." IEEE transactions on pattern analysis and machine intelligence 43.1 (2019): 172-186.
[5] Gorelick, Lena, et al. "Actions as space-time shapes." IEEE transactions on pattern analysis and machine intelligence 29.12 (2007): 2247-2253.
[6] Starck, Jonathan, and Adrian Hilton. "Surface capture for performance-based animation." IEEE computer graphics and applications 27.3 (2007): 21-31.
[7] Weinland, Daniel, Mustafa Özuysal, and Pascal Fua. "Making action recognition robust to occlusions and viewpoint changes." European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2010.
[8] Schuldt, Christian, Ivan Laptev, and Barbara Caputo. "Recognizing human actions: a local SVM approach." Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004. Vol. 3. IEEE, 2004.
[9] L. Xia, C.C. Chen and JK Aggarwal. "View invariant human action recognition using histograms of 3D joints", 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 20-27, 2012.
[10] C. Chen, R. Jafari, and N. Kehtarnavaz. "UTD-MHAD: A Multimodal Dataset for Human Action Recognition Utilizing a Depth Camera and a Wearable Inertial Sensor". Proceedings of IEEE International Conference on Image Processing, Canada, 2015.
[11] G. Papandreou, T. Zhu, L.C. Chen, S. Gidaris, J. Tompson, K. Murphy. "PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model". Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 269-286
[12] V. Mazzia, S. Angarano, F. Salvetti, F. Angelini, M. Chiaberge. "Action Transformer: A Self-Attention Model for Short-Time Human Action Recognition". arXiv preprint (https://arxiv.org/abs/2107.00606), 2021.
Files
1.txt
Files
(267.1 MB)
Name | Size | Download all |
---|---|---|
md5:890f1bd450bea4bd12091e56e414296f
|
560.2 kB | Preview Download |
md5:486ec5e9ed5fe38cd347598bd9a0da34
|
88.5 MB | Preview Download |
md5:f0b8042d4740fa79ae074edfeb2e71ac
|
560.2 kB | Preview Download |
md5:10ca989eceaabe07eb711020f64b64ab
|
88.5 MB | Preview Download |
md5:4fe999596baf8fd569a5f6cbb1a533fc
|
560.2 kB | Preview Download |
md5:b9ea73ed2bbfd88902742bc444620afe
|
88.5 MB | Preview Download |
Additional details
References
- F. Angelini, Z. Fu, Y. Long, L. Shao and S. M. Naqvi, "2D Pose-based Real-time Human Action Recognition with Occlusion-handling," in IEEE Transactions on Multimedia. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8853267&isnumber=4456689
- F. Angelini, J. Yan and S. M. Naqvi, "Privacy-preserving Online Human Behaviour Anomaly Detection Based on Body Movements and Objects Positions," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 2019, pp. 8444-8448. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8683026&isnumber=8682151
- F. Angelini and S. M. Naqvi, "Joint RGB-Pose Based Human Action Recognition for Anomaly Detection Applications," 2019 22th International Conference on Information Fusion (FUSION), Ottawa, ON, Canada, 2019, pp. 1-7. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9011277&isnumber=9011156
- Cao, Zhe, et al. "OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields." IEEE transactions on pattern analysis and machine intelligence 43.1 (2019): 172-186
- Gorelick, Lena, et al. "Actions as space-time shapes." IEEE transactions on pattern analysis and machine intelligence 29.12 (2007): 2247-2253
- Starck, Jonathan, and Adrian Hilton. "Surface capture for performance-based animation." IEEE computer graphics and applications 27.3 (2007): 21-31
- Weinland, Daniel, Mustafa Özuysal, and Pascal Fua. "Making action recognition robust to occlusions and viewpoint changes." European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2010
- Schuldt, Christian, Ivan Laptev, and Barbara Caputo. "Recognizing human actions: a local SVM approach." Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004. Vol. 3. IEEE, 2004
- L. Xia, C.C. Chen and JK Aggarwal. "View invariant human action recognition using histograms of 3D joints", 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 20-27, 2012
- C. Chen, R. Jafari, and N. Kehtarnavaz. "UTD-MHAD: A Multimodal Dataset for Human Action Recognition Utilizing a Depth Camera and a Wearable Inertial Sensor". Proceedings of IEEE International Conference on Image Processing, Canada, 2015
- G. Papandreou, T. Zhu, L.C. Chen, S. Gidaris, J. Tompson, K. Murphy. "PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model". Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 269-286
- V. Mazzia, S. Angarano, F. Salvetti, F. Angelini, M. Chiaberge. "Action Transformer: A Self-Attention Model for Short-Time Human Action Recognition". arXiv preprint (https://arxiv.org/abs/2107.00606), 2021