Conference paper Open Access
Konstantinos Gkountakos;
Despoina Touska;
Konstantinos Ioannidis;
Theodora Tsikrika;
Stefanos Vrochidis;
Ioannis Kompatsiaris
This work presents a spatio-temporal activity detection and recognition framework for untrimmed surveillance videos consisting of a three-step pipeline: object detection, tracking, and activity recognition. The framework relies on the YOLO v4 architecture for object detection, Euclidean distance for tracking, while the activity recognizer uses a 3D Convolutional Deep learning architecture employing spatio-temporal boundaries and addressing it as multi-label classification. The evaluation experiments on the VIRAT dataset achieve accurate detections of the temporal boundaries and recognitions of activities in untrimmed videos, with better performance for the multi-label compared to the multi-class activity recognition.
Name | Size | |
---|---|---|
3.Spatio-temporal activity detection and recognition in untrimmed surveillance videos_Zenodo.pdf
md5:966ffebedd56853d75ef4e34a553e1b8 |
1.0 MB | Download |
All versions | This version | |
---|---|---|
Views | 199 | 199 |
Downloads | 155 | 155 |
Data volume | 156.3 MB | 156.3 MB |
Unique views | 179 | 179 |
Unique downloads | 144 | 144 |