Spatio-temporal activity detection and recognition in untrimmed surveillance videos
Creators
- 1. Information Technologies Institute CERTH
Description
This work presents a spatio-temporal activity detection and recognition framework for untrimmed surveillance videos consisting of a three-step pipeline: object detection, tracking, and activity recognition. The framework relies on the YOLO v4 architecture for object detection, Euclidean distance for tracking, while the activity recognizer uses a 3D Convolutional Deep learning architecture employing spatio-temporal boundaries and addressing it as multi-label classification. The evaluation experiments on the VIRAT dataset achieve accurate detections of the temporal boundaries and recognitions of activities in untrimmed videos, with better performance for the multi-label compared to the multi-class activity recognition.
Files
3.Spatio-temporal activity detection and recognition in untrimmed surveillance videos_Zenodo.pdf
Files
(1.0 MB)
Name | Size | Download all |
---|---|---|
md5:966ffebedd56853d75ef4e34a553e1b8
|
1.0 MB | Preview Download |
Additional details
Funding
- European Commission
- PREVISION – Prediction and Visual Intelligence for Security Information 833115
- European Commission
- CREST – Fighting Crime and TerroRism with an IoT-enabled Autonomous Platform based on an Ecosystem of Advanced IntelligEnce, Operations, and InveStigation Technologies 833464
- European Commission
- CONNEXIONs – InterCONnected NEXt-Generation Immersive IoT Platform of Crime and Terrorism DetectiON, PredictiON, InvestigatiON, and PreventiON Services 786731