Zenodo.org will be unavailable for 2 hours on September 29th from 06:00-08:00 UTC. See announcement.

Conference paper Open Access

Spatio-temporal activity detection and recognition in untrimmed surveillance videos

Konstantinos Gkountakos; Despoina Touska; Konstantinos Ioannidis; Theodora Tsikrika; Stefanos Vrochidis; Ioannis Kompatsiaris

This work presents a spatio-temporal activity detection and recognition framework for untrimmed surveillance videos consisting of a three-step pipeline: object detection, tracking, and activity recognition. The framework relies on the YOLO v4 architecture for object detection, Euclidean distance for tracking, while the activity recognizer uses a 3D Convolutional Deep learning architecture employing spatio-temporal boundaries and addressing it as multi-label classification. The evaluation experiments on the VIRAT dataset achieve accurate detections of the temporal boundaries and recognitions of activities in untrimmed videos, with better performance for the multi-label compared to the multi-class activity recognition.

All versions This version
Views 199199
Downloads 155155
Data volume 156.3 MB156.3 MB
Unique views 179179
Unique downloads 144144


Cite as