Spatio-Temporal Vehicle Detection Dataset (STVD)
Description
A dataset suitable for spatiotemporal object detection is constructed using several aerial video clips of traffic in different road segments in Nicosia, Cyprus, captured using UAVs, rather than single areas in low resolution satelite images as other datasets.
By compiling multiple sequences of images extracted from these videos, the dataset accumulates a substantial corpus of 6,600 frames. The dataset encapsulates 3 classes: ‘car’, ‘truck’ and ‘bus’ with a distribution of 81165, 1541, and 1625 respectively in the case that we only use the even frame annotations, which approximately doubles when considering the entire dataset. An additional challenge of the dataset that mirrors real world application is the fact that the classes are not balanced, as there is a significantly larger number of cars compared to trucks and buses, as in a regular transportation network. The images have Full-HD resolution, with object sizes approximately between 20x20 to 150x150 pixels. The dataset was prepared in the YOLO format. The dataset was split into 80% for training and the remaining 20% for validation. The importance of such a dataset lies in its capability to encapsulate both spatial and temporal nuances. We note the frames belonging in the same continuous sequence as such the dataset can potentially be used to develop approaches that operate on multiple sequential frames for object detection by sampling a number of frames from the same sequence.
Dataset Feature | Description |
Total Images | ~6600 |
Image Sizes | 1920x1080 |
Classes | Car,Bus,Truck |
Data Collection | Collect from UAVs at different locations in Nicosia, Cyprus |
Data Format | PNG |
Labelling Format | YOLO |
Files
stvd-dataset_not_split.zip
Files
(4.3 GB)
Name | Size | Download all |
---|---|---|
md5:f7f257f077e69980cc638e5b190d5e60
|
2.2 GB | Preview Download |
md5:cd983ce6898165a4c29d73112ed8354f
|
2.2 GB | Preview Download |