Published August 20, 2024
| Version v1.0.0
Dataset
Open
PULP-Dronet v3 dataset
Creators
Contributors
Hosting institution:
Project leader:
Supervisor:
Description
The PULP-Dronet v3 dataset
The Himax dataset has been collected at University of Bologna and Technology Innovation Institute using a Himax ultra-low power, gray-scale, and QVGA camera mounted on a Bitcraze Crazyflie nano-drone. The dataset has been used for training and testing the PULP-Dronet v3 CNN, a neural network for autonomous visual-based navigation for nano-drones. This release includes the training and testing set described in the paper.Resources
Code available: pulp-platform/pulp-dronet
Video: https://youtu.be/ehNlDyhsVSc
Dataset Description
We collected a dataset of 77k images for nano-drones' autonomous navigation, for a total of 600MB of data.
We used the Bitcraze Crazyflie 2.1, collecting images from the AI-Deck's Himax HM01B0 monocrome camera.
The images in the PULP-Dronet v3 dataset have the following characteristics:
- Resolution: each image has a QVGA resolution of 324x244 pixels.
- Color: all images are grayscale, so they have 1 single channel.
- Format: the images are stored in .jpeg format.
A human pilot manually flew the drone, collecting i) images from the grayscale QVGA Himax camera sensor of the AI-deck, ii) the gamepad's yaw-rate, normalized in the [-1;+1] range, inputted from the human pilot, iii) the drone's estimated state, and iv) the distance between obstacles and the drone measured by the front-looking ToF sensor.
After the data collection, we labeled all the images with a binary collision label whenever an obstacle was in the line of sight and closer than 2m. We recorded 301 sequences in 20 different environments. Each sequence of data is labeled with high-level characteristics, listed in characteristics.json:
For training our CNNs, we augmented the training images by applying random cropping, flipping, brightness augmentation, vignetting, and blur. The resulting dataset has 157k images, split as follows: 110k, 7k, 15k images for training, validation, and testing, respectively.
To address the labels' bias towards the center of the [-1;+1] yaw-rate range in our testing dataset, we balanced the dataset by selectively removing a portion of images that had a yaw-rate of 0. Specifically, we removed (only from the test set) some images having yaw_rate==0 and collision==1.
Dataset | Train Images | Validation Images | Test Images | Total |
PULP-Dronet v3 | 53,830 | 7,798 | 15,790 | 77,418 |
PULP-Dronet v3 testing | 53,830 | 7,798 | 3,071 | 64,699 |
PULP-Dronet v3 training | 110,138 | 15,812 | 31,744 | 157,694 |
we use the `PULP-Dronet v3 training` for training and the `PULP-Dronet v3 testing` for validation/testing, this is the final split:
Dataset | Train Images | Validation Images | Test Images | Total |
Final | 110,138 | 7,798 | 3,071 | 121,007 |
Notes:
- `PULP-Dronet v3` and `PULP-Dronet v3 testing` datasets: Images are in full QVGA resolution (324x244px), uncropped.
- `PULP-Dronet v3 training` dataset: Images are cropped to 200x200px, matching the PULP-Dronet input resolution. Cropping was done randomly on the full-resolution images to create variations.
Dataset Structure
.
└── Dataset_PULP_Dronet_v3_*/
├── ETH finetuning/
│ ├── acquisition1/
│ │ ├── characteristics.json # metadata
│ │ ├── images/ # images folder
│ │ ├── labels_partitioned.csv # Labels for PULP-Dronet
│ │ └── state_labels_DroneState.csv # raw data from the crazyflie
| ...
│ └── acquisition39/
├── Lorenzo Bellone/
│ ├── acquisition1/
| ...
│ └── acquisition19/
├── Lorenzo Lamberti/
│ ├── dataset-session1/
| │ ├── acquisition1/
| | ...
| │ └── acquisition29/
│ ├── dataset-session2/
| │ ├── acquisition1/
| | ...
| │ └── acquisition55/
│ ├── dataset-session3/
| │ ├── acquisition1/
| | ...
| │ └── acquisition65/
│ └── dataset-session4/
| ├── acquisition1/
| ...
| └── acquisition51/
├── Michal Barcis/
│ ├── acquisition1/
| ...
│ └── acquisition18/
└── TII finetuning/
├── dataset-session1/
│ ├── acquisition1/
| ...
│ └── acquisition18/
└── dataset-session2/
├── acquisition1/
...
└── acquisition39/
This structure applies for all the three sets mentioned above: `PULP_Dronet_v3`, `PULP_Dronet_v3_training`, `PULP_Dronet_v3_testing`.
Dataset Labels
1. labels_partitioned.csv
The file contains metadata for the PULP-Dronet v3 image dataset.
The file includes the following columns:
- filename: The name of the image file (e.g., 25153.jpeg).
- label_yaw_rate: The yaw rate label, representing the rotational velocity. values are in the [-1, +1] range, where YawRate > 0 means counter-clockwise turn --> turn left, and YawRate < 0 means clockwise turn --> turn right.
- label_collision: The collision label, in range [0,1]. 0 denotes no collision and 1 indicates a collision.
- partition: The dataset partition, i.e., train, test, or valid.
contains metadata. This might be useful the user to filter the dataset on some specific characteristics, or to partition the images types equally:
- scenario (i.e., indoor or outdoor);
- path type (i.e., presence or absence of turns);
- obstacle types (e.g., pedestrians, chairs);
- flight height (i.e., 0.5, 1.0, 1.5 m/s);
- behaviour in presence of obstacles (i.e., overpassing, stand still, n/a);
- light conditions (dark, normal, bright, mixed);
- a location name identifier;
- acquisition date.
3. labeled_images.csv
the same as labels_partitioned.csv, but without the partition column. You can use this file to repeat the partition into train, valid, and test sets.
4. state_labels_DroneState.csv
This is the raw data logged from the crazyflie at ~100 samples/s.
The file includes the following columns:
- timeTicks: The timestamp.
- range.front: The distance measurement from the front VL53L1x ToF sensor [mm].
- mRange.rangeStatusFront: The status code of the front range sensor (check the VL53L1x datasheet for more info)
- controller.yawRate: The yaw rate command given by the human pilot (in radians per second).
- ctrltarget.yaw: The target yaw rate set by the control system (in radians per second).
- stateEstimateZ.rateYaw: The estimated yaw rate from the drone's state estimator (in radians per second).
Data Processing Workflow
You can find the scripts at pulp-platform/pulp-dronet
dataset_processing.py:
- Input: state_labels_DroneState.csv
- Output: labeled_images.csv
- Function: matches the drone state labels (~100Hz) timestamp to the image's timestamp (~10Hz), discarding extra drone states.
dataset_partitioning.py:
- Input: labeled_images.csv
- Output: labels_partitioned.csv
- Function: Partitions the labeled images into training, validation, and test sets.
License
We release this dataset as open source under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, see LICENSE.CC.md.
Files
Dataset_PULP_Dronet_v3.zip
Files
(2.3 GB)
Name | Size | Download all |
---|---|---|
md5:3aefdd526ab291aaa2ac01c0b4c0be47
|
434.4 MB | Preview Download |
md5:e2fd204c258e246f55b4a20832a9317e
|
1.9 GB | Preview Download |
Additional details
Dates
- Updated
-
2024-07dataset
Software
- Repository URL
- https://github.com/pulp-platform/pulp-dronet
- Programming language
- Python
- Development Status
- Active