Zenodo.org will be unavailable for 2 hours on September 29th from 06:00-08:00 UTC. See announcement.

Dataset Open Access

Sticky Pi -- Machine Learning Data, Configuration and Models

Quentin Geissmann

Dataset for the Machine Learning section of the Sticky Pi project (https://doc.sticky-pi.com/)

Contains the dataset for the three algorithms described in the publication: Universal Insect Detector, Siamese Insect Matcher and Insect Tuboid Classifier.

Universal Insect Detector:

`universal_insect_detector/` contains training/validation data, configuration files to train the model, and the model as trained and used for publication.

  • `data/` – A set of svg images that contain the embedded jpg raw image, and a set of non-intersecting polygon around the labelled insects
  • `output/`
    • `model_final.pth` – the model as trained for the publication
  • `config/`
    • `config.yaml `– The configuration file defining the hyperparameters to train the model
    • `mask_rcnn_R_101_C4_3x.yaml` – the base configuration file from which config is derived

 

Siamese Insect Matcher

`siamese_insect_matcher/` contains training/validation data, configuration files to train the model, and the model as trained and used for publication.

  • `data/` – a set of svg images that contain two embedded jpg raw images vertically stacked corresponding to two frames in a series. Each predicted insect is labelled as a polygon. Insects that are labelled as the same instance, between the two frames, are grouped (i.e. SVG group). The filename of each image is `<device>.<datetime_frame_1>.<datetime_frame_2>.svg`
  • `output/`
    • `model_final.pth` – the model as trained for the publication
  • `config/`
    • `config.yaml` – The configuration file defining the hyperparameters to train

Insect Tuboid Classifier:

`insect_tuboid_classifier/` contains images of insect tuboid, a database file describing their taxonomy, a configuration file to train the model, and the model as trained and used for publication.

  • `data/`
    • `database.db`: a sqlite file with a single table `ANNOTATIONS`. The table maps a unique identifier of each tuboid (tuboid_id) to a set of manually annotated taxonomic variables.
    • A directory tree of the form: `<series_id>/<tuboid_id>/`. Each terminal directory contains:
        • `tuboid.jpg` – a jpeg image made of 224 x 224 tiles representing all the shots in a tuboid, left to right, top to bottom – might be padded with empty images
        • `metadata.txt` – a csv text file with columns:
            • parrent_image_id – <device>.<UTC_datetime>
            • X – the X coordinates of the object centroid
            • Y – the Y coordinates of the object centroid
        • scale – The scaling factor applied between the original and image and the 224 x 224 tile (>1 => image was enlarged)
        • `context.jpg` – a representation of the first whole image of a series, with a box around the first tuboid shot (this is for debugging/labelling purposes)
  • `output/`
    • `model_final.pth` – the model as trained for the publication
  • config/
    • `config.yaml` – The configuration file defining the hyperparameters to train the model as well as the taxonomic labels
Second version. Added data to the UID and SIM. Minor changes in the configurations.
Files (11.0 GB)
Name Size
insect-tuboid-classifier.zip
md5:f125654fefb6a94c5c9b1014c812344b
6.9 GB Download
siamese-insect-matcher.zip
md5:847e05350bf8894e6ea3877af41c8f74
2.4 GB Download
universal-insect-detector.zip
md5:af04c62fc6b1e8e9453a202f7af6900a
1.8 GB Download
209
505
views
downloads
All versions This version
Views 20981
Downloads 505229
Data volume 1.1 TB697.8 GB
Unique views 18669
Unique downloads 162125

Share

Cite as