Zenodo.org will be unavailable for 2 hours on September 29th from 06:00-08:00 UTC. See announcement.
There is a newer version of this record available.

Dataset Open Access

Sticky Pi -- Machine Learning Data, Configuration and Models

Quentin Geissmann


JSON Export

{
  "files": [
    {
      "links": {
        "self": "https://zenodo.org/api/files/2de26f9c-875f-4ebe-86f1-bcd66475c977/insect-tuboid-classifier.zip"
      }, 
      "checksum": "md5:f125654fefb6a94c5c9b1014c812344b", 
      "bucket": "2de26f9c-875f-4ebe-86f1-bcd66475c977", 
      "key": "insect-tuboid-classifier.zip", 
      "type": "zip", 
      "size": 6860304663
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/2de26f9c-875f-4ebe-86f1-bcd66475c977/siamese-insect-matcher.zip"
      }, 
      "checksum": "md5:edf7b5fa94bd074e5e52284e96510c0e", 
      "bucket": "2de26f9c-875f-4ebe-86f1-bcd66475c977", 
      "key": "siamese-insect-matcher.zip", 
      "type": "zip", 
      "size": 1093649148
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/2de26f9c-875f-4ebe-86f1-bcd66475c977/universal-insect-detector.zip"
      }, 
      "checksum": "md5:2621daac4341ea3f7777c2b84e1c8568", 
      "bucket": "2de26f9c-875f-4ebe-86f1-bcd66475c977", 
      "key": "universal-insect-detector.zip", 
      "type": "zip", 
      "size": 1058469613
    }
  ], 
  "owners": [
    32777
  ], 
  "doi": "10.5281/zenodo.4680119", 
  "stats": {
    "version_unique_downloads": 162.0, 
    "unique_views": 118.0, 
    "views": 128.0, 
    "version_views": 208.0, 
    "unique_downloads": 38.0, 
    "version_unique_views": 185.0, 
    "volume": 435849455333.0, 
    "version_downloads": 505.0, 
    "downloads": 276.0, 
    "version_volume": 1133649261944.0
  }, 
  "links": {
    "doi": "https://doi.org/10.5281/zenodo.4680119", 
    "conceptdoi": "https://doi.org/10.5281/zenodo.4680118", 
    "bucket": "https://zenodo.org/api/files/2de26f9c-875f-4ebe-86f1-bcd66475c977", 
    "conceptbadge": "https://zenodo.org/badge/doi/10.5281/zenodo.4680118.svg", 
    "html": "https://zenodo.org/record/4680119", 
    "latest_html": "https://zenodo.org/record/6382496", 
    "badge": "https://zenodo.org/badge/doi/10.5281/zenodo.4680119.svg", 
    "latest": "https://zenodo.org/api/records/6382496"
  }, 
  "conceptdoi": "10.5281/zenodo.4680118", 
  "created": "2021-08-09T20:14:42.883746+00:00", 
  "updated": "2022-03-24T15:55:33.803717+00:00", 
  "conceptrecid": "4680118", 
  "revision": 3, 
  "id": 4680119, 
  "metadata": {
    "access_right_category": "success", 
    "doi": "10.5281/zenodo.4680119", 
    "description": "<p><strong>Dataset for the Machine Learning section of the Sticky Pi project (https://doc.sticky-pi.com/)</strong></p>\n\n<p>Contains the dataset for the three algorithms described in the publication: Universal Insect Detector, Siamese Insect Matcher and Insect Tuboid Classifier.</p>\n\n<p><strong>Universal Insect Detector:</strong></p>\n\n<p>`universal_insect_detector/` contains training/validation data, configuration files to train the model, and the model as trained and used for publication.</p>\n\n<ul>\n\t<li>`data/` &ndash; A set of svg images that contain the embedded jpg raw image, and a set of non-intersecting polygon around the labelled insects</li>\n\t<li>`output/`\n\t<ul>\n\t\t<li>`model_final.pth` &ndash; the model as trained for the publication</li>\n\t</ul>\n\t</li>\n\t<li>`config/`\n\t<ul>\n\t\t<li>`config.yaml` &ndash; The configuration file defining the hyperparameters to train the model as well as the taxonomic labels</li>\n\t\t<li>`config.yaml `&ndash; The configuration file defining the hyperparameters to train the model</li>\n\t\t<li>`mask_rcnn_R_101_C4_3x.yaml` &ndash; the base configuration file from which config is derived</li>\n\t</ul>\n\t</li>\n</ul>\n\n<p>&nbsp;</p>\n\n<p><strong>Siamese Insect Matcher</strong></p>\n\n<p>`siamese_insect_matcher/` contains training/validation data, configuration files to train the model, and the model as trained and used for publication.</p>\n\n<ul>\n\t<li>`data/` &ndash; a set of svg images that contain two embedded jpg raw images vertically stacked corresponding to two frames in a series. Each predicted insect is labelled as a polygon. Insects that are labelled as the same instance, between the two frames, are grouped (i.e. SVG group). The filename of each image is `&lt;device&gt;.&lt;datetime_frame_1&gt;.&lt;datetime_frame_2&gt;.svg`</li>\n\t<li>`output/`\n\t<ul>\n\t\t<li>`model_final.pth` &ndash; the model as trained for the publication</li>\n\t</ul>\n\t</li>\n\t<li>`config/`\n\t<ul>\n\t\t<li>`config.yaml` &ndash; The configuration file defining the hyperparameters to train the model as well as the taxonomic labels</li>\n\t\t<li>`config.yaml` &ndash; The configuration file defining the hyperparameters to train the model</li>\n\t</ul>\n\t</li>\n</ul>\n\n<p>&nbsp;</p>\n\n<p><strong>Insect Tuboid Classifier:</strong></p>\n\n<p>`insect_tuboid_classifier/` contains images of insect tuboid, a database file describing their taxonomy, a configuration file to train the model, and the model as trained and used for publication.</p>\n\n<ul>\n\t<li>`data/`\n\t<ul>\n\t\t<li>`database.db`: a sqlite file with a single table `ANNOTATIONS`. The table maps a unique identifier of each tuboid (tuboid_id) to a set of manually annotated taxonomic variables.</li>\n\t\t<li>A directory tree of the form: `&lt;series_id&gt;/&lt;tuboid_id&gt;/`. Each terminal directory contains:\n\t\t<ul>\n\t\t\t<li>\n\t\t\t<ul>\n\t\t\t\t<li>`tuboid.jpg` &ndash; a jpeg image made of 224 x 224 tiles representing all the shots in a tuboid, left to right, top to bottom &ndash; might be padded with empty images</li>\n\t\t\t\t<li>`metadata.txt` &ndash; a csv text file with columns:\n\t\t\t\t<ul>\n\t\t\t\t\t<li>\n\t\t\t\t\t<ul>\n\t\t\t\t\t\t<li>parrent_image_id &ndash; &lt;device&gt;.&lt;UTC_datetime&gt;</li>\n\t\t\t\t\t\t<li>X &ndash; the X coordinates of the object centroid</li>\n\t\t\t\t\t\t<li>Y &ndash; the Y coordinates of the object centroid</li>\n\t\t\t\t\t</ul>\n\t\t\t\t\t</li>\n\t\t\t\t</ul>\n\t\t\t\t</li>\n\t\t\t\t<li>scale &ndash; The scaling factor applied between the original and image and the 224 x 224 tile (&gt;1 =&gt; image was enlarged)</li>\n\t\t\t\t<li>`context.jpg` &ndash; a representation of the first whole image of a series, with a box around the first tuboid shot (this is for debugging/labelling purposes)</li>\n\t\t\t</ul>\n\t\t\t</li>\n\t\t</ul>\n\t\t</li>\n\t</ul>\n\t</li>\n\t<li>`output/`\n\t<ul>\n\t\t<li>`model_final.pth` &ndash; the model as trained for the publication</li>\n\t</ul>\n\t</li>\n\t<li>config/\n\t<ul>\n\t\t<li>`config.yaml` &ndash; The configuration file defining the hyperparameters to train the model as well as the taxonomic labels</li>\n\t</ul>\n\t</li>\n</ul>", 
    "language": "eng", 
    "title": "Sticky Pi -- Machine Learning Data, Configuration and Models", 
    "license": {
      "id": "CC-BY-4.0"
    }, 
    "relations": {
      "version": [
        {
          "count": 2, 
          "index": 0, 
          "parent": {
            "pid_type": "recid", 
            "pid_value": "4680118"
          }, 
          "is_last": false, 
          "last_child": {
            "pid_type": "recid", 
            "pid_value": "6382496"
          }
        }
      ]
    }, 
    "keywords": [
      "instect traps", 
      "behavioral ecology"
    ], 
    "publication_date": "2021-04-12", 
    "creators": [
      {
        "orcid": "0000-0001-6546-4306", 
        "affiliation": "University of British Columbia", 
        "name": "Quentin Geissmann"
      }
    ], 
    "access_right": "open", 
    "resource_type": {
      "type": "dataset", 
      "title": "Dataset"
    }, 
    "related_identifiers": [
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.4680118", 
        "relation": "isVersionOf"
      }
    ]
  }
}
208
505
views
downloads
All versions This version
Views 208128
Downloads 505276
Data volume 1.1 TB435.8 GB
Unique views 185118
Unique downloads 16238

Share

Cite as