Published November 2, 2023 | Version v2
Dataset Open

Pollen Video Library for Benchmarking Detection, Classification, Tracking and Novelty Detection Tasks

  • 1. Institute of Technical Informatics, Graz University of Technology, Austria
  • 2. Computer Engineering and Networks Lab, ETH Zurich
  • 3. Institute of Technical Informatics, Graz University of Technology, Austria, Complexity Science Hub Vienna, Austria

Description

Dataset description

This dataset contains microscopic images and videos of pollen gathered between Feb. and Aug. 2020 in Graz, Austria.

Pollen images of 16 types: ...images_16_types.zip

  • Acer Pseudoplatanus
  • Aesculus Carnea
  • Alnus
  • Anthoxanthum
  • Betula Pendula
  • Brassica
  • Carpinus
  • Corylus
  • Dactylis Glomerata
  • Fraxinus
  • Pinus Nigra
  • Platanus
  • Populus Nigra
  • Prunus Avium
  • Sequoiadendron Giganteum
  • Taxus Baccata

Pollen video library ...pollen_video_library.zip

  • Each type of pollen is in a separate folder, there may be multiple videos per type.
  • In each pollen folder, we included images cropped from the videos by YOLO object detection algorithm trained on a subset of pollen images as described in [1].
  • Cropped file name structure [Video file name]_[TrackingID]_[Image index of a grain]_[Frame index in video]
    • Example, if a grain has 5 images, the file name would be:  Anthoxanthum-grass-20200530-122652_0000000_001_00001.jpg Anthoxanthum-grass-20200530-122652_0000000_002_00002.jpg ... Anthoxanthum-grass-20200530-122652_0000000_005_00005.jpg

Field data over 3 days are gathered in Graz in spring 2020. ...pollen_field_data.zip

 

Version 2:

For experiments of mitigating the distribution shift of pollen identification on field data, there are 5 types selected from field data and manually labeled by the expert. The data are zipped in "the manual_labeled_field_data_5_types.zip" 

The "images_5_types_9010_train.zip" and "images_5_types_9010_val.zip" contain 5 types selected from library data (images_16_types.zip),  and these correspond to field data. 

The "images_3_types_for_ablation_study.zip" contains data on 3 levels of pollen grain hydration. These data are used for the ablation study of model generalization in pollen identification. 

Sample code to load the data and visualize the images is in ...plot_pollen_sample.py. Download and extract the file ...images_16_types.zip in the same folder as ...plot_pollen_sample.py to run the example.

Dependecies:

  • opencv
  • numpy
  • matplotlib

Credit

[1] N. Cao, M. Meyer, L. Thiele, and O. Saukh. 2020. Automated Pollen Detection with an Affordable Technology. In Proceedings of the International Conference on Embedded Wireless Systems and Networks (EWSN). 108–119.

@inproceedings{namcao2020pollen,  title = {Automated Pollen Detection with an Affordable Technology},  author = {Nam Cao and Matthias Meyer and Lothar Thiele and Olga Saukh},  booktitle = {Proceedings of the International Conference on Embedded Wireless Systems and Networks (EWSN)},  pages={108–119}  month = {2},  year = {2020}, }

Notes

Appears in the Proceedings of the 3rd Workshop on Data Acquisition To Analysis (DATA '20)

Files

images_3_types_for_ablation_study.zip