Published November 28, 2024 | Version v2
Dataset Open

DeepCanola: Phenotyping Brassica Pods Using Semi-Synthetic Data and Active Learning

Description

Dataset accomapnying the publication: DeepCanola: Phenotyping Brassica Pods Using Semi-Synthetic Data and Active Learning by Van Vliet, Atkins et al.

We provide model weights, training and vlidation datasets, as well as phenotype outputs. Each file/folder is outlined below:

  • deepcanola.pth - Model weights for the final Model 4, named DeepCanola
  • generated_datasets - Datasets generated at each stage of the active learning process, datasets 1-4
    • Each folder is an iteration of the active learning process, inside each folder are the generated images and associated annotations stored in the COCO format.
  • real_world_datasets - Real-world datasets used for either creation of the pod pools or validation. Datasets include:
    • br9 - Ordered and disordered dataset of images with generated pod length data of the ordered images stored in the `br9_gt_lengths.csv` file
    • br11 - Ordered and disordered dataset of images only
    • br17 - Ordered dataset with ground-truth of images with length annotations collected in ImageJ and stored in the `BR017 POD SCAN DATA.csv` file.
    • misc - Dataset of miscellaneous images including Brassica napus from Rothamstead and brassica relatives
  • data_generation_pools - Pools used to generate semi-synthetic data at each step of the active learning process. Pools include:
    • background_pool - Created background images to be selected at random by the semi-synthetic data generation script
    • pod_pools - Pools of pods used in the semi-synthetic data generation process. Pod pools include:
      • br9 - 673 pods with associated masks
      • br9 and br17 - 673 + 332 pods with associated masks
  • deepcanola_outputs - Phenotype data outputs generated by DeepCanola. Each output is stored as a .csv file of both length measurements of each pod (with _objects.csv suffix), and average length measurements per image (with _averages.csv suffix). Outputs include:
    • br9_ordered
    • br9_disordered
    • br17

Files

data_generation_pools.zip

Files (8.5 GB)

Name Size Download all
md5:a6cf2be3fecec53920732460ca9ef088
26.5 MB Preview Download
md5:66efcf5646d57fc87dbefb86b4df6ddb
176.2 MB Download
md5:8adbb1ee8de52d9ba3b74750a2c55e7f
365.8 kB Preview Download
md5:90f10a4eb8eae90c55520a4ef897fbac
4.9 GB Preview Download
md5:daaf71879c67286c9289fe259375e16d
3.4 GB Preview Download

Additional details

Software

Repository URL
https://github.com/kieranatkins/deepcanola
Programming language
Python