Published September 25, 2024 | Version v1
Dataset Open

Data for: Unlocking the Power of AI for Fruit Phenotyping: A Genetic Validation Study in Arabidopsis

  • 1. ROR icon Aberystwyth University
  • 2. Corporacion Colombiana de Investigación Agropecuaria

Description

This dataset contains annotated images of mature inflorescences, collected from experiments conducted on the Multiparent Advanced Generation Inter-Cross (MAGIC) population. The images were utilised to develop and validate a deep learning-based pipeline for Arabidopsis fruit trait extraction.

- The images and annotations have been separated into a training and a test set. 

- The pretrained Cascade Mask-RCNN model (in PyTorch) for instance segmentation of Arabidopsis siliques is also provided in arabidopsis.pth file.  

The instructions for installing and testing the pipeline, along with the extracted phenotype data and MAGIC genomic data for QTL analysis, can be found in https://github.com/kieranatkins/silique-detector/ . 

For the verification of the pipeline using QTL analysis, the full dataset collection (in total of over 7000 images) has been utilised, and the raw images and metadata are available at the following links.:

AT023: 10.5281/zenodo.13853394

AT024: 10.5281/zenodo.13856248

AT025: 10.5281/zenodo.13856317

Notes

- The annotations are stored in two JSON files:
at025_v2_fixed_train.json
at025_v2_fixed_test.json

These files contain image annotations for instance segmentation of Arabidopsis siliques in COCO format. They were used for training and testing deep learning models. The training set includes 44 images with 2,288 annotated siliques, while the test set contains 11 images with 652 silique segmentations.

- A list of known genes relevant to silique development is provided in the file:

filtered_genes.csv

They were retrieved from the TAIR9 database, if they have associated gene ontology terms related to fruit development, organ development, organ morphogenesis, or seed development.

Files

images_train.zip

Files (1.0 GB)

Name Size Download all
md5:0e4894c3a067f6bc11405c9a69efbd47
490.3 MB Download
md5:be46dd1f1a5ac0b15e7ecdedf81e700b
2.8 MB Preview Download
md5:68d2dc682c34511bf8fa6f9ab7b5b446
9.7 MB Preview Download
md5:4ab07681a83d77a676433c16fdbb20a2
249.1 kB Preview Download
md5:d73c929437c38d2fbd991c2b1e9558b3
128.2 MB Preview Download
md5:6ee5c82ca2cc588f3b3502f06d48f59b
404.3 MB Preview Download

Additional details

Software

Repository URL
https://github.com/kieranatkins/silique-detector/
Programming language
Python , R
Development Status
Active