Sainfoin Fruit Processing - Object Detection Dataset
Creators
Description
This dataset consists of 500 images of sainfoin (Onobrychis viciifolia) seed pods, seed, and split seeds. The images were taken as a part of an experiment to determine minimum sample size of seed pods needed to accurately estimate pod threshing trait heritability within sainfoin breeding lines.
The experiment was a complete factorial design with the following factors:
- Sainfoin named varieties: AAC Mountainview, Delaney, Eski , Rocky Mountain Remont, and Shoshone
- Sample Size: 1, 2, 3, 4, and 5 grams of dried seed pods
- Two different threshing types: Belt thresher processed 3X, Haldrup Impact Thresher (35sec @ Speed 9)
This makes for a total factorial combination set of 5 varieties X 5 sample sizes X 2 threshing types = 50.
Each combination was comprised of 10 individual replicates where each replicate in a combination was a unique, random sample of seeds of the same mass (So, 10 random, 2g samples of Eski seed, processed by belt thresher; 10 random, 5g samples of Delaney seed processed by the Haldrup thresher, etc.). This makes for a total of 500 experimental units that comprise the sample set.
Once the seeds were sampled, weighed, and processed through the threshing equipment, they were weighed again and imaged.
The threshed seeds were scattered onto an imaging platform with a blue background, lit by 2 LED panels, and photographed with a Sony ILCE-7RM2 at the following settings:
- ISO: 100
- Exposure: 1/40s
- Focal Length: 55mm
- Format: TIFF
- Size: 7968x5320
The raw images were converted from TIFF files to JPEG format and annotated in image labeling software. The seed objects were annotated with bounding boxes classified as the following classes
- pod: an enclosed seed pod
- seed: a seed which was successfully threshed from the legume pod carpel
- split: a seed threshed from the pod, but which split in two halves during the threshing process
All image annotations were exported into the convenient COCO format.
No further image processing was performed.
The image set was split into a 80/20 training and validation step using `scikit-learn` in Python 3.11 stratifying the datasets equally over the various experimental factor levels.
The zip file 'train_val_images.zip' contains a 'train' folder with 400 training images, 'val' containing 100 validation images, an image taken with a color correction card named 'color_test.jpg', and a json file with all the annotations.
Another file called 'seed_weights.csv' contains the image_name to global-key mapping in tabular format as well as the before and after threshing seed weights for each experimental sample.
Labeling Metrics:
- Pod (48.58%)
- 36,599 objects
- Seed (33.83%)
- 25,488 object
- Split (17.59%)
- 13,255 objects
- TOTAL (100%)
- 75,342 objects
Files
seed_weights.csv
Files
(897.5 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:1bdf522c1532c2b9252daae3dc9f1291
|
44.6 kB | Preview Download |
|
md5:0f9dc94110dff02d5c3734a1c44b2135
|
897.4 MB | Preview Download |
Additional details
Dates
- Updated
-
2023-10-16Updated to v1.0.0