Published March 2, 2024 | Version v1
Dataset Open

2SeasonWeedDet8: a two-season, 8-class dataset for cross-season weed detection generalization evaluation

Creators

  • 1. ROR icon Michigan State University

Description

The 2SeasonWeedDet8 dataset comprises two eight-class sub-datasets acquired in two consecutive seasons of 2021 and 2022. It was specifically curated for assessing the cross-season generalization assessment of weed detection models.
  • Images both years were captured for naturally germinated weeds using smartphone or digital color cameras in cotton fields across Mississippi. 
  • Images were manually labeled by qualified personnel who draw bounding boxes for individual weed plants using the VGG Image Annotator (version 2.10). 
  • Initial annotations were examined by PI or trained personnel for weed identification for quality control before inclusions in the final dataset.
  • Weed classes of the dataset include: Waterhemp, Carpetweed, Morninglory, Goosegrass, Spotted Spurge, Palmer Amaranth, Purslane, and Ragweed.
 

Each weed image (in .jpg fomat) has one corresponding annotation file of the same file name, in both JSON and XML formats, placed in the same folder. 

  • For the JSON file, the annotated bounding box is defined in COCO format, i.e., [x_min, y_min, width, height].
  • For the XML file, the annotated bounding box is represented in Pascal VOC format, i.e., [x_min, y_min, x_max, y_max]. 
 
For the two sub-datasets (corresponding to the compressed files, "Year2021" and "Year2022")
  • Weed Data of Year 2021: derived from the CottonWeedDet12 dataset, the sub-dataset contains 4734 images with 7664 bounding boxes. It is broken down into two compressed files "Year2021_Part1" (with 2290 images) and "Year2021_Part2" (with 2444 images) for the convenience of data uploading and downloading. After downloading and unzipping the two files, you may merge them together for the complete data of Year 2021.  
  • Weed Data of Year 2022: this sub-dataset consists of 1930 images with 3184 bounding boxes
The combined two-season dataset has 6664 images with 10848 bounding boxes. More detailed documentation of the dataset curation and model benchmarking for weed detection are described in the accompanying journal paper: Deng, B., Lu, Y., & Xu, J. (2024). Weed Database Development: An Updated Survey of Public Weed Datasets and Cross-Season Weed Detection Adaptation. Ecological Informatics, 102546. https://doi.org/10.1016/j.ecoinf.2024.102546
 
If you use the dataset on a published publication, please consider citing the dataset or associated journal article above. 
 

Files

README.txt

Files (35.8 GB)

Name Size Download all
md5:49699cd1aa4ce15e6351f5d6e98bf63b
2.2 kB Preview Download
md5:f0c025a55046274921cb1ca19882d986
10.7 GB Preview Download
md5:22cee76b6e01483d03991e80c5f1877b
14.0 GB Preview Download
md5:5f3cdd017f34953e7773cb55a5c89b65
11.1 GB Preview Download

Additional details

Related works

Is derived from
Dataset: 10.5281/zenodo.7535814 (DOI)

References

  • Deng, B., Lu, Y., & Xu, J. (2024). Weed Database Development: An Updated Survey of Public Weed Datasets and Cross-Season Weed Detection Adaptation. Ecological Informatics, 102546. https://doi.org/10.1016/j.ecoinf.2024.102546.