Dataset Open Access

Global Wheat Head Dataset 2021

DAVID Etienne

This is the full Global Wheat Head Dataset 2021. Labels are included in csv.

Tutorials available here:


🕵️ Introduction

Wheat is the basis of the diet of a large part of humanity. Therefore, this cereal is widely studied by scientists to ensure food security. A tedious, yet important part of this research is the measurement of different characteristics of the plants, also known as Plant Phenotyping. Monitoring plant architectural characteristics allow the breeders to grow better varieties and the farmers to make better decisions, but this critical step is still done manually. The emergence of UAV, camera and smartphone makes in-field RGB images more available and could be a solution to manual measurement. For instance, the counting of the wheat head can be done with Deep Learning.  However, this task can be visually challenging. There is often an overlap of dense wheat plants, and the wind can blur the photographs, making identify single heads difficult. Additionally, appearances vary due to maturity, colour, genotype, and head orientation. Finally, because wheat is grown worldwide, different varieties, planting densities, patterns, and field conditions must be considered. To end manual counting, a robust algorithm must be created to address all these issues. 

💾 Dataset

The dataset is composed of more than 6000 images of 1024x1024 pixels containing 300k+ unique wheat heads, with the corresponding bounding boxes. The images come from 11 countries and covers 44 unique measurement sessions. A measurement session is a set of images acquired at the same location, during a coherent timestamp (usually a few hours), with a specific sensor. In comparison to the 2020 competition on Kaggle, it represents 4 new countries, 22 new measurements sessions, 1200 new images and 120k new wheat heads. This amount of new situations will help to reinforce the quality of the test dataset. The 2020 dataset was labelled by researchers and students from 9 institutions across 7 countries. The additional data have been labelled by Human in the Loop, an ethical AI labelling company. We hope these changes will help in finding the most robust algorithms possible!

The task is to localize the wheat head contained in each image. The goal is to obtain a model which is robust to variation in shape, illumination, sensor and locations. A set of boxes coordinates is provided for each image.

The training dataset will be the images acquired in Europe and Canada, which cover approximately 4000 images and the test dataset will be composed of the images from North America (except Canada), Asia, Oceania and Africa and covers approximately 2000 images. It represents 7 new measurements sessions available for training but 17 new measurements sessions for the test!

📁 Files

Following files are available in the resources section:

  • images: the folder contains all images

  • competition_train.csv , competition_val.csv, competition_test.csv : contains the splits used for the 2021 Global Wheat Challenge

    • Val contains the "public test", which is the test set of Global Wheat Head 2020

    • Test contains the "private test".

  • Metadata.csv : contains additional metadatas for each domain

💻 Labels

  • All boxes are contained in a csv with three columns image_name, BoxesString and domain
  • image_name is the name of the image, without the suffix. All images have a .png extension
  • BoxesString is a string containing all predicted boxes with the format [x_min,y_min, x_max,y_max]. To concatenate a list of boxes into a PredString, please concatenate all list of coordinates with one space (" ") and all boxes with one semi-column ";". If there is no box, BoxesString is equal to "no_box".
  • domain give the domain for each image


If you use the dataset for your research, please do not forget to quote:

  title={Global Wheat Head Detection (GWHD) dataset: a large and diverse dataset of high-resolution RGB-labelled images to develop and benchmark wheat head detection methods},
  author={David, Etienne and Madec, Simon and Sadeghi-Tehran, Pouria and Aasen, Helge and Zheng, Bangyou and Liu, Shouyang and Kirchgessner, Norbert and Ishikawa, Goro and Nagasawa, Koichi and Badhon, Minhajul A and others},
  journal={Plant Phenomics},
  publisher={Science Partner Journal}

      title={Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods},
      author={Etienne David and Mario Serouart and Daniel Smith and Simon Madec and Kaaviya Velumani and Shouyang Liu and Xu Wang and Francisco Pinto Espinosa and Shahameh Shafiee and Izzat S. A. Tahir and Hisashi Tsujimoto and Shuhei Nasuda and Bangyou Zheng and Norbert Kichgessner and Helge Aasen and Andreas Hund and Pouria Sadhegi-Tehran and Koichi Nagasawa and Goro Ishikawa and Sébastien Dandrifosse and Alexis Carlier and Benoit Mercatoris and Ken Kuroki and Haozhou Wang and Masanori Ishii and Minhajul A. Badhon and Curtis Pozniak and David Shaner LeBauer and Morten Lilimo and Jesse Poland and Scott Chapman and Benoit de Solan and Frédéric Baret and Ian Stavness and Wei Guo},

Files (10.2 GB)
Name Size
10.2 GB Download
  • Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods, David et al,

All versions This version
Views 5,0175,017
Downloads 4,3694,369
Data volume 44.6 TB44.6 TB
Unique views 4,3554,355
Unique downloads 2,4342,434


Cite as