Zenodo.org will be unavailable for 2 hours on September 29th from 06:00-08:00 UTC. See announcement.

Dataset Open Access

Global Wheat Head Dataset 2021

DAVID Etienne

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods, David et al, https://arxiv.org/abs/2105.07660</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">deep learning</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">wheat counting</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">plant phenotyping</subfield>
  <controlfield tag="005">20210713014828.0</controlfield>
  <controlfield tag="001">5092309</controlfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">10215190132</subfield>
    <subfield code="z">md5:22b4b542c9ae7e056d7fcdeae9ecaed5</subfield>
    <subfield code="u">https://zenodo.org/record/5092309/files/gwhd_2021.zip</subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2021-07-12</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="o">oai:zenodo.org:5092309</subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">INRAe</subfield>
    <subfield code="a">DAVID Etienne</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Global Wheat Head Dataset 2021</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;This is the full Global Wheat Head Dataset 2021. Labels are included in csv.&lt;/p&gt;

&lt;p&gt;Tutorials available here: https://www.aicrowd.com/challenges/global-wheat-challenge-2021&lt;/p&gt;


&lt;p&gt;🕵️ Introduction&lt;/p&gt;

&lt;p&gt;Wheat is the basis of the diet of a large part of humanity. Therefore, this cereal is widely studied by scientists to ensure food security. A tedious, yet important part of this&amp;nbsp;research is the measurement of different characteristics of the plants, also known as&amp;nbsp;Plant Phenotyping. Monitoring plant architectural characteristics allow the breeders to grow better varieties and the farmers to make better decisions, but this critical step is still done manually. The emergence of UAV, camera and smartphone makes in-field RGB images more available and could be a solution to manual measurement. For instance, the counting of the wheat head can be done with Deep Learning.&amp;nbsp; However, this task can be visually challenging. There is often an overlap of dense wheat plants, and the wind can blur the photographs, making identify single heads difficult. Additionally, appearances vary due to maturity, colour, genotype, and head orientation. Finally, because wheat is grown worldwide, different varieties, planting densities, patterns, and field conditions must be considered. To end manual counting, a robust algorithm must be created to address all these issues.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;💾 Dataset&lt;/p&gt;

&lt;p&gt;The dataset&amp;nbsp;is composed of more than 6000 images of 1024x1024 pixels containing 300k+ unique wheat heads, with the corresponding bounding boxes. The images come from 11 countries and covers 44 unique measurement sessions. A measurement session is a set of images acquired at the same location, during a coherent timestamp (usually a few hours), with a specific sensor. In comparison to the 2020 competition on Kaggle, it represents 4 new countries, 22 new measurements sessions, 1200 new images and 120k new wheat heads. This amount of new situations will help to reinforce the quality of the test dataset. The 2020 dataset was labelled by researchers and students from 9 institutions across 7 countries. The additional data have been labelled by Human in the Loop, an ethical AI labelling company. We hope these changes will help in&amp;nbsp;finding the most robust algorithms possible!&lt;/p&gt;

&lt;p&gt;The task is to localize the wheat head contained in each image. The goal is to obtain a model which is robust to variation in shape, illumination, sensor and locations. A set of boxes coordinates is provided for each image.&lt;/p&gt;

&lt;p&gt;The training dataset will be the images acquired in Europe and Canada, which cover approximately 4000 images and the test dataset will be composed of the images from North America (except Canada), Asia, Oceania and Africa and covers approximately 2000 images. It represents 7 new measurements sessions available for training but 17 new measurements sessions for the test!&lt;/p&gt;

&lt;p&gt;📁 Files&lt;/p&gt;

&lt;p&gt;Following files are available in the &lt;code&gt;resources&lt;/code&gt; section:&lt;/p&gt;

	&lt;p&gt;&lt;code&gt;images: the folder contains all images&lt;/code&gt;&lt;/p&gt;
	&lt;p&gt;&lt;code&gt;competition_train.csv , competition_val.csv, competition_test.csv : contains the splits used for the 2021 Global Wheat Challenge&lt;/code&gt;&lt;/p&gt;

		&lt;p&gt;&lt;code&gt;Val contains the &amp;quot;public test&amp;quot;, which is the test set of Global Wheat Head 2020&lt;/code&gt;&lt;/p&gt;
		&lt;p&gt;&lt;code&gt;Test contains the &amp;quot;private test&amp;quot;.&lt;/code&gt;&lt;/p&gt;
	&lt;p&gt;&lt;code&gt;Metadata.csv : contains additional metadatas for each domain&lt;/code&gt;&lt;/p&gt;


	&lt;li&gt;All boxes are contained in a csv with three columns &lt;code&gt;image_name&lt;/code&gt;, BoxesString&amp;nbsp;and domain&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;image_name&lt;/code&gt;&amp;nbsp;is the name of the image, without the suffix. All images have a .png extension&lt;/li&gt;
	&lt;li&gt;BoxesString&amp;nbsp;is a string containing all predicted boxes with the format [x_min,y_min, x_max,y_max]. To concatenate a list of boxes into a PredString, please concatenate all list of coordinates with one space (&amp;quot; &amp;quot;) and all boxes with one semi-column &amp;quot;;&amp;quot;. If there is no box, BoxesString is equal to &amp;quot;no_box&amp;quot;.&lt;/li&gt;
	&lt;li&gt;domain give the domain for each image&lt;/li&gt;


&lt;p&gt;If you use the dataset for your research, please do not forget to quote:&lt;/p&gt;

  title={Global Wheat Head Detection (GWHD) dataset: a large and diverse dataset of high-resolution RGB-labelled images to develop and benchmark wheat head detection methods},
  author={David, Etienne and Madec, Simon and Sadeghi-Tehran, Pouria and Aasen, Helge and Zheng, Bangyou and Liu, Shouyang and Kirchgessner, Norbert and Ishikawa, Goro and Nagasawa, Koichi and Badhon, Minhajul A and others},
  journal={Plant Phenomics},
  publisher={Science Partner Journal}

&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; title={Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods},&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; author={Etienne David and Mario Serouart and Daniel Smith and Simon Madec and Kaaviya Velumani and Shouyang Liu and Xu Wang and Francisco Pinto Espinosa and Shahameh Shafiee and Izzat S. A. Tahir and Hisashi Tsujimoto and Shuhei Nasuda and Bangyou Zheng and Norbert Kichgessner and Helge Aasen and Andreas Hund and Pouria Sadhegi-Tehran and Koichi Nagasawa and Goro Ishikawa and S&amp;eacute;bastien Dandrifosse and Alexis Carlier and Benoit Mercatoris and Ken Kuroki and Haozhou Wang and Masanori Ishii and Minhajul A. Badhon and Curtis Pozniak and David Shaner LeBauer and Morten Lilimo and Jesse Poland and Scott Chapman and Benoit de Solan and Fr&amp;eacute;d&amp;eacute;ric Baret and Ian Stavness and Wei Guo},&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; year={2021},&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; eprint={2105.07660},&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; archivePrefix={arXiv},&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; primaryClass={cs.CV}&lt;br&gt;
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.5092308</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.5092309</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
All versions This version
Views 4,9654,965
Downloads 4,3324,332
Data volume 44.3 TB44.3 TB
Unique views 4,3074,307
Unique downloads 2,4002,400


Cite as