Published March 15, 2022 | Version 0.1
Dataset Open

deepNIR: Dataset for generating synthetic NIR images and improved fruit detection system using deep learning techniques

  • 1. CSIRO
  • 2. CARES, Department of Electrical, Computer and Software Engineering, University of Auckland

Description

In this paper, we present datasets that can be utilised for synthetic near infrared (NIR) image and bounding box level fruit detection system. It is undeniable fact that high-caliber machine learning software frameworks such as Tensorflow or Pytorch and large scale dataset such as ImageNet and COCO, and accelerated GPU hardware support have pushed the limit of machine learning for more than decades.

Among these breakthroughs quality dataset is one of important key building blocks that can lead to success in model generalisation and deployment for data-driven deep neural networks. Particularly, synthetic data generation such as generative adversarial networks often requires relatively larger scale data than other supervised approaches. In addition, posing constrains such as geometrical facial constrains in fake face generation or consistent and radiometrically calibrated reflectances from satellite imagery commonly yield better results. We share NIR+RGB dataset that are re-processed from other two public datasets (nirscene and SEN12MS) and our own novel sweetpepper dataset to be able to timely adopt to other following studies.

We oversampled from original nirscene dataset at 10, 100, 200, and 400 ratios and total of 127k pair of images. For SEN12MS satellite multispectral dataset, we selected one largest subset; Summer (45k) and All seasons (180k). Our sweetpeppr dataset consists of 1,615 pairs of NIR+RGB images. We demonstrate these NIR+RGB datasets are sufficient to be used for synthetic NIR generation quantitatively and qualitatively. We achieved Frechet Inception Distance (FID) of 11.36, 26.53, and 40.15 for nirscene1, SEN12MS, and sweetpepper dataset respectively.

We also release 11 fruits' bounding box annotations that can be exported as various formats using cloud service. 4 newly added fruits [blueberry, cherry, kiwi, and wheat] compounds 11 novel bounding box dastaset together with our previous work in deepFruits project [apple, avocado, capsicum, mango, orange, rockmelon, strawberry]. The total number of bounding box instances is 162k and all bounding box dataset is ready for use from cloud service. For evaluation of these dataset, Yolov5 single stage detector is exploited and reported impressive mean-average-precision, mAP[0.5:0.95] results of [min:0.49, max:0.812]. We hope these dataset is useful and serves as one of baseline for the following up studies.

Files

yolov5.zip

Files (719.3 MB)

Name Size Download all
md5:29f96c80cdd38df8ac9bd5f136a10f60
719.3 MB Preview Download