There is a newer version of the record available.

Published April 29, 2021 | Version 1.0
Dataset Open

Pl@ntNet-300K image dataset

  • 1. IMAG, Univ Montpellier, Inria, CNRS
  • 2. Inria, LIRMM, Univ Montpellier, CNRS
  • 3. CIRAD, AMAP
  • 4. LIRMM, AMIS, UPVM, Univ Montpellier, CNRS
  • 5. IMAG, Univ Montpellier, CNRS

Description

Pl@ntNet-300K is an image dataset aimed at evaluating set-valued classification. It was built from the database of Pl@ntnet citizen observatory and consists of 306146 images, covering 1081 species. We highlight two particular features of the dataset, inherent to the way the images are acquired and to the intrinsic diversity of plants morphology:
    i) The dataset exhibits a strong class imbalance, meaning that a few species represent most of the images.
    ii) Many species are visually similar, making identification difficult even for the expert eye.
These two characteristics make the present dataset a good candidate for the evaluation of set-valued classification methods and algorithms. Therefore, we recommend two set-valued evaluation metrics associated with the dataset (top-K and average-K) and we provide the results of a baseline approach based on a resnet50 trained with a cross-entropy loss. The full description of the dataset can be found in (to be provided soon).

The scientific publication (NEURIPS 2022) describing the dataset and providing baseline results can be found here: https://openreview.net/forum?id=eLYinD0TtIt 

Utilities to load the data and train models with pytorch can be found here: https://github.com/plantnet/PlantNet-300K/

 

 

Files

plantnet_300K.zip

Files (31.7 GB)

Name Size Download all
md5:2de74316542a327e7398e9b194b2443c
31.7 GB Preview Download

Additional details

Funding

COS4CLOUD – Co-designed Citizen Observatories Services for the EOS-Cloud 863463
European Commission