There is a newer version of this record available.

Dataset Open Access

Pl@ntNet-300K image dataset

Camille Garcin; Alexis Joly; Pierre Bonnet; Maximilien Servajean; Joseph Salmon

Pl@ntNet-300K is an image dataset aimed at evaluating set-valued classification. It was built from the database of Pl@ntnet citizen observatory and consists of 306146 images, covering 1081 species. We highlight two particular features of the dataset, inherent to the way the images are acquired and to the intrinsic diversity of plants morphology:
    i) The dataset exhibits a strong class imbalance, meaning that a few species represent most of the images.
    ii) Many species are visually similar, making identification difficult even for the expert eye.
These two characteristics make the present dataset a good candidate for the evaluation of set-valued classification methods and algorithms. Therefore, we recommend two set-valued evaluation metrics associated with the dataset (top-K and average-K) and we provide the results of a baseline approach based on a resnet50 trained with a cross-entropy loss. The full description of the dataset can be found in (to be provided soon).

The scientific publication (NEURIPS 2022) describing the dataset and providing baseline results can be found here: 

Utilities to load the data and train models with pytorch can be found here:



Files (31.7 GB)
Name Size
31.7 GB Download
All versions This version
Views 20,5895,440
Downloads 4,4632,229
Data volume 141.3 TB70.6 TB
Unique views 18,4514,643
Unique downloads 2,5201,407


Cite as