Published November 21, 2023 | Version 1.0.0
Dataset Open

Pizzaïolo Dataset

  • 1. Université du Littoral Côte d'Opale

Description

Pizzaïolo Dataset

This dataset contains 4800 samples of synthetic pizza images generated from and annotated with the Pizzaïolo Ontology.

The synthetic pizza images were generated with the Pizzaiolo Python library.
 
If you use Pizzaiolo or the Pizzaïolo Dataset, please cite as :
Grégory Bourguin, Arnaud Lewandowski. Pizzaïolo Dataset : Des Images Synthétiques Ontologiquement Explicables. https://hal.science/hal-04401953.

Directories

csv/ :

  • pizzaiolo_dataset.csv : details for all the samples in the dataset
  • pizzaiolo_train.csv : partition for training
  • pizzaiolo_valid.csv : partition for validating
  • pizzaiolo_test.csv : partition for testing
images/ :
  • 4800 pizza images (224*224) : 1 file / sample -> img_XXXXX.png 
    NB: icons used to generate the images are coming from https://www.flaticon.com/.
ontology/ :
  • pizzaiolo.xml : the Pizzaïolo Ontology (OWL) used to generate the samples.
    NB: this ontology was derived from [1], and built/manipulated with [2].
labels/ :
  • Concepts Encoding : concepts.json
  • Bounding Boxes : 1 file / sample -> img_XXXXX_bboxes.json
  • Contours : 1 file / sample -> img_XXXXX_contours.json
  • Semantic Segmentation : 1 file / sample -> img_XXXXX_segmentation.txt
sample_annotations/ :
  • Images showing examples of samples with provided annnotations

Concepts

The concepts are the elements constituting pizzas according to the Pizzaïolo Ontology (i.e. the pizza base, the pizza toppings, and the - optional - country of origin).
 
The concepts.json file contains a Python dict for concepts encoding :
  • key(s) : a concept id (uint)
  • value(s) : the concept name (string) (i.e. the short name of the concept class in the Pizzaïolo Ontology)

Bouding Boxes

The bounding boxes represent the localization of the concepts instances constituting a pizza sample.
Each img_XXXXX_bboxes.json file contains a Python dict:
  • key(s) : the name (string) of each concept class present in the sample
  • value(s) : list of all the bouding boxes for the corresponding concept key
 
NB: each bounding box is encoded as a Python (sub)list : [ x_left, y_top, width, height ]

Contours

The contours represent the localization and shape of the concepts instances constituting a pizza sample.
Each img_XXXXX_contours.json file contains a Python dict :
  • key(s) : the name (string) of each concept class present in the sample
  • value(s) : a list of all the contours for the corresponding concept key

NB: contours are encoded as OpenCV Contours.

Semantic Segmentation

Each img_XXXXX_segmentation.txt file contains a Python Numpy array (dtype=uint)
  • the shape of the array is the (2D) size of the samples (224*224)
  • each "pixel" belongs to a concept encoded according to concepts.json.

References


[2] Lamy, J.-B. (2017). Owlready : Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies. Artificial Intelligence in Medicine 80, 11–28.

Files

pizzaiolo_dataset.zip

Files (253.7 MB)

Name Size Download all
md5:9f9e2469c5f79604feaeaaf8c721945e
253.7 MB Preview Download