Published November 21, 2023
| Version 1.0.0
Dataset
Open
Pizzaïolo Dataset
Description
Pizzaïolo Dataset
This dataset contains 4800 samples of synthetic pizza images generated from and annotated with the Pizzaïolo Ontology.
The synthetic pizza images were generated with the Pizzaiolo Python library.
If you use Pizzaiolo or the Pizzaïolo Dataset, please cite as :
Grégory Bourguin, Arnaud Lewandowski. Pizzaïolo Dataset : Des Images Synthétiques Ontologiquement Explicables. https://hal.science/hal-04401953.
Directories
csv/ :
pizzaiolo_dataset.csv: details for all the samples in the datasetpizzaiolo_train.csv: partition for trainingpizzaiolo_valid.csv: partition for validatingpizzaiolo_test.csv: partition for testing
- 4800 pizza images (224*224) : 1 file / sample ->
img_XXXXX.png
NB: icons used to generate the images are coming from https://www.flaticon.com/.
pizzaiolo.xml: the Pizzaïolo Ontology (OWL) used to generate the samples.
NB: this ontology was derived from [1], and built/manipulated with [2].
labels/ :
- Concepts Encoding :
concepts.json - Bounding Boxes : 1 file / sample ->
img_XXXXX_bboxes.json - Contours : 1 file / sample ->
img_XXXXX_contours.json - Semantic Segmentation : 1 file / sample ->
img_XXXXX_segmentation.txt
sample_annotations/ :
-
Images showing examples of samples with provided annnotations
Concepts
The concepts are the elements constituting pizzas according to the Pizzaïolo Ontology (i.e. the pizza base, the pizza toppings, and the - optional - country of origin).
The
concepts.json file contains a Python dict for concepts encoding :
- key(s) : a concept id (
uint) - value(s) : the concept name (
string) (i.e. the short name of the concept class in the Pizzaïolo Ontology)
Bouding Boxes
The bounding boxes represent the localization of the concepts instances constituting a pizza sample.Each
img_XXXXX_bboxes.json file contains a Python dict:- key(s) : the name (
string) of each concept class present in the sample - value(s) :
listof all the bouding boxes for the corresponding concept key
NB: each bounding box is encoded as a Python (sub)
list : [ x_left, y_top, width, height ]Contours
The contours represent the localization and shape of the concepts instances constituting a pizza sample.Each
img_XXXXX_contours.json file contains a Python dict :- key(s) : the name (
string) of each concept class present in the sample - value(s) : a
listof all the contours for the corresponding concept key
NB: contours are encoded as OpenCV Contours.
Semantic Segmentation
Eachimg_XXXXX_segmentation.txt file contains a Python Numpy array (dtype=uint)
- the shape of the array is the (2D) size of the samples (224*224)
- each "pixel" belongs to a concept encoded according to
concepts.json.
References
[2] Lamy, J.-B. (2017). Owlready : Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies. Artificial Intelligence in Medicine 80, 11–28.
Files
pizzaiolo_dataset.zip
Files
(253.7 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:9f9e2469c5f79604feaeaaf8c721945e
|
253.7 MB | Preview Download |