Published January 19, 2022 | Version v1
Dataset Open

The Semantic PASCAL-Part Dataset

  • 1. Free University of Bozen-Bolzano
  • 2. Fondazione Bruno Kessler

Contributors

  • 1. Fondazione Bruno Kessler

Description

The Semantic PASCAL-Part dataset

The Semantic PASCAL-Part dataset is the RDF version of the famous PASCAL-Part dataset used for object detection in Computer Vision. Each image is annotated with bounding boxes containing a single object. Couples of bounding boxes are annotated with the part-whole relationship. For example, the bounding box of a car has the part-whole annotation with the bounding boxes of its wheels.

This original release joins Computer Vision with Semantic Web as the objects in the dataset are aligned with concepts from:

  • the provided supporting ontology;
  • the WordNet database through its synstes;
  • the Yago ontology.

The provided Python 3 code (see the GitHub repo) is able to browse the dataset and convert it in RDF knowledge graph format. This new format easily allows the fostering of research in both Semantic Web and Machine Learning fields.

Structure of the semantic PASCAL-Part Dataset

This is the folder structure of the dataset:

  • semanticPascalPart: it contains the refined images and annotations (e.g., small specific parts are merged into bigger parts) of the PASCAL-Part dataset in Pascal-voc style.
    • Annotations_set: the test set annotations in .xml format. For further information See the PASCAL VOC format here.
    • Annotations_trainval: the train and validation set annotations in .xml format. For further information See the PASCAL VOC format here.
    • JPEGImages_test: the test set images in .jpg format.
    • JPEGImages_trainval: the train and validation set images in .jpg format.
    • test.txt: the 2416 image filenames in the test set.
    • trainval.txt: the 7687 image filenames in the train and validation set.

The PASCAL-Part Ontology

The PASCAL-Part OWL ontology formalizes, through logical axioms, the part-of relationship between whole objects (22 classes) and their parts (39 classes). The ontology contains 85 logical axiomns in Description Logic in (for example) the following form:

Every potted_plant has exactly 1 plant AND
                   has exactly 1 pot

We provide two versions of the ontology: with and without cardinality constraints in order to allow users to experiment with or without them. The WordNet alignment is encoded in the ontology as annotations. We further provide the WordNet_Yago_alignment.csv file with both WordNet and Yago alignments.

The ontology can be browsed with many Semantic Web tools such as:

  • Protégé: a graphical tool for ongology modelling;
  • OWLAPI: Java API for manipulating OWL ontologies;
  • rdflib: Python API for working with the RDF format.
  • RDF stores: databases for storing and semantically retrieve RDF triples. See here for some examples.

Citing semantic PASCAL-Part

If you use semantic PASCAL-Part in your research, please use the following BibTeX entry

@article{DBLP:journals/ia/DonadelloS16,
  author    = {Ivan Donadello and
               Luciano Serafini},
  title     = {Integration of numeric and symbolic information for semantic image
               interpretation},
  journal   = {Intelligenza Artificiale},
  volume    = {10},
  number    = {1},
  pages     = {33--47},
  year      = {2016}
}

Files

semanticPascalPart.zip

Files (1.2 GB)

Name Size Download all
md5:a75fbe2ae2dd0aeac4756cc95113dfd9
1.2 GB Preview Download