Published July 5, 2018 | Version 1.0
Poster Open

Semantic modelling of video annotations – the TIB AV-Portal's metadata structure

  • 1. TIB


The TIB AV-Portal ( is an online platform for sharing scientific videos operated by the German National Library of Science and Technology (TIB). Besides the allocation of Digital Object Identifiers (DOI) and Media Fragment Identifiers (MFID) for video citation, long-term preservation of all material and open licenses like Creative Commons, the core feature of the TIB AV-Portal are its various methods of automated metadata extraction to fundamentally improve search functionalities (e.g. fine-grained search and faceting). These comprise of an automated chaptering, extraction of superimposed text, speech to text recognition, and the detection of predefined visual concepts. In addition, extracted metadata are consequently mapped against authority files like the German “Gemeinsame Normdatei” and knowledge bases like DBpedia and Library of Congress Subject Headings via a process of automated named entity linking (NEL) to enable semantic and cross-lingual search.

The results of this process are expressed as temporal and/or spatial video annotations, linking extracted metadata to certain key frames and video segments. In order to structure the data, express relations between single entities, and link to external information resources, several common vocabularies, ontologies and knowledge bases are being used. These include amongst others the Open Annotation Data Model, the NLP Interchange Format (NIF), BIBFRAME, the Friend of Friend Vocabulary (FOAF), and Furthermore, all data is stored adhering to the Resource Description Framework (RDF) data model and published as linked open data. This provides third parties with an interoperable and easy to reuse RDF graph representation of the AV-Portal’s metadata.

On our poster we illustrate the general structure of the TIB AV-Portal’s comprehensive metadata both authoritative and extracted automatically. Here, the main focus is on the underlying video annotation graph model and on semantic interoperability and reusability of the data. In particular we visualize how the use of vocabularies, ontologies and knowledge bases allows for rich semantic descriptions of video materials as well as for easy metadata publication, interlinking, and opportunities of reuse by third parties (e.g. for information retrieval and enrichment). In doing so, we present the AV-Portal’s metadata structure as an illustrative example for the complexity of modelling temporal and spatial video metadata and as a set of best practices in the field of audio-visual resources.



Files (587.3 kB)

Name Size Download all
587.3 kB Preview Download