From Dataset to Knowledge Graph: The "Chronology of Events 1940-1944" at the Academy of Athens
Authors/Creators
- 1. Modern Greek History Research Centre Academy of Athens
- 2. School of Electrical and Computer Engineering National Technical University of Athens
Description
From Dataset to Knowledge Graph: The “Chronology of Events 1940-1944” at the
Academy of Athens
The poster deals with the curation of a collection created by the Modern Greek History
Research Centre of the Academy of Athens in Greece. The collection is based on an archival
series of the British Foreign Office from the period between the outbreak of the Greek-Italian
War (October 1940) and the end of the Nazi occupation in Greece (October 1944). Based on
this material, a detailed chronology was compiled, recording specific events, but also
information exchange, proposals, plans and reports by Greeks, British, Allies and resistance
organizations. The Chronology was published in two volumes.1 The entries were also captured
in a database that allows multiple searches and extraction of data beyond the indexes contained
in the publication. Yet this digitized collection had remained a closed corpus of information,
without any interconnection with relevant collections of other institutions.
Three years ago, the Chronology was incorporated in the action “The 1940s in Greece” of the
project APOLLONIS, in the framework of which the partners of the project, based on the
metadata of various collections pertaining to that period, created common indexes and
knowledge bases to increase the collections’ accessibility and interoperability, and to familiarize
researchers with the possibilities of datasets curation.
Οne of the outcomes of this action was a linked data-based knowledge graph representation of
the collections to support expressive semantic queries. The first step for the construction of the
knowledge graph was the production of linked data representations of the collections using a
mapping tool.2 The mappings used both standard and custom vocabularies to cover collectionspecific
modeling needs. The second step was the automatic enrichment of the items in the
knowledge graph with annotations, which were produced for three semantic dimensions (place,
time, person/organization) by applying NERD tools on the relevant fields. For the Academy of
Athens Chronology, these tools produced Geonames, TimeLine3 and Wikidata annotations. The
resulting knowledge graph was further extended with relevant vocabularies (eg. DBPedia,
Greek Historical Periods4) by including cross vocabulary equivalence alignments for identical
terms and containment alignments for temporal terms.
Through these annotations, the knowledge graph allowed unified multi-vocabulary searches
over all collections, expressed in any of the supported vocabularies. To exploit the knowledge
graph, a search application was built that supported combined cross collection queries over the
three dimensions. Through limited reasoning support, the application was able to also answer
semantic queries by exploiting the vocabulary hierarchies and other term categorizations.
1 M. Spiliotopoulou & P. Papastratis (eds), Chronology of Events 1940-1944. From the Documents of the
British Foreign Office, two vols, Athens 2002 and 2004 (in Greek).
2 A. Chortaras, G. Stamou, D2RML: Integrating Heterogeneous Data and Web Services into Custom RDF
Graphs, LDOW@WWW 2018
3 TimeLine vocabulary, http://sw.islab.ntua.gr/timeline/
4 Greek Historical Periods vocabulary, Greek National Documentation Center,
https://www.semantics.gr/authorities/vocabularies/historical-periods/vocabulary-entries
Practical evaluation showed that, although the quality of the results depended on the quality of
the automatically produced annotations, and the complexity of the queries affected
performance, knowledge graph technologies can enable researchers to locate relevant material
through expressive queries exploiting the levels of abstraction provided by the underlying
knowledge. For the Academy of Athens, the outcome was particularly fruitful: firstly, its
collection participated in a rich knowledge graph that facilitated research on the specific period;
secondly, interconnectivity highlighted documentation inconsistencies that have to be adjusted
to improve the quality of semantic queries.
Files
DYAS-FROMDATASET.pdf
Files
(2.8 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:bc3fc41f08ef41bd1056a14c501fe439
|
2.8 MB | Preview Download |