Interdisciplinary Knowledge Graphs: Semantic Modelling using Linked Open Data, Wikibase and the Wikiverse
Description
Introduction
Within the digital transformation from an analogue world to the Knowledge Era in the Third Era of Computing (Thiery, 2019), computational archaeology and Research Software Engineering methods play a huge role (Thiery, Veller, et al., 2023). These methods are also crucial for data FAIRification (Wilkinson et al., 2016) – making data Findable, Accessible, Interoperable, and Reusable – addressed in the German National Research Data Infrastructure, called NFDI (Hartl, Wössner, and Sure-Vetter, 2021) to apply Open Science and FAIRify research data. Within Cultural heritage data networks, so-called knowledge graphs must be interdisciplinary and interoperable. This causes challenges such as heterogeneity and islands of data. The digital transformation is how we build bridges via Semantic Web technologies and Citizen Science Hubs such as Wikidata and Open Street Map (OSM).
The NFDI4Objects consortium (Bibby et al., 2023) represents a broad community dealing with material remains of human history from around 3 million years. It involves numerous disciplines, including the humanities, cultural studies, and natural sciences, with an archaeological and historical focus. The objects include potsherds of common ware, serially produced objects such as coins, organic remains such as wood, bones or pollen, inscribed clay tablets, papyri, and stones. The objects and relations constantly change, causing individual biographies: documenting, collecting, analysing, protecting, storing, and sharing. Therefore, the NFDI4Objects central tasks include (a) creating representations of physical objects as research data, (b) relating them to their contexts, (c) transforming them adequately into the digital space, and (d) curating them according to domain-specific requirements (Thiery, Mees, et al., 2023). Consequently, the NFDI4Objects Research Data Life Cycle goes along with the Object Biography.
Material & Data
This paper uses research data from several domains and sources: from archaeology, such as Irish Ogham stones (Macalister, 1945; MacManus, 1997; Schmidt and Thiery, 2022) and the silver coinage of Croton (Rutter, 1997; 2001; Stazio, 1984; Garraffo, 1987), provenance research (Hopp, 2018; Hopp and von dem Bussche, 2023; 2022a; 2022b), and geosciences (Schenk et al., 2024; Thiery and Schenk, 2023a; 2023c).
Ogham stones are monoliths bearing inscriptions in the early medieval Gaelic "primitive Irish" Ogham script, erected mainly on the island of Ireland and in the western part of Great Britain between the 4th and 9th century AD. Examples are CIIC 178 at Coumeenole North / Dunmore Head (Fig. 2; most west point of Ireland), OSM-Node 5145413640, coordinates: 52.1103°N, -10.4732°E) or CIIC 81 in front of the Visitor Centre at University College Cork (UCC), OSM-Node 11071361392, coordinates: 51.8938°N, -8.4921°E.
Hoard analyses of the silver coinage of Croton (6th to 3rd century BC), an Achaean colony in southern Italy, include spatial uncertainties. The sites identified are mainly derived from the literature and have varying degrees of precision about their geographical location.
About 40,000 yr b2k[1], the largest eruption of the Campanian Ignimbrite (CI) took place in the Phlegraean Fields (Barberi et al., 1978; De Vivo et al., 2001; Schenk et al., 2024). Evidence of the ash fall from this Late Pleistocene volcanic event is found in Central Europe, often in isolated watersheds and valleys. These sites are recorded in publications, e.g. by precise coordinates or references to cities, regions, caves, and archaeological sites (Thiery and Schenk, 2023a).
In a project by the Numismatic Collection (Berlin State Museums), all acquisitions from 1933 to 1945 were examined, and their previous owners and sellers, i.e. people or corporate bodies, were identified (Weisser, 2022a; 2022b). Identifiers were created in the coin cabinet's own authority file data portal (NDP) and stored with standard data from other portals such as GND or VIAF.
Methodology
To create interdisciplinary and interoperable Knowledge Graphs, Semantic Web technologies such as the Resource Description Framework (RDF) (Klyne, Carroll, and McBride, 2014) and Linked Open Data (Berners-Lee, 2006; Schmidt, Thiery, and Trognitz, 2022) are used. Combined with handling doubts – in the context of NFDI4Objects project often named using the umbrella terms “fuzziness and wobbliness” (Thiery et al., 2021) – such as vagueness, uncertainty, and ambiguities in data modelling, this makes research data comprehensible; especially using the “Fuzzy Spatial Locations Ontology” (Thiery, 2023b), based on “PROV-O”, “SKOS” and “GeoSPARQL” (Thiery, 2023a: 3–5). On top of that, the use of the FOSS software Wikibase and its instances, such as Wikidata, leads this method to Open Science.
Results & Conclusions
The methodology used leads to modelling strategies to FAIRify data. The resulting LOD from the Campanian Ignimbrite project (Thiery and Schenk, 2023b) or the Croton project (Thiery and Baars, 2023) as RDF can be converted into human-readable HTML files using the “SPARQL Unicorn Ontology Documentation Research Tool” (Homburg and Thiery, 2024) based on the SPARQL Unicorn (Thiery et al., 2020; Thiery and Homburg, 2024). The SPARQL Unicorn also enables QGIS users to integrate these data into GIS analysis.
One more challenging example is the hoard: «San Giorgio Ionico 1949, San Giorgio Ionico (near Taranto), on the property of E. De Finis» (Squirrel-ID no. 3001), which includes i. a. coinage of Croton and poses the question: “Where was the property of E. De Finis located?”. With the help of Lo Porto (1990), Siciliano (2002) and OSM Node 68530185, the coordinate POINT(17.3787 40.4579) can be determined as shown in crotonsite_3001.
The Wikidata Workflow creates entries in Wikidata, e.g., Q106680733, CIIC 81 as “Squirrel Stone” as part of the Ogham stones collection (can be queried via https://w.wiki/9jQv).
Semantically modelled entries in a Wikibase instance (here: https://n4o-prov.wikibase.cloud) enable to open the knowledge about coin dealers (e.g., Zakaris Bezdikian; Item:Q20) or corporate bodies (e.g., Ignaz Storek steel foundry and machine factory in Brünn; Item:Q23).
Discussion
This paper shows exemplary processes for creating interdisciplinary knowledge graphs using semantic modelling techniques like Linked Open Data, Wikibase, and Wikidata. These digital approaches and applied methods in computational archaeology create a shared understanding and the possibility of common standards using the NFDI4Objects community.
References
Barberi, F., F. Innocenti, L. Lirer, R. Munno, T. Pescatore, and R. Santacroce (1978), ‘The campanian ignimbrite: a major prehistoric eruption in the Neapolitan area (Italy)’, Bulletin Volcanologique, 41:1, 10–31, https://doi.org/10.1007/BF02597680.
Berners-Lee, T. (2006), ‘Linked Data - Design Issues’, https://www.w3.org/DesignIssues/LinkedData.html [accessed 17 April 2024].
Bibby, D., K.-C. Bruhn, A. Busch, F. Dührkohp, and et al. (2023), ‘NFDI4Objects - Proposal’, NFDI4Objects Zenodo Community, https://doi.org/10.5281/zenodo.10409227.
De Vivo, B., G. Rolandi, P. B. Gans, A. Calvert, W. A. Bohrson, F. J. Spera, et al. (2001), ‘New constraints on the pyroclastic eruptive history of the Campanian volcanic Plain (Italy)’, Mineralogy and Petrology, 73:1–3, 47–65, https://doi.org/10.1007/s007100170010.
Garraffo, S. (1987), ‘Crotoniensia: Dall’incuso al doppio rilievo’, in T. Caruso (ed.), Studi per Laura Breglia (Rom: Istituto poligrafico e Zecca dello Stato), pp. 105–17
Hartl, N., E. Wössner, and Y. Sure-Vetter (2021), ‘Nationale Forschungsdateninfrastruktur (NFDI)’, Informatik Spektrum, 44:5, 370–73, https://doi.org/10.1007/s00287-021-01392-6.
Homburg, T., and F. Thiery (2024), ‘SPARQL Unicorn Ontology Documentation’, Squirrel Papers, 6:2, #2, https://doi.org/10.5281/zenodo.10780476.
Hopp, M. (2018), ‘Provenienzrecherche und digitale Forschungsinfrastrukturen in Deutschland: Tendenzen, Desiderate, Bedürfnisse’, eds E. Blimlinger and H. Schödl, ... ... (k)ein Ende in Sicht. 20 Jahre Kunstrückgabegesetz in Österreich., Schriftenreihe der Kommission für Provenienzforschung:8, 37–62, https://doi.org/10.7767/9783205201274.37.
Hopp, M., and R. von dem Bussche (2022a), ‘Der „Bestand B323“ als Knowledgegraph für die Provenienzforschung. Methodische Überlegungen zur Verarbeitung von Archivdaten als Linked Open Data’, Archivar 75, 2022:1
——— (2022b), ‘Data related the archival collection Bundesarchiv Koblenz B323’, GitHub, art-provenance:b323, https://github.com/art-provenance/b323 [accessed 17 April 2024].
——— (2023), ‘Provenienzforschung und ihre Quellenbestände. Aktuelle Nutzungsszenarien zwischen Open Access und Inaccessibility’, Proceedings of the Digital Humanities Im Deutschsprachigen Raum, 2023, https://doi.org/10.5281/zenodo.7715360.
Klyne, G., J. J. Carroll, and B. McBride (2014), ‘RDF 1.1 Concepts and Abstract Syntax; W3C Recommendation 25 February 2014’, https://www.w3.org/TR/rdf11-concepts/ [accessed 17 April 2024].
Lo Porto, F. G. (1990), ‘Testimonianze archeologiche della espansione tarantina in età arcaica’, Taras, 10, 67–97.
Macalister, R. A. S. (1945), Corpus Inscriptionum Insularum Celticarum (Dublin: Stationery Office).
MacManus, D. (1997), A Guide to Ogam (Maynooth: An Sagart).
Rutter, N. K. (1997), The Greek Coinages of Southern Italy and Sicily (London: Spink).
——— (ed.) (2001), Historia Numorum: Italy (London: British Museum Press).
Schenk, F., U. Hambach, S. Britzius, D. Veres, and F. Sirocko (2024), ‘A Cryptotephra Layer in Sediments of an Infilled Maar Lake from the Eifel (Germany): First Evidence of Campanian Ignimbrite Ash Airfall in Central Europe’, Quaternary, 7:2, 17, https://doi.org/10.3390/quat7020017.
Schmidt, S. C., and F. Thiery (2022), ‘SPARQLing Ogham Stones: New Options for Analyzing Analog Editions by Digitization in Wikidata’, CEUR Workshop Proceedings, 3110:Graph Technologies in the Humanities 2020, 211–44, https://doi.org/10.5281/zenodo.6380914.
Schmidt, S. C., F. Thiery, and M. Trognitz (2022), ‘Practices of Linked Open Data in Archaeology and Their Realisation in Wikidata’, Digital, 2:3, 333–64, https://doi.org/10.3390/digital2030019.
Siciliano, A. (2002), ‘La circolazione monetale’, in Convegno di Studi sulla Magna Grecia (ed.), Taranto e Il Mediterraneo (Tarent), pp. 483–517.
Stazio, A. (1984), ‘Problemi della monetazione di Crotone’, in Istituto per la Storia e l’Archeologia della Magna Grecia (ed.), Crotone (Tarent: Istituto per la storia e l’archeologia della Magna Grecia), pp. 369–98.
Thiery, F. (2019), ‘Archaeology 4.0: Archaeology in the Third Era of Computing’, Squirrel Papers, 1:1, #2, https://doi.org/10.5281/zenodo.2629595.
——— (2023a), ‘Dealing with Doubts: Modelling Approaches in site georeferencing’, Squirrel Papers, 5(1):1, #7, https://doi.org/10.5281/zenodo.10403509.
——— (2023b), ‘Fuzzy Spatial Locations Ontology’, Squirrel Papers, 5(2):2, #3, https://doi.org/10.5281/zenodo.10362777.
Thiery, F., and S. Baars (2023), ‘Croton Site Instances Collection’, Squirrel Papers, Research Squirrel Engineers, via @croton-geo, https://research-squirrel-engineers.github.io/croton-geo/Site_collection/index.html [accessed 17 April 2024].
Thiery, F., and T. Homburg (2024), ‘SPARQLing Unicorn QGIS Plugin’, Squirrel Papers, 6:2, #1, https://doi.org/10.5281/zenodo.10779466.
Thiery, F., A. Mees, K. Tolle, and D. Wigg-Wolf (2021), ‘TRAIL 2.2: Evaluation of fuzziness and wobbliness in numismatics and ceramology’, NFDI4Objects TRAILS, 2021, No. 2.2, https://doi.org/10.5281/zenodo.5654897.
Thiery, F., A. W. Mees, B. Weisser, F. F. Schäfer, S. Baars, S. Nolte, et al. (2023), ‘Object-Related Research Data Workflows Within NFDI4Objects and Beyond’, Proceedings of the Conference on Research Data Infrastructure, 1, https://doi.org/10.52825/cordi.v1i.326.
Thiery, F., and F. Schenk (2023a), ‘Campanian Ignimbrite Geo Locations’, Squirrel Papers, 5(2):2, #2, https://doi.org/10.5281/zenodo.10361309.
——— (2023b), ‘CI Site Instances Collection’, Squirrel Papers, Research Squirrel Engineers, via @campanian-ignimbrite-geo, https://research-squirrel-engineers.github.io/campanian-ignimbrite-geo/Site_collection/index.html [accessed 17 April 2024].
——— (2023c), ‘How to locate the Campanian Ignimbrite site Urluia based on literature? How to provide and publish this data in a FAIR way?’, Squirrel Papers, 5(1):1, #5, https://doi.org/10.5281/zenodo.10262720.
Thiery, F., S. C. Schmidt, T. Homburg, and M. Trognitz (2020), ‘The SPARQL Unicorn: An introduction’, Squirrel Papers, 2:1, #1, https://doi.org/10.5281/zenodo.3742185.
Thiery, F., J. Veller, L. Raddatz, L. Rokohl, F. Boochs, and A. W. Mees (2023), ‘A Semi-Automatic Semantic-Model-Based Comparison Workflow for Archaeological Features on Roman Ceramics’, ISPRS International Journal of Geo-Information, 12:4, 167, https://doi.org/10.3390/ijgi12040167.
Weisser, B. (2022a), ‘Digitale Provenienzforschung am Münzkabinett der Staatlichen Museen zu Berlin. Erwerbungen zwischen 1933 und 1945’, Sonderheft Der Geldgeschichtlichen Nachrichten, 57, 260–67.
——— (ed.) (2022b), ‘Münzsammlungen in Deutschland zwischen 1933 und 1945. Erwerbungen und Normdaten.’, Sonderheft Der Geldgeschichtlichen Nachrichten, 57, 257–356.
Wilkinson, M. D., M. Dumontier, Ij. J. Aalbersberg, G. Appleton, and et al. (2016), ‘The FAIR Guiding Principles for scientific data management and stewardship’, Scientific Data, 3, 160018, https://doi.org/10.1038/sdata.2016.18.
[1] yr b2k: The geosciences use their own systems to describe years. For example, “Before Present” (BP) is used, which means "before 1950 AD". The term "b2k" means "before the year 2000 AD", i.e. the year 2000 is used as a reference point. 40,000 yr b2k therefore means approx. 40,000 years before the year 2000 AD, i.e. approx. 38,000 BC.
Files
20241106_CHNT29_2024_Vienna_InterdisciplinaryKnowledgeGraphs.pdf
Files
(27.1 MB)
Name | Size | Download all |
---|---|---|
md5:c92ece4b4bdec02824f4a923df00c298
|
27.1 MB | Preview Download |