Published August 18, 2025 | Version v1
Conference paper Open

Wikibase - the best software to communicate with the upcoming knowledge graphs?

  • 1. ROR icon Martin Luther University Halle-Wittenberg
  • 2. Forschungszentrum Gotha, Universität Erfurt

Contributors

  • 1. Nationale Forschungsdateninfrastruktur (NFDI) e.V.
  • 2. University of Amsterdam

Description

The integration of short-term projects – and most research projects are financed only for a two or three years – into the infrastructures of long-term institutions has until now posed significant preliminary challenges for project leaders. In the humanities, the use of databases is not standard practice. Expertise in computer science is scarce in this field. The data, on the other hand, are often extremely complex and specific – remote sources require very specialized modeling, and linking with authority data becomes difficult where information is missing and where fragmentation shapes the state of knowledge. FactGrid, founded in 2018 and funded by NFDI since 2023, has proven surprisingly successful with its offer to support historically oriented projects with an expandable database structure. Around 50 projects are currently active on the platform often in close exchange with one another. The graph database software Wikibase, developed by Wikimedia and Google for the Wikidata project, makes it easy to model information in line with source materials. Data entry is easy. Teams can access the same data in their various languages. The NFDI allows the free of charge use. The maintenance of the collective instance is comparatively inexpensive. When individual projects reach the end of their funding periods, their data remain active in the widening dataset, that is used by ensuing projects. The software"s technical API and SPARQL endpoints have been developed according to the FAIR principles – they are designed to make data retrievable in configurable ways and to integrate them into larger knowledge graphs. Using the example of the "Ontology of Historical, Official, and Occupational Titles" – an independent project within the FactGrid data landscape that organizes and hierarchizes over 45,000 authority records for offices and professions – we aim to discuss the opportunities that Wikibase as a software opens up for individual projects. These include the ability to use their own taxonomies on platforms, to create their own vocabularies, and to conduct mapping and matching against international standards, both within the platform and externally. We are particularly interested in the existing challenges: Behind the OhdAB ontology lies a variant apparatus that is immensely important for data matching but is currently managed separately because the Wikibase platform's technical capabilities would be significantly overstretched – the BlazeGraph search service has its limitations. There is an increasing demand for deeper AI-supported services, both in data entry and in data mining – a need that is particularly pressing in data integration, where entity recognition becomes increasingly complex as data volume grows, and in data mining, where SPARQL, the extremely powerful query language, is not intuitively manageable. Fundamental questions arise in this context: Does the future lie in large collaborative instances such as Wikidata and FactGrid, or rather in small, independent Wikibase instances that can be accessed in the future through federated search and integration into larger knowledge graphs? Research is not the same under these conditions.

Files

CoRDI_2025_paper_364.pdf

Files (108.6 kB)

Name Size Download all
md5:3d2bf161ccb167b985d24efb48bfef8e
108.6 kB Preview Download