Published April 15, 2018 | Version v1
Preprint Open

Wikidata for Digital Preservation

  • 1. Yale University
  • 2. Yale University Library
  • 3. Open Preservation Foundation

Description

The Wikidata knowledge base provides a public infrastructure for machine-readable metadata about computing resources. To optimize the Wikidata knowledge base for digital preservation professionals we need to create additional structured descriptions about file formats and software titles. To facilitate this process, we introduce the Wikidata for Digital Preservation portal. This free software portal allows people to browse, and to contribute data to, the Wikidata knowledge base. The interface guides users in contributing data in alignment with current data models for the domain of computing, which are collaboratively created by members of the Wikidata community.

Structured data about file formats, the many versions of software titles, and computing environments are already available in Wikidata. The content of Wikidata is licensed under the Creative Commons Zero (CC0) license, meaning that anyone can reuse the data for any purpose. The content in Wikidata is available in more than 350 human languages. The data in Wikidata is FAIR data, and it is five-star linked open data. Our portal provides a streamlined interface designed for the needs of the digital preservation community. When using the Wikidata for Digital Preservation portal, users will be enriching a multilingual repository of data open for reuse across institutional boundaries. The vision we share of using Wikidata as a registry of technical metadata for the domain of computing is a vision of semantic integration of data from multiple sources.

We publish the source code for the Wikidata for Digital Preservation portal under  the Gnu General Public License, v3 to ensure that this tool will remain in the public domain. We aligned our license decision with those of other tools in the Wikidata ecosystem and the enabling software infrastructure of Wikidata itself to ensure the sustainability of this infrastructure over time. Active developer communities maintain and provision this public infrastructure, and cultural heritage organizations are able to freely reuse the descriptive and technical metadata for software, file formats, and all other resources in the domain of digital preservation in hundreds of human languages.

Notes

This is a preprint of a paper submitted to iPres2018.

Files

wikidata-digital-preservation.pdf

Files (1.2 MB)

Name Size Download all
md5:755f2b6824febb648e665686e6395113
1.2 MB Preview Download