Published June 8, 2021 | Version v1
Presentation Open

Ideas Challenge 2020/2021 : WikiData Integration with Repository Contents

Description

Presentation of the work of an Ideas Challenge Team from Open Repositories Conference 2020.  The challenge presented is that of WikiData integration with repositories as a way of improving multilingual access to repository contents. Multilingual indexing by search engines and aggregators, and the overall importance of linguistic diversity in scholarly publishing and access is discussed.  The results presented include a detailed overview of various metadata standards relevant for representing multilingual WikiData concepts in repositories: HTML5, Dublin Core, DataCite, JATS XML, Schema.org.  Two scripts that were written in Python for enriching Repository Metadata with WikiData Concepts and their use on EPrints JSON-LD metadata and a test dataset of publications in information visualization is presented. These scripts use DBPedia Spotlight API to annotate scholarly metadata with DBPedia concepts, and these in turn are used to extract translated labels from WikiData. A resource list of relevant projects is included, as well as some additional examples and notes.

Notes

This was presented by Tomasz Neugebauer and Iryna Kuchma on behalf of the following members of the Ideas Challenge team: Tomasz Neugebauer, Iryna Kuchma, Daisy Selematsela, Tembe Biziwe, Niklas Zimmer, Thomas Zodwa, Jo Havemann, Heather Staines, Francisco Berrizbeitia, Phil Stacey, Kathleen Shearer, Victor Venema, Slava Tykhonov

Files

Files (9.0 MB)

Name Size Download all
md5:9e0e21a955dfc61a8030567b51b915dc
9.0 MB Download