There is a newer version of this record available.

Preprint Open Access

Representing COVID-19 information in collaborative knowledge graphs: the case of Wikidata

Houcemeddine Turki; Mohamed Ali Hadj Taieb; Thomas Shafee; Tiago Lubiana; Dariusz Jemielniak; Mohamed Ben Aouicha; Jose Emilio Labra Gayo; Eric A. Youngstrom; Mus'ab Banat; Diptanshu Das; Daniel Mietchen

Information related to the COVID-19 pandemic ranges from biological to bibliographic, from geographical to genetic and beyond. The structure of the raw data is highly complex, so converting it to meaningful insight requires data curation, integration, extraction and visualization, the global crowdsourcing of which provides both additional challenges and opportunities. Wikidata is an interdisciplinary, multilingual, open collaborative knowledge base of more than 90 million entities connected by well over a billion relationships. A web-scale platform for broader computer-supported cooperative work and linked open data, it can be queried in multiple ways in near real time by specialists, automated tools and the public, including via SPARQL, a semantic query language used to retrieve and process information from databases saved in Resource Description Framework (RDF) format. Here, we introduce four aspects of Wikidata that enable it to serve as a knowledge base for general information on the COVID-19 pandemic: its flexible data model, its multilingual features, its alignment to multiple external databases, and its multidisciplinary organization. The rich knowledge graph created for COVID-19 in Wikidata can be visualized, explored and analyzed, for purposes like decision support as well as educational and scholarly research.

3,576
4,629
views
downloads
All versions This version
Views 3,576757
Downloads 4,629210
Data volume 15.3 GB1.1 GB
Unique views 3,212663
Unique downloads 3,423153

Share

Cite as