Coverage and completeness: comparing sources of open metadata
Description
Lightning talk presentation at FORCE2021
Abstract:
The recently announced closure of Microsoft Academic raises challenges for the many tools, projects and services which have relied on it as a source of largely open metadata. This closure highlights the need for transparency, provenance and community governance in open metadata collection and provision. Multiple organizations are working to produce such datasets, including to fill the gap the closure of Microsoft Academic will create.
Ideally, these initiatives do not lead to siloed collections of metadata, but will contribute to a rich open metadata landscape. More than simply a set of datasets competing with each other we need data that can be combined to enable discovery, linking and integration of data on research process and outputs. To enable evaluation and systemic change in the research system we will need to tell new stories about how research is produced, shared and used. In this project, we assessed the value added by integrating Microsoft Academic data with Crossref metadata, and compared this to the recently released OpenAlex dataset.
We show how this analysis can be extended to other datasets such as OpenAIRE and CORE, to assess the completeness and coverage of current metadata collections, identify where they can strengthen each other and which gaps and biases in the open metadata landscape remain to be addressed.
Files
FORCE2021 - Comparing sources of open metadata.pdf
Files
(562.6 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:f94ee57ab20429fbf742546c474a1ad8
|
562.6 kB | Preview Download |