Published March 5, 2025 | Version 2.0.0
Dataset Open

DOI ORCID Index: a dataset for enriching bibliographic metadata with ORCID identifiers (2025)

  • 1. OpenCitations

Description

The DOI ORCID Index is a dataset derived from the annual ORCID public data file, specifically from its XML summaries. It is designed to facilitate the enrichment of bibliographic metadata within the OpenCitations Meta index by integrating ORCID identifiers with DOI-based metadata.

Purpose

Sources used to populate OpenCitations Meta (e.g., Crossref, PubMed, JaLC, OpenAIRE, DataCite) do not always include ORCID information for all authors in a bibliographic record. However, the ORCID dump provides this valuable data, allowing for improved integration. By merging these sources, the DOI ORCID Index enhances bibliographic metadata completeness, improving author disambiguation and citation tracking.

Dataset structure

The dataset consists of CSV files with the following columns:

  • id: A DOI (Digital Object Identifier). If no bibliographic resource is linked to a specific ORCID, this field may contain "None".
  • value: The author’s name in the format: Lastname, Firstname [ORCID].

Example:

id value
10.1162/qss_a_00292 Massari, Arcangelo [0000-0002-8420-0696]

Latest Version

The most recent version of this dataset is based on the ORCID Public Data File 2025.

Files

Files (1.2 GB)

Name Size Download all
md5:9c850d53ef7ca57a3b3a1dd1acc815a2
1.2 GB Download

Additional details

Related works

Is compiled by
Software: 10.5281/zenodo.6620267 (DOI)
Is source of
Dataset: 10.5281/zenodo.15625650 (DOI)
Is variant form of
Dataset: 10.23640/07243.30375589 (DOI)

References