DOI ORCID Index: a dataset for enriching bibliographic metadata with ORCID identifiers (2025)
Description
The DOI ORCID Index is a dataset derived from the annual ORCID public data file, specifically from its XML summaries. It is designed to facilitate the enrichment of bibliographic metadata within the OpenCitations Meta index by integrating ORCID identifiers with DOI-based metadata.
Purpose
Sources used to populate OpenCitations Meta (e.g., Crossref, PubMed, JaLC, OpenAIRE, DataCite) do not always include ORCID information for all authors in a bibliographic record. However, the ORCID dump provides this valuable data, allowing for improved integration. By merging these sources, the DOI ORCID Index enhances bibliographic metadata completeness, improving author disambiguation and citation tracking.
Dataset structure
The dataset consists of CSV files with the following columns:
- id: A DOI (Digital Object Identifier). If no bibliographic resource is linked to a specific ORCID, this field may contain "None".
- value: The author’s name in the format: Lastname, Firstname [ORCID].
Example:
| id | value |
| 10.1162/qss_a_00292 | Massari, Arcangelo [0000-0002-8420-0696] |
Latest Version
The most recent version of this dataset is based on the ORCID Public Data File 2025.
Files
Files
(1.2 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:9c850d53ef7ca57a3b3a1dd1acc815a2
|
1.2 GB | Download |
Additional details
Related works
- Is compiled by
- Software: 10.5281/zenodo.6620267 (DOI)
- Is source of
- Dataset: 10.5281/zenodo.15625650 (DOI)
- Is variant form of
- Dataset: 10.23640/07243.30375589 (DOI)
References
- Arcangelo Massari, Fabio Mariani, Ivan Heibi, Silvio Peroni, David Shotton; OpenCitations Meta. Quantitative Science Studies 2024; 5 (1): 50–75. doi: https://doi.org/10.1162/qss_a_00292
- Romanov, Andrej; Montenegro, Angel; Westwood, Giles (2024). ORCID Public Data File 2024. ORCID. Dataset. https://doi.org/10.23640/07243.27151305.v1