Published July 14, 2025
| Version 1.0
Dataset
Open
COMET arXiv preprint matching results
Authors/Creators
Description
Overview
This dataset contains 738,474 matched records linking arXiv preprints to their published counterparts. It is part of the COMET (Collaborative Metadata) initiative, specifically produced as a result of the matching strategy developed during COMET's pilot phase.
Data Structure
Each record contains the following fields:
- input_doi: The DOI of the ArXiv preprint (format: 10.48550/arxiv.XXXX.XXXXX)
- matched_doi: The DOI of the published work in Crossref that corresponds to the preprint
- confidence: A confidence score (0-1) indicating the reliability of the match
- matched_doi_type: The type of the matched publication (journal-article, proceedings-article, book-chapter, or report )
File Formats
The dataset is available in two formats:
- CSV: 20250615_arxiv_preprint_matching_results.csv
- JSON: 20250615_arxiv_preprint_matching_results.json
Files
20250615_arxiv_preprint_matching_results.csv
Files
(178.6 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:dc3163f195ef92d7920a78d82704ac33
|
53.4 MB | Preview Download |
|
md5:f7e0243ea08a49238e9e181fb8cb93fe
|
125.2 MB | Preview Download |
Additional details
Related works
- References
- Other: 10.71707/yj21-5d60 (DOI)
Dates
- Created
-
2025-06-15