Crossref funder names to ROR IDs

Portenoy, Jason

doi:10.13003/zmkagc4i

Published April 22, 2026 | Version v1

Dataset Open

Crossref funder names to ROR IDs

Portenoy, Jason¹

1. Crossref

Crossref funder names to ROR IDsThis dataset contains funder names from the metadata of scholarly works, matched with ROR IDs for the funding organizations. It is a sample of funder name strings from the Crossref metadata, manually labeled with the correct ROR ID(s). Each funder name can be matched to zero, one, or multiple ROR IDs.
The funder names were extracted from a July 2025 snapshot of the Crossref works data. There are 25,698,253 funder entries with names across 12,433,534 different works. (A single work can have more than one funder entry.) There are 3,004,870 unique names among these. There is skew in this data—some names occur much more often than others. This dataset comprises a weighted sample of funder names, with each weight representing the count of entries with that name that do not already have a funder ID asserted in the Crossref metadata.
A human evaluator manually matched all funder names to ROR IDs using ROR's online search, with ROR data up to April 2026. Active and inactive ROR records were considered, but not Withdrawn ROR records.
Some funder names also have "alternate" matches, to handle cases where funder strings might be ambiguous even for a human evaluator, or a matching strategy might identify a parent organization rather than the direct target—which may be acceptable depending on the use case. For funder names with alternate matches, the dataset includes mappings between those alternate IDs (or "no_match") and the primary matched ID. This enables "relaxed" evaluation of matching methods that does not penalize these ambiguous cases. See the documentation for the crossref-matcher library for more information.
The dataset contains:
3,505 unique funder name strings
Total weight: 2,138,538

1,895 (54%) of the names have at least one ROR ID match
151 (4.3%) of the names have at least one “alternate” match
The dataset is provided in two formats: a single JSON-lines (.jsonl) file, and two CSV files.
The JSON-lines file funders-crossref-weighted-with-alternates-2025-07-05.jsonl contains one JSON object per line, with the fields:
seq_no (int): zero-based sequence number (index) of the item
input (string): the funder name from Crossref’s metadata
output (list of strings): matched ROR ID(s) for this item, or an empty list if no match exists
alternates (list of string, string pairs): alternate matches for this item (see above)
weight (number): number of occurrences of this funder name in the Crossref data submitted without an ID
The two CSV files provide the same data in a tabular format:
funder_matches.csv contains the primary funder name matches, with one row per unique funder name string:
name (string): the funder name from Crossref's metadata
num_occurrences (int): the count of funder entries with this name in Crossref data (including those submitted with an ID)
weight (number): number of occurrences of this funder name in the Crossref data submitted without an ID
matched_id (string): the matched ROR ID(s) for this funder name, or "no_match" if no match exists. If there are multiple matched IDs, they are semicolon-separated.
funder_alternate_matches.csv contains the alternate match mappings for funder names that have them:
name (string): the funder name
relaxed_match_id (string): an alternate ROR ID, or "no_match" that could be considered a valid match for this funder name
map_to_id (string): the primary matched ROR ID for this funder name (from funder_matches.csv), or "no_match"
The 151 funder names with alternate matches each have one or more rows in funder_alternate_matches.csv.

Files

funder_matches.csv

Files (765.8 kB)

Name	Size	Download all
funder_alternate_matches.csv md5:c8ea7ed3be37568df1ccb25e67d07c21	15.3 kB	Preview Download
funder_matches.csv md5:2a20e90af900b41001d69a71026b85db	255.1 kB	Preview Download
funders-crossref-weighted-with-alternates-2025-07-05.jsonl md5:dd5c986e78bb3cdf49a47673e4db81d8	495.4 kB	Download

	All versions	This version
Views	67	67
Downloads	98	98
Data volume	26.8 MB	26.8 MB

Crossref funder names to ROR IDs

Authors/Creators

Description

Crossref funder names to ROR IDs

Files

funder_matches.csv

Files (765.8 kB)