Comparing the Use of Research Resource Identifiers and Natural Language Processing for Citation of Databases, Software and Other Digital Artifacts
Authors/Creators
- 1. UC San Diego
- 2. SciCrunch Inc.
- 3. Hypothes.is
Description
The Research Resource Identifier was introduced in biomedicine in 2014 to more precisely identify the reagents and tools used in published biomedical research and to track use of tools across the breadth of the biomedical literature. The current RRID specification covers key biological and digital resources. Authors are instructed to include an RRID after the first mention of any resource used. RRIDs are designed to be easy to find using a full text search search engine.
The published data sets were used in our comparative study where comparing the output of our RRID curation workflow with the outputs of automated text mining systems that have been used to identify mentions of resources in the text of publications. All files in tab-separated format (tsv).
Scibot.tsv: Records of the RRID curation workflow using SciBot.
Each record shows that a resource RRID was identified in paper PMID with curator tags (Tag1, Tag2, both optional)
PMID: Pubmed ID
RRID: Research Resource Identifier
Tag1: Curator tags (optional)
Tag2: Additional curator tags (optional)
rdwsorted.tsv: Records of the output from RDW, a text mining software.
RDW identifies mentions of research resources in papers. Each record shows that a resource RRID was identified in paper PMID.
PMID: Pubmed ID
RRID: Research Resource Identifier
rridbyrdw05282019.tsv: Records of the output of the RRID-by-RDW in RDW.
RRID-by-RDW is a component in RDW that identifies mentions of research resources in papers by matching patterns of RRID specifications. Each record shows that a resource RRID was identified in paper PMID.
PMID: Pubmed ID
RRID: Research Resource Identifier
Context: Snippet where the RRID was found
resource_metadata20190418.tsv: Metadata of RRIDs
This file contains metadata of resources and their RRIDs. See file header for column definitions.
RRIDCUR-definitions.tsv: Definitions of curator tags used in Scibot.tsv.
tag: Tag name
definition: Definition of the tag
Notes
Files
Files
(43.2 MB)
Additional details
References
- 10.1109/MCSE.2019.2952838
- https://escholarship.org/content/qt9kh7h8zr/qt9kh7h8zr.pdf
- Comparing the Use of Research Resource Identifiers and Natural Language Processing for Citation of Databases, Software, and Other Digital Artifacts.