Published March 5, 2021
| Version v1
Dataset
Open
CORD-19 Software Mentions
Description
In an effort to automate the process of identifying and analyzing the use of software in biomedical research, we have developed a SciBERT-based machine learning model to extract mentions of software from scientific articles. The input to this model is the full text from a scientific article and the output is a list of mentioned software within it. We applied this model to the CORD-19 full-text articles and stored the output in this dataset, which includes metadata of over 77,000 COVID-19 and coronavirus-related papers and a list of software tools mentioned in each.
Notes
Files
CORD19_software_mentions.csv
Files
(31.9 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:41f7c5dca5abc6fc97e4c54f116c227b
|
31.9 MB | Preview Download |