Dataset Open Access
Update: As of March 27, 2020 we have now analyzed 31,527 distinct sources (articles and preprints) from the most recent CORD-19 data (https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge/version/4. We're releasing citation tallies for these sources (covid-source-tallies 32720.csv). We're also releasing citation statements and classifications from these documents for open articles, which includes 1,682,216 out of the total 1,779,024 extracted. On March 20, 2020 we have analyzed 20,268 out of the 21,792 DOIS available from the CORD-19 data set. Of these documents we found citations citing 16,775 of them, and the classifications for these citations are included in covid-source-tallies.csv. covid-citations.csv includes all citations we have from all of the ~20k documents we have processed. This file is truncated to make sure it only includes openly available documents. The tallies however are not limited by this, and it is the full set relating to all source documents scite has processed.