Dataset Open Access

CONCIERGE-CM-UC3M/COVID19-gender-gap

Iñaki Ucar; Margarita Torre; Antonio Elías Fernández

This dataset contains records about preprint submissions during the COVID-19 global lockdowns in 2020, as well as the same period for the 3 previous years. It amounts a total of 502,762 research articles deposited in 5 macovjor preprint repositories (arXiv, medRxiv, bioRxiv, PsyArXiv and SocArXiv) during the months of January to May from 2017 to 2020. Author information is completed with gender identification. The dataset comprises 4 CSV files:

  • Articles: links each article ID (the URL) to the source repository and date of publication.
  • Authors: links each author with an article ID, and includes additional information such as position, rank and gender.
  • Categories: links each article ID to one or more categories/subcategories.
  • Text: provides the title and abstract for each article ID.
This work has been supported by the Madrid Government (Comunidad de Madrid) under the Multiannual Agreement with UC3M in the line of "Fostering Young Doctors Research" (CONCIERGE-CM-UC3M), and in the context of the V PRICIT (Regional Programme of Research and Technological Innovation.
Files (525.2 MB)
Name Size
articles.csv
md5:1c3b3ce7e4d6dffbe29815bd5d58d40e
15.9 MB Download
authors.csv
md5:09d012ce62d35a27cb283f44137ad504
128.2 MB Download
categories.csv
md5:9a78e0d61d35344f43b4a9d3805cee1e
40.4 MB Download
text.csv
md5:5fd645b16a1cc661062d4abf67a4585c
340.5 MB Download
495
167
views
downloads
All versions This version
Views 495495
Downloads 167167
Data volume 11.6 GB11.6 GB
Unique views 468468
Unique downloads 104104

Share

Cite as