PreprintMatch: a tool for preprint publication detection applied to analyze global inequities in scientific publishing
Creators
- 1. Department of Computer Science and Engineering, UC San Diego, La Jolla, CA, United States
- 2. Department of Neuroscience, UC San Diego, La Jolla, CA, United States
Description
Dataset underlying the paper "PreprintMatch: a tool for preprint publication detection applied to analyze global inequities in scientific publishing." preprint-paper-matches.csv lists all matches found by our algorithm between bioRxiv/medRxiv and PubMed, and preprint_affiliations.csv lists all extracted affiliations from bioRxiv/medRxiv. The Rxivist data dump (https://zenodo.org/record/4738007) was used for all preprint data, and the scrips to download PubMed data are available on our GitHub repository, https://github.com/PeterEckmann1/preprint-match.
The full database dump, with all data used in the study, is available on Google Drive at https://drive.google.com/file/d/1ZoafhYUP-DO4Hd_4A_v7mbQLjN3JPzJv/view?usp=sharing. The PostgreSQL database can be restored using the pg_restore command.
Files
preprint-paper-matches.csv
Files
(35.9 MB)
Name | Size | Download all |
---|---|---|
md5:7bd5eb8c3b1125ee0ff368de09145e1d
|
5.5 MB | Preview Download |
md5:11c2183841c17207d8b98a2201ce9376
|
30.4 MB | Preview Download |