There is a newer version of the record available.

Published February 13, 2019 | Version 2019-02-13
Dataset Open

Complete Rxivist dataset of scraped bioRxiv data

  • 1. University of Minnesota

Description

rxivist.org allows readers to sort and filter the tens of thousands of preprints posted to bioRxiv. Rxivist uses a custom web crawler to index all papers on biorxiv.org; this is a snapshot of Rxivist the production database. The version number indicates the date on which the snapshot was taken. See the included "README.md" file for instructions on how to use the "rxivist.backup" file to import data into a PostgreSQL database server.

Please note this is a different repository than the one used for the Rxivist manuscript—that is in a separate Zenodo repository. You're welcome (and encouraged!) to use this data in your research, but please cite our paper, available on bioRxiv.

Going forward, this information will also be available pre-loaded into Docker images, available at blekhmanlab/rxivist_data.

Version notes:

  • 2019-02-13
    • The redundant "paper" schema has been removed.
    • BioRxiv has begun making the full text of preprints available online. Beginning with this version, a new table ("fulltext") is available that contains the text of preprints that have been processed already. The format in which this information is stored may change in the future; any digression will be noted here.
    • This is the first version that has a corresponding Docker image.

Files

README.md

Files (96.8 MB)

Name Size Download all
md5:4a3c01657a9b1f62025387751129fa38
7.4 kB Preview Download
md5:ffc2109e54fb347d7cd78e0a4f119d59
96.8 MB Download

Additional details

Related works

Is referenced by
10.1101/515643 (DOI)
Is supplement to
10.5281/zenodo.2465688 (DOI)