Published August 31, 2021 | Version v1
Dataset Open

FDup deduplication software data benchmark: 10Mi OpenAIRE Publications Dump

  • 1. Institute of Information Science and Technlogies, CNR

Description

This dataset is a random subset of publications extracted from the OpenAIRE Research Graph (http://doi.org/10.5281/zenodo.4707307). The dataset contains ~10Mi JSON publications records. 

The file is a zip archive containing gz files, each with one JSON per line. Each JSON is compliant to the schema available at http://doi.org/10.5281/zenodo.4723403.

Learn more about the OpenAIRE Research Graph at https://graph.openaire.eu.

Files

publications_dump_10Mi.zip

Files (10.5 GB)

Name Size Download all
md5:48a6bf9959b859828d3a0377affc4b9f
10.5 GB Preview Download

Additional details

References