There is a newer version of this record available.

Dataset Open Access

Data from: The State of OA: A large-scale analysis of the prevalence and impact of Open Access articles

Piwowar, Heather; Priem, Jason; Larivière, Vincent; Alperin, Juan Pablo; Matthias, Lisa; Norlander, Bree; Farley, Ashley; West, Jevin; Haustein, Stefanie

This is the raw data behind the publication: 

The State of OA: A large-scale analysis of the prevalence and impact of Open Access articles.

Despite growing interest in Open Access (OA) to scholarly literature, there is an unmet need for large-scale, up-to-date, and reproducible studies assessing the prevalence and characteristics of OA. We address this need using oaDOI, an open online service that determines OA status for 67 million articles. We use three samples, each of 100,000 articles, to investigate OA in three populations: 1) all journal articles assigned a Crossref DOI, 2) recent journal articles indexed in Web of Science, and 3) articles viewed by users of Unpaywall, an open-source browser extension that lets users find OA articles using oaDOI. We estimate that at least 28% of the scholarly literature is OA (19M in total) and that this proportion is growing, driven particularly by growth in Gold and Hybrid. The most recent year analyzed (2015) also has the highest percentage of OA (45%). Because of this growth, and the fact that readers disproportionately access newer articles, we find that Unpaywall users encounter OA quite frequently: 47% of articles they view are OA. Notably, the most common mechanism for OA is not Gold, Green, or Hybrid OA, but rather an under-discussed category we dub Bronze: articles made free-to-read on the publisher website, without an explicit Open license.  We also examine the citation impact of OA articles, corroborating the so-called open-access citation advantage: accounting for age and discipline, OA articles receive 18% more citations than average, an effect driven primarily by Green and Hybrid OA. We encourage further research using the free oaDOI service, as a way to inform OA policy and practice.

Files (10.6 MB)
Name Size
accuracy_analysis.xlsx md5:18fba55dece4396c2a1431659719ecb9 156.9 kB Download
crossref_100k.csv.gz md5:cc0a958c6f83f21a6f21a01668536935 2.9 MB Download
README.txt md5:a57c3482404ef4fe3c4b6f433a80399a 1.1 kB Download
unpaywall_100k.csv.gz md5:49d0032a276b04abe2343d5b38352522 3.4 MB Download
wos_100k.csv.gz md5:d062eeb28a6dc7d1301eebc3e55fbcb2 3.6 MB Download
wos_analysis.xlsx md5:2ffc848b47f0d46435e64e27cf64bc9e 524.9 kB Download

Share

Cite as