Open access books through open data sources: Assessing prevalence, providers, and preservation
- 1. Hanken School of Economics
Science policy and practice for open access (OA) books is a rapidly evolving area in the scholarly domain. However, there is much that remains unknown. Utilizing open bibliometric data sources, this study aims to answer three questions: 1) How prevalent are OA books (data sources: Directory of Open Access Books, OpenAIRE, OpenAlex, Scielo Books, The Lens, WorldCat), 2) what web domains are responsible for offering full-text access to these OA books, and 3) to what degree can OA books be verified to be archived in trusted preservation services (data sources: Cariniana Network, CLOCKSS, Global LOCKSS Network, Portico). 396 995 unique records were identified from the OA book bibliometric sources, of which 19% were found to be included in at least one of the preservation services. The results suggest reason for concern for the long tail of OA books distributed at thousands of different web domains as these include volatile cloud storage or sometimes no longer contained the files at all. Data quality issues, varying definitions of OA across services, and inconsistent implementation of unique identifiers were discovered as key challenges. The study includes recommendations for publishers, libraries, data providers, and preservation services for improving monitoring and practices for OA book preservation.
The author is grateful to Alicia Wise and Ronald Snijder for assisting in the identification of available datasets and valuable feedback throughout the study.
This research was commissioned by CLOCKSS, DOAB, and OAPEN.
Data availability statement
The research data is made available as open data through Zenodo and can be downloaded from https://doi.org/10.5281/zenodo.7305477
- Is supplemented by
- Dataset: 10.5281/zenodo.7305477 (DOI)