Published July 12, 2023 | Version v1
Presentation Open

From Chaos to Control - How Dutch university libraries collectively build, manage and use a data warehouse for open access management

  • 1. UKB

Description

Abstract

For many consortia and libraries, open access is a driver for change in both license management and research support. Reaching 100% compliant open access with limited budgets is challenging for both institutions and researchers. An important success factor is the availability of open access management data. However, the lack of data format standards and a limited set of persistent identifiers, makes it impossible to easily combine open access related metadata sets, let alone use it for trusted business intelligence and decision support.

To get a grip on this chaotic playing field, UKB, the network of Dutch University libraries, started building a data warehouse in 2020. The first data services for both consortium and libraries were launched within a year. By the end of 2022 the datahub contained a rich, structured and controlled metadata set related to more than 300.000 peer reviewed articles, written by Dutch (co-) authors over the last five years inside and outside (consortium) publishing agreements.

Besides data from publishers, university research information systems and commercial databases, the data warehouse also harvests open databases like Crossref, Unpaywall and OpenAPC. The datahub is used to extract, transform, present and load open access related metadata. Business rules are applied when multiple data sources present conflicting information, for example regarding the open access status of an article.

The results enable the consortium and university libraries to audit the quality of publisher reports and open access publishing services including missed and non-compliant open access, the status of capped deals and long term publishing trends. The datahub is also used to analyze open access costs outside of deals (‘APC’s in the wild’), to give libraries additional insight in articles that are missing in university research information systems and closed access articles that can be converted to green open access.

The presentation aims to explain and show in a non-technical way how the data warehouse works including the main challenges that were encountered when building it. The added value for negotiation teams, open access experts and contract managers is addressed with several practical use cases and a demo of the datahub. Goal is to share lessons learned and to inspire other libraries and consortia who want to get a grip on open access related metadata that is essential for development of their open access strategy, to strengthen their position to publishers and to help authors in publishing funder compliant.

Files

Session1_ArjanSchalken_LIBER2023_final.pdf

Files (4.9 MB)

Name Size Download all
md5:a21e6da30d68d491c0c04797cc19f88c
4.9 MB Preview Download