Journal article Open Access
Bloom, Theodora; Dallmeier-Tiessen, Sünje ; Murphy, Fiona; Austin, Claire C.; Whyte, Angus; Tedds, Jonathan; Nurnberger, Amy; Raymond, Lisa; Stockhause, Martina; Vardigan, Mary
Tim Clark, Eleni Castro, Elizabeth Newbold, Samuel Moore, Brian Hole
Data publishing is a major cornerstone of open science, reliable research, and modern scholarly communication. It enables researchers to share their materials via dedicated workflows, services and infrastructures and ultimately is intended to ensure that data – and in particular datasets underlying published results — are well documented, curated, persistent, interoperable, reusable, citable, attributable, quality assured and discoverable. Needless to say, data publishing workflows potentially have an enormous impact on researchers, research practices and publishing paradigms, as well as on funding strategies, and career and research evaluations.
It is crucial for all stakeholders to understand the options for data publishing workflows and to be aware of emerging standards and best practices. To that end, the RDA-WDS Data Publishing Workflows group set out to survey the current data publishing workflow landscape across disciplines while at the same time paying attention to discipline-specific characteristics. We looked at a diverse set of workflows, including basic self-publishing services, institutional data repositories, curated data repositories, and joint data journal and repository arrangements to identify common components and standard practices. This permitted us to identify, analyze and categorize the main building blocks comprising data publishing workflows. We wanted to understand how workflows differ based on the desired outputs and how community needs play a role in workflows. Interestingly, we found that core concepts are congruent across disciplines and data publishing workflows.
The present paper describes our findings and presents them as components of a data publishing reference model. Based on the assessment of the current data publishing landscape, we highlight important gaps and challenges to consider, especially when dealing with more complex workflows and their integration into the wider community frameworks. We conclude the paper with recommendations to advance data publishing in line with the identified standards. It is our hope that as more research communities seek to publish data associated with their research, they will build on one or more of the components identified in creating their own workflows and thus accelerate uptake.