There is a newer version of the record available.

Published October 15, 2021 | Version v1
Poster Open

Research data quality assurance at repositories indexed in re3data

  • 1. Berlin School of Library and Information Science, Humboldt-Universität zu Berlin
  • 1. Helmholtz-Gemeinschaft / Helmholtz Association, Helmholtz Open Science Office
  • 2. DataCite - International Data Citation Initiative e.V.
  • 3. GFZ German Research Centre for Geosciences / Helmholtz-Zentrum Potsdam – Deutsches GeoForschungsZentrum (GFZ)
  • 4. Karlsruher Institut für Technologie / Karlsruhe Institute of Technology (KIT)
  • 5. Humboldt-Universität zu Berlin, Berlin School of Library and Information Science
  • 6. Purdue University

Description

Quality assurance is a central challenge when sharing research data, as it ensures that data are valid, reliable, and usable. The landscape of repositories and their essential contribution to research data sharing is well studied. In contrast, we know much less about repositories’ role in research data quality assurance, and their contributions remain largely invisible.

To address this issue, we conducted a survey among staff responsible for the data curation at repositories listed in re3data, an international registry of research data repositories. Of the 1897 repositories that were contacted, 332 completed the questionnaire.

Among other aspects, the survey covered the formal assessment and review of data, responsibility, data rejection and quality indicators.

The survey distinguished between formal assessment of data and data review. Formal assessment refers to technical, administrative and access-related aspects of data, whereas data review refers to the process by which experts, either from the hosting institution or from other institutions, evaluate the scientific quality of datasets.

We found that 62.3 % (207) of responding repositories apply formal criteria, and 51.5 % (171) conduct data review either for all (31.6 %, 105 ) or some (19.9 %, 66) datasets.

At most repositories conducting formal assessment or data review, repository data curators and research data managers at the hosting institution are responsible for these tasks. Technical repository administrators are more often involved in formal assessment, whereas subject experts are more frequently responsible for data review.

If data of insufficient quality is submitted, 65.1 % (216) of repositories would revise data and metadata and ask data depositors for revision until they fulfill required criteria. 33.1 % of repositories would consider rejecting data deposit. 86 repositories report that they have rejected data in the last two years, at an average rate of 11.1 % of submitted data sets.

The quality indicators repositories selected most frequently as very relevant or relevant are ‘overall data and documentation quality’, ‘appropriate metadata / documentation’, and ‘suitability to the scope of the repository’. In contrast, ‘timeliness’ and ‘novelty’ are least relevant to responding repositories.

Survey data were supplemented by re3data metadata to assess the influence of repository characteristics on quality assurance.

We found no significant relationship between certification status and the formal assessment of data at a repository. The association between certification status and data review is significant, but with a small effect size. Repository type has no significant relationship with either formal assessment or data review.

Results of the survey will be shared with the repository community and inform the development of a framework for research data quality assurance. On that basis, a controlled vocabulary for research data quality assurance measures will be established, which will be implemented in a future version of the re3data Metadata Schema.

The survey is part of the project re3data COREF (Community Driven Open Reference for Research Data Repositories), a project funded by the German research foundation DFG aiming at transforming re3data into a central service for the open science community.

Files

re3data_rda_2021.pdf

Files (1.8 MB)

Name Size Download all
md5:d0ea685ea37597d6f32af1afc3444945
1.8 MB Preview Download