Published February 18, 2020 | Version v1
Poster Open

Criteria for appraisal and assessment of research data upon submission to a data repository

  • 1. Friedrich Schiller University Jena

Description

Submitting a dataset to a data repository is the process of transferring a data object from the private domain to the shared or public domain (see domain model by Treloar & Klump 2019). The data provider’s intention is to preserve (i.e., archive) the data and in most cases also to make it accessible (i.e., publish) to a broad audience. Data repositories receiving the data need to make a number of decisions on how to treat the submitted data to fulfill the expectations of the data provider. Thus, most repositories publish terms and conditions under which they operate. However, often these do not cover all aspects needed. Secondly, compliance with these terms and conditions needs to be verified. To our experience, this verification process is based on the individual expertise and the experience of the data curation personnel. In many cases there is no formal and transparent process in place.

The motivation of this work is to provide data curators of data repositories with a practical guide containing a catalogue of criteria to be verified at data submission time. The catalogue also specifies information requirements that need to be collected from data providers at submission time, because they may not be available in the future. Information, such as retention period of the data in the repository, or the responsibility for data disposal, are typically not part of standard metadata.

Commencing from an initial draft catalogue, designed by the authors for an institutional repository, a 1-day workshop was conducted with data management support staff, data managers and data curators in order to discuss, complement and re-structure the criteria. The resulting criteria were phrased as questions and grouped into seven categories: Appraisal, Compliance, Primary Data, Metadata, Preservation, Access and Curation. Each question was designated to either the data provider or the data manager / curator or both.

The catalogue is supposed to serve as basis for appraisal and assessment of data submitted to a repository. Applying the criteria leads to the decision whether it can be ingested as it is, ingested only with preceding curation or has to be rejected. A repository that implements the catalogue and publishes that the criteria will be applied on data submitted, increases its transparency as well as effectiveness and efficiency. The criteria catalogue can be customized by the repository in accordance to its needs by e.g., deleting, adding or weighting the criteria. In a next step, the catalogue could be developed further by e.g., adding implications of certain answers for the data provider and the repository, translating it into a decision tree or complementing it with a scoring system.

This work was conducted as part of the eeFDM project, which was funded by the German Federal Ministry of Education and Research (BMBF).

Notes

This poster was presented at the 15th International Digital Curation Conference (IDCC), Dublin, Irland, 17-19th Februray 2020, under poster number 250

Files

PosterA1_IDCC2020_GerlachR_etal_250.pdf

Files (192.8 kB)

Name Size Download all
md5:15ae87d9ceff249c6bb24670efc639ab
192.8 kB Preview Download