Published July 8, 2022 | Version v1
Presentation Open

Long-term digital preservation of research data as a community-specific project

  • 1. ZB MED – Information Centre for Life Sciences

Description

The increasing amount of published research data, may it be in community-specific repositories or in general repositories, highlights challenges of data preservation. Since research data in particular is not limited to one or two popular publishing formats, format diversity and subsequently obsolescence is a significant risk to the reusability of research data over the long-term. A second challenge is intellectual reusability of data for future generations, which depends on preservation of sufficient metadata and context information.

A pilot project of ZB MED – Information Centre for Life Sciences and Leibniz Centre for Agricultural Landscape Research (ZALF) investigates digital preservation of research data which is published in the BonaRes data repository. In this project, a workflow spanning the two institutions ZALF and ZB MED is tested for preserving data from the repository in the archive. BonaRes is a repository for data of soil measurement, which is maintained by ZALF and uses established data handling guidelines. It follows open science best practices, like data curation and providing DOIs to make published data citable.

ZB MED runs a digital archive with the aim of not only preserving objects at the technical bitstream level, but also beyond. Further preservation measures aim at preserving access to file content (content level) by migrating files to current formats as needed, as well as intellectual reusability of content (semantic level) by preserving meta data. The archival system itself is part of a cooperation with two other national subject libraries, ZBW – Leibniz Information Centre for Economics and TIB Leibniz Information Centre for Science and Technology, where TIB provides hosting and administration of the system.

In the pilot project, the transfer of data into the archive is the main focus. Additionally, a second part of the workflow contains data transport the reverse way, from the archive to the repository, in case content is no longer available the regular way. This presentation will introduce the project and concepts developed as part of the workflow. Among those is the definition of the designated community as well as evaluation and selection of data and metadata of the repository for digital preservation together with the project partners of ZALF. Preservation methods were determined and specifically the issue of format diversity is addressed in multiple ways. File formats and format types suitable for preservation were defined and contact with data submitting researchers has been established via a workshop conducted by ZALF and ZB MED. The aim of the workshop was enabling researchers to recognize formats suitable for preservation.

The close cooperation of the information centre ZB MED (infrastructure partner) with research centre ZALF (research partner), as well as contact with submitting researchers provides the basis of a user-oriented service. 

Files

12-2 - Session12_KatharinaMarkus_LIBER2022.pdf

Files (1.0 MB)

Name Size Download all
md5:0b83dd48ede30999d0af952e5b1a7f23
1.0 MB Preview Download