Published April 19, 2021 | Version 1
Presentation Open

First Line Research Data Management for the Life Sciences: a case study

  • 1. Maastricht University - DataHub

Description

Modern life sciences depend on the collection, management and analysis of comprehensive datasets in what has become data-intensive research. Life science research is also characterised by having relatively small groups of researchers. This combination of small groups and data intensive research have led to an increasing bottleneck on research data management (RDM). Parallel to this, there has been an urgent call by initiatives like FAIR and Open Science to openly publish research data.

Here, we reflect on the lessons learned by DataHub Maastricht, a RDM support group of the university medical centre in Maastricht, the Netherlands, in providing first line RDM for life sciences. DataHub Maastricht operates with a small core team, and is complemented with disciplinary data stewards who preferably have joint positions between DataHub and a research group. This organisational model helps creating shared knowledge at between DataHub and the data stewards. Data stewards also provide direct input to the Scrum development process at DataHub. Finally, resources (time and manpower) to provide support are limited, and data stewards are key in knowing how to focus support on the most reusable datasets.

The features of a RDM service can be used as incentive for researchers to start RDM. This turned out to be very effective for offering the ability to reduce storage costs by having transparent tiering of datasets to tape. On the other hand, we have learned that it is easy to fall into the fallacy of believing the next feature-to-be will provide the definitive incentive for a researcher to start RDM.

The life sciences have very diverse sub domains, and this diversity can be a real challenge to cater for in your RDM services. We have used the integrated Rule-Orientated Data System (iRODS), which is well suited for the data intensive life science, to build our generic Maastricht Data Repository. However, to support this diversity we turn to domain specific RDM platforms like OMERO, XNAT and MOLGENIS. Teaming up with research groups to host these platforms turned out to be successful. The IT knowledge required to create a long-term stable platform is present at DataHub, while the platform specific knowledge is present at the research group.

Looking into the future, we foresee the need to further embed the role of data stewards into the lifeblood of the research organisation, along with policies how to finance long-term storage of research data. The latter needs to be combined with a further formalising of appraisal and reappraisal of archived research data.

Files

idcc2021-datahub-maastricht.pdf

Files (1.5 MB)

Name Size Download all
md5:1a3d2999e1d3c3e973b135a1284cf723
1.5 MB Preview Download