Published January 5, 2024 | Version v1
Presentation Open

PSDI Webinar: FAIR Data for the Biomolecular Simulation Community - Slides

  • 1. Science and Technology Facilities Council

Contributors

Distributor:

  • 1. University of Southampton

Description

In the Physical Sciences Data Infrastructure (PSDI) our first round pathfinders are exploratory pieces of work looking at an application area where PSDI could develop tools to enhance the research infrastructure. These are the slides from the Pathfinder 4: FAIR Data for the Biomolecular Simulation Community webinar presented by James Gebbie-Rayet and Jas Kalayan which was run on 18th October 2023. https://www.psdi.ac.uk/event/webinar-psdi-pf4/

The recording of this webinar is available on YouTube: https://youtu.be/FA_rVv-hZig

Abstract: The biomolecular simulation community produces vast amounts of data to study complex biological systems at atomic scales. Yet, how we store, share, and reuse our data is not well defined. One problem is that our simulation pipelines are difficult to replicate without detailed descriptions of each step. Another problem is the data files we produce can often be too large to store on the national compute facilities, which most of this data is produced on. In this webinar we present possible solutions to address these two problems; firstly, a software tool to record data provenance towards FAIR [1] compliant formats, and the other an online data repository to store and share this data. For reproducible data, we utilise the powerful AiiDA Python infrastructure [2,3] to develop our aiida-gromacs plugin [4], for tracking all inputs and outputs from simulation protocols performed with the GROMACS MD engine [5]. We show how aiida-gromacs can be used with minimal changes to how researchers already work, to allow for easy adoption of provenance tools within our community. To address the findable and accessible components of FAIR compliancy, we also briefly present our prototype data infrastructure. Our aim is to allow researchers to directly upload data produced via aiida-gromacs onto our database for easy access, querying and visualisation of simulation data/metadata online. We encourage the biomolecular simulation community to try aiida-gromacs, we have installation guides and tutorials available online. Please have a go, contribute, and provide us with feedback for improving aiida-gromacs! [1] Wilkinson, Mark D. et al. (2016). Scientific Data, 3(1), 160018. [2] https://www.aiida.net/sections/about.html [3] Huber, Sebastiaan P. et al. (2020). Scientific Data, 7(1), 300. [4] https://aiida-gromacs.readthedocs.io/en/latest/index.html [5] https://www.gromacs.org/

Files

20231018_Webinar4-PF4_combined-slidedeck.pdf

Files (5.0 MB)

Name Size Download all
md5:d7b60bf5c413770f849f55ed1d4ebad4
5.0 MB Preview Download

Additional details

Related works

Is supplemented by
Video/Audio: https://youtu.be/FA_rVv-hZig (URL)

Funding

PSDI Phase 1b EP/X032663/1
UK Research and Innovation
Physical Sciences Data Infrastructure (PSDI) Phase 1 Pilot EP/W032252/1
UK Research and Innovation
Physical Sciences Data Infrastructure Phase 1b EP/X032701/1
UK Research and Innovation

Dates

Created
2023-10-18