EOSC-Life Methodology framework to enhance reproducibility within EOSC-Life
Creators
- Bietrix, Florence1
- Carazo, José Maria2
- Capella-Gutierrez, Salvador3
- Coppens, Frederik4
- Chiusano, Maria Luisa5
- David, Romain6
- Fernandez, Jose Maria2
- Fratelli, Maddalena7
- Heriche, Jean-Karim8
- Goble, Carole9
- Gribbon, Philip10
- Holub, Petr11
- P. Joosten, Robbie12
- Leo, Simone13
- Owen, Stuart9
- Parkinson, Helen14
- Pieruschka, Roland15
- Pireddu, Luca13
- Porcu, Luca7
- Raess, Michael16
- Rodriguez- Navas, Laura3
- Scherer, Andreas17
- Soiland-Reyes, Stian9
- Tang, Jing17
- 1. EATRIS
- 2. CSIC
- 3. BSC
- 4. VIB
- 5. EMBRC
- 6. ERINHA
- 7. IRFMN
- 8. EMBL
- 9. UNIMAN
- 10. Fraunhofer
- 11. BBMRI-ERIC
- 12. NKI
- 13. CRS4
- 14. EMBL-EBI
- 15. FZJ
- 16. INFRAFRONTIER
- 17. HU
Description
The original scope of task 8.3 is to develop metrics to assess the impact on reproducibility of the availability of life-science open data and workflows in the cloud.
A great part of the activities within EOSC-Life is actually related to reproducibility and provides in several ways tools that will have an impact. Therefore, we decided that it would be more informative to describe such activities and to explain why and how they will have an impact on reproducibility in life sciences, instead of providing abstract metrics for such an impact. For these reasons, we changed the title of the deliverable, from “framework to assess ...” to “framework to enhance reproducibility”.
First of all, we reasoned on what can be the contribution of open science to improve the reproducibility of research. Publicly sharing data, protocols, tools and computational workflows makes it possible to compare or combine the data and outcomes from different studies within a discipline as well as integrate data across scientific domains. It allows conclusions to be validated and possibly corrected as well as being reinforced by meta-analyses. Replication data and test/training data can also be used in many applications to contribute to reproducible research. Moreover, new hypotheses, different from the original aims of the study, can be explored. Data sets can be re-used to develop and test new methods, to conduct scientific and technical benchmarking activities and to support training activities. Therefore, in addition to generating more value from research investments, data sharing has the potential to increase confidence in research outcomes and increase knowledge dissemination. These benefits of open sharing have long been recognized in some fields such as bioinformatics, which has a long history of publicly sharing data with, for example, public repositories for nucleotide sequences going back 30 years and the Protein Data Bank (PDB), a repository of information about the 3D structures of proteins, nucleic acids and complex assemblies, that celebrates its 50th birthday this year.
In this notion, any improvement in sharing of data, tools and workflows among scientists and across disciplines, that is the aim of EOSC-Life and the wider EOSC, will contribute to reproducible science. In addition to this general scope, several specific actions to frame transparency in the reporting of experimental protocols, data and analytical workflows warrant the reproducibility of each single object (experimental results, data or workflows) that is made available on the cloud.
We will describe here the initiatives in EOSC-Life to implement existing tools for reproducibility as well as to develop new tools for its enhancement. As the final goal of EOSC-Life is to make data resources available to the wider community of life scientists, although necessarily technical in several points, this document aims at a general readership, including experimental in addition to data scientists.
Files
EOSC-Life_D8.1_Methodology framework to enhance reproducibility within EOSC-Life_April-2021.pdf
Files
(859.9 kB)
Name | Size | Download all |
---|---|---|
md5:985c7ba80ad85220e91d3886641c97ba
|
859.9 kB | Preview Download |