Published June 17, 2024 | Version v2.0.0
Poster Open

An Open Software Development-based Ecosystem of R Packages for Metabolomics Data Analysis

  • 1. Institute for Biomedicine, Eurac Research, Italy
  • 2. Computational Biology and Bioinformatics, de Duve Institute, UCLouvain, Belgium
  • 3. Department of Electronic Engineering & IISPV, Universitat Rovira i Virgili, Spain
  • 4. CIBER de Diabetes y Enfermedades Metabólicas Asociadas (CIBERDEM), Instituto de Salud Carlos III, Spain
  • 5. Department of Environmental Chemistry, Switzerland
  • 6. Institute of Molecular Systems Biology, ETH Zurich, Switzerland
  • 7. Metabolomics Unit, Research and Innovation Centre, Fondazione Edmund Mach, Italy
  • 8. Department of Effect Directed Analysis, Helmholtz Center for Environmental Research, Germany
  • 9. Research Unit Analytical BioGeoChemistry, Helmholtz Munich, Germany
  • 10. Department of Nutrition, Exercise and Sports, University of Copenhagen, Denmark
  • 11. Department of Plant and Environmental Sciences, Weizmann Institute of Science, Israel
  • 12. Functional Genomics Center Zurich (FGCZ)-University of Zurich/ETH Zurich, Switzerland
  • 13. Swiss Institute of Bioinformatics (SIB), Switzerland
  • 14. Genome Biology Unit, EMBL, Germany
  • 15. Ingalls Lab at the School of Oceanography, University of Washington, USA
  • 16. Laboratory of Integrative Metabolomics (LIMET), Department of Translational Physiology, Infectiology and Public Health (DI04), Faculty of Veterinary Medicine, Ghent University
  • 17. Department of Life Sciences, Chalmers University of Technology, Sweden
  • 18. RECETOX, Faculty of Science, Masaryk University, Czech Republic
  • 19. Computational Plant Biochemistry, MetaCom, Leibniz Institute of Plant Biochemistry, Germany
  • 20. Metabolomics and Proteomics Core, Helmholtz Munich, Germany
  • 21. Anesthesiology and Intensive Care Medicine, University Hospital Greifswald, Germany

Description

A frequent problem with scientific research software is the lack of support, maintenance and further development. In particular, development by a single researcher can easily result in orphaned software packages, especially if combined with poor documentation or lack of adherence to open software development standards.

The RforMassSpectrometry initiative aims to develop an efficient and stable infrastructure for mass spectrometry (MS) data analysis. As part of this initiative, a growing ecosystem of R software packages is being developed covering different aspects of metabolomics and proteomics data analysis.  To avoid the aforementioned problems, community contributions are fostered, and open development, documentation and long-term support emphasized.

At the heart of the package ecosystem is the Spectra package that provides the core infrastructure to handle and analyze MS data. Its design allows easy expansion to support additional file or data formats including data representations with minimal memory footprint or remote data access. The xcms package for LC-MS data preprocessing was updated to reuse this infrastructure, enabling now also the analysis of very large, or remote, data. This integration simplifies in addition complete analysis workflows which can include the MsFeatures package for compounding, and the MetaboAnnotation package for annotation of untargeted metabolomics experiments. Public annotation resources can be easily accessed through packages such as MsBackendMassbank, MsBackendMgf, MsBackendMsp or CompoundDb, the latter also allowing to create and manage lab-specific compound databases. Finally, the MsCoreUtils and MetaboCoreUtils packages provide efficient implementations of commonly used algorithms, designed to be re-used in other R packages. Ultimately, and in contrast to a monolithic software design, the package ecosystem enables to build customized, modular, and reproducible analysis workflows.

Future development will focus on improved data structures and analysis methods for chromatographic data, and better interoperability with other open source softwares including a direct integration with Python MS libraries.

Notes

Files

RforMassSpectrometry_metabolomics.pdf

Files (1.4 MB)

Name Size Download all
md5:48d7a86b33fddccfda85917bfea1b55d
1.4 MB Preview Download

Additional details

Software

Programming language
R, Python, RMarkdown
Development Status
Active

References