An Open Software Development-based Ecosystem of R Packages for Metabolomics Data Analysis
Creators
- 1. Computational Biology and Bioinformatics, de Duve Institute, UCLouvain, Belgium
- 2. Department of Electronic Engineering & IISPV, Universitat Rovira i Virgili, Spain
- 3. Department of Environmental Chemistry, Switzerland. Institute of Molecular Systems Biology, EHT Zurich, Switzerland
- 4. Department of Effect Directed Analysis, Helmholtz Center for Environmental Research, Germany
- 5. Research Unit Analytical BioGeoChemistry, Helmholtz Munich, Germany
- 6. Department of Nutrition, Exercise and Sports, University of Copenhagen, Denmark
- 7. Department of Plant and Environmental Sciences, Weizmann Institute of Science, Israel
- 8. Genome Biology Unit, EMBL, Germany
- 9. RECETOX, Masaryk University, Czech Republic
- 10. Computational Plant Biochemistry, MetaCom, Leibniz Institute of Plant Biochemistry, Germany
- 11. Metabolomics and Proteomics Core, Helmholtz Munich, Germany
- 12. Anesthesiology and Intensive Care Medicine, University Hospital Greifswald, Germany
- 13. Institute for Biomedicine, Eurac Research, Italy
Description
Lack of support, maintenance and further development is common with scientific research software. In particular, development by a single researcher can easily result in orphaned software packages, especially if combined with poor documentation or lack of adherence to open software development standards.
The RforMassSpectrometry initiative aims to develop an efficient, thoroughly documented and stable infrastructure for mass spectrometry (MS) data analysis. As part of this initiative, a growing ecosystem of R software packages was and is being developed covering different aspects of metabolomics and proteomics data analysis. To avoid the aforementioned problems open shared development, documentation, support and stability are emphasized.
At the heart of the package ecosystem is the Spectra
package, that provides the core infrastructure to handle MS data. Core functionality, which can be easily re-used by other R software packages, is provided by the MsCoreUtils
and MetaboCoreUtils
packages. Version 4 of the xcms
package for LC-MS data pre-processing is now based mainly on this new infrastructure hence gaining support for additional data types, better data handling and support for ion mobility data. Integration of the xcms
package into the package ecosystem simplifies complete analysis workflows which can include the MsFeatures
package for feature grouping, and the MetaboAnnotation
package for annotation of untargeted metabolomics data. Seamless integration of publicly available annotation resources is possible through packages such as MsBackendMassbank
, MsBackendMsp
or CompoundDb
, the latter also allowing to create and manage lab-specific annotation resources. MsQuality
enables rapid, efficient, and standardized quality assessment of MS data.
Finally, integration of Python based functionality, such as provided by the matchms
package, is possible through the SpectriPy
package, and the SpectraQL
adds support for the MassQL common query language to R/Spectra
.
Notes
Files
R-ecosystem-for-metabolomics.pdf
Files
(876.4 kB)
Name | Size | Download all |
---|---|---|
md5:6b684b7a01d5e29ffe5f05c21fadca72
|
876.4 kB | Preview Download |
Additional details
References
- Naake, Thomas et al (2023). MsQuality – an interoperable open-source package for the calculation of standardized quality metrics of mass spectrometry data. bioRxiv https://doi.org/10.1101/2023.05.12.540477
- Rainer, Johannes et al. (2022). A Modular and Expandable Ecosystem for Metabolomics Data Annotation in R. Metabolites. https://doi.org/10.3390/metabo12020173
- Kockmann, Thomas et al. (2021). The rawrr R Package: Direct Access to Orbitrap Data and Beyond. Journal of Proteome Research. https://doi.org/10.1021/acs.jproteome.0c00866
- Huber, Florian et al. (2020). matchms – processing and similarity evaluation of mass spectrometry data. JOSS https://doi.org/10.21105/joss.02411
- Jarmusch, Alan K et al. (2022). A Universal Language for Finding Mass Spectrometry Data Patterns. bioRxiv https://doi.org/10.1101/2022.08.06.503000