Published October 11, 2018 | Version v1

Integrating omics datasets with the OmicsPLS package

  • 1. Dept. of Biomedical Data Sciences, LUMC, Albinusdreef 2, Leiden, 2300 RC, The Netherlands
  • 2. Department of Biostatistics and Research Support, UMC Utrecht, div. Julius Centre, Huispost Str. 6.131, Utrecht, 3508 GA, The Netherlands
  • 3. Delft Institute of Applied Mathematics, EEMCS, TU Delft, Van Mourik Broekmanweg 6, Delft, 2628 XE, The Netherlands
  • 4. MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XU, Scotland
  • 5. Genos Glycobiology Laboratory, Zagreb, 10000, Croatia
  • 6. Dept. of Statistics, University of Leeds, Leeds, LS2 9JT, United Kingdom

Description

Background: With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS.

Results: We introduce OmicsPLS, an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data.

Conclusions: We propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLS and can be installed in R via install.packages("OmicsPLS").

Files

12859_2018_2371_MOESM1_ESM.pdf

Files (2.8 MB)

Name Size Download all
md5:2f80e8398a21f76a1662cab73e4f073b
1.2 MB Preview Download
md5:44d52b412ef4d7026a7bb4789614ac5a
6.2 kB Download
md5:df80276831b15ace6c1fc716ef3851ab
1.6 MB Preview Download
md5:7bb993c3b28e8a08725ff57be4fb2108
19.3 kB Download

Additional details

Funding

European Commission
MIMOMICS - Methods for Integrated analysis of Multiple Omics datasets 305280