Published September 27, 2018 | Version v1
Journal article Open

Predicting clinical outcomes in neuroblastoma with genomic data integration

  • 1. Department of Computer Engineering, Antalya Bilim University, Antalya, Turkey
  • 2. Electrical and Computer Engineering Graduate Program, Institute of Applied Sciences, Antalya Bilim University, Antalya, Turkey
  • 3. Graduate School of Informatics, Department of Health Informatics, Middle East Technical University, Ankara, Turkey

Description

Background: Neuroblastoma is a heterogeneous disease with diverse clinical outcomes. Current risk group models require improvement as patients within the same risk group can still show variable prognosis. Recently collected genome-wide datasets provide opportunities to infer neuroblastoma subtypes in a more unified way. Within this context, data integration is critical as different molecular characteristics can contain complementary signals. To this end, we utilized the genomic datasets available for the SEQC cohort patients to develop supervised and unsupervised models that can predict disease prognosis.

Results: Our supervised model trained on the SEQC cohort can accurately predict overall survival and event-free survival profiles of patients in two independent cohorts. We also performed extensive experiments to assess the prediction accuracy of high risk patients and patients without MYCN amplification. Our results from this part suggest that clinical endpoints can be predicted accurately across multiple cohorts. To explore the data in an unsupervised manner, we used an integrative clustering strategy named multi-view kernel k-means (MVKKM) that can effectively integrate multiple high-dimensional datasets with varying weights. We observed that integrating different gene expression datasets results in a better patient stratification compared to using these datasets individually. Also, our identified subgroups provide a better Cox regression model fit compared to the existing risk group definitions.

Conclusion: Altogether, our results indicate that integration of multiple genomic characterizations enables the discovery of subtypes that improve over existing definitions of risk groups. Effective prediction of survival times will have a direct impact on choosing the right therapies for patients.

Reviewers: This article was reviewed by Susmita Datta, Wenzhong Xiao and Ziv Shkedy.

Files

13062_2018_223_MOESM1_ESM.pdf

Files (1.9 MB)

Name Size Download all
md5:8b442d0ef5604d2e70f6b207b7c1b354
685.2 kB Preview Download
md5:9997576a3b4672d176559ada25b11d57
113.2 kB Download
md5:bbd60f5c236408264b47d9b3dcf3c367
41.0 kB Download
md5:bbd60f5c236408264b47d9b3dcf3c367
41.0 kB Download
md5:daf42a9ad60bfb9a884960425440295e
1.0 MB Preview Download
md5:daa2922cecdcebd7b80b62c06c11a06b
19.4 kB Download