Published August 7, 2019 | Version 1.0.0
Dataset Open

Graph topological features extracted from expression profiles of neuroblastoma patients

  • 1. Luxembourg Institute of Health
  • 2. Nanyang Technological University

Description

Introduction

This dataset contains the data described in the paper titled "A deep neural network approach to predicting clinical outcomes of neuroblastoma patients." by Tranchevent, Azuaje and Rajapakse. More precisely, this dataset contains the topological features extracted from graphs built from publicly available expression data (see details below). This dataset does not contain the original expression data, which are available elsewhere. We thank the scientists who did generate and share these data (please see below the relevant links and publications).

 

Content

File names start with the name of the publicly available dataset they are built on (among "Fischer", "Maris" and "Versteeg"). This name is followed by a tag representing whether they contain raw data ("raw", which means, in this case, the raw topological features) or TF formatted data ("TF", which stands for TensorFlow). This tag is then followed by a unique identifier representing a unique configuration. The configuration file "Global_configuration.tsv" contains details about these configurations such as which topological features are present and which clinical outcome is considered.

The code associated to the same manuscript that uses these data is at https://gitlab.com/biomodlih/SingalunDeep. The procedure by which the raw data are transformed into the TensorFlow ready data is described in the paper.

 

File format

All files are TSV files that correspond to matrices with samples as rows and features as columns (or clinical data as columns for clinical data files). The data files contain various sets of topological features that were extracted from the sample graphs (or Patient Similarity Networks - PSN). The clinical files contain relevant clinical outcomes.

The raw data files only contain the topological data. For instance, the file "Fischer_raw_2d0000_data_tsv" contains 24 values for each sample corresponding to the 12 centralities computed for both the microarray (Fischer-M) and RNA-seq (Fischer-R) datasets. The TensorFlow ready files do not contain the sample identifiers in the first column. However, they contain two extra columns at the end. The first extra column is the sample weights (for the classifiers and because we very often have a dominant class). The second extra column is the class labels (binary), based on the clinical outcome of interest.

 

Dataset details

The Fischer dataset is used to train, evaluate and validate the models, so the dataset is split into train / eval / valid files, which contains respectively 249, 125 and 124 rows (samples) of the original 498 samples. In contrast, the other two datasets (Maris and Versteeg) are smaller and are only used for validation (and therefore have no training or evaluation file).

The Fischer dataset also has more data files because various configurations were tested (see manuscript). In contrast, the validation, using the Maris and Versteeg datasets is only done for a single configuration and there are therefore less files.

For Fischer, a few configurations are listed in the global configuration file but there is no corresponding raw data. This is because these items are derived from concatenations of the original raw data (see global configuration file and manuscript for details).

 

References

This dataset is associated with Tranchevent L., Azuaje F.. Rajapakse J.C., A deep neural network approach to predicting clinical outcomes of neuroblastoma patients.

If you use these data in your research, please do not forget to also cite the researchers who have generated the original expression datasets.

Fischer dataset:

  • Zhang W. et al., Comparison of RNA-seq and microarray-based models for clinical endpoint prediction. Genome Biology 16(1) (2015). doi:10.1186/s13059-015-0694-1
  • Wang C. et al., The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nat. Biotechnol. 32(9), 926–932. doi:10.1038/nbt.3001

Versteeg dataset:

  • Molenaar J.J. et al., Sequencing of neuroblastoma identifies chromothripsis and defects in neuritogenesis genes. Nature 483(7391), 589–593. doi:10.1038/nature10910

Maris dataset:

  • Wang Q. et al., Integrative genomics identifies distinct molecular classes of neuroblastoma and shows that multiple genes are targeted by regional alterations in DNA copy number. Cancer Res. 66(12), 6050–6062. doi:10.1158/0008-5472.CAN-05-4618

Notes

Project supported by the Fonds National de la Recherche (FNR), Luxembourg (SINGALUN project). This research was also partially supported by Tier-2 grant MOE2016-T2-1-029 by the Ministry of Education, Singapore.

Files

Files (9.3 MB)

Name Size Download all
md5:69afea4b4776de136a281fe67f7b62c3
193.3 kB Download
md5:819a6d2dcfa27dc567e5b52f54d94836
77.7 kB Download
md5:5196be73b98533d77812a569c107e9e2
98.0 kB Download
md5:8b44821d92b4f9205ca0568299fc3daf
41.8 kB Download
md5:3f84193c6bd44397fb40d0020cef764c
98.3 kB Download
md5:3c2cdf3ef9ccbea3b3b95b6b7aaac0cd
38.8 kB Download
md5:3b8a3a9000052912e8ed27d7ae540184
193.7 kB Download
md5:18ce40e90e5a1f3fb051b8936427e5ae
79.7 kB Download
md5:d569abc3b103580f8f0c8b57be6c2511
97.9 kB Download
md5:6fcf5ef6280b2c2b3e140732db0a7236
32.9 kB Download
md5:e2bade62011ec85960496c5a3dbf3bd0
98.7 kB Download
md5:237ead031ea0dd2c88090536049c7087
49.8 kB Download
md5:089488bcae31ca1a6b639a1e7aa00ae8
7.0 kB Download
md5:306ba9cc665f94fb0bc823b3cf5edb55
59.8 kB Download
md5:e42e0f74c4ad75f2277d74d99e8da386
122.5 kB Download
md5:c1a4684789a8cf6f5e325d97358d2183
61.2 kB Download
md5:0fb10be21d78f81dd19de4a3377df520
186.2 kB Download
md5:0129812e97446352bbf7a29b974410f3
374.5 kB Download
md5:ae31bec374993a413dd3f7ef0bcbe9f1
186.5 kB Download
md5:ad93e19cc3e8c54856587cea213cf98d
30.4 kB Download
md5:8402bafce5ab2f2bc50711a07967f1fd
64.0 kB Download
md5:7d03a3d48a3a7abe984677d6477f70bb
32.0 kB Download
md5:719fd7a5f1b351eb8fc266b3153e5e00
97.3 kB Download
md5:0f2ba680e0e99c128b527bdef2524199
197.3 kB Download
md5:401f1c87c0601004a4f3f1c905b503e7
98.3 kB Download
md5:c514ae0932deb1d9360eda54e6121edb
30.4 kB Download
md5:665725bd01bb33100d8e1a1defc3504c
63.9 kB Download
md5:527156da82cd1d05fc733f60a8632259
32.0 kB Download
md5:4d6b3a716be04cc77010cd8b2083a171
89.9 kB Download
md5:4b6ebdb1e4a45594a39e09340d55200c
182.6 kB Download
md5:931b0ece791a29943c66f26995411e92
90.9 kB Download
md5:08a4500b73296245ea579854c624f585
59.7 kB Download
md5:f17576fce69fad735a05f133f8a7c42d
121.9 kB Download
md5:6c10167989e20265bf21342c88554a39
61.0 kB Download
md5:ba079e88862e23c42ba942d7bb4ce71a
193.4 kB Download
md5:09c80775b076b64fb8acd6c43e1a571b
388.3 kB Download
md5:219f8f2db8edc5d60c730c828e819cb4
193.5 kB Download
md5:5ad187948f90e65156ee9ccde336c122
30.4 kB Download
md5:2c1373c8e83360b079b33236043f1772
63.6 kB Download
md5:ad975ea37e3765fa564087300c3a49d1
31.8 kB Download
md5:609a17672380c65e697b5a8954071d26
74.8 kB Download
md5:05670344464eed4ad46e30c97a663540
152.1 kB Download
md5:4cf5c292c842084fa7bdbe8d22622624
75.8 kB Download
md5:c94f7522afb9496b2fef480a4c42c69a
30.3 kB Download
md5:20ba77c2e8d3e6ae183f6c0eae8b5c92
63.3 kB Download
md5:3c813ac095436429306da9db57f94747
31.7 kB Download
md5:55c661743fc42b215bd529894d03e189
119.6 kB Download
md5:3948377b57880d8de198969502543dbb
241.2 kB Download
md5:2536811e91d57e496bd97cceb76121ec
120.2 kB Download
md5:017aadaafc9162ed2a2b46349e2b6190
246.0 kB Download
md5:e87dccb4dc3c29b72e9c57374b1c3090
493.4 kB Download
md5:0b25db6b1545b8ab94cd9805f6dd2313
245.9 kB Download
md5:3e83569602d71164bd29db3e484a338a
127.5 kB Download
md5:0176f761424f5aae4f6d5cc1381c1315
257.5 kB Download
md5:fd6cfd8e11e6fa034f340a80207b00b5
128.3 kB Download
md5:dcdc5fc3d43fa06e72e0076867d5c278
119.5 kB Download
md5:ee1f84bc021b9e0fffe2725c4981462a
241.4 kB Download
md5:88600fe55b447ee215d813041f7502ed
120.4 kB Download
md5:18b705101fee379bccf26c59e8c97e02
252.0 kB Download
md5:d2a49a2c3a252406ef1de1b8548dbb6a
504.8 kB Download
md5:a15610a37e21c539b95942afd2d99aee
251.7 kB Download
md5:56959c816afd8413e3a7f2e8f7c2b0e6
104.3 kB Download
md5:9a4fb1bc66b4c7b0845a4765e5655499
210.7 kB Download
md5:13dbc60ac8aee64b47da425327deb01f
105.1 kB Download
md5:1b9527a5df3729226b493433cf8c6b04
148.6 kB Download
md5:74322e5829d40deea58e70884c719778
298.9 kB Download
md5:f9a7349ed35fd441fe2c44813dbc3e6a
149.0 kB Download
md5:6816cd9fad8c582668daabd1bb10aa65
1.2 kB Download
md5:729d0d0f52a34ac73a4c355be10a098d
18.1 kB Download
md5:218420ac0f7a4529d5092a89c0aed698
18.1 kB Download
md5:3cd9e68c668ff0a9f9574e05e22e6edf
1.2 kB Download
md5:f853ca38dcd5041f3aaa8b0fbe4c3fce
23.6 kB Download
md5:c8f866e73cda4dadb6e6975c27fe211c
23.5 kB Download
md5:8aa459e13b97a1e6837cf506354c32d3
4.2 kB Download
md5:8d74412430a9d500e8c0041ccd7ac51c
17.3 kB Download
md5:4aa448d92ab44f678691459b3736e415
17.4 kB Download
md5:62ce0edc8657c8330bc9075780b0295d
1.2 kB Download
md5:18902aa6ea80b56a8ae81de72a842500
22.5 kB Download
md5:c6b401a3f4b4781a672996bfca340f74
21.4 kB Download