Graph topological features extracted from expression profiles of neuroblastoma patients
- 1. Luxembourg Institute of Health
- 2. Nanyang Technological University
Description
Introduction
This dataset contains the data described in the paper titled "A deep neural network approach to predicting clinical outcomes of neuroblastoma patients." by Tranchevent, Azuaje and Rajapakse. More precisely, this dataset contains the topological features extracted from graphs built from publicly available expression data (see details below). This dataset does not contain the original expression data, which are available elsewhere. We thank the scientists who did generate and share these data (please see below the relevant links and publications).
Content
File names start with the name of the publicly available dataset they are built on (among "Fischer", "Maris" and "Versteeg"). This name is followed by a tag representing whether they contain raw data ("raw", which means, in this case, the raw topological features) or TF formatted data ("TF", which stands for TensorFlow). This tag is then followed by a unique identifier representing a unique configuration. The configuration file "Global_configuration.tsv" contains details about these configurations such as which topological features are present and which clinical outcome is considered.
The code associated to the same manuscript that uses these data is at https://gitlab.com/biomodlih/SingalunDeep. The procedure by which the raw data are transformed into the TensorFlow ready data is described in the paper.
File format
All files are TSV files that correspond to matrices with samples as rows and features as columns (or clinical data as columns for clinical data files). The data files contain various sets of topological features that were extracted from the sample graphs (or Patient Similarity Networks - PSN). The clinical files contain relevant clinical outcomes.
The raw data files only contain the topological data. For instance, the file "Fischer_raw_2d0000_data_tsv" contains 24 values for each sample corresponding to the 12 centralities computed for both the microarray (Fischer-M) and RNA-seq (Fischer-R) datasets. The TensorFlow ready files do not contain the sample identifiers in the first column. However, they contain two extra columns at the end. The first extra column is the sample weights (for the classifiers and because we very often have a dominant class). The second extra column is the class labels (binary), based on the clinical outcome of interest.
Dataset details
The Fischer dataset is used to train, evaluate and validate the models, so the dataset is split into train / eval / valid files, which contains respectively 249, 125 and 124 rows (samples) of the original 498 samples. In contrast, the other two datasets (Maris and Versteeg) are smaller and are only used for validation (and therefore have no training or evaluation file).
The Fischer dataset also has more data files because various configurations were tested (see manuscript). In contrast, the validation, using the Maris and Versteeg datasets is only done for a single configuration and there are therefore less files.
For Fischer, a few configurations are listed in the global configuration file but there is no corresponding raw data. This is because these items are derived from concatenations of the original raw data (see global configuration file and manuscript for details).
References
This dataset is associated with Tranchevent L., Azuaje F.. Rajapakse J.C., A deep neural network approach to predicting clinical outcomes of neuroblastoma patients.
If you use these data in your research, please do not forget to also cite the researchers who have generated the original expression datasets.
Fischer dataset:
- Zhang W. et al., Comparison of RNA-seq and microarray-based models for clinical endpoint prediction. Genome Biology 16(1) (2015). doi:10.1186/s13059-015-0694-1
- Wang C. et al., The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nat. Biotechnol. 32(9), 926–932. doi:10.1038/nbt.3001
Versteeg dataset:
- Molenaar J.J. et al., Sequencing of neuroblastoma identifies chromothripsis and defects in neuritogenesis genes. Nature 483(7391), 589–593. doi:10.1038/nature10910
Maris dataset:
- Wang Q. et al., Integrative genomics identifies distinct molecular classes of neuroblastoma and shows that multiple genes are targeted by regional alterations in DNA copy number. Cancer Res. 66(12), 6050–6062. doi:10.1158/0008-5472.CAN-05-4618
Notes
Files
Files
(9.3 MB)
Name | Size | Download all |
---|---|---|
md5:69afea4b4776de136a281fe67f7b62c3
|
193.3 kB | Download |
md5:819a6d2dcfa27dc567e5b52f54d94836
|
77.7 kB | Download |
md5:5196be73b98533d77812a569c107e9e2
|
98.0 kB | Download |
md5:8b44821d92b4f9205ca0568299fc3daf
|
41.8 kB | Download |
md5:3f84193c6bd44397fb40d0020cef764c
|
98.3 kB | Download |
md5:3c2cdf3ef9ccbea3b3b95b6b7aaac0cd
|
38.8 kB | Download |
md5:3b8a3a9000052912e8ed27d7ae540184
|
193.7 kB | Download |
md5:18ce40e90e5a1f3fb051b8936427e5ae
|
79.7 kB | Download |
md5:d569abc3b103580f8f0c8b57be6c2511
|
97.9 kB | Download |
md5:6fcf5ef6280b2c2b3e140732db0a7236
|
32.9 kB | Download |
md5:e2bade62011ec85960496c5a3dbf3bd0
|
98.7 kB | Download |
md5:237ead031ea0dd2c88090536049c7087
|
49.8 kB | Download |
md5:089488bcae31ca1a6b639a1e7aa00ae8
|
7.0 kB | Download |
md5:306ba9cc665f94fb0bc823b3cf5edb55
|
59.8 kB | Download |
md5:e42e0f74c4ad75f2277d74d99e8da386
|
122.5 kB | Download |
md5:c1a4684789a8cf6f5e325d97358d2183
|
61.2 kB | Download |
md5:0fb10be21d78f81dd19de4a3377df520
|
186.2 kB | Download |
md5:0129812e97446352bbf7a29b974410f3
|
374.5 kB | Download |
md5:ae31bec374993a413dd3f7ef0bcbe9f1
|
186.5 kB | Download |
md5:ad93e19cc3e8c54856587cea213cf98d
|
30.4 kB | Download |
md5:8402bafce5ab2f2bc50711a07967f1fd
|
64.0 kB | Download |
md5:7d03a3d48a3a7abe984677d6477f70bb
|
32.0 kB | Download |
md5:719fd7a5f1b351eb8fc266b3153e5e00
|
97.3 kB | Download |
md5:0f2ba680e0e99c128b527bdef2524199
|
197.3 kB | Download |
md5:401f1c87c0601004a4f3f1c905b503e7
|
98.3 kB | Download |
md5:c514ae0932deb1d9360eda54e6121edb
|
30.4 kB | Download |
md5:665725bd01bb33100d8e1a1defc3504c
|
63.9 kB | Download |
md5:527156da82cd1d05fc733f60a8632259
|
32.0 kB | Download |
md5:4d6b3a716be04cc77010cd8b2083a171
|
89.9 kB | Download |
md5:4b6ebdb1e4a45594a39e09340d55200c
|
182.6 kB | Download |
md5:931b0ece791a29943c66f26995411e92
|
90.9 kB | Download |
md5:08a4500b73296245ea579854c624f585
|
59.7 kB | Download |
md5:f17576fce69fad735a05f133f8a7c42d
|
121.9 kB | Download |
md5:6c10167989e20265bf21342c88554a39
|
61.0 kB | Download |
md5:ba079e88862e23c42ba942d7bb4ce71a
|
193.4 kB | Download |
md5:09c80775b076b64fb8acd6c43e1a571b
|
388.3 kB | Download |
md5:219f8f2db8edc5d60c730c828e819cb4
|
193.5 kB | Download |
md5:5ad187948f90e65156ee9ccde336c122
|
30.4 kB | Download |
md5:2c1373c8e83360b079b33236043f1772
|
63.6 kB | Download |
md5:ad975ea37e3765fa564087300c3a49d1
|
31.8 kB | Download |
md5:609a17672380c65e697b5a8954071d26
|
74.8 kB | Download |
md5:05670344464eed4ad46e30c97a663540
|
152.1 kB | Download |
md5:4cf5c292c842084fa7bdbe8d22622624
|
75.8 kB | Download |
md5:c94f7522afb9496b2fef480a4c42c69a
|
30.3 kB | Download |
md5:20ba77c2e8d3e6ae183f6c0eae8b5c92
|
63.3 kB | Download |
md5:3c813ac095436429306da9db57f94747
|
31.7 kB | Download |
md5:55c661743fc42b215bd529894d03e189
|
119.6 kB | Download |
md5:3948377b57880d8de198969502543dbb
|
241.2 kB | Download |
md5:2536811e91d57e496bd97cceb76121ec
|
120.2 kB | Download |
md5:017aadaafc9162ed2a2b46349e2b6190
|
246.0 kB | Download |
md5:e87dccb4dc3c29b72e9c57374b1c3090
|
493.4 kB | Download |
md5:0b25db6b1545b8ab94cd9805f6dd2313
|
245.9 kB | Download |
md5:3e83569602d71164bd29db3e484a338a
|
127.5 kB | Download |
md5:0176f761424f5aae4f6d5cc1381c1315
|
257.5 kB | Download |
md5:fd6cfd8e11e6fa034f340a80207b00b5
|
128.3 kB | Download |
md5:dcdc5fc3d43fa06e72e0076867d5c278
|
119.5 kB | Download |
md5:ee1f84bc021b9e0fffe2725c4981462a
|
241.4 kB | Download |
md5:88600fe55b447ee215d813041f7502ed
|
120.4 kB | Download |
md5:18b705101fee379bccf26c59e8c97e02
|
252.0 kB | Download |
md5:d2a49a2c3a252406ef1de1b8548dbb6a
|
504.8 kB | Download |
md5:a15610a37e21c539b95942afd2d99aee
|
251.7 kB | Download |
md5:56959c816afd8413e3a7f2e8f7c2b0e6
|
104.3 kB | Download |
md5:9a4fb1bc66b4c7b0845a4765e5655499
|
210.7 kB | Download |
md5:13dbc60ac8aee64b47da425327deb01f
|
105.1 kB | Download |
md5:1b9527a5df3729226b493433cf8c6b04
|
148.6 kB | Download |
md5:74322e5829d40deea58e70884c719778
|
298.9 kB | Download |
md5:f9a7349ed35fd441fe2c44813dbc3e6a
|
149.0 kB | Download |
md5:6816cd9fad8c582668daabd1bb10aa65
|
1.2 kB | Download |
md5:729d0d0f52a34ac73a4c355be10a098d
|
18.1 kB | Download |
md5:218420ac0f7a4529d5092a89c0aed698
|
18.1 kB | Download |
md5:3cd9e68c668ff0a9f9574e05e22e6edf
|
1.2 kB | Download |
md5:f853ca38dcd5041f3aaa8b0fbe4c3fce
|
23.6 kB | Download |
md5:c8f866e73cda4dadb6e6975c27fe211c
|
23.5 kB | Download |
md5:8aa459e13b97a1e6837cf506354c32d3
|
4.2 kB | Download |
md5:8d74412430a9d500e8c0041ccd7ac51c
|
17.3 kB | Download |
md5:4aa448d92ab44f678691459b3736e415
|
17.4 kB | Download |
md5:62ce0edc8657c8330bc9075780b0295d
|
1.2 kB | Download |
md5:18902aa6ea80b56a8ae81de72a842500
|
22.5 kB | Download |
md5:c6b401a3f4b4781a672996bfca340f74
|
21.4 kB | Download |