Training dataset for the TRENDY method
Description
This dataset is used for training the TRENDY method for gene regulatory network inference. It also contains the SINC test data set.
For a brief description of the code for TRENDY method, see https://github.com/YueWangMathbio/TRENDY.
See https://github.com/YueWangMathbio/TRENDY/blob/main/GRN_transformer.pdf for the manuscript of TRENDY method.
To use the data:
1 download all files from https://github.com/YueWangMathbio/TRENDY
2 download all files from this database (both https://zenodo.org/records/14927741 and https://zenodo.org/records/13929908)
3 in the folder with all files from GitHub, creat a folder named "total_data_10", and unzip all files with name "dataset....zip" in this folder
4 unzip "rev_wendy_all_10.zip" in the folder with all files from GitHub
5 unzip "SINC_data.zip", and the files into the folder "SINC"
The "total_data_10" folder will contain 102 groups of data, where each group has eight files with different name endings:
xxx_A: 1000 ground truth gene regulatory networks, each of size 10*10
xxx_cov: 11000 covariance matrices for 1000 samples at 11 time points, each of size 10*10
xxx_data: 1000 gene expression samples, each of size 100*10*11 (100 cells, 10 genes, 11 time points)
xxx_genie: 10000 inferred gene regulatory networks by GENIE3 method for 1000 samples at 10 time points, each of size 10*10
xxx_nlode: 1000 inferred gene regulatory networks by NonlinearODEs method for 1000 samples, each of size 10*10
xxx_revcov: 10000 constructed pseudo covariance matrices for 1000 samples at 10 time points, each of size 10*10
xxx_sinc:1000 inferred gene regulatory networks by SINCERITIES method for 1000 samples, each of size 10*10
xxx_wendy: 10000 inferred gene regulatory networks by WENDY method for 1000 samples at 10 time points, each of size 10*10
The "rev_wendy_all_10" folder will contain two groups of data, where each group has eight files with different name endings:
xxx_ktstar: 10000 inferred covariance matrices by the first half of TRENDY for 1000 samples at 10 time points, each of size 10*10
xxx_revwendy: 10000 inferred gene regulatory networks by the first half of TRENDY for 1000 samples at 10 time points, each of size 10*10
The first 100 group with numbering are for training. The one group with "val" is for validation. The one group with "test" is for testing.
If you want to train or test new GRN inference methods, then just use the xxx_A files and xxx_data files.
Files
SINC_data.zip
Files
(300.0 MB)
Name | Size | Download all |
---|---|---|
md5:76cce82fa8f32983abfd5064885821ab
|
300.0 MB | Preview Download |
Additional details
Dates
- Available
-
2024-10-14
Software
- Repository URL
- https://github.com/YueWangMathbio/TRENDY
- Programming language
- Python