Published April 24, 2023
| Version v3
Dataset
Open
Multimodal learning of noncoding variant effects using genome sequence and chromatin structure
Description
ncVarPred-1D3D:
The data used for testing the inconsistency among genome sequence, epigenetic profile, and later, to show its relation to 3D chromatin structure can be found in sanity_check_data.tar.gz.
Some trained model for noncoding mutation effect prediction (mapping genome sequence to epigenetic profile) can be found in CNN_MLP, CNN_GCN, CNN_RNN_MLP, CNN_RNN_GCN.tar.gz.
The trained model for pathogenic variants prediction can be found in fewshot_pathogenic_model.tar.gz.
The training data can be found in training_data.tar.gz.
Some noncoding variant effects prediction results, e.g. eQTL and pathogenic variants, can be replicated using the data shared in ncVar_data.tar.gz.
Files
Files
(48.0 GB)
Name | Size | Download all |
---|---|---|
md5:7edca56def1b439e8fb7aed8b5751fc9
|
9.1 GB | Download |
md5:d3864834131a3f7dafda056aa49ee76a
|
4.7 GB | Download |
md5:317f136e3553d028917385012b7e4e58
|
5.4 GB | Download |
md5:9fa6962273dfbd7710b7a26f94707dd6
|
4.4 GB | Download |
md5:b91600e1732eab442d9ad17fbc4aa697
|
7.8 GB | Download |
md5:b3209535243e7506610b158809453215
|
1.3 GB | Download |
md5:f46d6413ffe851e58e54fc673c80d48d
|
5.0 GB | Download |
md5:93529482fa983ca7a9281f44357c8c3c
|
10.2 GB | Download |
Additional details
References
- Tan, Wuwei, and Yang Shen. "Multimodal learning of noncoding variant effects using genome sequence and chromatin structure." bioRxiv (2022): 2022-12.