Published April 24, 2023 | Version v3
Dataset Open

Multimodal learning of noncoding variant effects using genome sequence and chromatin structure

Creators

  • 1. Texas A&M University

Description

ncVarPred-1D3D:

The data used for testing the inconsistency among genome sequence, epigenetic profile, and later, to show its relation to 3D chromatin structure can be found in sanity_check_data.tar.gz.

Some trained model for noncoding mutation effect prediction (mapping genome sequence to epigenetic profile) can be found in CNN_MLP, CNN_GCN, CNN_RNN_MLP, CNN_RNN_GCN.tar.gz.

The trained model for pathogenic variants prediction can be found in fewshot_pathogenic_model.tar.gz. 

The training data can be found in training_data.tar.gz.

Some noncoding variant effects prediction results, e.g. eQTL and pathogenic variants, can be replicated using the data shared in ncVar_data.tar.gz.

Files

Files (48.0 GB)

Name Size Download all
md5:7edca56def1b439e8fb7aed8b5751fc9
9.1 GB Download
md5:d3864834131a3f7dafda056aa49ee76a
4.7 GB Download
md5:317f136e3553d028917385012b7e4e58
5.4 GB Download
md5:9fa6962273dfbd7710b7a26f94707dd6
4.4 GB Download
md5:b91600e1732eab442d9ad17fbc4aa697
7.8 GB Download
md5:b3209535243e7506610b158809453215
1.3 GB Download
md5:f46d6413ffe851e58e54fc673c80d48d
5.0 GB Download
md5:93529482fa983ca7a9281f44357c8c3c
10.2 GB Download

Additional details

References

  • Tan, Wuwei, and Yang Shen. "Multimodal learning of noncoding variant effects using genome sequence and chromatin structure." bioRxiv (2022): 2022-12.