Published July 16, 2018 | Version v2
Dataset Open

Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk (sequence model release)

Creators

  • 1. Simons Foundation and Princeton University

Description

(This is the updated version that has been converted a standard pytorch model format)

This is the deep learning sequence model used in 

Jian Zhou, Chandra L. Theesfeld, Kevin Yao, Kathleen M. Chen, Aaron K. Wong, and Olga G. Troyanskaya, Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk, Nature Genetics, 2018.

Note the full software is available from https://github.com/FunctionLab/ExPecto and this release is created for the convenience of use and under the same non-commercial license. The model weights can be loaded with pytorch load_state_dict function (for an example please find https://github.com/FunctionLab/ExPecto/blob/master/chromatin.py). We also provide a web server for browsing mutations with strong predicted effects at https://hb.flatironinstitute.org/expecto/, which are currently limited to mutations within 1kb to TSS or are 1000 Genomes variants.

Trivia: we code-named our models with whale names. This model has an unofficial codename DeepSEA "Beluga".

Files

Files (598.1 MB)

Name Size Download all
md5:62360f2db4ac96554d28d058b50ab654
598.1 MB Download