The training data set for the deep ResNet models used by RaptorX-3DModeling
Creators
Description
This dataset contains the multiple sequence alignments (MSAs) and experimental structure files of the proteins that are used by RaptorX-3DModeling to train ResNet models for protein contact/distance/orientation prediction. The proteins are taken from a Cath S35 list created in Jan 2020 or December 2019. The MSAs are generated by HHblits with E=0.001 on uniclust30 created in 2017. The package CathS35V2020MSA.tar.gz contains two folders: MSA_2017_E001 for MSAs and LISTS for training and validation lists. The MSA files can be used to construct both sequential and pairwise input features. The CathS35V2020PDB.tar.gz contains one folder PDB for the experimental structure files, from which inter-atom distance and inter-residue orientation can be constructed.
Files
Files
(6.1 GB)
Name | Size | Download all |
---|---|---|
md5:da37c3c4f2f0d3c49c4456d597e75dec
|
5.3 GB | Download |
md5:28a705edcbd95fd015522d93478537ab
|
816.1 MB | Download |