Project files provided as supporting information to the manuscript "A deep learning approach to the structural analysis of proteins"
Description
README file to the project files provided as supporting information to the manuscript “A deep learning approach to the structural analysis of proteins”
Dec. 30, 2018
Authors: Marco Giulini and Raffaello Potestio
==================================
The dataset contains the following files:
- datasets.zip: archive containing five .csv files, namely:
- decoys_cm.csv : all the data for 10728 protein decoys, training set
- evaluation_cm.csv : all data for 146 proteins in the evaluation set
- random_CG.csv : 1200 Coulomb matrices. 100 CG models for each protein with 120 amino acids
- 1e5g_centered_sphere.csv : 100 CG models in which the central atoms in 1e5g are not removed
- 1e5g_random_sphere.csv : 10 CG models for 10 different (random) locations for the sphere that includes atoms that have to be retained. 100 CG models in total
- decoys_labels.lab containing the labels associated to the 10728 decoys present in the training set
- evaluation_labels.lab containing the labels associated to the 146 pdb files in the evaluation set
- random_CG_labels.lab containing the labels associated to the 6 proteins with 120 amino acids
- network_development_training: a python script that performs cross validation and full training of the model
- saved_networks.zip FOLDER containing 10 networks: the architecture is included in .json files while weight parameters are inside .hs files
- pdb_files.zip FOLDER containing the PDB files that have been employed in the project, namely:
- pdb_files_len100 : pdb files with 100 amino acids
- pdb_files_len101-110 : pdb files with a number of amino acids between 101 and 110
- decoys : decoys of length 100 extracted from the above folder: name syntax == PDBNAME_decoy_STARTRES_ENDRES.pdb
EXAMPLE 6gsp.pdb will give rise to 6gsp_decoy_0_100.pdb , 6gsp_decoy_1_101.pdb , 6gsp_decoy_2_102.pdb , 6gsp_decoy_3_103.pdb , 6gsp_decoy_4_104.pdb
- pdb_files_len100 : 6 pdb files with 120 amino acids
Files
datasets.zip
Files
(760.7 MB)
Name | Size | Download all |
---|---|---|
md5:63ccc5da9d9c63f2cea8b80e9bee9d32
|
225.9 MB | Preview Download |
md5:a8239ecde778d4a53440b88a9f77cb1b
|
1.7 MB | Download |
md5:0a76b791400b6357f2873c98cc1f85f5
|
38.3 kB | Download |
md5:6816b6726b86c22e975f95d4a9d46ed2
|
3.2 MB | Download |
md5:6b7c234e432645d978859aba8dcc11b0
|
41.9 kB | Download |
md5:57879a9ba89383c4376802c034bc3bd6
|
7.8 kB | Download |
md5:109c5179d7299a7eb96cad6299b39db5
|
476.0 MB | Preview Download |
md5:bad2ca671082d61dfc39b306dd3adec9
|
78.1 kB | Download |
md5:a5e89ad018d0469a3acfc7ea8c93c136
|
53.7 MB | Preview Download |
Additional details
Funding
References
- M. Giulini and R. Potestio, A deep learning approach to the structural analysis of proteins, Interface Focus 9 (2019) http://doi.org/10.1098/rsfs.2019.0003