Published February 22, 2018
| Version 1.0
Dataset
Open
A large data-set of CASP protein refinement simulations for machine-learning
- 1. The Francis Crick Institute
Description
The uploaded trajectory data originates from our own laboratory's refinement method in CASP11 and CASP12 for which the reference crystal structure is available in the PDB. In total the trajectory data consists of 904 trajectories with 3419 ns cumulative simulation time and 1,709,704 snapshots with a delta t =2 ps from 42 different protein systems.
File Overview
- trajectory_data_pdbs.tar.gz : contains the PDB files of the different trajectories as well as the starting model and reference crystal structure for each target
- casp_normalized_all_data_final.csv.gz : contains the trajectory features calculated for each snapshot from the trajectory PDBs
- cv_folds.csv : contains the 7 fold cross-validation assignment used to assess the performance of the model
Notes
Files
cv_folds.csv
Files
(34.0 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:fc06ce114a9d639effdeefde9483a023
|
654.1 MB | Download |
|
md5:743aeee87eb1ac442c041df7315dbde8
|
24.0 kB | Preview Download |
|
md5:7702f0b21f54d98fa9f478336d7a37b6
|
33.4 GB | Download |
Additional details
Funding
- Wellcome Trust
- Other FC001003