An end-to-end framework for the prediction of protein structure and fitness from single sequence
Authors/Creators
- 1. Tsinghua University
Description
Info
Feature, model parameters and prediciton results for BioRixv manuscript titled "An end-to-end framework for the prediction of protein structure and fitness from single sequence".
For source code and detailed instructions on usage, please refer to our GitHub.
If you want to quickly use our tool, you can utilize our web server for SPIRED-Fitness and SPIRED-Stab.
Model
Note: Only model parameters are available. For detailed instructions on usage, please refer to our GitHub.
SPIRED.pth is the model checkpoint of SPIRED. The model checkpoints of SPIRED-Fitness and SPIRED-Stab are in Model.zip.
model.zip
- SPIRED-Fitness.pth
- SPIRED-Stab.pth
SPIRED Predicted Structure
CAMEO.tar.gz
Inference results for CAMEO targets (August 2022∼August 2023), consisting of 680 protein chains with the length ranging from50 to 1,126 residues.
Including predicted and label PDB files and tables for TM-score, lDDT & RMSD information.
CASP15.tar.gz
Inference results for 45 target domains from the CASP15 competition.
Including predicted and label PDB files and tables for TM-score, lDDT & RMSD information.
SCOPe.tar.gz
Inference results for 1,231 folds (34,021 domains) from SCOPe database (v2.08, S95, September 2021).
Including predicted PDB files and tables for TM-score, lDDT & RMSD information.
CATH.tar.gz
Inference results for 1,223 topologies (24,183 domains) from CATH database (v4.2, S35, July 2017).
Including predicted PDB files and tables for TM-score, lDDT & RMSD information.
Fitness Prediction
fitness_data.zip
- single_and_double_mutation_fitness_train_data.csv
- single_mutation_fitness_pred.csv
Stability Prediction
stability_data.zip
- ddG_S669_test_data.csv
- ddG_S461_test_data.csv
- ddG_train_data.csv
- dTm_S557_test_data.csv
- dTm_train_data.csv
SPIRED-Fitness Stage 2 Training/Validation Data Example
Here, we provide SPIRED-Fitness Stage 2 training/validation data and the parameters of the stage 1 SPIRED-Fitness (SPIRED.pth and Fitness.pth).
Note that due to size constraints, only structure data of training/validation samples are provided. If you need the fitness data of training/validation samples, please contact authors. For the complete training code, please visit our GitHub repository.
SPIRED-Fitness_Stage2_training_data.zip
SPIRED-Stab Stage 1 features
wt_data_for_value_function.pt
FoldX Predicted Structure
cDNA_dataset_FoldX_structure.zip
Mutant structure (PDB file) of each entry in cDNA proteolysis dataset generated by FoldX
Files
cDNA_dataset_FoldX_structure.zip
Files
(13.4 GB)
| Name | Size | |
|---|---|---|
|
md5:54146aa896aea7251b6d2503a49f4548
|
348.1 MB | Download |
|
md5:9518e3a962af4d2f73cf6338b4bd7f1b
|
18.5 MB | Download |
|
md5:d633b707065a51427c007babe750248d
|
1.2 GB | Download |
|
md5:6950eaa7467c60ca6457153bec3d2ad2
|
4.3 GB | Preview Download |
|
md5:fdb39c4f060f530296cdf6174e2b66c3
|
18.4 MB | Preview Download |
|
md5:f59d901a5a58a24767c73a048c83c61c
|
946.3 MB | Preview Download |
|
md5:266b8174e6e2486f69d1a2c41725000c
|
2.0 GB | Download |
|
md5:b05adf46783851311e2abe5265c4a418
|
3.7 GB | Preview Download |
|
md5:d16fc871194c82609ab1cd4f906ce6a3
|
501.9 MB | Download |
|
md5:fa2b7a9ae09aeda5c369dba19f2a29f3
|
644.9 kB | Preview Download |
|
md5:a73d9de864c2b094814205a042872e57
|
431.0 MB | Download |
Additional details
References
- Chen, Y., Xu, Y., Liu, D. et al. An end-to-end framework for the prediction of protein structure and fitness from single sequence. Nat Commun 15, 7400 (2024). https://doi.org/10.1038/s41467-024-51776-x