Published August 27, 2024 | Version v3

An end-to-end framework for the prediction of protein structure and fitness from single sequence

Description

Info

Feature, model parameters and prediciton results for  BioRixv manuscript titled "An end-to-end framework for the prediction of protein structure and fitness from single sequence".

For source code and detailed instructions on usage, please refer to our GitHub.

If you want to quickly use our tool, you can utilize our web server for SPIRED-Fitness and SPIRED-Stab.

 

Model

Note:  Only model parameters are available. For detailed instructions on usage, please refer to our GitHub.

SPIRED.pth is the model checkpoint of SPIRED. The model checkpoints of SPIRED-Fitness and SPIRED-Stab are in Model.zip.

model.zip 

  • SPIRED-Fitness.pth
  • SPIRED-Stab.pth

 

SPIRED Predicted Structure

CAMEO.tar.gz

Inference results for CAMEO targets (August 2022∼August 2023), consisting of 680 protein chains with the length ranging from50 to 1,126 residues.

Including predicted and label PDB files and tables for TM-score, lDDT & RMSD information.

CASP15.tar.gz

Inference results for 45 target domains from the CASP15 competition.

Including predicted and label PDB files and tables for TM-score, lDDT & RMSD information.

SCOPe.tar.gz

Inference results for 1,231 folds (34,021 domains) from SCOPe database (v2.08, S95, September 2021).

Including predicted PDB files and tables for TM-score, lDDT & RMSD information.

CATH.tar.gz

Inference results for 1,223 topologies (24,183 domains) from CATH database (v4.2, S35, July 2017).

Including predicted PDB files and tables for TM-score, lDDT & RMSD information.

 

Fitness Prediction

fitness_data.zip

  • single_and_double_mutation_fitness_train_data.csv
  • single_mutation_fitness_pred.csv

 

Stability Prediction

stability_data.zip

  • ddG_S669_test_data.csv
  • ddG_S461_test_data.csv
  • ddG_train_data.csv
  • dTm_S557_test_data.csv
  • dTm_train_data.csv

 

SPIRED-Fitness Stage 2 Training/Validation Data Example

Here, we provide SPIRED-Fitness Stage 2 training/validation data and the parameters of the stage 1 SPIRED-Fitness (SPIRED.pth and Fitness.pth). 

Note that due to size constraints, only structure data of training/validation samples are provided. If you need the fitness data of training/validation samples, please contact authors. For the complete training code, please visit our GitHub repository.

SPIRED-Fitness_Stage2_training_data.zip

 

SPIRED-Stab Stage 1 features

wt_data_for_value_function.pt

 

FoldX Predicted Structure

cDNA_dataset_FoldX_structure.zip

Mutant structure (PDB file) of each entry in cDNA proteolysis dataset generated by FoldX

Files

cDNA_dataset_FoldX_structure.zip

Files (13.4 GB)

Name Size
md5:54146aa896aea7251b6d2503a49f4548
348.1 MB Download
md5:9518e3a962af4d2f73cf6338b4bd7f1b
18.5 MB Download
md5:d633b707065a51427c007babe750248d
1.2 GB Download
md5:6950eaa7467c60ca6457153bec3d2ad2
4.3 GB Preview Download
md5:fdb39c4f060f530296cdf6174e2b66c3
18.4 MB Preview Download
md5:f59d901a5a58a24767c73a048c83c61c
946.3 MB Preview Download
md5:266b8174e6e2486f69d1a2c41725000c
2.0 GB Download
md5:b05adf46783851311e2abe5265c4a418
3.7 GB Preview Download
md5:d16fc871194c82609ab1cd4f906ce6a3
501.9 MB Download
md5:fa2b7a9ae09aeda5c369dba19f2a29f3
644.9 kB Preview Download
md5:a73d9de864c2b094814205a042872e57
431.0 MB Download

Additional details

References