Geometric deep learning improves generalizability of MHC-bound peptide predictions
Creators
Contributors
Project members:
Description
Full dataset and trained models from the manuscript "Geometric deep learning improves generalizability of MHC-bound peptide predictions".
"outputs_and-BA_data.zip" contains the networks' outputs for each cross-validation experiment and a "full_dataset.csv" containing the initial BA data.
Note: this file has been updated (2024/11/26) due to errors in generating some of the previous csvs. In the earlier version, both MLP and CNN outputs reported were wrong. The correct values are now reported in the updated csvs.
"trained_models.zip" contains all the trained models parameters
"propedia_ssl.zip" contains all the 3D models from propedia used to train the 3D-SSL
"pdb.zip" contains 3D models generated in PANDORA and used to train CNN, GNN and EGNN. It amounts to 145665 .pdb files, one for each human binding affinity entry from the initial dataset from O'Donnell et al. The list of entries used to actually train networks after filtering can be found in outputs_and-BA_data.zip", in the "full_dataset.csv" file.
Files
outputs_and_BA_data.zip
Additional details
Identifiers
Software
- Repository URL
- https://github.com/DeepRank/3D-Vac
- Programming language
- Python
References
- Marzella, D. et al. Improving generalizability and data efficiency for MHC-I binding peptide predictions through structure-based geometric deep learning. Preprint at https://doi.org/10.21203/rs.3.rs-3924124/v1 (2024).
- O'Donnell, T. J., Rubinsteyn, A. & Laserson, U. MHCflurry 2.0: Improved Pan-Allele Prediction of MHC Class I-Presented Peptides by Incorporating Antigen Processing. Cell Syst. 11, 42–48.e7 (2020).