There is a newer version of the record available.

Published November 26, 2024 | Version v3
Dataset Open

Geometric deep learning improves generalizability of MHC-bound peptide predictions

  • 1. ROR icon Radboud University Medical Center
  • 2. Netherlands eScience Center
  • 1. ROR icon University of Amsterdam
  • 2. ROR icon Radboud University Medical Center

Description

Full dataset and trained models from the manuscript "Geometric deep learning improves generalizability of MHC-bound peptide predictions".

"outputs_and-BA_data.zip" contains the networks' outputs for each cross-validation experiment and a "full_dataset.csv" containing the initial BA data.
Note: this file has been updated (2024/11/26) due to errors in generating some of the previous csvs. In the earlier version, both MLP and CNN outputs reported were wrong. The correct values are now reported in the updated csvs.

"trained_models.zip" contains all the trained models parameters

"propedia_ssl.zip" contains all the 3D models from propedia used to train the 3D-SSL

"pdb.zip" contains 3D models generated in PANDORA and used to train CNN, GNN and EGNN. It amounts to 145665 .pdb files, one for each human binding affinity entry from the initial dataset from O'Donnell et al. The list of entries used to actually train networks after filtering can be found in outputs_and-BA_data.zip", in the "full_dataset.csv" file. 

 

Files

outputs_and_BA_data.zip

Files (4.8 GB)

Name Size Download all
md5:78c3a0c00b5e55bc9df257581d0a955d
8.2 MB Preview Download
md5:33f00da419c27ce4f930990d3c25b64c
4.3 GB Preview Download
md5:a8ae74dea860cddce707e20b4ba762df
110.3 MB Preview Download
md5:248b5bd7a6b92b48c1e59cb1a44e8527
447.3 MB Preview Download

Additional details

Software

Repository URL
https://github.com/DeepRank/3D-Vac
Programming language
Python

References

  • Marzella, D. et al. Improving generalizability and data efficiency for MHC-I binding peptide predictions through structure-based geometric deep learning. Preprint at https://doi.org/10.21203/rs.3.rs-3924124/v1 (2024).
  • O'Donnell, T. J., Rubinsteyn, A. & Laserson, U. MHCflurry 2.0: Improved Pan-Allele Prediction of MHC Class I-Presented Peptides by Incorporating Antigen Processing. Cell Syst. 11, 42–48.e7 (2020).