BioExcel Webinar #83: Restoring Protein Glycosylation with GlycoShape (2025-3-11)
Description
The introduction of machine learning (ML) for protein structure prediction1,2 revolutionized structural biology, boosting our ability to source and resolve protein structures, and broadening the potential for therapeutic discovery3. One limitation affecting all ML-derived structures is the lack post-translational modifications4, which are key to the correct folding, structural stability, and function of the underlying protein. Glycosylation is the most common post-translational modification of proteins, with an estimated 3 to 4% of the human genome dedicated exclusively to encode for glycosylation pathways5. Yet, glycans remain largely ‘unseen’ due to their heterogeneity, complexity and highly dynamic nature6. In this talk I will introduce GlycoShape7 (https://glycoshape.org), a unique resource based on high-performance computing that allows users to rapidly and easily restore glycoproteins from ML (AlphaFold/RoseTTAFold), as well as from the Protein Data Bank (www.rcsb.org), to their native, functional state by adding the missing glycan 3D information in seconds. Because of the robustness of its 3D database and of the algorithm, GlycoShape can also predict N-glycosylation site occupancy with a 92% accuracy against all experimentally profiled glycoproteins in the AlphaFold Protein Structure Database (https://alphafold.ebi.ac.uk/). This remarkable level of agreement with glycoproteomics data provides further evidence that the type of glycosylation and occupancy depend on site accessibility and complementarity of the glycan to the protein surface8,9, revealing a real potential of training upcoming ML algorithms with enormous impact on scientific and therapeutic advances. To this end, I will provide some examples, ranging from pathogen infection to protein folding, to underscore the importance of rebuilding glycosylation to understand biomolecular structure and function in life sciences.
- Jumper, J. et al.Highly accurate protein structure prediction with AlphaFold. Nature (2021) doi:10.1038/s41586-021-03819-2.
- Baek, M. et al.Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).
- Arnold, C. AlphaFold touted as next big thing for drug discovery – but is it? Nature622, 15–17 (2023).
- Bagdonas, H., Fogarty, C. A., Fadda, E. & Agirre, J. The case for post-predictional modifications in the AlphaFold Protein Structure Database. Nat. Struct. Mol. Biol.1–2 (2021).
- Schjoldager, K. T., Narimatsu, Y., Joshi, H. J. & Clausen, H. Global view of human protein glycosylation pathways and functions. Nat. Rev. Mol. Cell Biol.21, 729–749 (2020).
- Dance, A. Refining the toolkit for sugar analysis. Nature599, 168–169 (2021).
- Thaysen-Andersen, M. & Packer, N. H. Site-specific glycoproteomics confirms that protein structure dictates formation of N-glycan type, core fucosylation and branching. Glycobiology22, 1440–1452 (2012).
- Ives, CM. and Singh O. et al.Nat Methods 21, 2117–2127 (2024)
- Casalino, L. et al.Beyond Shielding: The Roles of Glycans in the SARS-CoV-2 Spike Protein. ACS Cent Sci 6, 1722–1734 (2020).
Files
      
        talk_bioExcel_ef_final.pdf
        
      
    
    
      
        Files
         (135.3 MB)
        
      
    
    | Name | Size | Download all | 
|---|---|---|
| md5:92a5c9789170e00d07785790bdea29bf | 135.3 MB | Preview Download |