Published January 3, 2023 | Version v1
Journal article Open

De novo design of site-specific protein interactions with learned surface fingerprints

  • 1. Ecole polytechnique fédérale de Lausanne (EPFL)

Description

Physical interactions between proteins are essential for most biological processes governing life. However, the molecular determinants of such interactions have been challenging to understand, even as genomic, proteomic, and structural data grows. This knowledge gap has been a major obstacle for the comprehensive understanding of cellular protein-protein interaction (PPI) networks and for the de novo design of protein binders that are crucial for synthetic biology and translational applications. We exploit a geometric deep learning framework operating on protein surfaces that generates fingerprints to describe geometric and chemical features critical to drive PPIs. We hypothesized these fingerprints capture the key aspects of molecular recognition that represent a new paradigm in the computational design of novel protein interactions. As a proof-of-principle, we computationally designed several de novo protein binders to engage four protein targets: SARS-CoV-2 spike, PD-1, PD-L1, and CTLA-4. Several designs were experimentally optimized while others were purely generated in silico, reaching nanomolar affinity with structural and mutational characterization showing highly accurate predictions. Overall, our surface-centric approach captures the physical and chemical determinants of molecular recognition, enabling a novel approach for the de novo design of protein interactions and, more broadly, of artificial proteins with function.

Notes

Scaffold database used for grafting the seed generated by MaSIF-seed (see publication) The scaffold database is composed of globular structures ranging from 30 to 100 amino acids originating from (see references): 1) 301 structures from the AlphaFold2 proteome prediction database 2) 2049 structures from the Protein Data Bank (PDB) 3) 1997 structures from 2 publications introducing "miniprotein" designs

Files

Files (64.2 MB)

Name Size Download all
md5:0843fce418a86948c182b421f6ad9bcc
64.2 MB Download

Additional details

References

  • Varadi, M. et al. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 50, D439–D444 (2022)
  • Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).
  • Rocklin Gabriel J. et al. Global analysis of protein folding using massively parallel design, synthesis, and testing. Science 357, 168–175 (2017).
  • Bhardwaj, G. et al. Accurate de novo design of hyperstable constrained peptides. Nature 538, 329–335 (2016).