Published July 15, 2021 | Version v2
Journal article Open

Dynamics-evolution correspondence in protein structures

  • 1. University of Tokyo, and RIKEN CBS, Japan
  • 2. University of Tokyo, Japan

Description

Abstract: The genotype-phenotype mapping of proteins is a fundamental question in structural biology. In this Letter, with the analysis of a large dataset of proteins from hundreds of protein families, we quantitatively demonstrate the correlations between the noise-induced protein dynamics and mutation-induced variations of native structures, indicating the dynamics-evolution correspondence of proteins. Based on the investigations of the linear responses of native proteins, the origin of such a correspondence is elucidated. It is essential that the noise/mutation-induced deformations of the proteins are restricted on a common low-dimensional subspace, as confirmed from the data. These results suggest an evolutionary mechanism of the proteins gaining both dynamical flexibility and evolutionary structural variability.

Related Publication:

File description: 

  • "Code-1p9sA.zip": For illustration, here we take one group of proteins (query: 1p9sA) as an example. The calculated results are already shown in the Figures of the Main Text and the Supplementary Materials of the paper. The Jupyter notebook "Notebook.ipynb" contains the entire workflow of the calculation. To ensure the proper operation of the notebook, the original data (PDB files for every single protein) of a group of structural homologs are also included in this zip file. One may easily use the code to calculate other groups of proteins.
  • "Dali_files.zip": It contains all the DALI files used in this work. (DALI compares protein structures based on distance-matrix alignment.)
  • "Ens_structhomologs.zip": It contains all the structural homologs used in this work. Every PDB file (e.g. "1p9sA_ens.pdb") represents the ensemble of a group of proteins with the same query. In the PDB files, only the coordinates of the carbon alpha atoms are included.
  • "Database.xlsx": For the proteins in our database, the query (PDB code of the representing structure), the related data (including chain lengths N, radii of gyration Rg, correlation lengths ξD and ξE, modularity Q, the average Rayleigh Quotient <λD>E, eigenvalue λ1D, and the subspace overlap), and a detailed description of every group of structural homologs are listed.
  • "Notebook.html" and "Notebook.pdf": If you just want to know how the calculations are performed, you can simply read the HTML or PDF version of the Jupyter Notebook.

Files

Code-1p9sA.zip

Files (1.8 GB)

Name Size Download all
md5:35ebdf36cc77a5977f4368b0605a9b24
46.0 MB Preview Download
md5:0f2c14e6e3ab096ccc4458d99d4c817d
322.4 MB Preview Download
md5:48708883e7504036e807ed6c872415ff
51.6 kB Download
md5:b16447baf6614beff73b78f1129d9ea4
1.5 GB Preview Download
md5:8783007c8ae63321e4bcc9328132644f
1.5 MB Download
md5:7e28a1eb8fee75b8d2550ba76aedb72c
730.1 kB Preview Download