Dynamics-evolution correspondence in protein structures
Authors/Creators
- 1. University of Tokyo, and RIKEN CBS, Japan
- 2. University of Tokyo, Japan
Description
Abstract: The genotype-phenotype mapping of proteins is a fundamental question in structural biology. In this Letter, with the analysis of a large dataset of proteins from hundreds of protein families, we quantitatively demonstrate the correlations between the noise-induced protein dynamics and mutation-induced variations of native structures, indicating the dynamics-evolution correspondence of proteins. Based on the investigations of the linear responses of native proteins, the origin of such a correspondence is elucidated. It is essential that the noise/mutation-induced deformations of the proteins are restricted on a common low-dimensional subspace, as confirmed from the data. These results suggest an evolutionary mechanism of the proteins gaining both dynamical flexibility and evolutionary structural variability.
Related Publication:
- Qian-Yuan Tang, Kunihiko Kaneko. Dynamics-Evolution Correspondence of Protein Structures. Physical Review Letters, 127, 098103 (2021). https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.127.098103
File description:
- "Code-1p9sA.zip": For illustration, here we take one group of proteins (query: 1p9sA) as an example. The calculated results are already shown in the Figures of the Main Text and the Supplementary Materials of the paper. The Jupyter notebook "Notebook.ipynb" contains the entire workflow of the calculation. To ensure the proper operation of the notebook, the original data (PDB files for every single protein) of a group of structural homologs are also included in this zip file. One may easily use the code to calculate other groups of proteins.
- "Dali_files.zip": It contains all the DALI files used in this work. (DALI compares protein structures based on distance-matrix alignment.)
- "Ens_structhomologs.zip": It contains all the structural homologs used in this work. Every PDB file (e.g. "1p9sA_ens.pdb") represents the ensemble of a group of proteins with the same query. In the PDB files, only the coordinates of the carbon alpha atoms are included.
- "Database.xlsx": For the proteins in our database, the query (PDB code of the representing structure), the related data (including chain lengths N, radii of gyration Rg, correlation lengths ξD and ξE, modularity Q, the average Rayleigh Quotient <λD>E, eigenvalue λ1D, and the subspace overlap), and a detailed description of every group of structural homologs are listed.
- "Notebook.html" and "Notebook.pdf": If you just want to know how the calculations are performed, you can simply read the HTML or PDF version of the Jupyter Notebook.
Files
Code-1p9sA.zip
Files
(1.8 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:35ebdf36cc77a5977f4368b0605a9b24
|
46.0 MB | Preview Download |
|
md5:0f2c14e6e3ab096ccc4458d99d4c817d
|
322.4 MB | Preview Download |
|
md5:48708883e7504036e807ed6c872415ff
|
51.6 kB | Download |
|
md5:b16447baf6614beff73b78f1129d9ea4
|
1.5 GB | Preview Download |
|
md5:8783007c8ae63321e4bcc9328132644f
|
1.5 MB | Download |
|
md5:7e28a1eb8fee75b8d2550ba76aedb72c
|
730.1 kB | Preview Download |