Dataset Open Access

Project files provided as supporting information to the manuscript "An information theory-based approach for optimal model reduction of biomolecules"

Marco Giulini; Roberto Menichetti; Raffaello Potestio

The dataset contains the following files:

  • - adenylate.zip
  • - antitrypsin.zip
  • - tamapin.zip
  • - analysis_notebooks.zip

Each of these refers to one of three proteins. For each CG sites number N, each compressed folder contains the following files:

  • random mappings (random_mappings_${N}.txt) 
  • random mapping entropies (random_smaps_${N}.txt) [fig1]
  • optimal mappings (lowest_mappings_${N}.txt) [fig3, fig4, figS2]
  • optimal mapping entropies (lowest_smaps_${N}.txt) [fig1]
  • pdb files with conservations probabilities in the beta factor column (${N}_probs.pdb) [fig4, figs2]
  • SASA values (${protein_name}_SASA_residues.xvg 
  • transition mapping entropies (${protein_name}_transition_smaps.txt) [fig2]
  • additional transition mapping entropies (${protein_name}_transition_smaps*) [figs3]

The file analysis_notebooks.zip contains the python3 notebooks employed to perform all the analysis present in the paper:

  • paper_analysis_adenylate.ipynb
  • paper_analysis_antitrypsin.ipynb
  • paper_analysis_tamapin.ipynb

Packages required for the usage of these python 3 scripts:

- numpy
- pandas
- matplotlib
- seaborn
 

Files (10.5 MB)
Name Size
adenylate.zip
md5:fa3b5c1c6d10133df269f9efe3d28624
2.7 MB Download
analysis_notebooks.zip
md5:057556636d7096dd4c038c13a0262229
2.6 MB Download
antitrypsin.zip
md5:0eb8b67a7e4c3eb1f361576211f44415
5.0 MB Download
tamapin.zip
md5:a2735196bb58a62641aad91a42762162
315.3 kB Download
40
17
views
downloads
All versions This version
Views 4040
Downloads 1717
Data volume 49.2 MB49.2 MB
Unique views 3939
Unique downloads 1313

Share

Cite as