Published April 29, 2020 | Version v1
Dataset Open

Project files provided as supporting information to the manuscript "An information theory-based approach for optimal model reduction of biomolecules"

  • 1. University of Trento

Description

The dataset contains the following files:

  • - adenylate.zip
  • - antitrypsin.zip
  • - tamapin.zip
  • - analysis_notebooks.zip

Each of these refers to one of three proteins. For each CG sites number N, each compressed folder contains the following files:

  • random mappings (random_mappings_${N}.txt) 
  • random mapping entropies (random_smaps_${N}.txt) [fig1]
  • optimal mappings (lowest_mappings_${N}.txt) [fig3, fig4, figS2]
  • optimal mapping entropies (lowest_smaps_${N}.txt) [fig1]
  • pdb files with conservations probabilities in the beta factor column (${N}_probs.pdb) [fig4, figs2]
  • SASA values (${protein_name}_SASA_residues.xvg 
  • transition mapping entropies (${protein_name}_transition_smaps.txt) [fig2]
  • additional transition mapping entropies (${protein_name}_transition_smaps*) [figs3]

The file analysis_notebooks.zip contains the python3 notebooks employed to perform all the analysis present in the paper:

  • paper_analysis_adenylate.ipynb
  • paper_analysis_antitrypsin.ipynb
  • paper_analysis_tamapin.ipynb

Packages required for the usage of these python 3 scripts:

- numpy
- pandas
- matplotlib
- seaborn
 

Files

adenylate.zip

Files (10.5 MB)

Name Size Download all
md5:fa3b5c1c6d10133df269f9efe3d28624
2.7 MB Preview Download
md5:057556636d7096dd4c038c13a0262229
2.6 MB Preview Download
md5:0eb8b67a7e4c3eb1f361576211f44415
5.0 MB Preview Download
md5:a2735196bb58a62641aad91a42762162
315.3 kB Preview Download

Additional details

Funding

VARIAMOLS – VAriable ResolutIon Algorithms for macroMOLecular Simulation 758588
European Commission