Published April 3, 2024 | Version v1
Dataset Open

Water will find its way: transport through narrow tunnels in hydrolases (water-protein interaction analysis)

  • 1. ROR icon Adam Mickiewicz University in Poznań
  • 2. ROR icon International Institute of Molecular and Cell Biology

Description

Repository including information on the analyses of water-protein interactions by MM/GBSA.

hal_mmgbsa_data.tar.gz - contains a minimal set of files necessary to recreate MM/GBSA calculations and also raw calculation results for Hal.
epx_mmgbsa_data.tar.gz - contains a minimal set of files necessary to recreate MM/GBSA calculations and also raw calculation results for Epx.
lip_mmgbsa_data.tar.gz - contains a minimal set of files necessary to recreate MM/GBSA calculations and also raw calculation results for Lip.

In each archive structure is:
    {supercluster_id}:  - supercluster directory
        {event_nr}_{rounded_radius}:  - transport event directory
            renum.pdb  -pdb file containing protein, water of interest and 8 Angstrom radius sphere of surrounding waters
            water.txt  - file denoting water ID for water of interest in the renum.pdb file
            binding_[implicit|explicit].dat  - file containing MMGBSA results for full implicit and partial explicit system
            decomposition_[implicit|explicit].dat - file containing MMGBSA decomposition results for full implicit and partial explicit system
    info_{supercluster_id}.csv - file containing descriptions of transport events in supercluster (columns: event_name, radius, frame, simulation, water_id)

results.tar.gz - contains csv files generated by script collect_data_minimal.py for each protein for full implicit and partial explicit solvent. Columns in those csv file are: event_name, supercluster, radius, frame, simulation, water_id, delta_total, delta_van_der_waals, delta_electrostatic, delta_G_solv

scripts.tar.gz - contains following scripts
    read_transport_events.py
    prep_inputs.py
    run_MMGBSA_single.sh
    collect_data_minimal.py
    plot_bin.py

To recreate MMGBSA calculations with provided inputs one should:
$ python read_transport_events.py [Hal.dat|Epx.dat|Lip.dat] > selection.txt
Prepare transport events information in a compatible format. Binary transport events databases are available in the repository.

$ python prep_inputs.py selection.txt - {absolute output path} {supercluster_id} 1 
Prepare parm and coordinate file for each supercluster id included in computation. Last argument 1 point script to resume preparations from renum.pdb files.

$ bash run_MMGBSA_single.sh {residue_count}
Runs MMGBSA for both partial explicit and fully implicit systems. Needs to be run for each event in each supercluster.
residue_count should be set to 293 for hal, 319 for epx, and 534 for lip.

$ python collect_data_minimal.py [hal|epx|lip]/{supercluster_id} [hal|epx|lip]/info_{supercluster_id}.csv [0|1] >> output.csv
Gathers data from MMGBSA calculations for a single supercluster into csv file. 
Last argument should be 0 for fully implicit system, 1 for partial explicit system.

$ python plot_bin.py {hal_implicit_csv} {hal_explicit_csv} {epx_implicit_csv} {epx_explicit_csv} {lip_implicit_csv} {lip_explicit_csv}
Results visualization. Produces "combined.png".

Notes

This work was supported by the National Science Centre, Poland (grant no. 2017/26/E/NZ1/00548). The computations were performed at the Poznan Supercomputing and Networking Center. C.S-B. and A.S.T. are recipients of a scholarship associated with the POWER project (grant nos. POWR.03.02.00-00-I022/16 and POWR.03.02.00-00-I006/17, respectively).

Files

README.txt

Files (4.8 GB)

Name Size Download all
md5:7ddd6b82670e6865204a99b90440a932
2.3 GB Download
md5:0424a1bfed451ca435501f8f017b9c5e
265.3 MB Download
md5:0414a409df33f2b7c87c4b3993a82013
2.2 GB Download
md5:c03ce154e24275b463bd843181d2035c
3.1 kB Preview Download
md5:c9fcfb673c1689389e8794caee75963a
1.6 MB Download
md5:8d1b5f90163d1ce33deb38c49eab0c05
5.7 kB Download

Additional details

Related works

Is derived from
Dataset: 10.5281/zenodo.7966081 (DOI)
Dataset: 10.5281/zenodo.7966058 (DOI)
Dataset: 10.5281/zenodo.7966091 (DOI)
Is published in
Preprint: 10.1101/2023.05.24.542065 (DOI)