Pose Selector Workflow - Docking Poses, Absolute Binding Free Energy Estimates and Structure Input Files for Machine Learning
Creators
Description
The Pose Selector (PS) workflow calculates absolute binding free energies (ABFEs) for binding poses of protein-ligand complexes. First, it converts the binding poses (both docking poses as well as experimentally observed ligand binding poses), which are provided as a combination of protein PDB file and ligand MOL2 file, into input files for molecular dynamics (MD) simulations with GROMACS after they have passed extensive quality checks and repair steps. Next, the PS workflow post-processes and analyses the last frame of the resulting eight 100 ps trajectories per binding pose with the Generalised Born model of implicit solvation as implemented in gmx_MMPBSA to obtain the ABFE estimates. The workflow was designed for soluble proteins without post-translational modifications, co-factors and non-standard amino acids, and it has limited support for coordinated ions.
For the dataset published here, the PS workflow was run on docking poses generated for the PDBbind 2020 dataset (http://www.pdbbind.org.cn/index.php), shared in dockingPosesPDBBind2020.tar.gz. This entry and its partner entry 10.5281/zenodo.11397486 also share the intial coordinates used in the MD simulations of >800,000 docking poses of 4022 protein-ligand complexes (structureFiles_dockingPoses1.tar.gz in this entry and structureFiles_dockingPoses2.tar.gz in 10.5281/zenodo.11397486) and of the experimental ligand binding pose of 4549 complexes (structureFiles_experimentalStructures.tar.gz) as well as the corresponding ABFE estimates (absoluteBindingFreeEnergyEstimates.tar.gz). The MD simulations were run on the LUMI and MeluXina supercomputers while the implicit-solvent calculations were carried out on Galileo (Cineca).
The README file describes the structure of the shared data in more detail and points out how to reproduce the MD trajectories and the subsequent implicit-solvent calculations yielding the free-energy estimates as well as how to use the data provided in this entry to train a machine-learning model predicting the ABFE of binding poses of protein-ligand complexes. The workflow scripts can be downloaded from GitHub (https://github.com/LigateProject/Pose-Selector-workflow). The MD simulations were run with GROMACS 2023.2 (https://manual.gromacs.org/2023.2/index.html), and the implicit-solvent calculations were carried out with gmx_MMPBSA 1.6.1 (https://valdes-tresanco-ms.github.io/gmx_MMPBSA/v1.6.1/).
Files
README.txt
Files
(49.1 GB)
Additional details
Funding
Software
- Repository URL
- https://github.com/LigateProject/Pose-Selector-workflow