MD simulations files for: Enhanced Sampling of Biomolecular Slow Conformational Transitions Using Adaptive Sampling and Machine Learning
Description
Here's a rephrased version of the README file:
#### Enhanced Sampling of Biomolecular Slow Conformational Transitions Using Adaptive Sampling and Machine Learning
**Authors:** Mingyuan Zhang, Hao Wu, Yong Wang
This repository contains the official implementation for the paper "Enhanced Sampling of Biomolecular Slow Conformational Transitions Using Adaptive Sampling and Machine Learning" by Mingyuan Zhang, Hao Wu, and Yong Wang. Included are all trajectories from our MD simulations in the form of PLUMED COLVAR files, as well as all analysis scripts and files needed to replicate the results and figures presented in both the main text and Supporting Information (SI) of the paper.
The paper features two examples: Ala2 and Ala10. For each, we have organized all the associated simulation files as they were during our automated simulation pipeline. The directory structure is the same for both examples. Here, we use Ala2, found in the `Ala2` folder, as an example:
### Key Components
- **Automated Pipeline Implementation:** The pipeline is implemented in `Ala2/7-adaptive-40ps/ala2.ipynb`. This implementation is ready to use once all required packages are installed, and gmx/gmx_mpi/plumed are callable within the notebook. After configuring the environment and specifying parameters like `gpu_id`, `ntomp`, and `n_sim` according to your hardware, running the blocks will replicate the entire pipeline.
- **Analysis Scripts:** The scripts to replicate the results or figures from the main text or SI are organized in three files: `Ala2/7-adaptive-40ps/AdaptiveSamplingAnalysis.ipynb`, `Ala2/7-adaptive-40ps/compare_with_msm.ipynb`, and `Ala2/7-adaptive-40ps/opes/COLVAR/analysis.ipynb`.
### Directory Structure
Under the `Ala2` main directory, there are seven subdirectories:
- **`Ala2/1-topol/`**: Contains files generated during system construction, including the final Gromacs topology file `topol.top`, which is necessary for running the automated simulation script.
- **`Ala2/2-em/`, `Ala2/3-nvt/`, `Ala2/4-npt/`**: These directories store files generated during energy minimization and NVT/NPT equilibration. The `Ala2/4-npt/npt.gro` file is required to run the automated simulation script.
- **`Ala2/mdp/`**: Contains all mdp files used, including `Ala2/mdp/md_detail.mdp`, which is necessary for running the automated simulation script.
- **`Ala2/7-adaptive-40ps/`**: Contains all simulation and analysis scripts, along with files required to replicate the study related to the automated pipeline.
1. **`Ala2/7-adaptive-40ps/CV/`**: Stores all COLVAR files from adaptive sampling simulations.
2. **`Ala2/7-adaptive-40ps/opes/`**: Contains all files related to OPES simulations, including raw data for the final FES plots found in `Ala2/7-adaptive-40ps/opes/COLVAR/`. The script for replicating OPES and FES estimation figures is located in `Ala2/7-adaptive-40ps/opes/COLVAR/analysis.ipynb`.
3. **`Ala2/7-adaptive-40ps/figures/`**: Includes all original figures from the main text and SI, saved at 600 dpi.
4. **`Ala2/7-adaptive-40ps/traj_and_dat/`**: Stores all PLUMED `*.dat` files for the `DRIVER` utility in adaptive sampling simulations, a topology file `input.pdb` for PLUMED `MOLINFO`, and a topology file `seed_ref.pdb` for MDAnalysis adaptive sampling seed `*.gro` generation. Note that all `*.xtc` files from adaptive sampling were deleted to reduce the package size.
5. **Seed Index Files:** Seed indices for each round are stored as `Ala2/7-adaptive-40ps/round{i}_seed.txt`, necessary for figure replication.
6. **Automated Pipeline Notebook:** Implemented in `Ala2/ala2.ipynb`. Ensure that all imported packages are installed and gromacs (both gmx and gmx_mpi)/plumed can be called within the Jupyter notebook.
7. **Adaptive Sampling Analysis:** Scripts for analyzing adaptive sampling trajectories are found in `Ala2/AdaptiveSamplingAnalysis.ipynb`. This notebook contains scripts to replicate all figures related to adaptive sampling.
8. **MSM Comparison:** Analysis scripts for MSM comparison are located in `Ala2/compare_with_msm.ipynb`. This notebook contains scripts to replicate figures used for MSM/OPES comparison.
- **`Ala2/8-adaptive-400ps/`**: Contains all simulation files (except xtc) for an additional adaptive sampling dataset computed for MSM comparison.
### Contact Information
We are continuing to test and improve the pipeline, so a tutorial is not yet available. Please feel free to reach out with any questions related to the implementation via email at mingyuanzhang@zju.edu.cn or by raising an issue on our GitHub page: https://github.com/yongwangCPH/papers/tree/main/2024/ALICE.
Files
ALICE_ala2_ala10.zip
Files
(24.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:7d5bd04eab01d46cea1cc7006057750a
|
24.0 MB | Preview Download |