rMD17-aq dataset
Creators
Contributors
Project leaders:
Researchers:
Description
The rMD17-aq dataset:
Citation:
Jas Kalayan, Ismaeel Ramzan, Christopher D. WIlliams, Neil A. Burton and Richard A. Bryce "A neural network potential based on pairwise resolved atomic forces and energies", publication TBC
Description:
QM/MM aqueous simulations of the 10 molecules from the original MD17 dataset by Chmiela et al. (and revised dataset by Christensen et al.) were performed surrounded by 400 SPC/E water molecules. Each simulation was performed for 100~ps at 500K temperature and 1 atm pressure. The solute conformations sampled from the QM/MM simulations performed with CP2K are used to recalculate forces and energies of each conformation in Gaussian with a denser integral grid to effectively remove numerical noise.
We also include an 11th molecule of a higher energy conformer of salicylic acid (directory name: salicylic_high_energy_conformer) in addition to the lower energy conformer sampled in the MD17 dataset.
For each molecule (excluding all surrounding water molecules), this dataset contains the nuclear charges, coordinates (Angstrom), forces (kcal/mol/Ang), energies (kcal/mol/Ang) and partial atomic charges (atomic units) in space separated formats outputted from the numpy savetxt function.
The data:
The files in each molecule directory are:
'nuclear_charges.txt' : The nuclear charges for each atom in a molecule.
'coords.txt' : The Cartesian coordinates for each atom in a conformation (Angstrom units)
'energies.txt' : The total energy of each conformation (kcal/mol units)
'forces.txt' : The Cartesian forces for each atom in a conformation (kcal/mol/Angstrom units)
'charges.txt' : The partial ElectroStatic Potential (ESP) atomic charges (atomic units)
'molecules.prmtop' : The Amber formatted topology file containing the MM parameters for water molecules (solute MM parameters are not used)
'minimised.rst.pdb' : The initial coordinates of a minimised system used to perform QM/MM simulations in CP2K
The input data:
The input files to perform simulations and single point energy calculations are provided in the '_cp2k_gaussian_example_inputs' directory. These files are:
'cp2k-qmmm-example.inp' : input file for the QM/MM simulations performed with CP2K. The number of QM atom kinds are replaced with placeholders CCC, OOO, HHH, NNN for the number of carbon, oxygen, hydrogen and nitrogen atoms respectively in a solute molecule. The system dimensions placeholder XXYYZZ can be replaced with the BOX_DIMENSIONS in the molecules.prmtop file.
'def2-svp.1.cp2k' : the basis set used in QM/MM simulations
'gaussain_input.com': an example of a Gaussian input file for single point energy calculations for aspirin.
Files
_cp2k_gaussian_example_inputs.zip
Files
(1.0 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:6b7271acd2eb36f1a2f576a0ec216f35
|
6.7 kB | Preview Download |
|
md5:46336825ee8ae7ef27fd1dde7160a2e0
|
125.5 MB | Preview Download |
|
md5:8e2431fac66294c1bd0601e816393d25
|
143.6 MB | Preview Download |
|
md5:44b9f345231d5a7de8008b2605dba83b
|
71.1 MB | Preview Download |
|
md5:2cd36522390caffd99e9c74680204160
|
53.8 MB | Preview Download |
|
md5:191308fa831fb7fedd4afe3dd9e6da28
|
53.6 MB | Preview Download |
|
md5:dcb3cb14aa7303c08ffdf3b47353feb1
|
106.7 MB | Preview Download |
|
md5:dac0816970406ac9ba3908842df92a33
|
119.4 MB | Preview Download |
|
md5:e1115546816aebcea159115275620295
|
2.7 kB | Preview Download |
|
md5:0a135fca46d9f720b58adf2ce444a595
|
95.5 MB | Preview Download |
|
md5:85979bc8f9ef96223dd81316f7ef0c47
|
95.5 MB | Preview Download |
|
md5:4037f174b669632b58799082e9e1db9e
|
89.1 MB | Preview Download |
|
md5:c7a1f8def1ffc381f0096866bdd3dff0
|
71.5 MB | Preview Download |
Additional details
Related works
- Is variant form of
- Publication: 10.1126/sciadv.1603015 (DOI)
- Publication: 10.1088/2632-2153/abba6f (DOI)
Funding
- Leverhulme Trust
- Accelerating shape-based drug design by machine learning and quantum mechanics RPG- 2020-05