Published June 30, 2024 | Version 1.0
Dataset Open

Aquamarine: Quantum-Mechanical Exploration of Conformers and Solvent Effects in Large Drug-like Molecules

Description

Open challenges in computational drug design include the understanding and accurate description of solvent effects as well as collective dispersion interactions for realistic drug-like molecules. Both interactions profoundly influence the conformational stability of drug molecules and, consequently, the determination of other important quantum-mechanical (QM) observables. In this context, we here introduce the Aquamarine (AQM) dataset -- an extensive QM dataset that contains the structural and electronic information -- of 59,786 low-and high-energy conformers of 1,653 molecules containing up to 54 non-hydrogen atoms (including  C, N, O, F, P, S and Cl). To gain insights into the solvent effects, we have carried out QM calculations of structures and properties in gas phase and in an aqueous solution modeled with implicit solvent. AQM contains over 40 global (molecular) and local (atom-in-a-molecule) physicochemical properties (including ground-state and response properties) per molecular structure computed at the tightly converged PBE0+MBD level of theory for gas-phase molecules, whereas PBE0+MBD supplemented with the modified Poisson-Boltzmann (MPB) model of water was used for solvated molecules. By treating both molecule-solvent and dispersion interactions, the AQM dataset can help understand the impact of both interactions in structure-property and property-property relationships of realistic drug-like molecules. Therefore, we propose the AQM dataset as a  benchmark for current state-of-the-art machine learning methods for property prediction as well as for the de novo generation of large and flexible (solvated) molecules with pharmaceutical and biological relevance.

Files

README.txt

Files (5.9 GB)

Name Size Download all
md5:e4cee28996ee002bcc37157691f7600a
1.6 GB Download
md5:c84d69fc6a3bacb2b80d8b9a7dbd0ec7
2.6 GB Download
md5:71c86586e3e26a1471721555325026ce
1.7 GB Download
md5:a722d765bda6e9274925bbf0dc899e85
328.9 kB Preview Download
md5:2103b824abc6d6000d4ea3a1b0dd7b81
524.4 kB Preview Download
md5:40500597b2501dafc3f29e91a760ebd4
1.0 kB Download
md5:fc615ca71f9c6432bb88a2652802c270
2.8 kB Preview Download

Additional details

Funding

Janssen (Belgium)
European Union
University of Luxembourg

Dates

Available
2024-06-30