Published April 14, 2023 | Version 1
Dataset Open

Hybrid quantum-classical machine learning for generative chemistry and drug design: Generated molecules

  • 1. Russian Quantum Center
  • 2. Gero PTE

Description

Deep generative chemistry models emerge as powerful tools to expedite drug discovery. How- ever, the immense size and complexity of the structural space of all possible drug-like molecules pose significant obstacles, which could be overcome with hybrid architectures combining quantum computers with deep classical networks. As the first step toward this goal, we built a compact discrete variational autoencoder (DVAE) with a Restricted Boltzmann Machine (RBM) of reduced size in its latent layer. The size of the proposed model was small enough to fit on a state-of-the-art D-Wave quantum annealer and allowed training on a subset of the ChEMBL dataset of biologically active compounds. Finally, we generated 2331 novel chemical structures with medicinal chemistry and synthetic accessibility properties in the ranges typical for molecules from ChEMBL. The pre- sented results demonstrate the feasibility of using already existing or soon-to-be-available quantum computing devices as testbeds for future drug discovery applications.

Notes

Molecule datasets for paper "Hybrid quantum-classical machine learning for generative chemistry and drug design", A.I. Gircha, A.S. Boev, K. Avchaciov, P. O. Fedichev, and A.K. Fedorov. 1. Training dataset: train_set.txt 2. Molecules generated using the model with Gibbs sampling after 300 epochs of training: gibbs300.txt 3. Molecules generated using the model with Gibbs sampling after 75 epochs of training: gibbs75.txt gibbs75-1.txt 4. Molecules generated using the model with D-Wave sampling after 75 epochs of training: dw75-1.txt dw75-2.txt dw75-3.txt dw75-4.txt dw75-5.txt dw75-6.txt

Files

dw75-1.txt

Files (12.0 MB)

Name Size Download all
md5:68aa818c853915e929bbf554c90b3326
39.2 kB Preview Download
md5:778c11d63339c47abd3879685be0f6ff
33.8 kB Preview Download
md5:bd934a0d0a7350e7888bd6dd6d218356
36.4 kB Preview Download
md5:d0f14d906207e6fca1d6aac65cefa353
38.1 kB Preview Download
md5:7bdc19fe9558c0532711aa077a0ebc78
41.6 kB Preview Download
md5:4132a5f059aebb3b9371d923e9dc493c
33.9 kB Preview Download
md5:628a1b906bb178cafac15a0952e6730f
2.8 MB Preview Download
md5:c2642b75ce2a60cce0d588c2a927dd45
274.9 kB Preview Download
md5:f2330ba9656febbbf3a50defa2020ebc
274.3 kB Preview Download
md5:0189efdb6ed0b0d7f0232722f5333436
596 Bytes Preview Download
md5:904bf14b51cb06608a540e3e3f404891
8.4 MB Preview Download

Additional details

References

  • arXiv:2108.11644