There is a newer version of the record available.

Published September 14, 2024 | Version v1
Dataset Open

Coarse-Grained and Multi-Dimensional Data-Driven Molecular Generation: A General Framework for Selective Inhibitor Design and Optimization in Structure-Based Drug Discovery

Authors/Creators

  • 1. ROR icon Sichuan University

Contributors

Contact person:

  • 1. ROR icon Sichuan University

Description

Many approaches not only fail to consider the intricate binding pocket interactions, leading to molecules with suboptimal properties and stability, but also struggle with designing selective inhibitors. To address this challenge, we have developed an innovative structure-based three-dimensional molecular generation framework named Coarse-grained and Multi-dimensional Data-driven molecular generation (CMD-GEN). This framework bridges three-dimensional ligand-protein complex data with two-dimensional drug-like molecule data by utilizing coarse-grained pharmacophore points sampled from diffusion models, thereby enriching the training data for generative models. Through a hierarchical architecture, it decomposes the generation of three-dimensional molecules within the pocket into sampling of coarse-grained pharmacophore points, generating of chemical structures, and alignment of conformations, avoiding the instability issues associated with inherent in deep generative model-based generation of molecular conformations.

This project provide the source dataset used to train and evaluate the overall model.

Files

CMD-GEN.zip

Files (5.9 GB)

Name Size Download all
md5:d83aef5543ca80dfba36fb5ddfb4b297
5.9 GB Preview Download

Additional details

Dates

Other
2024