Published 2025
| Version v2
Dataset
Open
CovDocker
Description
The preprocessed CovDocker dataset for paper "CovDocker: Benchmarking Covalent Drug Design with Tasks, Datasets, and Solutions" with associated code at https://github.com/PoloWitty/CovDocker.
The dataset files are saved as lmdb file for the convenience of use.
File structure:
processed
├── bonded
│ ├── 1A0L
│ │ ├── 1A0L_10Apocket.pdb
│ │ ├── 1A0L_5Apocket.pdb
│ │ ├── 1A0L_8Apocket.pdb
│ │ ├── 1A0L_chain_within_10A.pdb
│ │ ├── 1A0L_ligand.pdb # ligand part from original complex pdb file
│ │ ├── 1A0L_ligand.sdf # ligand pdb structure aligned with coresponding SMILES
│ │ └── 1A0L_protein.pdb # protein part from original complex pdb file
.....
│ └── 9XIA
├── dataset # lmdb files used for deep learning model
│ ├── docking
│ ├── reaction
│ └── reactive_site
├── dataset.csv # used for task2 and task3 (n=2754)
├── dataset.filtered.csv # used for task1 (n=2717)
├── dataset.filtered.unseen.csv # used for task1 unseen test set (67 unseen test samples)
├── dataset.unseen.csv # used for task2 and task3 unseen test set (68 unseen test samples)
└── pdb2mechanism.csv # (n=2754)
Paper Abstract:
Molecular docking is a widely used computational tool to predict the binding mode of a ligand to a target protein. Covalent interactions, which involve the formation of a covalent bond between the ligand and the target, have gained significant importance due to their strong and durable binding. However, most traditional docking methods and existing deep learning approaches hardly account for the formation of covalent bonds and the resultant structural changes.
In this paper, we introduce a comprehensive benchmark for covalent docking. We decompose the covalent docking process into three main tasks: reactive site prediction, covalent reaction prediction, and covalent docking (cov-docking). By adapting state-of-the-art models such as Uni-Mol and Chemformer, we establish baseline performances, demonstrating the benchmark's efficacy in accurately predicting interaction sites and modeling the molecular changes involved in covalent binding.
These initial findings provide a foundation for further research, facilitating the development of advanced computational methods to expedite the discovery of covalent drugs. Positioned as a valuable resource for the scientific community, this benchmark serves as a catalyst for innovation in covalent drug design methodologies. By addressing the unique challenges of covalent docking through tailored tasks and datasets, our work paves the way for more accurate and efficient computational techniques, ultimately contributing to the acceleration of covalent inhibitor discovery and development. Our code is available at https://github.com/PoloWitty/CovDocker.
Files
covDocker_data.zip
Files
(708.6 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:e351875949a98054af167aad10bb202e
|
708.6 MB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/PoloWitty/CovDocker