NextTopDocker
Authors/Creators
- 1. Université Paris Cité, CNRS UMR 8251, INSERM ERL 1133, F-75013 Paris, France
- 2. Department of Bioengineering, Imperial College London, London SW7 2AZ, UK
Description
NextTopDocker, a largest-scale, up-to-date (as of May 2025), and fully open-access data set of 19,239 PDB-derived protein-ligand complexes, split into 14,038 training and 5,201 test entries via a strict cold-ligand strategy, together with nine ligand-similarity-aware training subsets, provides a challenging, diverse, and reproducible foundation for evaluating pose generation and docking performance.
On this benchmark dataset, our simple logistic regression models, LogReg (x%), trained on Smina and GNINA 1.3 scores from chemically dissimilar ligands and applied to Smina-generated poses, achieved docking power comparable to or exceeding that of the four SOTA end-to-end ML docking tools (DeepDock, Interformer, SurfDock, and Uni-Mol Docking v.2).
Files
DeepDock.zip
Files
(33.6 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:f2f2a84cda7e5a72e5a9225c0edf6781
|
290.8 MB | Preview Download |
|
md5:a14ce1fc348a23e7f8f4319c81829602
|
250.1 MB | Preview Download |
|
md5:9f8a188f2958b9fcb200cee70dac3dd1
|
12.9 GB | Preview Download |
|
md5:2c9a3bac624b2aa81a1bb9a3cf0ca4b5
|
874.8 MB | Preview Download |
|
md5:a6463de2f2163bea42274f097fa2b993
|
18.9 GB | Preview Download |
|
md5:def5e26fc9ffecad7a50d9459e7760dd
|
453.8 MB | Preview Download |