There is a newer version of the record available.

Published May 22, 2024 | Version 1.0.0
Publication Open

Deep Learning for Protein-Ligand Docking: Are We There Yet?

  • 1. University of Missouri

Description

Included are preprocessed datasets and benchmark method predictions accompanying the benchmarking manuscript "Deep Learning for Protein-Ligand Docking: Are We There Yet?" [1]. In particular, the preprocessed Astex Diverse and PoseBusters Benchmark datasets as well as the publicly available CASP15 targets referenced in the manuscript are available for download. Also available are baseline method predictions from a variety of deep learning and conventional docking methods (e.g., DiffDock-L, Vina) for each of these benchmarking datasets. Note that the "holo_aligned" predicted protein structures provided for the Astex Diverse and PoseBusters Benchmark datasets have been pre-aligned to the corresponding ground-truth (holo) protein structures. Similarly, the "predicted_structures" predicted protein structures provided for the CASP15 dataset have been pre-aligned to the corresponding ground-truth (holo) protein structures.

 

Paper Abstract:

The effects of ligand binding on protein structures and their in vivo functions carry numerous implications for modern biomedical research and biotechnology development efforts such as drug discovery. Although several deep learning (DL) methods and benchmarks designed for protein-ligand docking have recently been introduced, to date no prior works have systematically studied the behavior of docking methods within the practical context of (1) predicted (apo) protein structures, (2) multiple ligands concurrently binding to a given target protein, and (3) having no prior knowledge of binding pockets. To enable a deeper understanding of docking methods' real-world utility, we introduce PoseBench, the first comprehensive benchmark for practical protein-ligand docking. PoseBench enables researchers to rigorously and systematically evaluate DL docking methods for apo-to-holo protein-ligand docking and protein-ligand structure generation using both single and multi-ligand benchmark datasets, the latter of which we introduce for the first time to the DL community. Empirically, using PoseBench, we find that all recent DL docking methods but one fail to generalize to multi-ligand protein targets and also that template-based docking algorithms perform equally well or better for multi-ligand docking as recent single-ligand DL docking methods, suggesting areas of improvement for future work. Code, data, tutorials, and benchmark results are available at https://github.com/BioinfoMachineLearning/PoseBench.

 

References:

[1] Morehead A, Giri N, Liu J, Cheng J. Deep Learning for Protein-Ligand Docking: Are We There Yet? arXiv; 2024. Available from: http://arxiv.org/abs/2308.05777

Files

Files (19.3 GB)

Name Size Download all
md5:eeba48db091d63b80e7613a34b92f115
44.2 MB Download
md5:f0bc078d73ff5af74ae2e82f450757ba
23.3 MB Download
md5:f5bace269a8e288f44e8b7f2865fc73f
992.5 MB Download
md5:06b32ab4382bc61b8242c441027ba60a
257.8 MB Download
md5:f9d5f4e9245aa1005c03ff5298162830
20.0 MB Download
md5:6e22e67b4c784271a42598d2e4313ec2
8.8 GB Download
md5:4b334a974ca2448abfe2440e1cc26889
524.1 kB Download
md5:7afc7dcd23dcbc75126da4f1bf1761a0
8.8 GB Download
md5:4fea75dd5f150c1fce27f8ceabab4a0a
155.0 MB Download
md5:83e0df123e3f4678d790ee5f1e462262
110.7 MB Download
md5:5777e9be311cd387d062e509305b9304
90.4 MB Download
md5:97e8100adbf2d0df6dfddd4433a100a1
962.1 kB Download
md5:b031abd8c576b9c3d430cb018241bfcf
21.8 MB Download

Additional details

Software

Repository URL
https://github.com/BioinfoMachineLearning/PoseBench
Programming language
Python
Development Status
Active