Dataset Open Access
A key concept in drug design is how natural variants, especially the ones occurring in the binding site of drug targets, affect the inter-individual drug response and efficacy by altering binding affinity. These effects have been studied on very limited and small datasets while, ideally, a large dataset of binding affinity changes due to binding site single-nucleotide polymorphisms (SNPs) is needed for evaluation. However, to the best of our knowledge, such a dataset does not exist. Thus, a reference dataset of ligands binding affinities to proteins with all their reported binding sites’ variants was constructed using a molecular docking approach. Having a large database of protein-ligand complexes covering a wide range of binding pocket mutations and a large small molecules’ landscape is of great importance for several types of studies. For example, developing machine learning algorithms to predict protein-ligand affinity or a SNP effect on it requires an extensive amount of data. In this work, we present PSnpBind: A large database of mutated binding site protein-ligand complexes constructed using a multithreaded virtual screening workflow. It provides a web interface to explore and visualize the protein-ligand complexes and a REST API to programmatically access the different aspects of the database contents. PSnpBind is freely available at https://psnpbind.org. The source code of the tools used in constructing PSnpBind is available on GitHub.