Published July 3, 2023 | Version merged_14-04-2022_30_05-09-2022
Dataset Open

Large curated dataset for drug target interaction

Description

A large data curation from the PubChem, ChEMBL and BindingDB public sources. The curated dataset includes samples of pairs of small molecules and protein targets, with information about their binding interactions. The data is stored in an efficient tables format, decoupling entity IDs from their string representations, to avoid redundancy. The curation also includes meaningful splits of the dataset into train, validation and test sets for the purpose of utilizing it for learning based affinity prediction models.

Files

merged_14-04-2022_30_05-09-2022.zip

Files (11.3 GB)

Name Size Download all
md5:ed33c1fcfaf4a4c8bad6456f5c02af46
11.3 GB Preview Download