Published August 27, 2021
| Version 1.0
Dataset
Open
TocoDecoy: a new approach to design unbiased datasets for training and benchmarking machine-learning scoring functions
Description
This dataset file contains TocoDecoy datasets generated based on the targets and active ligands of LIT-PCBA.
1_property_filtered.zip :
- TD set: the ligand file name, 2D T-sne vectors, Smiles, molecular weight (MW), Wildman-Crippen partition coefficient (log P), number of rotatable bonds (RB), number of hydrogen-bond acceptors (HBA), number of hydrogen-bond donors (HBD), number of halogens (HAL), topology similarities of decoys to the seed active ligands, active label (active or inactive) and training set label (whether belongs to training set or test set) OF active ligands and their topologically dissimilar decoys
- CD set: the decoy conformations with low docking scores generated by docking active ligands into protein pockets using Glide, Schrödinger.
Files
1_property_filtered.zip
Files
(4.9 GB)
Name | Size | Download all |
---|---|---|
md5:24e5ee6f96a941730bfe41ae771e0e05
|
4.9 GB | Preview Download |