Datasets for AGIMA-Score modeling - updated
Description
This data repositary includes the following datasets.
(1) 'rs.zip' -- It is a secondary dataset that originates from the Refined Set in PDBbind database (version V2020). When training models like AGIMA-Score, the complexes in the validation/test sets need to be removed.
(2) 'cs.zip' -- It is a secondary dataset that originates from the Core Set (CASF-2016) in PDBbind database (version V2020). It was used as the validation set (for parameter tuning) when building the AGIMA-Score models.
(3) 'test1.zip' -- It is a secondary dataset that originates from CSAR-HiQ1. It was used as the Test1 set for evaluating the AGIMA-Score models.
(4) 'test2.zip' -- It is a secondary dataset that originates from CSAR-HiQ2. It was used as the Test2 set for evaluating the AGIMA-Score models.
(5) 'indexes.zip' -- It includes the labels (binding affinity data) for the complexes in above (1)~(4) sets.
A file 'xxxx_atm_prop.txt' indicates a specific protein-ligand complex in above sets, with 'xxxx' denoting the original complex ID in PDBbind and the data fields showing the following information. Note that here each row in such as file indicates an atom in the binding complex.
--------------------------------------------------------------------------------
id - atom id with protein atoms starting from 1 and ligand atoms also starting from 1 (integer)
atmnum - atomic number (integer)
x,y,z - the X, Y, Z coordinates for the atom (float)
atmB,atmC,atmN,atmO,atmP,atmS,atmSe - whether the atom is of some specific type, such as B, C, N, O, P, S and Se (binary)
atmHalogen - whether the atom is a halogen atom (binary)
atmMetal,atmMetallic - whether the atom is metal (binary)
hybridization - hybridization type of the atom (integer)
heavyneighbors - number of heavy-atom neighbors (integer)
heteroneighbors - number of hetero-atom neighbors (integer)
hydrophobic,aromatic,acceptor,donor,ring - pharmacophoric properties of the atom (binary)
partialCH - paricial charge of the atom (float)
posionizable,negionizable - whether the atom is positively ionizable or negatively ionizable (binary)
exlvolume - excluded volume of the atom (float)
vdwrad - VDW radius of the atom (float)
moltype - molecule the atom belongs to (0 for protein and 1 for ligand)
"neighbors(nbr:idx--anum--(sbond,dbond,tbond,arombond,ringbond))" - information of the covalent neighboring atoms for the atom
--------------------------------------------------------------------------------
Files
indexes.zip
Files
(838.1 MB)
Name | Size | Download all |
---|---|---|
md5:dded25c84b084dde54032487527b87ec
|
23.0 kB | Preview Download |
md5:6269fea896ad50e43742a004e2729692
|
22.9 MB | Preview Download |
md5:2dd1ac05cd11595ea03e0d28b344b4a8
|
22.9 MB | Preview Download |
md5:0d51494704c609014c2928a0479bbf5c
|
759.7 MB | Preview Download |
md5:44b161aae2c992f551e6c86457bd2f82
|
32.6 MB | Preview Download |