Published March 7, 2024 | Version v1
Dataset Open

Graph Attention Site Prediction (GrASP) COACH420 Dataset

  • 1. EDMO icon University of Maryland

Description

The modified version of the COACH420 dataset that was used for GrASP evaluation.

  1. unprocessed_pdb.zip: PDB structures from the original COACH420 dataset.
  2. ready_to_parse_mol2.zip: Protein and ligand structures after our additional processing was applied.
  3. raw.zip: NumPy arrays of the features used to construct PyTorch Geometric graphs.
  4. processed.zip: Processed protein graphs used as graph neural network inputs.
  5. mol2.zip: Protein with hydrogens removed and atoms renumbered accordingly. Indices match the node feature order in the NumPy and PyTorch files.
  6. coach420(mlig)_uniprot.pkl: Pickle containing UniProt ID for each receptor, used to define train/test splits.

Files

ready_to_parse_mol2.zip

Files (762.9 MB)

Name Size Download all
md5:659f10b2d7f066ba52e6163ff04ce6b8
6.5 kB Download
md5:abb9890f05916c84da4df782a4ff1680
14.1 MB Preview Download
md5:db55201d4de0d46eebf32cc8a56f29ca
80.9 MB Preview Download
md5:e6d747c80d29d04132bf5988fabc4ceb
620.8 MB Preview Download
md5:e54ebce724bce6ea2b4e98581000985a
27.7 MB Preview Download
md5:61742d77e1e0fe26da6971115d01daad
19.3 MB Preview Download

Additional details

Related works

Is supplement to
Publication: https://pubs.acs.org/doi/10.1021/acs.jcim.3c01698 (URL)

Software

Repository URL
https://github.com/tiwarylab/GrASP
Programming language
Python