Published March 20, 2024
| Version v1
Dataset
Open
Data from: Genome-scale annotation of protein binding sites via language model and geometric deep learning
Authors/Creators
Description
The dataset contains the training and test sets of protein binding sites with DNA, RNA, peptide, protein, ATP, HEM, Zn2+, Ca2+, Mg2+ and Mn2+. Each protein is associated with 3 lines indicating the protein name (PDB accession code and chain), sequence and residue labels (0 for non-binding and 1 for binding), respectively. The ESMFold-predicted structures are also provided.
Files
GPSite_dataset.zip
Files
(482.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:a9f311ae1ad580d4eed26639a6d43941
|
482.2 MB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/biomed-AI/GPSite
- Programming language
- Python