Published September 24, 2025
| Version v3
Dataset
Open
GEMS: Resolving Data Bias Improves Generalization in Binding Affinity Prediction
Authors/Creators
Description
For fast reproduction of our results, we provide PyTorch datasets of precomputed interaction graphs for the entire PDBbind database on Zenodo. To enable quick establishment of leakage-free evaluation setups with PDBbind, we also provide pairwise similarity matrices for the entire PDBbind dataset on Zenodo.
Version 2 - Updated to improve the accuracy of Tanimoto Scores in the pairwise similarity matrices, which also caused minor changes in the composition of PDBbind CleanSplit.
Version 3 - Including pairwise similarity matrix for sequence identity (from TM-align)
Files
pairwise_similarity_complexes.json
Files
(13.8 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:0ccae0515633787958b3a467944c31dc
|
10.6 GB | Download |
|
md5:752f7cf36edc58900e184d934d9d5d75
|
155.5 kB | Preview Download |
|
md5:b1b4d14146e560ff38d824ca22404c64
|
3.2 GB | Download |
Additional details
Dates
- Updated
-
2025-09-24Version 2 - Including pairwise similarity matrix for sequence identity (from TM-align)
Software
- Repository URL
- https://github.com/camlab-ethz/GEMS