Published September 24, 2025 | Version v3
Dataset Open

GEMS: Resolving Data Bias Improves Generalization in Binding Affinity Prediction

Authors/Creators

Contributors

Data manager:

Researcher:

  • 1. EDMO icon ETH Zürich

Description

For fast reproduction of our results, we provide PyTorch datasets of precomputed interaction graphs for the entire PDBbind database on Zenodo. To enable quick establishment of leakage-free evaluation setups with PDBbind, we also provide pairwise similarity matrices for the entire PDBbind dataset on Zenodo.

Version 2 - Updated to improve the accuracy of Tanimoto Scores in the pairwise similarity matrices, which also caused minor changes in the composition of PDBbind CleanSplit.

Version 3 - Including pairwise similarity matrix for sequence identity (from TM-align)

Files

pairwise_similarity_complexes.json

Files (13.8 GB)

Name Size Download all
md5:0ccae0515633787958b3a467944c31dc
10.6 GB Download
md5:752f7cf36edc58900e184d934d9d5d75
155.5 kB Preview Download
md5:b1b4d14146e560ff38d824ca22404c64
3.2 GB Download

Additional details

Dates

Updated
2025-09-24
Version 2 - Including pairwise similarity matrix for sequence identity (from TM-align)

Software