PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery
Description
Prediction of protein-protein binding (PPB) affinity plays an important role in large-molecular drug discovery. Deep learning (DL) has been adopted to predict the change of PPB binding affinity upon mutation, but there was a scarcity of studies predicting the PPB affinity itself. The major reason is the paucity of open-source dataset concerning PPB affinity. Therefore, the current study aimed to introduce and disclose a PPB affinity dataset (PPB-Affinity), which will definitely benefit the development of applicable DL to predict the PPB affinity. The PPB-Affinity dataset contains key information such as crystal structures of protein-protein complexes (with or without protein mutation patterns), PPB affinity, receptor protein chain, ligand protein chain, etc. To the best of our knowledge, this is the largest and publicly available PPB-Affinity dataset, which may finally help the industry in improving the screening efficiency of discovering new large-molecular drugs.
Codes for PPB-Affinity database preparation is disclosed at https://github.com/Huatsing-Lau/PPB-Affinity-DataPrepWorkflow.
Codes for the benchmark algorithm is disclosed at https://github.com/ChenPy00/PPB-Affinity.
Files are orginized as follows:
- PDB/
- Affinity Benchmark v5.5/
- file1.pdb
- file2.pdb
- ...
- filek.pdb
- ATLAS/
- PDBbind v2020/
- SAbDab/
- SKEMPIv2.0/