Preprocessed kinodata-3D for PyTorch Geometric & kinase affinity prediction models
Creators
Description
Project Description
Drug discovery pipelines nowadays rely on machine learning models to explore and evaluate large chemical spaces. While the inclusion of 3D complex information is considered to be beneficial, structural ML for affinity prediction suffers from data scarcity.
We provide here the pre-processed kinodata-3D dataset as well as the trained models that were used for our case study on structure-based ML for kinase activity prediction. They are meant to be used for working with the kinodata-3D binding affinity prediction repository, e.g., reproducing our trained models, or training your own PyTorch Geometric models using kinodata-3D.
Data
1. Preprocessed dataset
We publish this preprocessed version of kinodata-3D for the sake of guaranteeing reproducibility. The archive kinodata3d_preprocessed.zip can be extracted in the root directory of the kinodata-3D binding affinity prediction repository, which also contains further usage instructions.
Note that this archive also contains the exact data splits used in the training and evaluation of our models as CSV files.
2. Pretrained models
If you intend to use not only the dataset, but also our pretrained models, the archive kinodata3d_models.zip should also be extracted in the kinodata-3D binding affinity prediction repository root directory. Instructions on how to use pre-trained models can be found here.
Files
kinodata3d_models.zip
Files
(3.8 GB)
Name | Size | Download all |
---|---|---|
md5:f68145087d8eae97f6f7c8dd602376a5
|
2.7 GB | Preview Download |
md5:a85d82bfe14658af08ecd2a3f801f6fe
|
1.0 GB | Preview Download |
Additional details
Dates
- Updated
-
2024-03-28