Published March 28, 2024 | Version 2.0.0
Software Open

Preprocessed kinodata-3D for PyTorch Geometric & kinase affinity prediction models

Description

Project Description

Drug discovery pipelines nowadays rely on machine learning models to explore and evaluate large chemical spaces. While the inclusion of 3D complex information is considered to be beneficial, structural ML for affinity prediction suffers from data scarcity.

We provide here the pre-processed kinodata-3D dataset as well as the trained models that were used for our case study on structure-based ML for kinase activity prediction. They are meant to be used for working with the kinodata-3D binding affinity prediction repository, e.g., reproducing our trained models, or training your own PyTorch Geometric models using kinodata-3D.

Data

1. Preprocessed dataset

We publish this preprocessed version of kinodata-3D for the sake of guaranteeing reproducibility. The archive kinodata3d_preprocessed.zip can be extracted in the root directory of the kinodata-3D binding affinity prediction repository, which also contains further usage instructions.

Note that this archive also contains the exact data splits used in the training and evaluation of our models as CSV files.

2. Pretrained models

If you intend to use not only the dataset, but also our pretrained models, the archive kinodata3d_models.zip should also be extracted in the kinodata-3D binding affinity prediction repository root directory. Instructions on how to use pre-trained models can be found here.

Files

kinodata3d_models.zip

Files (3.8 GB)

Name Size Download all
md5:f68145087d8eae97f6f7c8dd602376a5
2.7 GB Preview Download
md5:a85d82bfe14658af08ecd2a3f801f6fe
1.0 GB Preview Download

Additional details

Dates

Updated
2024-03-28