Cryo-EM and X-ray crystallography ligands represented as 3D voxel grids for training deep learning models
Authors/Creators
Description
Ligand datasets used to train and evaluate the models studied in "Ligand Identification using Deep Learning" by Karolczak, J. et al.
The blobs_full.tar.gz and cryoem_blobs.zip files contain compressed 3D numpy arrays (*.npz) of all the ligand blobs extracted from X-ray and cryo-EM PDB deposits prior to quality filtering. The npz file names correspond to the PDB ID, chain, residue number, and ligand name of the extracted blob. The cmb_data.csv file contains the tabular data used to train the CheckMyBlob model. The X-ray data were later divided into training and testing subsets according to the xray_train.csv and xray_holdout.csv files, respectively. The ligand_mapping.csv file contains the mapping from ligand IDs to ligand group names. Finally, the cryoem_qscores.csv file contains Q-scores that were used to filter cryo-EM ligands.
Files
cryoem_blobs.zip
Files
(19.4 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:cf0851e7722848e487ab7a92afcaab37
|
12.0 GB | Download |
|
md5:13c1c0f23115a15ad9ae973678d34f02
|
508.8 MB | Preview Download |
|
md5:44194c24a72631f590a24c3699a395c9
|
6.9 GB | Preview Download |
|
md5:cd82e1b1b1668ec17816a75db0de3cc3
|
13.9 MB | Preview Download |
|
md5:a04d4273f51d7311213e0c667cec467b
|
470.7 kB | Preview Download |
|
md5:7f4a9b05273cc0d388a2fac4160f1feb
|
6.7 MB | Preview Download |
|
md5:ea40b376028fcac9fac38a23b45b4ecb
|
15.6 MB | Preview Download |