Published August 27, 2023 | Version 0.0.01
Dataset Open

Protein Structure Datasets for Protein Workshop

Creators

  • 1. University of Cambridge

Description

Raw + Processed Datasets used in the ProteinWorkshop Representation Learning Benchmark

 

Includes datasets from:

* The Antibody Developability dataset from Chen et al. (https://doi.org/10.1101/2020.06.18.159798)

* CATH from Ingraham et al. (https://www.mit.edu/~vgarg/GenerativeModelsForProteinDesign.pdf)

* CCPDB datasets from Agrawal et al. (https://doi.org/10.1093/database/bay142)

* The Deep Sea Proteins dataset from Sieg et al. (https://doi.org/10.1002/prot.26337)

* Reaction Class prediction from Hermosilla et al. (https://doi.org/10.48550/arXiv.2007.06252)

* FoldClassification from Hou et al. (https://doi.org/10.1093/bioinformatics/btx780)

* MaSIF Site dataset from Gainza et al. (https://doi.org/10.1038/s41592-019-0666-6)

* Metal3d Dataset from Duerr et al. (https://doi.org/10.1038/s41467-023-37870-6)

* Post-translational Modification Dataset from Yan et al. (https://doi.org/10.1016/j.crmeth.2023.100430)

Files

Files (16.0 GB)

Name Size Download all
md5:2d7ad11284bbd1c561f3c828a38d29cc
188.1 MB Download
md5:d1c77941ad390660ddcfee948a4f7b3f
4.3 GB Download
md5:b64e73ede0550212ab220dce446242b4
462.6 MB Download
md5:cd17fd7230f710f70cea5162ec73a784
257.4 MB Download
md5:8a201370939453ed86847c923c7cd48d
1.2 GB Download
md5:810fc8b24c6fb6b887f6bd4fc7389838
1.6 GB Download
md5:a59a559aceb265d8b8b9e15211a864f1
989.1 MB Download
md5:d65187f457449f977a84eb226d98fc79
287.2 MB Download
md5:72bc625f68f874dc4702229fae991372
128.8 MB Download
md5:314839e6073a8d2f289bd89bbd42c9e1
6.5 GB Download