Protein Structure Datasets for Protein Workshop
Description
Raw + Processed Datasets used in the ProteinWorkshop Representation Learning Benchmark
Includes datasets from:
* The Antibody Developability dataset from Chen et al. (https://doi.org/10.1101/2020.06.18.159798)
* CATH from Ingraham et al. (https://www.mit.edu/~vgarg/GenerativeModelsForProteinDesign.pdf)
* CCPDB datasets from Agrawal et al. (https://doi.org/10.1093/database/bay142)
* The Deep Sea Proteins dataset from Sieg et al. (https://doi.org/10.1002/prot.26337)
* Reaction Class prediction from Hermosilla et al. (https://doi.org/10.48550/arXiv.2007.06252)
* FoldClassification from Hou et al. (https://doi.org/10.1093/bioinformatics/btx780)
* MaSIF Site dataset from Gainza et al. (https://doi.org/10.1038/s41592-019-0666-6)
* Metal3d Dataset from Duerr et al. (https://doi.org/10.1038/s41467-023-37870-6)
* Post-translational Modification Dataset from Yan et al. (https://doi.org/10.1016/j.crmeth.2023.100430)
Files
Files
(16.0 GB)
Name | Size | Download all |
---|---|---|
md5:2d7ad11284bbd1c561f3c828a38d29cc
|
188.1 MB | Download |
md5:d1c77941ad390660ddcfee948a4f7b3f
|
4.3 GB | Download |
md5:b64e73ede0550212ab220dce446242b4
|
462.6 MB | Download |
md5:cd17fd7230f710f70cea5162ec73a784
|
257.4 MB | Download |
md5:8a201370939453ed86847c923c7cd48d
|
1.2 GB | Download |
md5:810fc8b24c6fb6b887f6bd4fc7389838
|
1.6 GB | Download |
md5:a59a559aceb265d8b8b9e15211a864f1
|
989.1 MB | Download |
md5:d65187f457449f977a84eb226d98fc79
|
287.2 MB | Download |
md5:72bc625f68f874dc4702229fae991372
|
128.8 MB | Download |
md5:314839e6073a8d2f289bd89bbd42c9e1
|
6.5 GB | Download |