Published June 2022
| Version v1
Dataset
Open
Data to "Contrastive learning on protein embeddings enlightens midnight zone"
Contributors
Researcher (5):
Description
This is a backup of the data shared as part of the publication Contrastive learning on protein embeddings enlightens midnight zone and the associated github repo.
- ProtTucker_ProtT5.pt holds model weights of ProtTucker
- prottucker_training_embeddings.tar.gz holds input used to train ProtTucker (i.e., ProtT5 embeddings)
- cath-domain-list.txt holds CATH annotation/labels used for training ProtTucker
- pdb_seqres_161121.h5 holds pre-computed ProtTucker embeddings for PDB. Useful as lookup DB
- scope_2.08_S100.h5 same as PDB_seqres but for SCOPe
- sprot_161121.h5 same as PDB_seqres but for SwissProt
- cath_v430_dom_seqs_S100_161121.h5 same as PDB_seqres but for CATH_v430
Files
cath-domain-list.txt
Files
(9.4 GB)
| Name | Size | |
|---|---|---|
|
md5:6df99c76b576d7fda8c47f0d43187ce1
|
37.1 MB | Preview Download |
|
md5:2025cdd39c697de8b3cefffecfe35e09
|
299.9 MB | Download |
|
md5:0dcb4e99bdae08229f64a2d187f65bdf
|
1.6 GB | Download |
|
md5:d558d1df229cf5ee0f864908ebc0500a
|
4.7 MB | Download |
|
md5:170a4ca81e258371c9afabbdfaa438e4
|
5.9 GB | Download |
|
md5:3cdbcd8362c757b7ea7af9ee993a0863
|
225.5 MB | Download |
|
md5:0934353fd63182b66e862279c7985eb6
|
1.4 GB | Download |
Additional details
Dates
- Accepted
-
2022-06