Published March 18, 2022 | Version 1.0.0
Dataset Open

Convolutions are competitive with transformers for protein sequence pretraining

Description

- Pretrained models of protein sequences. See https://github.com/microsoft/protein-sequence-models for instructions on how to load models.

- IDR datasets used for evaluation. 

- March 2020 version of UniRef50 with splits used for training. 

Files

cdc28_binding.csv

Files (11.8 GB)

Name Size Download all
md5:a5fdc095531e3742d1484d508b9c7c3c
151.6 MB Download
md5:54701ac812658debb909f835927724e0
2.5 MB Download
md5:2601f6b9971fe5ea1733067eb87a3398
2.6 GB Download
md5:68e67c7ecb8c300e1c652c4ae091d218
303.1 MB Download
md5:fd0ef9a4415167e5fc0afa50b3204c57
631.7 kB Preview Download
md5:62f16ce55c55cad4a75d8cfdbff945ac
637.1 kB Preview Download
md5:e2278f1d93836e8309cf4192b8455828
8.7 GB Download