There is a newer version of the record available.

Published September 20, 2024 | Version v1
Dataset Open

Exploring zero-shot structure-based protein fitness prediction

Description

This repository contains data used in Exploring zero-shot structure-based protein fitness
prediction.

Directions to use this data can be found on our GitHub repository.

  1. experimental_struct_artifacts contains the experimentally determined structures for ProteinGym assays used in our analysis along with the reference file needed to generate ESM inverse folding predictions for these structures in ProteinGym.
  2. results contains the prediction results obtained by running SSEmb on the 216 ProteinGym assays being considered in this study.
  3. test.tar.gz contains all the structures from ProteinGym as well as MSAs generated using mmseqs2. To use this directory:
    • Setup SSEmb as directed in its repository
    • Download this file and extract it in the data folder.

 

 

Files

Files (184.1 MB)

Name Size Download all
md5:57403b36f948a9ea4abd4fe6f42eb771
13.0 MB Download
md5:d24f26bbdc8321a0188fd889b9fe9c57
45.6 MB Download
md5:862d79a927deb21b030ac0259018e376
125.4 MB Download

Additional details