Published September 20, 2024 | Version v2

Exploring zero-shot structure-based protein fitness prediction

Description

This repository contains data used in Exploring zero-shot structure-based protein fitness
prediction.

Directions to use this data can be found on our GitHub repository.

  1. experimental_struct_artifacts contains the experimentally determined structures for ProteinGym assays used in our analysis along with the reference file needed to generate ESM inverse folding predictions for these structures in ProteinGym.
  2. results contains the prediction results obtained by running SSEmb on the 216 ProteinGym assays being considered in this study.
  3. test.tar.gz contains all the structures from ProteinGym as well as MSAs generated using mmseqs2. To use this directory:
    • Setup SSEmb as directed in its repository
    • Download this file and extract it in the data folder.

 

 

Files

Files (184.1 MB)

Name Size Download all
md5:1ebb71e59e435d2ff2d2d9013667a016
13.0 MB Download
md5:1ead8dc1d95ae92b77504fe19a531d0d
45.6 MB Download
md5:862d79a927deb21b030ac0259018e376
125.4 MB Download

Additional details