Published October 17, 2023 | Version v1
Dataset Open

AlphaFold structures reported in "AlphaFold2 Can Predict Single-Mutation Effects"

Description

This contains AlphaFold predictions for X proteins that are found in the Protein Data Bank (PDB), that were used to evalluate AlphaFold's predictions of mutation effects. This includes one set of structures predicted by AlphaFold2.0, using default settings, and one structure for each of 5 models. This also includes structures predicted by the ColabFold version of AlphaFold (6 recycles, 5 models, no template, amber minimization, 4 repeats).

There are also additional predicted structures that are found in the PDB that were not analyzed in the paper.

There are AlphaFold predictions for three proteins (BFP / RFP, GFP, and PafA), covering either all (BFP/RFP, PafA) or a subset (GFP) of the sequences in three datasets of phenotype measurements from high-throughput experiments.

Results are separated into tar files based on whether DeepMind  (AF2.0) or ColabFold implementation was used.

Folders under "ColabFold/PDB" are labelled according to a sequence ID, since multiple PDB structures can exist for a single sequence. These sequence IDs can be mapped back to PDB IDs using the information in "seq_id_pdb_id.json".

All PDB files have been compressed using Foldcomp (https://github.com/steineggerlab/foldcomp). Foldcomp is required to decompress the ".fcz" files in order to recover the ".pdb" files.

Files

seq_id_pdb_id.json

Files (6.7 GB)

Name Size Download all
md5:024af8f1ac5e6b9466ed198ab62a5de9
1.9 GB Download
md5:53114f4b440f6d871f90e33378a4d403
4.8 GB Download
md5:096c93d83a9ecf9e2e87618916b46b6c
56.2 kB Preview Download

Additional details

Related works

Is supplement to
Preprint: 10.1101/2022.04.14.488301 (DOI)