Published August 20, 2022 | Version 1.0.0
Dataset Open

Input features of E. coli proteome for predicting and modeling protein-protein interactions with AF2Complex

  • 1. Georgia Tech

Description

Input features to be used with AF2Complex for predicting protein-protein interactions among ~4400 E. coli proteins. A pickled feature file was generated by the feature data pipeline of AF2Complex for each E. coli protein. To reduce storage size, we limited up to 10,000 MSA sequences and up to 10 structural templates from the Protein Data Bank. The cutoff date for sequence libraries and the Protein Data Bank releases used for feature generation is no later than 11-30-2021.

  • ecoli_af2c_fea.txt -- A list of all E coli protein with pre-generated input features
  • af2c_fea_ecoli_220331_msa10ktem10.tar -- Input features named after the UniProt ID of each proteins. Note that after untar the tarball, you may use the gzipped feature pickle files directly with AF2Complex w/o gunzip.

Files

ecoli_af2c_fea.txt

Files (5.0 GB)

Name Size Download all
md5:070f6f90276ee27ad75fd67cfae14148
5.0 GB Download
md5:65959b7b3fe209d918b8a46ce0e6db89
283.8 kB Preview Download