Dataset for "Computational structural genomics unravels common folds and predicted functions in the secretome of fungal phytopathogen Magnaporthe oryzae"

Seong, Kyungyong; Krasileva, Ksenia

doi:10.5281/zenodo.5525101

Published October 1, 2021 | Version v2

Journal article Open

Dataset for "Computational structural genomics unravels common folds and predicted functions in the secretome of fungal phytopathogen Magnaporthe oryzae"

1. University of California, Berkeley

Datasets for:

"Computational structural genomics unravels common folds and predicted functions in the secretome of fungal phytopathogen Magnaporthe oryzae" https://apsjournals.apsnet.org/doi/abs/10.1094/MPMI-03-21-0071-R

This version of the dataset includes structures predicted by AlphaFold (https://github.com/deepmind/alphafold). If you would like to download the datasets produced with TrRosetta and I-TASSER, please access the previous version of the dataset. AlphaFold was run with "--preset=full_dbs" with all the databases required by AlphaFold. The template structures were downloaded around July 20th, and all templates were allowed to be used for modeling. The only change in the database was ~1650 fungal genome annotations from Joint Genome Institute appended to the uniref90 database.

In total, five PDB structures were generated per protein sequence. Four relied on the CASP 14 model (model_1, model_3, model_4 and model_5), and the other one was generated with model_2_ptm to obtain the pTM score.

The datasets included are:
1) Best_models.tar.gz: This zipped file contains the best model (ranked_0.pdb) of the five for all secreted proteins.
2) Best_models_pkl.tar.gz: This zipped file contains result_model_<>.pkl files for the best models. The pkl files store extra information about the predicted structures. For more details, please visit AlphaFold's GitHub page.
3) Network.tar.gz: This zipped file includes sequence-based homology and structure-based analogy search results, filtered multiple sequence alignments for each secreted protein, and structural similarity search results against the databases.
4) Magnaporthe_Oryza_Structure_prediction_and_clustering_metadata.zenodo.xlsx: This file, similar to Table S5, contains metadata about secreted proteins and their assignments into clusters based on sequence and structural similarity. Only AlphaFold models were used for structure-based clustering, and the criteria for clustering were the same as those for TrRosetta models.

If you have questions about the outputs or need additional data from us, please don't hesitate to email us ( s.kyungyong@berkeley.edu and kseniak@berkeley.edu ) !!

Files

Files (77.4 GB)

Name	Size	Download all
Best_models.tar.gz md5:8393ca4c7543c9854c0b18ea754eee14	137.9 MB	Download
Best_models_pkl.tar.gz md5:19827e18544a4e757e4bb8866f77d120	75.2 GB	Download
Magnaporthe_Oryza_Structure_prediction_and_clustering_metadata.zenodo.xlsx md5:b3de2052f3c8423e3f3dcc926929f072	747.4 kB	Download
Network.tar.gz md5:2c8cb879f7c123ab3c15ff7bc7a9cb32	2.1 GB	Download

	All versions	This version
Views	1,074	547
Downloads	414	178
Data volume	3.1 TB	2.8 TB

Dataset for "Computational structural genomics unravels common folds and predicted functions in the secretome of fungal phytopathogen Magnaporthe oryzae"

Authors/Creators

Description

Files

Files (77.4 GB)