Alignment-based protein mutational landscape prediction: doing more with less
- 1. Sorbonne University
- 2. Technical University Munich
- 3. University of Paris
Description
The wealth of genomic data has boosted the development of computational methods predicting the phenotypic outcomes of missense variants. The most accurate ones exploit multiple sequence alignments, which can be costly to generate. Recent efforts for democratizing protein structure prediction have overcome this bottleneck by leveraging the fast homology search of MMseqs2. Here, we show the usefulness of this strategy for mutational outcome prediction through a large-scale assessment of 1.5M missense variants across 72 protein families. Our study demonstrates the feasibility of producing alignment-based mutational landscape predictions that are both high-quality and compute-efficient for entire proteomes. We provide the community with the whole human proteome mutational landscape and simplified access to our predictive pipeline.
Notes
Files
README.md
Files
(28.3 GB)
Name | Size | Download all |
---|---|---|
md5:83b5707c2aa5e9f42bffb56ce756a868
|
7.9 MB | Download |
md5:058e656c18cb60d230bfca9ba6d55e50
|
23.7 kB | Download |
md5:2dfeb2d397f27082225854535eea6f0e
|
28.0 GB | Download |
md5:93e7eb19ab302fdf2ffb1fc33d864fbd
|
295.2 MB | Download |
md5:9edb9e7c568eafe0f2325c0d13cd1a42
|
10.9 kB | Preview Download |
Additional details
Related works
- Is cited by
- 10.1101/2022.12.13.520259 (DOI)