Deep indel mutagenesis reveals the impact of insertions and deletions on protein stability and function
- 1. Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain; University Pompeu Fabra (UPF), Barcelona, Spain
- 2. Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- 3. Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain; University Pompeu Fabra (UPF), Barcelona, Spain; Institució Catalana de Recerca i estudis Avançats (ICREA), Barcelona, Spain; Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
Description
Datasets for "Deep indel mutagenesis reveals the impact of insertions and deletions on protein stability and function ". Code is available at: https://github.com/lehner-lab/deep_indel_mutagenesis.
Amino acid insertions and deletions (indels) are an abundant class of genetic variants. However, compared to substitutions, the effects of indels are not well understood and poorly predicted. Here we address this shortcoming by performing deep indel mutagenesis (DIM) of structurally diverse proteins. Indel tolerance is strikingly different to substitution tolerance and varies extensively both between different proteins and within different regions of the same protein. Although state of the art variant effect predictors perform poorly on indels, we show that both experimentally-measured and computationally-predicted substitution scores can be repurposed as good indel variant effect predictors by incorporating information on protein secondary structures. Quantifying the effects of indels on protein-protein interactions reveals that insertions can be an important class of gain-of-function variants. Our results provide an overview of the impact of indels on proteins and a method to predict their effects genome-wide.
additional_files.zip
additional_dfs.rds
DiMSum.zip
aPCA_domains_fitness_replicates.RData
aPCA_domains_variant_data_merge.RData
grb2_bind_fitness_replicates.RData
grb2_fold_fitness_replicates.RData
pdz3_bind_fitness_replicates.RData
pdz3_fold_fitness_replicates.RData
pre_processed_data.zip
color_scale.rds
scaled_variants_aPCA.rds
scaled_variants_bPCA.rds
tsuboyama_nat_doms_all.rds
indel_prediction_models.zip
ddmut_prediction_mean.rds
encoded_9doms.rds
ddG_insertions_models.R
ddG_deletions_models.R