There is a newer version of the record available.

Published October 23, 2023 | Version v2
Journal article Open

Deep indel mutagenesis reveals the impact of insertions and deletions on protein stability and function

  • 1. Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain; University Pompeu Fabra (UPF), Barcelona, Spain
  • 2. Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
  • 3. Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain; University Pompeu Fabra (UPF), Barcelona, Spain; Institució Catalana de Recerca i estudis Avançats (ICREA), Barcelona, Spain; Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK

Description

Datasets for "Deep indel mutagenesis reveals the impact of insertions and deletions on protein stability and function ". Code is available at: https://github.com/lehner-lab/deep_indel_mutagenesis.

Amino acid insertions and deletions (indels) are an abundant class of genetic variants. However, compared to substitutions, the effects of indels are not well understood and poorly predicted.  Here we address this shortcoming by performing deep indel mutagenesis (DIM) of structurally diverse proteins. Indel tolerance is strikingly different to substitution tolerance and varies extensively both between different proteins and within different regions of the same protein. Although state of the art variant effect predictors perform poorly on indels, we show that both experimentally-measured and computationally-predicted substitution scores can be repurposed as good indel variant effect predictors by incorporating information on protein secondary structures. Quantifying the effects of indels on protein-protein interactions reveals that insertions can be an important class of gain-of-function variants. Our results provide an overview of the impact of indels on proteins and a method to predict their effects genome-wide. 

additional_files.zip
additional_dfs.rds
 

DiMSum.zip

aPCA_domains_fitness_replicates.RData

aPCA_domains_variant_data_merge.RData

grb2_bind_fitness_replicates.RData

grb2_fold_fitness_replicates.RData

pdz3_bind_fitness_replicates.RData

pdz3_fold_fitness_replicates.RData

 

pre_processed_data.zip

color_scale.rds

scaled_variants_aPCA.rds

scaled_variants_bPCA.rds

tsuboyama_nat_doms_all.rds

 

indel_prediction_models.zip

ddmut_prediction_mean.rds

encoded_9doms.rds

ddG_insertions_models.R

ddG_deletions_models.R

Files

additional_files.zip

Files (23.7 MB)

Name Size Download all
md5:e507ef994b13dfdb8498b2d2fd2db12e
18.0 MB Preview Download
md5:7673a14d78a023d7a85319a77131372d
2.1 MB Preview Download
md5:1b5dafcd6e8fa1b8bfd5c3eb09af9101
63.7 kB Preview Download
md5:399978321cc749077f6c7c31509f25cc
3.5 MB Preview Download