Published April 25, 2024 | Version v1
Dataset Open

The effect of presence and absence of DNA repair genes on the rate and pattern of mutation in bacteria

Description

This repository contains the code, data, and outputs used in the manuscript "The effect of presence and absence of DNA repair genes on the rate and pattern of mutation in bacteria".

  • Script.py: This is a Python script for estimating the number of polymorphisms and mutation rates of bacterial clusters of orthologous genes found in ATGC_data.zip. It produces individual strain outputs to Polymorphism_Results/ and Rate_Results/ directories.
  • ATGC_data.zip: This is a data set of bacterial clusters of orthologous genes downloaded from the ATGC database by Kristensen et al. (2017). The fasta files are utilised by Script.py and should be extracted to the same working directory, under ATGC_data/.
  • mutation_rate_data.csv: This file contains the outputs of Script.py merged into a single file, alongside additional information regarding the presence and absence of particular repair genes, and the calculation of overall mutation rate and Watterson's corrected mutation rate.
  • Phylogenetic_Comparisons.xlsx: This file contains the averages of overall mutation rate, transition/transversion ratio, and GC-AT bias, for 50 phylogenetic comparisons under presence and absence of repair enzymes. 

Files

mutation_rate_data.csv

Files (1.4 GB)

Name Size Download all
md5:fd279f768523137396a546682d390082
1.4 GB Preview Download
md5:f64e570940e637b55d98e9ae35e03f2d
88.9 kB Preview Download
md5:408ace5d82d10903ac1df660b3294059
17.2 kB Download
md5:91926d48add13df8874acf9c416f9596
8.2 kB Download