Published April 25, 2024
| Version v1
Dataset
Open
The effect of presence and absence of DNA repair genes on the rate and pattern of mutation in bacteria
Creators
Description
This repository contains the code, data, and outputs used in the manuscript "The effect of presence and absence of DNA repair genes on the rate and pattern of mutation in bacteria".
- Script.py: This is a Python script for estimating the number of polymorphisms and mutation rates of bacterial clusters of orthologous genes found in ATGC_data.zip. It produces individual strain outputs to Polymorphism_Results/ and Rate_Results/ directories.
- ATGC_data.zip: This is a data set of bacterial clusters of orthologous genes downloaded from the ATGC database by Kristensen et al. (2017). The fasta files are utilised by Script.py and should be extracted to the same working directory, under ATGC_data/.
- mutation_rate_data.csv: This file contains the outputs of Script.py merged into a single file, alongside additional information regarding the presence and absence of particular repair genes, and the calculation of overall mutation rate and Watterson's corrected mutation rate.
- Phylogenetic_Comparisons.xlsx: This file contains the averages of overall mutation rate, transition/transversion ratio, and GC-AT bias, for 50 phylogenetic comparisons under presence and absence of repair enzymes.