Technical Report: GenoTyphi Implementation in Mykrobe
- 1. London School of Hygiene & Tropical Medicine
- 2. University of Melbourne
- 3. Monash University
- 4. EMBL-EBI
Description
The GenoTyphi genotyping scheme divides the Salmonella enterica serovar Typhi ("Typhi") population into 4 major lineages, and >75 different clades and subclades. The scheme was introduced in the paper "An extended genotyping framework for Salmonella enterica serovar Typhi, the cause of human typhoid", Wong et al, 2016, Nature Communications. Subsequent updates, including new genotypes and mutations conferring resistance to fluoroquinolones and azithromycin, are summarised in "Five years of GenoTyphi: updates to the global Salmonella Typhi genotyping framework", Dyson & Holt, 2021, Journal of Infectious Diseases. The original code that was made available for assigning GenoTyphi genotypes to Typhi genomes, implemented in Python, took as input BAM or VCF files that the user has already generated by mapping Illumina reads to the reference genome for Typhi CT18.
Here we describe a new kmer-based approach to assigning GenoTyphi genotypes to Typhi genomes, implemented in the Mykrobe open-source software platform and taking raw sequencing reads (FASTQ files) as input. This implementation also includes five newly defined genotypes, taking the total to n=87. We tested the new implementation on n=12,848 Typhi genomes, and found near-perfect concordance with the original mapping-based implementation. Code is available at https://github.com/katholt/genotyphi (v2.0, doi: 10.5281/zenodo.7430538).
Files
Mykrobe_GenoTyphi_TechReport.pdf
Files
(231.6 kB)
Name | Size | Download all |
---|---|---|
md5:e01a51c657e636b9cb74f1496ed18976
|
231.6 kB | Preview Download |