Roving methyltransferases generate a mosaic epigenetic landscape and influence evolution in Bacteroides fragilis group
Creators
- 1. Bacterial Pathogenesis and Antimicrobial Resistance Unit, LCIM, NIAID, NIH, Bethesda, MD
- 2. National Institutes of Health Clinical Center, National Institutes of Health, Bethesda, MD
Description
This repository contains code and the data for reproducing results and figures in the associated manuscript:
Roving methyltransferases generate a mosaic epigenetic landscape and influence evolution in Bacteroides fragilis group
BFG-Analysis-main/ includes scripts and data to process Nanopore and Illumina reads, assemble BFG genomes, polish those genomes, and correct out-of-frame ORFs for MLST alignment. This also includes GenBank reference genomes referred to in the manuscript.
tree_files/ includes pylogenetic tree files and aligned sequence files used in Figures 1, 5, and 6
acessory_regions/ includes a .fasta file of accessory regions in each genome from the study in which it was possible to calculate this (using Ppanggolin/panRGP)
genomes/ contains different versions of BFG genomes with and without different types of polishing and frame-correction:
- genomes/pacbio_uncorrected/ contains genomes sequenced with PacBio and assembled with PacBio software
- Analyzed for MLST trees in Figures: 1, 5, 6
- Analyzed in Figures: 5, 6, S7, S9 - S16
- genomes/nanopore_racon_medaka/ contains genomes sequenced with Nanopore, assembled with Flye, then polished with racon and medaka.
- Analyzed in Figures: 5, 6, S7, S9 - S16
- genomes/nanopore_racon_medaka_pilon/ contains genomes sequenced with Nanopore, assembled with Flye, polished with racon and medaka, then polished with Illumina reads with pilon.
- Analyzed in Figures: 5, 6, S7, S9 - S16
- genomes/proovframe_BFG_genomes/ contains genomes from the pacbio_uncorrected/, nanopore_racon_medaka/, and nanopore_racon_medaka_pilon/ directories that were frame-corrected with Proovframe.
- Analyzed for Figures: 2, 3, 4, S2, S3, S4, S5, S6, S8
- genomes/nanopore_MEGAN_corrected/ contains genomes from the nanopore_racon_medaka/ and nanopore_racon_medaka_pilon/ directories that were frame-corrected with MEGAN
- Analyzed for MLST trees in Figures: 1, 5, 6
nanodisco_difference_files/ contains Nanodisco intermediate files reporting the difference in nanopore signal between native and PCR-generated gDNA at each genomic position. They refer to the genomes in directories genomes/nanopore_racon_medaka_pilon/, genomes/nanopore_racon_medaka/, genomes/pacbio_uncorrected/. Each isolate has a genome in only one of these directories.
acessory_regions/ has a .fasta file of accessory sequences (per methods in manuscript) of relevant genomes.
Files
Tisza_Smith_et_al_BFG_extra.zip
Files
(2.6 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:ae501727b52bc859de6f25eb1be5fd1b
|
2.6 GB | Preview Download |