Evolution of Salmonella Chromosomes and Its Influence on Gene Expression and Chromosomal Conformation
Authors/Creators
- 1. Shenzhen University
- 2. Shenzhen University Health Science Center
Description
BactCG is designed to analyze the core genome of a group of bacterial strains. Typically, pairwise alignment is repeatedly performed between each pair of genes (or proteins) from two bacterial strains respectively. Mutual best alignment pairs are identified, generating the core gene set. The order of computation reaches n2 for n strains. BactCG takes one representative strain as reference, and makes mutual alignment between the genes (or proteins) of the other strains and those from the reference strain to reveal the core gene set. The computation order of BactCG decreases to n.
For specific usage of BactCG, please refer to dbESG
2. BactPG
BactPG is designed to analyze the pan-genome of a group of bacterial strains. Typically, pairwise alignment is repeatedly performed between each pair of genes (or proteins) from two bacterial strains respectively. Mutual best alignment pairs are identified, generating the pan-gene set. BactPG analyzes each combination of all strains. In each combination, BactPG takes one representative strain as reference, and makes mutual alignment between the genes (or proteins) of the other strains. Then, the gene sets of each combination are merged to form the pan-genome.
For specific usage of BactPG, please refer to dbESG
3. BactAG
The ancient orthologous genomes of bacteria were inferred with a two-step Backbone-Patching approach semi-manually. In the Backbone step, the most anciently diverged clades were identified according to the phylogenomic tree with strains covering the major branches of the genus, species or subspecies to be studied, and one representative strain was selected randomly from either clade. Orthologous fragments were analyzed with Mauve version 2.4.0 and an iterative Maximum Homologous Block (MHB) algorithm, and combined to generate the backbone of ancient orthologous genome. Two patching sub-steps followed. Firstly, genomes of other representative strains were aligned between the two clades, and the orthologous fragments were retrieved, which were further compared to the backbone. The sub-fragments not covered by the backbone genome were patched in manually and the backbone ancient orthologous genome was updated iteratively. Secondly, the genomes of closely-related outgroup strains or the genome of nearest ancestor were also aligned against the representative strains of either clade respectively and the orthologous fragments were extracted to further patch the ancient orthologous genome.
For specific usage of BactAG, please refer to dbESG
4. Bact1DGR
Bact1DGR is developed to represent individual bacterial genomes as blocks annotated with the evolutionary origins. Both the phylogenetic information of the target strain and the ancient genomes of the nodes along its evolutionary trajectory are referred to. The representation scheme can facilitate understanding the sequence evolution of bacterial genomes and intuitive comparison of multiple bacterial genomes. The procedure involves a couple of steps: (1) locating the end phylogenetic branch where the target strain falls, tracing all the nodes along the phylogenetic route of the branch, and delineating the evolutionary trajectory of the target strain; (2) aligning the genome of target strain against that of the oldest ancestor, identifying the orthologous fragments and labeling the homologous genome blocks of the target strain; (3) aligning the genome of target strain against that of the second oldest ancestor, identifying the orthologous fragments and labeling the homologous genome blocks of the target strain that have not been labeled; (4) performing the step 3 iteratively till the genome of the most recent ancestor is compared and labeled accordingly, and finding out the strain-specific sequence blocks.
For specific usage of Bact1DGR, please refer to dbESG
5. BactPGA
BactPGA is developed to facilitate automatic annotation of ancient or extant individual genomes according to the pan-genome annotation results. Once sequenced and assembled, the target genome could be annotated for the encoding genes with RASTtk or PGAG. BactPGA mainly classify the genes into pan-genome families. BactPGA can also be used to annotate the results of 1DGR or other comparative genomic analysis.
For specific usage of BactPGA, please refer to dbESG
6. Supplementary Datasets
Dataset S1. Salmonella genomes newly sequenced and used in this study.
Dataset S2. The core gene set of 26 representative Salmonella strains used for manual AOC construction.
Dataset S3. Genes and their evolutionary routes inferred from the evolutionary trajectory of Salmonella ancient orthologous chromosomes.
Dataset S4. Normalized expression levels of genes for the pan gene sets of Salmonella or common gene sets among the eight representative strains.
Dataset S5. Evolutionary origins of differentially expressed genes.
Dataset S6. Consensus sequence motifs identified from the flanking (+/- 500bp) regions of the major interaction intervals or CID boundaries.
Dataset S7. Presence of a large inversion spanning the ter major interval between E. coli lineages and Salmonella.
Files
Bact1DGR.zip
Files
(91.4 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:9246b971cc0954a71527c5bcbd2344e8
|
8.9 MB | Preview Download |
|
md5:7fe9e9d64958091396065f23c79efb0c
|
27.2 MB | Preview Download |
|
md5:fc81a7b1373b6f29dfeb6e3dcbc2f62f
|
9.3 MB | Preview Download |
|
md5:beeb48783b7946fa39d77686772316c1
|
4.7 MB | Preview Download |
|
md5:f4ee5259a5cf1d3e7418b51c01b93759
|
34.1 MB | Preview Download |
|
md5:dfba643f15b168d2449c1f9dea893cb3
|
7.1 MB | Preview Download |