Published June 25, 2024 | Version v0.1.1
Dataset Open

Escherichia coli lineage deconvolution indexes for Themisto, mSWEEP/mGEMS, and demix_check

  • 1. University of Helsinki

Description

Escherichia coli lineage deconvolution indexes for Themisto, mSWEEP/mGEMS, and demix_check

This dataset contains the indexes used in a study conducted in Punjab, Pakistan that investigated E. coli colonisation diversity in healthy carriage with the use of CLED enrichment plates. The following files are included in the download:

  • Themisto v3 pseudoalignment index.
  • PopPUNK database.
  • Demix_check index.
  • Raw assembly data.

About

Version history

v0.1.1 (current version)

  • Added reference to the study.

v0.1.0

  • Added brief description with a few missing parts.

Distribution

These files are made available under a CC-BY 4.0 license. If you use these assemblies in your study please cite the source as appropriate (study will be added in a later version).

Citation

Khawaja, T., Mäklin, T., Kallonen, T. et al. Deep sequencing of _Escherichia coli_ exposes colonisation diversity and impact of antibiotics in Punjab, Pakistan. Nature Communications 15, 5196 (2024). https://doi.org/10.1038/s41467-024-49591-5

Methods briefly

Data

Assembly data included originates from the following studies:

  • Horesh G, et al., A comprehensive and high-quality collection of Escherichia coli genomes and their genes. Microbial Genomics 2021. doi: 10.1099/mgen.0.000499
  • Gladstone R, et al., Emergence and dissemination of antimicrobial resistance in Escherichia coli causing bloodstream infections in Norway in 2002–17: a nationwide, longitudinal, microbial population genomic study. The Lancet Microbe 2021. doi: 10.1016/S2666-5247(21)00031-8
  • Shao Y, et al., Stunted microbiota and opportunistic pathogen colonization in caesarean-section birth. Nature 2019. 10.1038/s41586-019-1560-1
  • Snaith AE, et al. The highly diverse plasmid population found in Escherichia coli colonizing travellers to Laos and its role in antimicrobial resistance gene carriage. Microbial Genomics 2023. doi: 10.1099/mgen.0.001000
  • Habib A, et al. Dissemination of carbapenemase-producing Enterobacterales in the community of Rawalpindi, Pakistan. PLOS ONE 2022. doi: 10.1371/journal.pone.0270707
  • Runcharoen C, et al. Whole genome sequencing of ESBL-producing Escherichia coli isolated from patients, farm waste and canals in Thailand. Genome Medicine 2017. doi: 10.1186/s13073-017-0471-8
  • Musicha P, et al. Trends in antimicrobial resistance in bloodstream infection isolates at a large urban hospital in Malawi (1998–2016): a surveillance study. The Lancet Infectious Diseases 2017. doi: 10.1016/S1473-3099(17)30394-8

Themisto index construction

Assemblies were indexed with Themisto v3.0.0-rc using k-mer size 31 and the `--file-colors` option.

PopPUNK clustering

We followed the approach described in Mäklin et al. 2022 using PopPUNK v2.5.0.

Demix check indexing

The index was generated using the `setup_reference.sh` script from tmaklin/coreutils_demix_check.

Contact

Tommi Mäklin <tommi'at'maklin.fi>.

Files

README.md

Files (32.9 GB)

Name Size Download all
md5:76269906cbb04ad71db6420c4719209a
20.9 GB Download
md5:615cd68f89a6e50218c49a8a7ad3b674
49 Bytes Download
md5:0e158863c703848f76bd508c511c23b7
1.1 GB Download
md5:12ae71bd8bc31a753967a48e94633167
59 Bytes Download
md5:10263a7e5117d57ba831b88ff7c37b9d
3.4 GB Download
md5:b7377e8a6e623b9410f9d9d80aa29c77
52 Bytes Download
md5:10b16d27d3ebe7760ae3a113d95955ee
3.4 kB Preview Download
md5:343fd0f8653f4a0b88e74142fa0331b6
7.5 GB Download
md5:fe597d63f23521ad30cc7ab374141888
59 Bytes Download

Additional details

Related works

Is documented by
Workflow: 10.5281/zenodo.10077810 (DOI)
Is part of
Journal article: 10.1038/s41467-024-49591-5 (DOI)

References

  • Khawaja, T., Mäklin, T., Kallonen, T. et al. Deep sequencing of Escherichia coli exposes colonisation diversity and impact of antibiotics in Punjab, Pakistan. Nature Communications 15, 5196 (2024). https://doi.org/10.1038/s41467-024-49591-5