Shotgun metagenomic sequencing dataset of a synthetic mock community containing 20 genomes spiked-in at even and staggered concentrations.
Description
Shotgun metagenomics (SM) sequencing is a popular method used in microbial ecology to obtain insights on microbial community structure and function potential in a given biological system without the need to cultivate microorganisms. The dataset described in this article describes technical triplicates of shotgun metagenomic sequence libraries generated from two purified and titrated mixes of 20 distinct reference bacterial genomes for which key characteristics such as genome size, sequence and spiked-in concentrations are known. In one of the genomic DNA mix, each genome is spiked-in at similar concentrations (representing an even microbial community) and in the other, genomes are spiked-in at different concentrations with some genomes highly abundant and other in low quantity, mimicking an uneven microbial community DNA extract. In order to be interpretable, SM sequencing data needs to be properly analyzed by complex analytical bioinformatic pipelines. Environments investigated with this method can range from simple to very complex. Typically, microbial communities contain microbes that are ubiquitous and some others much rarer. Analysis of rare microbes in a complex microbial community are challenging to perform as their sequencing signals get submerged by the microbial genomes that are more abundant. In this context, it is critical to have access to sequencing data of simple mock communities of mixes of well characterized genomes in order to develop and validate bioinformatic methods that aim to accurately analyze microbial communities.
Notes
Files
Data_note_2022-08-29_final.pdf
Files
(74.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:31964258439a6e597a853e260b9b55fd
|
74.8 kB | Preview Download |