In silico mock communities for evaluation of taxonomic profilers across prokaryotes and viruses
Creators
Description
In silico mock communities generated with CAMISIM for benchmarking the performance of taxonomic profilers across prokaryotic (50 communities), eukaryotic (30 communities), and viral communities (10 communities) of the human microbiome. Metagenomes were generated using CAMISIM (Fritz et al., 2019), which simulates 2.1 Gb of Illumina 2 ×150 bp paired end reads with the default HiSeq 2500 error profile and a mean insert size of 200 bp. To assess profiling performance for a range of sequencing depths, the 50 in silico metagenomes were also rarefied with seqtk (-s100) to sequencing depths of 20, 5, 2, 1, 0.5, 0.25 and 0.1 million read pairs. Counts are provided for rarefied metagenomes.
Prokaryotic communities
For prokaryotic benchmarking, 10 body site-representative prokaryotic metagenomes were simulated for each of the following five body sites: adult gut, infant gut, oral, skin, and vagina. Genome accession ids for prokaryotic species found in each human body site were identified from published literature (Bäckhed et al., 2015; Proctor et al., 2019; Saheb Kashaf et al., 2021).
Adult Gut: pro_gut_adult.zip
Infant Gut: pro_gut_infant.zip
Oral: pro_oral.zip
Skin: pro_skin_1.zip, pro_skin_2.zip, pro_skin_3.zip
Vaginal: pro_vaginal.zip
Downsized counts:
Eukaryotic communities
30 eukaryotic in silico metagenomes comprising up to 200 randomly sampled genomes from a set of 113 eukaryotic species (See Supplementary Table 2 from the paper) corresponding to the eukaryotic species within both CHAMP and MetaPhlAn 4 (Blanco-Míguez et al., 2023) databases.
Eukaryotic data is deposited here: doi: 10.5281/zenodo.12090449
Viral communities
10 viral communities were simulated with 95% of the reads from bacteria and 5% of the reads originating from phages. Each community consisted of 200 randomly selected bacterial genomes from GTDB with species-level annotation and 200 viral genomes from the Gut Phage Database (GPD, Camarillo-Guerrero et al., 2021).
Counts: phage_communities_counts.zip
FastQ, forward reads: camisimu_[1-10].fq.1.gz
FastQ, reverse reads: camisimu_[1-10].fq.2.gz
References
Bäckhed, F., Roswall, J., Peng, Y., Feng, Q., Jia, H., Kovatcheva-Datchary, P., et al. (2015). Dynamics and Stabilization of the Human Gut Microbiome during the First Year of Life. Cell Host Microbe 17, 690–703. doi: 10.1016/J.CHOM.2015.04.004
Blanco-Míguez, A., Beghini, F., Cumbo, F., McIver, L. J., Thompson, K. N., Zolfo, M., et al. (2023). Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4. Nature Biotechnology 2023 41:11 41, 1633–1644. doi: 10.1038/s41587-023-01688-w
Camarillo-Guerrero, L. F., Almeida, A., Rangel-Pineros, G., Finn, R. D., and Lawley, T. D. (2021). Massive expansion of human gut bacteriophage diversity. Cell 184, 1098. doi: 10.1016/J.CELL.2021.01.029
Fritz, A., Hofmann, P., Majda, S., Dahms, E., Dröge, J., Fiedler, J., et al. (2019). CAMISIM: Simulating metagenomes and microbial communities. Microbiome 7, 1–12. doi: 10.1186/S40168-019-0633-6/FIGURES/5
Proctor, L. (2019). Priorities for the next 10 years of human microbiome research. Nature 2021 569:7758 569, 623–625. doi: 10.1038/d41586-019-01654-0
Saheb Kashaf, S., Proctor, D. M., Deming, C., Saary, P., Hölzer, M., Mullikin, J., et al. (2021). Integrating cultivation and metagenomics for a multi-kingdom view of skin microbiome diversity and functions. Nature Microbiology 2021 7:1 7, 169–179. doi: 10.1038/s41564-021-01011-w
Files
phage_communities_counts.zip
Files
(186.5 GB)
Name | Size | Download all |
---|---|---|
md5:a943603532eeb999dc3c91e81d2d9eaa
|
5.1 GB | Download |
md5:d05eb9e41a8071507e3deb44dc36c43a
|
5.3 GB | Download |
md5:71bd723fa77d137d3598b73d93c953f4
|
5.1 GB | Download |
md5:10c2fcce415e43762c11d008c5a2c4b3
|
5.3 GB | Download |
md5:0abd9096f4d990851ff6081880b42c3e
|
5.1 GB | Download |
md5:292333256cad021985301049bc21806c
|
5.3 GB | Download |
md5:362e6989ae1b3d7f0b6d9a1a3f71d52f
|
5.1 GB | Download |
md5:8d83f3d9a613747bc13f41f425e354f3
|
5.3 GB | Download |
md5:81fa93303af550a08f4e4a1d90d6ddc9
|
5.1 GB | Download |
md5:cec612d2fa37ec2a13b928193642fa32
|
5.3 GB | Download |
md5:b7b4443eebe2d072f7bd187c20031bad
|
5.1 GB | Download |
md5:b5487ef914216f029b2e4b7f5fcbd3f3
|
5.3 GB | Download |
md5:23b5236a59809ff088cb49278069b203
|
5.1 GB | Download |
md5:11aff75d70c92c11621b0159c6ebb835
|
5.3 GB | Download |
md5:a7a83992b06ec396c7975a32021d5fab
|
5.0 GB | Download |
md5:4b849d2cc3903392c4c2ff989eefbe37
|
5.3 GB | Download |
md5:e038153891d9f2e9d2ac50fb1cce9b45
|
5.1 GB | Download |
md5:9af6d99037f22f3186814cf4c1df8c51
|
5.3 GB | Download |
md5:770a88a90e0f7486e4a772908e81d1ec
|
5.1 GB | Download |
md5:324b6ff6e16c65b3ee565c5871d988ac
|
5.3 GB | Download |
md5:09b588d015c518051ef12e3ecd86826a
|
3.6 MB | Preview Download |
md5:d1517396914401460afda7f182961aef
|
16.4 GB | Preview Download |
md5:7a2489393a878d4cb404c22805cab12e
|
16.6 GB | Preview Download |
md5:4b9cfadd22f054216cc6128c0407bde1
|
16.5 GB | Preview Download |
md5:d2e6ec72050caaba34adf2dac3d56018
|
8.3 GB | Preview Download |
md5:97bba38d9d7f9d717fdf3ab534300cb8
|
6.7 GB | Preview Download |
md5:e7f00f272e0a3e2a2e80d6e119bfba7e
|
1.7 GB | Preview Download |
md5:2f8994cc6e528480b3d03cf99bcd5849
|
16.4 GB | Preview Download |