DECARD: CC11 Dataset
Description
Data files for use with DECARD, as described in:
Golob JL, Margolis E, Hoffman NG, Fredricks DN. Evaluating the accuracy of amplicon-based microbiome computational pipelines on simulated human gut microbial communities. BMC Bioinformatics. 2017 May 30;18(1):283. PMCID: PMC5450146
Contents:
- 16S_SSU.tar.bz2
Filtered full-length repository reads culled from NCBI 16S microbial bioproject and Silva. These are the templates to use when generating a community
- CC11.tar.bz2
Reads for the CC11 family of 100 synthetic communities use for our BMC Bioinformatics publication.
454 is for reads for simulated 454 pyrosequencing, amplified with the HMP 454 primers.
illumina is for miseq-like paired end reads, amplified with the EMP primers.
For each there are no error and reads with simulated error and a map file (specifying the community for each sequence ID
and the "true" source sequence for each read.
- CC11_targets.csv
A target file, to be used with DECARD to recreate the CC11 communities, using the culled reads in 16S_SSU.tar.bz2 and primer sequences of your choosing. This can be used to test different primer sets.
- CC11_targets.tre
The "true" phylogenetic tree for the full-length 16S sequences used in all of the CC11 communities, in Newick format and suitable for packages like Phyloseq to calculate "true" DPCoA or (weighted) UniFrac pairwise distances between the communities
Files
CC11_targets.csv
Files
(1.7 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:7c530f5f72b70f59e4c4789febbce767
|
5.1 MB | Download |
|
md5:89d7596de13abc06f048ea233bbf01f0
|
1.7 GB | Download |
|
md5:0892821bce48a79de9640d1d5661512e
|
1.3 MB | Preview Download |
|
md5:bcf92d06a00a13ef8c1627af983906c6
|
122.3 kB | Download |
|
md5:95d5dac1d3995c767bce54e6e4636aeb
|
1.1 kB | Preview Download |
Additional details
Related works
- Is supplemented by
- PMC5450146 (pmcid)
- 10.1186/s12859-017-1690-0 (DOI)