RDP taxonomic training data formatted for DADA2 (RDP release 19 - update 2023-08-23)
Description
These DADA2-formatted training fasta files were derived from the Ribosomal Database Project's Training Set 19 and the 2023-08-23 release of the RDP database. https://sourceforge.net/projects/rdp-classifier/files/RDP_Classifier_TrainingData/
These fastas were generated by the following commands using the dada2 R package version 1.35.4:
## RDP data: https://sourceforge.net/projects/rdp-classifier/files/RDP_Classifier_TrainingData/
path <- "~/tax/rdp/v19"
fn.out.rdp <- "~/Desktop/rdp_19_toGenus_trainset.fa.gz"
dada2:::makeTaxonomyFasta_RDP(file.path(path, "trainset19_072023_speciesrank.fa"),
file.path(path, "trainset19_db_taxid.txt"),
fn.out.rdp, include.species=FALSE,
compress=TRUE)fn.out.spc.rdp <- "~/Desktop/rdp_19_toSpecies_trainset.fa.gz"
dada2:::makeTaxonomyFasta_RDP(file.path(path, "trainset19_072023_speciesrank.fa"),
file.path(path, "trainset19_db_taxid.txt"),
fn.out.spc.rdp, include.species=TRUE,
compress=TRUE)
Files
Files
(12.8 MB)
Name | Size | Download all |
---|---|---|
md5:390b8a359c45648adf538e72a1ee7e28
|
6.3 MB | Download |
md5:951c6d90f1bcc893411f0624b34663f5
|
6.5 MB | Download |
Additional details
Funding
- National Institutes of Health
- Quantitative Metagenomics and the Vaginal Microbiome of Preterm Birth R35GM133745
Software
- Repository URL
- https://github.com/benjjneb/dada2
- Programming language
- R
- Development Status
- Active
References
- Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied and environmental microbiology. 2007 Aug 15;73(16):5261-7.
- Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJ, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nature methods. 2016 Jul;13(7):581-3.