Dataset Open Access
These training fasta files are derived from the Silva Project's version 138.1 release and formatted for use with DADA2. These files are intended for use in classifying prokaryotic 16S sequencing data and are not appropriate for classifying eukaryotic ASVs.
See https://benjjneb.github.io/dada2/training.html for information about DADA2 reference databases and https://www.arb-silva.de/documentation/release-138.1/ for database and citation information for Silva 138.1. The Silva 138.1 database is licensed under Creative Commons Attribution 4.0 (CC-BY 4.0); see file "SILVA_LICENSE.txt". These fasta database files were generated and checked for consistency using the R markdown documents in the silva-138.1 folder in https://zenodo.org/record/4587946.
If you use these files, please cite one or both of the Silva references below (or at the above link) and the DADA2 paper (reference below). I also recommend citing or linking to the Zenodo record for this specific version in your Methods or published source code to record the specific taxonomic database files used in your analysis.
NOTE: These database files have a known problem in 3/895 families and 59/3936 genera. See https://github.com/mikemc/dada2-reference-databases/blob/main/silva-138.1/v1/bad-taxa.csv for a list of affected taxa and https://github.com/benjjneb/dada2/issues/1293 for more information.
Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, Peplies J, Glöckner FO (2013) The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucl. Acids Res. 41 (D1): D590-D596.
Yilmaz P, Parfrey LW, Yarza P, Gerken J, Pruesse E, Quast C, Schweer T, Peplies J, Ludwig W, Glöckner FO (2014) The SILVA and "All-species Living Tree Project (LTP)" taxonomic frameworks. Nucl. Acids Res. 42:D643-D648
Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. 2016. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods 13:581–583. doi:10.1038/nmeth.3869