Published February 12, 2023
| Version 207.0
Dataset
Open
RDP Classifier training files for 16S rRNA sequences from GTDB
Description
16S rRNA gene sequences from the Genome Taxonomy Database (GTDB release 207) were used to retrain the RDP Classifier (version 2.13). Two sets of training files are provided:
genus.zip- Genus levelspecies.zip- Species level
The code in prepare_files.R was used to prepare the GTDB sequence and taxonomy files for retraining the RDP Classifier. Notes:
- Steps to retrain the RDP Classifier are adapted from https://john-quensen.com/tutorials/training-the-rdp-classifier/
- Python scripts (lineage2taxTrain.py and addFullLineage.py) are available at https://github.com/rdpstaff/classifier/issues/18
An analysis of human microbiome data using these files is described in this preprint.