COI rCRUX filtered metabarcoding reference database and naive-bayes classifier
Description
COI metabarcoding database and naive-bayes classifier in QIIME2 .qza format, with Insecta and Amphibia sequences removed. Original database downloaded from here, built with rCRUX using the Leray CO1 primers.
rCRUX details
rCRUX generated by combining and de-replicating the following databases:
Leray CO1-ncbi-mitochondrial (https://doi.org/10.5281/zenodo.8407603)
Leray CO1-embl (https://doi.org/10.5281/zenodo.8407606)
Leray CO1-searchterm (https://doi.org/10.5281/zenodo.8407620)
Primer Name: Leray CO1
Gene: CO1
Length of Target: ~313
Forward Sequence (5'-3'): GGWACWGGWTGAACWGTWTAYCCYCC
Reverse Sequence (5'-3'): TANACYTCnGGRTGNCCRAARAAYCA
Reference: Leray, M., Yang, J. Y., Meyer, C. P., Mills, S. C., Agudelo, N., Ranwez, V., ... & Machida, R. J. (2013). A new versatile primer set targeting a short fragment of the mitochondrial COI region for metabarcoding metazoan diversity: application for characterizing coral reef fish gut contents. Frontiers in zoology, 10(1), 34.
Details to filter database and train classifier:
4. Use this list of non-Amphibia or Insecta taxa to filter the original COI database fasta
seqkit grep -f filtered_taxa_toextract.txt CO1_combined_derep_and_clean.fasta > CO1_combined_derep_and_clean-noInsectaAmphibia.fa
--input-path CO1_combined_derep_and_clean-noInsectaAmphibia.fasta \
--output-path COI_rCRUX_filt_20231110.qza
qiime tools import --type 'FeatureData[Taxonomy]' \
--input-path CO1_combined_derep_and_clean_taxonomy-noInsectaAmphibia.txt \
--output-path COI_rCRUX_taxonomy_filt_20231110.qza \
--input-format 'HeaderlessTSVTaxonomyFormat'
--i-reference-reads COI_rCRUX_filt_20231110.qza \
--i-reference-taxonomy COI_rCRUX_taxonomy_filt_20231110.qza \
--p-classify--chunk-size 5000 \
--o-classifier COI_rCRUX_filt_20231110-classifier.qza
Files
CO1_combined_derep_and_clean_taxonomy-noAmphibiaInsecta.txt
Files
(877.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:77815c4040ed90e75338fcc4c43eeff8
|
288.4 MB | Download |
|
md5:85f1a3d632a64bdfc97ac957faf8c3b4
|
94.4 MB | Preview Download |
|
md5:48195f688d050f7407ce1c486d9d7a3c
|
451.1 MB | Download |
|
md5:4b407995951f2ca77b196a708ac1e088
|
31.8 MB | Download |
|
md5:9d3e9ea8bbceeb6347b1a450d6f859ed
|
11.6 MB | Download |
|
md5:99ea7c0b2efffce3c6d6b92b847b98fd
|
894 Bytes | Download |
Additional details
Related works
- Is variant form of
- Dataset: 10.5281/zenodo.8407631 (DOI)
References
- Bokulich NA, Kaehler BD, Rideout JR, Dillon M, Bolyen E, Knight R, Huttley GA, Caporaso JG. 2018. Optimizing taxonomic classification of marker gene sequences. Microbiome 6(1): 90. doi: https://doi.org/10.1186/s40168-018-0470-z.
- Curd, E. E., Gal, L., Gallego, R., Silliman, K., Nielsen, S., & Gold, Z. (2023). rCRUX: A rapid and versatile tool for generating metabarcoding reference libraries in R. Environmental DNA (Hoboken, N.J.). https://doi.org/10.1002/edn3.489
- Leray, M., Yang, J. Y., Meyer, C. P., Mills, S. C., Agudelo, N., Ranwez, V., ... & Machida, R. J. (2013). A new versatile primer set targeting a short fragment of the mitochondrial COI region for metabarcoding metazoan diversity: application for characterizing coral reef fish gut contents. Frontiers in zoology, 10(1), 34.