ReClustOR: a Re-Clustering method using an Open-Reference method that improves OTU definition for metabarcoding approaches
Creators
- 1. UMR 1347 Agroécologie, INRA/UBFC
Description
ReClustOR is a novel clustering method that overcomes some of the problems associated with classical ‘heuristic’ clustering methods and consequently increases the stability and quality of the reconstructed OTUs. Moreover, the OTUs database defined with ReClustOR can be used as reference(s) with gradual enrichment of it, with new studies and samples. In this way, huge datasets like the Earth Microbiome Project can be easily used as references for smaller projects, thereby increasing the quality of comparisons between studies and datasets
More precisely, this new strategy combines two of the previously-described clustering methods. Firstly, a de novo method is used to define OTU centroids and create a reference database. Secondly, a closed- or open-reference method (depending on the user’s choice) is computed for all reads which are not considered as OTU centroids. As the chosen OTU centroids harbor greater differences than the defined threshold and sufficiently reflect the biodiversity of the microbial communities being examined, they can be considered as a good reference database. Contrary to the de novo clustering approach, each read is compared to all centroids using a distance-based greedy clustering technique (Edgar, 2010; He et al., 2015), and then assigned to the nearest one, thereby fixing the erroneous assignments of reads to OTUs (Figure 1C). With this method, all datasets can be compared directly against the reference database of selected OTUs, without needing to repeat the entire analysis with all the dataset.
Files
ReClustOR.zip
Files
(2.7 MB)
Name | Size | Download all |
---|---|---|
md5:dc7d4f4e2c3ba26e650275243486f091
|
2.0 MB | Preview Download |
md5:abaa6ddf8028eca28ca123b6896c84af
|
720.3 kB | Preview Download |