Published March 18, 2019 | Version 1.0.1
Software Open

ReClustOR: a Re-Clustering method using an Open-Reference method that improves OTU definition for metabarcoding approaches

Description

ReClustOR is a novel clustering method that overcomes some of the problems associated with classical ‘heuristic’ clustering methods and consequently increases the stability and quality of the reconstructed OTUs. Moreover, the OTUs database defined with ReClustOR can be used as reference(s) with gradual enrichment of it, with new studies and samples. In this way, huge datasets like the Earth Microbiome Project can be easily used as references for smaller projects, thereby increasing the quality of comparisons between studies and datasets

More precisely, this new strategy combines two of the previously-described clustering methods. Firstly, a de novo method is used to define OTU centroids and create a reference database. Secondly, a closed- or open-reference method (depending on the user’s choice) is computed for all reads which are not considered as OTU centroids. As the chosen OTU centroids harbor greater differences than the defined threshold and sufficiently reflect the biodiversity of the microbial communities being examined, they can be considered as a good reference database. Contrary to the de novo clustering approach, each read is compared to all centroids using a distance-based greedy clustering technique (Edgar, 2010; He et al., 2015), and then assigned to the nearest one, thereby fixing the erroneous  assignments of reads to OTUs (Figure 1C). With this method, all datasets can be compared directly against the reference database of selected OTUs, without needing to repeat the entire analysis with all the dataset.

Files

ReClustOR.zip

Files (2.7 MB)

Name Size Download all
md5:dc7d4f4e2c3ba26e650275243486f091
2.0 MB Preview Download
md5:abaa6ddf8028eca28ca123b6896c84af
720.3 kB Preview Download