18S V4 rDNA sequences organized at the OTU level for the SOMLIT-Astan time-series (2009-2016)
Creators
- 1. Sorbonne Université, CNRS, Station Biologique de Roscoff, AD2M, UMR7144, Place Georges Tessier, 29680 Roscoff, France; Research Federation for the study of Global Ocean Systems Ecology and Evolution, FR2022/Tara Oceans GOSEE, 3 rue Michel-Ange, 75016 Paris, France.
- 2. Sorbonne Université, CNRS, Station Biologique de Roscoff, AD2M, UMR7144, Place Georges Tessier, 29680 Roscoff, France.
- 3. Cirad, UMR BGPI, F-34398, Montpellier, France
- 4. Sorbonne Université, CNRS, Station Biologique de Roscoff, FR2424, 29680 Roscoff, France.
Description
The present file includes metadata for each 18S V4 rDNA OTU from the SOMLIT-Astan time series (2009-2016) including the following fields: amplicon = identifier of the representative (most abundant) sequence; total = total number of reads; spread = number of samples in which the OTU has been found; cloud = number of unique sequences constituting the OTU; sequence = nucleic acid sequence of the representative sequence; length = length of the representative sequence; quality = minimum expected error observed for the representative sequence, divided by sequence length; taxonomy = taxonomic path assigned to the representative sequence; identity = percentage of identity of the representative sequence to the closest reference sequence from PR2; references = best hit reference sequence(s) ; RA090107_02:RA161222_3 = 375 samples from January 2009 to December 2016, the first two number are the year followed by the month and the day (sampling twice a month during 8 years). Values after “_” indicate the size of the filter used for the filtration: 02 for 0.2 µm and 3 for 3 µm.
Generation of 18S V4 rDNA Operational Taxonomic Units (OTUs) from the raw sequencing reads and their assembly into a OTUtable was obtained according to the following pipeline (https://doi.org/10.5281/zenodo.5791089). The V4 region was extracted from the 18S rDNA reference sequences from PR2 v4.12 (Guillou et al., 2013) with Cutadapt. The representative sequences of each OTU were compared to these V4 reference sequences by pairwise global alignment (usearch_global VSEARCH’s command). Each OTU inherits the taxonomy of the best hit or the last common ancestor in case of ties. OTUs with a score below 80% similarity were considered as unassigned (Mahé et al., 2017; Stoeck et al., 2010).
The final dataset (filtered OTU table) contains 375 samples (sampled twice per month from 2009 to 2016) with a total of ~30 million sequence reads and 21,418 OTUs.
Files
Files
(4.3 MB)
Name | Size | Download all |
---|---|---|
md5:dc2d2d8390202b65102f03b8e2a6d970
|
4.3 MB | Download |
Additional details
References
- Mahé, F., Vargas, C. De, Bass, D., Czech, L., Stamatakis, A., Lara, E., … Dunthorn, M. (2017). Parasites dominate hyperdiverse soil protist communities in Neotropical rainforests. Nature Publishing Group, 1(March), 1–8. https://doi.org/10.1038/s41559-017-0091
- Stoeck, T., Bass, D., Nebel, M., Christen, R., Jones, M. D. M., Breiner, H. W., & Richards, T. A. (2010). Multiple marker parallel tag environmental DNA sequencing reveals a highly complex eukaryotic community in marine anoxic water. Molecular Ecology, 19(SUPPL. 1), 21–31. https://doi.org/10.1111/j.1365-294X.2009.04480.x