Published June 26, 2019 | Version v1
Journal article Open

The UNITE Database for Molecular Identification and for Communicating Fungal Species

  • 1. University of Tartu, Tartu, Estonia
  • 2. University of Tartu, Natural History Museum, Tartu, Estonia
  • 3. University of Gothenburg, Göteborg, Sweden|Gothenburg Global Biodiversity Centre, Gothenburg, Sweden
  • 4. Department of Research and Collections, University of Oslo, Natural History Museum, Postboks 1172, Blindern, 0318 Oslo, Norway, Oslo, Norway
  • 5. The James Hutton Institute, Craigiebuckler, Aberdeen AB15 8QH, Scotland UK, Aberdeen, United Kingdom

Description

UNITE (https://unite.ut.ee; Nilsson et al. 2018) is an international community of scientists and citizen scientists established in 2001. The ambition of UNITE is to develop: 1) datasets and tools for robust and reproducible molecular identification; 2) Persistent Identifiers based system for the communicating fungal species. Datasets of the nuclear ribosomal internal transcribed spacer (ITS) region, form the basis for UNITE. The current version includes nearly 1 million public fungal ITS sequences. Datasets are curated and annotated by community members. During the past 15 years, they made more than 275 000 improvements. In the complete absence of Latin names for species, UNITE offers a unique system where species hypotheses (SH) are provided with Digital Object Identifiers (DOIs). The current version 8 of UNITE offers more than 800 000 DOI-based SHs. One such SH DOI page is shown in Fig. 1. These DOI identifiers are also incorporated into the taxonomic backbone, making communication of taxa seamless in both directions. DOI identifiers of species hypotheses are also used by GBIF (Global Biodiversity Information Facility) in order to publish high-throughput sequencing taxon occurrence data in their data portal.

UNITE serves as a data provider for a range of metabarcoding software pipelines and regularly exchanges data with all major fungal sequence databases and other community resources.

Recent improvements include ITS-based species hypotheses for all eukaryotes and aggregation of full-length, high-quality ITS sequences generated by the PacBio Sequel system (https://www.pacb.com/products-and-services/sequel-system) from diverse material samples.

Files

BISS_article_37402.pdf

Files (235.1 kB)

Name Size Download all
md5:03a91d671664591390fe1b3e7a35bb32
219.8 kB Preview Download
md5:27384428bc0b350f52d3a93f6d72e843
15.4 kB Preview Download

Linked records