Naming the unnamed: Over 65,000 Candidatus names for unnamed Archaea and Bacteria in the Genome Taxonomy Database
Authors/Creators
- 1. University of East Anglia
- 2. Quadram Insitute Bioscience
- 3. University of Innsbruck
Description
Thousands of new bacterial and archaeal species and higher-level taxa are discovered each year through the analysis of genomes and metagenomes. The Genome Taxonomy Database (GTDB) provides hierarchical sequence-based descriptions and classifications for new and as-yet-unnamed taxa. However, bacterial nomenclature, as currently configured, cannot keep up with the need for new well-formed names. Instead, microbiologists have been forced to use hard-to-remember alphanumeric placeholder labels. Here, we exploit an approach to the generation of well-formed arbitrary Latinate names at a scale sufficient to name tens of thousands of unnamed taxa within GTDB. These newly created names represent an important resource for the microbiology community, facilitating communication between bioinformaticians, microbiologists and taxonomists, while populating the emerging landscape of microbial taxonomic and functional discovery with accessible and memorable linguistic labels.
Presented here are input and output files associated with the scripts used in this project. Note that the file simple.txt was too large to upload but can be downloaded from https://hosted-datasets.gbif.org/datasets/backbone/backbone-current-simple.txt.gz
Scripts are available from
This version of the files and associated names for bacteria supercedes those published here: https://zenodo.org/record/5652886
which were associated with this preprint: https://www.preprints.org/manuscript/202111.0557/v1
Files
ar53_r207_corrected_ar_genus_names_table.txt
Files
(1.6 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:54f039701da66979f0c4bbc4a2aa2e3d
|
186.2 kB | Download |
|
md5:2522604eebfb4916e1fff2eaf876ceeb
|
2.7 MB | Download |
|
md5:cd99ab8c55fffac51ab05c9510b07f67
|
7.5 MB | Download |
|
md5:4ad832c518cab008f564c6390321664e
|
7.6 MB | Download |
|
md5:53d543e9ca5c7702a7aa33600e904320
|
23.7 kB | Preview Download |
|
md5:0ddb826e7a50aeb910465e65f5b52bc9
|
59.3 kB | Preview Download |
|
md5:1aae30db6a49f2c6200ac4c3130aeff6
|
861.7 kB | Download |
|
md5:5efbd975c5b414dfd495d3180c2de77e
|
865.1 kB | Download |
|
md5:e92650544e3636f091f87f364be36823
|
901.4 kB | Download |
|
md5:23d5dd1a8c3549d054d847d52a2f73ca
|
883.6 kB | Download |
|
md5:dc67cdcc4636e5fdcba023b678a2d9f2
|
915.0 kB | Download |
|
md5:7337868a413092cd36fe8a7bc6733a77
|
16.8 kB | Download |
|
md5:c2a587c9b52a368cfbda844c49d823c8
|
39 Bytes | Preview Download |
|
md5:2f18a7998f4737c467872fc208e4c128
|
13.5 kB | Preview Download |
|
md5:a318c0d3e6b92fc978acc87e032b0868
|
2.1 MB | Preview Download |
|
md5:d54fd953953dc4018b09dc5d67c8ae41
|
3.3 MB | Download |
|
md5:f46495420129e04010288321110b15eb
|
422.5 MB | Download |
|
md5:e5adb0f4ce37b8f52e7a873a0032ced5
|
422.9 MB | Download |
|
md5:1fa7fe585e71b0296dddcbf04cc7151b
|
225.2 kB | Preview Download |
|
md5:f10fbda3fdea3852b1d26d8adc772ff1
|
994.3 kB | Preview Download |
|
md5:29853293a2be8f0fe16ad1c206e7de3e
|
45.2 MB | Download |
|
md5:cbb2d614511691b994125e64470c73cf
|
45.4 MB | Download |
|
md5:c9fa60ad17a662b53c1d720b376f814a
|
45.6 MB | Download |
|
md5:960a96365083bd895baebeb186ff1998
|
45.3 MB | Download |
|
md5:712c0ab4f3b32c887082791cbc5b26fa
|
45.8 MB | Download |
|
md5:5954bba32ebfe519638c5056f9541421
|
380.4 kB | Download |
|
md5:eb99cb339087b4b965d6329c9c1d094b
|
329 Bytes | Preview Download |
|
md5:0f77fbfdef4b154d83372a1841f8aec8
|
299.2 kB | Preview Download |
|
md5:231aa43ede0f66ff53827d3bb88a9e74
|
27.5 MB | Preview Download |
|
md5:b088ae134ef21258afaae6adf3aed9dd
|
46.3 MB | Download |
|
md5:7905332d1844fecd61579c72a29a4ac3
|
259.3 MB | Preview Download |
|
md5:f34072d152df8b53acc681cb829851b3
|
12.8 kB | Preview Download |
|
md5:63e8cf1b862ae04d5b5d2125fa2b2b8d
|
99.8 kB | Preview Download |
|
md5:f4e1bd167af427882fe7380f4f6929df
|
50.0 MB | Preview Download |
|
md5:443192562714719285e50a20f2efa135
|
7.9 kB | Preview Download |
|
md5:2006f92025c335c0fe67bf7ccf6cc514
|
2.5 kB | Preview Download |
|
md5:37fec70840678e28fa6641107442df7f
|
10.4 kB | Preview Download |
|
md5:d9d2c5d54a083262a03348e005f8eccd
|
46.5 MB | Download |
|
md5:faa04bd376eb09d9940487a514f66cdf
|
46.2 MB | Download |
|
md5:6079fd25143ffb37346c226cb648ce0c
|
780 Bytes | Download |
|
md5:3e20c43f8cfb70bc43d6e1f9bffb8511
|
143.7 kB | Preview Download |
|
md5:25e523ba42b7663a9f9949a28f6270d8
|
132.3 kB | Download |
|
md5:315d623ebbec105f079e205d25c548bb
|
2.7 MB | Preview Download |
|
md5:ad75d98bb0783ff7102d6b129734c031
|
3.5 MB | Download |
|
md5:9d5c67d65e32788e7afc1bf5fac7b4ed
|
298 Bytes | Preview Download |
|
md5:454e27dfa275918feea1e222ffd74244
|
2.7 MB | Preview Download |
|
md5:a0d05ac8ebe16e250f01b8545a45632a
|
12.5 kB | Preview Download |
|
md5:a4e3145901544e80ef8214fdb4fd1df1
|
183.0 kB | Preview Download |
|
md5:2295c6c2398bf3ce4b1f87e2d2b50189
|
2.3 MB | Preview Download |
|
md5:6c9811826b2f42fbca7368336259e03f
|
150.7 kB | Preview Download |
|
md5:2c581200ad7b3c5e1011265a8299d5e4
|
432.0 kB | Preview Download |