Published June 14, 2024 | Version v2
Dataset Open

NBC++ Database

Creators

Description

 
The NBC++ Metagenome Database is a collection of metagenomic data sampled from the RefSeq database using Woltka. The database includes three distinct profiles:
  1. Basic Profile: Comprising almost one genome per genus, resulting in a compilation of 4,634 genomes as of July 24, 2023.
  2. Standard Profile: Encompassing all NCBI-defined reference and representative genomes, totaling 18,237 genomes collected on July 26, 2023.
  3. Extended Profile: Featuring one genome per species with a Latinate name and higher ranks, accumulating 319,554 genomes by July 26, 2023.
The assembly summary information about genomes in the database are in:
  • database_assembly_summaries.zip

Files

basic_9mer_canonical.zip

Files (32.3 GB)

Name Size Download all
md5:b33e9da226fd6eee1ee40ccbd01b2c1e
1.2 GB Preview Download
md5:0f606b31a9989f6fea21f1e5ea34b041
25.4 MB Preview Download
md5:98d171888c14b6994e717a06ad6ab287
22.7 GB Preview Download
md5:57dde0c6f8e2f0422d3670a763f2f810
8.3 GB Preview Download