Published June 13, 2024 | Version v1
Data paper Open

NBC++: Results of Simulated and Real Queries

Description

Basic database folder:

A. Simulated (The classification results at all taxonomic levels from simulated reads against 20% of the taxa in the Basic database) -- we describe the method and directory it is in:

1. With Canonical counting method (all)

2. Non-canonical counting method (non-can

B. Real Human Gut Sample: SRS105153 (human) -- all taxonomic samples plus basic_human for the raw NBC++ results file.

   

Standard database folder

A. Simulated (the classification results at all taxonomic levels from simulated reads against 20% of the taxa in the Standard database)

1. With Canonical counting method (all)

a. For each Kmer size with directory K

b. confusion_matrix/average: The confusion matrix for commonly found human taxa. 

2. Bacteria/Archaea only out of database, Canonical counting method (ba_only)

a. For each Kmer size with directory K

3. Non-canonical counting method (non-can)   

a. For each Kmer size with directory K

B. Real Human Gut Sample: SRS105153 (human) -- all taxonomic samples plus human_sample for the raw NBC++ results file.

 

Extended database folder:

A. Simulated (the classification results at all taxonomic levels from simulated reads against 20% of the taxa in the Extended database)

1. With Canonical counting method (all)

2. and Non-canonical counting method (non-can

B. Real Human Gut Sample: SRS105153 (human)   -- all taxonomic samples plus ext_human for the raw NBC++ results file. 

Files

Simulated_and_real_data_results.zip

Files (3.8 GB)

Name Size Download all
md5:c730cc7f6519ae732b5dda8f9d69d0e1
3.8 GB Preview Download

Additional details

Funding

U.S. National Science Foundation
Keeping up with the genomes - Continual Learning of Metagenomic Data 1936791

Dates

Issued
2024-06-13