Published December 26, 2025 | Version v1
Dataset Open

kaiju_mycobacterium_pre-compiled

Description

Kaiju database – Mycobacterium pre-compiled subset (2024 release)


This dataset provides a pre-compiled Kaiju database containing protein sequences exclusively from the genus Mycobacterium, extracted from the NCBI NR/RefSeq repositories (August 2024).

The database was built to optimize the taxonomic classification of sequencing reads from Mycobacterium tuberculosis and related species, significantly reducing computational requirements compared to the full Kaiju NR database (~100 GB).

Unlike the standard Kaiju NR database or raw FASTA-based subsets, this release distributes the final Kaiju index files already built, allowing immediate use in analysis pipelines without requiring database construction.

This subset includes representative genomes from Mycobacterium tuberculosisM. bovisM. africanumM. smegmatis, and other clinically or environmentally relevant species within the genus.

Contents:

  • kaiju_db_mycobacterium_2024.fmi — Kaiju formatted database index

  • nodes.dmpnames.dmp — NCBI taxonomy mapping files

Total size: ~1 GB
Kaiju version: compatible with ≥ 1.9.0
Reference source: NCBI NR/RefSeq (retrieved August 2024)

Use case:
Designed for pipelines performing taxonomic classification and contamination screening of Mycobacterium sequencing data, enabling faster execution while maintaining taxonomic resolution at the species level.

Recommended citation:

Kaiju database – Mycobacterium subset (2024 release). Zenodo. https://10.5281/zenodo.17554952

Menzel, P., Ng, K. L., & Krogh, A. (2016). Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nature Communications, 7, 11257. https://doi.org/10.1038/ncomms11257

Files

Files (300.9 MB)

Name Size Download all
md5:6f087a7deed64f176c3fd02eed3ecfb9
300.9 MB Download

Additional details

Identifiers

Other
kaiju-mycobacterium-precompiled-2024

Related works

Is supplement to
Publication: 10.1038/ncomms11257 (DOI)

Dates

Available
2025-12-26