Published November 14, 2022 | Version v1
Dataset Open

Lite Kraken/Bracken databases built using UHGG genomes

  • 1. Department of Medicine, University of Cambridge

Description

Kraken/Braken databases for UHGG genomes.

HUMAN_3006.tar.gz: Kraken/Bracken database for 3006 high quality species clusters of the UHGG (Beresford-Jones et al., 2022). Database was built from the single highest quality genome for each species cluster (n=3006). Uses the original GTDB v1.3 taxonomy.

UHGG_5987_KRAKEN.tar.gz: Kraken/Bracken database for 3006 high quality species clusters of the UHGG (Beresford-Jones et al., 2022). Species clusters are represented by a variable number of high quality genomes (n=5987 in total), selected to maximise represented taxonomic diversity. Uses a custom taxonomy modified from GTDB v2.1 with each species cluster being represented by a species level taxonomic annotation. 

 

Methods:

Databases built using Kraken v2.1.2 and Bracken v2.6.2. Commands used to build the databases are included below.

kraken2-build --build --db Kraken --threads 12

bracken-build -d Kraken -k 35 -l 150 -t 12

Files

Files (25.1 GB)

Name Size Download all
md5:53ef2f8d4044976076f1ed2e0e557161
12.4 GB Download
md5:913a91b88aac173dda12e22211740197
12.7 GB Download