Bakta database
Creators
- 1. Bioinformatics and Systems Biology, Justus Liebig University Giessen, Giessen, 35392, Germany; Institute of Medical Microbiology, Justus Liebig University Giessen, Giessen, 35392, Germany; German Centre for Infection Research (DZIF), partner site Giessen-Marburg-Langen, Giessen, Germany
Description
This data repository contains the mandatory DB for Bakta.
It is available in two versions: the default (db.tar.gz or) and a lightweight alternative (db-light.tar.gz).
Bakta is a tool for the rapid & standardized local annotation of bacterial genomes & plasmids. It provides dbxref-rich and sORF-including annotations in machine-readble JSON
& bioinformatics standard file formats for automatic downstream analysis: https://github.com/oschwengers/bakta
This db provides protein sequence hash digests and lengths of UniProt's UniRef100 clusters, UniParc and NCBI RefSeq sequences for ultra-fast identification & lookups. It has been pre-annotated with several specialized db and enriched with Dbxrefs. Furthermore, seed sequences of UniProt's UniRef90 clusters are stored for fallback homology searches via Diamond sequence alignments. All conducted pre-annotations are logged and provided in the db.log.gz file.
External DB versions:
- NCBI AMRFinderPlus: 2023-11-15.1
- COG: 2020
- DoriC: 12
- ISFinder: 2019-09-25
- Mob-suite: 3
- Pfam: 36
- RefSeq: r220
- Rfam: 14.10
- UniProtKB/Swiss-Prot: 2023_05
- VFDB: 2024-01-15
Files
db-versions.json
Additional details
Related works
- Is cited by
- Journal article: 10.1099/mgen.0.000685 (DOI)
- Is required by
- Software: https://github.com/oschwengers/bakta (URL)