Published March 24, 2025 | Version v2

BacDive AI models: Predicting bacterial phenotypes from genome annotations

  • 1. ROR icon Leibniz Institute DSMZ – German Collection of Microorganisms and Cell Cultures
  • 2. Leibniz Institut - Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH

Description

This is a small command line tool bundled with selected BacDive-AI models to predict bacterial phenotypes from InterPro (Pfam) annotations. Please see our Preprint for more information.
 
This package comes with an example InterPro annotation of the strain Actinomyces dentalis DSM 19115. The annotated file in TSV format is named as 1120941.3.faa.tsv. The genome is derived from BV-BRC (formerly PATRIC).
 
For reference, the used command to generate the Pfams with InterProScan was:
interproscan.sh -i 1120941.3.faa -f tsv -d ./ -appl Pfam
To start the prediction of a single trait, you can use the following example code in the project folder:
python predict.py gram-positive 1120941.3.faa.tsv
For the aforementioned call, you can select one of the following values:
Value Trait
acidophile Acidophilic
gram-positive Gram-positive
spore-forming Spore-forming
aerobic Aerobic
anaerobic Anaerobic
thermophile Thermophilic
psychrophile Psychrophilic
motile2+ Flagellated motility
 
You can also choose `all` to see the complete list of predicted values. You will get the prediction (true or false) for each of the trait together with a confidence score. The output should look like this:
 
$ python predict.py all 1120941.3.faa.tsv

Acidophilic: False (98.5%)
Gram-positive: True (98.93%)
Spore-forming: False (96.47%)
Aerobic: False (93.91%)
Anaerobic: True (63.54%)
Thermophilic: False (99.77%)
Psychrophilic: False (99.96%)
Flagellated motility: False (99.85%)

Files

v2.zip

Files (168.2 MB)

Name Size Download all
md5:38ac40c311ef1494e0a36f68c302dbb5
168.2 MB Preview Download

Additional details

Related works

Is published in
Preprint: 10.1101/2024.08.12.607695 (DOI)

Funding

Leibniz Association
DiASPora – Digital Approaches for the Synthesis of Poorly Accessible Biodiversity Information K280/2019

Dates

Updated
2025-03-24
Revision Date

Software

Repository URL
https://github.com/JKoblitz/bacdive-AI
Programming language
Python
Development Status
Active