Published June 18, 2026 | Version 1.1

CheckAMG database

Authors/Creators

  • 1. EDMO icon University of Wisconsin-Madison

Description

Databases required to run CheckAMG. The annotation module ("checkamg annotate") and the de-novo module ("checkamg denovo") use separate databases, distributed as two independent archives, so the larger de-novo database is only downloaded when needed.

  • CheckAMG_annotate_db_v1.1_20260316.tar.gz: profile HMM databases and their associated score cutoff files required by "checkamg annotate". Contains:
    • *.hmm, *.h3m, *.h3i, *.h3f, and *.h3p files for each of KEGG, FOAM, Pfam-A, PHROGs, dbCAN (dbCAN_HMMdb_v14), METABOLIC_custom, and CAMPER
    • KEGG_cutoffs.tsv: score cutoffs for KEGG KOfam HMMs
    • FOAM_cutoffs.tsv: score cutoffs for FOAM HMMs
    • CAMPER_cutoffs.tsv: score cutoffs for CAMPER HMMs
    • METABOLIC_cutoffs.tsv: score cutoffs for METABOLIC_custom HMMs
    • README.txt: dependent database versions and citations
  • CheckAMG_denovo_db_v1_20260605.tar.gz: trained Protein Set Transformer (PST) model and precomputed reference data required by "checkamg denovo". Contains:
    • checkAMG-PST_TL-P__large_4.20260605.ckpt: trained CheckAMG-PST model checkpoint
    • checkAMG-PST_TL-P__large_4.20260605.PST-EMBED.h5: reference protein PST embeddings (with labels)
    • checkAMG-PST_TL-P__large_4.20260605.PST-EMBED.index.faiss: FAISS index over the reference embeddings for nearest-neighbor search
    • checkAMG-PST_TL-P__large_4.20260605.PST-EMBED.labels.h5: reference protein labels
    • README.txt: model/database version and citation

Files

Files (92.8 GB)

Name Size
md5:1090661e026819a98210a69bc8d34490
13.6 GB Download
md5:94dc128542300c72a8917c633effe0f7
79.2 GB Download

Additional details

Dates

Updated
2026-06-20

Software

Repository URL
https://github.com/AnantharamanLab/CheckAMG
Programming language
Python
Development Status
Active