Published July 10, 2025 | Version v1
Dataset Open

Predicted Biosynthetic Gene Clusters and Peptides from Fermented Food Microbial Genomes

Description

This repository contains results and raw files from running the MicrocosmFoods/bac-mining workflow on ~11,500 bacterial genomes assembled from diverse fermented foods. Specifically, biosynthetic gene clusters (BGCs) were predicted using antiSMASH and two different peptide types - small ORFs (smORFs) and cleavage peptides were predicted on these set of genomes. This repository contains the following files: 

  • all_molecule_counts.tsv - This is the main summary file that summarizes for each genome the count of each type of molecule such as certain types of BGCs, smorfs, and cleavage peptides. The corresponding metadata for these genomes can be found in the Fermented Foods Microbial Genomes Database Zenodo repository
  • all_smorfinder_results.tsv - All combined results output from smorfinder
  • all_deeppeptide_results.tsv - All combined results from DeepPeptide for predicting cleavage peptides
  • all-MAG-combined-batch-peptides.fasta - Protein sequences for all peptide sequences, including smorfs, cleavage peptides (not including predicted propeptide sequences) and core RiPP sequences predicted from antiSMASH, if they were found
  • 2025-06-08-mag-antismash-predictions.tar.gz - All predicted antiSMASH results for each of the ~11,500 bacterial genomes. The decompressed archive is split by genome, so for example each subdirectory for a genome contains: 
    • genome_name.json - The json summary file of all identified BGCs
    • genome_name.log - The logfile from the antiSMASH run
    • genome_name*.gbk - Each biosynthetic gene cluster identified in GBK format

Files

Files (15.1 GB)

Name Size Download all
md5:62b5ea88790eea808ae1fa3f7c3beab6
14.9 GB Download
md5:9b28f8a74fdb138634d8a93aacf089de
69.8 MB Download
md5:d63fcbdaa1eaf4ab1442b9a51dd4ddf9
71.6 MB Download
md5:67a34000a352990b83d7eed7f88d035a
890.2 kB Download
md5:2b39ea93025b0584cf30ad901b9fc4cf
54.8 MB Download