Published November 15, 2025 | Version v4
Dataset Open

Datasets and models for Metagenomics-AI

Authors/Creators

  • 1. BGI Research
  • 2. BGI research Shenzhen

Description

Datasets and models for Metagenome-AI framework and related manuscript

4 datasets in total: Global AMP prediction, Gram positive, Gram negative bacteria and toxicity.

3 Base models: ESM2-3B, ESM3 and ProteinTrans.

Proteins collected from extreme environments are given in 11379_corepeptides_scored_cAMP20250915-forRevision.tsv

 

 

Files

Files (17.6 MB)

Name Size Download all
md5:c4dad050786d514b5a67cf012eaca81a
5.1 MB Download
md5:c2c7dd0bb4151f1522b4d5c7c82e1317
419.8 kB Download
md5:41dc9318adb40aefbb6f46b26e71f5c9
3.3 MB Download
md5:d3bf2f1f4e878715f9bb2d74d5ce2ba3
416.3 kB Download
md5:a3676399e2c0fe25517a24fd68789def
359.7 kB Download
md5:2d4a5d21a4901f6c29b9760bbb5f3e72
2.9 MB Download
md5:7facd7ac6fc17efcd16479c1295730d2
361.4 kB Download
md5:2ea53973db392d93ab78fc08e1b62530
360.5 kB Download
md5:d2788d65b86e98f22b4ca25e7969021b
2.9 MB Download
md5:6408514ca8e872b4064aa5d4f2d8d437
364.8 kB Download
md5:dc99ee45e47f03fd902a1db4d8e1a4df
40.6 kB Download
md5:acd6fb1431125bb5eb36fdf0e4f5027b
40.7 kB Download
md5:99eaac62f9b009070ad276eb7994c4cf
40.7 kB Download
md5:23737becdf5c89043de6e600bf4e64f3
40.6 kB Download
md5:5817db41276b358ab0de24eb8430a2ce
64.8 kB Download
md5:84140ecb62039c96cff93aa7584768c8
64.8 kB Download
md5:ac8cf87c27535c209e716c8f98344ddb
64.8 kB Download
md5:2b7f281a122bd58f31a7e194b7428a32
64.8 kB Download
md5:f89d184b82216a16634433fe258d24ea
26.9 kB Download
md5:de624efe8f39080d50039996fe6613e4
26.9 kB Download
md5:cac0a3a5d2e21b9da628d0326fda12e1
26.9 kB Download
md5:cc090bda014bca5ffc6d94eb8c75b080
26.9 kB Download
md5:d5ea6b0133368fe4b53a7ba33ba010b7
64.8 kB Download
md5:4a587c236d4d8c14d2e568ed660595c5
515.6 kB Download
md5:aa3f5126d1409fcdd9c303b5bc629258
64.7 kB Download

Additional details

Dates

Available
2025-10-29

Software

Repository URL
https://github.com/BGIResearch/Metagenome-AI
Programming language
Python

References

  • The Extreme Environment Microbiome Catalog (EEMC): A Global Resource for Microbial Diversity2 and Antimicrobial Discovery, Puzi Jiang et. al.