Published April 12, 2026 | Version v1
Dataset Open

Metagenomic assemblies from real environmental samples for binning benchmarking

  • 1. ROR icon Max Planck Institute for Multidisciplinary Sciences

Description

This dataset consists of metagenomic assemblies from three environments: black soil, human gut (Honduras), and neonatal gut. It includes contigs assembled through coassembly and sample-wise assembly:

  • *_pooledcontigs.fa.tar.gz: coassemblies generated by pooling reads from all samples within each dataset.
  • *_samplewise_contigs.fa.tar.gz: assemblies generated individually for each sample and concatenated into a single file. Fasta header is formatted as <sampleid>C<contigid>.

All files are provided in FASTA format and provided as compressed .tar.gz archives.

Files

Files (6.3 GB)

Name Size Download all
md5:dc84d5c2dd3434366d10122d49dbcef9
523.8 MB Download
md5:97ed4bdb429f1bd07b7701d17cbe148b
2.3 GB Download
md5:95531d830936c33ca7717b16d3a912cf
608.6 MB Download
md5:4bc5e103f13bab4e3e488e00d1c5dcb4
2.1 GB Download
md5:ad042f0611bd9c6249891d24c8e277f7
247.9 MB Download
md5:29e83d38438d1c5bd69d33961781a157
446.4 MB Download

Additional details

Related works

Is published in
Publication: 10.1093/bioinformatics/btaf538 (DOI)
Publication: 10.1093/bib/bbaf617 (DOI)

Funding

European Commission
Metagenome binning - Accurate reconstruction of microbial genomes from the environment 101111457

Software

Repository URL
https://github.com/soedinglab/MAGmax
Programming language
Rust
Development Status
Active