Published August 24, 2021 | Version v1
Journal article Open

Additional data for preprint: Recovery of high quality metagenome-assembled genomes from full-scale activated sludge microbial communities in a tropical climate using longitudinal metagenome sampling

  • 1. Singapore Centre for Environmental Life Sciences Engineering, National University of Singapore, Singapore, 117456
  • 2. Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, 637551

Description

Microbial communities underpinning the operation of wastewater treatment plants are particularly challenging targets for MAG analysis due to their high eco-biological complexity, and remain important, albeit understudied and play a key role in mediating the interactions between human and natural ecosystems. We consider strategies for recovery of MAG sequence from time series metagenome surveys of full-scale activated sludge microbial communities and generate MAG catalogues using: multiple individual sample assemblies, two variations on multi-sample co-assembly, and a recently published MAG recovery workflow using deep learning (VAMB). We obtain a total of just under 9,100 draft genomes, which collapse to around 3,100 non-redundant genomic clusters. Here, we provide additional data not included in NCBI submission (BioProject Accession PRJNA731554):

1. all_25_assemblies.tar.gz: contains 24 individual short-reads assemblies sequences (FASTA) from each of the 24 samples and one co-assembly sequence from the combined 24 samples.

2. coassembly_multi_bam.tar.gz: contains all 1,712 MAGs sequences (FASTA) obtained from co-assembly and Metabat2 binning workflow using coverage profiles generated across all 24 samples.

3. coassembly_single_bam.tar.gz: contains 1,997 MAGs sequences (FASTA) obtained from co-assembly and Metabat2 binning workflow using the entire read set treated as a single meta-sample.

4. individual_assemblies.tar.gz: contains 3,429 MAGs sequences (FASTA) obtained from individual assemblies and Metabat2 binning workflow.

5. vamb.tar.gz: contains 1,941 MAGs sequences (FASTA) obtained from individual assemblies and VAMB binning workflow.

Files

Files (26.6 GB)

Name Size Download all
md5:8f519c9236fa12eedcfd00626a6f83ae
19.8 GB Download
md5:52f40ed2bf1753e57694b88dfe78940c
1.7 GB Download
md5:0260c88a6466ba0de4ea4e3de803a286
1.9 GB Download
md5:1352c2f7840a2ff183099d84fcee2b50
2.5 GB Download
md5:224eb35b560a29ee9094cc34632a5818
885.7 MB Download

Additional details

Related works

Is derived from
Preprint: 10.1101/2021.08.21.456929 (DOI)