Additional data for preprint: Recovery of high quality metagenome-assembled genomes from full-scale activated sludge microbial communities in a tropical climate using longitudinal metagenome sampling
Authors/Creators
- 1. Singapore Centre for Environmental Life Sciences Engineering, National University of Singapore, Singapore, 117456
- 2. Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, 637551
Description
Microbial communities underpinning the operation of wastewater treatment plants are particularly challenging targets for MAG analysis due to their high eco-biological complexity, and remain important, albeit understudied and play a key role in mediating the interactions between human and natural ecosystems. We consider strategies for recovery of MAG sequence from time series metagenome surveys of full-scale activated sludge microbial communities and generate MAG catalogues using: multiple individual sample assemblies, two variations on multi-sample co-assembly, and a recently published MAG recovery workflow using deep learning (VAMB). We obtain a total of just under 9,100 draft genomes, which collapse to around 3,100 non-redundant genomic clusters. Here, we provide additional data not included in NCBI submission (BioProject Accession PRJNA731554):
1. all_25_assemblies.tar.gz: contains 24 individual short-reads assemblies sequences (FASTA) from each of the 24 samples and one co-assembly sequence from the combined 24 samples.
2. coassembly_multi_bam.tar.gz: contains all 1,712 MAGs sequences (FASTA) obtained from co-assembly and Metabat2 binning workflow using coverage profiles generated across all 24 samples.
3. coassembly_single_bam.tar.gz: contains 1,997 MAGs sequences (FASTA) obtained from co-assembly and Metabat2 binning workflow using the entire read set treated as a single meta-sample.
4. individual_assemblies.tar.gz: contains 3,429 MAGs sequences (FASTA) obtained from individual assemblies and Metabat2 binning workflow.
5. vamb.tar.gz: contains 1,941 MAGs sequences (FASTA) obtained from individual assemblies and VAMB binning workflow.
Files
Files
(26.6 GB)
Additional details
Related works
- Is derived from
- Preprint: 10.1101/2021.08.21.456929 (DOI)