Poster Open Access
Background Short-read shotgun metagenomics can offer comprehensive microbial detection and characterisation of complex clinical samples. The de novo assembly of this data into draft genomes is key in metagenomic analysis, yielding longer sequences that offer contextual information and afford a more complete picture of the microbial community. The assembly process represents a major bottleneck in obtaining trustworthy, reproducible results.
Methods LMAS is an automated workflow developed as a flexible platform to evaluate traditional and metagenomic dedicated prokaryotic de novo assembly software performance given known standard communities. Its implementation in Nextflow ensures the transparency and reproducibility of the results obtained and the use of Docker containers provides further flexibility. The results are presented in an interactive HTML report where global and reference specific performance metrics can be explored. Currently, 10 assemblers are implemented in LMAS, with the possibility for expansion as novel algorithms are developed.
Results The eight bacterial genomes and four plasmids of the ZymoBIOMICS Microbial Community Standards were used as reference. Raw sequence data of the mock communities, with an even and logarithmic distribution of species, and matching simulated samples were used as input. The resulting LMAS report is available at https://lmas-demo.herokuapp.com.
Discussion Overall, k-mer De Bruijn graph assemblers outperform the alternative approaches but come with a greater computational cost. Metagenomic dedicated algorithms produce fewer misassembly errors than standard genomic assemblers. The performance of each assembler varied depending on the species of interest and its abundance in the sample, with less abundant species presenting a significant challenge for all assemblers. No assembler stood out as an undisputed all-purpose choice for short-read metagenomic prokaryote genome assembly, highlighting that efforts are still needed to further improve metagenomic assembly performance. Using LMAS could underpin this development process. The LMAS workflow is available at https://github.com/cimendes/LMAS.