There is a newer version of the record available.

Published January 12, 2021 | Version 1.2.0
Software Open

artic-network/fieldbioinformatics: 1.2.0

  • 1. University of Birmingham
  • 2. @nanoporetech
  • 3. Fred Hutchinson Cancer Research Center
  • 4. The Scripps Research Institute

Description

Brief overview of updates:

  • Adds a --no-longshot flag to minion, which prevents longshot running in the medaka workflow and instead uses medaka annotate to add info to the VCF.
  • Adds a prefilter before filtering the medaka VCFs to bin vars with <2 read in support and var quality < 20, instead of writing them to the FAIL VCF and contributing to the consensus mask
  • Because of the prefilter, the longshot and medaka filters in vcf_filter have been combined to a single filter as they perform the same function.
  • Adds --strict flag to minion to enable both primer scheme checking (for ARTIC format conventions) and filtering of the merged VCF to remove variants that are present only once in amplicon overlap regions, or are present in primer binding sites.
  • Adds --scheme-version flag to minion to give user option of specifying primer scheme version. Scheme version can be omitted and the original behaviour (schemename/V1) is still used preferentially over the new flag. Default will remain 1 but the latest available scheme can be specified using 0. If the scheme name is not found using the original logic (combining scheme name, scheme dir and scheme version), minion will attempt to download the scheme from the ARTIC repos (only nipah, ebola, scov2 available atm). The minion command can now be simplified a bit by dropping --scheme-directory and autodownload schemes, e.g.: artic minion --medaka --read-file test-data/*.fast[aq] scov2/V3 MT007544
  • Rearranges the variant calling code block in minion, removing the need to split the BAMs by readgroup as medaka performs BAM filtering in place - resulting in fewer bam copies.
  • Adds some code comments throughout minion to document pipeline steps.
  • Moves some file opening closer to where they are being used.
  • Cleans repo to remove an old fast5 which was lurking in the commit history and causing slow repo downloads. This improves the fix from #38. It should be noted that anyone wanting to make a PR will need a fresh pull/fork of the repo from today.
  • Adds multiqc (artic fork) to the conda environment (via pip), which produces amplicon coverage plots and variant quality information. This closes #51 as now amplicon coverage is computed from the align_trim report. Plots are now optionally produced by the user after the pipeline is run and can be collated across runs.
  • Deprecates plot_amplicon_depth script and removed associated dependencies. This has been a headache for pipeline users on HPC schedulers etc. Multiqc now replaces this functionality and the plotting is more accurate and optional.
  • Requires --medaka-model to be provided if --medaka is set during artic minion runs.
  • Updates docs.

Notes:

  • The new --no-longshot flag causes more Ns in the consensus as we do not achieve the same filtering as longshot was doing, ending up with a few more vars appearing in our VCF fail file, which are applied to the mask.
  • A bug in medaka means that when medaka annotate is used instead of longshot (via the new flag), INDELs will be silently dropped as medaka incorrectly reports no read support.

Files

artic-network/fieldbioinformatics-1.2.0.zip

Files (3.5 MB)

Name Size Download all
md5:ad827bd13f0ff4c4ab9f045aa5e38043
3.5 MB Preview Download

Additional details