broadinstitute/viral-ngs: v1.22.0

Daniel Park; Chris Tomkins-Tinch; Simon Ye; Irwin Jungreis; Hayden Metsky; Ilya Shlyakhter; Hanna; Mike Lin; Vang Le; pvanheus

doi:10.5281/zenodo.1587079

Published November 27, 2018 | Version v1.22.0

Software Open

broadinstitute/viral-ngs: v1.22.0

1. Broad Institute
2. Broad Institute of MIT and Harvard
3. MIT
4. @broadinstitute
5. DNAnexus
6. Aalborg University

New:

Adding commands for working with kmer sets using the KMC tool. (#854)
- new top-level python file: kmer_utils.py providing the following functions (see the documentation for more information):
  - build_kmer_db: Build a database of kmers occurring in given sequences
  - dump_kmer_counts: Dump kmers and their counts from kmer database to a text file
  - filter_reads: Filter reads based on their kmer content
  - kmers_binary_op: Perform a simple binary operation on kmer sets
  - kmers_set_counts: Copy the kmer database, setting all kmer counts in the output to the given value
add metagenomics.py::filter_bam_to_taxa (#883)
- This function filters an input bam file to include only reads that have been mapped to specified taxonomic IDs or scientific names. This requires a classification TSV file, as produced by tools such as Kraken, as well as the NCBI taxonomy database. The column numbers of the tax ID and read ID can be specified, allowing use beyond kraken-format read classification files, however the relationship is assumed to be bijective.
add WDL for filter_bam_to_taxa
assembly.py::assemble_spades now has an option, --minContigLen, to so spades-based de novo assembly now yields only contigs longer than a specified length (#889)
assembly.py - added --alwaysSucceed option to SPAdes (#888)
allow RunInfo.xml override in illumina_demux WDL task (#891)
Added read_utils.py::read_names to extract read names from a sequence file
Added run-pipe_local.sh wrapper script for invoking the Snakemake-based pipeline on a single compute instance (#897)

Changed:

the Unmatched.bam file is now preserved in the illumina_demux WDL task (#887)
increase memory headroom requested for UGER jobs by 10% (#892)
(Broad only) change dotkit providing python-yaml (#890)
use python3 in easy-deploy script if available (#894)
Snakemake rules now specify their memory requirement via the mem_mb param, which is recognized by certain execution engines such as kubernetes (#897)

Fixed:

do not require chromosome names when checking whether a bam file is sorted (#898)
add --no-same-owner to tar -x in WDL tasks (#880)
safely build snpEff database (#881)
allow ints in Snakemake remote protocols ("s3://"...) (#895)
fix ncbi tbl parser for refseq accessions (#899)

Added/Upgraded:

coveralls 1.1 -> 1.3.0(#876)
pytest 3.6.3 -> 3.7.1 (#876)
pytest-mock 1.5.0 -> 1.10.0 (#876)
pytest-xdist 1.15.0 -> 1.22.5 (#876)
coverage 4.4.1 -> 4.5.1 (#876)
spades 3.11.1 -> 3.12.0 (#878)
Added kmc 3.1.1rc1
update Docker viral-baseimage 0.1.12 - 0.1.13 (#884)

Files

broadinstitute/viral-ngs-v1.22.0.zip

Files (8.0 MB)

Name	Size	Download all
broadinstitute/viral-ngs-v1.22.0.zip md5:bcfbac7ee00f5d7d343b3577872dd6c5	8.0 MB	Preview Download

Additional details

Is supplement to: https://github.com/broadinstitute/viral-ngs/tree/v1.22.0 (URL)

	All versions	This version
Views	8,420	103
Downloads	1,729	14
Data volume	43.2 GB	112.4 MB

broadinstitute/viral-ngs: v1.22.0

Authors/Creators

Description

Files

broadinstitute/viral-ngs-v1.22.0.zip

Files (8.0 MB)

Additional details

Related works