There is a newer version of the record available.

Published September 15, 2020 | Version v1.6.0
Software Open

ENCODE-DCC/chip-seq-pipeline2: v1.6.0

  • 1. Stanford University
  • 2. The University of Chicago

Description

Conda users should update pipeline's environment. However, reinstalling is always recommended since we added GNU utils to the installer.

# To update env
$ bash scripts/update_conda_env.sh

# To re-install env
$ bash scripts/uninstall_conda_env.sh
$ bash scripts/install_conda_env.sh

New factor-based resource parameters

  • New parameters are factor-based and those factors are multiplied to task's input file sizes to determine required resources (mem/disk) to run a task (on a cloud instance or as an HCP job).
  • e.g. for each replicate, sum of all R1/R2 FASTQs size will be used to determine resource for task align and BAM size will be used for task filter.
  • e.g. if you have total 20 GB (R1 + R2) of PE FASTQs and default chip.align_mem_factor is 0.15 and base memory is fixed at 4-6 GB for most tasks (5 GB for task align). So instance's memory for task align will be 20 * 0.15 + 5 = 8 GB
  • Also, optimized memory/disk requirements for each task, all tasks should use less memory/disk than previous versions.
  • Use SSD for all tasks on Google Cloud. This will cost x4 than HDD but it's still negligible (cost for SSD 100 GB is $0.5 per hour).

Change of default for resource parameters

  • chip.align_cpu: 2 -> 6
  • chip.filter_cpu: 2 -> 4
  • chip.call_peak_cpu: 1 -> 2 (peak-caller MACS2 is single-threaded. No more than 2 is required)

Added resource parameters

  • chip.spr_disk_factor
  • chip.preseq_disk_factor
  • chip.call_peak_cpu

Change of resource parameters.

  • chip.align_mem_mb -> chip.align_bowtie2_mem_factor and chip.align_bwa_mem_factor
    • According to chosen aligner chip.aligner (bowtie2 or bwa), For custom aligner, it will use chip.align_bwa_mem_factor.
  • chip.align_disks -> chip.align_bowtie2_disk_factor and chip.align_bwa_disk_factor
    • According to chosen aligner chip.aligner (bowtie2 or bwa), For custom aligner, it will use chip.align_bwa_disk_factor.
  • chip.filter_mem_mb -> chip.filter_mem_factor
  • chip.filter_disks -> chip.filter_disk_factor
  • chip.bam2ta_mem_mb -> chip.bam2ta_mem_factor
  • chip.bam2ta_disks -> chip.bam2ta_disk_factor
  • chip.xcor_mem_mb -> chip.xcor_mem_factor
  • chip.xcor_disks -> chip.xcor_disk_factor
  • chip.spr_mem_mb -> chip.spr_mem_factor
  • chip.spr_disks -> chip.spr_disk_factor
  • chip.jsd_mem_mb -> chip.jsd_mem_factor
  • chip.jsd_disks -> chip.jsd_disk_factor
  • chip.call_peak_mem_mb -> chip.call_peak_spp_mem_factor and chip.call_peak_macs2_mem_factor
    • According to chosen peak caller chip.peak_caller (defaulting to spp for TF ChIP and macs2 for histone ChIP).
  • chip.call_peak_disks -> chip.call_peak_spp_disk_factor and chip.call_peak_macs2_disk_factor
    • According to chosen peak caller chip.peak_caller (defaulting to spp for TF ChIP and macs2 for histone ChIP).
  • chip.macs2_signal_track_mem_mb -> chip.macs2_signal_track_mem_factor
  • chip.macs2_signal_track_disks -> chip.macs2_signal_track_disk_factor

Resources for task align

  • Custom aligner python script must be updated with --mem-gb.
    • Task align will use BWA's resources (chip.align_bwa_mem_factor and chip.align_bwa_disk_factor).
    • --mem-gb should be added to your Python script chip.custom_align_py.
    • See input documentation for details.

Resources for task call_peak

  • Different factor-based parameters will be used for different peak caller chip.peak_caller (defaulting to spp for TF ChIP and macs2 for histone ChIP).
  • If chip.peak_caller is not defined then TF ChIP-seq ("chip.pipeline_type": "tf") will default to use spp peak caller, hence chip.call_peak_spp_mem_factor and chip.call_peak_spp_disk_factor).
  • If chip.peak_caller is not defined then histone ChIP-seq ("chip.pipeline_type": "histone") will default to use macs2 peak caller, hence chip.call_peak_macs2_mem_factor and chip.call_peak_macs2_disk_factor).

Misc.

  • Better multi-threading samtools view/index/sort.
  • Added GNU utils to Conda environment.

Files

ENCODE-DCC/chip-seq-pipeline2-v1.6.0.zip

Files (1.6 MB)

Name Size Download all
md5:f9bbb6cf321f995a5ccad4dfee1f5d34
1.6 MB Preview Download

Additional details