Published September 15, 2020
| Version v1.6.0
Software
Open
ENCODE-DCC/chip-seq-pipeline2: v1.6.0
- 1. Stanford University
- 2. The University of Chicago
Description
Conda users should update pipeline's environment. However, reinstalling is always recommended since we added GNU utils to the installer.
# To update env
$ bash scripts/update_conda_env.sh
# To re-install env
$ bash scripts/uninstall_conda_env.sh
$ bash scripts/install_conda_env.sh
New factor-based resource parameters
- New parameters are factor-based and those factors are multiplied to task's input file sizes to determine required resources (mem/disk) to run a task (on a cloud instance or as an HCP job).
- e.g. for each replicate, sum of all R1/R2 FASTQs size will be used to determine resource for task
align
and BAM size will be used for taskfilter
. - e.g. if you have total
20 GB
(R1 + R2) of PE FASTQs and defaultchip.align_mem_factor
is0.15
and base memory is fixed at4-6 GB
for most tasks (5 GB
for taskalign
). So instance's memory for taskalign
will be20 * 0.15 + 5 = 8 GB
- Also, optimized memory/disk requirements for each task, all tasks should use less memory/disk than previous versions.
- Use SSD for all tasks on Google Cloud. This will cost x4 than HDD but it's still negligible (cost for SSD 100 GB is $0.5 per hour).
Change of default for resource parameters
chip.align_cpu
: 2 -> 6chip.filter_cpu
: 2 -> 4chip.call_peak_cpu
: 1 -> 2 (peak-caller MACS2 is single-threaded. No more than 2 is required)
Added resource parameters
chip.spr_disk_factor
chip.preseq_disk_factor
chip.call_peak_cpu
Change of resource parameters.
chip.align_mem_mb
->chip.align_bowtie2_mem_factor
andchip.align_bwa_mem_factor
- According to chosen aligner
chip.aligner
(bowtie2
orbwa
), For custom aligner, it will usechip.align_bwa_mem_factor
.
- According to chosen aligner
chip.align_disks
->chip.align_bowtie2_disk_factor
andchip.align_bwa_disk_factor
- According to chosen aligner
chip.aligner
(bowtie2
orbwa
), For custom aligner, it will usechip.align_bwa_disk_factor
.
- According to chosen aligner
chip.filter_mem_mb
->chip.filter_mem_factor
chip.filter_disks
->chip.filter_disk_factor
chip.bam2ta_mem_mb
->chip.bam2ta_mem_factor
chip.bam2ta_disks
->chip.bam2ta_disk_factor
chip.xcor_mem_mb
->chip.xcor_mem_factor
chip.xcor_disks
->chip.xcor_disk_factor
chip.spr_mem_mb
->chip.spr_mem_factor
chip.spr_disks
->chip.spr_disk_factor
chip.jsd_mem_mb
->chip.jsd_mem_factor
chip.jsd_disks
->chip.jsd_disk_factor
chip.call_peak_mem_mb
->chip.call_peak_spp_mem_factor
andchip.call_peak_macs2_mem_factor
- According to chosen peak caller
chip.peak_caller
(defaulting tospp
for TF ChIP andmacs2
for histone ChIP).
- According to chosen peak caller
chip.call_peak_disks
->chip.call_peak_spp_disk_factor
andchip.call_peak_macs2_disk_factor
- According to chosen peak caller
chip.peak_caller
(defaulting tospp
for TF ChIP andmacs2
for histone ChIP).
- According to chosen peak caller
chip.macs2_signal_track_mem_mb
->chip.macs2_signal_track_mem_factor
chip.macs2_signal_track_disks
->chip.macs2_signal_track_disk_factor
Resources for task align
- Custom aligner python script must be updated with
--mem-gb
.- Task
align
will use BWA's resources (chip.align_bwa_mem_factor
andchip.align_bwa_disk_factor
). --mem-gb
should be added to your Python scriptchip.custom_align_py
.- See input documentation for details.
- Task
Resources for task call_peak
- Different factor-based parameters will be used for different peak caller
chip.peak_caller
(defaulting tospp
for TF ChIP andmacs2
for histone ChIP). - If
chip.peak_caller
is not defined then TF ChIP-seq ("chip.pipeline_type": "tf"
) will default to usespp
peak caller, hencechip.call_peak_spp_mem_factor
andchip.call_peak_spp_disk_factor
). - If
chip.peak_caller
is not defined then histone ChIP-seq ("chip.pipeline_type": "histone"
) will default to usemacs2
peak caller, hencechip.call_peak_macs2_mem_factor
andchip.call_peak_macs2_disk_factor
).
Misc.
- Better multi-threading
samtools view/index/sort
. - Added GNU utils to Conda environment.
Files
ENCODE-DCC/chip-seq-pipeline2-v1.6.0.zip
Files
(1.6 MB)
Name | Size | Download all |
---|---|---|
md5:f9bbb6cf321f995a5ccad4dfee1f5d34
|
1.6 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/ENCODE-DCC/chip-seq-pipeline2/tree/v1.6.0 (URL)