Software Open Access

Pipengine: An ultra light YAML-based pipeline execution engine

Strozzi, Francesco; Bonnal, Raoul Jean Pierre

PipEngine will generate runnable shell scripts, already configured for the PBS/Torque job scheduler, for each sample in a pipeline. It allows to run a complete pipeline or just a single step depending on the needs. PipEngine is best suited for NGS pipelines, but it can be used for any kind of pipeline that can be run on a job scheduling system and which is "sample" centric, i.e. you have from one side a list of samples with their corresponding input data, and from the other side a pipeline that you would like to apply to them.

PipEngine was developed to combine the typical flexibility and portability of shell scripts, with the concept of pipeline templates that can be easily applied on different input data to reproduce scientific results. The overall improvement over Makefiles or customised ad-hoc shell scripts is better readability of the pipelines using the YAML format, especially for people with no coding experience, the automated scripts generation which allows adding extra functionalities like error controls and logging directly into script jobs, and an enforced separation between the description of input data and the pipeline template, which improves clarity and reusability of analysis protocols.

Files (386.1 kB)
Name Size
bioruby-pipengine.tgz
md5:e6e3579556653e6317080184d5a91750
386.1 kB Download

Share

Cite as