github.com/PlasmoGenEpi/plasmodiumdrugres-wdl/plasmodiumdrugres
Authors/Creators
Description
PlasmodiumDrugRes WDL interface (parity with Nextflow)
This document defines the user-facing inputs and outputs for the WDL implementation of the plasmodiumdrugres pipeline, and how these map to the current Nextflow pipeline in ~/Documents/git_projects/plasmodiumdrugres/.
Source of truth (Nextflow):
- Parameter schema:
~/Documents/git_projects/plasmodiumdrugres/nextflow_schema.json - Defaults:
~/Documents/git_projects/plasmodiumdrugres/nextflow.config - Workflow wiring / branching:
~/Documents/git_projects/plasmodiumdrugres/workflows/plasmodiumdrugres.nf - Input validation and PMO/population-field normalization:
~/Documents/git_projects/plasmodiumdrugres/subworkflows/local/utils_nfcore_plasmodiumdrugres_pipeline/main.nf
Inputs
Required: choose exactly one input mode
Provide exactly one of:
pmo(File): PMO JSON file.allele_table(File): TSV/CSV containing microhaplotypes. When using this mode,panel_info_bedis also required.
Required files
loci_of_interest_bed(File): BED of loci of interest (single-locus estimates are computed at these loci).loci_groups(File): TSV/CSV defining multi-locus groups (multi-locus estimates are computed for these groups).
Required iff using allele_table
panel_info_bed(File): BED defining panel target coordinates.
Optional grouping / population splitting
You can run either:
- Single population (default): no splitting is performed; results are labeled using
population_label(defaultpop1). - Per-population: split input tables by population and compute outputs for each population.
Inputs controlling this:
population_assignment(File?): TSV/CSV mappingspecimen_name→population.pmo_population_fields(String?, defaultnull): comma-separated list of PMO specimen metadata fields; used only whenpmois provided andpopulation_assignmentis not provided.pmo_population_separator(String, default_): join string used when building the population label frompmo_population_fields.population_label(String, defaultpop1): used only when no population assignment is available.
Branching rule (parity target):
has_population_assignment = (population_assignment is provided) OR (pmo is provided AND pmo_population_fields is provided)
Optional references (PMO mode only)
These are used when generating a panel BED from PMO and adding reference sequences to it:
targeted_reference(File?, defaultnull): FASTA containing only the targets.genome_reference(File?, defaultnull): FASTA containing the full genome.
Behavior (parity target):
- If both are provided, prefer
targeted_reference(Nextflow warns and prefers targeted reference).
Method selection (defaults from Nextflow)
mlaf_method(String, defaultnaive): one ofnaive,MLBM,FEM.naive_mlaf_method(String, defaultwsaf_prop): passed to the naive multi-locus method.
slaf_method(String, defaultnaive): one ofnaive,IDM,mhaps_freq.naive_slaf_method(String, defaultread_count_prop): passed to the naive single-locus method.mhaps_frequses DCIFER in the current Nextflow pipeline.
Optional tuning parameters (passed through to scripts)
translate_loci_extra_args(String, default"")mlbm_wrapper_aa_specimen_occurence_cut_off(Int?, defaultnull)naive_multilocus_wsaf_cut_off(Float?, defaultnull)dcifer_slaf_wrapper_coi_lrank(Int?, defaultnull)dcifer_slaf_wrapper_qstart(Float?, defaultnull)dcifer_slaf_wrapper_tol(Float?, defaultnull)
Output directory convention
To mimic Nextflow's outdir organization (even though Terra does not require explicit staging), the WDL workflow will write deliverables under:
outdir(String, defaultoutput)
Outputs
On Terra, the workflow exposes exactly seven outputs as String URIs from the staging step (same idea as mad4hatter-wdl move_outputs). When Cromwell sees a gs://fc-…/… path for the merged ml_summary, files are copied with gcloud to gs://fc-…/<outdir>/<timestamp>/ using their original basenames. On local Cromwell (paths like /Users/… or tests/input/…), the same task uses cp into <execution_dir>/<outdir>/<timestamp>/ and outputs absolute local paths instead.
Optional input workspace_bucket (String?): set to the workspace bucket id (e.g. fc-15e572f9-33a3-4a1e-8534-099df773bfbf, no gs:// prefix) if your backend localizes files before WDL evaluates paths and automatic gs://fc-… detection fails—then GCS staging is forced.
outdir must be alphanumeric plus _ or - only (validated at workflow start).
Workflow output names (each value is a gs://… path to the file):
ml_summary→ml_summary.tsvsl_summary→sl_summary.tsvsl_from_ml_summary→sl_from_ml_summary.tsvamino_acid_calls→amino_acid_calls.tsv.gzcollapsed_amino_acid_calls→collapsed_amino_acid_calls.tsv.gzloci_covered_by_target_samples_info→loci_covered_by_target_samples_info.tsvloci_of_interest_for_target_for_microhap→loci_of_interest_for_target_for_microhap.tsv.gz
Per-population merge artifacts and intermediate translated_loci/ paths are still computed inside the run but are not listed as workflow outputs; use the staged URIs above for downloads and downstream tooling.
Files
github.com-PlasmoGenEpi-plasmodiumdrugres-wdl-plasmodiumdrugres_v0.1.0.zip
Files
(21.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:eefdc77e8a4a30a27135508632dc873a
|
21.3 kB | Preview Download |
Additional details
Related works
- Is identical to
- https://dockstore.org/aliases/workflow-versions/10.5281-zenodo.20709638 (URL)
- https://dockstore.org/workflows/github.com/PlasmoGenEpi/plasmodiumdrugres-wdl/plasmodiumdrugres:v0.1.0 (URL)
- https://dockstore.org/api/ga4gh/trs/v2/tools/%23workflow%2Fgithub.com%2FPlasmoGenEpi%2Fplasmodiumdrugres-wdl%2Fplasmodiumdrugres/versions/v0.1.0/PLAIN-WDL/descriptor/plasmodiumdrugres.wdl (URL)