ropensci/drake: map_plan() and other niceties

Will Landau; Will Landau; Alex Axthelm; Kirill Müller; Kendon Bell; Rainer M Krug; Chris Muir; Brianna McHorse; Xianying Tan; Xavier Laviron; Jasper; BruceZhao

doi:10.5281/zenodo.1472227

Published October 26, 2018 | Version v6.1.0

Software Open

ropensci/drake: map_plan() and other niceties

1. Eli Lilly and Company @EliLillyCo
2. Indiana Commission for Higher Education
3. University of California at Berkeley Agricultural and Resource Economics
4. University of Zürich
5. MIT
6. Generali China AMC @GCAMC
7. LECA Grenoble

Version 6.1.0 New features

Add a new map_plan() function to easily create a workflow plan data frame to execute a function call over a grid of arguments.
Add a new plan_to_code() function to turn drake plans into generic R scripts. New users can use this function to better understand the relationship between plans and code, and unsatisfied customers can use it to disentangle their projects from drake altogether. Similarly, plan_to_notebook() generates an R notebook from a drake plan.
Add a new drake_debug() function to run a target's command in debug mode. Analogous to drake_build().
Add a mode argument to trigger() to control how the condition trigger factors into the decision to build or skip a target. See the ?trigger for details.
Add a new sleep argument to make() and drake_config() to help the master process consume fewer resources during parallel processing.
Enable the caching argument for the "clustermq" and "clustermq_staged" parallel backends. Now, make(parallelism = "clustermq", caching = "master") will do all the caching with the master process, and make(parallelism = "clustermq", caching = "worker") will do all the caching with the workers. The same is true for parallelism = "clustermq_staged".
Add a new append argument to gather_plan(), gather_by(), reduce_plan(), and reduce_by(). The append argument control whether the output includes the original plan in addition to the newly generated rows.
Add new functions load_main_example(), clean_main_example(), and clean_mtcars_example().
Add a filter argument to gather_by() and reduce_by() in order to restrict what we gather even when append is TRUE.
Add a hasty mode: make(parallelism = "hasty") skips all of drake's expensive caching and checking. All targets run every single time and you are responsible for saving results to custom output files, but almost all the by-target overhead is gone.

Bug fixes

Make commands in the plan are re-analyzed for dependencies when new imports are added (https://github.com/ropensci/drake/issues/548). Was a bug in version 6.0.0 only.
Call path.expand() on the file argument to render_drake_graph() and render_sankey_drake_graph(). That way, tildes in file paths no longer interfere with the rendering of static image files. Compensates for https://github.com/wch/webshot.
Skip tests and examples if the required "Suggests" packages are not installed.
Stop checking for non-standard columns. Previously, warnings about non-standard columns were incorrectly triggered by evaluate_plan(trace = TRUE) followed by expand_plan(), gather_plan(), reduce_plan(), gather_by(), or reduce_by(). The more relaxed behavior also gives users more options about how to construct and maintain their workflow plan data frames.
Use checksums in "future" parallelism to make sure files travel over network file systems before proceeding to downstream targets.
Refactor and clean up checksum code.
Allow more tests and checks to succeed withtout the optional visNetwork package.

Enhancements

Stop earlier in make_targets() if all the targets are already up to date.
Improve the documentation of the seed argument in make() and drake_config().
Set the default caching argument of make() and drake_config() to "master" rather than "worker". The default option should be the lower-overhead option for small workflows. Users have the option to make a different set of tradeoffs for larger workflows.
Allow the condition trigger to evaluate to non-logical values as long as those values can be coerced to logicals.
Require that the condition trigger evaluate to a vector of length 1.
Keep non-standard columns in drake_plan_source().
make(verbose = 4) now prints to the console when a target is stored.
gather_by() and reduce_by() now gather/reduce everything if no columns are specified.
Change the default parallelization of the imports. Previously, make(jobs = 4) was equivalent to make(jobs = c(imports = 4, targets = 4)). Now, make(jobs = 4) is equivalent to make(jobs = c(imports = 1, targets = 4)). See issue 553 for details.
Add a console message for building the priority queue when verbose is at least 2.
Condense load_mtcars_example().
Deprecate the hook argument of make() and drake_config().
In gather_by() and reduce_by(), do not exclude targets with all NA gathring variables.

Files

ropensci/drake-v6.1.0.zip

Files (1.5 MB)

Name	Size	Download all
ropensci/drake-v6.1.0.zip md5:595c0a818ac262171b282607075f5f10	1.5 MB	Preview Download

Additional details

Is supplement to: https://github.com/ropensci/drake/tree/v6.1.0 (URL)

	All versions	This version
Views	3,814	70
Downloads	617	18
Data volume	1.8 GB	37.8 MB

ropensci/drake: map_plan() and other niceties

Creators

Description

Files

ropensci/drake-v6.1.0.zip

Files (1.5 MB)

Additional details

Related works