There is a newer version of the record available.

Published January 26, 2018 | Version v5.0.0
Software Open

ropensci/drake: First release under rOpenSci

  • 1. Eli Lilly and Company @EliLillyCo
  • 2. Indiana Commission for Higher Education
  • 3. University of California at Berkeley Agricultural and Resource Economics

Description

TL;DR: this is the first release in which drake is part of rOpenSci. Relative to 4.4.0, this release has major changes to cache internals, user-level function names, and documentation.

  • Transfer drake to rOpenSci: https://github.com/ropensci/drake
  • Several functions now require an explicit config argument, which you can get from drake_config() or make(). Examples:
    • outdated()
    • missed()
    • rate_limiting_times()
    • predict_runtime()
    • vis_drake_graph()
    • dataframes_graph()
  • Always process all the imports before building any targets. This is part of the solution to #168: if imports and targets are processed together, the full power of parallelism is taken away from the targets. Also, the way parallelism happens is now consistent for all parallel backends.
  • Major speed improvement: dispense with internal inventories and rely on cache$exists() instead.
  • Let the user define a trigger for each target to customize when make() decides to build targets.
  • Document triggers and other debugging/testing tools in the new debug vignette.
  • Restructure the internals of the storr cache in a way that is not back-compatible with projects from versions 4.4.0 and earlier. The main change is to make more intelligent use of storr namespaces, improving efficiency (both time and storage) and opening up possibilities for new features. If you attempt to run drake >= 5.0.0 on a project from drake <= 4.0.0, drake will stop you before any damage to the cache is done, and you will be instructed how to migrate your project to the new drake.
  • Use formatR::tidy_source() instead of parse() in tidy_command() (originally tidy() in R/dependencies.R). Previously, drake was having problems with an edge case: as a command, the literal string "A" was interpreted as the symbol A after tidying. With tidy_source(), literal quoted strings stay literal quoted strings in commands. This may put some targets out of date in old projects, yet another loss of back compatibility in version 5.0.0.
  • Speed up clean() by refactoring the cache inventory and using light parallelism.
  • Implement rescue_cache(), exposed to the user and used in clean(). This function removes dangling orphaned files in the cache so that a broken cache can be cleaned and used in the usual ways once more.
  • Change the default cpu and elapsed arguments of make() to NULL. This solves an elusive bug in how drake imposes timeouts.
  • Allow users to set target-level timeouts (overall, cpu, and elapsed) with columns in the workflow plan data frame.
  • Document timeouts and retries in the new debug vignette.
  • Add a new graph argument to functions make(), outdated(), and missed().
  • Export a new prune_graph() function for igraph objects.
  • Delete long-deprecated functions prune() and status().
  • Deprecate and rename functions:
    • analyses() => plan_analyses()
    • as_file() => as_drake_filename()
    • backend() => future::plan()
    • build_graph() => build_drake_graph()
    • check() => check_plan()
    • config() => drake_config()
    • evaluate() => evaluate_plan()
    • example_drake() => drake_example()
    • examples_drake() => drake_examples()
    • expand() => expand_plan()
    • gather() => gather_plan()
    • plan(), workflow(), workplan() => drake_plan()
    • plot_graph() => vis_drake_graph()
    • read_config() => read_drake_config()
    • read_graph() => read_drake_graph()
    • read_plan() => read_drake_plan()
    • render_graph() => render_drake_graph()
    • session() => drake_session()
    • summaries() => plan_summaries()
  • Disallow output and code as names in the workflow plan data frame. Use target and command instead. This naming switch has been formally deprecated for several months prior.
  • Deprecate the ..analysis.. and ..dataset.. wildcards in favor of analysis and dataset, respectively. The new wildcards are stylistically better an pass linting checks.
  • Add new functions drake_quotes(), drake_unquote(), and drake_strings() to remove the silly dependence on the eply package.
  • Add a skip_safety_checks flag to make() and drake_config(). Increases speed.
  • In sanitize_plan(), remove rows with blank targets "".
  • Add a purge argument to clean() to optionally remove all target-level information.
  • Add a namespace argument to cached() so users can inspect individual storr namespaces.
  • Change verbose to numeric: 0 = print nothing, 1 = print progress on imports only, 2 = print everything.
  • Add a new next_stage() function to report the targets to be made in the next parallelizable stage.
  • Add a new session_info argument to make(). Apparently, sessionInfo() is a bottleneck for small make()s, so there is now an option to suppress it. This is mostly for the sake of speeding up unit tests.
  • Add a new log_progress argument to make() to suppress progress logging. This increases storage efficiency and speeds some projects up a tiny bit.
  • Add an optional namespace argument to loadd() and readd(). You can now load and read from non-default storr namespaces.
  • Add drake_cache_log(), drake_cache_log_file(), and make(..., cache_log_file = TRUE) as options to track changes to targets/imports in the drake cache.
  • Detect knitr code chunk dependencies in response to commands with rmarkdown::render(), not just knit().
  • Add a new general best practices vignette to clear up misconceptions about how to use drake properly.

Files

ropensci/drake-v5.0.0.zip

Files (4.5 MB)

Name Size Download all
md5:c31cbdc9f47e7e04c492ffc73a1f6d31
4.5 MB Preview Download

Additional details

Related works