There is a newer version of the record available.

Published September 14, 2019 | Version v7.6.2
Software Open

ropensci/drake: Continuing with efficient data formats

  • 1. Eli Lilly and Company @EliLillyCo
  • 2. PNNL/UMD
  • 3. Indiana Commission for Higher Education
  • 4. Queensland Fire and Emergency Services
  • 5. EcoHealth Alliance
  • 6. Uppsala University
  • 7. Manaaki Whenua
  • 8. University of Jena, LMU Munich
  • 9. University of Zürich
  • 10. @GerkeLab at Moffitt Cancer Center
  • 11. Capital One
  • 12. Generali China AMC @GCAMC
  • 13. LECA Grenoble
  • 14. NIES
  • 15. @ropensci @lockedata
  • 16. https://2degrees-investing.org/
  • 17. UC Berkeley
  • 18. Human Predictions LLC

Description

Version 7.6.2 Bug fixes

  • Remove README.md from CRAN altogether. Also remove all links from the news and vignette. The links trigger too many CRAN notes, which made the automated checks too brittle.
  • Serialize formats that need serialization (like "keras") before sending the data from HPC workers to the master process (#989).
  • Check for custom-formatted files when checking checksums.
  • Force fst-formatted targets to plain data frames. Same goes for the new "fst_dt" format.
  • Change the meaning and behavior of max_expand in drake_plan(). max_expand is now the maximum number of targets produced by map(), split(), and cross(). For cross(), this reduces the number of targets (less cumbersome) and makes the subsample of targets more representative of the complete grid. It also. ensures consistent target naming when .id is FALSE (#1002). Note: max_expand is not for production workflows anyway, so this change does not break anything important. Unfortunately, we do lose the speed boost in drake_plan() originally due to max_expand, but drake_plan() is still fast, so that is not so bad.
  • Drop specialized formats of NULL targets (#998).
  • Prevent false grouping variables from partially tagging along in cross() (#1009). The same fix should apply to map() and split() too.
  • Respect graph topology when recovering old grouping variables for map() (#1010).
New features
  • Add a new "fst_dt" format for fst-powered saving of data.table objects.
  • Support a custom "caching" column of the plan to select master vs worker caching for each target individually (#988).
  • Make transform a formal argument of target() so that users do not have to type "transform =" all the time in drake_plan() (#993).
  • Migrate the documentation website from ropensci.github.io/drake to docs.ropensci.org/drake.
Enhancements
  • Document the HPC limitations of target(format = "keras") (#989).
  • Remove the now-superfluous vignette.
  • Wrap up console and text file logging functionality into a reference class (#964).
  • Deprecate the verbose argument in various caching functions. The location of the cache is now only printed in make(). This made the previous feature easier to implement.
  • Carry forward nested grouping variables in combine() (#1008).
  • Improve the encapsulation of hash tables in the decorated storr (#968).

Files

ropensci/drake-v7.6.2.zip

Files (1.2 MB)

Name Size Download all
md5:1463eb4fed88f56e596d9e3911a4cd7f
1.2 MB Preview Download

Additional details

Related works