Software Open Access

ropensci/drake: Continuing with efficient data formats

Will Landau; Will Landau; Ben Bond-Lamberty; Alex Axthelm; Miles McBain; Kirill Müller; TJ Mahr; Noam Ross; brendanf; Kendon Bell; Patrick Schratz; Tim Mastny; Rainer M Krug; Garrick Aden-Buie; Chris Muir; Brianna McHorse; Xianying Tan; Xavier Laviron; Tiernan Martin; Shinya Uryu; Maëlle Salmon; Mauro Lepore; Jeroen Ooms; Jasper; Hugo Gruson; BruceZhao; Bill Denney

Version 7.6.2 Bug fixes

  • Remove README.md from CRAN altogether. Also remove all links from the news and vignette. The links trigger too many CRAN notes, which made the automated checks too brittle.
  • Serialize formats that need serialization (like "keras") before sending the data from HPC workers to the master process (#989).
  • Check for custom-formatted files when checking checksums.
  • Force fst-formatted targets to plain data frames. Same goes for the new "fst_dt" format.
  • Change the meaning and behavior of max_expand in drake_plan(). max_expand is now the maximum number of targets produced by map(), split(), and cross(). For cross(), this reduces the number of targets (less cumbersome) and makes the subsample of targets more representative of the complete grid. It also. ensures consistent target naming when .id is FALSE (#1002). Note: max_expand is not for production workflows anyway, so this change does not break anything important. Unfortunately, we do lose the speed boost in drake_plan() originally due to max_expand, but drake_plan() is still fast, so that is not so bad.
  • Drop specialized formats of NULL targets (#998).
  • Prevent false grouping variables from partially tagging along in cross() (#1009). The same fix should apply to map() and split() too.
  • Respect graph topology when recovering old grouping variables for map() (#1010).
New features
  • Add a new "fst_dt" format for fst-powered saving of data.table objects.
  • Support a custom "caching" column of the plan to select master vs worker caching for each target individually (#988).
  • Make transform a formal argument of target() so that users do not have to type "transform =" all the time in drake_plan() (#993).
  • Migrate the documentation website from ropensci.github.io/drake to docs.ropensci.org/drake.
Enhancements
  • Document the HPC limitations of target(format = "keras") (#989).
  • Remove the now-superfluous vignette.
  • Wrap up console and text file logging functionality into a reference class (#964).
  • Deprecate the verbose argument in various caching functions. The location of the cache is now only printed in make(). This made the previous feature easier to implement.
  • Carry forward nested grouping variables in combine() (#1008).
  • Improve the encapsulation of hash tables in the decorated storr (#968).

Files (1.2 MB)
Name Size
ropensci/drake-v7.6.2.zip
md5:1463eb4fed88f56e596d9e3911a4cd7f
1.2 MB Download
447
355
views
downloads
All versions This version
Views 4471
Downloads 3553
Data volume 985.5 MB3.7 MB
Unique views 4171
Unique downloads 412

Share

Cite as