Presentation Open Access
If your data analyses involve coding, then you know how liberating it is to use and create functions. They hide complexity, improve testability, and enable reusability. In this talk I explain how you can really set your code free: by turning it into a command-line tool. The command line can be a very flexible and efficient environment for working with data. It's specialized in combining tools that are written in all sorts of languages (including Python and R), running them in parallel, and applying them to massive amounts of (streaming) data. Although the command line itself has quite a learning curve, turning your existing code into a tool is, as I demonstrate, a matter of a few steps. I discuss how your new tool can be combined with existing tools in order to obtain, scrub, explore, and model data at the command line. Finally, I share some best practices regarding interface design and distribution.
Set your code free; turn it into a command-line tool.pdf