Presentation Open Access

Automatically archiving reproducible studies with Docker

Nüst, Daniel

Hinz, Matthias







Reproducibility of computations is crucial in an era where data is born digital and analysed algorithmically. Most studies however only publish the results, often with figures as important interpreted outputs. But where do these figures come from? Scholarly articles must provide not only a description of the work but be accompanied by data and software. R offers excellent tools to create reproducible works, i.e. Sweave and RMarkdown. Several approaches to capture the workspace environment in R have been made, working around CRAN’s deliberate choice not to provide explicit versioning of packages and their dependencies. They preserve a collection of packages locally (packrat, pkgsnap, switchr/GRANBase) or remotely (MRAN timemachine/checkpoint), or install specific versions from CRAN or source (requireGitHub, devtools). Installers for old versions of R are archived on CRAN. A user can manually re-create a specific environment, but this is a cumbersome task.
We introduce a new possibility to preserve a runtime environment including both, packages and R, by adding an abstraction layer in the form of a container, which can execute a script or run an interactive session. The package containeRit automatically creates such containers based on Docker. Docker is a solution for packaging an application and its dependencies, but shows to be useful in the context of reproducible research (Boettiger 2015). The package creates a container manifest, the Dockerfile, which is usually written by hand, from sessionInfo(), R scripts, or RMarkdown documents. The Dockerfiles use the Rocker community images as base images. Docker can build an executable image from a Dockerfile. The image is executable anywhere a Docker runtime is present.

containeRit uses harbor for building images and running containers, and sysreqs for installing system dependencies of R packages. Before the planned CRAN release we want to share our work, discuss open challenges such as handling linked libraries (see discussion on geospatial libraries in Rocker), and welcome community feedback.

This work is supported by the project Opening Reproducible Research (Offene Reproduzierbare Forschung) funded by the German Research Foundation (DFG) under project numbers PE 1632/10-1, KR 3930/3-1 and TR 864/6-1.
Files (3.3 MB)
Name Size
3.3 MB Download
  • Boettiger, Carl. 2015. "An Introduction to Docker for Reproducible Research, with Examples from the R Environment." ACM SIGOPS Operating Systems Review 49 (January): 71–79. doi:10.1145/2723872.2723882.

  • Nüst, Daniel, Markus Konkol, Edzer Pebesma, Christian Kray, Marc Schutzeichel, Holger Przibytzin, and Jörg Lorenz. 2017. "Opening the Publication Process with Executable Research Compendia." D-Lib Magazine 23 (January). doi:10.1045/january2017-nuest.

All versions This version
Views 5959
Downloads 5151
Data volume 165.9 MB165.9 MB
Unique views 5656
Unique downloads 5050


Cite as