Housekeeping rules

  • Please mute your microphone when you are not speaking
  • Please turn off your camera during presentations – the session is being recorded
  • Post your questions in the chat or raise your hand using the “raise your hand” function in Zoom (“Reactions” button)
  • Tell us how we did in the survey (more about that later)
  • The presentation and all materials will be shared afterwards (see GIT)

License:

About me

Dr. Franz Eder

Assoc. Prof. for International Relations

University of Innsbruck


Research focus: Foreign and Security Policy; (Counter-)Terrorism; USA, Europe, Austria; social science research methods (v.a. QTA, DNA); academic writing and presentation; open and reproducible science


Program

  1. Recap: What is computational reproducibility?
  2. Reproducibility iceberg (Rodrigues 2023)
  3. Package dependencies with renv (Ushey and Wickham 2025)
  4. Pipelining with {targets} (Landau 2025)
  5. Q & A

Learning outcomes

After completing this webinar, participants will be able to

  • create a reproducible environment for their R projects using the renv-package;
  • adopt a function-oriented style of programming;
  • use the {targets}-package as a pipeline tool for their R projects.

What is computational reproducibility?

Definition

“the ability of a second researcher to receive a set of files, including data, code, and documentation, and to recreate or recover the outputs of a research project, including figures, tables, and other key quantitative and qualitative results” (Kitzes 2017, 19).

Challenges and solutions

  • “multiple inconsistent versions of code, data, or both” (Peikert, Lissa, and Brandmaier 2021, 838)

    • solution: Version control (git)
  • missing documentation and copy-and-paste errors in final reports

    • solution: dynamic document generation/literate programming (RStudio Quarto)
  • software dependencies

  • undocumented or ambiguous order of documentation

    • solution: file and folder management; comments; make-files or targets-package

Reproducibility iceberg (Rodrigues 2023)

click!

Package dependencies with renv

What is renv?

Install and load

install.packages("renv")

# load packages
library(renv)

Initialize

# initialize renv
renv::init()

Find packages

# where are the package libraries?
.libPaths()


# in which library is the tidyverse package?
find.package("tidyverse")

Install, status, snapshot

install.packages("tidymodels")
renv::status()
renv::snapshot()
find.package("tidymodels")

Update and restore

# update packages (ATTENTION: run/check code afterwards!)
renv::update()

# download and install all the packages needed for collaborators
renv::restore()


.gitignore

library/
local/
cellar/
lock/
python/
sandbox/
staging/

Re-open RProject

.Rprofile

source("renv/activate.R")

Pipelining with {targets}

What is {targets}?

When (not) to use {targets}

  • pipeline tool for statistics and data science in R (see GNU Make)
  • to coordinate the pieces of computationally demanding analysis projects
  • when steps of analyses have to be repeated multiple times
  • to make these analyses reproducible
  • DO NOT HAVE TO USE IT FOR simple analyses

Function-oriented style of programming

  • {targets} expects users to adopt a function-oriented style of programming (see Functions)

  • functions should be organized according to three key steps of an analysis

    • data generation
    • data analysis
    • reporting results
  • save functions in subfolder R in the Rproject directory

Original script

###
# 00. configuration----
###

# install packages if needed
install.packages("tidyverse")   # collection of packages for data science

# load packages
library(tidyverse)

###
# 10. loading data----
###

# read Cronos3, wave 2 dataset from data-folder
df <- read_csv("data/CRON3W2e01.1.csv")

# view head of the dataframe
head(df)

# select variables:
## w2gq1: How worried about climate change 
## w2gq6: Humans meant to rule over nature 

df <- df |> select(w2gq1, w2gq6, agegroup35) |> 
    rename("climate" = w2gq1, "rule" = w2gq6) |>  # rename variables
    filter(climate != 9) |>   # remove "No answer"
    filter(rule != 9) |>   # remove "No answer"
    filter(agegroup35 != 9) |>
    mutate(rule = case_when( # recode answers in "Humans meant to rule over nature"
        rule == 1 ~ 5, # Agree strongly becomes 5
        rule == 2 ~ 4,
        rule == 3 ~ 3,
        rule == 4 ~ 2,
        rule == 5 ~ 1) # Disagree strongly becomes 1
        ) |> 
    mutate(agegroup35 = as_factor(agegroup35))  # mutate agegroup35 to factor

df$agegroup35 <- fct_recode(df$agegroup35, "under 35" = "1",
                           "35 and above" = "2") # recode agegroup35

###
# 20. analyzing data----
###

fit_model <- lm(climate ~ rule, df)
summary(fit_model)

###
# 30. plotting results----
###

# plotting the model
ggplot(df, aes(x = rule, y = climate)) +
    geom_jitter(color = "#66B32F", alpha = 0.3) +
    geom_abline(intercept = fit_model$coefficients[1], slope = fit_model$coefficients[2],
                color = "#E72E6B", linetype = 2) +
    labs(x = "Humans meant to rule over nature") +
    labs(y = "How worried about climate change") +
    theme_minimal()

# violin plot of agegroup vs "How worried about climate change"
ggplot(df, aes(agegroup35, climate)) +
    geom_violin() +
    geom_jitter(color = "#2A4B9B", alpha = 0.3) +
    labs(x = "") +
    labs(y = "How worried about climate change") +
    theme_minimal()

Step 1: getting the data

From…

###
# 10. loading data----
###

# read Cronos3, wave 2 dataset from data-folder
df <- read_csv("data/CRON3W2e01.1.csv")

# select variables:
## w2gq1: How worried about climate change 
## w2gq6: Humans meant to rule over nature 

df <- df |> select(w2gq1, w2gq6, agegroup35) |> 
    rename("climate" = w2gq1, "rule" = w2gq6) |>  # rename variables
    filter(climate != 9) |>   # remove "No answer"
    filter(rule != 9) |>   # remove "No answer"
    filter(agegroup35 != 9) |>
    mutate(rule = case_when( # recode answers in "Humans meant to rule over nature"
        rule == 1 ~ 5, # Agree strongly becomes 5
        rule == 2 ~ 4,
        rule == 3 ~ 3,
        rule == 4 ~ 2,
        rule == 5 ~ 1) # Disagree strongly becomes 1
        ) |> 
    mutate(agegroup35 = as_factor(agegroup35)) |>   # mutate agegroup35 to factor
    mutate(agegroup35 = fct_recode(agegroup35, "under 35" = "1", # recode agegroup35
                                   "35 and above" = "2"))

… to 01_data.R

get_data <- function(file) {
    read_csv(file) |> 
        select(w2gq1, w2gq6, agegroup35) |> 
        rename("climate" = w2gq1, "rule" = w2gq6) |>  # rename variables
        filter(climate != 9) |>   # remove "No answer"
        filter(rule != 9) |>   # remove "No answer"
        filter(agegroup35 != 9) |>
        mutate(rule = case_when( # recode answers in "Humans meant to rule over nature"
            rule == 1 ~ 5, # Agree strongly becomes 5
            rule == 2 ~ 4,
            rule == 3 ~ 3,
            rule == 4 ~ 2,
            rule == 5 ~ 1) # Disagree strongly becomes 1
        ) |> 
        mutate(agegroup35 = as_factor(agegroup35)) |> 
        mutate(agegroup35 = fct_recode(agegroup35, "under 35" = "1", # recode agegroup35
                                       "35 and above" = "2"))
}

Step 2: analyzing the data

From…

fit_model <- lm(climate ~ rule, df)

… to 02_analyze.R

fit_model <- function(data) {
  lm(climate ~ rule, data)
}

Step 3: reporting results

From…

# plotting the model
ggplot(df, aes(x = rule, y = climate)) +
    geom_jitter(color = "#66B32F", alpha = 0.3) +
    geom_abline(intercept = fit_model$coefficients[1], slope = fit_model$coefficients[2],
                color = "#E72E6B", linetype = 2) +
    labs(x = "Humans meant to rule over nature") +
    labs(y = "How worried about climate change") +
    theme_minimal()

# violin plot of agegroup vs "How worried about climate change"
ggplot(df, aes(agegroup35, climate)) +
    geom_violin() +
    geom_jitter(color = "#2A4B9B", alpha = 0.3) +
    labs(x = "") +
    labs(y = "How worried about climate change") +
    theme_minimal()

… to 03_plot.R

plot_model <- function(data, model) {
    ggplot(data, aes(x = rule, y = climate)) +
        geom_jitter(color = "#66B32F", alpha = 0.3) +
        geom_abline(intercept = model$coefficients[1], slope = model$coefficients[2],
                    color = "#E72E6B", linetype = 2) +
        labs(x = "Humans meant to rule over nature") +
        labs(y = "How worried about climate change") +
        theme_minimal()
}

plot_violin <- function(data) {
    ggplot(data, aes(agegroup35, climate)) +
        geom_violin() +
        geom_jitter(color = "#2A4B9B", alpha = 0.3) +
        labs(x = "") +
        labs(y = "How worried about climate change") +
        theme_minimal()
}

How to use {targets}

Step 1: Install packages

install.packages("targets")
install.packages("tarchetypes")

Step 2: use_targets()

library(targets)
use_targets()

Step 3: modify _targets.R

# Load packages required to define the pipeline:
library(targets)
library(tarchetypes)

# Set target options:
tar_option_set(
  packages = c("tidyverse") # Packages that your targets need for their tasks.
)

# Run the R scripts in the R/ folder with your custom functions:
tar_source()

# Replace the target list below with your own:
list(
  tar_target(file, "data/CRON3W2e01.1.csv", format = "file"),
  tar_target(data, get_data(file)),
  tar_target(model, fit_model(data)),
  tar_target(plot, plot_model(data, model)),
  tar_target(violin_plot, plot_violin(data))
)

Step 4: Inspect the pipeline

tar_visnetwork()

Step 5: Run the pipeline

tar_make()

Step 6: Read results

tar_read(model)

Change functions?

# change function in 03_plot.R

plot_model <- function(data, model) {
    ggplot(data, aes(x = rule, y = climate)) +
        #geom_jitter(color = "#66B32F", alpha = 0.3) +
        geom_jitter(color = "#ffed00", alpha = 0.3) +
        geom_abline(intercept = model$coefficients[1],
                    slope = model$coefficients[2],
                    #color = "#E72E6B", linetype = 2) +
                    color = "#2A4B9B", linetype = 2) +
        labs(x = "Humans meant to rule over nature") +
        labs(y = "How worried about climate change") +
        theme_minimal()
}

{targets} and Quarto

Step 1: create Quarto document

Step 2: modify _targets.R

# Load packages required to define the pipeline:
library(targets)
library(tarchetypes)

# Set target options:
tar_option_set(
  packages = c("tidyverse", "tibble") # Packages that your targets need for their tasks.
)

# Run the R scripts in the R/ folder with your custom functions:
tar_source()

# Replace the target list below with your own:
list(
  tar_target(file, "data/CRON3W2e01.1.csv", format = "file"),
  tar_target(data, get_data(file)),
  tar_target(model, fit_model(data)),
  tar_target(plot, plot_model(data, model)),
  tar_target(violin_plot, plot_violin(data)),
  tar_quarto(report, "report.qmd")
)

Step 3: tar_make()

Bibliography

Kitzes, Justin. 2017. “The Basic Reproducible Workflow Template.” In The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences, edited by Justin Kitzes, Daniel Turek, and Fatma Deniz, 19–30. Oakland, CA: University of California Press.
Landau, Will. 2025. The targets r Package User Manual. ropensci.org. https://books.ropensci.org/targets/.
Peikert, Aaron, Caspar J. van Lissa, and Andreas M. Brandmaier. 2021. “Reproducible Research in r: A Tutorial on How to Do the Same Thing More Than Once.” Psych 3 (4): 836–67. https://doi.org/10.3390/psych3040053.
Rodrigues, Bruno. 2023. “Building Reproducible Analytical Pipelines with r.” https://raps-with-r.dev/.
Ushey, Kevin, and Hadley Wickham. 2025. Renv: Project Environments. R package version 1.1.4,. https://rstudio.github.io/renv/.

Upcoming Events

Feedback

Q & A