This supplementary is available under https://github.com/marruett/Supplementary_Ruett_Precision_Agriculture and outlines the Decision Analysis approaches used in Assessing expected utility and profitability to support decision-making for disease control strategies in ornamental heather production Ruett et al. (under review). Here we provide our input table, code and results to allow for transparent reproduction of the model. To create and organize this supplementary file we used the packages devtools (Wickham et al. 2019b), knitr (Xie 2019a), kableExtra (Zhu 2019) and bookdown (Xie 2019b). All other applied R packages are cited and listed in the References.

Input table for assessment of expected utility and profitability to support decision-making in ornamental heather production

The input table Input_CBA.csv contains: Variable names, unit, distributions (posnorm = positive normal distribution, const = constant, tnorm_0_1= truncated normal distribution), lower bound, mean, upper bound and definition of all the input variables for the model. The abbreviation CBA refers to Cost Benefit Assessment.

Implementing the model in R

Here we provide our mathematical model that enables simulation of decision outcomes. The calculations in this document are based on random draws from the distributions in the input table Input_CBA.csv. Explanations for code chunks are not part of the actual simulation. The following code descriptions will guide readers through our process of code development.

Load the input table Input_CBA.csv and make variables.


input_table <- "Input_CBA.csv"

make_variables <- function(est,n = 1)
{ x <- random(rho = est, n = n)
for(i in colnames(x)) assign(i, as.numeric(x[1,i]),envir = .GlobalEnv)}
make_variables(estimate_read_csv(input_table))

We generated a function called Simulation function wrapping another function called Cost_Benefit to evaluate the decision options in decisionSupport (Luedeling et al. 2021). The Cost_Benefit function represents the code to compute the costs and benefits for a monitoring strategy over a determined time period. We were able to calculate the cashflow and the Net Present Value (NPV) for the three monitoring strategies since we wrapped the Simulation function around the Cost_Benefit function. The following calculations show how the functions are coded (see the entire function below).


Simulation <- function(){
  Cost_Benefit <- function(n_years, # number of years to run the simulation
                            highrisk_year, # Years with a high risk of fungal infections
                            CV, # coefficient indicating variation into a time series
                            area, # area of the heather production system
                            sample_costs, # costs per lab sample
                            value_of_discarded_plant, # value of a sampled and then 
                            # discarded heather plant
                            plant_value_of_A1_quality, # plant value of marketable heather
                            # plant that would not have achieved high quality without 
                            # the respective monitoring strategy.
                            Initial_investment, # Mandatory investment in the first year
                            # to start then respective monitoring approach
                            additional_investment, # Occurring investments is the following
                            # years 
                            labor_costs, # Labor costs to conduct monitoring
                            post_processing_costs, # Data processing of acquired data
                            sample_number, # Number of lab samples
                            additional_saved_plants, # number of heather plants that would
                            # not have achieved high marketable quality without the
                            # respective monitoring strategy.
                            adjustment_sample_size, # Adjustment to increase sample size 
                            # in high risk years 
                            increased_resource_use_costs, # Costs for increased resource
                            # use to protect heather plants in high risk years
                            resource_savings # Monetary savings for reduced resource
                            # use in normal risk years due to the respective monitoring
                            # strategy
  ){

First we defined the labor costs per area.


  yearly_labor_costs_per_ha <- vv(labor_costs, CV, n = n_years)

Then we defined the number of samples per hectare. We used the value varier function vv() from the decisionSupport package (Luedeling et al. 2021) to account for the yearly variation of e.g. taking samples under different risk conditions.


  samples_per_ha <- round(vv(sample_number, CV, n = n_years) *
      (1+highrisk_year * (vv(adjustment_sample_size, CV, n = n_years))), digits = 0)

We defined the sample costs per hectare and the value of discarded plants per hectare for each year.


    sample_costs_per_ha <- samples_per_ha * sample_costs
    
    discarded_plant_value_per_ha <- samples_per_ha *
      vv(value_of_discarded_plant, CV, n = n_years)
    

We defined the total labor and sample costs for the whole production system.


    labor_and_sample_cost <- (yearly_labor_costs_per_ha +
                                    sample_costs_per_ha +
                                    discarded_plant_value_per_ha) * area

Here we defined the initial investment of the first year and additional investments (maintenance costs) of the following years. Then we summed up the labor and sample costs, the data post processing costs, the initial investment and the additional investments, and defined the result as the total costs.


    total_cost <- labor_and_sample_cost +
      vv(post_processing_costs, CV, n = n_years) + 
      c(Initial_investment, rep(0,n_years-1)) +
      c(0,vv(additional_investment, CV, n = n_years-1))

Benefits of monitoring strategies consist of resource savings that are achieved in normal risk years and the value of saved plants in high risk years.

In normal risk years the number of frequent fungicide applications can be reduced to some extent, because more monitoring increases knowledge about plant health status in the field. Therefore, resource savings can be achieved in these years.

Although monitoring allows for resource savings in normal risk years, no resource savings are achieved in high risk years because a higher number of pesticide applications is required. However, thanks to more precise knowledge about spatially distributed plant vitality, producers are able to protect quality of more plants, which are safely cultivated until they are sold.

We defined the resource savings in normal risk years and the value of saved high quality plants in high risk years. We summed up these monetary values and defined the result as the total benefits.


    resource_savings <- vv(resource_savings, CV, n = n_years) * area * (1-highrisk_year)
    
    value_of_high_quality_plants <- vv(additional_saved_plants, CV, n = n_years) * 
                                       plant_value_of_A1_quality * area * highrisk_year
    
    total_benefits <- resource_savings + value_of_high_quality_plants

In the return command we defined the output of the of the Cost_Benefit function as cashflow.


 return(cashflow = total_benefits - total_cost)}

High risk years can increase sample size and resource use. Occurrence of high risk years was defined using the chance_event function from decisionSupport (Luedeling et al. 2021). The chance of a high-risk year occurring (chance_high_risk) was defined in the input table as a probability of 40% to 60%, reflecting the uncertainty for risky weather events.


  highrisk_year <- chance_event(chance_high_risk, n = n_years)

Here we simulated the Cost_Benefit function for standard monitoring (Baseline) and defined the result as 'cashflow_G'.


   cashflow_G <- Cost_Benefit(n_years = n_years,
                              highrisk_year = highrisk_year,
                              CV = var_CV,
                              area = production_area,
                              sample_costs = lab_costs_per_sample,
                              value_of_discarded_plant = plant_value_of_discarded_plant,
                              plant_value_of_A1_quality = plant_value_of_A1_quality,
                              Initial_investment = Initial_investment_G , 
                              additional_investment = additional_investment_G ,
                              labor_costs = labor_costs_G,
                              post_processing_costs = post_processing_costs_G,
                              sample_number = sample_number_G,
                              additional_saved_plants = Number_of_saved_high_quality_plants_G,
                              adjustment_sample_size = adjustment_sample_size_G,
                              resource_savings = resource_savings_G)

Here we calculated the Net Present Value for 'Baseline' monitoring.


  NPV_G <- discount(cashflow_G, discount_rate, calculate_NPV = TRUE)

Here we simulated the Cost_Benefit function for more intense monitoring (Improved) and defined the result as 'cashflow_I'.


  cashflow_I <- Cost_Benefit(n_years = n_years,
                              highrisk_year = highrisk_year,
                              CV = var_CV,
                              area = production_area,
                              sample_costs = lab_costs_per_sample,
                              value_of_discarded_plant = plant_value_of_discarded_plant,
                              plant_value_of_A1_quality = plant_value_of_A1_quality,
                              Initial_investment = Initial_investment_I , 
                              additional_investment = additional_investment_I ,
                              labor_costs = labor_costs_I,
                              post_processing_costs = post_processing_costs_I,
                              sample_number = sample_number_I,
                              additional_saved_plants = Number_of_saved_high_quality_plants_I,
                              adjustment_sample_size = adjustment_sample_size_I,
                              resource_savings = resource_savings_I)

Here we calculated the Net Present Value for 'Improved' monitoring.


NPV_I <- discount(cashflow_I, discount_rate, calculate_NPV = TRUE)

Here we calculated the Net Present Value for the decision to apply 'Improved' instead of 'Baseline' monitoring.


  comp_NPV_IG <- NPV_I - NPV_G

Here we simulated the Cost_Benefit function for monitoring with sensor technology (Sensor) and defined the result as 'cashflow_S'.


  cashflow_S  <- Cost_Benefit(n_years = n_years,
                              highrisk_year = highrisk_year,
                              CV = var_CV,
                              area = production_area,
                              sample_costs = lab_costs_per_sample,
                              value_of_discarded_plant = plant_value_of_discarded_plant,
                              plant_value_of_A1_quality = plant_value_of_A1_quality,
                              Initial_investment = Initial_investment_S , 
                              additional_investment = additional_investment_S ,
                              labor_costs = labor_costs_S,
                              post_processing_costs = post_processing_costs_S,
                              sample_number = sample_number_S,
                              additional_saved_plants = Number_of_saved_high_quality_plants_S,
                              adjustment_sample_size = adjustment_sample_size_S,
                              resource_savings = resource_savings_S)

Here we calculated the Net Present Value for 'Sensor' monitoring.


NPV_S <- discount(cashflow_S, discount_rate, calculate_NPV = TRUE)

Here we calculated the Net Present Value for the decision to apply 'Sensor' instead of 'Baseline' monitoring.


  comp_NPV_SG <- NPV_S - NPV_G

At the end of the Simulation function we listed all calculated outcomes.


  return(list(cashflow_G = cashflow_G,
              NPV_G = NPV_G,
              cashflow_I = cashflow_I,
              NPV_I = NPV_I, 
              cashflow_S = cashflow_S,
              NPV_S = NPV_S,
              comp_NPV_IG = comp_NPV_IG,
              comp_NPV_SG = comp_NPV_SG))
}

Here we defined legend file, results folder and figures folder.


legend_file <- "Legend_CBA.csv"
MC_Results_folder <- "MC_Results_CBA"
EVPI_Results_folder <- "EVPI_Results_CBA"

The decisionSupport function facilitated the start of the Monte Carlo simulation.


decisionSupport(inputFilePath = input_table,
                outputPath = MC_Results_folder, 
                welfareFunction = Simulation, 
                write_table = TRUE,
                numberOfModelRuns = 10000, 
                functionSyntax = "plainNames")

We produced all outcome variables with the decisionSupport function. We stored the Monte Carlo Simulation results into the MC_file and generate the EVPI files.


MC_file <- read.csv(paste(MC_Results_folder,"/mcSimulationResults.csv",sep = ""))

MC_file_without_cashflow <- select(MC_file, -c(1, starts_with("cashflow")))
multi_EVPI(mc = MC_file_without_cashflow,first_out_var = "NPV_G",
           write_table = TRUE,outfolder = EVPI_Results_folder)

welfare_summary <- read.csv(paste(MC_Results_folder,"/welfareDecisionSummary.csv",sep = ""))

R function: Simulation

Here our code is shown at one piece for an overview of all calculations.


library(decisionSupport)

input_table <- "Input_CBA.csv"
legend_file <- "Legend_CBA.csv"
MC_Results_folder <- "MC_Results_CBA"
EVPI_Results_folder <- "EVPI_Results_CBA"

make_variables <- function(est,n = 1)
{ x <- random(rho = est, n = n)
for(i in colnames(x)) assign(i, as.numeric(x[1,i]),envir = .GlobalEnv)}
make_variables(estimate_read_csv(input_table))
dir.create(MC_Results_folder)

Simulation <- function(){

  Cost_Benefit <- function(n_years,
                            highrisk_year, 
                            CV, 
                            area, 
                            sample_costs, 
                            value_of_discarded_plant, 
                            plant_value_of_A1_quality, 
                            Initial_investment,
                            additional_investment,
                            labor_costs, 
                            post_processing_costs, 
                            sample_number, 
                            additional_saved_plants,  
                            adjustment_sample_size, 
                            increased_resource_use_costs,
                            resource_savings){ 

    yearly_labor_costs_per_ha <- vv(labor_costs, CV, n = n_years)
    
    samples_per_ha <- round(vv(sample_number, CV, n = n_years) *
      (1+highrisk_year * (vv(adjustment_sample_size, CV, n = n_years))), digits = 0)
    
    sample_costs_per_ha <- samples_per_ha * sample_costs
    
    discarded_plant_value_per_ha <- samples_per_ha *
      vv(value_of_discarded_plant, CV, n = n_years)
    
    labor_and_sample_cost <- (yearly_labor_costs_per_ha +
                                    sample_costs_per_ha +
                                    discarded_plant_value_per_ha) * area

    total_cost <- labor_and_sample_cost +
      vv(post_processing_costs, CV, n = n_years) + 
      c(Initial_investment, rep(0,n_years-1)) +
      c(0,vv(additional_investment, CV, n = n_years-1))
    
    resource_savings <- vv(resource_savings, CV, n = n_years) * area * (1-highrisk_year)
    
    value_of_high_quality_plants <- vv(additional_saved_plants, CV, n = n_years) * 
                                       plant_value_of_A1_quality * area * highrisk_year
    
    total_benefits <- resource_savings + value_of_high_quality_plants
    
    return(cashflow = total_benefits - total_cost)}
  
  highrisk_year <- chance_event(chance_high_risk, n = n_years)
  
  cashflow_G <- Cost_Benefit(n_years = n_years,
                              highrisk_year = highrisk_year,
                              CV = var_CV,
                              area = production_area,
                              sample_costs = lab_costs_per_sample,
                              value_of_discarded_plant = plant_value_of_discarded_plant,
                              plant_value_of_A1_quality = plant_value_of_A1_quality,
                              Initial_investment = Initial_investment_G , 
                              additional_investment = additional_investment_G ,
                              labor_costs = labor_costs_G,
                              post_processing_costs = post_processing_costs_G,
                              sample_number = sample_number_G,
                              additional_saved_plants = Number_of_saved_high_quality_plants_G,
                              adjustment_sample_size = adjustment_sample_size_G,
                              resource_savings = resource_savings_G)
  
  NPV_G <- discount(cashflow_G, discount_rate, calculate_NPV = TRUE)
  
  cashflow_I <- Cost_Benefit(n_years = n_years,
                              highrisk_year = highrisk_year,
                              CV = var_CV,
                              area = production_area,
                              sample_costs = lab_costs_per_sample,
                              value_of_discarded_plant = plant_value_of_discarded_plant,
                              plant_value_of_A1_quality = plant_value_of_A1_quality,
                              Initial_investment = Initial_investment_I , 
                              additional_investment = additional_investment_I ,
                              labor_costs = labor_costs_I,
                              post_processing_costs = post_processing_costs_I,
                              sample_number = sample_number_I,
                              additional_saved_plants = Number_of_saved_high_quality_plants_I,
                              adjustment_sample_size = adjustment_sample_size_I,
                              resource_savings = resource_savings_I)

  NPV_I <- discount(cashflow_I, discount_rate, calculate_NPV = TRUE)

  comp_NPV_IG <- NPV_I - NPV_G

  cashflow_S  <- Cost_Benefit(n_years = n_years,
                              highrisk_year = highrisk_year,
                              CV = var_CV,
                              area = production_area,
                              sample_costs = lab_costs_per_sample,
                              value_of_discarded_plant = plant_value_of_discarded_plant,
                              plant_value_of_A1_quality = plant_value_of_A1_quality,
                              Initial_investment = Initial_investment_S , 
                              additional_investment = additional_investment_S ,
                              labor_costs = labor_costs_S,
                              post_processing_costs = post_processing_costs_S,
                              sample_number = sample_number_S,
                              additional_saved_plants = Number_of_saved_high_quality_plants_S,
                              adjustment_sample_size = adjustment_sample_size_S,
                              resource_savings = resource_savings_S)

  NPV_S <- discount(cashflow_S, discount_rate, calculate_NPV = TRUE)
  
  comp_NPV_SG <- NPV_S - NPV_G
  
  return(list(cashflow_G = cashflow_G,
              NPV_G = NPV_G,
              cashflow_I = cashflow_I,
              NPV_I = NPV_I, 
              cashflow_S = cashflow_S,
              NPV_S = NPV_S,
              comp_NPV_IG = comp_NPV_IG,
              comp_NPV_SG = comp_NPV_SG))
}

decisionSupport(inputFilePath = input_table,
                outputPath = MC_results_folder,
                welfareFunction = Simulation, 
                write_table = TRUE,
                numberOfModelRuns = 10000, 
                functionSyntax = "plainNames")

MC_file <- read.csv(paste(MC_Results_folder,"/mcSimulationResults.csv",sep = ""))

MC_file_without_cashflow <- select(MC_file, -c(1, starts_with("cashflow")))
multi_EVPI(mc = MC_file_without_cashflow,first_out_var = "NPV_G",
           write_table = TRUE,outfolder = EVPI_Results_folder)

welfare_summary <- read.csv(paste(MC_Results_folder,"/welfareDecisionSummary.csv",sep = ""))

Calculation of the expected utility

Here we calculated the expected utility for different monitoring strategies. First, we loaded the simulation results into the R environment. We used the package dyplyr (Wickham et al. 2019a) to select the simulated Net Present Values from each monitoring strategy.


data_G <- read.csv("mcSimulationResults.csv")
data_G <- dplyr::select(data_G, starts_with("NPV_G")) %>%
  stack(drop=FALSE)
data_G$values <- as.numeric(data_G$values)

data_I <- read.csv("mcSimulationResults.csv")
data_I <- dplyr::select(data_I, starts_with("NPV_I")) %>%
  stack(drop=FALSE)
data_I$values <- as.numeric(data_I$values)

data_S <- read.csv("mcSimulationResults.csv")
data_S <- dplyr::select(data_S, starts_with("NPV_S")) %>%
  stack(drop=FALSE)
data_S$values <- as.numeric(data_S$values)

Then we started the risk premium calculation for the monitoring strategies.


risk_premium_G <-(1/2)*0.0001*var(data_G$values)

risk_premium_I <-(1/2)*0.0001*var(data_I$values)

risk_premium_S <-(1/2)*0.0001*var(data_S$values)

We calculated the semivariance for the monitoring strategies.


for(z in 1 : 10000){
  data_G$NewColumn[z] <- ifelse(min((data_G$values[z]- mean(data_G$values)),0) < 0,
                                (data_G$values[z]- mean(data_G$values))^2, 0)
}
Semivaricance_G <- mean(data_G$NewColumn)

for(z in 1 : 10000){
  data_I$NewColumn[z] <- ifelse(min((data_I$values[z]- mean(data_I$values)),0) < 0,
                                (data_I$values[z]- mean(data_I$values))^2, 0)
}
Semivaricance_I <- mean(data_I$NewColumn)

for(z in 1 : 10000){
  data_S$NewColumn[z] <- ifelse(min((data_S$values[z]- mean(data_S$values)),0) < 0,
                                (data_S$values[z]- mean(data_S$values))^2, 0)
}
Semivaricance_S <- mean(data_S$NewColumn)

Here we computed the certainty equivalents and the risk aversion coefficients and stored the results in a data frame called risk_aversion_data.


for(r in c(-1e-01, -1e-02, -1e-03, -1e-04, -1e-05, -1e-06, 0,
           1e-06,  1e-05, 1e-04, 1e-03, 1e-02, 1e-01)){
  Certainty_equivalent_G_ <- mean(data_G$values) - Semivaricance_G * r
  assign(paste("CE_G_", r, sep = ""), Certainty_equivalent_G_)
  
  Certainty_equivalent_I_ <- mean(data_I$values) - Semivaricance_I * r
  assign(paste("CE_I_", r, sep = ""), Certainty_equivalent_I_)
  
  Certainty_equivalent_S_ <- mean(data_S$values) - Semivaricance_S * r
  assign(paste("CE_S_", r, sep = ""), Certainty_equivalent_S_)
}

Certainty_equivalent <- c(`CE_G_-0.1`, `CE_G_-0.01`, `CE_G_-0.001`, `CE_G_-1e-04`,
                          `CE_G_-1e-05`, `CE_G_-1e-06`,`CE_G_0`, `CE_G_1e-06`,
                          `CE_G_1e-05`, `CE_G_1e-04`, `CE_G_0.001`, `CE_G_0.01`,
                          `CE_G_0.1`, `CE_I_-0.1`, `CE_I_-0.01`, `CE_I_-0.001`, 
                          `CE_I_-1e-04`, `CE_I_-1e-05`, `CE_I_-1e-06`, `CE_I_0`,
                          `CE_I_1e-06`, `CE_I_1e-05`, `CE_I_1e-04`, `CE_I_0.001`,
                          `CE_I_0.01`, `CE_I_0.1`,`CE_S_-0.1`, `CE_S_-0.01`, `CE_S_-0.001`,
                          `CE_S_-1e-04`, `CE_S_-1e-05`, `CE_S_-1e-06`,`CE_S_0`, `CE_S_1e-06`,
                          `CE_S_1e-05`, `CE_S_1e-04`, `CE_S_0.001`, `CE_S_0.01`, `CE_S_0.1`)

monitoring_type <- c(rep("Baseline", 13),rep("Improved", 13),rep("Sensor", 13))

risk_aversion_coefficient <- c(-1e-01, -1e-02, -1e-03, -1e-04, -1e-05, -1e-06, 0, 1e-06,
                               1e-05, 1e-04, 1e-03, 1e-02, 1e-01)

options(scipen = 999)
risk_aversion_data <- data.frame(Certainty_equivalent, risk_aversion_coefficient, monitoring_type)

Results

Assessment of profitability and expected utility

The following plots are generated from the mcSimulationResults.csv from the decisionSupport function.

We plotted probability distributions and merged them to illustrate the frequency of Monte Carlo simulation outcomes. Additionally, we integrated boxplots to compare Net Present Values of Baseline, Improved, and Sensor.

For data visualization we applied the ggplot2 (Wickham 2016), ggstance (Henry et al. 2019) and dplyr (Wickham et al. 2019a) packages. We used the function ggarrange from the ggpubr package (Kassambara 2019) to merge figures.

Figure 1

Probability density distributions (scaled density among 10,000 runs of a Monte Carlo simulation) of the Net Present Value (Euros) for three heather monitoring strategies. Baseline: Current visual monitoring regime. Improved: More intensive visual monitoring. Sensor: Sensor-based monitoring.

Figure 2

Stochastic efficiency with respect to a function (SERF) results for monitoring strategies Baseline, Improved, and Sensor. In the upper figure the y-axis is shown on a normal scale, while the lower figure uses a log-scale to zoom into the positive parts of the SERF functions. In the latter case negative values are dropped from the illustration, because negative certainty equivalent values imply that decision makers would not adopt an activity at all.

We plotted probability distributions and merged them to illustrate the frequency of Monte Carlo simulation outcomes. Additionally, we integrated boxplots to compare Net Present Values of DoMoreVisual and UseSensor.

Figure 3

Probability density distributions (scaled density among 10,000 runs of a Monte Carlo simulation) of Net Present Values (Euros) for two heather monitoring decisions. DoMoreVisual: More intensive visual monitoring compared to the current monitoring regime. UseSensor: Sensor-based monitoring compared to the current monitoring regime.

EVPI and VIP analysis of management decisions

Comparison of Information Value and Variable Importance for DoMoreVisual and UseSensor.

Figure 4

Variable Importance in the Projection (VIP scores) and Information Value (EVPI values) for the DoMoreVisual and UseSensor decisions. For the Variable Importance, variables that have a positive impact on the projected NPVs are represented by blue bars, while those with a negative impact are represented by orange bars.

Acknowledgements

We particularly acknowledge the heather farmers Gerd Canders, Tom Canders (Europlant Canders GmbH, Straelen, Germany), Matthias Küppers and Verena Zachau Küppers (Jungpflanzen Küppers GbR, Wachtendonk, Germany) for their participation and contributions throughout the research process. We thank Peter Tiede-Arlt, Rainer Peters and Torsten Wolf (Landwirtschaftskammer Nordrhein-Westfalen) for their commitment and advice in this study.

Funding

This research was funded by Stiftung Zukunft NRW within the research project inruga (Innovationen für NRW zur Steigerung der Ressourceneffizienz und Umweltverträglichkeit im Gartenbau, „Entscheidungshilfen im Zierpflanzenbau“). The funders were not involved in the preparation of this article.

References

Henry, L., Wickham, H., & Chang, W. (2019). Ggstance: Horizontal ’ggplot2’ Components. https://CRAN.R-project.org/package=ggstance

Kassambara, A. (2019). Ggpubr: ’Ggplot2’ Based Publication Ready Plots. https://CRAN.R-project.org/package=ggpubr

Luedeling, E., Goehring, L., Schiffers, K., Whitney, C., & Fernandez, E. (2021). decisionSupport: Quantitative Support of Decision Making under Uncertainty. http://www.worldagroforestry.org/

Wickham, H. (2016). Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org

Wickham, H., François, R., Henry, L., & Müller, K. (2019a). Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr

Wickham, H., Hester, J., & Chang, W. (2019b). Devtools: Tools to Make Developing R Packages Easier. https://CRAN.R-project.org/package=devtools

Xie, Y. (2019a). Knitr: A General-Purpose Package for Dynamic Report Generation in R.

Xie, Y. (2019b). Bookdown: Authoring Books and Technical Documents with R Markdown.

Zhu, H. (2019). kableExtra: Construct Complex Table with ’kable’ and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra