This notebook visualizes the results from the data extraction phase of the study investigating the state of practice of addressing threats to validity in crossover-design experiments at analysis time. All figures are stored in the figures/ sub-directory.

Data

Firstly, we load the raw data from the data-extraction.xlsx file.

data <- read_excel("../data/state-of-practice/data-extraction.xlsx", sheet="Extraction", skip=2)

head(data)
## # A tibble: 6 x 22
##   `Paper-ID` Extractor `Factor-Constructs`                  `Factor-Measurement`
##        <dbl> <chr>     <chr>                                <chr>               
## 1          6 rater2    business process model               [textual; visual]   
## 2          9 rater1    Adherence to REST API design rules;~ [rule; violation]; ~
## 3         14 rater2    Testing approach                     [TDD; TLD]          
## 4         48 rater2    Testing approach                     [TDD; ITL]          
## 5         71 rater2    Tool version                         [tab;panels]        
## 6         77 rater1    Test migration process               [manual; tool]      
## # i 18 more variables: `Response-Constructs` <chr>,
## #   `Response-Measurement` <chr>, `Subject-Number` <chr>, `Subject-Type` <chr>,
## #   Method <chr>, `Test Type` <chr>, Period <chr>, Sequence <chr>, Skill <chr>,
## #   Carryover <chr>, Washout <lgl>, `Data-Availability` <chr>,
## #   `Data-URL` <chr>, `Analysis-Availability` <chr>, `Analysis-URL` <chr>,
## #   Comment <chr>, `?` <lgl>, Response <chr>

Then, we rename all columns to R-appropriate names (i.e., replace dashes by dots, e.g., id instead of “Paper-ID”).

data <- data %>% 
  rename(all_of(c(
    id = "Paper-ID",
    
    subject.number = "Subject-Number",
    subject.type = "Subject-Type",

    method = "Method",
    test.type = "Test Type",
    
    availability.data = "Data-Availability",
    availability.analysis = "Analysis-Availability"
    )))

Additionally, we cast columns to their appropriate data type. The warning that NAs introduced by coercion is expected, as the column cast to integer values (subject.number) can contain NA values.

cat.subjects <- c("Students", "Student Groups", "Practitioners", "Pracititioner Groups", "Both", "Mixed Groups", "Researchers", "Unknown", "Other")
cat.threats <- c("Period", "Sequence", "Skill", "Carryover")
cat.method <- c("NHST", "GLM", "GLMM", "GEE", "Other", "Unknown")
cat.type <- c("Unpaired T", "Paired T", "Mann-Whitney U", "Wilcoxon signed-rank", "ANOVA", "ANOVA-type Statistics", "Kruskal-Wallis", "Other")
cat.addressal <- c("Modeled", "Stratified", "Isolated", "Acknowledged", "Neglected", "Ignored")
cat.availability <- c("Proprietary", "Private", "Unavailable", "Broken", "Upon Request", "Reachable", "Open Source", "Archived")

data <- data %>% 
  mutate(
    subject.type <- factor(subject.type, levels=cat.subjects, ordered=FALSE),
    subject.number = as.integer(subject.number),
    
    method <- factor(method, levels=cat.method, ordered=FALSE),
    test.type <- factor(test.type, levels=cat.type, ordered=FALSE),
    
    Period <- factor(Period, levels=cat.addressal, ordered=TRUE),
    Sequence <- factor(Sequence, levels=cat.addressal, ordered=TRUE),
    Skill <- factor(Skill, levels=cat.addressal, ordered=TRUE),
    Carryover <- factor(Carryover, levels=cat.addressal, ordered=TRUE),
    
    availability.data <- factor(availability.data, levels=cat.availability, ordered=TRUE),
    availability.analysis <- factor(availability.analysis, levels=cat.availability, ordered=TRUE),
  )
## Warning: There was 1 warning in `mutate()`.
## i In argument: `subject.number = as.integer(subject.number)`.
## Caused by warning:
## ! NAs introduced by coercion

Visualization

This section contains the generation and storage of figures from the loaded data. Each subsection addresses one of the attribute groups.

#' Visualize the distribution of categorical values in a column of the data dataframe 
#'
#' @param column The name of the columns as a string.
#' @param levels An ordered list of values in that column.
#' @param label_tilt Boolean (true if the labels should be tilted by 10 degrees to avoid overlapping)
#' @param ignore Column value to filter out
#' @param per_study Boolean (true if values are counted per study, not per experiment)
#' @returns A ggplot bar chart with values written into/onto each bar 
#' @examples
#' bar_chart_ordered("subject.type", cat.subjects)
bar_chart_ordered <- function(column, levels, label_tilt=FALSE, ignore="", per_study=FALSE) {
  d <- data
  if(per_study) {
    d <- data %>% 
      select(id, !!sym(column)) %>%
      distinct()
  }
  
  n.per.category <- d %>% 
    group_by(!!sym(column)) %>% 
    tally() %>% 
    mutate(
      label.vjust = ifelse(n<3, -0.8, 1.5),
      label.color = ifelse(n<3, "black", "white")
      ) %>% 
    filter(!!sym(column) != ignore)
    
  plt <- n.per.category %>%
    ggplot(aes(x=factor(!!sym(column), level=levels), y=n)) +
      geom_bar(stat="identity") +
      geom_text(aes(label=n), vjust=n.per.category$label.vjust, color=n.per.category$label.color) +
      labs(x = "Subject Type", y = "Count") +
      theme(axis.text.x = element_text(angle=ifelse(label_tilt, 10, 0)))
  
  return(plt)
}

Subjects

The first group of attributes describes the subjects involved in the investigated experiments. The subject type describes the type of participants included in the experiment.

bar_chart_ordered("subject.type", cat.subjects, TRUE)

ggsave(file='../figures/subject-type.pdf', device = "pdf",
       width = 14, height = 8, units = "cm")

The subject count visualizes the number of participants involved in the experiment.

ggplot(data, aes(x=subject.number)) +
  geom_boxplot(notch=TRUE, notchwidth=0.3, na.rm=TRUE) +
  labs(x="Number of Participants") +
  theme(axis.title.y=element_blank(),
        axis.text.y=element_blank(),
        axis.ticks.y=element_blank(), 
        panel.grid.major.y = element_blank(), 
        panel.grid.minor.y = element_blank())

ggsave(file='../figures/subject-count.pdf', device = "pdf",
       width = 14, height = 3, units = "cm")

Additionally, we determine the mean and median number of subjects involved in the experiments.

median(data$subject.number, na.rm=TRUE)
## [1] 21
mean(data$subject.number, na.rm=TRUE)
## [1] 31.16923

Additionally, identify all studies where the subject number was NA.

data %>% 
  filter(is.na(subject.number)) %>% 
  select(id, subject.number, subject.type)
## # A tibble: 2 x 3
##      id subject.number subject.type 
##   <dbl>          <int> <chr>        
## 1     6             NA Students     
## 2    53             NA Practitioners

Analysis

The attribute group Analysis contains all attributes that characterize how the sample of primary studies analyzed the data obtained from their experiments. The method represents the type of statistical test performed by the authors.

bar_chart_ordered("method", cat.method)

ggsave(file='../figures/method.pdf', device = "pdf",
       width = 14, height = 5, units = "cm")

The test.type applies only to experiments that were analyzed using a null-hypothesis significance test (method=NHST). It represents the test type in implies the assumptions of the authors.

bar_chart_ordered("test.type", cat.type, TRUE, ignore="NA")

ggsave(file='../figures/test-type.pdf', device = "pdf",
       width = 14, height = 7, units = "cm")

The addressal attribute represents how the analyses addressed the threats to validity inherent to the crossover design.

d.pivot <- data %>% 
  select(cat.threats) %>% 
  pivot_longer(cols = everything(), names_to = "threat", values_to = "addressal") %>% 
  count(threat, addressal) %>% 
  mutate(label.color = ifelse(n < 30, "white", "black"))
## Warning: Using an external vector in selections was deprecated in tidyselect 1.1.0.
## i Please use `all_of()` or `any_of()` instead.
##   # Was:
##   data %>% select(cat.threats)
## 
##   # Now:
##   data %>% select(all_of(cat.threats))
## 
## See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
d.pivot %>% 
  ggplot(aes(
    factor(threat, levels=cat.threats),
    factor(addressal, levels=cat.addressal))
  ) +
    geom_tile(aes(fill=n)) +
    geom_text(aes(label=n), color=d.pivot$label.color) +
    labs(x = "Threats to Validity", y = "Types of Addressal", fill = "Count") +
    scale_x_discrete(labels=c("Maturation/\nexhaustion", "Optimal\nSequence", "Subject\nVariability", "Carryover"))

ggsave(file='../figures/addressal.pdf', device = "pdf",
       width = 14, height = 9, units = "cm")

Identify all papers that contain a perfect addressal.1

data %>% 
  filter(
    Period == 'Modeled', 
    Sequence == 'Modeled', 
    Skill == 'Modeled', 
    Carryover == 'Modeled', 
  )
## # A tibble: 0 x 30
## # i 30 variables: id <dbl>, Extractor <chr>, Factor-Constructs <chr>,
## #   Factor-Measurement <chr>, Response-Constructs <chr>,
## #   Response-Measurement <chr>, subject.number <int>, subject.type <chr>,
## #   method <chr>, test.type <chr>, Period <chr>, Sequence <chr>, Skill <chr>,
## #   Carryover <chr>, Washout <lgl>, availability.data <chr>, Data-URL <chr>,
## #   availability.analysis <chr>, Analysis-URL <chr>, Comment <chr>, ? <lgl>,
## #   Response <chr>, ...

Material

The material attributes represent how available the material (data and analysis scripts) are. Note that these attributes are analyzed per publication, not per experiment, since the material connected with the publications in our sample were always per publication, not per experiment.

bar_chart_ordered("availability.data", cat.availability, per_study = TRUE)

ggsave(file='../figures/availability-data.pdf', device = "pdf",
         width = 14, height = 6, units = "cm")
bar_chart_ordered("availability.analysis", cat.availability, per_study = TRUE)

ggsave(file='../figures/availability-analysis.pdf', device = "pdf",
        width = 14, height = 6, units = "cm")

  1. Perfect addressal is when all four threats to validity are modeled↩︎