1 Introduction

During spring 2021 the Rise 2 Flow study was replicated in yet another experiment E2. The setup and execution was very similar to the first instance (E1) of the experiment, which took place during 2020. In the data/ directory one can find data for both instances.

The purpose of this document is to show how the analysis of the experiments was done. With this replication package anyone can download and rerun the analysis, and check our assumptions.

The experiment used a number of instruments during the entry, exit, and weekly sessions (additionally personal data was collected during the entry session).1

For entry and exit sessions six instruments were used:

For the weekly sessions the WHO-5 well-being index was used (WHO-5).8

To summarize, the experiments were executed according to,

  1. An entry survey, at time \(t_0\) (MAAS, etc.) with personal data
  2. A weekly survey, at time \(t_{\text{w}_{1}}, \ldots, t_{\text{w}_{n}}\)
  3. Daily surveys, at time \(t_{\text{d}_{1}}, \ldots, t_{\text{d}_{n}}\)
  4. An exit survey, at time \(t_{1}\) (MAAS, etc.)

The main question the study tries to answer is: Is there a difference in the responses (i.e., higher/lower values on responses), across all or among some instruments, over time?

1.1 Data cleaning

In order to answer the above question we can first of all compare outcomes at \(t_0\) and \(t_1\), and condition on personal data, e.g., age and gender. Next, the weekly and daily questions could be used to analyze the trend over time, and perhaps see where a noticeable effect, if it exists, shows up.

1.1.1 Entry and exit instruments

d <- read.csv(here("data/cleanedData.csv"), row.names = 1)
dim(d)
## [1] 187  85

In total, we have 187 rows and 85 columns; so not bad at all actually (\(N=187\)). Each row corresponds to one subject answering either the entry (\(t_0\)) or the exit (\(t_1\)) survey. In some cases we have subjects that have only answered one of them!

table(table(d$ID) >1)
## 
## FALSE  TRUE 
##   105    41

So, \(105\) subjects filled out the survey once, and \(41\) filled out the entry and exit surveys, i.e., in total \(146\) subjects participated in both experiments.

There should not be more than \(>2\) rows per subject and it should be encoded as an integer for reasons of anonymity.

max(table(d$ID))
## [1] 2
typeof(d$ID)
## [1] "integer"

For each column (variable) we check which values it has and then code it correctly,

# MAAS instrument
for(j in 2:16){
  set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("Almost never", "Very infrequently", "Somewhat infrequently", "Somewhat frequently", "Very frequently", "Almost always"), ordered = TRUE))
}

# SPANE instrument
for(j in 17:28){
  set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("Very rarely or never", "Rarely", "Sometimes", "Often", "Very often or always"), ordered = TRUE))
}

# PWB instrument
for(j in 29:36){
  set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("Strongly disagree", "Disagree", "Slightly disagree", "Mixed or neither agree nor disagree", "Slightly agree", "Agree", "Strongly agree"), ordered = TRUE))
}

# PST instrument
for(i in 37:58)
  d[,eval(i)] <- ifelse(d[eval(i)] == 'Yes', 1, 0)

# SEQ instrument
for(j in 59:68){
  set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("Not true","Hardly true","Rather true","Exactly true"), ordered = TRUE))
}

# PP instrument
for(j in 69:75){
  set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("None of the time","A little of the time","Some of the time","Most of the time","All of the time"), ordered = TRUE))
}
for(j in 76:78){
  set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("1","2","3","4","5","6","7","8","9","10"), ordered = TRUE))
}
for(j in 79:79){
  set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("You were a lot worse than other workers","You were a little worse than other workers","You were somewhat worse than other workers","You were about average","You were somewhat better than other workers","You were a little better than other workers","You were a lot better than other workers"), ordered = TRUE))
}

# make sure Age is standardized
d$Age <- scale(d$Age)

To summarize, in the first column we have subject ID, then the MAAS instrument (\(15\) questions), SPANE (\(12\)), PWB (\(8\)), PST (\(22\)), SE (\(10\)), PP (\(11\)), age (standardized), gender (\(1/2\)), occupation (student or non-students, \(1/2\)), living condition (four unordered categories), temporal variable (\(0/1\) for \(t_0 / t_1\)), and experiment (\(1/2\) for first or second experiment).

1.1.2 Weekly and daily data

First, load the weekly data set.

# drop first index column with `row.names = 1`
weekly <- read.csv(here("data/weekly.csv"), row.names = 1)

str(weekly)
## 'data.frame':    456 obs. of  8 variables:
##  $ ID        : int  1 1 1 1 2 2 3 3 3 4 ...
##  $ WQ1_1_6   : int  4 2 4 4 5 4 2 2 3 4 ...
##  $ WQ2_1_6   : int  4 2 4 4 4 4 2 2 2 2 ...
##  $ WQ3_1_6   : int  4 4 4 4 4 5 4 5 5 4 ...
##  $ WQ4_1_6   : int  3 3 4 4 4 5 5 5 4 2 ...
##  $ WQ5_1_6   : int  2 4 5 5 2 4 5 6 6 4 ...
##  $ experiment: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Seq       : int  1 2 3 4 1 2 1 2 3 1 ...

We have 456 rows and 8 columns. The first column contains a unique ID for each subject. Columns \(2\)\(6\) are the \(5\) questions asked (Likert scale \(1,\ldots,6\)), Column \(7\) indicates if they were part of the 1st (0) or 2nd (1) experiment. Finally, the last column is a sequence indicator, i.e., the first subject answered the questions four times, \(1,\ldots,4\), the second subject twice, and so on.

Let’s turn our attention to the daily data set,

daily <- read.csv(here("data/daily.csv"), row.names = 1)

str(daily)
## 'data.frame':    1646 obs. of  4 variables:
##  $ ID        : int  1 1 1 1 1 2 2 2 2 2 ...
##  $ resp      : int  8 7 3 5 6 6 7 8 6 7 ...
##  $ experiment: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Seq       : int  1 2 3 4 5 1 2 3 4 5 ...

We have 1646 rows and 4 columns. The first column is the subject’s ID. The response to the daily question comes next (Likert scale \(1,\ldots,10\)). After that we have an indicator which experiment the subject belongs to and a sequence indicator, as was the case in the weekly data set.

2 Model designs

In Appendix 3 a comparison of different likelihoods can be found.

2.1 weekly data

If we take a random sample of \(32\) subjects we will see how complicated it will be to analyze this. For the first question we have in the weekly survey instrument, for each of the \(32\) subjects, we’ve plotted answers (dots) they provided on a \(1,\ldots,6\) Likert scale; then we’ve drawn a linear model for what we’ve got.

weekly |>
  filter(ID %in% sample(x = max(weekly$ID), size = 32)) %>%
  ggplot(aes(x = Seq, y = WQ1_1_6)) +
  geom_point() +
  scale_x_continuous(breaks = seq(from = 2, to = 12, by = 2)) +
  scale_y_continuous(breaks = c(2,4,6)) +
  stat_smooth(method = "lm", se = F) +
  ylab("Question 1 (Likert 1 to 6)") +
  xlab("Week number") +
  theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20)) +
  facet_wrap( ~ ID, ncol = 4)
## `geom_smooth()` using formula 'y ~ x'

2.1.1 WHO-5: Q1

2.1.1.1 Prior predictive checks

We want to model the weekly instrument and how it varies over time \(t\) (in our case the sequence variable Seq). In this case, \(t\) will be modeled using a Gaussian Process and then each individual in the dataset will be modeled using a varying intercept (in addition we’ll model \(\sigma\), called disc below, for each experimental session).

An information theoretical comparison using LOO showed that a model with varying intercepts for each individual had considerably better relative out of sample performance with \(\Delta\text{elpd} = -77.1\) and \(\Delta\text{SE}=11.2\), i.e., assuming a \(99\%\) \(z\)-score of \(2.576\) we have \(\text{CI}_{95\%}[-106;-48]\).

m_weekly_1 <- brm(
  bf(WQ1_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
    lf(disc ~ 0 + experiment, cmc = FALSE),
  data = weekly,
  family = cumulative(probit),
  sample_prior = "only",
  prior = c(
    prior(normal(0, 2.5), class = Intercept),
    prior(normal(0, 2.5), class = b),
    prior(normal(0, 1), class = b, dpar = disc),
    prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
    prior(weibull(2, 1), class = sdgp),
    prior(weibull(2, 1), class = sd)
  ),
  silent = TRUE, refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 2 finished in 0.5 seconds.
## Chain 3 finished in 0.4 seconds.
## Chain 4 finished in 0.4 seconds.
## Chain 1 finished in 0.5 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.5 seconds.
## Total execution time: 0.9 seconds.
pp_check(m_weekly_1,
         type = "bars",
         nsamples = 200)

The combinations of our priors are fairly uniform on the outcome space. Let’s sample with data now.

2.1.1.2 Sampling and diagnostics

m_weekly_1 <- brm(
  bf(WQ1_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
    lf(disc ~ 0 + experiment, cmc = FALSE),
  data = weekly,
  family = cumulative(probit),
  # sample_prior = "only",
  prior = c(
    prior(normal(0, 2.5), class = Intercept),
    prior(normal(0, 2.5), class = b),
    prior(normal(0, 1), class = b, dpar = disc),
    prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
    prior(weibull(2, 1), class = sdgp),
    prior(weibull(2, 1), class = sd)
  ),
  silent = TRUE, refresh = 0,
  control = list(adapt_delta = 0.99)
)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 1 finished in 45.0 seconds.
## Chain 3 finished in 45.5 seconds.
## Chain 2 finished in 62.8 seconds.
## Chain 4 finished in 63.4 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 54.2 seconds.
## Total execution time: 63.5 seconds.

Let’s have a look at some diagnostics.

# Check divergences, tree depth, energy
rstan::check_hmc_diagnostics(eval(m_weekly_1)$fit)
## 
## Divergences:
## 0 of 4000 iterations ended with a divergence.
## 
## Tree depth:
## 0 of 4000 iterations saturated the maximum tree depth of 10.
## 
## Energy:
## E-BFMI indicated no pathological behavior.
## Rhat:
## Rhat values ok.
## ESS:
## Sufficient ESS.

2.1.1.3 Posterior predictive checks

tibble(Experiment     = c("1","2"),
       mu    = c(0, fixef(m_weekly_1)["experiment", 1]),
       sigma = c(1, 1 / (exp(fixef(m_weekly_1)["disc_experiment", 1])))) %>% 
  expand(nesting(Experiment, mu, sigma),
         y = seq(from = -5, to = 5, by = 0.1)) %>% 
  mutate(d = dnorm(y, mu, sigma)) %>% 
  
  ggplot(aes(x = y, y = d, fill = Experiment)) +
  geom_area(alpha = 2/3) +
  geom_vline(xintercept = fixef(m_weekly_1)[1:5, 1], linetype = 3, color = sl[9]) +
  scale_fill_scico_d(palette = "lajolla", begin = .33, end = .67) +
  scale_x_continuous(sec.axis = dup_axis(breaks = fixef(m_weekly_1)[1:5, 1] %>% as.double(), 
                                         labels = parse(text = str_c("theta[", 1:5, "]")))) +
  scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
  labs(title = "Underlying latent scale for Q1 in \nExperiment 1 and 2", x = NULL) +
  theme_light() +
  theme(axis.ticks.x.top = element_blank(),
        text = element_text(size = 20))

post <- posterior_samples(m_weekly_1)

post <-
  post %>%
  select(`b_Intercept[1]`:`b_Intercept[5]`) %>%
  mutate(iter = 1:n())

means <-
    post %>% 
    summarise_at(vars(`b_Intercept[1]`:`b_Intercept[5]`), mean) %>% 
    pivot_longer(everything(),
                 values_to = "mean")

post %>% 
    gather(name, threshold, -iter) %>% 
    group_by(iter) %>% 
    mutate(theta_bar = mean(threshold)) %>% 
    
    ggplot(aes(x = threshold, y = theta_bar, color = name)) +
    geom_vline(data = means,
               aes(xintercept = mean, color = name),
               linetype = 2) +
    geom_point(alpha = 1/10) +
    scale_color_scico_d(palette = "lajolla", begin = .25) +
    ylab("mean threshold") +
    theme_light() +
    theme(legend.position = "none",
          text = element_text(size = 20))

p <- pp_check(m_weekly_1, type = "ecdf_overlay", nsamples = 50)

p + theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20))

p <- pp_check(m_weekly_1, type = "bars_grouped", nsamples = 100, 
              group = "experiment") +
  scale_x_continuous("y", breaks = 1:6) +
  scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
  ggtitle("Data with posterior predictions",
          subtitle = expression(list(italic(N[A])==298, italic(N[B])==158))) +
  theme(legend.background = element_blank(),
        legend.position = c(.9, .75))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
my_labels <- as_labeller(c("0" = "A", "1" = "B"))
p + facet_wrap("group", labeller = my_labels) +
  theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20))

post %>% 
    select(-iter) %>% 
    mutate_all(.funs = ~pnorm(. ,0, 1)) %>% 
    transmute(`p[Y==1]` = `b_Intercept[1]`,
              `p[Y==2]` = `b_Intercept[2]` - `b_Intercept[1]`,
              `p[Y==3]` = `b_Intercept[3]` - `b_Intercept[2]`,
              `p[Y==4]` = `b_Intercept[4]` - `b_Intercept[3]`,
              `p[Y==5]` = `b_Intercept[5]` - `b_Intercept[4]`,
              
              `p[Y==6]` = 1 - `b_Intercept[5]`) %>% 
    set_names(1:6) %>% 
    pivot_longer(everything(), names_to = "Y") %>% 
    
    ggplot(aes(x = value, y = Y)) +
    stat_pointinterval(point_interval = mode_hdi, .width = .95, 
                 fill = sl[4], color = sl[8], size = 1/2, height = 2.5) +
    scale_x_continuous(expression(italic(p)*"["*Y==italic(i)*"]"),
                       breaks = 0:5 / 5,
                       expand = c(0, 0), limits = c(0, 1)) +
  theme_light() +
  theme(text = element_text(size = 20))

p <- conditional_effects(m_weekly_1)

plot(p, plot = FALSE)[[2]] +
  xlab("Week number") + 
  ylab("Question 1") +
  scale_x_continuous(breaks= seq(1:12)) +
  scale_y_continuous(breaks= c(3.5, 4, 4.5, 5)) +
  theme_light() +
  theme(text = element_text(size = 20))

2.1.2 WHO-5: Q2

2.1.2.1 Prior predictive checks

m_weekly_2 <- brm(
  bf(WQ2_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
    lf(disc ~ 0 + experiment, cmc = FALSE),
  data = weekly,
  family = cumulative(probit),
  sample_prior = "only",
  prior = c(
    prior(normal(0, 2.5), class = Intercept),
    prior(normal(0, 2.5), class = b),
    prior(normal(0, 1), class = b, dpar = disc),
    prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
    prior(weibull(2, 1), class = sdgp),
    prior(weibull(2, 1), class = sd)
  ),
  silent = TRUE, refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 1 finished in 0.4 seconds.
## Chain 2 finished in 0.4 seconds.
## Chain 3 finished in 0.4 seconds.
## Chain 4 finished in 0.4 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.4 seconds.
## Total execution time: 0.5 seconds.
pp_check(m_weekly_2,
         type = "bars",
         nsamples = 200)

The combinations of our priors are fairly uniform on the outcome space. Let’s sample with data now.

2.1.2.2 Sampling and diagnostics

m_weekly_2 <- brm(
  bf(WQ2_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
    lf(disc ~ 0 + experiment, cmc = FALSE),
  data = weekly,
  family = cumulative(probit),
  # sample_prior = "only",
  prior = c(
    prior(normal(0, 2.5), class = Intercept),
    prior(normal(0, 2.5), class = b),
    prior(normal(0, 1), class = b, dpar = disc),
    prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
    prior(weibull(2, 1), class = sdgp),
    prior(weibull(2, 1), class = sd)
  ),
  silent = TRUE, refresh = 0,
  control = list(adapt_delta = 0.99)
)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 3 finished in 42.7 seconds.
## Chain 1 finished in 44.0 seconds.
## Chain 2 finished in 44.6 seconds.
## Chain 4 finished in 49.1 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 45.1 seconds.
## Total execution time: 49.3 seconds.

Let’s have a look at some diagnostics.

# Check divergences, tree depth, energy
rstan::check_hmc_diagnostics(eval(m_weekly_2)$fit)
## 
## Divergences:
## 0 of 4000 iterations ended with a divergence.
## 
## Tree depth:
## 0 of 4000 iterations saturated the maximum tree depth of 10.
## 
## Energy:
## E-BFMI indicated no pathological behavior.
## Rhat:
## Rhat values ok.
## ESS:
## Sufficient ESS.

2.1.2.3 Posterior predictive checks

tibble(X     = LETTERS[1:2],
       mu    = c(0, fixef(m_weekly_2)["experiment", 1]),
       sigma = c(1, 1 / (exp(fixef(m_weekly_2)["disc_experiment", 1])))) %>% 
  expand(nesting(X, mu, sigma),
         y = seq(from = -5, to = 5, by = 0.1)) %>% 
  mutate(d = dnorm(y, mu, sigma)) %>% 
  
  ggplot(aes(x = y, y = d, fill = X)) +
  geom_area(alpha = 2/3) +
  geom_vline(xintercept = fixef(m_weekly_2)[1:5, 1], linetype = 3, color = sl[9]) +
  scale_fill_scico_d(palette = "lajolla", begin = .33, end = .67) +
  scale_x_continuous(sec.axis = dup_axis(breaks = fixef(m_weekly_2)[1:5, 1] %>% as.double(), 
                                         labels = parse(text = str_c("theta[", 1:5, "]")))) +
  scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
  labs(title = "Underlying latent scale for our outome Q2, \n given experimental session X", x = NULL) +
  theme_light() +
  theme(axis.ticks.x.top = element_blank(),
        text = element_text(size = 20))

post <- posterior_samples(m_weekly_2)

post <-
  post %>%
  select(`b_Intercept[1]`:`b_Intercept[5]`) %>%
  mutate(iter = 1:n())

means <-
    post %>% 
    summarise_at(vars(`b_Intercept[1]`:`b_Intercept[5]`), mean) %>% 
    pivot_longer(everything(),
                 values_to = "mean")

post %>% 
    gather(name, threshold, -iter) %>% 
    group_by(iter) %>% 
    mutate(theta_bar = mean(threshold)) %>% 
    
    ggplot(aes(x = threshold, y = theta_bar, color = name)) +
    geom_vline(data = means,
               aes(xintercept = mean, color = name),
               linetype = 2) +
    geom_point(alpha = 1/10) +
    scale_color_scico_d(palette = "lajolla", begin = .25) +
    ylab("mean threshold") +
    theme_light() +
    theme(legend.position = "none",
          text = element_text(size = 20))

p <- pp_check(m_weekly_2, type = "ecdf_overlay", nsamples = 50)

p + theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20))

p <- pp_check(m_weekly_2, type = "bars_grouped", nsamples = 100, 
              group = "experiment") +
  scale_x_continuous("y", breaks = 1:6) +
  scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
  ggtitle("Data with posterior predictions",
          subtitle = expression(list(italic(N[A])==298, italic(N[B])==158))) +
  theme(legend.background = element_blank(),
        legend.position = c(.9, .75))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
my_labels <- as_labeller(c("0" = "A", "1" = "B"))
p + facet_wrap("group", labeller = my_labels) +
  theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20))

post %>% 
    select(-iter) %>% 
    mutate_all(.funs = ~pnorm(. ,0, 1)) %>% 
    transmute(`p[Y==1]` = `b_Intercept[1]`,
              `p[Y==2]` = `b_Intercept[2]` - `b_Intercept[1]`,
              `p[Y==3]` = `b_Intercept[3]` - `b_Intercept[2]`,
              `p[Y==4]` = `b_Intercept[4]` - `b_Intercept[3]`,
              `p[Y==5]` = `b_Intercept[5]` - `b_Intercept[4]`,
              
              `p[Y==6]` = 1 - `b_Intercept[5]`) %>% 
    set_names(1:6) %>% 
    pivot_longer(everything(), names_to = "Y") %>% 
    
    ggplot(aes(x = value, y = Y)) +
    stat_pointinterval(point_interval = mode_hdi, .width = .95, 
                 fill = sl[4], color = sl[8], size = 1/2, height = 2.5) +
    scale_x_continuous(expression(italic(p)*"["*Y==italic(i)*"]"),
                       breaks = 0:5 / 5,
                       expand = c(0, 0), limits = c(0, 1)) +
  theme_light() +
  theme(text = element_text(size = 20))

p <- conditional_effects(m_weekly_2)

plot(p, plot = FALSE)[[2]] +
  xlab("Week number") + 
  ylab("Question 2") +
  scale_x_continuous(breaks= seq(1:12)) +
  scale_y_continuous(breaks= c(3.5, 4, 4.5, 5)) +
  theme_light() +
  theme(text = element_text(size = 20))

2.1.3 WHO-5: Q3

2.1.3.1 Prior predictive checks

m_weekly_3 <- brm(
  bf(WQ3_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
    lf(disc ~ 0 + experiment, cmc = FALSE),
  data = weekly,
  family = cumulative(probit),
  sample_prior = "only",
  prior = c(
    prior(normal(0, 2.5), class = Intercept),
    prior(normal(0, 2.5), class = b),
    prior(normal(0, 1), class = b, dpar = disc),
    prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
    prior(weibull(2, 1), class = sdgp),
    prior(weibull(2, 1), class = sd)
  ),
  silent = TRUE, refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 1 finished in 0.5 seconds.
## Chain 2 finished in 0.5 seconds.
## Chain 3 finished in 0.5 seconds.
## Chain 4 finished in 0.5 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.5 seconds.
## Total execution time: 0.6 seconds.
pp_check(m_weekly_3,
         type = "bars",
         nsamples = 200)

The combinations of our priors are fairly uniform on the outcome space. Let’s sample with data now.

2.1.3.2 Sampling and diagnostics

m_weekly_3 <- brm(
  bf(WQ3_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
    lf(disc ~ 0 + experiment, cmc = FALSE),
  data = weekly,
  family = cumulative(probit),
  # sample_prior = "only",
  prior = c(
    prior(normal(0, 2.5), class = Intercept),
    prior(normal(0, 2.5), class = b),
    prior(normal(0, 1), class = b, dpar = disc),
    prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
    prior(weibull(2, 1), class = sdgp),
    prior(weibull(2, 1), class = sd)
  ),
  silent = TRUE, refresh = 0,
  control = list(adapt_delta = 0.99)
)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 2 finished in 43.7 seconds.
## Chain 4 finished in 44.2 seconds.
## Chain 3 finished in 47.6 seconds.
## Chain 1 finished in 47.8 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 45.8 seconds.
## Total execution time: 47.8 seconds.

Let’s have a look at some diagnostics.

# Check divergences, tree depth, energy
rstan::check_hmc_diagnostics(eval(m_weekly_3)$fit)
## 
## Divergences:
## 0 of 4000 iterations ended with a divergence.
## 
## Tree depth:
## 0 of 4000 iterations saturated the maximum tree depth of 10.
## 
## Energy:
## E-BFMI indicated no pathological behavior.
## Rhat:
## Rhat values ok.
## ESS:
## Sufficient ESS.

2.1.3.3 Posterior predictive checks

tibble(X     = LETTERS[1:2],
       mu    = c(0, fixef(m_weekly_3)["experiment", 1]),
       sigma = c(1, 1 / (exp(fixef(m_weekly_3)["disc_experiment", 1])))) %>% 
  expand(nesting(X, mu, sigma),
         y = seq(from = -5, to = 5, by = 0.1)) %>% 
  mutate(d = dnorm(y, mu, sigma)) %>% 
  
  ggplot(aes(x = y, y = d, fill = X)) +
  geom_area(alpha = 2/3) +
  geom_vline(xintercept = fixef(m_weekly_3)[1:5, 1], linetype = 3, color = sl[9]) +
  scale_fill_scico_d(palette = "lajolla", begin = .33, end = .67) +
  scale_x_continuous(sec.axis = dup_axis(breaks = fixef(m_weekly_3)[1:5, 1] %>% as.double(), 
                                         labels = parse(text = str_c("theta[", 1:5, "]")))) +
  scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
  labs(title = "Underlying latent scale for our outome Q3, \n given experimental session X", x = NULL) +
  theme_light() +
  theme(axis.ticks.x.top = element_blank(),
        text = element_text(size = 20))

post <- posterior_samples(m_weekly_3)

post <-
  post %>%
  select(`b_Intercept[1]`:`b_Intercept[5]`) %>%
  mutate(iter = 1:n())

means <-
    post %>% 
    summarise_at(vars(`b_Intercept[1]`:`b_Intercept[5]`), mean) %>% 
    pivot_longer(everything(),
                 values_to = "mean")

post %>% 
    gather(name, threshold, -iter) %>% 
    group_by(iter) %>% 
    mutate(theta_bar = mean(threshold)) %>% 
    
    ggplot(aes(x = threshold, y = theta_bar, color = name)) +
    geom_vline(data = means,
               aes(xintercept = mean, color = name),
               linetype = 2) +
    geom_point(alpha = 1/10) +
    scale_color_scico_d(palette = "lajolla", begin = .25) +
    ylab("mean threshold") +
    theme_light() +
    theme(legend.position = "none",
          text = element_text(size = 20))

p <- pp_check(m_weekly_3, type = "ecdf_overlay", nsamples = 50)

p + theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20))

p <- pp_check(m_weekly_3, type = "bars_grouped", nsamples = 100, 
              group = "experiment") +
  scale_x_continuous("y", breaks = 1:6) +
  scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
  ggtitle("Data with posterior predictions",
          subtitle = expression(list(italic(N[A])==298, italic(N[B])==158))) +
  theme(legend.background = element_blank(),
        legend.position = c(.9, .75))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
my_labels <- as_labeller(c("0" = "A", "1" = "B"))
p + facet_wrap("group", labeller = my_labels) +
  theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20))

post %>% 
    select(-iter) %>% 
    mutate_all(.funs = ~pnorm(. ,0, 1)) %>% 
    transmute(`p[Y==1]` = `b_Intercept[1]`,
              `p[Y==2]` = `b_Intercept[2]` - `b_Intercept[1]`,
              `p[Y==3]` = `b_Intercept[3]` - `b_Intercept[2]`,
              `p[Y==4]` = `b_Intercept[4]` - `b_Intercept[3]`,
              `p[Y==5]` = `b_Intercept[5]` - `b_Intercept[4]`,
              
              `p[Y==6]` = 1 - `b_Intercept[5]`) %>% 
    set_names(1:6) %>% 
    pivot_longer(everything(), names_to = "Y") %>% 
    
    ggplot(aes(x = value, y = Y)) +
    stat_pointinterval(point_interval = mode_hdi, .width = .95, 
                 fill = sl[4], color = sl[8], size = 1/2, height = 2.5) +
    scale_x_continuous(expression(italic(p)*"["*Y==italic(i)*"]"),
                       breaks = 0:5 / 5,
                       expand = c(0, 0), limits = c(0, 1)) +
  theme_light() +
  theme(text = element_text(size = 20))

p <- conditional_effects(m_weekly_3)

plot(p, plot = FALSE)[[2]] +
  xlab("Week number") + 
  ylab("Question 3") +
  scale_x_continuous(breaks= seq(1:12)) +
  scale_y_continuous(breaks= c(3, 3.5, 4, 4.5, 5)) +
  theme_light() +
  theme(text = element_text(size = 20))

2.1.4 WHO-5: Q4

2.1.4.1 Prior predictive checks

m_weekly_4 <- brm(
  bf(WQ4_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
    lf(disc ~ 0 + experiment, cmc = FALSE),
  data = weekly,
  family = cumulative(probit),
  sample_prior = "only",
  prior = c(
    prior(normal(0, 2.5), class = Intercept),
    prior(normal(0, 2.5), class = b),
    prior(normal(0, 1), class = b, dpar = disc),
    prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
    prior(weibull(2, 1), class = sdgp),
    prior(weibull(2, 1), class = sd)
  ),
  silent = TRUE, refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 1 finished in 0.5 seconds.
## Chain 3 finished in 0.5 seconds.
## Chain 4 finished in 0.4 seconds.
## Chain 2 finished in 0.5 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.5 seconds.
## Total execution time: 0.6 seconds.
pp_check(m_weekly_4,
         type = "bars",
         nsamples = 200)

The combinations of our priors are fairly uniform on the outcome space. Let’s sample with data now.

2.1.4.2 Sampling and diagnostics

m_weekly_4 <- brm(
  bf(WQ4_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
    lf(disc ~ 0 + experiment, cmc = FALSE),
  data = weekly,
  family = cumulative(probit),
  # sample_prior = "only",
  prior = c(
    prior(normal(0, 2.5), class = Intercept),
    prior(normal(0, 2.5), class = b),
    prior(normal(0, 1), class = b, dpar = disc),
    prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
    prior(weibull(2, 1), class = sdgp),
    prior(weibull(2, 1), class = sd)
  ),
  silent = TRUE, refresh = 0,
  control = list(adapt_delta = 0.99)
)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 4 finished in 36.9 seconds.
## Chain 3 finished in 38.4 seconds.
## Chain 2 finished in 38.8 seconds.
## Chain 1 finished in 41.1 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 38.8 seconds.
## Total execution time: 41.2 seconds.

Let’s have a look at some diagnostics.

# Check divergences, tree depth, energy
rstan::check_hmc_diagnostics(eval(m_weekly_4)$fit)
## 
## Divergences:
## 0 of 4000 iterations ended with a divergence.
## 
## Tree depth:
## 0 of 4000 iterations saturated the maximum tree depth of 10.
## 
## Energy:
## E-BFMI indicated no pathological behavior.
## Rhat:
## Rhat values ok.
## ESS:
## Sufficient ESS.

2.1.4.3 Posterior predictive checks

tibble(X     = LETTERS[1:2],
       mu    = c(0, fixef(m_weekly_4)["experiment", 1]),
       sigma = c(1, 1 / (exp(fixef(m_weekly_4)["disc_experiment", 1])))) %>% 
  expand(nesting(X, mu, sigma),
         y = seq(from = -5, to = 5, by = 0.1)) %>% 
  mutate(d = dnorm(y, mu, sigma)) %>% 
  
  ggplot(aes(x = y, y = d, fill = X)) +
  geom_area(alpha = 2/3) +
  geom_vline(xintercept = fixef(m_weekly_4)[1:5, 1], linetype = 3, color = sl[9]) +
  scale_fill_scico_d(palette = "lajolla", begin = .33, end = .67) +
  scale_x_continuous(sec.axis = dup_axis(breaks = fixef(m_weekly_4)[1:5, 1] %>% as.double(), 
                                         labels = parse(text = str_c("theta[", 1:5, "]")))) +
  scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
  labs(title = "Underlying latent scale for our outome Q4, \n given experimental session X", x = NULL) +
  theme_light() +
  theme(axis.ticks.x.top = element_blank(),
        text = element_text(size = 20))

post <- posterior_samples(m_weekly_4)

post <-
  post %>%
  select(`b_Intercept[1]`:`b_Intercept[5]`) %>%
  mutate(iter = 1:n())

means <-
    post %>% 
    summarise_at(vars(`b_Intercept[1]`:`b_Intercept[5]`), mean) %>% 
    pivot_longer(everything(),
                 values_to = "mean")

post %>% 
    gather(name, threshold, -iter) %>% 
    group_by(iter) %>% 
    mutate(theta_bar = mean(threshold)) %>% 
    
    ggplot(aes(x = threshold, y = theta_bar, color = name)) +
    geom_vline(data = means,
               aes(xintercept = mean, color = name),
               linetype = 2) +
    geom_point(alpha = 1/10) +
    scale_color_scico_d(palette = "lajolla", begin = .25) +
    ylab("mean threshold") +
    theme_light() +
    theme(legend.position = "none",
          text = element_text(size = 20))

p <- pp_check(m_weekly_4, type = "ecdf_overlay", nsamples = 50)

p + theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20))

p <- pp_check(m_weekly_4, type = "bars_grouped", nsamples = 100, 
              group = "experiment") +
  scale_x_continuous("y", breaks = 1:6) +
  scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
  ggtitle("Data with posterior predictions",
          subtitle = expression(list(italic(N[A])==298, italic(N[B])==158))) +
  theme(legend.background = element_blank(),
        legend.position = c(.9, .75))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
my_labels <- as_labeller(c("0" = "A", "1" = "B"))
p + facet_wrap("group", labeller = my_labels) +
  theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20))

post %>% 
    select(-iter) %>% 
    mutate_all(.funs = ~pnorm(. ,0, 1)) %>% 
    transmute(`p[Y==1]` = `b_Intercept[1]`,
              `p[Y==2]` = `b_Intercept[2]` - `b_Intercept[1]`,
              `p[Y==3]` = `b_Intercept[3]` - `b_Intercept[2]`,
              `p[Y==4]` = `b_Intercept[4]` - `b_Intercept[3]`,
              `p[Y==5]` = `b_Intercept[5]` - `b_Intercept[4]`,
              
              `p[Y==6]` = 1 - `b_Intercept[5]`) %>% 
    set_names(1:6) %>% 
    pivot_longer(everything(), names_to = "Y") %>% 
    
    ggplot(aes(x = value, y = Y)) +
    stat_pointinterval(point_interval = mode_hdi, .width = .95, 
                 fill = sl[4], color = sl[8], size = 1/2, height = 2.5) +
    scale_x_continuous(expression(italic(p)*"["*Y==italic(i)*"]"),
                       breaks = 0:5 / 5,
                       expand = c(0, 0), limits = c(0, 1)) +
  theme_light() +
  theme(text = element_text(size = 20))

p <- conditional_effects(m_weekly_4)

plot(p, plot = FALSE)[[2]] +
  xlab("Week number") + 
  ylab("Question 4") +
  scale_x_continuous(breaks= seq(1:12)) +
  scale_y_continuous(breaks= c(3, 3.5, 4, 4.5, 5)) +
  theme_light() +
  theme(text = element_text(size = 20))

2.1.5 WHO-5: Q5

2.1.5.1 Prior predictive checks

m_weekly_5 <- brm(
  bf(WQ5_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
    lf(disc ~ 0 + experiment, cmc = FALSE),
  data = weekly,
  family = cumulative(probit),
  sample_prior = "only",
  prior = c(
    prior(normal(0, 2.5), class = Intercept),
    prior(normal(0, 2.5), class = b),
    prior(normal(0, 1), class = b, dpar = disc),
    prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
    prior(weibull(2, 1), class = sdgp),
    prior(weibull(2, 1), class = sd)
  ),
  silent = TRUE, refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 1 finished in 0.4 seconds.
## Chain 2 finished in 0.4 seconds.
## Chain 3 finished in 0.3 seconds.
## Chain 4 finished in 0.4 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.4 seconds.
## Total execution time: 0.5 seconds.
pp_check(m_weekly_5,
         type = "bars",
         nsamples = 200)

The combinations of our priors are fairly uniform on the outcome space. Let’s sample with data now.

2.1.5.2 Sampling and diagnostics

m_weekly_5 <- brm(
  bf(WQ5_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
    lf(disc ~ 0 + experiment, cmc = FALSE),
  data = weekly,
  family = cumulative(probit),
  # sample_prior = "only",
  prior = c(
    prior(normal(0, 2.5), class = Intercept),
    prior(normal(0, 2.5), class = b),
    prior(normal(0, 1), class = b, dpar = disc),
    prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
    prior(weibull(2, 1), class = sdgp),
    prior(weibull(2, 1), class = sd)
  ),
  silent = TRUE, refresh = 0,
  control = list(adapt_delta = 0.99)
)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 2 finished in 48.9 seconds.
## Chain 1 finished in 55.1 seconds.
## Chain 3 finished in 55.1 seconds.
## Chain 4 finished in 78.1 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 59.3 seconds.
## Total execution time: 78.3 seconds.

Let’s have a look at some diagnostics.

# Check divergences, tree depth, energy
rstan::check_hmc_diagnostics(eval(m_weekly_5)$fit)
## 
## Divergences:
## 0 of 4000 iterations ended with a divergence.
## 
## Tree depth:
## 0 of 4000 iterations saturated the maximum tree depth of 10.
## 
## Energy:
## E-BFMI indicated no pathological behavior.
## Rhat:
## Rhat values ok.
## ESS:
## Sufficient ESS.

2.1.5.3 Posterior predictive checks

tibble(X     = LETTERS[1:2],
       mu    = c(0, fixef(m_weekly_5)["experiment", 1]),
       sigma = c(1, 1 / (exp(fixef(m_weekly_5)["disc_experiment", 1])))) %>% 
  expand(nesting(X, mu, sigma),
         y = seq(from = -5, to = 5, by = 0.1)) %>% 
  mutate(d = dnorm(y, mu, sigma)) %>% 
  
  ggplot(aes(x = y, y = d, fill = X)) +
  geom_area(alpha = 2/3) +
  geom_vline(xintercept = fixef(m_weekly_5)[1:5, 1], linetype = 3, color = sl[9]) +
  scale_fill_scico_d(palette = "lajolla", begin = .33, end = .67) +
  scale_x_continuous(sec.axis = dup_axis(breaks = fixef(m_weekly_5)[1:5, 1] %>% as.double(), 
                                         labels = parse(text = str_c("theta[", 1:5, "]")))) +
  scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
  labs(title = "Underlying latent scale for our outome Q5, \n given experimental session X", x = NULL) +
  theme_light() +
  theme(axis.ticks.x.top = element_blank(),
        text = element_text(size = 20))

post <- posterior_samples(m_weekly_5)

post <-
  post %>%
  select(`b_Intercept[1]`:`b_Intercept[5]`) %>%
  mutate(iter = 1:n())

means <-
    post %>% 
    summarise_at(vars(`b_Intercept[1]`:`b_Intercept[5]`), mean) %>% 
    pivot_longer(everything(),
                 values_to = "mean")

post %>% 
    gather(name, threshold, -iter) %>% 
    group_by(iter) %>% 
    mutate(theta_bar = mean(threshold)) %>% 
    
    ggplot(aes(x = threshold, y = theta_bar, color = name)) +
    geom_vline(data = means,
               aes(xintercept = mean, color = name),
               linetype = 2) +
    geom_point(alpha = 1/10) +
    scale_color_scico_d(palette = "lajolla", begin = .25) +
    ylab("mean threshold") +
    theme_light() +
    theme(legend.position = "none",
          text = element_text(size = 20))

p <- pp_check(m_weekly_5, type = "ecdf_overlay", nsamples = 50)

p + theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20))

p <- pp_check(m_weekly_5, type = "bars_grouped", nsamples = 100, 
              group = "experiment") +
  scale_x_continuous("y", breaks = 1:6) +
  scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
  ggtitle("Data with posterior predictions",
          subtitle = expression(list(italic(N[A])==298, italic(N[B])==158))) +
  theme(legend.background = element_blank(),
        legend.position = c(.9, .75))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
my_labels <- as_labeller(c("0" = "A", "1" = "B"))
p + facet_wrap("group", labeller = my_labels) +
  theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20))

post %>% 
    select(-iter) %>% 
    mutate_all(.funs = ~pnorm(. ,0, 1)) %>% 
    transmute(`p[Y==1]` = `b_Intercept[1]`,
              `p[Y==2]` = `b_Intercept[2]` - `b_Intercept[1]`,
              `p[Y==3]` = `b_Intercept[3]` - `b_Intercept[2]`,
              `p[Y==4]` = `b_Intercept[4]` - `b_Intercept[3]`,
              `p[Y==5]` = `b_Intercept[5]` - `b_Intercept[4]`,
              `p[Y==6]` = 1 - `b_Intercept[5]`) %>% 
    set_names(1:6) %>% 
    pivot_longer(everything(), names_to = "Y") %>% 
    
    ggplot(aes(x = value, y = Y)) +
    stat_pointinterval(point_interval = mode_hdi, .width = .95, 
                 fill = sl[4], color = sl[8], size = 1/2, height = 2.5) +
    scale_x_continuous(expression(italic(p)*"["*Y==italic(i)*"]"),
                       breaks = 0:5 / 5,
                       expand = c(0, 0), limits = c(0, 1)) +
  theme_light() +
  theme(text = element_text(size = 20))

p <- conditional_effects(m_weekly_5)

plot(p, plot = FALSE)[[2]] +
  xlab("Week number") + 
  ylab("Question 5") +
  scale_x_continuous(breaks= seq(1:12)) +
  scale_y_continuous(breaks= c(3, 3.5, 4, 4.5, 5)) +
  theme_light() +
  theme(text = element_text(size = 20))

2.1.6 Model estimates

Compare the above with the two below plots where we have separated the effect according to the two experimental sessions.

## Running MCMC with 4 parallel chains...
## 
## Chain 1 finished in 22.6 seconds.
## Chain 2 finished in 23.2 seconds.
## Chain 3 finished in 24.3 seconds.
## Chain 4 finished in 25.5 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 23.9 seconds.
## Total execution time: 26.1 seconds.
## Running MCMC with 4 parallel chains...
## 
## Chain 4 finished in 19.3 seconds.
## Chain 3 finished in 20.6 seconds.
## Chain 1 finished in 21.2 seconds.
## Chain 2 finished in 25.9 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 21.7 seconds.
## Total execution time: 26.0 seconds.
## Running MCMC with 4 parallel chains...
## 
## Chain 1 finished in 27.6 seconds.
## Chain 4 finished in 28.6 seconds.
## Chain 3 finished in 29.4 seconds.
## Chain 2 finished in 39.8 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 31.4 seconds.
## Total execution time: 40.0 seconds.
## Running MCMC with 4 parallel chains...
## 
## Chain 4 finished in 24.5 seconds.
## Chain 2 finished in 25.7 seconds.
## Chain 1 finished in 26.0 seconds.
## Chain 3 finished in 29.8 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 26.5 seconds.
## Total execution time: 30.0 seconds.
## Running MCMC with 4 parallel chains...
## 
## Chain 2 finished in 24.3 seconds.
## Chain 4 finished in 26.9 seconds.
## Chain 1 finished in 28.7 seconds.
## Chain 3 finished in 37.0 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 29.2 seconds.
## Total execution time: 37.1 seconds.
## Running MCMC with 4 parallel chains...
## 
## Chain 2 finished in 17.3 seconds.
## Chain 1 finished in 17.5 seconds.
## Chain 4 finished in 17.5 seconds.
## Chain 3 finished in 18.7 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 17.7 seconds.
## Total execution time: 18.8 seconds.
## Running MCMC with 4 parallel chains...
## 
## Chain 2 finished in 22.1 seconds.
## Chain 3 finished in 25.2 seconds.
## Chain 1 finished in 26.4 seconds.
## Chain 4 finished in 28.9 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 25.6 seconds.
## Total execution time: 29.1 seconds.
## Running MCMC with 4 parallel chains...
## 
## Chain 2 finished in 15.4 seconds.
## Chain 1 finished in 15.8 seconds.
## Chain 3 finished in 16.5 seconds.
## Chain 4 finished in 17.2 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 16.2 seconds.
## Total execution time: 17.3 seconds.
## Running MCMC with 4 parallel chains...
## 
## Chain 2 finished in 28.5 seconds.
## Chain 1 finished in 30.5 seconds.
## Chain 4 finished in 32.3 seconds.
## Chain 3 finished in 32.9 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 31.0 seconds.
## Total execution time: 33.0 seconds.
## Running MCMC with 4 parallel chains...
## 
## Chain 4 finished in 17.9 seconds.
## Chain 1 finished in 22.8 seconds.
## Chain 2 finished in 24.3 seconds.
## Chain 3 finished in 24.5 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 22.4 seconds.
## Total execution time: 24.6 seconds.

2.2 daily data

Recall, the daily data set consists of the following variables,

str(daily)
## 'data.frame':    1646 obs. of  4 variables:
##  $ ID        : int  1 1 1 1 1 2 2 2 2 2 ...
##  $ resp      : int  8 7 3 5 6 6 7 8 6 7 ...
##  $ experiment: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Seq       : int  1 2 3 4 5 1 2 3 4 5 ...

where ID is the subject, resp is the response (Likert \(1,\ldots,10\)), experiment is the session the subject belongs to, and, finally, Seq is the order each subject answered the question.

The response, resp, when plotted, looks like this,

hist(daily$resp, main = "Histogram of responses to daily question")

and the maximum and minimum number of responses (days) a subject provided was,

max(daily$Seq)
## [1] 84
min(daily$Seq)
## [1] 1

If we take a random sample of \(32\) subjects we will see how complicated it will be to analyze this. For the question we have in the daily survey instrument, for each of the \(32\) subjects, we’ve plotted answers (dots) they provided on a \(1,\ldots,10\) Likert scale; then we’ve drawn a linear model for what we’ve got.

daily |>
  filter(ID %in% sample(x = max(daily$ID), size = 32)) %>%
  ggplot(aes(x = Seq, y = resp)) +
  geom_point() +
  scale_x_continuous(breaks = c(1,20,40,60,80)) +
  scale_y_continuous(breaks = c(1,5,10)) +
  stat_smooth(method = "lm", se = F) +
  ylab("Daily question") +
  xlab("Days") +
  theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20)) +
  facet_wrap( ~ ID, ncol = 4)
## `geom_smooth()` using formula 'y ~ x'

2.2.1 Prior predictive checks

We want to model the daily instrument and how it varies over time \(t\) (in our case the sequence variable Seq). In this case, \(t\) will be modeled using a Gaussian Process and then each individual in the dataset will be modeled using a varying intercept (in addition we’ll model \(\sigma\), called disc below, for each experimental session).

An information theoretical comparison using LOO showed that a model with varying intercepts for each individual had considerably better relative out of sample performance with \(\Delta\text{elpd} = -77.1\) and \(\Delta\text{SE}=11.2\), i.e., assuming a \(99\%\) \(z\)-score of \(2.576\) we have \(\text{CI}_{95\%}[-106;-48]\).

m_daily <- brm(
  bf(resp ~ 1 + gp(Seq) + experiment + (1 | ID)) +
    lf(disc ~ 0 + experiment, cmc = FALSE),
  data = daily,
  family = cumulative(probit),
  sample_prior = "only",
  prior = c(
    prior(normal(0, 2.5), class = Intercept),
    prior(normal(0, 2.5), class = b),
    prior(normal(0, 1), class = b, dpar = disc),
    prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
    prior(weibull(2, 1), class = sdgp),
    prior(weibull(2, 1), class = sd)
  ),
  silent = TRUE, refresh = 0, control = list(adapt_delta=0.95)
)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 1 finished in 0.8 seconds.
## Chain 2 finished in 0.8 seconds.
## Chain 3 finished in 0.9 seconds.
## Chain 4 finished in 0.9 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.9 seconds.
## Total execution time: 1.0 seconds.
pp_check(m_daily,
         type = "bars",
         nsamples = 200)

The combinations of our priors are fairly uniform on the outcome space. Let’s sample with data now.

2.2.2 Sampling and diagnostics

m_daily <- brm(
  bf(resp ~ 1 + gp(Seq) + experiment + (1 | ID)) +
    lf(disc ~ 0 + experiment, cmc = FALSE),
  data = daily,
  family = cumulative(probit),
  # sample_prior = "only",
  prior = c(
    prior(normal(0, 2.5), class = Intercept),
    prior(normal(0, 2.5), class = b),
    prior(normal(0, 1), class = b, dpar = disc),
    prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
    prior(weibull(2, 1), class = sdgp),
    prior(weibull(2, 1), class = sd)
  ),
  silent = TRUE, refresh = 0, init = 0,
  control = list(adapt_delta = 0.99, max_treedepth=12)
)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 3 finished in 491.8 seconds.
## Chain 1 finished in 497.1 seconds.
## Chain 4 finished in 589.4 seconds.
## Chain 2 finished in 729.8 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 577.1 seconds.
## Total execution time: 730.0 seconds.

Let’s have a look at some diagnostics.

# Check divergences, tree depth, energy
rstan::check_hmc_diagnostics(eval(m_daily)$fit)
## 
## Divergences:
## 0 of 4000 iterations ended with a divergence.
## 
## Tree depth:
## 0 of 4000 iterations saturated the maximum tree depth of 10.
## 
## Energy:
## E-BFMI indicated no pathological behavior.
## Rhat:
## Rhat values ok.
## ESS:
## Sufficient ESS.

2.2.3 Posterior predictive checks

tibble(X     = LETTERS[1:2],
       mu    = c(0, fixef(m_daily)["experiment", 1]),
       sigma = c(1, 1 / (exp(fixef(m_daily)["disc_experiment", 1])))) %>% 
  expand(nesting(X, mu, sigma),
         y = seq(from = -5, to = 5, by = 0.1)) %>% 
  mutate(d = dnorm(y, mu, sigma)) %>% 
  
  ggplot(aes(x = y, y = d, fill = X)) +
  geom_area(alpha = 2/3) +
  geom_vline(xintercept = fixef(m_daily)[1:9, 1], linetype = 3, color = sl[9]) +
  scale_fill_scico_d(palette = "lajolla", begin = .33, end = .67) +
  scale_x_continuous(sec.axis = dup_axis(breaks = fixef(m_daily)[1:9, 1] %>% as.double(), 
                                         labels = parse(text = str_c("theta[", 1:9, "]")))) +
  scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
  labs(title = "Underlying latent scale for our outome, \n given experimental session X", x = NULL) +
  theme_light() +
  theme(axis.ticks.x.top = element_blank(),
        text = element_text(size = 20))

post <- posterior_samples(m_daily)

post <-
  post %>%
  select(`b_Intercept[1]`:`b_Intercept[9]`) %>%
  mutate(iter = 1:n())

means <-
    post %>% 
    summarise_at(vars(`b_Intercept[1]`:`b_Intercept[9]`), mean) %>% 
    pivot_longer(everything(),
                 values_to = "mean")

post %>% 
    gather(name, threshold, -iter) %>% 
    group_by(iter) %>% 
    mutate(theta_bar = mean(threshold)) %>% 
    
    ggplot(aes(x = threshold, y = theta_bar, color = name)) +
    geom_vline(data = means,
               aes(xintercept = mean, color = name),
               linetype = 2) +
    geom_point(alpha = 1/10) +
    scale_color_scico_d(palette = "lajolla", begin = .25) +
    ylab("mean threshold") +
    theme_light() +
    theme(legend.position = "none",
          text = element_text(size = 20))

p <- pp_check(m_daily, type = "ecdf_overlay", nsamples = 50)

p + theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20))

p <- pp_check(m_daily, type = "bars_grouped", nsamples = 100, 
              group = "experiment") +
  scale_x_continuous("y", breaks = 1:10) +
  scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
  ggtitle("Data with posterior predictions",
          subtitle = expression(list(italic(N[A])==1024, italic(N[B])==622))) +
  theme(legend.background = element_blank(),
        legend.position = c(.9, .75))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
my_labels <- as_labeller(c("0" = "A", "1" = "B"))
p + facet_wrap("group", labeller = my_labels) +
  theme_light() +
  theme(panel.grid = element_blank(),
        text = element_text(size = 20))

2.2.4 Model estimates

post %>% 
    select(-iter) %>% 
    mutate_all(.funs = ~pnorm(. ,0, 1)) %>% 
    transmute(`p[Y==1]` = `b_Intercept[1]`,
              `p[Y==2]` = `b_Intercept[2]` - `b_Intercept[1]`,
              `p[Y==3]` = `b_Intercept[3]` - `b_Intercept[2]`,
              `p[Y==4]` = `b_Intercept[4]` - `b_Intercept[3]`,
              `p[Y==5]` = `b_Intercept[5]` - `b_Intercept[4]`,
              `p[Y==6]` = `b_Intercept[6]` - `b_Intercept[5]`,
              `p[Y==7]` = `b_Intercept[7]` - `b_Intercept[6]`,
              `p[Y==8]` = `b_Intercept[8]` - `b_Intercept[7]`,
              `p[Y==9]` = `b_Intercept[9]` - `b_Intercept[8]`,
              `p[Y==10]` = 1 - `b_Intercept[9]`) %>% 
    set_names(1:10) %>% 
    pivot_longer(everything(), names_to = "Y") %>% 
    
    ggplot(aes(x = value, y = Y)) +
    stat_pointinterval(point_interval = mode_hdi, .width = .95, 
                 fill = sl[4], color = sl[8], size = 1/2, height = 2.5) +
    scale_x_continuous(expression(italic(p)*"["*Y==italic(i)*"]"),
                       breaks = 0:5 / 5,
                       expand = c(0, 0), limits = c(0, 1)) +
  scale_y_discrete(limits = factor(seq(1:10))) +
  theme_light() +
  theme(text = element_text(size = 20))

#pdf("daily.pdf", width=7.5/2.54, height=9.49/2.54)

plot(conditional_effects(m_daily), plot = FALSE)[[2]] +
  xlim(1,84) +
  ylab("Response") +
  scale_x_continuous(name="Day", 
                     limits = c(1,84), 
                     breaks = c(1,seq(10,80, by = 10),84)) +
  scale_y_continuous(name="Response", 
                     limits = c(6,8),
                     breaks = seq(6,8)) +
  theme(text = element_text(size=16))

#dev.off()

2.3 Multivariate models with temporal variable

We want a model that uses personal data (e.g., age and gender) as predictors. The idea is to later see if any of the predictors have predictive capacity for the different survey instruments.

Design models with predictors from personal data, together with our indicator t for time, experiment for experimental session, and a varying intercept ID that varies depending on subject.

2.3.1 Prior and posterior predictive checks

Let’s now sample all models with the same predictors we used above (visual prior and posterior predictive checks were done for each model).

d$Living_1_4_f <- as.factor(d$Living_1_4)
################################################################################
#
# MAAS
#
################################################################################
bf1 <- bf(mvbind(MAASQ1_1_6,
                 MAASQ2_1_6,
                 MAASQ3_1_6,
                 MAASQ4_1_6,
                 MAASQ5_1_6,
                 MAASQ6_1_6,
                 MAASQ7_1_6,
                 MAASQ8_1_6,
                 MAASQ9_1_6,
                 MAASQ10_1_6,
                 MAASQ11_1_6,
                 MAASQ12_1_6,
                 MAASQ13_1_6,
                 MAASQ14_1_6,
                 MAASQ15_1_6) ~ 
            1 + Age + Gender_0_1 + Occupation_0_1 + Living_1_4_f + t + (1 |e| Experiment_1_2) + (1 |c| ID))

# Prior predictive checks have been conducted and the probability mass 
# is distributed evenly on the outcome space
p <- get_prior(bf1, data=d, family=cumulative(probit)) %>%
  mutate(prior = ifelse(class == "b", "normal(0,3)", prior)) %>%
  mutate(prior = ifelse(class == "cor", "lkj(2)", prior)) %>%
  mutate(prior = ifelse(class == "Intercept", "normal(0,5)", prior)) %>%
  mutate(prior = ifelse(class == "sd", "weibull(2,1)", prior))

m_maas <- brm(bf1,
    family = cumulative(probit),
    data = d,
    prior = p,
    #sample_prior = "only", # use if you only want to sample from prior
    silent = TRUE,
    iter = 5e3,
    inits = "0",
    control = list(adapt_delta=0.99),
    refresh = 0)
## Running MCMC with 4 parallel chains...
## 
## Chain 4 finished in 1418.7 seconds.
## Chain 3 finished in 1419.0 seconds.
## Chain 2 finished in 1423.6 seconds.
## Chain 1 finished in 1444.3 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 1426.4 seconds.
## Total execution time: 1444.7 seconds.
# use the below to do prior and posterior predictive checks
# change resp to the response variable you want to check 
pp_check(m_maas, resp = "MAASQ116", type = "bars", nsamples = 250) 

################################################################################
#
# SPANE
#
################################################################################

bf1 <- bf(mvbind(SPANEQ1_1_5,
                 SPANEQ2_1_5,
                 SPANEQ3_1_5,
                 SPANEQ4_1_5,
                 SPANEQ5_1_5,
                 SPANEQ6_1_5,
                 SPANEQ7_1_5,
                 SPANEQ8_1_5,
                 SPANEQ9_1_5,
                 SPANEQ10_1_5,
                 SPANEQ11_1_5,
                 SPANEQ12_1_5) ~ 
            1 + Age + Gender_0_1 + Occupation_0_1 + Living_1_4 + t + (1 |e| Experiment_1_2) + (1 |c| ID))

p <- get_prior(bf1, data=d, family=cumulative(probit)) %>%
  mutate(prior = ifelse(class == "b", "normal(0,3)", prior)) %>%
  mutate(prior = ifelse(class == "cor", "lkj(2)", prior)) %>%
  mutate(prior = ifelse(class == "Intercept", "normal(0,5)", prior)) %>%
  mutate(prior = ifelse(class == "sd", "weibull(2,1)", prior))

m_spane <- brm(bf1,
    family = cumulative(probit),
    data = d,
    prior = p,
    init = "0",
    iter = 8e3,
    control = list(adapt_delta=0.99),
    # sample_prior = "only", # use if you only want to sample from prior
    silent = TRUE,
    refresh = 0)
## Running MCMC with 4 parallel chains...
## 
## Chain 3 finished in 1796.1 seconds.
## Chain 2 finished in 1802.3 seconds.
## Chain 4 finished in 1811.7 seconds.
## Chain 1 finished in 1870.6 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 1820.2 seconds.
## Total execution time: 1870.9 seconds.
# use the below to do prior and posterior predictive checks
# change resp to the response variable you want to check 
# pp_check(m_spane, resp = "SPANEQ115", type = "bars", nsamples = 250) 
################################################################################
#
# PWB
#
################################################################################

bf1 <- bf(mvbind(PWBQ1_1_7,
                 PWBQ2_1_7,
                 PWBQ3_1_7,
                 PWBQ4_1_7,
                 PWBQ5_1_7,
                 PWBQ6_1_7,
                 PWBQ7_1_7,
                 PWBQ8_1_7) ~ 
            1 + Age + Gender_0_1 + Occupation_0_1 + Living_1_4 + t + (1 |e| Experiment_1_2) + (1 |c| ID))

p <- get_prior(bf1, data=d, family=cumulative(probit)) %>%
  mutate(prior = ifelse(class == "b", "normal(0,3)", prior)) %>%
  mutate(prior = ifelse(class == "cor", "lkj(2)", prior)) %>%
  mutate(prior = ifelse(class == "Intercept", "normal(0,5)", prior)) %>%
  mutate(prior = ifelse(class == "sd", "weibull(2,1)", prior))

m_pwb <- brm(bf1,
    family = cumulative(probit),
    data = d,
    prior = p,
    init = "0",
    control = list(adapt_delta=0.95),
    iter = 8e3,
    # sample_prior = "only", # use if you only want to sample from prior
    silent = TRUE,
    refresh = 0)
## Running MCMC with 4 parallel chains...
## 
## Chain 1 finished in 554.1 seconds.
## Chain 3 finished in 555.9 seconds.
## Chain 4 finished in 559.4 seconds.
## Chain 2 finished in 560.3 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 557.4 seconds.
## Total execution time: 560.7 seconds.
# pp_check(m_pwb, resp = "PWBQ117", type = "bars", nsamples = 250) 
################################################################################
#
# PST
#
################################################################################
# Here we move to a bernoulli since we have 0/1 outcome
bf1 <- bf(mvbind(PSTQ1_0_1,
                 PSTQ2_0_1,
                 PSTQ3_0_1,
                 PSTQ4_0_1,
                 PSTQ5_0_1,
                 PSTQ6_0_1,
                 PSTQ7_0_1,
                 PSTQ8_0_1,
                 PSTQ9_0_1,
                 PSTQ10_0_1,
                 PSTQ11_0_1,
                 PSTQ12_0_1,
                 PSTQ13_0_1,
                 PSTQ14_0_1,
                 PSTQ15_0_1,
                 PSTQ16_0_1,
                 PSTQ17_0_1,
                 PSTQ18_0_1,
                 PSTQ19_0_1,
                 PSTQ20_0_1,
                 PSTQ21_0_1,
                 PSTQ22_0_1) ~ 
            1 + Age + Gender_0_1 + Occupation_0_1 + Living_1_4 + t + (1 |e| Experiment_1_2) + (1 |c| ID))

p <- get_prior(bf1, data=d, family=bernoulli) %>%
  mutate(prior = ifelse(class == "b", "normal(0,3)", prior)) %>%
  mutate(prior = ifelse(class == "cor", "lkj(2)", prior)) %>%
  mutate(prior = ifelse(class == "Intercept", "normal(0,5)", prior)) %>%
  mutate(prior = ifelse(class == "sd", "weibull(2,1)", prior))

m_pst <- brm(bf1,
    family = bernoulli,
    data = d,
    prior = p,
    # sample_prior = "only",
    silent = TRUE,
    refresh = 0
    )
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 2 finished in 99.7 seconds.
## Chain 4 finished in 100.6 seconds.
## Chain 3 finished in 100.7 seconds.
## Chain 1 finished in 104.0 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 101.3 seconds.
## Total execution time: 104.4 seconds.
# pp_check(m_pst, resp = "PSTQ101", type = "bars", nsamples = 250) 
################################################################################
#
# SE
#
################################################################################

bf1 <- bf(mvbind(SEQ1_1_4,
                 SEQ2_1_4,
                 SEQ3_1_4,
                 SEQ4_1_4,
                 SEQ5_1_4,
                 SEQ6_1_4,
                 SEQ7_1_4,
                 SEQ8_1_4,
                 SEQ9_1_4,
                 SEQ10_1_4) ~
            1 + Age + Gender_0_1 + Occupation_0_1 + Living_1_4 + t + (1 |e| Experiment_1_2) + (1 |c| ID))

p <- get_prior(bf1, data=d, family=cumulative(probit)) %>%
  mutate(prior = ifelse(class == "b", "normal(0,3)", prior)) %>%
  mutate(prior = ifelse(class == "cor", "lkj(2)", prior)) %>%
  mutate(prior = ifelse(class == "Intercept", "normal(0,5)", prior)) %>%
  mutate(prior = ifelse(class == "sd", "weibull(2,1)", prior))

m_se <- brm(bf1,
    family = cumulative(probit),
    data = d,
    prior = p,
    init = "0",
    control = list(adapt_delta = 0.99),
    iter = 5e3,
    # sample_prior = "only",
    silent = TRUE,
    refresh = 0)
## Running MCMC with 4 parallel chains...
## 
## Chain 2 finished in 1005.8 seconds.
## Chain 1 finished in 1009.0 seconds.
## Chain 3 finished in 1015.2 seconds.
## Chain 4 finished in 1033.4 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 1015.9 seconds.
## Total execution time: 1033.8 seconds.
# pp_check(m_se, resp = "SEQ314", type = "bars", nsamples = 250) 
################################################################################
#
# PPHQ
#
################################################################################

bf1 <- bf(mvbind(PPHQ1_1_5,
                 PPHQ2_1_5,
                 PPHQ3_1_5,
                 PPHQ4_1_5,
                 PPHQ5_1_5,
                 PPHQ6_1_5,
                 PPHQ7_1_5,
                 PPRQ1_1_10,
                 PPRQ2_1_10,
                 PPRQ3_1_10,
                 PPOQ1_1_7
                 ) ~ 
            1 + Age + Gender_0_1 + Occupation_0_1 + Living_1_4 + t + (1 |e| Experiment_1_2) + (1 |c| ID))

p <- get_prior(bf1, data=d, family=cumulative(probit)) %>%
  mutate(prior = ifelse(class == "b", "normal(0,3)", prior)) %>%
  mutate(prior = ifelse(class == "cor", "lkj(2)", prior)) %>%
  mutate(prior = ifelse(class == "Intercept", "normal(0,5)", prior)) %>%
  mutate(prior = ifelse(class == "sd", "weibull(2,1)", prior))

m_pph <- brm(bf1,
    family = cumulative(probit),
    data = d,
    prior = p,
    init = "0",
    iter = 1e4,
    control =list(adapt_delta=0.96),
    # sample_prior = "only",
    silent = TRUE,
    refresh = 0)
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 3 finished in 992.1 seconds.
## Chain 4 finished in 993.9 seconds.
## Chain 1 finished in 997.5 seconds.
## Chain 2 finished in 1335.5 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 1079.7 seconds.
## Total execution time: 1335.9 seconds.
# pp_check(m_pph, resp = "PPHQ115", type = "bars", nsamples = 250) 

So we have now sampled a number of models (\(6\)), i.e., for each instrument we have one multivariate model (\(>1\) outcome), where we also estimate the correlation between the outcome given the subject’s ID.

Before we put any trust in these models we need to check a number of diagnostics.

2.3.2 Diagnostics

Hamiltonian Monte Carlo diagnostics indicate all is well (we’ve also checked traceplots, effective sample sizes, and \(\widehat{R}\) for all models).

The diagnostics code is not executed in this Rmd since it takes up a lot of space. However, if one wants to, in the index.Rmd (used to generate the index.html) one can see all diagnostics that were checked and if one wants to rerun it then simply change the code chunk to eval=TRUE and include=TRUE.

2.3.3 Parameter estimates

First, let’s summarize what we have so far:

  1. Weekly trends modeled with Gaussian Process.
  2. Daily trend modeled with Gaussian Process.
  3. Six multivariate models with the temporal variable to analyze entry/exit.

We can now investigate which, if any, parameters are significant on the 95%-level for the third item in the list, i.e., for each model, and each question, plot all parameters. Any parameters that are significant on the 95%-level could warrant further analysis, and in particular a connection to the qualitative analysis.

For each model we first analyze the temporal variable \(t\). Then we analyze the rest of the \(\beta\) parameters. Finally, where appropriate we look at the conditional effects. All densities we plot are limited to 95% probability mass, while the 50% probability mass is shaded around a vertical line, which is the median.

In short, any density that does not cross zero (the vertical line) is an effect of interest.

Each of the following models used a sample size of \(n=\) 187 for sampling.

2.3.3.1 Mindful attention awareness scale (MAAS)

The model estimated 8071 parameters.

For Questions \(1\)\(8\), \(11\)\(12\), and \(14\) the parameter \(t\) is negative. In short, the respondents answered with higher responses at \(t_0\) than at \(t_1\) for these questions.

Some examples of how the questions were phrased:9

  • Q1 “I could be experiencing some emotion and not be conscious of it until some time later.”
  • Q2 “I break or spill things because of carelessness, not paying attention, or thinking of something else.”
  • Q3 “I find it difficult to stay focused on what’s happening in the present.”
  • Q8 “I rush through activities without being really attentive to them.”
  • Q14 “I find myself doing things without paying attention.”

If we look at each unique parameter, in each question, can we see which, if any, predictors drove these results?

Much variance makes the estimates uncertain and, ultimately, no parameter is significant.

And the same applies to gender. How about occupation status?

Occupation status (student/non-student) has an effect for Q2. Let’s look at living conditions before we turn our attention to the next instrument.

Questions \(1\)\(3\) are significant, as are Q\(8\), and Q\(12\).

2.3.3.2 Scale of positive and negative experience (SPANE)

The model estimated 6227 parameters.

All \(t\) parameters are significant (positive), i.e, subjects responded with higher values at \(t_1\).

2.3.3.3 Personal well-being index (PWB)

The model estimated 4023 parameters.

All \(t\) parameters are significant, i.e., higher values at \(t_1\), except for Q\(3\).

2.3.3.4 Positive thinking (PST)

The model estimated 12365 parameters.

Concerning the temporal variable,

Questions \(4\), \(9\), \(12\), \(15\), \(17\)\(18\) are significant.

Concerning the remaining predictors,

2.3.3.5 Self efficacy (SE)

The model estimated 5071 parameters.

Concerning the temporal variable,

Questions \(6\), \(7\), and \(9\) are significant.

Concerning the remaining predictors,

2.3.3.6 Perceived productivity (PP)

The model estimated 5681 parameters.

Concerning the temporal variable,

Question \(1\) shows a significant difference when moving from \(t_0\) to \(t_1\) (lower responses at \(t_1\)).

Concerning the remaining predictors,

# (APPENDIX) Appendices {-}

3 Assumptions concerning the data-generation process

In order to design an appropriate model, one of the things we need to make assumptions about is the underlying data-generative model, i.e., a prior for the data we have, which is usually called the likelihood.

In short, there are four candidates: Cumulative, Continuation ratio, Stopping ratio, and Adjacent category. Let’s use LOO to create identical null models with default priors, and then use the same data for all models to compare them from an information theoretical perspective.

bf1 <- bf(mvbind(MAASQ1_1_6,
                 MAASQ2_1_6,
                 MAASQ3_1_6,
                 MAASQ4_1_6,
                 MAASQ5_1_6,
                 MAASQ6_1_6,
                 MAASQ7_1_6,
                 MAASQ8_1_6,
                 MAASQ9_1_6,
                 MAASQ10_1_6,
                 MAASQ11_1_6,
                 MAASQ12_1_6,
                 MAASQ13_1_6,
                 MAASQ14_1_6,
                 MAASQ15_1_6) ~ 
            1)

m0 <- brm(bf1,
    family = cumulative,
    data = d,
    silent = TRUE,
    refresh = 0
    )
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 3 finished in 119.0 seconds.
## Chain 1 finished in 120.2 seconds.
## Chain 2 finished in 121.9 seconds.
## Chain 4 finished in 121.9 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 120.7 seconds.
## Total execution time: 122.3 seconds.
mcr <- brm(bf1,
    family = cratio,
    data = d,
    silent = TRUE,
    refresh = 0
    )
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 2 finished in 37.1 seconds.
## Chain 3 finished in 37.1 seconds.
## Chain 4 finished in 37.3 seconds.
## Chain 1 finished in 37.5 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 37.2 seconds.
## Total execution time: 37.9 seconds.
msr <- brm(bf1,
    family = sratio,
    data = d,
    silent = TRUE,
    refresh = 0
    )
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 4 finished in 36.4 seconds.
## Chain 1 finished in 36.5 seconds.
## Chain 2 finished in 36.5 seconds.
## Chain 3 finished in 42.6 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 38.0 seconds.
## Total execution time: 42.9 seconds.
mac <- brm(bf1,
    family = acat,
    data = d,
    silent = TRUE,
    refresh = 0
    )
## Running MCMC with 4 chains, at most 8 in parallel...
## 
## Chain 4 finished in 75.6 seconds.
## Chain 2 finished in 76.2 seconds.
## Chain 1 finished in 76.5 seconds.
## Chain 3 finished in 77.1 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 76.4 seconds.
## Total execution time: 77.4 seconds.
m0 <- add_criterion(m0, criterion = "loo")
msr <- add_criterion(msr, criterion = "loo")
mcr <- add_criterion(mcr, criterion = "loo")
mac <- add_criterion(mac, criterion = "loo")

# setting moment_match = TRUE will still lead to the same conclusion
loo_compare(m0, mcr, msr, mac)
##     elpd_diff se_diff
## m0   0.0       0.0   
## mac -0.3       0.4   
## msr -0.8       1.0   
## mcr -0.9       1.1

Evidently, since \(\Delta\)SE is fairly large, in comparison to the relative difference in expected log pointwise predictive density (\(\Delta\)elpd), we can safely assume that there’s no significant difference between the likelihoods and, thus, opt for the standard approach, i.e., the Cumulative distribution, in this case represented by \(\mathcal{M}_0\).

4 Computational environment

print(sessionInfo(), locale=FALSE)
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-apple-darwin20.4.0 (64-bit)
## Running under: macOS Big Sur 11.4
## 
## Matrix products: default
## BLAS:   /usr/local/Cellar/openblas/0.3.15_1/lib/libopenblasp-r0.3.15.dylib
## LAPACK: /usr/local/Cellar/r/4.1.0/lib/R/lib/libRlapack.dylib
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] scico_1.2.0       here_1.0.1        data.table_1.14.0 tidybayes_2.3.1  
##  [5] openxlsx_4.2.4    patchwork_1.1.1   ggthemes_4.2.4    latex2exp_0.5.0  
##  [9] bayesplot_1.8.1   forcats_0.5.1     stringr_1.4.0     dplyr_1.0.7      
## [13] purrr_0.3.4       readr_1.4.0       tidyr_1.1.3       tibble_3.1.2     
## [17] ggplot2_3.3.5     tidyverse_1.3.1   plyr_1.8.6        brms_2.15.0      
## [21] Rcpp_1.0.6       
## 
## loaded via a namespace (and not attached):
##   [1] readxl_1.3.1         backports_1.2.1      igraph_1.2.6        
##   [4] splines_4.1.0        svUnit_1.0.6         crosstalk_1.1.1     
##   [7] rstantools_2.1.1     inline_0.3.19        digest_0.6.27       
##  [10] htmltools_0.5.1.1    rsconnect_0.8.18     fansi_0.5.0         
##  [13] checkmate_2.0.0      magrittr_2.0.1       modelr_0.1.8        
##  [16] RcppParallel_5.1.4   matrixStats_0.59.0   xts_0.12.1          
##  [19] prettyunits_1.1.1    colorspace_2.0-2     rvest_1.0.0         
##  [22] ggdist_2.4.1         haven_2.4.1          xfun_0.24           
##  [25] callr_3.7.0          crayon_1.4.1         jsonlite_1.7.2      
##  [28] lme4_1.1-27.1        zoo_1.8-9            glue_1.4.2          
##  [31] gtable_0.3.0         V8_3.4.2             distributional_0.2.2
##  [34] pkgbuild_1.2.0       rstan_2.21.2         abind_1.4-5         
##  [37] scales_1.1.1         mvtnorm_1.1-2        DBI_1.1.1           
##  [40] miniUI_0.1.1.1       xtable_1.8-4         stats4_4.1.0        
##  [43] StanHeaders_2.21.0-7 DT_0.18              htmlwidgets_1.5.3   
##  [46] httr_1.4.2           threejs_0.3.3        arrayhelpers_1.1-0  
##  [49] posterior_0.1.5      ellipsis_0.3.2       farver_2.1.0        
##  [52] pkgconfig_2.0.3      loo_2.4.1            sass_0.4.0          
##  [55] dbplyr_2.1.1         utf8_1.2.1           labeling_0.4.2      
##  [58] tidyselect_1.1.1     rlang_0.4.11         reshape2_1.4.4      
##  [61] later_1.2.0          munsell_0.5.0        cellranger_1.1.0    
##  [64] tools_4.1.0          cli_3.0.0            generics_0.1.0      
##  [67] broom_0.7.8          ggridges_0.5.3       evaluate_0.14       
##  [70] fastmap_1.1.0        yaml_2.2.1           processx_3.5.2      
##  [73] knitr_1.33           fs_1.5.0             zip_2.2.0           
##  [76] nlme_3.1-152         mime_0.11            projpred_2.0.2      
##  [79] xml2_1.3.2           compiler_4.1.0       shinythemes_1.2.0   
##  [82] rstudioapi_0.13      curl_4.3.2           gamm4_0.2-6         
##  [85] reprex_2.0.0         bslib_0.2.5.1        stringi_1.6.2       
##  [88] highr_0.9            ps_1.6.0             Brobdingnag_1.2-6   
##  [91] lattice_0.20-44      Matrix_1.3-4         nloptr_1.2.2.2      
##  [94] markdown_1.1         tensorA_0.36.2       shinyjs_2.0.0       
##  [97] vctrs_0.3.8          pillar_1.6.1         lifecycle_1.0.0     
## [100] jquerylib_0.1.4      bridgesampling_1.1-2 httpuv_1.6.1        
## [103] R6_2.5.0             bookdown_0.22        promises_1.2.0.1    
## [106] gridExtra_2.3        nleqslv_3.3.2        codetools_0.2-18    
## [109] boot_1.3-28          colourpicker_1.1.0   MASS_7.3-54         
## [112] gtools_3.9.2         assertthat_0.2.1     rprojroot_2.0.2     
## [115] withr_2.4.2          shinystan_2.5.0      mgcv_1.8-36         
## [118] parallel_4.1.0       hms_1.1.0            grid_4.1.0          
## [121] coda_0.19-4          minqa_1.2.4          cmdstanr_0.4.0.9000 
## [124] rmarkdown_2.9        shiny_1.6.0          lubridate_1.7.10    
## [127] base64enc_0.1-3      dygraphs_1.1.1.6

  1. For more details we refer you to the manuscript.↩︎

  2. https://ppc.sas.upenn.edu/resources/questionnaires-researchers/mindful-attention-awareness-scale↩︎

  3. https://www.midss.org/content/scale-positive-and-negative-experience-spane-0↩︎

  4. http://www.acqol.com.au/instruments↩︎

  5. https://www.psytoolkit.org/survey-library/pts.html↩︎

  6. https://sparqtools.org/mobility-measure/new-general-self-efficacy-scale/↩︎

  7. https://www.hcp.med.harvard.edu/hpq/info.php↩︎

  8. https://www.psykiatri-regionh.dk/who-5/who-5-questionnaires/Pages/default.aspx↩︎

  9. More details can be found in the manuscript.↩︎