During spring 2021 the Rise 2 Flow study was replicated in yet another experiment E2
. The setup and execution was very similar to the first instance (E1
) of the experiment, which took place during 2020. In the data/
directory one can find data for both instances.
The purpose of this document is to show how the analysis of the experiments was done. With this replication package anyone can download and rerun the analysis, and check our assumptions.
The experiment used a number of instruments during the entry, exit, and weekly sessions (additionally personal data was collected during the entry session).1
For entry and exit sessions six instruments were used:
For the weekly sessions the WHO-5 well-being index was used (WHO-5).8
To summarize, the experiments were executed according to,
The main question the study tries to answer is: Is there a difference in the responses (i.e., higher/lower values on responses), across all or among some instruments, over time?
In order to answer the above question we can first of all compare outcomes at \(t_0\) and \(t_1\), and condition on personal data, e.g., age and gender. Next, the weekly and daily questions could be used to analyze the trend over time, and perhaps see where a noticeable effect, if it exists, shows up.
d <- read.csv(here("data/cleanedData.csv"), row.names = 1)
dim(d)
## [1] 187 85
In total, we have 187 rows and 85 columns; so not bad at all actually (\(N=187\)). Each row corresponds to one subject answering either the entry (\(t_0\)) or the exit (\(t_1\)) survey. In some cases we have subjects that have only answered one of them!
table(table(d$ID) >1)
##
## FALSE TRUE
## 105 41
So, \(105\) subjects filled out the survey once, and \(41\) filled out the entry and exit surveys, i.e., in total \(146\) subjects participated in both experiments.
There should not be more than \(>2\) rows per subject and it should be encoded as an integer for reasons of anonymity.
max(table(d$ID))
## [1] 2
typeof(d$ID)
## [1] "integer"
For each column (variable) we check which values it has and then code it correctly,
# MAAS instrument
for(j in 2:16){
set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("Almost never", "Very infrequently", "Somewhat infrequently", "Somewhat frequently", "Very frequently", "Almost always"), ordered = TRUE))
}
# SPANE instrument
for(j in 17:28){
set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("Very rarely or never", "Rarely", "Sometimes", "Often", "Very often or always"), ordered = TRUE))
}
# PWB instrument
for(j in 29:36){
set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("Strongly disagree", "Disagree", "Slightly disagree", "Mixed or neither agree nor disagree", "Slightly agree", "Agree", "Strongly agree"), ordered = TRUE))
}
# PST instrument
for(i in 37:58)
d[,eval(i)] <- ifelse(d[eval(i)] == 'Yes', 1, 0)
# SEQ instrument
for(j in 59:68){
set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("Not true","Hardly true","Rather true","Exactly true"), ordered = TRUE))
}
# PP instrument
for(j in 69:75){
set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("None of the time","A little of the time","Some of the time","Most of the time","All of the time"), ordered = TRUE))
}
for(j in 76:78){
set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("1","2","3","4","5","6","7","8","9","10"), ordered = TRUE))
}
for(j in 79:79){
set(d, i=NULL, j=j, value=factor(d[[j]], levels = c("You were a lot worse than other workers","You were a little worse than other workers","You were somewhat worse than other workers","You were about average","You were somewhat better than other workers","You were a little better than other workers","You were a lot better than other workers"), ordered = TRUE))
}
# make sure Age is standardized
d$Age <- scale(d$Age)
To summarize, in the first column we have subject ID, then the MAAS instrument (\(15\) questions), SPANE (\(12\)), PWB (\(8\)), PST (\(22\)), SE (\(10\)), PP (\(11\)), age (standardized), gender (\(1/2\)), occupation (student or non-students, \(1/2\)), living condition (four unordered categories), temporal variable (\(0/1\) for \(t_0 / t_1\)), and experiment (\(1/2\) for first or second experiment).
First, load the weekly data set.
# drop first index column with `row.names = 1`
weekly <- read.csv(here("data/weekly.csv"), row.names = 1)
str(weekly)
## 'data.frame': 456 obs. of 8 variables:
## $ ID : int 1 1 1 1 2 2 3 3 3 4 ...
## $ WQ1_1_6 : int 4 2 4 4 5 4 2 2 3 4 ...
## $ WQ2_1_6 : int 4 2 4 4 4 4 2 2 2 2 ...
## $ WQ3_1_6 : int 4 4 4 4 4 5 4 5 5 4 ...
## $ WQ4_1_6 : int 3 3 4 4 4 5 5 5 4 2 ...
## $ WQ5_1_6 : int 2 4 5 5 2 4 5 6 6 4 ...
## $ experiment: int 0 0 0 0 0 0 0 0 0 0 ...
## $ Seq : int 1 2 3 4 1 2 1 2 3 1 ...
We have 456 rows and 8 columns. The first column contains a unique ID for each subject. Columns \(2\)–\(6\) are the \(5\) questions asked (Likert scale \(1,\ldots,6\)), Column \(7\) indicates if they were part of the 1st (0) or 2nd (1) experiment. Finally, the last column is a sequence indicator, i.e., the first subject answered the questions four times, \(1,\ldots,4\), the second subject twice, and so on.
Let’s turn our attention to the daily data set,
daily <- read.csv(here("data/daily.csv"), row.names = 1)
str(daily)
## 'data.frame': 1646 obs. of 4 variables:
## $ ID : int 1 1 1 1 1 2 2 2 2 2 ...
## $ resp : int 8 7 3 5 6 6 7 8 6 7 ...
## $ experiment: int 0 0 0 0 0 0 0 0 0 0 ...
## $ Seq : int 1 2 3 4 5 1 2 3 4 5 ...
We have 1646 rows and 4 columns. The first column is the subject’s ID. The response to the daily question comes next (Likert scale \(1,\ldots,10\)). After that we have an indicator which experiment the subject belongs to and a sequence indicator, as was the case in the weekly data set.
In Appendix 3 a comparison of different likelihoods can be found.
weekly
dataIf we take a random sample of \(32\) subjects we will see how complicated it will be to analyze this. For the first question we have in the weekly survey instrument, for each of the \(32\) subjects, we’ve plotted answers (dots) they provided on a \(1,\ldots,6\) Likert scale; then we’ve drawn a linear model for what we’ve got.
weekly |>
filter(ID %in% sample(x = max(weekly$ID), size = 32)) %>%
ggplot(aes(x = Seq, y = WQ1_1_6)) +
geom_point() +
scale_x_continuous(breaks = seq(from = 2, to = 12, by = 2)) +
scale_y_continuous(breaks = c(2,4,6)) +
stat_smooth(method = "lm", se = F) +
ylab("Question 1 (Likert 1 to 6)") +
xlab("Week number") +
theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20)) +
facet_wrap( ~ ID, ncol = 4)
## `geom_smooth()` using formula 'y ~ x'
We want to model the weekly instrument and how it varies over time \(t\) (in our case the sequence variable Seq
). In this case, \(t\) will be modeled using a Gaussian Process and then each individual in the dataset will be modeled using a varying intercept (in addition we’ll model \(\sigma\), called disc
below, for each experimental session).
An information theoretical comparison using LOO showed that a model with varying intercepts for each individual had considerably better relative out of sample performance with \(\Delta\text{elpd} = -77.1\) and \(\Delta\text{SE}=11.2\), i.e., assuming a \(99\%\) \(z\)-score of \(2.576\) we have \(\text{CI}_{95\%}[-106;-48]\).
m_weekly_1 <- brm(
bf(WQ1_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
lf(disc ~ 0 + experiment, cmc = FALSE),
data = weekly,
family = cumulative(probit),
sample_prior = "only",
prior = c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(normal(0, 1), class = b, dpar = disc),
prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
prior(weibull(2, 1), class = sdgp),
prior(weibull(2, 1), class = sd)
),
silent = TRUE, refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 2 finished in 0.5 seconds.
## Chain 3 finished in 0.4 seconds.
## Chain 4 finished in 0.4 seconds.
## Chain 1 finished in 0.5 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 0.5 seconds.
## Total execution time: 0.9 seconds.
pp_check(m_weekly_1,
type = "bars",
nsamples = 200)
The combinations of our priors are fairly uniform on the outcome space. Let’s sample with data now.
m_weekly_1 <- brm(
bf(WQ1_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
lf(disc ~ 0 + experiment, cmc = FALSE),
data = weekly,
family = cumulative(probit),
# sample_prior = "only",
prior = c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(normal(0, 1), class = b, dpar = disc),
prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
prior(weibull(2, 1), class = sdgp),
prior(weibull(2, 1), class = sd)
),
silent = TRUE, refresh = 0,
control = list(adapt_delta = 0.99)
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 1 finished in 45.0 seconds.
## Chain 3 finished in 45.5 seconds.
## Chain 2 finished in 62.8 seconds.
## Chain 4 finished in 63.4 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 54.2 seconds.
## Total execution time: 63.5 seconds.
Let’s have a look at some diagnostics.
# Check divergences, tree depth, energy
rstan::check_hmc_diagnostics(eval(m_weekly_1)$fit)
##
## Divergences:
## 0 of 4000 iterations ended with a divergence.
##
## Tree depth:
## 0 of 4000 iterations saturated the maximum tree depth of 10.
##
## Energy:
## E-BFMI indicated no pathological behavior.
## Rhat:
## Rhat values ok.
## ESS:
## Sufficient ESS.
tibble(Experiment = c("1","2"),
mu = c(0, fixef(m_weekly_1)["experiment", 1]),
sigma = c(1, 1 / (exp(fixef(m_weekly_1)["disc_experiment", 1])))) %>%
expand(nesting(Experiment, mu, sigma),
y = seq(from = -5, to = 5, by = 0.1)) %>%
mutate(d = dnorm(y, mu, sigma)) %>%
ggplot(aes(x = y, y = d, fill = Experiment)) +
geom_area(alpha = 2/3) +
geom_vline(xintercept = fixef(m_weekly_1)[1:5, 1], linetype = 3, color = sl[9]) +
scale_fill_scico_d(palette = "lajolla", begin = .33, end = .67) +
scale_x_continuous(sec.axis = dup_axis(breaks = fixef(m_weekly_1)[1:5, 1] %>% as.double(),
labels = parse(text = str_c("theta[", 1:5, "]")))) +
scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
labs(title = "Underlying latent scale for Q1 in \nExperiment 1 and 2", x = NULL) +
theme_light() +
theme(axis.ticks.x.top = element_blank(),
text = element_text(size = 20))
post <- posterior_samples(m_weekly_1)
post <-
post %>%
select(`b_Intercept[1]`:`b_Intercept[5]`) %>%
mutate(iter = 1:n())
means <-
post %>%
summarise_at(vars(`b_Intercept[1]`:`b_Intercept[5]`), mean) %>%
pivot_longer(everything(),
values_to = "mean")
post %>%
gather(name, threshold, -iter) %>%
group_by(iter) %>%
mutate(theta_bar = mean(threshold)) %>%
ggplot(aes(x = threshold, y = theta_bar, color = name)) +
geom_vline(data = means,
aes(xintercept = mean, color = name),
linetype = 2) +
geom_point(alpha = 1/10) +
scale_color_scico_d(palette = "lajolla", begin = .25) +
ylab("mean threshold") +
theme_light() +
theme(legend.position = "none",
text = element_text(size = 20))
p <- pp_check(m_weekly_1, type = "ecdf_overlay", nsamples = 50)
p + theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20))
p <- pp_check(m_weekly_1, type = "bars_grouped", nsamples = 100,
group = "experiment") +
scale_x_continuous("y", breaks = 1:6) +
scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
ggtitle("Data with posterior predictions",
subtitle = expression(list(italic(N[A])==298, italic(N[B])==158))) +
theme(legend.background = element_blank(),
legend.position = c(.9, .75))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
my_labels <- as_labeller(c("0" = "A", "1" = "B"))
p + facet_wrap("group", labeller = my_labels) +
theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20))
post %>%
select(-iter) %>%
mutate_all(.funs = ~pnorm(. ,0, 1)) %>%
transmute(`p[Y==1]` = `b_Intercept[1]`,
`p[Y==2]` = `b_Intercept[2]` - `b_Intercept[1]`,
`p[Y==3]` = `b_Intercept[3]` - `b_Intercept[2]`,
`p[Y==4]` = `b_Intercept[4]` - `b_Intercept[3]`,
`p[Y==5]` = `b_Intercept[5]` - `b_Intercept[4]`,
`p[Y==6]` = 1 - `b_Intercept[5]`) %>%
set_names(1:6) %>%
pivot_longer(everything(), names_to = "Y") %>%
ggplot(aes(x = value, y = Y)) +
stat_pointinterval(point_interval = mode_hdi, .width = .95,
fill = sl[4], color = sl[8], size = 1/2, height = 2.5) +
scale_x_continuous(expression(italic(p)*"["*Y==italic(i)*"]"),
breaks = 0:5 / 5,
expand = c(0, 0), limits = c(0, 1)) +
theme_light() +
theme(text = element_text(size = 20))
p <- conditional_effects(m_weekly_1)
plot(p, plot = FALSE)[[2]] +
xlab("Week number") +
ylab("Question 1") +
scale_x_continuous(breaks= seq(1:12)) +
scale_y_continuous(breaks= c(3.5, 4, 4.5, 5)) +
theme_light() +
theme(text = element_text(size = 20))
m_weekly_2 <- brm(
bf(WQ2_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
lf(disc ~ 0 + experiment, cmc = FALSE),
data = weekly,
family = cumulative(probit),
sample_prior = "only",
prior = c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(normal(0, 1), class = b, dpar = disc),
prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
prior(weibull(2, 1), class = sdgp),
prior(weibull(2, 1), class = sd)
),
silent = TRUE, refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 1 finished in 0.4 seconds.
## Chain 2 finished in 0.4 seconds.
## Chain 3 finished in 0.4 seconds.
## Chain 4 finished in 0.4 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 0.4 seconds.
## Total execution time: 0.5 seconds.
pp_check(m_weekly_2,
type = "bars",
nsamples = 200)
The combinations of our priors are fairly uniform on the outcome space. Let’s sample with data now.
m_weekly_2 <- brm(
bf(WQ2_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
lf(disc ~ 0 + experiment, cmc = FALSE),
data = weekly,
family = cumulative(probit),
# sample_prior = "only",
prior = c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(normal(0, 1), class = b, dpar = disc),
prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
prior(weibull(2, 1), class = sdgp),
prior(weibull(2, 1), class = sd)
),
silent = TRUE, refresh = 0,
control = list(adapt_delta = 0.99)
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 3 finished in 42.7 seconds.
## Chain 1 finished in 44.0 seconds.
## Chain 2 finished in 44.6 seconds.
## Chain 4 finished in 49.1 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 45.1 seconds.
## Total execution time: 49.3 seconds.
Let’s have a look at some diagnostics.
# Check divergences, tree depth, energy
rstan::check_hmc_diagnostics(eval(m_weekly_2)$fit)
##
## Divergences:
## 0 of 4000 iterations ended with a divergence.
##
## Tree depth:
## 0 of 4000 iterations saturated the maximum tree depth of 10.
##
## Energy:
## E-BFMI indicated no pathological behavior.
## Rhat:
## Rhat values ok.
## ESS:
## Sufficient ESS.
tibble(X = LETTERS[1:2],
mu = c(0, fixef(m_weekly_2)["experiment", 1]),
sigma = c(1, 1 / (exp(fixef(m_weekly_2)["disc_experiment", 1])))) %>%
expand(nesting(X, mu, sigma),
y = seq(from = -5, to = 5, by = 0.1)) %>%
mutate(d = dnorm(y, mu, sigma)) %>%
ggplot(aes(x = y, y = d, fill = X)) +
geom_area(alpha = 2/3) +
geom_vline(xintercept = fixef(m_weekly_2)[1:5, 1], linetype = 3, color = sl[9]) +
scale_fill_scico_d(palette = "lajolla", begin = .33, end = .67) +
scale_x_continuous(sec.axis = dup_axis(breaks = fixef(m_weekly_2)[1:5, 1] %>% as.double(),
labels = parse(text = str_c("theta[", 1:5, "]")))) +
scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
labs(title = "Underlying latent scale for our outome Q2, \n given experimental session X", x = NULL) +
theme_light() +
theme(axis.ticks.x.top = element_blank(),
text = element_text(size = 20))
post <- posterior_samples(m_weekly_2)
post <-
post %>%
select(`b_Intercept[1]`:`b_Intercept[5]`) %>%
mutate(iter = 1:n())
means <-
post %>%
summarise_at(vars(`b_Intercept[1]`:`b_Intercept[5]`), mean) %>%
pivot_longer(everything(),
values_to = "mean")
post %>%
gather(name, threshold, -iter) %>%
group_by(iter) %>%
mutate(theta_bar = mean(threshold)) %>%
ggplot(aes(x = threshold, y = theta_bar, color = name)) +
geom_vline(data = means,
aes(xintercept = mean, color = name),
linetype = 2) +
geom_point(alpha = 1/10) +
scale_color_scico_d(palette = "lajolla", begin = .25) +
ylab("mean threshold") +
theme_light() +
theme(legend.position = "none",
text = element_text(size = 20))
p <- pp_check(m_weekly_2, type = "ecdf_overlay", nsamples = 50)
p + theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20))
p <- pp_check(m_weekly_2, type = "bars_grouped", nsamples = 100,
group = "experiment") +
scale_x_continuous("y", breaks = 1:6) +
scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
ggtitle("Data with posterior predictions",
subtitle = expression(list(italic(N[A])==298, italic(N[B])==158))) +
theme(legend.background = element_blank(),
legend.position = c(.9, .75))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
my_labels <- as_labeller(c("0" = "A", "1" = "B"))
p + facet_wrap("group", labeller = my_labels) +
theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20))
post %>%
select(-iter) %>%
mutate_all(.funs = ~pnorm(. ,0, 1)) %>%
transmute(`p[Y==1]` = `b_Intercept[1]`,
`p[Y==2]` = `b_Intercept[2]` - `b_Intercept[1]`,
`p[Y==3]` = `b_Intercept[3]` - `b_Intercept[2]`,
`p[Y==4]` = `b_Intercept[4]` - `b_Intercept[3]`,
`p[Y==5]` = `b_Intercept[5]` - `b_Intercept[4]`,
`p[Y==6]` = 1 - `b_Intercept[5]`) %>%
set_names(1:6) %>%
pivot_longer(everything(), names_to = "Y") %>%
ggplot(aes(x = value, y = Y)) +
stat_pointinterval(point_interval = mode_hdi, .width = .95,
fill = sl[4], color = sl[8], size = 1/2, height = 2.5) +
scale_x_continuous(expression(italic(p)*"["*Y==italic(i)*"]"),
breaks = 0:5 / 5,
expand = c(0, 0), limits = c(0, 1)) +
theme_light() +
theme(text = element_text(size = 20))
p <- conditional_effects(m_weekly_2)
plot(p, plot = FALSE)[[2]] +
xlab("Week number") +
ylab("Question 2") +
scale_x_continuous(breaks= seq(1:12)) +
scale_y_continuous(breaks= c(3.5, 4, 4.5, 5)) +
theme_light() +
theme(text = element_text(size = 20))
m_weekly_3 <- brm(
bf(WQ3_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
lf(disc ~ 0 + experiment, cmc = FALSE),
data = weekly,
family = cumulative(probit),
sample_prior = "only",
prior = c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(normal(0, 1), class = b, dpar = disc),
prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
prior(weibull(2, 1), class = sdgp),
prior(weibull(2, 1), class = sd)
),
silent = TRUE, refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 1 finished in 0.5 seconds.
## Chain 2 finished in 0.5 seconds.
## Chain 3 finished in 0.5 seconds.
## Chain 4 finished in 0.5 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 0.5 seconds.
## Total execution time: 0.6 seconds.
pp_check(m_weekly_3,
type = "bars",
nsamples = 200)
The combinations of our priors are fairly uniform on the outcome space. Let’s sample with data now.
m_weekly_3 <- brm(
bf(WQ3_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
lf(disc ~ 0 + experiment, cmc = FALSE),
data = weekly,
family = cumulative(probit),
# sample_prior = "only",
prior = c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(normal(0, 1), class = b, dpar = disc),
prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
prior(weibull(2, 1), class = sdgp),
prior(weibull(2, 1), class = sd)
),
silent = TRUE, refresh = 0,
control = list(adapt_delta = 0.99)
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 2 finished in 43.7 seconds.
## Chain 4 finished in 44.2 seconds.
## Chain 3 finished in 47.6 seconds.
## Chain 1 finished in 47.8 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 45.8 seconds.
## Total execution time: 47.8 seconds.
Let’s have a look at some diagnostics.
# Check divergences, tree depth, energy
rstan::check_hmc_diagnostics(eval(m_weekly_3)$fit)
##
## Divergences:
## 0 of 4000 iterations ended with a divergence.
##
## Tree depth:
## 0 of 4000 iterations saturated the maximum tree depth of 10.
##
## Energy:
## E-BFMI indicated no pathological behavior.
## Rhat:
## Rhat values ok.
## ESS:
## Sufficient ESS.
tibble(X = LETTERS[1:2],
mu = c(0, fixef(m_weekly_3)["experiment", 1]),
sigma = c(1, 1 / (exp(fixef(m_weekly_3)["disc_experiment", 1])))) %>%
expand(nesting(X, mu, sigma),
y = seq(from = -5, to = 5, by = 0.1)) %>%
mutate(d = dnorm(y, mu, sigma)) %>%
ggplot(aes(x = y, y = d, fill = X)) +
geom_area(alpha = 2/3) +
geom_vline(xintercept = fixef(m_weekly_3)[1:5, 1], linetype = 3, color = sl[9]) +
scale_fill_scico_d(palette = "lajolla", begin = .33, end = .67) +
scale_x_continuous(sec.axis = dup_axis(breaks = fixef(m_weekly_3)[1:5, 1] %>% as.double(),
labels = parse(text = str_c("theta[", 1:5, "]")))) +
scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
labs(title = "Underlying latent scale for our outome Q3, \n given experimental session X", x = NULL) +
theme_light() +
theme(axis.ticks.x.top = element_blank(),
text = element_text(size = 20))
post <- posterior_samples(m_weekly_3)
post <-
post %>%
select(`b_Intercept[1]`:`b_Intercept[5]`) %>%
mutate(iter = 1:n())
means <-
post %>%
summarise_at(vars(`b_Intercept[1]`:`b_Intercept[5]`), mean) %>%
pivot_longer(everything(),
values_to = "mean")
post %>%
gather(name, threshold, -iter) %>%
group_by(iter) %>%
mutate(theta_bar = mean(threshold)) %>%
ggplot(aes(x = threshold, y = theta_bar, color = name)) +
geom_vline(data = means,
aes(xintercept = mean, color = name),
linetype = 2) +
geom_point(alpha = 1/10) +
scale_color_scico_d(palette = "lajolla", begin = .25) +
ylab("mean threshold") +
theme_light() +
theme(legend.position = "none",
text = element_text(size = 20))
p <- pp_check(m_weekly_3, type = "ecdf_overlay", nsamples = 50)
p + theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20))
p <- pp_check(m_weekly_3, type = "bars_grouped", nsamples = 100,
group = "experiment") +
scale_x_continuous("y", breaks = 1:6) +
scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
ggtitle("Data with posterior predictions",
subtitle = expression(list(italic(N[A])==298, italic(N[B])==158))) +
theme(legend.background = element_blank(),
legend.position = c(.9, .75))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
my_labels <- as_labeller(c("0" = "A", "1" = "B"))
p + facet_wrap("group", labeller = my_labels) +
theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20))
post %>%
select(-iter) %>%
mutate_all(.funs = ~pnorm(. ,0, 1)) %>%
transmute(`p[Y==1]` = `b_Intercept[1]`,
`p[Y==2]` = `b_Intercept[2]` - `b_Intercept[1]`,
`p[Y==3]` = `b_Intercept[3]` - `b_Intercept[2]`,
`p[Y==4]` = `b_Intercept[4]` - `b_Intercept[3]`,
`p[Y==5]` = `b_Intercept[5]` - `b_Intercept[4]`,
`p[Y==6]` = 1 - `b_Intercept[5]`) %>%
set_names(1:6) %>%
pivot_longer(everything(), names_to = "Y") %>%
ggplot(aes(x = value, y = Y)) +
stat_pointinterval(point_interval = mode_hdi, .width = .95,
fill = sl[4], color = sl[8], size = 1/2, height = 2.5) +
scale_x_continuous(expression(italic(p)*"["*Y==italic(i)*"]"),
breaks = 0:5 / 5,
expand = c(0, 0), limits = c(0, 1)) +
theme_light() +
theme(text = element_text(size = 20))
p <- conditional_effects(m_weekly_3)
plot(p, plot = FALSE)[[2]] +
xlab("Week number") +
ylab("Question 3") +
scale_x_continuous(breaks= seq(1:12)) +
scale_y_continuous(breaks= c(3, 3.5, 4, 4.5, 5)) +
theme_light() +
theme(text = element_text(size = 20))
m_weekly_4 <- brm(
bf(WQ4_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
lf(disc ~ 0 + experiment, cmc = FALSE),
data = weekly,
family = cumulative(probit),
sample_prior = "only",
prior = c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(normal(0, 1), class = b, dpar = disc),
prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
prior(weibull(2, 1), class = sdgp),
prior(weibull(2, 1), class = sd)
),
silent = TRUE, refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 1 finished in 0.5 seconds.
## Chain 3 finished in 0.5 seconds.
## Chain 4 finished in 0.4 seconds.
## Chain 2 finished in 0.5 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 0.5 seconds.
## Total execution time: 0.6 seconds.
pp_check(m_weekly_4,
type = "bars",
nsamples = 200)
The combinations of our priors are fairly uniform on the outcome space. Let’s sample with data now.
m_weekly_4 <- brm(
bf(WQ4_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
lf(disc ~ 0 + experiment, cmc = FALSE),
data = weekly,
family = cumulative(probit),
# sample_prior = "only",
prior = c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(normal(0, 1), class = b, dpar = disc),
prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
prior(weibull(2, 1), class = sdgp),
prior(weibull(2, 1), class = sd)
),
silent = TRUE, refresh = 0,
control = list(adapt_delta = 0.99)
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 4 finished in 36.9 seconds.
## Chain 3 finished in 38.4 seconds.
## Chain 2 finished in 38.8 seconds.
## Chain 1 finished in 41.1 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 38.8 seconds.
## Total execution time: 41.2 seconds.
Let’s have a look at some diagnostics.
# Check divergences, tree depth, energy
rstan::check_hmc_diagnostics(eval(m_weekly_4)$fit)
##
## Divergences:
## 0 of 4000 iterations ended with a divergence.
##
## Tree depth:
## 0 of 4000 iterations saturated the maximum tree depth of 10.
##
## Energy:
## E-BFMI indicated no pathological behavior.
## Rhat:
## Rhat values ok.
## ESS:
## Sufficient ESS.
tibble(X = LETTERS[1:2],
mu = c(0, fixef(m_weekly_4)["experiment", 1]),
sigma = c(1, 1 / (exp(fixef(m_weekly_4)["disc_experiment", 1])))) %>%
expand(nesting(X, mu, sigma),
y = seq(from = -5, to = 5, by = 0.1)) %>%
mutate(d = dnorm(y, mu, sigma)) %>%
ggplot(aes(x = y, y = d, fill = X)) +
geom_area(alpha = 2/3) +
geom_vline(xintercept = fixef(m_weekly_4)[1:5, 1], linetype = 3, color = sl[9]) +
scale_fill_scico_d(palette = "lajolla", begin = .33, end = .67) +
scale_x_continuous(sec.axis = dup_axis(breaks = fixef(m_weekly_4)[1:5, 1] %>% as.double(),
labels = parse(text = str_c("theta[", 1:5, "]")))) +
scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
labs(title = "Underlying latent scale for our outome Q4, \n given experimental session X", x = NULL) +
theme_light() +
theme(axis.ticks.x.top = element_blank(),
text = element_text(size = 20))
post <- posterior_samples(m_weekly_4)
post <-
post %>%
select(`b_Intercept[1]`:`b_Intercept[5]`) %>%
mutate(iter = 1:n())
means <-
post %>%
summarise_at(vars(`b_Intercept[1]`:`b_Intercept[5]`), mean) %>%
pivot_longer(everything(),
values_to = "mean")
post %>%
gather(name, threshold, -iter) %>%
group_by(iter) %>%
mutate(theta_bar = mean(threshold)) %>%
ggplot(aes(x = threshold, y = theta_bar, color = name)) +
geom_vline(data = means,
aes(xintercept = mean, color = name),
linetype = 2) +
geom_point(alpha = 1/10) +
scale_color_scico_d(palette = "lajolla", begin = .25) +
ylab("mean threshold") +
theme_light() +
theme(legend.position = "none",
text = element_text(size = 20))
p <- pp_check(m_weekly_4, type = "ecdf_overlay", nsamples = 50)
p + theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20))
p <- pp_check(m_weekly_4, type = "bars_grouped", nsamples = 100,
group = "experiment") +
scale_x_continuous("y", breaks = 1:6) +
scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
ggtitle("Data with posterior predictions",
subtitle = expression(list(italic(N[A])==298, italic(N[B])==158))) +
theme(legend.background = element_blank(),
legend.position = c(.9, .75))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
my_labels <- as_labeller(c("0" = "A", "1" = "B"))
p + facet_wrap("group", labeller = my_labels) +
theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20))
post %>%
select(-iter) %>%
mutate_all(.funs = ~pnorm(. ,0, 1)) %>%
transmute(`p[Y==1]` = `b_Intercept[1]`,
`p[Y==2]` = `b_Intercept[2]` - `b_Intercept[1]`,
`p[Y==3]` = `b_Intercept[3]` - `b_Intercept[2]`,
`p[Y==4]` = `b_Intercept[4]` - `b_Intercept[3]`,
`p[Y==5]` = `b_Intercept[5]` - `b_Intercept[4]`,
`p[Y==6]` = 1 - `b_Intercept[5]`) %>%
set_names(1:6) %>%
pivot_longer(everything(), names_to = "Y") %>%
ggplot(aes(x = value, y = Y)) +
stat_pointinterval(point_interval = mode_hdi, .width = .95,
fill = sl[4], color = sl[8], size = 1/2, height = 2.5) +
scale_x_continuous(expression(italic(p)*"["*Y==italic(i)*"]"),
breaks = 0:5 / 5,
expand = c(0, 0), limits = c(0, 1)) +
theme_light() +
theme(text = element_text(size = 20))
p <- conditional_effects(m_weekly_4)
plot(p, plot = FALSE)[[2]] +
xlab("Week number") +
ylab("Question 4") +
scale_x_continuous(breaks= seq(1:12)) +
scale_y_continuous(breaks= c(3, 3.5, 4, 4.5, 5)) +
theme_light() +
theme(text = element_text(size = 20))
m_weekly_5 <- brm(
bf(WQ5_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
lf(disc ~ 0 + experiment, cmc = FALSE),
data = weekly,
family = cumulative(probit),
sample_prior = "only",
prior = c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(normal(0, 1), class = b, dpar = disc),
prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
prior(weibull(2, 1), class = sdgp),
prior(weibull(2, 1), class = sd)
),
silent = TRUE, refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 1 finished in 0.4 seconds.
## Chain 2 finished in 0.4 seconds.
## Chain 3 finished in 0.3 seconds.
## Chain 4 finished in 0.4 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 0.4 seconds.
## Total execution time: 0.5 seconds.
pp_check(m_weekly_5,
type = "bars",
nsamples = 200)
The combinations of our priors are fairly uniform on the outcome space. Let’s sample with data now.
m_weekly_5 <- brm(
bf(WQ5_1_6 ~ 1 + gp(Seq) + experiment + (1 | ID)) +
lf(disc ~ 0 + experiment, cmc = FALSE),
data = weekly,
family = cumulative(probit),
# sample_prior = "only",
prior = c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(normal(0, 1), class = b, dpar = disc),
prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
prior(weibull(2, 1), class = sdgp),
prior(weibull(2, 1), class = sd)
),
silent = TRUE, refresh = 0,
control = list(adapt_delta = 0.99)
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 2 finished in 48.9 seconds.
## Chain 1 finished in 55.1 seconds.
## Chain 3 finished in 55.1 seconds.
## Chain 4 finished in 78.1 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 59.3 seconds.
## Total execution time: 78.3 seconds.
Let’s have a look at some diagnostics.
# Check divergences, tree depth, energy
rstan::check_hmc_diagnostics(eval(m_weekly_5)$fit)
##
## Divergences:
## 0 of 4000 iterations ended with a divergence.
##
## Tree depth:
## 0 of 4000 iterations saturated the maximum tree depth of 10.
##
## Energy:
## E-BFMI indicated no pathological behavior.
## Rhat:
## Rhat values ok.
## ESS:
## Sufficient ESS.
tibble(X = LETTERS[1:2],
mu = c(0, fixef(m_weekly_5)["experiment", 1]),
sigma = c(1, 1 / (exp(fixef(m_weekly_5)["disc_experiment", 1])))) %>%
expand(nesting(X, mu, sigma),
y = seq(from = -5, to = 5, by = 0.1)) %>%
mutate(d = dnorm(y, mu, sigma)) %>%
ggplot(aes(x = y, y = d, fill = X)) +
geom_area(alpha = 2/3) +
geom_vline(xintercept = fixef(m_weekly_5)[1:5, 1], linetype = 3, color = sl[9]) +
scale_fill_scico_d(palette = "lajolla", begin = .33, end = .67) +
scale_x_continuous(sec.axis = dup_axis(breaks = fixef(m_weekly_5)[1:5, 1] %>% as.double(),
labels = parse(text = str_c("theta[", 1:5, "]")))) +
scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
labs(title = "Underlying latent scale for our outome Q5, \n given experimental session X", x = NULL) +
theme_light() +
theme(axis.ticks.x.top = element_blank(),
text = element_text(size = 20))
post <- posterior_samples(m_weekly_5)
post <-
post %>%
select(`b_Intercept[1]`:`b_Intercept[5]`) %>%
mutate(iter = 1:n())
means <-
post %>%
summarise_at(vars(`b_Intercept[1]`:`b_Intercept[5]`), mean) %>%
pivot_longer(everything(),
values_to = "mean")
post %>%
gather(name, threshold, -iter) %>%
group_by(iter) %>%
mutate(theta_bar = mean(threshold)) %>%
ggplot(aes(x = threshold, y = theta_bar, color = name)) +
geom_vline(data = means,
aes(xintercept = mean, color = name),
linetype = 2) +
geom_point(alpha = 1/10) +
scale_color_scico_d(palette = "lajolla", begin = .25) +
ylab("mean threshold") +
theme_light() +
theme(legend.position = "none",
text = element_text(size = 20))
p <- pp_check(m_weekly_5, type = "ecdf_overlay", nsamples = 50)
p + theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20))
p <- pp_check(m_weekly_5, type = "bars_grouped", nsamples = 100,
group = "experiment") +
scale_x_continuous("y", breaks = 1:6) +
scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
ggtitle("Data with posterior predictions",
subtitle = expression(list(italic(N[A])==298, italic(N[B])==158))) +
theme(legend.background = element_blank(),
legend.position = c(.9, .75))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
my_labels <- as_labeller(c("0" = "A", "1" = "B"))
p + facet_wrap("group", labeller = my_labels) +
theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20))
post %>%
select(-iter) %>%
mutate_all(.funs = ~pnorm(. ,0, 1)) %>%
transmute(`p[Y==1]` = `b_Intercept[1]`,
`p[Y==2]` = `b_Intercept[2]` - `b_Intercept[1]`,
`p[Y==3]` = `b_Intercept[3]` - `b_Intercept[2]`,
`p[Y==4]` = `b_Intercept[4]` - `b_Intercept[3]`,
`p[Y==5]` = `b_Intercept[5]` - `b_Intercept[4]`,
`p[Y==6]` = 1 - `b_Intercept[5]`) %>%
set_names(1:6) %>%
pivot_longer(everything(), names_to = "Y") %>%
ggplot(aes(x = value, y = Y)) +
stat_pointinterval(point_interval = mode_hdi, .width = .95,
fill = sl[4], color = sl[8], size = 1/2, height = 2.5) +
scale_x_continuous(expression(italic(p)*"["*Y==italic(i)*"]"),
breaks = 0:5 / 5,
expand = c(0, 0), limits = c(0, 1)) +
theme_light() +
theme(text = element_text(size = 20))
p <- conditional_effects(m_weekly_5)
plot(p, plot = FALSE)[[2]] +
xlab("Week number") +
ylab("Question 5") +
scale_x_continuous(breaks= seq(1:12)) +
scale_y_continuous(breaks= c(3, 3.5, 4, 4.5, 5)) +
theme_light() +
theme(text = element_text(size = 20))
Compare the above with the two below plots where we have separated the effect according to the two experimental sessions.
## Running MCMC with 4 parallel chains...
##
## Chain 1 finished in 22.6 seconds.
## Chain 2 finished in 23.2 seconds.
## Chain 3 finished in 24.3 seconds.
## Chain 4 finished in 25.5 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 23.9 seconds.
## Total execution time: 26.1 seconds.
## Running MCMC with 4 parallel chains...
##
## Chain 4 finished in 19.3 seconds.
## Chain 3 finished in 20.6 seconds.
## Chain 1 finished in 21.2 seconds.
## Chain 2 finished in 25.9 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 21.7 seconds.
## Total execution time: 26.0 seconds.
## Running MCMC with 4 parallel chains...
##
## Chain 1 finished in 27.6 seconds.
## Chain 4 finished in 28.6 seconds.
## Chain 3 finished in 29.4 seconds.
## Chain 2 finished in 39.8 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 31.4 seconds.
## Total execution time: 40.0 seconds.
## Running MCMC with 4 parallel chains...
##
## Chain 4 finished in 24.5 seconds.
## Chain 2 finished in 25.7 seconds.
## Chain 1 finished in 26.0 seconds.
## Chain 3 finished in 29.8 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 26.5 seconds.
## Total execution time: 30.0 seconds.
## Running MCMC with 4 parallel chains...
##
## Chain 2 finished in 24.3 seconds.
## Chain 4 finished in 26.9 seconds.
## Chain 1 finished in 28.7 seconds.
## Chain 3 finished in 37.0 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 29.2 seconds.
## Total execution time: 37.1 seconds.
## Running MCMC with 4 parallel chains...
##
## Chain 2 finished in 17.3 seconds.
## Chain 1 finished in 17.5 seconds.
## Chain 4 finished in 17.5 seconds.
## Chain 3 finished in 18.7 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 17.7 seconds.
## Total execution time: 18.8 seconds.
## Running MCMC with 4 parallel chains...
##
## Chain 2 finished in 22.1 seconds.
## Chain 3 finished in 25.2 seconds.
## Chain 1 finished in 26.4 seconds.
## Chain 4 finished in 28.9 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 25.6 seconds.
## Total execution time: 29.1 seconds.
## Running MCMC with 4 parallel chains...
##
## Chain 2 finished in 15.4 seconds.
## Chain 1 finished in 15.8 seconds.
## Chain 3 finished in 16.5 seconds.
## Chain 4 finished in 17.2 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 16.2 seconds.
## Total execution time: 17.3 seconds.
## Running MCMC with 4 parallel chains...
##
## Chain 2 finished in 28.5 seconds.
## Chain 1 finished in 30.5 seconds.
## Chain 4 finished in 32.3 seconds.
## Chain 3 finished in 32.9 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 31.0 seconds.
## Total execution time: 33.0 seconds.
## Running MCMC with 4 parallel chains...
##
## Chain 4 finished in 17.9 seconds.
## Chain 1 finished in 22.8 seconds.
## Chain 2 finished in 24.3 seconds.
## Chain 3 finished in 24.5 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 22.4 seconds.
## Total execution time: 24.6 seconds.
daily
dataRecall, the daily data set consists of the following variables,
str(daily)
## 'data.frame': 1646 obs. of 4 variables:
## $ ID : int 1 1 1 1 1 2 2 2 2 2 ...
## $ resp : int 8 7 3 5 6 6 7 8 6 7 ...
## $ experiment: int 0 0 0 0 0 0 0 0 0 0 ...
## $ Seq : int 1 2 3 4 5 1 2 3 4 5 ...
where ID
is the subject, resp
is the response (Likert \(1,\ldots,10\)), experiment
is the session the subject belongs to, and, finally, Seq
is the order each subject answered the question.
The response, resp
, when plotted, looks like this,
hist(daily$resp, main = "Histogram of responses to daily question")
and the maximum and minimum number of responses (days) a subject provided was,
max(daily$Seq)
## [1] 84
min(daily$Seq)
## [1] 1
If we take a random sample of \(32\) subjects we will see how complicated it will be to analyze this. For the question we have in the daily survey instrument, for each of the \(32\) subjects, we’ve plotted answers (dots) they provided on a \(1,\ldots,10\) Likert scale; then we’ve drawn a linear model for what we’ve got.
daily |>
filter(ID %in% sample(x = max(daily$ID), size = 32)) %>%
ggplot(aes(x = Seq, y = resp)) +
geom_point() +
scale_x_continuous(breaks = c(1,20,40,60,80)) +
scale_y_continuous(breaks = c(1,5,10)) +
stat_smooth(method = "lm", se = F) +
ylab("Daily question") +
xlab("Days") +
theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20)) +
facet_wrap( ~ ID, ncol = 4)
## `geom_smooth()` using formula 'y ~ x'
We want to model the daily instrument and how it varies over time \(t\) (in our case the sequence variable Seq
). In this case, \(t\) will be modeled using a Gaussian Process and then each individual in the dataset will be modeled using a varying intercept (in addition we’ll model \(\sigma\), called disc
below, for each experimental session).
An information theoretical comparison using LOO showed that a model with varying intercepts for each individual had considerably better relative out of sample performance with \(\Delta\text{elpd} = -77.1\) and \(\Delta\text{SE}=11.2\), i.e., assuming a \(99\%\) \(z\)-score of \(2.576\) we have \(\text{CI}_{95\%}[-106;-48]\).
m_daily <- brm(
bf(resp ~ 1 + gp(Seq) + experiment + (1 | ID)) +
lf(disc ~ 0 + experiment, cmc = FALSE),
data = daily,
family = cumulative(probit),
sample_prior = "only",
prior = c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(normal(0, 1), class = b, dpar = disc),
prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
prior(weibull(2, 1), class = sdgp),
prior(weibull(2, 1), class = sd)
),
silent = TRUE, refresh = 0, control = list(adapt_delta=0.95)
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 1 finished in 0.8 seconds.
## Chain 2 finished in 0.8 seconds.
## Chain 3 finished in 0.9 seconds.
## Chain 4 finished in 0.9 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 0.9 seconds.
## Total execution time: 1.0 seconds.
pp_check(m_daily,
type = "bars",
nsamples = 200)
The combinations of our priors are fairly uniform on the outcome space. Let’s sample with data now.
m_daily <- brm(
bf(resp ~ 1 + gp(Seq) + experiment + (1 | ID)) +
lf(disc ~ 0 + experiment, cmc = FALSE),
data = daily,
family = cumulative(probit),
# sample_prior = "only",
prior = c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(normal(0, 1), class = b, dpar = disc),
prior(inv_gamma(4.3, 1), class = lscale, coef = gpSeq),
prior(weibull(2, 1), class = sdgp),
prior(weibull(2, 1), class = sd)
),
silent = TRUE, refresh = 0, init = 0,
control = list(adapt_delta = 0.99, max_treedepth=12)
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 3 finished in 491.8 seconds.
## Chain 1 finished in 497.1 seconds.
## Chain 4 finished in 589.4 seconds.
## Chain 2 finished in 729.8 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 577.1 seconds.
## Total execution time: 730.0 seconds.
Let’s have a look at some diagnostics.
# Check divergences, tree depth, energy
rstan::check_hmc_diagnostics(eval(m_daily)$fit)
##
## Divergences:
## 0 of 4000 iterations ended with a divergence.
##
## Tree depth:
## 0 of 4000 iterations saturated the maximum tree depth of 10.
##
## Energy:
## E-BFMI indicated no pathological behavior.
## Rhat:
## Rhat values ok.
## ESS:
## Sufficient ESS.
tibble(X = LETTERS[1:2],
mu = c(0, fixef(m_daily)["experiment", 1]),
sigma = c(1, 1 / (exp(fixef(m_daily)["disc_experiment", 1])))) %>%
expand(nesting(X, mu, sigma),
y = seq(from = -5, to = 5, by = 0.1)) %>%
mutate(d = dnorm(y, mu, sigma)) %>%
ggplot(aes(x = y, y = d, fill = X)) +
geom_area(alpha = 2/3) +
geom_vline(xintercept = fixef(m_daily)[1:9, 1], linetype = 3, color = sl[9]) +
scale_fill_scico_d(palette = "lajolla", begin = .33, end = .67) +
scale_x_continuous(sec.axis = dup_axis(breaks = fixef(m_daily)[1:9, 1] %>% as.double(),
labels = parse(text = str_c("theta[", 1:9, "]")))) +
scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
labs(title = "Underlying latent scale for our outome, \n given experimental session X", x = NULL) +
theme_light() +
theme(axis.ticks.x.top = element_blank(),
text = element_text(size = 20))
post <- posterior_samples(m_daily)
post <-
post %>%
select(`b_Intercept[1]`:`b_Intercept[9]`) %>%
mutate(iter = 1:n())
means <-
post %>%
summarise_at(vars(`b_Intercept[1]`:`b_Intercept[9]`), mean) %>%
pivot_longer(everything(),
values_to = "mean")
post %>%
gather(name, threshold, -iter) %>%
group_by(iter) %>%
mutate(theta_bar = mean(threshold)) %>%
ggplot(aes(x = threshold, y = theta_bar, color = name)) +
geom_vline(data = means,
aes(xintercept = mean, color = name),
linetype = 2) +
geom_point(alpha = 1/10) +
scale_color_scico_d(palette = "lajolla", begin = .25) +
ylab("mean threshold") +
theme_light() +
theme(legend.position = "none",
text = element_text(size = 20))
p <- pp_check(m_daily, type = "ecdf_overlay", nsamples = 50)
p + theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20))
p <- pp_check(m_daily, type = "bars_grouped", nsamples = 100,
group = "experiment") +
scale_x_continuous("y", breaks = 1:10) +
scale_y_continuous(NULL, breaks = NULL, expand = expansion(mult = c(0, 0.05))) +
ggtitle("Data with posterior predictions",
subtitle = expression(list(italic(N[A])==1024, italic(N[B])==622))) +
theme(legend.background = element_blank(),
legend.position = c(.9, .75))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
my_labels <- as_labeller(c("0" = "A", "1" = "B"))
p + facet_wrap("group", labeller = my_labels) +
theme_light() +
theme(panel.grid = element_blank(),
text = element_text(size = 20))
post %>%
select(-iter) %>%
mutate_all(.funs = ~pnorm(. ,0, 1)) %>%
transmute(`p[Y==1]` = `b_Intercept[1]`,
`p[Y==2]` = `b_Intercept[2]` - `b_Intercept[1]`,
`p[Y==3]` = `b_Intercept[3]` - `b_Intercept[2]`,
`p[Y==4]` = `b_Intercept[4]` - `b_Intercept[3]`,
`p[Y==5]` = `b_Intercept[5]` - `b_Intercept[4]`,
`p[Y==6]` = `b_Intercept[6]` - `b_Intercept[5]`,
`p[Y==7]` = `b_Intercept[7]` - `b_Intercept[6]`,
`p[Y==8]` = `b_Intercept[8]` - `b_Intercept[7]`,
`p[Y==9]` = `b_Intercept[9]` - `b_Intercept[8]`,
`p[Y==10]` = 1 - `b_Intercept[9]`) %>%
set_names(1:10) %>%
pivot_longer(everything(), names_to = "Y") %>%
ggplot(aes(x = value, y = Y)) +
stat_pointinterval(point_interval = mode_hdi, .width = .95,
fill = sl[4], color = sl[8], size = 1/2, height = 2.5) +
scale_x_continuous(expression(italic(p)*"["*Y==italic(i)*"]"),
breaks = 0:5 / 5,
expand = c(0, 0), limits = c(0, 1)) +
scale_y_discrete(limits = factor(seq(1:10))) +
theme_light() +
theme(text = element_text(size = 20))
#pdf("daily.pdf", width=7.5/2.54, height=9.49/2.54)
plot(conditional_effects(m_daily), plot = FALSE)[[2]] +
xlim(1,84) +
ylab("Response") +
scale_x_continuous(name="Day",
limits = c(1,84),
breaks = c(1,seq(10,80, by = 10),84)) +
scale_y_continuous(name="Response",
limits = c(6,8),
breaks = seq(6,8)) +
theme(text = element_text(size=16))
#dev.off()
We want a model that uses personal data (e.g., age and gender) as predictors. The idea is to later see if any of the predictors have predictive capacity for the different survey instruments.
Design models with predictors from personal
data, together with our indicator t
for time, experiment
for experimental session, and a varying intercept ID
that varies depending on subject.
Let’s now sample all models with the same predictors we used above (visual prior and posterior predictive checks were done for each model).
d$Living_1_4_f <- as.factor(d$Living_1_4)
################################################################################
#
# MAAS
#
################################################################################
bf1 <- bf(mvbind(MAASQ1_1_6,
MAASQ2_1_6,
MAASQ3_1_6,
MAASQ4_1_6,
MAASQ5_1_6,
MAASQ6_1_6,
MAASQ7_1_6,
MAASQ8_1_6,
MAASQ9_1_6,
MAASQ10_1_6,
MAASQ11_1_6,
MAASQ12_1_6,
MAASQ13_1_6,
MAASQ14_1_6,
MAASQ15_1_6) ~
1 + Age + Gender_0_1 + Occupation_0_1 + Living_1_4_f + t + (1 |e| Experiment_1_2) + (1 |c| ID))
# Prior predictive checks have been conducted and the probability mass
# is distributed evenly on the outcome space
p <- get_prior(bf1, data=d, family=cumulative(probit)) %>%
mutate(prior = ifelse(class == "b", "normal(0,3)", prior)) %>%
mutate(prior = ifelse(class == "cor", "lkj(2)", prior)) %>%
mutate(prior = ifelse(class == "Intercept", "normal(0,5)", prior)) %>%
mutate(prior = ifelse(class == "sd", "weibull(2,1)", prior))
m_maas <- brm(bf1,
family = cumulative(probit),
data = d,
prior = p,
#sample_prior = "only", # use if you only want to sample from prior
silent = TRUE,
iter = 5e3,
inits = "0",
control = list(adapt_delta=0.99),
refresh = 0)
## Running MCMC with 4 parallel chains...
##
## Chain 4 finished in 1418.7 seconds.
## Chain 3 finished in 1419.0 seconds.
## Chain 2 finished in 1423.6 seconds.
## Chain 1 finished in 1444.3 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 1426.4 seconds.
## Total execution time: 1444.7 seconds.
# use the below to do prior and posterior predictive checks
# change resp to the response variable you want to check
pp_check(m_maas, resp = "MAASQ116", type = "bars", nsamples = 250)
################################################################################
#
# SPANE
#
################################################################################
bf1 <- bf(mvbind(SPANEQ1_1_5,
SPANEQ2_1_5,
SPANEQ3_1_5,
SPANEQ4_1_5,
SPANEQ5_1_5,
SPANEQ6_1_5,
SPANEQ7_1_5,
SPANEQ8_1_5,
SPANEQ9_1_5,
SPANEQ10_1_5,
SPANEQ11_1_5,
SPANEQ12_1_5) ~
1 + Age + Gender_0_1 + Occupation_0_1 + Living_1_4 + t + (1 |e| Experiment_1_2) + (1 |c| ID))
p <- get_prior(bf1, data=d, family=cumulative(probit)) %>%
mutate(prior = ifelse(class == "b", "normal(0,3)", prior)) %>%
mutate(prior = ifelse(class == "cor", "lkj(2)", prior)) %>%
mutate(prior = ifelse(class == "Intercept", "normal(0,5)", prior)) %>%
mutate(prior = ifelse(class == "sd", "weibull(2,1)", prior))
m_spane <- brm(bf1,
family = cumulative(probit),
data = d,
prior = p,
init = "0",
iter = 8e3,
control = list(adapt_delta=0.99),
# sample_prior = "only", # use if you only want to sample from prior
silent = TRUE,
refresh = 0)
## Running MCMC with 4 parallel chains...
##
## Chain 3 finished in 1796.1 seconds.
## Chain 2 finished in 1802.3 seconds.
## Chain 4 finished in 1811.7 seconds.
## Chain 1 finished in 1870.6 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 1820.2 seconds.
## Total execution time: 1870.9 seconds.
# use the below to do prior and posterior predictive checks
# change resp to the response variable you want to check
# pp_check(m_spane, resp = "SPANEQ115", type = "bars", nsamples = 250)
################################################################################
#
# PWB
#
################################################################################
bf1 <- bf(mvbind(PWBQ1_1_7,
PWBQ2_1_7,
PWBQ3_1_7,
PWBQ4_1_7,
PWBQ5_1_7,
PWBQ6_1_7,
PWBQ7_1_7,
PWBQ8_1_7) ~
1 + Age + Gender_0_1 + Occupation_0_1 + Living_1_4 + t + (1 |e| Experiment_1_2) + (1 |c| ID))
p <- get_prior(bf1, data=d, family=cumulative(probit)) %>%
mutate(prior = ifelse(class == "b", "normal(0,3)", prior)) %>%
mutate(prior = ifelse(class == "cor", "lkj(2)", prior)) %>%
mutate(prior = ifelse(class == "Intercept", "normal(0,5)", prior)) %>%
mutate(prior = ifelse(class == "sd", "weibull(2,1)", prior))
m_pwb <- brm(bf1,
family = cumulative(probit),
data = d,
prior = p,
init = "0",
control = list(adapt_delta=0.95),
iter = 8e3,
# sample_prior = "only", # use if you only want to sample from prior
silent = TRUE,
refresh = 0)
## Running MCMC with 4 parallel chains...
##
## Chain 1 finished in 554.1 seconds.
## Chain 3 finished in 555.9 seconds.
## Chain 4 finished in 559.4 seconds.
## Chain 2 finished in 560.3 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 557.4 seconds.
## Total execution time: 560.7 seconds.
# pp_check(m_pwb, resp = "PWBQ117", type = "bars", nsamples = 250)
################################################################################
#
# PST
#
################################################################################
# Here we move to a bernoulli since we have 0/1 outcome
bf1 <- bf(mvbind(PSTQ1_0_1,
PSTQ2_0_1,
PSTQ3_0_1,
PSTQ4_0_1,
PSTQ5_0_1,
PSTQ6_0_1,
PSTQ7_0_1,
PSTQ8_0_1,
PSTQ9_0_1,
PSTQ10_0_1,
PSTQ11_0_1,
PSTQ12_0_1,
PSTQ13_0_1,
PSTQ14_0_1,
PSTQ15_0_1,
PSTQ16_0_1,
PSTQ17_0_1,
PSTQ18_0_1,
PSTQ19_0_1,
PSTQ20_0_1,
PSTQ21_0_1,
PSTQ22_0_1) ~
1 + Age + Gender_0_1 + Occupation_0_1 + Living_1_4 + t + (1 |e| Experiment_1_2) + (1 |c| ID))
p <- get_prior(bf1, data=d, family=bernoulli) %>%
mutate(prior = ifelse(class == "b", "normal(0,3)", prior)) %>%
mutate(prior = ifelse(class == "cor", "lkj(2)", prior)) %>%
mutate(prior = ifelse(class == "Intercept", "normal(0,5)", prior)) %>%
mutate(prior = ifelse(class == "sd", "weibull(2,1)", prior))
m_pst <- brm(bf1,
family = bernoulli,
data = d,
prior = p,
# sample_prior = "only",
silent = TRUE,
refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 2 finished in 99.7 seconds.
## Chain 4 finished in 100.6 seconds.
## Chain 3 finished in 100.7 seconds.
## Chain 1 finished in 104.0 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 101.3 seconds.
## Total execution time: 104.4 seconds.
# pp_check(m_pst, resp = "PSTQ101", type = "bars", nsamples = 250)
################################################################################
#
# SE
#
################################################################################
bf1 <- bf(mvbind(SEQ1_1_4,
SEQ2_1_4,
SEQ3_1_4,
SEQ4_1_4,
SEQ5_1_4,
SEQ6_1_4,
SEQ7_1_4,
SEQ8_1_4,
SEQ9_1_4,
SEQ10_1_4) ~
1 + Age + Gender_0_1 + Occupation_0_1 + Living_1_4 + t + (1 |e| Experiment_1_2) + (1 |c| ID))
p <- get_prior(bf1, data=d, family=cumulative(probit)) %>%
mutate(prior = ifelse(class == "b", "normal(0,3)", prior)) %>%
mutate(prior = ifelse(class == "cor", "lkj(2)", prior)) %>%
mutate(prior = ifelse(class == "Intercept", "normal(0,5)", prior)) %>%
mutate(prior = ifelse(class == "sd", "weibull(2,1)", prior))
m_se <- brm(bf1,
family = cumulative(probit),
data = d,
prior = p,
init = "0",
control = list(adapt_delta = 0.99),
iter = 5e3,
# sample_prior = "only",
silent = TRUE,
refresh = 0)
## Running MCMC with 4 parallel chains...
##
## Chain 2 finished in 1005.8 seconds.
## Chain 1 finished in 1009.0 seconds.
## Chain 3 finished in 1015.2 seconds.
## Chain 4 finished in 1033.4 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 1015.9 seconds.
## Total execution time: 1033.8 seconds.
# pp_check(m_se, resp = "SEQ314", type = "bars", nsamples = 250)
################################################################################
#
# PPHQ
#
################################################################################
bf1 <- bf(mvbind(PPHQ1_1_5,
PPHQ2_1_5,
PPHQ3_1_5,
PPHQ4_1_5,
PPHQ5_1_5,
PPHQ6_1_5,
PPHQ7_1_5,
PPRQ1_1_10,
PPRQ2_1_10,
PPRQ3_1_10,
PPOQ1_1_7
) ~
1 + Age + Gender_0_1 + Occupation_0_1 + Living_1_4 + t + (1 |e| Experiment_1_2) + (1 |c| ID))
p <- get_prior(bf1, data=d, family=cumulative(probit)) %>%
mutate(prior = ifelse(class == "b", "normal(0,3)", prior)) %>%
mutate(prior = ifelse(class == "cor", "lkj(2)", prior)) %>%
mutate(prior = ifelse(class == "Intercept", "normal(0,5)", prior)) %>%
mutate(prior = ifelse(class == "sd", "weibull(2,1)", prior))
m_pph <- brm(bf1,
family = cumulative(probit),
data = d,
prior = p,
init = "0",
iter = 1e4,
control =list(adapt_delta=0.96),
# sample_prior = "only",
silent = TRUE,
refresh = 0)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 3 finished in 992.1 seconds.
## Chain 4 finished in 993.9 seconds.
## Chain 1 finished in 997.5 seconds.
## Chain 2 finished in 1335.5 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 1079.7 seconds.
## Total execution time: 1335.9 seconds.
# pp_check(m_pph, resp = "PPHQ115", type = "bars", nsamples = 250)
So we have now sampled a number of models (\(6\)), i.e., for each instrument we have one multivariate model (\(>1\) outcome), where we also estimate the correlation between the outcome given the subject’s ID.
Before we put any trust in these models we need to check a number of diagnostics.
Hamiltonian Monte Carlo diagnostics indicate all is well (we’ve also checked traceplots, effective sample sizes, and \(\widehat{R}\) for all models).
The diagnostics code is not executed in this Rmd
since it takes up a lot of space. However, if one wants to, in the index.Rmd
(used to generate the index.html
) one can see all diagnostics that were checked and if one wants to rerun it then simply change the code chunk to eval=TRUE
and include=TRUE
.
First, let’s summarize what we have so far:
We can now investigate which, if any, parameters are significant on the 95%-level for the third item in the list, i.e., for each model, and each question, plot all parameters. Any parameters that are significant on the 95%-level could warrant further analysis, and in particular a connection to the qualitative analysis.
For each model we first analyze the temporal variable \(t\). Then we analyze the rest of the \(\beta\) parameters. Finally, where appropriate we look at the conditional effects. All densities we plot are limited to 95% probability mass, while the 50% probability mass is shaded around a vertical line, which is the median.
In short, any density that does not cross zero (the vertical line) is an effect of interest.
Each of the following models used a sample size of \(n=\) 187 for sampling.
The model estimated 8071 parameters.
For Questions \(1\)–\(8\), \(11\)–\(12\), and \(14\) the parameter \(t\) is negative. In short, the respondents answered with higher responses at \(t_0\) than at \(t_1\) for these questions.
Some examples of how the questions were phrased:9
If we look at each unique parameter, in each question, can we see which, if any, predictors drove these results?
Much variance makes the estimates uncertain and, ultimately, no parameter is significant.
And the same applies to gender. How about occupation status?
Occupation status (student/non-student) has an effect for Q2. Let’s look at living conditions before we turn our attention to the next instrument.
Questions \(1\)–\(3\) are significant, as are Q\(8\), and Q\(12\).
The model estimated 6227 parameters.
All \(t\) parameters are significant (positive), i.e, subjects responded with higher values at \(t_1\).
The model estimated 4023 parameters.
All \(t\) parameters are significant, i.e., higher values at \(t_1\), except for Q\(3\).
The model estimated 12365 parameters.
Concerning the temporal variable,
Questions \(4\), \(9\), \(12\), \(15\), \(17\)–\(18\) are significant.
Concerning the remaining predictors,
The model estimated 5071 parameters.
Concerning the temporal variable,
Questions \(6\), \(7\), and \(9\) are significant.
Concerning the remaining predictors,
The model estimated 5681 parameters.
Concerning the temporal variable,
Question \(1\) shows a significant difference when moving from \(t_0\) to \(t_1\) (lower responses at \(t_1\)).
Concerning the remaining predictors,
# (APPENDIX) Appendices {-}
In order to design an appropriate model, one of the things we need to make assumptions about is the underlying data-generative model, i.e., a prior for the data we have, which is usually called the likelihood.
In short, there are four candidates: Cumulative, Continuation ratio, Stopping ratio, and Adjacent category. Let’s use LOO to create identical null models with default priors, and then use the same data for all models to compare them from an information theoretical perspective.
bf1 <- bf(mvbind(MAASQ1_1_6,
MAASQ2_1_6,
MAASQ3_1_6,
MAASQ4_1_6,
MAASQ5_1_6,
MAASQ6_1_6,
MAASQ7_1_6,
MAASQ8_1_6,
MAASQ9_1_6,
MAASQ10_1_6,
MAASQ11_1_6,
MAASQ12_1_6,
MAASQ13_1_6,
MAASQ14_1_6,
MAASQ15_1_6) ~
1)
m0 <- brm(bf1,
family = cumulative,
data = d,
silent = TRUE,
refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 3 finished in 119.0 seconds.
## Chain 1 finished in 120.2 seconds.
## Chain 2 finished in 121.9 seconds.
## Chain 4 finished in 121.9 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 120.7 seconds.
## Total execution time: 122.3 seconds.
mcr <- brm(bf1,
family = cratio,
data = d,
silent = TRUE,
refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 2 finished in 37.1 seconds.
## Chain 3 finished in 37.1 seconds.
## Chain 4 finished in 37.3 seconds.
## Chain 1 finished in 37.5 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 37.2 seconds.
## Total execution time: 37.9 seconds.
msr <- brm(bf1,
family = sratio,
data = d,
silent = TRUE,
refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 4 finished in 36.4 seconds.
## Chain 1 finished in 36.5 seconds.
## Chain 2 finished in 36.5 seconds.
## Chain 3 finished in 42.6 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 38.0 seconds.
## Total execution time: 42.9 seconds.
mac <- brm(bf1,
family = acat,
data = d,
silent = TRUE,
refresh = 0
)
## Running MCMC with 4 chains, at most 8 in parallel...
##
## Chain 4 finished in 75.6 seconds.
## Chain 2 finished in 76.2 seconds.
## Chain 1 finished in 76.5 seconds.
## Chain 3 finished in 77.1 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 76.4 seconds.
## Total execution time: 77.4 seconds.
m0 <- add_criterion(m0, criterion = "loo")
msr <- add_criterion(msr, criterion = "loo")
mcr <- add_criterion(mcr, criterion = "loo")
mac <- add_criterion(mac, criterion = "loo")
# setting moment_match = TRUE will still lead to the same conclusion
loo_compare(m0, mcr, msr, mac)
## elpd_diff se_diff
## m0 0.0 0.0
## mac -0.3 0.4
## msr -0.8 1.0
## mcr -0.9 1.1
Evidently, since \(\Delta\)SE is fairly large, in comparison to the relative difference in expected log pointwise predictive density (\(\Delta\)elpd), we can safely assume that there’s no significant difference between the likelihoods and, thus, opt for the standard approach, i.e., the Cumulative distribution, in this case represented by \(\mathcal{M}_0\).
print(sessionInfo(), locale=FALSE)
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-apple-darwin20.4.0 (64-bit)
## Running under: macOS Big Sur 11.4
##
## Matrix products: default
## BLAS: /usr/local/Cellar/openblas/0.3.15_1/lib/libopenblasp-r0.3.15.dylib
## LAPACK: /usr/local/Cellar/r/4.1.0/lib/R/lib/libRlapack.dylib
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] scico_1.2.0 here_1.0.1 data.table_1.14.0 tidybayes_2.3.1
## [5] openxlsx_4.2.4 patchwork_1.1.1 ggthemes_4.2.4 latex2exp_0.5.0
## [9] bayesplot_1.8.1 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7
## [13] purrr_0.3.4 readr_1.4.0 tidyr_1.1.3 tibble_3.1.2
## [17] ggplot2_3.3.5 tidyverse_1.3.1 plyr_1.8.6 brms_2.15.0
## [21] Rcpp_1.0.6
##
## loaded via a namespace (and not attached):
## [1] readxl_1.3.1 backports_1.2.1 igraph_1.2.6
## [4] splines_4.1.0 svUnit_1.0.6 crosstalk_1.1.1
## [7] rstantools_2.1.1 inline_0.3.19 digest_0.6.27
## [10] htmltools_0.5.1.1 rsconnect_0.8.18 fansi_0.5.0
## [13] checkmate_2.0.0 magrittr_2.0.1 modelr_0.1.8
## [16] RcppParallel_5.1.4 matrixStats_0.59.0 xts_0.12.1
## [19] prettyunits_1.1.1 colorspace_2.0-2 rvest_1.0.0
## [22] ggdist_2.4.1 haven_2.4.1 xfun_0.24
## [25] callr_3.7.0 crayon_1.4.1 jsonlite_1.7.2
## [28] lme4_1.1-27.1 zoo_1.8-9 glue_1.4.2
## [31] gtable_0.3.0 V8_3.4.2 distributional_0.2.2
## [34] pkgbuild_1.2.0 rstan_2.21.2 abind_1.4-5
## [37] scales_1.1.1 mvtnorm_1.1-2 DBI_1.1.1
## [40] miniUI_0.1.1.1 xtable_1.8-4 stats4_4.1.0
## [43] StanHeaders_2.21.0-7 DT_0.18 htmlwidgets_1.5.3
## [46] httr_1.4.2 threejs_0.3.3 arrayhelpers_1.1-0
## [49] posterior_0.1.5 ellipsis_0.3.2 farver_2.1.0
## [52] pkgconfig_2.0.3 loo_2.4.1 sass_0.4.0
## [55] dbplyr_2.1.1 utf8_1.2.1 labeling_0.4.2
## [58] tidyselect_1.1.1 rlang_0.4.11 reshape2_1.4.4
## [61] later_1.2.0 munsell_0.5.0 cellranger_1.1.0
## [64] tools_4.1.0 cli_3.0.0 generics_0.1.0
## [67] broom_0.7.8 ggridges_0.5.3 evaluate_0.14
## [70] fastmap_1.1.0 yaml_2.2.1 processx_3.5.2
## [73] knitr_1.33 fs_1.5.0 zip_2.2.0
## [76] nlme_3.1-152 mime_0.11 projpred_2.0.2
## [79] xml2_1.3.2 compiler_4.1.0 shinythemes_1.2.0
## [82] rstudioapi_0.13 curl_4.3.2 gamm4_0.2-6
## [85] reprex_2.0.0 bslib_0.2.5.1 stringi_1.6.2
## [88] highr_0.9 ps_1.6.0 Brobdingnag_1.2-6
## [91] lattice_0.20-44 Matrix_1.3-4 nloptr_1.2.2.2
## [94] markdown_1.1 tensorA_0.36.2 shinyjs_2.0.0
## [97] vctrs_0.3.8 pillar_1.6.1 lifecycle_1.0.0
## [100] jquerylib_0.1.4 bridgesampling_1.1-2 httpuv_1.6.1
## [103] R6_2.5.0 bookdown_0.22 promises_1.2.0.1
## [106] gridExtra_2.3 nleqslv_3.3.2 codetools_0.2-18
## [109] boot_1.3-28 colourpicker_1.1.0 MASS_7.3-54
## [112] gtools_3.9.2 assertthat_0.2.1 rprojroot_2.0.2
## [115] withr_2.4.2 shinystan_2.5.0 mgcv_1.8-36
## [118] parallel_4.1.0 hms_1.1.0 grid_4.1.0
## [121] coda_0.19-4 minqa_1.2.4 cmdstanr_0.4.0.9000
## [124] rmarkdown_2.9 shiny_1.6.0 lubridate_1.7.10
## [127] base64enc_0.1-3 dygraphs_1.1.1.6
For more details we refer you to the manuscript.↩︎
https://ppc.sas.upenn.edu/resources/questionnaires-researchers/mindful-attention-awareness-scale↩︎
https://www.midss.org/content/scale-positive-and-negative-experience-spane-0↩︎
https://sparqtools.org/mobility-measure/new-general-self-efficacy-scale/↩︎
https://www.psykiatri-regionh.dk/who-5/who-5-questionnaires/Pages/default.aspx↩︎
More details can be found in the manuscript.↩︎