Data and analysis supplement

Whalley, B, Solbrig, L, Kavanagh, DJ, May, J, Parkin, T, Jones, R, Andrade, J

Data and analysis supplement for: Functional Imagery Training versus Motivational Interviewing for Weight Loss: A randomised controlled trial of brief individual interventions for overweight and obesity.

This document provides supplementary information on the analysis in the main paper.

Tables and figures shown here are compiled from an RMarkdown source document, also provided, which provides a complete account of the analyses reported. For clarity some R output is suppressed in the compiled document.

Documents provided to support this supplement include:

R Dependencies

It’s suggested to use the R checkpoint library to install the relevant versions of requied R packages.

Alternatively, just run options("repos" = "https://mran.microsoft.com/snapshot/2018-05-01") prior to installation to ensure consistent package versions.

Data setup

Dataframes in a suitable format for our analyses were created as follows:

bmi <- function(height, weight) weight / height^2

df <- readr::read_csv("blind_data.csv") %>%
  filter(!is.na(group)) %>%
  mutate(person = row_number()) %>%
  mutate(female = gender == "f") %>%
  select(-gender) %>%
  mutate(
    bmi1 = bmi(height, kg1),
    bmi2 = bmi(height, kg2),
    bmi3 = bmi(height, kg3)
  ) %>%
  # These two lines were added after the analyses had been completed, allowing us to properly
  # label tables and figures. All analyses were initially conducted blind to condition.
  mutate(group = factor(ifelse(group == 1, "MI", "FIT"))) %>%
  mutate(group = relevel(group, ref = "MI"))

patient.chars <- df %>%
  select(person, age, female, height)

df_long <-
  df %>%
  select(
    contains("kg"), contains("cm"), contains("bmi"), contains("gqol"),
    person, group
  ) %>%
  gather(variable, value, -person, -group) %>%

  # regex to extract variable names and followup point from column names
  mutate(
    var = stringr::str_extract(variable, "\\w+\\D"),
    time = as.numeric(stringr::str_extract(variable, "\\d"))
  ) %>%
  reshape2::dcast(person + time + group ~ var) %>%

  # make baseline scores
  arrange(person, time) %>%
  group_by(person) %>%
  mutate(cm_base = first(cm), kg_base = first(kg), bmi_base = first(bmi)) %>%
  ungroup() %>%

  # center baselines for ease of interpretation
  mutate_at(vars(contains("_base")), funs(scale(., center = T, scale = F))) %>%
  mutate(time = factor(time, labels = c("Baseline", "Month 6", "Month 12"))) %>%

  # merge age etc, which are constant per-participant, back to this long DF
  left_join(., patient.chars)

# Version without baseline outcomes for primary analyses
df_long_nobase <- df_long %>% 
  filter(time != "Baseline")

Save intermediate data for later use:

write_csv(df, "intermediate-data/df.csv")
write_csv(df_long, "intermediate-data/df_long.csv")

For reference, print group means and N at each followup:

df_long %>%
  group_by(time, group) %>%
  summarise(mean(kg, na.rm = T), sd(kg, na.rm = T), sum(!is.na(kg))) %>%
  pander()
time group mean(kg, na.rm = T) sd(kg, na.rm = T) sum(!is.na(kg))
Baseline MI 89.13 14.76 58
Baseline FIT 90.48 15.90 63
Month 6 MI 88.39 15.72 55
Month 6 FIT 86.37 15.07 59
Month 12 MI 88.46 15.34 53
Month 12 FIT 84.04 15.96 59

And the change from baseline and N at 6 and 12 months in each group:

df %>%
  group_by(group) %>%
  summarise(sd(kg1 - kg2, na.rm = T), 
            sd(kg1 - kg3, na.rm = T), 
            N_m6 = sum(!is.na(kg1)&!is.na(kg2)), 
            N_m12 = sum(!is.na(kg1)&!is.na(kg3))) %>%
  # add total N
  bind_rows(., summarise_at(., vars(starts_with("N")), sum)) %>% 
  pander(missing="")
group sd(kg1 - kg2, na.rm = T) sd(kg1 - kg3, na.rm = T) N_m6 N_m12
MI 2.057 2.319 55 53
FIT 4.180 6.275 59 59
114 112

Missing data

Very little data in the study was lost to followup: only 9 participants failed to provide complete data, and only 7 failed to provide enough data to be included in our primary outcome models:

xs <- function(x) ifelse(x == 1, "x", ".")
df %>%
  select(matches("^kg|^cm|^gqol")) %>%
  mice::md.pattern() %>%
  tidy() %>% rename(Count=.rownames) %>% 
  select(-V9) %>%
  mutate_at(funs(xs), .vars = vars(-Count)) %>%
  select(Count, noquote(order(colnames(.)))) %>%
  filter(Count != "") %>%
  pander(caption = "Patterns of data missing or lost-to-followup.")
Patterns of data missing or lost-to-followup.
Count cm1 cm2 cm3 gqol1 gqol2 kg1 kg2 kg3
112 x x x x x x x x
2 x x . x x x x .
7 x . . x . x . .

Comparison of groups at baseline

We reported a table of baseline characteristics of participants in each group. In the text we futher reported that none of these baseline differences were statistically significant, based on the analyses here.

These functions run linear or logistic regression models, returning a test for the group indicator:

vars_to_compare <- c("cm1", "kg1", "bmi1", "gqol1", "age")
binary_vars_to_compare <- c("female")

# Helper functions for baseline comparisons, all return dataframes
comparegroups_cont <- function(f) lm(f, data = df) %>% broom::tidy()

comparegroups_binary <- function(f) {
  glm(f, data = df, family = binomial(link = "logit")) %>%
    broom::tidy() %>%
    # Convert logit to OR. Remember glm returns z not t
    mutate_at(vars(estimate, std.error), funs(exp))
}

# not actually used in the end
# comparegroups_categ <- function(f){
#     # as an example, remember glm returns z stat not t
#     chisq.test(xtabs(f, data=df)) %>%
#       broom::tidy() %>%
#       rename(df = parameter)
# }

We then run models for all continuous variables:

# compare all continuous variables
conts <- data_frame(model = paste(vars_to_compare, "~ group")) %>%
  group_by(model) %>%
  do(., comparegroups_cont(.$model)) 

And all binary variables:

binas <- data_frame(model = paste(binary_vars_to_compare, "~ group")) %>%
  group_by(model) %>%
  do(., comparegroups_binary(.$model)) 

And combine in a single table:

round.p <- function(p){
  if (p<.001) p <- .001
  return(sprintf("%.3f", p))
}

both <- bind_rows(conts, binas) %>% 
  filter(term == "groupFIT") %>%
  select(-term) %>% 
  setNames(c("Model", "Difference/OR", "Std. Error", "Statistic", "p"))
apastats::round.p
## function (values, include.rel = 1, digits = 3, strip.lead.zeros = T, 
##     replace.very.small = 0.001) 
## {
##     values <- as.numeric(values)
##     rel <- ifelse(include.rel, "= ", "")
##     values_string <- format(round(values, digits = digits), nsmall = digits)
##     if (strip.lead.zeros) {
##         values_string <- sub("^0", "", values_string)
##     }
##     values_string <- paste(rel, values_string, sep = "")
##     if (!is.null(replace.very.small)) {
##         values_string[abs(values) <= replace.very.small] <- paste0("< ", 
##             ifelse(strip.lead.zeros, sub("^0", "", replace.very.small), 
##                 replace.very.small))
##     }
##     values_string
## }
## <bytecode: 0x7f9ac98c83c0>
## <environment: namespace:apastats>
both %>%
  mutate(p = round.p(p), p = str_replace(p, "=", "")) %>%
  pander::pander(caption = "Baseline between-group comparisons (mean difference and t-statistic for continuous outcomes; odds ratio and z for binary variables; chisq for categorical variables).")
Baseline between-group comparisons (mean difference and t-statistic for continuous outcomes; odds ratio and z for binary variables; chisq for categorical variables).
Model Difference/OR Std. Error Statistic p
age ~ group 2.000 2.558 0.782 0.436
bmi1 ~ group 0.408 0.988 0.413 0.680
cm1 ~ group 0.567 2.397 0.236 0.813
gqol1 ~ group 0.420 2.296 0.183 0.855
kg1 ~ group 1.355 2.796 0.485 0.629
female ~ group 0.944 1.511 -0.140 0.889

Outcome plots

We report a figure displaying raw outcome data:

primary_outcomes_raw <- df_long %>%
  select(-contains("base"), -age, -female, -height, -bmi, -gqol) %>%
  reshape2::melt(id.vars = c("group", "person", "time")) %>%
  filter(!is.na(value)) %>% 
  ggplot(aes(time, value, group = group, shape = group)) +
  stat_summary(geom = "pointrange", fun.data=mean_cl_boot, position=position_dodge(width=.1)) +
  stat_summary(geom = "line", fun.data = mean_cl_boot) +
  facet_wrap(~ toupper(variable), scales = "free") +
  stat_summary(
    aes(label = sprintf("%.0f", ..y..)),
    fun.y = mean, geom = "text", size = 2.5, vjust = 1.5, hjust = 1.5
  ) +
  xlab("Time") + ylab("kg/cm") +
  scale_shape_discrete("") + 
  ggthemes::theme_tufte()

# ggsave("plots/primary_outcomes_raw.pdf", width = 5, height = 3)
primary_outcomes_raw

Unadjusted weight and waist circumference by group: mean and 95% confidence interval.

Unadjusted weight and waist circumference by group: mean and 95% confidence interval.

Primary outcome models

Our outcome models were defined as:

f.kg <- kg ~ kg_base + bmi_base * time * group + (1 | person)
f.cm <- cm ~ cm_base + bmi_base * time * group + (1 | person)
f.gqol <- gqol2 ~ gqol1 + bmi1 * group

Model fitting was either by ML/REML (using lm or lmer) or for Bayesian model fits via MCMC (using rstanarm).

First with ML:

m.kg <- lmer(f.kg, data = df_long_nobase)
m.cm <- lmer(f.cm, data = df_long_nobase)

# note only have 6 month data for gqol so not lmer
m.gqol <- lm(f.gqol, data = df)

And then some code to extract contrasts we need:

gqol.contrasts <-
  emmeans(m.gqol, "group", contr = "trt.vs.ctrlk")$contrasts %>%
  tidy() %>%
  mutate(time = "Month 6")

# helper to do the same for lmer models
# we see warnings when this runs because of the baseline interaction,
# but we have centered baseline. Contrasts are very similar to models
# without baseline interaction.
getlsmeansforgroup_ <- function(model) {
  emmeans(model, pairwise ~ group | time, adjust = "none")$contrasts %>%
  tidy()
}

# merge results
bgconts <- bind_rows(
  getlsmeansforgroup_(m.kg) %>% mutate(outcome = "Kg"),
  getlsmeansforgroup_(m.cm) %>% mutate(outcome = "Cm"),
  gqol.contrasts %>% mutate(outcome = "GQOL")
) %>%
  select(-level1, -level2) %>%
  rename(t = statistic, difference = estimate) %>%

  # flip estimates because of contrast coding direction
  mutate(difference = -1 * difference, t = -1 * t) %>%
  select(outcome, everything()) %>%

  # fix p values for table output
  mutate(p = round.p(p.value), p = str_replace(p, "=", "")) %>%
  select(-p.value) %>%

  # fix column names
  setNames(c("Outcome", "Followup", "Difference", "Std. Error", "df", "t", "p"))


# These results are later merged with MCMC estimates and raw means so no 
# output here - see table below.

Models estimated with MCMC, using rstanarm package:

m.gqol.stan <- stan_lm(f.gqol, data = df, prior = R2(location = .5))
m.kg.stan <- stan_lmer(kg ~ kg_base + bmi_base * time * group + (1 | person), 
                       data = df_long_nobase)
m.cm.stan <- stan_lmer(cm ~ cm_base + bmi_base * time * group + (1 | person), 
                       data = df_long_nobase)
saveRDS(m.gqol.stan, "saved-chains/m.gqol.stan.RDS")
saveRDS(m.kg.stan, "saved-chains/m.kg.stan.RDS")
saveRDS(m.cm.stan, "saved-chains/m.cm.stan.RDS")

We used the default priors provided by the rstanarm package, which are weakly informative and pessimistic (i.e. prior for effect size is zero mean). Details of the priors are shown below. Priors are constructed from the data based on rules described here:https://cran.r-project.org/web/packages/rstanarm/vignettes/priors.html#automatic-prior-scale-adjustments. For the GQOL model we chose to set the prior for \(R^2 = .5\), because we included baseline scores in this model which are likely to be highly correlated with followup (e.g. for weight). Setting \(R^2\) location lower would tend to reduce model coefficients, but for a reasonable range of values (we tested \(R^2 = \{.2, .5, .8\}\)) did not change the inferences reported.

m.gqol.stan$prior.info %>% pander()
m.kg.stan$prior.info %>% pander()
m.cm.stan$prior.info %>% pander()

Model convergence

All models appeared to converge satisfactorily, based on visual inspection of MCMC traces and parameter R-hat statistics.

rstan::stan_trace(m.cm.stan)

rstan::stan_trace(m.kg.stan)

rstan::stan_trace(m.gqol.stan)

Rhat and Neff also seemed reasonable:

# Extract Rhat and other stats from each model and summarise
a <- summary(m.cm.stan) %>% as_tibble() %>% mutate(Outcome = "Cm")
b <- summary(m.kg.stan) %>% as_tibble() %>% mutate(Outcome = "Kg")
c <- summary(m.gqol.stan) %>% as_tibble() %>% mutate(Outcome = "GQOL")

model.stats <- bind_rows(a, b, c) %>%
  mutate(mcse.scaled = mcse / sd) %>%
  select(Rhat, mcse, mcse.scaled, Outcome) %>%
  reshape2::melt(id.var = "Outcome") %>%
  group_by(Outcome, variable)

model.stats %>%
  do(., mean_cl_normal(.$value)) %>%
  rename(Statistic = variable, Mean = y, `95% lower` = ymin, `95% upper` = ymax) %>%
  pander(caption = "Model convergence statistics averaged across model parameters with 95% CI. Note mcse_scaled = mcse/sd.")
Model convergence statistics averaged across model parameters with 95% CI. Note mcse_scaled = mcse/sd.
Outcome Statistic Mean 95% lower 95% upper
Cm Rhat 1.001 1.001 1.001
Cm mcse 0.035 0.024 0.046
Cm mcse.scaled 0.017 0.016 0.018
GQOL Rhat 1.001 1.000 1.002
GQOL mcse 0.049 -0.006 0.104
GQOL mcse.scaled 0.023 0.019 0.027
Kg Rhat 1.000 1.000 1.000
Kg mcse 0.030 0.022 0.038
Kg mcse.scaled 0.017 0.016 0.017

Average treatment effects

Average treatment effects are calculated from our ML models using the emmeans package. For MCMC models, average treatment effects are computed based on posterior linear predictions (i.e. predictions for the mean using rstanarm::posterior_linpred or the add_fitted_samples wrapper in tidybayes). Where we refer to predictions for individuals or for ‘new patients’ then these are based on full posterior predictions (i.e. rstanarm::posterior_predict).

Begin by making dataframe defining values to predict-at; we predict at the means of covariate values:

# This is for the GQOL model:
newd <-
  expand.grid(
    cm_base = 0,
    gqol1 = mean(df$gqol1, na.rm = T),
    bmi1 = mean(df$bmi1, na.rm = T),
    group = unique(df_long$group)
  ) %>%
  as_tibble()

# And for cm and kg (all centered)
newd_long <-
  expand.grid(
    person = Inf,
    cm_base = 0, kg_base = 0, bmi_base = 0,
    time = factor(unique(df_long$time)[2:3]),
    group = unique(df_long$group)
  ) %>%
  as_tibble()

Make the predictions/intervals for means:

preds.k <- add_fitted_samples(m.kg.stan, newdata = newd_long) %>%
  mutate(outcome = "Kg")

preds.c <- add_fitted_samples(m.cm.stan, newdata = newd_long) %>%
  mutate(outcome = "Cm")

preds.g <- add_fitted_samples(m.gqol.stan, newdata = newd) %>%
  mutate(outcome = "GQOL") %>%
  # fill this in; would otherwise be NA and mess up our table
  mutate(time = "Month 6")

Then make predictions/intervals for ‘new’ individuals:

preds.k.post <- add_predicted_samples(m.kg.stan, newdata = newd_long) %>%
  mutate(outcome = "Kg")

preds.c.post <- add_predicted_samples(m.cm.stan, newdata = newd_long) %>%
  mutate(outcome = "Cm")

And bind everything together:

stan.preds <- bind_rows(preds.g, preds.c, preds.k) %>%
  ungroup() %>% 
  select(outcome, estimate, .iteration, .row, group, time) %>% 
  data.table::melt(id.vars=c("outcome", ".iteration", ".row", "group", "time")) %>% 
  data.table::dcast(outcome+time+.iteration~group) 

stan.preds.post <- bind_rows(preds.c.post, preds.k.post) %>%
  ungroup() %>% 
  select(outcome, pred, .iteration, .row, group, time) %>% 
  data.table::melt(id.vars=c("outcome", ".iteration", ".row", "group", "time")) %>% 
  data.table::dcast(outcome+time+.iteration~group) 

We use these draws from the model posterior distribution to summarise the average treatment effects:

effect.summary <- stan.preds %>%
  group_by(outcome, time) %>%
  mean_qi(., FIT - MI) %>% 
  set_names(c("Outcome", "Followup", "Difference", "lower", "upper", "prob"))

# a table for the raw data in the same format to merge into a single table.
rawlong <-
  df_long %>%
  select(group, time, cm, kg, gqol) %>%
  rename(Kg = kg, Cm = cm, GQOL = gqol) %>%
  reshape2::melt(id.var = c("group", "time")) %>%
  rename(Outcome = variable) %>%
  group_by(group, time, Outcome) %>%
  summarise(mu = mean(value, na.rm = T), sd = sd(value, na.rm = T)) %>%
  melt(id.var = c("group", "time", "Outcome")) %>%
  dcast(Outcome + time ~ group + variable) %>%
  # Remove missing row Month 12 GQOL
  filter(!is.na(MI_mu)) %>%
  reshape2::melt(id.vars=c("Outcome", "time")) %>%
  mutate(variable = stringr::str_replace(variable, "_mu", " (mean)")) %>%
  mutate(variable = stringr::str_replace(variable, "_sd", " (sd)")) %>%
  dcast(Outcome + time ~ variable) %>%
  rename(Followup = time)

We then combine predictions from the Bayesian model with ML estimates from above to create the combined table presented in the paper:

# the `bgconts` dataframef containing ML contrasts was created in the
# code futher up, from lmer model fits
between.groups.table <-
  left_join(bgconts, effect.summary,
    by = c("Outcome", "Followup"),
    suffix = c(" (ML)", " (MCMC)")
  ) %>%
  select(-`prob`) %>%
  left_join(rawlong, .) %>%
  select(-`Difference (ML)`, -`Std. Error`) %>%
  rename(`Treatment effect (MCMC)` = `Difference (MCMC)`)

between.groups.table %>%
  pander(missing = "", caption = "Between group contrasts (with Satterthwaite corrected degrees of freedom for Kg and Cm) and posterior mean differences (and 95% credible intervals) for the effect of FIT vs. MI at month 6 and 12.")

Between group contrasts (with Satterthwaite corrected degrees of freedom for Kg and Cm) and posterior mean differences (and 95% credible intervals) for the effect of FIT vs. MI at month 6 and 12.

Outcome Followup FIT (mean) FIT (sd) MI (mean) MI (sd) df t p Treatment effect (MCMC) lower upper
Cm Baseline 106.07 13.75 105.50 12.51
Cm Month 6 99.05 12.61 102.78 13.37 205.2 -4.727 0.001 -4.455 -6.354 -2.614
Cm Month 12 96.97 12.59 103.04 12.45 205.9 -7.011 0.001 -6.685 -8.586 -4.839
Kg Baseline 90.48 15.90 89.13 14.76
Kg Month 6 86.37 15.07 88.39 15.72 161.0 -4.877 0.001 -3.674 -5.175 -2.183
Kg Month 12 84.04 15.96 88.46 15.34 163.0 -7.706 0.001 -5.931 -7.454 -4.386
GQOL Baseline 62.13 10.59 61.71 14.51
GQOL Month 6 75.81 11.66 72.53 10.42 109.0 2.107 0.001 2.837 0.061 5.655

Predictions for individuals: Recasting results from the patient perspective

By simulating data from our models it was possible to estimate the probability that the benefit of FIT over MI for kg lost or WC reduced exceeded a particular value. The code below estimates and plots probabilities for a range of different outcomes: between 0 and 15kg of weight lost or CM reduction in waist.

# given a dataframe of predictions, calculate the probability the effect exceeds `val`
p.for.rope <- function(outcome, df, val) {
  df %>%
    filter(outcome == outcome & time == "Month 12") %>%
    group_by(outcome) %>%
    summarise(p = mean((MI - FIT) > val), val = val)
}

area_of_interest <- seq(-2, 15, .25)

ropes.post <- expand.grid(val = area_of_interest, outcome = c("Kg", "Cm")) %>%
  rowwise() %>%
  do(., p.for.rope(outcome, stan.preds.post, .$val))

p <- ropes.post %>%
  # trim extreme values because otherwise loess smoothing
  # goes crazy but note that the smoothed line does now match
  # the data (uncomment the geom_point(). layer below to check this)
  filter(p < .98 & p > .02) %>%

  # it's confusing to show weight-gain within this plot, so filter val > 0,
  filter(val > 0) %>%

  # reverse the outcome
  mutate(Outcome = outcome) %>%

  # plot
  ggplot(aes(x = val, y = p, group = Outcome, linetype = Outcome)) +
  geom_smooth(se = F, color = "black", method="loess") +
  # geom_point() +
  scale_y_continuous(breaks = seq(0, 1, .1), minor_breaks = seq(.05, 1, .05)) +
  scale_x_continuous(breaks = seq(0, 15, 2)) +
  xlab("Benefit to new patient choosing FIT vs. MI.") +
  ylab("Probability") +
  scale_color_discrete("")

ggsave("plots/probability_for_ropes.pdf", p, width = 5, height = 3)
p

Probablity that the effect of FIT equalled or exceeded \(X\) kg (cm).

Probablity that the effect of FIT equalled or exceeded $X$ kg (cm).

This table was not included in the paper for brevity, but these figures were used within the text:

ropes.post %>%
  ungroup() %>% 
  group_by(outcome, p.round = round(p, 1)) %>%
  filter(p.round %in% seq(.1, .9, .1)) %>%
  summarise(effect = mean(val)) %>%
  rename(
    Outcome = outcome,
    Probability = p.round,
    Effect = effect
  ) %>%
  melt(id.var = c("Outcome", "Probability")) %>%
  dcast(Probability ~ Outcome + variable) %>%
  rename(`Cm reduction` = Cm_Effect, `Kg lost` = Kg_Effect) %>%
  pander(caption = "Supplementary table giving calculations for verbal descriptives of the effect of FIT vs. MI given in the text. Figures indicate probabilities for a benefit at least as good as the value in the kg or cg columns.")
Supplementary table giving calculations for verbal descriptives of the effect of FIT vs. MI given in the text. Figures indicate probabilities for a benefit at least as good as the value in the kg or cg columns.
Probability Cm reduction Kg lost
0.1 14.250 11.125
0.2 12.250 9.125
0.4 8.250 6.875
0.5 6.625 6.000
0.6 5.000 5.125
0.8 1.250 2.750
0.9 -1.000 1.000

Presenting predicted outcomes in natural frequencies

In the paper we report patient outcomes in nautural frequencies and include waffle plots.

# define thresholds for the pictogram for kg based on % weight lost from baseline:
kg.base <- df$kg1 %>% mean(., na.rm = T)
wtcuts_pc <- c(
  -Inf,
  -kg.base * .05, -kg.base * .01, kg.base * .01, kg.base * .05, kg.base * .1, kg.base * .15, Inf
)

wtcutlabls_pc <- c("Gain > 5%", "Gain 1-5%", "< 1% change", "Lose 1-5%", "Lose 5-10%", "Lose 10-15%", "Lose > 15%")

# sometimes percentages don't sum to 100 because of rounding, so fix using this
make100 <- function(d) {
  shouldbe100 <- sum(d)
  if (shouldbe100 != 100) {
    # add or remove rounding error from the modal category
    ismode <- d == max(d)
    stopifnot(sum(ismode) < 2) # only add to one category... will now fail with ties?
    d <- d + (100 - shouldbe100) * ismode
  }
  return(d)
}

bygroup.wd <- stan.preds.post %>%
  filter(outcome == "Kg") %>%
  rename(i = outcome) %>%
  select(i, MI, FIT, time) %>%
  reshape2::melt(id.vars = c("i", "time")) %>%
  mutate(v = kg.base - value, value = cut(kg.base - value,
    breaks = wtcuts_pc,
    labels = wtcutlabls_pc
  )) %>%
  group_by(time, variable) %>%
  mutate(N = n()) %>%
  group_by(variable, time, value) %>%
  summarise(cat = round(n() / N[1] * 100))

# make two dataframes, one for each group
mi.wd <- bygroup.wd %>%
  filter(variable == "MI" & time == "Month 12")

fit.wd <- bygroup.wd %>%
  filter(variable == "FIT" & time == "Month 12")

# make data lists
mi.waff <- mi.wd$cat
fit.waff <- fit.wd$cat

# make labels and colors
names(mi.waff) <- mi.wd$value
names(fit.waff) <- fit.wd$value
pal <- c("#e34a33", "#fdbb84", "grey", "#b1c9b4", "#41ab5d", "#238443", "#093810")

And finally draw the plots:

plot.a <- waffle::waffle(make100(mi.waff), 
                         colors = pal, flip = T, rows = 5, legend_pos = "left")
ggsave("plots/waffle_mi_percent_kg.pdf", width = 3, height = 4)
plot.a

12 month prognosis for new patients choosing MI

12 month prognosis for new patients choosing MI
plot.b <- waffle::waffle(make100(fit.waff), 
                         colors = pal, flip = T, rows = 5, legend_pos = "left")
ggsave("plots/waffle_fit_percent_kg.pdf", width = 3, height = 4)
plot.b

12 month prognosis for new patients choosing FIT

12 month prognosis for new patients choosing FIT

This table not shown in the paper, but some numbers were reported in the text:

bygroup.wd %>%
  reshape2::dcast(value ~ time + variable, value.var = "cat") %>%
  rename(Outcome = value) %>%
  pander(caption = "% patients achieving different outcomes at treatment end and 12 months in MI vs FIT groups.")
% patients achieving different outcomes at treatment end and 12 months in MI vs FIT groups.
Outcome Month 6_MI Month 6_FIT Month 12_MI Month 12_FIT
Gain > 5% 9 1 8 0
Gain 1-5% 22 6 22 2
< 1% change 16 9 16 4
Lose 1-5% 31 30 32 19
Lose 5-10% 19 37 19 40
Lose 10-15% 3 15 3 28
Lose > 15% 0 2 0 7

An additional table showing probabilities that MI is better than fit, from the perspective of new patients.

stan.preds.post %>%
  group_by(time, outcome) %>%
  summarise(`Probability FIT preferable to MI` = mean(MI > FIT)) %>%
  arrange(-as.numeric(time), outcome) %>%
  rename(Outcome = outcome, Followup = time) %>%
  pander("Probabilities that MI is better than fit, from the perspective of new patients, at each followup.")
Probabilities that MI is better than fit, from the perspective of new patients, at each followup.
Followup Outcome Probability FIT preferable to MI
Month 12 Cm 0.852
Month 12 Kg 0.944
Month 6 Cm 0.742
Month 6 Kg 0.838

Inputs for cost-effectiveness calculations

These calculations were used to populate the Public Health England costing e-tool.xlsx for weight loss interventions (see paper for details).

fit.effs <- between.groups.table %>%
  filter(Outcome == "Kg" & Followup != "Baseline") %>%
  select(`Treatment effect (MCMC)`, Followup)

fit.effs %>%
  mutate(bmi_change = bmi(weight = `Treatment effect (MCMC)`, 
                          height = mean(patient.chars$height, na.rm = T))) %>%
  pander()
Treatment effect (MCMC) Followup bmi_change
-3.674 Month 6 -1.331
-5.931 Month 12 -2.148
xtabs(~ female + group, df) %>% pander(caption = "Female==True by Group")
Female==True by Group
  MI FIT
FALSE 15 17
TRUE 43 46