The analysis pipeline makes heavy use of custom functions which can be found the the R-script utils.R. This analysis documents the analysis steps and generates figures and tables for the paper. While this document shows the code for running the Bayesian analyses, due to their time-consuming nature they were implemented as standalone scripts.

The document is organized as follows: The section Data data gives an overview of the data used for the investigations of viral load at the first positive test, viral load differences between B.1.1.7 and non-B.1.1.7 cases, viral load time course, and viral load - culture probability. The section Analysis describes the analytic approaches, implements the analyses, verifies that model parameters were estimated reliably, and that the employed statistical models capture important characteristics of the observed data. The description of the viral load time course analysis includes descriptions of priors for the Bayesian analysis as well as prior predictive checks to examine that appropriate priors were chosen. The section Results contains the code to calculate posterior predictions, from which all reported results are derived, and generates the results presented in the main paper as well as supplementary analyses. The section Figures contains the code to generate the figures for the results section of the main article.

1 Data

The analysis uses following data files:

  • viral-load-with-negatives.tsv.bz2 has first positive test results and initial negative test results
  • Culture_probability_data_wild_type.xlsx and Culture_probability_data_wild_type_woelfel.csv and contain culture-positivity data
  • min-3-timeseries.json has time course data of subjects with at least 3 test results
  • Culture_probability_data_B.1.1.7.xlsx has data from the culturing experiment conducted for the B.1.1.7 and B.1.177 types.

1.1 RT-PCR test data

In this data analysis, subjects are classified into one of the following groups:

  • Hospitalised: This includes subjects with a positive test, where the sample was taken in a stationary hospital ward. See table 1.1 groups of test centres or locations
  • Pre-symptomatic, asymptomatic, and mildly-symptomatic (PAMS): This includes subjects where the first positive test was taken in a test centre established for SARS-CoV-2 testing and who later did not test positive in a hospital ward.
  • Other: All subjects that do not fall into one of the first two groups.
## get_full_data() loads and pre-processes the data. 
dt = get_full_data() %>%
  .[log10Load < 0, log10Load := NA] 

Table 1.1 gives an overview of the number of tests, subjects and positive tests in the different test centre types (testing locations).

centre_table = 
  dt %>% 
  .[, .(`N centres` = length(unique(TestCentre)),
        `N tests`= .N,
        `N individuals` = length(unique(personHash)),
        `N positive` = sum(!is.na(log10Load)),
        `N hosp.` = sum(Hospitalized == T & !is.na(log10Load)),
        `% hosp.` = round(sum(Hospitalized == T & !is.na(log10Load))/sum(!is.na(log10Load))*100),
        `Mean VL` = round(mean(log10Load,na.rm = T),1)), 
   by = .(TestCentreCategory)] %>% 
  setnames(c("TestCentreCategory"), c("Centr abbr.")) %>% 
  .[, Status := ifelse(`Centr abbr.` %in% c("C19"),"Likely PAMS",
                       ifelse(`Centr abbr.` %in% c(HOSPITAL_TEST_CENTRE,"C19-H"),
                             "Likely hospitalized","Other"))] 

abbrevs = 
  matrix(c('Airport', 'AIR',
           'COVID-19 testing centre', 'C19',
           'COVID-19 testing centre\nlater hospitalized', 'C19-H',
           'Infectious diseases ward', 'IDW',
           'Company physician', 'CP',
           'Emergency department', 'ED',
           'Forensic medicine', 'FM',
           'Hospital', 'H',
           'Intensive care unit', 'ICU',
           'Other laboratory \n(not Labor Berlin)', 'L',
           'Labour ward', 'LW',
           'Outpatient department', 'OD',
           'Public health department ', 'PHD',
           'Unclassified', '?',
           'Sports Medicine', "SM",
           'Ward', 'WD',
           'Prison','PRI',
           'Age residence','RES'),
         nrow = 2) %>%
  t() %>%
  data.table() %>%
  setnames(c("V1","V2"), c("Centre type", "Centr abbr.")) 

centre_table = 
  merge(centre_table, abbrevs) %>% 
  .[,.(Status,`Centr abbr.`, `Centre type`,`N centres`,`N positive`, `N tests`, `N individuals`,`Mean VL`, `N hosp.`, `% hosp.`)] %>% 
  .[, `Positivity rate` := round(`N positive`/`N tests` *100,1)] %>% 
  .[order(Status,-`N tests`)]
 
caption_centre_table = paste("Test centre categories and test counts for",format(length(unique(dt$personHash)),big.mark = ","),"subjects.*")
note_centre_table = "First-positive RT-PCR tests are broken down according to test centre type. Each centre type category includes data from many different test centres. A category of Other is assigned to all those who were not at a centre testing people who were likely PAMS or likely hospitalized. For example, emergency departments fall into the Other category because subjects presenting there are not likely to be asymptomatic, but are not counted as formally hospitalized at that point."
centre_table %>% 
  kable(format = table_format,
        caption = caption_centre_table) %>%
  kable_styling(full_width = F) %>% 
  add_footnote(note_centre_table, notation="symbol")
Table 1.1: Test centre categories and test counts for 415,935 subjects.*
Status Centr abbr. Centre type N centres N positive N tests N individuals Mean VL N hosp. % hosp. Positivity rate
Likely PAMS C19 COVID-19 testing centre 23 6159 163489 71128 6.9 49 1 3.8
Likely hospitalized WD Ward 552 4140 226691 107452 6.0 4140 100 1.8
Likely hospitalized H Hospital 159 1996 71745 39462 6.3 1996 100 2.8
Likely hospitalized ICU Intensive care unit 15 630 8518 5196 6.3 630 100 7.4
Likely hospitalized IDW Infectious diseases ward 1 69 186 154 6.1 69 100 37.1
Other ED Emergency department 142 8224 187624 131091 6.5 2457 30 4.4
Other OD Outpatient department 217 744 63296 46143 5.7 51 7 1.2
Other AIR Airport 2 487 41907 36799 6.0 3 1 1.2
Other ? Unclassified 917 634 36052 22256 5.9 40 6 1.8
Other RES Age residence 5 1121 24500 9225 6.0 61 5 4.6
Other PRI Prison 16 206 13976 5565 5.7 0 0 1.5
Other LW Labour ward 31 99 12423 10001 5.5 18 18 0.8
Other CP Company physician 7 202 12406 6747 5.8 4 2 1.6
Other L Other laboratory (not Labor Berlin) 36 455 9964 6691 6.0 1 0 4.6
Other SM Sports Medicine 1 116 5921 1301 5.2 0 0 2.0
Other PHD Public health department 5 70 813 657 5.5 0 0 8.6
Other FM Forensic medicine 1 29 112 80 7.2 0 0 25.9
* First-positive RT-PCR tests are broken down according to test centre type. Each centre type category includes data from many different test centres. A category of Other is assigned to all those who were not at a centre testing people who were likely PAMS or likely hospitalized. For example, emergency departments fall into the Other category because subjects presenting there are not likely to be asymptomatic, but are not counted as formally hospitalized at that point.
centre_table[, `N centres` := NULL]
ft_centre_table = 
  do.call(rbind,
  lapply(unique(centre_table$Status), function(status) {
    status_val = paste0(
            status,"\n (",
            sum(centre_table[Status == status,`N positive`]),", ",
            round(sum(centre_table[Status == status,`N positive`])/
                    sum(centre_table$`N positive`)*100,1),"%)")
    centre_table[Status == status][,Status := status_val]
  }
  )) 

ft_centre_table[c(3:5,7:17), Status := ""]

ft_centre_table = 
  flextable(ft_centre_table) %>%
  autofit() %>%
  set_caption(caption = caption_centre_table) %>%
  footnote(i = 1,
           j = c(5),
           value = as_paragraph(
             c(note_centre_table)),
           ref_symbols = c("*"),
           part = "header",
           inline = TRUE) %>% 
  hline(i = c(1,5))

table_doc <- read_docx() %>% 
  body_add_flextable(ft_centre_table) %>%
  body_end_section_portrait() # a landscape section is ending here

The following table and figures provide an overview over the data used in the analysis of first positive RT-PCR tests.

my_breaks = c(seq(0,25,5),seq(35,65,10),120)
my_labels = c("0-5","5-10","10-15","15-20","20-25","25-35","35-45","45-55","55-65",">65")
## get_TC_data() loads and pre-processes the data. 
TC_data = get_TC_data()
time_course_table =  
  TC_data %>%
  .[, `Age category` := cut(Age, breaks = my_breaks, ordered_result = T,
                    labels = my_labels)] %>%
  .[day == 0, 
    list(`N time` = .N),
    by = .(`Age category`)] %>% 
  .[order(`Age category`)]

dt[, positive := ifelse(is.na(log10Load),F,T)]

t1 = function(dt) {
  dt[, list(`N pos.` = sum(positive,na.rm = T),
            `pos. rate` = round((sum(positive,na.rm = T) / .N)*100,1),
            `log~10~load` = paste0(round(mean(log10Load,na.rm = T),1),
                                  " (",round(sd(log10Load,na.rm = T),2),")")),
     by = .(`Age category`)] %>%
    .[order(`Age category`)]
}
table_1 = 
  cbind(t1(dt),
        time_course_table[, .(`N time`)],
        t1(dt[Group == "PAMS"])[,-1,with = F],
        t1(dt[Hospital_centre == 1])[,-1,with = F])
table_1_caption = "Viral load by age groups for the full sample, pre-, asymptomatic, and mild- (PAMS) cases, and hospitalized cases"
table_1_note = "N pos. = number of positive tests, pos. rate = positivity rate, N time = number of cases with time course data."
tbl = 
  kable(table_1, digits = 1, 
        format = table_format,
        caption = table_1_caption) %>%
  add_header_above(c(" " = 1, "Full sample" = 4, "PAMS" = 3, "Hospitalized" = 3)) %>%
  kable_styling(full_width = F) %>%
  add_footnote(table_1_note, notation="symbol")
tbl
Table 1.2: Viral load by age groups for the full sample, pre-, asymptomatic, and mild- (PAMS) cases, and hospitalized cases
Full sample
PAMS
Hospitalized
Age category N pos. pos. rate log10load N time N pos. pos. rate log10load N pos. pos. rate log10load
0-5 331 1.8 5.9 (1.84) 14 36 5.1 6.6 (1.87) 32 0.9 5.6 (2.22)
5-10 186 1.8 6 (1.73) 12 39 6.2 6.1 (1.83) 18 1.4 5.8 (1.97)
10-15 226 2.2 6 (1.76) 8 51 6.9 6.4 (1.92) 22 1.4 6 (2.02)
15-20 652 3.0 6.3 (1.87) 39 197 5.2 6.7 (1.77) 123 2.5 6.1 (1.95)
20-25 1642 3.2 6.5 (1.89) 110 700 4.0 6.9 (1.86) 245 2.7 5.9 (1.92)
25-35 4457 3.0 6.6 (1.9) 323 1985 3.9 7 (1.83) 617 2.2 6.1 (1.89)
35-45 3387 2.7 6.4 (1.84) 322 1272 3.5 6.9 (1.8) 574 2.0 6 (1.89)
45-55 3338 3.1 6.4 (1.81) 405 1012 3.4 6.9 (1.83) 736 2.3 5.9 (1.77)
55-65 3326 2.7 6.3 (1.78) 625 675 3.0 6.8 (1.82) 1039 2.1 5.9 (1.8)
>65 7836 3.0 6.4 (1.79) 2484 143 5.8 6.8 (1.87) 3429 2.3 6.2 (1.86)
0-5 331 1.8 5.9 (1.84) 2 36 5.1 6.6 (1.87) 32 0.9 5.6 (2.22)
* N pos. = number of positive tests, pos. rate = positivity rate, N time = number of cases with time course data.
names(table_1)[6:8] = paste("P",names(table_1)[6:8])
names(table_1)[9:11] = paste("nP",names(table_1)[9:11])

ft = 
  flextable(table_1) %>%
  autofit() %>%
  set_caption(caption = table_1_caption) %>%
  footnote(i = 1,
           j = c(2,3,4,5),
           value = as_paragraph(
             c("number of positive tests",
               "proportion positive tests",
               "mean (standard deviation)",
               "number of cases with time course data")),
           ref_symbols = c("a", "b", "c","d"),
           part = "header",
           inline = TRUE)

save_as_docx(ft,path = "figures/Table1.docx")

table_doc <- table_doc %>%
  body_add_flextable(ft) %>%
  body_end_section_landscape() # a landscape section is ending here

rm(time_course_table)
## get_log10Load_data() loads and pre-processes the data.
bdata = 
  get_log10Load_data() 

lvls = do.call(c,lapply(levels(bdata$Month), function(x) substr(x,1,3)))
bdata %>% 
  .[, month := factor(as.numeric(Month), labels = lvls)] %>% 
  .[, AgeGroup_r := factor(AgeGroup, levels = rev(levels(bdata$AgeGroup)))]



p_N_age = 
  bdata %>% 
  ggplot(aes(x = Date, fill = AgeGroup_r)) + 
  geom_histogram(bins = 25) +
  scale_fill_ordinal(name = "Age group") +
  theme(legend.position = c(.3,.8)) +
  ylab("N") +
  xlab("Date")  +
  gg_text_size() + 
  gg_legend_size(2) + 
  scale_y_continuous(expand = expansion(0.005,0)) +
  theme(axis.text.x = element_text(size = 4)) + 
  geom_vline(xintercept = as.Date("2021-01-01"), col = "grey") 

p_N_age
Number of ositive tests over time.

Figure 1.1: Number of ositive tests over time.

The cobas and LC480 PCR systems appear to result in slightly different distributions of viral loads.

ggplot(bdata, aes(x = log10Load, fill = Group)) + 
  geom_histogram(bins = 30, alpha = .75) +
  xlab(expression(log[10]~viral~load)) + 
  facet_wrap(~PCR, nrow = 2, scale = "free_y", ncol = 2) +
  red_blue(1:3) + 
  theme(legend.position = "top")
Distribution of viral load by PCR system and clinical status.

Figure 1.2: Distribution of viral load by PCR system and clinical status.

agehist_by_PAMS = 
  bdata %>% 
  .[, `Sub-sample` := Group] %>% 
  ggplot(aes(x = Age, fill = `Sub-sample`)) + 
  geom_histogram(alpha = .5, breaks = seq(0,101,1))  + 
  theme(legend.position = c(.8,.9)) + 
  ylab("Number of subjects") + 
  gg_add_grid() + 
  red_blue(1:3) + 
  gg_expand() + 
  gg_legend_size()

agehist_by_PAMS
Distribution of subject ages by clinical status.

Figure 1.3: Distribution of subject ages by clinical status.

bdata %>% 
  .[, Age5 := cut(Age,breaks = c(seq(0,95,by = 5),101), labels = paste0(seq(0,95,5),"-",seq(5,100,5)))] %>% 
  .[, Age5 := ordered(Age5)] %>% 
  .[, Group := factor(Group, levels = c("PAMS","Hospitalized","Other"))]

N_Age5_Group=
  bdata[, .(N = .N), by = .(Group,Age5)]

Group.labs = 
  c(PAMS = paste0("PAMS (",sum(bdata$Group == "PAMS"),")"),
    Hospitalized = paste0("Hospitalized (",sum(bdata$Group == "Hospitalized"),")"),
    Other = paste0("Other (",sum(bdata$Group == "Other"),")"))

beesPAMS =
  bdata %>% 
  ggplot(aes(x = Age5, y = log10Load, color = Group)) + 
  geom_quasirandom(method = "tukeyDense", size = .1) + 
  ylab(expression(log[10]~viral~load)) +
  red_blue(c(2,3,1)) +  
  geom_text(data = N_Age5_Group, aes(y = 11, label = N), size = 3, show.legend = F) +
  scale_y_continuous(expand = expansion(0,1), breaks = seq(2,10,2)) + 
  coord_cartesian(ylim = c(2.5,10.6)) + 
  facet_grid(Group~.,labeller = labeller(Group = Group.labs)) +
  gg_add_grid("y") +
  theme(strip.text.x = element_blank(), legend.position = "none") + 
  xlab("Age")

beesPAMS
Distribution of viral loads by subject ages and clinical status.

Figure 1.4: Distribution of viral loads by subject ages and clinical status.

tmp = 
  copy(bdata) %>% 
  .[, log10Load.cat := cut(log10Load, breaks = c(1.5,4,5:9,12))] %>% 
  .[, Age.cat := cut(Age, 
                     breaks = c(seq(0,90,10),101),
                     labels = paste0(seq(0,90,10),"-",seq(10,100,10)))] %>% 
  .[, rAge := round(Age/5)*5] %>% 
  .[, high.load := log10Load > 9] 


prop.gen = 
  tmp[, .(N.gen = .N), by = .(Group,Age.cat)] %>% 
  .[, prop.gen := N.gen/sum(N.gen)] %>% 
  setkeyv(c("Group","Age.cat"))

prop.high = 
  tmp[high.load == T, .(N.high = .N), by = .(Group,Age.cat)] %>% 
  .[, prop.high := N.high/sum(N.high)] %>% 
  setkeyv(c("Group","Age.cat")) %>% 
  .[prop.gen, rel.high := prop.high/prop.gen] %>% 
  setnames(.,"N.high","N with\nhigh load") 

p_rel_risk_high_load =
  prop.high  %>% 
  ggplot(aes(x = Age.cat, y = rel.high, color = Group, size = `N with\nhigh load`)) + 
  geom_hline(yintercept = 1, lty = 3) + 
  geom_point(alpha = .75) + 
  red_blue(1:3) + 
  ylab("Relative risk of high load") +
  xlab("Age category") + 
  ggtitle("B")

p_high_VL = 
  tmp[log10Load > quantile(bdata$log10Load,.95)] %>% 
  ggplot() +
  geom_mosaic(aes(product(Group , Age.cat), fill = Group), divider = ddecker(),) + 
  red_blue(1:3) +
  theme(axis.text.x = element_text(angle=90)) + 
  ylab("Clinical status") +
  xlab("Age category") + 
  ggtitle("A")


px = p_high_VL / p_rel_risk_high_load
px

ggsave(px, file = "figures/S3_p_high_VL.png", height = 25, width = 20, units = "cm", dpi = 600)
ggsave(px, file = "figures/S3_p_high_VL.pdf", height = 25, width = 20, units = "cm")

N_high_load = sum(tmp[,high.load == T])
p_high_load = round(mean(tmp[,high.load == T])*100,1)
N_high_load_PAMS20_60 = nrow(tmp[high.load == T & ((Age > 20 & Age < 60 & Group == "PAMS"))])
N_high_load_Hosp60 = nrow(tmp[high.load == T & ((Age > 60 & Group == "Hospitalized"))])

2228 (8.8)% subjects have a log~10˙ viral load higher than 9. To a large part, these are PAMS subjects aged 20 to 60 (N = 722, 32.4%) or Hospitalized patients older than 60 years (499, 22.4%).

1.1.1 Symptom onset data

For a subset of 317 subjects with positive test results data about symptom onset is available, of which 300 remain after exclusion of cases who reported symptoms onset more than 3 weeks before the first positive test.

onset_data = 
  bdata[,.(Onset,Group,log10Load, PCR, Age,TestCentreCategory)] %>% 
  .[!is.na(Onset) & Onset <= 21 & Onset >= 0] #%>% 
  #.[Group == "Hospitalized" & TestCentreCategory == "C19", Group := "PAMS"]

if(file.exists("pdata/onset.Rdata")) {
  load("pdata/onset.Rdata")
} else {
  bfit_onset = brm(log10Load ~ Onset*Group + s(Age) + PCR + (1 | TestCentreCategory),
                 onset_data,
                 family = gaussian(),
                 backend = "cmdstanr",
                 adapt_delta = .99)
  save(bfit_onset,file = "pdata/onset.Rdata")
}

new_data = 
  expand.grid(Onset = 0:21,
              Group = levels(onset_data$Group))  %>% 
  data.table() %>% 
  .[, Age := mean(onset_data$Age)] %>% 
  .[, TestCentreCategory := names(sort(-table(onset_data$TestCentreCategory)))[1]] %>% 
  .[, PCR := names(sort(-table(onset_data$PCR)))[1]]
pep = 
  new_data %>% 
  cbind(t(posterior_epred(bfit_onset, newdata = new_data))) %>% 
  melt(id.vars = c(names(new_data))) %>% 
  .[, .draw := as.numeric(gsub("V","",variable))] %>% 
  .[, variable := NULL] %>% 
  get_stats(by = c("Onset","Group"))

N_table = 
  onset_data[,.(N = .N),by = Group] %>%
  .[, label := paste0(Group," (N = ",N,")")] %>% 
  .[c(3,2,1)]

setkeyv(pep,"Group")
setkeyv(N_table,"Group")
setkeyv(onset_data,"Group")
pep[N_table, label := label] %>% 
  .[, Group := label]
onset_data[N_table, label := label] %>% 
  .[, Group := label]

VL_by_onset = 
  pep %>% 
  ggplot(aes(x = Onset, y = mean, color = Group)) + 
  geom_line() + 
  conf_ribbon(pep, fill = "Group") + 
  geom_point(data = onset_data,aes(y = log10Load), alpha = .5, shape = 16) + 
  red_blue(c(3,1,2)) + 
  facet_wrap(~Group) + 
  ylab(expression(log[10]~viral~load)) + 
  xlab("Days from symptom onset") +
  theme(legend.position = "none") +
  geom_line(data = pep[,.(mean = mean(mean)), by = .(Onset)], aes(color = NULL), lty = 3)

ggsave(VL_by_onset,file = "figures/VL_by_onset.png", height = 10, width = 20, units = "cm")
ggsave(VL_by_onset,file = "figures/VL_by_onset.pdf", height = 10, width = 20, units = "cm")
VL_by_onset
Association between viral load at first positive test and temporal distance to symptom onset by clinical status. Points depict raw data. The dotted line depicts the average regression line across the three groups. The figure shows that subjects who reported high viral loads were more likely tested close to symptom onset, and subjects who reported low viral loads were more likely tested more than a week before reported symptoms onset. After adjustment for time of testing relative to symptom onset, first positive tests for hospitalized patients are likely highest, though the current data set does not include sufficient PAMS cases for a reliable quantification. The analysis adjusted for PCR system, age, and test centre type.

Figure 1.5: Association between viral load at first positive test and temporal distance to symptom onset by clinical status. Points depict raw data. The dotted line depicts the average regression line across the three groups. The figure shows that subjects who reported high viral loads were more likely tested close to symptom onset, and subjects who reported low viral loads were more likely tested more than a week before reported symptoms onset. After adjustment for time of testing relative to symptom onset, first positive tests for hospitalized patients are likely highest, though the current data set does not include sufficient PAMS cases for a reliable quantification. The analysis adjusted for PCR system, age, and test centre type.

rm(pep,bfit_onset,VL_by_onset)

1.2 B.1.1.7 first positive test viral load data

Here is a short overview of first positive test viral load data from subjects with the B.1.1.7 variant of the virus, together with different comparison groups. The reasoning behind the ever-more restrictive comparison groups is to compare B.1.1.7 subjects to non-B.1.1.7 subjects that were tested in increasing temporal- and spatial proximity.

B117data = 
  copy(bdata) %>% 
  .[, B117 := ifelse(B117 == 1, "B.1.1.7","non-B.1.1.7")]
B117.density.all = 
  B117data %>% 
  ggplot(aes(x = log10Load, fill = B117)) + 
  geom_density(alpha = .5, color = NA) + 
  gg_expand() + 
  scale_fill_manual(name = "All cases",
                    values = c("#7E6148FF","#00A087FF")) +
  ylab(expression(log[10]~viral~load))


B117data = B117data[B117CentreDay5 == 1]
B117.density.5 = 
  B117data %>% 
  ggplot(aes(x = log10Load, fill = B117)) + 
  geom_density(alpha = .5, color = NA) + 
  gg_expand() + 
  scale_fill_manual(name = "Within +/- 5 days\nof B1.1.7 cases",
                    values = c("#7E6148FF","#00A087FF")) +
  ylab(expression(log[10]~viral~load))

B117data = B117data[B117CentreDay1 == 1]
B117.density.1 = 
  B117data %>% 
  ggplot(aes(x = log10Load, fill = B117)) + 
  geom_density(alpha = .5, color = NA) + 
  gg_expand() + 
  scale_fill_manual(name = "Within +/- 1 day\nof B1.1.7 cases",
                    values = c("#7E6148FF","#00A087FF")) +
  ylab(expression(log[10]~viral~load))

B117data = 
  B117data %>%
  .[B117CentreDay1 == 1] 
CentresWithMin2Cases = 
  B117data[,.(N = .N), by = .(TestCentre)][N > 1,TestCentre]
B117data = B117data[TestCentre %in% CentresWithMin2Cases]
B117.density.1p = 
  B117data %>% 
  ggplot(aes(x = log10Load, fill = B117)) + 
  geom_density(alpha = .5, color = NA) + 
  gg_expand() + 
  scale_fill_manual(name = "Within +/- 1 day\nonly centres\nwith both types",
                    values = c("#7E6148FF","#00A087FF")) +
  ylab(expression(log[10]~viral~load))

B117.density.all / 
  B117.density.5 / 
  B117.density.1 / 
  B117.density.1p
Distribution of viral load for B.1.1.7 and non-B.1.1.7  cases for different data sub-sets.

Figure 1.6: Distribution of viral load for B.1.1.7 and non-B.1.1.7 cases for different data sub-sets.

rm(B117.density.all,B117.density.5,B117.density.1,B117.density.1p)
B117_table1 = 
  B117data %>% 
  .[, .(N = .N,
     `Mean age` = sprint_mci(Age,0),
     `Mean viral load` = sprint_mci(log10Load,1),
     `% hospitalized` = sprint_mcib(Hospitalized,0),
     `% PAMS` = sprint_mcib(PAMS1,1)),
   by = .(B117)] 

B117_table1 = 
  kable(B117_table1,
        format = table_format,
        caption = "Sample description for focussed B.1.1.7 analysis.")  %>% 
  kable_styling(full_width = F) %>%
  add_footnote("Numbers are means and 95% confidence intervals.",
               notation = "none")

B117_table1
Table 1.3: Sample description for focussed B.1.1.7 analysis.
B117 N Mean age Mean viral load % hospitalized % PAMS
B.1.1.7 1451 48 (47, 50) 7.4 (7.3, 7.5) 36 (32, 40) 25.9 (22.9, 29.0)
non-B.1.1.7 977 51 (49, 52) 6.4 (6.3, 6.5) 31 (27, 35) 29.1 (25.1, 33.1)
Numbers are means and 95% confidence intervals.

A crude comparison of log10 viral load of B.1.1.7 and non-B.1.1.7 subjects suggests higher viral load in the former group in adults (see Figure 1.7). There is little data from children.

B117data %>% 
  .[, B.1.1.7 := B117] %>% 
  .[, AgeCat := cut(Age,breaks = c(0,20,65,101))]

B117data[, .(log10Load = mean(log10Load),
          sd = sd(log10Load), N = .N),
      by = .(Group,B.1.1.7,AgeCat)] %>% 
  .[, lower := log10Load + (sd/sqrt(N))*1.96] %>% 
  .[, upper := log10Load - (sd/sqrt(N))*1.96] %>% 
  setkeyv(c("AgeCat","Group","B.1.1.7")) %>% 
  ggplot(aes(x = AgeCat, y = log10Load, shape = Group, color = B.1.1.7)) + 
  geom_point(position = position_dodge(.5)) + 
  geom_linerange(aes(ymin = lower, ymax = upper),
                 position = position_dodge(.5)) + 
  green_brown() +
  xlab(expression(log[10]~viral~load))
Viral load by age group, clinical status, and B.1.1.7. Lines indicate 95% confidence intervals. Confidence intervals cannot be shown when a group has only one member. This comparison uses the most restrictive data set, that includes B.1.1.7 cases only when there was a test result for a non-B.1.1.7 centre from the same location +/- one day.

Figure 1.7: Viral load by age group, clinical status, and B.1.1.7. Lines indicate 95% confidence intervals. Confidence intervals cannot be shown when a group has only one member. This comparison uses the most restrictive data set, that includes B.1.1.7 cases only when there was a test result for a non-B.1.1.7 centre from the same location +/- one day.

1.3 B.1.1.7 cell culture isolation data

Table ?? gives an overview of the cell culturing data shown in figure 1.8.

# get_CP_data() loads and pre-processes the data. 
CP = 
  get_CP_data() %>% 
  .[Study == "Own data" & culture_positive == 1]

CPstats = 
  CP %>% 
  .[,.(log10Load = mean(log10Load), 
       sd = sd(log10Load),
       N = .N),
    by = .(Clade)] %>% 
  .[, lower := log10Load - 1.96 * (sd/sqrt(N))] %>% 
  .[, upper := log10Load + 1.96 * (sd/sqrt(N))]

CPdata = 
  get_CP_data() %>% 
  .[Study == "Own data"] %>%
  .[, `Culture positive` := ordered(if_else(culture_positive == 1,"Yes","No"), levels = c("Yes","No"))]
levels(CPdata$`Culture positive`) = c("Yes","No")

CPdataTBL = 
  CPdata[, .(N = .N,mean = mean(log10Load), sd = sd(log10Load)),
         by = .(Clade,`Culture positive`)]

CPdata_table = 
  kable(CPdataTBL,
        format = table_format,
        digits = c(0,0,1,2),
        caption = "Sample description for  B.1.1.7 culturing data")  %>% 
  kable_styling(full_width = F) 

if (file.exists("pdata/culture_fit.Rdata")) {
  load("pdata/culture_fit.Rdata")
} else {
  culture_fit = 
    brm(bf(log10Load ~ Clade,
           sigma ~ Clade),
        family = student(),
        CP,
        backend = "cmdstanr")
  save(culture_fit,file = "pdata/culture_fit.Rdata")
}

mean_difference = 
  culture_fit$fit %>% 
  as_draws() %>% 
  subset_draws("b_CladeB.1.1.7") %>% 
  as.numeric() %>% 
  sprint_stat(2)

sigma_par = 
  culture_fit$fit %>% 
    as_draws() %>% 
    subset_draws(c("b_sigma_Intercept","b_sigma_CladeB.1.1.7")) %>% 
  as_draws_matrix() 

sigma_difference = 
  sigma_par %*% matrix(c(1,0,1,1),ncol = 2) %>% 
  exp() %>% 
  apply(1,diff) %>% 
  sprint_stat(2)

CPdataTBL

Application of the Inter-ocular trauma test to the culturing data does not indicate that the distribution of pre-culturing viral loads are different (or identical) for positive B.1.1.7 and B.1.177 cultures. (A Bayesian regression model confirms inconclusive results for the difference of means 0.21 (-0.06, 0.47) and for the difference of standard deviations -0.09 (-0.27, 0.13)).

rm(mean_difference,sigma_par,sigma_difference)
p_culturing = 
  CPdata %>% 
  ggplot(aes(x = Clade, y = log10Load, color = `Culture positive`)) + 
  geom_quasirandom(data = CPdata[`Culture positive` == "Yes"]) +
  geom_quasirandom(data = CPdata[`Culture positive` == "No"], alpha = .25) +
  theme(legend.position ="top") + 
  scale_colour_manual(values = rev(diverge_hcl(2, palette = "Blue-Red 2"))) +
  geom_point(data = CPstats, shape = 3, size = 3, color = diverge_hcl(2, palette = "Blue-Red 2")[1]) + 
  geom_linerange(data = CPstats,aes(ymin = lower,ymax = upper),color = diverge_hcl(2, palette = "Blue-Red 2")[1]) + 
  ylab(expression(log[10]~viral~load~before~culturing))

ggsave(p_culturing, filename = "figures/S4_Culturing.png", width = 12, height = 12, units = "cm", dpi = 600)
ggsave(p_culturing, filename = "figures/S4_Culturing.pdf", width = 12, height = 12, units = "cm")

p_culturing
Pre-culturing viral loads split by clade and culturing outcome. The cross indicates mean (horizontal lines) and 95% confidence intervals (vertical lines) of pre-culturing viral loads for culture-positive samples.

Figure 1.8: Pre-culturing viral loads split by clade and culturing outcome. The cross indicates mean (horizontal lines) and 95% confidence intervals (vertical lines) of pre-culturing viral loads for culture-positive samples.

1.4 RT-PCR time course data

For a subset of participants, multiple RT-PCR test results were available. These data can be used to estimate the time course of viral load. For this analysis we used data from participants with 3 or more RT-PCR results.

To enable a robust estimation of viral load time courses, we only included cases that fulfilled following criteria:

  • at least 3 test results, at least two of which must be positive
  • test results must cover a period of at least 5 days
  • if the first and the last test are negative, the results need to cover a period of at least 10 days.
  • at most an increase of five log10 load per day from the first to the second test
make_time_course_standata(
  max_diff_load_12perday = 5,
  ub_log_slope_up_mu = Inf) %>% 
  list2env(.GlobalEnv)

PCR_data_info = 
  unique(day_data[,.(ID,N_tests,Age,first_test_negative,last_test_negative, PAMS1, Hospitalized)]) 

PCR_data_info %>% 
  .[, number_tests := cut(N_tests,
                          breaks = c(2.9,3,4,6,9,20),
                          labels = c("3","4","5-6","7-9",">9"),
                          ordered_result = T)] %>%
  ggplot(aes(x = Age, fill = number_tests)) +
  geom_histogram(breaks = seq(0,100,5)) + 
  scale_fill_ordinal(name = "Number \nof tests") 
Distribution of age and number of tests of subjects included in the time course analysis.

Figure 1.9: Distribution of age and number of tests of subjects included in the time course analysis.

The time series data set has few patients who are younger than 20 years.

TC_table1 = 
  PCR_data_info[, 
              .(N = .N,
                `Mean age` = sprint_mci(Age,0),
                `% leading neg. test` = sprint_mcib(first_test_negative,0),
                `% trailing neg. test` = sprint_mcib(last_test_negative,0),
                `% hospitalized` = sprint_mcib(Hospitalized,0),
                `% PAMS` = sprint_mcib(PAMS1,1)),
              by = .(number_tests)] %>% 
  .[order(number_tests)] %>% 
  setnames("number_tests", "Number of tests")

TC_table1 = 
  kable(TC_table1,
        format = table_format,
        caption = "Sample for time course analysis.") %>% 
  kable_styling(full_width = F) %>%
  add_footnote("Numbers are means and 95% confidence intervals",
               notation = "none")

TC_table1
Table 1.4: Sample for time course analysis.
Number of tests N Mean age % leading neg. test % trailing neg. test % hospitalized % PAMS
3 1992 61 (60, 62) 20 (17, 22) 32 (29, 35) 76 (68, 84) 9.4 (8.0, 10.9)
4 1080 67 (65, 68) 28 (24, 32) 46 (40, 51) 83 (70, 96) 5.1 (3.7, 6.5)
5-6 901 70 (69, 71) 39 (34, 44) 47 (41, 53) 83 (69, 98) 2.0 (1.1, 2.9)
7-9 310 72 (71, 74) 46 (36, 57) 59 (46, 72) 88 (58, 100) 0.3 (0.0, 1.0)
>9 61 75 (71, 78) 43 (21, 64) 67 (31, 100) 95 (0, 100) 0.0 (0.0, 0.0)
Numbers are means and 95% confidence intervals

Few participants with time course data never had a positive test outside a hospital ward, i.e., were classified as PAMS. These participants are almost exclusively among subjects with three or four data points.

grp_var = "ID"

day_data = 
  prep_time_course_data(TC_data)

TC_raw_data = 
  day_data %>% 
  .[, NtestsCats := factor(as.numeric(NtestsCat),labels = c(3:9,"10+"))] %>% 
  ggplot(aes(x = day, y = log10Load, group = ID)) + 
  geom_line(alpha = .2) + 
  geom_jitter(col = adjustcolor("red", alpha = .1), size = 1) +
  facet_wrap(~NtestsCats) +
  ylab(expression(log[10]~viral~load)) + 
  xlab("Day since first test") +
  red_blue(c(1,3))

TC_raw_data
Observed viral load time courses. Red dots are PCR results. Data points are jittered to reduce visual overlap.

Figure 1.10: Observed viral load time courses. Red dots are PCR results. Data points are jittered to reduce visual overlap.

ggsave(TC_raw_data + facet_wrap(~NtestsCats,ncol = 2),
       filename = "figures/S8_TC_rawdata.png", width = 20, height = 30, units = "cm")
ggsave(TC_raw_data + facet_wrap(~NtestsCats,ncol = 2),
       filename = "figures/S8_TC_rawdata.pdf", width = 20, height = 30, units = "cm")
my_hist = 
  ggplot(day_data, aes(x = day, y = log10Load)) + 
  geom_point(alpha = 0) +
  geom_jitter() +
  ylab(expression(log[10]~viral~load)) + 
  xlab("Days since first test")
  
ggMarginal(my_hist,type = "histogram")
rm(my_hist)

1.5 Wild type cell culture isolation data

Here we briefly describe data from three published articles Kampen et al. (2020) that assessed infectivity of samples with different viral loads by performing cell culture isolation trials.

culture_data = 
  get_CP_data() %>% 
  .[!is.na(culture_positive) & Study !=  "Own data" & log10Load > 2]

ggplot(culture_data, aes(x = log10Load, fill = Study, color = culture_positive)) + 
  geom_histogram(bins = 20) + 
  xlab(expression(log[10]~viral~load)) + 
  facet_wrap(~Study, nrow = 1, scale = "free_y") + 
  theme(legend.position = "top") + 
  theme(legend.position = "none")
Distribution of viral loads for data considered for the estimation of the association between viral load and culture probability.

Figure 1.11: Distribution of viral loads for data considered for the estimation of the association between viral load and culture probability.

rm(merged_data)

We estimated a Bayesian hierarchical logistic regressions in rstanarm (Goodrich et al. 2020) for the three datasets to investigate if the association between viral load and positive culture outcome is sufficiently similar to pool the data from these studies:

if (!file.exists("pdata/culture_model.Rdata")) {
  culture_model = stan_glmer(culture_positive ~ log10Load + (log10Load | Study),
                    data = data.frame(culture_data),
                    family = binomial)
  save(culture_model, culture_model, file = "pdata/culture_model.Rdata")
} else {
  load("pdata/culture_model.Rdata")
}


newdata = 
  expand.grid(log10Load = seq(0,10, length.out = 100),
              Study = unique(culture_data$Study)) %>%
  data.table()

pp = posterior_epred(culture_model,newdata = newdata)

pp = 
  cbind(newdata, t(pp)) %>%
  melt(id.vars = names(newdata)) %>%
  .[, as.list(post_stats_list(value)),
    by = c("log10Load","Study")] %>% 
  .[, Study := as.character(Study)] %>% 
  .[, Study := gsub("Wölfel","Woelfel",Study)]
  
ggplot(pp,
       aes(x = log10Load,
           y = m,
           color = Study,
           fill = Study)) +
    geom_line() + 
    geom_ribbon(aes(ymin = q5,ymax = q95),
                    alpha = .25, color = NA) + 
  xlab(expression(log[10]~viral~load)) + 
  ylab("Culture probability")
Estimated association between viral load and culture probability by data set. Coloured bands show 90% credible intervals.

Figure 1.12: Estimated association between viral load and culture probability by data set. Coloured bands show 90% credible intervals.

rm(pp,culture_model)

The associations in the data from (Wölfel et al. 2020) and (Perera et al. 2020) are similar enough to warrant pooling the culture data. The data from [Kampen et al. (2020) will not be included, as it shows a different association between viral load and culture probability (see discussion in the manuscript for why this is plausible / to be expected).

2 Analysis

2.1 Viral load and infectiousness at the first positive test by age

We use a Bayesian thin-plate spline regression (Wood 2003) as implemented in the brms package (Bürkner 2017, 2018) to estimate the association between age and viral load. Following specifications account for structure in the data:

  • We model differences in viral load between sexes, samples measured on different PCR systems, samples from the B.1.1.7 variant, and samples from different test centre categories (see Table 1.1)
  • We allow the association between age and viral load to vary, dependent on whether subjects fall into the PAMS, Hospitalized, or Other group
  • We allow for variation of residual variance, dependent on age group and test centre category
  • We implement a robust regression approach by using a student-t distributed error term.

As described earlier, cases are categorized in the groups (see also Table 1.1):

  • PAMS
  • Hospitalized
  • Other

The brms model is

 bf(log10Load ~ Group + B117 + PCR + Male + s(Age, by = Group) + (1 | TestCentreCategory/Hospitalized) ,
    family = student(),
    sigma ~ PCR + (1 | fAgeGroup) + (1 | TestCentreCategory))

wherein

  • ~ Group models the population-level effect of being part of the “PAMS,” “Hospitalized,” or “Other” group
  • B117 + PCR + Male model effects of B.1.1.7 type, PCR system and gender
  • + (1 | TestCentreCategory/Hospitalized) models differences between test centre categories and hospitalized and non-hospitalized subjects within test centre categories with random effects.
  • + s(Age, by = Group) implements a thin-plate spline model for the effect of Age, with separate spline coefficients for groups defined by clinical status.
  • family = student() uses student-t distributed error variance which implements a robust regression that reduces the influence of outliers,
  • sigma ~ PCR + (1 | fAgeGroup) + (1 | TestCentreCategory) models age-group specific error variances.

The following code describes the model estimation, including specification of weakly informative priors. The analysis uses default values of 1000 warm-up samples, 1000 post warm-up samples, and 4 independent chains.

# get data
bdata = 
  get_log10Load_data() %>%
  .[, PCR_Gender_Group := factor(paste0(PCR,"_",Gender,"_",Group))] %>%
  .[, PCR_Group := factor(paste0(PCR,"_",Group))] %>%
  .[, fAgeGroup := factor(AgeGroup, ordered = F)] %>% 
  .bdata[, Male := ifelse(Gender == "M",1,ifelse(Gender == "F",0,.5))]


model_formula = 
  bf(log10Load ~ B117 + PCR + Male + Group + s(Age, by = Group) + (1 | TestCentreCategory/Hospitalized) ,
     family = student(),
     sigma ~ PCR + (1 | fAgeGroup) + (1 | TestCentreCategory)) 

# Specify weakly informative priors
prior = 
  c(prior(normal(6, 3), class = Intercept),
    prior(normal(0, 2), class = b),
    prior(normal(0, 2), class = sd),
    prior(normal(0, 1), class = b, dpar = "sigma"),
    prior(normal(0, 1), class = Intercept, dpar = "sigma"),
    prior(normal(0, 1), class = sd, dpar = "sigma"),
    prior(normal(0, 1), class = sds)
  )

# make output-skeleton to get parameter dimensions 
# in a convenient format to be used to set initial 
# parameter values
bfit = brm(model_formula,
           data = bdata,
           chains = 1,
           iter = 10,
           prior = prior,
           backend = "cmdstanr")

############ fit modified model ###########
my_inits = lapply(1:4,function(x) make_age_fit_inits(bfit))
bfit = brm(model_formula,
           data = bdata,
           chains = 4,
           iter = 2000,
           prior = prior,
           inits = my_inits,
           backend = "cmdstanr")

sampler_params = 
  nuts_params(bfit) %>% 
  data.table() %>% 
  dcast(Chain + Iteration ~ Parameter, value.var = "Value")

fn = here("FPT/splines_TP+.Rdata")
save(bfit,sampler_params,file = fn)
draws = as_draws(bfit$fit)
save(bfit,draws,sampler_params,file = fn)

2.1.1 Check successful estimation

To verify a successful model estimation, we check R-hat (Vehtari et al. 2020) and the number of divergent iterations (Betancourt 2016).

load(here("FPT/splines_TP+.Rdata"))
bfitdata = 
  data.table(bfit$data) %>% 
  .[, Age_rounded := round(Age)] %>% 
  .[, ID := 1:nrow(bfit$data)] %>% 
  setkeyv("ID")
data.table(rhat = rhat(bfit)) %>% 
  ggplot(aes(x = rhat)) + 
  geom_vline(xintercept = 1.1, col = "red") + 
  geom_vline(xintercept = 1.01, col = "red", lty = 3) + 
  geom_histogram(bins = 50)
R-hat values for viral load and age models

Figure 2.1: R-hat values for viral load and age models

divergen_iterations = sampler_params$divergent__

The maximal R-hat value is 1.007 and there are 1 divergent iterations out of a total of 4000 iterations, which indicates that the model converged.

2.1.2 Posterior predictive check

To check if the current analysis describes the observed data well, we use a visual posterior predictive check. We examine whether the observed means and standard deviations of group-wise viral loads are within the 90% credible interval of the posterior predictive distribution. Figure 2.2 shows observed and estimated viral load means and standard deviations by age group.

if (file.exists("pdata/yhat_age.Rdata")) {
  load("pdata/yhat_age.Rdata")
} else {
  yhat = posterior_predict(bfit)
  dt.yhat = 
    data.table(ID = 1:nrow(bfitdata), t(yhat)) %>% 
    melt(id.var = "ID") %>% 
    .[, .draw := as.numeric(gsub("V","",variable))] %>% 
    .[, variable := NULL] %>% 
    setkeyv("ID") %>% 
    .[bfitdata, `:=`(Age_rounded = Age_rounded,
                     Group = Group)]
  yhat_by_age = 
    dt.yhat[, .(log10Load = collapse::fmean(value)),
            by = .(Age_rounded,.draw)] %>% 
    get_stats(var = "log10Load", by = "Age_rounded") %>% 
    setnames("mean","log10Load") %>% 
    .[, `Test result` := factor("modeled",levels = c("observed","modeled"))]
  save(dt.yhat,yhat,yhat_by_age,file = "pdata/yhat_age.Rdata")
}

obs_by_age = 
  bfitdata[, list(log10Load = mean(log10Load),
               lower = mean(log10Load) - 1.96 * sd(log10Load)/sqrt(.N),
               upper = mean(log10Load) + 1.96 * sd(log10Load)/sqrt(.N),
               N = .N),
               by = c("Age_rounded")] %>% 
  .[, `Test result` := factor("observed",levels = c("observed","modeled"))] 

ppc_VL_by_age = 
  yhat_by_age %>% 
  ggplot(aes(x = Age_rounded, y = log10Load)) + 
  geom_line() + 
  conf_ribbon(data = yhat_by_age, fill ="black") +
  geom_linerange(data = obs_by_age,
                 aes(ymin = lower, ymax = upper),
                 alpha = .5,
                 show.legend = F) + 
  geom_point(data = obs_by_age, aes(size = N,shape = Ns), shape = 21, fill = "white") + 
  scale_x_continuous(expand = expansion(.001,0.999)) + 
  coord_cartesian(ylim = c(4,9)) +
  xlab("Age (rounded to the year)") + 
  ylab(expression(log[10]~viral~load)) +
  theme(legend.position = c(.2,.8)) + 
  scale_size(range = c(0, 4),guide = guide_legend(nrow = 1),breaks = c(0,25,50,100,250,500))

ppc_VL_by_age
Observed (points and vertical lines) and estimated (horizontal line and confidence band) age-wise mean viral loads with confidence and credible intervals, respectively.

Figure 2.2: Observed (points and vertical lines) and estimated (horizontal line and confidence band) age-wise mean viral loads with confidence and credible intervals, respectively.

yhat_by_age = 
  dt.yhat[Age_rounded < 26] %>% 
  .[, .(log10Load = collapse::fmean(value)),
          by = .(Age_rounded,Group,.draw)] %>% 
  get_stats(var = "log10Load", by = c("Age_rounded","Group")) %>% 
  setnames("mean","log10Load") %>% 
  .[, `Test result` := factor("modeled",levels = c("observed","modeled"))]


obs_by_age = 
  bfitdata[Age_rounded < 26] %>% 
  .[, list(log10Load = mean(log10Load),
           lower = mean(log10Load) - 1.96 * sd(log10Load)/sqrt(.N),
           upper = mean(log10Load) + 1.96 * sd(log10Load)/sqrt(.N),
           N = .N),
    by = .(Age_rounded,Group)] %>% 
  .[, `Test result` := factor("observed",levels = c("observed","modeled"))]

obs_by_age_all = 
  bfitdata[Age_rounded < 26] %>% 
  .[, list(log10Load = mean(log10Load)),
    by = .(Age_rounded)]

ppc_VL_by_age_PAMS = 
  yhat_by_age %>% 
  ggplot(aes(x = Age_rounded, y = log10Load)) + 
  geom_line(show.legend = F, aes(color= Group)) + 
  conf_ribbon(data = yhat_by_age, fill = "Group") +
  geom_linerange(data = obs_by_age,
                 aes(ymin = lower, ymax = upper, color = Group)) + 
  geom_point(data = obs_by_age, aes(size = N,color= Group, fill = Group), shape = 21, fill = "white") +
  scale_x_continuous(expand = expansion(.001,0.999)) + 
  coord_cartesian(xlim = c(0,25)) +
  geom_text(data = obs_by_age,aes(label = N, y = upper+.2), 
            color = "black", size = 3) + 
  xlab("Age (rounded to the year)") + 
  ylab(expression(log[10]~viral~load)) +
    facet_wrap(. ~ Group, ncol = 1) + 
  red_blue(1:3) + 
  scale_size(range = c(0, 5)) + 
  geom_point(data = obs_by_age_all, size = 1) +
  theme(legend.position = "none")
ppc_VL_by_age_PAMS
Observed and estimated viral loads for subjects younger than 26 years stratified by clinical status.

Figure 2.3: Observed and estimated viral loads for subjects younger than 26 years stratified by clinical status.

ggsave("figures/S1_ppc_age.png",ppc_VL_by_age_PAMS, width = 15, height = 20, units = "cm")
ggsave("figures/S1_ppc_age.pdf",ppc_VL_by_age_PAMS, width = 15, height = 20, units = "cm")
rm(dt.yhat)
tmp = gc()

The next plots compare directly-estimated and observed mean and standard deviations for age groups (see Figure 2.4) clinical status (see Figure 2.5) and test centre category (see Figure 2.6).

bfitdata[,AgeGroup :=  cut(Age,c(seq(0,90,10),101))]
ppc_5i = 
  ppc_2d(bfitdata,yhat, "AgeGroup") + 
  ggtitle("Posterior prediction by age group")
ppc_5i
Observed (red dots) and estimated means and variances of viral load by age group.

Figure 2.4: Observed (red dots) and estimated means and variances of viral load by age group.

ppc_2d(bfitdata,yhat, "Group") + ggtitle("Posterior prediction by sub-sample")
Observed (red dots) and estimated means and variances of viral load by sub-sample

Figure 2.5: Observed (red dots) and estimated means and variances of viral load by sub-sample

ppc_2d(bfitdata,yhat, "TestCentreCategory") + ggtitle("Posterior prediction by TestCentreCategory group")
Observed (red dots) and estimated means and variances of viral load by test centre group

Figure 2.6: Observed (red dots) and estimated means and variances of viral load by test centre group

rm(fits,yhat)
tmp = gc(verbose = F)

2.1.3 Null hypothesis significance testing

As a first approach, we can use null hypothesis significance tests to compare viral loads of different age groups. The following code implements Welch’s t-test to compare group means.

agl = levels(bdata$AgeGroup)
contr_idx = list(
  `0-5 vs 20-65`   = list(young = bdata$AgeGroup %in% agl[1],
                           adult = bdata$AgeGroup %in% agl[5:9]),
  `5-10 vs 20-65` = list(young = bdata$AgeGroup %in% agl[2],
                          adult = bdata$AgeGroup %in% agl[5:9]),
  `10-15 vs 20-65`  = list(young = bdata$AgeGroup %in% agl[3],
                            adult = bdata$AgeGroup %in% agl[5:9]),
  `15-20 vs 20-65`   = list(young = bdata$AgeGroup %in% agl[4],
                             adult = bdata$AgeGroup %in% agl[5:9])
)

contr_names = names(contr_idx)

NHST_stats = c()
NHST_Groups = c("PAMS","Hospitalized","Other","All")
for (contr in 1:4) {
  for (s in NHST_Groups) {
    if (s == "All") {
      s_idx = bdata$Group %in% unique(bdata$Group)
    } else {
      s_idx = bdata$Group == s
    }
    
    young_idx = contr_idx[[contr]]$young
    adult_idx = contr_idx[[contr]]$adult
    cdata = rbind(
      data.table(log10Load = bdata$log10Load[s_idx & young_idx],Group = "Y"),
      data.table(log10Load = bdata$log10Load[s_idx & adult_idx],Group = "A")
    )
    
    if (min(table(cdata$Group)) > 5) {
      welch_test = 
        t.test(cdata[Group == "Y",log10Load],
               cdata[Group == "A",log10Load],
               var.equal = F)
      
      MWU_test = wilcox.test(log10Load~Group,cdata)
    } else {
      welch_test = data.frame(conf.int = NA, p.value = NA, statistic = NA, parameter = NA)
      MWU_test = data.frame(p.value = NA, statistic = NA)
    }
    
    delta = round(mean(cdata[Group == "Y", log10Load]) - 
                    mean(cdata[Group == "A", log10Load]),digits = 2)
    tmp = 
      data.table(Sample = factor(s,levels = c("PAMS","Hospitalized","Other","All")),
                 Comparison = names(contr_idx)[contr],
                 Difference = paste0(delta," (",paste0(round(welch_test$conf.int,2),collapse = ", "),")"),
                 `p~Welch~` = welch_test$p.value,
                 `t-stat` = welch_test$statistic,
                 `t df` = welch_test$parameter,
                 `p~MW~` = MWU_test$p.value)
    NHST_stats =  rbind(NHST_stats,tmp)
  }
}
rm(welch_test)

NHST_stats = 
  NHST_stats[order(Sample)]

footnote_text = 
  "The difference is given with 95% confidende intervals, p = p-value,
   t = t-test, df = degrees of freedom, MW = Mann-Whitney U test"
tbl_NHST = 
  kable(NHST_stats,
      digits = c(0,0,2,2,2,2,2),
      format = table_format) %>%
  kable_styling(full_width = F) %>%
  add_footnote(footnote_text, notation = "none")
tbl_NHST
Sample Comparison Difference pWelch t-stat t df pMW
PAMS 0-5 vs 20-65 -0.37 (-1, 0.26) 0.24 -1.19 35.43 0.21
PAMS 5-10 vs 20-65 -0.86 (-1.46, -0.26) 0.01 -2.92 38.52 0.00
PAMS 10-15 vs 20-65 -0.56 (-1.1, -0.02) 0.04 -2.08 50.82 0.03
PAMS 15-20 vs 20-65 -0.25 (-0.5, 0.01) 0.06 -1.93 210.87 0.05
Hospitalized 0-5 vs 20-65 -0.37 (-1.1, 0.37) 0.32 -1.01 36.41 0.11
Hospitalized 5-10 vs 20-65 -0.48 (-1.38, 0.43) 0.28 -1.11 19.15 0.17
Hospitalized 10-15 vs 20-65 -0.12 (-0.98, 0.74) 0.78 -0.28 22.19 0.62
Hospitalized 15-20 vs 20-65 -0.03 (-0.35, 0.3) 0.87 -0.16 144.86 0.72
Other 0-5 vs 20-65 -0.32 (-0.55, -0.1) 0.00 -2.89 279.50 0.00
Other 5-10 vs 20-65 -0.12 (-0.41, 0.18) 0.44 -0.78 132.01 0.52
Other 10-15 vs 20-65 -0.3 (-0.57, -0.03) 0.03 -2.17 159.78 0.05
Other 15-20 vs 20-65 -0.08 (-0.29, 0.13) 0.46 -0.74 345.86 0.35
All 0-5 vs 20-65 -0.49 (-0.69, -0.29) 0.00 -4.82 343.73 0.00
All 5-10 vs 20-65 -0.39 (-0.65, -0.14) 0.00 -3.09 189.86 0.00
All 10-15 vs 20-65 -0.41 (-0.64, -0.18) 0.00 -3.47 231.98 0.00
All 15-20 vs 20-65 -0.15 (-0.29, 0) 0.05 -1.98 702.87 0.05
The difference is given with 95% confidende intervals, p = p-value,
t = t-test, df = degrees of freedom, MW = Mann-Whitney U test

2.2 Association of viral load probability of a positive culture

To estimate the association between viral load and culture probability, we estimate a logistic regression in brms. Based on the preliminary analysis described above, this analysis uses only the data from from (Wölfel et al. 2020) and (Perera et al. 2020).

culture_data = culture_data[Study != "van Kampen (2021)"]

CP_results_file = "CPfit.Rdata"
 if (file.exists(CP_results_file)) {
  load(CP_results_file)
} else {
  CP.fit = brm(culture_positive ~ log10Load,
               family = bernoulli(),
               backend = "cmdstanr",
               data = culture_data)
  save(CP.fit,file = CP_results_file)
}

2.3 B.1.1.7 viral load at the first positive test

The B.1.1.7 cases in this sample are from a restricted time period (starting January 2021) and from small number of test locations. Because the appearance of B.1.1.7 cases can have led to changing testing strategies, this analysis restricts the comparison of B.1.1.7 and non-B.1.1.7 cases to test results that were obtained either one day prior, the same day, or one day after a positive B.1.1.7 cases from the same location. For instance, if test centre A had only one positive test on January 15th, and centre B had only one positive test on January 20th, the analysis would only use positive test results from January 14th-16th from test centre A and test results from January 19th-21st from test centre B. Further, this analysis adjusts for age, PCR system, gender, and models effects of test centres as random effects.

To examine the sensitivity of results to different analysis strategies, we estimate the B.1.1.7 with different data (c.f. 1.4)

  • the full data set
  • B.1.1.7 cases and non-B.1.1.7 cases within +/- 5 days of the B.1.1.7 cases
  • B.1.1.7 cases and non-B.1.1.7 cases within +/- 1 day of the B.1.1.7 cases
  • non-B.1.1.7 cases within +/- 1 day of the B.1.1.7 cases, and B.1.1.7 cases only from test centres/locations that also reported non-B.1.1.7 cases within +/- 1 day of the B.1.1.7 cases

and with different regression model

  • unadjusted regression
  • uanadjusted regression with random effects for test centres/locations
  • adjustment for gender, age, PCR system, and clinical status with random effects for test centres/locations

The following code implements these analyses and shows R-hat values for the final analysis, i.e., the fully adjsuted analysis with the most restrictive data set, which also permits the most direct comparison of B.1.1.7 and non-B.1.1.7 cases.

bdata.B117 = 
  bdata %>%
  .[, B117 := factor(B117, labels = c("non-B117","B117"))] %>% 
  .[, Sex := ifelse(Gender == "U",.5,ifelse(Gender == "M",1,0))]

B117model.unadjusted =
  bf(log10Load ~ B117,
     sigma ~ B117 + (1 | TestCentre))

B117model.re =
  bf(log10Load ~ B117 + (1 | TestCentre),
     sigma ~ B117 + PAMS + (1 | TestCentre))

B117model.re.adjusted =
  bf(log10Load ~ B117 + Group + Sex + PCR + s(Age) + (1 | TestCentre),
     sigma ~ B117 + PAMS + (1 | TestCentre))


######### all, unadjusted #########
B117data = 
  bdata.B117 
Bfit = fit_B117_model(B117model.unadjusted,B117data,"B117model_unadjusted_all",adapt_delta = .8)
B117_model_stats = get_B117_stats(Bfit,"unadjusted",Inf)
Bfit = fit_B117_model(B117model.re,B117data,"B117model_re_all",adapt_delta = .8)
B117_model_stats = rbind(B117_model_stats,get_B117_stats(Bfit,"RE, unadjusted",Inf))
B117_model_stats = rbind(B117_model_stats,get_B117_stats(Bfit,"RE, adjusted",Inf))
########### +/- 5 days ########### 
B117data = 
  bdata.B117 %>%
  .[B117CentreDay5 == 1] 
### with adjustment ####
Bfit = fit_B117_model(B117model.unadjusted,B117data,"B117model_unadjusted_5",adapt_delta = .99)
B117_model_stats = rbind(B117_model_stats,get_B117_stats(Bfit,"unadjusted",5))
Bfit = fit_B117_model(B117model.re,B117data,"B117model_re_5",adapt_delta = .99)
B117_model_stats = rbind(B117_model_stats,get_B117_stats(Bfit,"RE, unadjusted",5))
Bfit = fit_B117_model(B117model.re.adjusted,B117data,"B117model_re_adjusted_5",adapt_delta = .99)
B117_model_stats = rbind(B117_model_stats,get_B117_stats(Bfit,"RE, adjusted",5))
########### +/- 1 day ########### 
B117data = 
  bdata.B117 %>%
  .[B117CentreDay1 == 1] 
####  no adjustment ####
Bfit = fit_B117_model(B117model.unadjusted,B117data,"B117model_unadjusted_1",adapt_delta = .99)
B117_model_stats = rbind(B117_model_stats,get_B117_stats(Bfit,"unadjusted",1))
Bfit = fit_B117_model(B117model.re,B117data,"B117model_re_1",adapt_delta = .99)
B117_model_stats = rbind(B117_model_stats,get_B117_stats(Bfit,"RE, unadjusted",1))
Bfit = fit_B117_model(B117model.re.adjusted,B117data,"B117model_re_adjusted_1",adapt_delta = .99)
B117_model_stats = rbind(B117_model_stats,get_B117_stats(Bfit,"RE, adjusted",1))
### with adjustment, non-B117 & B117 in centre ####
CentresWithMin2Cases = 
  B117data[,.(N = .N), by = .(TestCentre)][N > 1,TestCentre]
B117data = 
  B117data[TestCentre %in% CentresWithMin2Cases]
Bfit = fit_B117_model(B117model.re.adjusted,B117data,"B117model_re_adjusted_1wcc",adapt_delta = .99)
B117_model_stats = rbind(B117_model_stats,get_B117_stats(Bfit,"RE, adjusted, (paired)",1,"Yes"))
data.table(rhat = rhat(Bfit)) %>% 
  ggplot(aes(x = rhat)) + 
  geom_histogram(bins = 100) + 
  geom_vline(xintercept = 1.1, color = "red")+ 
  geom_vline(xintercept = 1.01, color = "red", lty = 3)

B117fit = Bfit
rm(Bfit)

2.4 Viral load and infectiousness time course

2.4.1 A model for estimating viral load time course without knowledge of day of infection or peak viral load

In this analysis, we use the time course data described above to estimate how viral load and infectiousness, indexed by culture probability, evolve over time. Because the day of infection is not known, we estimate it simultaneously with the viral load time course. The model has following basic properties:

  • log10 viral load is assumed to increase linearly up to the maximum viral load and thereafter decreases linearly.
  • day zero is the day of peak viral load, which is modeled with a parameter intercept
  • the linear increase of log10 viral load from the start of shedding to peak viral load is modeled with a slope parameter slope_up
  • the linear decrease has a slope parameter slope_down
  • we use a logistic weighting function (weight_down = inv_logit(day*b)) to calculate log10 viral load at any point in time as the average of the up and down slopes: yhat = load_up * (1-weight_down) + load_down * weight_down. This implements a smooth transition from increasing to declining viral load.

To obtain subject-level parameter estimates, we use random effects models for intercept, slope_up, and slope_down. In short, subject-level parameters \(p_i\) are estimated as

\[log(p_i) = \alpha + \beta X_i + \gamma s(age) + \rho_gZ_i\]

where \(\alpha\) is the population mean (see below for priors) the vector \(\beta\) has weights for the fixed effects of gender, PAMS, B.1.1.7, and hospitalization status in vector \(X_i\). \(\gamma s(age)\) represents the non-linear effect of age, modeled with restricted cubic splines. \(Z_i\) is a vector with random effects variables and \(\rho_g\) are the associated weights for group \(g\) and \(\rho_g\sim N(0,\sigma)\), with \(\sigma\) being the random effects standard deviation. To obtained the negative slopes for the viral load decay, \(log(p_i)\) is multiplied by -1. We estimate, for all model parameters, subject-level random effects. In addition, we estimate for the intercept the effect of the centre in which the first test was taken and for intercept and slopes the effect of the primary centre, that is that centre with the longest distance between consecutive tests from the same centre (as a proxy for duration of stay). These two latter effects are also estimated as random effects.

To align observed viral loads with with estimated viral load time courses, we estimate for each subject a “shift” parameter, which captures the number of days between peak viral load and the first test. We use a uniform prior from -10 to 20 days for the subject-level shift. Additionally, we model differences in shifts between participants that had their first test in different centres as random effects.

For small subset of subjects we also had the day of the reported symptom onset. To incorporate this data, the model calculated the time from peak viral load to symptom onset and employed a skew-normal prior distribution on this distance. To further account for noise in this data (some subjects reported symptom onset more than three weeks after their first positive test), we model the reported onset as a mixture between a normally-distributed component with a large variance error and the skew normal.

model = "TC_APGHB117_simple_shiftw25o"
tcmp = list(
  # mean, sd, random effects sigma of peak viral load
  log_int_mu = 2.1, log_int_sigma = .15,  log_int_reff_sigma = 1,
  # mean, sd, random effects sigma of the log up-slope
  log_up_mu = .6, log_up_sigma = .25, log_up_reff_a = 4, log_up_reff_b = 15,
  # mean and sd of the log down-slope
  log_down_mu = -1.75, log_down_sigma = .5, log_down_reff_a = 4, log_down_reff_b = 15,
  b_mu = 10, b_sigma = 1
)

descr_dist = function(mu,sigma,digits = 1) {
  m = round(exp(mu+sigma^2/2),digits)
  qs =  round(qlnorm(c(.1,.9),mu,sigma),digits)
  return(
    paste0(
      m, ", a 10% quantile of ", qs[1], ", and a 90% quantile of ",qs[2]
    )
  )
}

We use following priors for population-level means (\(alpha\) above):

  • \(log(intercept)\) (peak viral load): \(N(2.1,0.15)\). This prior distribution has a mean 8.3, a 10% quantile of 6.7, and a 90% quantile of 9.9.
  • \(log(slope_{up})\) (attack rate): \(N(0.6,0.25)\). This prior distribution has a mean 1.9, a 10% quantile of 1.3, and a 90% quantile of 2.5,
  • \(-log(slope_{down})\) (decay rate): \(N(-1.75,0.5)\). This prior distribution has a mean 0.2, a 10% quantile of 0.09, and a 90% quantile of 0.33.

Priors for random effects variances (\(\sigma\) above) are \(gamma(4,20)\) for log10 load model parameters, \(half-normal(0,1)\) for shifts. Random effects for viral load parameters are added on the log scale. Priors for fixed effects, which are added on the log scale (\(\beta\) above) are \(N(0,.125)\).1

The prior for onset is \(skewN(\xi = -2.5, \omega = 10, \alpha = 10)\), the prior for the error component is \(N(0,50)\), and the prior probability for a reported onset coming from the error component is \(uniform(0,1)\).

The statistical model was implemented in Stan (Carpenter et al. 2017). The complete Stan model can be found at the end of this document.

2.4.2 Data for the Stan model

We use utility-scripts to generate the data for the Stan model.

make_time_course_standata(
  max_diff_load_12perday = 5,
  ub_log_slope_up_mu = Inf) %>% 
  list2env(.GlobalEnv)
## <environment: R_GlobalEnv>
caption_text = 
  "Correlation matrix of regressors for model paramerers
   slope_up, intercept, and slope_down"
cor(cbind(Age = datalist$Age,
          datalist$X_PGH)) %>% 
 kable(digits = 2, 
        format = table_format,
        caption = caption_text) %>%
  kable_styling(full_width = F)
Table 2.1: Correlation matrix of regressors for model paramerers slope_up, intercept, and slope_down
Age PAMS1 Gender Hospitalized B117 PAMS1:B117
Age 1.00 -0.36 -0.04 0.36 -0.05 -0.10
PAMS1 -0.36 1.00 -0.08 -0.51 0.01 0.28
Gender -0.04 -0.08 1.00 0.11 0.05 -0.01
Hospitalized 0.36 -0.51 0.11 1.00 0.04 -0.14
B117 -0.05 0.01 0.05 0.04 1.00 0.26
PAMS1:B117 -0.10 0.28 -0.01 -0.14 0.26 1.00

2.4.3 Model priors and prior predictions

A first step of the modelling workflow is to examine the prior predictive distributions of the model. Here, we check if the model predictions are roughly in line with the prior information about the time course of a SARS-CoV-2 infection, namely a duration between 3 and 7 days from infection to peak viral load, a peak viral load of around 8, and decay rate of around 0.15. To obtain prior predictive distributions, we sample from the Stan model without conditioning on the data.

pp_file = paste0("CP/prior_predict_",model,".Rdata")
if (file.exists(pp_file)) {
  load(pp_file)
} else {
  make_time_course_standata(
    selection = 3,
    samples = 500,
    max_diff_load_12perday = 5) %>% 
    list2env(.GlobalEnv)
  sm = cmdstan_model(here(paste0("CP/stan/",model,".stan")))
  datalist$condition_on_data = 0
  my_inits = lapply(1:4, function(x) make_TC_inits(datalist))
  priorpred = 
    sm$sample(data = datalist,
              iter_warmup = 250,
              iter_sampling = 500,
              chains = 4,
              init = my_inits,
              seed = 123)
  draws = priorpred$draws()
  save(draws, datalist, day_data, file = pp_file)
  datalist$condition_on_data = 1
}

Figure 2.7 illustrates the model and describes key model parameters.

par(mar=c(2,1,0,1), mgp=c(1,.25,0), tck=-.01)
layout(t(matrix(c(1:4,rep(5,8),rep(6,4),rep(7,8)), nrow = 4)))

## prior distributions for group-level means
curve(dlnorm(x,tcmp$log_up_mu,tcmp$log_up_sigma),1,3,
      xlab = "slope_up", ylab = "", yaxt = "n", bty = "n", col = "red")
curve(dlnorm(x,tcmp$log_int_mu,tcmp$log_int_sigma),4,12,
      xlab = "intercept", ylab = "", yaxt = "n", bty = "n")
curve(dlnorm(x,tcmp$log_down_mu,tcmp$log_down_sigma),0,.9,
      xlab = "-slope_down", ylab = "", yaxt = "n", bty = "n", col = "blue")
p_intup = density(abs(rlnorm(1000000,tcmp$log_int_mu,tcmp$log_int_sigma)) / 
                    rlnorm(1000000,tcmp$log_up_mu,tcmp$log_up_sigma))
plot(p_intup, xlab = "time2peak", ylab = "", yaxt = "n",
     bty = "n", main = "", xlim = c(1,10))

model_pars = 
  draws %>% 
  subset_draws("slope_up_mu") %>% 
  as_draws_dt() %>% 
  merge(draws %>%  subset_draws("slope_down_mu") %>% as_draws_dt(), by = ".draw") %>% 
  merge(draws %>%  subset_draws("intercept_mu") %>% as_draws_dt(), by = ".draw") %>% 
  merge(draws %>%  subset_draws("beta_sweight_mu") %>% as_draws_dt(), by = ".draw")

## prior predictions from group-level means
par(mar=c(0,3,2,1))
plot(0,
     type = "n", ylim = c(0,12), xlim = c(-10,40),
     ylab = expression(log[10]~viral~load), xaxt = "n")
tmp = 
  lapply(1:1000, function(d) {
  abline(a = model_pars$intercept_mu[d],
         b = model_pars$slope_up_mu[d],
         col = adjustcolor("blue",alpha = .01))
  abline(a = model_pars$intercept_mu[d],
         b = -model_pars$slope_down_mu[d],
         col = adjustcolor("red",alpha = .02))
})
abline(v = 0, lty = 3, col = "grey")

## weights
par(mar=c(0,3,0,1))
days = seq(-10,40,by = .1)
w = inv.logit(matrix(days,ncol = 1)  %*% 
                matrix(model_pars$beta_sweight_mu[1:250],nrow = 1))
matplot(days,w,type = "l", 
        col = adjustcolor("black",alpha = .025),
        lty = 1, ylab = "weight down", xaxt = "n")

## full trajectory
par(mar=c(3,3,0,1))
plot(0, type = "n", ylim = c(0,12), xlim = c(-10,40),
     ylab = expression(log[10]~viral~load),
     xlab = "Days form peak viral load")
tmp = 
  lapply(1:250, function(d) {
  w_down = inv.logit(days * model_pars$beta_sweight_mu[d])
  est_up = model_pars$intercept_mu[d] + model_pars$slope_up_mu[d]*days
  est_do = model_pars$intercept_mu[d] + -model_pars$slope_down_mu[d]*days
  yhat = est_do * w_down + est_up * (1-w_down)
  lines(days,yhat, col = adjustcolor("black",alpha = .1))
})
Components of the two-slopes model used to estimate viral load time courses. Top: Priors for population-level means. The (implied) prior for time to peak viral load is the ratio of the priors for slope up and intercept.  Second row: Components of exemplary viral load trajectories. Each line is generated from the prior distribution of group-level means in the top row. Third row: Weighting function for the combination of up- and down-slopes. Bottom: Exemplary viral load trajectories. In the full model, subject-level parameters also depend on random effects, subject age, gender, clinical status, and PCR system effects.

Figure 2.7: Components of the two-slopes model used to estimate viral load time courses. Top: Priors for population-level means. The (implied) prior for time to peak viral load is the ratio of the priors for slope up and intercept. Second row: Components of exemplary viral load trajectories. Each line is generated from the prior distribution of group-level means in the top row. Third row: Weighting function for the combination of up- and down-slopes. Bottom: Exemplary viral load trajectories. In the full model, subject-level parameters also depend on random effects, subject age, gender, clinical status, and PCR system effects.

Figure 2.8 shows that the subject-level parameters, which in addition to population-level means shown in Figure 2.7 also depend on additional fixed and random effects, have wider prior distributions than the priors for population-level means. As a result, the model can also capture extreme subject-level parameters, if the data strongly indicate such extreme values.2

prior_sample_densities = vector(mode = "list",length = 4)
names(prior_sample_densities) = 
  c("slope_down","slope_up","intercept","time2peak")
for (p in names(prior_sample_densities))
  prior_sample_densities[[p]] = draws %>% 
  draws_by_id(p) %>% 
  setnames(p,"par") %>% 
  .[par > min(0,quantile(par,.005)) & par < max(0,quantile(par,.995))] %>% 
  .[,par] %>% 
  density()

par(mfrow = c(2,2),mar=c(2,1,0,.25), mgp=c(1,.25,0), tck=-.01)
curve(dlnorm(x,tcmp$log_up_mu,tcmp$log_up_sigma),0,6, 
      xlab = "slope_up", ylab = "", yaxt = "n", bty = "n", col = "red", lty = 2)
lines(prior_sample_densities[["slope_up"]], col = "red")
curve(dlnorm(x,tcmp$log_int_mu,tcmp$log_int_sigma),2,14, 
      xlab = "intercept", ylab = "", yaxt = "n", bty = "n", lty = 2)
lines(prior_sample_densities[["intercept"]])
curve(dlnorm(x,tcmp$log_down_mu,tcmp$log_down_sigma),0,1, 
      xlab = "-slope_down", ylab = "", yaxt = "n", bty = "n", col = "blue", lty = 2)
lines(-prior_sample_densities[["slope_down"]]$x,
      prior_sample_densities[["slope_down"]]$y, col = "blue")
plot(p_intup, xlab = "time2peak", ylab = "", yaxt = "n", 
     bty = "n", main = "", xlim = c(0,15), lty = 2)
lines(prior_sample_densities[["time2peak"]])
Prior distributions for group-level means and prior predictive distributions of subject-level parameters. Dotted lines show the prior distributions of group-level means, which are also shown in the previous plot. Solid lines show prior distributions of subject-level parameters, which are obtained by adding random effects and effects of covariates (age, gender, PAMS), to the group-level means.

Figure 2.8: Prior distributions for group-level means and prior predictive distributions of subject-level parameters. Dotted lines show the prior distributions of group-level means, which are also shown in the previous plot. Solid lines show prior distributions of subject-level parameters, which are obtained by adding random effects and effects of covariates (age, gender, PAMS), to the group-level means.

2.4.4 Model estimation

Data for the Stan model are generated in a custom function (see utils.R). To estimate model parameters, we used 4 independent chains, each with 1000 warm-up and 1000 post warm-up samples. Prior analysis showed that 4000 post-warm-up samples produced sufficient effective samples to reliably estimate 90% credible intervals for parameters of interest. Here, we only show the code for the analysis of the full date set (which takes around 6 hours with 4 parallel chains on an Amazon EC2 c5d.2xlarge instance).

make_time_course_standata(
  selection = 3, # minimum number of data points per participants
  max_diff_load_12perday = 5, # maximum allowed per day viral load increases 
  imputation_limit = 3 # maximum value of imputed viral load for negative tests
) %>% 
  list2env(.GlobalEnv)

sm = cmdstan_model(here(paste0("CP/stan/",model,".stan")))
csf = sm$sample(data = datalist,
                iter_warmup = 1000,
                iter_sampling = 1000,
                adapt_delta = .8,
                max_treedepth = 10,
                init = lapply(1:4, function(x) make_TC_inits()),
                seed = 123)

draws = csf$draws()

ss = 
  draws %>% 
  summarise_draws() %>% 
  data.table() 

sampler_diags = 
  csf$sampler_diagnostics() %>% 
  as_draws_df() %>% 
  data.table()

2.4.5 Check successful estimation

Next, we check R-hat and number of divergent iterations to verify successful model estimation. R-hat values should be below 1.1, or even better below 1.01.

grep_string = "a0|slope_down_ld_centre|int_centre1|shift_centre1|_sigma|_mu|beta"

p_rhat = ggplot(ss[grepl(grep_string,variable)], aes(x = rhat)) + 
  geom_vline(xintercept = 1.01, color = "red", lty = 3) + 
  geom_vline(xintercept = 1.1, color = "red") + 
  geom_histogram(bins = 30) + 
  xlab("R-hat")
p_ess_bulk = ggplot(ss[grepl(grep_string,variable)], aes(x = ess_bulk)) + 
  geom_vline(xintercept = 100, color = "red", lty = 3) + 
  geom_histogram(bins = 30) + 
  xlab("ESS posterior bulk") +
  scale_x_log10()
p_ess_tail = ggplot(ss[grepl(grep_string,variable)], aes(x = ess_tail)) + 
  geom_vline(xintercept = 100, color = "red", lty = 3) + 
  geom_histogram(bins = 30) + 
  xlab("ESS posterior tail") +
  scale_x_log10()

p_rhat_i = ggplot(ss[!grepl(grep_string,variable)], aes(x = rhat)) + 
  geom_vline(xintercept = 1.01, color = "red", lty = 3) + 
  geom_vline(xintercept = 1.1, color = "red") + 
  geom_histogram(bins = 30) + 
  xlab("R-hat")
p_ess_bulk_i = ggplot(ss[!grepl(grep_string,variable)], aes(x = ess_bulk)) + 
  geom_vline(xintercept = 100, color = "red", lty = 3) + 
  geom_histogram(bins = 30) + 
  xlab("ESS posterior bulk") +
  scale_x_log10()
p_ess_tail_i = ggplot(ss[!grepl(grep_string,variable)], aes(x = ess_tail)) + 
  geom_vline(xintercept = 100, color = "red", lty = 3) + 
  geom_histogram(bins = 30) + 
  xlab("ESS posterior tail") +
  scale_x_log10()

divergent_proportion = 
  mean(sampler_diags$divergent__)

(p_rhat + p_ess_bulk + p_ess_tail) /
  (p_rhat_i + p_ess_bulk_i + p_ess_tail_i)
R-hat values and number of effective samples sizes (ESS) for model parameters. Top row: population-level parameters. Bottom row: Subject-level parameters.

Figure 2.9: R-hat values and number of effective samples sizes (ESS) for model parameters. Top row: population-level parameters. Bottom row: Subject-level parameters.

rm(p_rhat,p_ess_bulk,p_ess_tail,p_rhat_i,p_ess_bulk_i,p_ess_tail_i)

is.difficult = 
  ss %>% 
  .[!grepl("b_shift|time2peak",variable) & rhat > 1.1,] %>%
  .[, tmp_id := as.numeric(gsub("[^0-9.]","",variable))] %>% 
  .[, ID := unique(day_data$ID)[tmp_id]]
day_data[log10Load <= 2, idx := 1:sum(day_data$log10Load <= 2)]
for (id in is.difficult[grepl("imp_neg",variable),tmp_id]) 
  is.difficult[tmp_id == id, ID := day_data[idx == id,ID]]
is.difficult = 
  is.difficult %>% 
  .[, parameter := tstrsplit(variable,"\\[")[[1]]] %>% 
  .[, parameter := paste0(parameter," (", round(rhat,1),")")] %>% 
  .[, mean_rhat := mean(rhat), by = .(ID)] %>% 
  .[, parameters := paste(parameter,collapse = ", "), by = .(ID)] %>% 
  .[, parameter_lb := paste(parameter,collapse = "\n"), by = .(ID)] %>% 
  .[, n_pars := .N, by = .(ID)] %>% 
  .[, .(ID,parameters,n_pars,parameter_lb,mean_rhat)] %>% 
  unique() %>% 
  setkeyv("ID") %>% 
  .[order(-mean_rhat)]
day_data[, is.difficult := ifelse(ID %in% is.difficult$ID,T,F)]
nuts_params(csf) %>% 
  mcmc_nuts_energy()
Energy and Bayesian fraction of missing information. Each subplot shows the results for one mcmc-chain. The overlap of the of the (centered) marginal energy distribution and the first-differenced distribution indicates good exploration of the tail of the posterior distribution.

Figure 2.10: Energy and Bayesian fraction of missing information. Each subplot shows the results for one mcmc-chain. The overlap of the of the (centered) marginal energy distribution and the first-differenced distribution indicates good exploration of the tail of the posterior distribution.

The R-hat values indicate that all population-level parameters converged. However, a small number of subject-level parameters have R-hat values above 1.1. Inspection of the 119 relevant subjects shows that this is due to viral load times courses that could be at the onset or later in the infection, resulting in bimodal distributions of these subjects’ shift-parameters (see Figure 2.12).

The estimation resulted in 0.175% divergent iterations. Optimally, there should be no divergent iterations, though this is still a matter of current research and a small number of divergent iteration is not generally seen as problematic.

2.4.6 Posterior predictive check

To check that the model adequately describes the data, we plot observed data together with model predictions as a posterior predictive check (Gabry et al. 2019). Figure 2.11 shows subject-level expected linear upwards and downwards slope of viral load given data, model, and estimated parameters. Note that the observed data points do not need to be and are not typically enveloped by the 90% credible interval of the expected viral load time course.

if (file.exists("pdata/VLCP_by_day_draws.Rdata")) {
  load("pdata/VLCP_by_day_draws_b.Rdata")
} else {
  VLCP_by_day_draws = 
    make_VLCP_by_draw_ID(draws,
                         days = tc_days,
                         thin = thin)
  setkeyv(VLCP_by_day_draws,c("ID",".draw"))
  
  
  # calculate posterior expectation of viral load
  # by ID and day
  VLCP_by_dayID = 
    VLCP_by_day_draws %>%
    .[, list(CP = collapse::fmean(CP),
             log10Load = collapse::fmean(log10Load)),
      by = c("ID","day_shifted")]
  
  # select sub-set of cases for plotting
  ids = sample(unique(VLCP_by_dayID$ID),81)
  
  VL_by_dayID = 
    VLCP_by_day_draws[ID %in% ids] %>%
    summarise_draws_dt_by(by = c("day_shifted","ID"),
                          target.var = "log10Load",
                          varname = "log10Load")
  
  save(VLCP_by_day_draws, VLCP_by_dayID, VL_by_dayID, ids, file = "pdata/VLCP_by_day_draws.Rdata")
}

setkeyv(day_data,"ID")
tmp = gc(verbose = F)


shift_draws = 
  draws_by_id(draws, c("shift"), thin = thin)
setkeyv(shift_draws,c("ID",".draw"))

# get imputed loads by component
day_data[log10Load <= 2, idx := 1:sum(day_data$log10Load <= 2)]
imputed_loads = 
  subset_draws(draws,"imp_neg") %>%
  thin_draws(thin = thin) %>% 
  as_draws_dt() %>%
  melt(id.var = ".draw",
       variable.name = "par",
       value.name = "log10Load") %>%
  .[, idx := as.numeric(gsub("[^0-9]","",par))] %>%
  .[, par := NULL] %>%
  merge(day_data[!is.na(idx), .(ID,day,idx)],
        by = "idx", all.x = T, all.y = F) %>%
  .[,idx := NULL] %>% 
  .[, imputed := "Yes"] %>% 
  setkeyv("ID")

setkeyv(day_data,"ID")

# assign imputed loads to correct ID and day
imputed_day_data_by_draw_ID = 
  shift_draws %>%
  .[, .(ID,.draw)] %>%
  unique() %>%
  merge(
    day_data[log10Load > 2,.(ID,day,log10Load)] %>% 
      .[, imputed := "No"], 
    by = "ID",
    allow.cartesian = T,
    all = T) %>%
  rbind(imputed_loads) %>%
  setkeyv(c("ID",".draw","day"))

setkeyv(imputed_day_data_by_draw_ID,c("ID",".draw","day"))

# combined imputed and observed loads
shifted_data_by_draw_day_ID = 
  shift_draws %>% 
  merge(day_data[,.(day,ID)], by = "ID",
        allow.cartesian = T) %>%
  .[, day_shifted := day + shift] %>%
  .[, c("shift") := NULL] %>%
  setkeyv(c("ID",".draw","day")) %>% 
  .[imputed_day_data_by_draw_ID, log10Load := log10Load] %>% 
  .[imputed_day_data_by_draw_ID, imputed := imputed] 

rm(imputed_loads,shift_draws,imputed_day_data_by_draw_ID)
tmp = gc(verbose = F)

day_data[log10Load <= 2, log10Load := 0]

draw_samples = sample(max(shifted_data_by_draw_day_ID$.draw),250)

ppc_timecourse = 
  ggplot(VL_by_dayID[ID %in% ids], aes(x = day_shifted, y = log10Load)) + 
  geom_ribbon(aes(ymin = q5, ymax = q95), fill = "blue", alpha = .3) + 
  geom_line(col = "blue") + 
  coord_cartesian(xlim = c(-10,40), ylim = c(0,10)) + 
  facet_wrap(~ID, ncol = 9) + 
  geom_text(aes(x = 27, y = 8, label = ID)) + 
  geom_jitter(data = shifted_data_by_draw_day_ID[ID %in% ids & .draw %in% draw_samples],
              height = .05, alpha = .01, color = "red",
              aes(x = day_shifted, y = log10Load)) + 
  geom_point(data = day_data[ID %in% ids], aes(x = day),
             col = "black", pch = "x", size = 4) + 
  theme(
    strip.background = element_blank(),
    strip.text.x = element_blank(),
    legend.position = "top"
  ) + 
  xlab("Day") + 
  ylab(expression(log[10]~viral~load)) 
ppc_timecourse
Posterior predictive check for the estimated 2-slopes time course. Blue lines are expected viral load time courses, i.e., the average over all time courses calulated from the posterior distribution of a subject's parameters. The shaded blue region indicates the 90% credible interval of the expection. x are observed measurements. Red points are observed measurements after shifting them for alignment in time and imputation of viral loads for negative tests.

Figure 2.11: Posterior predictive check for the estimated 2-slopes time course. Blue lines are expected viral load time courses, i.e., the average over all time courses calulated from the posterior distribution of a subject’s parameters. The shaded blue region indicates the 90% credible interval of the expection. x are observed measurements. Red points are observed measurements after shifting them for alignment in time and imputation of viral loads for negative tests.

ggsave(ppc_timecourse,
       file = "figures/S16_ppc_timecourse.png",
       width = 30, height = 40,units = "cm", dpi = 300,
       device = png_device)
ggsave(ppc_timecourse,
       file = "figures/S16_ppc_timecourse.pdf",
       width = 30, height = 40,units = "cm")
difficult.ID = is.difficult$ID[1:49]

VL_by_dayID = 
  VLCP_by_day_draws[ID %in% difficult.ID] %>%
  summarise_draws_dt_by(by = c("day_shifted","ID"),
                        target.var = "log10Load",
                        varname = "log10Load")

ppc_timecourse = 
  ggplot(VL_by_dayID[ID %in% difficult.ID], aes(x = day_shifted, y = log10Load)) + 
  geom_ribbon(aes(ymin = q5, ymax = q95), fill = "blue", alpha = .3) + 
  geom_line(col = "blue") + 
  coord_cartesian(xlim = c(-10,45), ylim = c(0,10)) + 
  facet_wrap(~ID, ncol = 7) + 
  geom_jitter(data = shifted_data_by_draw_day_ID[ID %in% difficult.ID & .draw %in% draw_samples],
              height = .05, alpha = .01, color = "red",
              aes(x = day_shifted, y = log10Load)) + 
  geom_point(data = day_data[ID %in% difficult.ID], aes(x = day),
             pch = "x", size = 4)  + 
  theme(
    strip.background = element_blank(),
    strip.text.x = element_blank(),
    legend.position = "top"
  ) + 
  xlab("Day") + 
  ylab(expression(log[10]~viral~load)) + 
  scale_alpha(range = c(.01,.5)) + 
  geom_text(data = is.difficult[ID %in% difficult.ID], size = 3,
            aes(x = 40, y = 8, label = parameter_lb),hjust = 1) + 
  geom_text(data = is.difficult[ID %in% difficult.ID], size = 3,
            aes(x = 1, y = 10, label = ID),hjust = 1) + 
  geom_line(data = day_data[ID %in% difficult.ID & is.difficult == T], aes(x = day),
            col = "black")
ggsave(ppc_timecourse,
       file = "figures/S17_ppc_difficult.png",
       width = 30, height = 30,units = "cm", dpi = 300,
       device = png_device)
ggsave(ppc_timecourse,
       file = "figures/S17_ppc_difficult.png",
       width = 30, height = 30,units = "cm", dpi = 300,
       device = png_device)
ppc_timecourse
Posterior predictive check for subjects with at least one subject-level parameter with an R-hat value larger than 1.1. Parameters for which R-hat values are larger than 1.1 are shown in the top right corner. imp_neg is the imputed viral load for negative tests, intercept_raw is the subject-level random effect for peak load, intercept is the subject's peak load (the sum of population average, population effects of age, PAMS, etc, and subject-level random effect.)

Figure 2.12: Posterior predictive check for subjects with at least one subject-level parameter with an R-hat value larger than 1.1. Parameters for which R-hat values are larger than 1.1 are shown in the top right corner. imp_neg is the imputed viral load for negative tests, intercept_raw is the subject-level random effect for peak load, intercept is the subject’s peak load (the sum of population average, population effects of age, PAMS, etc, and subject-level random effect.)

Figure 2.13 shows the data together with posterior predictions, i.e., posterior expectations plus error variance and their 90% credible intervals. Here we can see that the observed data are typically within the credible interval of posterior predictions.

sigma = 
  draws_by_id(draws,"sigma", thin = thin)

VLhat_by_day_draws = 
  VLCP_by_day_draws[ID %in% ids] %>%
  .[, CP := NULL] %>%
  merge(sigma[ID %in% ids],
        by = c("ID",".draw"),
        allow.cartesian = T) %>%
  .[, log10Load := log10Load + rnorm(1,sd = sigma),
    by = c(".draw","ID","day_shifted")] %>%
  .[, sigma := NULL]
tmp = gc(verbose = F)

VLhat_by_dayID = 
  VLhat_by_day_draws[ID %in% ids] %>%
  summarise_draws_dt_by(by = c("day_shifted","ID"),
                     target.var = "log10Load",
                     varname = "log10Load")

ppc_timecourse_sigma = 
  ggplot(VLhat_by_dayID[ID %in% ids], aes(x = day_shifted, y = log10Load)) + 
  geom_ribbon(aes(ymin = q5, ymax = q95), fill = "blue", alpha = .3) + 
  geom_line(col = "blue") + 
  coord_cartesian(xlim = c(-10,30), ylim = c(0,10)) + 
  facet_wrap(~ID, ncol = 9) + 
  geom_text(aes(x = 27, y = 8, label = ID)) + 
  geom_jitter(data = shifted_data_by_draw_day_ID[ID %in% ids & .draw %in% draw_samples],
              height = .05, alpha = .01, color = "red",
              aes(x = day_shifted, y = log10Load)) + 
  geom_point(data = day_data[ID %in% ids], aes(x = day),
             pch = "x", size = 4)  + 
  theme(
    strip.background = element_blank(),
    strip.text.x = element_blank(),
    legend.position = "top"
  ) + 
  xlab("Day") + 
  ylab(expression(log[10]~viral~load)) 
rm(VLhat_by_dayID, sigma, VLhat_by_day_draws, VL_by_dayID)
VLCP_by_day_draws = VLCP_by_day_draws[day_shifted >= -5 & day_shifted <= 25]
ppc_timecourse_sigma
Posterior predictive plot with 90% prediction intervals.

Figure 2.13: Posterior predictive plot with 90% prediction intervals.

2.4.7 Priors and posteriors for key model parameters

To verify that the priors for key parameters, i.e., grand mean for intercept, slope_up, slope_down, and time_to_peak load do not determine the results, figure 2.14 shows the posterior distribution of these parameters together with the prior distributions. The relatively wide prior for slope_up reflects uncertainty about the start of viral shedding (i.e. uncertainty of the estimated shift parameters).

prior_posterior_intercept = 
  plot_prior_posterior("intercept_mu",draws,
                       tcmp$log_int_mu,tcmp$log_int_sigma,
                       "log-normal", xlim = c(4,12))
    
  
  prior_posterior_slope_up = 
    plot_prior_posterior("slope_up_mu",draws,
                       tcmp$log_up_mu,tcmp$log_up_sigma,
                       "log-normal", xlim = c(0,4))
  
  
  ttp_prior = 
    data.table(d = p_intup$y, time2peak = p_intup$x) %>% 
    .[time2peak < 12]
  prior_posterior_time2peak = 
    draws %>% 
    draws_by_id("time2peak") %>% 
    .[,.(time2peak = mean(time2peak)), by = ".draw"] %>%
    ggplot(aes(x = time2peak)) + 
    geom_density(fill = adjustcolor("blue", alpha = .5), color = NA) + 
    geom_line(data = ttp_prior, aes(y = d)) + 
    ggtitle("Prior: implicit") + 
    gg_expand() + 
    theme(plot.title  = element_text(size = 10))
  
  prior_posterior_slope_down = 
    plot_prior_posterior("slope_down_mu",draws,
                       tcmp$log_down_mu,tcmp$log_down_sigma,
                       "log-normal", xlim = c(0,.5))
    


prior_posterior_params = 
  prior_posterior_slope_up +
  prior_posterior_intercept + 
  prior_posterior_time2peak +
  prior_posterior_slope_down

prior_posterior_params
Priors and posteriors for key model parameters. The prior density is shown as a black outline. The posterior density is filled blue.

Figure 2.14: Priors and posteriors for key model parameters. The prior density is shown as a black outline. The posterior density is filled blue.

rm(prior_posterior_slope_up,
   prior_posterior_intercept,
   prior_posterior_time2peak,
   prior_posterior_slope_down,
   prior_posterior_params)

2.4.8 Estimated shifts

Figure 2.15 indicates that relatively few first-positive tests were estimated to be done at time of peak viral load.

draws %>% 
  draws_by_id("shift") %>% 
  ggplot(aes(x = shift, group = .draw)) + 
  stat_bin(aes(y=..count..),
           geom="step", alpha = .01, 
           position = "identity", bins = 50) + 
  xlab("Days shifted") +
  coord_cartesian(xlim = c(-10,25)) + 
  ylab("Number of subjects") + 
  gg_expand()
Posterior distribution of shift estimates. Each line represents the outline of the histogram over subjects from one posterior sample. The prior for shift was a uniform distribution from -10 to 25.

Figure 2.15: Posterior distribution of shift estimates. Each line represents the outline of the histogram over subjects from one posterior sample. The prior for shift was a uniform distribution from -10 to 25.

Participants with a negative first test were typically assumed to have had the first test prior to peak load, whereas participants whose first test was positive received shifts such that the lower the maximal observed load for the participant was, the later in the infection time course the first test was assumed to be taken.

day_data[, `First tests` := 
           ifelse(first_test_negative == T,
                  "Negative-increasing",
                  ifelse(Test1_positive_increasing == T,
                         "Positive-increasing",
                         "Positive-decreasing"))] 

draws %>% 
  draws_by_id("shift") %>% 
  .[, .(shift = mean(shift)), by = .(ID)] %>% 
  setkeyv("ID") %>% 
  .[day_data, max_load := max_load] %>% 
  .[day_data, `First tests` := `First tests`] %>% 
  ggplot(aes(x = shift, y = max_load, color = `First tests`)) +
  ylab("Maximum observed load") +
  geom_point(alpha = .5)
Association between estimated shift (mean) and maximum observed load for participants. Colour indicates if a participant had a leading negative test, and if viral load increased or decreased over the first two tests.

Figure 2.16: Association between estimated shift (mean) and maximum observed load for participants. Colour indicates if a participant had a leading negative test, and if viral load increased or decreased over the first two tests.

Finally, we show the shifted and imputed data points for all participants:

tmp_shifted_data = 
  shifted_data_by_draw_day_ID %>% 
  .[,.(day_shifted = mean(day_shifted), 
       log10Load = mean(log10Load)), by = .(ID,day,imputed)] 

mi = median(ss[grepl("intercept\\[",variable),mean])
msd = median(ss[grepl("slope_down\\[",variable),mean])
msu = median(ss[grepl("slope_up\\[",variable),mean])

lmdata = 
  data.table(day_shifted = c(-10,0,50),
           log10Load = c(mi-10*msu,mi,mi+50*msd), 
           ID = tmp_shifted_data$ID[1],
           day = 0,
           imputed = "No")

p_shifted_data = 
  tmp_shifted_data %>% 
  ggplot(aes(x = day_shifted, y = log10Load, group = ID, label = day, fill = imputed)) + 
  geom_hline(yintercept = 0, lty = 3) +
  geom_point(alpha = .15, shape = 21, stroke = 0, size = 1) + 
  ylab(expression(log[10] ~ "load")) +
  xlab("Days since peak load (estimated)") +
  gg_add_grid() + 
  red_blue(c(3,2)) +
  geom_point(data = tmp_shifted_data[day == 0], shape = 16, size = 0.0005) +
  geom_line(data = lmdata) +
  theme(legend.position = "none",axis.line.x = element_blank()) +
  coord_cartesian(ylim = c(-5,10.5), xlim = c(-7.5,40)) 

pdat = 
  data.table(x = c(0,5,7,9)) %>% 
  .[, yl := (x-4)*2] %>% 
  .[, yp := if_else(x == 0,0, yl)]
  
inlay = 
  pdat %>% 
  ggplot(aes(x = x)) + 
  geom_hline(yintercept = 0, lty = 3) +
  geom_line(aes(y = yl)) + 
  geom_point(aes(y = yp), col = "red") + 
  xlab("Days since first test") + 
  ylab(expression(log[10] ~ "load")) + 
  expand_limits(y = 0) + 
  theme(axis.line.x = element_blank(),
        axis.text = element_text(size = 6),
        axis.title = element_text(size = 6)) 

p_shifted_data = 
  p_shifted_data + 
  inset_element(inlay,.6,0,1,.3)

ggsave("figures/S7_TC_shifted.png",plot = p_shifted_data, width = 6, height = 6,dpi = 300)
ggsave("figures/S7_TC_shifted.pdf",plot = p_shifted_data, width = 6, height = 6)
p_shifted_data
The plot shows the placement in time of RT-PCR viral load values from subjects with at least three RT-PCR results. Points with central black dots indicate the first test of a subject. Because RT-PCR tests have a limit of detection of around 2 log10 copies (when dilution is accounted for) and false negatives are more likely when the true viral load is low, we imputed viral load values of negative tests (shown in red, with observed positive-test viral loads in blue). The permissible range of imputed values is -Inf to +3. Note that choosing a lower upper limit for trailing negative tests would lead to slightly steeper decrease in viral load. Negative imputed values are allowed to capture situations in which a leading negative test is followed several days later by low, increasing viral loads. In this scenario the onset of shedding happened between the leading negative test and the first positive test. The inset in the bottom-right corner shows that, in this situation, fitting a line through the first positive tests means that the estimated log10 viral load at the time of the leading negative test has to be negative. The negative values for imputed log10 viral loads should not be interpreted as suggesting the presence of a fractional virus particle. Instead, by allowing imputation of negative log10 viral loads, we calculate a more accurate estimation of increasing viral loads at the beginning of the infection, based on these negative tests that may miss small viral concentrations that are below the limit of detection.

Figure 2.17: The plot shows the placement in time of RT-PCR viral load values from subjects with at least three RT-PCR results. Points with central black dots indicate the first test of a subject. Because RT-PCR tests have a limit of detection of around 2 log10 copies (when dilution is accounted for) and false negatives are more likely when the true viral load is low, we imputed viral load values of negative tests (shown in red, with observed positive-test viral loads in blue). The permissible range of imputed values is -Inf to +3. Note that choosing a lower upper limit for trailing negative tests would lead to slightly steeper decrease in viral load. Negative imputed values are allowed to capture situations in which a leading negative test is followed several days later by low, increasing viral loads. In this scenario the onset of shedding happened between the leading negative test and the first positive test. The inset in the bottom-right corner shows that, in this situation, fitting a line through the first positive tests means that the estimated log10 viral load at the time of the leading negative test has to be negative. The negative values for imputed log10 viral loads should not be interpreted as suggesting the presence of a fractional virus particle. Instead, by allowing imputation of negative log10 viral loads, we calculate a more accurate estimation of increasing viral loads at the beginning of the infection, based on these negative tests that may miss small viral concentrations that are below the limit of detection.

rm(shifted_data_by_draw_day_ID,p_shifted_data)
tmp = gc(verbose = F)

2.4.9 Key model parameters

The following figure shows estimated population-level means for slope_up, intercept, and slope_down and their correlations:

subset_draws(draws,c("slope_up_mu","slope_down_mu", "intercept_mu")) %>% 
  mcmc_pairs()
Posterior distribution and covariations of population means for key model parameters.

Figure 2.18: Posterior distribution and covariations of population means for key model parameters.

3 Results

3.1 Viral load across age

3.1.1 Null hypothesis significance testing

Here we show the results of the Hypothesis significance tests, split by sub-sample (PAMS vs. hospitalized).

tbl_NHST
Sample Comparison Difference pWelch t-stat t df pMW
PAMS 0-5 vs 20-65 -0.37 (-1, 0.26) 0.24 -1.19 35.43 0.21
PAMS 5-10 vs 20-65 -0.86 (-1.46, -0.26) 0.01 -2.92 38.52 0.00
PAMS 10-15 vs 20-65 -0.56 (-1.1, -0.02) 0.04 -2.08 50.82 0.03
PAMS 15-20 vs 20-65 -0.25 (-0.5, 0.01) 0.06 -1.93 210.87 0.05
Hospitalized 0-5 vs 20-65 -0.37 (-1.1, 0.37) 0.32 -1.01 36.41 0.11
Hospitalized 5-10 vs 20-65 -0.48 (-1.38, 0.43) 0.28 -1.11 19.15 0.17
Hospitalized 10-15 vs 20-65 -0.12 (-0.98, 0.74) 0.78 -0.28 22.19 0.62
Hospitalized 15-20 vs 20-65 -0.03 (-0.35, 0.3) 0.87 -0.16 144.86 0.72
Other 0-5 vs 20-65 -0.32 (-0.55, -0.1) 0.00 -2.89 279.50 0.00
Other 5-10 vs 20-65 -0.12 (-0.41, 0.18) 0.44 -0.78 132.01 0.52
Other 10-15 vs 20-65 -0.3 (-0.57, -0.03) 0.03 -2.17 159.78 0.05
Other 15-20 vs 20-65 -0.08 (-0.29, 0.13) 0.46 -0.74 345.86 0.35
All 0-5 vs 20-65 -0.49 (-0.69, -0.29) 0.00 -4.82 343.73 0.00
All 5-10 vs 20-65 -0.39 (-0.65, -0.14) 0.00 -3.09 189.86 0.00
All 10-15 vs 20-65 -0.41 (-0.64, -0.18) 0.00 -3.47 231.98 0.00
All 15-20 vs 20-65 -0.15 (-0.29, 0) 0.05 -1.98 702.87 0.05
The difference is given with 95% confidende intervals, p = p-value,
t = t-test, df = degrees of freedom, MW = Mann-Whitney U test

3.1.2 Viral load across age: Bayesian analysis

To compute estimates of age- and subject-groupwise viral loads and viral load differences, we generate a new articifial data set (newdata in the code below), with which we can obtain posterior predictions for hypothetical subjects stratified by, age and test centre, and clinical status. Note that we set the value for PCR system to the more frequently used cobas system (81%) and set the value for Male to 0.5. By marginalizing over co-variates (in particular test centre) and comparing groups, we later calculate statistics of interest.

bfitdata %>% 
  .[,TestCentre.Group := paste0(TestCentreCategory,Group)] 

newdata =
  expand.grid(Group = unique(bfitdata$Group),
              Age = seq(0,100,1),
              TestCentreCategory = unique(bfitdata$TestCentreCategory),
              Male = 0.5,
              PCR = "cobas") %>%
  data.table() %>%
  .[, B117 := 0] %>% 
  .[, TestCentre.Group := paste0(TestCentreCategory,Group)] %>% 
  .[TestCentre.Group %in% unique(bfitdata$TestCentre.Group)] %>% 
  .[, Hospitalized := ifelse(Group == "Hospitalized", 1L,0L)]

newdata = 
  newdata %>% 
  merge(bfitdata[, .(Age, fAgeGroup)][Age > 100, Age := 100][, Age := ceiling(Age)] %>% unique(),by = "Age")

newdata_load = cbind(rep(1,50),seq(2,10,length.out = 50))

Next we generate weights for the calculation of result-statistics that accurately reflect the sample. Test centres (TestCentreCategory) with more test samples should have a greater weight when calculating viral load or culture probability averages by age, and when calculating averages for age groups, age years with many participants should receive a greater weight than age years with fewer patients.

# Weights for correct weighting of test-centre categories
c_weights_Other =
  table(bfitdata[Group == "Other",TestCentre.Group]) %>%
  prop.table() %>%
  data.table() %>%
  setnames(c("V1","N"),c("TestCentre.Group","weight")) 

c_weights_PAMS =
  table(bfitdata[Group == "PAMS",TestCentre.Group]) %>%
  prop.table() %>%
  data.table() %>%
  setnames(c("V1","N"),c("TestCentre.Group","weight")) 

c_weights_Hosp =
  table(bfitdata[Group == "Hospitalized",TestCentre.Group]) %>%
  prop.table() %>%
  data.table() %>%
  setnames(c("V1","N"),c("TestCentre.Group","weight")) 

c_weights_all =
  table(bfitdata[,TestCentre.Group]) %>%
  prop.table() %>%
  data.table() %>%
  setnames(c("V1","N"),c("TestCentre.Group","weight")) 


group_weights =
  expand.grid(Age = 1:100,
              Group = c("PAMS","Hospitalized","Other","All")) %>% 
  data.table() %>% 
  .[, w_All := ifelse(Group == "All",1,0)] %>% 
  .[, w_Other := ifelse(Group == "Other",1,0)] %>% 
  .[, w_PAMS := ifelse(Group == "PAMS",1,0)] %>% 
  .[, w_Hosp := ifelse(Group == "Hospitalized",1,0)]

all_weights_Age_TestCentre = 
    bfitdata %>% 
      .[, .(Age,TestCentre.Group)] %>% 
      .[Age > 100, Age := 100] %>% 
      .[, Age := ceiling(Age)] %>% 
      xtabs(~TestCentre.Group + Age, .) %>% 
      data.table() %>% 
      .[, weight := N/sum(N), by = .(Age)] %>% 
      .[is.na(weight), weight := 0] %>% 
      .[, Group := "All"] %>% 
  .[, Age := as.numeric(Age)] %>% 
  setkeyv(c("Group","Age","TestCentre.Group"))


all_weights_Age_TestCentre %>% 
  copy() %>% 
  .[N == 0, weight := NA] %>% 
  ggplot(aes(y = TestCentre.Group, x = weight, group = Age, color = Age)) + 
  geom_point() + 
  theme(legend.position= "top")
Weights of test centres by age when calculating age-wise viral load for the complete sample.

Figure 3.1: Weights of test centres by age when calculating age-wise viral load for the complete sample.

# Weights for correct weighting of ages in age years in age groups
age_group_N_comp = make_age_group_N(bfitdata,breaks = c(seq(0,20,5),65,101))
age_group_N_comp.PAMS = make_age_group_N(bfitdata,breaks = c(seq(0,20,5),65,101), my_group = "PAMS")
age_group_N_comp.Hosp = make_age_group_N(bfitdata,breaks = c(seq(0,20,5),65,101), my_group = "Hospitalized")

3.1.2.1 Calculating posterior predictions

Here we calculate linear posterior predictions of viral load \(\hat{Y}_{load|Age}\) for ages 0:100, separately for the three subject groups PAMS, Hospitalized, and Other. We calculate all posterior predictions on the level of posterior draws, but we omit indexing for posterior draws for better readability. The expected viral load given a specific age and clinical status is calculated as

\[ \hat{Y}_{load|age,group} = \alpha_{load} + \beta_{load} X + \sigma_{g} B + \rho Z \]

where \(\alpha\) is the intercept, \(\beta\) are regression weights for co-variates (including clinical status), \(B\) is the design matrix for spline-smoothed age, \(\sigma_g\) are group-specific spline coefficients, and \(Z\) and \(\rho\) are design matrix and coefficients for test-centre random effects.

We calculate culture probability for the complete sample based on the association of viral load and percent positive culture estimated with the data from Wölfel at al and Perera et el. The posterior predictions of culture probability are calculated as

\[ \hat{Y}_{CP|load} = logit(\alpha_{CP} + \beta_{CP} load) \]

where \(load\) are measured viral loads of samples used in culture isolation trials. Figure 3.2 shows the estimated association between viral load and culture probability.

## posterior predictions of positive culture
CPpars =
  CP.fit$fit %>% 
  as_draws() %>% 
  subset_draws(c("b_Intercept","b_log10Load")) %>% 
  as_draws_dt()


CP_by_load =
  data.table(log10Load = newdata_load[,2]) %>%
  cbind(inv.logit(newdata_load %*% t(as.matrix(CPpars[,c("b_Intercept","b_log10Load")])))) %>%
  melt(id.vars = "log10Load", variable.name = ".draw") %>%
  .[, .draw := as.numeric(gsub("V","",.draw, perl = T))] %>%
  .[, value := value] %>%
  get_stats(var = "value", by = c("log10Load"))

p_CP_by_load =
  ggplot(CP_by_load,
         aes(x = log10Load,
             y = mean)) +
  geom_hline(yintercept = 100, lty = 3, col = "grey") + 
  conf_ribbon(CP_by_load, fill = "black") +
  geom_line() +
  coord_cartesian(ylim = c(0,1), xlim = c(2.5,10)) + 
  xlab(expression(log[10]~viral~load)) +
  ylab("Culture probability") +
  gg_expand() 
  
p_CP_by_load
Estimated association between viral load and culture probability based on data from Wölfel et al (2020) and Perera et al (2020).

Figure 3.2: Estimated association between viral load and culture probability based on data from Wölfel et al (2020) and Perera et al (2020).

p_CP_by_load = 
  p_CP_by_load + 
  gg_expand() + 
  gg_text_size()

One can then calculate the expected culture probability given age and group as

\[ \hat{Y}_{CP|\hat{Y}_{load|age,group}} = logit(\alpha_{CP} + \beta_{CP} \hat{Y}_{load|age,group}) \]

The parameters \(\alpha_{CP}\) and \(\beta_{CP}\) are obtained independently from the model and parameters used to calculate \(\hat{Y}_{load|age,group}\). To properly combine the uncertainty of parameter estimates from both models, \(\hat{Y}_{CP|\hat{Y}_{load|age,group}}\) is calculated with individual posterior distribution draws from the two analyses. That is, 4000 posterior predictions from the age model are paired randomly with 4000 posterior draws from the culture probability model to calculate 4000 draws that make up the posterior distribution of the estimated age and groupwise culture probabilities \(\hat{Y}_{CP|\hat{Y}_{load|age,group}}\).

\(\hat{Y}_{CP|\hat{Y}_{load|age,group}}\) is an estimate of culture probability at the expected (mean) viral load for a specific group of subjects. Because the variation of viral load within age groups is larger than the variation between age groups, an estimate of culture probability should use subject-level predictions of viral load as its basis. We use the posterior predictions calculated as

\[ \hat{Y}^i_{load|age,group} = \hat{Y}_{load|age,group} + \epsilon_{age,centre} \] where \(\epsilon_{age,centre}\) are age group and test-centre specific error terms. We use the posterior predictions \(\hat{Y}^i_{load|age,group}\) to calculate estimates for culture probability by age and group.

\[ \hat{Y}^i_{CP|\hat{Y}_{load|age,group}} = logit(\alpha_{CP} + \beta_{CP} \hat{Y}^i_{load|age,group}) \] the expected (mean) culture probabilities by age and group calculated from posterior predictions will be identical to those calculated from the posterior expectations \(\hat{Y}_{CP|age,group}\), but only culture probability estimates calculated from posterior predictions reflect the variability of culture probability in the population.

The regression analysis accurately models mean and standard deviations of age-wise viral load across sub-groups (clinical status), but not the bimodal distribution of log10 viral loads. We tried to implement modeling of log10viral load distributions using different strategies, but models that captured the bimodal structure well did not adequately capture variations of mean and standard deviations of age-wise viral load across sub-groups. Hence, we used an additional post-processing step to obtain bimodal posterior predictions. The post-processing step involves estimating parameters for a mixture of two normal distributions (means, standard deviations, weight) which, for each age-year and subject-group combination, has the same mean as the model-estimated mean viral load and matches the bimodal distribution of viral load values observed in that sub-group. The post-processing is implemented in the functions calc_post_lin_pred() and add_mix() and use the model mix_s_all.stan to estimate parameters of the mixture distribution with the ADVI variational inference algorithm (Kucukelbir et al. 2015).

if (file.exists("pdata/pp_1st_pos_load+.Rdata")) {
  load("pdata/pp_1st_pos_load+.Rdata")
} else {
  post_lin_pred = 
    rbind(
      calc_post_lin_pred(bfit, CPpars, newdata, c_weights_Other),
      calc_post_lin_pred(bfit, CPpars, newdata, c_weights_PAMS),
      calc_post_lin_pred(bfit, CPpars, newdata, c_weights_Hosp),
      calc_post_lin_pred(bfit, CPpars, newdata, all_weights_Age_TestCentre[Group == "All"], sum.Group = NULL) %>% .[,Group := "All"]
    ) %>% 
    .[, Age := as.integer(Age)] %>% 
    .[, .draw := as.integer(.draw)]
  post_pred = 
    rbind(
      calc_post_lin_pred(bfit, CPpars, newdata, c_weights_Other, epred = F),
      calc_post_lin_pred(bfit, CPpars, newdata, c_weights_PAMS, epred = F),
      calc_post_lin_pred(bfit, CPpars, newdata, c_weights_Hosp, epred = F)
    ) %>% 
    .[, Age := as.integer(Age)] %>% 
    .[, .draw := as.integer(.draw)] 
  save(post_lin_pred,post_pred, file = "pdata/pp_1st_pos_load+.Rdata")
}
Ages = 25
grp = "Other"

simple_posterior_predict = 
  posterior_predict(bfit, newdata = newdata[Age %in% Ages & Group == grp]) %>% 
  as.numeric() 

hdata = 
  rbind(
  data.table(log10Load = bfitdata[round(Age) %in% Ages & Group == grp,log10Load], Estimand = "Observed"),
  data.table(log10Load = post_lin_pred[Age %in% Ages & Group == grp,log10Load], Estimand = "Expectation"),
  data.table(log10Load = simple_posterior_predict, Estimand = "Posterior prediction"),
  data.table(log10Load = post_pred[Age %in% Ages & Group == grp,log10Load], Estimand = "Posterior prediction with post processing")
) %>% 
  .[, Value := ifelse(Estimand == "Expectation","Expectation","Prediction")]

means = hdata[, .(m = mean(log10Load), sd = sd(log10Load)), by = .(Estimand)][, Value := "Prediction"]

hdata %>% 
    ggplot(aes(x = log10Load, fill = Estimand)) + 
    geom_density(data = hdata[Estimand != "Observed"], alpha = .25, color = NA) + 
    geom_density(data = hdata[Estimand == "Observed"],fill = NA, color = "black") + 
    facet_wrap(~Value, scales = "free_y", ncol = 1) + 
    theme(legend.position = "top") +
  geom_vline(xintercept = 9, lty = 3, col = "grey") + 
  xlab(expression(log[10]~viral~load)) +
  gg_expand()
Calculating posterior predictions. Top: Posterior distribution of the expected (mean) viral load for a 25 year old subject from the Other group. Bottom: Distribution of observed viral loads (black outline) and posterior predictions for the same group. The distribution of posterior predictions (green) obtained from the basic analysis model implemented in brms is unimodal and underestimates the proportion of subjects with log10 viral loads larger than 9. Adding the post-processing step recovers the bimodal distribution of viral loads and more accurately estimates the proportion of subjects with very high viral loads.

Figure 3.3: Calculating posterior predictions. Top: Posterior distribution of the expected (mean) viral load for a 25 year old subject from the Other group. Bottom: Distribution of observed viral loads (black outline) and posterior predictions for the same group. The distribution of posterior predictions (green) obtained from the basic analysis model implemented in brms is unimodal and underestimates the proportion of subjects with log10 viral loads larger than 9. Adding the post-processing step recovers the bimodal distribution of viral loads and more accurately estimates the proportion of subjects with very high viral loads.

In addition to calculating the expected culture probability given the estimated viral load by age and clinical status, we can also calculate the expected culture probability given the observed viral loads as \(\hat{Y}_{CP|Y_{load|age,group}} = logit(\alpha_{CP} + \beta_{CP} Y_{load|age,group})\). The following code calculates expected culture probability given the observed viral loads of 20-65 year olds.

Obs_load_by_age = 
  rbind(
    bfitdata %>% 
      .[, .(log10Load = mean(log10Load)),
        by = .(fAgeGroup, Group)] %>% 
      .[, fAgeGroup := as.character(fAgeGroup)],
    bfitdata %>% 
      .[Age > 20 & Age < 65,
        .(log10Load = mean(log10Load)),
        by = .(Group)] %>% 
      .[,fAgeGroup := "20-65"]
  ) 

CP4obs = c()
for (k in 1:nrow(Obs_load_by_age)) {
  CP4obs = 
    rbind(CP4obs,
        data.table(CP = inv.logit(CPpars$b_log10Load*Obs_load_by_age[k,log10Load] + CPpars$b_Intercept),
                   AgeGroup = Obs_load_by_age[k,fAgeGroup],
                   Group = Obs_load_by_age[k,Group])
  )
}

CP4obs_stats = 
  get_stats(CP4obs, var = "CP",by = c("AgeGroup","Group")) %>% 
  .[, tbl := paste0(round(mean,2), 
                    " (",round(lower90,2),
                    ", ", round(upper90,2),
                    ")")]


CP4obs_tbl = 
  CP4obs_stats %>% dcast(AgeGroup ~ Group, value.var = "tbl") %>% 
  kable(caption = "Culture probability for observed viral loads by age group and clinical status.",
        format = table_format) %>% 
  kable_styling(full_width = F) %>%
  add_footnote("Estimates are based on the average observed viral load in age groups, whereas figures and comparisons reported later are based on estimated viral loads.",
               notation = "none")

CP4obs_tbl
Table 3.1: Culture probability for observed viral loads by age group and clinical status.
AgeGroup Other PAMS Hospitalized
0-5 0.13 (0.06, 0.23) 0.29 (0.18, 0.41) 0.11 (0.04, 0.2)
10-15 0.14 (0.06, 0.23) 0.23 (0.13, 0.35) 0.15 (0.07, 0.25)
15-20 0.17 (0.09, 0.28) 0.32 (0.21, 0.44) 0.17 (0.08, 0.27)
20-25 0.2 (0.11, 0.31) 0.39 (0.27, 0.51) 0.16 (0.07, 0.26)
20-65 0.2 (0.1, 0.3) 0.4 (0.28, 0.53) 0.17 (0.08, 0.27)
25-35 0.2 (0.11, 0.31) 0.43 (0.31, 0.56) 0.17 (0.08, 0.27)
35-45 0.19 (0.09, 0.29) 0.4 (0.28, 0.53) 0.17 (0.08, 0.27)
45-55 0.19 (0.1, 0.3) 0.39 (0.27, 0.51) 0.17 (0.08, 0.27)
5-10 0.17 (0.09, 0.28) 0.17 (0.08, 0.27) 0.1 (0.03, 0.18)
55-65 0.19 (0.1, 0.3) 0.37 (0.25, 0.5) 0.18 (0.09, 0.28)
>65 0.21 (0.12, 0.32) 0.37 (0.25, 0.5) 0.24 (0.14, 0.36)
Estimates are based on the average observed viral load in age groups, whereas figures and comparisons reported later are based on estimated viral loads.

Here we calculate the posterior distribution of the average culture probability stratified by clinical status.

global_weights = 
  merge(
    expand.grid(
      Age = 0:100,
      Group = unique(bfitdata$Group)) %>% 
      data.table(),
    bfitdata[,.(Age,Group)] %>% 
      .[Age > 100, Age := 100] %>% 
      .[, Age := ceiling(Age)] %>% 
      .[, .(N = .N), by = .(Group, Age)],
    by = c("Group", "Age"), all = T) %>% 
  .[is.na(N), N := 0] %>% 
  .[, w_age := N/sum(N), by = .(Group)] %>% 
  .[, Group := as.character(Group)] %>% 
  .[, c("N") := NULL]

global_weights = 
  rbind(
    global_weights,
    bfitdata[,.(Age,Group)] %>% 
      .[Age > 100, Age := 100] %>% 
      .[, Age := ceiling(Age)] %>% 
      .[, .(w_age = .N), by = .(Age)] %>% 
      .[, w_age := w_age / nrow(bfitdata)] %>% 
      .[, Group := "All"]
  )

setkeyv(post_lin_pred,c("Age","Group"))
setkeyv(global_weights,c("Age","Group"))
global_CP = 
  post_lin_pred %>% 
  .[global_weights, w_CP := CP * w_age] %>%
  .[, .(CP = sum(w_CP)), by = .(.draw,Group)]

post_lin_pred[, w_CP := NULL]

3.1.2.2 Viral load by age

We start the preparation of results by calculating mean and credible intervals for age- and group-wise viral load. From this we can plot age-wise viral load, stratified by clinical status.

Group_levels = c("PAMS","Hospitalized","Other","All")

###  culture probability from posterior predictions
stats_by_age = rbind(
  stats_VLCP_by_Age(post_lin_pred,"All"),
  stats_VLCP_by_Age(post_lin_pred,"Other"),
  stats_VLCP_by_Age(post_lin_pred,"PAMS"),
  stats_VLCP_by_Age(post_lin_pred,"Hospitalized")
) %>%
  .[, sample := factor(sample, levels = Group_levels)]

### for culture probability from posterior expectations 
stats_by_age.e = rbind(
  stats_VLCP_by_Age(post_lin_pred,"All",CP.var = "e.CP"),
  stats_VLCP_by_Age(post_lin_pred,"Other",CP.var = "e.CP"),
  stats_VLCP_by_Age(post_lin_pred,"PAMS",CP.var = "e.CP"),
  stats_VLCP_by_Age(post_lin_pred,"Hospitalized",CP.var = "e.CP")
) %>%
  .[, sample := factor(sample, levels = Group_levels)]

p_load_by_age =
  stats_by_age[outcome == "log10Load" & sample != "All"] %>% 
  ggplot(aes(x = Age, y = mean, color = sample, fill = sample)) +
  red_blue_black() +
  conf_ribbon(stats_by_age[outcome == "log10Load" & sample != "All"], fill = "sample") + 
  geom_line() +
  coord_cartesian(ylim = c(2,10)) +
  scale_x_continuous(expand = expansion(0,0)) +
  ylab(expression(Mean~log[10]~viral~load)) +
  guides(color = FALSE) 

p_load_by_age
Estimated viral load by age. Confidence bands indicate 90% credible intervals.

Figure 3.4: Estimated viral load by age. Confidence bands indicate 90% credible intervals.

p_load_by_age = 
  p_load_by_age + 
  theme(legend.position = my_legend_position)  + 
  gg_legend_size(1) +
  gg_expand() + 
  gg_text_size() +
  guides(color = FALSE) 

Because we are primarily interested in differences between sub groups of the population, we calculate credible intervals based on the expected viral load (and culture probability). This could however give a false impression about the accuracy of viral load predictions on the subject-level, which should take the error variance into account.

3.1.2.3 Culture probability by age and differences between age

Age-wise culture probability and derived differences were calculated above. Here we only plot the results of these calculations.

p_CP_by_age =
  ggplot(stats_by_age[outcome == "CP" & sample != "All"],
         aes(x = Age,
             y = mean,
             color = sample, fill = sample)) +
  red_blue_black() +
  conf_ribbon(stats_by_age[outcome == "CP" & sample != "All"], fill = "sample",conf_levels = seq(50,90,5)) + 
  geom_line() +
  coord_cartesian(ylim = c(0,1)) +
  scale_x_continuous(expand = expansion(0,0)) +
  ylab("Culture probability") +
  guides(color = FALSE) 

p_CP_by_age.e =
  ggplot(stats_by_age.e[outcome == "e.CP" & sample != "All"],
         aes(x = Age,
             y = mean,
             color = sample, fill = sample)) +
  red_blue_black() +
  conf_ribbon(stats_by_age.e[outcome == "CP" & sample != "All"], fill = "sample",conf_levels = seq(50,90,5)) + 
  geom_line() +
  coord_cartesian(ylim = c(0,1)) +
  scale_x_continuous(expand = expansion(0,0)) +
  ylab("Culture probability") +
  guides(color = FALSE) 

p_CP_by_age = 
  p_CP_by_age + 
  theme(legend.position = my_legend_position)  + 
  gg_legend_size(1) +
  gg_expand() + 
  gg_text_size() +
  guides(color = FALSE) 

p_CP_by_age.e = 
  p_CP_by_age.e + 
  theme(legend.position = my_legend_position)  + 
  gg_legend_size(1) +
  gg_expand() + 
  gg_text_size() +
  guides(color = FALSE) 

p_CP_by_age.e + p_CP_by_age + facet_grid(sample~.) + theme(legend.position = "none")
Estimated culture probability calculated from posterior expectations (left) and from posterior predictions (right).

Figure 3.5: Estimated culture probability calculated from posterior expectations (left) and from posterior predictions (right).

Credible intervals have difficulties in portraying uncertainty in multi-modal distributions. Hence, we use highest-density regions to display the uncertainty of posterior predictions obtained after post-processing.

if (file.exists("pdata/HDR_FPT.Rdata")) {
  load("pdata/HDR_FPT.Rdata")
} else {
  VLP9CP_by_Age.hdi = 
  rbind(
    post_pred[, as.list(fast.hdi(CP)),
              by = .(Age, Group)] %>% 
      .[,outcome := "Culture positvity"],
    post_pred[, as.list(fast.hdi(log10Load, 
                                 posterior.dist = "norm")),
              by = .(Age, Group)] %>% 
      .[,outcome := "log10 viral load"],
    post_pred[, as.list(fast.hdi(p9)),
              by = .(Age, Group)] %>% 
      .[,outcome := "P(log10 viral load > 9)"]
  )  %>% 
  .[, outcome := factor(outcome,
                        levels = c("log10 viral load",
                                   "Culture positvity",
                                   "P(log10 viral load > 9)"))] %>% 
    .[, Group := factor(Group, levels = c("PAMS","Hospitalized","Other"))]
  save(VLP9CP_by_Age.hdi,file = "pdata/HDR_FPT.Rdata")
}

p_VL_by_Age.hdi = 
  VLP9CP_by_Age.hdi[outcome == "log10 viral load"] %>% 
  ggplot(aes(x = Age, y = mean, color = Group)) + 
  geom_line() + 
  conf_linerange(color = "Group", size = .770) + 
  facet_wrap(~ Group) + 
  red_blue_black() + 
  theme(legend.position = "none") +
  ylab(expression(log[10]~viral~load)) + 
  geom_hline(yintercept = 9, col = "white", lty = 3) + 
  coord_cartesian(ylim = c(2,10)) + 
  gg_expand(x1 = .025, x2 = .025)

p_high_load = 
  VLP9CP_by_Age.hdi[outcome == "P(log10 viral load > 9)"] %>% 
    ggplot(aes(x = Age, y = mean, color = Group)) + 
    geom_line() + 
    conf_linerange(color = "Group", size = .770) + 
    facet_wrap( ~ Group) + 
    red_blue_black() + 
    theme(legend.position = "none") + 
    ylab(expression(Proportion~log[10]~viral~load>9)) + 
    coord_cartesian(ylim = c(0,.20)) + 
  gg_expand(x1 = .025, x2 = .025)

p_CP_by_Age.hdi = 
  VLP9CP_by_Age.hdi[outcome == "Culture positvity"] %>% 
  ggplot(aes(x = Age, y = mean, color = Group)) + 
  geom_line() + 
  conf_linerange(color = "Group", size = .770) + 
  facet_wrap(~Group) +
  red_blue_black() + 
  theme(legend.position = "none") +
  ylab("Culture probability") + 
  gg_expand(x1 = .025, x2 = .025)

hdi.plot = 
  (p_VL_by_Age.hdi + theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.line.x = element_blank())) / 
  (p_high_load +     theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.line.x = element_blank(), strip.text.x = element_blank())) / 
  (p_CP_by_Age.hdi + theme(strip.text.x = element_blank()))

hdi.plot
Highest-density regions for posterior predictions of log10 viral load and culture probability by age. The posterior predictions here take the error variance into account and used post-processing to also account for bimodal viral load distributions. The credible intervals indicate certainty of posterior predictions for individual cases (e.g., one 20 year old PAMS subject). In comparison, the credible intervals of posterior expectations used in the paper indicate the certainty of perditions for groups of subjects (e.g., the group of 20 year old PAMS subjects). The highest-density region for culture probability in the PAMS group is bimodal, because viral load is on average higher in this group, which means that a relatively larger share of subject have viral loads in a range that leads with high certainty to positive culturing results.

Figure 3.6: Highest-density regions for posterior predictions of log10 viral load and culture probability by age. The posterior predictions here take the error variance into account and used post-processing to also account for bimodal viral load distributions. The credible intervals indicate certainty of posterior predictions for individual cases (e.g., one 20 year old PAMS subject). In comparison, the credible intervals of posterior expectations used in the paper indicate the certainty of perditions for groups of subjects (e.g., the group of 20 year old PAMS subjects). The highest-density region for culture probability in the PAMS group is bimodal, because viral load is on average higher in this group, which means that a relatively larger share of subject have viral loads in a range that leads with high certainty to positive culturing results.

ggsave(hdi.plot,file = "figures/S2_HDIpostpredict_1st_pos.png",width = 20, height = 21, units = "cm",dpi = 600)
ggsave(hdi.plot,file = "figures/S2_HDIpostpredict_1st_pos.pdf",width = 20, height = 21, units = "cm")
if (file.exists("pdata/HDR_FPT_no_post_processing.Rdata")) {
  load("pdata/HDR_FPT_no_post_processing.Rdata")
} else {
  VLP9CP_by_Age.hdi = 
  rbind(
    post_lin_pred[, as.list(fast.hdi(CP)),
              by = .(Age, Group)] %>% 
      .[,outcome := "Culture positvity"],
    post_lin_pred[, as.list(fast.hdi(log10Load.p, 
                                 posterior.dist = "norm")),
              by = .(Age, Group)] %>% 
      .[,outcome := "log10 viral load"],
    post_lin_pred[, as.list(fast.hdi(p9)),
              by = .(Age, Group)] %>% 
      .[,outcome := "P(log10 viral load > 9)"]
  )  %>% 
  .[, outcome := factor(outcome,
                        levels = c("log10 viral load",
                                   "Culture positvity",
                                   "P(log10 viral load > 9)"))] %>% 
    .[, Group := factor(Group, levels = c("PAMS","Hospitalized","Other","All"))]
  save(VLP9CP_by_Age.hdi,file = "pdata/HDR_FPT_no_post_processing.Rdata")
}

p_VL_by_Age.hdi = 
  VLP9CP_by_Age.hdi[outcome == "log10 viral load" & Group != "All"] %>% 
  ggplot(aes(x = Age, y = mean, color = Group)) + 
  geom_line() + 
  conf_ribbon(VLP9CP_by_Age.hdi[outcome == "log10 viral load" & Group != "All"], fill = "Group") + 
  facet_wrap(~ Group) + 
  red_blue_black()+ 
  theme(legend.position = "none") +
  ylab(expression(log[10]~viral~load)) + 
  geom_hline(yintercept = 9, col = "white", lty = 3) + 
  coord_cartesian(ylim = c(2,10)) + 
  gg_expand(x1 = .025, x2 = .025)

p_high_load = 
  VLP9CP_by_Age.hdi[outcome == "P(log10 viral load > 9)"& Group != "All"] %>% 
    ggplot(aes(x = Age, y = mean, color = Group)) + 
    geom_line() + 
    conf_ribbon(VLP9CP_by_Age.hdi[outcome == "P(log10 viral load > 9)" & Group != "All"], fill = "Group") +
    facet_wrap( ~ Group) + 
    red_blue_black() + 
    theme(legend.position = "none") + 
    ylab(expression(Proportion~log[10]~viral~load>9)) + 
    coord_cartesian(ylim = c(0,.20)) + 
  gg_expand(x1 = .025, x2 = .025)


p_CP_by_Age.hdi.no.postproc = 
  VLP9CP_by_Age.hdi[outcome == "Culture positvity" & Group != "All"] %>% 
  ggplot(aes(x = Age, y = mean, color = Group)) + 
  geom_line() + 
  conf_ribbon.hdi( VLP9CP_by_Age.hdi[outcome == "Culture positvity" & Group != "All"],fill = "Group") + 
  facet_wrap(~Group) +
  red_blue_black() + 
  theme(legend.position = "none") +
  ylab("Culture probability") + 
  gg_expand(x1 = .025, x2 = .025)


hdi.plot = 
  (p_VL_by_Age.hdi + theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.line.x = element_blank())) / 
  (p_high_load +     theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.line.x = element_blank(), strip.text.x = element_blank())) / 
  (p_CP_by_Age.hdi.no.postproc + theme(strip.text.x = element_blank()))

hdi.plot
Highest-density regions for posterior predictions of log10 viral load and culture probability by age obtained without post-processing.

Figure 3.7: Highest-density regions for posterior predictions of log10 viral load and culture probability by age obtained without post-processing.

ggsave(hdi.plot,file = "figures/S2_HDIpostpredict_1st_pos_no_post_processing.png",width = 20, height = 21, units = "cm",dpi = 600)
ggsave(hdi.plot,file = "figures/S2_HDIpostpredict_1st_pos_no_post_processing.pdf",width = 20, height = 21, units = "cm")

3.1.2.4 Group comparisons

Using the above-calculated posterior predictions (without post-processing) by clinical status and age, we next compare PAMS and hospitalized subjects.

tmp = 
 post_lin_pred %>%
  .[Group %in% c("PAMS","Hospitalized")] %>% 
  setkeyv(c(".draw","Age","Group")) %>% 
  .[, .(diffPAMS = -diff(log10Load)), by = c("Age",".draw")] %>% 
  .[, as.list(my_stats_list_long(diffPAMS,quantiles = seq(.5,.95,by = .05))), by = "Age"] 

p_VL_PAMSvsHosp = 
  tmp %>% 
  ggplot(aes(x = Age, y = mean)) +
  geom_hline(yintercept = 0, lty = 2, col = "grey") + 
  conf_ribbon(data = tmp,fill = "black") + 
  geom_line() + 
  gg_expand() +
  coord_cartesian(xlim = c(0,101)) +
  ylab(expression(log[10]~viral~load~difference~PAMS~-Hospitalized))

tmp = 
 post_lin_pred %>%
  .[Group != "Other"] %>% 
  setkeyv(c(".draw","Age","Group")) %>% 
  .[, .(diffPAMS = -diff(CP)), by = c("Age",".draw")] %>% 
  .[, as.list(my_stats_list_long(diffPAMS,quantiles = seq(.5,.95,by = .05))), by = "Age"] 

p_CP_PAMSvsHosp = 
  tmp %>% 
  ggplot(aes(x = Age, y = mean)) +
  geom_hline(yintercept = 0, lty = 2, col = "grey") + 
  conf_ribbon(data = tmp,fill = "black") + 
  geom_line() + 
  gg_expand() +
  coord_cartesian(xlim = c(0,101)) +
  ylab("Culture probability difference difference PAMS-Hospitalized")

p_VL_PAMSvsHosp | p_CP_PAMSvsHosp
Viral load differences between PAMS and non-PAMS subjects by age

Figure 3.8: Viral load differences between PAMS and non-PAMS subjects by age

rm(tmp)

3.1.2.5 Age differences

First, we calculate the average estimated viral load for the three age groups 0-20, 20-65, 65 and older without stratification by age group.

age_group3_PAMS =
  copy(bfitdata) %>%
  .[Age > 100, Age := 100] %>% 
  .[, Age := ceiling(Age)] %>%
  .[, age_group := cut(Age, breaks = c(0,20,65,101), right = F)] %>%
  .[, list(N_Age = sum(.N)), by = c("age_group","Age")] %>% 
  .[, w := N_Age/sum(N_Age), by = "age_group"] %>% 
  .[, N_Age := NULL]

post_lin_pred_all = post_lin_pred[Group == "All"]
setkeyv(post_lin_pred_all,"Age")
setkeyv(age_group3_PAMS,"Age")
tmp = 
  post_lin_pred_all %>% 
  .[age_group3_PAMS,`:=`(wVL = w*log10Load, age_group = age_group)] %>% 
  .[, .(log10Load = sum(wVL)), by = .(age_group,.draw)] %>% 
  .[, .(load_stats = sprint_stat(log10Load,2)), by = .(age_group)]

load_3_age_groups = tmp$load_stats
names(load_3_age_groups) = tmp$age_group

rm(tmp,post_lin_pred_all)
tmp = gc()

Next we look at age differences in viral load by comparing younger subjects group against adults aged 20-65.

stats_agegroup_diff =
  rbind(
    stats_VLCP_by_Age(post_lin_pred, "All", get.stats = F) %>%
      calc_diff(age_group_N_comp) %>%
      get_stats(var = "value", by = c("comparison", "outcome")) %>%
      .[, sample := "All"],
    stats_VLCP_by_Age(post_lin_pred, "Other", get.stats = F) %>%
      calc_diff(age_group_N_comp) %>%
      get_stats(var = "value", by = c("comparison", "outcome")) %>%
      .[, sample := "Other"],
    stats_VLCP_by_Age(post_lin_pred,"PAMS", get.stats = F) %>%
      calc_diff(age_group_N_comp.PAMS) %>%
      get_stats(var = "value", by = c("comparison", "outcome")) %>%
      .[, sample := "PAMS"],
    stats_VLCP_by_Age(post_lin_pred,"Hospitalized", get.stats = F) %>%
      calc_diff(age_group_N_comp.Hosp) %>%
      get_stats(var = "value", by = c("comparison", "outcome")) %>%
      .[, sample := "Hospitalized"]
  ) %>%
  .[, sample := factor(sample, levels = Group_levels)]

stats_agegroup_diff[, 
  tbl := paste0(round(mean,digits = ifelse(outcome == "log10Load",2,3)), 
                " (",
                paste0(round(
                        c(lower90,upper90),
                        digits = ifelse(outcome == "log10Load",2,3)),
                       collapse = ", "),
                ")"),
  by = 1:nrow(stats_agegroup_diff)]

bfitdata[,age_group := cut(Age, breaks = c(0,5,10,15,20,65,80,101), right = F)]

N_table = 
  rbind(data.table(table(bfitdata[,age_group])) %>%
        .[, sample := "All"],
        data.table(table(bfitdata[Group == "Other",age_group])) %>%
        .[, sample := "Other"],
      data.table(table(bfitdata[Group == "PAMS",age_group])) %>%
        .[, sample := "PAMS"],
      data.table(table(bfitdata[Group == "Hospitalized",age_group])) %>%
        .[, sample := "Hospitalized"]) %>%
  setnames("V1","age_group") %>%
  .[,age_group := gsub("\\[|\\)","",age_group)] %>%
  .[,age_group := gsub(",","-",age_group)]

for (s in unique(N_table$sample)) {
  for (ag in N_table$age_group[1:4]) {
    stats_agegroup_diff[grepl(ag,comparison) & sample == s,
                        N := N_table[age_group == ag & sample == s,N]]
  }
}


age_group_diff_tbl = 
  dcast(outcome + sample ~ comparison, 
        data = stats_agegroup_diff,
        value.var = "tbl") 

age_group_diff_tbl_print = 
  age_group_diff_tbl %>%
  setnames(names(age_group_diff_tbl),
           gsub("\\[|)","",names(age_group_diff_tbl)) %>% gsub(",","-",.)) %>% 
  kable(format = table_format,
        caption = "Viral load and culture probability by age and age difference") %>%
  kable_styling(full_width = F)
age_group_diff_tbl_print
Table 3.2: Viral load and culture probability by age and age difference
outcome sample 0-5 5-10 10-15 15-20 20-65 65-101 0-5 vs 20-65 5-10 vs 20-65 10-15 vs 20-65 15-20 vs 20-65
log10Load PAMS 6.26 (5.94, 6.58) 6.38 (6.13, 6.63) 6.54 (6.36, 6.71) 6.71 (6.6, 6.82) 6.89 (6.85, 6.94) 6.72 (6.55, 6.9) -0.63 (-0.96, -0.32) -0.51 (-0.77, -0.26) -0.35 (-0.54, -0.17) -0.18 (-0.29, -0.07)
log10Load Hospitalized 5.84 (5.57, 6.09) 5.86 (5.66, 6.05) 5.88 (5.73, 6.04) 5.91 (5.79, 6.03) 6.02 (5.98, 6.07) 6.41 (6.37, 6.45) -0.18 (-0.45, 0.07) -0.16 (-0.36, 0.03) -0.14 (-0.29, 0.02) -0.11 (-0.22, 0.01)
log10Load Other 5.83 (5.68, 5.97) 5.92 (5.8, 6.03) 6.02 (5.92, 6.12) 6.15 (6.07, 6.23) 6.18 (6.15, 6.22) 6.26 (6.2, 6.31) -0.36 (-0.51, -0.2) -0.27 (-0.39, -0.15) -0.16 (-0.27, -0.06) -0.03 (-0.11, 0.05)
log10Load All 5.9 (5.78, 6.02) 6 (5.9, 6.1) 6.1 (6.02, 6.18) 6.22 (6.16, 6.28) 6.4 (6.38, 6.42) 6.36 (6.32, 6.39) -0.5 (-0.62, -0.37) -0.4 (-0.5, -0.3) -0.3 (-0.39, -0.22) -0.18 (-0.23, -0.12)
CP PAMS 0.329 (0.003, 0.95) 0.349 (0.003, 0.958) 0.377 (0.004, 0.966) 0.408 (0.005, 0.974) 0.441 (0.008, 0.981) 0.41 (0.006, 0.974) -0.112 (-0.279, -0.003) -0.092 (-0.228, -0.003) -0.064 (-0.162, -0.002) -0.033 (-0.087, -0.001)
CP Hospitalized 0.253 (0.002, 0.865) 0.257 (0.002, 0.868) 0.26 (0.002, 0.872) 0.264 (0.002, 0.874) 0.281 (0.002, 0.891) 0.347 (0.004, 0.936) -0.028 (-0.104, 0.009) -0.025 (-0.084, 0.003) -0.022 (-0.071, 0.001) -0.018 (-0.055, 0)
CP Other 0.257 (0.002, 0.874) 0.271 (0.002, 0.889) 0.288 (0.003, 0.905) 0.309 (0.003, 0.921) 0.314 (0.003, 0.923) 0.327 (0.004, 0.929) -0.058 (-0.148, -0.001) -0.044 (-0.11, -0.001) -0.026 (-0.072, -0.001) -0.005 (-0.027, 0.008)
CP All 0.286 (0.003, 0.906) 0.299 (0.003, 0.915) 0.308 (0.004, 0.921) 0.321 (0.004, 0.921) 0.353 (0.006, 0.939) 0.362 (0.006, 0.949) -0.067 (-0.167, -0.002) -0.054 (-0.132, -0.002) -0.045 (-0.111, -0.002) -0.033 (-0.076, -0.001)
clvls = c("0-5 vs\n20-65","5-10 vs\n20-65","10-15 vs\n20-65","15-20 vs\n20-65")
stats_agegroup_diff[, comparison := gsub("vs ","vs\n", comparison, perl = T)] %>%
  .[, comparison := factor(comparison, levels = clvls)]
stats_agegroup_diff = stats_agegroup_diff[!is.na(comparison)]


p_load_diff =
  ggplot(stats_agegroup_diff[outcome == "log10Load" & sample != "All" & (grepl("-",comparison))],
         aes(x = comparison, y = mean, color = sample, label = N)) +
  geom_hline(yintercept = 0, color = "grey") +
  geom_point(position = position_dodge(width = dodge_with)) +
  red_blue_black () +
  conf_linerange(stats_agegroup_diff[outcome == "log10Load" & sample != "All" & (grepl("-",comparison))], color = "sample") + 
  ylab(expression(log[10]~viral~load~difference))  +
  guides(color = guide_legend(nrow = 1, byrow = TRUE)) +
  xlab("Age group comparison") +
  theme(legend.position = "top")

p_CP_diff =
  ggplot(stats_agegroup_diff[outcome == "CP" & sample != "All" & (grepl("-",comparison))],
         aes(x = comparison, y = mean, color = sample, label = N)) +
  geom_hline(yintercept = 0, color = "grey") +
  geom_point(position = position_dodge(width = dodge_with)) +
  red_blue_black() +
  conf_linerange(stats_agegroup_diff[outcome == "CP" & sample != "All" & (grepl("-",comparison))],color = "sample") +
  ylab("Culture probability difference")  +
  guides(color = guide_legend(nrow = 1, byrow = TRUE)) +
  xlab("Age group comparison") +
  theme(legend.position = "top")

p_load_diff | p_CP_diff
Estimated viral load age differences. Confidence bands indicate 90% credible intervals

Figure 3.9: Estimated viral load age differences. Confidence bands indicate 90% credible intervals

N_lable =  
  N_table[age_group == "20-65"] %>% 
  .[,label := paste0(sample, " (",N,")")]
  
p_load_diff = 
  p_load_diff +
  geom_text(aes(y = upper90), size = 1.5,
            position = position_dodge(width = dodge_with)) + 
  gg_legend_size(1) +
  gg_text_size() +
  scale_color_manual(name = "Sample (N for 20-65)",
                     labels = N_lable$label,
                     values = c("black","red","blue")) + 
  theme(legend.position = c(.7,.25))

tmp = gc(verbose = F)

A more fine-grained comparison compares each age year against 50 year old adults.

if (file.exists("pdata/AgeCompFPT.Rdata")) {
  load("pdata/AgeCompFPT.Rdata")
} else {
  age_comps = c()
  for (age in 1:100) {
    for (g in c("PAMS","Hospitalized","Other")) {
      tmp = data.table(
        load_diff = post_lin_pred[Age == age & Group == g, log10Load] - post_lin_pred[Age == 50 & Group == g, log10Load],
        CP_diff = post_lin_pred[Age == age & Group == g, CP] - post_lin_pred[Age == 50 & Group == g, CP]
      )
      age_comps = 
        rbind(
          age_comps,
          tmp %>% 
            get_stats(var = "load_diff") %>% 
            .[, Group :=  g] %>% 
            .[, Age := age] %>% 
            .[, outcome := "load_diff"],
          tmp %>% 
            get_stats(var = "CP_diff") %>% 
            .[, Group :=  g] %>% 
            .[, Age := age] %>% 
            .[, outcome := "CP_diff"]
        )
    }
  }
  age_comps[, Group := factor(Group,levels(stats_by_age$sample))]
  save(age_comps,file = "pdata/AgeCompFPT.Rdata") 
}


p_age_comp_VL = 
  age_comps %>% 
  .[outcome == "load_diff"] %>% 
  ggplot(aes(x = Age, y = mean, color = Group, group = Group)) + 
  geom_hline(yintercept = 0, color ="grey", lty = 2) +
  conf_ribbon(age_comps,fill = "Group") +
  geom_line(show.legend = F) + 
  red_blue_black() + 
  gg_expand() + 
  ylab("Viral load difference to 50 year olds") + 
  gg_add_grid() +
  coord_cartesian(ylim = c(-1.5,1.5)) +
  theme(legend.position = c(.2,.9)) + 
  guides(color = guide_legend(nrow = 1),fill = guide_legend(nrow = 1))

p_age_comp_CP = 
  age_comps %>% 
  .[outcome == "CP_diff"] %>% 
  ggplot(aes(x = Age, y = mean, color = Group, group = Group)) + 
  geom_hline(yintercept = 0, color ="grey", lty = 2) +
  conf_ribbon(age_comps,fill = "Group") +
  geom_line(show.legend = F) + 
  red_blue_black() + 
  gg_expand() + 
  ylab("Culture probability difference to 50 year olds") + 
  gg_add_grid() + 
  coord_cartesian(ylim = c(-.5,.5)) +
  theme(legend.position = c(.25,.97), strip.text.y = element_blank()) + 
  guides(color = guide_legend(nrow = 1, title.position = "left", title = "Group:"),
         fill = guide_legend(nrow = 1, title.position = "left", title = "Group:"))


p_age_comp_VL
Estimated viral load age differences. Confidence bands indicate 90% credible intervals

Figure 3.10: Estimated viral load age differences. Confidence bands indicate 90% credible intervals

rm(post_lin_pred_all)

To collect all age comparisons in one place and thus facilitate comparisons, we compose a table with estimated age differences for viral load and culture probability and unadjusted age differences calculated directly from the raw data. This is table 2 of the paper.

clvls = c("0-5 vs 20-65","5-10 vs 20-65","10-15 vs 20-65","15-20 vs 20-65")
tmp_tbl = 
  stats_agegroup_diff[, .(comparison, sample, tbl,outcome)] %>% 
  dcast(comparison + sample ~ outcome,
        value.var = "tbl") %>% 
  .[, comparison := gsub("\n"," ", comparison)] %>%
  setnames(c("sample","comparison","CP","log10Load"),
           c("Sample","Comparison", "model-diff CP", "model-diff load")) %>%
  .[, Sample := factor(Sample, levels = levels(NHST_stats$Sample))] 

diff_table = 
  merge(tmp_tbl,
        NHST_stats[, .(Sample,Comparison,Difference,`p~MW~`)],
        by = c("Sample","Comparison")) %>%
  .[order(Sample,Comparison)] %>%
  .[, `p~MW~` := ifelse(`p~MW~` < .001,"<.001",as.character(round(`p~MW~`,3)))] %>%
  .[, Comparison := factor(Comparison, levels = clvls)] %>%
  setkeyv(c("Sample","Comparison"))

table_2_caption = "Differences of culture probabilities between age groups."

tbl_age_diff = 
  kable(diff_table[,setdiff(names(diff_table),c("t-stat","t df")),with = F],
      digits = c(0,0,0,0,0,2),
      caption = table_2_caption, 
      format = table_format) %>%
  kable_styling(full_width = F) %>%
  add_footnote("model-diff = model-based difference, CP = culture probability, load = log10viral load, difference = raw difference with 95% confidende intervals, p = p-value, MW = Mann-Whitney U test", notation = "none")

tbl_age_diff
Table 3.3: Differences of culture probabilities between age groups.
Sample Comparison model-diff load model-diff CP Difference pMW
PAMS 0-5 vs 20-65 -0.63 (-0.96, -0.32) -0.112 (-0.279, -0.003) -0.37 (-1, 0.26) 0.212
PAMS 5-10 vs 20-65 -0.51 (-0.77, -0.26) -0.092 (-0.228, -0.003) -0.86 (-1.46, -0.26) 0.004
PAMS 10-15 vs 20-65 -0.35 (-0.54, -0.17) -0.064 (-0.162, -0.002) -0.56 (-1.1, -0.02) 0.034
PAMS 15-20 vs 20-65 -0.18 (-0.29, -0.07) -0.033 (-0.087, -0.001) -0.25 (-0.5, 0.01) 0.055
Hospitalized 0-5 vs 20-65 -0.18 (-0.45, 0.07) -0.028 (-0.104, 0.009) -0.37 (-1.1, 0.37) 0.114
Hospitalized 5-10 vs 20-65 -0.16 (-0.36, 0.03) -0.025 (-0.084, 0.003) -0.48 (-1.38, 0.43) 0.171
Hospitalized 10-15 vs 20-65 -0.14 (-0.29, 0.02) -0.022 (-0.071, 0.001) -0.12 (-0.98, 0.74) 0.622
Hospitalized 15-20 vs 20-65 -0.11 (-0.22, 0.01) -0.018 (-0.055, 0) -0.03 (-0.35, 0.3) 0.718
Other 0-5 vs 20-65 -0.36 (-0.51, -0.2) -0.058 (-0.148, -0.001) -0.32 (-0.55, -0.1) 0.004
Other 5-10 vs 20-65 -0.27 (-0.39, -0.15) -0.044 (-0.11, -0.001) -0.12 (-0.41, 0.18) 0.521
Other 10-15 vs 20-65 -0.16 (-0.27, -0.06) -0.026 (-0.072, -0.001) -0.3 (-0.57, -0.03) 0.054
Other 15-20 vs 20-65 -0.03 (-0.11, 0.05) -0.005 (-0.027, 0.008) -0.08 (-0.29, 0.13) 0.352
All 0-5 vs 20-65 -0.5 (-0.62, -0.37) -0.067 (-0.167, -0.002) -0.49 (-0.69, -0.29) <.001
All 5-10 vs 20-65 -0.4 (-0.5, -0.3) -0.054 (-0.132, -0.002) -0.39 (-0.65, -0.14) 0.003
All 10-15 vs 20-65 -0.3 (-0.39, -0.22) -0.045 (-0.111, -0.002) -0.41 (-0.64, -0.18) <.001
All 15-20 vs 20-65 -0.18 (-0.23, -0.12) -0.033 (-0.076, -0.001) -0.15 (-0.29, 0) 0.047
model-diff = model-based difference, CP = culture probability, load = log10viral load, difference = raw difference with 95% confidende intervals, p = p-value, MW = Mann-Whitney U test
ftdiff = 
  flextable(diff_table) %>%
  autofit() %>%
  set_caption(caption = table_2_caption) %>%
  footnote(i = 1,
           j = c(1,3,4,5,6),
           value = as_paragraph(
             c("PAMS: Pre and asymptomatic subjects",
               "Mean difference and 90% credible interval based on regression analysis. CP = culture positicity",
               "Load = log(10) viral load",
               "Difference and confidence interval based on raw data",
               "p-value from Mann–Whitney U test")),
           ref_symbols = c("a", "b", "c", "d","e"),
           part = "header",
           inline = TRUE)


table_doc <- table_doc %>%
  body_add_flextable(ftdiff) %>%
  body_end_section_landscape()

Next we calculate and plot the difference and ratio of culture probability for PAMS and hospitalized subjects, stratified by age in full years.

ObsLoad_20_65 = 
  bdata[Age > 20 & Age < 65, .(mean_load = mean(log10Load)), by = .(Group)]

EstCP_20_65_by_PAMS = 
  data.table(
    PAMS = inv.logit(CPpars$b_Intercept + CPpars$b_log10Load * ObsLoad_20_65[Group == "PAMS", mean_load]),
    Other = inv.logit(CPpars$b_Intercept + CPpars$b_log10Load * ObsLoad_20_65[Group == "Other", mean_load]),
    Hospitalized = inv.logit(CPpars$b_Intercept + CPpars$b_log10Load * ObsLoad_20_65[Group == "Hospitalized", mean_load])) %>% 
  .[, delta := PAMS - Hospitalized] %>% 
  .[, ratio := PAMS  / Hospitalized]

pdata = 
  rbind(
    data.table(
      x = c(2.5, rep(ObsLoad_20_65[Group == "Hospitalized", mean_load],2)),
      y = c(rep(mean(EstCP_20_65_by_PAMS$Hospitalized),2),0)) %>% 
      .[, Group := "Hospitalized"],
    data.table(
      x = c(2.5, rep(ObsLoad_20_65[Group == "Other", mean_load],2)),
      y = c(rep(mean(EstCP_20_65_by_PAMS$Other),2),0)) %>% 
      .[, Group := "Other"],
    data.table(
      x = c(2.5, rep(ObsLoad_20_65[Group == "PAMS", mean_load],2)),
      y = c(rep(mean(EstCP_20_65_by_PAMS$PAMS),2),0)) %>%
      .[, Group := "PAMS"]
  ) %>% 
  .[, Group := ordered(Group, levels = c("PAMS","Hospitalized","Other"))]

p_CP_by_load = 
  p_CP_by_load + 
  geom_line(data = pdata, aes(x = x, y = y, color = Group)) + 
  red_blue_black() + 
  theme(legend.position = c(.2,.9))
  
tmp = 
 post_lin_pred %>% 
  .[Group %in% c("PAMS","Hospitalized")] %>% 
  setkeyv(c(".draw","Age","Group")) %>% 
  .[, .(diffPAMS = -diff(CP),
        rrPAMS = 1/exp(diff(log(CP)))),
    by = c("Age",".draw")]
  
  
PAMSvsnonPMASCP.dt = 
  tmp %>% 
  .[, as.list(my_stats_list_long(diffPAMS,quantiles = seq(.5,.95,by = .05))), by = "Age"] 

p_PAMSvsnonPMASCP = 
  PAMSvsnonPMASCP.dt %>% 
  ggplot(aes(x = Age, y = mean)) +
  geom_hline(yintercept = 0, lty = 2, col = "grey") + 
  conf_ribbon(data = PAMSvsnonPMASCP.dt,fill = "black") + 
  geom_line() + 
  gg_expand() +
  ylab("Culture probability difference PAMS - Hospitalized")


PAMSvsnonPMASCPRR.dt = 
  tmp %>% 
  .[, as.list(my_stats_list_long(rrPAMS,quantiles = c(.01,seq(.25,.5,by = .05)))), by = "Age"] %>% 
  .[, median := (lower1 + upper1)/2]

p_PAMSvsnonPMASCPRR = 
  PAMSvsnonPMASCPRR.dt %>% 
  ggplot(aes(x = Age, y = median)) +
  geom_hline(yintercept = 1, lty = 2, col = "grey") + 
  conf_ribbon(data = PAMSvsnonPMASCPRR.dt,fill = "black",conf_levels = seq(25,50,by = 5)) + 
  geom_line() + 
  gg_expand() +
  ylab("Culture probability ratio PAMS / Hospitalized")
p_PAMSvsnonPMASCP | p_PAMSvsnonPMASCPRR
Viral load differences between PAMS and hospitalized subjects by age. Because ratios have heavy tails, we show median and 25%-50% credible intervals for the ratio of culture probabilities in the right-hand side panel.

Figure 3.11: Viral load differences between PAMS and hospitalized subjects by age. Because ratios have heavy tails, we show median and 25%-50% credible intervals for the ratio of culture probabilities in the right-hand side panel.

rm(post_lin_pred)

At the average viral load of 20-65 year old adults, PAMS, and hospitalized cases have an expected culture probability of 0.4 (0.3, 0.5) and 0.2 (0.1, 0.3), respectively, resulting in a differences of 0.2 (0.2, 0.3) or a ratio of 2.6 (1.8, 3.9).

3.2 B.1.1.7 Viral load and culture probability

To visualize the effect of B.1.1.7, we show posterior distributions of estimated viral loads and culture probabilities for B.1.1.7 and non-B.1.1.7 cases as well as for the difference between these two groups.

B117_sigmas = 
  B117fit$fit %>% 
  as_draws() %>% 
  subset_draws("b_sigma",regex = T) %>% 
  as_draws_dt() %>% 
  .[, r := rnorm(10000)] %>% 
  .[, r_B117 := exp(b_sigma_Intercept + b_sigma_B117B117) * r] %>% 
  .[, r_nonB117 := exp(b_sigma_Intercept) * r] %>% 
  .[, `:=` (b_sigma_Intercept = NULL , 
            b_sigma_B117B117 = NULL , 
            b_sigma_PAMS = NULL,
            r = NULL)] %>% 
  .[.draw < 4001] %>% 
  setkeyv(".draw")

B117.draws = 
  as_draws(B117fit$fit) %>% 
  subset_draws(c("b_Intercept","b_B117B117")) %>% 
  as_draws_dt() %>% 
  merge(CP.fit$fit %>% 
          as_draws() %>% 
          subset_draws(c("b_Intercept","b_log10Load")) %>% 
          as_draws_dt() %>% 
          setnames(c("b_Intercept","b_log10Load"),c("a_CP","b_CP"))) %>% 
  setkeyv(".draw") %>% 
  .[B117_sigmas, r_B117 := r_B117] %>% 
  .[B117_sigmas, r_nonB117 := r_nonB117] %>% 
  .[, `:=` (
    load.nonB117 = b_Intercept,
    load.B117 = b_Intercept + b_B117B117)] %>% 
  .[, `:=` (
    CP.nonB117 = inv.logit((load.nonB117 + r_nonB117)*b_CP + a_CP),
    CP.B117 = inv.logit((load.B117 + r_B117)*b_CP + a_CP))] %>%  
  .[, `:=` (
    r_nonB117 =  r_nonB117,
    r_B117 = r_B117)] %>% 
  .[, .(.draw, load.nonB117, load.B117, CP.nonB117,CP.B117)] %>% 
  .[, `:=`(load.deltaB117 = load.B117 - load.nonB117,
           CP.deltaB117 = CP.B117 - CP.nonB117,
           CP.RRB117 = CP.B117/CP.nonB117)] %>% 
  melt(id.vars = ".draw") %>% 
  .[, B117 := ifelse(grepl("RR",variable),"RRB117",
                     ifelse(grepl("delta",variable),"deltaB117",
                            ifelse(grepl("nonB117",variable),"nonB117","B117")))] %>% 
  .[, Outcome := ifelse(grepl("CP",variable),"culture positivity","log10Load")]


no_y_axis = theme(axis.title.y = element_blank(), axis.line.y = element_blank(), axis.text.y = element_blank())

p1 = 
  B117.draws[Outcome == "log10Load" & !(grepl("delta|RR",B117))] %>% 
  ggplot(aes(x = value, fill = B117)) +
  stat_histinterval(.width = c(.50, .80, .90), breaks = 25, alpha = .75, color = NA) + 
  scale_fill_manual(values = c("#00A087FF","#7E6148FF"),
                    name = "", labels = c("B.1.1.7","non-B.1.1.7")) +
  scale_color_manual(values = c("#00A087FF","#7E6148FF"),
                    name = "", labels = c("B.1.1.7","non-B.1.1.7")) +
  theme(axis.line.y = element_blank(), axis.text.y = element_blank()) +
  xlab(expression(log[10]~viral~load)) + 
  theme(legend.position = c(.2,.975)) + 
  ylab("Posterior density")

hdi.data = rbind(
  B117.draws[variable == "load.B117",value] %>% 
  fast.hdi(posterior.dist = "norm") %>% 
  .[,.(lower90,upper90)] %>% 
  .[, hdi.group := 1] %>% 
  melt(id.vars = "hdi.group") %>% 
  .[, B117 := "B117"] %>% 
  .[, y := 0],
  B117.draws[variable == "load.nonB117",value] %>% 
  fast.hdi(posterior.dist = "norm") %>% 
  .[,.(lower90,upper90)] %>% 
  .[, hdi.group := 1] %>% 
  melt(id.vars = "hdi.group") %>% 
  .[, B117 := "nonB117"] %>% 
  .[, y := 0])

p1 = 
  p1 +
  geom_line(data = hdi.data,
            aes(y = y, x = value, group = interaction(hdi.group,B117), color = B117),
            size = 1) + 
  gg_add_grid("x")

p2 = 
  B117.draws[Outcome == "log10Load" & (grepl("delta",B117))] %>% 
  ggplot(aes(x = value)) +
  stat_histinterval(.width = c(.50, .80, .90), breaks = 25, alpha = .5) + 
  theme(axis.line.y = element_blank(), axis.text.y = element_blank()) +
  xlab(expression(log[10]~load~B.1.1.7-`non-B.1.1.7`)) +
  coord_cartesian(xlim= c(0.5,1.5)) +
  ylab("") + 
  gg_add_grid("x")

pB117load = (p1 + ggtitle("A")| p2 + ggtitle("B")) + plot_layout(widths = c(1,1))

p1 = 
  B117.draws[Outcome == "culture positivity" & !(grepl("delta|RR",B117))] %>% 
  ggplot(aes(x = value, fill = B117)) +
  stat_histinterval(breaks = 25, alpha = .5, .width = c(), color = NA) + 
  stat_histinterval(.width = c(.50, .80, .90), breaks = 25, alpha = .5, color = NA) + 
  scale_fill_manual(values = c("#00A087FF","#7E6148FF"),
                    name = "", labels = c("B.1.1.7","non-B.1.1.7")) +
  scale_color_manual(values = c("#00A087FF","#7E6148FF"),
                    name = "", labels = c("B.1.1.7","non-B.1.1.7")) + 
  theme(axis.line.y = element_blank(), axis.text.y = element_blank()) +
  xlab("Culture probability") + 
  theme(legend.position = "none") + 
  ylab("Posterior density") 

hdi.data = rbind(
  B117.draws[variable == "CP.B117",value] %>% 
  fast.hdi() %>% 
  .[,.(lower90,upper90)] %>% 
  .[, hdi.group := 1:2] %>% 
  melt(id.vars = "hdi.group") %>% 
  .[, B117 := "B117"] %>% 
  .[, y := -.025],
  B117.draws[variable == "CP.nonB117",value] %>% 
  fast.hdi() %>% 
  .[,.(lower90,upper90)] %>% 
  .[, hdi.group := 1:2] %>% 
  melt(id.vars = "hdi.group") %>% 
  .[, B117 := "nonB117"] %>% 
  .[, y := 0])
p1 = 
  p1 + 
  geom_line(data = hdi.data,
            aes(y = y, x = value, group = interaction(hdi.group,B117), color = B117),
            size = 1) + 
  gg_add_grid("x")
p2 = 
  B117.draws[Outcome == "culture positivity" & (grepl("delta",B117))] %>% 
  ggplot(aes(x = value)) +
  stat_histinterval(breaks = 15, alpha = .5,.width = c(.50, .80, .90)) + 
  no_y_axis + 
  xlab(expression(Culture~prob.~B.1.1.7-`non-B.1.1.7`)) + 
  gg_add_grid("x")

pB117cp = (p1 + ggtitle("C") | p2 + ggtitle("D")) + plot_layout(widths = c(1,1))

pB117load / pB117cp
Posterior distributions of viral load and culture probability for B.1.1.7 and non-B.1.1.7 cases. A posterior distribution of log10 viral load. B difference of average viral load between B.1.1.7 and non-B.1.1.7 cases. C posterior distribution of the probability of a positive culture load. D Difference of average culture probability between B.1.1.7 and non-B.1.1.7 cases.

Figure 3.12: Posterior distributions of viral load and culture probability for B.1.1.7 and non-B.1.1.7 cases. A posterior distribution of log10 viral load. B difference of average viral load between B.1.1.7 and non-B.1.1.7 cases. C posterior distribution of the probability of a positive culture load. D Difference of average culture probability between B.1.1.7 and non-B.1.1.7 cases.

At 7.2 (7.1, 7.4) B.1.1.7 cases have on average a 1.0 (0.9, 1.1) higher viral load than non-B.1.1.7 cases. Assuming that the association between culture probability is the same for the B.1.1.7 and non-B.1.1.7 variants of the virus, this implies an average probability of a positive culture of 0.51 (0.04, 0.97) for B.1.1.7 cases, compared to 0.32 (0.00, 0.94) for non-B.1.1.7 cases, that is a 4.2 (1.0, 12.1) times higher culture probability for B.1.1.7.

The estimates of a mean 4.2 (1.0, 12.1) times higher culture probability for B.1.1.7 can appear surprising given the mean culture probabilities of 0.51 (0.04, 0.97) and 0.32 (0.00, 0.94). The following figure shows that this is explained by the fact that the estimated culture probability for B.1.1.7 is strictly higher, and that ratio distributions are generally heavily skewed. Due to the long-tailed nature of ratio distributions, we report median and 25th and 75th percentile to describe the ratio of culture probabilities: 2.6 (1.4, 5.0).

scatter_plot = 
  B117.draws[B117 %in% c("B117","nonB117") & Outcome == "culture positivity",.(B117,value,.draw)] %>% 
  dcast(.draw~B117) %>% 
  ggplot(aes(x = B117, y = nonB117)) + 
  geom_point(alpha = .25, col = "blue", shape = 16) + 
  coord_cartesian(xlim = c(0,1), ylim = c(0,1)) +
  geom_abline(intercept = 0, slope = 1, col = "grey",lty = 3)  + 
  xlab("Estimated culture probability B.1.1.7") + 
  ylab("Estimated culture probability B.1.177") +
  gg_expand()

post_dens_ratio = 
  B117.draws[B117 == "RRB117" & Outcome == "culture positivity"] %>% 
  ggplot(aes(x = value)) + 
  stat_histinterval(fill = "blue", alpha = .5, breaks = 100, col = "red",.width = c(.50, .80, .90)) + 
  coord_cartesian(xlim = c(1,30)) + 
  xlab("Culture probability B.1.1.7 \n/ Culture probability B.1.177") +
  ylab("Posterior mass") +
  gg_expand()

scatter_plot | post_dens_ratio
Posterior distribution of the ratio of B.1.1.7 and B.1.177 culture probabilities. Left: Each point is one posterior draw, for which we calculated culture probability. The same errror term is used in each draw for B.1.1.7 and non-B.1.1.7. When the error term is very low (< -2) we get a low viral load and culture probability for B.1.1.7 and non-B.1.1.7. When the error term is very high  (> 2) we get a high viral load and culture probability for both. Only when the error term is intermediate does the on average higher viral load of B.1.1.7 translate to a higher culture probability. Right: The ratio of culture probabilties has a heavy tail. (x-axis clipped at 30.)

Figure 3.13: Posterior distribution of the ratio of B.1.1.7 and B.1.177 culture probabilities. Left: Each point is one posterior draw, for which we calculated culture probability. The same errror term is used in each draw for B.1.1.7 and non-B.1.1.7. When the error term is very low (< -2) we get a low viral load and culture probability for B.1.1.7 and non-B.1.1.7. When the error term is very high (> 2) we get a high viral load and culture probability for both. Only when the error term is intermediate does the on average higher viral load of B.1.1.7 translate to a higher culture probability. Right: The ratio of culture probabilties has a heavy tail. (x-axis clipped at 30.)

To document the robustness of the result to different analysis approaches, we show the results of a number of alternative analyses. It should be kept in mind that the primary analysis with the most restricted data set has the relative cleanest design, i.e., the best matching of B.1.1.7 and non-B.1.1.7 cases.

T_B117_caption = "Log10 viral load differences between B.1.1.7 and non-B.1.1.7 cases. Each row shows the estimated effect of B.1.1.7 in an alternative analysis."
T_B117_footnote = "Window: Number of days within which non-B.1.1.7 cases must occur in a test centre with B.1.1.7 cases to be included in the analysis. 5 = non-B.1.1.7 cases detected within +/-5 days of B.1.1.7 cases are included. Adjusted: Adjustment for age, PCR type, group (PAMS, Hospitalized, Other), and sex. Random effects: Modelling test centres as random effects. Paired: Yes if only test centres that report B.1.1.7 and non-B.1.1.7 centres are included. Effects are given with 90% credible intervals."

B117_ktbl = 
  B117_model_stats %>% 
  kable(format = table_format,
        caption = T_B117_caption) %>%
  kable_styling(full_width = F) %>%
  add_footnote(T_B117_footnote,
               notation = "none")

B117_ktbl
Table 3.4: Log10 viral load differences between B.1.1.7 and non-B.1.1.7 cases. Each row shows the estimated effect of B.1.1.7 in an alternative analysis.
Window Model N B.1.1.7 N non-B.1.1.7 Load B.1.1.7 Load non-B.1.1.7 Effect B.1.1.7
Inf unadjusted 1533 23848 7.3 (7.2, 7.4) 6.3 (6.3, 6.3) 0.99 (0.92, 1.05)
Inf RE, unadjusted 1533 23848 6.8 (6.7, 6.9) 5.9 (5.8, 5.9) 0.93 (0.86, 0.99)
Inf RE, adjusted 1533 23848 6.8 (6.7, 6.9) 5.9 (5.8, 5.9) 0.93 (0.86, 0.99)
5 unadjusted 1533 1582 7.3 (7.2, 7.4) 6.3 (6.2, 6.3) 1.04 (0.94, 1.14)
5 RE, unadjusted 1533 1582 7.1 (7.0, 7.2) 6.0 (5.9, 6.2) 1.02 (0.92, 1.11)
5 RE, adjusted 1533 1582 7.2 (7.0, 7.3) 6.1 (6.0, 6.3) 1.04 (0.94, 1.14)
1 unadjusted 1533 977 7.3 (7.3, 7.4) 6.4 (6.2, 6.5) 0.97 (0.85, 1.09)
1 RE, unadjusted 1533 977 7.1 (7.0, 7.2) 6.1 (6.0, 6.2) 1.00 (0.88, 1.11)
1 RE, adjusted 1533 977 7.2 (7.0, 7.3) 6.2 (6.0, 6.3) 1.03 (0.92, 1.14)
1 RE, adjusted, (paired) 1453 977 7.2 (7.1, 7.4) 6.2 (6.0, 6.4) 1.04 (0.92, 1.15)
Window: Number of days within which non-B.1.1.7 cases must occur in a test centre with B.1.1.7 cases to be included in the analysis. 5 = non-B.1.1.7 cases detected within +/-5 days of B.1.1.7 cases are included. Adjusted: Adjustment for age, PCR type, group (PAMS, Hospitalized, Other), and sex. Random effects: Modelling test centres as random effects. Paired: Yes if only test centres that report B.1.1.7 and non-B.1.1.7 centres are included. Effects are given with 90% credible intervals.
B117_ft = 
  flextable(B117_model_stats) %>%
  autofit() %>%
  set_caption(caption = T_B117_caption) %>%
  footnote(i = 1,
           j = 1,
           value = as_paragraph(T_B117_footnote),
           ref_symbols = "",
           part = "header",
           inline = TRUE)

table_doc <- table_doc %>%
  body_add_flextable(B117_ft) %>%
  body_end_section_landscape()

3.3 Viral load and culture probability over time

3.3.1 Basic model parameters: Peak day and slopes

The following plot shows population-level means and standard deviations of parameter estimates.

rename_params = function(dt) {
  pars = c(expression(Slope~of~log[10]~load~increase),
           expression(Days~to~peak~load),
         expression(Peak~log[10]~viral~load),
         expression(Slope~of~log[10]~load~decrease))
  dt[, parameter := factor(parameter,
                           levels = c("slope_up",
                                      "time2peak",
                                      "intercept",
                                      "slope_down"),
                           labels = pars)]
  return(dt)
}

if (file.exists("pdata/r_model_pars_by_draw.Rdata")) {
  load("pdata/r_model_pars_by_draw.Rdata")
} else {
  r_model_pars_by_draw = 
  rbind(
    smrs_by_draw(draws, "intercept"),
    smrs_by_draw(draws, "slope_up"),
    smrs_by_draw(draws, "slope_down"),
    smrs_by_draw(draws, "time2peak")) %>% 
  rename_params()
  setnames(r_model_pars_by_draw,"parameter","variable")
  save(r_model_pars_by_draw,file = "pdata/r_model_pars_by_draw.Rdata")
}

param_means = 
  r_model_pars_by_draw %>% 
  plot_post_hists(value.var = "mean", labeller = label_parsed) +
  xlab("mean over cases") + 
  ggtitle("Group-level parameters")
param_sds = r_model_pars_by_draw %>% 
  plot_post_hists(value.var = "sd", labeller = label_parsed) +
  xlab("sd over cases")

paramplot = (param_means) / (param_sds)
paramplot
Posterior distributions of means and standard deviations of population-level model parameters. Numbers in the plot are mean and the 90% credible interval. Top row: Posterior distribution for mean over cases. Bottom row: Posterior distribution for standard deviation over cases

Figure 3.14: Posterior distributions of means and standard deviations of population-level model parameters. Numbers in the plot are mean and the 90% credible interval. Top row: Posterior distribution for mean over cases. Bottom row: Posterior distribution for standard deviation over cases

ggsave(paramplot,
       file = "figures/S5_TC_model_params_posterior_distribution.png",
       height = 15,
       width = 30,
       units = "cm",
       dpi = 300,
       type = "cairo",
       device = png_device)
ggsave(paramplot,
       file = "figures/S5_TC_model_params_posterior_distribution.pdf",
       height = 15,
       width = 30,
       units = "cm")

setnames(r_model_pars_by_draw,"variable", "parameter")

tmp = gc(verbose = F)

This model estimated the mean peak viral load to be 8.1 (8.0, 8.3) at 4.3 (4.0, 4.6) days after infections, and gradients of 2.0 (1.8, 2.1) and -0.168 (-0.171, -0.165). The corresponding standard deviations were 0.70 (0.66, 0.75), 0.92 (0.62, 1.24), 0.39 (0.23, 0.55) and 0.016 (0.012, 0.020), respectively.

The following plot shows the distribution of the mean (expected) model parameter over subjects. The larger standard deviation for slope of viral load increase is due to using PAMS as a predictor for shift and growth gradient, which are two dependent parameters in the model. Related, the PAMS cases are also those with a lower peak viral load and fewer days to peak viral load.

if (file.exists("pdata/r_model_pars_by_ID.Rdata")) {
  load("pdata/r_model_pars_by_ID.Rdata")
} else {
  r_model_pars_by_ID = 
    rbind(
      smrs_by_ID(draws, "intercept"),
      smrs_by_ID(draws, "slope_up"),
      smrs_by_ID(draws, "slope_down"),
      smrs_by_ID(draws, "time2peak")) %>% 
    rename_params()
  setnames(r_model_pars_by_ID,"parameter","variable")
  save(r_model_pars_by_ID,file = "pdata/r_model_pars_by_ID.Rdata")
}

p_m = 
  r_model_pars_by_ID %>% 
  merge(day_data[day == 0, .(ID,PAMS1)], by = "ID") %>% 
  .[, PAMS := factor(PAMS1, labels = c("No","Yes"))] %>%
  plot_post_hists(value.var = "mean", fill = "PAMS", labeller = label_parsed) + 
  ggtitle("Subject-level expectations") + 
  ylab("count over subjects") + 
  xlab("posterior expectation") + 
  red_blue(3:2) +
  theme(legend.position = c(0.2,0.5))

p_sd = 
  r_model_pars_by_ID %>% 
  merge(day_data[day == 0, .(ID,PAMS1)], by = "ID") %>% 
  .[, PAMS := factor(PAMS1, labels = c("No","Yes"))] %>%
  plot_post_hists(value.var = "sd", fill = "PAMS", labeller = label_parsed) + 
  ggtitle("Subject-level standard deviations") + 
  ylab("count over subjects") + 
  xlab("posterior expectation") + 
  red_blue(3:2) +
  theme(legend.position = "none")

p_m / p_sd  
Distribution of expected subject-level model parameters. Expectations of subject-level parameters are simply averages of posterior samples for a parameter for a subject.

Figure 3.15: Distribution of expected subject-level model parameters. Expectations of subject-level parameters are simply averages of posterior samples for a parameter for a subject.

shift_by_ID = 
  draws_by_id(draws = draws,params = c("shift")) %>%
  .[, .(mean = collapse::fmean(shift),
        q5 = quantile(shift,.05, names = F),
        q95 = quantile(shift,.95, names = F),
        sd = collapse::fsd(shift)), by = c("ID")] %>%
  merge(day_data[day == 0, .(ID,day_first_positive)], by = "ID") %>% 
  .[, mean := mean + day_first_positive] %>% 
  .[, q5 := q5+day_first_positive] %>% 
  .[, q95 := q95 + day_first_positive]  %>% 
  .[, variable := "day_first_positive_shifted"] %>% 
  .[, day_first_positive := NULL]

r_model_pars_by_ID = rbind(
  r_model_pars_by_ID,
  shift_by_ID
)

r_model_pars_by_ID %>% 
  merge(day_data[day == 0, .(ID,personHash,PAMS1, Age, N_tests)], by = "ID") %>% 
fwrite(file = "figures/TC_model_parameters_by_participant.csv",
       sep = ",")
rm(r_model_pars_by_ID)

One limitation of the available data is that testing did not follow a pre-planned schedule. Instead, the number of tests a subject had most likely depended on factors like the severity and duration of the illness. Hence we also investigate how model parameters are associated with the number of tests performed for a subject:

day_data[, N_tests_group := cut(N_tests,
                          breaks = c(2,3,4,5,7,20),
                          ordered_result = T,
                          labels = c("3","4","5","6-7",">7"))]

if (file.exists("pdata/r_model_pars_by_grp.Rdata")) {
  load("pdata/r_model_pars_by_grp.Rdata")
} else {
  r_model_pars_by_grp = 
  rbind(
    smrs_by_grp(draws, "time2peak", unique(day_data[,.(ID,N_tests_group)])),
    smrs_by_grp(draws, "intercept", unique(day_data[,.(ID,N_tests_group)])),
    smrs_by_grp(draws, "slope_down", unique(day_data[,.(ID,N_tests_group)])),
    smrs_by_grp(draws, "slope_up", unique(day_data[,.(ID,N_tests_group)]))) %>% 
  rename_params()
  save(r_model_pars_by_grp,file = "pdata/r_model_pars_by_grp.Rdata")
}

r_model_pars_by_grp[, N_tests_group := factor(N_tests_group,levels = levels(day_data$N_tests_group))]

tbl = 
  r_model_pars_by_grp %>%
  dcast(N_tests_group ~ parameter,value.var = "tbl") %>%
  setnames("N_tests_group","# tests")

pars1_caption = "Estimated model parameters for cases with different numbers of tests per subject."
ktbl = 
  clean_colnames(tbl) %>% 
  kable(format = table_format,
        caption = pars1_caption) %>%
  kable_styling(full_width = F) %>%
  add_footnote("Numbers are means and 90% credible intervals. Parameters were estimated in one joint model.",
               notation = "none")
ktbl
Table 3.5: Estimated model parameters for cases with different numbers of tests per subject.
# tests Slope of log10 load increase Days to peak load Peak log10 viral load Slope of log10 load decrease
3 1.97 (1.84, 2.11) 4.26 (3.98, 4.55) 8.04 (7.86, 8.22) -0.169 (-0.172, -0.165)
4 1.96 (1.84, 2.09) 4.33 (4.06, 4.62) 8.16 (7.97, 8.33) -0.168 (-0.171, -0.165)
5 1.97 (1.84, 2.1) 4.35 (4.07, 4.64) 8.22 (8.04, 8.4) -0.167 (-0.17, -0.164)
6-7 1.97 (1.84, 2.1) 4.39 (4.12, 4.68) 8.31 (8.12, 8.48) -0.167 (-0.171, -0.164)
>7 1.95 (1.82, 2.08) 4.48 (4.19, 4.79) 8.41 (8.21, 8.59) -0.168 (-0.172, -0.164)
Numbers are means and 90% credible intervals. Parameters were estimated in one joint model.
p_pars_by_group = 
  ggplot(r_model_pars_by_grp, aes(x = N_tests_group, y = mean)) + 
  geom_point() + 
  facet_wrap(~parameter, scale = "free", ncol = 4, labeller = label_parsed) + 
  conf_linerange() + 
  xlab("Number of tests per case") + 
  ylab("Parameter estimate")

ft_pars1 = 
  clean_colnames(tbl) %>% 
  flextable() %>%
  autofit() %>%
  set_caption(caption = pars1_caption)
table_doc <- table_doc %>%
  body_add_flextable(ft_pars1) %>%
  body_end_section_landscape() 
rm(r_model_pars_by_grp)
p_pars_by_group
Means and credible intervals of model parameters split by cases with different numbers of tests.  Note that model parameters were estimated in one joint analyses.

Figure 3.16: Means and credible intervals of model parameters split by cases with different numbers of tests. Note that model parameters were estimated in one joint analyses.

The clear association between number of tests, with the peak viral load (intercept) and the down slope is expected, if one assumes that cases with a more severe illness have a higher peak viral load which is also declines more slowly. This analysis does not show clear differences in the attack rate (slope up) between subjects with different numbers of tests.

As a further test, we also estimated model parameters with different subset of cases, i.e., with at least 3, at least 4, … at least 9 tests. Figure 3.17 shows the estimated parameters for these data subsets.

my_vars = c("intercept","slope_up","slope_down","time2peak")

if (file.exists("pdata/model_pars_by_sub_sample.Rdata")) {
  load("pdata/model_pars_by_sub_sample.Rdata")
} else {
  model_pars_by_sub_sample = rbind(
    pars_min3 = 
      stats_over_draws(draws = draws,
                       vars = my_vars) %>%
      .[, min_N_tests := 3],
    do.call(rbind,
            lapply(4:9, function(x) {
              stats_over_draws(file = here(paste0("CP/fits/w25o/",model,"_sel",x,".Rdata")),
                               vars = my_vars) %>%
                .[, min_N_tests := x]
            })
    )
  ) %>% 
    rename_params()
  save(model_pars_by_sub_sample,file = "pdata/model_pars_by_sub_sample.Rdata")
}



tbl = 
  model_pars_by_sub_sample %>%
  .[, tbl := paste0(round(mean,2)," (",round(lower90,2),", ",round(upper90,2),")")] %>%
  dcast(min_N_tests ~ parameter,value.var = "tbl") %>%
  setnames("min_N_tests","min # tests") 


pars2_caption = "Estimated model parameters for sub-samples with different minimum number of tests per subject."
pars2_note = "Numbers are means and 90% credible intervals. Parameters were estimated separately for subjects with different number of tests."

ktbl = 
  kable(tbl,
        format = table_format,
        caption = pars2_caption) %>%
  kable_styling(full_width = F) %>%
  add_footnote(pars2_note,
               notation = "none")
ktbl
Table 3.6: Estimated model parameters for sub-samples with different minimum number of tests per subject.
min # tests Slope ~ of ~ log[10] ~ load ~ increase Days ~ to ~ peak ~ load Peak ~ log[10] ~ viral ~ load Slope ~ of ~ log[10] ~ load ~ decrease
3 1.97 (1.84, 2.1) 4.31 (4.04, 4.6) 8.14 (7.96, 8.32) -0.17 (-0.17, -0.16)
4 1.94 (1.81, 2.07) 4.58 (4.27, 4.89) 8.27 (8.04, 8.51) -0.18 (-0.18, -0.17)
5 1.91 (1.76, 2.06) 4.52 (4.15, 4.89) 8.21 (7.91, 8.48) -0.18 (-0.18, -0.17)
6 1.79 (1.64, 1.96) 4.94 (4.51, 5.38) 8.42 (8.12, 8.69) -0.18 (-0.19, -0.18)
7 1.82 (1.64, 2.02) 5.07 (4.58, 5.62) 8.54 (8.16, 8.89) -0.19 (-0.2, -0.19)
8 1.98 (1.71, 2.3) 4.8 (4.15, 5.49) 8.6 (8.19, 9.03) -0.2 (-0.21, -0.19)
9 1.79 (1.5, 2.14) 5.27 (4.39, 6.22) 8.49 (7.84, 9.11) -0.19 (-0.21, -0.18)
Numbers are means and 90% credible intervals. Parameters were estimated separately for subjects with different number of tests.
p_pars_by_sub_sample = 
  ggplot(model_pars_by_sub_sample, aes(x = min_N_tests, y = mean)) + 
  geom_point() + 
  facet_wrap(~parameter, scale = "free", ncol = 4, labeller = label_parsed) + 
  conf_linerange() + 
  xlab("Minimum number of tests in sub-sample.") + 
  ylab("Parameter estimate")
p_pars_by_sub_sample
Means and credible intervals of model parameters split by sub-samples with different number of minimum tests per case. Note that model parameters were estimated in independent analyses.

Figure 3.17: Means and credible intervals of model parameters split by sub-samples with different number of minimum tests per case. Note that model parameters were estimated in independent analyses.

rm(model_pars_by_sub_sample)
((p_pars_by_group + ggtitle("A"))/ 
         (p_pars_by_sub_sample + ggtitle("B"))) %>% 
  ggsave(file = "figures/S11_TC_model_params_by_subsample_group.png",
       width = 30,
       height = 20,
       units = "cm",
       type = "cairo",
       device = png_device)
ggsave(file = "figures/S11_TC_model_params_by_subsample_group.pdf",
       width = 30,
       height = 20,
       units = "cm")

ft_pars2= 
  flextable(tbl) %>%
  autofit() %>%
  set_caption(caption = pars2_caption)

table_doc <- table_doc %>%
  body_add_flextable(ft_pars2) %>%
  body_end_section_landscape() 

3.3.2 Associations with adjustment variables

p = 
  draws %>% 
  subset_draws("intercept_PCR") %>%
  plot_post_hists()
p +  
  xlab("Regression weight for PCR system = cobas.") + 
  geom_vline(xintercept = 0, lty = 3, col = "red")
Posterior distribution of the association between PCR system and viral load.

Figure 3.18: Posterior distribution of the association between PCR system and viral load.

Figure 3.19 shows associations between test centre category and test log10 viral load. For instance, any tests taken at C19 centres resulted on average in 0.28 higher log10 viral load, compared to the grand mean and after adjustment for the other variables in the model. These effects were obtained as random effects.

centres = gsub("centreCategory",
               "",
               colnames(datalist$centre))
p = 
  draws %>% 
  subset_draws("int_centr") %>%
  plot_post_hists(labels = centres, nrow = 3)
p +  
  xlab("Regression weight of test centre category for viral load at any time")  + 
  geom_vline(xintercept = 0, lty = 3, col = "red")
Posterior distribution of random effects of test centre category on viral load. These random effects capture variations of average viral load measured in different test centre categories.

Figure 3.19: Posterior distribution of random effects of test centre category on viral load. These random effects capture variations of average viral load measured in different test centre categories.

Figure 3.20 shows associations between the test centre category of the first test and peak viral load. For instance, participants with a first positive test obtained in a residence for the elderly (RES) had on average an estimated peak viral load of 7.9, compared to a value of 7.5 for subjects with a first positive test from an outpatient department (OD). These effects were obtained as random effects.

tmp = 
  draws %>% 
  subset_draws("int_centre1_raw") %>% 
  as_draws_dt() %>% 
  melt(id.var = ".draw", value.name = "int_centre1_raw") %>% 
  .[, centr1 := factor(as.numeric(factor(variable)),
                          labels = names(table(day_data[day == 0,centreCategory])))] %>% 
  merge(draws %>%
          subset_draws("int_centre1_sigma") %>%
          as_draws_dt(), by = ".draw") %>% 
  merge(draws %>%
          subset_draws("log_intercept_mu") %>%
          as_draws_dt(), by = ".draw") %>% 
  .[, intercept := exp(log_intercept_mu + int_centre1_raw * int_centre1_sigma )] 

tmp_stats = 
  tmp %>% 
  .[, .(m = mean(intercept), lower = quantile(intercept,.5), upper = quantile(intercept,.95)) , by = centr1] %>% 
  .[, label := paste0(round(m,1)," (",round(lower,1),", ",round(upper,1),")")]

p_reff_c1_int = 
  tmp %>% 
  ggplot(aes(x = intercept)) + 
  geom_vline(xintercept = ss[variable == "intercept_mu",mean], color = "grey", lty = 2) +
  geom_histogram(alpha = .5, fill = "blue", bins = 30) + 
  facet_wrap(~centr1) + 
  theme(axis.line.y = element_blank(), axis.text.y = element_blank()) +
  ylab("Effect of the first test centre on peak viral load")

p_reff_c1_int +
    geom_text(data = tmp_stats,
              aes(x = min(ggplot_build(p_reff_c1_int)$data[[2]][,"xmin"]),
                  y = .9*max(ggplot_build(p_reff_c1_int)$data[[2]][,"ymax"]),
                  label = label),
              hjust = 0, size = 3)
Posterior distribution of random effects of test centre type for first test on peak viral load. Note that these associations are adjusted for age.

Figure 3.20: Posterior distribution of random effects of test centre type for first test on peak viral load. Note that these associations are adjusted for age.

rm(tmp)

Figure 3.21 shows associations between the test centre category with longest stay and decay gradient. The test centre with the longest stay is defined as the test centre catgeory with the longest period from the first to the last of consecutive tests from that test centre category.

tmp = 
  draws %>% 
  subset_draws("slope_down_ld_centre_raw") %>% 
  as_draws_dt() %>% 
  melt(id.var = ".draw", value.name = "slope_down_ld_centre_raw") %>% 
  .[, ld_centre := factor(as.numeric(factor(variable)),
                          labels = names(table(day_data[day == 0,ld_centre])))] %>% 
  merge(draws %>%
          subset_draws("slope_down_ld_centre_sigma") %>%
          as_draws_dt(), by = ".draw") %>% 
  merge(draws %>%
          subset_draws("log_slope_down_mu") %>%
          as_draws_dt(), by = ".draw") %>% 
  .[, slope_down := -exp(log_slope_down_mu + slope_down_ld_centre_raw * slope_down_ld_centre_sigma )] 

tmp_stats = 
  tmp %>% 
  .[, .(m = mean(slope_down), lower = quantile(slope_down,.5), upper = quantile(slope_down,.95)) , by = ld_centre] %>% 
  .[, label := paste0(round(m,3)," (",round(lower,3),", ",round(upper,3),")")]
p_reff_ld_centre_int = 
  tmp %>% 
  ggplot(aes(x = slope_down)) + 
  geom_vline(xintercept = -ss[variable == "slope_down_mu",mean], color = "grey", lty = 2) +
  geom_histogram(alpha = .5, fill = "blue", bins = 30) + 
  facet_wrap(~ld_centre) + 
  theme(axis.line.y = element_blank(), axis.text.y = element_blank()) 

p_reff_ld_centre_int +
    geom_text(data = tmp_stats,
              aes(x = min(tmp$slope_down),
                  y = max(ggplot_build(p_reff_c1_int)$data[[2]][,"ymax"]),
                  label = label),
              hjust = 0, size = 3)
Posterior distribution of random effects of test centre type with longest stay on decay gradient. Stay is defined as the number of days between the first and the last of consecutive tests from a test centre category. Cases without a unique centre with the longest stay are labeled X. Note that these associations are adjusted for age.

Figure 3.21: Posterior distribution of random effects of test centre type with longest stay on decay gradient. Stay is defined as the number of days between the first and the last of consecutive tests from a test centre category. Cases without a unique centre with the longest stay are labeled X. Note that these associations are adjusted for age.

rm(tmp,ss)
if (file.exists("pdata/model_pars_by_PAMS_HOSP.Rdata")) {
  load("pdata/model_pars_by_PAMS_HOSP.Rdata")
} else {
  model_pars_by_PAMS = 
  rbind(
    smrs_by_grp(draws, "slope_up",day_data[day == 0, .(ID,PAMS1)], calc.delta = T),
    smrs_by_grp(draws, "intercept",day_data[day == 0, .(ID,PAMS1)], calc.delta = T),
    smrs_by_grp(draws, "time2peak",day_data[day == 0, .(ID,PAMS1)], calc.delta = T),
    smrs_by_grp(draws, "slope_down",day_data[day == 0, .(ID,PAMS1)], calc.delta = T)
  ) %>% 
    setnames("PAMS1","group") %>% 
    .[, grouping.variable := "PAMS"] %>% 
    .[, group := gsub("FALSE-TRUE","diff",group)] %>% 
    .[, group := gsub("TRUE","PAMS",group)] %>%
    .[, group := gsub("FALSE","non-PAMS",group)] 

model_pars_by_hospitalized = 
  rbind(
    smrs_by_grp(draws, "slope_up",day_data[day == 0, .(ID,Hospitalized)], calc.delta = T),
    smrs_by_grp(draws, "intercept",day_data[day == 0, .(ID,Hospitalized)], calc.delta = T),
    smrs_by_grp(draws, "time2peak",day_data[day == 0, .(ID,Hospitalized)], calc.delta = T),
    smrs_by_grp(draws, "slope_down",day_data[day == 0, .(ID,Hospitalized)], calc.delta = T)
  )  %>% 
    .[, grouping.variable := "Hospitalized"] %>% 
    setnames("Hospitalized","group") %>% 
  .[, group := gsub("0-1","diff",group)] %>% 
  .[, group := gsub("0","non-hosp.",group)] %>% 
  .[, group := gsub("1","Hospitalised",group)]

# 5 subjects with missing gender
grp.dt = day_data[day == 0, .(ID,Gender)][, Gender := ifelse(is.na(Gender),sample(0:1,1),Gender), by = .(ID)]
model_pars_by_Gender = 
  rbind(
    smrs_by_grp(draws, "slope_up",grp.dt, calc.delta = T),
    smrs_by_grp(draws, "intercept",grp.dt, calc.delta = T),
    smrs_by_grp(draws, "time2peak",grp.dt, calc.delta = T),
    smrs_by_grp(draws, "slope_down",grp.dt, calc.delta = T)
  )  %>% 
    .[, grouping.variable := "Gender"] %>% 
    setnames("Gender","group") %>% 
  .[, group := gsub("0-1","diff",group)] %>% 
  .[, group := gsub("0","Female",group)] %>% 
  .[, group := gsub("1","Male",group)]

 save(model_pars_by_PAMS,
      model_pars_by_hospitalized,
      model_pars_by_Gender,
      file = "pdata/model_pars_by_PAMS_HOSP.Rdata")
}


tmp_lvls = rev(c("PAMS","non-PAMS","Hospitalised","non-hosp.","Female","Male","diff"))
model_pars_by = 
  rbind(
    model_pars_by_PAMS,
    model_pars_by_hospitalized,
    model_pars_by_Gender) %>% 
  rename_params() %>% 
  .[, value := ifelse(group == "diff","difference","estimate")] %>% 
  .[, value := factor(value, levels = c("estimate","difference"))] %>% 
  .[, group := factor(group, levels = tmp_lvls)] %>% 
  setnames("grouping.variable","Grouped_by")

plot_model_pars_by = 
  model_pars_by %>% 
  ggplot(aes(x = group, y = mean, ymin = lower90, ymax = upper90, color = Grouped_by)) + 
  geom_point(position = position_dodge(dodge_with)) + 
  conf_linerange(data = model_pars_by,color = "Grouped_by") + 
  facet_wrap(parameter ~ value, scale = "free", labeller = label_parsed, ncol = 2) + 
  ylab("parameter value") + 
  theme(legend.position = "top") + 
  geom_hline(data = model_pars_by[group == "diff"], aes(yintercept = 0), col = "grey", lty = 3) +
  coord_flip() + xlab("")

plot_model_pars_by
Model parameters by clinical status and gender. Note that due to limitations of the available data, both presence and absence of group differences in these result remain associated with uncertainty.

Figure 3.22: Model parameters by clinical status and gender. Note that due to limitations of the available data, both presence and absence of group differences in these result remain associated with uncertainty.

model_pars_by_tbl = 
  model_pars_by %>% 
  dcast(Grouped_by + group ~ parameter, value.var = "tbl") 

N_data = 
  rbind(day_data[day == 0, .(N = .N), by = .(Hospitalized)] %>% 
          .[, Grouped_by := "Hospitalized"] %>% 
          setnames("Hospitalized", "group") %>% 
          .[, group := ifelse(group == 0,"non-hosp.","Hospitalised" )],
        day_data[day == 0, .(N = .N), by = .(PAMS1)] %>% 
          .[, Grouped_by := "PAMS"] %>% 
          setnames("PAMS1", "group")%>% 
          .[, group := ifelse(group == 0,"non-PAMS","PAMS" )],
        day_data[day == 0, .(N = .N), by = .(Gender)] %>% 
          .[, Grouped_by := "Gender"] %>% 
          setnames("Gender", "group") %>% 
          .[!is.na(group)]  %>% 
          .[, group := ifelse(group == 0,"Female","Male" )]
  ) 

model_pars_by_tbl = 
  merge(model_pars_by_tbl,
        N_data,
        by = c("Grouped_by","group"),
        all.x = T) %>% 
  .[, group := factor(group, levels = rev(tmp_lvls))] %>% 
  setkeyv(c("Grouped_by", "group"))

gv = c("Grouped_by", "group", "N")
model_pars_by_tbl = 
  model_pars_by_tbl[, c(gv, 
                        setdiff(names(model_pars_by_tbl),gv)),
                    with = F]

model_pars_by_ktbl = 
  kable(model_pars_by_tbl,
        format = table_format,
        caption = "Model parameters by PAMS and hospitalisation.") %>%
  kable_styling(full_width = F)

model_pars_by_ktbl
Table 3.7: Model parameters by PAMS and hospitalisation.
Grouped_by group N Slope ~ of ~ log[10] ~ load ~ increase Days ~ to ~ peak ~ load Peak ~ log[10] ~ viral ~ load Slope ~ of ~ log[10] ~ load ~ decrease
Gender Female 2051 1.97 (1.83, 2.13) 4.3 (3.99, 4.62) 8.12 (7.93, 8.3) -0.171 (-0.176, -0.167)
Gender Male 2287 1.97 (1.82, 2.13) 4.32 (3.98, 4.67) 8.16 (7.97, 8.34) -0.165 (-0.17, -0.161)
Gender diff NA 0 (-0.17, 0.17) -0.02 (-0.39, 0.34) -0.03 (-0.12, 0.06) -0.006 (-0.011, 0)
Hospitalized Hospitalised 3494 1.92 (1.8, 2.05) 4.46 (4.17, 4.76) 8.27 (8.08, 8.45) -0.169 (-0.172, -0.165)
Hospitalized non-hosp. 850 2.14 (1.9, 2.42) 3.7 (3.27, 4.16) 7.6 (7.39, 7.79) -0.166 (-0.173, -0.159)
Hospitalized diff NA 0.22 (-0.02, 0.47) -0.75 (-1.2, -0.3) -0.68 (-0.83, -0.52) 0.003 (-0.005, 0.01)
PAMS PAMS 262 2.24 (1.88, 2.65) 3.41 (2.84, 4.04) 7.28 (6.95, 7.59) -0.173 (-0.186, -0.159)
PAMS non-PAMS 4082 1.95 (1.83, 2.08) 4.37 (4.1, 4.66) 8.2 (8.01, 8.37) -0.168 (-0.171, -0.164)
PAMS diff NA -0.29 (-0.68, 0.06) 0.96 (0.33, 1.53) 0.92 (0.62, 1.21) 0.005 (-0.009, 0.018)
model_pars_by_tbl_ft = 
  flextable(model_pars_by_tbl) %>%
  autofit() %>%
  set_caption(caption = "Model parameters by PAMS and hospitalisation.")
  
table_doc <- table_doc %>%
  body_add_flextable(model_pars_by_tbl_ft) %>%
  body_end_section_landscape()

3.3.3 Viral load over time

Figure 3.23 shows estimated viral loads over time. The red line and shaded area are expected viral load and the 90% credible interval. The blue lines are estimated time courses for individual subjects.

if (file.exists("pdata/VL_by_day.Rdata")) {
  load("pdata/VL_by_day.Rdata")
} else {
  VL_by_key_days_draws = 
    VLCP_by_day_draws %>%
    .[day_shifted %in% c(0,5,8)] %>%
    .[, list(mean = mean(log10Load), sd = sd(log10Load)), by = c(".draw","day_shifted")] %>%
    melt(id.vars = c(".draw","day_shifted"), variable.name = "statistic") 
  VL_by_day = 
    VLCP_by_day_draws %>%
    .[, value := collapse::fmean(log10Load), by = .(day_shifted,.draw)] %>%
    get_stats(by = "day_shifted") %>%
    setnames("mean","log10Load") %>%
    .[, ID := 0]
  save(VL_by_day,VL_by_key_days_draws,file = "pdata/VL_by_day.Rdata")
}

plot_VL_by_day = 
  plot_by_day(VL_by_day,
              VLCP_by_dayID %>% as.data.frame(), y.var = "log10Load", ylim = c(0,10), xlim = c(-5,25)) +
    ylab(expression(log[10]~viral~load))  + 
    gg_text_size()

tmp = gc(verbose = F)
plot_VL_by_day
Estimated viral load over time. Blue lines are posterior expectations (means) for subjects. The red line and shaded area are sample average and its 90% credible interval.

Figure 3.23: Estimated viral load over time. Blue lines are posterior expectations (means) for subjects. The red line and shaded area are sample average and its 90% credible interval.

3.3.4 Number of days from peak viral load to symptom onset

For 171 participants with time course data we also had a self-reported date of symptoms onset. This data is shown in figure 3.24. The presence of a few unexpected very early self-reported symptom onsets (more than 10 days before the first positive test) motivated the employment of a mixture model to estimate the temporal distance between peak viral load and symptom onset, where a skew-normal prior with the parameters \(\xi = -2.5\), \(\omega = 10\), and \(\tau=10\) modeled the prior expectation of the temporal distance, a normal distribution with \(\mu=0\) and \(\sigma=50\) modeled potential inaccurate recall, and a weight parameter \(\theta \in[0,1]\) with a uniform prior modeled the weight of the two processes.

day_data[day == 0 & !(is.na(onset_day))] %>% 
  ggplot(aes(x = onset_day)) +
  geom_histogram() + 
  xlab("Days since first test") + 
  ylab("Number subjects")
Histogram of time between first test and self-reported symptom onset.

Figure 3.24: Histogram of time between first test and self-reported symptom onset.

theta = draws %>% subset_draws("theta") %>% as.numeric()

peak2onset =
  draws %>% 
  thin_draws(1) %>% 
  subset_draws("shifted_onset") %>% 
  as_draws_dt() %>% 
  melt(id.var = ".draw")

peak2onset_m = 
  peak2onset %>% 
  .[,.(m = median(value)), by = .(.draw)]

The estimated theta of 0.95 (0.91, 0.98) indicates that around 5% of the responses are ascribed to inaccurate recall. We estimate that symptom onset occurs around 4.3 (3.5, 5.1) days after peak viral load. 3.25 visualizes this result.

p1 = 
  peak2onset %>% 
  ggplot(aes(x = value,group = .draw)) + 
  geom_histogram(position = "identity", alpha = .01, bins = 65) + 
  xlab("Days since peak viral load") + 
  theme(axis.line.y = element_blank(), 
        axis.title.y = element_blank(), 
        axis.text.y = element_blank())

tmp = as.matrix(peak2onset_m$m,ncol = 1) %>%
  as_draws() 
dimnames(tmp)$variable = "Mean"
p2 = 
  tmp %>% 
  as.matrix(peak2onset_m$m,ncol = 1) %>% 
  as_draws() %>% 
  plot_post_hists()  + 
  xlab("Days since peak load") + 
  gg_add_grid() + 
  geom_vline(xintercept = mean(peak2onset_m$m)) + 
  ylab("") 

px = 
  p1 + 
  inset_element(p2,-0.025,0.375,.55,1)
px
Estimated number of days from peak load to symptom onset. The main figure shows overlapping histograms (generated from the posterior distribution) of estimated symptom onsets, relative to peak viral load. A small number of participants reported a symptom onset prior to the first positive test in our data. The inset shows the posterior distribution of the mean estimated symptom onset over subjects with symptom onset data.

Figure 3.25: Estimated number of days from peak load to symptom onset. The main figure shows overlapping histograms (generated from the posterior distribution) of estimated symptom onsets, relative to peak viral load. A small number of participants reported a symptom onset prior to the first positive test in our data. The inset shows the posterior distribution of the mean estimated symptom onset over subjects with symptom onset data.

ggsave(px,filename = "figures/S15_peak2onset.png", width = 15, height = 12, units = "cm", dpi = 600)
ggsave(px,filename = "figures/S15_peak2onset.pdf", width = 15, height = 12, units = "cm")

rm(peak2onset,peak2onset_m,tmp,p1,p2,px)

3.3.5 Number / percent of cases tested befor peak viral load

day_data %>% 
  .[log10Load> 2, day_first_positive := min(day), by = "ID"] %>%
  .[, day_first_positive := max(day_first_positive), by = "ID"] %>% 
  .[, Group := ifelse(Hospitalized == 1, "Hospitalized",ifelse(PAMS1 == 1,"PAMS","Other"))]

if (file.exists("pdata/DetectionTiming.Rdata")) {
  load("pdata/DetectionTiming.Rdata")
} else {
  shift_ttp = 
    draws_by_id(draws,c("shift","slope_up","intercept","time2peak"))
  setkeyv(day_data,"ID")
  setkeyv(positive_before_peak_load_by_ID_draw,"ID")
  positive_before_peak_load_by_ID_draw = 
  merge(shift_ttp,
        day_data[day == 0,
                 .(ID,first_test_negative,day_first_positive,N_tests,Age,PAMS1,Group)],
        by = "ID",
        allow.cartesian = T)  %>%
  .[, shift := shift] %>%
  .[, day_first_positive_shifted := day_first_positive + shift] %>%
  .[, days_1stTest_to_peak := time2peak - day_first_positive_shifted] %>%
  .[, days_infection_to_1stPosTest := time2peak + day_first_positive_shifted] %>%
  .[, p_pos_test_before_peak := ((day_first_positive + shift) < 0) ] %>%
  .[, pos_test_before_peak := day_first_positive_shifted < 0] %>%
  .[, pos_test_before_peak_b := ifelse(p_pos_test_before_peak > runif(1),T,F), by = .(ID,.draw)]
  
day_first_post_test_by_draw = 
  positive_before_peak_load_by_ID_draw %>%
  .[, .(m_day_first_positive_shifted = mean(day_first_positive_shifted),
        sd_day_first_positive_shifted = sd(day_first_positive_shifted)),
    by = .(`.draw`, pos_test_before_peak)]

infection_to_first_post_test_by_draw = 
  positive_before_peak_load_by_ID_draw %>%
  .[, .(m_inf_to_first_positive_shifted = mean(days_infection_to_1stPosTest),
        sd_inf_to_first_positive_shifted = sd(days_infection_to_1stPosTest)),
    by = .(`.draw`, pos_test_before_peak)]

day_first_post_test_under70_PAMS = 
  positive_before_peak_load_by_ID_draw %>%
  .[Age < 70] %>% 
  .[, .(m_day_first_positive_shifted = mean(day_first_positive_shifted),
        sd_day_first_positive_shifted = sd(day_first_positive_shifted)),
    by = .(`.draw`, PAMS1)]

positive_before_peak_load_by_draw =
  positive_before_peak_load_by_ID_draw %>%
  .[, .(p_pos_test_before_peak = mean(p_pos_test_before_peak),
        n_pos_test_before_peak = sum(pos_test_before_peak_b),
        m_day_first_positive_shifted = mean(day_first_positive_shifted),
        sd_day_first_positive_shifted = sd(day_first_positive_shifted)),
   by = .(.draw)]
  
  save(positive_before_peak_load_by_ID_draw,
       day_first_post_test_by_draw,
       infection_to_first_post_test_by_draw,
       day_first_post_test_under70_PAMS,
       positive_before_peak_load_by_draw,
       file = "pdata/DetectionTiming.Rdata")
  rm(shift_ttp)
}

Figure 3.26 shows that our analysis estimates that it takes on average around 11.6 days from the start of viral shedding to the first positive test, and that this average is is associated with considerable variation between subjects.

positive_before_peak_load_by_ID_draw %>% 
  ggplot(aes(x = days_infection_to_1stPosTest, group = .draw)) + 
  stat_bin(aes(y=..count..),
           geom="step", alpha = .01, 
           position = "identity", bins = 50) + 
  xlab("Number of days from infection to first positive test") +
  ylab("Number of subjects") + 
  gg_expand()
Histograms for number of days from infection to first positive test. The figure shows for each draw from the posterior sample a histogram of days from infection to first positive test across subjects.

Figure 3.26: Histograms for number of days from infection to first positive test. The figure shows for each draw from the posterior sample a histogram of days from infection to first positive test across subjects.

Here we calculate the expected proportion of people who had a first positive test result before peak viral load.

positive_before_peak_load_by_draw %>% 
   setnames(c("p_pos_test_before_peak","n_pos_test_before_peak","m_day_first_positive_shifted","sd_day_first_positive_shifted"),
            c("% pos. tests before peak load","N pos. tests before peak load","mean day of 1st pos. test","sd day of 1st pos. test")) %>% 
  .[, `N pos. tests before peak load` := as.numeric(`N pos. tests before peak load`)] %>% 
  melt(id.var = ".draw") %>% 
  plot_post_hists() +
    ylab("Estimate")
Estimated timing of first positive tests relative to peak viral load. The figure shows posterior distributions.

Figure 3.27: Estimated timing of first positive tests relative to peak viral load. The figure shows posterior distributions.

The estimated average probability of having a positive test before peak viral load is 22.1 (21.0, 23.3)% (962 (914, 1010) out of the 4344 subjects). Over all subjects, the estimated day for the first positive test was 7.30 (7.11, 7.50) after peak viral load, with a standard deviation of 7.68 (7.58, 7.77). For those who had the first positive test before peak load, we estimated this to be -1.4 (-1.5, -1.3) before peak viral load, whereas it was estimated to be 9.8 (9.6, 10.0) for the others (sd: 6.9 (6.8, 7.0)).

The mean estimated day of detection for PAMS subjects under 70 years is 5.1 (4.5, 5.7) days versus the non-PAMS mean of 8.2 (7.9, 8.5) days.

tmp = 
  positive_before_peak_load_by_ID_draw %>% 
  .[, .(day_first_positive_shifted = mean(day_first_positive_shifted)),
                                       by = .(.draw,PAMS1)] 


positive_before_peak_load_by_draw =
  positive_before_peak_load_by_ID_draw %>%
  .[, .(p_pos_test_before_peak = mean(p_pos_test_before_peak),
        n_pos_test_before_peak = sum(pos_test_before_peak_b),
        m_day_first_positive_shifted = mean(day_first_positive_shifted),
        sd_day_first_positive_shifted = sd(day_first_positive_shifted)),
   by = .(.draw)]

rm(positive_before_peak_load_by_ID_draw)

Next, we test investigate the hypothesis that PAMS cases are detected earlier than non-PAMS cases. Figure 3.28 shows that among participants with time course data, PAMS cases were on average detected 5.1 (4.5, 5.7) days after peak load, 2.4 (1.7, 3.0) days before non-PAMS cases, which were on average detected 7.4 (7.2, 7.6) days after peak load.

tmp %>% 
  .[, PAMS := factor(if_else(PAMS1 == T,"PAMS","non-PAMS"),levels = c("PAMS","non-PAMS"))] %>% 
  ggplot(aes(x = day_first_positive_shifted, fill = PAMS)) + 
  geom_histogram(alpha = .5, position = "identity", bins = 75) + 
  red_blue() + 
  xlab("Mean number of days from peak load to first positive test") + 
  ylab("N posterior draws") + 
  theme(legend.position = c(.2,.8))
Posterior distribution of temporal distance from peak load to first positive test

Figure 3.28: Posterior distribution of temporal distance from peak load to first positive test

rm(tmp)

Note that the reliability of these results is linked to the reliability with which we could estimate when in the time course of the infection the tests were done, which remains somewhat uncertain.

3.3.6 Days from non-infectiousness or detectable viral load to peak viral load

Here we calculate the number of days from barely non-detectable viral load or culture probability of only 2.5% to peak viral load and thus peak culture probability.

plor_daysdetectiontonpeak_data = 
  draws %>% 
  subset_draws(c("intercept_mu","slope_up_mu","alpha_CP","beta_CP")) %>% 
  as_draws_dt() %>% 
  .[, maxCP := alpha_CP + `beta_CP[1]`*intercept_mu] %>% 
  .[, dayMax := -(alpha_CP - `beta_CP[1]`*intercept_mu + maxCP) / (`beta_CP[1]` * slope_up_mu)] %>% 
  .[, day25 := -(alpha_CP - `beta_CP[1]`*intercept_mu + logit(.01)) / (`beta_CP[1]` * slope_up_mu)] %>% 
  .[, days25toMax := day25-dayMax] %>% 
  .[,.(.draw,days25toMax)] %>% melt(id.var = ".draw")

plor_daysdetectiontonpeak_data %>% 
  plot_post_hists()
Expected numbers of days from 2.5% to maximum culture probability (left). 90% credible interval in parentheses.

Figure 3.29: Expected numbers of days from 2.5% to maximum culture probability (left). 90% credible interval in parentheses.

Note that (as above) the reliability of these results is linked to the reliability with which we could estimate when in the time course of the infection the tests was done, which is currently uncertain.

3.3.7 Culture probability over time

Figure 3.30 shows estimated culture probability over time.

if (file.exists("pdata/CP_by_day.Rdata")) {
  load("pdata/CP_by_day.Rdata")
} else {
  CP_by_key_days_draws = 
    VLCP_by_day_draws %>%
    .[day_shifted %in% c(-2, 0,5,10)] %>%
    .[, .(mean = mean(CP), sd = sd(CP)), by = c(".draw","day_shifted")] %>%
    melt(id.vars = c(".draw","day_shifted"), variable.name = "statistic")
  
  CP_by_day = 
    VLCP_by_day_draws %>%
    .[, value := collapse::fmean(CP), by = c("day_shifted",".draw")] %>%
    get_stats(by = "day_shifted") %>%
    setnames("mean","CP") %>%
    .[, ID := 0]
  save(CP_by_key_days_draws,CP_by_day,file = "pdata/CP_by_day.Rdata")
}


plot_CP_by_day = 
  plot_by_day(CP_by_day,
              VLCP_by_dayID %>% as.data.frame(), y.var = "CP", ylim = c(-.005,1), xlim = c(-5,25)) +
  ylab("Culture probability")  + 
  gg_text_size()
tmp = gc(verbose = F)
plot_CP_by_day
Estimated culture probability over time. Blue lines are posterior expectations (means) for subjects. The red line and shaded area are sample average and its 90% credible interval.

Figure 3.30: Estimated culture probability over time. Blue lines are posterior expectations (means) for subjects. The red line and shaded area are sample average and its 90% credible interval.

We estimate a peak viral load of 8.1 (8.0, 8.3) at around 4.3 (4.0, 4.6) days after the estimated start of viral shedding. Given the non-linear association between viral load and culture infectivity, the estimated culture infectivity shows greater changes over time, with a peak of 0.74 (0.61, 0.85) (between participation standard deviation: 0.16 (0.12, 0.21)), which declines to 0.52 (0.40, 0.64) 5 days later, and 0.29 (0.19, 0.40) 10 days after peak viral load.

3.3.8 Conditional effects of age on model parameters

We used a spline model to estimate the association between age and peak viral load as well as viral loads slopes.

param_levels = 
  c("Growth gradient","Peak viral load","Decay gradient")

if (file.exists("pdata/TC_ConditionalEffectAge.Rdata")) {
  load("pdata/TC_ConditionalEffectAge.Rdata")
} else {
  px = 
    lapply(c("intercept","slope_up","slope_down"), 
           function(var, predictor = "Age") {
             cond_eff_contPGH(var,predictor)
           }
    ) %>% 
    do.call(rbind,.) %>% 
    .[, parameter := gsub("intercept","Peak viral load", parameter)] %>% 
    .[, parameter := gsub("slope_up","Growth gradient", parameter)] %>% 
    .[, parameter := gsub("slope_down","Decay gradient", parameter)] %>% 
    .[, parameter := factor(parameter, levels = param_levels)] %>% 
    get_stats(by = c("Age","parameter"))
  
  mus = 
    do.call(rbind,
            lapply(c("intercept","slope_up","slope_down"),
                   function(p) 
                     data.table(value = draws %>% 
                                  subset_draws(p) %>% 
                                  as_draws_matrix() %>% 
                                  as.vector(),
                                parameter = p))) %>% 
    .[, parameter := gsub("intercept","Peak viral load", parameter)] %>% 
    .[, parameter := gsub("slope_up","Growth gradient", parameter)] %>% 
    .[, parameter := gsub("slope_down","Decay gradient", parameter)] %>% 
    .[, parameter := factor(parameter, levels = param_levels)] 
  
  save(px,mus,file = "pdata/TC_ConditionalEffectAge.Rdata")
}


sub_plots = vector(length = 3,mode = "list")
names(sub_plots) = c("Growth gradient", "Peak viral load","Decay gradient")
k = 0
for (p in names(sub_plots)) {
  k = k+1
  lims = range(
    quantile(mus[parameter == p,value],c(.001,.999)),
    range(px[parameter == p, c("upper95","lower95"),with = F])
  )
  
  p1 =
    px[parameter == p] %>% 
    ggplot(aes(x = Age, y = mean)) + 
    conf_ribbon(px,fill = "black") + 
    geom_line() +
    coord_cartesian(xlim = c(-1,101)) + 
    ylab(p) + 
    gg_expand() + 
    gg_text_size() + 
    coord_cartesian(ylim = lims) + 
    theme(plot.margin = margin(-10,-5,-5,0, unit = "pt")) + 
    gg_add_grid() + 
    ggtitle(LETTERS[k])
  
  if (p != "Decay gradient")
    p1 = p1 + xlab("")

  
  p2 = 
    mus[parameter == p] %>% 
    ggplot(aes(x = value)) + 
    geom_histogram(fill = "blue",alpha = .5, bins = 30) +
    gg_expand() + 
    coord_flip(xlim = lims * c(.999,1.001)) + 
    ylab("") + 
    gg_text_size() +
    gg_add_grid("y") +
    theme_marginal()
  
  sub_plots[[p]] = (p1 | p2) + plot_layout(widths = c(3,1))
}

sub_plots[[3]][[2]] = 
  sub_plots[[3]][[2]] + 
  theme(axis.title.x = element_text()) + ylab("N")

Age_hist = 
  day_data[day == 0, .(Age)] %>% 
  ggplot(aes(x = Age)) + 
  geom_histogram(breaks = seq(0:100), alpha = .25) + 
  gg_expand() + theme_marginal()

layout = c(
  patchwork::area(t = 1,l = 1,b = 20, r = 20),
  patchwork::area(t = 1,l = 21,b = 20, r = 27),
  patchwork::area(t = 21,l = 1,b = 40, r = 20),
  patchwork::area(t = 21,l = 21,b = 40, r = 27),
  patchwork::area(t = 41,l = 1,b = 60, r = 20),
  patchwork::area(t = 41,l = 21,b = 60, r = 27),
  patchwork::area(t = 55,l = 1,b = 60, r = 20)
)

sub_plots[[1]][[1]] = sub_plots[[1]][[1]] + ylab(expression(Slope~of~log[10]~load~increase))
sub_plots[[2]][[1]] = sub_plots[[2]][[1]] + ylab(expression(Peak~log[10]~viral~load))
sub_plots[[3]][[1]] = sub_plots[[3]][[1]] + ylab(expression(Slope~of~log[10]~load~decline))

p_par_by_age = 
  sub_plots[[1]][[1]]  + sub_plots[[1]][[2]] + 
  sub_plots[[2]][[1]] + sub_plots[[2]][[2]] +   
  sub_plots[[3]][[1]] + sub_plots[[3]][[2]] + 
  Age_hist +
  plot_layout(design = layout) + 
  theme(plot.margin = margin(-10,-5,-5,0, unit = "pt"))

ggsave(p_par_by_age,
       file = "figures/S10_TC_params_by_age.png",
       width = 20, height = 30, units = "cm", dpi = 300)
ggsave(p_par_by_age,
       file = "figures/S10_TC_params_by_age.pdf",
       width = 20, height = 30, units = "cm")
p_par_by_age
Associations between age and subject-level model parameters. The left column shows conditional effects, that is the associations with age after marginalizing out variability due to clinical status and gender. Ribbons show 90% credible intervals. The histograms in the right column show posterior distributions for the average parameter estimate for the sample. The gray inset at the bottom of panel C shows the age distribution of the sample.

Figure 3.31: Associations between age and subject-level model parameters. The left column shows conditional effects, that is the associations with age after marginalizing out variability due to clinical status and gender. Ribbons show 90% credible intervals. The histograms in the right column show posterior distributions for the average parameter estimate for the sample. The gray inset at the bottom of panel C shows the age distribution of the sample.

rm(p_par_by_age,sub_plots,mus,p2)
tmp = gc()

3.3.9 Posterior predictions of peak viral load and culture probability by age

Using a similar approach as above for associations of viral load and culture probability by age, we now calculate and show highest-density regions for viral load and culture probability by day since peak viral load. One difference to the calculation by age is that we use posterior expectations and not posterior predictions for this analysis. The reason for this choice is that the time course model already increases considerable intra-individual variance in the viral loads over time, and that the error variance for the time course model is much lower than for the age-model.

if (file.exists("pdata/pp_peak_load_by_age_group.Rdata")) {
  load("pdata/pp_peak_load_by_age_group.Rdata")
} else {
  sigma = 
    draws %>% 
    subset_draws("sigma_mu") %>%
    as_draws_dt() %>%
    .[, sd := exp(sigma_mu)] %>%
    setkeyv(".draw")
  
  r = 
    data.table(.draw = 1:4000,
               r = rnorm(4000)) %>% 
    setkeyv(".draw") %>% 
    .[, r := r-mean(r)] %>% 
    .[, r := r/sd(r)] %>% 
    .[sigma, epsilon := sd*r]
  
  peak_loads = c()
  for (g in c("Other","Hospitalized","PAMS")) {
    setPGH = c(PAMS1 = 0, Gender = .5, Hospitalized = 0, B117 = 0)
    if (g == "Hospitalized")
      setPGH["Hospitalized"] = 1
    if (g == "PAMS")
      setPGH["PAMS1"] = 1
    
    peak_load =  
      cond_eff_contPGH("intercept", adjust = T, round_Age = 0, setPGH = setPGH) %>% 
      .[, .draw := as.integer(.draw)] %>% 
      setkeyv(".draw") %>% 
      .[, parameter := NULL] %>% 
      .[, Group := g]
    
    peak_load %>% 
      .[r, log10Load := value + epsilon] %>% 
      .[r, epsilon := epsilon] %>% 
      .[sigma, p9 := 1-pnorm(9,value,sd)] %>% 
      .[, value := NULL]
    
    peak_loads = 
      rbind(peak_loads,
            peak_load)
    
  }
  CPpars = draws %>% 
    subset_draws(c("alpha_CP","beta_CP[1]")) %>% 
    as_draws_dt() %>% 
    setnames("beta_CP[1]","beta_CP") %>% 
    setkeyv(".draw") 

  peak_loads %>% 
    setkeyv(".draw") %>% 
    .[CPpars, CP := inv.logit(alpha_CP + beta_CP*log10Load)]
  
  peak_load_hdi =
    peak_loads[,
               as.list(fast.hdi(log10Load,posterior.dist = "norm")),
               by = .(Age,Group)] %>% 
    .[, Group := factor(Group, levels = c("PAMS","Hospitalized","Other"))]
  
  peak_pload9_hdi =
    peak_loads[,
               as.list(fast.hdi(p9,posterior.dist = "beta")),
               by = .(Age,Group)] %>% 
    .[, Group := factor(Group, levels = c("PAMS","Hospitalized","Other"))]
  
  peak_CP_hdi =
    peak_loads[,
               as.list(fast.hdi(CP)),
               by = .(Age,Group)] %>% 
    .[, Group := factor(Group, levels = c("PAMS","Hospitalized","Other"))]
  
  save(peak_load_hdi,peak_pload9_hdi,peak_CP_hdi,file = "pdata/pp_peak_load_by_age_group.Rdata")
}
 
p1 =
  peak_load_hdi %>%
  ggplot(aes(x = Age, y = mean, color = Group)) +
  geom_line() +
  conf_linerange(peak_load_hdi, color = "Group", size = .770)   +
  facet_wrap(~Group) +
  red_blue(c(2,3,1)) +
  theme(legend.position = "none") +
  ylab(expression(Peak~log[10]~viral~load)) + 
  coord_cartesian(ylim = c(4.5,11)) + 
  geom_hline(yintercept = 9, col = "white", lty = 3) + 
  theme(axis.line.x = element_blank(), axis.text.x = element_blank(), axis.title.x = element_blank()) + 
  gg_expand(x1 = .025, x2 = .025)

p2 =
  peak_pload9_hdi %>%
  ggplot(aes(x = Age, y = mean, color = Group)) +
  geom_line() +
  conf_linerange(peak_pload9_hdi, color = "Group", size = .770)   +
  facet_wrap(~Group) +
  red_blue(c(2,3,1)) +
  theme(legend.position = "none") +
  ylab(expression(Prop.~peak~log[10]~viral~load~larger~9)) + 
  coord_cartesian(ylim = c(0,.5)) + 
  theme(axis.line.x = element_blank(), axis.text.x = element_blank(), axis.title.x = element_blank(), strip.text.x = element_blank()) + 
  gg_expand(x1 = .025, x2 = .025)

p3 = 
  peak_CP_hdi %>% 
  ggplot(aes(x = Age, y = mean, color = Group)) + 
  geom_line() +
  conf_linerange(peak_CP_hdi, color = "Group", size = .770)   + 
  facet_wrap(~Group) + 
  red_blue(c(2,3,1)) + 
  theme(legend.position = "none") + 
  ylab("Culture probability") + 
  theme(strip.text.x = element_blank()) + 
  gg_expand(x1 = .025, x2 = .025)

px = p1 / p2 / p3
ggsave(px,file = "figures/S6_HDI_posterior_predict_peak.png", width = 20, height = 21, units = "cm",dpi = 600)
ggsave(px,file = "figures/S6_HDI_posterior_predict_peak.pdf", width = 20, height = 21, units = "cm")

px
Highest posterior density region (HDR) for peak viral load and culture infectiousness over time by clinical status and age. Top row: 90% HDR for estimated peak viral load. The solid line in each plot indicates the mean. Middle row: 90% HDR for proportion of subjects with a peak log10 viral load higher than 9. Bottom row: 90% HDR for estimated peak culture infectivity. The HDR for the PAMS and Other groups shows a bimodal distribution of peak viral load for some age groups. For PAMS subjects younger than ~40 years, the results suggest that the majority of participants have a lower culture probability, whereas a minority of subjects have a high culture probability. For subjects aged around 45 years old, less than 10% have the estimated mean culture infectiousness.

Figure 3.32: Highest posterior density region (HDR) for peak viral load and culture infectiousness over time by clinical status and age. Top row: 90% HDR for estimated peak viral load. The solid line in each plot indicates the mean. Middle row: 90% HDR for proportion of subjects with a peak log10 viral load higher than 9. Bottom row: 90% HDR for estimated peak culture infectivity. The HDR for the PAMS and Other groups shows a bimodal distribution of peak viral load for some age groups. For PAMS subjects younger than ~40 years, the results suggest that the majority of participants have a lower culture probability, whereas a minority of subjects have a high culture probability. For subjects aged around 45 years old, less than 10% have the estimated mean culture infectiousness.

3.3.10 Viral load and culture probability over time by group

The following plots show viral load and culture probability over time stratified by various grouping variables. All results are generated from posterior predictions from the same model.

grp.dt = unique(day_data[, .(ID,ld_centre)])
grp.var = grep("ID",names(grp.dt), value = T, invert = T)

if (file.exists("pdata/VLbyDayPrimaryCentre.Rdata")) {
  load("pdata/VLbyDayPrimaryCentre.Rdata")
} else {
  by_day_grp = 
    do.call(rbind,
            lapply(unique(grp.dt[[grp.var]]), 
                   function(x) {
                     idx = grp.dt[get(grp.var) == x,ID]
                     VLCP_by_day_draws[ID %in% idx,
                                       list(log10Load = collapse::fmean(log10Load)),
                                       by = .(.draw,day_shifted)] %>% 
                       melt(id.vars = c(".draw", "day_shifted")) %>% 
                       get_stats(by = c("variable","day_shifted")) %>% 
                       .[,(grp.var) := x]
                   })
    )
  
  mx = 
    by_day_grp %>% 
    .[, .(mean = mean(mean)), by = .(day_shifted,variable)]
  
  new_levels = by_day_grp[day_shifted == 0,][order(-mean),ld_centre]
  by_day_grp %>% 
    .[,ld_centre := factor(as.character(ld_centre),
                           levels = new_levels,ordered = T)]
  save(by_day_grp,mx,new_levels,file = "pdata/VLbyDayPrimaryCentre.Rdata")
}


N_table = 
  grp.dt[,.(N = .N),by = grp.var] %>%
  .[ N > 0] %>%
  .[, ld_centre := factor(ld_centre, levels = new_levels, ordered = T)] %>% 
  .[, label := paste0(ld_centre," (",N,")")] %>% 
  .[order(ld_centre)]

p_VL_by_ld_centre = 
  by_day_grp %>% 
  ggplot(aes(x = day_shifted,y = mean)) + 
  geom_line(aes(color = ld_centre)) + 
  conf_ribbon(by_day_grp, fill = "ld_centre") +
  geom_line(data = mx, lty = 2, col = "white") + 
  coord_cartesian(xlim = c(-5,25),ylim = c(0,9)) + 
  guides(fill=guide_legend(ncol=2),color=guide_legend(ncol=2)) + 
  ylab(expression(poportion~log[10]~viral~load)) + 
  xlab("Days from peak viral load") + 
  scale_color_ordinal(name = "Primary centre category", labels = N_table$label) + 
  scale_fill_ordinal(name = "Primary centre category", labels = N_table$label)

ggsave(p_VL_by_ld_centre, filename = "figures/S11_VL_by_ld_centre.png",width = 15, height = 10, units = "cm", dpi = 600)
ggsave(p_VL_by_ld_centre, filename = "figures/S11_VL_by_ld_centre.pdf",width = 15, height = 10, units = "cm")
p_VL_by_ld_centre
Viral load time course stratified by primary test centre category. The primary test centre category is the one with the longest duration between consecutive tests from the same test centre category. If there were categories with equal duration, a subject was assigned to the most severe one, i.e., in the order hospitalized, other, PAMS. If assignment to a unique category was not possible, a subject was assigned to a rest-category X. Hospitalized patients (e.g., WD, IDW, ICU) had the highest peak load,  followed by patients from the Other category (ED, OD, CP). Patients with COVID-19 test centres (C19) as primary test centres had the lowest peak viral load.

Figure 3.33: Viral load time course stratified by primary test centre category. The primary test centre category is the one with the longest duration between consecutive tests from the same test centre category. If there were categories with equal duration, a subject was assigned to the most severe one, i.e., in the order hospitalized, other, PAMS. If assignment to a unique category was not possible, a subject was assigned to a rest-category X. Hospitalized patients (e.g., WD, IDW, ICU) had the highest peak load, followed by patients from the Other category (ED, OD, CP). Patients with COVID-19 test centres (C19) as primary test centres had the lowest peak viral load.

day_data[, Age_group := cut(Age,
                          breaks = c(seq(0,20,5),45,55,65,101),
                          ordered_result = T,
                          labels = c("0-5","5-10","10-15","15-20","20-45","45-55","55-65","65+"))]

grp.dt = unique(day_data[,.(ID,Age_group)])
my_comparisons = 
  combn(levels(day_data$Age_group),2)
my_comparisons = my_comparisons[,grepl("45-55",my_comparisons[2,]) | 
                                  grepl("45-55",my_comparisons[1,])]
my_comparisons[,(ncol(my_comparisons)-1):ncol(my_comparisons)] = 
  my_comparisons[c(2,1),(ncol(my_comparisons)-1):ncol(my_comparisons)]
if (file.exists("pdata/TC_by_age_grp2.Rdata")) {
  load("pdata/TC_by_age_grp2.Rdata")
} else {
  TC_by_age_grp2 = 
  plot_by_day_grp_delta(VLCP_by_day_draws,
                      grp.dt = unique(day_data[,.(ID,Age_group)]),
                      comparisons = my_comparisons)
  TC_by_age_grp2_peak_VLCB_tbl = 
  merge(
  TC_by_age_grp2[[1]]$data %>%
  .[day_shifted == 0] %>%
  .[,tbl := paste0(round(mean,2),
                " (", round(lower90,2),
                ", ", round(upper90,2),")")] %>%
  .[,.(Age_group,tbl)],
  TC_by_age_grp2[[2]]$data %>%
  .[day_shifted == 0] %>%
  .[,tbl := paste0(round(mean,2),
                " (", round(lower90,2),
                ", ", round(upper90,2),")")] %>%
  .[,.(Age_group,tbl)],
by = "Age_group") %>% 
  setnames(c("tbl.x","tbl.y"),c("Viral load","Culture probability"))

CP_RR_tbl = 
  plot_delta_grps(VLCP_by_day_draws,
                  grp.dt = grp.dt,
                  y.var = "CP",
                  stat = "RR",
                  comparisons = my_comparisons,
                  plot =  F) %>%
  .[,tbl := paste0(round(mean,2),
                   " (", round(lower90,2),
                   ",", round(upper90,2),")")] %>%
  .[,.(comparison,tbl)] %>% 
  .[, Age_group := tstrsplit(comparison,"/")[[1]]] %>% 
  .[, .(Age_group,tbl)] %>% 
  setnames("tbl","CP_RR") 

CP_RD_tbl = 
  plot_delta_grps(VLCP_by_day_draws,
                  grp.dt = grp.dt,
                  y.var = "CP",
                  comparisons = my_comparisons,
                  plot =  F) %>%
  .[,tbl := paste0(round(mean,2),
                   " (", round(lower90,2),
                   ",", round(upper90,2),")")] %>%
  .[,.(comparison,tbl)] %>% 
  .[, Age_group := tstrsplit(comparison,"–")[[1]]] %>% 
  .[, .(Age_group,tbl)] %>% 
  setnames("tbl","CP diff") 

VL_diff_tbl = 
  plot_delta_grps(VLCP_by_day_draws,
                  grp.dt = grp.dt,
                  comparisons = my_comparisons,
                  plot =  F) %>%
  .[,tbl := paste0(round(mean,2),
                   " (", round(lower90,2),
                   ",", round(upper90,2),")")] %>%
  .[,.(comparison,tbl)] %>% 
  .[, Age_group := tstrsplit(comparison,"–")[[1]]] %>% 
  .[, .(Age_group,tbl)] %>% 
  setnames("tbl","VL diff")
  save(TC_by_age_grp2,CP_RR_tbl,CP_RD_tbl,VL_diff_tbl,TC_by_age_grp2_peak_VLCB_tbl,file = "pdata/TC_by_age_grp2.Rdata")
}

 
a_lvls = c("0-5","5-10","10-15","15-20","20-45","45-55","55-65","65+")
TC_by_age_grp2_peak_VLCB_tbl =
  TC_by_age_grp2_peak_VLCB_tbl %>% 
  merge(VL_diff_tbl[, Age_group := gsub("–45-55","",Age_group)], by = "Age_group", all.x = T) %>% 
  merge(CP_RD_tbl[, Age_group := gsub("–45-55","",Age_group)], by = "Age_group", all.x = T) %>% 
  merge(CP_RR_tbl[, Age_group := gsub("–45-55","",Age_group)], by = "Age_group", all.x = T) %>% 
  .[, Age_group := factor(Age_group, levels = a_lvls)] %>% 
  .[order(Age_group)]

TC_by_age_grp2_peak_VLCB_ftbl = 
  flextable(TC_by_age_grp2_peak_VLCB_tbl) %>%
  autofit() %>%
  set_caption(caption = "Peak viral load and culture probability by age group. All differences and risk ratios are in comparison to the age group 45-45.")


save_as_docx(TC_by_age_grp2_peak_VLCB_ftbl,path = "figures/Table_S5.docx")

table_doc <- table_doc %>%
  body_add_flextable(TC_by_age_grp2_peak_VLCB_ftbl) %>%
  body_end_section_landscape() 

TC_by_age_grp2
Viral load and culture positvity time courses by age group and differences in culture probability.

Figure 3.34: Viral load and culture positvity time courses by age group and differences in culture probability.

rm(VL_diff_tbl,CP_RD_tbl,CP_RR_tbl)
tmp = gc(verbose = F)
grp.dt.Group = 
  unique(day_data[,.(ID,PAMS1,Hospitalized)]) %>% 
  .[, Group := ifelse(PAMS1 == 1,
                      "PAMS",
                      ifelse(Hospitalized == 1,
                             "Hospitalized","Other"))] %>% 
  .[, Group := factor(Group,levels = c("Hospitalized","PAMS","Other"))] %>% 
  .[, PAMS1 := NULL] %>% 
  .[, Hospitalized := NULL]

if (file.exists("pdata/TC_by_PAMSHosp.Rdata")) {
  load("pdata/TC_by_PAMSHosp.Rdata")
} else {
  TC_by_PAMSHosp = 
  plot_by_day_grp_delta(VLCP_by_day_draws,
                      grp.dt = grp.dt.Group)
  TC_by_PAMSHosp[[1]] = TC_by_PAMSHosp[[1]] + red_blue(c(3,2,1))
  TC_by_PAMSHosp[[2]] = TC_by_PAMSHosp[[2]] + red_blue(c(3,2,1))
  save(TC_by_PAMSHosp,file = "pdata/TC_by_PAMSHosp.Rdata")
}

TC_by_PAMSHosp
Viral load and culture positvity time courses by clinical status at the first positive test and hospitalization

Figure 3.35: Viral load and culture positvity time courses by clinical status at the first positive test and hospitalization

p_PAMS_Hosp = (TC_by_PAMSHosp[[1]] + theme(legend.title = element_blank(), legend.position = c(.7, .9))) | (TC_by_PAMSHosp[[2]] + theme(legend.position = "none"))
ggsave(p_PAMS_Hosp,filename = "figures/S9_TC_PAMS_Hosp.png", width = 20, height = 10, units = "cm")
ggsave(p_PAMS_Hosp,filename = "figures/S9_TC_PAMS_Hosp.pdf", width = 20, height = 10, units = "cm")
rm(TC_by_PAMSHosp,p_PAMS_Hosp)
day_data %>% 
  .[, Age_group2 := cut(Age,
                          breaks = c(0,20,45,55,101),
                          ordered_result = T,
                          labels = c("0-20","20-45","45-55","55+"))] %>% 
  .[, Group := ifelse(PAMS1 == 1,
                      "PAMS",
                      ifelse(Hospitalized == 1,
                             "Hospitalized","Other"))] %>% 
  .[, Group := factor(Group,levels = c("Hospitalized","PAMS","Other"))] %>% 
  .[, Group_Age := paste0(Group,"_",Age_group2)]

grp.dt = unique(day_data[,.(ID,Group_Age)])
grp.var = "Group_Age"

if (file.exists("pdata/Group_Age.Rdata")) {
  load("pdata/Group_Age.Rdata")
} else {
  by_day_grp = 
    do.call(rbind,
            lapply(unique(grp.dt[[grp.var]]), 
                   function(x) {
                     idx = grp.dt[get(grp.var) == x,ID]
                     VLCP_by_day_draws[ID %in% idx,
                                  list(log10Load = collapse::fmean(log10Load)),
                                  by = .(.draw,day_shifted)] %>% 
                       melt(id.vars = c(".draw", "day_shifted")) %>% 
                       get_stats(by = c("variable","day_shifted")) %>% 
                       .[,(grp.var) := x]
                   })
    )
    by_day_grp %>% 
      .[, c("Group", "Age") := tstrsplit(Group_Age,"_")] %>% 
      .[, Age := ordered(Age,levels = c("0-20","20-45","45-55","55+"))]
    save(by_day_grp, file = "pdata/Group_Age.Rdata")
}


p_Group_Age = 
  by_day_grp %>% 
  ggplot(aes(x = day_shifted, y = mean, color = Group)) + 
  geom_line() + 
  conf_ribbon(by_day_grp, fill = "Group") + 
  facet_wrap(~Age) + 
  coord_cartesian(xlim = c(-5,25), ylim = c(0,9)) +
  theme(legend.position = c(.25,.6)) + 
  ylab(expression(log[10]~viral~load)) +
  xlab("Days from peak viral load") + 
  red_blue(c(3,1,2))

p_Group_Age1 = 
  by_day_grp %>% 
  ggplot(aes(x = day_shifted, y = mean, color = Group)) + 
  geom_line() + 
  conf_ribbon(by_day_grp, fill = "Group") + 
  facet_wrap(~Age, nrow = 1) + 
  coord_cartesian(xlim = c(-5,25), ylim = c(0,9)) +
  theme(legend.position = c(.125,.3)) + 
  ylab(expression(log[10]~viral~load)) +
  xlab("Days from peak viral load") + 
  red_blue(c(3,1,2)) + 
  theme(axis.title.x = element_blank())

p_Age_Group2 = 
  by_day_grp %>% 
  ggplot(aes(x = day_shifted, y = mean, color = Age)) + 
  geom_line() + 
  conf_ribbon(by_day_grp, fill = "Age") + 
  facet_wrap(~Group,ncol = 4) + 
  coord_cartesian(xlim = c(-5,25), ylim = c(0,9)) +
  theme(legend.position = c(.125,.4)) + 
  ylab(expression(log[10]~viral~load)) +
  xlab("Days from peak viral load") 

p_Age_Group = p_Group_Age1 / p_Age_Group2

p_Group_Age
Viral load time courses by clinical status and age group.

Figure 3.36: Viral load time courses by clinical status and age group.

ggsave(p_Group_Age,filename = "figures/S13_Group_Age.png", width = 20, height = 20, units = "cm")
ggsave(p_Age_Group,filename = "figures/S13b_Group_Age.png", width = 20, height = 15, units = "cm")
ggsave(p_Group_Age,filename = "figures/S13_Group_Age.pdf", width = 20, height = 20, units = "cm")
ggsave(p_Age_Group,filename = "figures/S13b_Group_Age.pdf", width = 20, height = 15, units = "cm")
day_data %>% 
  .[, PAMSf := factor(PAMS1,labels = c("non-PAMS","PAMS"))] %>% 
  .[, B117f := factor(B117,labels = c("non-B.1.1.7","B.1.1.7"))] %>% 
  .[, B117_PAMS := paste0(PAMSf,",",B117f)]

if (file.exists("pdata/TC_by_B117PAMS.Rdata")) {
  load("pdata/TC_by_B117PAMS.Rdata")
} else {
  TC_by_B117PAMS = 
  plot_by_day_grp_delta(VLCP_by_day_draws,
                      grp.dt = unique(day_data[,.(ID,B117_PAMS)]))
  save(TC_by_B117PAMS,file = "pdata/TC_by_B117PAMS.Rdata")
}

TC_by_B117PAMS
Viral load and culture probability time courses by by B.1.1.7 and clinical status at the first positive test

Figure 3.37: Viral load and culture probability time courses by by B.1.1.7 and clinical status at the first positive test

tmp = gc(verbose = F)
if (file.exists("pdata/TC_by_B117.Rdata")) {
  load("pdata/TC_by_B117.Rdata")
} else {
  TC_by_B117 = 
  plot_by_day_grp_delta(VLCP_by_day_draws,
                      grp.dt = unique(day_data[,.(ID,B117f)]))
  save(TC_by_B117,file = "pdata/TC_by_B117.Rdata")
}

TC_by_B117
Viral load and culture positvity time courses by by B.1.1.7 status

Figure 3.38: Viral load and culture positvity time courses by by B.1.1.7 status

tmp = gc(verbose = F)
day_data[, Male := if_else(Gender == 0,1,0)][is.na(Male), Male := 0][, Male := factor(Male)]

if (file.exists("pdata/TC_by_Gender.Rdata")) {
  load("pdata/TC_by_Gender.Rdata")
} else {
  TC_by_Gender = 
    plot_by_day_grp_delta(VLCP_by_day_draws,
                          grp.dt = unique(day_data[,.(ID,Male)]))
  save(TC_by_Gender,file = "pdata/TC_by_Gender.Rdata")
}


TC_by_Gender
Viral load and culture positivity time courses by gender and differences in culture probability

Figure 3.39: Viral load and culture positivity time courses by gender and differences in culture probability

tmp = gc(verbose = F)

3.3.11 Viral load and culture probability at -2, 0, 5, and 10 days from peak viral load

VLCP_by_keydays_draws = 
  VLCP_by_day_draws %>% 
  .[day_shifted %in% c(-2,0,5,10)] %>% 
  .[unique(day_data[, .(ID,Age_group)]),Age_group := Age_group]

plot_VL_key_days_by_Age = 
  plot_key_days_by_age(VLCP_by_keydays_draws,
                       target.var = "log10Load",
                       var = "log10Load") + 
  ylab(expression(log[10]~viral~load)) + 
  coord_cartesian(ylim = c(0,10)) + 
  scale_y_continuous(expand = expansion(.01,0))
plot_VL_key_days_by_Age 
Estimated viral load at key time points by age group. The shaded lines cover the 90% credible interval.

Figure 3.40: Estimated viral load at key time points by age group. The shaded lines cover the 90% credible interval.

tmp = gc(verbose = F)

3.3.12 Age differences in peak viral load and culture probability

Figure 3.41 shows comparisons of all age groups with the age group 45-55 for the day of peak viral load.

if (file.exists("pdata/peak_load_by_age_draws.Rdata")) {
  load("pdata/peak_load_by_age_draws.Rdata")
} else {
  CPpars = draws %>% 
    subset_draws(c("alpha_CP","beta_CP")) %>% 
    as_draws_dt() %>% 
    setkeyv(".draw") 
  peak_load_by_age_draws_full = 
    do.call(rbind,
            lapply(c(T,F), 
                   function(adj) 
                     cond_eff_contPGH("intercept", round_Age = 2,adjust = adj) %>%
                     .[, parameter := NULL] %>% 
                     .[, .draw := as.integer(.draw)] %>% 
                     setkeyv(".draw") %>% 
                     setnames("value","VL") %>% 
                     .[CPpars, CP := inv.logit(alpha_CP + VL * `beta_CP[1]`)] %>% 
                     .[ , Adjusted := adj]
            )
    )
  
  peak_load_by_age_draws_full %>% 
    .[, Age_group := cut(Age,
                         breaks = c(seq(0,20,5),45,55,65,101),
                         ordered_result = T,
                         labels = c("0-5","5-10","10-15","15-20","20-45","45-55","55-65","65+"))] %>% 
    .[, Age := round(Age)]
  
  peak_load_by_age_draws = 
    peak_load_by_age_draws_full %>% 
    .[,.(VL = mean(VL), CP = mean(CP)), by = .(.draw,Age,Adjusted)]
  
  peak_load_by_age_group_draws = 
    peak_load_by_age_draws_full %>% 
    .[,.(VL = mean(VL), CP = mean(CP)), by = .(.draw,Age_group,Adjusted)]
  
  rm(peak_load_by_age_draws_full,draws,CPpars)
  
  save(peak_load_by_age_draws,peak_load_by_age_group_draws,file = "pdata/peak_load_by_age_draws.Rdata")
}



comp = peak_load_by_age_draws[Age > 44 & Age < 55 , .(m = mean(VL)), by = .(adjusted,.draw)]
setkeyv(comp,c(".draw","adjusted"))
setkeyv(peak_load_by_age_draws,c(".draw","adjusted"))
pdata = 
  peak_load_by_age_draws %>% 
  .[comp, delta := VL - m] %>% 
  get_stats(var = "delta",by = c("Age","adjusted")) %>% 
  .[,Adjustment := ifelse(adjusted == T,"Yes","No")]
delta_grand_mean = 
  peak_load_by_age_draws[,.(m = mean(VL)), by = .(adjusted)] %>% 
  merge(comp[,.(m = mean(m)), by = .(adjusted)], by = "adjusted") %>% 
  .[, delta := m.x - m.y] %>% 
  .[,Adjustment := ifelse(adjusted == T,"Yes","No")]
p_deltaVL_peak = 
  pdata %>% 
  ggplot(aes(x = Age, y = mean, color = Adjustment, fill = Adjustment, group = Adjustment)) +
  scale_fill_discrete_diverging("Blue-Red 2") +
  scale_color_discrete_diverging("Blue-Red 2") +
  geom_line() +
  conf_ribbon(pdata,fill = "Adjustment") + 
  geom_hline(yintercept = 0) + 
  coord_cartesian(ylim = c(-1,1), xlim = c(0,101)) +  
  ylab("Estimated peak viral load difference") + 
  geom_hline(data = delta_grand_mean,
             aes(yintercept = delta, color = Adjustment), lty = 3) + 
  theme(legend.position = c(.15,.9)) + 
  gg_expand() + 
  gg_add_grid()

comp = peak_load_by_age_draws[Age > 44 & Age < 55 , .(m = mean(CP)), by = .(adjusted,.draw)]
setkeyv(comp,c(".draw","adjusted"))
pdata = 
  peak_load_by_age_draws %>% 
  .[comp, delta := CP - m] %>% 
  get_stats(var = "delta",by = c("Age","adjusted")) %>% 
  .[,Adjustment := ifelse(adjusted == T,"Yes","No")]
delta_grand_mean = 
  peak_load_by_age_draws[,.(m = mean(CP)), by = .(adjusted)] %>% 
  merge(comp[,.(m = mean(m)), by = .(adjusted)], by = "adjusted") %>% 
  .[, delta := m.x - m.y] %>% 
  .[,Adjustment := ifelse(adjusted == T,"Yes","No")]
p_deltaCP_peak = 
  pdata %>% 
  ggplot(aes(x = Age, y = mean, color = Adjustment, fill = Adjustment, group = Adjustment)) +
  scale_fill_discrete_diverging("Blue-Red 2") +
  scale_color_discrete_diverging("Blue-Red 2") +
  geom_line() +
  conf_ribbon(pdata,fill = "Adjustment") + 
  geom_hline(yintercept = 0) + 
  coord_cartesian(ylim = c(-.4,.4), xlim = c(0,101)) +  
  ylab("Culture probability difference") + 
  geom_hline(data = delta_grand_mean,
             aes(yintercept = delta, color = Adjustment), lty = 3) + 
  theme(legend.position = c(.15,.9)) + 
  gg_expand() + 
  gg_add_grid()

p_deltaVL_peak | p_deltaCP_peak
Age differences in maximal viral load. Adjusted refers to adjustment for clinical status.

Figure 3.41: Age differences in maximal viral load. Adjusted refers to adjustment for clinical status.

comps = setdiff(levels(peak_load_by_age_group_draws$Age_group),"45-55")
setkeyv(peak_load_by_age_group_draws, c(".draw","adjusted","Age_group"))

peak_VLCP_diffs_tbl = 
  do.call(rbind,
        lapply(comps, 
               function(comp)
                 peak_load_by_age_group_draws %>% 
                 .[Age_group %in% c("45-55",comp)] %>% 
                 .[, .(deltaCP = diff(CP)*ifelse(comp %in% comps[1:5],-1,1), 
                       deltaVL = diff(VL)*ifelse(comp %in% comps[1:5],-1,1)), 
                   by = .(adjusted,.draw)] %>% 
                 .[, .(VL = sprint_stat(deltaVL),CP = sprint_stat(deltaCP,2)), by = .(adjusted)] %>% 
                 .[, `Age group` := comp])
) %>% 
  .[, .(`Age group`,adjusted,VL,CP)] %>% 
  .[, `Age group` := factor(`Age group`,levels = levels(peak_load_by_age_group_draws$Age_group))] %>% 
  setkeyv(c("adjusted","Age group"))


tbl_peak_VLCP_diffs_tbl = 
  kable(peak_VLCP_diffs_tbl,
        format = table_format,
        caption = "Differences in peak viral load between age groups") %>% 
  kable_styling(full_width = F)
tbl_peak_VLCP_diffs_tbl
Table 3.8: Differences in peak viral load between age groups
Age group adjusted VL CP
0-5 FALSE -0.7 (-1.0, -0.3) -0.22 (-0.36, -0.10)
5-10 FALSE -0.6 (-0.9, -0.3) -0.20 (-0.33, -0.09)
10-15 FALSE -0.5 (-0.8, -0.2) -0.17 (-0.29, -0.07)
15-20 FALSE -0.4 (-0.7, -0.2) -0.15 (-0.25, -0.06)
20-45 FALSE -0.3 (-0.4, -0.1) -0.09 (-0.15, -0.03)
55-65 FALSE 0.2 (0.1, 0.3) 0.05 (0.03, 0.08)
65+ FALSE 0.4 (0.3, 0.6) 0.11 (0.07, 0.17)
0-5 TRUE -0.5 (-0.8, -0.3) -0.18 (-0.28, -0.08)
5-10 TRUE -0.5 (-0.7, -0.2) -0.15 (-0.25, -0.07)
10-15 TRUE -0.4 (-0.6, -0.2) -0.13 (-0.21, -0.05)
15-20 TRUE -0.3 (-0.5, -0.2) -0.11 (-0.18, -0.04)
20-45 TRUE -0.2 (-0.3, -0.1) -0.06 (-0.10, -0.03)
55-65 TRUE 0.1 (0.1, 0.2) 0.04 (0.02, 0.07)
65+ TRUE 0.4 (0.3, 0.5) 0.10 (0.06, 0.15)

Here are the age-specific results for results for culture probability.

day_data[, Age_group := cut(Age,
                         breaks = c(seq(0,20,5),45,55,65,101),
                         ordered_result = T,
                         labels = c("0-5","5-10","10-15","15-20","20-45","45-55","55-65","65+"))]

plot_CP_key_days_by_Age = 
  plot_key_days_by_age(VLCP_by_keydays_draws,
                       target.var = "CP",
                       var = "CP") + 
  ylab("Culture probability") + 
  coord_cartesian(ylim = c(0,1)) +
  scale_y_continuous(expand = expansion(.01,0))
plot_CP_key_days_by_Age
Estimated culture probability at key time points by age group. The shaded lines cover the 90% credible interval.

Figure 3.42: Estimated culture probability at key time points by age group. The shaded lines cover the 90% credible interval.

The variation in culture infectiousness within age groups appear to be larger than the variation between age groups.

table_doc %>%
  print( target = "figures/Tables.docx" )

4 Figures

4.1 Figure 1: First positive test - with raw data

The mean first-positive viral load for PAMS and Hospitalised subjects of age 50 are 7.2 (4.0, 9.8) and 6.2 (3.8, 9.5), respectively.

ppc_VL_by_age = 
  ppc_VL_by_age + 
  guides(size = guide_legend(title.position = "left",nrow = 1),
         color = guide_legend(title.position = "left",nrow = 1)) +
  scale_size(range = c(0, 3),guide = guide_legend(nrow = 1),breaks = c(0,25,50,100,250,500)) +
  coord_cartesian(ylim = c(4,8)) +
  gg_add_grid()


F1 = 
  (beesPAMS + ggtitle("A")) /
  (ppc_VL_by_age + ggtitle("B")) / 
  (agehist_by_PAMS  + ggtitle("C")) +
  plot_layout(heights = c(1.5,1,.5))

ggsave(F1,filename = "figures/F1.png", width = 30, height = 25, units = "cm", dpi = 300) 
ggsave(F1,filename = "figures/F1.pdf", width = 30, height = 25, units = "cm") 
#save(F1,file = "figures/F1.Rdata")
F1
Distribution of age and first-positive viral load in PAMS, Hospitalised, and Other subjects. A) Distribution of observed first-positive viral loads for 25,381 subjects according to clinical status (6110 PAMS, 9519 Hospitalised, 9752 Other) and age group. B) Age-viral load association with observed viral loads and confidence intervals as circles (with size indicating subject count) with vertical lines, and model-predicted viral loads and credible intervals as a black roughly-horizontal line with grey shading. C) Overlapping age histograms according to subject clinical status. Because inclusion in the study required a positive RT-PCR test result, and testing is in many cases symptom-dependent, the study may have a proportion of PAMS cases that differs from the proportion in the general population.

Figure 4.1: Distribution of age and first-positive viral load in PAMS, Hospitalised, and Other subjects. A) Distribution of observed first-positive viral loads for 25,381 subjects according to clinical status (6110 PAMS, 9519 Hospitalised, 9752 Other) and age group. B) Age-viral load association with observed viral loads and confidence intervals as circles (with size indicating subject count) with vertical lines, and model-predicted viral loads and credible intervals as a black roughly-horizontal line with grey shading. C) Overlapping age histograms according to subject clinical status. Because inclusion in the study required a positive RT-PCR test result, and testing is in many cases symptom-dependent, the study may have a proportion of PAMS cases that differs from the proportion in the general population.

rm(ppc_VL_by_age,F1,agehist_by_PAMS)
tmp = gc(verbose = F)

4.2 Figure 2: First positive test - age differences

### F2A: load by age + marginal histogram 
## load by age ##
checks = c(mean(bdata$log10Load),0,10)
levels(bfitdata$Group) = Group_levels
my_VL_hist_age =
  ggplot(bdata, aes(x = log10Load, fill = Group)) + 
  red_blue_black() +
  geom_histogram(bins = 20, alpha = .5) + 
  gg_text_size() +
  gg_expand() + 
  theme_marginal() + 
  theme(legend.position = c(.5,.15))  +
  gg_add_grid(axis = "y") +  
  theme(axis.title = element_blank()) +
  coord_flip(xlim= c(2,11)) +
    theme(axis.title.y.left = element_blank(), legend.position = "none") 

p_load_by_age_hist = 
  wrap_plots(
    p_load_by_age + 
      theme(plot.margin = margin(0,-5,0,0, unit = "pt")) + 
      gg_add_grid() +
      coord_cartesian(ylim = c(2,11)) + 
      gg_expand(), 
    my_VL_hist_age, 
    widths = c(1,.3)
  )
p_load_by_age_hist[[1]] = p_load_by_age_hist[[1]] + ggtitle("A")

### F2B: Viral load differences between age groups
p_age_comp_VL = 
  p_age_comp_VL + 
  gg_add_grid("y") + 
  guides(color = guide_legend(ncol = 1, keywidth = .25, keyheight = .25)) + 
  theme(legend.position = my_legend_position) + 
  labs(fill = "Sample") +
  gg_legend_size(ncol = 1) + 
  gg_text_size() +
  coord_cartesian(ylim = c(-3,3)) + 
  ggtitle("B")

### F2C: Culture infectivity by viral load
culture_data[, Outcome := ifelse(culture_positive == 1,
                                 paste0("pos. (",sum(culture_data$culture_positive),")"),
                                 paste0("neg. (",sum(culture_data$culture_positive == 0),")"))]

p_CP_hist = 
  ggplot(culture_data, aes(x = log10Load, fill= Outcome)) + 
  geom_histogram(bins = 20) + 
  coord_cartesian(xlim = c(1,10)) +
  theme(legend.position = c(.0,.8)) +
  gg_legend_size(1) + 
  gg_expand() + 
  guides(fill = guide_legend(ncol = 1,
                             keywidth = .25,
                             keyheight = .25)) + 
  gg_text_size() +
  theme_marginal() + 
  ylab("") +  xlab("")

p_CP_by_load_hist = 
  wrap_plots(
    p_CP_hist,
    p_CP_by_load + 
      gg_add_grid() + 
      theme(plot.margin = margin(-5,0,0,0, unit = "pt")), 
    heights = c(.2,1)
  )
p_CP_by_load_hist[[1]] = p_CP_by_load_hist[[1]] + ggtitle("C")

### F2D: CP by age ###
CPpars =
  CP.fit$fit %>% 
  as_draws() %>% 
  subset_draws(c("b_Intercept","b_log10Load")) %>% 
  as_draws_dt()
CP_alpha = mean(CPpars$b_Intercept)
CP_beta = mean(CPpars$b_log10Load)

h_data_age = rbind(
  data.table(CP = inv.logit(CP_alpha + bfitdata[Group == "PAMS",log10Load]*CP_beta), sample = "PAMS"),
  data.table(CP = inv.logit(CP_alpha + bfitdata[Group == "Hospitalized",log10Load]*CP_beta), sample = "Hospitalized"),
  data.table(CP = inv.logit(CP_alpha + bfitdata[Group == "Other",log10Load]*CP_beta), sample = "Other")
) %>% 
  .[, sample := factor(sample, levels = levels(p_CP_by_age$data$sample))]

lbl_data =
  data.table(sample = unique(h_data_age$sample), 
             CP = .5,
             x = 1500)

checks = c(c(0,.5,1))
my_CP_hist_age =
  ggplot(h_data_age, aes(x = CP, fill = sample)) + 
  geom_text(data = lbl_data,
            aes(x = CP, y = x, label = sample), angle = 270,size = 3) +
  red_blue_black() +
  geom_histogram(bins = 20, alpha = .5) + 
  ylab("")  + 
  gg_text_size() +
  theme_marginal() + 
  gg_expand() + gg_add_grid("y") +
  coord_flip(xlim = c(-.01,1.01)) +
  theme(axis.title.y.left = element_blank(),
        legend.position = c(.6,.5)) +
  gg_legend_size(1) + 
  gg_expand() + 
  facet_grid(sample~.) +
  theme(legend.position = "none", strip.text.y = element_blank())

p_CP_hdi_by_age_hist = 
  wrap_plots(
    p_CP_by_age +
      theme(plot.margin = margin(0,-5,0,0, unit = "pt"),
            legend.position = "none",strip.text.y = element_blank()) + 
      facet_grid(sample~.) +
      gg_expand() +
      gg_add_grid() +
      gg_text_size(),
    my_CP_hist_age +  theme(legend.position = "none"), 
    widths = c(1,.3),
    ncol = 2
  ) 

### F2E: CP by age HDI ###
p_CP_by_Age.hdi = 
  p_CP_by_Age.hdi.no.postproc + 
  no_y_axis +
  scale_x_continuous(breaks = seq(20,80,20)) +
  facet_grid(Group~.) +
  theme(strip.text.y = element_blank()) +
  gg_expand() + 
  gg_add_grid() +
  ggtitle("E") + 
  gg_text_size()

### F2F: CP differences between age groups
p_age_comp_CP = 
  p_age_comp_CP + 
  gg_add_grid("y") + 
  guides(color = guide_legend(ncol = 1, keywidth = .25, keyheight = .25)) + 
  theme(legend.position = my_legend_position) + 
  labs(fill = "Sample") +
  ggtitle("F")  +
  gg_legend_size(ncol = 1) + 
  coord_cartesian(ylim = c(-.5,.5)) +
  gg_text_size()


p_CP_hdi_by_age_hist[[1]] = p_CP_hdi_by_age_hist[[1]] + ggtitle("D")

F2 = 
((p_load_by_age_hist | p_age_comp_VL | p_CP_by_load_hist) + plot_layout(widths = c(1.5,1,1))) /
    ((p_CP_hdi_by_age_hist | p_CP_by_Age.hdi | p_age_comp_CP) + plot_layout(widths = c(1.5,1.1,1))) 

ggsave(F2,filename = "figures/F2.png",
       width = 30, height = 24, units = "cm", dpi = 300,
       device = png_device)
ggsave(F2,filename = "figures/F2.pdf",
       width = 30, height = 24, units = "cm")
#save(F2,file = "figures/F2.Rdata")
F2
Estimated viral load and culture probability at time of first positive RT-PCR test. Shaded regions show 90% credible intervals in all panels. To indicate change within each 90% region, shading decreases in intensity from a narrow 50% credibility interval level to the full 90%. A) Estimated mean viral load in first-positive RT-PCR test according to age and status. The stacked histogram (right) shows the observed viral load distribution. Small age-year to age-year variations in the proportion of subjects groups cause the fluctuations in the estimated viral loads for the total sample. Because the shaded region shows the 90% credible interval for the mean, it does not include the higher values shown in the histogram on the right. B) Differences in estimated first-positive viral load according to age and status. Each coloured line is specific to a particular subset of subjects (PAMS, Hospitalised, Other). The line shows how viral load differs by age for subjects of the corresponding status from that of 50-year old (rounded age) subjects of the same status. The comparison against those of age 50 avoids comparing any subset of the subjects against a value (such as the overall mean) that is computed in part based on that subset, thereby partially comparing data to itself. The mean first-positive viral load for PAMS and Hospitalised subjects of age 50 are 7.2 and 6.2, respectively, allowing relative y-axis differences to be translated to approximate viral loads. C) Estimation of the association between viral load and cell culture isolation success rate based on data from our own laboratory (19) and the study of Perera et al. (20). Viral load differences in the range ~6 to ~9 have a large impact on culture probability, while the impact is negligible for differences outside that range. The vertical lines indicate the observed mean first-positive viral load for different subject groups and the horizontal lines the corresponding expected probability of a positive culture. D) Estimated culture probability at time of first-positive RT-PCR according to age and status, obtained by combining the results in panels A and C. Culture probability is calculated from posterior predictions, that is the posterior means shown in panel A plus error variance. The histogram on the right shows that mean culture probabilities calculated from observed log10 viral load values are not well-matched by credible intervals, which do not include the most-probable estimated culture probabilities. E) Same y-axis as D. Culture probability with highest posterior density regions, which do include the most-probable estimated culture probabilities and match the histograms in D well. F) Differences of estimated expected culture probability at time of first-positive RT-PCR for age groups, with plot elements as described for B.

Figure 4.2: Estimated viral load and culture probability at time of first positive RT-PCR test. Shaded regions show 90% credible intervals in all panels. To indicate change within each 90% region, shading decreases in intensity from a narrow 50% credibility interval level to the full 90%. A) Estimated mean viral load in first-positive RT-PCR test according to age and status. The stacked histogram (right) shows the observed viral load distribution. Small age-year to age-year variations in the proportion of subjects groups cause the fluctuations in the estimated viral loads for the total sample. Because the shaded region shows the 90% credible interval for the mean, it does not include the higher values shown in the histogram on the right. B) Differences in estimated first-positive viral load according to age and status. Each coloured line is specific to a particular subset of subjects (PAMS, Hospitalised, Other). The line shows how viral load differs by age for subjects of the corresponding status from that of 50-year old (rounded age) subjects of the same status. The comparison against those of age 50 avoids comparing any subset of the subjects against a value (such as the overall mean) that is computed in part based on that subset, thereby partially comparing data to itself. The mean first-positive viral load for PAMS and Hospitalised subjects of age 50 are 7.2 and 6.2, respectively, allowing relative y-axis differences to be translated to approximate viral loads. C) Estimation of the association between viral load and cell culture isolation success rate based on data from our own laboratory (19) and the study of Perera et al. (20). Viral load differences in the range ~6 to ~9 have a large impact on culture probability, while the impact is negligible for differences outside that range. The vertical lines indicate the observed mean first-positive viral load for different subject groups and the horizontal lines the corresponding expected probability of a positive culture. D) Estimated culture probability at time of first-positive RT-PCR according to age and status, obtained by combining the results in panels A and C. Culture probability is calculated from posterior predictions, that is the posterior means shown in panel A plus error variance. The histogram on the right shows that mean culture probabilities calculated from observed log10 viral load values are not well-matched by credible intervals, which do not include the most-probable estimated culture probabilities. E) Same y-axis as D. Culture probability with highest posterior density regions, which do include the most-probable estimated culture probabilities and match the histograms in D well. F) Differences of estimated expected culture probability at time of first-positive RT-PCR for age groups, with plot elements as described for B.

tmp = gc(verbose = F)

4.3 Figure 3: B.1.1.7

pB117load[[1]] = pB117load[[1]] + gg_legend_size(ncol = 1) + 
  guides(fill = guide_legend(title=element_blank()),
         color = guide_legend(title=element_blank())) + 
  gg_text_size()
pB117cp[[1]] = pB117cp[[1]] + gg_legend_size(ncol = 1) + 
  guides(fill = guide_legend(title=element_blank()))
F3 = pB117load / pB117cp
F3
Posterior distributions of estimated viral loads and culture probabilities for B.1.1.7 and non-B.1.1.7 subjects, and their differences. The viral loads and estimated culture probability of 1387 B.1.1.7 and 977 non-B.1.1.7 subjects. To select a comparable subset of non-B.1.1.7 viral loads for the comparison, non-B.1.1.7 subjects were included only from test centres that had detected a B.1.1.7 variant as well as at least one non-B.1.1.7 subject, and only if the non-B.1.1.7 infection was detected on the same day as a B.1.1.7 infection was detected, plus or minus one day. Similar differences exist when viral loads from larger, less restrictive, subsets of non-B.1.1.7 subjects are used in the comparison (Materials and Methods, Table S2).  A) Posterior distribution of log10 viral load. B) Difference of average viral load between B.1.1.7 and non-B.1.1.7 cases. C) Posterior distribution of the estimated culture probability. See also Fig. S2. D) Difference of mean culture probability between B.1.1.7 and non-B.1.1.7 cases. Horizontal lines in A, B, and D indicate 90% credible intervals, and the highest posterior density intervals in C.

Figure 4.3: Posterior distributions of estimated viral loads and culture probabilities for B.1.1.7 and non-B.1.1.7 subjects, and their differences. The viral loads and estimated culture probability of 1387 B.1.1.7 and 977 non-B.1.1.7 subjects. To select a comparable subset of non-B.1.1.7 viral loads for the comparison, non-B.1.1.7 subjects were included only from test centres that had detected a B.1.1.7 variant as well as at least one non-B.1.1.7 subject, and only if the non-B.1.1.7 infection was detected on the same day as a B.1.1.7 infection was detected, plus or minus one day. Similar differences exist when viral loads from larger, less restrictive, subsets of non-B.1.1.7 subjects are used in the comparison (Materials and Methods, Table S2). A) Posterior distribution of log10 viral load. B) Difference of average viral load between B.1.1.7 and non-B.1.1.7 cases. C) Posterior distribution of the estimated culture probability. See also Fig. S2. D) Difference of mean culture probability between B.1.1.7 and non-B.1.1.7 cases. Horizontal lines in A, B, and D indicate 90% credible intervals, and the highest posterior density intervals in C.

ggsave(filename = "figures/F3.png",F3, width = 18, height = 12, units = "cm")
ggsave(filename = "figures/F3.pdf",F3, width = 18, height = 12, units = "cm")
#save(F3,file = "figures/F3.Rdata")

4.4 Figure 4: Time course

### F4A: sample for time course test data ###
day_data %>%
  .[, number_tests := cut(N_tests,
                          breaks = c(2.9,3,4,6,9,20),
                          labels = c("3","4","5-6","7-9",">9"),
                          ordered_result = T)]
p_N_time = ggplot(day_data[day == 0], aes(x = Age, fill = number_tests)) +
  geom_histogram(breaks = seq(0,100,5)) + 
  scale_fill_ordinal(name = "Number of tests") +
  theme(legend.position = my_legend_position) +
  gg_text_size() + 
  gg_legend_size(1) + 
  gg_expand() + 
  ylab("Number of subjects")
p_N_time = p_N_time + ggtitle("A")

### F4B: VL time course + histogram
# histogram for time course data
brks = c(-.1,0,.25,.5,.75,.99,1)
lbls = c("0",paste0(c(1,26,51,76),"-",c(25,50,75,99)),"100" )
day_data[, `% tests \nin hosp.` := cut(phosptests,breaks = brks, labels = lbls)]
my_VL_hist_time =
  ggplot(day_data, aes(x = log10Load)) + 
  geom_histogram(bins = 20, fill = "blue", alpha = .5) + 
  coord_cartesian(xlim = c(0,10)) +
  theme(legend.position = c(.7,.9))  +
  ylab("")  + 
  xlab("") +
  coord_flip() +
  gg_text_size() +
  gg_expand() +
  gg_legend_size(1) +
  theme_marginal() + 
  guides(fill = guide_legend(ncol = 1, keywidth = .25, keyheight = .25)) + 
  theme(axis.title.y.left = element_blank())
p_load_by_time_hist = 
  wrap_plots(
    plot_VL_by_day + 
      theme(plot.margin = margin(0,-5,0,0, unit = "pt")) + 
      gg_expand() + 
      gg_add_grid(), 
    my_VL_hist_time , 
    widths = c(1,.3)
  )
p_load_by_time_hist[[1]] = p_load_by_time_hist[[1]] + ggtitle("B") 


### F4C: CP time course + histogram
ddp = day_data[log10Load > 0, c("log10Load","% tests \nin hosp."), with = F]
h_data_time = 
  apply(CPpars[.draw < 500], 1, 
        function(x) 
          inv.logit(x[1] + ddp$log10Load*x[2])) %>%
  data.table() %>%
  .[, `% tests \nin hosp.` := ddp$`% tests \nin hosp.`] %>%  
  .[, TID := 1:nrow(ddp)] %>%
  melt(id.var = c("% tests \nin hosp.","TID"), value.name = "CP") %>%
  .[, .draw := as.numeric(gsub("V","",variable, perl = T))] %>%
  .[, CP := CP*100] %>%
  .[, c("variable",".draw","TID") := NULL]
my_CP_hist_time =
  ggplot(h_data_time, aes(x = CP)) + 
  geom_histogram(bins = 20, fill = "blue", alpha = .5) + 
  coord_cartesian(xlim = c(0,100)) +
  theme(legend.position = c(.6,.9))  +
  gg_text_size() +
  gg_expand() +
  gg_legend_size(1) +
  theme_marginal() + 
  guides(fill = guide_legend(ncol = 1, keywidth = .25, keyheight = .25)) + 
  ylab("")  + 
  coord_flip() +
  theme(axis.title.y.left = element_blank())
p_CP_by_time_hist = 
  wrap_plots(
    plot_CP_by_day + theme(plot.margin = margin(0,-5,0,0, unit = "pt")) + gg_add_grid(), 
    my_CP_hist_time, 
    widths = c(1,.3)
  )
p_CP_by_time_hist[[1]] = p_CP_by_time_hist[[1]]  + ggtitle("C")


F4 = 
  (p_N_time | p_load_by_time_hist | p_CP_by_time_hist) + 
  plot_layout(widths = c(1,1.5,1.5))

ggsave(F4,filename = "figures/F4.png",
       width = 30, height = 10, units = "cm", dpi = 300,
       device = png_device)
ggsave(F4,filename = "figures/F4.pdf",
       width = 30, height = 10, units = "cm")
F4
Viral load and estimated infectious virus shedding time series. Of 25,381 positive subjects, 4344 had three or more RT-PCR test results available and these were used in a viral load time series analysis. Subjects with only one result cannot be placed in time due to inherent ambiguity from the model having both an increasing and a decreasing phase, and those with only two test results were excluded from the time series analysis due to insufficient data for temporal placement (their number of data points is less than number of model parameters being estimated). A) The number of subjects with three or more RT-PCR test results available, at least two of which were positive, according to age. B) Estimated time course of viral load for 18,136 RT-PCR results from the 4344 subjects with at least three RT-PCR results. Blue lines are expected complete time courses for individual cases. The sample mean is shown in red, with its 90% credible interval as a shaded area. The small histogram on the right shows the distribution of all observed viral loads. The histogram values at zero correspond to the initial and trailing negative tests in subject timelines. Raw viral load time series, per subject and split by number of RT-PCR tests, are shown in Fig. S8. C) Estimated time course of positive cell culture probability, calculated by applying the results shown in Fig. 2C to the estimated viral load time courses in B. Blue lines are expected time courses for individual subjects. The sample average is shown in red, with its 90% credible interval as a shaded area. The small histogram to the right shows the distribution of culture probability in the sample, and was obtained by applying the curve in Fig. 2C to the data in the histogram in B.

Figure 4.4: Viral load and estimated infectious virus shedding time series. Of 25,381 positive subjects, 4344 had three or more RT-PCR test results available and these were used in a viral load time series analysis. Subjects with only one result cannot be placed in time due to inherent ambiguity from the model having both an increasing and a decreasing phase, and those with only two test results were excluded from the time series analysis due to insufficient data for temporal placement (their number of data points is less than number of model parameters being estimated). A) The number of subjects with three or more RT-PCR test results available, at least two of which were positive, according to age. B) Estimated time course of viral load for 18,136 RT-PCR results from the 4344 subjects with at least three RT-PCR results. Blue lines are expected complete time courses for individual cases. The sample mean is shown in red, with its 90% credible interval as a shaded area. The small histogram on the right shows the distribution of all observed viral loads. The histogram values at zero correspond to the initial and trailing negative tests in subject timelines. Raw viral load time series, per subject and split by number of RT-PCR tests, are shown in Fig. S8. C) Estimated time course of positive cell culture probability, calculated by applying the results shown in Fig. 2C to the estimated viral load time courses in B. Blue lines are expected time courses for individual subjects. The sample average is shown in red, with its 90% credible interval as a shaded area. The small histogram to the right shows the distribution of culture probability in the sample, and was obtained by applying the curve in Fig. 2C to the data in the histogram in B.

#save(F4,file = "figures/F4.Rdata")
rm(F4,p_load_by_time_hist,p_CP_by_time_hist)
tmp = gc(verbose = F)

4.5 Figure 5: Peak viral load and culture probability by age

VLCP_by_day_draws[, Age_group := NULL]
grp.dt = unique(day_data[,.(ID,Age_group)])

VL_TC = 
  TC_by_age_grp2[[1]] + 
  gg_add_grid() + 
  ggtitle("A") + 
  coord_cartesian(xlim = c(-5,25), ylim = c(0,10)) 

CP_TC = 
  TC_by_age_grp2[[2]] +
  coord_cartesian(xlim = c(-5,25), ylim = c(0,1)) + 
  gg_expand() + gg_text_size() + 
  theme(legend.position = "none") + 
  gg_add_grid() + ggtitle("B")

p_deltaVL_peak = 
  p_deltaVL_peak + ggtitle("C")
p_deltaCP_peak = 
  p_deltaCP_peak + ggtitle("D")

F5 = 
  (VL_TC | CP_TC) / (p_deltaVL_peak | p_deltaCP_peak)

ggsave("figures/F5.png",F5, width = 20, 
       height = 20, units = "cm",dpi = 300)
ggsave("figures/F5.pdf",F5, width = 20, 
       height = 20, units = "cm")
#save(F5,file = "figures/F5.Rdata")
F5 
Estimated expected viral load and culture probability for age groups by time. A) Change in estimated viral load over time according to age group for 4344 subjects with at least three RT-PCR tests, at least two of which were positive. The age colouring, range, and number of subjects in each category is given in the figure legend. Shading indicates the 90% credible interval of the mean. B) Change in estimated culture probability over time according to age. Age groups, colouring, and shading are as in A. C) Estimated age group differences in mean peak viral load, corresponding to the values at day zero in A. D) Estimated age group differences in mean peak culture probability, corresponding to the values at day zero in B. In C and D, adjusted differences account for variations by age in clinical status and gender. Dotted lines indicate grand means for the 4344 subjects.

Figure 4.5: Estimated expected viral load and culture probability for age groups by time. A) Change in estimated viral load over time according to age group for 4344 subjects with at least three RT-PCR tests, at least two of which were positive. The age colouring, range, and number of subjects in each category is given in the figure legend. Shading indicates the 90% credible interval of the mean. B) Change in estimated culture probability over time according to age. Age groups, colouring, and shading are as in A. C) Estimated age group differences in mean peak viral load, corresponding to the values at day zero in A. D) Estimated age group differences in mean peak culture probability, corresponding to the values at day zero in B. In C and D, adjusted differences account for variations by age in clinical status and gender. Dotted lines indicate grand means for the 4344 subjects.

rm(F5 ,VL_TC, CP_TC,VL_delta, CP_delta)

5 Stan model for time course estimation

data {
  // Data for time course analysis
  int N_DAY;                       // number of data points
  vector[N_DAY] Y_DAY;             // outocome log10 viral load
  vector[N_DAY] X_DAY;             // day of measurement in time series
  int G;                           // number of subjects
  int gstart_DAY[G];               // first day in time series
  int gend_DAY[G];                 // last day in time series
  int N_onset;                     // number of subjects with symptom-onset data
  int idx_onset[N_onset];          // index of subjects with symptom-onset data
  vector[N_onset] onset;           // onset in days from first test
  int N_NegTests;                  // Total number negative test results
  int idx_NegTests[N_NegTests];    // position of negative test results in Y_DAY
  vector[N_DAY] PCR;               // Type of PCR system used for test
  int N_centres;                   // Number of test centre types
  matrix[N_DAY,N_centres] centre;  // Matrix with centre types for each test
  int N_centre1;                   // Number of test centre types for 1st test
  int centre1[G];                  // centre of each participants 1st test
  int N_ld_centre;                 // Number of test centre types with longest "stay"
  int ld_centre[G];                // each subjects centre types with longest "stay"
  real imputation_limit;           // upper limit for imputed viral loads for neg tests
  int num_basis_Age;               // numberf of bases for Age spline model
  matrix[num_basis_Age, G] B_Age;  // regression matrix for Age spline model
  vector[G] Age;                   // Cases ages
  int K_PGH;                       // Number of covariates
  matrix[G,K_PGH] X_PGH;           // covariates 
  vector[G] max_load;              // Maximum measured load per case
  int condition_on_data;           // 1 for parameter estimation, 0 for prior predictive plots
  // Data for culture positivity analysis
  int N_CP;                        // number of data points
  int Y_CP[N_CP];                  // Outcome (0,1)
  matrix[N_CP,1] X_CP;             // Predictor (log10 viral load)
}

parameters {
  // grand means for key model parameters
  real log_slope_up_mu;
  real log_slope_down_mu;
  real log_intercept_mu;
  
  // covariates for key model parameters
  row_vector[num_basis_Age] a_raw_intercept_Age;
  real<lower=0> tau_intercept_Age;
  real<lower=0> a0_intercept_Age; 
  vector[K_PGH] betaPGH_intercept;
  row_vector[num_basis_Age] a_raw_slope_up_Age;
  real<lower=0> tau_slope_up_Age;
  real a0_slope_up_Age; 
  vector[K_PGH] betaPGH_slope_up;
  row_vector[num_basis_Age] a_raw_slope_down_Age;
  real<lower=0> tau_slope_down_Age;
  real a0_slope_down_Age; 
  vector[K_PGH] betaPGH_slope_down;

  // individual random effects for key model parameters
  real<lower=0> intercept_sigma;
  vector[G] intercept_raw; 
  real<lower=0> slope_up_sigma;
  vector[G] slope_up_raw;
  real<lower=0> slope_down_sigma;
  vector[G] slope_down_raw;
  real<lower=0> sigma_sigma;
  vector[G] sigma_raw;
  
  // first centre random effects for key model parameters
  real<lower=0> int_centre1_sigma;
  vector[N_centre1] int_centre1_raw;
  real<lower=0> slope_down_ld_centre_sigma;
  vector[N_ld_centre] slope_down_ld_centre_raw;
  real<lower=0> slope_up_ld_centre_sigma;
  vector[N_ld_centre] slope_up_ld_centre_raw;
  real<lower=0> int_ld_centre_sigma;
  vector[N_ld_centre] int_ld_centre_raw;
  
  // smoothness parameter for logistic weight function
  real<lower=0> beta_sweight_mu;
  
  // impute negative tests
  vector<lower=-25,upper=imputation_limit>[N_NegTests] imp_neg;
  
  // shifts for individuals
  vector<lower=-10,upper=20>[G] b_shift;
  // random effects of cirst centre on shifts
  real<lower=0> shift_centre1_sigma;
  vector[N_centre1] shift_centre1_raw;
  
  // error variance
  real sigma_mu; // mean
  
  // centre random effects on viral load
  vector[N_centres] centre_raw;
  real<lower=0> centre_sigma;
  // PCR system effect on viral load
  real intercept_PCR;
  
  // culture positivity parameters
  real alpha_CP;
  vector[1] beta_CP;
  
  real<lower=0,upper=1> theta;
}

transformed parameters {
  // spline coefficients
  row_vector[num_basis_Age] a_slope_up_Age = a_raw_slope_up_Age*tau_slope_up_Age;
  row_vector[num_basis_Age] a_intercept_Age = a_raw_intercept_Age*tau_intercept_Age;
  row_vector[num_basis_Age] a_slope_down_Age = a_raw_slope_down_Age*tau_slope_down_Age;
  
  // centre random effects on measured loads
  vector[N_centres] int_centr = centre_raw * centre_sigma;
  // error variance
  vector[G] sigma = exp(sigma_mu + sigma_sigma*sigma_raw);
  // initialize vectors for individual level parameters
  vector<lower=0>[G] intercept;
  vector<lower=0>[G] slope_up;
  vector<upper=0>[G] slope_down;
  vector<lower=0>[G] time2peak;
  vector[G] shift;
  vector[N_onset] shifted_onset;
  { // calculate model parameters
     vector[N_centre1] shift_centre1 = shift_centre1_sigma * shift_centre1_raw;
     vector[N_centre1] int_centre1 = int_centre1_sigma * int_centre1_raw;
     vector[N_ld_centre] slope_up_ld_centre = slope_up_ld_centre_sigma * slope_up_ld_centre_raw;
     vector[N_ld_centre] int_ld_centre = int_ld_centre_sigma * int_ld_centre_raw;
     vector[N_ld_centre] slope_down_ld_centre = slope_down_ld_centre_sigma * slope_down_ld_centre_raw;
     vector[G] log_intercept = 
                 log_intercept_mu + // intercept (fixed effect [FE], group level mean)
                 intercept_sigma * intercept_raw + // individual level random effects
                 a0_intercept_Age * Age + to_vector(a_intercept_Age * B_Age) + // age with splines (FE)
                 X_PGH * betaPGH_intercept; // covariates (FE)
     vector[G] log_slope_up = 
                 log_slope_up_mu +
                 slope_up_sigma * slope_up_raw + 
                 a0_slope_up_Age * Age + to_vector(a_slope_up_Age * B_Age) + 
                 X_PGH * betaPGH_slope_up; 
     vector[G] log_slope_down = 
                 log_slope_down_mu + 
                 slope_down_sigma * slope_down_raw + 
                 a0_slope_down_Age * Age + to_vector(a_slope_down_Age * B_Age) + 
                 X_PGH * betaPGH_slope_down;
     // test-centre category based random effects
     for (g in 1:G) {
       shift[g] = b_shift[g] + shift_centre1[centre1[g]];
       log_intercept[g] += int_centre1[centre1[g]] + int_ld_centre[g[]];
       log_slope_down[g] += slope_down_ld_centre[ld_centre[g]];
       log_slope_up[g] += slope_up_ld_centre[ld_centre[g]];
     }
     // apply link function
     intercept = exp(log_intercept);
     slope_up = exp(log_slope_up);
     slope_down = -exp(log_slope_down);
     shifted_onset = onset + shift[idx_onset]; // calculate temporal location of onset relative to peak load
  }
  time2peak = intercept ./ slope_up;
}

model {
  // DAY
  log_intercept_mu ~ normal(2.1,.1); 
  log_slope_up_mu ~ normal(.6,.25); 
  log_slope_down_mu ~ normal(-1.75,.5); 
  beta_sweight_mu ~ normal(10,1); 
  
  shift_centre1_sigma ~ std_normal();
  shift_centre1_raw ~ std_normal();
  
  intercept_sigma ~ gamma(4,20);
  intercept_raw ~ std_normal();
  
  slope_up_sigma ~ gamma(4,20);
  slope_up_raw ~ std_normal();
  
  slope_down_sigma ~ gamma(4,20);
  slope_down_raw ~ std_normal();
  
  sigma_mu ~ std_normal();
  sigma_sigma ~ std_normal();
  sigma_raw ~ std_normal();
  
  a0_intercept_Age ~ normal(0,.125);
  a_raw_intercept_Age ~ std_normal();
  tau_intercept_Age ~ normal(0, .5); 
  betaPGH_intercept ~ normal(0,.125);
  // one sd change in age changes slope by up to around 20%
  // pnorm(.2,0,.125) = .945, exp(.2) = 1.2
  a0_slope_up_Age ~ normal(0,.125);
  a_raw_slope_up_Age ~ std_normal(); 
  tau_slope_up_Age ~ normal(0, .75);  
  betaPGH_slope_up ~ normal(0,.125);
  a0_slope_down_Age ~ normal(0,.225);
  a_raw_slope_down_Age ~ std_normal(); 
  tau_slope_down_Age ~ normal(0, .75); 
  betaPGH_slope_down ~ normal(0,.225);
  
  // random centre effects
  centre_raw ~ std_normal();
  centre_sigma ~ std_normal();
  
  int_centre1_sigma ~ normal(0,.125);
  int_centre1_raw ~ std_normal();
  slope_down_ld_centre_sigma ~ normal(0,.125);
  slope_down_ld_centre_raw ~ std_normal();
  
  intercept_PCR ~ normal(0,1);
  
  if (condition_on_data == 1) {
    vector[N_DAY] Y_DAY_imputed = Y_DAY;
    for (j in 1:N_NegTests) {
      Y_DAY_imputed[idx_NegTests[j]] = imp_neg[j];
    }
    
    for (g in 1:G) {
      int N_g = gend_DAY[g] - gstart_DAY[g] + 1;
      vector[N_g] DAY_shifted1 = X_DAY[gstart_DAY[g]:gend_DAY[g]] + shift[g];
      vector[N_g] weight_down1 = inv_logit(DAY_shifted1 * beta_sweight_mu);
      vector[N_g] est_up1 = intercept[g] + slope_up[g]*DAY_shifted1;
      vector[N_g] est_down1 = intercept[g] + slope_down[g]*DAY_shifted1;
      vector[N_g] yhat1 = (1-weight_down1) .* est_up1 + weight_down1 .* est_down1;

      yhat1 += PCR[gstart_DAY[g]:gend_DAY[g]] * intercept_PCR;
      yhat1 += centre[gstart_DAY[g]:gend_DAY[g],] * int_centr;
      target += normal_lpdf(Y_DAY_imputed[gstart_DAY[g]:gend_DAY[g]] | yhat1, sigma[g]);
    }
    
    //shifted_onset ~ skew_normal(-3,10,12);
    for (k in 1:N_onset) {
     target += log_mix(theta,
     skew_normal_lpdf(shifted_onset[k] | -2.5,10,10),
     normal_lpdf(shifted_onset[k] | 0,50));
    }
    
    // culture positivity analysis
    alpha_CP ~ normal(0,20);
    beta_CP ~ normal(0,3);
    Y_CP ~ bernoulli_logit_glm(X_CP,alpha_CP, beta_CP);
  }
}

generated quantities {
  real slope_up_mu = exp(log_slope_up_mu);
  real slope_down_mu = exp(log_slope_down_mu);
  real intercept_mu = exp(log_intercept_mu);
  real time2peak_mu = intercept_mu/slope_up_mu;
}

6 Session Info

library(pander)
library(sessioninfo)
my_sessioninfo <- session_info()
pander(my_sessioninfo)
  • platform:

    • version: R version 4.0.2 (2020-06-22)
    • os: macOS High Sierra 10.13.6
    • system: x86_64, darwin17.0
    • ui: X11
    • language: (EN)
    • collate: en_US.UTF-8
    • ctype: en_US.UTF-8
    • tz: Europe/Oslo
    • date: 2021-05-25
  • packages:

    Table continues below
      package ondiskversion loadedversion
    abind abind 1.4.5 1.4-5
    assertthat assertthat 0.2.1 0.2.1
    backports backports 1.2.1 1.2.1
    base64enc base64enc 0.1.3 0.1-3
    bayesplot bayesplot 1.8.0 1.8.0
    beeswarm beeswarm 0.2.3 0.2.3
    bookdown bookdown 0.22 0.22
    boot boot 1.3.27 1.3-27
    bridgesampling bridgesampling 1.0.0 1.0-0
    brms brms 2.14.11 2.14.11
    Brobdingnag Brobdingnag 1.2.6 1.2-6
    bslib bslib 0.2.4 0.2.4
    callr callr 3.5.1 3.5.1
    cellranger cellranger 1.1.0 1.1.0
    checkmate checkmate 2.0.0 2.0.0
    cli cli 2.5.0 2.5.0
    cmdstanr cmdstanr 0.2.2 0.2.2
    coda coda 0.19.4 0.19-4
    codetools codetools 0.2.18 0.2-18
    collapse collapse 1.5.1 1.5.1
    colorspace colorspace 2.0.0 2.0-0
    colourpicker colourpicker 1.1.0 1.1.0
    crayon crayon 1.4.1 1.4.1
    crosstalk crosstalk 1.1.1 1.1.1
    curl curl 4.3 4.3
    data.table data.table 1.13.6 1.13.6
    DBI DBI 1.1.1 1.1.1
    digest digest 0.6.27 0.6.27
    distributional distributional 0.2.2 0.2.2
    dplyr dplyr 1.0.6 1.0.6
    DT DT 0.17 0.17
    dygraphs dygraphs 1.1.1.6 1.1.1.6
    ellipsis ellipsis 0.3.2 0.3.2
    emmeans emmeans 1.5.4 1.5.4
    english english 1.2.5 1.2-5
    EnvStats EnvStats 2.4.0 2.4.0
    estimability estimability 1.3 1.3
    evaluate evaluate 0.14 0.14
    extraDistr extraDistr 1.9.1 1.9.1
    fansi fansi 0.4.2 0.4.2
    farver farver 2.0.3 2.0.3
    fastmap fastmap 1.1.0 1.1.0
    flextable flextable 0.6.3 0.6.3
    forcats forcats 0.5.1 0.5.1
    gamm4 gamm4 0.2.6 0.2-6
    gdtools gdtools 0.2.3 0.2.3
    generics generics 0.1.0 0.1.0
    ggbeeswarm ggbeeswarm 0.6.0 0.6.0
    ggdist ggdist 2.4.0 2.4.0
    ggmosaic ggmosaic 0.3.3 0.3.3
    ggplot2 ggplot2 3.3.3 3.3.3
    ggrepel ggrepel 0.9.1 0.9.1
    ggridges ggridges 0.5.3 0.5.3
    ggthemes ggthemes 4.2.4 4.2.4
    glue glue 1.4.2 1.4.2
    gridExtra gridExtra 2.3 2.3
    gtable gtable 0.3.0 0.3.0
    gtools gtools 3.8.2 3.8.2
    hdrcde hdrcde 3.4 3.4
    here here 1.0.1 1.0.1
    highr highr 0.9 0.9
    htmltools htmltools 0.5.1.1 0.5.1.1
    htmlwidgets htmlwidgets 1.5.3 1.5.3
    httpuv httpuv 1.5.5 1.5.5
    httr httr 1.4.2 1.4.2
    igraph igraph 1.2.6 1.2.6
    inline inline 0.3.17 0.3.17
    isoband isoband 0.2.3 0.2.3
    jquerylib jquerylib 0.1.3 0.1.3
    jsonlite jsonlite 1.7.2 1.7.2
    kableExtra kableExtra 1.3.4 1.3.4
    knitr knitr 1.33 1.33
    labeling labeling 0.4.2 0.4.2
    later later 1.1.0.1 1.1.0.1
    lattice lattice 0.20.41 0.20-41
    lazyeval lazyeval 0.2.2 0.2.2
    lifecycle lifecycle 1.0.0 1.0.0
    lme4 lme4 1.1.26 1.1-26
    loo loo 2.4.1 2.4.1
    magrittr magrittr 2.0.1 2.0.1
    markdown markdown 1.1 1.1
    MASS MASS 7.3.53.1 7.3-53.1
    Matrix Matrix 1.3.2 1.3-2
    matrixStats matrixStats 0.58.0 0.58.0
    mgcv mgcv 1.8.34 1.8-34
    mime mime 0.10 0.10
    miniUI miniUI 0.1.1.1 0.1.1.1
    minqa minqa 1.2.4 1.2.4
    multcomp multcomp 1.4.16 1.4-16
    munsell munsell 0.5.0 0.5.0
    mvtnorm mvtnorm 1.1.1 1.1-1
    nlme nlme 3.1.152 3.1-152
    nloptr nloptr 1.2.2.2 1.2.2.2
    officer officer 0.3.16 0.3.16
    pander pander 0.6.3 0.6.3
    patchwork patchwork 1.1.1 1.1.1
    pillar pillar 1.6.1 1.6.1
    pkgbuild pkgbuild 1.2.0 1.2.0
    pkgconfig pkgconfig 2.0.3 2.0.3
    plotly plotly 4.9.3 4.9.3
    plyr plyr 1.8.6 1.8.6
    posterior posterior 0.1.3 0.1.3
    prettyunits prettyunits 1.1.1 1.1.1
    processx processx 3.4.5 3.4.5
    productplots productplots 0.1.1 0.1.1
    projpred projpred 2.0.2 2.0.2
    promises promises 1.2.0.1 1.2.0.1
    ps ps 1.5.0 1.5.0
    purrr purrr 0.3.4 0.3.4
    R6 R6 2.5.0 2.5.0
    Rcpp Rcpp 1.0.6 1.0.6
    RcppArmadillo RcppArmadillo 0.10.2.1.0 0.10.2.1.0
    RcppEigen RcppEigen 0.3.3.9.1 0.3.3.9.1
    RcppParallel RcppParallel 5.0.3 5.0.3
    readxl readxl 1.3.1 1.3.1
    reshape reshape 0.8.8 0.8.8
    reshape2 reshape2 1.4.4 1.4.4
    rjson rjson 0.2.20 0.2.20
    rlang rlang 0.4.11 0.4.11
    rmarkdown rmarkdown 2.8 2.8
    rprojroot rprojroot 2.0.2 2.0.2
    rsconnect rsconnect 0.8.16 0.8.16
    rstan rstan 2.26.1 2.26.1
    rstanarm rstanarm 2.21.1 2.21.1
    rstantools rstantools 2.1.1 2.1.1
    rstudioapi rstudioapi 0.13 0.13
    rvest rvest 0.3.6 0.3.6
    sandwich sandwich 3.0.0 3.0-0
    sass sass 0.3.1 0.3.1
    scales scales 1.1.1 1.1.1
    sessioninfo sessioninfo 1.1.1 1.1.1
    shiny shiny 1.6.0 1.6.0
    shinyjs shinyjs 2.0.0 2.0.0
    shinystan shinystan 2.5.0 2.5.0
    shinythemes shinythemes 1.2.0 1.2.0
    StanHeaders StanHeaders 2.26.1 2.26.1
    statmod statmod 1.4.35 1.4.35
    stringi stringi 1.6.2 1.6.2
    stringr stringr 1.4.0 1.4.0
    survival survival 3.2.7 3.2-7
    svglite svglite 1.2.3.2 1.2.3.2
    systemfonts systemfonts 1.0.1 1.0.1
    TH.data TH.data 1.0.10 1.0-10
    threejs threejs 0.3.3 0.3.3
    tibble tibble 3.1.2 3.1.2
    tidyr tidyr 1.1.3 1.1.3
    tidyselect tidyselect 1.1.1 1.1.1
    utf8 utf8 1.2.1 1.2.1
    uuid uuid 0.1.4 0.1-4
    V8 V8 3.4.0 3.4.0
    vctrs vctrs 0.3.8 0.3.8
    vipor vipor 0.4.5 0.4.5
    viridisLite viridisLite 0.3.0 0.3.0
    webshot webshot 0.5.2 0.5.2
    withr withr 2.4.1 2.4.1
    xfun xfun 0.23 0.23
    xml2 xml2 1.3.2 1.3.2
    xtable xtable 1.8.4 1.8-4
    xts xts 0.12.1 0.12.1
    yaml yaml 2.2.1 2.2.1
    zip zip 2.1.1 2.1.1
    zoo zoo 1.8.8 1.8-8
    Table continues below
      path
    abind /Library/Frameworks/R.framework/Versions/4.0/Resources/library/abind
    assertthat /Library/Frameworks/R.framework/Versions/4.0/Resources/library/assertthat
    backports /Library/Frameworks/R.framework/Versions/4.0/Resources/library/backports
    base64enc /Library/Frameworks/R.framework/Versions/4.0/Resources/library/base64enc
    bayesplot /Library/Frameworks/R.framework/Versions/4.0/Resources/library/bayesplot
    beeswarm /Library/Frameworks/R.framework/Versions/4.0/Resources/library/beeswarm
    bookdown /Library/Frameworks/R.framework/Versions/4.0/Resources/library/bookdown
    boot /Library/Frameworks/R.framework/Versions/4.0/Resources/library/boot
    bridgesampling /Library/Frameworks/R.framework/Versions/4.0/Resources/library/bridgesampling
    brms /Library/Frameworks/R.framework/Versions/4.0/Resources/library/brms
    Brobdingnag /Library/Frameworks/R.framework/Versions/4.0/Resources/library/Brobdingnag
    bslib /Library/Frameworks/R.framework/Versions/4.0/Resources/library/bslib
    callr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/callr
    cellranger /Library/Frameworks/R.framework/Versions/4.0/Resources/library/cellranger
    checkmate /Library/Frameworks/R.framework/Versions/4.0/Resources/library/checkmate
    cli /Library/Frameworks/R.framework/Versions/4.0/Resources/library/cli
    cmdstanr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/cmdstanr
    coda /Library/Frameworks/R.framework/Versions/4.0/Resources/library/coda
    codetools /Library/Frameworks/R.framework/Versions/4.0/Resources/library/codetools
    collapse /Library/Frameworks/R.framework/Versions/4.0/Resources/library/collapse
    colorspace /Library/Frameworks/R.framework/Versions/4.0/Resources/library/colorspace
    colourpicker /Library/Frameworks/R.framework/Versions/4.0/Resources/library/colourpicker
    crayon /Library/Frameworks/R.framework/Versions/4.0/Resources/library/crayon
    crosstalk /Library/Frameworks/R.framework/Versions/4.0/Resources/library/crosstalk
    curl /Library/Frameworks/R.framework/Versions/4.0/Resources/library/curl
    data.table /Library/Frameworks/R.framework/Versions/4.0/Resources/library/data.table
    DBI /Library/Frameworks/R.framework/Versions/4.0/Resources/library/DBI
    digest /Library/Frameworks/R.framework/Versions/4.0/Resources/library/digest
    distributional /Library/Frameworks/R.framework/Versions/4.0/Resources/library/distributional
    dplyr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/dplyr
    DT /Library/Frameworks/R.framework/Versions/4.0/Resources/library/DT
    dygraphs /Library/Frameworks/R.framework/Versions/4.0/Resources/library/dygraphs
    ellipsis /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ellipsis
    emmeans /Library/Frameworks/R.framework/Versions/4.0/Resources/library/emmeans
    english /Library/Frameworks/R.framework/Versions/4.0/Resources/library/english
    EnvStats /Library/Frameworks/R.framework/Versions/4.0/Resources/library/EnvStats
    estimability /Library/Frameworks/R.framework/Versions/4.0/Resources/library/estimability
    evaluate /Library/Frameworks/R.framework/Versions/4.0/Resources/library/evaluate
    extraDistr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/extraDistr
    fansi /Library/Frameworks/R.framework/Versions/4.0/Resources/library/fansi
    farver /Library/Frameworks/R.framework/Versions/4.0/Resources/library/farver
    fastmap /Library/Frameworks/R.framework/Versions/4.0/Resources/library/fastmap
    flextable /Library/Frameworks/R.framework/Versions/4.0/Resources/library/flextable
    forcats /Library/Frameworks/R.framework/Versions/4.0/Resources/library/forcats
    gamm4 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/gamm4
    gdtools /Library/Frameworks/R.framework/Versions/4.0/Resources/library/gdtools
    generics /Library/Frameworks/R.framework/Versions/4.0/Resources/library/generics
    ggbeeswarm /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggbeeswarm
    ggdist /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggdist
    ggmosaic /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggmosaic
    ggplot2 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggplot2
    ggrepel /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggrepel
    ggridges /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggridges
    ggthemes /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggthemes
    glue /Library/Frameworks/R.framework/Versions/4.0/Resources/library/glue
    gridExtra /Library/Frameworks/R.framework/Versions/4.0/Resources/library/gridExtra
    gtable /Library/Frameworks/R.framework/Versions/4.0/Resources/library/gtable
    gtools /Library/Frameworks/R.framework/Versions/4.0/Resources/library/gtools
    hdrcde /Library/Frameworks/R.framework/Versions/4.0/Resources/library/hdrcde
    here /Library/Frameworks/R.framework/Versions/4.0/Resources/library/here
    highr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/highr
    htmltools /Library/Frameworks/R.framework/Versions/4.0/Resources/library/htmltools
    htmlwidgets /Library/Frameworks/R.framework/Versions/4.0/Resources/library/htmlwidgets
    httpuv /Library/Frameworks/R.framework/Versions/4.0/Resources/library/httpuv
    httr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/httr
    igraph /Library/Frameworks/R.framework/Versions/4.0/Resources/library/igraph
    inline /Library/Frameworks/R.framework/Versions/4.0/Resources/library/inline
    isoband /Library/Frameworks/R.framework/Versions/4.0/Resources/library/isoband
    jquerylib /Library/Frameworks/R.framework/Versions/4.0/Resources/library/jquerylib
    jsonlite /Library/Frameworks/R.framework/Versions/4.0/Resources/library/jsonlite
    kableExtra /Library/Frameworks/R.framework/Versions/4.0/Resources/library/kableExtra
    knitr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/knitr
    labeling /Library/Frameworks/R.framework/Versions/4.0/Resources/library/labeling
    later /Library/Frameworks/R.framework/Versions/4.0/Resources/library/later
    lattice /Library/Frameworks/R.framework/Versions/4.0/Resources/library/lattice
    lazyeval /Library/Frameworks/R.framework/Versions/4.0/Resources/library/lazyeval
    lifecycle /Library/Frameworks/R.framework/Versions/4.0/Resources/library/lifecycle
    lme4 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/lme4
    loo /Library/Frameworks/R.framework/Versions/4.0/Resources/library/loo
    magrittr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/magrittr
    markdown /Library/Frameworks/R.framework/Versions/4.0/Resources/library/markdown
    MASS /Library/Frameworks/R.framework/Versions/4.0/Resources/library/MASS
    Matrix /Library/Frameworks/R.framework/Versions/4.0/Resources/library/Matrix
    matrixStats /Library/Frameworks/R.framework/Versions/4.0/Resources/library/matrixStats
    mgcv /Library/Frameworks/R.framework/Versions/4.0/Resources/library/mgcv
    mime /Library/Frameworks/R.framework/Versions/4.0/Resources/library/mime
    miniUI /Library/Frameworks/R.framework/Versions/4.0/Resources/library/miniUI
    minqa /Library/Frameworks/R.framework/Versions/4.0/Resources/library/minqa
    multcomp /Library/Frameworks/R.framework/Versions/4.0/Resources/library/multcomp
    munsell /Library/Frameworks/R.framework/Versions/4.0/Resources/library/munsell
    mvtnorm /Library/Frameworks/R.framework/Versions/4.0/Resources/library/mvtnorm
    nlme /Library/Frameworks/R.framework/Versions/4.0/Resources/library/nlme
    nloptr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/nloptr
    officer /Library/Frameworks/R.framework/Versions/4.0/Resources/library/officer
    pander /Library/Frameworks/R.framework/Versions/4.0/Resources/library/pander
    patchwork /Library/Frameworks/R.framework/Versions/4.0/Resources/library/patchwork
    pillar /Library/Frameworks/R.framework/Versions/4.0/Resources/library/pillar
    pkgbuild /Library/Frameworks/R.framework/Versions/4.0/Resources/library/pkgbuild
    pkgconfig /Library/Frameworks/R.framework/Versions/4.0/Resources/library/pkgconfig
    plotly /Library/Frameworks/R.framework/Versions/4.0/Resources/library/plotly
    plyr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/plyr
    posterior /Library/Frameworks/R.framework/Versions/4.0/Resources/library/posterior
    prettyunits /Library/Frameworks/R.framework/Versions/4.0/Resources/library/prettyunits
    processx /Library/Frameworks/R.framework/Versions/4.0/Resources/library/processx
    productplots /Library/Frameworks/R.framework/Versions/4.0/Resources/library/productplots
    projpred /Library/Frameworks/R.framework/Versions/4.0/Resources/library/projpred
    promises /Library/Frameworks/R.framework/Versions/4.0/Resources/library/promises
    ps /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ps
    purrr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/purrr
    R6 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/R6
    Rcpp /Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp
    RcppArmadillo /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo
    RcppEigen /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen
    RcppParallel /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppParallel
    readxl /Library/Frameworks/R.framework/Versions/4.0/Resources/library/readxl
    reshape /Library/Frameworks/R.framework/Versions/4.0/Resources/library/reshape
    reshape2 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/reshape2
    rjson /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rjson
    rlang /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rlang
    rmarkdown /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rmarkdown
    rprojroot /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rprojroot
    rsconnect /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rsconnect
    rstan /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstan
    rstanarm /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstanarm
    rstantools /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstantools
    rstudioapi /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstudioapi
    rvest /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rvest
    sandwich /Library/Frameworks/R.framework/Versions/4.0/Resources/library/sandwich
    sass /Library/Frameworks/R.framework/Versions/4.0/Resources/library/sass
    scales /Library/Frameworks/R.framework/Versions/4.0/Resources/library/scales
    sessioninfo /Library/Frameworks/R.framework/Versions/4.0/Resources/library/sessioninfo
    shiny /Library/Frameworks/R.framework/Versions/4.0/Resources/library/shiny
    shinyjs /Library/Frameworks/R.framework/Versions/4.0/Resources/library/shinyjs
    shinystan /Library/Frameworks/R.framework/Versions/4.0/Resources/library/shinystan
    shinythemes /Library/Frameworks/R.framework/Versions/4.0/Resources/library/shinythemes
    StanHeaders /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders
    statmod /Library/Frameworks/R.framework/Versions/4.0/Resources/library/statmod
    stringi /Library/Frameworks/R.framework/Versions/4.0/Resources/library/stringi
    stringr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/stringr
    survival /Library/Frameworks/R.framework/Versions/4.0/Resources/library/survival
    svglite /Library/Frameworks/R.framework/Versions/4.0/Resources/library/svglite
    systemfonts /Library/Frameworks/R.framework/Versions/4.0/Resources/library/systemfonts
    TH.data /Library/Frameworks/R.framework/Versions/4.0/Resources/library/TH.data
    threejs /Library/Frameworks/R.framework/Versions/4.0/Resources/library/threejs
    tibble /Library/Frameworks/R.framework/Versions/4.0/Resources/library/tibble
    tidyr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/tidyr
    tidyselect /Library/Frameworks/R.framework/Versions/4.0/Resources/library/tidyselect
    utf8 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/utf8
    uuid /Library/Frameworks/R.framework/Versions/4.0/Resources/library/uuid
    V8 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/V8
    vctrs /Library/Frameworks/R.framework/Versions/4.0/Resources/library/vctrs
    vipor /Library/Frameworks/R.framework/Versions/4.0/Resources/library/vipor
    viridisLite /Library/Frameworks/R.framework/Versions/4.0/Resources/library/viridisLite
    webshot /Library/Frameworks/R.framework/Versions/4.0/Resources/library/webshot
    withr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/withr
    xfun /Library/Frameworks/R.framework/Versions/4.0/Resources/library/xfun
    xml2 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/xml2
    xtable /Library/Frameworks/R.framework/Versions/4.0/Resources/library/xtable
    xts /Library/Frameworks/R.framework/Versions/4.0/Resources/library/xts
    yaml /Library/Frameworks/R.framework/Versions/4.0/Resources/library/yaml
    zip /Library/Frameworks/R.framework/Versions/4.0/Resources/library/zip
    zoo /Library/Frameworks/R.framework/Versions/4.0/Resources/library/zoo
    Table continues below
      loadedpath
    abind /Library/Frameworks/R.framework/Versions/4.0/Resources/library/abind
    assertthat /Library/Frameworks/R.framework/Versions/4.0/Resources/library/assertthat
    backports /Library/Frameworks/R.framework/Versions/4.0/Resources/library/backports
    base64enc /Library/Frameworks/R.framework/Versions/4.0/Resources/library/base64enc
    bayesplot /Library/Frameworks/R.framework/Versions/4.0/Resources/library/bayesplot
    beeswarm /Library/Frameworks/R.framework/Versions/4.0/Resources/library/beeswarm
    bookdown /Library/Frameworks/R.framework/Versions/4.0/Resources/library/bookdown
    boot /Library/Frameworks/R.framework/Versions/4.0/Resources/library/boot
    bridgesampling /Library/Frameworks/R.framework/Versions/4.0/Resources/library/bridgesampling
    brms /Library/Frameworks/R.framework/Versions/4.0/Resources/library/brms
    Brobdingnag /Library/Frameworks/R.framework/Versions/4.0/Resources/library/Brobdingnag
    bslib /Library/Frameworks/R.framework/Versions/4.0/Resources/library/bslib
    callr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/callr
    cellranger /Library/Frameworks/R.framework/Versions/4.0/Resources/library/cellranger
    checkmate /Library/Frameworks/R.framework/Versions/4.0/Resources/library/checkmate
    cli /Library/Frameworks/R.framework/Versions/4.0/Resources/library/cli
    cmdstanr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/cmdstanr
    coda /Library/Frameworks/R.framework/Versions/4.0/Resources/library/coda
    codetools /Library/Frameworks/R.framework/Versions/4.0/Resources/library/codetools
    collapse /Library/Frameworks/R.framework/Versions/4.0/Resources/library/collapse
    colorspace /Library/Frameworks/R.framework/Versions/4.0/Resources/library/colorspace
    colourpicker /Library/Frameworks/R.framework/Versions/4.0/Resources/library/colourpicker
    crayon /Library/Frameworks/R.framework/Versions/4.0/Resources/library/crayon
    crosstalk /Library/Frameworks/R.framework/Versions/4.0/Resources/library/crosstalk
    curl /Library/Frameworks/R.framework/Versions/4.0/Resources/library/curl
    data.table /Library/Frameworks/R.framework/Versions/4.0/Resources/library/data.table
    DBI /Library/Frameworks/R.framework/Versions/4.0/Resources/library/DBI
    digest /Library/Frameworks/R.framework/Versions/4.0/Resources/library/digest
    distributional /Library/Frameworks/R.framework/Versions/4.0/Resources/library/distributional
    dplyr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/dplyr
    DT /Library/Frameworks/R.framework/Versions/4.0/Resources/library/DT
    dygraphs /Library/Frameworks/R.framework/Versions/4.0/Resources/library/dygraphs
    ellipsis /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ellipsis
    emmeans /Library/Frameworks/R.framework/Versions/4.0/Resources/library/emmeans
    english /Library/Frameworks/R.framework/Versions/4.0/Resources/library/english
    EnvStats /Library/Frameworks/R.framework/Versions/4.0/Resources/library/EnvStats
    estimability /Library/Frameworks/R.framework/Versions/4.0/Resources/library/estimability
    evaluate /Library/Frameworks/R.framework/Versions/4.0/Resources/library/evaluate
    extraDistr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/extraDistr
    fansi /Library/Frameworks/R.framework/Versions/4.0/Resources/library/fansi
    farver /Library/Frameworks/R.framework/Versions/4.0/Resources/library/farver
    fastmap /Library/Frameworks/R.framework/Versions/4.0/Resources/library/fastmap
    flextable /Library/Frameworks/R.framework/Versions/4.0/Resources/library/flextable
    forcats /Library/Frameworks/R.framework/Versions/4.0/Resources/library/forcats
    gamm4 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/gamm4
    gdtools /Library/Frameworks/R.framework/Versions/4.0/Resources/library/gdtools
    generics /Library/Frameworks/R.framework/Versions/4.0/Resources/library/generics
    ggbeeswarm /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggbeeswarm
    ggdist /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggdist
    ggmosaic /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggmosaic
    ggplot2 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggplot2
    ggrepel /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggrepel
    ggridges /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggridges
    ggthemes /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ggthemes
    glue /Library/Frameworks/R.framework/Versions/4.0/Resources/library/glue
    gridExtra /Library/Frameworks/R.framework/Versions/4.0/Resources/library/gridExtra
    gtable /Library/Frameworks/R.framework/Versions/4.0/Resources/library/gtable
    gtools /Library/Frameworks/R.framework/Versions/4.0/Resources/library/gtools
    hdrcde /Library/Frameworks/R.framework/Versions/4.0/Resources/library/hdrcde
    here /Library/Frameworks/R.framework/Versions/4.0/Resources/library/here
    highr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/highr
    htmltools /Library/Frameworks/R.framework/Versions/4.0/Resources/library/htmltools
    htmlwidgets /Library/Frameworks/R.framework/Versions/4.0/Resources/library/htmlwidgets
    httpuv /Library/Frameworks/R.framework/Versions/4.0/Resources/library/httpuv
    httr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/httr
    igraph /Library/Frameworks/R.framework/Versions/4.0/Resources/library/igraph
    inline /Library/Frameworks/R.framework/Versions/4.0/Resources/library/inline
    isoband /Library/Frameworks/R.framework/Versions/4.0/Resources/library/isoband
    jquerylib /Library/Frameworks/R.framework/Versions/4.0/Resources/library/jquerylib
    jsonlite /Library/Frameworks/R.framework/Versions/4.0/Resources/library/jsonlite
    kableExtra /Library/Frameworks/R.framework/Versions/4.0/Resources/library/kableExtra
    knitr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/knitr
    labeling /Library/Frameworks/R.framework/Versions/4.0/Resources/library/labeling
    later /Library/Frameworks/R.framework/Versions/4.0/Resources/library/later
    lattice /Library/Frameworks/R.framework/Versions/4.0/Resources/library/lattice
    lazyeval /Library/Frameworks/R.framework/Versions/4.0/Resources/library/lazyeval
    lifecycle /Library/Frameworks/R.framework/Versions/4.0/Resources/library/lifecycle
    lme4 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/lme4
    loo /Library/Frameworks/R.framework/Versions/4.0/Resources/library/loo
    magrittr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/magrittr
    markdown /Library/Frameworks/R.framework/Versions/4.0/Resources/library/markdown
    MASS /Library/Frameworks/R.framework/Versions/4.0/Resources/library/MASS
    Matrix /Library/Frameworks/R.framework/Versions/4.0/Resources/library/Matrix
    matrixStats /Library/Frameworks/R.framework/Versions/4.0/Resources/library/matrixStats
    mgcv /Library/Frameworks/R.framework/Versions/4.0/Resources/library/mgcv
    mime /Library/Frameworks/R.framework/Versions/4.0/Resources/library/mime
    miniUI /Library/Frameworks/R.framework/Versions/4.0/Resources/library/miniUI
    minqa /Library/Frameworks/R.framework/Versions/4.0/Resources/library/minqa
    multcomp /Library/Frameworks/R.framework/Versions/4.0/Resources/library/multcomp
    munsell /Library/Frameworks/R.framework/Versions/4.0/Resources/library/munsell
    mvtnorm /Library/Frameworks/R.framework/Versions/4.0/Resources/library/mvtnorm
    nlme /Library/Frameworks/R.framework/Versions/4.0/Resources/library/nlme
    nloptr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/nloptr
    officer /Library/Frameworks/R.framework/Versions/4.0/Resources/library/officer
    pander /Library/Frameworks/R.framework/Versions/4.0/Resources/library/pander
    patchwork /Library/Frameworks/R.framework/Versions/4.0/Resources/library/patchwork
    pillar /Library/Frameworks/R.framework/Versions/4.0/Resources/library/pillar
    pkgbuild /Library/Frameworks/R.framework/Versions/4.0/Resources/library/pkgbuild
    pkgconfig /Library/Frameworks/R.framework/Versions/4.0/Resources/library/pkgconfig
    plotly /Library/Frameworks/R.framework/Versions/4.0/Resources/library/plotly
    plyr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/plyr
    posterior /Library/Frameworks/R.framework/Versions/4.0/Resources/library/posterior
    prettyunits /Library/Frameworks/R.framework/Versions/4.0/Resources/library/prettyunits
    processx /Library/Frameworks/R.framework/Versions/4.0/Resources/library/processx
    productplots /Library/Frameworks/R.framework/Versions/4.0/Resources/library/productplots
    projpred /Library/Frameworks/R.framework/Versions/4.0/Resources/library/projpred
    promises /Library/Frameworks/R.framework/Versions/4.0/Resources/library/promises
    ps /Library/Frameworks/R.framework/Versions/4.0/Resources/library/ps
    purrr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/purrr
    R6 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/R6
    Rcpp /Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp
    RcppArmadillo /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppArmadillo
    RcppEigen /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen
    RcppParallel /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppParallel
    readxl /Library/Frameworks/R.framework/Versions/4.0/Resources/library/readxl
    reshape /Library/Frameworks/R.framework/Versions/4.0/Resources/library/reshape
    reshape2 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/reshape2
    rjson /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rjson
    rlang /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rlang
    rmarkdown /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rmarkdown
    rprojroot /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rprojroot
    rsconnect /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rsconnect
    rstan /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstan
    rstanarm /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstanarm
    rstantools /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstantools
    rstudioapi /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstudioapi
    rvest /Library/Frameworks/R.framework/Versions/4.0/Resources/library/rvest
    sandwich /Library/Frameworks/R.framework/Versions/4.0/Resources/library/sandwich
    sass /Library/Frameworks/R.framework/Versions/4.0/Resources/library/sass
    scales /Library/Frameworks/R.framework/Versions/4.0/Resources/library/scales
    sessioninfo /Library/Frameworks/R.framework/Versions/4.0/Resources/library/sessioninfo
    shiny /Library/Frameworks/R.framework/Versions/4.0/Resources/library/shiny
    shinyjs /Library/Frameworks/R.framework/Versions/4.0/Resources/library/shinyjs
    shinystan /Library/Frameworks/R.framework/Versions/4.0/Resources/library/shinystan
    shinythemes /Library/Frameworks/R.framework/Versions/4.0/Resources/library/shinythemes
    StanHeaders /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders
    statmod /Library/Frameworks/R.framework/Versions/4.0/Resources/library/statmod
    stringi /Library/Frameworks/R.framework/Versions/4.0/Resources/library/stringi
    stringr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/stringr
    survival /Library/Frameworks/R.framework/Versions/4.0/Resources/library/survival
    svglite /Library/Frameworks/R.framework/Versions/4.0/Resources/library/svglite
    systemfonts /Library/Frameworks/R.framework/Versions/4.0/Resources/library/systemfonts
    TH.data /Library/Frameworks/R.framework/Versions/4.0/Resources/library/TH.data
    threejs /Library/Frameworks/R.framework/Versions/4.0/Resources/library/threejs
    tibble /Library/Frameworks/R.framework/Versions/4.0/Resources/library/tibble
    tidyr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/tidyr
    tidyselect /Library/Frameworks/R.framework/Versions/4.0/Resources/library/tidyselect
    utf8 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/utf8
    uuid /Library/Frameworks/R.framework/Versions/4.0/Resources/library/uuid
    V8 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/V8
    vctrs /Library/Frameworks/R.framework/Versions/4.0/Resources/library/vctrs
    vipor /Library/Frameworks/R.framework/Versions/4.0/Resources/library/vipor
    viridisLite /Library/Frameworks/R.framework/Versions/4.0/Resources/library/viridisLite
    webshot /Library/Frameworks/R.framework/Versions/4.0/Resources/library/webshot
    withr /Library/Frameworks/R.framework/Versions/4.0/Resources/library/withr
    xfun /Library/Frameworks/R.framework/Versions/4.0/Resources/library/xfun
    xml2 /Library/Frameworks/R.framework/Versions/4.0/Resources/library/xml2
    xtable /Library/Frameworks/R.framework/Versions/4.0/Resources/library/xtable
    xts /Library/Frameworks/R.framework/Versions/4.0/Resources/library/xts
    yaml /Library/Frameworks/R.framework/Versions/4.0/Resources/library/yaml
    zip /Library/Frameworks/R.framework/Versions/4.0/Resources/library/zip
    zoo /Library/Frameworks/R.framework/Versions/4.0/Resources/library/zoo
    Table continues below
      attached is_base date
    abind FALSE FALSE 2016-07-21
    assertthat FALSE FALSE 2019-03-21
    backports FALSE FALSE 2020-12-09
    base64enc FALSE FALSE 2015-07-28
    bayesplot TRUE FALSE 2021-01-10
    beeswarm FALSE FALSE 2016-04-25
    bookdown FALSE FALSE 2021-04-22
    boot TRUE FALSE 2021-02-12
    bridgesampling FALSE FALSE 2020-02-26
    brms TRUE FALSE 2021-02-26
    Brobdingnag FALSE FALSE 2018-08-13
    bslib FALSE FALSE 2021-01-25
    callr FALSE FALSE 2020-10-13
    cellranger FALSE FALSE 2016-07-27
    checkmate FALSE FALSE 2020-02-06
    cli FALSE FALSE 2021-04-26
    cmdstanr TRUE FALSE 2020-12-07
    coda FALSE FALSE 2020-09-30
    codetools FALSE FALSE 2020-11-04
    collapse FALSE FALSE 2021-01-12
    colorspace TRUE FALSE 2020-11-11
    colourpicker FALSE FALSE 2020-09-14
    crayon FALSE FALSE 2021-02-08
    crosstalk FALSE FALSE 2021-01-12
    curl FALSE FALSE 2019-12-02
    data.table TRUE FALSE 2020-12-30
    DBI FALSE FALSE 2021-01-15
    digest FALSE FALSE 2020-10-24
    distributional FALSE FALSE 2021-02-02
    dplyr TRUE FALSE 2021-05-05
    DT FALSE FALSE 2021-01-06
    dygraphs FALSE FALSE 2018-07-11
    ellipsis FALSE FALSE 2021-04-29
    emmeans FALSE FALSE 2021-02-03
    english TRUE FALSE 2020-01-26
    EnvStats TRUE FALSE 2020-10-21
    estimability FALSE FALSE 2018-02-11
    evaluate FALSE FALSE 2019-05-28
    extraDistr TRUE FALSE 2020-09-07
    fansi FALSE FALSE 2021-01-15
    farver FALSE FALSE 2020-01-16
    fastmap FALSE FALSE 2021-01-25
    flextable TRUE FALSE 2021-02-01
    forcats FALSE FALSE 2021-01-27
    gamm4 FALSE FALSE 2020-04-03
    gdtools FALSE FALSE 2021-01-06
    generics FALSE FALSE 2020-10-31
    ggbeeswarm TRUE FALSE 2017-08-07
    ggdist TRUE FALSE 2021-01-04
    ggmosaic TRUE FALSE 2021-02-23
    ggplot2 TRUE FALSE 2020-12-30
    ggrepel FALSE FALSE 2021-01-15
    ggridges FALSE FALSE 2021-01-08
    ggthemes TRUE FALSE 2021-01-20
    glue FALSE FALSE 2020-08-27
    gridExtra FALSE FALSE 2017-09-09
    gtable FALSE FALSE 2019-03-25
    gtools FALSE FALSE 2020-03-31
    hdrcde FALSE FALSE 2021-01-18
    here TRUE FALSE 2020-12-13
    highr FALSE FALSE 2021-04-16
    htmltools FALSE FALSE 2021-01-22
    htmlwidgets FALSE FALSE 2020-12-10
    httpuv FALSE FALSE 2021-01-13
    httr FALSE FALSE 2020-07-20
    igraph FALSE FALSE 2020-10-06
    inline FALSE FALSE 2020-12-01
    isoband FALSE FALSE 2020-12-01
    jquerylib FALSE FALSE 2020-12-17
    jsonlite FALSE FALSE 2020-12-09
    kableExtra TRUE FALSE 2021-02-20
    knitr TRUE FALSE 2021-04-24
    labeling FALSE FALSE 2020-10-20
    later FALSE FALSE 2020-06-05
    lattice FALSE FALSE 2020-04-02
    lazyeval FALSE FALSE 2019-03-15
    lifecycle FALSE FALSE 2021-02-15
    lme4 FALSE FALSE 2020-12-01
    loo FALSE FALSE 2020-12-09
    magrittr TRUE FALSE 2020-11-17
    markdown FALSE FALSE 2019-08-07
    MASS TRUE FALSE 2021-02-12
    Matrix FALSE FALSE 2021-01-06
    matrixStats FALSE FALSE 2021-01-29
    mgcv FALSE FALSE 2021-02-16
    mime FALSE FALSE 2021-02-13
    miniUI FALSE FALSE 2018-05-18
    minqa FALSE FALSE 2014-10-09
    multcomp FALSE FALSE 2021-02-08
    munsell FALSE FALSE 2018-06-12
    mvtnorm FALSE FALSE 2020-06-09
    nlme FALSE FALSE 2021-02-04
    nloptr FALSE FALSE 2020-07-02
    officer TRUE FALSE 2021-01-04
    pander TRUE FALSE 2018-11-06
    patchwork TRUE FALSE 2020-12-17
    pillar FALSE FALSE 2021-05-16
    pkgbuild FALSE FALSE 2020-12-15
    pkgconfig FALSE FALSE 2019-09-22
    plotly FALSE FALSE 2021-01-10
    plyr FALSE FALSE 2020-03-03
    posterior TRUE FALSE 2020-12-07
    prettyunits FALSE FALSE 2020-01-24
    processx FALSE FALSE 2020-11-30
    productplots FALSE FALSE 2016-07-02
    projpred FALSE FALSE 2020-10-28
    promises FALSE FALSE 2021-02-11
    ps FALSE FALSE 2020-12-05
    purrr FALSE FALSE 2020-04-17
    R6 FALSE FALSE 2020-10-28
    Rcpp TRUE FALSE 2021-01-15
    RcppArmadillo FALSE FALSE 2021-02-09
    RcppEigen FALSE FALSE 2020-12-17
    RcppParallel FALSE FALSE 2021-02-24
    readxl TRUE FALSE 2019-03-13
    reshape FALSE FALSE 2018-10-23
    reshape2 FALSE FALSE 2020-04-09
    rjson TRUE FALSE 2018-06-08
    rlang FALSE FALSE 2021-04-30
    rmarkdown FALSE FALSE 2021-05-07
    rprojroot FALSE FALSE 2020-11-15
    rsconnect FALSE FALSE 2019-12-13
    rstan FALSE FALSE 2021-02-19
    rstanarm TRUE FALSE 2020-07-20
    rstantools FALSE FALSE 2020-07-06
    rstudioapi FALSE FALSE 2020-11-12
    rvest FALSE FALSE 2020-07-25
    sandwich FALSE FALSE 2020-10-02
    sass FALSE FALSE 2021-01-24
    scales FALSE FALSE 2020-05-11
    sessioninfo TRUE FALSE 2018-11-05
    shiny FALSE FALSE 2021-01-25
    shinyjs FALSE FALSE 2020-09-09
    shinystan FALSE FALSE 2018-05-01
    shinythemes FALSE FALSE 2021-01-25
    StanHeaders FALSE FALSE 2021-02-19
    statmod FALSE FALSE 2020-10-19
    stringi FALSE FALSE 2021-05-17
    stringr FALSE FALSE 2019-02-10
    survival FALSE FALSE 2020-09-28
    svglite FALSE FALSE 2020-07-07
    systemfonts FALSE FALSE 2021-02-09
    TH.data FALSE FALSE 2019-01-21
    threejs FALSE FALSE 2020-01-21
    tibble FALSE FALSE 2021-05-16
    tidyr FALSE FALSE 2021-03-03
    tidyselect FALSE FALSE 2021-04-30
    utf8 FALSE FALSE 2021-03-12
    uuid FALSE FALSE 2020-02-26
    V8 FALSE FALSE 2020-11-04
    vctrs FALSE FALSE 2021-04-29
    vipor FALSE FALSE 2017-03-22
    viridisLite FALSE FALSE 2018-02-01
    webshot FALSE FALSE 2019-11-22
    withr FALSE FALSE 2021-01-26
    xfun FALSE FALSE 2021-05-15
    xml2 FALSE FALSE 2020-04-23
    xtable FALSE FALSE 2019-04-21
    xts FALSE FALSE 2020-09-09
    yaml FALSE FALSE 2020-02-01
    zip FALSE FALSE 2020-08-27
    zoo FALSE FALSE 2020-05-02
    Table continues below
      source md5ok
    abind CRAN (R 4.0.0) NA
    assertthat CRAN (R 4.0.0) NA
    backports CRAN (R 4.0.2) NA
    base64enc CRAN (R 4.0.0) NA
    bayesplot CRAN (R 4.0.2) NA
    beeswarm CRAN (R 4.0.0) NA
    bookdown CRAN (R 4.0.2) NA
    boot CRAN (R 4.0.2) NA
    bridgesampling CRAN (R 4.0.0) NA
    brms Github () NA
    Brobdingnag CRAN (R 4.0.0) NA
    bslib CRAN (R 4.0.2) NA
    callr CRAN (R 4.0.2) NA
    cellranger CRAN (R 4.0.0) NA
    checkmate CRAN (R 4.0.0) NA
    cli CRAN (R 4.0.2) NA
    cmdstanr local NA
    coda CRAN (R 4.0.2) NA
    codetools CRAN (R 4.0.2) NA
    collapse CRAN (R 4.0.2) NA
    colorspace CRAN (R 4.0.2) NA
    colourpicker CRAN (R 4.0.2) NA
    crayon CRAN (R 4.0.2) NA
    crosstalk CRAN (R 4.0.2) NA
    curl CRAN (R 4.0.0) NA
    data.table CRAN (R 4.0.2) NA
    DBI CRAN (R 4.0.2) NA
    digest CRAN (R 4.0.2) NA
    distributional CRAN (R 4.0.2) NA
    dplyr CRAN (R 4.0.2) NA
    DT CRAN (R 4.0.2) NA
    dygraphs CRAN (R 4.0.0) NA
    ellipsis CRAN (R 4.0.2) NA
    emmeans CRAN (R 4.0.2) NA
    english CRAN (R 4.0.0) NA
    EnvStats CRAN (R 4.0.2) NA
    estimability CRAN (R 4.0.0) NA
    evaluate CRAN (R 4.0.0) NA
    extraDistr CRAN (R 4.0.2) NA
    fansi CRAN (R 4.0.2) NA
    farver CRAN (R 4.0.0) NA
    fastmap CRAN (R 4.0.2) NA
    flextable CRAN (R 4.0.2) NA
    forcats CRAN (R 4.0.2) NA
    gamm4 CRAN (R 4.0.2) NA
    gdtools CRAN (R 4.0.2) NA
    generics CRAN (R 4.0.2) NA
    ggbeeswarm CRAN (R 4.0.0) NA
    ggdist CRAN (R 4.0.2) NA
    ggmosaic CRAN (R 4.0.2) NA
    ggplot2 CRAN (R 4.0.2) NA
    ggrepel CRAN (R 4.0.2) NA
    ggridges CRAN (R 4.0.2) NA
    ggthemes CRAN (R 4.0.2) NA
    glue CRAN (R 4.0.2) NA
    gridExtra CRAN (R 4.0.0) NA
    gtable CRAN (R 4.0.0) NA
    gtools CRAN (R 4.0.0) NA
    hdrcde CRAN (R 4.0.2) NA
    here CRAN (R 4.0.2) NA
    highr CRAN (R 4.0.2) NA
    htmltools CRAN (R 4.0.2) NA
    htmlwidgets CRAN (R 4.0.2) NA
    httpuv CRAN (R 4.0.2) NA
    httr CRAN (R 4.0.2) NA
    igraph CRAN (R 4.0.2) NA
    inline CRAN (R 4.0.2) NA
    isoband CRAN (R 4.0.2) NA
    jquerylib CRAN (R 4.0.2) NA
    jsonlite CRAN (R 4.0.2) NA
    kableExtra CRAN (R 4.0.2) NA
    knitr CRAN (R 4.0.2) NA
    labeling CRAN (R 4.0.2) NA
    later CRAN (R 4.0.0) NA
    lattice CRAN (R 4.0.2) NA
    lazyeval CRAN (R 4.0.0) NA
    lifecycle CRAN (R 4.0.2) NA
    lme4 CRAN (R 4.0.2) NA
    loo CRAN (R 4.0.2) NA
    magrittr CRAN (R 4.0.2) NA
    markdown CRAN (R 4.0.0) NA
    MASS CRAN (R 4.0.2) NA
    Matrix CRAN (R 4.0.2) NA
    matrixStats CRAN (R 4.0.2) NA
    mgcv CRAN (R 4.0.2) NA
    mime CRAN (R 4.0.2) NA
    miniUI CRAN (R 4.0.0) NA
    minqa CRAN (R 4.0.0) NA
    multcomp CRAN (R 4.0.2) NA
    munsell CRAN (R 4.0.0) NA
    mvtnorm CRAN (R 4.0.0) NA
    nlme CRAN (R 4.0.2) NA
    nloptr CRAN (R 4.0.2) NA
    officer CRAN (R 4.0.2) NA
    pander CRAN (R 4.0.2) NA
    patchwork CRAN (R 4.0.2) NA
    pillar CRAN (R 4.0.2) NA
    pkgbuild CRAN (R 4.0.2) NA
    pkgconfig CRAN (R 4.0.0) NA
    plotly CRAN (R 4.0.2) NA
    plyr CRAN (R 4.0.0) NA
    posterior local NA
    prettyunits CRAN (R 4.0.0) NA
    processx CRAN (R 4.0.2) NA
    productplots CRAN (R 4.0.2) NA
    projpred CRAN (R 4.0.2) NA
    promises CRAN (R 4.0.2) NA
    ps CRAN (R 4.0.2) NA
    purrr CRAN (R 4.0.0) NA
    R6 CRAN (R 4.0.2) NA
    Rcpp CRAN (R 4.0.2) NA
    RcppArmadillo CRAN (R 4.0.2) NA
    RcppEigen CRAN (R 4.0.2) NA
    RcppParallel CRAN (R 4.0.2) NA
    readxl CRAN (R 4.0.0) NA
    reshape CRAN (R 4.0.2) NA
    reshape2 CRAN (R 4.0.0) NA
    rjson CRAN (R 4.0.2) NA
    rlang CRAN (R 4.0.2) NA
    rmarkdown CRAN (R 4.0.2) NA
    rprojroot CRAN (R 4.0.2) NA
    rsconnect CRAN (R 4.0.0) NA
    rstan local NA
    rstanarm CRAN (R 4.0.2) NA
    rstantools CRAN (R 4.0.2) NA
    rstudioapi CRAN (R 4.0.2) NA
    rvest CRAN (R 4.0.2) NA
    sandwich CRAN (R 4.0.2) NA
    sass CRAN (R 4.0.2) NA
    scales CRAN (R 4.0.0) NA
    sessioninfo CRAN (R 4.0.0) NA
    shiny CRAN (R 4.0.2) NA
    shinyjs CRAN (R 4.0.2) NA
    shinystan CRAN (R 4.0.0) NA
    shinythemes CRAN (R 4.0.2) NA
    StanHeaders local NA
    statmod CRAN (R 4.0.2) NA
    stringi CRAN (R 4.0.2) NA
    stringr CRAN (R 4.0.0) NA
    survival CRAN (R 4.0.2) NA
    svglite CRAN (R 4.0.2) NA
    systemfonts CRAN (R 4.0.2) NA
    TH.data CRAN (R 4.0.0) NA
    threejs CRAN (R 4.0.0) NA
    tibble CRAN (R 4.0.2) NA
    tidyr CRAN (R 4.0.2) NA
    tidyselect CRAN (R 4.0.2) NA
    utf8 CRAN (R 4.0.2) NA
    uuid CRAN (R 4.0.2) NA
    V8 CRAN (R 4.0.2) NA
    vctrs CRAN (R 4.0.2) NA
    vipor CRAN (R 4.0.0) NA
    viridisLite CRAN (R 4.0.0) NA
    webshot CRAN (R 4.0.0) NA
    withr CRAN (R 4.0.2) NA
    xfun CRAN (R 4.0.2) NA
    xml2 CRAN (R 4.0.0) NA
    xtable CRAN (R 4.0.0) NA
    xts CRAN (R 4.0.2) NA
    yaml CRAN (R 4.0.0) NA
    zip CRAN (R 4.0.2) NA
    zoo CRAN (R 4.0.0) NA
      library
    abind /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    assertthat /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    backports /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    base64enc /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    bayesplot /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    beeswarm /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    bookdown /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    boot /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    bridgesampling /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    brms /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    Brobdingnag /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    bslib /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    callr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    cellranger /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    checkmate /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    cli /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    cmdstanr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    coda /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    codetools /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    collapse /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    colorspace /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    colourpicker /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    crayon /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    crosstalk /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    curl /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    data.table /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    DBI /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    digest /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    distributional /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    dplyr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    DT /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    dygraphs /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    ellipsis /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    emmeans /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    english /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    EnvStats /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    estimability /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    evaluate /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    extraDistr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    fansi /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    farver /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    fastmap /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    flextable /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    forcats /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    gamm4 /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    gdtools /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    generics /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    ggbeeswarm /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    ggdist /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    ggmosaic /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    ggplot2 /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    ggrepel /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    ggridges /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    ggthemes /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    glue /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    gridExtra /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    gtable /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    gtools /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    hdrcde /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    here /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    highr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    htmltools /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    htmlwidgets /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    httpuv /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    httr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    igraph /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    inline /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    isoband /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    jquerylib /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    jsonlite /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    kableExtra /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    knitr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    labeling /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    later /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    lattice /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    lazyeval /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    lifecycle /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    lme4 /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    loo /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    magrittr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    markdown /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    MASS /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    Matrix /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    matrixStats /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    mgcv /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    mime /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    miniUI /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    minqa /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    multcomp /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    munsell /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    mvtnorm /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    nlme /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    nloptr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    officer /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    pander /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    patchwork /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    pillar /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    pkgbuild /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    pkgconfig /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    plotly /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    plyr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    posterior /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    prettyunits /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    processx /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    productplots /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    projpred /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    promises /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    ps /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    purrr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    R6 /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    Rcpp /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    RcppArmadillo /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    RcppEigen /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    RcppParallel /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    readxl /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    reshape /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    reshape2 /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    rjson /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    rlang /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    rmarkdown /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    rprojroot /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    rsconnect /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    rstan /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    rstanarm /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    rstantools /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    rstudioapi /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    rvest /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    sandwich /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    sass /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    scales /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    sessioninfo /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    shiny /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    shinyjs /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    shinystan /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    shinythemes /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    StanHeaders /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    statmod /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    stringi /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    stringr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    survival /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    svglite /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    systemfonts /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    TH.data /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    threejs /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    tibble /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    tidyr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    tidyselect /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    utf8 /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    uuid /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    V8 /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    vctrs /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    vipor /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    viridisLite /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    webshot /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    withr /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    xfun /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    xml2 /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    xtable /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    xts /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    yaml /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    zip /Library/Frameworks/R.framework/Versions/4.0/Resources/library
    zoo /Library/Frameworks/R.framework/Versions/4.0/Resources/library

References

Betancourt, Michael. 2016. Diagnosing Suboptimal Cotangent Disintegrations in Hamiltonian Monte Carlo,” April. http://arxiv.org/abs/1604.00695.
Bürkner, Paul-Christian. 2017. brms: An R Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80 (1): 1–28. https://doi.org/10.18637/jss.v080.i01.
———. 2018. “Advanced Bayesian Multilevel Modeling with the R Package brms.” The R Journal 10 (1): 395–411. https://doi.org/10.32614/RJ-2018-017.
Carpenter, Bob, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A Probabilistic Programming Language.” Journal of Statistical Software 76 (1): 1–32.
Gabry, Jonah, Daniel Simpson, Aki Vehtari, Michael Betancourt, and Andrew Gelman. 2019. “Visualization in Bayesian Workflow.” J. R. Stat. Soc. A 182: 389–402. https://doi.org/10.1111/rssa.12378.
Goodrich, Ben, Jonah Gabry, Imad Ali, and Sam Brilleman. 2020. “Rstanarm: Bayesian Applied Regression Modeling via Stan.” https://mc-stan.org/rstanarm.
Kampen, Jeroen J A van, David A M C van de Vijver, Pieter L A Fraaij, Bart L Haagmans, Mart M Lamers, Nisreen Okba, Johannes P C van den Akker, et al. 2020. Shedding of infectious virus in hospitalized patients with coronavirus disease-2019 (COVID-19): duration and key determinants.” medRxiv, June, 2020.06.08.20125310.
Kucukelbir, Alp, Rajesh Ranganath, Andrew Gelman, and David M Blei. 2015. Automatic Variational Inference in Stan,” June. http://arxiv.org/abs/1506.03431.
Perera, Ranawaka Apm, Eugene Tso, Owen T Y Tsang, Dominic N C Tsang, Kitty Fung, Yonna W Y Leung, Alex W H Chin, et al. 2020. SARS-CoV-2 virus culture from the upper respiratory tract: Correlation with viral load, subgenomic viral RNA and duration of illness.” medRxiv, July, 2020.07.08.20148783.
Vehtari, Aki, Andrew Gelman, Daniel Simpson, Bob Carpenter, and Paul-Christian Bürkner. 2020. Rank-Normalization, Folding, and Localization: An Improved \(\widehat\{R\}\) for Assessing Convergence of MCMC.” Bayesian Analysis.
Wood, S. N. 2003. “Thin-Plate Regression Splines.” Journal of the Royal Statistical Society (B) 65 (1): 95–114.
Wölfel, Roman, Victor M Corman, Wolfgang Guggemos, Michael Seilmaier, Sabine Zange, Marcel A Müller, Daniela Niemeyer, et al. 2020. Virological assessment of hospitalized patients with COVID-2019.” Nature 581 (7809): 465–69.

  1. i.e., 95% of the mass of the posterior are below 0.21, where a one unit increase corresponds to an increase of the estimated parameters by the factor 1.23↩︎

  2. Given that subject data is sparse, this is, however, unlikely to happen.↩︎