Discrete time chain binomial model to estimate vaccine efficacy

We estimate transmission rates using a chain binomial model where we assume that the rate of susceptible animals acquiring infection per unit time is given by:

\[ R(S\to I) = \beta \frac{(I+\epsilon_{I} I^{V})}{N}\]

where \(I\) is the number of infected (unvaccinated) animals, \(I^{V}\) the number of vaccinated animals that subsequently became infected, \(N\) is the size of the group and \(\beta\) is the per capita transmission rate per infectious animal. The direct vaccine efficacy (\(\epsilon_{S}\)) reduces the rate that vaccinated animals become infected relative to controls. \(\epsilon_{I}\) measures the reduction in infectiousness of vaccinated animals that become infected (again relative to unvaccinated controls.)

In formulating a chain binomial model we make the approximation that the rate of transmission is constant between the two time points and depends only on the time between the observations (\(\Delta t\)) and the values of the state variables (\(S_{t},V_{t},I_{t},I^{V}_{t},N\)) at the earlier time point. Assuming this the probability of transmission occurring in a discrete interval of time \(\Delta t\) is:

\[ P(S \to I) = 1 - e^{\Delta t \beta \frac{(I_{t-1}+\epsilon_{I} I_{t-1}^{V})}{N_{t-1}}} \]

\[ P(V \to I^V) = 1 - e^{\Delta t \epsilon_{S} \beta \frac{(I_{t-1}+\epsilon_{I} I_{t-1}^{V})}{N_{t-1}}} \]

The number of new cases (\(C\)) in the interval between \(t\) and \(\Delta t\) for susceptibles will then be:

\[ C(t,t+\Delta t) \sim \mathrm{Binom}(S_{t}, P(S \to I)) \]

and likewise for vaccinates (\(C_{v}\)):

\[ C_{v}(t,t+\Delta t) \sim \mathrm{Binom}(S_{t}, P(S \to I)) \]

As the probability of infection takes the same functional form as the complimentary log-log transformation, the transmission rate (\(\beta\)) and vaccine efficacy (\(\epsilon_{S}\)) can be estimated using a generalised linear model (GLM) with binomial response (and complementary log-log link function).

Loss of animals and staggered test dates during Phase II

As per the experimental protocol a number of animals had to be euthanized on welfare grounds before the end of the experiment. A total of 5, 4, 4, and 3 animals in total were removed early from groups 1,2,3 and 4 respectively. The majority of removals in each group (except for group 3) were seeders consisting of 3/5, 3/4, 1/4 and 2/3 respectively.

ggplot(DST1_tot_def %>% filter(Treatment == 'C') %>% group_by(Group,t,Batch) %>% select(N_S) %>% arrange(Group,t),aes(x=t,y=N_S,col=Batch)) + geom_path() + facet_wrap(~Group) + geom_point() + xlab('Time') + ylab('Number of Seeders')
## Adding missing grouping variables: `Group`, `t`, `Batch`

For the purposes of analysis we assume that the number of infectious animals for each interval between successive tests consists of the total number of seeder animals and sentinel animals with a previous positive test result that were present within the barn for the whole period. Sentinel animals that were removed before testing positive will not contribute to estimates of transmission rates except with respect to the size of the group which is also updated for each observation interval. In Phase II, delays in the recruitment of calves meant that there were two parallel entry points into Groups 3 and 4. As a consequence the numbers of infectious (and surviving) animals between successive tests are different for these two batches of animals (I and II) and for analysis are treated as two subgroups.

ggplot(DST1_tot_def %>% filter(Treatment == 'C') %>% group_by(Group,t,Batch) %>% select(N) %>% arrange(Group,t),aes(x=t,y=N,col=Batch)) + geom_path() + facet_wrap(~Group) + geom_point() + xlab('Time') + ylab('Group Size')
## Adding missing grouping variables: `Group`, `t`, `Batch`

Logical infection history

Infectious status is inferred through the response to a set of candidate DIVA (Differentiates Infected from Vaccinated Animals) tests. For the two variants of the DIVA test (DST1 and DST10, IGRA and skin tests respectively) we construct a logical infection history where animals are considered to be infected - and potentially infectious - from the first positive test. For our main analysis we focus on the DST1 (IGRA) which was carried out at two monthly intervals for sentinel animals. The number of new infections per observation period within each group is calculated as the number of animals that switch diagnostic status within that interval.

In Phase II seeder animals were tested, but only at a lower frequency than the sentinels. As such, we need to impute number of infectious seeder animals in groups 3 & 4 based on their status at the end of Phase I and end of Phase II. Given the small number of animals that changed status between these two points we did not expect this to affect our results, but to carry out a sensitivity analysis we considered three methods of imputation. The default method (presented below) we change the status of seeder animals only at the point that additional tests were carried out. To set absolute upper and lower bounds we also considered the cases where seeders were assumed to be fixed to their status at the end of Phase I (“minimum”) or end of Phase II (“maximum”). In Appendix 2 we present a comparison of the estimates based on these three methods of imputation and for the two alternative diagnostic tests (DST1 and DST10).

Logical infection history based on DST1 (IGRA) test status

Interim analysis and comparison of attack rates between experimental groups

An interim analysis was carried out at the end of Phase I to ensure that the one year contact time was sufficient to ensure that the stop/go conditions for progression to Phase II were satisfied:

  • An experimental reproduction ratio of > 1.5
  • That at least half of the unvaccinated sentinels had converted to be test-positive

To estimate the experimental reproduction ratio we fitted a reduced model to the Phase I data with treatment as the only explanatory variable. Here we fit the same model to each experimental group in turn to explore the consistency of the attack rate within these barns.

# Fit reduced model to each experimental group
stan_glm.fitg1<-stan_glm(cbind(C,S) ~ Treatment + offset(log(deltat*(IC+I_S+IV+IV_S)/N)),
family=binomial(link=cloglog),
data=DST1_tot_def %>% filter(Group==1) %>% mutate(FIV=(IV+IV_S)/(IC+IV+IV_S+I_S)),refresh=0,chains = 8, iter = 20000)

stan_glm.fitg2<-stan_glm(cbind(C,S) ~ Treatment + offset(log(deltat*(IC+I_S+IV+IV_S)/N)),
family=binomial(link=cloglog),
data=DST1_tot_def %>% filter(Group==2) %>% mutate(FIV=(IV+IV_S)/(IC+IV+IV_S+I_S)),refresh=0,chains = 8, iter = 20000)

stan_glm.fitg3<-stan_glm(cbind(C,S) ~ Treatment + offset(log(deltat*(IC+I_S+IV+IV_S)/N)),
family=binomial(link=cloglog),
data=DST1_tot_def %>% filter(Group==3) %>% mutate(FIV=(IV+IV_S)/(IC+IV+IV_S+I_S)),refresh=0,chains = 8, iter = 20000)

stan_glm.fitg4<-stan_glm(cbind(C,S) ~ Treatment + offset(log(deltat*(IC+I_S+IV+IV_S)/N)),
family=binomial(link=cloglog),
data=DST1_tot_def %>% filter(Group==4) %>% mutate(FIV=(IV+IV_S)/(IC+IV+IV_S+I_S)),refresh=0,chains = 8, iter = 20000)

bayes_Rg = tibble(test='DST1',Group=1,treatment='Unvaccinated',R0=exp(as.data.frame(stan_glm.fitg1)[,1])*365)
bayes_Rg = bayes_Rg %>% bind_rows(tibble(test='DST1',Group=1,treatment='Vaccinated',
                                         R0=exp(rowSums(as.data.frame(stan_glm.fitg1)[,1:2]))*365))
bayes_Rg = bayes_Rg %>% bind_rows(tibble(test='DST1',Group=2,treatment='Unvaccinated',
                                         R0=exp(as.data.frame(stan_glm.fitg2)[,1])*365))
bayes_Rg = bayes_Rg %>% bind_rows(tibble(test='DST1',Group=2,treatment='Vaccinated',
                                         R0=exp(rowSums(as.data.frame(stan_glm.fitg2)[,1:2]))*365))
bayes_Rg = bayes_Rg %>% bind_rows(tibble(test='DST1',Group=3,treatment='Unvaccinated',
                                         R0=exp(as.data.frame(stan_glm.fitg3)[,1])*365))
bayes_Rg = bayes_Rg %>% bind_rows(tibble(test='DST1',Group=3,treatment='Vaccinated',
                                         R0=exp(rowSums(as.data.frame(stan_glm.fitg3)[,1:2]))*365))
bayes_Rg = bayes_Rg %>% bind_rows(tibble(test='DST1',Group=4,treatment='Unvaccinated',
                                         R0=exp(as.data.frame(stan_glm.fitg4)[,1])*365))
bayes_Rg = bayes_Rg %>% bind_rows(tibble(test='DST1',Group=4,treatment='Vaccinated',
                                         R0=exp(rowSums(as.data.frame(stan_glm.fitg4)[,1:2]))*365))

bayes_Rg <- bayes_Rg %>% mutate(Group = factor(Group))
pRG <- ggplot(bayes_Rg,aes(x=R0,col=Group,fill=Group)) + 
  geom_density(alpha=0.5) + 
  facet_wrap(~treatment) + xlab('Experimental R') + theme(text = element_text(size = 10))

ggsave(pRG, width=4.75, height=4.75/2,file="../Manuscript_figures/Fig_S3.pdf")
print(bayes_Rg %>% group_by(treatment,Group) %>% summarise(R = signif(median(R0),3), 
                                           lower = signif(quantile(R0,0.025),2), 
                                           upper = signif(quantile(R0,0.975),2)))
## # A tibble: 8 × 5
## # Groups:   treatment [2]
##   treatment    Group     R lower upper
##   <chr>        <fct> <dbl> <dbl> <dbl>
## 1 Unvaccinated 1     3.04  1.8     4.7
## 2 Unvaccinated 2     3.27  1.9     5.2
## 3 Unvaccinated 3     2.95  1.6     4.9
## 4 Unvaccinated 4     0.678 0.22    1.5
## 5 Vaccinated   1     1.36  0.71    2.3
## 6 Vaccinated   2     1.39  0.79    2.3
## 7 Vaccinated   3     0.835 0.31    1.8
## 8 Vaccinated   4     0.391 0.079   1.1
# Unvaccinated treatment, want to compare R from group 4 to groups 1,2,3

group_4 <- unlist(bayes_Rg %>% filter(Group==4 & treatment=='Unvaccinated') %>% select(R0))
group_123 <- unlist(bayes_Rg %>% filter(Group!=4 & treatment=='Unvaccinated') %>% select(R0))


# Vaccinated treatment, want to compare R from groups 3,4 to groups 1 & 2

group_3v <- unlist(bayes_Rg %>% filter(Group==3 & treatment=='Vaccinated') %>% select(R0))
group_4v <- unlist(bayes_Rg %>% filter(Group==4 & treatment=='Vaccinated') %>% select(R0))
group_12v <- unlist(bayes_Rg %>% filter(Group!=3 & Group!=4 & treatment=='Vaccinated') %>% select(R0))

The posterior probability that the transmission rate in group 4 is less than groups 1-3 is 1. The posterior probability that the transmission rate in group 3 is less than in groups 1&2 is 0.84. The posterior probability that the transmission rate in group 4 is less than in groups 1&2 is 0.98.

Default model estimates

We use the rstanarm package in R to estimate the model within a Bayesian framework and obtain posterior estimates for the vaccine efficacy and average base transmission rate within the two experimental barns. Convergence was assessed through standard MCMC diagnostics (presented in more detail in Appendix 1 below).

# DST 1
stan_glm.fit1<-stan_glm(cbind(C,S) ~ Treatment + FIV + offset(log(deltat*(IC+I_S+IV+IV_S)/N)),
family=binomial(link=cloglog),
data=DST1_tot_def %>% mutate(FIV=(IV+IV_S)/(IC+IV+IV_S+I_S)),refresh=0,chains = 8, iter = 20000)
bayes_eff = tibble(test='DST1',efficacy='Susceptibility',value=1-exp(as.data.frame(stan_glm.fit1)[,2]))
bayes_eff = bayes_eff %>% bind_rows(tibble(test='DST1',efficacy='Infectiousness',value=1-exp(as.data.frame(stan_glm.fit1)[,3])))
bayes_eff = bayes_eff %>% bind_rows(tibble(test='DST1',efficacy='Total',value=1-(exp(rowSums(as.data.frame(stan_glm.fit1)[,2:3])))))

bayes_R0 = tibble(test='DST1',group='Unvaccinated-Unvaccinated',R0=exp(as.data.frame(stan_glm.fit1)[,1])*365)
bayes_R0 = bayes_R0 %>% bind_rows(tibble(test='DST1',group='Unvaccinated-Vaccinated',
                                         R0=exp(rowSums(as.data.frame(stan_glm.fit1)[,1:2]))*365))
bayes_R0 = bayes_R0 %>% bind_rows(tibble(test='DST1',group='Vaccinated-Unvaccinated',
                                         R0=exp(rowSums(as.data.frame(stan_glm.fit1)[,c(1,3)]))*365))
bayes_R0 = bayes_R0 %>% bind_rows(tibble(test='DST1',group='Vaccinated-Vaccinated',
                                         R0=exp(rowSums(as.data.frame(stan_glm.fit1)[,c(1:3)]))*365))

direct <- bayes_eff %>% filter(efficacy=='Susceptibility') %>% select(value)
indirect <- bayes_eff %>% filter(efficacy=='Infectiousness') %>% select(value)

The posterior probability that the efficacy to reduce infectiousness is greater than the direct protection is: 0.8641875.

Posterior predictive checks

In order to assess the fit of the estimated model we form baysian predictive p-values based on the proportion of zeros, maximum and mean values and find no evidence for a significant lack of fit.

require(bayesplot)

y <- DST1_tot_def$C

y_pred <- posterior_predict(stan_glm.fit1)

color_scheme_set("brightblue")
ppc_dens_overlay(y, y_pred)

ppc_hist(y, y_pred[1:5, ])

prop_zero <- function(x) mean(x == 0)

ppc_stat(y, y_pred, stat = "prop_zero", binwidth = 0.005)

ppc_stat(y, y_pred, stat = "max")

ppc_stat(y, y_pred, stat = "mean")

Results (Main figure from manuscript)

## # A tibble: 3 × 4
##   efficacy         e_s lower upper
##   <chr>          <dbl> <dbl> <dbl>
## 1 Infectiousness 0.738  0.46  0.89
## 2 Susceptibility 0.577  0.34  0.73
## 3 Total          0.89   0.74  0.96
## # A tibble: 4 × 6
## # Groups:   test [1]
##   test  group                         R lower upper prob_1
##   <chr> <chr>                     <dbl> <dbl> <dbl>  <dbl>
## 1 DST1  Unvaccinated-Unvaccinated 3.11   2.3    4.1  0    
## 2 DST1  Unvaccinated-Vaccinated   1.31   0.9    1.8  0.075
## 3 DST1  Vaccinated-Unvaccinated   0.812  0.35   1.6  0.72 
## 4 DST1  Vaccinated-Vaccinated     0.343  0.14   0.7  1

Appendix 1: model tables and untransformed parameter estimates

Vaccine efficacy and transmission parameters are calculated via log transforms of the raw parameters. In this appendix we present the regression output for the transformed parameters for reference, along with convergence statistics (Rhat, msce, n_eff).

DST1

summary(stan_glm.fit1)
## 
## Model Info:
##  function:     stan_glm
##  family:       binomial [cloglog]
##  formula:      cbind(C, S) ~ Treatment + FIV + offset(log(deltat * (IC + I_S + 
##     IV + IV_S)/N))
##  algorithm:    sampling
##  sample:       80000 (posterior sample size)
##  priors:       see help('prior_summary')
##  observations: 72
##  predictors:   3
## 
## Estimates:
##               mean   sd   10%   50%   90%
## (Intercept) -4.8    0.2 -5.0  -4.8  -4.6 
## TreatmentV  -0.9    0.2 -1.2  -0.9  -0.6 
## FIV         -1.4    0.4 -1.9  -1.3  -0.9 
## 
## Fit Diagnostics:
##            mean   sd   10%   50%   90%
## mean_PPD 1.1    0.2  0.9   1.1   1.3  
## 
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
## 
## MCMC diagnostics
##               mcse Rhat n_eff
## (Intercept)   0.0  1.0  72319
## TreatmentV    0.0  1.0  51414
## FIV           0.0  1.0  40902
## mean_PPD      0.0  1.0  75969
## log-posterior 0.0  1.0  30615
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

Appendix 2: Sensitivity analysis

Comparison of estimates of vaccine efficacy for the two alternative diagnostic tests (DST1 & DST10) and three methods of imputing the status of seeder animals in Phase II. Estimates are consistent across all of the imputed trajectories.

## # A tibble: 18 × 6
## # Groups:   test, imputation [6]
##    test  imputation efficacy         e_s lower upper
##    <chr> <chr>      <chr>          <dbl> <dbl> <dbl>
##  1 DST1  Default    Infectiousness    74    46    89
##  2 DST1  Default    Susceptibility    58    34    73
##  3 DST1  Default    Total             89    74    96
##  4 DST1  Maximum    Infectiousness    77    53    90
##  5 DST1  Maximum    Susceptibility    57    34    73
##  6 DST1  Maximum    Total             90    77    96
##  7 DST1  Minimum    Infectiousness    72    42    88
##  8 DST1  Minimum    Susceptibility    58    34    73
##  9 DST1  Minimum    Total             88    71    96
## 10 DST10 Default    Infectiousness    68    32    87
## 11 DST10 Default    Susceptibility    46    14    66
## 12 DST10 Default    Total             83    57    94
## 13 DST10 Maximum    Infectiousness    71    39    88
## 14 DST10 Maximum    Susceptibility    46    14    66
## 15 DST10 Maximum    Total             84    61    94
## 16 DST10 Minimum    Infectiousness    67    30    87
## 17 DST10 Minimum    Susceptibility    46    14    66
## 18 DST10 Minimum    Total             82    55    94
## # A tibble: 18 × 6
## # Groups:   test, imputation [6]
##    test  imputation efficacy         e_s lower upper
##    <chr> <chr>      <chr>          <dbl> <dbl> <dbl>
##  1 DST1  Default    Infectiousness    74    46    89
##  2 DST1  Maximum    Infectiousness    77    53    90
##  3 DST1  Minimum    Infectiousness    72    42    88
##  4 DST10 Default    Infectiousness    68    32    87
##  5 DST10 Maximum    Infectiousness    71    39    88
##  6 DST10 Minimum    Infectiousness    67    30    87
##  7 DST1  Default    Susceptibility    58    34    73
##  8 DST1  Maximum    Susceptibility    57    34    73
##  9 DST1  Minimum    Susceptibility    58    34    73
## 10 DST10 Default    Susceptibility    46    14    66
## 11 DST10 Maximum    Susceptibility    46    14    66
## 12 DST10 Minimum    Susceptibility    46    14    66
## 13 DST1  Default    Total             89    74    96
## 14 DST1  Maximum    Total             90    77    96
## 15 DST1  Minimum    Total             88    71    96
## 16 DST10 Default    Total             83    57    94
## 17 DST10 Maximum    Total             84    61    94
## 18 DST10 Minimum    Total             82    55    94

Save posterior draws

dst1_post <- tibble(beta=exp(as.data.frame(stan_glm.fitg1)[,1]),
                    eff_S=1-exp(as.data.frame(stan_glm.fit1)[,2]),
                    eff_I=1-exp(as.data.frame(stan_glm.fit1)[,3]))

dst10_post <- tibble(beta=exp(as.data.frame(stan_glm.fit10)[,1]),
                    eff_S=1-exp(as.data.frame(stan_glm.fit10)[,2]),
                    eff_I=1-exp(as.data.frame(stan_glm.fit10)[,3]))

save(dst1_post,dst10_post,file='../Posterior/dst_post.RData')
write.table(dst1_post,file='../Posterior/dst1_post.csv',row.names=FALSE)
write.table(dst10_post,file='../Posterior/dst10_post.csv',row.names=FALSE)