UPDATE: Zenodo migration postponed to Oct 13 from 06:00-08:00 UTC. Read the announcement.

Journal article Open Access

Use of a null assumption to re-analyze data collected through a rolling cohort subject to selection bias due to informative censoring

Mark Reeder

A novel method of estimating selection bias due to informative censoring for a rolling cohort utilizing matches is demonstrated for a recently published, and highly influential, study. The core reason for the bias is related to the principle that those with Covid-19 symptoms are prevented from obtaining a vaccine. The study explored the efficacy of the BNT162b2 mRNA Covid-19 vaccine in a nationwide mass vaccination setting in Israel, and was conducted over a 44-day study period spanning December 2020 through early February 2021. The present approach utilizes the published data to establish a population entering the study, and the number of matches which exist at the end of the study period is also a known value. Since the time-wise distribution of those censored was not made available, two different distribution patterns were compared. Simple probability rules were applied to estimate those who rolled out of the cohort on any given day. Under those circumstances, a null assumption (i.e., that the vaccine has no effect) for the exposed group clearly led to a different value for the outcome of interest, demonstrating the effect of the bias. The expected value from the null assumption could then be compared to the actual measured value to yield efficacy. It was found that the time-wise distribution of those rolling out strongly influenced the level of bias. Discussion of the process includes a brief overview of entities known from the Kaplan-Meier method of data presentation, from which several relevant inferences may be drawn. First, a transparent listing of the number of matches which both enter and exit the cohort for all possible combination of days within the study period would lead to a more robust approximation of the selection bias due to informative censoring. The literature suggests that such a practice be implemented and also that the estimate would likely be further improved if hazard ratios were modeled, and this effect was confirmed herein. Two other examples from the literature are explored and lead to a question: Is it possible that a similar selection bias due to informative censoring might also be responsible, at least in part, for the difference between research studies which report high vaccine efficiency and relatively poor outcomes in real-world environments? If so, the use of a booster to improve outcomes is called into question. All studies and reports on Covid-19 vaccine efficiency should be assessed to determine the significance of selection bias due to informative censoring and provide a fully-transparent description of how this often-neglected bias is taken into account.

I realize the title is bland, but the content is anything but... This manuscript has been submitted to the journal "Measurement Science and Technology" and is presently out for review.
Files (1.1 MB)
Name Size
1.1 MB Download
All versions This version
Views 11,64611,646
Downloads 6,3936,393
Data volume 6.7 GB6.7 GB
Unique views 10,69310,693
Unique downloads 6,0276,027


Cite as