COVID-19 early pandemic scenario in India compared to China and rest of the world: a data driven and model analysis

The coronavirus disease 2019 (COVID-19) global pandemic is ongoing, and this is devastating. This study aimed to explore the early COVID-19 pandemic situation in Indian context compared to that of China (the primary epicenter of the ongoing COVID-19 pandemic) as well as in countries outside China. Various epidemiologic parameters, by data driven analysis: basic reproduction number (R0), average reproduction number (R), effective reproduction number (Re), daily growth rate (DGR), case fatality rate (CFR) and case recovery rate (CRR), as well as model analysis: R0, R, Re, serial interval (SI), transmission rate (β) and recovery rate (γ), of COVID-19 early pandemic in India were determined, and were compared with China and rest of the world. The DGR, CRR, CFR, and SI of COVID-19 in India were 17%, 8.25%, 1.87%, and 5.76 days, respectively. The data driven estimates of R0, R and Re were 1.03, 1.73, and 1.35, respectively. The exponential and SIR model had higher estimates of R0, R and Re. The data driven as well as estimated COVID-19 cases reflected the growing nature of the epidemic in India and world excluding China, whereas the same in China revealed the involved population became infected with the disease and moved into the recovered stage. The Re values in India before and after lockdown were 1.62 and 1.37 respectively, with SI 5.52 days and 5.98 days, respectively. The current findings reflected the effectiveness of lockdowns, and therefore, for an early end of the COVID-19 pandemic, strong social distancing is important.


Introduction
The coronavirus disease 2019 (COVID-19) is a rapidly spreading respiratory illness caused with the infection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The COVID-19 pandemic started in December 2019 in Wuhan, China [1], and has spread worldwide. As of April 9, 2020, infected 1,604,252 people globally, including 81,907 cases, with 3,336 deaths, in China and 1,522,345 cases, with 92,378 deaths, in 203 countries/territories/areas outside China [2]. In India, the first COVID-19 case was reported on January 30, 2020, followed by three cases on February 3, 2020, who were students returned from Wuhan University, with further no new cases till March 1, 2020, but with two more positive cases reported on March 2, 2020 [3]. Confirmed cases in India crossed 50 on March 10, 2020, amongst which 39 had an international travel history, and 16 out of the first 44 cases were foreign nationals. The number of confirmed cases increased to 100 on March 14, 2020, and crossed 1,000 on March 29, 2020, and 14,000 on April 17, 2020, with 486 deaths [2]. In order to contain the ongoing COVID-19 pandemic, India announced a 21-day lockdown that commenced on March 25, 2020 [3], and thereafter extended up to May 3, 2020. In order to control the spread of the virus and to break the chain of transmission non-pharmaceutical strategies, such as testing and isolation of COVID-19 cases, contact tracing and quarantine combined with social distancing, masking in public and practicing personal protection are of great significance for the early epidemic control [4][5][6]. In addition, exploring the epidemiological features of COVID-19 will help to understand the nature of the disease in context to the Indian scenario of the ongoing global COVID-19 pandemic.
Modeling plays a significant role in understanding the basic epidemiology of an infectious disease and in evaluating the effectiveness of implementing control strategies. The current study stands for the exploration of the early pandemic situation of COVID-19 in India, compared with the situation in China, which remains the primary epicenter of the ongoing COVID-19 pandemic, as well as in countries outside China, utilizing data available up to April 9, 2020. This study, thus, estimates the epidemiologic parameters including basic reproduction number (R0), average reproduction number (R) and effective reproduction number (Re), serial interval (SI), transmission rate (β), and recovery rate (γ) of COVID-19 early pandemic based on publicly available epidemiological data with exponential and SIR (susceptibleinfectious-recovered) model analysis, in order to identify the underlying epidemiologic pattern, evaluation of the efficacy of control strategies, and forecast the epidemiologic dynamics.

Retrieval and processing of data
Information at early stage of the ongoing COVID-19 pandemic, including the number of cumulative infected, infectious and recovered cases, and deaths, were retrieved from Worldometer, available up to April 9, 2020, for three geographical locations: India, China, and countries outside China [2]. The lockdown effect in India was studied up to April 17, 2020, retrieving data from Worldometer [2]. The data were processed to calculate daily growth rate (DGR), case recovery rate (CRR), and case fatality rate (CFR).

Exponential model
The short term prediction of pandemic was done using exponential model, by nonlinear least square method for data fitting and parameter estimation [7]. In India, and the countries outside China, initially the pandemic dynamics was exponential with a growth rate λ, such that the size of the infected population at time t, was given by = 0 , and 0 , a constant, was estimated from the fitted exponential curve of the cumulative number of infected cases. This exponential equation was used to determine SI, which is the time period between the onset of symptoms in the infector (a primary COVID-19 patient) and onset of symptoms in the infectee (a patient receiving the infection from the infector) [8]. The SI may be same as the generation time (renewal time of the infected population), considering the onset of symptoms to be same as the onset of infectiousness [9].
The R0 of COVID-19 was defined as the average number of secondary infections produced by an infected individual in an otherwise susceptible host population [10]. The probability density function (PDF), expressed as , where MGF is the moment generating function of SI.

SIR model
A time-varying SIR model was developed on the basis of the cumulative number of COVID-19 cases per day, and the total population size (N) was divided into three mutually exclusive infection status, assuming that any infectious person (I), has a probability of contacting any susceptible person (S), and later recovered (R), so that = + + . The dynamics of the pandemic in three geographical locations were modeled using the following three differential equations: = − , = − , = , where and represent transmission rate and recovery rate respectively, defined as the probability of a susceptible-infected contact resulting in a new infection, and the probability of an infected case recovering and moving into the resistant phase, respectively. The reproduction number was calculated using the ratio of the transmission rate to the recovery rate obtained from SIR modeling. The Re, defined as the mean number of secondary cases generated by a primary case at time t in a population, was calculated as an indicator to measure the transmission of COVID-19 both before and after the interventions [10]. The dynamic SIR model determines the reproduction numbers, forecast the end date and the final size of the epidemic, by considering the cumulative infected cases and the total population number in a country/region as the initial population size N, in optimal and optional model, respectively.
The CFR in India and the world excluding China were 3.38% and 6.07%, respectively, with overall death toll of 227 and 92,378, respectively, as of April 9, 2020 ( Figure 1c and Figure 2c). The CFR in China reached an equilibrium attaining 4.07% at the later stage of the COVID-19 epidemic with single new death and 3,336 cumulative deaths as of April 9, 2020 ( Figure 3c). The data driven as well as estimated COVID-19 cases as represented in Figure 1b   to yield R0=4.85 (Table 1). The SI of COVID-19 in China, calculated from DGR, showed a mean of 9.9 days (95% CI 8.32 -11.49), which in early phase of the epidemic was <2 days (Figure 3b). Longer SI was observed for India and world excluding China (Figure 1d and Figure 2d), while China displayed SI in between <5 and >5 days (Figure 3d).  Table 1 represents the R0, R, and Re values estimated from data based on DGR, exponential and SIR mathematical models for three locations.
The mean Re for the pre-lockdown period (before March 25, 2020) was 1.62 (95% CI: 1.36 -1.89), and the mean Re for the post-lockdown period was 1.1 (95% CI: 0.84 -1.37) ( Figure 4); the decline of Re indicated the effectiveness of the introduced interventions at the early stage of the ongoing COVID-19 pandemic. The association of reproduction number with other epidemiological parameters is shown in Table 2.  # Based on fitted exponential growth rate and MGF of SI distribution. * The optimal and optional SIR model for India was based on the cumulative infected case number = 6,725 on April 9, 2020, and the population of India = 13×108, as the initial population size, respectively. ** The optimal and optional SIR model worldwide for countries outside China was based on the cumulative infected case number = 1,522,345 on April 9, 2020, and the world population excluding China = 37.6×108, as the initial population size, respectively. *** The optimal and optional SIR model for China was based on the cumulative infected case number = 81,907 on April 9, 2020, and the population of China =13.8×108, as the initial population size respectively.

Discussion
The World Health Organization, on March 11, 2020, declared COVID-19 a pandemic caused with the infection of SARS-CoV-2, which rapidly spread worldwide from China [11]. In the present study, the dynamic changes of the epidemiologic parameters were taken into account at three different geographical locations: India, China, and countries outside China, representing the variation in DGR, CRR, CFR, SI and R. An undulating pattern of DGR of COVID-19 cases was observed for India as well as world excluding China, while in China the DGR was <0.3 % since February 23, 2020 onwards. Sanche et al. [12] estimated the growth rate of early COVID-19 outbreak, in China, as 0.21-0.30 per day. The CFR, defined as the ratio of deaths from COVID-19 to the number of cases, provides an assessment of virulence/clinical severity. Therefore, in order to explore the early insights into the severity of ongoing COVID-19 pandemic, CFR of the disease was also estimated. Yang et al. [13], using data-driven statistical method, estimated CFR in the early phase of COVID-19 as 0.15% in mainland China without Hubei, 1.41% in Hubei province without Wuhan city, and 5.25% in Wuhan. The COVID-19 CFR in China was reported as high as 15%, due to low initial case number, which on later time decreased to 3·4% [14]. As of March 17, 2020, in Italy, the CFR of COVID-19 has been recorded as 7.2%, which stands higher than the CFR (2.3%) in China, as of February 11, 2020 [15]; interesting to note, Italy had CFR of 2.3% (as of February 28, 2020), which was identical to the CFR of China [16].
The estimated SI of 5.44, 6.1 d, and 9.9 days, respectively for India, countries outside China, and China as of April 9, 2020, indicated decreasing COVID-19 transmission from one generation of cases to the next. The SI representing the time from illness onset in a primary case to illness onset in a secondary one is imperative to recognize the turnover of case generations and COVID-19 transmissibility [17]. Sanche et al. [12] determined the doubling time of COVID-19 at early outbreak, in China, as 2.3 -3.3 days, reflecting the high transmission rate of SARS-CoV-2. The distribution of COVID-19 SI is a critical input for determining R0 and the extent of interventions required to control the epidemic [18]. The mean SI for COVID-19 estimated as of February 8, 2020, by Du et al. [19] was 3.96 days, which was shorter than that found in our study, and as calculated for SARS (8.4 days) [20], or MERS (14.6 days) [21], implying that contact tracing strategies must contend with the fast replacement of case generations, resulting into number of contacts surpass the existing healthcare facilities [22]. The mean incubation period (time from exposure to symptoms development) for COVID-19 was ≈5 days (range 1-14 days) [23]. Longer SI than incubation period is indicative of symptomatic COVID-19 transmission [22], as observed for India and world excluding China, in the current study, while China displayed SIs <5 to >5 days implying both pre-symptomatic and symptomatic COVID-19 transmission. The estimates of R0 and R produced by SIR mathematical model were higher than that of statistical exponential growth model, and data driven based on DGR, possibly attributed to the underlying assumptions of the mathematical models.
The earlier studies showed variation of R0 that ranged 1.4 -6.49, for COVID-19, because of variation in sources of data collection, time periods, and modelling methods used [23,24]. The current estimates depicted that the R0 were in the order: world (excluding China)> China> India, following DGR and SIR methods, while R and Re were in the pattern: world (excluding China)> India> China. This could be explained by the fact that during the initial phase of the epidemic the COVID-19 infection was greater in the world, wherein most of the proportion of infected cases were from China, while in the later stage, China contained the SARS-CoV-2 transmission by the time when the world excluding China, and India in the evolving stage, had escalating growth of COVID-19 cases. This leads to higher reproduction number compared to that of China with lowest proportion of people susceptible to COVID-19 as well as gradual decrease in the number of cumulative infectious cases, and daily new deaths plummeting trend dropping to one, as of April 9, 2020.
The R, which determines the transmissibility of the virus, SARS-CoV-2, represents the average number of new COVID-19 cases generated by each of the infected individuals, the initial value of which was regarded as R0 that depicted the average number of new infections per infected COVID-19 case, while the Re has been defined as the average number of secondary cases generated per primary case, with symptom onset, at time t [10]. The Re displays variation as the outbreak progresses in time course and might be affected with control measures. In the current study, the optimal and optional SIR models for India estimated the average R of 5.33 (95% CI: 3.05 -11.33, average β = 0.32, average γ = 0.06) and 3.67 (95% CI: 2.13 -7.26, average β = 0.22, average γ = 0.06), respectively, calculated from March 5, 2020 to April 9, 2020, with R0 = 1.0. The estimates of the optimal and optional SIR models for countries outside China showed the average R 9.21 (95% CI: 8.19 -10.35; average β = 0.188, average γ = 0.0204) and 7.3 (95% CI: 7.24 -7.26, average β = 0.1491, average γ = 0.0204), respectively, as calculated between February 2, 2020 and April 9, 2020, with R0 = 7.5. The optimal and optional SIR model prediction for average R in China was 3.05 (95% CI: 2.89 -3.22; average β = 0.128, average γ = 0.042) and 0.34 (95% CI 0.33 to 0.35; average β = 0.0143, average γ = 0.042), respectively, as calculated between January 23, 2020 and April 9, 2020, with R0 = 7.43 and 7.38 respectively. As per the estimate of Zhao et al. [7], R0 for SARS-CoV-2 ranged 2.24 -3.58, which was >1, escalation of epidemic at time t (unless effective control measures are operated), while Re <1 defines the shrinking of epidemic size at time t [10]; the epidemics to be fade away when the transmissibility could be reduced by 1-1/R0 [25].
The epidemic size estimates in India, following optimal SIR modeling, of ~17,886 (as of April 9, 2020 with 5863 infectious cases) and ~30,284 (as of April 15, 2020 with 12,370 infectious cases) reflected an estimated end of the epidemic on June 3, 2020, and June 9, 2020. Similarly, the estimated epidemic size of the world excluding China: ~4,362,777 (as of April 9, 2020 with 1,150,985 infectious cases) and ~6,303,651 (as of April 15, 2020 with 2,004,136 infectious cases), reflected an end of the epidemic on June 5, 2020, and June 11, 2020. Ranjan [26] reported R0 for COVID-19 pandemic in India as 1.4-3.9, and model within this study predicted an equilibrium of the disease by the end of May, 2020, considering the absence of the impact of community transmission. However, China has contained the local transmission of SARS-CoV-2 with 76-day lockdowns (since January 23, 2020), and that have begun to lift in response to the slowing of the pandemic, which might be informative to public health policy making in other countries. [24] Herein, there was significant (p<0.05) association of R with time and SI, globally. Other epidemiologic parameters including the number of infectious cases and total deaths were significantly related to R in China, implying these two variables as an index of R in epidemiologic evolutionary dynamics. However, CFRs in our study were insignificantly associated with R, at three geographical locations, irrespective of the epidemiologic stages. The rate of deaths per day has emerged as a useful parameter for tracking the evolution of COVID-19 spread in different regions [27].
Mathematical models and studies demonstrated the association between reduction of Re (an indicator to measure SARS-CoV-2 transmission), effectively to <1, and implementation of public health measures amidst COVID-19 pandemic [24,28]. India's strategies with massive social distancing: lockdown since March 25, 2020, combined with mass contact tracing, quarantine, and case isolation help curbing the SARS-CoV-2 spread, and that might be relevant to adopt by other countries [29]. Zhao and Chen [30] in China, before January 30, 2020, depicted Re of >1 (range: 1.2-5.9), while after January 30, 2020, Re was <1 (range: 0.51-0.53), suggesting the effectiveness of non-pharmaceutical control strategies in averting SARS-CoV-2 transmission. There has been a gradual decrease of Re from high level of transmission (Re: 3.1, 2.6, and 1.9) to <1 (Re: 0.9 or 0.5) due to increasing implementation of public health measures, in the early stage of COVID-19 pandemic [5]. The transmission of SARS-CoV-2 is, however, escalating rapidly worldwide, and multiple countries outside China are experiencing the devastating consequences of COVID-19 pandemic indicating the shortfalls in preparedness in effective and early containment of the disease. Thus, this is crucial to understand how the stringent strategies (mass lockdown and mass testing alongside) prevented the spread of SARS-CoV-2 in China, [31] and to implement this by countries outside China [24,32].

Conclusion
Combined with available evidences, current findings suggested the adoption of non-pharmaceutical interventions in order to shrink the peak intensity of outbreaks, wherein the lifting of the measures results resurgence of SARS-CoV-2 transmission. Therefore, high grade effective social distancing strategies help reduce the extent of SARS-CoV-2 infections that cause strain on the health care systems, unless critical care capacity is expanded substantially, or a treatment or vaccine becomes available. Also regular surveillance of SARS-CoV-2 in an urgent manner (even after accomplishing ample control of transmission, if any) is essential for the vigilance of further plausible resurgence of COVID-19 outbreaks. Determining CFR of COVID-19 and accounting daily deaths will help track the dynamics of disease spread, formulate control measures and design the supplies of health care systems to mitigate the global crisis of ongoing COVID-19 pandemic. Thus, this early epidemiologic studies of COVID-19 that originated in China and spread globally will help determine the evolutionary dynamics of the pandemic in Indian as well as global contexts for its management and mitigation in future.