PHYSICAL REVIEW X 2, 031005 (2012)
Early Real-Time Estimation of the Basic Reproduction Number of Emerging Infectious Diseases
Bahman Davoudi,1 Joel C. Miller,2 Rafael Meza,1 Lauren Ancel Meyers,3 David J. D. Earn,4 and Babak Pourbohloul1,5,* 1Mathematical Modeling Services, British Columbia Centre for Disease Control, Vancouver, British Columbia, Canada 2Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA 3Section of Integrative Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas, USA 4Department of Mathematics & Statistics and the M. G. DeGroote Institute of Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada 5School of Population & Public Health, University of British Columbia, Vancouver, British Columbia, Canada (Received 22 July 2011; revised manuscript received 6 April 2012; published 30 July 2012) When an infectious disease strikes a population, the number of newly reported cases is often the only available information during the early stages of the outbreak. An important goal of early outbreak analysis is to obtain a reliable estimate for the basic reproduction number, R0. Over the past few years, infectious disease epidemic processes have gained attention from the physics community. Much of the work to date, however, has focused on the analysis of an epidemic process in which the disease has already spread widely within a population; conversely, very little attention has been paid, in the physics literature or elsewhere, to formulating the initial phase of an outbreak. Careful analysis of this phase is especially important as it could provide policymakers with insight on how to effectively control an epidemic in its initial stage. We present a novel method, based on the principles of network theory, that enables us to obtain a reliable real-time estimate of the basic reproduction number at an early stage of an outbreak. Our method takes into account the possibility that the infectious period has a wide distribution and that the degree distribution of the underlying contact network is heterogeneous. We validate our analytical framework with numerical simulations.
DOI: 10.1103/PhysRevX.2.031005 Subject Areas: Complex Systems, Interdisciplinary Physics, Statistical Physics
I. INTRODUCTION infection transmission. Consequently, a wide array of out- comes is possible, ranging from the outbreak fizzling out, The basic reproduction number, R0, is a fundamental even in the absence of an intervention, to circumstances characteristic of the spread of an infectious disease. It is where the initial stage expands into a large-scale epidemic. generally defined as the expected number of new infections Once a full-blown epidemic develops, several assumptions caused by a typical individual during the entire period of can be made that simplify the estimation of R0, as has his or her infection in a totally susceptible population been discussed in detail in the literature [1,3,5,8]. [1–3]. Because R0 is a simple scalar quantity, and perhaps In many cases, it is necessary to assess the impact of because in many circumstances it determines the expected various intervention strategies before a large-scale epi- (average) final size of an outbreak [3–7], it has been widely demic occurs. In doing so, stochastic manifestations of used to gauge the degree of threat that a specific infectious agent will pose as an outbreak progresses [8–10]. While it disease transmission, as well as the underlying structure of the contact network, should be taken into account. The is clear that knowing the value of R0 can be very useful for policymakers in planning a response, it is not as straight- first aspect has been widely studied. For example, the Reed-Frost model is a chain-binomial stochastic model forward to obtain a reliable estimate of R0, especially in the early stages of an outbreak, before large-scale, uncon- where each infected individual can infect susceptible indi- trolled transmission has taken place and before the basic viduals and they are all assumed to have the same contact biology and transmission pathways of the pathogen have rate [11–14]. Another example is the methodology devel- been characterized. oped by Becker [15] and by Ball and Donnelly [16], which Early in an outbreak, the pattern of disease spread is is based on a branching process susceptible-infected- predominantly influenced by the probabilistic nature of recovered (SIR) model. Branching processes have received wide attention because they facilitate the evaluation of the basic reproduction number as well as the final epidemic *Corresponding author: Mathematical Modeling Services, British Columbia Centre for Disease Control, 655 West 12th size and epidemic probability [17]. More recently, the 2003 Avenue, Vancouver, BC, V5Z 4R4 Canada. global outbreak of severe acute respiratory syndrome [email protected] (SARS) inspired the development of new methodologies based on the daily number of new cases and the distribution Published by the American Physical Society under the terms of the Creative Commons Attribution 3.0 License. Further distri- of the serial interval between successive infections [18–21]. bution of this work must maintain attribution to the author(s) and However, none of these methods take into consideration the the published article’s title, journal citation, and DOI. influence of the contact network underlying an epidemic
2160-3308=12=2(3)=031005(15) 031005-1 Published by the American Physical Society BAHMAN DAVOUDI et al. PHYS. REV. X 2, 031005 (2012) process. Alternatively, it is assumed that the contact network influenza virus in Japan. Although all of these constitute is a classic random graph. an important advancement in the literature, none of Several new methods to estimate the basic reproduction them simultaneously addresses analytically the stochastic- number R0 were proposed during or shortly after the 2009 ity due to the underlying contact network and the trans- H1N1 influenza pandemic. Notably, Nishiura et al. [22] mission process. employed an age-structured model to derive an estimate for R0. Katriel et al. [23] used a new discrete-time stochastic II. OUTLINE SUMMARY epidemic SIR model that explicitly takes into account the disease’s specific generation-time distribution and the in- In the following, we first describe the basic notion of trinsic demographic stochasticity inherent to the infection a contact network model. We then define the infection process. Balcan et al. [24] employed a method that is based hazard or infectivity function, the removal hazard or re- on the distribution of the arrival times of the H1N1 influ- moval function, the transmissibility of an infectious agent, enza virus in 12 different countries seeded by the Mexico and the removal probability. We then derive a stochastic epidemic using 1 106 computationally simulated epi- renewal equation that describes the rate of newly infected demics. Nishiura et al. [25] also developed a discrete-time individuals at any given time t as a function of the number stochastic model that accounts for demographic stochastic- of newly infected individuals up to time t. We then show ity and conditional measurement and applied it to estimate that during the exponential-growth phase of an epidemic the R0 value using the weekly incidence of the H1N1 (also referred here as the exponential regime), the renewal
TABLE I. Summary of the parameters used in this paper.
Quality Symbol Description
Degree distribution pk Probability that a randomly chosen vertex has degree k. Average degree z1 Average degree of vertices in a network calculated by z1 ¼hki. Excess degree Zx Average degree of a vertex chosen by sampling an edge 2 (calculated by Zx ¼hk kihki) Infection hazard or ið Þ Instantaneous rate of infection. ið Þ gives the probability of disease infectivity function transmission across an edge between infection age and þ , given it occurred after age . Removal hazard or rð Þ Instantaneous removal rate. rð Þ gives the removal probability of an removal function infectious individual between its infection age and þ , given it occurred after age . Transmissibility Tð Þ Probability of diseaseR transmission by infection age . It is calculated by 0 0 Tð Þ¼1 exp½ 0 rð d Þ . Removal distribution 1 ð Þ ð Þ gives the probabilityR of not being removed by age of infection . 0 0 ð Þ¼1 exp½ 0 rð Þd . Removal-probability density c ð Þ c ð Þ¼ d ð Þ=d R 1 Expected transmissibility T Probability of disease transmission along one edge, T ¼ 0 c ð ÞTð Þd . Basic reproduction number R0 Expected number of infections a typical infected individual can cause in a fully susceptible population, R0 ¼ ZxT. Rate of new infections J~ðtÞa J~ðtÞ t gives the number of new infections between times t and t þ t. ~ ð ; tÞ Fraction of active S-I edges where the disease is actually transmitted exactly at time t. R I~ t t I~ t J~ t ; t d Number of infectious individuals ð Þ Total number of infectious individuals at time , ¼ 0R ð Þ ð Þ . ~ ~ t ~ Number of removed individuals RðtÞ Total number of removed individuals at time t, RðtÞ¼ 0 Jðt Þ½1 ð ; tÞ d . R~rðtÞ Total number of removed individuals at time t whose predecessor is already removed. R~iðtÞ Total number of removed individuals at time t whose predecessor is still infectious. I~rðtÞ Total number of excess links of removed individuals, calculated by Eq. (22). I~iðtÞ Total number of infectious individuals at time t whose predecessor is still infectious. ~ r ZxðtÞ Total number of excess links of removed individuals, calculated by Eq. (22). Transmissibility of removed TrðtÞ Gives the transmissibility of removed individuals at time t and is individuals. calculated by Eqs. (23) and (25) ð Þ ð Þ gives the expected number of new infections produced by an infectious individual between ages of infection and þ d . Generation-interval distribution. ^ð Þ ^ð Þ gives the conditional probability that given an infection, it occurred between ages of infection and þ d .
031005-2 EARLY REAL-TIME ESTIMATION OF THE BASIC ... PHYS. REV. X 2, 031005 (2012) equation reduces to the well-known Wallinga-Lipsitch some diseases, after a temporary recovery, the person equation for R0 [26]. In this case, we also obtain an equa- may become infectious again. Knowing that an individual tion that expresses the generation-interval distribution in acquires infection at a given time ti, various states of terms of the transmissibility and the removal distribution infectiousness for this individual can be encapsulated function. We next define the basic reproduction number for within the infection hazard or infectivity function, ið Þ. removed individuals and write two independent equations The infectivity function measures the instantaneous risk of for this quantity. Equating these two equations acts as a disease transmission across an edge. This implies that for constraint that allows us to estimate one unknown model small , the conditional probability that infection occurs parameter, either a disease or a network structure parameter. across an edge between times and þ , given that it We then derive an algorithm to estimate one unknown did not occur by time , can be approximated by ið Þ . model parameter based on this constraint and the number Typically, ið Þ is initially zero during the latent period; it of newly infected individuals up to time t, and show how we increases to a certain level and then declines during the can estimate R0 with this algorithm. We finally present infectious period, before finally vanishing and returning to numerical examples of the methodology, which show its zero at the time of permanent recovery. Figure 1 shows four accuracy in estimating R0 for different contact networks hypothetical infectivity functions, the first of which is the and disease parameters. typical case. In practice, the functional form of ið Þ We summarize a list of the main parameters introduced should be estimated from the actual transmission profile in this paper and their definitions in Table I. corresponding to a specific disease. Note that the only technical restriction on ið Þ is that it must be a non- III. NETWORK BASIS negative integrable function. Given the infectivity function, one can evaluate the This section briefly introduces the idea of contact- probability that an individual transmits the disease to one network epidemiology and defines the key concepts of of his or her contacts during a specific time period. Let Tð Þ infection rate, removal rate, transmissibility, and removal- denote the probability of disease transmission along one probability density. We map a collection of N individuals to edge for an individual with infection age . Then Tð Þ a network where each vertex represents an individual and satisfies [27,29] each edge shows a pathway of possible infection trans- Z 0 0 mission between two individuals. We use k to denote the Tð Þ¼1 exp ið Þd : (1) degree of a given individual (the number of contacts that 0 he or she has) represented graphically by the number of In general, the time to removal varies from one individ- edges emanating from a vertex, and we use pk to denote ual to another and there is no a priori knowledge of the the degree distribution [the probability that a randomly exact value of this quantity for each individual. Therefore chosen vertex has degree k (k contacts)]. Several impor- we must account for its variability as well. Let rð Þ denote tant quantities can be derived once a network’s degree the removal hazard or removal function, i.e., the instanta- distribution is known.P The moments of the degree distri- neous rate of removal for an individual with infection age kn 1 knp n 1 k bution are h i¼ ¼0 .For ¼ , h i is the aver- . This implies that for small , the conditional probabil- age number of nearest neighbors of a randomly chosen ity that an individual is removed between times and individual, which we denote z1. The average number of þ , given that he or she is not removed by time , second nearest neighbors of a randomly selected individ- can be approximated by rð Þ . The removal function 2 ual, z2, can be expressed as hk i hki [27]. To estimate indicates how quickly the infectious individuals are re- R0, we count the number of edges along which an indi- moved from disease dynamics as a function of the duration vidual can infect others, once that individual has become of their infection. This can be related to death or various infected. This quantity, usually termed excess degree, interventions such as hospitalization or quarantine, reduc- represents the number of edges emanating from a vertex tion of social activity due to severity of illness, and behav- (individual), excluding the edge that was the source of the ior change. Let ð Þ denote the probability that an infection. One can show that the average excess degree individual has a time to removal which is greater than or z2 (Zx) is given by the ratio [28]. equal to . Then [29] z1 We denote the time at which an individual acquires the Z 0 0 t ð Þ¼exp rð Þd ; (2) infection by i, and the time since acquiring infection by 0 ¼ t ti (which is also known as the age of infection). 0 While harboring the infection, the individual is first latent subject to the condition ð1Þ ¼ . The removal probabil- c d ð Þ (infected but not yet infectious) and then infectious (either ityR density function is given by ð Þ¼ d (or ð Þ¼ 1 0 0 symptomatically or asymptomatically). The individual c ð Þd ). may also recover, by which we mean only that he or Using Eq. (1) and c ð Þ [or ð Þ] one can calculate the she can no longer transmit the infection, not that he or expected transmissibility, i.e., the probability of disease she has necessarily completely cleared the pathogen. For transmission, across a given edge:
031005-3 BAHMAN DAVOUDI et al. PHYS. REV. X 2, 031005 (2012)
FIG. 1. Hypothetical infectivity functions ið Þ. They show the general infectivity patterns that can occur, varying by complexity of the disease. The top left panel shows the infectivity function of a very generic disease, the top right panel shows the infectivity function of an HIV type disease, the bottom left displays the infectivity function of any recurrent disease such as chicken pox, and finally, the bottom right panel exhibits the infectivity function for an influenza type disease.
Z Z 1 1 dTð Þ number of newly infected individuals becomes significant, T ¼ c ð ÞTð Þd ¼ ð Þ d 0 0 d and stochastic fluctuations are smoothed out (exponential R R Z 1 regime). The progression of disease spread starts to decline rðuÞdu iðuÞdu ¼ rð Þe 0 ½1 e 0 d : (3) as the cumulative number of infected cases becomes com- 0 parable to the size of the network, at which point network The basic reproduction number, which represents the finite-size effects become important (declining regime) [28]. average number of infections caused by a typical infected From now on, we use the tilde notation to make the individual in a fully susceptible population, can be written distinction between the realization of a stochastic process as the product of the expected excess degree and the (with tilde) and its mean field value (without tilde). We ~ expected transmissibility [27] define jðtÞ as the time series of infection events, which is a sum of the Dirac functions located at each infection time. R Z T: 0 ¼ x (4) The case count C~ðt; tÞ that gives the number of infections t t t C~ t; t Rbetween times and þ can be expressed as ð Þ¼ tþ t ~ 0 0 ~ ~ t jðt Þdt . We define JðtÞ¼Cðt; tÞ= t as the inci- IV. DISEASE DYNAMICS ON NETWORKS dence rate of new infections at time t, where t is the In this section, we present some examples of the spread maximum time resolution. In the exponential regime, the of an infectious agent on a contact network. The pattern of incidence rate of new infections grows exponentially and disease spread on a network can be categorized into three therefore can be expressed as different regimes: stochastic, exponential, and declining. ~ JðtÞ’J0 expð tÞ; (5) The process of disease spread is stochastic in nature, given that the disease transmission along an edge occurs in a for some >0. Figure 2 represents the three regimes, ln J t probabilistic manner and that the degree of the next ½ ð Þ , in terms of time t. infected individual cannot be determined a priori. The t stochastic behavior is dominant in the initial stage of disease spread when the number of infectious individuals A. Stochastic dynamics of disease is comparatively small (stochastic regime). The effect In this section, we outline a general framework to esti- of stochasticity becomes much less pronounced when the mate the basic reproduction number assuming that all
031005-4 EARLY REAL-TIME ESTIMATION OF THE BASIC ... PHYS. REV. X 2, 031005 (2012)
2
Stochastic regime Exponential regime Declining regime 1 ] ) t ( ~ J t
δ 0 ln[
δ
-1
-2 5 101520253035404550 t
FIG. 2. Two hypothetical realizations of an epidemic process on a network. The left, middle, and right sections represent stochastic, exponential, and declining regimes, respectively. information about a specific realization of the epidemic Lotka renewal equation for population growth [30,31]; process up to time t is known. We start by first deriving a here it is applied to epidemic dynamics when taking into renewal equation for the rate of new infections, J~ðtÞ. account the structure of the underlying contact network and the stochasticity inherent to the transmission process. 1. A renewal equation for J~ðtÞ Let us consider the first person in the population infected 2. Exponential regime and the with the disease and assume that his or her infection generation-interval distribution occurred at time 0. From Eq. (3), we can infer that the The importance of Eq. (8) is in its applicability to the expected number of infections that this individual will early stage of an outbreak. When advancing to the next cause by time t is given by stage, i.e., the exponential regime, the evaluation of (8) can R Z t dTð Þ be simplified. Indeed, an estimation of 0 in the exponen- Z ð Þ d tial regime has been well studied, analytically. An excel- x d (6) 0 lent account for this analytical framework is presented by (assuming that his or her excess degree is equal to the Wallinga and Lipsitch [26]. In this section, we show how, average excess degree). This leads in the limit t !1to the as a special case, our general framework can reduce to usual value of R0 ¼ ZxT. The above expression also their finding in the exponential-regime limit. During the implies that the mean contribution of this individual to exponential regime we can ignore stochastic fluctuations the incidence rate of new infections at his or her infection and replace all quantities with their expected values. In age is given by particular, if we ignore stochastic effects we can rewrite the dTð Þ renewal equation (8)as Zx ð Þ : (7) d Z t dTð Þ JðtÞ¼Zx Jðt Þ ð Þ d ; (9) The above equations can be readily generalized to 0 d address the random process of infection spread on a contact ~ ; t dTð Þ network. In particular, one can compute the contribution of where we used ð Þ d . t J~ t t dTð Þ the individuals infected at time , ð Þ , to the Let ð Þ Zx ð Þ d . Equation (9) then takes the number of new infections occurring in the initial stage of simpler form an outbreak at time t, J~ðtÞ t, namely, Z t Z J t J t d ; ~ t ~ ~ ð Þ¼ ð Þ ð Þ (10) JðtÞ t ¼ Zx Jðt Þ t ð Þ ð ; tÞd ; (8) 0 0 ~ which is the well-known Lotka renewal equation [30,31]. where ð ; tÞ denotes the fraction of those edges where From the definitions of ð Þ, the expected transmissibility t disease is actually transmitted exactly at time . This is a [Eq. (3)] and theR basic reproduction number [Eq. (4)], dTð Þ 1 random function with the expectation given by d . Note we can see that 0 ð Þd ¼ R0. Substituting JðtÞ that Zx and ð Þ are both functions of t in the most general expð tÞ in Eq. (10) and taking the limit when t !1we case. Expression (8) is a generalization of the classical obtain
031005-5 BAHMAN DAVOUDI et al. PHYS. REV. X 2, 031005 (2012)
Z 1 Z 1 1 ¼ expð Þ ð Þd : (11) 1 ¼ Zx i expð rÞ expð Þ expð i Þd (14) 0 0 It is worth mentioning that the exact exponential regime can Z i ; t ¼ x (15) be reached when !1for an infinite-size network and that i þ r þ is why it is valid to take the limit. This means that there should be a slight deviation from the exponential behavior or for a ‘‘finite-size’’ system at ‘‘finite time’’ once the outbreak ¼ðZx 1Þ i r: (16) has surpassed the stochastic regime. Dividing both sides of Therefore, we can express the rate of exponential growth of Eq. (11)byR0 we find that [26] an epidemic in terms of the mean excess degree (Zx), the 1 Z 1 infectivity ( i), and the removal ( r) rates. exp ^ d ; R ¼ ð Þ ð Þ (12) Furthermore, we can also explicitly compute the 0 0 generation-interval distribution [Eq. (13)] ^ =R ð þ Þ where ð Þ¼ ð Þ 0 is defined as thegeneration-interval ^ð Þ¼ð i þ rÞe i r : (17) distribution. This equation relates the basic reproduction For constant parameters, the generation interval is an ex- number to the Laplace transform of the generation-interval 1= distribution in the asymptotic case (infinite size, infinite ponential random variable with mean ð i þ rÞ. Using time). Now, in our formulation the generation-interval distri- this fact and Eq. (12) we obtain the following expression for R0: bution can be written as R i þ r þ 1 : dT 0 ¼ ¼ þ (18) ð Þ ð Þ i þ r i þ r ^ R d : ð Þ¼ dT 0 (13) This equation relates the value of R0 to the rate of growth 1 ð 0Þ ð Þ d 0 0 d 0 during the exponential phase of an epidemic ( ), the This equation describes how the transmissibility, Tð Þ,and infection rate ( i), and the removal rate ( r). Notice that, the distribution of the time to removal, ð Þ, determine in contrast to the results obtained from a deterministic SIR together the generation-interval distribution. model, where the mean generation interval is equal to the mean duration of infection [26], here we find that the mean generation interval also depends on the infection rate. 3. The generation-interval distribution Figure 3 shows two examples of logarithms (base e)of for constant parameters the epidemic curves ð ln½J~ðtÞ Þ in the three regimes for Equation (11) has a simpler form for constants r and i. a binomial (left panel) and exponential network (right This can be obtained by replacing c ð Þ¼ r expð r Þ panel). The algorithm used to simulate the spread of an and Tð Þ¼1 expð i Þ in Eq. (11): infectious agent on a contact network is described in the
5 5
y=0.26t+const y=0.516t+const
4 4
3 3 ] ] ) ) t t ( ( ~ ~ J J ln[ ln[ 2 2
1 1
0 0 0 20406080 10 20 30 40 50 t t ~ FIG. 3. The logarithm (base e) of the rate of new infections lnJðtÞ for a binomial network with z1 ¼ 5 (left panel) and for an exponential network with ¼ 4 (right panel), with i ¼ 0:127 71 and r ¼ 0:25. Two independent epidemic realizations are shown in each panel (green and blue). The solid red line shows the tangent of ln½J~ðtÞ in the exponential regime.
031005-6 EARLY REAL-TIME ESTIMATION OF THE BASIC ... PHYS. REV. X 2, 031005 (2012)
Appendix. The left panel of Fig. 3 shows two epidemic events unfolding (two different initial index cases) N k N k on a binomial network pk ¼ k p ð1 pÞ with N¼100000 nodes, z1 ¼ 5 [or p ¼ z1=ðN 1Þ], Zx ¼ 5, and f i ¼ 0:127 71; r ¼ 0:25g. Using Eq. (3), the ex- pected transmissibility can be calculated as T ¼ 0:338. The two solid lines y1, y2 ¼ t þ const with ¼ 0:260 84 are the tangent of ln½J~ðtÞ during the exponential regime [Eq. (16)]. This shows the consistency between the simulated epidemic curve and its expected (exponential-growth) behavior. In the right panel, we show the same results for an exponential network with pk ¼ 1= k= ð1 e Þe , ¼ 4, z1 ¼ 3:52, Zx ¼ 7:0416, and f i ¼ 0:127 71; r ¼ 0:25g. The lines y1, y2 ¼ t þ const with ¼ 0:521 58 are again the tangent of ln½J~ðtÞ during the exponential regime. Notice that, although here we know the ‘‘true’’ value of , in practice it can be estimated from real-life time series data if the outbreak progresses beyond the stochastic regime. FIG. 4. This figure illustrates schematically the dependency of the rate of new infections, J~ðtÞ (blue curve), on its past values. 4. The number of infected and removed Only a fraction of the cases infected at time t , J~ðt Þ t, individuals at time t contributes to the infections at time t, J~ið ;tÞ t¼J~ðt Þ t ð Þ ~r ~ Using the quantities introduced in the previous sections, (red curve); the rest, J ð ; tÞ t ¼ Jðt Þ t½1 ð Þ ,have we now derive other expressions that will be helpful in already been removed. estimating the basic reproduction number, R0. As an out- break progresses, at any given time there is a population of B. The transmissibility of removed individuals infectious individuals, I~ðtÞ, and a population of removed Our methodology to estimate R0 is based on a detailed individuals, R~ðtÞ. The number of affected individuals—the analysis of the characteristics of removed individuals. This total number of infected individuals at time t and those who is because the history of removed individuals contains all are recovered or removed by time t—is given by Z t the information about the mechanisms of disease trans- I~ðtÞþR~ðtÞ¼N S~ðtÞ¼ J~ðt Þd ; (19) mission and recovery process. In particular, the period of 0 infection of these individuals can help us estimate the where S~ðtÞ denotes the number of susceptible individuals at distribution of removal times. Furthermore, since removed time t. As illustrated in Fig. 4, the total number of infected individuals have already had the opportunity to transmit cases can be written as the disease, the fraction of the contacts that they actually Z t Z t infected contains a lot of information about the transmis- ~ ~i ~ ~ IðtÞ¼ J ð ; tÞd ¼ Jðt Þ ð Þd ; (20) sibility of disease. In an ideal world, a full characterization 0 0 of the infection history of each removed individual would which, in turn, implies that be enough to estimate R0. However, in reality, it is Z Z t t extremely difficult to know which infected individuals R~ðtÞ¼ J~rð ; tÞd ¼ J~ðt Þ½1 ð Þ d ; (21) 0 0 have already been removed and what fraction of their potential infections actually occurred. Therefore, we where Jið ; tÞ¼J~ðt Þ ð Þ and Jrð ; tÞ¼J~ðt Þ 1 instead derive theoretical expressions that can help us ½ ð Þ . estimate some of these quantities. Figure 5 shows the number of infectious (left panel) and First, we write an expression for the total number of removed (right panel) individuals for a disease that spreads secondary contacts of those individuals already removed either on a binomial or an exponential network. In both by time t. Using ideas similar to those above, the total panels, the curves consisting of red symbols correspond to excess degree of removed individuals can be written as the computer simulation of an epidemic on the correspond- Z t ing network; during the simulation, the new case counts are ~ r ~ J~ t Z xðtÞ¼Zx Jðt Þ½1 ð Þ d : (22) recorded to create a synthetic ‘‘time series’’ for ð Þ. The 0 solid curves correspond to Eqs. (20)or(21) [for I~ðtÞ or ~ ~ r RðtÞ] for the corresponding network. These figures show a Here, ZxðtÞ represents the total number of edges of perfect agreement for both networks between the analytical already removed individuals that could have transmitted formulas and the case counts from the simulation. the infection by time t. However, only a fraction of these
031005-7 BAHMAN DAVOUDI et al. PHYS. REV. X 2, 031005 (2012)
15000
Sim. Exp. 60000 Ana. Exp.
Sim. Bino.
10000 Ana. Bino.
40000 ) ) t t ( ( I ~ ~ R
Sim. Exp.
5000 Ana. Exp. 20000 Sim. Bino.
Ana. Bino.
0 0 20 40 60 80 20 40 60 80 t t
FIG. 5. Number of infectious (left panel) and removed individuals (right panel) as a function of time for the binomial (z1 ¼ 5, i ¼ 0:127 71, and r ¼ 0:25) and exponential ( ¼ 4) networks. The solid curves come from the evaluation of Eqs. (20) and (21), and the symbols come from the direct counting of the infectious and removed individuals at any given time for a specific realization of the process. links actually transmitted the disease successfully. This before time t, defined as c~ rð ; tÞ. The quantity c~ rð ; tÞ ~r ~r ~r latter fraction is given by I ðtÞþR ðtÞ, where I ðtÞ and is proportional to the number of individuals already R~r t ð Þ represent, respectively, the number of infectious and removed by time t that were removedR after exactly units t ~ r t ~ 0 0 removed individuals at time whose predecessor has of time, i.e., c ð ; tÞ / c ð Þ 0 Jð Þd . This already been removed. The ratio of these two quantities probability function, after incorporating the proper normal- represents the fraction of potential transmissions that ization, can be written as actually occurred, which we shall refer to as the expected R ~r t ~ 0 0 transmissibility of removed individuals or T ðtÞ; r c ð Þ 0 Jð Þd c~ ð ; tÞ¼R R : (24) t c 0 t 00 J~ 00 d 0d 00 I~r t R~r t 0 ð Þ 0 ð Þ T~ r t ð Þþ ð Þ : ð Þ ~ r (23) ZxðtÞ The expected transmissibility of removed individuals can then be calculated as [see Eq. (3)] Estimates for the expected values for these quantities Z can, in principle, be calculated based on the rate of new t T~ rðtÞ¼ c~ rð ; tÞT~ð ; tÞd ; (25) infections, JðtÞ, using arguments similar to those above. 0 These expressions are derived in the next section. T~ ; t T Equations (11) and (23) form a set of equations that allows where ð Þ is an extension of ð Þ that takes into account us to find two unknowns, for instance, the amplitude and the stochastic effects in the disease-transmission process variance of the infectivity profile; a detailed analysis of this (represented by the explicit dependence of this quantity t subject merits a separate manuscript. Here, we solely use on ). Eq. (23) to estimate one quantity. D. The basic reproduction number of removed ~ C. Expression for the transmissibility individuals and an equation for R0 ~r of removed individuals, T ðtÞ The total excess degree of removed individuals [given in As discussed in the previous section, our methodology is Eq. (22)] takes the simpler form: based on a careful analysis of the characteristics of the Z~ rðtÞ¼Z R~ðtÞ: removed individuals. In particular, the expected transmis- x x (26) T~r t sibility of removed individuals, ð Þ, will play a crucial Combining the last equation with Eq. (23) one obtains role in the estimation of R0. We now derive an alternative T~r t I~r t R~r t expression for ð Þ and other expressions related to ~ r ð Þþ ð Þ T ðtÞZx ¼ : (27) Eq. (23). R~ðtÞ The expected transmissibility of removed individuals, T~rðtÞ, can also be obtained as an extension of Eq. (3)by We define the right-hand side of the previous equation replacing the removal distribution, c ð Þ, with the condi- as the reproduction number of the removed individuals ~ r tional distribution of removal time, given that it occurred R0ðtÞ, i.e.,
031005-8 EARLY REAL-TIME ESTIMATION OF THE BASIC ... PHYS. REV. X 2, 031005 (2012)