Downloaded by guest on September 26, 2021 catrUiest,Hmlo,O 8 L,Canada; 4L8, Canada; L8S ON Hamilton, University, McMaster Canada; 3K7, 20892; N6A MD ON Bethesda, London, NIH, Center, International Fogarty Studies, a Park Woo numbers Sang reproduction to growth link correctly intervals serial Forward-looking PNAS person another infects individual that (7). when (infectee) (infec- and generation individual an infected the when is between tor) where time the The (6), as defined 5). is distribution interval (4, interval reports generation case the from robustly rate estimated growth be often rate growth can exponential population-level the com- on based A is observed. be estimating be of typically cannot method will mon some infections when observed down on biased based estimates Direct . disease any of provide scale to time or fails the individuals also about aver- among information number heterogeneity reproduction an capture The represents to space. simple fails across number it in reproduction 3), it (2, and the eliminate age population, Since to the (2). necessary in intervention cases spread of will amount the an which to extent number population— reproduction susceptible fully “basic” a the in by value caused is The infections infection. number secondary primary of reproduction a number The average the (1). as COVID-19 defined of rent T modeling disease infectious interval generation genera- and reassessing intervals, for serial intervals, framework tion a provides contact early and of efforts importance tracing the demonstrates study This 2.6. serial initial of of tor defined estimates biases improperly context 2020) using this in 8, province; intervals February Hubei outside to China January in of (21 intervals. estimates early serial interval-based to in serial method changes the neglecting apply from We arise address- for the that method over heuristic biases a ways ing present We complex epidemic. in an epidemic of changing course censoring, by by both and affected dynamics are incorrect intervals give Forward-looking intervals, dynamics, mates. intrinsic population-level and neglect backward, which time measure which cases intervals, symptomatic of link process reliably therefore renewal and the describe correctly infec- an tor, of onset symptom from epidemic, forward for time an measuring intervals, of implications course the the over and vary suggest intervals these studies how of Recent explore estimates in chain. biases substitution onset transmission such symptom a that between in time links the successive intervals, Because serial transmission. often substitute epidemiologists onward observe, to and difficult are intervals infection generation intervals, between generation time by 2020) linked the 4, are June review They for quantities. (received 2020 demiological 4, December approved number and reproduction Norway, The Oslo, Oslo, of University Stenseth, Chr. Nils by Edited and 30332; Weitz S. Joshua eateto clg n vltoayBooy rneo nvriy rneo,N 08544; NJ Princeton, University, Princeton Biology, Evolutionary and Ecology of Department siaigterpouto number reproduction the Estimating hrceitc fa mrigeiei,sc stecur- the as such epidemic, emerging an of characteristics number reproduction he 01Vl 1 o e2011548118 2 No. 118 Vol. 2021 g colo ilgclSine,GogaIsiueo ehooy tat,G 30332; GA Atlanta, Technology, of Institute Georgia Sciences, Biological of School i rneo colo ulcadItrainlAfis rneo nvriy rneo,N 08544 NJ Princeton, University, Princeton Affairs, International and Public of School Princeton r n h erdcinnumber reproduction the and a,1 | g,h eilinterval serial aya Sun Kaiyuan , ra .Grenfell T. Bryan , R R R d R eateto ilg,MMse nvriy aitn NLS48 Canada; 4L8, L8S ON Hamilton, University, McMaster Biology, of Department with n h rwhrate growth the and siain owr-okn serial Forward-looking estimation. ertebgnigo nepidemic an of beginning the near R | erdcinnumber reproduction r soeo h otimportant most the of one is R ncnrs,backward-looking contrast, In . R R b siae o COVID-19. for estimates 0 o h OI-9outbreak COVID-19 the for alw st rdc the predict to us —allows ai Champredon David , R a,b,i R sotnchallenging. often is ae on based n oahnDushoff Jonathan and , f .G eroeIsiuefrIfciu ies eerh catrUiest,Hmlo,O 8 4L8, L8S ON Hamilton, University, McMaster Research, Disease Infectious for Institute DeGroote G. M. R R r r rtclepi- critical are r ikdby linked are yu oafac- a to up by | r eewe Here . c eateto ahlg n aoaoyMdcn,Uiest fWsenOntario, Western of University Medicine, Laboratory and Pathology of Department r R which , esti- c ihe Li Michael , doi:10.1073/pnas.2011548118/-/DCSupplemental hsatcecnan uprigifrainoln at online information supporting contains article This 1 under distributed BY).y is (CC article access open This Submission.y Direct PNAS a is article This interest.y competing no declare authors The ulse eebr2,2020. 23, December Published J.D. research; and B.T.G., performed J.S.W., D.J.D.E., J.D. J.D. B.M.B., and and paper. M.L., the D.C., B.T.G., J.S.W., wrote K.S., S.W.P., J.S.W., D.J.D.E., and D.J.D.E., B.M.B., data; analyzed B.M.B., S.W.P. K.S., S.W.P., M.L., research; D.C., designed K.S., S.W.P., contributions: Author n 7.Ti ofso saprn vni tnadsoftware 16 standard refs. in (e.g., even explic- intervals apparent cases, is two confusion some the This in of 17). definitions and, and the 11–15), conflate refs. itly (e.g., on them distinguishing COVID-19 clearly between without about intervals serial inferences and unfolds. generation base concep- pandemic both to COVID-19 better differ- the continue a as their Researchers for clearer understanding need becoming for is the framework ences (7), theoretical ago and decade tual a over defined generation positive. whereas be (11), must intervals transmission pres- the presymptomatic in values of negative Serial take ence even (7–10). cases, contexts some many generation serial in in can, than mean intervals contexts, same variances many the larger have in have but intervals particu- to that, expected noted In are have intervals quantities. transmission, studies disease different previous of lar, fundamentally and scale measure time infector serial and the an they generation measure when While both between (7). intervals time symptoms genera- develop the serial practice, infectee The as an in intervals. defined serial observe is with to replaced interval often difficult are intervals be tion can which events, owo orsodnemyb drse.Eal [email protected] Email: addressed. be may correspondence whom To etta h w itiuin iedfeetetmtsof estimates different rate give growth number distributions reproduction two the the but sug- that key, studies Recent gest are analyses. distributions outbreak in interval quantities different, serial and generation The Significance s fsra nevldsrbtosi estimating in principled distributions the interval to . serial changes practical of for use provides basis study theoretical Our outside methods. the inference China, prior sys- from revealing in 2020), biases 8, data tematic February to interval 21 (January serial province Hubei COVID-19 frame- our using to apply as We work distribution. estimate interval same generation the the and gives cohort, reference correct r lhuhteedsrbtoswr lal n distinctly and clearly were distributions these Although infection between time measure intervals generation Since d,e,f n h eilitra itiuin hndfie rmthe from defined when distribution, interval serial the and d h ejmnM Bolker M. Benjamin , colo hsc,GogaIsiueo ehooy tat,GA Atlanta, Technology, of Institute Georgia Physics, of School b y iiino nentoa pdmooyadPopulation and International of Division r ee eso htestimating that show we Here, . https://doi.org/10.1073/pnas.2011548118 e eateto ahmtc n Statistics, and Mathematics of Department R sifre rmteobserved the from inferred as raieCmosAtiuinLcne4.0 License Attribution Commons Creative d,e,f . y ai .D Earn D. J. David , https://www.pnas.org/lookup/suppl/ R R ae on based e,f during | f12 of 1 , r

POPULATION BIOLOGY for estimating R, such as EpiEstim, in which the serial interval an infector and an infectee (e.g., generation and serial intervals). We define distribution is used to infer time-dependent R (18). These stud- the delay as the time difference between the primary event and the sec- ies are examples of many—indeed, it is a common practice to use ondary event. In some cases, the primary event always occurs before the the serial and generation intervals interchangeably. secondary event (e.g., the time from infection to onset of symptoms in a One source of confusion arises from an apparent discrepancy single individual, or the generation interval between two individuals). In other cases, the delay can sometimes be negative (e.g., the time from onset between the generation interval and serial interval viewpoints. of symptoms to onset of infectiousness in a single individual, or the serial While the epidemic is growing exponentially, the spread of infec- interval between two individuals). tion can be characterized as a renewal process based on previous At the individual level, we can define the time distribution between of infection, the associated generation interval distri- a primary and a secondary event that we expect to observe for a single bution, and the average infectiousness of an infected individual. infected individual by averaging across individual characteristics—we refer It is well established that this renewal formulation allows us to to this distribution as the intrinsic distribution. For example, the intrin- link the exponential growth rate of an epidemic r with its repro- sic distribution describes the expected time distribution duction number R using the generation interval distribution (6). from infection to symptom onset of an infected individual. Likewise, the However, the serial interval distribution also describes a renewal intrinsic generation interval distribution describes the expected time distri- bution of infectious contacts made by an infected individual. However, the process—in this case, the creation of a new symptomatic case intrinsic time distributions are not always equivalent to the corresponding based on a symptomatic case in the previous generation. Since realized time distributions at the population level (i.e., the distribution of both renewal processes, based on either generation or serial time between actual primary and secondary events that occur during an interval distributions, describe the same underlying exponen- epidemic; Fig. 1). For example, an infectious contact results in infection tially growing system, both should provide the same correct link only if the contacted individual is susceptible (and has not already been between the reproduction number R and the epidemic growth infected)—this is one mechanism that causes realized generation intervals rate r. (time between actual infection events) to differ from the intrinsic genera- In contexts where the serial and generation interval distribu- tion intervals (time between infection and infectious contacts) (21). In this tions differ, current theory has no explanation for how two dif- example, the difference between intrinsic and realized time distributions R can be attributed to the fact that the fraction of susceptible individuals is ferent distributions could provide identical estimates of from itself dynamic. r. In fact, recent theory suggests that using the serial interval At the population level, we model realized time delays between a pri- can underestimate the reproduction number (19, 20). However, mary and a secondary event from a cohort perspective. A cohort consists of these studies rely on intrinsic distributions of incubation peri- all individuals whose (primary or secondary) event occurred at a given time. ods and generation intervals that neglect the population-level For example, when we are measuring incubation periods, a primary cohort dynamics of disease spread. consists of all individuals who became infected at time p, while a secondary Here we show that, by correctly defining and calculating the cohort consists of all individuals whose symptom onset occurred at time s. “forward” serial interval distribution (i.e., a distribution of serial Similarly, when we are measuring serial intervals, a primary cohort consists intervals from a cohort of infectors that developed symptoms of all infectors who became symptomatic at time p. Then, for a primary cohort at time p, we can define the distribution of realized delays between at the same time) that connects symptom onset dates, we can primary and secondary events. We refer to this distribution as the forward resolve this discrepancy. These forward intervals are different delay distribution and denote it as fp(τ). from the “intrinsic” serial intervals that previous studies have Likewise, we define the backward delay distribution bs(τ) for a secondary relied on (7–10, 19). During an ongoing epidemic, all observed cohort at time s: The backward delay distribution describes the time delays epidemiological delays (e.g., incubation period) between pri- between primary and secondary events given that the secondary event mary (e.g., infection) and secondary (e.g., symptom onset) events occurred at time s. For example, the backward incubation period distribu- are subject to backward biases: When the incidence of pri- tion at time s describes incubation periods for a cohort of individuals who mary events is increasing (or decreasing), we are more likely became symptomatic at time s. Likewise, the backward serial interval dis- to observe shorter (respectively, longer) intervals. In particular, tribution at time s describes serial intervals for a cohort of infectees who became symptomatic at time s. when we consider forward serial interval distributions, the incu- Both forward and backward perspectives must yield identical measure- bation periods of the infectors are subject to backward biases, ment (e.g., the length of the incubation period of a given individual is the because we have to look backward in time from their symp- same whether measured forward from the time of infection or backward tom onset to infection. Therefore, the realized incubation period from the time of symptom onset). Consequently, no matter how delays are distributions of the infector and the infectee can differ dynam- distributed, if P and S represent the sizes of primary and secondary cohorts, ically, even if the intrinsic analogs of the same distributions are then we can express the total density of intervals τ between calendar times expected to be equivalent. p and s (i.e., τ = s − p) as follows: We develop a cohort-based framework for characterizing and comparing realized serial intervals, as well as any other epi- W(p)P(p)fp(τ) = S(s)bs(τ), [1] demiological delays, and show that the initial forward serial interval distribution correctly estimates R from r. Conversely, where W(p), the “weight” of the primary cohort, represents the average using inaccurately defined serial intervals or failing to account number of forward intervals that an individual in cohort P(p) produces over for changes in the observed serial interval distributions over the the course of their infection. When we measure within-individual delays, course of an epidemic can considerably bias estimates of R. we expect W(p) ≤ 1 because only a subset of individuals who experience For example, in our analysis of the COVID-19 serial intervals the primary event (e.g., infection) will eventually experience the secondary event (e.g., symptom onset). For between-individual delays, we expect from China, outside Hubei province, we find that the original W(p) to change throughout an epidemic, because individuals infected ear- R0 estimates based on aggregated serial interval data underesti- lier in an epidemic will infect more individuals, on average, than those mated R0 by a factor of 2.0 to 2.6. We further lay out several infected later. principles to consider in using information about serial inter- Substituting p = s − τ, it follows that vals and other epidemiological time delays to correctly infer

the initial reproduction number during the early stages of an W(s − τ)P(s − τ)fs−τ (τ) bs(τ) = . [2] outbreak. S(s)

Methods If we are considering incubation periods, the left-hand side of this equation Intrinsic, Forward, and Backward Delay Distributions. A time delay between is the probability density that an individual who became symptomatic at two epidemiological events can involve either one infected individual (e.g., time s had an incubation period of length τ. From the right-hand side, we incubation period: infection and symptom onset of an individual) or two— see that this probability density depends on the weight parameter W(s − τ)

2 of 12 | PNAS Park et al. https://doi.org/10.1073/pnas.2011548118 Forward-looking serial intervals correctly link epidemic growth to reproduction numbers Downloaded by guest on September 26, 2021 Downloaded by guest on September 26, 2021 owr-okn eilitrascretyln pdmcgot orpouto numbers reproduction to growth epidemic link correctly intervals serial Forward-looking al. et Park eta rwhpae ecnsubstitute can bias we obtain of phase, to amount (f growth (W the distribution when nential inci- delay parameter calculate delays When forward weight can interevent time. the the we that through longer) exponentially, Assuming (decreasing) (or exactly. growing means increasing shorter time is been same observe dence has indi- the to size on at tend cohort Conditioning occurred size have observations: cohort we events primary of that secondary the conditionality whose in to Back- viduals changes due change. on time, not depend does over distributions distribution delay delay forward ward corresponding their if to harder other is will it to intervals more If due proportionally then short. (22–24): as infection, be of to interventions course well the due or in as later change progresses, infect depletion, behavioral epidemic susceptible like an of factors as process shorter dynamical gener- become the forward often example, For intervals interval epidemic dynamics. Between- serial ation an epidemic behavior. or on generation individual of depend as or course such distributions, policies distributions, the health delay over public forward from individual in change time changes may of to an testing, distribution due of to scale the time onset like the distributions, symptom at realized invariant remain Other equivalent realized and outbreak. are distributions Some distributions, intrinsic dynamics. period their epidemic to incubation by like affected distributions, directly forward dis- not delay are forward within-individual tributions Typically, time. over distributions delay trsa time at starts fidvdasifce ttime time-varying f at time the infected earlier infection), individuals the symptomatic of at of size proportion cohort the primary case, this (in invariant. Forward time intervals remain dynamics. Backward intervals population-level depletion). forward susceptible on when through even dependent intervals sizes not generation cohort of in are changes contraction and to (e.g., characteristics due dynamics change individual epidemiological can to of (blue) due average change reflect can (green) (black) intervals intervals Intrinsic distribution. period h owr nuainpro distribution, period incubation forward the (C distribution. period incubation h owr eilitra o ooto netr h eaesmtmtca time at symptomatic became who infectors of cohort a for interval distribution, period serial incubation forward backward The the (B) from distribution. drawn period incubation intrinsic the case, 1. Fig. s−τ Eq. backward and forward in changes the drive mechanisms different Several (τ τ i hscs,tepoaiiydniyta nicbto eidthat period incubation an that density probability the case, this (in ) 2 i1 h nrni eilitra o ooto niiul netda time at infected individuals of cohort a for interval serial intrinsic The (A) intervals. serial backward and forward, intrinsic, of Illustration ugssta akaddlydsrbtoscag vrtm even time over change distributions delay backward that suggests sdanfo h nrni nuainpro distribution, period incubation intrinsic the from drawn is s − τ b nsa time at ends 0 (τ ) [ = W (p) C B A (0)P ≈ s s). W − h akadsra nevlfrachr fifceswobcm ypoai ttime at symptomatic became who infectees of cohort a for interval serial backward The ) (0)/S 0)rmi osatdrn h expo- the during constant remain (0)) τ ,adtefraddlydistribution delay forward the and ), (0)] P (s exp(−r τ − g P sdanfo h akadgnrto nevldsrbto,and distribution, interval generation backward the from drawn is τ (t i hscs,tenumber the case, this (in ) ) τ = )f 0 P (τ 0 exp(rt (0) ), τ p g (τ sdanfo h owr eeainitra itiuin and distribution, interval generation forward the from drawn is ) ≈ nEq. in ) f 0 (τ τ )and )) g sdanfo h nrni eeainitra itiuin and distribution, interval generation intrinsic the from drawn is [3] 2 r nrni oidvdas(n o eedn neiei yaisat not dynamics but epidemic forward on of true dependent generally not is that (and population-level)—something individuals the to intrinsic incu- realized are of distributions However, periods, 19). and 10, bation serial 8, (7, the when distributions that mean same concluded the same studies the have These intervals infectee. generation the and infector the and respectively, infectee, their expressed when often have and studies symptomatic Previous intervals becomes serial (7). infector symptomatic becomes an infectee when between Distributions. time Interval Serial Realized by work the as Eq. well distribution. that as showed respectively, who infectee, 19, an ref. and generation per- infector backward the an from and of intervals period). forward general- generation spective compared realized incubation and describe who the to distributions 24, distributions (e.g., ref. delay interval that by disease epidemiological delays work a all the time of to ize measuring history apply are life ideas we (unless the These if distribution to even delay intrinsic equilibrium), forward are at fast- the is a from disease within differ the intervals will distribution backward delay shorter (high observe epidemic to growing expect we as long disease, as interval ward where rate growth exponential the the during on distribution only r depends delay phase backward growth the exponential Therefore, constant. ization where [W n h nta owr ea distribution delay forward initial the and h enbcwr nevlwl eawy hre hntema for- mean the than shorter always be will interval backward mean The (0)P r (0)/S τ steepnnilgot ae Since rate. growth exponential the is i1 and (0)] τ s τ −1 τ ntefr Fg 1A) (Fig. form the in i2 i1 = and ersn nuainproso nifco n an and infector an of periods incubation represent R −∞ ∞ τ r i2 r ,ales en qa.I eea,tebackward the general, In equal. being else all ), ilb dnia nyi easm htthey that assume we if only identical be will , exp(−r τ > τ g s .Ee o ifrn pdmc ftesame the of epidemics different for Even 0. 3 = ersnstegnrto nevlbetween interval generation the represents od o h akadgnrto interval generation backward the for holds (τ τ i2 τ g 0 sdanfo h akadincubation backward the from drawn is + )f https://doi.org/10.1073/pnas.2011548118 h eilitra sdfie sthe as defined is interval serial The 0 τ (τ i2 ) 0 − dτ ) f 0 nti case, this In s. τ . b i1 τ 0 0 , i2 orsod otenormal- the to corresponds sapoaiiydistribution, probability a is τ sdanfo h forward the from drawn is i1 and nti case, this In p. τ i2 τ τ i2 i1 r rw from drawn are PNAS sdanfrom drawn is sdanfrom drawn is | nthis In p. f12 of 3 τ i1 [4] is

POPULATION BIOLOGY backward incubation period distributions. We refer to the definition Eq. 4 by the number of individuals who developed symptoms at time p, j(p) = R ∞ 0 0 as the intrinsic serial interval (Fig. 1A). −∞ T (p − τ , p) dτ , we obtain the serial reproduction number, To correctly link the realized serial interval distribution to the renewal process between infections based on symptom onset dates, we must use R ∞ T ( , + τ 0) dτ 0 the forward serial interval (i.e., use the perspective of a cohort of infectors −∞ p p Rs(p) = , [9] that share the same symptom onset time). Given that an infector became j(p) symptomatic at time p, to calculate the forward serial interval, we first go backward in time to when the infector was infected, and then forward in which we define as the average number of infections generated by an indi- time to when the infectee was infected, and then forward again to when vidual who developed symptoms at time p. Combining the forward serial τ the infectee became symptomatic. In Fig. 1B, we see that i1 is drawn from interval distribution with the serial reproduction number completes the the backward incubation period distribution of the cohort of infectors who renewal process between symptomatic cases, became symptomatic at time p, τg is drawn from the forward generation interval distribution of the cohort of infectors who became infected at time Z ∞ p − τ , and τ is drawn from the forward incubation period distribution i1 i2 j(t) = Rs(t − τ)j(t − τ)ft−τ (τ) dτ. [10] of the cohort of infectees who became infected at time p − τi1 + τg. Like- −∞ wise, we can define the backward serial interval distribution for a cohort of infectees who became symptomatic at time s (Fig. 1C). This conceptual This framework allows us to understand changes in the realized serial framework demonstrates that the distributions of τi1, τg, and τi2 (and there- intervals for any epidemic model and properly link serial interval distribu- fore the distributions of realized serial intervals) depend on the reference tions with the renewal process. In addition, assuming that the reproduction cohort, which is defined by temporal direction (forward or backward) and a number as well as the forward serial interval distribution remain constant particular reference time. during the exponential growth phase, we can substitute j(t) ≈ j(0) exp(rt), To calculate realized serial interval distributions, we begin by modeling Rs(t) ≈ Rs(0), and ft−τ (τ) ≈ f0(τ) to obtain T (p, s): the total density of serial intervals that start (when infectors develop symptoms) at time p and end (when infectees develop symptoms) at time s. Z ∞ For simplicity, we assume that all infected individuals eventually develop 1 = exp(−rτ)f0(τ)dτ. [11] symptoms. Then, the density of serial intervals between times p and s, given Rs(0) −∞ that the infectors became infected at time α1 ≤ p and the infectees became infected at time α2 ≤ s, depends on the amount of infection that occurs Therefore, the initial forward serial interval distribution, f0(τ), provides the between times α1 and α2 as well as the density of forward incubation correct link between the exponential growth rate r and the initial serial periods between α1 and p (realized incubation periods of infectors) and reproduction number Rs(0). We revisit this idea later in Linking r and R between α2 and s (realized incubation periods of infectees), and show that the initial forward serial interval distribution provides the same r–R link as the intrinsic generation interval distribution. T |α α = R α × α × − α α − α (p, s 1, 2) c( 1) i( 1) hα1 (p 1, 2 1) | {z } |{z} | {z } case incidence Epidemic Model. We illustrate changes in forward and backward serial inter- reproduction joint density of of forward incubation number infection vals over the course of an epidemic by applying our framework to a specific periods p−α1 and forward generation intervals α2−α1 example of an epidemic model. We model disease spread with a renewal (of infectors) equation model (10, 26–30). Ignoring births and deaths, changes in the pro- × ` − α portion of susceptible individuals S(t) and incidence of infection i(t) can be α2 (s 2) , [5] | {z } described as marginal density of forward incubation periods s−α2 dS (of infectees) = −i(t) dt where the case reproduction number R (α ) is defined as the average Z ∞ c 1 i(t) = R(t) i(t − τ)g(τ) dτ, [12] number of secondary infections that an individual infected at time α1 will 0 generate over the course of their infection (25). We describe the forward incubation periods and the forward generation intervals using a joint prob- where R(t) is the instantaneous reproduction number [i.e., the average ability distribution because onset of symptoms and transmission potential number of infections that an individual infected at time t will generate if jointly depend on the life history of a disease; for example, if an infected conditions at time t remain unchanged (25)], and g(τ) is the intrinsic gener- individual can only transmit the disease after symptom onset, the forward ation interval distribution [i.e., the forward generation interval distribution generation interval will necessarily be longer than the forward incubation of a primary case in a population where changes in R(t) are negligible period. (24)]. This model assumes that g(τ) remains constant through time—in other The total density of serial intervals between times p and s can now be words, that epidemic dynamics are driven by changes in transmission rate. obtained by integrating over all possible infection times for the infector This assumption may not be well suited to individual-based intervention and the infectee, such as case isolation (25); nonetheless, this form has been widely used in Z p Z s the literature and has been successfully applied in modeling the current T (p, s) = T (p, s|α1, α2) dα2 dα1. [6] COVID-19 pandemic (31). −∞ α 1 Here, changes in reproduction number can be modeled as a product of the basic reproduction number R0, proportion susceptible S(t), and a Then, the forward serial interval distribution fp(τ) is given by the density of intervals of length τ starting at time p, relative to the total number of serial time-dependent factor M(t) (for example, accounting for nonpharmaceuti- intervals starting at time p, cal interventions and behavioral changes): R(t) = R0S(t)M(t); ref. 32 used a similar framework to evaluate the impact of nonpharmaceutical inter- T (p, p + τ) ventions on the spread of COVID-19 in 11 countries. Then, the forward f (τ) = . [7] p R ∞ 0 0 generation interval for a cohort of individuals that were infected at time −∞ T (p, p + τ ) dτ p follows (see ref. 14),

Likewise, the backward serial interval distribution bs(τ) is given by the den- sity of intervals of length τ ending at s, relative to the total number of serial g(τ)S(p + τ)M(p + τ) g (τ) = , [13] intervals ending at s, p R ∞ 0 0 0 0 0 g(τ )S(p + τ )M(p + τ ) dτ T (s − τ, s) b (τ) = . [8] s R ∞ 0 0 −∞ T (s − τ , s) dτ which allows us to separate the joint probability distribution hp of the for- ward incubation period and the forward generation interval distribution The denominator of the forward serial interval distribution (Eq. 7) then as a product of the proportion of susceptible individuals S and the joint corresponds to the total number of infections generated by infected individ- probability distribution h of the forward incubation period and the intrinsic uals who themselves developed symptoms at time p. Dividing this quantity generation intervals,

4 of 12 | PNAS Park et al. https://doi.org/10.1073/pnas.2011548118 Forward-looking serial intervals correctly link epidemic growth to reproduction numbers Downloaded by guest on September 26, 2021 Downloaded by guest on September 26, 2021 h nta pdmcgot hs ie,linking in (i.e., interested phase are we growth Since epidemic simulations. initial our the in depletion susceptible on vs. only forward (i.e., perspective the on over depend vary and epidemic backward). distributions an time of epidemiological course realized the the how Eqs. calculated illustrate into then to quantities are distributions these interval substituting serial by backward and forward The ae nteElrLtaeuto 6.Hr,w ou nteetmtsof estimates the on focus population, number we reproduction Here, basic (6). equation the Euler–Lotka the on based effects qualitative addition, In period. this of during constant roughly remain follows: as defined is model this for reproduction case the Finally, i eeainitra itiuinpoie h orc ikbtenthe between link correct the rate provides growth exponential distribution interval generation sic exponentially: (S grows infection constant approximately remains susceptible Linking serial of and shape generation the in on changes depend however, and still robust; curve can be intervals epidemic to the expected of are shape analysis detailed our from draw general we Therefore, framework. conclusions modeling this under depletion susceptible of eilitra distribution interval serial period. transmission presymptomatic Eq. distribution intervals, joint generation the of shape the in constant normalization R the where yais n sgvnby given is and distribution dynamics, interval serial intrinsic The by given is distribution interval serial forward initial the Here, positive), necessarily not is forward interval serial initial forward the as to r refer during distribution we distribution interval phase—which serial interval serial growth symptomatic forward exponential the between the expect process we serial renewal Therefore, forward the cases. distribution, describe interval distributions generation interval intrinsic the to Analogous life the represents as it it as denote epidemic, we an disease; of a course of the history not over does cohorts distribution across period incubation vary forward the that assume further We owr-okn eilitrascretyln pdmcgot orpouto numbers reproduction to growth epidemic link correctly intervals serial Forward-looking al. et Park −∞ ∞ steitiscgnrto nevldsrbto nt,hwvr htthe that however, (note, distribution interval generation intrinsic the as M ic ed o aeayasmtosabout assumptions any make not do we Since S3. section Appendix, SI efrhrcmaeti ihteetmt of estimate the with this compare further We o ipiiy elet we simplicity, For f 0 htreduces that f (τ 0 (τ r ) = dτ ) and h φ p 1 S (x (t R. = Z , ) −∞ τ ≈ .W rvd ahmtclpofo hsrelationship this of proof mathematical a provide We 1. 0 ) R R(t uigteiiilpaeo neiei,teproportion the epidemic, an of phase initial the During = 1), c Z (t R α R oooial vrtm ilb iia oteimpact the to similar be will time over monotonically ) ) 0 τ 1 ∞ = R intrinsic R M exp(r 1 1 1 r R R 0 0 18 q(τ f 0 = g(τ n h nta erdcinnumber reproduction initial the and ∞ 0 = `(x 0 = t siaetesm au of value same the estimate —to n sueta pdmcdnmc depend dynamics epidemic that assume and 1 Z α od,i eea,wehro o hr sa is there not or whether general, in holds, Z h(x = ), Z ) ) 0 i h(x −∞ 1 0 = = (t ∞ ∞ Z )h(−α ∞ 0 R ) −∞ Z Z , , g(τ ≈ ∞ τ φ exp(−r 0 0 0 τ exp(−r h ∞ ∞ 0 hn ehave we Then, `. )S i sdtrie yterqieetthat requirement the by determined is tevleof value (the )S 0 )S exp(−r ewe nuainprosadthe and periods incubation between (p 1 exp(rt h(x h(x (p (t , q(τ α + + + M , , 2 τ τ 7 τ τ τ − osntdpn nepidemic on depend not does ) τ )g(τ τ )f . dx ) dτ ) )M .Drn hspro,teintrin- the period, this During ). and 0 τ )M 0 )M α )q(τ (τ (p (t 1 dτ. ) (t . , r )`(τ (p )dτ. ) eueti framework this use We 8. + + R to ≈ )dτ. + R τ 0 τ S − ,w expect we R), τ ) 0) n niec of incidence and (0)), dτ. ) ae nteintrinsic the on based naflysusceptible fully a in 0 α dτ ) 2 dα ) 0 dx R 2 0 0 dα R . o given a for 1 = , R(t R 0 [19] [16] [15] [14] [20] [18] [17] S to ) (0) hr h omlzto constant R normalization the where lduigabvraelgnra itiuinwt orltos(ntelog the mod- (on is correlations with distribution distribution probability scale) log-normal joint bivariate distributions, a The normal using exponentiated. eled underlying later the of are deviations which standard and mean mean the log with deviation distribution dard log-normal a using mean log σ with distribution normal period incubation intrinsic Mean evaluate to can and distribution epidemic interval distribution of serial an estimates interval Euler–Lotka the of the of serial affect course definitions realized characterize different the the how to over which COVID-19 change to can for degree estimates the parameter use We Results for allowing period, incubation the transmission. than presymptomatic shorter be to log- intervals bivariate generation the of scale) log distribution: the normal (on four coefficients consider correlation on we the simplicity, for based For values 1). parameterized (Table COVID-19 are of for develop intervals distributions estimates parameter Marginal generation who later. and individuals transmit periods to is, positively likely incubation be that more to are period; later intervals incubation symptoms generation the the symptom with expect of time correlated generally the we around peaks (11), SARS-CoV-2 of onset load viral the that model, Given equation renewal the (in Eq. intervals generation intrinsic distribution and probability distri- periods joint lognormal the bivariate model a to use incubation we bution between Here, distribution intervals. generation joint and the periods on depend distribution interval Parameterization. Model reflect to order in Instead, COVID-19. model of equivalent. the dynamics in are transmission not transmission the periods do (SI presymptomatic exposed we for infector however, and allow an paper, we incubation of this onset the in period symptom that else incubation between assume Everywhere the time S5). of section the independent Appendix, 2) is only and infection can onset individuals and infected symptom the dur- 1) after distributions because case, phase same transmit this infec- growth the In exponential follow and the case. intervals symptoms ing generation special of and a serial onset provides forward that simultaneously), (i.e., occur equivalent shape tiousness and are the incubation the periods Susceptible–Exposed–Infected– on that assumption the exposed depends additional example, the distribution under For model, interval distribution. Recovered shape joint serial exact the the forward However, of initial distributions. forward the interval intrin- initial the serial of than the and mean larger Therefore, generation a occur. have sic generally to will distribution likely interval less (Eq. serial is short transmission be to tomatic periods incubation backward rate growth tial S4). section Appendix, (SI simplicity oilsrt elsi antdso ievrigprpcieefcsi the in effects COVID-19 time-varying/perspective of pandemic. of current characteristics magnitudes match realistic to illustrate chosen to are distributions interval tion Values simulations for used Parameter values Parameter 1. Table of Ditiscicbto period incubation intrinsic SD Ditiscgnrto interval generation intrinsic SD interval generation intrinsic Mean −∞ I ∞ = h nrni nuainpro itiuini aaeeie sn log- a using parameterized is distribution period incubation intrinsic The exponen- the on depends distribution interval serial forward initial The g, ,wiealwn o h osblt htte ih ecorrelated. be might they that possibility the for allowing while 12), 2 h nrni eeainitra itiuini parameterized is distribution interval generation intrinsic The 0.42. q(τ f ρ 0 and , = dτ ) q(τ h nrni nuainpro n genera- and period incubation intrinsic The 0 .75} 0.5, 0.25, {0, ) = q = σ .Rte hnnmrclyitgaigoe lsdforms closed over integrating numerically than Rather 1. oestimate to G φ r 1 = o atgoigeiei (high epidemic fast-growing a For . q Z 7 o enadlgsadr eitosrepresent deviations standard log and mean Log 0.37. ρ −∞ = 0 5 hsprmtrzto losfor allows parameterization This 0.75. 0.5, 0.25, 0, ehv hw httednmc fteserial the of dynamics the that shown have We Z α τ R 1 h(−α 0 euesmlto-ae prahsfor approaches simulation-based use we , φ q 1 µ , sdtrie yterqieetthat requirement the by determined is https://doi.org/10.1073/pnas.2011548118 α I = 2 − 2adlgsadr deviation standard log and 1.62 R α 0 1 efrhradeshow address further We . )`(τ ,maigta presymp- that meaning 3), . d 5.5 . d 2.4 . d 1.9 d 5.0 µ − h G α fitiscincubation intrinsic of = 2 dα ) 4adlgstan- log and 1.54 r ,w xetthe expect we ), 2 PNAS dα 1 , | Source f12 of 5 (33) (33) (34) (34) [21]

POPULATION BIOLOGY the observed serial intervals, measured through contact trac- longer serial intervals (and a corresponding lower proportion of ing, are affected by right censoring during an ongoing epidemic negative intervals). and provide a heuristic method for addressing biases that can Comparing the shapes of the initial forward serial interval arise from using serial interval data to estimate R0. Finally, we distribution (Eq. 19) and the intrinsic generation interval dis- analyze serial interval data from the COVID-19 epidemic in tribution allows us to better understand how different forward China, outside Hubei province, based on 468 transmission events distributions lead to identical estimates of R0. In general, dis- reported between January 21 and February 8, 2020, under our tributions with higher means and less variability lead to higher framework. R0 for a given r (6, 35, 36). When incidence is growing expo- nentially, forward serial intervals have higher means (Fig. 2C) Realized Serial Interval Distributions during the Exponential Growth and squared coefficients of variation (Fig. 2D) than the intrin- Phase. Fig. 2 shows Euler–Lotka estimates of R0 based on dif- sic generation interval distribution. The effects of higher means ferent definitions of the serial interval. When the initial forward (which increase R0) exactly cancel those of higher variability serial interval distribution f0(τ) is used, estimates (from Eq. (which decrease R0). On the other hand, intrinsic serial intervals 18) exactly match the (correct) generation interval-based esti- (Eq. 21) have the same mean (equal to the mean initial forward mates (Eq. 17) for all values of the correlation ρ between the serial at r = 0 in Fig. 2C) as the intrinsic generation intervals but intrinsic incubation period and the intrinsic generation interval are more variable (also see squared coefficient of variation of (Fig. 2A). When the intrinsic distributions are used, however, the initial forward serial interval distribution at r = 0 in Fig. 2D); estimates based on the serial interval (Eq. 20) underestimate R0: therefore, we underestimate R0 when we use the intrinsic serial As r increases, Rintrinsic saturates and eventually decreases due interval distribution. to the increasing inferred importance of negative serial intervals (Fig. 2B). While the initial forward serial intervals during the Realized Serial Interval Distributions during an Ongoing Epidemic. exponential growth phase can also be negative, their effects are The initial forward serial interval distribution captures the expo- appropriately balanced, because faster epidemic growth leads to nential growth phase of an epidemic. We now explore how

AB

CD

Fig. 2. Estimates of the reproduction number from the exponential growth rate based on serial and generation interval distributions. (A) The ini- tial forward serial interval distributions give the correct link between the exponential growth rate r and the reproduction number R0, for any correlation ρ between intrinsic incubation period and intrinsic generation interval of the underlying bivariate log-normal distribution. (B) The intrin- sic serial interval distributions give an incorrect link between r and R0.(C) The mean initial forward serial interval during the exponential growth phase increases with r.(D) The squared coefficient of variation of the initial forward serial intervals during the exponential growth phase decreases with r.

6 of 12 | PNAS Park et al. https://doi.org/10.1073/pnas.2011548118 Forward-looking serial intervals correctly link epidemic growth to reproduction numbers Downloaded by guest on September 26, 2021 Downloaded by guest on September 26, 2021 owr-okn eilitrascretyln pdmcgot orpouto numbers reproduction to growth epidemic link correctly intervals serial Forward-looking al. et Park number low See to simplicity. due values. for stochasticity parameter other, initial for and each 1 delays Table of time See of independent intervals. censoring be left delays simulations. to (e.g., backward stochastic dynamics and 10 assumed Forward transient of delay. are possible results forward set remove initial the to we mean represent order the infections), (A) In represent lines of 1. lines Gray Dashed Fig. simulation. simulations. to deterministic stochastic according 10 a colored of of are average results the the represent represent (B–G) lines points (B–G) Colored colored (E and interval. (A) serial Black and interval. interval, generation period, incubation forward 3. Fig. infected were they when 38). estimate ref. to their (e.g., try observe and estimating typically date we are onset individual, we symptom an when of Similarly, period incubation infected. backward the were and the they when ask often whom and by individuals are infected observed identify epidemic typically the We ongoing ones: because an important and during is forward intervals distributions the China between backward in differences the primary the COVID-19 our is understanding of distribution interval focus, serial dynamics assume forward further the transmission While and we (37). the S1 1; reflect sections Table , Appendix in to (SI parameters simulations equations using stochastic renewal S2) and course the the deterministic on over using based vary epidemic, can an intervals of serial backward and forward E B A pdmooia yaisadcagsi enfradadbcwr ea itiuin.( distributions. delay backward and forward mean in changes and dynamics Epidemiological t = otefis iepitwe al niec sgetrta 0.Itiscicbto eid n nrni eeainintervals generation intrinsic and periods incubation Intrinsic 100. than greater is incidence daily when point time first the to 0 FG CD R i.S1 Fig. Appendix, SI 0 2 = hne ntema akadicbto eid eeainitra,adserial and interval, generation period, incubation backward mean the in Changes –G) . 5, eid )tefradgnrto nevl n )teforward the 3) and interval, generation forward 1 the (Fig. 2) intervals period, three of tions interval serial forward mean 3D). the (Fig. contrast, time rapidly over In declines decreases 24). population (22, susceptible 3C) mean the (Fig. when The is is incidence 3B). when which slightly (Fig. high, constant decreases assumption remains interval generation by period forward simula- epidemic, incubation individual-based an forward throughout on mean based The tions. realizations stochastic (Eq. equation ing renewal 3 the on (Fig. based 3 forward (Fig. mean ward the with h owr eilitra itiuin eedo distribu- on depend distributions interval serial forward The together 3A) (Fig. dynamics epidemiological the shows 3 Fig. o iuain ihcreae nuainprosadgeneration and periods incubation correlated with simulations for ea itiuin fadtriitcmodel deterministic a of distributions delay E–G) hne ntemean the in Changes (B–D) time. over incidence Daily A) :1 h akadincubation backward the 1) B): https://doi.org/10.1073/pnas.2011548118 n h enback- mean the and B–D) n ftecorrespond- the of and 12) PNAS | f12 of 7

POPULATION BIOLOGY incubation period. In these simulations, both forward incubation an unbiased estimate of the basic reproduction number, we need period (Fig. 3B) and generation interval (Fig. 3C) distributions to estimate the initial forward serial interval distribution—that is, remain roughly constant; therefore, changes in the forward serial serial intervals based on cohorts of infectors who share the same interval distributions (Fig. 3D) are predominantly driven by symptom onset time, at the early stage of the epidemic. However, changes in the backward incubation period distribution, whose researchers typically use all available information to estimate mean increases over time as the growth rate of disease incidence epidemiological parameters (e.g., aggregating all serial intervals slows and then reverses. In general, relative contributions of the observed until certain time of an epidemic). For example, ref. three distributions depend on their shapes, correlations between 18 recently suggested that up-to-date serial interval data are intrinsic incubation periods and generation intervals, and overall necessary to accurately estimate the reproduction number. We epidemiological dynamics. explore the consequences of neglecting changes in the realized We see similar qualitative patterns in all three backward delays serial interval distribution on estimates of the basic reproduction (Fig. 3 E–G and Eq. 2), because they are predominantly driven number. by the rate of change in incidence, which, in turn, affects rela- When an epidemic is ongoing, the observed serial inter- tive cohort sizes. When incidence is increasing, individuals are vals are subject to right censoring because we cannot observe more likely to have been infected recently, and therefore we are a serial interval if either an infector or an infectee has not more likely to observe shorter intervals (Eq. 3). Similarly, when yet developed symptoms. For example, if we were to measure incidence decreases, we are more likely to observe longer inter- serial intervals on day 8 as in Fig. 4A, we will only be able to vals. Neglecting these changes will bias the inference of intrinsic observe the first six events (ID 1 to 6). Fig. 4B demonstrates distributions from observed distributions. how the effect of right censoring in the observed serial inter- vals translates to the underestimation of the basic reproduction Observed Serial Interval Distributions. Now, we turn to practical number R0 in our stochastic simulations (assuming R0 = 2.5 as issues of estimating the reproduction number from the observed in Fig. 3). Notably, even if we could observe and aggregate all serial interval data during on ongoing epidemic. In order to have serial intervals across all transmission pairs after the epidemic

AB

C D

Fig. 4. Estimating the reproduction number from the observed serial intervals. (A) Schematic representation of line list data collected during an epidemic. (B) Estimates of R0 based on all observed serial intervals completed by a given time. (C) Schematic representation of line list data rearranged by symptom onset date of infectors. (D) Estimates of R0 based on all observed serial intervals started by a given time. Black dashed lines represent the mean initial forward serial interval and R0. Black solid lines represent the mean intrinsic serial interval and Rintrinsic. Colored solid lines represent the mean estimates of R0 across 10 stochastic simulations. Colored ribbons represent the range of estimates of R0 across 10 stochastic simulations.

8 of 12 | PNAS Park et al. https://doi.org/10.1073/pnas.2011548118 Forward-looking serial intervals correctly link epidemic growth to reproduction numbers Downloaded by guest on September 26, 2021 Downloaded by guest on September 26, 2021 owr-okn eilitrascretyln pdmcgot orpouto numbers reproduction to growth epidemic link correctly intervals serial Forward-looking al. et 13. Park ref. of across materials delays supplementary observable from minimum of taken and were estimates maximum data Cohort-averaged the The (D) represent CIs. dates. lines bootstrap onset dashed 95% The symptom associated fits. reported smoothing of scatterplot range estimated the (C locally secondary estimated and the (B) represent lines primary of (B dates data. onset tracing symptom contact the in included pairs of set we a that is over difference averaging major to the analogous information 4D); is all (Fig. time using intervals as certain just backward a intervals, forward to of up set a over became ing increase who we we as infectors is, time change from that before intervals analysis; symptomatic the serial into observed cohorts analyze recent more incorporate i.5. Fig. of estimate the (par- as can distribution well we interval as serial Then, mean) the distribution. its case of interval ticularly primary shape inter- a serial the of how serial forward date compare earlier, the onset onset showed symptom us we same symptom give the as the share 4C); that on (Fig. vals based infectors and of intervals interval date serial serial forward observed initial group mean the of therefore estimate the in collected interpreted been be must care. have epidemic with an that of intervals periods different serial epi- throughout of aggre- the Therefore, distributions if 3C). interval (Fig. gated phase population generation the depletion through forward backward burnt susceptible the demic to the of value during subject intrinsic contraction distribution the to be underestimate and even due longer would periods slightly we incubation no fact, In the will biases. as intervals distribution, generation intrin- interval the for- to serial converges mean sic distribution interval initial serial the observed The underestimate therefore (and still interval serial would ward we ended, has ee epoieahuitcwyo sesn oeta biases potential assessing of way heuristic a provide we Here, bevdsra nevl fCVD1 n ootaeae siae of estimates cohort-averaged and COVID-19 of intervals serial Observed R 0 ersetvl.W a erag h iels and list line the rearrange can We retrospectively. t C AB hsapoc saaoost averag- to analogous is approach This . t n vlaehwteestimates the how evaluate and R 0 ,lkl yalreamount. large a by likely ), and C owr n akadsra nevl vrtm.Sra nevldt aebe rue ae nthe on based grouped been have data interval Serial time. over intervals serial backward and Forward ) ae.Pit ersn h en.Vria ro asrpeette9%eutie unie.Solid quantiles. equitailed 95% the represent bars error Vertical means. the represent Points cases. ) R 0 hnea we as change erae vrtm.Wietedces slkl ob affected be to likely is decrease the While time. over 95% decreases and 5C) 5B mean (Fig. Fig. interval their secondary quantiles. serial compute and backward respectively—and 5B) and symptom distributions, (Fig. the forward quan- primary by to to them the order cases—corresponding group of In we 40). dates intervals, ref. onset serial in in epidemic 1 changes COVID-19 figure tify (compare total a China a resembling in of individuals), curve (consisting unique pairs 752 transmission of 468 within and (95% individuals d) d 3.96 4.39 5A of to interval Fig. serial d mean 2020. 3.53 the 8, estimated CI: February (13) and al. from 21 et (13) January Du between al. transmis- reported et 468 events on Du based sion province, by Hubei outside collected China, Pandemic. COVID-19 mainland of COVID-19 intervals the serial to Applications the on effect in their changes and dynamical distributions of detect estimates interval to serial us forward allows the distribu- growth interval approach exponential serial This forward the tions. the in after changes soon reflecting period, rapidly cohort-averaged the decrease However, estimates period. this during inference cise and exponen- interval the serial 4 mean Fig. During the rather period. of period, certain estimates R certain a the a phase, in growth in end tial begin that that those intervals than serial on focus R D 0 0 ypo ne ae falidvdaswti 6 transmission 468 within individuals all of dates onset Symptom (A) R. suigduln eido n 1,3) ibn ersn the represent Ribbons 39). (14, d 8 and 6 of period doubling assuming r ossetwt h revle(e iiilfrad in forward” “initial (see value true the with consistent are B and hw h itiuino ypo ne ae fall of dates onset symptom of distribution the shows R ;adn oedt losu omk oepre- more make to us allows data more adding D); 0 . hw httema owr eilinterval serial forward mean the that shows R 0 f13 9%C:11 o1.48). to 1.16 CI: (95% 1.32 of https://doi.org/10.1073/pnas.2011548118 ial,w reanalyze we Finally, PNAS | f12 of 9

POPULATION BIOLOGY by the right censoring (indicated by the closeness between the grouped serial intervals by the symptom onset date of infectors quantiles of the observed serial intervals and maximum observ- across 14-d periods and found that the mean forward serial inter- able serial intervals), the increase in the proportion of negative val decreased from 7.8 d to 2.6 d. While they attributed the serial intervals indicates changes in the forward serial inter- decrease in serial intervals to reduction of the isolation delay, val distribution; this proportion is unlikely to be affected by their regression analysis showed that isolation delays explain only left censoring (based on the gap between the quantiles of the 51.5% of the variation in serial intervals (they could explain up to observed serial intervals and minimum observable serial inter- 72% of the variance by including other intervention measures). vals). The decrease in the mean forward serial interval was Our framework provides an explanation for the remaining vari- probably driven by interventions against spread. Interventions ation: Changes in the backward incubation period during the during this time period both decreased (and then reversed) the decreasing phase of an epidemic act to further shorten serial growth rate of COVID-19 cases—thus increasing the backward intervals due to increased amount of presymptomatic transmis- incubation period—and also reduced generation intervals, by sion (even in the absence of nonpharmaceutical interventions). preventing infections once cases were identified. Both of these Isolation delays and other intervention measures affect the would have acted to reduce the forward serial interval. Fig. 5C amount of onward transmission, and therefore the distribution shows that the mean backward serial interval increased over of realized (forward) generation intervals. They therefore are not time, also likely driven directly by the decrease in COVID-19 expected to explain all of the variation in forward serial intervals, infections. since these additionally depend on both the backward incubation While the qualitative changes in the mean forward and back- period of the infector and the forward incubation period of the ward serial interval are consistent with our earlier simulations infectee (Fig. 1B). (Fig. 3), the initial mean forward serial interval (Fig. 5B) appears Our results support the use of serial interval distributions to be larger than what we calculated based on previously esti- for calculating the R during the exponential growth phase, mated incubation period and generation interval distributions but they also reveal gaps in current practices in incorporating (Fig. 2C). This difference may imply that the incubation period serial interval distributions into outbreak analyses. For example, and generation interval (Table 1) were underestimated, as nei- ref. 18 recently emphasized the importance of using up-to- ther study explicitly accounted for the fact that the observed date serial interval data for accurate estimation of time-varying intervals were drawn from the backward distributions and were reproduction numbers. However, our results show that, if obser- likely to have been censored. vational biases in the forward serial interval through time are Fig. 5D shows the cohort-averaged estimates of R0, which not accounted for, using up-to-date serial interval data can actu- remain roughly constant until day January 17 and suddenly ally exacerbate the underestimation of R in the initial growth decrease; this sudden decrease is due to changes in the forward phase of an outbreak. Future studies should explore how neglect- serial intervals consistent with the dynamics seen in our simu- ing changes in the forward serial interval distribution can affect lations (Fig. 4). The cohort-averaged estimates of R0 based on the estimates of R beyond the exponential growth phase, and the early forward serial intervals are also consistent with pre- potentially reassess existing estimates of R. We also suggest that vious estimates of R0 of the COVID-19 epidemic in China (1, modelers should aim to characterize spatiotemporal variation in 37): R0 = 2.6 (95% CI: 2.2 to 3.1) and R0 = 3.4 (95% CI: 2.7 forward serial interval distributions. These modeling approaches to 4.3) based on a doubling period of 8 d or 6 d, respectively, should be coupled with epidemiological investigation through using serial interval data from infectors who developed symp- contact tracing. Going forward, an additional advantage of early, toms by January 17. These early cohort-averaged estimates of R0 intensive contact tracing of emerging diseases is that it provides are unlikely to be affected by the right censoring, as we expect the the best information to characterize the initial forward serial degree of right censoring to be low (Fig. 5A). Therefore, the orig- interval distribution. inal R0 estimate of 1.32 (95% CI: 1.16 to 1.48), which neglects Our study underlines the fact that the serial interval distribu- the changes in the forward serial interval distribution, underesti- tion depends not only on the generation interval and incubation mates R0 by a factor of 2.0 to 2.6. This example demonstrates period distributions but also on the correlation between their the danger of using the observed serial intervals to calculate duration in a given individual. Here, we use a bivariate lognormal the reproduction number without organizing serial intervals into distribution to capture these correlations phenomenologically cohorts. and to show that realized serial intervals can decrease over time in the context of COVID-19. Although their true correlation Discussion will depend on viral load dynamics, we expect our conclusions Generation and serial intervals determine the time scale of about decreasing serial intervals of COVID-19 to be robust, as disease transmission, and are therefore critical to dynamical individuals with longer incubation periods will generally have a modeling of infectious outbreaks. We have shown that the initial longer time window to transmit before symptom onset. In gen- forward serial interval distribution—measured from the cohort eral, the impact of increasing backward incubation periods on of infectors who developed symptoms during the exponential the forward serial intervals are likely to be disease specific—for growth phase of an epidemic—provides the correct link between example, we show, in SI Appendix, section S5, that the initial for- the exponential growth rate r and the initial reproduction num- ward serial interval distribution can be equivalent to the intrin- ber R. In general, the forward serial interval distributions will sic generation interval distribution, regardless of the growth not match the intrinsic serial interval distribution (which has rate r, due to independence between the incubation period the same mean as the intrinsic generation interval distribution) and time from symptom onset to transmission and the lack because the incubation period of the infectors (conditional on of presymptomatic transmission. Future studies trying to inter- their symptom onset date of the infector) will be subject to back- pret realized serial intervals should consider carefully the joint ward biases. In particular, the mean forward serial interval can distribution between the generation intervals and incubation decrease over time for COVID-19, as individuals who develop periods. symptoms later in an epidemic are more likely to have longer In closing, we lay out a few practical principles for analyz- incubation periods, and therefore have greater opportunity to ing and interpreting serial interval data. First, serial intervals transmit presymptomatically. Failing to account for these effects should be cohorted based on the symptom onset date of the can result in underestimation of initial R. infector (and not of the infectee) whenever possible. Previous Recently, Ali et al. (41) also showed that forward serial studies have often regarded serial intervals as an intrinsic quan- intervals of COVID-19 decreased through time in China. They tity, having the same mean as the intrinsic generation interval

10 of 12 | PNAS Park et al. https://doi.org/10.1073/pnas.2011548118 Forward-looking serial intervals correctly link epidemic growth to reproduction numbers Downloaded by guest on September 26, 2021 Downloaded by guest on September 26, 2021 6 .M nesn .Hetrek .Kikneg .D olnsot,Hwwill How Hollingsworth, Hellewell D. J. T. 17. Klinkenberg, D. Heesterbeek, H. Anderson, M. R. 16. 5 .Zhao domestic S. potential 15. the forecasting and Nowcasting Leung, M. G. Leung, K. Wu, T. J. 14. Du Z. 13. Abbott S. 12. He X. SEIR Erlang-distributed 11. the of Equivalence Earn, D. J. D. Dushoff, J. Champredon, D. 10. owr-okn eilitrascretyln pdmcgot orpouto numbers reproduction to growth epidemic link correctly intervals serial Forward-looking al. et Park rmr vn iecnhl vi akadbae (although changes potential biases detect distribution. as backward the well in as avoid exist) still help can biases can the censoring feature by time pervasive delays a event epidemiological be primary Cohorting to dynamics. expected outbreak backward due are distributions of sizes the delay cohort backward for the changing the of correct in to epidemiologi- changes phases to but other early (42), tried biases to the have apply during epidemics studies they COVID-19 Some that distributions. show generalize cal and we 24); ideas 23, (19, these intervals backward generation vs. forward of of measurement importance the Pre- shown have distributions. studies time vious epidemiological measured defining fully to data issues interval these serial interpreting but dynamics. when intervals, transmission mind inform serial in about kept data be fol- should limited to epidemic to hard the be due sometimes low, of will In trajectory recommendations context. these epidemiological a practice, provide by to possible, accompanied whenever curve, be interval should serial Finally, data specific. epi- disease are than intervals rather given serial specific, because a demic care, with of done epidemics be should across serial disease information intrinsic Third, interval 4). the serial (Fig. throughout obtain depletion applying susceptible observed not to due be do distribution, we can interval epidemic, intervals unmitigated serial Even an realized biases: be epidemiological all be can and when should distribution censoring interval different periods to serial epidemic subject realized and the because cohorts avoided different observed inter- across of serial aggregating vals mean) Second, dynamics. the epidemic through to (and changes due and time, distribution expectation, this the from differs but intervals serial 19), 10, 8, (7, .D .T es,J alna .Dne,M .Bvn siaigtegnrto inter- generation the Estimating Boven. V. M. Donker, T. Wallinga, J. incubation Beest, Te and E. D. infectivity 9. between correlation The Nishiura, relationship H. the Klinkenberg, D. shape 8. intervals generation How 7. Lipsitch, M. rates. growth Wallinga, epidemic initial Estimating influenza. Earn, J. D. pandemic J. 6. D. 1918 Bolker, M. of B. Transmissibility Dushoff, J. Lipsitch, Ma, J. M. 5. Robins, M. J. Mills, computation E. C. the and 4. definition the On Metz, A. J. on Heesterbeek, A. preprints J. Diekmann, of O. Impact 3. May, M. epidemic: R. the Anderson, M. in R. 2. Early Mandl, D. K. Majumder, S. M. 1. oebody u td nelnsteiprac fcare- of importance the underlines study our broadly, More n contacts. and epidemic?. COVID-19 the of course Lancet the influence measures mitigation country-based outbreak. early the (2020). during taaa033 (COVID-19) 27, disease coronavirus novel the of Med. Nat. n nentoa pedo h 09no ubekoiiaigi ua,Cia A China: Wuhan, in originating outbreak study. 2019-nCoV modeling the of spread international and Dis. Infect. Emerg. 2020. April 20 Accessed varying-transmission.html. https://cmmid.github.io/topics/covid19/current-patterns-transmission/global-time- equation. renewal the (2018). and model settings. epidemic social of range a in (H1N1) (2013). A influenza of val cases. two with households from (2011). 52–60 estimated measles, of period models. epidemic in times (2007). generation 311 on note A Svensson, numbers. A. reproductive and (2007). rates growth between Biol. Math. Bull. Nature populations. ratio reproduction basic the of 1991). Press, University (Oxford transmissibility. (2020). covid-19 about discourse global ˚ eilitra fCVD1 mn ulcyrpre ofimdcases. confirmed reported publicly among COVID-19 of interval Serial al., et eprldnmc nvrlsedn n rnmsiiiyo COVID-19. of transmissibility and shedding viral in dynamics Temporal al., et 3–3 (2020). 931–934 395, 0–0 (2004). 904–906 432, eilitra ndtriigteetmto frpouto number reproduction of estimation the determining in interval Serial al., et eprlvraini rnmsindrn h OI-9outbreak. COVID-19 the during transmission in variation Temporal al., et 7–7 (2020). 672–675 26, esblt fcnrligCVD1 ubek yioaino cases of isolation by outbreaks COVID-19 controlling of Feasibility al., et .Mt.Biol. Math. J. actGo.Health Glob. Lancet 4–6 (2014). 245–260 76, Lancet 3114 (2020). 1341–1343 26, 8–9 (2020). 689–697 395, 6–8 (1990). 365–382 28, netosDsae fHmn:Dnmc n Control and Dynamics Humans: of Diseases Infectious R 0 48e9 (2020). e488–e496 8, nmdl o netosdsae nheterogeneous in diseases infectious for models in IMJ pl Math. Appl. J. SIAM actGo.Health Glob. Lancet rc il Sci. Biol. Proc. Epidemiology ah Biosci. Math. .Ter Biol. Theor. J. 3258–3278 78, 599–604 274, e627–e630 8, .Ta.Med. Trav. J. 244–250 24, 300– 208, 284, 6 .W ak .Capeo,J .Wiz .Dsof rcia generation-interval- practical A Dushoff, J. Weitz, S. for J. Challenges Champredon, ebola: D. of Park, W. transmission S. post-death 36. Modeling Dushoff, J. Weitz, S. J. 35. Ferretti L. 34. Lauer A. S. 33. Flaxman S. 32. Gostic M. repro- K. basic 31. the of estimation Model-consistent smallpox Heesterbeek, a P. of A. control J. the Roberts, for G. model M. equation integral 30. An Roberts, G. M. Aldis, K. G. 29. 8 .G oet,Mdln taeisfrmnmzn h mato nipre exotic imported an of impact the minimizing for strategies Modeling Roberts, G. M. 28. Heesterbeek. P. A. J. Diekmann, O. of 27. concept The Dietz. K. Heesterbeek, P. A. J. emerging an 26. in numbers reproduction household and individual infectious- Estimating in Fraser, intervals C. generation 25. realized and Intrinsic Dushoff, Impli- J. Champredon, disease: D. infectious an 24. of time generation the in variations Time Nishiura, data H. epidemic and 23. contraction interval Generation Robins. distributions M. J. generation-interval Lipsitch, M. Inferring Kenah, E. Dushoff, J. 22. Champredon, D. Park, W. S. 21. remedies. and Ganyani T. Biases epidemics: 20. emerging in Estimation Tomba, S. G. Britton, T. 19. Thompson N. R. 18. hsrpr r hs fteatosadd o eesrl ersn the represent necessarily Services. Human not and Health do of and Department or authors in NIH US the conclusions the of of and position by those official findings supported are The was report J.S.W. (W911NF1910384). this (CIHR). Office supported Research Natu- Research was Health by J.D. Army of Infectious (NSERC). Institutes supported Council for Canadian was Research Institute by Engineering D.J.D.E. and DeGroote University. Sciences G. ral McMaster Michael Research, the Disease from funding response ACKNOWLEDGMENTS. and in article included the are data study All (https://github.com/parksw3/serial). repository Availability. Data inter- estimating serial in of use pandemic. estimates COVID-19 their reassessing distributions—and for study Our val rationale 46–49). 13, a (11, epidemic constant provides an remain of distributions course interval the of serial throughout the explic- estimates that or implicitly assume the COVID-19 itly of on intervals serial the effect of the- serial estimates its our in and for variation temporal intervals support demonstrating further framework, provides oretical China from COVID-19 of scale of estimates time is the bias the there also may Neglecting asymptomatic, transmission asymptomatic interval. is will serial pair symptom-based transmission intervals no a serial in observed individuals of the the estimates in the Biases bias 44). necessarily 43, (11, dif- this ficulty identifying exacerbates and COVID-19 asymptomatic practice, of and transmission general, In presymptomatic in difficult, exactly. epi- is whom relevant known infected who all is including delays, process, demiological transmission entire the that ept hs iiain,oraayi fsra nevl of intervals serial of analysis our limitations, these Despite and symptoms develop individuals all that assume we Here, ae praht nern h tegho pdmc rmterspeed. their from epidemics of (2019). strength 12–18 the 27, inferring to approach based control. for opportunities and inference tracing. contact digital (2020). application. 577–582 and Estimation cases: confirmed reported publicly Europe. in COVID-19 number, infection. emerging an of incidence (2007). the 803–816 from number duction outbreak. iess oe ulig nlssadInterpretation infection. and Analysis Building, 5. vol. Model Diseases: (1996). 89–110 epidemic. transmission. disease potential. transmission quantify appropriately Eng. to sampling for cations analysis. data. 2020. contact-tracing from March data, onset symptom (2020). on based 19) Interface Soc. outbreaks. disease infectious during 5–6 (2010). 851–869 7, ah Biosci. Math. R uniyn ASCV2tasiso ugsseiei oto with control epidemic suggests transmission SARS-CoV-2 Quantifying al., et rc o.Sc od il Sci. Biol. B Lond. Soc. Roy. Proc. lSOne PloS ah Biosci. Math. t siaigtegnrto nevlfrcrnvrsdsae(COVID- disease coronavirus for interval generation the Estimating al., et h nuainpro fcrnvrsdsae21 CVD1)from (COVID-19) 2019 disease coronavirus of period incubation The al., et eRi:010/000.82145 2 uut2020). August (28 medRxiv:10.1101/2020.06.18.20134858 . siaigteefcso o-hraetclitretoson interventions non-pharmaceutical of effects the Estimating al., et rcia osdrtosfrmauigteefciereproductive effective the measuring for considerations Practical al., et 0860(2019). 20180670 16, IAppendix SI mrvdifrneo ievrigrpouto numbers reproduction time-varying of inference Improved al., et l aaadcd r trdi ulcyaalbeGitHub available publicly a in stored are code and data All 78(2007). e758 2, Nature rc il Sci. Biol. Proc. 17 (2008). 71–79 213, Science –2(2005). 1–22 195, ..adDJDE r rtflfrCVD1 rapid COVID-19 for grateful are D.J.D.E. and J.D. .R o.Interface Soc. R. J. . 5–6 (2020). 257–261 584, ab96(2020). eabb6936 368, 0506(2015). 20152026 282, Epidemics ahmtclEieilg fInfectious of Epidemiology Mathematical c.Rep. Sci. https://doi.org/10.1073/pnas.2011548118 4121 (2004). 2411–2415 271, utemr,we n of one when Furthermore, R. 0979(2020). 20190719 17, R 036(2019). 100356 29, 0 71(2015). 8751 5, neiei theory. epidemic in Eurosurveillance otexisting Most R. n.Itr.Med. Intern. Ann. Jh ie,2000), Wiley, (John drn the R—during PNAS .Mt.Biol. Math. J. tt Neerl. Stat. ah Biosci. Math. 2000257 25, | R Epidemics 1o 12 of 11 (45). 172, .R. J. 55, 50,

POPULATION BIOLOGY 37. S. W. Park et al., Reconciling early-outbreak estimates of the basic reproductive 44. W. E. Wei, Presymptomatic transmission of SARS-CoV-2—Singapore, January 23– number and its uncertainty: Framework and applications to the novel coronavirus March 16, 2020. Morbidity Mortality Weekly Rep. 69, 411–415 (2020). (SARS-CoV-2) outbreak. J. R. Soc. Interface 17, 20200144 (2020). 45. S. W. Park, D. M. Cornforth, J. Dushoff, J. S. Weitz, The time scale of asymptomatic 38. J. A. Backer, D. Klinkenberg, J. Wallinga, Incubation period of 2019 novel coronavirus transmission affects estimates of epidemic potential in the COVID-19 outbreak. (2019-nCoV) infections among travellers from Wuhan, China, 20–28 January 2020. Epidemics 69, 100392 (2020). Eurosurveillance 25, 2000062 (2020). 46. H. Nishiura, N. M. Binton, A. R. Akhmetzhanov. Serial interval of novel coronavirus 39. Q. Li et al., Early transmission dynamics in Wuhan, China, of novel coronavirus– (COVID-19) infections. Int. J. Infect. Dis. 93, 284–286 (2020). infected pneumonia. N. Engl. J. Med. 382, 1199–1207 (2020). 47. L. Tindale et al., Transmission interval estimates suggest pre-symptomatic spread of 40. A. Pan et al., Association of public health interventions with the epidemiology of the COVID-19. medRxiv:10.1101/2020.03.03.20029983 (6 March 2020). COVID-19 outbreak in wuhan, China. J. Am. Med. Assoc. 323, 1915–1923 (2020). 48. S. Zhao et al., Estimating the serial interval of the novel coronavirus disease 41. S. T. Ali et al., Serial interval of SARS-CoV-2 was shortened over time by nonpharma- (COVID-19): A statistical analysis using the public data in Hong Kong from Jan- ceutical interventions. Science 369, 1106–1109 (2020). uary 16 to February 15, 2020. medRxiv:10.1101/2020.02.21.20026559 (25 February 42. R. Verity et al., Estimates of the severity of coronavirus disease 2019: A model-based 2020). analysis. Lancet Infect. Dis. 20, 669–677 (2020). 49. J. Zhang et al., Evolving epidemiology and transmission dynamics of coronavirus dis- 43. Y. Bai et al., Presumed transmission of COVID-19. JAMA 323, ease 2019 outside Hubei province, China: A descriptive and modeling study. Lancet 1406–1407 (2020). Infect. Dis. 20, P793–P802 (2020).

12 of 12 | PNAS Park et al. https://doi.org/10.1073/pnas.2011548118 Forward-looking serial intervals correctly link epidemic growth to reproduction numbers Downloaded by guest on September 26, 2021