<<

arXiv:2101.04620v2 [eess.SP] 21 Apr 2021 aeinfitrn,qiks eeto,compartmental detection, waves. second quickest COVID-19 model, filtering, Bayesian a upre npr yteUS N,i atb h ..NRL Progr U.S. Grants Research the Technology 80NSSC19K1076. by Space Grant NASA’s part under the in N00173-16-1-G905, by ONR, Grant part U.S. and in the N00014-18-1-1238 by Grant part under Patti p in K. in of supported supported work was was The Willett FA9500-18-1-0463. P. Contract of t under work with AFOSR The are with Pattipati Connecticut. K. of is and sity Marano Willett P. S. Salerno. b (ACT). of Knowledg University Transformation sponsored Data Command program, the Allied (DKOE) Experimen by NATO Effectiveness and supported Operational Research was Maritime work and Their for (CMRE). Centre NATO tation the with are of relaxation premature care the intensive of the under because on However, cases pressure units. of the relieving number pan- and the the of control bringing spread thereby reduc- the [2], slowing effectively see and demic in rate economies, resulted the the measures ing These and [1]. patterns, e.g. mobility behav- social global in ior, changes national of disruptive even closure causing and bans, lockdowns, factories travel shops, extraor- as universities, undertake such schools, to measures, decided social cure, dinary effective the an by of hampered governments, lack many then, Since demic. O USA. and the second in COVID-19 waves the analyzing third (iii) by so- and validated proposed contagion; is The lution the evolution. termina- of the the growth forecast (or rate); reliably exponential onset an infection the of possible the as tion) (e.g., quickly rel as pandemic learn detect (i) the (ii) to: of designed features is evant in- data, available an publicly propose using th we of framework article, detection-estimation-forecasting as outbreak this tegrated In the such waves. prevented authorities, pandemic’s restric- not new political drastic have by lockdowns, countries, systems national adopted many deaths, healthcare measures In million advanced tive stress. 1.7 most heavy over world’s under caused the put 2020, and December to up .Sli .Fri .Ggin,P rc,adL .Millefiori M. L. and Braca, P. Gaglione, D. Forti, N. Soldi, G. ne Terms Index Abstract ac 1 00 h ol elhOrganiza- pan- Health a disease World COVID-19 the the declared (WHO) 2020, tion 11, March N ucetDtcinadFrcs fPandemic of Forecast and Detection Quickest ivniSli ioaFri oeioGgin,PooBra Paolo Gaglione, Domenico Forti, Nicola Soldi, Giovanni TeCVD1 admchs olwd and worldwide has, pandemic COVID-19 —The ubek:Aayi fCVD1 Waves COVID-19 of Analysis Outbreaks: enroM Millefiori, M. Leonardo Pnei oeigadprediction, and modeling —Pandemic ee .Willett, K. Peter .I I. NTRODUCTION elw IEEE, Fellow, ebr IEEE, Member, eUniver- he r by art the y pati and rsn .Pattipati R. Krishna am at, e - - tfn Marano, Stefano e fnwcssi ne oto,t fo)acritical the a and in (from) growth infected, exponential to of an control, number by under num- characterized is the i.e., cases a one, which to) new during back of phase (or ber a from i.e., passages regime, the rate, controlled possible infection as the quickly e.g., as pandemic, the of tures uin ftemi pdmooia uniis .. the e.g., quantities, epidemiological main distri the probability of posterior the butions with and together time-varying, infec- learned the considered are are as describe rates, such to recovery and parameters, used tion are Model II), evolution. Section pandemic epi- (cf. tasks, and models forecasting SIR SEIR and the wave as learning such pandemic models, the compartmental a demiological for of As is termination well. hypotheses), the as the detect to of able roles risk prop- also the MAST, the (inverting The and adjusted outbreak. erly intervention an prop- declaring in by incorrectly delay growth of the pandemic trading-off exponential detect an erly to able of is onset (MAST), the cele- test the sequential the called agnostic of tailored method, mean This specifically version data. [4] a pandemic non-stationary [3], is to test that CUSUM [7], recently Page’s method and brated a [6] on in relies presented evolution task its detection forecasting quickest and The features, peculiar learn- its onset, of pandemic ing COVID-19 of for detection tools developed quickest recently exploit We evolution. demic netdadrcvrd,i beto of able number is (e.g., recovered), authorities and by infected basis data target daily on based a and that, on framework filtering provided a propose Bayesian we adaptive [5], tracking [4], [3], niques support. decision au- real-time the governmental enhanced and of with local evolution provide thorities incipient to as order quickly the in as forecast infection phase to growth detect and exponential to possible, an able develop of are to onset that importance the algorithms paramount and of models is advanced or it late This that countermeasures. too suggests ineffective acted and patterns light have applied growth either have exponential have consequently, new governments and, these the early, many detect Despite wave, to world. first failed the the around of countries experience ram- many are in cases COVID-19 pant of waves new measures, these eeaigorkoldei ucetdtcintech- detection quickest in knowledge our Leveraging ieFlo,IEEE Fellow, Life ca, eirMme,IEEE, Member, Senior eirMme,IEEE, Member, Senior iii) eibyfrcs h pan- the forecast reliably i) er eeatfea- relevant learn ii) detect 1 - . 2 numbers of infected and recovered individuals; see de- with a median value of 5.2 days). Therefore, suscep- tails in [8]. tible individuals go through an exposed (E) compart- Key to the accuracy of the forecast is to know which ment before developing evident symptoms, and, even- recent data apply to it; that is, we need the change-points tually, move to the infected compartment. In the SEIRQ between controlled and critical regimes. For this rea- model [10], an extra compartment is added for individ- son, in this paper, we combine the quickest detection uals who have contracted the virus and are quarantined approach with the Bayesian forecast to develop an in- (Q). A further extension is represented by the generalized tegrated detection-forecast framework. In particular, de- SEIR (GSEIR) [11], that includes three more compart- tecting the beginning and the termination of a pandemic ments, i.e., insusceptible, quarantined, and death. The wave through MAST enables a more reliable infection SIR-X model [12] takes into account restrictive mea- rate estimation to be adopted in accurate forecast of pan- sures, such as closure of schools and shops, or complete demic evolution up to several weeks after the detection. lockdown, by removing susceptible individuals from the The comprehensive set of tools provided by the proposed disease spreading process. The majority of epidemio- framework might assist the authorities in evaluating the logical models described above assume that the disease implementation of pandemic countermeasures. The ef- spreads inside a unique population, e.g., city, region, fectiveness of the proposed framework is assessed by country. Metapopulation models [13] go beyond com- detecting the onsets and terminations of the second and partmental models by adding a further spatial dimension third waves of COVID-19 in the USA, starting from May and considering a network of spatially separated subpop- 1, 2020, and forecasting the evolution of the contagion ulations among which individuals can move freely, and up to December 13, 2020. come in contact with each other. The remainder of this paper is organized as follows. Most of these compartmental models describe the flow Section II presents the most used compartmental models dynamics from one compartment to another by means of for epidemiological modeling. Section III describes the a set of stochastic differential equations. In most cases, proposed framework that includes the Bayesian learning the main model parameters are fixed and do not vary of the model parameters the quickest detection of an with time. In our proposed framework, described in the exponential growth, and the forecasting of the contagion. following sections, we assume that relevant epidemiolog- Data analysis of the second and third waves of COVID- ical model parameters are time-varying to better capture 19 in the USA is presented in Section IV, and concluding the effects of mobility and possible restrictive measures. remarks are provided in Section V. These parameters are then estimated online along with the epidemiological model states. II. EPIDEMIOLOGICAL MODELING Compartmental epidemiological models assume that III. LEARNING, DETECTIONANDFORECAST a given population is partitioned into a predefined num- Our proposed decision-directed estimation (learning)- ber of compartments (population subgroups), where each detection-forecasting framework is presented in Fig. 1. compartment represents a pandemic state that an individ- The sketch reads from left to right, and describes the ual can occupy. The SIR model [9] accounts for three main stages using the SIR epidemiological model; nev- compartments, specifically, susceptible (S), infected (I), ertheless, other models, as those introduced in Section and recovered (R) individuals. A II, can be employed. Moreover, we describe the frame- can contract the virus at a fixed constant “infection” rate, work using the sequences of daily new positive individ- denoting the rate at which the individual comes in con- uals and the cumulative number of healed people and tact with an infected individual. If infected, an individual fatalities, that are grouped under the “recovered” or re- develops the disease and is transitioned to the infected moved compartment. The use of these sequences, among compartment. Finally, an infected individual recovers or many others, is due to the fact that this information is passes away at a constant “recovery” rate, thereby mov- available for a large number of countries and territories, ing to the recovered compartment. Recovered people are which makes the proposed framework directly applicable considered permanently immune. to data from different areas of the world. Nevertheless, Over the years, more complex extensions of the SIR the algorithm is general enough to be extended and used model have been developed. For example, the SEIR with different and richer data including, but not limited model assumes that a susceptible individual does not to, the number of swab tests and the sequence of hospi- develop the disease symptoms immediately, but only af- talized individuals [6]. In this regard, we note that the use ter an of a certain duration (in the of the sequence of hospitalized individuals for the quick- COVID-19 case, this duration ranges from 3 to 15 days, est detection of a pandemic wave has been recently in- PSfrag replacements Continue Observing

End 1st Wave

3

Publicly available data (such as The output of the “quickest detection” controls the positive and recovered individuals) Learning is based on epidemiological parameters and pandemic forecast. If in critical regime, are collected daily. compartmental models (e.g., SIR). Model the infection rate is expected to increase, as well as the states (S,I,R) and parameters (V and W) are number of infected; on the other hand, if in controlled Daily new positives sequentially learned from the data. regime, infection rate is expected to decrease. Recovered

INPUT DATA BAYESIAN LEARNING PARAMETERS FORECAST Controlled Regime S I R V W The forecasted model’s parameters are used to predict Infection Rate

TODAY the evolution of the pandemic.

Confidence V TODAY Interval PRE-PROCESSING Critical Regime

2nd Wave TODAY Recovery Rate

Confidence Interval

TODAY 1st Wave W TODAY

GROWTH RATE TODAY PANDEMIC FORECAST Controlled Regime 1 QUICKEST DETECTION Onset of 1st Wave TODAY

Thr. 1 The pandemic growth rate is TODAY computed as the ratio between two Critical Regime consecutive daily positive case counts. When growth rate > 1, TODAY the pandemic is in a critical End of 1st Wave Onset of 2nd Wave regime; otherwise, it is in a MAST statistics are used to detect the onset and controlled regime. Clearly, the termination of pandemic waves, namely, the growth rate is randomly fluctuating, beginning of critical or controlled regimes. In TODAY thus requiring the design of the above example as of TODAY, the pandemic a proper statistical test. is in a critical regime. Fig. 1. Notional sketch of the proposed decision-directed-forecasting framework.

vestigated in [14]. The analysis shows that, even though by a certain level of randomness, unavoidable in real- the number of hospitalized individuals is less susceptible world measurements, modeled as superimposed “noise”. to reporting errors, the detection obtained by using the The approach we adopt, recently proposed in [8], is number of infected individuals is usually quicker. The based on the discretization of the continuous stochas- following subsections provide a tutorial description of tic differential equations that describe the compartmental each stage of the framework. epidemiological model, and on the assumption that the model parameters are time-varying. Then, by applying A. Bayesian learning of pandemic evolution basic principles of Bayesian sequential estimation, that The objective of the Bayesian learning step, shown involve a prediction and an update step, the posterior at the top of Fig. 1, is to track the day-by-day evolu- probability density functions (pdfs) of the model param- tion of the epidemiological model states, specifically the eters, as well as the pdfs of the model states, are com- number of infected (I) and recovered (R) individuals, as puted. Specifically, during the prediction step, the pdfs well as the model parameters, i.e., infection rate V and of the parameters V and W, and of the states I and R, ob- recovery rate W, through the daily (possibly partial) ob- tained the day before, are predicted according to the dy- servations of the states. These observations are affected namic model defined by the set of discrete-time stochas- 4 tic difference equations; during the update step, the new in the classical detection theory. The results on the per- observations are processed and used to refine those pre- formance of MAST, presented in [6], [7], show that the dicted pdfs, finally providing the posterior pdfs of model decision delay required to reveal the onset of an expo- states and parameters at the current time. The pictorial nential phase is in the order of a few days, with a risk that graphs within the Bayesian learning box in Fig. 1 are ex- scales exponentially with the delay. The pictorial graph amples of the estimated infection and recovery rates over in the quickest detection box of Fig. 1 shows three MAST time, and of their confidence intervals. An efficient im- statistics exceeding a fixed threshold (dashed horizontal plementation of the proposed method, based on mixture line), each corresponding to, respectively, the onset of models, is presented in [8]; therein, a concrete example the second pandemic wave (continuous magenta line), of the application to Italian and US data is also provided. the termination of the second wave (continuous yellow The same implementation, enhanced by the information line), and the onset of the third wave (dashed magenta provided by the quickest detection step, will be used for line). A fourth MAST statistic (yellow dashed line) cor- the data analysis in Section IV. responding to the termination of the third wave has not as of this writing exceeded the threshold; therefore, in B. Quickest detection of pandemic onset the example depicted in Fig. 1, the pandemic is still in a The quickest detection step, shown at the bottom of critical regime. As described in the next subsection, the Fig. 1, is designed to recognize, as quickly as possible, output of the quickest detection step controls the forecast the passages from a controlled to a critical regime of the of the pandemic evolution. pandemic, and vice-versa. The detection procedure that we adopt, proposed in [6], [7], is based on the growth C. Forecasting of pandemic evolution rate sequence, computed daily as the ratio between two Once a transition from a controlled to a critical regime, consecutive new positive case counts; this is preceded by or vice-versa, is detected through MAST, an infection a pre-processing of the sequence of daily new positive rate evolution strategy is hypothesized. An infection rate individuals to mitigate gross errors and weekly fluctua- strategy is the hypothesized evolution of infection rate V tions in the reported data. Intuitively, if the growth rate of that depends on its natural evolution and how the authori- infected individuals is below unity, the pandemic is un- ties and population respond to regime transitions. We call der control and will wane; otherwise, if the growth rate them "scenarios". Specifically, when an outbreak is de- is above 1, the contagion is still spreading. However, clared (critical regime), the hypothesized infection rate the growth rate is randomly fluctuating, and its simple slope (i.e., the derivative of the infection rate continu- observation is not adequate to declare the onset or ter- ously estimated through the Bayesian learning algorithm) mination of a pandemic wave, thus requiring the design is positive (or zero) as shown in the lowermost pictorial of a specific statistical test. graph in the parameters forecast box of Fig. 1; whereas The growth rate is modeled as normally distributed when the termination of a pandemic wave is declared with unknown and time-varying mean, whereas the stan- (controlled regime), the hypothesized infection rate slope dard deviation is re-estimated daily from a sliding win- is negative (or zero) as shown in the uppermost pictorial dow of the available data. To detect regime transitions, graph. Then, due to the fact that the SIR epidemiolog- we rely on the generalized likelihood ratio test (GLRT) ical model is nonlinear (as are all the epidemiological approach, see e.g. [7], which has proven its effectiveness models described in Section II), the forecast of the pan- in applications with unknown parameters in the statisti- demic evolution is produced via ensemble forecasting cal distribution of data. Specifically, the GLRT solution (see [8] and references therein), i.e., a Monte Carlo ap- to the quickest detection problem of interest amounts to proach that produces a set (or ensemble) of forecasts. recursively computing the MAST decision statistic that Specifically, first the posterior pdf of the epidemiologi- depends only on the observed growth rate and on its cal model states (I and R) is sampled to obtain an initial estimated standard deviation. Then, a regime change is ensemble, then this ensemble is propagated forward in declared when the MAST decision statistic exceeds a time — according to the epidemiological dynamic model predefined threshold, which is selected to trade-off the and using the hypothesised infection rate — up to a fore- decision delay, i.e., the average time elapsed from the cast horizon of days. Evidently, the quality of the pan- actual change of regime to the detection, and the so- demic forecasts depends on the assumed evolution of the called risk, i.e., the reciprocal of the mean time between infection rate, which, in turn, depends on how authorities two consecutive false alarms; a false alarm is defined and people are expected to respond. The mean and stan- as a threshold crossing when no change has occurred. dard deviation of the ensemble represent, respectively, The risk plays the role of the false alarm probability the evolution of the pandemic, in terms of infected and PSfrag replacements

0

PSfrag replacements

0 5

10 220 recovered individuals, and its confidence interval. The 9 Number of Infected 200 Number of Recovered thousands) (in Individuals # pictorial graphs in the pandemic forecast box of Fig. 1 8 180 Number of Daily New Positives 160 show the pandemic forecasts in case a critical regime 7 140 is declared (lowermost graph), and in case a controlled 6 120 regime is declared (uppermost graph). 5 100 4 80 IV. ANALYSIS OF SECOND AND THIRD 3 60 WAVES IN USA 2

# Individuals (in millions) 40 The proposed decision-directed forecasting framework 1 20 is applied to the COVID-19 dataset from the USA in or- 01-Mar 05-Apr 10-May 14-Jun 19-Jul 23-Aug 27-Sep 01-Nov 06-Dec der to recognize the beginning and the termination of the Date second and third waves, and infer the progression of the Fig. 2. 21-day average of numbers of infected and recovered individ- pandemic. As described earlier, Bayesian learning uses uals (left axis), and number of daily new positive individuals (right the number of infected individuals and the number of re- axis) in the USA since March 1, 2020 (data from JHU [15]). Darker areas correspond to the three waves of the pandemic, characterized covered individuals (which encompasses the total recov- by exponential growth in the number of daily new positives. eries plus deaths), while the quickest detection, imple- mented through the MAST, uses the number of daily new 10 9 Statistic for Onset Detection 2nd Wave positive individuals. These numbers, along with many Statistic for Termination Detection 2nd Wave 8 others, are provided daily by the authorities, and have Statistic for Onset Detection 3rd Wave 7 Statistic for Termination Detection 3rd Wave been collected and made publicly available by the Johns 6 Threshold Hopkins University (JHU) since the beginning of the 5 COVID-19 emergency [15]. Fig. 2 shows the 21-day av- 4 erage of numbers of infected, recovered, and daily new MAST Statistic 3 1-May 22-Jun 12-Aug 29-Sep positives from March 1 to December 13, 2020. There 2 are clearly three identifiable waves, each characterized 1 0 by rapid growth in the number of daily new positives; 01-Mar 05-Apr 10-May 14-Jun 19-Jul 23-Aug 27-Sep 01-Nov 06-Dec these regions of exponential growth are highlighted with Stopping Day a darker background for clarity. Actually, the third wave Fig. 3. MAST statistics are computed, starting from May 1, for the is likely to be a delayed second wave in different geo- onset detection of the second wave (magenta solid line) and the third graphic regions of the USA, as a state-by-state analysis wave (magenta dashed line), and for the termination detection of the second wave (dark yellow solid line) and the third wave (dark yellow seems to imply. dashed line). The threshold (black dashed line) corresponds to a risk The MAST statistic used for the onset detection of level of 10−4. The onset of the second wave is declared on June 22, the second wave is calculated from May 1, and is shown and its termination on August 12. The onset of the third wave is in solid magenta in Fig. 3. The threshold is obtained declared on September 29, and it is still ongoing. from the analysis in [6], and is set assuming a risk level rate estimated from May 1 to December 13, along with of 10−4, which corresponds to accepting, on average, a its 90% confidence interval. For forecasting, one needs false detection every 27 years. Note that different risk to incorporate control policies in the form of scenar- levels might be used for the detection of the onset and ios. To do this, from June 22 (day of detection of the the termination of a wave; however, previous analyses second wave) onward, two possible progressions of the have shown that this would change the time a detection infection rate are envisaged and the concomitant fore- is declared by only a few days [14]. The onset of the casts reported. Forecast “A”, depicted with a dashed red second wave is declared on June 22, the day on which the line, assumes that the infection rate so far learned keeps statistic crosses the threshold for the first time. On this increasing (or decreasing) with the same slope for 15 day, the MAST statistic to detect the termination of the days (this is a reasonable assumption given the range second wave is also initiated, shown in solid dark yellow of COVID-19’s incubation period), then maintains the in Fig. 3, leading to a termination detection on August attained value for the remaining period; this mimics a 12. Likewise, the onset of the third wave is declared on scenario in which no countermeasures are taken to slow September 29, and it is still ongoing as of December 13. down the infection rate. Forecast “B”, instead, mimics While the MAST statistic is computed for detection a scenario in which restrictions are applied to limit the purposes, the Bayesian learning algorithm continuously pandemic. Therefore, if, on the day of forecasting, the estimates the epidemiological model’s states and param- infection rate is increasing, one assumes that it keeps eters. As an example, Fig. 4 shows in blue the infection increasing for 15 days with the same estimated slope, PSfrag replacements

6

050 . then decreases for 30 days with the opposite slope (to PSfrag replacements.045 Bayesian Learning (all data) Forecast A (No restriction) model the rightward skew in infection distribution), and .040 Forecast B (With restriction) finally maintains the attained value for the remaining pe- .035 riod; this is illustrated with a dashed green line in Fig. .030 4. On the other hand, if, on the day of forecasting, the .025 infection rate is decreasing, it keeps decreasing for 15 .020

Infection Rate days and then maintains the realized value for the re- .015 maining period, as in forecast scenario A. Unlike [8], .010 the estimated slope of the infection rate on a given day .005 is computed by averaging the slopes since the day MAST 0 01-May 29-May 26-Jun 24-Jul 21-Aug 18-Sep 16-Oct 13-Nov 11-Dec declares a change (either of the onset or the termination Date of a pandemic wave). If the slope is not coherent with Fig. 4. The blue solid line represents the infection rate estimated the declared regime (that is, positive slope under criti- from May 1 to December 13; the light blue area is its 90% confidence cal regime and negative slope under controlled regime) interval. Forecast scenario A (red dashed line) and forecast scenario because of the random fluctuation in the infection rate B (green dashed line) are two possible progressions of the infection rate envisaged on June 22, the day on which the onset of the second estimation, then the slope used in the forecast is set to PSfrag replacementswave is detected; the light red area and the light green area represent zero. Other strategies can be investigated for improving the 90% confidence interval of the two forecasts, respectively. the forecast reliability; however, such investigations are delegated to future work. 10 The evolution of the pandemic in terms of the number 9 Number of Infected Forecast A (No restriction) of infected individuals in these two cases, i.e., forecast 8 Forecast B (With restriction) scenario A and forecast scenario B, is shown in Fig. 5 7 with a dashed red line and a dashed green line, respec- 6 tively, and compared with the true number of infected 5 individuals, i.e., the blue solid line. Both the forecasts 4 follow the actual evolution of the number of infected, 3 particularly in the next 1-2 months. Forecast B clearly # Individuals (in2 millions) foresees a lower number of infected compared to fore- 1 cast A, since it is assumed that countermeasures to the 0 01-May 29-May 26-Jun 24-Jul 21-Aug 18-Sep 16-Oct 13-Nov 11-Dec spreading of the virus will take place after 15 days from Date the detection of the onset. A more rigorous evaluation of Fig. 5. Evolution of the number of infected individuals according to the proposed Bayesian learning and forecast algorithm is forecast scenario A (red dashed line) and forecast scenario B (green provided in Fig. 6, that shows the mean absolute percent- dashed line) from June 22, the day on which the onset of the second wave is detected. The blue solid line represents the observed number age error (MAPE) computed on the number of infected of infected individuals from May 1 to December 13. individuals for both strategies and for two different fore- cast horizons , i.e., 2 and 4 weeks. Apart from the 40 time intervals of roughly between July 19 and August Proposed, Strategy A - Horizon 4 Weeks 35 Proposed, Strategy B - Horizon 4 Weeks 13, and October 22 and November 14, the MAPE is be-

30 GSEIR-fit - Horizon 4 Weeks low 5% for both forecast scenarios A and B. Between Proposed, Strategy A - Horizon 2 Weeks July 19 and August 13, the MAPE increases — still 25 Proposed, Strategy B - Horizon 2 Weeks below 5% and 15% for horizons of 2 and 4 weeks, re- GSEIR-fit - Horizon 2 Weeks 20 spectively — as effect of the reduction of the infection

15 rate that follows its peak reached on August 2 (see Fig. 4). Indeed, independently of the forecast scenario, one 10 still assumes — without any further knowledge — that Mean Absolute Percentage Error 5 the infection rate keeps increasing for 15 days, and the 0 closer one gets to August 2, the more the hypothesized 01-May 29-May 26-Jun 24-Jul 21-Aug 18-Sep 16-Oct 13-Nov 11-Dec Day of Forecasting infection rate deviates from the actual one. The same Fig. 6. Mean absolute percentage error (MAPE) of the forecast of happens at the end of October when the infection rate, the pandemic evolution performed with the proposed algorithm (for after attaining its minimum value, starts increasing again. scenarios A and B) and the GSEIR-fit, on different days (abscissa) A comparison with an alternative curve-fitting approach, and for different time horizons, i.e., 2 and 4 weeks, (depicted with called GSEIR-fit [11], is also provided. The GSEIR-fit solid and dashed lines, respectively). 7 employs a nonlinear least squares fitting algorithm that, REFERENCES using the number of infected and recovered individuals, [1] L. M. Millefiori et al., “COVID-19 impact on global maritime computes the six parameters of the GSEIR compartmen- mobility,” Nature Comm. (under review), 2020. tal model and then uses them to forecast the evolution of [2] J. Dehning et al., “Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions,” Science, the pandemic. As shown in Fig. 6, the proposed forecast vol. 369, no. 6500, Jul. 2020. algorithm outperforms the GSEIR-fit approach for both [3] M. Basseville et al., Detection of Abrupt Changes: Theory and forecast horizons of 2 and 4 weeks. Even for longer- Application. Upper Saddle River, NJ, USA: Prentice Hall, term forecasts (not reported in Fig. 6), e.g., 8 weeks, 1993. [4] H. V. Poor et al., Quickest Detection. Cambridge, UK: Cam- the time-averaged MAPE is 12.2% and 10.9% for the bridge University Press, 2009. proposed algorithm with forecast strategies A and B, re- [5] Y. Bar-Shalom et al., Tracking and Data Fusion. Storrs, CT, spectively, and 16.2% for the GSEIR-fit approach. It is USA: YBS Publishing, 2011. [6] P. Braca et al., “Decision support for the quickest detection of worth noting that the same curve-fitting approach using critical COVID-19 phases,” Sci. Rep. (accepted), 2021. the SIR and SIR-X epidemiological models leads to less [7] ——, “Quickest detection of COVID-19 pandemic onset,” IEEE accurate forecasts than that obtained with the GSEIR-fit, Signal Process. Lett. (accepted), 2021. [8] D. Gaglione et al., “Adaptive Bayesian learning and forecasting and, consequently, to those obtained with the proposed of evolution - Data analysis of the COVID-19 out- algorithm. Indeed, the forecast obtained with the SIR-fit break,” IEEE Access, vol. 8, pp. 175 244–175 264, Sep. 2020. approach presents a time-average MAPE of 101.8% and [9] O. N. Bjørnstad et al., “Modeling infectious ,” Nat. 137.2% for forecast horizons of 2 and 4 weeks, respec- Methods, vol. 17, no. 5, pp. 455–456, Apr. 2020. [10] Z. Hu et al., “Evaluation and prediction of the COVID-19 vari- tively; the time-averaged MAPE obtained with the SIR- ations at different input population and quarantine strategies, a X-fit approach, instead, is 25.9% and 27.5% for forecast case study in Guangdong province, China,” Int. J. Infect. Dis., horizons of 2 and 4 weeks, respectively. vol. 95, pp. 231–240, Jun. 2020. [11] L. Peng et al., “Epidemic analysis of COVID-19 in China by dynamical modeling,” arXiv, Feb. 2020. [Online]. Available: V. CONCLUSION https://arxiv.org/abs/2002.06563 [12] B. F. Maier et al., “Effective containment explains subexpo- Leveraging known concepts from the fields of signal nential growth in recent confirmed COVID-19 cases in China,” processing and communication, we have proposed an in- Science, vol. 368, no. 6492, pp. 742–746, May 2020. tegrated detection-estimation-forecasting framework that [13] M. Chinazzi et al., “The effect of travel restrictions on the spread of the 2019 novel Coronavirus (COVID-19) outbreak,” is able to reliably detect the onset and termination of Science, vol. 368, no. 6489, pp. 395–400, Apr. 2020. pandemic waves, as well as to forecast the epidemiolog- [14] P. Braca et al., “MAST: A quickest detection procedure for ical evolution. A pandemic wave onset (termination) is COVID-19 epidemiological data to trigger strategic decisions,” in Proc. EUSIPCO-21 (submitted), Dublin, Ireland, Aug. 2021. determined by an infection rate increase (decrease), also [15] E. Dong et al., “An interactive web-based dashboard to track referred to as critical (controlled) regime. The detection COVID-19 in real time,” Lancet Infect. Dis., vol. 20, no. 5, pp. of such regimes and the ability to learn relevant epidemi- 533 – 534, May 2020. ological factors are crucial to determine an infection rate Giovanni Soldi is a Scientist at CMRE. He received his Ph.D. degree evolution scenario for reliable pandemic forecasting. Ex- in Signal Processing from Télécom ParisTech in 2016. perimental validation on COVID-19 data from the USA Nicola Forti is a Scientist at CMRE. He received his Ph.D. degree in has shown that the proposed framework is able to re- Information Engineering from University of Florence, Italy, in 2016. liably detect two consecutive exponential outbreaks on Domenico Gaglione is a Scientist at CMRE. He obtained the Ph.D. June 22 and September 29, and forecast the pandemic degree in Signal Processing from University of Strathclyde in 2017. evolution over time horizons ranging from 2 to 4 weeks, Paolo Braca is a Scientist at CMRE. He received the Ph.D. degree in Information Engineering from University of Salerno, Italy, in 2010. while maintaining a mean absolute percentage error be- Leonardo M. Millefiori is a Scientist at CMRE. He received the tween 5% to 15%. M.Sc. degree in Communication Engineering from Sapienza Univer- Learning and forecasting, as described in this paper, sity of Rome, Italy, in 2013. are based on the classical SIR model. However, the pro- Stefano Marano received the Ph.D. degree in Electronic Engineering posed methodology is general enough to be able to ac- and Computer Science from the University of Naples, Italy, in 1997. commodate more detailed compartmental models. They He is currently a Professor with DIEM, University of Salerno, Italy. would allow the modeling of additional mechanisms, Peter K. Willett received the Ph.D. degree from Princeton Univer- sity, NJ, in 1986. He is currently a Professor with the Department such as the effect of the campaign and so- of Electrical and Computer Engineering, University of Connecticut, cial distancing measures, as well as predict the evolution Storrs, CT, USA. of other metrics, such as hospitalizations. Further exten- Krishna R. Pattipati is the Board of Trustees Distinguished Pro- sions might include the use of metapopulation models fessor and the UTC Chair Professor in systems engineering with the Department of Electrical and Computer Engineering, University of to better describe the diffusion of the infection among Connecticut, Storrs, CT, USA. geographically distributed subpopulations.