<<

DECEMBER 2020 L E N S S E N E T A L . 2387

Seasonal Forecast Skill of ENSO Teleconnection Maps

a b b NATHAN J. L. LENSSEN, LISA GODDARD, AND SIMON MASON a International Research Institute for Climate and Society, and Department of Earth and Environmental Sciences, Columbia University, Palisades, New York; b International Research Institute for Climate and Society, Columbia University, Palisades, New York

(Manuscript received 27 November 2019, in final form 3 September 2020)

ABSTRACT: El Niño–Southern Oscillation (ENSO) is the dominant source of seasonal climate predictability. This study quantifies the historical impact of ENSO on seasonal precipitation through an update of the global ENSO teleconnection maps of Mason and Goddard. Many additional teleconnections are detected due to better handling of missing values and 20 years of additional, higher quality data. These global teleconnection maps are used as deterministic and probabilistic empirical seasonal forecasts in a verification study. The probabilistic empirical forecast model outperforms in the tropics demonstrating the value of a forecast derived from the expected precipitation anomalies given the ENSO phase. Incorporating uncertainty due to SST prediction shows that teleconnection maps are skillful in predicting tropical pre- cipitation up to a lead time of 4 months. The historical IRI seasonal forecasts generally outperform the empirical forecasts made with the teleconnection maps, demonstrating the additional value of state-of-the-art dynamical-based seasonal forecast systems. Additionally, the probabilistic empirical seasonal forecasts are proposed as reference forecasts for future skill assessments of real-time seasonal forecast systems. KEYWORDS: Teleconnections; ENSO; Forecast verification/skill; Seasonal forecasting; Statistical forecasting; Climate services

1. Introduction 2005; Kumar et al. 2020). Effective tailoring and translation of El Niño–Southern Oscillation (ENSO) is a major driver of seasonal forecasts to users is difficult and is a current focus of precipitation variability worldwide (Ropelewski and Halpert the World Meteorological Organization (Kumar et al. 2020). If 1987, 1989; Mason and Goddard 2001). The robust precipita- the generation, translation, or communication of a forecast tion teleconnections and associated societal impacts of ENSO fails, users do not have proper access to that forecast and turn make it the primary source of skill for seasonal forecasts to alternative forecasts. One of the most accessible alternative (Livezey and Timofeyeva 2008; Barnston et al. 2010). Immense forecasts during ENSO events is a teleconnection map repre- work has gone into understanding the dynamics, variability, senting the timing and spatial extent of ENSO impacts such as and predictability of ENSO (Yeh et al. 2018; Timmermann the cartoon map shown in Figs. 1 and 2. Similar representations et al. 2018) to improve replication in Earth system models of ENSO impacts are also prominent on NOAA1 and WMO2 (ESMs) critical to forecasts of climate variability and projec- web pages. Quantifying the skill of forecasts made with car- tions of climate change (Bellenger et al. 2014). Research on toon teleconnection maps is an important benchmark for understanding, quantifying, and communicating seasonal fore- assessing and communicating the value of state-of-the-art casts and ENSO-driven precipitation variability reaches far forecast systems. beyond the physical sciences, with decision makers in fields The first goal of this study is to update to the global analysis such as agriculture (Rahman et al. 2016), water management of robust historical ENSO-precipitation teleconnections of (Crochemore et al. 2016), and public health (Borbor-Mendoza Mason and Goddard (2001) (hereafter MG01). The resulting et al. 2016) utilizing forecasts of seasonal precipitation. global maps of ENSO teleconnections are synthesized to up- The gold standard for seasonal forecasts is a dynamical date the IRI ENSO cartoon (Figs. 1 and 2). Much of the recent forecast that has been postprocessed to address systematic research in ENSO impacts has focused on regional impacts model biases, then tailored for specific users (Cash and Buizer (See Yeh et al. (2018) for a review) and there have only been a studies applying a single method for detecting ENSO impacts globally since MG01 (Davey et al. 2014; Lin and Qian 2019). Denotes content that is immediately available upon publica- This study detects teleconnections globally using an analysis tion as open access. period of 1951–2016 with a method to mask missing data and increase the statistical signal of ENSO-related precipitation anomalies. Supplemental information related to this paper is available at the Journals Online website: https://doi.org/10.1175/WAF-D-19- 0235.s1. 1 https://www.climate.gov/enso#qt-el_ni_o_la_ni_a_el_ni_o_ southern-ui-tabs4 (accessed 30 Mar 2020). Corresponding author: Nathan J. L. Lenssen, lenssen@iri. 2 https://public.wmo.int/en/About-us/FAQs/faqs-el-ninola-nina columbia.edu (accessed 30 Mar 2020).

DOI: 10.1175/WAF-D-19-0235.1 Ó 2020 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses). Unauthenticated | Downloaded 10/03/21 03:54 AM UTC 2388 AND FORECASTING VOLUME 35

FIG. 1. The cartoon El Niño teleconnection map issued by the IRI as updated by this study. Precipitation impacts detected in this study and supported with regional literature are displayed in a concise and easy to read format.

The second goal is to quantify the skill of ENSO telecon- defining skill as the value added over the empirical, historical impact nection maps. To quantify the skill in a methodical way, several of ENSO on precipitation. As an example application, the skill of empirical (statistical) forecast methods are developed to em- the historical IRI seasonal forecasts relative to the EBF and cli- ulate ad hoc forecasts made using ENSO teleconnection maps. matology reference forecasts are compared. The reference forecasts These forecasts are hereafter referred to as ENSO-based are available for download3 and will be updated monthly. forecasts (EBFs). The EBF methods are simple statistical In section 2, the historical precipitation, historical ENSO, models representing various interpretations of the ENSO and historical forecast data used in this study are outlined. In teleconnection maps rather than complex statistical models section 3, methodologies for the updated teleconnection maps with may predictors that aim to maximize skill. The performance of MG01 and the generation of EBFs are detailed. Section 4 of the EBFs is compared with the historical IRI seasonal pre- presents results from the updated teleconnection maps, up- cipitation forecasts, providing an evaluation of the IRI’s fore- dated cartoon representation of these teleconnection maps, casts through 2016 (Goddard et al. 2003; Barnston et al. 2010). and the EBF verification. Section 5 summarizes the results and The EBFs provide a useful benchmark for the evaluation of provides some suggestions for future applications. real-time forecast systems. Seasonal forecast skill is quantified through comparison of some forecast attribute with a reference 2. Data forecast (Mason 2018). The reference is generally climatology: a forecast only containing information on the long-term mean cli- a. Historical precipitation mate. However, the robust ENSO impacts on seasonal precipitation motivates using the simple ENSO-based forecasts as an alternative 1) CRU TS 4.01 reference for seasonal forecasts. The use of more stringent alter- The Climatic Research Unit (CRU) TSv4.01 dataset is used native reference forecasts is not novel; the ENSO climatology and to quantify historical ENSO precipitation impacts due to its persistence (CLIPER) forecasting scheme of Knaff and Landsea (1997) uses simple statistical models as a physically motivated ref- erence forecast for evaluating ENSO forecasts. Using an EBF as 3 http://iridl.ldeo.columbia.edu/expert/home/.lenssen/.realtime a reference for seasonal forecasting follows similar reasoning by ReferenceForecasts.

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC DECEMBER 2020 L E N S S E N E T A L . 2389

FIG. 2. The cartoon La Niña teleconnection map issued by the IRI as updated by this study. long record with global coverage at high resolution and its may be hidden in the missing data. The recent reduced re- uncertainty quantification (Harris et al. 2020). Station data porting along with other potential issues in data quality are quality controlled and interpolated onto a 0.58 grid motivates future regional studies incorporating national using triangular linear interpolation. Full coverage of land data not reported to global sources. grid cells is achieved by providing climatology values The ENSO impacts analysis is performed on the native for grid boxes with insufficient observational data. Data 0.58 grid as well as a 2.58 grid. The 2.58 resolution is chosen quality is reported through ‘‘number of contributing stations’’ to match the resolution of the IRI seasonal forecasts. The fields and a grid cell with two or fewer contributing stations is results from the 2.58 impacts analysis agree with the 0.58 filled with the climatology value. The study is restricted to resolution analysis. A 2.58 grid box has sufficient reporting 1951–2016 due to data sparsity issues before 1950. station data if at least half of the contained 0.58 grid boxes Properly accounting for grid boxes that contain climatology- have sufficient reporting data, balancing reductions in sig- only values into their monthly time series is necessary for ac- nal due to climatological values while not overaggressively curate ENSO impacts. Since the ENSO impacts are detected masking. through anomalous wet or dry , climatological values due to missing data erroneously reduce the signal of 2) CPC CMAP ENSO on seasonal precipitation. The CRU TS dataset has The NOAA Climate Prediction Center (CPC) Merged nearly stationary coverage from 1951 to 1990, but because Analysis of Precipitation (CMAP) dataset (Xie and Arkin of reduced reporting of the global observational record, 1997) blends remotely sensed and precipitation gauge data there is a steady decrease in coverage from 1990 to the from 1979 to the present, allowing for greater global coverage present (Fig. 3b). This loss of coverage in recent decades is over an era when station coverage is declining. It is used to verify not uniform across the globe (Fig. 3a); the losses are the global seasonal forecasts over the 1997–2016 time frame, pro- greatest in the tropics where the precipitation signals from viding continuity with previous verification studies of the IRI ENSO are most robust. The following probabilistic tele- seasonal forecasts such as Barnston et al. (2010). Repeating the connection maps should be viewed as a lower bound on the verification (analysis not shown) with the GPCP merged historical likelihood of the expected precipitation anomaly, analysis (Adler et al. 2003) results in no major changes to the particularly for the tropics region, as robust teleconnections conclusions.

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC 2390 WEATHER AND FORECASTING VOLUME 35

FIG. 3. (a) The spatial distribution of coverage in the CRU TS 4.01 dataset visualized through the instantaneous coverage in 1990 when CRU has approximately maximum coverage and 2015, which is reflective of the present-day coverage. (b) The time evolution of coverage in CRU TS 4.01 from 1950 to 2016.

b. Seasonal Niño-3.4 SST index system. Prior to 2017, the core of the IRI forecast system The state of ENSO at a seasonal time scale is represented by consisted primarily of dynamical two-tiered models in which the NOAA CPC oceanic Niño index (ONI).4 The ONI is the SST fields were predicted with a combination of dynamical and seasonal mean of the monthly Niño-3.4 index. Again following the statistical models followed by the atmospheric response as CPC, the monthly Niño-3.4 index is the monthly mean sea surface simulated with dynamical atmospheric general circulation temperature anomaly in the central equatorial Pacific (58N–58S, models (AGCMs) (Goddard et al. 2003; Barnston et al. 2010). 1708–1208W) calculated using the Extended Reconstructed Final forecasts were determined after statistical postprocessing version 5 (ERSSTv5) dataset (Huang of the multimodel AGCM ensembles and minor subjective et al. 2017). modifications to reduce noise and known regional issues in the dynamical models and postprocessing methods (Goddard et al. c. Historical forecasts 2003; Barnston et al. 2010). Since 2017, the IRI’s seasonal climate forecast system begins 1) IRI SEASONAL FORECASTS with raw output from NOAA’s North American Multimodel The IRI history of seasonal precipitation forecasts from 1997 Ensemble Project (NMME) (Kirtman et al. 2014). After re- to 2016 serves as an example state-of-the-art seasonal forecast. moval of mean, lead-time-dependent biases, each model is These seasonal forecasts were first issued October 1997 and were corrected for systematic biases and calibrated using extended produced quarterly until August 2001, when the frequency of logistic regression (Wilks 2009). The calibrated forecasts from issuance increased to monthly. The forecasts are issued globally individual models are combined with equal weight into one over land on a 2.58 grid with leads up to 4 months. multimodel forecast. By using historical forecasts rather than hindcasts, these forecast data capture the evolution of the seasonal forecast 2) IRI ENSO FORECASTS The IRI began to issue real-time monthly ENSO forecasts in March 2002, incorporating the many dynamical and statistical 4 https://www.cpc.ncep.noaa.gov/data/indices/oni.ascii.txt (ac- ENSO forecasts run at climate centers around the world. These cessed 28 Jul 2020). ENSO forecasts became a joint effort with NOAA’s Climate

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC DECEMBER 2020 L E N S S E N E T A L . 2391

Prediction Center (CPC) in late 2011. The forecasts are issued removed from the series where CRU TS does not have suffi- as probabilities of El Niño, neutral, and La Niña conditions cient reporting station data. Masking for data sparsity results in with lead times up to 9 months. The forecasts are objective intra-gridbox differences in the number of years on record and probabilities from a simple counting of the models in each the number of El Niño and La Niña events. The differences in ENSO category and are available from June 2004 to 2016. sample sizes and ENSO events are inherently taken into ac- count in the p values of the hypergeometric test. However, a 3. Methods small p value does not necessarily imply a large effect, rather The analyses were performed in the open source language some combination of a large effect and a large sample size. R(R Core Team 2020) and the code needed to replicate the A p value is determined using the hypergeometric test with following analyses and generate all figures in this report are the null hypothesis that each precipitation tercile is equally available in a github repository.5 The source data are available likely to occur regardless of the ENSO state. The corre- from the authors upon request. sponding alternative hypothesis is that ENSO does change the distribution of seasonal precipitation for a given location. As a. Global ENSO precipitation impacts methods described in detail in MG01, this problem can be posed as a test Teleconnection maps are constructed using contingency ta- for independence on a 3 3 2 contingency table (Agresti 1990). bles as in MG01; these maps indicate the local frequency of The corresponding null distribution is a hypergeometric dis- occurrence of categorized precipitation conditional on the tribution with tail probabilities calculated using Fisher’s exact state of ENSO. Maps are computed for each and both test (Fisher 1935; Agresti 1990; Lehmann and Romano 2005). ENSO phases. Locations with a locally statistically significant The counts of observed precipitation terciles for each ENSO response according to a hypergeometric test (modified to ac- phase cannot be easily compared for different locations be- count for purely climatological data in the CRU dataset) are cause of the different sample sizes resulting from the masking identified. of climatological values. Thus, hypergeometric tests are per- The method involves three steps; changes and updates to the formed on each gridbox series to calculate a p value. The re- MG01 method are italicized: sulting impact maps show historical empirical probabilities that have been masked for robust impacts, providing a measure of 1) Identify all ENSO events since 1950 where the ONI exceeds the effect of ENSO on precipitation while still accounting for an absolute anomaly of 0.58C for at least 5 consecutive statistical significance. Direct visualization of the p values is months and has a maximum absolute seasonal anomaly of at avoided as a low p value can be an indication of a large effect least 18C. size and/or a large sample size (Sullivan and Feinn 2012). 2) Calculate the relative frequency of below-normal, normal, Dry season areas are defined similar to MG01. A location is and above-normal (defined by terciles) precipitation for determined dry for a given season if either (i) the climatology each land surface grid box, conditioned on the ENSO state, for that season is less than 15% of the annual total and greater but excluding years with climatology-only data. than 50 mm or (ii) the lower tercile for that season is less than 3) Determine the statistical significance of above- or below- 10 mm of rain. The absolute cutoff in criteria ii is increased normal signal according to a hypergeometric test with the from 0 mm in MG01 to 10 mm to reduce artifacts in the dry sample sizes determined by the number of El Niñoor mask arising from the interpolation in CRU TS. La Niña events and total years with sufficient data. Step 1 of the updated analysis differs from MG01 by defining b. ENSO-based forecast (EBF) models ENSO events according to the ONI rather than taking the top 8 Three versions of empirical seasonal precipitation forecast or 11 events. NOAA CPC defines an ENSO event as a period models are developed with statistical models trained solely on where the ONI exceeds an absolute anomaly of 0.58C for at the historical ENSO impacts (Table 1). Two known-ENSO least five consecutive months.6 Selecting only ENSO events EBF models, one deterministic and one probabilistic, assume with maximum absolute anomalies of at least 18C excludes that the state of ENSO is known. The probabilistic forecast- weak ENSO events that result in reduced skill in seasonal ENSO EBF model accounts for uncertainty in the seasonal precipitation forecasts due to corresponding weaker atmo- forecast of precipitation due to limited predictability of the spheric responses (Goddard and Dilley 2005). Providing an ENSO state. absolute definition of a moderate or stronger ENSO event, The three EBF models represent increasing complexity rather than a relative one as in the case of MG01, allows con- in how a decision maker might factor known ENSO tele- tinuous updates as events that were included in previous ver- connections into planning or preparedness during an ENSO sions will always be included. event. As such, the skill of forecasts issued with these EBF The changes to steps 2 and 3 are to exclude climatology-only models represent the skill of forecasts made with ENSO impact values in the analysis by only using time points with sufficient maps. Hindcasts made with the three EBF models are com- station data. For each grid box independently, time points are pared with the IRI seasonal forecast to quantify the value added by state-of-the-art seasonal forecasting and identify possible areas of improvement. 5 https://github.com/nlenssen/ENSOImpacts. Broadly speaking, seasonal climate forecasts contain two 6 https://origin.cpc.ncep.noaa.gov/products/analysis_monitoring/ sources of uncertainty arising from the underlying dynamical ensostuff/ONI_v5.php (accessed 28 Jul 2020). systems: uncertainty from predicting the global SST pattern

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC 2392 WEATHER AND FORECASTING VOLUME 35

TABLE 1. Summary of the three ENSO-based forecast (EBF) 2) PROBABILISTIC KNOWN-ENSO EBF MODEL methods. Increasing in complexity, the probabilistic known-ENSO EBF model uses the same criteria for issuing a non- Known climatological forecast as the deterministic known-ENSO Probabilistic? ENSO state? EBF, but instead issues the historical probability of observing Deterministic known-ENSO EBF No Yes each precipitation tercile. For a location and season with ro- Probabilistic known-ENSO EBF Yes Yes bust impacts, the forecast is the empirical probabilities of the Probabilistic forecast-ENSO EBF Yes No three terciles from out-of-sample historical record. As with the deterministic known-ENSO EBF model, nonclimatological forecasts are only issued during active ENSO events. Terciles and from uncertainty predicting the atmospheric response with zero empirical probability are given nominal probability given SST patterns. Two known-ENSO EBF models assume (;2%) by assuming there is one additional year that is split perfect information of the ENSO state and effectively ignore climatologically between terciles. As before, a climatology the uncertainty in SST prediction. These serve as an upper forecast is issued globally during ENSO-neutral conditions and bound in forecast skill of the ENSO teleconnection map or for any location that does not have a significant impact. The cartoon. The forecast-ENSO EBF model uses the historical probabilistic known-ENSO EBF model provides a more real- ENSO forecasts to represent the uncertainty arising from istic representation of the uncertainty in predicting atmo- prediction of the SST state. Note that this framework does not spheric responses to ENSO and is expected to outperform the account for the uncertainty in seasonal forecasts arising due to deterministic forecast on any probabilistic verification method model biases and postprocessing. that takes reliability into account. Following a widely used format for seasonal forecasts, the EBF models issue probabilistic forecasts with the forecasted 3) PROBABILISTIC FORECAST-ENSO EBF MODEL likelihood of each climatological tercile (above normal, near The probabilistic forecast-ENSO EBF model accounts for normal, and below normal) 5 (AN, NN, BN) represented as uncertainty in predicting the ENSO state in addition to the probabilities that sum to one. The climatological forecast is uncertainty in the atmospheric response. Historical IRI ENSO then (1/3, 1/3, 1/3). forecasts are used to quantify the limited predictability of Hindcasts are issued with each of the three EBF configura- ENSO in the hindcast study. The probabilistic forecast-ENSO tions over the period 1997–2016. This is the period of the EBF model issues forecasts as the weighted average of the El available IRI forecasts, the example historical state-of-the-art Niño, neutral, and La Niña probabilistic known-ENSO EBF forecast used for this study. The three EBF models described in forecasts for a given season with weights set by the ENSO detail below are based on precipitation anomaly maps calcu- forecast. For example, given a probabilistic forecast of the lated from the out-of-sample time period of 1951–96. For the ENSO state, the issued probability of AN precipitation is cal- real-time EBF forecasts available for download, the EBF culated as forecast models are trained on the full historical record.7 While 8 the precipitation impacts analysis was done on both the 0.5 P(AN) 5 P(El Niño)P(ANjEl Niño) 8 8 and 2.5 grids, the forecasts in this study use the 2.5 global grid 1 P(neutral)P(ANjneutral) to allow direct comparison with the IRI historical forecasts. 1 P(La Niña)P(ANjLa Niña), (1)

1) DETERMINISTIC KNOWN-ENSO EBF MODEL where the first term in each line of the right-hand side is the The deterministic known-ENSO EBF model assumes per- forecast probability of the ENSO state and the second term is fect information of the ENSO state. Given an El NiñoorLa the historical probability of AN precipitation under each Niña, the deterministic known-ENSO EBF model issues a ENSO state. The forecast probabilities for NN and BN pre- forecast with 100% above-normal or below-normal probability cipitation are calculated similarly. The forecast-ENSO EBF in regions with a robust impact. A climatology forecast is issued model issue identical forecasts to the probabilistic known- in grid boxes without statistically significant impacts and ENSO EBF model during ENSO events as the probability for globally for ENSO-neutral seasons. While a deterministic El NiñoorLaNiña is then 100%. forecast is drastically overconfident for seasonal precipitation Probabilistic forecast-ENSO EBF hindcasts are issued from forecasting, it represents how decision makers may interpret mid-2004 to 2016 as the historical IRI ENSO forecast is first and act upon ENSO teleconnection maps; during an ENSO available in 2004. The hindcasts are issued at leads of 1– event, the precipitation impacts are viewed as an expectation— 4 months, which permits investigation into the effect of lead as a certain forecast. This forecast is innately overconfident and time on skill of the different methods. The probabilistic expected to verify poorly when using probabilistic verification forecast-ENSO EBF model is the most promising candidate methods that measure reliability. for benchmarking state-of-the-art seasonal forecasting systems as it better represents the uncertainty in seasonal forecasts.

c. Forecast verification methods 7 http://iridl.ldeo.columbia.edu/expert/home/.lenssen/.realtime The quality of the EBFs and IRI forecasts is quantified ReferenceForecasts. through metrics of resolution, reliability, and discrimination

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC DECEMBER 2020 L E N S S E N E T A L . 2393

TABLE 2. Descriptions of the forecast attributes referenced in make comparison of the deterministic known-ENSO EBF and the study. the other forecasts impossible.

Attribute Question answered Score 4. Results Reliability Does the outcome occur as Reliability score frequently as forecasted a. Global ENSO teleconnection maps on average? The global ENSO impacts analysis presented in section 3a is Resolution Does the outcome differ Resolution score used to generate global maps of robust ENSO-related precip- given different forecasts? itation impacts for each season, ENSO state, and tercile cate- Discrimination Do the forecasts distinguish GROC higher categories gory with empirical probabilities demonstrating the historical from lower? effect of ENSO in an intuitive way. Higher historical proba- Skill How does the forecast RPSS bility indicates more consistent, and therefore predictable perform relative to a precipitation anomalies given a nonneutral ENSO state. In reference forecast? addition, using probabilities to describe the effect of ENSO on precipitation motivates probabilistic precipitation forecasts according to historical impacts. First, these teleconnection (Table 2). See the appendix for further details on the forecast maps are discussed in the context of MG01 and existing re- verification scores used in the study. gional and global teleconnection studies. Second, updates to The analysis is divided into three sections. First, the per- the IRI ENSO teleconnection cartoons (Figs. 1 and 2) are formance of the two known-ENSO EBFs and the IRI forecast described. are compared. All skill calculations in this portion of the study use climatology as the reference forecast, following the cur- 1) ANALYSIS OF TELECONNECTION MAPS rent practice in seasonal forecast verification (Jolliffe and Teleconnection maps for nonoverlapping seasons (DJF, Stephenson 2012; Mason 2018). The climatological forecast for MAM, JJA, and SON) are shown in Figs. 4 and S3–S5 in the tercile forecasts is a probability of 1/3 issued to each of the online supplemental material. Maps and source data for all precipitation terciles. The deterministic known-ENSO EBF 12 seasons can be visualized and downloaded in a variety of represents real-time forecasts that could be, and often are, is- formats.8 sued with the use of cartoon teleconnection maps such as The updated teleconnection maps reproduce all of the Figs. 1 and 2. The probabilistic known-ENSO EBF emulates a teleconnections discussed in previous global teleconnec- forecast issued using probabilistic teleconnection maps such as tion studies (Ropelewski and Halpert 1987; Mason and Fig. 4. By verifying the known-ENSO EBFs, the skill of these Goddard 2001; Davey et al. 2014; Lin and Qian 2019). An very simple seasonal predictions is quantified. additional 20 years of data with multiple strong ENSO events Second, the skill of the IRI forecast is calculated relative to along with the improvements in methodology leads to greater reference forecasts made with the probabilistic known-ENSO statistical power and better estimates of the empirical proba- EBF model in addition to the traditional climatology refer- bility of anomalous seasonal precipitation when compared ence. The skill of the IRI forecast relative to the EBF reference with MG01. The remainder of this section discusses tele- provides a measure of the value added by a calibrated MME connections not found in MG01 by season during El Niño and forecast over ENSO teleconnection maps. then La Niña. References to relevant studies are provided Third, skill as a function of lead time is calculated for the IRI when available and include the first study to identify the re- forecast model and probabilistic forecast-ENSO EBF model. gional teleconnection. Understanding how skill falls off with lead time in the EBF For the DJF season during El Niño(Fig. 4a), a much larger model provides another important baseline metric for state-of- above-normal precipitation signal than MG01 is found across the-art seasonal forecast systems. the southern United States, Caribbean, and Mexico (Horel Both Brier- (Brier 1950) and ignorance-based (Roulston and and Wallace 1981; Cayan et al. 1999). Additionally, an above- Smith 2002) scores were used to verify the ENSO-based and normal precipitation anomaly is found in Lake Victoria region IRI forecasts. Since the results from the Brier- and ignorance- in equatorial eastern Africa. Below-normal anomalies are based verifications qualitatively agree, only the Brier-based found in the El Niño DJF season in the southern tropical Andes results are presented for three reasons. First, the Brier-based region stretching from southern Peru through Bolivia into scores remain proper when averaged over both time and space northern Chile and Argentina (Vuille et al. 2000; Garreaud and and the spatial and temporal averages commute. Second, the Aceituno 2001; Sulca et al. 2018) as well as the northern United resolution–reliability decomposition of the Brier score extends States and Canada in agreement with the above-normal signal to naturally to the tercile setting (Epstein 1969; Murphy 1971). the south (Horel and Wallace 1981; Cayan et al. 1999). The ignorance-based score decomposition requires the non- For the MAM season during El Niño (Fig. S3a), an above- local ranked ignorance score, removing one of the supposed normal precipitation anomalies are found in the southern advantages of working with ignorance-based scores (Weijs et al. 2010; Tödter and Ahrens 2012). Finally, the ignorance score of an incorrect deterministic forecast is infinite. While this feature can be argued to be appropriate, infinite values 8 http://iridl.ldeo.columbia.edu/home/.lenssen/.ensoTeleconnections/.

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC 2394 WEATHER AND FORECASTING VOLUME 35

FIG. 4. The empirical probability (from 1951 to 2016) of observing above-normal (cool colors) and below-normal (warm colors) seasonal anomalies in DJF during (a) El Niño and (b)La Niña events. Areas considered dry are masked in light red and areas without a significant signal at the a 5 0.10 significance level are masked in white. Maps for all 12 seasons and both ENSO states are available at http://iridl.ldeo.columbia.edu/home/ .lenssen/.ensoTeleconnections/.

United States, Caribbean, and Mexico (Horel and Wallace et al. 1999). Additionally, an above-normal anomaly is 1981; Cayan et al. 1999) as well as Peru (Carrillo 1892; Cai et al. found centered in France. While a significant impact of 2020). A wintertime below-normal precipitation anomalies in ENSO on boreal winter and spring precipitation over Europe the southern tropical Andes is shown to persist into the spring has been demonstrated (Brönnimann et al. 2007; Yeh et al. (Vuille et al. 2000; Garreaud and Aceituno 2001; Sulca 2018), no work could be found documenting a possible summer et al. 2018). teleconnection in France. For El Niño JJA (Fig. S4a), widespread above-normal During El Niño SON (Fig. S5a), a much greater spatial anomalies are found in the Bolivia lowlands (Ronchail extent of below-normal precipitation is now detected and Gallaire 2006; Garreaud et al. 2009) and Argentina’s throughout northern South America than was in MG01, Patagonia (Compagnucci and Araneo 2007; Garreaud et al. likely due to the longer data record (Ropelewski and Halpert 2009). Below-normal precipitation anomalies are found in 1987; Grimm 2003). southern Mexico into Central America (Magana et al. The updated study found fewer additional teleconnections 2003) and northern China (Ronghui and Wu 1989; Zhang beyond MG01 during La Niña periods in agreement with the

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC DECEMBER 2020 L E N S S E N E T A L . 2395 fewer robust La Niña teleconnections already identified. criterion for additions, regions are only added if previous re- Additional teleconnections detected during DJF (Fig. 4b) gional studies provide a general consensus on the teleconnec- are above-normal anomalies in the northwest United tion. The IRI cartoon teleconnection maps prior to the update States (Horel and Wallace 1981) and strengthened below- are included in the supplemental materials (Figs. S1 and S2) to normal anomalies found across the southern United States, facilitate comparison. Caribbean, and Mexico (Cayan et al. 1999). The text on the cartoons has been updated to encourage the For MAM (Fig. S3b), above-normal anomalies are found on use of probabilistic representations of ENSO teleconnections the Iberian Peninsula (Rodó et al. 1997; Brönnimann et al. in applications. As is evident from the low historical proba- 2007) and below-normal anomalies stretch between eastern bilities of many teleconnections, a deterministic representation Afghanistan and Pakistan to the Caspian Sea (Barlow et al. of ENSO’s effect on precipitation is potentially misleading. 2002), and across western Africa. The detected Sahelian and The utility of using probabilistic teleconnection maps is ex- equatorial West African dry anomalies associated with La plored in the following section through the relative skill of the Niña are in disagreement with Balas et al. (2007) where SST deterministic and probabilistic EBFs. composite analysis suggested that drying in the region is as- Comparing with recent global studies (Davey et al. 2014; Lin sociated with El Niño–like SST conditions. The relative im- and Qian 2019), the updated IRI cartoon contains nearly all portance mechanisms driving precipitation in the Sahel and teleconnections present in both studies and includes additional equatorial West Africa are difficult to untangle due to the detail in many regions. The cartoon produced in Davey et al. various temporal and spatial scales of variability (Nicholson (2014) agrees very well as expected from the shared method- et al. 2018). In particular, it has been shown that La Niña ology. Minor differences arise from Davey et al. (2014) using a conditions increase intraseasonal precipitation variability during much shorter record, 1979–2012 as opposed to the 1951–2016 MAM in equatorial Africa (Sandjon et al. 2012). used in this study. Direct comparison with the schematic of Lin During JJA (Fig. S4b), above-normal anomalies stretch and Qian (2019) is less straightforward as their method de- from Nepal through Bangladesh. No existing studies doc- termines timing relative to the maximum and only shows linear umenting this teleconnection could be found suggesting further teleconnections, but the rough location and timing of the tel- investigation into the underlying data and underlying dynam- econnections agrees. One notable difference is that Lin and ical mechanisms. Qian (2019) include precipitation teleconnections in subpolar For SON (Fig. S5b), below-normal anomalies are found Asia and North America that are not included on the IRI up- across the Iberian Peninsula (Rodó et al. 1997; Brönnimann dated cartoon. et al. 2007). Both of the MAM and SON anomalies in the Iberian Peninsula have been investigated previously with in- b. Assessment of known-ENSO EBFs conclusive results as discussed in (Brönnimann et al. 2007) The study’s goal of quantifying the skill of empirical seasonal suggesting further study of the region is needed. Studies have forecasts that emulate globally is addressed through verifica- found teleconnections assuming linearity of the precipitation tion of hindcasts from the EBF methods. Since the EBF response (Lloyd-Hughes and Saunders 2002). However, the methods utilize robust teleconnections, one would expect them nonlinear teleconnections found in this investigation suggest to be skillful. As shown by the teleconnection maps, seasonal methods that do not assume linearity may be necessary. precipitation responds stochastically to ENSO. Comparing The broad agreement of the results with regional studies deterministic and probabilistic forecasts based on ENSO tel- provides confidence in the updated global teleconnection econnections illustrates the value of including uncertainty in maps. The maps provide a starting place for future studies in- the forecast method. Comparing the statistical EBFs with the corporating regional data sources as well as more sophisticated IRI forecast, a forecast system primarily based on dynamical statistical and dynamical methodologies. The teleconnections models, shows the added value of a state-of-the-art forecast with no relevant global or regional studies motivate further and highlights geographic regions or ENSO events that need analysis with regional and national data sources to verify further investigation. these findings as well as dynamical studies of the underly- Comparison of the IRI forecast with the known-ENSO ing mechanisms. EBFs across various forecast attributes (Figs. 5–11) indicates that the IRI forecast performs best, followed by the probabi- 2) UPDATE OF IRI ENSO CARTOON listic known-ENSO EBF, and the deterministic known-ENSO The global teleconnection maps are used to update the EBF has the least skill. Additionally, the improvement in skill simplified IRI cartoon teleconnection maps for El Niño and La over time of a historical real-time forecast captures the evo- Niña(Figs. 1 and 2). Additions to the map from the previous lution of the forecast system, both in the development of dy- version fall into two categories. First, teleconnections that were namical models and calibration/combination approaches. The detected in MG01 and included in the previous version of the general increase in IRI forecast skill over time illustrates these cartoon that now have larger spatial or temporal extent. These improvements, particularly during the larger ENSO events. are expected with the additional data and data cleaning in the The three most extreme events during the study period were updated analysis and, particularly in regions with limited data. the 1997/98 and 2015/16 El Niños and the 2010–12 La Niña. The second category of additions to the cartoon are tele- Each metric presented shows the increased performance of the connections that were not included in the original cartoon and IRI forecast relative to the probabilistic known-ENSO EBF as are the regions discussed in the previous section. As a final time progresses. While the improvement is a qualitative result

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC 2396 WEATHER AND FORECASTING VOLUME 35

FIG. 5. The (a) mean resolution and (b) mean discrimination over the tropics (308S–308N) of the three forecasts. The mean resolution and discrimination over the total record are de- noted by the values with color corresponding to the forecasts. as it is made from a sample size of three for a highly variable where ENSO modulates the climate most directly. The IRI system, it is encouraging to observe that the IRI forecast has forecast generally has greater resolution than the EBFs in captured each major ENSO event better than the previous one teleconnection regions, echoing the tropical mean time series. while the known EBF skill remains relatively constant. However, the known-ENSO EBFs exhibit comparable or Looking more specifically at the forecast attributes, the greater resolution over northwest Colombia, the Namibia re- mean resolution, which measures the dependence of the out- gion in southwest Africa, and eastern Australia suggesting that come on the forecast (Murphy 1973), is calculated for the some ENSO teleconnections may be inadequately represented tropics (308N–308S) over time (Fig. 5a). The corresponding in some or all of the dynamical models that contribute to the spatial patterns of annual mean resolution for the IRI forecast IRI MME forecast or that there may be low-quality observa- and the known-ENSO EBFs (Fig. 6) are similar with highest tions. In addition, the EBF forecast exhibits some, albeit low, resolution in regions with strong ENSO teleconnections. The resolution throughout the extratropics with skill extending into greatest resolution occurs in the Maritime Continent region subpolar regions in the Northern Hemisphere. These results

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC DECEMBER 2020 L E N S S E N E T A L . 2397

FIG. 6. Spatial distribution of the resolution score averaged over all 12 seasons for the (a) IRI forecast and (b) probabilistic known-ENSO EBF. Higher values are indicative of better forecasts as the outcome is more conditioned on the forecast probability. are echoed in the spatial distribution of the discrimination Murphy (1973). The deterministic forecast has very poor (Fig. 7) suggesting need for a more detailed investigation into and highly variable reliability due to the inherent uncertainty the teleconnections as well as the cause of extratropical skill of in predicting variations in seasonal precipitation. Reliability the EBFs. diagrams (Fig. 9) are examined for mean and conditional The discrimination (Figs. 5b and 7) quantifies the dependence forecast bias. Reflective of the bias correction, the IRI exhibits of the forecast on the outcome (Murphy 1991). The area under very small conditional bias for each of the three terciles. The the generalized relative operating characteristics (GROC) curve EBFs are overconfident for all terciles. That is, they system- (Mason and Weigel 2009) of the EBFs are very similar when they atically issue higher forecasts probability than the observed issue nonclimatological forecasts since the GROC rewards fore- relative frequencies. casts that put the highest forecast probabilities on the category The combined reliability and resolution skill of the forecasts that is ultimately observed regardless of the probability issued. with respect to climatology is quantified by the ranked prob- The IRI forecast and EBF have similar spatial patterns of skill ability skill scores (RPSS) (Figs. 10 and 11). The deterministic with the IRI discrimination generally higher. The qualitative known-ENSO EBF performs poorly in RPSS primarily due to agreement between the resolution and discrimination in both the its poor reliability. The ranking of forecast systems observed in spatial and temporal averages suggests that the Brier-based scores resolution and discrimination scores holds with the IRI as the capture the information content of the forecasts reasonably well. most skillful on average, and at most time points. The mean The reliability time series (Fig. 8) measures the ability of RPSS fields (Figs. 11a,b) closely mirror the spatial pattern seen each forecast to represent the probability of outcomes in the resolution (Fig. 6) reflecting that the majority of the

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC 2398 WEATHER AND FORECASTING VOLUME 35

FIG. 7. A comparison of the (a) IRI forecast and (b) probabilistic known-ENSO EBF discrimination as quantified by the GROC score. The EBFs issue maximum probability on the same category in nearly all cases resulting in similar discrimination. spatial variability in RPSS arises from the spatial variability in in the IRI forecast are adding additional skill in the majority of resolution. ENSO teleconnection regions. The widespread positive skill The resolution–reliability decomposition and skill calcula- illustrates the value added by IRI forecast over the EBF for the tion was also performed with ignorance-based scores for the majority of the world. However, a few teleconnection regions IRI and probabilistic EBFs with no substantial change to the show negative skill indicating that the EBF forecast has higher results. The resolution, discrimination, and RPSS field calcu- skill over the study period. The most striking regions of neg- lations were performed seasonally in addition to the annual ative IRI forecast skill relative to the EBF are the values presented and indicated that increased skill generally region of western India, southwestern Africa, and eastern aligns with a region’s rainy season, in agreement with Barnston Australia. These negative skill regions are also found in the et al. (2010). rainy season RPSS in addition to the annual RPSS presented in Fig. 11c further motivating closer investigation. c. Climatology versus ENSO reference forecasts The global RPSS skill of the IRI forecast is slightly higher The added skill provided by IRI forecast over the probabi- when the ENSO-based reference forecast is used as the base- listic known-ENSO EBF forecast is calculated with the RPSS line (Fig. 12a), but the tropics skill decreases substantially using the EBF as the reference forecast instead of the usual (Fig. 12b). The additional value of the IRI forecast over the climatological forecast. The spatial pattern of this RPSS EBF is summarized by the positive global skill under the EBF (Fig. 11c) roughly mirrors that of the RPSS with a climatology reference during the majority of nearly every ENSO event in reference forecast (Fig. 11a) suggesting that dynamical models both the global and tropics means. Under the ENSO-based

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC DECEMBER 2020 L E N S S E N E T A L . 2399

FIG. 8. The mean negative reliability over the tropics (308S–308N) of the three forecasts. Negative reliability is plotted to remain consistent with the other verification plots where high values on the plot represent good forecast performance. reference forecast, the total tropics skill decreases nearly skill over climatology. This is to say that the utility of the EBF 50% where the large decrease in reflects the greater presence in the extratropics is limited, but still may have longer lead skill of robust teleconnection in the tropics. in regions with strong teleconnections and regional skill such as the southwestern United States. As with the known- d. Lead-time dependence ENSO EBF verification, the verification was replicated us- The RPSS and resolution skill as a function of lead time are ing ignorance-based scores finding no substantive difference calculated for the IRI and probabilistic forecast-ENSO EBFs in results. forecasts (Fig. 13). Both the IRI forecast and EBF have posi- tive RPSS skill relative to climatology in the tropics at all lead 5. Discussion and conclusions times with the IRI forecast exhibiting uniformly greater skill. This study updates Mason and Goddard (2001) with new The higher resolution of the IRI forecast demonstrates that the data and better data quality control. The resulting probabilistic additional skill of the IRI forecast is due to a greater infor- precipitation impact maps provide a dataset expanding the mation content, and not just improved reliability due to cali- teleconnections found in MG01 to include many previously bration. The IRI forecast has positive skill at the global level to studied teleconnections. These maps are synthesized to up- at least 4-month lead times, but the EBF has nearly zero global date the IRI precipitation impact cartoons. Investigation of the

FIG. 9. Reliability diagrams for the (a) IRI forecast, (b) probabilistic known-ENSO EBF, and (c) deterministic known-ENSO EBF with the forecast probability on the x axis and the corresponding frequency of observed outcome on the y axis. Histograms indicate the distribution and quantity of forecast probabilities issued. Dotted lines show a weighted linear fit of the reliability curve with weights determined by the number of forecasts issued in for a probability.

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC 2400 WEATHER AND FORECASTING VOLUME 35

FIG. 10. The (a) global and (b) tropics mean RPSS for the IRI forecast and two known- ENSO EBFs. The results are generally consistent between the global and tropical series. Total RPSS scores over the record are given by the values in the bottom-right corner. maps suggests a boreal summer El Niño above-normal pre- The IRI forecasts, serving as example state-of-the-art sea- cipitation teleconnection in France and a boreal summer La sonal forecasts using statistically calibrated dynamical models, Niña below-normal teleconnection in the greater Bangladesh provide more skill than the EBFs in nearly all regions and region, warranting further study. seasons. Including uncertainty arising from limited predict- There is some predictive skill in the simplest of the ENSO- ability of the state of the tropical Pacific at various lead times based statistical forecast models as quantified by resolution allows comparison of the EBF and IRI forecast as lead-time and discrimination. Incorporating probabilistic information in changes. The probabilistic forecast-ENSO EBF exhibits skill in the forecast increases resolution in addition to the expected the tropics at leads out to at least 4 months, but the IRI forecast increase in reliability. Additionally, including a realistic rep- uniformly provides additional skill. Thus, users should gener- resentation of the SST prediction uncertainty allows forecasts ally avoid using simplified ENSO teleconnection maps as to be made at a variety of lead times and in borderline ENSO forecasts when state-of-the-art forecasts are accessible. conditions where the deterministic and probabilistic forecasts The widespread implementation of state-of-the-art forecasts only issue climatological information. is an ongoing global project (Kumar et al. 2020). The trade-off

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC DECEMBER 2020 L E N S S E N E T A L . 2401

FIG. 11. Spatial distribution of RPSS averaged over all 12 seasons for the (a) IRI forecast and (b) probabilistic known-ENSO EBF. (c) The RPSS of the IRI forecast using the prob- abilistic known-ENSO EBF as the reference is green where the IRI forecast has additional skill over the EBF and pink where it underperforms.

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC 2402 WEATHER AND FORECASTING VOLUME 35

FIG. 12. The (a) global and (b) tropics mean RPSS for the three forecasts with reference forecasts of climatology and the probabilistic known-ENSO EBF. The EBF is equal to cli- matology in periods of neutral ENSO. Total RPSS scores over the record are given by the values in the bottom-right corner. between simplicity and skill explored in the verification of the maps from this study or customized regional info-sheets that EBFs can inform the dissemination and suggested use of cli- back up probabilities with historical data. mate information. While the WMO continues to facilitate and The verification study also reveals regions where forecast encourage use of state-of-the-art forecasts, simplified EBFs methods may be underperforming relative to the simple may be more appropriate for users with limited resources as ENSO-based method. Predictions based on historical tele- they are better forecasts than no information in many regions. connections exhibit greater skill in southwestern Africa, eastern However, the superior skill of the probabilistic EBF over the Australia, and around the Bay of Bengal as shown by the prob- deterministic EBF shows that communication of uncertainty in abilistic known-ENSO EBF. Also, greater EBF skill is observed the impacts of ENSO is critical in the proper adoption of cli- in the parts of the Northern Hemisphere extratropics where mate information for decision making (Suarez and Patt 2004; more work is necessary to understand the nature of these results. Hansen 2005; Roulston et al. 2006; Hirschberg et al. 2011). While the state-of-the-art forecast system generally out- Potential communication methods could include the impact performs the EBF system, especially in recent years, there is

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC DECEMBER 2020 L E N S S E N E T A L . 2403

FIG. 13. IRI forecast and probabilistic forecast-ENSO EBF forecast (a) RPSS and (b) resolution as a function of lead time.

utility for these EBFs in the forecast development process. In a. Forecast resolution and reliability: Ranked the future, the IRI’s real-time forecast will be verified with probability score EBFs in addition to the standard climatology. The alternative The resolution of a forecast describes how much the out- verification provides new information about the additional come varies for different forecasts, with a resolution score skill provided by a state-of-the-art forecast in regions with and quantifying how much the outcome changes for different without known teleconnections. These forecasts are available forecast probabilities. High resolution is critical to the per- for download9 and will be updated monthly for use in other formance of a forecast system. If the outcome does not depend real-time forecasting systems. on the forecast, the forecast is not providing any useful infor- mation. A forecast’s resolution cannot be improved through Acknowledgments. The authors thank the three anonymous local postprocessing; the resolution is a measure of the inherent reviewers for their constructive comments. This work was information in the forecast. funded by ACToday, the first project of Columbia University’s Reliability is a measure of how well forecast probabilities World Projects. N.L. was partially funded by a National agree with the observed frequency of an event. Probabilistic Science Foundation Graduate Research Fellowship under statements are used to express uncertainty in a forecast and the Grant NSF DGE 16-44869. Thanks to Francesco Fiondella for probabilities issued should accurately reflect the underlying designing the updated ENSO teleconnection cartoon and Tony uncertainty. Forecasts that are over or underconfident are not Barnston, John del Corral, and Jeff Turmelle for all of their help properly calibrated to the actual occurrence of outcomes and tracking down historical forecasts. Also thanks to Weston are said to have poor reliability. Anderson, Yochanan Kushnir, and Ángel Muñoz for their The Brier score simultaneously quantifies the resolution and comments and discussion. reliability (Brier 1950) but can be decomposed into reliability and resolution components to evaluate each attribute sepa- APPENDIX rately. The Brier score (BS) for a collection of forecasts of a binary event/nonevent pair of size n is Forecast Verification Methods The following section provides a brief introduction to the n 5 å 2 2 BS ( pi di) , (A1) probabilistic forecast verification methods used in the study. i51 The skill of the IRI forecasts and EBFs is quantified through their resolution, reliability, and discrimination. A summary of where pi is the forecasted probability of an event and 5 1ðÞ the verification attributes is given in Table 2. See Jolliffe and di event occurs and takes on lower values for better Stephenson (2012) and the citations therein for more infor- forecasts. We use the Brier score because it is a ubiquitous mation on forecast verification. probabilistic verification score and can be decomposed into reliability and resolution components Murphy (1973). The Brier score can be written as 9 http://iridl.ldeo.columbia.edu/expert/home/.lenssen/.realtime ReferenceForecasts. BS 5 REL 2 RES 1 UNC, (A2)

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC 2404 WEATHER AND FORECASTING VOLUME 35 where RES is the resolution, REL is the reliability, and UNC is of a forecast’s information and does not require that a forecast the observational uncertainty and is independent of the fore- has sufficient resolution to be deemed skillful as in the RPSS. cast. Since lower Brier scores are better, we seek small values The GROC and resolution scores can be seen as additional for the reliability indicating little departure from a perfectly measures of forecast skill alongside the RPSS. reliable forecast. Similarly, a better forecast will have higher The discrimination is an interesting attribute as it reveals the resolution with the outcome more strongly conditioned on the shared information pool of the two known-ENSO EBFs. forecast. When calculating global and regional mean Brier During ENSO events, all of the forecasts have probability scores and the respective reliability and resolution scores, it is maxima for the same tercile. If this outcome occurs, all fore- critical to weight the forecasts by their gridbox area. A com- casts are deemed to have predicted this outcome equally cor- plete derivation of the weighted Brier decomposition used in rectly and are rewarded nearly the same. Minor differences this study is provided in Young (2010). arise between the known-ENSO deterministic and probabi- The ranked probability score (RPS) is the multicategory listic EBFs due to nonclimatological forecasts that issue equal generalization of the Brier score and is used in this study to probabilities to two categories. extend the Brier decomposition to tercile forecasts (Epstein 1969; Murphy 1971). One formulation of the RPS is as the sum c. Forecast skill: Ranked probability skill score of multiple Brier scores with the binary event boundary ap- A major question of the study is ‘‘What is the skill of the propriately defined. For a tercile forecast, we write the RPS in EBFs?’’ with skill defined as the performance of the EBFs terms of Brier scores as compared with some reference forecast. Climatology is often used as a reference forecast since it represents a state of no 1 RPS 5 (BS 1 BS ), (A3) information past the mean climate state. Rephrasing the 2 2 1 question, calculating a skill with respect to climatology is where BS_ is the Brier score with an event defined as near- equivalent to asking ‘‘How much better (or worse) does an normal or above-normal seasonal precipitation and BS1 is the EBF perform relative to a forecast of no information?’’ Brier score with an event defined as above-normal seasonal This study uses the ranked probability sum score (RPSS) as precipitation Jolliffe and Stephenson (2012). In other words, the measure of skill. Mathematically, the RPSS is BS_ turns the tercile forecast into a binary forecast by placing a RPS break between below normal and near normal resulting in RPSS 5 1 2 forecast , (A4) (nonevent, event) 5 (BN, {NN, AN}). Likewise, BS1 places RPSclimatology the break between near normal and above normal resulting in (nonevent, event) 5 (BN, {NN, AN}). where RPS is the ranked probability score discussed above. Writing the RPS as the sum of Brier scores allows for the Since the RPSS is derived from the RPS of a forecast, the RPSS extension of the reliability/resolution Brier score decomposi- is a combined measure of a forecast’s resolution and reliability tion to tercile forecasts. The calculation of the RPS provides compared with the that of the climatology reference forecast. information on two of the key forecast attributes. Additionally, Values of the RPSS greater than zero indicate that the forecast using this formulation of the RPS demystifies the statistic by outperforms the climatological forecast. showing it as the tercile generalization of the Brier score. The RPS being a combination of resolution and reliability comes with advantages and disadvantages when used to eval- b. Discrimination: Generalized ROC score uate forecasts. Since the RPS combines two important forecast With resolution quantifying how much the outcomes are attributes in a single score, it can be used to compare combined conditioned on the forecasts, it is also informative to ask how forecast resolution and reliability with a single score. However, the forecasts are conditioned on the outcomes. Discrimination since the RPS does not make clear if resolution or reli- scores answer the question: ‘‘Are forecasts different for dif- ability is the cause of a good or poor forecast, individual ferent outcomes?’’ A forecast with poor discrimination will measures of reliability and resolution are important to issue similar forecasts regardless of the outcome, providing include in a verification study. little-to-no information on the outcome. As with resolution, REFERENCES forecast discrimination cannot be improved locally through postprocessing. Adler, R. F., and Coauthors, 2003: The Version-2 Global Precipitation Forecast discrimination is quantified with the generalized Climatology Project (GPCP) Monthly Precipitation Analysis relative operating characteristic (ROC) score also known as (1979–present). J. Hydrometeor., 4, 1147–1167, https://doi.org/ , . the GROC score (Mason and Weigel 2009). As the name 10.1175/1525-7541(2003)004 1147:TVGPCP 2.0.CO;2. suggests, the score is a multicategory generalization of the area Agresti, A., 1990: Categorical Data Analysis. 1st ed. Wiley, 558 pp. Balas, N., S. E. Nicholson, and D. Klotter, 2007: The relationship of under the curve of an ROC curve used for binary events. rainfall variability in west central Africa to sea-surface tem- The GROC determines the probability that a forecast perature fluctuations. Int. J. Climatol., 27, 1335–1349, https:// correctly predicts the outcome and normalizes such that random doi.org/10.1002/joc.1456. guessing achieves a baseline probability of 0.5. The GROC is Barlow, M., H. Cullen, and B. Lyon, 2002: Drought in central and desirable as it is bounded in [0, 1] with higher values indicating southwest Asia: La Niña, the warm pool, and Indian Ocean greater discrimination allowing for intuitive comparison of precipitation. J. Climate, 15, 697–700, https://doi.org/10.1175/ forecasts. Much like resolution, the GROC provides a measure 1520-0442(2002)015,0697:DICASA.2.0.CO;2.

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC DECEMBER 2020 L E N S S E N E T A L . 2405

Barnston, A. G., S. Li, S. J. Mason, D. G. DeWitt, L. Goddard, and Grimm, A. M., 2003: The El Niñoimpactonthesummermonsoonin X. Gong, 2010: Verification of the first 11 years of IRI’s sea- Brazil: Regional processes versus remote influences. J. Climate, sonal climate forecasts. J. Appl. Meteor. Climatol., 49, 493– 16, 263–280, https://doi.org/10.1175/1520-0442(2003)016,0263: 520, https://doi.org/10.1175/2009JAMC2325.1. TENIOT.2.0.CO;2. Bellenger, H., E. Guilyardi, J. Leloup, M. Lengaigne, and J. Vialard, Hansen, J. W., 2005: Integrating seasonal climate prediction and 2014: ENSO representation in climate models: From CMIP3 to agricultural models for insights into agricultural practice. CMIP5. Climate Dyn., 42, 1999–2018, https://doi.org/10.1007/ Philos. Trans. Roy. Soc. London, B360, 2037–2047, https:// s00382-013-1783-z. doi.org/10.1098/rstb.2005.1747. Borbor-Mendoza, M., and Coauthors, 2016: WHO/WMO climate Harris, I., T. J. Osborn, P. Jones, and D. Lister, 2020: Version 4 of the services for health: Improving public health decision-making CRU TS monthly high-resolution gridded multivariate climate in a new climate. WHO/WMO, 218 pp., https://public.wmo.int/ dataset. Sci. Data, 7,109,https://doi.org/10.1038/s41597-020-0453-3. en/resources/library/climate-services-health-case-studies. Hirschberg, P. A., and Coauthors, 2011: A weather and climate Brier, G. W., 1950: Verification of forecasts expressed in terms of enterprise strategic implementation plan for generating and probability. Mon. Wea. Rev., 78, 1–3, https://doi.org/10.1175/ communicating forecast uncertainty information. Bull. Amer. 1520-0493(1950)078,0001:VOFEIT.2.0.CO;2. Meteor. Soc., 92, 1651–1666, https://doi.org/10.1175/BAMS-D- Brönnimann, S., E. Xoplaki, C. Casty, A. Pauling, and J. Luterbacher, 11-00073.1. 2007: ENSO influence on Europe during the last centuries. Horel, J. D., and J. M. Wallace, 1981: Planetary-scale atmo- Climate Dyn., 28, 181–197, https://doi.org/10.1007/S00382- spheric phenomena associated with the southern oscilla- 006-0175-Z. tion. Mon. Wea. Rev., 109, 813–829, https://doi.org/10.1175/ Cai, W., and Coauthors, 2020: Climate impacts of the El Niño– 1520-0493(1981)109,0813:PSAPAW.2.0.CO;2. Southern Oscillation on South America. Nat. Rev. Earth Huang, B., and Coauthors, 2017: Extended Reconstructed Sea Environ., 1, 215–231, https://doi.org/10.1038/s43017-020-0040-3. Surface Temperature, version 5 (ERSSTv5): Upgrades, vali- Carrillo, C. N., 1892: Disertación sobre las corrientes y estudios de dations, and intercomparisons. J. Climate, 30, 8179–8205, la corriente Peruana de Humboldt. Bol. Soc. Geogr. Lima, 11, https://doi.org/10.1175/JCLI-D-16-0836.1. 72–110. Jolliffe, I. T., and D. B. Stephenson, Eds., 2012: Forecast Verification: Cash, D. W., and J. Buizer, 2005: Knowledge–Action Systems for A Practitioner’s Guide in Atmospheric Science.2nded.Wiley, Seasonal to Interannual Climate Forecasting: Summary of a 292 pp. Workshop. National Academy Press, 44 pp., https://doi.org/ Kirtman, B. P., and Coauthors, 2014: The North American 10.17226/11204. Multimodel Ensemble: Phase-1 seasonal-to-interannual pre- Cayan, D. R., K. T. Redmond, and L. G. Riddle, 1999: ENSO diction; phase-2 toward developing intraseasonal prediction. and hydrologic extremes in the western United States. Bull. Amer. Meteor. Soc., 95, 585–601, https://doi.org/10.1175/ J. Climate, 12, 2881–2893, https://doi.org/10.1175/1520- BAMS-D-12-00050.1. 0442(1999)012,2881:EAHEIT.2.0.CO;2. Knaff, J. A., and C. W. Landsea, 1997: An El Niño–Southern Compagnucci, R. H., and D. C. Araneo, 2007: Alcances de El Niño Oscillation climatology and persistence (CLIPER) forecasting como predictor del caudal de los ríos Andinos Argentinos. scheme. Wea. Forecasting, 12, 633–652, https://doi.org/10.1175/ IMTA-TC, 22, 23–35. 1520-0434(1997)012,0633:AENOSO.2.0.CO;2. Crochemore, L., M.-H. Ramos, and F. Pappenberger, 2016: Bias Kumar, A., and Coauthors, 2020: Guidance on operational prac- correcting precipitation forecasts to improve the skill of tices for objective seasonal forecasting. WMO 1246, 106 pp., seasonal streamflow forecasts. Hydrol. Earth Syst. Sci., 20, https://library.wmo.int/doc_num.php?explnum_id510314. 3601–3618, https://doi.org/10.5194/hess-20-3601-2016. Lehmann, E. L., and J. P. Romano, 2005: Testing Statistical Davey, M., A. Brookshaw, and S. Ineson, 2014: The probability of Hypotheses. 3rd ed. Springer, 784 pp. the impact of ENSO on precipitation and near-surface tem- Lin, J., and T. Qian, 2019: A new picture of the global impacts of El perature. Climate Risk Manage., 1, 5–24, https://doi.org/10.1016/ Niño–Southern Oscillation. Sci. Rep., 9, 17543, https://doi.org/ j.crm.2013.12.002. 10.1038/s41598-019-54090-5. Epstein, E. S., 1969: A scoring system for probability forecasts of Livezey, R. E., and M. M. Timofeyeva, 2008: The first decade of ranked categories. J. Appl. Meteor., 8, 985–987, https://doi.org/ long-lead U.S. seasonal forecasts. Bull. Amer. Meteor. Soc., 89, 10.1175/1520-0450(1969)008,0985:ASSFPF.2.0.CO;2. 843–854, https://doi.org/10.1175/2008BAMS2488.1. Fisher, R. A., 1935: The logic of inductive inference. J. Roy. Stat. Lloyd-Hughes, B., and M. A. Saunders, 2002: Seasonal prediction of Soc., 98, 39–82, https://doi.org/10.2307/2342435. European spring precipitation from El Niño–Southern Garreaud, R. D., and P. Aceituno, 2001: Interannual rainfall vari- Oscillation and local sea-surface temperatures. Int. J. Climatol., ability over the South American altiplano. J. Climate, 14, 22, 1–14, https://doi.org/10.1002/joc.723. 2779–2789, https://doi.org/10.1175/1520-0442(2001)014,2779: Magana, V., J. Vázquez, J. Párez, and J. Pérez, 2003: Impact of IRVOTS.2.0.CO;2. El Niño on precipitation in Mexico. Geofis. Int., 42, 313–330. ——, M. Vuille, R. Compagnucci, and J. Marengo, 2009: Present-day Mason, S. J., 2018: Guidance on verification of operational seasonal South American climate. Palaeogeogr. Palaeoclimatol. Palaeoecol., climate forecasts. WMO 1220, 81 pp., https://library.wmo.int/ 281, 180–195, https://doi.org/10.1016/j.palaeo.2007.10.032. doc_num.php?explnum_id54886. Goddard, L., and M. Dilley, 2005: El Niño: Catastrophe or op- ——, and L. Goddard, 2001: Probabilistic precipitation anomalies portunity. J. Climate, 18, 651–665, https://doi.org/10.1175/ associated with ENSO. Bull. Amer. Meteor. Soc., 82, 619–638, JCLI-3277.1. https://doi.org/10.1175/1520-0477(2001)082,0619:PPAAWE. ——, A. G. Barnston, and S. J. Mason, 2003: Evaluation of the 2.3.CO;2. IRI’s ‘‘net assessment’’ seasonal climate forecasts: 1997–2001. ——, and A. P. Weigel, 2009: A generic forecast verification Bull. Amer. Meteor. Soc., 84, 1761–1782, https://doi.org/10.1175/ framework for administrative purposes. Mon. Wea. Rev., 137, BAMS-84-12-1761. 331–349, https://doi.org/10.1175/2008MWR2553.1.

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC 2406 WEATHER AND FORECASTING VOLUME 35

Murphy, A. H., 1971: A note on the ranked probability score. in weather forecasts. Wea. Forecasting, 21, 116–122, https:// J. Appl. Meteor., 10, 155–156, https://doi.org/10.1175/1520- doi.org/10.1175/WAF887.1. 0450(1971)010,0155:ANOTRP.2.0.CO;2. Sandjon, A. T., A. Nzeukou, and C. Tchawoua, 2012: Intraseasonal ——, 1973: A new vector partition of the probability score. J. Appl. atmospheric variability and its interannual modulation in Meteor., 12, 595–600, https://doi.org/10.1175/1520-0450(1973) central Africa. Meteor. Atmos. Phys., 117, 167–179, https:// 012,0595:ANVPOT.2.0.CO;2. doi.org/10.1007/s00703-012-0196-6. ——, 1991: Forecast verification: Its complexity and dimensional- Suarez, P., and A. G. Patt, 2004: Cognition, caution, and credibility: ity. Mon. Wea. Rev., 119, 1590–1601, https://doi.org/10.1175/ The risks of climate forecast application. Risk Decis. Policy, 9, 1520-0493(1991)119,1590:FVICAD.2.0.CO;2. 75–89, https://doi.org/10.1080/14664530490429968. Nicholson, S. E., C. Funk, and A. H. Fink, 2018: Rainfall over the Sulca, J., K. Takahashi, J.-C. Espinoza, M. Vuille, and W. Lavado- African continent from the 19th through the 21st century. Casimiro, 2018: Impacts of different ENSO flavors and trop- Global Planet. Change, 165, 114–127, https://doi.org/10.1016/ ical Pacific convection variability (ITCZ, SPCZ) on austral j.gloplacha.2017.12.014. summer rainfall in South America, with a focus on Peru. Int. R Core Team, 2020: R: A Language and Environment for J. Climatol., 38, 420–435, https://doi.org/10.1002/joc.5185. Statistical Computing. R Foundation for Statistical Computing, Sullivan, G. M., and R. Feinn, 2012: Using effect size-or why the accessed 12 March 2020, https://www.R-project.org/. p value is not enough. J. Grad. Med. Educ., 4, 279–282, https:// Rahman, T., J. Buizer, and Z. Guido, 2016: The economic impact of doi.org/10.4300/JGME-D-12-00156.1. seasonal drought forecast information service in Jamaica, Timmermann, A., and Coauthors, 2018: El Niño–Southern Oscillation 2014-15. Paper prepared for the U.S. Agency for International complexity. Nature, 559, 535–545, https://doi.org/10.1038/s41586- Development (USAID), 59 pp., https://www.climatelinks.org/ 018-0252-6. sites/default/files/asset/document/Economic-Impact-of-Drought_ Tödter, J., and B. Ahrens, 2012: Generalization of the ignorance Information_Service_FINAL.pdf. score: Continuous ranked version and its decomposition. Mon. Rodó, X., E. Baert, and F. A. Comín, 1997: Variations in seasonal Wea. Rev., 140, 2005–2017, https://doi.org/10.1175/MWR-D- rainfall in Southern Europe during the present century: 11-00266.1. Relationships with the North Atlantic Oscillation and the El Vuille, M., R. S. Bradley, and F. Keimig, 2000: Interannual climate Niño–-Southern Oscillation. Climate Dyn., 13, 275–284, https:// variability in the Central Andes and its relation to tropical doi.org/10.1007/s003820050165. Pacific and Atlantic forcing. J. Geophys. Res., 105, 12 447– Ronchail, J., and R. Gallaire, 2006: ENSO and rainfall along the 12 460, https://doi.org/10.1029/2000JD900134. Zongo valley (Bolivia) from the Altiplano to the Amazon Weijs, S. V., R. van Nooijen, and N. van de Giesen, 2010: Kullback– basin. Int. J. Climatol., 26, 1223–1236, https://doi.org/10.1002/ Leibler divergence as a forecast skill score with classic reliability– joc.1296. resolution–uncertainty decomposition. Mon. Wea. Rev., Ronghui, H., and Y. Wu, 1989: The influence of ENSO on the 138, 3387–3399, https://doi.org/10.1175/2010MWR3229.1. summer climate change in China and its mechanism. Adv. Wilks, D. S., 2009: Extending logistic regression to provide full- Atmos. Sci., 6, 21–32, https://doi.org/10.1007/BF02656915. probability-distribution MOS forecasts. Meteor. Appl., 16, Ropelewski,C.F.,andM.S.Halpert,1987:Globalandre- 361–368, https://doi.org/10.1002/met.134. gional scale precipitation patterns associated with the El Xie, P., and P. A. Arkin, 1997: Global precipitation: A 17-year Niño/Southern Oscillation. Mon. Wea. Rev., 115, 1606–1626, monthly analysis based on gauge observations, satellite https://doi.org/10.1175/1520-0493(1987)115,1606:GARSPP. estimates, and numerical model outputs. Bull. Amer. 2.0.CO;2. Meteor. Soc., 78, 2539–2558, https://doi.org/10.1175/1520- ——, and ——, 1989: Precipitation patterns associated with the 0477(1997)078,2539:GPAYMA.2.0.CO;2. high index phase of the Southern Oscillation. J. Climate, Yeh, S.-W., and Coauthors, 2018: ENSO atmospheric teleconnections 2, 268–284, https://doi.org/10.1175/1520-0442(1989)002,0268: and their response to greenhouse gas forcing. Rev. Geophys., 56, PPAWTH.2.0.CO;2. 185–206, https://doi.org/10.1002/2017RG000568. Roulston, M. S., and L. A. Smith, 2002: Evaluating probabilistic Young, R. M. B., 2010: Decomposition of the Brier score for forecasts using information theory. Mon. Wea. Rev., 130, weighted forecast-verification pairs. Quart. J. Roy. Meteor. 1653–1660, https://doi.org/10.1175/1520-0493(2002)130,1653: Soc., 136, 1364–1370, https://doi.org/10.1002/qj.641. EPFUIT.2.0.CO;2. Zhang, R., A. Sumi, and M. Kimoto, 1999: A diagnostic study of the ——,G.E.Bolton,A.N.Kleit,andA.L. Sears-Collins, 2006: A lab- impact of El Niño on the precipitation in China. Adv. Atmos. oratory study of the benefits of including uncertainty information Sci., 16, 229–241, https://doi.org/10.1007/BF02973084.

Unauthenticated | Downloaded 10/03/21 03:54 AM UTC