<<

FEBRUARY 2018 G A S C Ó NETAL. 89

Improving Predictions of Type at the Surface: Description and Verification of Two New Products from the ECMWF Ensemble

ESTÍBALIZ GASCÓN,TIM HEWSON, AND THOMAS HAIDEN European Centre for Medium-Range Forecasts, Reading, United Kingdom

(Manuscript received 9 August 2017, in final form 7 November 2017)

ABSTRACT

The medium-range ensemble (ENS) from the European Centre for Medium-Range Weather Forecasts (ECMWF) Integrated Forecasting System (IFS) is used to create two new products intended to face the challenges of precipitation-type forecasting. The products themselves are a map product that repre- sents which precipitation type is most likely whenever the probability of precipitation is .50% (also including information on lower probability outcomes) and a meteogram product, showing the temporal evolution of the instantaneous precipitation-type probabilities for a specific location, classified into three categories of pre- cipitation rate. A minimum precipitation rate is also used to distinguish dry from precipitating conditions, setting this value according to type, in order to try to enforce a zero frequency bias for all . The new products differ from other ECMWF products in three important respects: first, the input variable is discretized, rather than continuous; second, the postprocessing increases the output information content; and, third, the map-based product condenses information into a more accessible format. The verification of both products was developed using four months’ worth of 3-hourly observations of present weather from manual surface synoptic observation (SYNOPs) in Europe during the 2016/17 winter period. This verification shows that the IFS is highly skillful when forecasting and , but only moderately skillful for and rain and snow mixed, while the ability to predict the occurrence of pellets is negligible. Typical outputs are also illustrated via a freezing-rain case study, showing interesting changes with lead time.

1. Introduction types completely paralyzed, and with major long-term damage to infrastructure and vegetation (DeGaetano One of the greatest difficulties facing forecasters 2000; Chang et al. 2007; Call 2010). Accurate predictions during the winter is the accurate identification of from weather forecast models of timing (onset and du- precipitation type at ground level (Ralph et al. 2005). ration), intensity, spatial extent, and phase (i.e., pre- Certain types of precipitation can be a threat to human cipitation type) are crucial for decision-making and can health and public safety and can disrupt travel and help minimize the potential impacts (Branick 1997; commerce, seriously affecting the economy (Ralph et al. Ikeda et al. 2013; Grout et al. 2012; Ikeda et al. 2017). 2005; Reeves et al. 2016). Freezing rain (FZRA) is Nevertheless, although it is self-evident that correct particularly hazardous due to its ice-loading effects on predictions of precipitation type are vitally important, power wires, and because it can make travel extremely only limited attention has been paid to wintertime dangerous. In the most severe cases with heavy and precipitation-type forecasting in Europe. prolonged freezing precipitation, the consequences can There are numerous sources of uncertainty in be catastrophic, with collapsed power lines causing precipitation-type forecasts, in particular mixed phases prolonged power outages, with travel networks of many [FZRA, (IP), and rain and snow mixed (RASN)] are not well predicted (Wandishin et al. 2005; Reeves et al. 2014; Elmore et al. 2015; Ikeda et al. 2017) Denotes content that is immediately available upon publica- and continue to pose a substantial forecast challenge for tion as open access. numerical weather prediction (NWP) models. The thermodynamic structures of the atmosphere in IP and Corresponding author:Estíbaliz Gascón, estibaliz.gascon@ FZRA situations are so similar that small errors in ecmwf.int the predicted thicknesses of an elevated melting layer

DOI: 10.1175/WAF-D-17-0114.1 Ó 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses). Unauthenticated | Downloaded 09/23/21 01:57 PM UTC 90 WEATHER AND FORECASTING VOLUME 33 and/or a near-surface subzero layer can result in an The main surface observations used by national mete- incorrect prediction of precipitation type at ground orological services in Europe are surface synoptic ob- level (Reeves et al. 2014, 2016). Precipitation rate also servations (SYNOPs), of both automatic and manual plays an important role in the correct determination of types. Manual SYNOP observations are generated by precipitation type, because melting, which is an in- trained observers and are generally accurate with regard tegral feature of IP/FZRA situations, will absorb la- to the determination of precipitation type. However, tent heat from the atmosphere and thereby cool and automatic SYNOP station precipitation-type reports are modify the thermodynamic structure in proportion to often erroneous, with mixed precipitation types often precipitation rate. Furthermore, precipitation rate is misdiagnosed (Elmore et al. 2015). influenced by snowflake type, density, and the degree Regarding NWP, sophisticated microphysical pa- of riming, as well as by interaction with other particles rameterizations schemes are widely used in high- during passage through different atmospheric layers resolution regional forecast models, which should help with different and moisture profiles with precipitation-type prediction, but even with such (Sankaré and Thériault 2016; Reeves et al. 2016; complex algorithms, correctly predicting what phase of Ikeda et al. 2017). Temporal variability is an added precipitation ends up at the ground remains a chal- complication since rain (RA) and snow (SN) are in lenging task (Ikeda et al. 2013). The study by Thériault general longer-lived phenomena than FZRA, IP, or et al. (2010) demonstrated that precipitation type at the RASN. Various authors (such as Reeves 2016), have ground is highly sensitive to temperature profile varia- also highlighted the existence of strong biases in tions as small as 60.58C, meaning that predictive diffi- precipitation-type forecasts, especially for FZRA and culties are particularly acute within snow–rain transition IP (Manikin et al. 2004; Schuur et al. 2012; Reeves regions. Other factors such as proximity to water bodies, et al. 2014), but also with RA and SN. The causes of terrain height variations, or the precipitation rate are these systematic errors can in principle be diagnosed very important as well (Stewart 1985; Bernstein 2000; using observational data (Reeves et al. 2014). Forbes Robbins and Cortinas 2002; Minder and Durran 2011). et al. (2014) compared two freezing-rain case studies Some authors consider these uncertainties to be difficult between the last version of the and precipitation to reduce, but they can potentially be quantified by the parameterizations in Integrated Forecasting System use of ensemble forecasts (Cortinas et al. 2002; Manikin (IFS) cycle 41r1 (2015) and the version in IFS cycle et al. 2004; Wandishin et al. 2005; Shafer and Rudack 36r4 (2010). In the newer version, the freezing-rain 2014; Scheuerer et al. 2017). Brooks et al. (1996), processes for elevated warm layers are modified. They Wandishin et al. (2005), and Reeves (2016) point out include a more representative time scale for the re- that a more desirable approach to increasing the accu- freezing of raindrops that depends on the temperature racy of precipitation-type forecasts for mixed pre- and crucially on whether the snow particles have cipitation events is to use ensemble prediction to completely melted or not (Zerr 1997). Forbes et al. provide probabilistic forecasts of precipitation type. (2014) found large errors in the previous depiction of Naturally, this provides the forecaster with a broader precipitation type, specifically an inability to predict perspective on the likelihood of occurrence of different the extent of the freezing-rain event, whereas the mixed phases during (potential) FZRA episodes. model with the new physics is able to predict freezing Wandishin et al. (2005) published the first study to in- rain that is in general agreement with the observations. vestigate extensively the potential use of ensembles for The present study is an extension of Forbes et al. forecasting precipitation types during the winter period. (2014), who showed the advantages of using ensemble They used 10 ensemble members and examined 0–48-h forecasts of precipitation type and a capacity to detect lead times, showing that ensemble forecasts have the potential freezing-rain areas even with low precipita- capacity to be of substantial value to potential users and tion rate thresholds. how skill increased with the number of members, espe- The correct choice of observations is another impor- cially for mixed-phase precipitation forecasts. tant aspect of precipitation-type verification but few ECMWF IFS ensemble forecasts (ENS) have been authors have paid attention to this topic (Huntemann operational for over 25 years and currently comprise 1 et al. 2014; Reeves 2016). When the true surface tem- control and 50 perturbed forecasts running out to perature is near to 08C, small errors in forecasts can 15 days, twice per day. Instantaneous surface pre- have a large impact on the precipitation-type forecast cipitation type is one of the ENS output variables, and (Reeves 2016). Similarly, height differences between the this takes one of six different values: RA, SN, wet snow model and the true orography at observation sites can be (WSN), RASN, FZRA, and IP. In addition, one can another source of uncertainty in the verification process. compute the total instantaneous precipitation rate at the

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC FEBRUARY 2018 G A S C Ó NETAL. 91 surface by summing together convective/stratiform type of precipitation falling at the surface at each fore- precipitation rate values valid at a particular time cast time step (and denotes dry if the total precipitation (rather than the typical averaged or accumulated rate is zero). RA, SN, FZRA, and IP have been con- over a period), and this can be combined with the sidered in other studies related to precipitation-type precipitation-type diagnostic. In the IFS model the forecasts (Wandishin et al. 2005; Reeves et al. 2014; melting and refreezing parameterizations must be for- Elmore et al. 2015). However, RASN (partly melted mulated in terms of its prognostic variables for pre- falling snow) has not been considered in these studies. cipitation: rain and snow. Precipitation type is diagnosed The version of IFS used for this study was the cycle 41r1 from the ratio of rain and snow at the surface and the (operational at ECMWF from November 2016 to July profile of precipitation and temperature above. Whether 2017) with 1 control and 50 perturbed forecasts (the the precipitation refreezes or not in the air below ENS). For the verification of the new ECMWF also affects the temperature profile, resulting in colder precipitation-type products we used the 0000 UTC base in the layer if the particles remain as time and evaluated up to a lead time of 168 h. In this supercooled water. The latent heat of fusion is instead ENS configuration, the spatial resolution is approxi- transferred to the surface with the rain freezing on im- mately 18 km while the temporal resolution of the output pact, and this leads to a relative warming of the surface available internally at ECMWF is 1 hourly from T 1 0to and near-surface temperatures. More information about 90 h, 3 hourly from T 1 90 to 144 h, and 6 hourly from the algorithms can be found in Forbes et al. (2014). T 1 144 h onward. For the verification we split the 7 days As part of ECMWF’s contribution to the Enhancing into 24-h periods and focused on the domain 338–718N, Emergency Management and Response to Extreme 118W–358E. HRES precipitation-type forecasts have Weather and Events (ANYWHERE) project, given (and continue to give) impactful information for two new products have been developed in this study the users for winter precipitation forecasting (Forbes based on ENS forecasts of precipitation type: a map et al. 2014). However, the new tools described in this showing the most probable precipitation type paper go further and provide information even when the

(PREFptype) and a meteogram showing the probability probabilities of occurrence are very low, and they of precipitation type (PROBptype) for a user-selected add more detailed information specific to each site. These new products highlight the advantages of precipitation type. using ENS forecasts to infer precipitation type, espe- The PREFptype maps (Figs. 1a,b) show, in color, which cially in more challenging situations where there is a risk of the six precipitation types is most likely whenever the of SN or FZRA. We have also developed a methodology probability of any type of precipitation is .50%. Then, for verifying precipitation-type forecasts using SYNOP shading darkness is further used to denote what the observations. The instantaneous precipitation rate var- probability (of the type denoted by the color) actually is, iable, which can acquire minuscule values, is also used in three probability ranges: up to 50%, 50%–70%, and concurrently and in a new way to define cutoffs between more than 70%. To expand on this, particularly for precipitating and dry conditions. In using this rate vari- longer leads when probabilities above 50% become in- able, we aimed to minimize, for each precipitation type, creasingly rare, we use also two gray shades to denote any frequency biases relative to SYNOP reports. In turn when the probability of any type of precipitation is this approach should reduce the numbers of misses and/ 10%–30% or 30%–50%. This overall structure has been or false alarms. The methods and datasets used in the carefully designed to try to extract, compress, and dis- creation and validation of these new products are ex- play as much information as possible from the input plained in section 2. Sections 3 and 4 describe the veri- ensemble data, in a meaningful format, taking into ac- fication procedure and results for the meteogram and count the requirements of what might be a typical user. map products, respectively. Section 5 describes a FZRA Figures 1a and 1b show example forecast output for a case study using both products. Concluding remarks are winter in the northeastern United States, valid at provided in section 6. 1200 UTC 14 March 2017, with base times of 0000 UTC 14 March and 0000 UTC 11 March, respectively.

Both PREFptype maps show a similar structure but note, 2. Datasets and products first, that FZRA and IP were the most probable types in some regions in the shortest-range forecast (Fig. 1a) but a. Forecasts and products not 3 days in advance (Fig. 1b), and second that for the At ECMWF, the precipitation-type output variable is longer lead time (Fig. 1b) we see more gray areas, which provided by the IFS deterministic/high-resolution is typical. Another design aspect for the user to appre- (HRES) and ENS runs. This variable describes the ciate is that whenever the lightest shade of a given color

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC 92 WEATHER AND FORECASTING VOLUME 33

FIG. 1. A significant heavy snowfall event occurred near the U.S East Coast on 14 Mar 2017. Map products show

the PREFptype, valid at 1200 UTC 14 Mar, from base times of 0000 UTC (a) 14 Mar (yellow circle is the position of New York City) and (b) 11 Mar. The PROBptype meteogram plots at New York City (40.728N, 748W) for base times of 0000 UTC (c) 14 and (d) 11 Mar 2017, both valid up to a 120-h lead time.

(except gray) appears on a map, the user immediately precipitation rate (another IFS variable), which can be knows that more than one precipitation type has been key for determining the severity of impacts of, for ex- predicted at that time, which can serve as an initial alarm ample, potential freezing-rain or snowfall events. We bell for ‘‘uncertainty.’’ In particular, one can see exam- define three different categories of precipitation type ples of this in Fig. 1b. We envisage that in the first in- depending on the precipitation rate, from Rmin to 21 stance users might first view this product and display an 0.2 mm h (low intensity, where Rmin is the minimum animation over a range of lead times, and then focus in permissible rate for each precipitation type), from 0.2 2 on a particular location/event using the meteogram to 1 mm h 1 (moderate intensity), and greater than 2 product (Figs. 1c,d), which provides much more detailed 1mmh 1 (high intensity). One clear example of the probabilistic information. advantages of this product can be seen in the New York

The PROBptype or meteogram product itself (Figs. 1c,d) meteograms; the first one is valid from 14 to 18 March depicts the temporal evolution of probabilities for a 2017 (Fig. 1c) and the second one from 11 to 15 March specific location in bar chart format. Here, the shading 2017 (Fig. 1d). Although both meteograms forecasted provides much more detail regarding the probabilities relatively heavy snowfall (intensities greater than 2 for different precipitation types and also includes 1mmh 1) in New York City on 14 March, the most information pertaining to the instantaneous total recent prediction (Fig. 1c) shows higher probabilities

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC FEBRUARY 2018 G A S C Ó NETAL. 93

FIG. 2. Map showing the SYNOP manual stations used. The verification covered the area from 338 to 718N and from 118Wto358E.

(close to 100%) of heavy snowfall during several hours b. Observations and includes some probabilities of freezing rain. How- ever, 3 days in advance of the event, probabilities of The verification of the two new products was per- snow were mixed with wet snow and even with some formed exclusively using 3-hourly observations of rain, but provided no evidence of FZRA. present weather from manned SYNOP stations in

The constant Rmin was defined on the basis of a 4-month Europe. Despite the high density of automatic SYNOP verification training period that utilized present weather stations in Europe, we refrained from using them in this reports from SYNOP observations. For each type of study because the present weather sensing and coding precipitation the objective was to come as close as possi- can be unreliable, particularly in mixed-phase pre- ble to removing any biases at all lead times beyond day 1 cipitation scenarios (Elmore et al. 2015). The period (as far as the wintertime SYNOP observations were analyzed ran from 15 October 2016 to 15 February 2017 concerned). The procedure adopted was iterative, (4 months over winter). The aim here was to assess the wherein six values of the precipitation rate threshold were most recent ECMWF model cycle running over a winter applied in order to calculate the bias for each precipitation period (cycle 43r1). Although both products were orig- 2 type: 0.02, 0.05, 0.07, 0.1, and 0.12 mm h 1. For our com- inally created with 1-h time resolution, only 3-hourly putations, we considered each ensemble member as a verification was possible as a result of the absence binary forecast result (occurrence/nonoccurrence), ob- of more frequent SYNOP manual observations in taining 51 forecasts nine times per day (3 hourly), giving a ECMWF archives. One of the difficulties encountered in total of 459 forecasts per day in each location. Each pre- this validation was the inconsistent frequency of present cipitation type was studied and verified probabilistically weather observations for the different stations in the for the PROBptype product, while the data from archives [an issue also noted by Carriére et al. (2000)]. PREFptype were considered to constitute binary observa- The total number of stations used in this study is 1050 tions (occurrence or nonoccurrence) in the verification. (Fig. 2). However, not all stations are open 24 h a day, so During the design of the products, we received con- the nominal maximum frequency of SYNOP observa- tinuous feedback from volunteer forecasters, adding tions providing a current weather group during the study extra features that they requested, as well as removing period varied with time of the day. The original present information that was not relevant for them. Also train- weather reports were classified into one of five different ing courses were used to show how the products could/ categories: RA, SN, RASN, FZRA and IP (Table 1). should be used and to get further feedback regarding WSN was not considered separately because of the lack how the products were actually being interpreted. In of direct observations for its verification; instead, the these various ways the products have been optimized for WSN forecasts were classified as SN. In future work one potential users, and at the same time those users have could conceivably use measurements of 2-m tempera- been and will continue to be trained in their usage. ture (as in the IFS code) or visibility (see Ludlam 1980)

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC 94 WEATHER AND FORECASTING VOLUME 33

TABLE 1. Precipitation-type classification from the SYNOP manual present weather code, including the number of observations of each precipitation type during the verification period.

Precipitation type Present weather (total No. of obs 5 681 155) SYNOP code Description RA (49 892 obs) 51 , not freezing; continuous slight 52, 53, 54, 55 Drizzle, not freezing; intermittent or continuous; moderate or heavy 58, 59 Drizzle and rain; slight, moderate, or heavy 60, 61, 62, 63, 64, 65 Rain, not freezing; intermittent or continuous; slight, moderate, or heavy 80, 81, 82 Rain showers; slight, moderate, or heavy/violent 87, 88 Showers of snow pellets or small , with or without rain or rain and snow mixed; slight, moderate, or heavy 89, 90, 91, 92, 93, 94 Showers of hail, with or without rain or rain and snow mixed; not asso- ciated with thunder 95, 96, 97, 99 with or without hail but with rain and/or snow; slight, moderate, or heavy SN (34 591 obs) 70, 71, 72, 73, 74, 75 Intermittent or continuous fall of snowflakes; slight, moderate, or heavy 76, 77, 78 or or isolated starlike snow crystals 85, 86 Snow showers; slight, moderate, or heavy 87, 88 Showers of snow pellets or small hail, with or without rain or rain and snow mixed; slight, moderate, or heavy 89, 90, 93, 94 Showers of hail, with or without rain or rain and snow mixed; not asso- ciated with thunder 95, 97 Thunderstorm with or without hail but with rain and/or snow; slight, moderate, or heavy RASN (3362 obs) 68, 69 Rain or drizzle and snow; slight, moderate, or heavy 83, 84 Showers of rain and snow mixed; slight, moderate, or heavy 87, 88 Showers of snow pellets or small hail, with or without rain or rain and snow mixed; slight, moderate, or heavy 89, 90, 93, 94 Showers of hail, with or without rain or rain and snow mixed; not asso- ciated with thunder 95, 97 Thunderstorm with or without hail but with rain and/or snow; slight, moderate, or heavy FZRA (538 obs) 57 Drizzle, freezing; moderate or heavy (dense) 66, 67 Rain, freezing; slight, moderate, or heavy IP (1315 obs) 77 Snow grains 79 Ice pellets Observed precipitation but 50 Drizzle, not freezing; intermittent considered as ‘‘no precipitation’’ 56 Drizzle, freezing; slight

as the basis for separating SN from WSN, but here we classification of the observations (not shown). As we retain the simpler approach. Some types of present described in section 1, one important source of un- weather observations were easy to classify, but the certainty is the height difference between the model and classification of mixed-phase precipitation was not so observations, which is critical in near-freezing temper- straightforward. For example, how large a fraction of atures. For this reason, SYNOP stations with an altitude water do you need before RASN becomes RA or, for difference of more than 200 m relative to the closest that matter, SN? Also, not all SYNOP present weather ENS point were removed from the verification. reports had an exclusive classification, meaning that the same observation could correspond to two or three dif- ferent categories (e.g., ‘‘rain or drizzle and/or snow,’’ 3. Verification of ENS precipitation type with codes 68 and 69, are included in both categories— (meteogram product) RA and RASN). Slight (code 56) was a. Reducing systematic bias considered ‘‘no precipitation’’; however, slight contin- uous not freezing drizzle (code 51) was included in the Here, we first describe the methodology for rate-related RA category (Table 1). This decision was made after frequency bias correction for the precipitation-type vari- many tests during the verification process in an attempt able; that is, we define Rmin for each precipitation type. to avoid including extra bias due to the wrong The target of this procedure was to make the total

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC FEBRUARY 2018 G A S C Ó NETAL. 95

21 FIG. 3. Frequency bias for ENS forecast with different precipitation rate thresholds (mm h ) and for each of the five precipitation types at (a) 24–48- and (b) 96–120-h lead times. frequency of occurrence of each precipitation type, function of lead time (beyond day 1), and so we apply within forecasts, over all the observation sites, equal to the same Rmin values across the lead-time range. Al- the observed frequency of occurrence at those sites (i.e., though only two lead-time ranges are shown here, we frequency bias 5 1). We examined multiple lead times, did in fact analyze seven sets (from 0–24 to 144–168 h), anticipating that there could be some lead-time-based and very similar results were obtained for all of them. drift in frequencies, though not expecting this would Equally, we actually found no evidence of spinup effects be a major factor. Figure 3a shows the frequency bias on day 1, which from both a user and a modeler’s per- calculated over our 4-month wintertime verification spectives is encouraging (not shown). training period, as a function of different precipitation To evaluate the advantages of postprocessing using 21 rate thresholds (0.02, 0.05, 0.07, 0.1, and 0.12 mm h )at Rmin in probabilistic forecasts of precipitation type, re- 24–48-h lead times. This day-2 lead was initially used liability diagrams for each precipitation type with dif- as a focal point instead of day 1 to avoid any model ferent thresholds were constructed. Reliability diagrams spinup effects and also in recognition of the fact that (Murphy and Winkler 1977; Wilks 1995) compare the ECMWF’s primary goal is not short-range forecasting. forecast probabilities against the frequency of an event Naturally, frequency bias diminishes as precipitation occurrence and therefore measure how closely the rate increases. All precipitation types, except IP, forecast probabilities of an event correspond to the ac- present a positive frequency bias with a precipitation tual chance of the event occurring. In this section, two 2 rate of 0.02 mm h 1, which suggests that this limit is too different lead times for evaluating performance are low as it leads to overprediction. Indeed, the bias rea- again considered (24–48 and 96–120 h), with two dif- 21 ches almost 2 for RA. A value of Rmin 5 0.05 mm h ferent precipitation rate thresholds (see Fig. 4). The seems to be the most suitable limit for the SN and FZRA black solid diagonal line represents perfect reliability. forecasts, giving a bias close to 1. However, for RASN RA forecasts are reasonably reliable for both lead times 21 and RA we need a higher rate and set Rmin 5 0.1 mm h and Rmin settings (Fig. 4a), though the larger Rmin (blue) 2 for RASN and 0.12 mm h 1 for RA. So, from a bias- gives better results throughout. The SN forecasts are reduction perspective it is clearly beneficial to apply also reasonably reliable (Fig. 4b), but if the larger Rmin different precipitation rate thresholds for each pre- setting was used, and probabilities were low (10%–30% cipitation type. IP exhibits a different behavior com- say), too many events would be missed. So results for pared to the other precipitation types: whatever the both RA and SN are consistent with the recommenda- precipitation rate threshold, a large underestimation tions from the previous section regarding Rmin. The occurs. FZRA (Fig. 4c) and RASN (Fig. 4d) forecasts are not The frequency biases for ENS at 96–120-h lead times good but show some limited skill, though the sample size (Fig. 3b) are quite similar to those at 24–48-h lead times seems insufficient to highlight the benefits of the rec-

(Fig. 3a), suggesting that the bias is not in general a ommended Rmin values. Also of note is the fact that high

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC 96 WEATHER AND FORECASTING VOLUME 33

FIG. 4. Reliability diagrams for ENS forecasts at 24–48- (solid lines) and 96–120-h (dashed lines) lead times for 2 2 2 precipitation rate thresholds of 0.05 mm h 1 (red lines), 0.12 mm h 1 [blue lines in (a)], and 0.1 mm h 1 [blue lines in (b)–(e)] for (a) RA, (b) SN, (c) FZ, (d) RASN, and (e) IP. The inset histograms denote frequency of forecast 2 usage of each probability bin for the 0.05 mm h 1 precipitation rate threshold.

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC FEBRUARY 2018 G A S C Ó NETAL. 97 probability forecasts of either FZRA or RASN, though precipitation, only the probabilities of occurrence. Al- rarely occurring, are generally far too confident. This is a though ROC curves do not of themselves provide any typical characteristic of reliability diagrams for param- measure of reliability, in our case we have tried to eters that are generally not well predicted. Finally, maximize reliability using the Rmin thresholds. How- Fig. 4e shows that probabilistic forecasts for IP (Fig. 4e) ever, the use of this approach does not mean that we will cannot be relied upon. get perfect reliability at every probability threshold. Clearly, sample size affects the above results (Table 1). ROC curves for each precipitation type and at seven Frequencies of IP and FZRA are very low compared different lead times are shown in Fig. 5, wherein prob- with, for example, RA. On the other hand, from the ability thresholds were assigned at 2% intervals (al- perspective of severe winter weather prediction, it is though labels were added only at 10% intervals, for the somewhat encouraging that in spite of this FZRA shortest-range forecasts; shown in red). The AUC score forecasts, at least, do have some reliability at day 2. This for each category and each time step is indicated in the is probably because there is more spatial continuity/ex- bottom-right corner of each diagram. Looking at the tent in FZRA during an FZRA event than there would ROC curves for RA (Fig. 5a) and for SN (Fig. 5b), they be for IP during an IP event. Examining the model seem quite similar; however, RA is a bit more skillfully output, one tends to see that IP conditions are predicted predicted than SN. The differences between the RA and to occur in narrow bands (i.e., having a relatively one- SN ROC plots are clearer in the AUC values (gray dimensional structure), which makes predictive accu- boxes). AUC values for RA precipitation (Fig. 5a) are racy very vulnerable to slight lateral displacement errors between 0.90 at 0–24-h lead time and 0.81 at 144–168-h (and also explains the very low frequencies of higher lead time. In the SN case (Fig. 5b), these values range probability forecasts). FZRA zones are typically more from 0.87 to 0.83 at the same lead times. From days 5 to 7 two-dimensional. the RA AUC is slightly lower than the SN AUC, as a One disadvantage of postprocessing in general is that result of the fact that the F is greater for RA than for SN there is always a need to recalibrate each time a related at these lead times. As in the analysis of the reliability significant change is made in a new model cycle. How- (Fig. 4), the ROC curves for FZRA (Fig. 5c) and RASN ever, at ECMWF the experimental runs covering many (Fig. 5d) are quite similar, probably because of their winter months are always carried out in advance of the lower frequencies in the study sample. For FZRA and release of a new cycle, and a verification tool has been RASN the first day exhibits slightly less skill in the ROC automatized to recalibrate the products in case the curves when compared with the second day. One hy- biases change significantly. This will allow the recali- pothesis could be that spinup in precipitation processes bration of Rmin to also be done in advance, and, as il- is to blame, although with just this information we can- lustrated above, four months’ worth of reruns should be not be sure, and indeed the frequency bias adjustment sufficient for this purpose. It is highly probable that the procedure described above suggested that spinup was bias varies seasonally (e.g., we would expect a larger rate not a major issue. The AUC index for FZRA varies threshold for RA in ); however, this new tool between 0.72 at 0–24 h and 0.59 at 144–168 h, indicating presents its main use in the winter season, so we have slight skill, especially at earlier lead times. RASN prioritized the correction of this period of the year. (Fig. 5d) shows similar overall skill to FZRA; however it is slightly worse than FZRA at shorter lead times and b. ROC curves better at longer lead times. In fact, the skill in RASN The relative (or receiver) operating characteristic forecasts using this metric does not vary much with lead (ROC) diagram (Mason 1982) is widely used to evaluate time. Finally, the ability to predict IP is almost negligi- the quality of probabilistic forecasts (Stanski et al. 1989; ble, with all curves close to the diagonal. Wandishin Buizza and Palmer 1998; Mason and Graham 1999). It et al. (2005) published one of the first studies doc- plots the hit rate H against the false alarm rate F, for umenting the generation and verification of a different probability thresholds. The main diagonal precipitation-type short-range ensemble forecast prod- corresponds to random forecasts (H 5 F), and the area uct for the winter season using temperature vertical under the ROC curve (AUC; Hanley and McNeil 1982) profile forecasting. That study shows much better results is taken as a measure of skill, with values between 0.5 in the AUC, with values around 0.95 for RA, 0.86 for (random forecast) and 1 (perfect forecast). For the FZRA, 0.96 for SN, and 0.80 considering IP. This sig- verification of each precipitation type, we first apply the nificant difference in the results is because they only

Rmin filter for each precipitation type, as discussed considered sites at which precipitation was both ob- above. Following the filtering, the verification is per- served and forecast by at least one ensemble member. formed without taking into account the intensity of the In the present study, we wanted to evaluate the

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC 98 WEATHER AND FORECASTING VOLUME 33

FIG. 5. ROC curves at different lead times, up to day 7, for (a) RA, (b) SN, (c) FZ, (d) RASN, and (e) IP. The curves are the plots of hit rate vs false alarm rate for each decision threshold (2% interval used). Labels, at 10% intervals, are shown for the day-1 forecasts only (in red). The 458 black line represents no skill. The AUC for each lead time is shown in a gray box. precipitation-type forecast, including the no-occurrence does not occur, the economic impact is 0 (Wilks 2001). case, while in their study the problem of precipitation The users will have available the probability of occur- type was separated from the occurrence/nonoccurrence. rence p of this weather event for decision-making. If protection is chosen, the cost will be incurred (with a c. Relative economic value probability of 1), but the loss will be 0. If protection is Previous subsections contained several verification not chosen, loss will be suffered with probability p. indices that provide the user with information about the Therefore, the optimal time to pay for protective action usefulness of the PROBptype. The precipitation-type is when the probability of occurrence of the event is forecast value itself can be evaluated using a standard more than the user’s cost–loss ratio. So, the ‘‘relative cost–loss model (Richardson 2000; Wilks 2001) or with value’’ of a forecast system is defined as the reduction in more complex methodologies that incorporate the un- expenditure that it would lead to divided by the re- certainty in forecast probability derived from an en- duction that would be achieved by using a perfect semble, as Allen and Eckel (2012) propose. In this paper forecast. we only use the simpler first approach. The basic The benefit of the probabilistic approach is demon- premise of the cost–loss problem is that a decision- strated in the cost–loss model by the flexibility it adds to maker is faced with the uncertain prospect of some kind the choice of a decision-making strategy: for a low cost– of weather event. The user will be able to protect against loss ratio application, it is a good choice to take action the effects of this event, which incurs a cost, while the even for low forecasted probabilities. On the other hand, opposite scenario—occurrence of the event without for large cost–loss ratio applications, costs can be re- protective action—results in a loss to the user. The duced by taking action only for forecast probabilities protection cost occurs whenever the final decision is to close to 100%. The envelope of the value added by all protect, whether or not the weather event occurs. possible strategies (one for each threshold value of the However if no protective action is taken and the event probability) is shown by the full curve on the relative

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC FEBRUARY 2018 G A S C Ó NETAL. 99

FIG. 6. Cost–loss value plots for four precipitation types: (a) RA, (b) SN, (c) FZ, and (d) RASN. The envelope of the relative value added by all possible strategies (one for each threshold value of the probability) is shown by the full curves in the relative-value diagrams. Each line corresponds to a different lead time range, from 0–24 to 144–168 h (see legends). value diagram (see Fig. 6 for examples). Two parame- users; however, RASN was kept due to the close con- ters can summarize the information provided by cost– nection with SN events. The relative values for RA loss value plots: the maximum relative value Vmax and (Fig. 6a) and SN (Fig. 6b) show quite similar shapes in the width of the value curve Vwid. The second parameter the plots although there are some differences, especially provides information about the range of users for whom in their behavior at different lead times. At 0–24-h lead the forecast system would provide positive value (the time, Vmax is between 0.6 and 0.7 for RA and slightly larger Vwid is, the more users will obtain benefit from the higher for SN. While Vmax decreases with increasing lead product, assuming there is a somewhat even spread of time for RA, for SN it remains quite similar up to T 1 users’ cost–loss ratios). 72 h. Beyond T 1 72, Vmax is reduced more slowly for SN We show the cost–loss ratio curves for four pre- than for RA. This is similar to the behavior we saw in cipitation types: RA (Fig. 6a), SN (Fig. 6b), FZRA AUC for these precipitation types (Figs. 5a,b) At the

(Fig. 6c), and RASN (Fig. 6d), at six different lead-time same time, Vwid for SN, for lead times from 24 to 96 h, is ranges, from 0–24 to 144–168 h. These relative values higher than it is for RA, with cost–loss values extending have been calculated with a sample climatology of four up to about 0.8, meaning that this product’s SN forecasts winter months, as is the case for rest of the analysis in would be useful for a greater range of users (greater this paper. IP cases have been removed from this study Vwid) than would its RA forecasts. FZRA and RASN due to the marginal utility of their cost–loss results for predictions (Figs. 6c,d, with reduced scale ranges

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC 100 WEATHER AND FORECASTING VOLUME 33

TABLE 2. Standard 2 3 2 contingency table for dichotomous forecasts.

Event observed Yes No Event forecast Yes A (hits) B (false alarms) No C (misses) D (correct negatives)

compared to Figs. 6a,b) show smaller relative values for the first lead-time range and cease to have value for users beyond 48 and 24 h, respectively. This verification was developed by comparing the exact time and location of the observations in relation to the forecasts, so pre- dictions only slightly displaced in space or time count as incorrect (which can often happen for short-lived events and for FZRA cases, as an example in section 5 will show). Also, FZRA events are rare, implying that the FIG. 7. Performance diagram for the PREFptype for each type of user should be ready to take action on the basis of precipitation and for multiple lead times. Labeled solid contours forecasts of low probability of occurrence. Although this represent the CSI and dashed lines are FB with labels along the study has not compared the relative value for the outward extension of the line. Differing sizes of the points indicate the six different lead times (the bigger the size, the shorter the lead ECMWF ENS with the relative value of using just a time, from 0–24 to 144–168 h). single deterministic forecast (e.g., from HRES here), Richardson (2000) demonstrated that the added value of large ensembles, like the ECMWF ENS, is particularly B FAR 5 . (5) important for users with low cost–loss ratios and for A 1 B rarer events, because of the ensemble’s ability to sample the tails of probability space. These indices are mathematically related, and the geometrical representation in a single diagram allows 4. Verification of favored precipitation type (map accuracy, bias, reliability, and skill to be simultaneously product) visualized. Figure 7 is a performance diagram showing results for all precipitation types at each lead time. The PREFptype was verified as a dichotomous (yes– no) forecast, and it was applied only for colored areas Dashed lines represent bias scores with labels on the (well-defined precipitation type). Performance dia- outward extension of the line, and labeled solid contours grams (Roebber 2009) can relate four verification in- are for the CSI. Green dots correspond to RA, blue dots dices in the same plot: H, success ratio (SR), frequency to SN, red to FZRA, turquoise to RASN, and orange bias (FB), and critical success index (CSI; also known as indicates IP. The different dot sizes represent different threat score). This diagram is similar to a Taylor diagram lead times, so the smaller the point, the longer the lead (Taylor 2001) but is useful for dichotomous (yes–no) time. In the original conceptualization of this diagram, a forecasts. Based on a 2 3 2 contingency table (Table 2), perfect forecast would lie in the top-right corner; how- these scores are defined as ever, this is a postprocessed product where we obtained the PREFptype, thereby eliminating the possibilities of A other precipitation types, so the verification results are H 5 , (1) A 1 C in effect portrayed for the product itself, and not spe- A 1 B cifically for the precipitation-type variable in the ENS. FB 5 , (2) A 1 C For RA and SN (Fig. 7), the earliest lead times are A clustered toward the center of the diagram, close to the CSI 5 , and (3) 5 A 1 B 1 C bias 1 line, especially RA at 0–24-h lead time (the A biggest green dot) with maximum H between 0.5 and 0.6 SR 5 5 1 2 FAR, (4) (and with a similar result for SR). For the same pre- A 1 B cipitation types, values of CSI between 0.3 and 0.4 are where the false alarm ratio (FAR) is observed, decreasing to 0.1 as we move on to day-6

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC FEBRUARY 2018 G A S C Ó NETAL. 101

TABLE 3. SEDI values for FZRA, IP, and RASN at different lead times.

0–24 h 24–48 h 48–72 h 72–96 h 96–120 h 120–144 h 144–168 h FZRA 0.61 0.51 0.46 0.41 0.29 0 0 IP 0.08 0.14 0 0 0 0 0 RASN 0.38 0.33 0.30 0.27 0.22 0.22 0.19

forecasts. As seen in the probability of precipitation- stable over lead times, always keeping low values that type ROC curves (Fig. 5), the skill levels for FZRA and vary from 0.38 at 0–24 h to 0.19 at 144–168 h. Finally, ice

RASN on the PREFptype product are low, but there is pellet forecasts have no skill, confirming earlier results still some predictability. In this case, the H is lower from the PROBptype product verification. (values not higher than 0.2) than with the probability So we will summarize one or two of the key verification product (more than 0.4 in Fig. 5c). Finally, the forecast results for the PREFptype product. On days 1 and 2, if the skill of this product is minimal for RASN and com- PREFptype is showing SN or RA falling at a given time, pletely negligible for IP. there is a 50%–60% chance that observations will show Because FZRA, RASN, and IP are usually tran- the same, while for FZRA the maximum chance is only sient mixed precipitation phases, identification on the about 15% on day 1 and 5%–10% on day 2. This re-

PREFptype will tend to correspond, on average, to lower iterates that the correct prediction of falling precipitation, probabilities than one sees on average for RA and SN on even at day 1, is challenging, and for FZRA it is partic- that product. For similar reasons, as we go to longer lead ularly difficult. This is true for the model and indeed for times, the frequency with which one sees these types on human forecasters also, because of the narrow range of the map product decreases very rapidly (note that in vertical atmospheric profiles required for FZRA to be a Fig. 7 the nominal bias for this product decreases more possibility. Physically, one might expect many of the rapidly with lead time for mixed-phase types than it does forecast failures for SN and RA to be related to showery for RA and SN). Although this verification diagram situations. However, for FZRA this is unlikely to be a does not incorporate the verification metrics that are contributor, and the primary explanations will probably most strongly affected by base rate dependency or the be the finely balanced nature of the related synoptic sit- no-occurrence cases, such as percent correct (PC) and F, uation, and also the fact that the IFS model in its current it still uses H, SR, and CSI, which are all potentially form cannot represent freezing drizzle from supercooled affected by it. When an event becomes rarer, these water (see Forbes et al. 2014). By design, the quantities or indices tend toward 0, because the entries PREFptype also tends to ‘‘underrepresent’’ instances of in the contingency table tend to zero at different rates. specific precipitation types at longer leads, because it One way to solve this issue can be through the compu- needs to have a probability of (any) precipitation falling tation of a symmetric extremal dependency index greater than 50%, and as the ENS becomes more dis- (SEDI), which has many beneficial properties that are persed with lead time, probabilities in general terms tend not present in most other verification measures used to migrate toward the (model) climatological probability, with rare events (Ferro and Stephenson 2011). The which for most parts of Europe is well below 50% (we SEDI is based on the F and H indices. The score is de- saw an example of this type of behavior in Fig. 1,albeitfor fined as another continent). Nonetheless, we believe that the

PREFptype is a useful and compact resource for fore- logF 2 logH 2 log(1 2 F) 1 log(1 2 H) SEDI 5 . (6) casters, provided there is an understanding of its method logF 1 logH 1 log(1 2 F) 1 log(1 2 H) of construction. It can be made even more valuable when used in conjunction with the meteogram products. In the Table 3 shows the SEDI index values for the three near future ECMWF users will be able to click on the rare events studied in this paper (FZRA, IP, and RASN) PREF map product via a web interface and imme- at different lead times. This index gives a better idea of ptype diately view the corresponding PROB meteogram the skill for these three precipitation types in the ptype product for anywhere in the world. PREFptype product than does the performance diagram, where the values of the indices were a bit too small to be usefully compared at different lead times. FZRA shows 5. Freezing rain in Finland: A case study better skill at shorter lead times with a maximum value of 0.61, which progressively decreases with lead time, From the night of 27 February to noon on 28 February reaching zero at 120–144 and 144–168 h. RASN is more 2017, the southeastern part of Finland suffered from a

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC 102 WEATHER AND FORECASTING VOLUME 33

FIG. 8. Mean sea-level pressure (black lines) for valid time 0600 UTC 28 Feb 2017 and 12-h accumulated precipitation (shaded area) for the period from 0000 to 1200 UTC 28 Feb 2017. Both have base times at 0000 UTC 27 Feb 2017. The red box corresponds to the zoomed-in area in Fig. 9. heavy freezing drizzle and FZRA event that provoked (60.818N, 23.508E). Even though this station was not numerous problems on the roads and impacted the exactly in the position where FZRA was observed, it was electrical power grid. After a period of snowfall the day the nearest available and can reflect the general atmo- before, a low center situated in the North Sea trans- spheric conditions in the zone. Evidently, the lower ported warmer and moister air from the south, primarily troposphere is characterized by a very shallow layer of in the 900–700-hPa layer that created perfect conditions cold and wet air around 08C near the ground with for FZRA occurrence (Fig. 8). somewhat warmer moist air above it up to 850 hPa.

Figure 9a shows the PREFptype product generated Above that a saturated layer below 08C is observed up to by ENS model output from a base time at 0000 550 hPa. This profile, with the presence of both an ele- UTC 27 February 2017 and valid 24 h later. An extensive vated warmer layer and a layer of subfreezing air adja- area of FZRA is observed on the PREFptype product, cent to the surface, in principal creates the perfect situated in the southeastern part of Finland, matching conditions for FZRA formation. However, if the warm fairly well with the observations from the nearest SYNOP layer was not sufficiently extensive to melt all the snow, stations. Examining the forecast from the same base time ice pellets could result instead. but for a 33-h lead (Fig. 9b), the signal of FZRA con- According to reports, one of the most affected places tinues, and the probabilities in the center of the affected in the region was the town of Mikkeli (black star in area are between 50% and 70%. The rest of the obser- Fig. 9a). Meteograms of the PROBptype product for this vations also match quite well with the PREFptype.For location are shown in Fig. 11; these run out to 7-day forecasts from the 0000 UTC 26 February 2017 base time leads and start from four consecutive 0000 UTC base (more than 48 h before the FZRA event started) with a times. Figure 11a is for 24 h before the FZRA event 54-h lead time (Fig. 9c), a smaller but still clear signal of started. High probabilities of FZRA and IP together are FZRA is observed on the map, with probabilities below denoted during the first half of 28 February, up to more 50% but with some specific points indicating probabilities than 70%, of which 50% or so is for FZRA. The domi- between 50% and 70% in the far west of Russia. Finally, nation of FZRA over other precipitation types was seen in 78-h forecasts from 0000 UTC 25 February 2017 as well in the PREFptype map (Figs. 9a,b). The transition (Fig. 9d), the FZRA signal is not so clear, as it is con- from SN (the day before) to FZRA, and then RA centrated in Russia, but there are one or two points of IP, (during the second half of 28 February) is not unusual, which needs a very similar atmospheric structure to that being typical of warm fronts. The meteogram also in- required for FZRA formation. dicates that for many of the ENS members that pre- Figure 10 shows the nearest radiosonde ascent in the dicted FZRA the associated precipitation rate was 2 area (pink star in Fig. 9a), situated in Jokioinen Ilmala between 0.2 and 1 mm h 1. The next two meteograms

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC FEBRUARY 2018 G A S C Ó NETAL. 103

FIG. 9. A significant FZRA event occurred across southeast Finland on 28 Feb 2017. Map products show the

PREFptype, valid 28 Feb, at different base times. The observed precipitation types from SYNOP reports at the same times are plotted as symbols (dry is not shown). (a) The 24-h lead-time forecast for valid time 0000 UTC 28 Feb (pink star is the sounding site at Jokioinen and black star is Mikkeli; see text). (b) The 33-h forecast for valid time 0900 UTC 28 Feb. (c) The 54-h forecast for valid time 0600 UTC 28 Feb. (d) The 78-h forecast for valid time 0600 UTC 28 Feb.

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC 104 WEATHER AND FORECASTING VOLUME 33

FIG. 10. The 0000 UTC 28 Feb 2017 sounding from Jokioinen (60.818N, 23.508E; location shown in Fig. 9a with a pink star). with base times of 0000 UTC 26 February (Fig. 11b) and meteogram product. Unsurprisingly, the probabilities of 25 February 2017 (Fig. 11c) show lower probabilities of FZRA decrease with lead time in the meteograms, but FZRA (not higher than 25%), but on the other hand would have provided early warning that a FZRA event there is a relatively consistent signal for the period when could happen. Users could conceivably act upon this FZRA was observed. Finally, the meteogram initialized information, taking protective action, especially if there 4 days before the episode (Fig. 11d) does not show was supporting evidence from other sources. However, FZRA, but we have two ensemble members that show recall also that Fig. 6c suggested that overall there is no IP as the precipitation type at some moments during economic value in FZRA forecasts for lead times be- 28 February, indicative of the possibility that there could yond day 2 (when using the ENS in isolation). be an elevated warm layer in the model output. While two ensemble members forecasting FZRA (only 4% 6. Conclusions probability) would not be enough for decision-makers to act, it is nonetheless a small signal that alerts the user to This paper describes two new probabilistic products at least pay more attention to the forecasts of the fol- based on the instantaneous precipitation-type variable lowing days. Moreover, for this longer-lead forecast the in ECMWF ENS forecasts. Together they provide a new time step between bars is only 3 h (compared to 1 h for forecast tool for decision-makers related to high-impact the later data times), so the user would also need to be weather and exploit probabilistic forecasts for this pur- alert to the possibility of a short-lived weather event not pose, which is almost certainly better than just taking a being fully captured. deterministic viewpoint. The PREFptype map product This case study has shown how these new products shows which of the six precipitation types (RA, SN, based on the precipitation-type variable from ENS have RASN, FZRA, IP, and WSN) is most probable when- been able to forewarn of a severe FZRA episode 3 days ever the probability of some precipitation is .50%. This in advance, but with a clearer signal in the meteogram product is classified in three different ranges of proba- product that can help in decision-making for local or bilities: up to 50% (low probability), from 50% to regional warnings. As stated above, we would recom- 70% (moderate probability), and higher than 70% mend that users start with the PREFptype map product, (high probability). As a complementary product, the then for possible events investigate in more detail what PROBptype meteogram product represents the temporal the actual probabilities and rates are using the evolution of precipitation-type probabilities for a

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC FEBRUARY 2018 G A S C Ó NETAL. 105

FIG. 11. The PROBptype plots at Mikkeli (61.698N, 27.278E; location shown in Fig. 9a with a black star) for base time at 0000 UTC (a) 27, (b) 26, (c) 25, and (d) 24 Feb 2017 and valid up to 168-h lead time. specific location, also incorporating important addi- minimum precipitation rate for each precipitation type tional information regarding the precipitation rate. was established. For this we used the instantaneous The instantaneous total precipitation rate is shown in precipitation rate variable, applying different pre- three different categories: from a minimum Rmin to cipitation rate thresholds for each precipitation type (to 2 2 0.2 mm h 1 (low intensity), from 0.2 to 1 mm h 1 (me- classify dry from precipitating) to try to enforce a zero 2 dium intensity), and greater than 1 mm h 1 (high in- frequency bias, relative to manual SYNOP present tensity), providing an indication of the potential severity weather observations. This lead to the thresholds (Rmin) 2 2 of SN or FZRA events. So if we consider the two being 0.12 mm h 1 for RA, 0.1 mm h 1 for RASN, and 2 products together, from a user perspective, the map 0.05 mm h 1 for SN, FZRA, and IP, which are now being product can first deliver a useful initial ENS-based used as semipermanent filters for the final products, to overview of a given weather situation, while the me- help reduce misses and false alarms. When model teogram product would then allow the user to drill physics (or other) changes impact the precipitation rate down, for a given site, to see all the probabilities, their and precipitation type, we expect to have to recalibrate temporal evolution, and vital additional information the results. Reliability diagrams showed that the main concerning precipitation rates. In this way the user will positive impact of applying this technique was in prac- be able to make better-informed decisions regarding tice to reduce the overestimation of RA and to reduce severe winter weather, like FZRA. This is true even if the underestimation of SN. For the rest of the pre- the actions were only to amount to putting standby cipitation types the benefits are not so clear-cut, but measures in place well in advance of a possible (low there is no evidence of any degradation. probability) event. A complete 4-month verification of winter weather In creating the new products, a new methodology for precipitation-type probabilities was developed for both reducing the systematic model bias and defining the products, at seven different 24-h lead time ranges from

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC 106 WEATHER AND FORECASTING VOLUME 33

0–24 to 144–168 h. Three-hourly manual SYNOP ob- meteogram product, which can help in decision-making servations were used for the verification. Also, SYNOP for local or regional warnings. While the probabilities observations were previously tested to reduce the un- of FZRA decreased for longer lead times, as we ex- certainty arising from incorrect classifications of the pected, they still provide some useful information for reported present weather. ROC curves and AUC values the users. show that, for RA and SN, the levels of forecast skill are A limitation of our approach is that we are working quite similar, although the RA forecasts have somewhat with instantaneous parameters of the model, so at greater skill. For FZRA and RASN, the reliability is longer lead times one may miss events because they relatively lower and, similarly, the ROC curve skill is may fall in between the model time steps displayed. also quite low (though not negligible, at least for shorter This is an area to consider for future work, but the lead times), probably due in part to limited occurrences output we do provide does not claim to portray any during the study period. A common feature for all the more than what is happening at given snapshots in time, precipitation types is, unsurprisingly, that the un- and a typical user should readily appreciate any innate certainty increases as lead time increases. Finally, the limitations of this. Additional future work will include ability to predict IP is almost negligible, as we expected an assessment of the new precipitation-type products in from physical considerations and by referring to other other regions that suffer from weather hazards like studies. heavy snowfall or freezing rain (particularly North A cost–loss value analysis also indicates that the RA America), since the output is global. Within the and SN probabilities can be useful for decision-making framework of the ANYWHERE project, the relative for a broad number of users, while FZRA and RASN economic value for different regions in Europe will be show reduced relative values for the first lead time evaluated to provide a tailored guide to the value of ranges and nonexistent relative value for longer leads. these products for users. However in practice, as was shown with the freezing- rain case study, the true value beyond day 2 may in fact Acknowledgments. This work has been supported by the be underestimated with this metric if one takes into European Horizon 2020 research project ANYWHERE account the effects of small displacements in space or (EC-HORIZON2020-PR700099-ANYWHERE). The au- time, which are more prevalent at longer leads. The IP thors also wish to thank the Hungarian Meteorological relative value was 0 for all the lead times, so it did not Service (OMSZ), particularly Istvan Ihasz, who provided present any utility for users. One question that arises is the original idea for the meteogram product, and Zied Ben should IP remain as a separate category in the verifica- Bouallegue from ECMWF for his valuable advice regarding tion, or should IP events be merged with SN events. Or, the relative economic value subsection. alternatively, should IP be merged with FZRA because although IP itself does not present any important risk, the atmospheric conditions conducive to IP are very REFERENCES similar to those that are conducive to FZRA. As of now, Allen, M., and F. Eckel, 2012: Value from ambiguity in ensemble there is no clear-cut answer here. Similar considerations forecasts. Wea. Forecasting, 27, 70–84, https://doi.org/10.1175/ could be applied for RASN, merging this category WAF-D-11-00016.1. with SN. Bernstein, B. C., 2000: Regional and local influences on freezing Verification of the PREF map product was de- drizzle, freezing rain, and ice pellet events. Wea. Forecasting, 15, ptype 485–508, https://doi.org/10.1175/1520-0434(2000)015,0485: veloped using a performance diagram, which is a very RALIOF.2.0.CO;2. useful tool for simultaneously representing different Branick, M. L., 1997: A climatology of significant winter-type parameters and verification indices. This verification weather events in the contiguous United States, 1982–94. brought similar results, but they were classified in Wea. Forecasting, 12, 193–207, https://doi.org/10.1175/ , . dichotomous terms—occurrence or nonoccurrence. As 1520-0434(1997)012 0193:ACOSWT 2.0.CO;2. Brooks, H. E., J. V. Cortinas, P. R. Janish, and D. Stensrud, 1996: was seen in the ROC curves, the skill for FZRA and Application of short-range numerical ensembles to the fore- RASN as represented on the PREFptype is not good, but casting of hazardous winter weather. Preprints, 11th Conf. on there is some predictability. RA and SN have the best Numerical Weather Prediction, Norfolk, VA, Amer. Meteor. forecast skill, but that decreases considerably with lead Soc., J70–J71. time, while IP forecasts had no skill. Buizza, R., and T. N. Palmer, 1998: Impact of ensemble size on en- Finally, the FZRA case study described in section 5 semble prediction. Mon. Wea. Rev., 126, 2503–2518, https://doi. org/10.1175/1520-0493(1998)126,2503:IOESOE.2.0.CO;2. showed how the new products from ENS, when used Call, D. A., 2010: Changes in impacts over time: 1886– together, could forewarn of a severe FZRA episode 2000. Wea. Climate Soc., 2, 23–35, https://doi.org/10.1175/ 3 days in advance, but with a clearer signal in the 2009WCAS1013.1.

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC FEBRUARY 2018 G A S C Ó NETAL. 107

Carriére, J.-M., C. Lainard, C. Le Bot, and F. Robart, 2000: A Meteor. Soc., 23.1, https://ams.confex.com/ams/84Annual/ climatological study of surface freezing precipitation in Eu- techprogram/paper_73517.htm. rope. Meteor. Appl., 7, 229–238, https://doi.org/10.1017/ Mason, I., 1982: A model for assessment of weather forecasts. Aust. S1350482700001560. Meteor. Mag., 30, 291–303. Chang,S.E.,T.L.McDaniels,J.Mikawoz,andK.Peterson, Mason, S. J., and N. E. Graham, 1999: Conditional probabilities, 2007: Infrastructure failure interdependencies in ex- relative operating characteristics, and relative operating treme events: Power outage consequences in the 1998 ice levels. Wea. Forecasting, 14, 713–725, https://doi.org/10.1175/ storm. Nat. Hazards, 41, 337–358, https://doi.org/10.1007/ 1520-0434(1999)014,0713:CPROCA.2.0.CO;2. s11069-006-9039-4. Minder, J. R., and D. R. Durran, 2011: Mesoscale controls on the Cortinas, J. V., Jr., K. F. Brill, and M. E. Baldwin, 2002: Probabi- mountainside snow line. J. Atmos. Sci., 68, 2107–2127, https:// listic forecasts of precipitation type. Preprints, 16th Conf. on doi.org/10.1175/JAS-D-10-05006.1. Probability and Statistics in the Atmospheric Sciences, Or- Murphy, A. H., and R. L. Winkler, 1977: Can weather forecasters lando, FL, Amer. Meteor. Soc., 3.13, https://ams.confex.com/ formulate reliable probability forecasts of precipitation and ams/pdfpapers/30176.pdf. temperature. Natl. Wea. Dig., 2, 2–9. DeGaetano, A. T., 2000: Climatic perspective and impacts of the Ralph, F. M., and Coauthors, 2005: Improving short-term 1998 northern New York and New England ice storm. Bull. (0–48 h) cool-season quantitative precipitation fore- Amer. Meteor. Soc., 81, 237–254, https://doi.org/10.1175/ casting: Recommendations from a USWRP workshop. 1520-0477(2000)081,0237:CPAIOT.2.3.CO;2. Bull. Amer. Meteor. Soc., 86, 1619–1632, https://doi.org/ Elmore, K., H. Grams, D. Apps, and H. Reeves, 2015: 10.1175/BAMS-86-11-1619. Verifying forecast precipitation type with mPING. Reeves, H. D., 2016: The uncertainty of precipitation-type obser- Wea. Forecasting, 30, 656–667, https://doi.org/10.1175/ vations and its effect on the validation of forecast precipitation WAF-D-14-00068.1. type. Wea. Forecasting, 31, 1961–1971, https://doi.org/10.1175/ Ferro, C., and D. Stephenson, 2011: Extremal dependence indices: WAF-D-16-0068.1. Improved verification measures for deterministic forecasts ——, K. L. Elmore, A. Ryzhkov, T. Schuur, and J. Krause, 2014: of rare binary events. Wea. Forecasting, 26, 699–713, https:// Sources of uncertainty in precipitation-type forecasting. doi.org/10.1175/WAF-D-10-05030.1. Wea. Forecasting, 29, 936–953, https://doi.org/10.1175/ Forbes, R., I. Tsonevsky, T. Hewson, and M. Leutbecher, 2014: WAF-D-14-00007.1. Towards predicting high-impact freezing rain events. ECMWF ——, A. V. Ryzhkov, and J. Krause, 2016: Discrimination between Newsletter, No. 141, ECMWF, Reading, United Kingdom, 15–21, winter precipitation types based on spectral-bin microphysical https://www.ecmwf.int/en/elibrary/17334-towards-predicting-high- modeling. J. Appl. Meteor., 55, 1747–1761, https://doi.org/ impact-freezing-rain-events. 10.1175/JAMC-D-16-0044.1. Grout, T., Y. Hong, J. Basara, B. Balasundaram, Z. Kong, and Richardson, D. S., 2000: Skill and relative economic value of the S. T. S. Bukkapatnam, 2012: Significant winter weather events ECMWF Ensemble Prediction System. Quart. J. Roy. Meteor. and associated socioeconomic impacts (federal aid expendi- Soc., 126, 649–667, https://doi.org/10.1002/qj.49712656313. tures) across Oklahoma: 2000–10. Wea. Climate Soc., 4, 48–58, Robbins, C. C., and J. V. Cortinas Jr., 2002: Local and synoptic https://doi.org/10.1175/WCAS-D-11-00057.1. environments associated with freezing rain in the contiguous Hanley, J. A., and B. J. McNeil, 1982: The meaning and use of United States. Wea. Forecasting, 17, 47–65, https://doi.org/ the area under a receiver operating characteristic (ROC) 10.1175/1520-0434(2002)017,0047:LASEAW.2.0.CO;2. curve. Radiology, 143, 29–36, https://doi.org/10.1148/ Roebber, P., 2009: Visualizing multiple measures of forecast radiology.143.1.7063747. quality. Wea. Forecasting, 24, 601–608, https://doi.org/10.1175/ Huntemann, T., M. Schenk, and P. Fajman, 2014: Verification of 2008WAF2222159.1. precipitating weather type forecasts in the National Digital Sankaré, H., and J. M. Thériault, 2016: On the relationship between Forecast Database and National Digital Guidance Database. thesnowflaketypealoftandthesurface precipitation types 39th National Weather Association Annual Meeting, Salt Lake at temperatures near 08C. Atmos. Res., 180, 287–296, https:// City, UT, NWA, P1.64, http://www.nws.noaa.gov/mdl/synop/ doi.org/10.1016/j.atmosres.2016.06.003. papers/NWA2014_P1.64_Huntemann_etal.pdf. Scheuerer,M.,G.Scott,M.H.Thomas,andE.S.Phillip,2017: Ikeda, K., M. Steiner, J. Pinto, and C. Alexander, 2013: Evalu- Probabilistic precipitation-type forecasting based on ation of cold-season precipitation forecasts generated by GEFS ensemble forecasts of vertical temperature profiles. the hourly updating High-Resolution Rapid Refresh model. Mon. Wea. Rev., 145, 1401–1412, https://doi.org/10.1175/ Wea. Forecasting, 28, 921–939, https://doi.org/10.1175/ MWR-D-16-0321.1. WAF-D-12-00085.1. Schuur, T. J., H.-S. Park, A. V. Ryzhkov, and H. D. Reeves, 2012: ——, ——, and G. Thompson, 2017: Examination of mixed-phase Classification of precipitation types during transitional winter precipitation forecasts from the High-Resolution Rapid weather using the RUC model and polarimetric radar re- Refresh model using surface observations and sounding trievals. J. Appl. Meteor. Climatol., 51, 763–779, https://doi.org/ data. Wea. Forecasting, 32, 949–967, https://doi.org/10.1175/ 10.1175/JAMC-D-11-091.1. WAF-D-16-0171.1. Shafer, P. E., and D. E. Rudack, 2014: Experimental MOS pre- Ludlam, F. H., 1980: Clouds and : The Behavior and Effect cipitation type guidance from the ECMWF. 22th Conf. on of Water in the Atmosphere. The Pennsylvania State Univer- Probability and Statistics in the Atmospheric Sciences, Atlanta, sity Press, 488 pp. GA, Amer. Meteor. Soc., 6.4, https://ams.confex.com/ams/ Manikin, G. S., K. F. Brill, and B. Ferrier, 2004: An Eta model 94Annual/webprogram/Paper234514.html. precipitation type mini-ensemble for winter weather fore- Stanski, H. R., L. J. Wilson, and W. R. Burrows, 1989: Survey of casting. 20th Conf. on Weather Analysis and Forecasting/16th common verification methods in . WMO World Conf. on Numerical Weather Prediction, Seattle, WA, Amer. Weather Watch Tech. Rep. WMO TD 358, 114 pp.

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC 108 WEATHER AND FORECASTING VOLUME 33

Stewart, R. E., 1985: Precipitation types in winter storms. Pure Wandishin, M., M. Baldwin, S. Mullen, and J. Cortinas Jr., 2005: Appl. Geophys., 123, 597–609, https://doi.org/10.1007/ Short-range ensemble forecasts of precipitation type. Wea. BF00877456. Forecasting, 20, 609–626, https://doi.org/10.1175/WAF871.1. Taylor, K. E., 2001: Summarizing multiple aspects of model per- Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences: formance in a single diagram. J. Geophys. Res., 106, 7183– An Introduction. Academic Press, 467 pp. 7192, https://doi.org/10.1029/2000JD900719. ——, 2001: A skill score based on economic value for probability Thériault, J. M., R. E. Stewart, and W. Henson, 2010: On the forecasts. Meteor. Appl., 8, 209–219, https://doi.org/10.1017/ dependence of winter precipitation types on tempera- S1350482701002092. ture, precipitation rate, and associated features. J. Appl. Zerr, R. J., 1997: Freezing rain: An observational and theoretical Meteor. Climatol., 49, 1429–1442, https://doi.org/10.1175/ study. J. Appl. Meteor., 36, 1647–1661, https://doi.org/10.1175/ 2010JAMC2321.1. 1520-0450(1997)036,1647:FRAOAT.2.0.CO;2.

Unauthenticated | Downloaded 09/23/21 01:57 PM UTC