A Local Large Hail Probability Equation for Columbia, Sc
Total Page:16
File Type:pdf, Size:1020Kb
EASTERN REGION TECHNICAL ATTACHMENT NO. 98-6 AUGUST, 1998 A LOCAL LARGE HAIL PROBABILITY EQUATION FOR COLUMBIA, SC Mark DeLisi NOAA/National Weather Service Forecast Office West Columbia, SC Editor’s Note: The author’s current affiliation is NWSFO Mt. Holly, NJ. 1. INTRODUCTION Skew-T/Hodograph Analysis and Research Program, or SHARP (Hart and Korotky Billet et al. (1997) derived a successful 1991). The dependent variable, hail diameter probability of large hail (diameter greater than or the absence of hail, was derived from or equal to 0.75 inch) equation as an aid in spotter reports for the CAE warning area from National Weather Service (NWS) severe May, 1995 through September, 1996. There thunderstorm warning operations. Hail of that were 136 cases used to develop the regression diameter or larger requires a severe equation. thunderstorm warning be issued by NWS offices. Shortly thereafter, a Weather The LPLH was verified using 69 spotter Surveillance Radar-1988 Doppler (WSR- reports from the period September, 1996 88D) radar probability of severe hail (POSH) through September, 1997. The POSH was became available for warning operations (Witt verified as well. Verification statistics et al. 1998). This study recreates the steps included the Brier score and the chi-square taken in Billet et al. (1997) to derive a local (32) statistic. probability of large hail equation (LPLH) for the Columbia, SC National Weather Service The goal of this study was to develop an Forecast Office (CAE) warning area, and it objective method to estimate the probability assesses the utility of the LPLH relative to the of large hail for use in forecast operations. POSH. Since the WSR-88D algorithms also produce a POSH, the locally produced method was A logistic regression approach was used to verified in comparison to determine if the develop a LPLH for the CAE warning area. local regression equation was more accurate. Independent variables included in the LPLH Since Billet et al. (1997) used data from a were vertically integrated liquid (VIL) and the WSR-88D and an upper air sounding that ratio of VIL to echo top height (ET), both were co-located while this study did not, computed by the WSR-88D; and 500-hPa similarities and differences in the results from temperature at Peachtree City, Georgia (FFC) Billet et al. (1997) are discussed. and Charleston, South Carolina (CHS) extracted from upper air soundings using the 1 2. DATA index, total totals index, bulk Richardson number, equilibrium level, echo top height The dependent variable in the study was the (ET), speed shear in the layer between 200 h- categorical (binary) representation of the Pa and 500 h-Pa, and vertically integrated occurrence of large hail. Dependent variable liquid (VIL) density. VIL density, which is data were obtained from reports from local the ratio of VIL to ET, has been shown to be fire, police, and postal workers, trained useful in discriminating between severe and spotters, NWS employees, and others. non-severe storms (Paxton and Shepherd Independent variable data were obtained from 1993). the WSR-88D and upper air soundings from Peachtree City, Georgia (FFC) and WSR-88D independent variable values came Charleston, South Carolina (CHS) archived from storm interrogation within 20 minutes of locally and extracted via SHARP. the time of hail, or in the cases when hail was reported not to have occurred, from storm Potential independent variables include the interrogation during the same period at the following: time of maximum VIL. This information came from local Archive III and Archive IV 1) vertically integrated liquid optical disks. Most of the time, upper air 2) vertically integrated liquid density variable values came from the most recent 3) lifted index FFC soundings. The soundings were 4) convective available potential energy modified at the surface to match CAE 5) sweat index temperature and dew point at the time of hail 6) total totals index occurrence. 7) bulk Richardson number 8) 700-500-hPa lapse rate The FFC sounding is launched approximately 9) 850-hPa temperature 210 miles from CAE, but it was determined to 10) 700-hPa temperature be the most representative sounding of the 11) 500-hPa temperature CAE warning area in most cases. All other 12) 300-hPa temperature soundings except the WSO Greensboro 13) precipitable water (GSO) and CHS soundings are launched 14) 700-hPa dewpoint depression farther away from CAE than the FFC 15) freezing level sounding. The GSO sounding is launched 16) wet-bulb zero level too far north to be as representative as the 17) equilibrium level FFC sounding. The fairly frequent marine 18) echo top height influence on the CHS sounding usually 19) storm-relative helicity rendered it less representative than the FFC 20) storm-relative directional shear sounding. However, in four of the 136 cases, 21) positive shear there was an air-mass boundary between CAE 22) mean storm relative inflow (sfc to 2 km) and FFC that made the most recent FFC 23) 500-200-hPa speed shear sounding less representative of the atmosphere over CAE than the CHS The list of potential independent variables sounding. In those cases, the CHS sounding was similar to Billet et al. (1997). Variables was used. The CHS sounding is launched not utilized by Billet et al. (1997) were: sweat approximately 95 miles from CAE. 2 In order to maximize the number of cases for that is common to locations with sparse the logistic regression technique, virtually all populations was that the occurrence of hail hail reports and all verifiable reports of no may be unreported and/or undetected. The hail were counted as cases. There were only CAE warning area has many parts that are two limiting criteria. The first was that only sparsely populated. This study went to one report per thunderstorm cell was retained. considerable lengths to minimize these For a given cell, when there were more than sources of error. one report of hail, one hail report was chosen at random. 3. EQUATION When there were one report of hail and one or more reports of no hail, the hail report was A commercially available statistical software chosen. When there were no reports of hail package was used to derive a probability of and more than one report of no hail, one no- large hail equation using logistic regression hail report was chosen at random. The second analysis. The logistic regression approach limiting criterion was that all cases were results in an equation that expresses the between 14 and 105 miles from the radar site. probability of some categorical event (in this The minimum distance was necessary to avoid case, the occurrence of large hail) as the sum problems with the radar’s “cone of silence.” of weighted values of quantifiable variables. The maximum distance was the shortest A complete explanation of logistic regression distance that encompassed the entire CAE can be found in Freeman (1987). warning area. There were 136 cases used to derive the equation. Of those 136 cases, 44 The statistic used to select independent were of no hail, 37 were of small hail variables in this approach is the likelihood (diameter less than 0.75 inch), and 55 were of ratio chi-square statistic (G2). G2 relates large hail. directly to the discrepancies between (a) the predicted number of categorical event Two potential sources of error in the occurrences and non-occurrences from a independent data (sounding based) were that model and (b) the actual number of event the radar site and the sounding sites were not occurrences and non-occurrences. A model collocated, and that soundings were not with a high G2 value has little or no predictive coincident in time with storms. Another value; so, the more model G2 is reduced by possible source of error in the independent inclusion of a given independent variable, the data (radar based) was that in most cases only better the predictive value of that variable. A a data interval was available for VIL. A VIL p-value is associated with the reduction in G2 of 40 up to but not including 45 kg m-2 was by a given variable, and in this case the p- designated as 40 kg m-2, a VIL of 45 up to but value is the chance of obtaining such a not including 50 kg m-2 was designated as 45 reduction randomly. Therefore, low p-values kg m-2, and so on. are associated with variables that have high predictive value. A source of error in the dependent data was the estimation of hail sizes by spotters. Very The first step of this regression approach is to few spotters actually measure hail diameter. run each potential independent variable in a Another source of error in the dependent data one-independent variable equation, and the 3 variable that results in the greatest reduction correlated, the results themselves are not in G2 is kept (provided the associated p-value contaminated. Among the three independent is less than or equal to .05). The next step is variables retained in this study, only VIL and to run each remaining variable together with VIL density were significantly correlated. the one kept from the previous step in a two- Pearson’s product-moment correlation independent variable equation. The one that coefficient between the two values was .7578. results in the greatest reduction in G2 is also The p-value of the correlation was zero with kept. This process is continued until the 134 degrees of freedom. In this case, the p- reduction in G2 resulting from keeping the value meant that the probability of obtaining “best” remaining independent variable has an such a high correlation coefficient between associated p-value greater than .05.