<<

AUGUST 2008 FRIEDERICHSANDHENSE 659

A Probabilistic Forecast Approach for Daily Precipitation Totals

PETRA FRIEDERICHS AND ANDREAS HENSE Meteorological Institute, Bonn, Germany

(Manuscript received 12 June 2007, in final form 27 November 2007)

ABSTRACT

Commonly, postprocessing techniques are employed to calibrate a model forecast. Here, a probabilistic postprocessor is presented that provides calibrated probability and quantile forecasts of precipitation on the local scale. The forecasts are based on large-scale circulation patterns of the 12-h forecast from the NCEP high-resolution (GFS). The censored quantile regression is used to estimate se- lected quantiles of the precipitation amount and the probability of the occurrence of precipitation. The approach accounts for the mixed discrete-continuous character of daily precipitation totals. The forecasts are verified using a new verification score for quantile forecasts, namely the censored quantile verification (CQV) score. The forecast approach is as follows: first, a canonical correlation is employed to correct systematic deviations in the GFS large-scale patterns compared with the NCEP–NCAR reanalysis or the 40-yr ECMWF Re-Analysis (ERA-40). Second, the statistical quantile model between the large-scale circulation and the local precipitation quantile is derived using NCEP and ERA-40 reanalysis data. Then, the statistical quantile model is applied to 12-h forecasts provided by the GFS forecast system. The probabilistic forecasts are reliable and the relative gain in performance of the quantile as well as the probability forecasts compared to the climatological forecasts range between 20% and 50%. The importance of the various parts of the postprocessing is assessed, and the performance is compared to forecasts based on the direct pre- cipitation output from the ECMWF forecast system.

1. Introduction The perfect prog method (PP; Klein 1971) and model output statistics (MOS; Glahn and Lowry 1972) have Quantitative forecasts of precipitation including its been successfully applied in numerical predic- extremes are of high socioeconomic interest. Although tion to recalibrate the direct model output to local con- present-day global weather forecast models provide re- ditions. In contrast to MOS, PP ignores the model fore- liable forecasts of the atmospheric large-scale circula- cast error, but has the advantage of available large tion, they cannot provide realistic descriptions of local datasets. Kalnay (2003) and Marzban et al. (2005) dis- weather variability. This is because of the horizontally cuss the pros and cons of both approaches. They pro- restricted resolution (30–100 km at best for global mod- pose a combination of PP and MOS, which uses re- els), but is also because of missing cloud dynamical and analysis for the development of the regression equa- microphysical processes. For this reason, a description tions, and denote this approach RAN as an acronym for and forecast of local weather phenomena, and particu- reanalysis. MOS is generally based on multiple linear larly extreme events, can only be achieved through a regression and therefore applies to the postprocessing combination of dynamical and statistical analysis meth- and forecasting of expectation values. Other ap- ods, where a stable and significant statistical model proaches use statistical correction based on the analog based on a priori physical reasoning establishes, a pos- method (Zorita and von Storch 1999; Hamill and terior, a calibrated model between the local conditions Whitaker 2006). Vislocky and Young (1989) used dif- and the large-scale circulation. ferent PP models based on an analog model and logistic regression as predictors in an MOS approach. Bremnes Corresponding author address: Petra Friederichs, Meteorologi- (2004) was the first to apply quantile regression for cal Institute, Auf dem Hügel 20, 53121 Bonn, Germany. precipitation forecasts in the context of numerical E-mail: [email protected] weather prediction.

DOI: 10.1175/2007WAF2007051.1

© 2008 American Meteorological Society Unauthenticated | Downloaded 10/02/21 05:08 AM UTC

WAF2007051 660 WEATHER AND FORECASTING VOLUME 23

The stochastic character of weather requires a proba- National Center for Atmospheric Research (NCEP– bilistic treatment. Furthermore, probabilistic forecasts NCAR) reanalysis dataset. , or more gen- provide a measure of uncertainty that might be impor- eral statistical postprocessing, seeks a statistical model tant for decision makers. Statistical postprocessing of between the local variable and the large-scale model deterministic model forecasts should thus also provide output. In FH it is shown how the statistical postpro- probabilistic forecast measures. Here, our focus is on cessing derives conditional quantiles of precipitation at daily precipitation totals. Of interest are, for example, one station given the large-scale circulation of the the probability of the occurrence of precipitation, and NCEP–NCAR reanalysis. the expected amount. Particular attention is also given In this paper we will extend and modify the postpro- to extreme precipitation events. cessing in several ways. The conditional quantile model A complete probabilistic description of a variable is is trained on either the NCEP–NCAR or the 40-yr Eu- obtained by an estimate of the conditional distribution ropean Centre for Medium-Range Weather Forecasts function—conditional on the large-scale model fore- (ECMWF) Re-Analysis (ERA-40) reanalysis data, and cast. However, there is no general agreement that pre- it is then applied to NCEP high-resolution Global Fore- cipitation can be adequately modeled by a single para- cast System (GFS) 12-h forecasts. A canonical correla- metric distribution. Vrac and Naveau (2007) proposed tion analysis (CCA) is used to correct for systematic a mixture of gamma and generalized Pareto distribu- model forecast errors relative to the reanalysis in the tions for modeling precipitation and combined this with large-scale circulation patterns. As this approach uses a nonhomogeneous stochastic weather-typing approach reanalysis for the training of the probabilistic model, we for statistical downscaling. Instead of a parametric es- denote this method the RAN approach, as in Marzban timate of the conditional distribution function, the con- et al. (2005), although they use multiple linear regres- ditional distribution function can be estimated at given sion instead of CCA. values or thresholds. One possibility to do so is to use Forecast skill is assessed using the censored quantile the logistic multiple regression (Hamill et al. 2004), verification (CQV) score (FH) and the Brier score. A which gives an estimate of the conditional probability skill score is defined with respect to a climatological that an a priori fixed threshold is exceeded. This is a forecast. The performance of various approaches is valuable approach for a priori defined thresholds, par- compared. Those approaches are the RAN approach; a ticularly for the zero precipitation threshold. Another PP approach, where no calibration of the large-scale approach would be to estimate the conditional quantile patterns is applied; a probabilistic MOS approach (P- function at a given probability ␶. The method estimates MOS), where the training of the statistical postprocess- the precipitation amount that is exceeded with a prob- ing relies on 5 yr of short-term GFS model forecasts; ability of 1 Ϫ ␶. In this case, there is no need to a priori and a downscaling (D) approach that is similar to the define thresholds that could be very site or user specific, PP approach but uses reanalysis instead of forecasts. but representative quantiles such as the 0.5 quantile The differences assess the roles of the various steps in (median), bounds for “normal” conditions such as the the RAN postprocessing. The results are also compared 0.25 and 0.75 quantiles, and extreme quantiles, for ex- to a somewhat simpler approach that uses calibrated ample, the 0.05 and 0.95 quantiles, can be used. precipitation forecasts from the ECMWF direct model Here, we demonstrate a probabilistic forecast ap- output. The comparison should provide some reference proach that derives such probabilistic measures from skill in order to rank the skill obtained with the RAN the output of a single deterministic model forecast. The approach. For comparability, the calibration of the probabilistic measures are the probability of precipita- ECMWF direct model output again uses censored tion above zero, and the quantile function at given quantile regression, but no information of the large- probabilities ␶. The method employs censored quantile scale circulation is included. regression (QR; Koenker 2005; Powell 1986) and logis- Section 2 briefly describes the censored quantile re- tic regression (Fahrmeir and Tutz 1994). Our statistical gression approach. For a detailed description the postprocessing estimates the probability of precipita- reader is referred to FH. Section 3 introduces the data tion and selected quantiles conditional on the forecasts used in this study, and the forecast approach is pre- of the large-scale circulation. sented in section 4. Section 5 presents the results using The forecast approach follows Friederichs and Hense the NCEP–NCAR and ERA-40 reanalyses, and com- (2007, hereafter FH), where we presented a downscal- pares the various alternative approaches. Section 6 con- ing approach for daily station rainfall data using the cludes the article. An appendix discusses in more detail National Centers for Environmental Prediction– the quantile regression and the QV score.

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC AUGUST 2008 FRIEDERICHSANDHENSE 661

2. Censored quantile regression efficients of the model and are identical to ␤ in (1). The term ␥TX⑀ is an error term that accounts for a linear Our forecast approach employs censored quantile re- dependency of the square root error variance on the gression. The concept of quantile regression (QR) was covariate X (heteroscedasticity). Censoring is applied developed by Koenker and Bassett (1978) as a compre- using the maximum function by taking only values of hensive strategy in order to complete the regression zero and above, so that censoring is expressed as Y ϭ picture. While standard regression estimates condi- max(0, Y*). tional mean surfaces, QR gives conditional quantile es- If the ordering of the data is not changed, or more timates and, thus, a more complete measure of the con- generally, for every nondecreasing function h, the ditional distribution particularly in the case of nonnor- equality mality. The conditional quantile model coefficients are ͑␶| ͒ ϭ ͓ ͑␶| ͔͒ ͑ ͒ estimated such that they minimize a loss function de- Qh͑y*͒ X h Qy* X 2 rived from the absolute deviations [least absolute de- holds. This property of the quantile function is used in viation (LAD) or the L1 method; see the appendix]. censored quantile regression. The conditional quantile This is equivalent to the maximization of the likelihood ␶ ␶ function Qy( |X) at a probability is derived by function of the data, which can be formulated by inde- ϭ ˆ ͑␶| ͒ ϭ ˆ ͑␶| ͒ ϭ ͓ ͑␶| ͔͒ pendently distributed asymmetric Laplace densities yˆ␶ Qy X Qmax͑0, y*͒ X max Qy* X (Yu and Moyeed 2001). In this section, censored QR is T ϭ max͑0, ␤ˆ ␶ X͒. ͑3͒ introduced in a nutshell. For more insight see FH, and ␤ˆ references therein. A very comprehensive description The coefficients ␶ are estimated by minimizing a of quantile regression and related subjects is given in piecewise linear censored LAD function (Powell 1986): the monograph by Koenker (2005). Calculations are ␤ˆ ϭ ␳ ͓ Ϫ ͑ ␤⌻ ͔͒ ͑ ͒ ␶ arg min͚ ␶ yn max 0, ␶ xn , 4 performed using the R programming language (R De- n velopment Core Team 2007). where ␳␶(u) is the so-called check function, with ␳␶(u) ϭ To account for the mixed discrete-continuous char- ␶u if u Ն 0 and ␳␶(u) ϭ (␶ – 1)u if u Ͻ 0. The index n acter of daily precipitation totals, precipitation is rep- denotes the members of the training sample. More de- resented by a so-called censored variable. Censoring is tails on the quantile function and its relation to the a term that originates from survival analysis. In the case optimization problem are given in the appendix. of life data, the variable of interest is the time until The minimization of (4) is much more complex than failure. If the failure occurs during the test period, the minimizing the LAD function of the noncensored QR value is complete. However, if the test ends before the ͚ ␳ Ϫ ␤T model, n ␶[yn ␶ xn]. We thus use a three-step ap- failure has occurred, the value is incomplete and is said proach following Chernozhukov and Hong (2002), to be censored. The censoring line is the end time of the which is described in more detail in FH. This approach test period and represents an upper bound of observa- estimates the noncensored QR model, but on a sub- tion, although the underlying process has no upper sample for which the estimated conditional probability bound. of the occurrence of precipitation is larger than the In the case of precipitation, we assume a hypothetic respective probability ␶. The procedure is as follows. process Y* that is observed through the amount of pre- First, the probability of the occurrence of precipitation cipitation Y. The zero precipitation line is assumed to P(y Ͼ 0|X) is estimated using a logistic regression represent a censoring line, which acts as a lower bound (with a probit function). A subsample {yn} is selected as of observation. The hypothetic process could well have defined by {y|P(y Ͼ 0|X) Ͼ ␶}. In a second step, the ␤␶ values below zero, although it is not observed through coefficients are estimated by minimizing the noncen- the observable “precipitation.” Note that this is a sta- sored LAD function over the subsample {yn}. A third tistical construct and does not assume a real physical step repeats the second step, but now on an updated process. ␤ˆ T Ͼ subsample {yn} defined by {y| ␶ X 0}. The coeffi- Let Y be the univariate censored response variable cients ␤␶ are updated by minimizing the noncensored (e.g., daily precipitation totals) and X the conditioning LAD function on the new subsample. Optionally, this multivariate variable. Then, the statistical censored lin- step can be repeated several times. Thus, besides the ear model is conditional ␶ quantile, the method also estimates the Y|X ϭ max͑0, ␤TX ϩ ␥TX⑀͒, ⑀ ϳ IID. ͑1͒ probability of noncensoring, hence the conditional probability of precipitation above zero, which is esti- The noncensored process Y* is modeled by Y*|X ϭ mated in step one. ␤TX ϩ ␥TX⑀, where ␤ are the unknown regression co- Additionally, the censored quantile regression natu-

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC 662 WEATHER AND FORECASTING VOLUME 23

rally provides a scoring rule, the censored quantile veri- after the initialization of the model forecasts. Observa- fication (CQV) score: tions at German weather stations are provided by the German Weather Service (Deutscher Wetterdienst, ϭ ␳ ͓ Ϫ ͑ ␤ˆ T ͔͒ ͑ ͒ CQV ͚ ␶ yn max 0, ␶ xn , 5 DWD) within the priority project “Quantitative pre- n cipitation forecasts” of the German Research Founda- ϭ where yn and xn, n 1,..., N, are taken from the tion. We have chosen 50 stations in the region of Rhine- forecast and verification sample of Y and X. Further land-Palatinate that are almost complete for the fore- explication on the separation of training and forecast cast period from 2002 to 2005 and that have sufficient samples is given in section 4. data coverage for the training period from 1958 to 2000. The CQV score is a proper scoring rule (Gneiting To account for seasonal nonstationarities, the data are and Raftery 2007; Bröcker and Smith 2007), which dis- divided into a cold season, November–March (winter), courages hedging on the part of the forecaster (Murphy and a warm season, May–September (summer). 1973). The CQV score is a positive definite function, The statistical postprocessing is derived on the basis and its expected minimum is obtained if the forecasts of reanalysis data, both from the NCEP–NCAR re- correspond to the conditional ␶ quantile (see the ap- analysis project (Kalnay and et al. 1996) and the ERA- pendix). Its expectation is zero only if the forecast is 40 reanalysis project (Uppala and et al. 2005). Unfor- perfect and the underlying process deterministic. To tunately, no GFS precipitation forecasts are available. assess the relative gain in performance of a forecast Instead, we used 5 yr of the deterministic ECMWF with respect to a reference forecast, we construct a skill precipitation forecasts (European Centre for Medium- score analogously to the Brier skill score as Range Weather Forecasts 2006) over Germany for the CQV͑␶͒ period from 2001 to 2005. The ECMWF forecast system CQVSS͑␶͒ ϭ 1 Ϫ . ͑6͒ provides precipitation forecasts on a 0.4° ϫ 0.4° hori- CQV ͑␶͒ ref zontal grid. As a predictor, we use the 6–30-h accumu- The CQVSS can takes values on the half-open interval lated precipitation of the 0000 UTC initial time fore- (Ϫϱ, 1]. A zero CQVSS indicates no gain with respect cast, which approximately corresponds to the observa- to the reference forecast, while a value of one indicates tional accumulation period. The ECMWF forecasts are a perfect and deterministic forecast. Likewise, the prob- intended to provide some reference skill in order to ability forecast of the occurrence of precipitation is rank the skill obtained with the RAN approach. verified by using the Brier skill score (Brier 1950). Note that the CQVSS as well as the Brier skill score are only 4. The forecast approach asymptotically proper for very large samples (Murphy 1973). Except when using ECMWF forecasts, the forecast approaches are based on a combined phase state vector ␨ ␻ 3. Data (XGFS, XRe)of 850, 850, and PWat at grid points in the area of 45°–59°N, 1°–20°E. To give equal weight to the A very common problem to model output statistic is different meteorological variables, the time series have the lack of a sufficiently long and homogeneous train- to be weighted. This is done by multiplying each vari- ing dataset. We have access to 5 yr (2001–05) of fore- able with the inverse of its spatial mean temporal stan- Ϫ4 Ϫ1 ␨ Ϫ1 casts produced for the 12-h period after the 0000 UTC dard deviation, which is of order 10 s for 850,10 Ϫ1 ␻ Ϫ2 initialization time from the NCEP high-resolution Pa s for 850, and 50 kg m for PWat. The ECMWF Global Forecast System (GFS; NOAA/EMC 2003). state vector XEC contains the total precipitation The data were archived at and kindly provided by Wet- amount at four neighboring grid points. terOnline GmbH. The analysis (0-h forecast) was not To assess the importance of the different parts in the stored, so no analyses with the GFS model are avail- postprocessing, we apply several approaches in addi- able. Three variables of the GFS forecasts are available tion to the RAN approach. All approaches are summa- over a region of 45°–59°N, 1°–20°E over western Eu- rized in Table 1. For the reason of comparability, the rope: precipitable water (PWat), relative vorticity at the verification period always extends from 2002 to 2005. ␨ 850-hPa pressure level ( 850), and vertical velocity The quantile forecasts are verified using the CQV and ␻ ( 850) at the 850-hPa pressure level. the Brier score. A skill score is derived with respect to Our study aims to provide reliable forecasts of daily a climatological forecast. The climatological probability totals of precipitation at German weather stations. The of precipitation and the climatological quantiles are es- daily totals are measured from 0730 to 0730 UTC of the timated from the forecast period from 2002 to 2005 for next day. The observational period thus starts 7.5 h the respective season. The climatology is estimated for

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC AUGUST 2008 FRIEDERICHSANDHENSE 663

TABLE 1. Forecast approaches employed in this study.

Acronym Training QR model Forecast of quantiles Variables ␨ ␻ RAN ERA-40/GFS ERA-40 1958–2000 GFS 2002–05 14 CCA modes ( 850, 850, and PWat) between ERA-40 and GFS (2001) ␨ ␻ RAN NCEP/GFS NCEP–NCAR 1948–2000 GFS 2002–05 14 CCA modes ( 850, 850, and PWat) between NCEP–NCAR and GFS (2001) ␨ ␻ PP ERA-40/GFS ERA-40 1958–2001 GFS 2002–05 (interpolated on an 14 EOFs ( 850, 850, and PWat) of ERA-40 grid) ERA-40 (1958–2001) ␨ ␻ PP NCEP/GFS NCEP–NCAR 1948–2001 GFS 2002–05 (interpolated on an 14 EOFs ( 850, 850, and PWat) of NCEP–NCAR grid) NCEP–NCAR (1948–2001) ␨ ␻ D NCEP NCEP–NCAR 1948–2001 NCEP–NCAR 2002–05 14 EOFs ( 850, 850, and PWat) of NCEP–NCAR ␨ ␻ P-MOS GFS GFS 2001–05 GFS 2002–05 10 EOFs ( 850, 850, and PWat) of GFS (cross validation) DMO ECMWF ECMWF 2001–05 ECMWF 2001–05 Four grid points of ECMWF precipitation (cross validation)

ˆ each station separately, to avoid artificial skill (Hamill whereas ERe is derived from the reanalysis period 1948– and Juras 2006). The sampling error of the skill scores 2001 (NCEP–NCAR) or 1958–2001 (ERA-40), respec- is estimated using the bootstrap method (Efron and tively. Tibshirani 1993). In a second step, the NCEP–NCAR and ERA-40 reanalyses for the period 1948/1958–2000 are projected a. The RAN approach onto the canonical patterns from step 1. This period serves as the training period for the censored QR Our forecast approach follows the RAN approach of model between the projections of the NCEP–NCAR Marzban et al. (2005); that is, it uses reanalysis for the and ERA-40 reanalyses onto their canonical patterns development of the censored QR model. The access to (U) and the DWD station precipitation totals at a se- a large training database is important for the estimation lected station, notably the estimation of the coefficients of a stable QR model as shown in FH. The RAN fore- ␤␶. The QR model reads cast approach is illustrated in Fig. 1. In a first step, a CCA is performed to derive patterns (denoted as ca- | ϭ ͑ ␤T ˆ ˆ ϩ ⑀ ͒ ͑ ͒ Y XRe max 0, ␶ UEReXRe ␶ , 8 nonical patterns) in the GFS forecasts and the NCEP– ⑀ NCAR and ERA-40 reanalyses with the maximum cor- where ␶ accounts for the error in the QR model. The ␤ relation. Prior to the CCA, a reduction of spatial de- model coefficients ␶ are estimated using data from the grees of freedom is needed for the gridded datasets training period 1948–2000 or 1958–2000, respectively. (NCEP–NCAR reanalysis, ERA-40 reanalysis, and The third step is the forecast step for the remaining GFS forecasts). This reduction is performed using a period from 2002 to 2005. The forecast first corrects for principal component analysis (Barnett and Preisendor- systematic errors in the large-scale variables in the GFS fer 1987). The CCA patterns are derived for the daily forecasts using the CCA model. Then, the censored QR fields of the respective season in the year 2001 (Fig. 1). model is applied to derive estimates of the conditional ␶ The CCA model between the GFS forecasts and the quantile of the station precipitation y. So the forecasts ␶ reanalysis (Re) reads as follows: of the conditional quantiles of precipitation are de- rived as ϭ ϩ ⑀ ͑ ͒ UEReXRe VEGFSXGFS CCA, 7 ϭ ˆ ͑␶| ͒ ϭ ͑ ␤ˆ T ˆ ˆ ͒ ͑ ͒ yˆ␶ Qy XGFS max 0, ␶ VEGFSXGFS , 9 where XRe and XGFS are the gridded and scaled data of for the remaining period from 2002 to 2005. The RAN the reanalysis and GFS forecasts, ERe and EGFS are the matrices composed by empirical orthogonal functions, approaches are denoted as RAN ERA-40/GFS and RAN NCEP/GFS. Note that the CCA model avoids the and EReXRe and EGFSXGFS constitute the correspond- ing principal components (PCs) or new variables. Here, interpolation of the data onto a common grid. ⑀ denotes the error term of the CCA model, and U CCA b. The perfect prog approach and V are matrices containing the canonical pattern or transformations from the CCA. The estimates Uˆ , Vˆ , and Here, we want to assess the importance of the CCA ˆ EGFS are derived from the daily data of the year 2001, correction. The reanalysis-derived model is directly ap-

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC 664 WEATHER AND FORECASTING VOLUME 23

FIG. 1. Illustration of the three-step RAN forecast approaches (GFS/ERA40 and GFS/NCEP). plied to the GFS forecasts without the CCA step. As or May–September (MJJAS)] is withheld from the both the ERA-40 and the NCEP–NCAR grids have training data, and the forecast lower resolution than the GFS forecasts, the GFS fore- T yˆ␶ ϭ max͑0, ␤ˆ ␶ E X ͒ ͑12͒ casts are interpolated onto the respective reanalysis GFS GFS grids. The QR model is now trained between the first is derived for each target season in the period from leading EOFs of the reanalysis data and the DWD sta- 2002 to 2005. The training data also include the data tion precipitation. The forecasts are estimated through from the year 2001.

ϭ ͑ ␤ˆ T ˆ ͒ ͑ ͒ e. Using precipitation (ECMWF) as a predictor yˆ␶ max 0, ␶ EReXGFS . 10 This censored QR approach, denoted as DMO Here, the systematic model and forecast errors are ig- ECMWF, uses only precipitation forecasts. As no GFS nored (prefect prog); hence, the approaches are de- precipitation forecasts were available, we use the noted as PP ERA-40/GFS and PP NCEP/GFS. Differ- ECMWF forecasts instead. This approach assesses the ences in skill are due to forecast errors as well as to skill obtained using direct precipitation forecasts. This systematic differences between reanalysis and interpo- approach is not directly comparable with the GFS fore- lated GFS forecasts. casts based on the large-scale information only, as it introduces a new model with a higher resolution. c. The downscaling approach Rather it should give some reference to rank the ob- Another approach is included that is obtained for tained skill scores. downscaled precipitation using the NCEP–NCAR re- The covariate to derive the forecast is again a multi- analysis. The approach is denoted as D NCEP. The variate covariate; however, it only consists of forecasts training of the QR model is based on the NCEP– of total precipitation at the four nearest grid points NCAR reanalysis from 1948 to 2001. The quantile es- around the respective station; therefore, no reduction timates for the period from 2002 to 2005 are obtained of degrees of freedom is needed. However, a simple from calibration is needed to derive probabilities and quan- tiles from the deterministic DMO, which are compa- ϭ ͑ ␤ˆ T ˆ ͒ ͑ ͒ yˆ␶ max 0, ␶ EReXRe . 11 rable. As for the P-MOS GFS approach, we use cross The differences in skill between RAN NCEP/GFS and validation for training and verification. The conditional D NCEP are mainly because of forecast errors. quantiles of the target season are estimated as ϭ ͑ ␤ˆ T ͒ ͑ ͒ yˆ␶ max 0, ␶ XEC , 13 d. The P-MOS approach where XEC is a five-dimensional vector containing the To emphasize the benefit of the RAN approach, we forecasts of total precipitation at the four grid points included a probabilistic MOS approach, where only the nearest to the station location and a constant compo- relatively short GFS forecasts are used for training the nent of one for the intercept. postprocessing. The training and verification process only uses the GFS forecasts and is denoted as P-MOS 5. Results GFS. The forecasts are derived using cross validation, We first discuss the results obtained with the RAN where one target season [November–March (NDJFM) postprocessing. To present meaningful results, the sen-

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC AUGUST 2008 FRIEDERICHSANDHENSE 665

FIG. 2. The CQV and Brier skill score for censored QR forecasts for four DWD stations for (a) winter and (b) summer months. ␨ ␻ Training of the QR model is performed on ERA-40 and NCEP–NCAR reanalyses 850, 850, and PWat, and on ECMWF precipitation forecasts. A forecast is performed on the basis of GFS (GFS/ERA40 and GFS/NCEP) or on ECMWF precipitation forecasts (see Table 1). The shading of the Brier skill score for the probability forecast of the occurrence of precipitation is dark, and the CQV skill score is shown for the 0.25-, 0.5-, 0.75-, 0.90-, 0.95-, and 0.99-quantile forecasts with gray shading. The error bars indicate the sampling errors of the CQV and Brier skill score.

sitivity of the quantile forecasts and the CQV skill Berkum (Helmbach-Düttling) represents the station scores to different settings has to be assessed. First, the with the lowest skill, and Schmelz-Hüttersdorf (Her- optimal number of EOF and canonical patterns that meskeil) the station with highest skill, in winter (sum- enter the QR model has to be defined. It turned out mer). Skill in winter is generally larger than in summer. that an optimal number of EOFs is of the order of The Brier skill score ranges between 23% (19%) and 12–14, and the QR model then gives the best results 50% (40%) in winter (summer). The CQV skill score is with about the same number of CCA modes (not generally largest for the 0.9 or 0.95 quantiles and is of shown). We have chosen 14 EOFs to represent the GFS the order of the Brier skill score. The CQV skill score forecasts and the NCEP–NCAR and ERA-40 reanaly- of the 0.99 quantile is smaller and has a large uncer- ses in the CCA model. All 14 CCA modes enter the QR tainty. The least skill is obtained for the 0.25 quantile. model. Although the length of the training period for However, this quantile lies only above the zero line, if the CCA correction is only 1 yr (or about 150 days for the probability of precipitation is above 0.75. The error each season), it turned out to suffice for our purpose. A bars given in Fig. 2 represent the 95% confidence in- longer period (two seasons) for the CCA correction in terval of the CQV skill score estimate derived by a turn reduces the period for the verification and did not bootstrap approach. No significant differences exist be- significantly increase the skill. tween the forecasts that are derived using either the Figure 2 shows the CQV skill scores for four stations: ERA-40 or the NCEP–NCAR reanalyses. Rhineland-Palatinate, Wachtberg-Berkum (50.62°N, Figure 2 also represents the CQV skill score obtained 7.13°E, 220 m), Schmelz-Hüttersdorf (49.42°N, 6.83°E, using the calibrated DMO ECMWF forecasts. For all 223 m), Helmbach-Düttling (50.60°N, 6.55°E, 380 m), stations, the DMO ECMWF approach outperforms and Hermeskeil (49.65°N, 6.93°E, 480 m). Wachtberg- the RAN approaches. The CQV skill score reaches

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC 666 WEATHER AND FORECASTING VOLUME 23

FIG. 3. Reliability for censored QR forecasts for four DWD stations for (a) winter and (b) summer months. Training and forecast approaches are as in Fig. 2. Horizontal lines indicate the theoretical probability, and the gray dots the empirical probability of the conditional quantile forecasts (␶ values as in Fig. 2). The vertical bars represent the 95% error interval of the empirical probabilities estimated from a 1000-member bootstrap sample.

70% (50%) for some stations in winter (summer). The and the observation is also zero, then it is not possible sampling error of the CQV skill score is comparable to to decide whether it is below or above the quantile that of the RAN approaches. Further studies are forecasts. Counting the zero quantile forecasts intro- needed in order to assess whether the differences are duces an artificial bias into the reliability, which is par- due to different model performance (e.g., due to the ticularly strong for the lower quantiles, as they are more higher horizontal resolution of the ECMWF forecasts), often censored. The reliability of the censoring, that is, or the choice of the covariates, or both. the reliability of the categorical forecast of precipitation To further verify the quantile forecasts, we investi- above zero, should be assessed separately. gated whether the quantile forecasts are reliable, that Figure 3 compares the theoretical (horizontal lines) is, show systematic errors (Fig. 3). If a quantile forecast and the empirical probabilities (dots) of the conditional is reliable, then it is expected that a fraction ␶ of the ␶-quantile forecasts. The 95% sampling error inter- observed precipitation values lies below the conditional vals of the empirical probability estimates are indicated ␶ quantiles. This fraction is denoted as the empirical by vertical lines. They are estimated from a 1000- probability. Note that the empirical probabilities can member bootstrap sample. Although some significant only be estimated for those forecasts where the condi- deviations from the theoretical values occur, particu- tional quantile forecast is above zero. If the quantile larly for the 0.25-quantile forecasts, a ␹2 test for count forecast is censored, which means that it is set to zero, data (Wilks 1995) does not reject the null hypothesis

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC AUGUST 2008 FRIEDERICHSANDHENSE 667

FIG. 4. CQV and Brier skill score for censored QR forecasts for four DWD stations for (a) winter and (b) summer months (as in Fig. 2) but for different forecast approaches (see Table 1). of reliability for any of the forecast approaches. Simi- different parts of the RAN approach. Therefore, the larly, the mean occurrence of precipitation lies within CQV skill scores of the remaining forecast approaches the uncertainty of the mean forecast occurrence (not are displayed in Fig. 4. The PP ERA-40/GFS and PP shown). NCEP/GFS approaches highlight the benefits of the As the conditional quantiles are estimated indepen- CCA correction, which should correct for systematic dently for the different levels of ␶, it may occur that the errors in the forecasts of the large-scale circulation that quantile forecasts cross (i.e., the conditional 0.9 quan- are due to the forecast errors as well as to systematic tile is larger than the 0.95 quantile). This would consti- differences between the reanalysis and the GFS model. tute an invalid distribution. Friederichs and Hense For all stations and seasons, the skill of PP ERA-40/ (2007) have shown that oversampling and a short train- GFS is comparable to RAN ERA-40/GFS. The spatial ing dataset can lead to an increased number of crossing resolution of each model is comparable (1.25° for quantiles. For the four stations shown in Fig. 2, and for ERA-40 and 1.0° for GFS); hence, the systematic dif- the six quantile forecasts for each day of the winter and ferences between the ERA-40 and GFS large-scale pat- summer seasons from 2002 to 2005 (1217 days), cross- terns as well as the systematic forecast errors seem ing occurred five times for the GFS/ERA-40, never for small. In contrast, the performance of PP NCEP/GFS is the GFS/NCEP forecast approach, and 333 times reduced compared to RAN NCEP/GFS for all stations for the forecasts using the QR model based on the and seasons. This suggests that the systematic model ECMWF precipitation forecasts. Even though, the differences are larger between NCEP–NCAR (2.5° ECMWF forecast approach provides more skillful resolution) and GFS. quantile forecasts, it predicts an invalid distribution sig- Similar conclusions are suggested by the perfor- nificantly more often than does the GFS/ERA-40 mance of the downscaling approach (D NCEP). The (GFS/NCEP) reanalysis approach. These results are skill for the downscaling, which should not contain completely consistent with FH. forecast errors, is comparable to those of the RAN ap- We now want to investigate the importance of the proaches. Thus, the systematic model errors have a

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC 668 WEATHER AND FORECASTING VOLUME 23

FIG. 5. Example of censored QR forecasts at the Schmelz-Huettersdorf station for (a) January and (b) August 2002 using the ERA-40 reanalysis and GFS forecast. The left side of the panels shows the climatological forecast. Gray bars give the forecast probability of precipitation (right axis), black circles are the observed precipitation sums, and the box-and-whiskers graphs indicate the 0.25, 0.50, and 0.75 quantiles as well as the extreme quantiles (0.90, 0.95, and 0.99), as indicated for the climatological forecast. The left axis is in mm. larger effect on the skill than do the forecast errors. and Brier skill scores. The number of crossing quantiles Note that the forecast lead time is 12 h and that the that are of the order of 1/1000 is only slightly increased forecast error will certainly be more pronounced with when compared to the RAN approaches. increasing lead time. Figure 5 shows an example of forecasts derived using Finally, we assess the benefit of having large training the RAN ERA-40/GFS approach for January (Fig. 5a) samples. As the training of the P-MOS GFS approach and August 2002 (Fig. 5b). The winter and summer only relies on four seasons of daily data, the data re- climatologies of the daily precipitation totals as esti- duction of the large-scale fields uses the first 10 leading mated from the period 2002 to 2005 are indicated in a EOFs. The skill is reduced for almost all stations com- box on the left-hand side of the panels. The climato- pared to the RAN ERA40/GFS or RAN NCEP/GFS logical probability of precipitation amounts to 0.56 in approaches. More importantly, the uncertainty of the winter and 0.44 in summer. In summer, the climatologi- quantile forecasts is largely increased due to the short cal 0.25 and 0.50 quantiles lie on the zero precipitation training period, which results in highly variable CQV line. The 0.9 quantile amounts to 8.7 mm (8.2 mm) and

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC AUGUST 2008 FRIEDERICHSANDHENSE 669

FIG. 6. Observed precipitation for all 50 stations on (a) 27 Jan and (c) 20 Aug 2002, and 0.95 quantile forecasts at 50 stations on (b) 27 Jan and (d) 20 Aug 2002.

the 0.99 quantile to 26.5 mm (27.9 mm) in winter (sum- 13 to 18 August as well as the extreme rainfall on 20 mer). The black dots indicate the observed precipita- August of about 40 mm. tion amounts. The first 10 days of January 2002 were We now look more closely at 27 January and 20 Au- very dry. Accordingly, the probability of precipitation gust 2002. During both days, one a winter day and one estimated for those days lies below 0.2. In the second a summer day, the total precipitation exceeded 40 mm half of January 2002, the probability of precipitation at one station. Figures 6a and 6c show the observed was much higher, with a probability reaching almost precipitation totals at 50 stations in Rhineland- one on 27 January. On that day, the value that could be Palatinate on 27 January and 20 August 2002. On 27 exceeded with a probability of 0.1 amounts to 24 mm, January, small cyclones passed over Germany within a and the 0.99 quantile lies at 40 mm. The observed value distinct west-southwest drift. The situation on 20 Au- amounts to 17 mm. The forecasts follow the observa- gust was marked by two small low pressure systems tions and capture nicely the range of variance of the coming from France that transported warm and humid daily precipitation totals. Similar good correspondence air from the southwest, leading to shower and thunder- is obtained for August 2002 at the Schmeltz-Hütters- storms. On both days, the observed precipitation totals dorf station. The forecast captures the dry period from strongly vary between the stations with values ranging

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC 670 WEATHER AND FORECASTING VOLUME 23 from 1.5 (0.7) to 44.7 mm (41 mm) on 27 January (20 The performance of the probabilistic RAN approach, August). which is based on the large-scale atmospheric patterns, The 0.95 quantile forecasts for the respective winter is compared to a DMO approach that uses precipita- and summer days are shown in Fig. 6 (right panels). On tion forecasts from the ECMWF forecast system. The 27 January, the 0.95 quantile forecasts range between ECMWF DMO approach provides significantly more 14.7 and 37.7 mm. In contrast to the observed precipi- skill than do the RAN approaches. Further studies are tation totals, the quantile forecasts are much smoother needed to assess those differences. Other forecast vari- in space. But this is to be expected since quantiles ables should be investigated and, in particular, the GFS similar to the expectation values characterize the prob- precipitation forecasts should be included. The occur- ability distributions, while the observations constitute rence of crossing quantiles of the DMO ECMWF fore- a sample of the realizations of those distributions. casts is small but not negligible. It indicates that al- The 0.95 quantile forecasts on 20 August show even though the skill is large, the data basis is not large smaller spatial variance than in January and range enough to estimate a stable QR model. around 20 mm. Several other forecast approaches are tested in order to assess the effects of the different components of the 6. Conclusions RAN approach. The advantage of the CCA correction We have presented a probabilistic postprocessing ap- is that an interpolation onto a common grid is not nec- proach that derives probabilistic measures of daily pre- essary. It turns out that the CCA correction of the cipitation on the basis of the large-scale circulation pat- large-scale patterns is important when using NCEP– terns of a single deterministic model forecast. Here, NCAR reanalysis data to train the censored QR model, 12-h forecasts are taken from the NCEP high- whereas no skill is lost when using ERA-40 reanalysis resolution GFS. Observations are daily precipitation data. So a correction of the systematic differences be- totals at 50 weather stations in the region of Rhineland- tween the NCEP–NCAR and GFS large-scale patterns Palatinate, Germany, provided by the German Weather is important. The forecast error seems small for the Service. The postprocessing employs canonical correla- 12-h forecasts, as the performance of the RAN ap- tion analysis, logistic regression, and censored quantile proach is comparable to the performance of the down- regression. It constitutes a probabilistic extension to scaling based on the NCEP–NCAR reanalysis. A P- standard model output statistics. MOS approach that relies solely on the large-scale pat- As Friederichs and Hense (2007) pointed out, a suf- terns of the GFS forecasts provides a less stable and less ficiently long training period is needed in order to re- meaningful statistical postprocessing, as the training pe- duce the probability of an invalid distribution forecast. riod is too short. Here, the RAN approach is clearly Following the RAN approach of Marzban et al. (2005), outperforming the P-MOS approach. the training of the probabilistic model, namely the cen- The objective of this study was to draw attention to sored quantile model, is derived on reanalysis data, ei- the method of censored quantile regression, which pro- ther from the ERA-40 or NCEP–NCAR reanalyses. vides a tool to formulate a probabilistic postprocessing On the basis of more than 40 yr of daily data, a stable of numerical weather forecasts. The utilization of re- and meaningful QR model is estimated. A CCA cor- analysis for the training of a probabilistic MOS system rects for systematic differences due to different grids together with the RAN-type approach is appropriate to and systematic model and forecast errors between the avoid problems due to the data shortage problems. The large-scale circulation patterns of the reanalyses and RAN approach represents a valuable and efficient post- the GFS forecasts. processing scheme to derive local probabilistic precipi- Forecast skill is assessed using the CQV and Brier tation forecasts from a single deterministic model fore- skill score, where the reference forecast constitutes an cast. estimate of the climatological distribution of precipita- tion. The reliability of the quantile forecasts is good, Acknowledgments. We gratefully acknowledge fruit- and the occurrence of crossing quantile forecasts is neg- ful discussions with Martin Goeber, Armin Mathes, and ligible. Skill is generally larger in winter than in sum- Jan Keller. Special thanks are also due to three anony- mer, and ranges between 20% and 50% depending on mous reviewers, for their detailed comments, and to the respective quantile, season, and station. Although WetterOnline GmbH for providing us with the GFS the NCEP–NCAR reanalysis data have lower resolu- forecasts. Parts of the research were carried out within tion than do the ERA-40 data, no significant differ- the DFG-funded Project PP1167-PQP and at the Inter- ences are obtained when using NCEP–NCAR or ERA- disciplinary Centre of Complex Systems (IZKS) of the 40 for the training of the quantile model. University of Bonn.

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC AUGUST 2008 FRIEDERICHSANDHENSE 671

APPENDIX

Quantiles and the Quantile Verification Score

This appendix gives some preliminary details about quantile estimation, notably that the quantiles can be expressed as the solution of an optimization problem (Koenker 2005). Standard linear regression minimizes the least square error, and the solution of an optimiza- tion problem that aims at minimizing the expected square error is the expectation value. And so the mean square error is an adequate measure of the perfor- mance of such a forecast. Analogously, quantile regres- sion minimizes a function of the least absolute differ- ␳ ences. This optimization problem is the basic principle FIG. A1. Check function of ␶(u). for quantile regression, and the solution is the condi- tional quantile function at a probability ␶. Likewise, the The check function ␳␶(u) is a function of the least ab- optimization problem defines a verification score, the solute deviations with u ϭ y Ϫ yˆ␶ and is displayed in Fig. QV or CQV score, which is discussed in some detail A1. Here, I() is an indicator function, which takes the here. value 1 if the condition in the brackets is valid, and 0 if Let Y be a univariate random variate with distribu- the condition is not valid. The estimate yˆ␶ should be tion F(y) ϭ prob(Y Յ y). The quantile function is de- determined such that it minimizes the expected loss fined as [E[] indicates the expectation under F(y)]: ͑␶͒ ϭ Ϫ1͑␶͒ ϭ ͕ | ͑ ͒ Ն ␶͖ ͑ ͒ Q F inf y F y , A1 yˆ␶ ͓␳ ͑ Ϫ ͔͒ ϭ ͵ ͑␶ Ϫ ͒͑ Ϫ ͒ ͑ ͒ Ϫ1 E ␶ Y yˆ␶ 1 y yˆ␶ dF y and F (␶) ϭ y␶ is called the ␶ quantile. Quantiles arise Ϫϱ

from a simple optimization problem, which is defined ϱ through the so-called check function: ϩ ͵ ␶͑ Ϫ ͒ ͑ ͒ ͑ ͒ y yˆ␶ dF y . A3 yˆ␶ ␳␶͑u͒ ϭ u͓␶ Ϫ I͑u Ͻ 0͔͒ ϭ ͑␶ Ϫ 1͒uI͑u Ͻ 0͒ ϩ ␶uI͑u Ն 0͒ for some ␶ ∈ ͑0, 1͒. ͑A2͒ Differentiating with respect to yˆ␶ results in

Ѩ yˆ␶ ϱ ͓␳ ͑ Ϫ ͔͒ ϭ ͑ Ϫ ␶͒ ͑ ͒ Ϫ ␶ ͑ ͒ ϭ ͑ ͒ Ϫ ␶ ϭ! ͑ ͒ Ѩ E ␶ y yˆ␶ 1 ͵ dF y ͵ dF y F yˆ␶ 0, A4 yˆ␶ Ϫϱ yˆ␶

Ϫ1 so a necessary condition of yˆ␶ to minimize the loss func- ␶ quantile, with F(y␶) ϭ ␶ or y␶ ϭ F (␶), then S(y,.)is tion is that F(yˆ␶) ϭ ␶, which is indeed the ␶ quantile. proper if for each yˆ␶ Analogously, it can be shown that the value that mini- 2 mizes E[(Y Ϫ yˆ) ] is the expectation value E[Y]ofY. E͓S͑Y, yˆ␶͔͒ Ն E͓S͑Y, y␶͔͒, ͑A5͒ The loss function ␳␶(y Ϫ yˆ␶) is proposed as a scoring rule for quantile forecast verification, the QV score (for and strictly proper if equality is only obtained if yˆ␶ ϭ y␶. censored data, the CQV score). It will be shown in the The expectation of each score can be trivially decom- following that the expected loss attains a minimum if posed into the forecast is perfect. This proves that the QV score is Ϫ1 ͓ ͑ ͔͒ ϭ ͓ ͑ ͔͒ ϩ ͕ ͓ ͑ ͔͒ Ϫ ͓ ͑ ͔͖͒ proper, but also that yˆ␶ ϭ F (␶) is a necessary and E S Y, yˆ␶ E S Y, y␶ E S Y, yˆ␶ E S Y, y␶ . sufficient condition for a minimum of the expected loss. ͑A6͒ We first recall the definition of a proper score (for de- tails see Gneiting and Raftery (2007)). Let S(y, yˆ␶) ϭ If S(Y, .) is a proper score, then the term in the braces ␳␶(y Ϫ yˆ␶) denote the QV score of a forecast yˆ␶ of the ␶ is positive definite. Such a decomposition for the QV quantile of the random variate Y ϳ F (y). Let y␶ be the score can be derived as follows:

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC 672 WEATHER AND FORECASTING VOLUME 23

ϱ ͓␳ ͑ Ϫ ͔͒ ϭ ͵ ␳ ͑ Ϫ ͒ ͑ ͒ E ␶ y yˆ␶ ␶ y yˆ␶ dF y Ϫϱ

yˆ␶ ϱ ϭ ͵ ͑␶ Ϫ ͒͑ Ϫ ͒ ͑ ͒ ϩ ͵ ␶͑ Ϫ ͒ ͑ ͒ 1 y yˆ␶ dF y y yˆ␶ dF y Ϫϱ yˆ␶

ϱ yˆ␶ ϱ yˆ␶ ϭ ͵ ␶͑ Ϫ ͒ ͑ ͒ Ϫ ͵ ͑ Ϫ ͒ ͑ ͒ Ϫ ͵ ␶͑ Ϫ ͒ ͑ ͒ ϩ ͵ ͑ Ϫ ͒ ͑ ͒ y y␶ dF y y y␶ dF y yˆ␶ y␶ dF y yˆ␶ y␶ dF y Ϫϱ Ϫϱ Ϫϱ Ϫϱ

ϱ y␶ yˆ␶ ϱ ϭ ͵ ␶͑ Ϫ ͒ ͑ ͒ Ϫ ͵ ͑ Ϫ ͒ ͑ ͒ Ϫ ͵ ͑ Ϫ ͒ ͑ ͒ Ϫ ␶͑ Ϫ ͒͵ ͑ ͒ y y␶ dF y y y␶ dF y y y␶ dF y yˆ␶ y␶ dF y Ϫϱ Ϫϱ y␶ Ϫϱ

yˆ␶ ϩ ͑ Ϫ ͒͵ ͑ ͒ yˆ␶ y␶ dF y Ϫϱ

yˆ␶ ϭ ␳ ͑ Ϫ ͒ ϩ ͕͑ ͑ ͒ Ϫ ␶͒͑ Ϫ ͒ Ϫ ͵ ͑ Ϫ ͒ ͑ ͖͒ ͑ ͒ E ␶ Y y␶ F yˆ␶ yˆ␶ y␶ y y␶ dF y . A7 y␶

Because of the mean value theorem for integration, the extreme precipitation using censored quantile regression. integral in the braces is bounded by Mon. Wea. Rev., 135, 2365–2378. Glahn, H. R., and D. A. Lowry, 1972: The use of model output yˆ␶ statistics (MOS) in objective weather forecasting. J. Appl. Յ ͵ ͑ Ϫ ͒ ͑ ͒ Յ ͑ Ϫ ͒͑ ͑ ͒ Ϫ ␶͒ 0 y y␶ dF y yˆ␶ y␶ F yˆ␶ . Meteor., 11, 1203–1211. y␶ Gneiting, T., and A. E. Raftery, 2007: Strictly proper scoring ͑A8͒ rules, prediction, and estimation. J. Amer. Stat. Assoc., 102, 359–378. As F(y) is a positive definite and monotonically in- Hamill, T. M., and J. Juras, 2006: Measuring forecast skill: Is it real skill or is it the varying climatology? Quart. J. Roy. Meteor. creasing function of y, the term in the brackets is posi- Soc., 132, 2905–2923, doi:10.1256/qj.06.25. tive definite. Thus, the QV score is a proper score. ——, and J. S. Whitaker, 2006: Probabilistic quantitative precipi- tation forecasts based on reforecast analogs: Theory and ap- REFERENCES plication. Mon. Wea. Rev., 134, 3209–3229. ——, ——, and X. Wei, 2004: Ensemble reforecasting: Improving Barnett, T. P., and R. Preisendorfer, 1987: Origins and levels of medium-range forecast skill using retrospective forecasts. monthly and seasonal forecast skill for United States surface Mon. Wea. Rev., 132, 1434–1447. air temperatures determined by canonical correlation analy- Kalnay, E., 2003: Atmospheric Modeling, Data Assimilation and sis. Mon. Wea. Rev., 115, 1825–1850. Predictability. Cambridge University Press, 341 pp. Bremnes, J. B., 2004: Probabilistic forecasts of precipitation in ——, and Coauthors, 1996: The NCEP/NCAR 40-Year Reanaly- terms of quantiles using NWP model output. Mon. Wea. Rev., sis Project. Bull. Amer. Meteor. Soc., 77, 437–471. 132, 338–347. Klein, W. H., 1971: Computer prediction of precipitation prob- Brier, G. W., 1950: Verification of forecasts expressed in terms of ability in the United States. J. Appl. Meteor., 10, 903–915. probability. Mon. Wea. Rev., 78, 1–3. Koenker, R., 2005: Quantile Regression. Econometric Society Bröcker, J., and L. A. Smith, 2007: Scoring probabilistic forecasts: Monogr., Vol. 38, Cambridge University Press, 349 pp. The importance of being proper. Wea. Forecasting, 22, 382– 388. ——, and B. Bassett, 1978: Regression quantiles. Econometrica, 46, 33–49. Chernozhukov, V., and H. Hong, 2002: Three-step censored quan- tile regression, with an application to extramarital affairs. J. Marzban, C., S. Sandgathe, and E. Kalnay, 2005: MOS, perfect Amer. Stat. Assoc., 97, 872–882. prog, and reanalysis data. Mon. Wea. Rev., 134, 657–663. Efron, B., and R. J. Tibshirani, 1993: An Introduction to the Boot- Murphy, A. H., 1973: Hedging and skill scores for probability strap. Chapman and Hall, 436 pp. forecasts. J. Appl. Meteor., 12, 215–223. European Centre for Medium-Range Weather Forecasts, cited NOAA/EMC, 2003: The GFS . NCEP Office 2006: ECMWF operational analysis data (March 1994– Note 442, 14 pp. [Available online at http://www.emc.ncep. present). [Available online at http://badc.nerc.ac.uk/data/ noaa.gov/officenotes/newernotes/on442.pdf.] ecmwf-op/.] Powell, J. L., 1986: Censored regression quantiles. J. Economet- Fahrmeir, L., and G. Tutz, 1994: Multivariate Statistical Modelling rics, 32, 143–155. Based on Generalized Linear Models. Springer Series in Sta- R Development Core Team, cited 2007: R: A language and envi- tistics, Springer, 425 pp. ronment for statistical computing. [Available online at http:// Friederichs, P., and A. Hense, 2007: Statistical downscaling of www.R-project.org.]

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC AUGUST 2008 FRIEDERICHSANDHENSE 673

Uppala, S. M., and Coauthors, 2005: The ERA-40 Re-analysis. Wilks, D., 1995: Statistical Methods in the Atmospheric Sciences: Quart. J. Roy. Meteor. Soc., 131, 2961–3012. An Introduction. International Geophysics Series, Vol. 59. Vislocky, R. L., and G. S. Young, 1989: The use of perfect prog Academic Press, 464 pp. forecasts to improve model output statistics forecasts of pre- Yu, K., and R. Moyeed, 2001: Bayesian quantile regression. Stat. cipitation probability. Wea. Forecasting, 4, 202–209. Prob. Lett., 54, 437–447. Vrac, M., and P. Naveau, 2007: Stochastic downscaling of precipi- Zorita, E., and H. von Storch, 1999: The analog method as a tation: From dry events to heavy rainfalls. Water Resour. Res., simple statistical downscaling technique: Comparison with 43, W07402, doi:10.1029/2006WR005308. more complicated methods. J. Climate, 12, 2474–2489.

Unauthenticated | Downloaded 10/02/21 05:08 AM UTC