A Comparison of Unbiased and Plottingposition Estimators of L
Total Page:16
File Type:pdf, Size:1020Kb
WATER RESOURCES RESEARCH, VOL. 31, NO. 8, PAGES 2019-2025, AUGUST 1995 A comparison of unbiased and plotting-position estimators of L moments J. R. M. Hosking and J. R. Wallis IBM ResearchDivision, T. J. Watson ResearchCenter, Yorktown Heights, New York Abstract. Plotting-positionestimators of L momentsand L moment ratios have several disadvantagescompared with the "unbiased"estimators. For generaluse, the "unbiased'? estimatorsshould be preferred. Plotting-positionestimators may still be usefulfor estimatingextreme upper tail quantilesin regional frequencyanalysis. Probability-Weighted Moments and L Moments •r+l-" (--1)r • P*r,k Olk '- E p *r,!•[J!•. Probability-weightedmoments of a randomvariable X with k=0 k=0 cumulativedistribution function F( ) and quantile function It is convenient to define dimensionless versions of L mo- x( ) were definedby Greenwoodet al. [1979]to be the quan- tities ments;this is achievedby dividingthe higher-orderL moments by the scale measure h2. The L moment ratios •'r, r = 3, Mp,ra= E[XP{F(X)}r{1- F(X)} s] 4, '", are definedby ßr-" •r/•2 ß {X(u)}PUr(1 -- U)s du. L momentratios measure the shapeof a distributionindepen- dently of its scaleof measurement.The ratios *3 ("L skew- ness")and *4 ("L kurtosis")are nowwidely used as measures Particularlyuseful specialcases are the probability-weighted of skewnessand kurtosis,respectively [e.g., Schaefer,1990; moments Pilon and Adamowski,1992; Royston,1992; Stedingeret al., 1992; Vogeland Fennessey,1993]. 12•r= M1,0, r = •01 (1 - u)rx(u) du, Estimators Given an ordered sample of size n, Xl: n • X2:n • ''' • urx(u) du. X.... there are two establishedways of estimatingthe proba- /3r--- Ml,r, 0 =f01 bility-weightedmoments and L moments of the distribution from whichthe samplewas drawn. Consider first the probabil- Certain linear combinationsof probability-weightedmo- ity-weightedmoment Otr. Landwehr et al. [1979a] used the mentscan be directlyinterpreted as measuresof the location, unbiased estimator scale,and shapeof the probabilitydistribution. These are the L moments,defined by Hosking[1990] to be the quantities n ar= n-' • (n-j)(n (n - -j 1)(n - 1)...- 5• :--(n(n -j - -r) r + 1)Xj:n' j=l •r -" X(U)Pr*-•(u) du. 1 Landwehret al. [1979b]preferred the estimator HereP*r(U) is the rth shiftedLegendre polynomial, defined by n Olr--- izl -1E(1 Pj:n)j:n, j=l Pr*(U)= E Pr,ld,l , k=0 withPj:n ----(J - 0.35)/n. We call thisa "plotting-position estimator"because Pi:n is a plottingposition, a distribution- where freeestimator of the nonexceedanceprobability of xi:n. Estimators can similarly be obtained for the probability- weighted moments •r and for L moments and L moment r,k(_l)r_•½ kr r +k k ((k!)2(r- 1)rW½(r +k)!k) ratios. For L moments,unbiased estimators are given by L moments are given in terms of the probabilityweighted momentsby lr+l'- (--1)r E Pr*,lca!•ß k=0 Copyright1995 by the AmericanGeophysical Union. The analogousestimators of L moment ratios are Paper number95WR01230. 0043-1397/95/95WR-01230505.00 tr-- lr/12; 2019 2020 HOSKING AND WALLIS: COMPARISON OF L MOMENT ESTIMATORS these estimatorsare not exactlyunbiased, but from their con- Table 1. Annual Maximum Flows of the River Annan at struction it is convenient to refer to them as "unbiased" esti- Brydekirk, Scotland,1967-1982 mators.Plotting-position estimators of L momentsand L mo- Maximum Flow, ment ratios are given by Year m3s-1 1967 453.3 /•r+l= (-- 1) Z Pr,k Olk , Tr = AriA2. 1968 268.4 k=0 1969 307.4 1970 257.4 1971 250.6 Unbiased and plotting-positionestimators are asymptoti- 1972 260.9 cally equivalentin large samples:the differencebetween them 1973 150.5 is of stochasticorder n- 1. The questionof whichis preferable 1974 263.6 in practicetherefore dependson the estimators'properties in 1975 256.3 1976 214.5 small and moderate samples. 1977 474.0 Landwehret al. [1979b]recommended plotting-position es- 1978 308.5 timatorsof probability-weightedmoments because when used 1979 285.7 to fit the Wakeby distributionthey gave more accurateesti- 1980 256.1 mates of the upper tail quantiles.Hosking et al. [1985b] and 1981 306.5 1982 390.2 Hoskingand Wallis[1987] reached the sameconclusion for the generalizedextreme value and generalizedPareto distribu- tions, respectively.These authorswere principallyconcerned with "floodlike" distributionshaving •2/•1 (the L moment [1988] and Fill and Stedinger[1995]. The effectcan be signifi- analogueof the usualcoefficient of variation) and % both in cant in small samples.As an example, we use Sinclair and the approximaterange 0.1 to 0.3. Ahmad'sexample data set of 16 annualmaximum flows of the More recently,L momentshave found other applications:as river Annan at Brydekirk, Scotland, 1967-1982. The data are summarystatistics of data samples[Hosking, 1990], as a tool listed in Table 1. For this data set the plotting-positionesti- for identifyingthe appropriatedistribution to fit to a data set matorsare •2 = 47.1,•'3 = 0.227,and •'4 = 0.323.If a constant [Chowdhuryet al., 1991;Hosking and Wallis,1993; Guttman et value of 250 is subtractedfrom each data value, the plotting- al., 1993] and in the fitting of distributionsto data, such as positionestimators change to X2= 42.4,•'3 = 0.261,and •'4 = monthly precipitation totals, that are much less skew than 0.248. The differencesbetween these values are clearly sub- annualmaximum flood data [Guttmanet al., 1993].For these stantial.For comparison,the "unbiased"estimators are l 2 = applications,plotting-position estimators have some undesir- 44.4, t 3 = 0.241, and t 4 ----0.315. able properties,which we discussin the following sections. Furthermore, the advantagesof plotting-positionestimators over unbiased estimatorsfor quantile estimation are slight, Impossible Values exceptin someregional frequency analyses. We thereforecon- A hitherto unremarked aspectof the lack of invarianceof clude that for generaluse, unbiasedestimators are preferable plotting-positionestimators is that for somesample configura- to plotting-positionestimators. The only exceptionis the esti- tions the plotting-positionestimators of L momentscan take mation of extremeupper tail quantilesin regional frequency numericalvalues that would be impossiblefor the population analysis:here the plotting-positionestimators sometimes out- L momentsof anydistribution. For example, consider X2[•/,/5]. perform the unbiasedestimators. It is straightforwardto show that Our discussionof plotting-positionestimators concentrates n-1 1+27-/5 on the estimatorsbased on the plottingposition Pi:n = (J -- X2[T,/5] = 12+ l•. 0.35)/n and denotedby '•r and •'r' The samegeneral princi- ples, however,apply to any reasonablechoice of the plotting Unless1 + 23, -/5 = 0 (in which casethe plottingposition is positionPj:n, andin particularto anyplotting position of the formPj:n = (J + T)/(n + /5)with /5 > y >-1; when symmetric,withp•:n - Pn--i+•:n), •2[•/, /5]can take negative values.For example,for a sampleof sizen = 20 that hasl• = emphasizingthat a result is valid for any suchestimator, we -100 andl 2 = 1, we wouldhave •2 - -0.55. It is clearly denote the plotting-position estimators by ,•r[•/, /5] and unsatisfactoryfor a scaleestimator to be able to take negative •r['Y, /5]' values. One can similarlyconstruct samples for which •-3> 1. Invariance Under Location Shift Bias If a constant is added to each data value, it is desirable that the estimateof a distribution'slocation parameter shouldin- The most important practical disadvantageof plotting- creaseby the same constantwhile the estimatesof scaleand positionestimators is their bias,which exceptfor a fairly nar- shapeparameters of the distributionremain unchanged. This is row range of parent distributionscan be disturbinglyhigh. For the case for unbiasedestimates of probability-weightedmo- example,the estimator •-3has low bias when the population ments and L moments,but is not true for plotting-position value of •-3 is near 0.2, but its bias is large enough to be of estimators. practicalconcern in small and moderatesamples drawn from This lack of invarianceof plotting-positionestimators has distributionswith •-3not closeto 0.2. been noted by Landwehr et al. [1979b, equation (12)] and Figure 1 illustratesthe bias of estimatorsof L skewnessand Hosking[1986a, p. 23] and illustratedby Sinclairand Ahmad L kurtosisfor samplesize n = 20. Bias is shownfor both the HOSKING AND WALLIS: COMPARISON OF L MOMENT ESTIMATORS 2021 01 0.1 0.0 0.1 0.2 0.3 0.4 . 0.5. 0.0 0.1 , 0.2, , 0.3 / 0.4/ 0•5 L-skewness L-skewness Figure 1. Bias of sampleL skewnessand L kurtosisstatistics for samplesize 20, for (a) plotting-position estimatorsand (b) unbiasedestimators. Arrows lead from the populationvalues to the mean of the sample statistics.Solid lines are the populationL skewness-Lkurtosis relationships for the generalizedlogistic (GLO), generalizedextreme value (GEV), and generalizedPareto (GPA) distributions.Results are basedon simulatedsamples drawn from kappa and Wakeby distributions. plotting-positionestimators •3 and •4 and the unbiasedesti- the 2-hour-durationannual maxima are availablefor 118 gag- matorst 3 and t4. Biasis showngraphically, by arrowsthat lead ing sites,with record lengthsvarying from 7 to 46 years and from the population values '3 and '4 to the means of the averaging31 years.As in Schaefer'sanalysis, the 118 siteswere sampleestimators. Results are basedon 10,000simulations of divided into 13 regions, each region representinga different distributionswith fixed values of X• = 1 and X2 = 0.2, and ,1-3 range of mean annual precipitation.Region sizesvaried from and *4 taking valuesat intervalsof 0.05 over the range 0 <- •'3 5 siteswith 176 stationyears of data to 13 siteswith 392 station <- 0.5, 0 <- *4 <- 0.3. The parent distributionsare kappa distri- years of data. The regional averagesof L skewnessand L butionswhen there existsa kappadistribution with the given*3 kurtosisfor the 13 regions,calculated using both unbiasedand and *4 values,i.e., for all pointson or belowthe "GLO" line *4 plotting-positionestimators, are shownin Figure 2. Figure 2 - (5,32+ 1)/6in Figure1, andWakeby distributions otherwise.