<<

RADIO SCIENCE Journal of Research NBSjUSNC-URSI Vol. 68D, No.9, September 1964 Statistical Inference for Rayleigh Distributions

M. M. Siddiqui 1

Contribution From Boulder Laboratories, National Bureau of Standards, Boulder, Colo.

(R eceived D ecember 6, 1963; revised May 7, 1964)

The main inference problems related to t he Rayleigh distribution a re the es t im atiop of its a nd the t est of t he hypothesis that a given set of observations is from such a di stribution. It is shown that (in case of radio signals) t he most effieient estimate of the parameter is obtain ed using t he sample power. Compli cations may a ri se wh en data are missin g or a re a utocorrelated. Methods are given to deal wit h such complicatio ns also.

I. Introduction (1 ) Com])lete incoherence, whi ch that each of A t is distri buted uniformly o,-er t he interval In a preyious paper [Siddiqui, 19621 some problems (0, 27r). of estimation and testin g of hypotheses were di s­ (2) Independence of phase and amplitude, i. e., cussed in co nnection with t he Rayleigh distribution. R i and A i are independent random Yclri ables. In the present survey article, for the sake of complete­ (3) lbsence of a dominant vector, defined by the ness, so me parts or the previous paper will be re­ condition of t he Central Limit Theorem [Cramer, peated. Howe,-er, an attempt will be made to 1951, pp. 215- 216, 185- ] 86]. If t he ?Lrst two con di­ condense the material already co \rered . tions f),re satisfied, this third co ndi tion will be sftti ::1ed if: (E'Z:~RDII3 (E'Z: ltD - 1I2 0 as n -HXJ . 1.1. Nota tion Thus, in particuhr, if R i have id entical di stributions or if Ri= r, a con stant for all i, then (3) is satisfied . The function P will be llsed generi cally for any The condition (3) will not be satis:fied if, say, a distribution function and J) for any density Jun ction. domi nant \'ector is prese nt. Under Sll Ch circum­ Th118 P(x) and P(y), in ge nernl , will not be the same stances the .t\akagami-Rice distribu tion applies functions. If X is ft rftndom vftriftble EX will [Nakagami, 1940; R ice, 1944, tllld 19451. . stand for the of X find yur X for the If tho ftbo "e assu mptions hold R cos A and H sm yariance~ of X. Thus A are asymptotically independent normal vari ates, each wi th men.1l zero and \'urianco (] /2)

and if P(x) is differentiable, P'(x)=p(x) . and R and A a re asymptoticall y independent. We, 2 . General Properties of Rayleigh therefore, co nsider a l" n.ndom "ariable H, which has Distributions the probability density a nd distribution function, respecti vely 'iVhen an incoming plane wave (electromagnetic, _ _ 2/ 2 () _ 2/ 2 < sound, or some other kind) passes through a scat­ p(r) = 2

Thus The only essential parameter to be estimated in a Rayleigh power distribution is the mean 'Y. All other E (In Z) = In 'Y + '" (1) = In 'Y - 0.577 char acteristics of the distribution such as the distri­ bution function, momemts and percentiles are func­ val' (In Z) = "" (1) = 7l' 2/6= 1.64493, tions of 'Y. Let ZI , . .. , ZN denote a complete sample of N independent observations from the dis­ where ", (x) = r' (x) jr(x). It is customary to measure tribution. The likelihood function (the joint prob­ power in decibels. Thus, writing Q= 10 loglo ability density of Zl, . .. , Z,v considered as a Z = 20 10glO R , we have function of 'Y) is given by EQ= (10 10glOe) E(ln Z) = 4.343 (In 'Y-0.577)

val' Q= (10 10gl oe)2 val' (In Z) =,3l.0 (2.7) This shows that J:,Z i a sufficient statistic for 'Y' 1006 Since E'Z,Zi=Ny, the unbiased 'sufficien t (also the and val' c1l2 are possible. In fact, from (3 .2) and maximum likelihood) estimator of I' is the sample (3.3) mean, (3. 1) E l /2 f (N + 1/2) l /2 1/2_[1 fZ(N + 1/2)J 1 C N l/2 r (N) I' ,val' c - Nf2(N) 1'. The variance of c is, of course, val' c= I'z/N, and if c' is any other unbiased estimator of I' then Thus an unbiased estimate of ER and its variance val' c-N which is an improvement over the biased estimate m. On the other hand, let R = N - I 'Z,fR i, then EX'= [r (N) ]-11'" xI+N - 1e-xdx= r(t+ N) /r (N). ER= ER, (3.4) ,.... 4- 7r val' R = 4N "(. Since c is the sufficient statistic for the family of Rayleigh distributions { P(z) : 0::;: 1'< co}, all other Thus the relati ve asymp totic efficiency, e(R), of 11, is characteristics of the distribution should be estimated i.n terms of c. For exampJe e(R )= "arm* 7r 091 varR 4 (4- 7r) ., (3.5) i. e., the sample mean 11, is a less efficient estimator is the maximum likelihood estimator of P(z) and of the population mean ER than the estimator m * based on the suffi cien t statistic c. ~(p) =- c In (l - p), o::;:p< co, (3 .6) If the measurements are in decibels, that is, if our sample consists of Qi= 10 10glO Z i, i = .l , 2, ... '. ,N, that of z( p). In general, if a function }(I') has an then we note from (2. 7) that an estImatIOn of EQ, unbiased estimator, it also has an unbiased estimator which involves In 1', is required. "Ve can directly based on the suffi cient statistic c, and this estimator evaluate E(In c) and var (In c) by differentiating is the most efficient estimator of}(I'). Also let hex) (3.4) once and twice wi tIL respect to : and then be a function which in some neighborhood of the setting t= O. We note that point X= I' is continuous and has con tinuous deri Ira­ tives of the first and second order. Then, if E(ln X) J= f (J) (N)/f (N) , j = l , 2, ... , Ih(x) l< eKX for all x2': 0 where K is some constant independent of x, it can be shown that, for suffi­ N - l 1/; (N)=r' (N) /f (N) = 1/; (1)+ L, 1'-1, ciently large N, r = 1

(3. 7) Thus, since In c= ln X + ln I' - ln N,

For exampJe if we wi sh to estimate the mean E (In c) = ln I' - ln N + 1/; (N) = ln I' + O (N-I ) amplitude ER= (7rl')1 IZ/2, we consider m = (7rc) 1/2/2 and from (3.7) var (In c) = var (In X) = 1/;' (N). Thus, to order N - I, the unbiased estimator of EQ based on cis

However, III this case, exact evaluations of Ec1lZ Q* = 4.343 (In c- O.577 ), val' Q* = (4.343)21/;' (N). (3.9) 1007 On the other hand, if Q=N-1 ~Qi' EQ=EQ and confidence limits for "(. If N?,IOO, these limits are approximately Nl/2c(Nl/ 2±Xa)-1. -Q (4.3437r) 2 val' 6N' 3 .3. Estimation From Order

An estimate of"( can be made with reaso nably high Thus the efficiency eN(Q) of Q relative to Q* is efficiency from a few order statistics. Let Z1, .. . , ZN denote, as before, a sample of N independent observations on Z and let Y 1 ::; Y~ ::; ... ::; Y N be their ordered values, e.g., Y1 =min (ZI' ..., ZN), YN = max (ZI' ..., ZN)' The probability density For N = I, eN(Q)= I, otherwise eN(Q) < l. Using fun ction of Yk is given by the Euler-Maclaurin formula N! p(y) (k- l ) l(N- k ) !"( exp [-(N- ~j (r) = ( kj (x)dx-I /2[j(k)-j(0)] r-O ) 0 lc + l )y/'Y][ I -exp (_y/'Y))k-I,

+1~ [Pl) (k)-fl) (0)]- ... (3 .10) The generating function of Y k is to sum the series in eN(Q) we obtain, for large N(?10), From

Thus the efficiency of Q is approximately 61 per­ (3. 12) cen t. We have thus established the fact that when a Using the Euler-Maclamin formula (3. 10) to approxi­ complete random sample is available, one should mate these summations, I have shown elsewhere invariably use estimates of population chara('ter­ [Siddiqui, 1963] that the optimum unbiased estimator istics using appropriate functions of the sufficient from a single order statistic corresponds to statistic c. The use of the other estimators men­ tioned above involves considerable loss of efficiency. k~0.7968 1 (N+ 1) - 0.39841 + 1.16312 (N + 1)- 1 3.2. Confidence Limits (3. 13)

We noted that 2Nc/'Y is a x2 variate with 2N deg where the yalue of k thus obtained is rounded off to of freedom. To set up con:6.dence limits on 'Y with the nearest integer. For this k, 1- a 2 a confidence coefficient we determine from x k tables two numbers a and b, corresponding to 2N cI = Y k /22 a i ~0.6275Yk (3. 14) deg of freedom, such that 1

is an unbiased estimator of 'Y with efficiency

Then a::; 2Nch::; b has probability 1- a and

(3.11) Comparing the effi ciencies CN (CI ) and eN( Q), we are the desired confidence limits. If h is a monotone notice that even a single order statistic, properly function from (0, 00), then h(2Nc/b) and h(2Nc/a) chosen, is more effi cient than the mean of all the will be 100 (1 - a) percen t confidence limi ts [or h ('Y) . decibel values for estimating the parameter of a Thus, for example, EQ and P(z) are monotone func­ R ayleigh distribution. It may also be noted that. tions of ,,(, and confidence limits for them can be the efficiency of the estimator based on the median. easily written down. i.e., the order statistic corresponding to k~ (J /2) If N?, 15 , x= (4Nc/'Y)1 /2- (4N- l )I /2 is approxi­ (N+ l ), is only 48 percent. mately a standard normal variate [Cramer, 1951 , The asymptotically (N----'7 oo) optimum unbiased p. 251]. If Xa is the number such that estimator of "( from two order statistics is given by

c2 = 0.523Ym + 0.179Yn , m~0.639(N+ 1), n~0.927(N+ 1 ), (3 .1 5) then 4Nc[(4N-I)I/2 :l: xa] - 2 are 100(1 -0') percent with asymptotic efficiency e(c2) = 0.82. Sarhan, 1008 --_._--

Greenberg, and Ogawa [1963] have gi ven asymptoti­ Generally , the correllction function a (s) will be cally optimum unbiased estimators of ')' from 1 to 15 unknown and will h ave to be estimated from the order statistics. A linear combination of as few as sample. Let fi ITe properly chosen order statistics provides an estimator of')' whose efficiency exceeds 94 p er cent. T S A big ad ITan tage of the es ti mators based on order C(s) = (T -s) - I i - Z(t)Z(t +s)clt. statistics is that only a p ar tial knowledge of the Then sample is required. They are especially good when a(s) =C(s) /c~- l low values of the sample are poorly recorded, not recorded or missing. Let us say t hat the lowest 60 is a consistent estima tor of a(s). Usually, it is percent of the sample is missing. ' Ve can still desirable to fit to a(s) some ma t hematically specified construct the estimators CI and C2, although the function such as exp (-,u lsl), exp (- ,us"), ,u (,u +S2)-]. sample mean (even median) can not be determined. Before choosing an approximating function , howeyer, The optimum estimators from more than one order a more carehJ investigation of a(s) near s= O is statistic cannot be co nstructed if, say the highest necessary. For example, if we take exp (-,ulsl) to 8 percent or more values in the sample are missing. represent a(s) then, sin ce this f unction is Ilo t differ­ If only the highest values are missing so that we are entiable ftt s= O, we will be co mpelled to co nclude left with Y1, ... , Y m , m~N, the most effi cient that the process Z(t) is n ot difrerentiabIe, a nd this unbiased estimator of ')' from these order statistics is may not be a I'ery desirable state of ,tfLtirs. For the sake of sim pli city we Lake the uni t of ti me (3 .16) such t lliLt T = I . Let

i = l, 2, ... , N, and has efficiency e(cm) = m /N. In fact, 2mcmh is a x2 variate [Epstein and Sobel 19541 with 2m deg of freedom.. Confidence limits b ased on Cm can be where N is gi I'en different I"

3.4. Estimation From Correlated Observations N SN= ~ YiYi-l, and NSN' In tbis section we co nsider Z = Z(t) to be a station­ 1 ary Rayleigh process and assume that for all t a nd s It can be shown that E Z (t )= ')', E Z (t) Z (t + s) = ')' 2( I + a(s), (3 .17) so that a(s) is the correlation function of t he process. Note th ata(O) = I , a( -s) = a(s). Let Z(t) be observed wh ere a' (-0) is the left-hand and a' (+ 0) is the o ITer an inter val of time O ~ t ~ T. From the a nalogy rig ht-hand derilratil e of a(s) at 8= 0. If a(s) is of the random sample estim ate, we use the sample differe nt iable at zero t hen this Ii l1li t is ;"eI" O, other­ mean wise n ot equal to zero. Thus Lhe beha lTiol" of SN as N increases will tell us wh ether a(s) is differentiable cT = T - I CZ (t )clt (3 .18) at s= o or not. Now sllpposing a(s) is t wi ce differ­ •• 0 entiable at s = o with al/(O) ~ O , we must hal'e to estimate ')'. CT is unbiased and has I'ariance plim S N=O, plim NSN= - ')' 2a" (0). (3. 19) We already know, in this case, that a' (0) = O. If exp (- ,uS2) seems to represent a(s), then an estimate We note that if we choose to take a discrete set of of ,u is obtained as m =NSN/ (2c},). equ ally spaced observations Z(h), Z(2h), .. . , Z (Nh), then the mean, CN, will have the I'aria nce 4. Tests of Statistical Hypotheses ')'2 2')'2 N- l ( S ) 4.1. Tests for the Distribution Function val' CN = N+ N ~ I - N a(sh). (3.20) Let Z], .. . , Z N denote indep endent obser vations The exact distribution of CT or CN is unknown. How- on a nonnegative I'ariate Z. We wi sh to test the ever , let N' be defined by the equation statistical hypothesis that Z h as Rayleigh power dis tribution. We employ the x2 goodness-of-fit var c= ')'2/N', i.e., N' = ,),2/var C, (3 .21 ) test for this purpose. We first calculate C= N - l ~t'Z i ; then the numbers Xi = - C In (l-p;), p i= i /m, i = l , where var C is either (3 .] 9) or (3 .20). Then 2N'c°l' h 2, .. . , m - l , where m~5. From (3.6) we see th at or 2N'CNi'Y is approximately a x2 type variate with Xi are lOOP i percentiles of the Rayleigh power distri­ 2N' deg of fr eedom. bution with ')' = C. The expected number of observa- 1009 tions in each of the intervals (0, XI )' (XI, x2), . .. , same distributions. Let C1 and C2 be the means of the (Xm- I, ro ) is N lm. Let f1 ' . .. ,fm be the number of samples of sizes Nl and N , from Rayleigh distribu­ obsBtTed Z /s falling in these intervals. The statistic tions with means "II, and 1'2, respectively. Then, under the hypothesis 1'1 = 1' 2, Nm ""L.Jml. f2_, N (4.1) is asymptotically a x2 variate with m - 2 deg of is a Fisher-Snedecor F variate with the indicated freedom under the hypothesis t hat Z is a Rayleigh degrees of freedom. We assume that CI "2 C2; if not variate. \,ye preassign some critical probability we simply invert the ratio and interchange the level a ( = 0.05 or 0.01 , say) and from x2 tables de­ degrees of freedom, preassign a sig ni:ficance level a termine the number x ~ such that Pr(x 2 "2 x ~) = a. and test for the significancE' of the calculated F by If the obsenTed X2< x ~ , we accept the hypothesis comparing it with t he upper lO Oa /2 percentage that Z is a Rayleigh variate, otherwise we reject it. point of the F distribution. If the observations in If the sample is truncated at one or both ends, we each sample al"e au tocorrelated, we modify the use order statistics to estimate I' and proceed as above degrees of freedom from N to N' according to (3.21). changing the fiTst and last intervals, if necessary, to If the F -test indicates acceptance of the h ypothesis read (0, x) and (y, ro ) where X is a number slightly 1'1 = 1' 2=1', the common mean, /" is estimated by less than the lowest recorded observation and y is a C= N - l(NIC1 + N 2C2),N= Nl + N 2' number slightly greater than the highest recorded observation. The expected number of observations 5. References in (0, x) is N(l -e-x/ C) and in (y, ro) is N e-Y/ c• We calculate Cram er, H. (1951), M athem atical methods of stat istics (Princeton University Press, Princeton) . Epstein, B., and M. Sobel (1954) , Some t heorems relevant to life t esting from an exponent ial distribution, Annals of Mathem atical Stat istics 25, 373- 38l. where fo = obser ved, f e= expected frequency. N akagami, Minoru (1940) , StllCly on t he r esulta nt l1lnplit. ude In case of correlated observations it is necessary to of m any vibrations whose phases a nd amplitudes are multiply the value of x2 thus obtained by Nfl IN random, Nippon Elec. Comm. Eng. No. 22, 69- 92. Rice, S. O. (1944), M athematicftl ft n alysis of r andom noise, where Nfl is the equivalent number of independent Bell System Tech. J. 23, 282- 332. (Also publish ed as 2 Bell Telephone Monogn tph B 1589 and included in selected observations. Thus x~ = (Nil IN) ~ (foj!e) should papers on noise a nd stochastic processes, N . Wax, cd. 2 Dover Publications, New York, N .Y. 1954.) be considered as an approximate x variate with m - 2 Rice, S. O. (1945), M athematical an alysis of r andom noise, degrees of freedom. It is difficult to determine Nfl . Bell System T ech . J. 24, 46- 156. However, if a(s) is the correlation function of Z (t) Sarhan, A. E., B . G. Greenberg, and J . Ogawa (1963) , Simpli­ process and the sample consis ts ofZ(h), . .. , Z (Nh) , fi ed estimfttes for t he exponential distribut ion, Annals of M athem fttical Statistics, 34, 102- 116. Siddiqui, M. NL (H)62), Some problems connected wit h Rayleigh distribut ions, J. R es. NBS 66D (Radio Prop.), No. ~ , 167- 174. Siddiqui, M . M . (1963), Opt imum estim ators of the p ftr am­ cters of negative exponential distributions from, one or appears to be a reasonably good approximation to two order st atistics, Annals of M athematicftl Statistics, Nil IN. 34, 117- 121.

4 .2. Comparison of Two Samples

Sometimes we wish to test the hypothesis that two independent samples, which are known to be from Rayleigh power distributions, are from 1,he (Paper 68D9- 400)

1010