Empirical Process Proof of the Asymptotic Distribution of Sample
Total Page:16
File Type:pdf, Size:1020Kb
Spring 1998 John Rust Economics 551b 37 Hillhouse, Rm. 27 Empirical Pro cess Pro of of the Asymptotic Distribution of Sample Quantiles th ~ De nition: Given 2 0; 1, the quantile of a random variable X with CDF F is de ned by: 1 F = inf fx j F x g: th th Note that is the median, is the 25 p ercentile, etc. Further if we de ne the 0 quantile :5 :25 as = lim and de ne similarly, it is easy to see that these are the lower and upp er 0 1 !0 ~ ~ p oints in the supp ort of X i.e. the minimum and maximum p ossible values of X which might ~ be 1 and +1 if X has unb ounded supp ort. Note also that if F is strictly increasing in a 1 neighb orho o d of , then = F is the usual inverse of the CDF F . If F happ ens to have \ at" sections, sayaninterval of p oints x satisfying F x= , then is the smallest x in this interval. The following lemma, a slightly mo di ed version of a lemma from R. J. Ser ing, 1980 Approximation Theorems of Mathematical Statistics Wiley, New York, provides some 1 basic prop erties of the quantile function F : 1 Lemma 1: Let F be a CDF. The quantile function F , 2 0; 1 is non-decreasing and left continuous, and satis es: 1 1. F F x x; 1 <x<1 1 2. F F ; 0 < <1 1 1 3. If F is strictly increasing in a neighb orho o d of = F wehave: F F = and 1 F F = . 1 4. F x if and only if x F . ~ ~ De nition: Let X ;:::;X bearandom sample of size N from a CDF F . Then the sample 1 N quantile ^ , 2 0; 1 is de ned by: 1 ^ = F ; N where F is the empirical CDF de ned by: N N X 1 ~ F x= I fX xg: N i N i=1 1 1 1 ~ ~ Thus ^ = F :5 is the sample median F :5 = medX ;:::;X , and ^ = F 0 is :5 1 N 0 N N N 1 1 ~ ~ the sample minimum, F 0 = minX ;:::;X , and ^ = F 1 is the sample maximum, 1 N 1 N N 1 ~ ~ F 1 = maxX ;:::;X . Since empirical CDF's have jumps of size 1=N unless more than 1 N N ~ one of the fX g's take the same value, then we can b ound the maximum di erence b etween i 1 and F F in Lemma 1-2 as follows: 1 ~ ~ Lemma 2: Let X ;:::;X bearandom sample from a CDF F and suppose that in this 1 N ~ ~ ~ ~ sample each X happens to be distinct, so that by reindexing we have X < X < < X < i 1 2 N1 ~ X . Then for al l 2 0; 1 we have: N 1 1 jF F j : N N N The following theorem shows that the asymptotic distribution of the sample quantiles ^ for 2 0; 1 are normally distributed. It is imp ortant to note that we exclude the two cases =0 and = 1 in this theorem since the asymptotic distribution of these extreme value statistics is very di erent and generally non-normal. ~ ~ Theorem: Let X ;:::;X be IID draws from a CDF F with continuous density f . Then if 1 N f > 0, we have: p p 1 1 2 N ^ = N F F = N0; ; N where: 1 2 = : 2 f Pro of: The Central Limit Theorem for IID random variables implies that for any x in the supp ort of F wehave: p 2 N F x F x = N 0; ; N 2 1 where = F x[1 F x]. Letting x = = F and using Lemma 1-3 wehave: p p 1 1 N F F = N F F F F = N 0; 1 : N N Furthermore, the prop ertyofstochastic equicontinuity from the theory of empirical processes see D. Andrews, 1996 Handbook of Econometrics vol. 4 for an accessible intro duction and de nition of sto chastic equicontinuity, wehave that the result given ab ove is una ected if we replace by a consistent estimate ^ : p p 1 1 N F ^ F ^ = N F F F F = N 0; 1 : N N N N Now note that Lemma 1-2 implies that p p 1 1 1 N F F N F F F F : N N N N ~ However since the true CDF F has a density, the probability of observing duplicate fX g's is i zero, so Lemma 2 implies that with probability1 wehave: p p p 1 1 1 N F F F F = N F F + O 1= N ; N p N N N which implies that: p 1 = N 0; 1 : N F F N 1 Nowwe apply the Delta theorem, i.e. wedoaTaylor series expansion of F F ab out the N 1 limiting p oint = F to get: h i 1 1 1 1 F F = F F + f ~ F F ; N N 2 where ~ is a p oint on the line segmentbetween ^ and . Using the result ab ove and Lemma 1-3 wehave: ! 1 p p F F 1 1 2 N N F F = N = N 0; ; N f ~ where wehave used Slutsky's Theorem and the fact that ~ ! since ^ ! with probability 1. 3.