Lecture 4 Estimating a Univariate Distribution
Total Page:16
File Type:pdf, Size:1020Kb
← →1 Lecture 4 Estimating a Univariate Distribution In this lecture we consider estimating the CDF and PDF of a con- tinuous random variable based on a simple random sample from it. Suppose the data we have is a sample Y1,Y2,...,Ym that is indepen- dently and identically distributed from the distribution F. Estimating the Cumulative Distribution Function of X The natural estimator of the relative CDF is the empirical cumulative distribution function, denoted by Fm(y) the proportion of the sample data that do not exceed the value y ⇐ CSSS 594 Distributional Methods → ← →2 This function is also called the empirical distribution function and the sample distribution function. Mathematically: m 1 X Fm(y) = I(Yj ≤ y) m j=1 where 1 if the event S is true I(S) = (1) 0 otherwise is the indicator function. Note that Fm(y) is a step function of y with jumps of 1/m at the ordered values of the sample data. ⇐ CSSS 594 Distributional Methods → ← →3 How well does Fm(y) estimate F (y)? For a fixed y, Fm(y) is itself a random variable. The exact distribution of mFm(y) is binomial with m trials and prob- ability of success F (y) This leads to: Theorem. For each value of y, Fm(y) is a consistent estimator of F (y). The sequence Fm(y), m = 1, 2,..., is also asymptotically normal: ( F (y)(1 − F (y))) Fm(y) ∼ AN F (y), − ∞ < y < ∞ m as m → ∞. ⇐ CSSS 594 Distributional Methods → ← →4 Some important notation The Gaussian or Normal Distribution The notation N(µ, σ2) is used to denote a normal (or Gaussian) distribution with mean µ and variance σ2. The standard normal is N(0, 1) and the corresponding CDF is often denoted by Φ(x), −∞ < x < ∞. Asymptotic Convergence of Distributions Consider a sequence of random variables X1,X2,... where the mth random variable has CDF Fm(x). Suppose X has CDF H(x). ⇐ CSSS 594 Distributional Methods → ← →5 We say that the Xm converges in distribution to X if, for each continuity point of H(x), lim F (x) = H(x). m→∞ m This concept measures a sense in which the Xm are “cross-sectionally” close to X when the sample size is large. It does not focus on how close a particular sequence of Xm is to X, only the aggregate. ⇐ CSSS 594 Distributional Methods → ← →6 We say that the Xm converges with probability one to X if, ! P lim X = X = 1. m→∞ m This concept measures a sense in which the Xm are “longitudinally” close to X when the sample size is large. If a sequence converges with probability one, then it also converges in distribution. ⇐ CSSS 594 Distributional Methods → ← →7 We say the sequence is asymptotically normal with “mean” µm and 2 “variance” σm > 0 if Xm − µm σm converges in distribution to a standard normal distribution. In this situation H(x) = Φ(x) and so is continuous for each −∞ < x < ∞. For additional information see Kelly (1994) or Serfling (1980). ⇐ CSSS 594 Distributional Methods → ← →8 Notation for the convergence properties of sequences 1. Deterministic sequences: Let xn and yn be two real-valued de- terministic (nonrandom) sequences. Then, as n → ∞, (a) xn = O(yn) if and only if lim supn→∞ |xn/yn| < ∞, (b) xn = o(yn) if and only if limn→∞ |xn/yn| = 0. 2. Random sequences: Let Xn and Yn be two real-valued random sequences. Then, as n → ∞, (a) Xn = Op(Yn) if and only if for all > 0, there exist δ and N such that P (|Xn/Yn| > δ) < , for all n > N, (b) Xn = op(Yn) if and only if for all > 0, limn→∞ P (|Xn/Yn| > ) = 0. ⇐ CSSS 594 Distributional Methods → ← →9 The result states that there is convergence for each value individual of y. One commonly used measure of the global closeness of Fm(y) to F (y) is the Kolmogorov-Smirnov distance Dm = sup |Fm(y) − F (y)|. 0<r<1 The convergence of Fm(y) to F (y) occurs simultaneously for all y in the sense that Dm converges to zero with probability one, that is, P [limm→∞ Dm = 0] = 1. In this sense, for large sample sizes the deviation between Fm(y) and F (y) will be small for all y. See Serfling (1980). ⇐ CSSS 594 Distributional Methods → ← →10 Estimation of the Quantile function Recall: Q(p) = F −1(p) = inf{x | F (x) ≥ p }. x The natural estimator Q(p) is the pth quantile of the sample distri- bution function Fm(y) defined by Qm(p) = inf{y : Fm(y) ≥ p}. The properties of Qm(p) as an estimator of Q(p) are similar to those of Fm(y) as an estimator of F (y). Theorem. Assume that 0 < p < 1 and suppose F (y) possesses a density, f(y), in a neighborhood of Q(p) and f(y) is positive and continuous at Q(p) Then, as m → ∞, ( p(1 − p) ) Qm(p) ∼ AN Q(p), mf 2(Q(p)) ⇐ CSSS 594 Distributional Methods → ← →11 These estimators have the drawback that they are step functions, while F (y) and Q(p) is usually continuous and much smoother. This suggests that alternative estimators exist that may better reflect the properties of F (y). In particular, if we had a smooth estimator of f(y), fˆ(y) say, we R y ˆ could estimate F (y) by 0 f(x)dx. ⇐ CSSS 594 Distributional Methods →.