Chapter 1: Estimation Theory Advanced Econometrics - HEC Lausanne
Christophe Hurlin
University of Orléans
November 20, 2013
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 1/147 Section 1
Introduction
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 2/147 1. Introduction
Estimation problem
Let us consider a continuous random variable Y characterized by a marginal probability density function fY (y; θ) for y R and θ Θ. 2 2 The parameter θ is unknown.
Let Y1, .., YN a random sample of i.i.d. random variables that have the samef distributiong as Y .
We have one realisation y1, .., yN of this sample. f g How to estimate the parameter θ?
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 3/147 1. Introduction
Remarks
1 The estimation problem can be extended to the case of an econometric model. In this case we consider two variables Y and X and a conditional pdf f Y X =x (y; θ) that depends on a parameter or a vector of unknown parametersj θ.
2 In this chapter, we don’tderive the estimators (for the estimation methods, see next chapters). We admit that we have an estimator θ for θ whatever the estimation method used and we study its …nite sample and large sample properties. b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 4/147 1. Introduction
Notations: In this course, I will (try to...) follow some conventions of notation.
Y random variable y realisation fY (y) probability density or mass function FY (y) cumulative distribution function Pr () probability y vector Y matrix
Problem: this system of notations does not allow to discriminate between a vector (matrix) of random elements and a vector (matrix) of non-stochastic elements (realisation). Abadir and Magnus (2002), Notation in econometrics: a proposal for a standard, Econometrics Journal.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 5/147 1. Introduction
The outline of this chapter is the following: Section 2: What is an estimator? Section 3: Finite sample properties Section 4: Large sample properties Subsection 4.1: Almost sure convergence Subsection 4.2: Convergence in probability Subsection 4.3: Convergence in mean square Subsection 4.4: Convergence in distribution Subsection 4.5: Asymptotic distributions
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 6/147 Section 2
What is an Estimator?
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 7/147 2. What is an Estimator?
Objectives
1 De…ne the concept of estimator.
2 De…ne the concept of estimate.
3 Sampling distribution.
4 Discussion about the notion of "good "estimator.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 8/147 2. What is an Estimator?
De…nition (Point estimator)
A point estimator is any function T (Y1, Y2, .., YN ) of a sample. Any statistic is a point estimator.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 9/147 What is an estimator?
Example (Sample mean) 2 Assume that Y1, Y2, .., YN are i.i.d. m, σ random variables. The sample mean (or average) N 1 N Y N = ∑ Yi N i=1 is a point estimator (or an estimator) of m.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 10/147 2. What is an Estimator?
Example (Sample variance) 2 Assume that Y1, Y2, .., YN are i.i.d. m, σ random variables. The sample variance N N 2 1 2 SN = ∑ Yi Y N N 1 i=1 is a point estimator (or an estimator) of σ2.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 11/147 2. What is an Estimator?
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 12/147 2. What is an Estimator?
Fact An estimator θ is a random variable.
Consequence:bθ has a (marginal or conditional) probability distribution. This sampling distribution is caracterized by a probability density
function (pdf) fθb(u)
b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 13/147 2. What is an Estimator?
De…nition (Sampling Distribution) The probability distribution of an estimator (or a statistic) is called the sampling distribution.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 14/147 2. What is an Estimator?
Fact An estimator θ is a random variable.
Consequence:bThe sampling distribution of θ is caracterized by moments such that the expectation E θ , the variance V θ and more b generally the kth central moment de…ned by: b b k k E θ E θ = u µ f (u) du k N θ θ 8 2 Z b b b b E µθ = θ = u fθ (u) du Z b b b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 15/147 2. What is an Estimator?
De…nition (Point estimate) A (point) estimateis the realized value of an estimator (i.e. a number) that is obtained when a sample is actually taken. For an estimator θ it can be denoted by θ (y) . b b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 16/147 2. What is an Estimator?
Example (Point estimate)
For instance y N is an estimate of m.
1 N y N = ∑ yi N i=1
If N = 3 and y1, y2, y3 = 3, 1, 2 then y = 1.333. f g f g N If N = 3 and y1, y2, y3 = 4, 8, 1 then y = 1. f g f g N etc..
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 17/147 2. What is an Estimator?
Question: What constitues a good estimator?
1 The search for good estimators constitutes much of econometrics.
2 An estimator is a rule or strategy for using the data to estimate the parameter. It is de…ned before the data are drawn.
3 Our objective is to use the sample data to infer the value of a parameter or set of parameters, which we denote θ.
4 Sampling distributions are used to make inferences about the population. The issue is to know if the sampling distribution of the estimator θ is informative about the value of θ....
b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 18/147 2. What is an Estimator?
Question (cont’d): What constitues a good estimator?
1 Obviously, some estimators are better than others.
1 To take a simple example, your intuition should convince you that the sample mean would be a better estimator of the population mean than the sample minimum; the minimum is almost certain to underestimate the mean. 2 Nonetheless, the minimum is not entirely without virtue; it is easy to compute, which is occasionally a relevant criterion.
2 The idea is to study the properties of the sampling distribution and especially its moments such as E θ (for the bias), V θ (for the precision), etc.. b b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 19/147 2. What is an Estimator?
Question (cont’d): What constitues a good estimator? Estimators are compared on the basis of a variety of attributes.
1 Finite sample properties (or …nite sample distribution) of estimators are those attributes that can be compared regardless of the sample size (SECTION 3). 2 Some estimation problems involve characteristics that are unknown in …nite samples. In these cases, estimators are compared on the basis on their large sample, or asymptotic properties (SECTION 4).
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 20/147 2. What is an Estimator?
Key Concepts Section 2
1 Point estimator 2 Point estimate 3 Sampling distribution
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 21/147 Section 3
Finite Sample Properties
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 22/147 3. Finite Sample Properties
Objectives
1 De…ne the concept of …nite sample distribution.
2 Finite sample properties => What is a good estimator?
3 Unbiased estimator.
4 Comparison of two unbiased estimators.
5 FDCR or Cramer Rao bound.
6 Best Linear Unbiased Estimator (BLUE).
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 23/147 3. Finite Sample Properties
De…nition (Finite sample properties and …nite sample distribution) The …nite sample properties of an estimator θ correspond to the properties of its …nite sample distribution (or exact distribution) de…ned for any sample size N N. b 2
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 24/147 3. Finite Sample Properties
Two cases:
1 In some particular cases, the …nite sample distribution of the estimator is known. It corresponds to the distribution of the random variable θ for any sample size N.
2 In mostb of cases, the …nite sample distribution is unknown, but we can study some speci…c moments (mean, variance, etc..) of this distribution (…nite sample properties).
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 25/147 3. Finite Sample Properties
Example (Sample mean and …nite sample distribution) 2 Assume that Y1, Y2, .., YN are .i.d. m, σ random variables. The N estimator m = Y (sample mean) has also a normal distribution: N N 2 b 1 σ m = Yi m, N N N ∑ N N 8 2 i=1 Consequence: theb …nite sample distribution of m for any N N is fully 2 characterized by m and σ2 (parameters that can be estimated). Example: 2 2 if N = 3, then m m, σ /3 , if N = 10, thenb m m, σ /10 , etc.. N N b b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 26/147 3. Finite Sample Properties
Proof: The sum of independent normal variables has a normal distribution with: 1 N Nm E (m) = ∑ E (Yi ) = = m N i=1 N b1 N 1 N Nσ2 σ2 V V V (m) = ∑ Yi = 2 ∑ (Yi ) = 2 = N i=1 ! N i=1 N N
since the variablesb Yi are
independent (then cov (Yi , Yj ) = 0) 2 identically distributed (then E (Yi ) = m and V (Yi ) = σ , i [1, .., N]). 8 2
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 27/147 3. Finite Sample Properties
Remarks
1 Except in very particular cases (normally distributed samples), the exact distribution of the estimator is very di¢ cult to calculate.
2 Sometimes, it is possible to derive the exact distribution of a transformed variable g θ , where g (.) is a continuous function. b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 28/147 3. Finite Sample Properties
Example (Sample variance and …nite sample distribution) 2 Assume that Y1, Y2, .., YN are .i.d. m, σ random variables. The sample variance N N 2 1 2 S = Yi Y N N N 1 ∑ i=1 2 2 2 is an estimator of σ . The transformed variable (N 1) SN /σ has a Chi-squared (exact / …nite sample) distribution with N 1 degrees of freedom: (N 1) S2 χ2 (N 1) N N σ2 N 8 2 Proof: see Chapter 4.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 29/147 3. Finite Sample Properties
Fact In most of cases, it is impossible to derive the exact / …nite sample distribution for the estimator (or a transformed variable).
Two reasons:
1 In some cases, the exact distribution of Y1, Y2..YN is known, but the function T (.) is too complicated to derive the distribution of θ :
θ = T (Y1, ..YN ) ??? N N 8 2 b 2 In most of cases,b the distribution of the sample variables Y1, Y2..YN is unknown... θ = T (Y1, ..YN ) ??? N N 8 2 b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 30/147 3. Finite Sample Properties
Question: how to evaluate the …nite sample properties of the estimator θ when its …nite sample distribution is unknow? b θ ??? N N 8 2 Solution: We will focus onb some speci…c moments of this (unknown) …nite sample (sampling) distribution in order to study some properties of the estimator θ and determine if it is a "good" estimator or not.
b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 31/147 3. Finite Sample Properties
De…nition (Unbiased estimator) An estimator θ of a parameter θ is unbiased if the mean of its sampling distribution is θ: b E θ = θ or b E θ θ = Bias θ θ = 0 implies that is unbiased. If is a vector of parameters, then the θ b θ b estimator is unbiased if the expected value of every element of θ equals the correspondingb element of θ. b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 32/147 3. Finite Sample Properties
Source: Greene (2007), Econometrics
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 33/147 3. Finite Sample Properties
Example (Bernouilli distribution)
Let Y1, Y2, .., YN be a random sampling from a Bernoulli distribution with a success probability p. An unbiased estimator of p is
1 N p = ∑ Yi N i=1 b Proof: Since the Yi are i.i.d. with E (Yi ) = p, then we have:
1 N pN E (p) = ∑ E (Yi ) = = p N i=1 N b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 34/147 3. Finite Sample Properties
Example (Uniform distribution)
Let Y1, Y2, .., YN be a random sampling from a uniform distribution . U[0,θ] An unbiased estimator of θ is
2 N θ = ∑ Yi N i=1 b Proof: Since the Yi are i.i.d. with E (Yi ) = (θ + 0) /2 = θ/2, then we have:
2 N 2 N 2 Nθ E θ = E ∑ Yi = ∑ E (Yi ) = = θ N i=1 ! N i=1 N 2 b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 35/147 3. Finite Sample Properties
Example (Multiple linear regression model) Consider the model y = Xβ + µ N K where y R , X N K is a nonrandom matrix, β R is a vector of 2 2 M 2 2 parameters, E (µ) = 0N 1 and V (µ) = σ IN . The OLS estimator 1 β = X>X X>y is an unbiased estimator ofb β.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 36/147 3. Finite Sample Properties
Proof: Since y = Xβ + µ, X N K is a nonrandom matrix and 2 M E (µ) = 0, we have E (y) = Xβ As a consequence:
1 E β = X>X X>E (y) 1 b = X>X X>Xβ = β
The estimator β is unbiased.
b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 37/147 3. Finite Sample Properties
Remark:
Even it is not relevant in the section devoted to the …nite sample properties of estimators, we can introduce here the notion of asymptotically unbiased estimator (which can be considered as a large sample property..).
Here we assume that the estimator θ = θN depends on the sample size N. b b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 38/147 3. Finite Sample Properties
De…nition (Asymptotically unbiased estimator)
The sequence of estimators θN (with N N) is asymptotically unbiased if 2
blim E θN = θ N ∞ ! b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 39/147 3. Finite Sample Properties
Example (Sample variance) 2 Assume that Y1, Y2, .., YN are .i.d. m, σ random variables. The uncorrected sample variance de…nedN by N 2 1 2 SN = ∑ Yi Y N N i=1 e is a biased estimator of σ2 but is asymptotically unbiased.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 40/147 3. Finite Sample Properties
Proof: We known that:
N 2 1 2 SN = ∑ Yi Y N N 1 i=1 (N 1) S2 χ2 (N 1) N N σ2 N 8 2 2 2 Since, we have a relationship between SN and SN , such that:
N 2 1 2 Ne 1 2 S = Yi Y N = S N N ∑ N N i=1 then we get: e N S2 χ2 (N 1) N N σ2 N 8 2 e
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 41/147 3. Finite Sample Properties
Proof (cont’d):
N S2 χ2 (N 1) N N σ2 N 8 2 Reminder: If X χ2 (v)e, then E (X ) = v and V (X ) = 2v. By de…nition: N E S2 = N 1 σ2 N or equivalently: e N 1 E S2 = σ2 = σ2 N N 6 2 N 2 2 So, S = (1/N) ∑ Yi e Y N is a biased estimator of σ . N i=1 e
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 42/147 3. Finite Sample Properties
2 N 2 Proof (cont’d): But SN = (1/N) ∑i=1 Yi Y N is asymptotically unbiased since: e 2 N 1 2 2 lim E SN = lim σ = σ N ∞ N ∞ N ! ! e Remark: Even in a more general framework (non-normal), the sample variance (with a correction for small sample) is an unbiased estimator of σ2
N 2 1 2 SN = (N 1) ∑ Yi Y N i=1 correction for small sample 2 2 | E{z SN} = σ
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 43/147 3. Finite Sample Properties
Unbiasedness is interesting per se but not so much!
1 The absence of bias is not a su¢ cient criterion to discriminate among competitive estimators.
2 It may exist many unbiased estimators for the same parameter (vector) of interest.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 44/147 3. Finite Sample Properties
Example (Estimators)
Assume that Y1, Y2, .., YN are i.i.d. with E (Yi ) = m, the statistics
1 N m1 = ∑ Yi N i=1
b m2 = Y1 are unbiased estimators of m. b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 45/147 3. Finite Sample Properties
Proof: Since the Yi are i.i.d. with E (Yi ) = m, then we have:
1 N Nm E (m1) = ∑ E (Yi ) = = m N i=1 N
b E (m2) = E (Y1) = m
Both estimators m1 and m2 of the parameter m are unbiased. b b b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 46/147 3. Finite Sample Properties
How to compare two unbiased estimators?
When two (or more) estimators are unbiased, the best one is the more precise,.i.e. the estimator with the minimum variance.
Comparing two (or more) unbiased estimates becomes equivalent to comparing their variance-covariance matrices.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 47/147 3. Finite Sample Properties
De…nition
Suppose that θ1 and θ2 are two unbiased estimators. θ1 dominates θ2, i.e. θ1 θ2, if and only if b b V θ1 V θ2 b b b b In the case where θ1, θ2 and θ areb vectors,b this inequality becomes:
V θ2 b bV θ1 is a positive semi de…nite matrix b b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 48/147 3. Finite Sample Properties
0.8
0.7 Estimator 1 Estimator 2 0.6
0.5
0.4
0.3
0.2
0.1
0 0 0.5 1 1.5 2 2.5 3 3.5 4 q
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 49/147 3. Finite Sample Properties
Example (Estimators) 2 Assume that Y1, Y2, .., YN are i.i.d. E (Yi ) = m and V (Yi ) = σ , the 1 N estimator m1 = N ∑i=1 Yi dominates the estimator m2 = Y1. Proof: The two estimators m and m are unbiased, so they can be b 1 2 b compared in terms of variance (precision): b b 1 N Nσ2 σ2 V V (m1) = 2 ∑ (Yi ) = 2 = since the Yi are i.i.d. N i=1 N N
b 2 V (m2) = V (Y1) = σ
So, V (m1) V (m2) , the estimator m1 is preferred to m2, m1 m2. b b b b b b b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 50/147 3. Finite Sample Properties
Question: is there a bound for the variance of the unbiased estimators?
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 51/147 3. Finite Sample Properties
De…nition (Cramer-Rao or FDCR bound)
Let X1, .., XN be an i.i.d. sample with pdf fX (θ; x). Let θ be an unbiased estimator of θ; i.e., Eθ(θ) = θ. If fX (θ; x) is regular then b 1 V θ I b (θ0) = FDCR or Cramer-Rao bound θ N where I N (θ0) denotesb the Fisher information number for the sample evaluated at the true value θ0. If θ is a vector then this inequality means 1 that V θ I (θ0) is positive semi-de…nite. θ N FDCR: Frechetb - Darnois - Cramer and Rao Remark: we will de…ne the Fisher information matrix (or number) in Chapter 2 (Maximum Likelihood Estimation).
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 52/147 3. Finite Sample Properties
De…nition (E¢ ciency) An estimator is e¢ cient if its variance attains the FDCR (Frechet - Darnois - Cramer - Rao) or Cramer-Rao bound:
1 Vθ θ = I N (θ0) where I N (θ0) denotes the Fisherb information matrix associated to the sample evaluated at the true value θ0.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 53/147 3. Finite Sample Properties
Finally, note that in some cases we further restrict the set of estimators to linear functions of the data.
De…nition (Estimator BLUE) An estimator is the minimum variance linear unbiased estimator or best linear unbiased estimator (BLUE) if it is a linear function of the data and has minimum variance among linear unbiased estimators
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 54/147 3. Finite Sample Properties
Remark: the term "linear" means that the estimator θ is a linear function of the data Yi : N b θj = ∑ ωij Yi i=1 b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 55/147 3. Finite Sample Properties
Key Concepts Section 3
1 Finite sample distribution 2 Finite sample properties 3 Bias and unbiased estimator 4 Comparison of unbiased estimators 5 Cramer-Rao or FDCR bound 6 E¢ cient estimator 7 Linear estimator 8 Estimateur BLUE
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 56/147 Section 4
Asymptotic Properties
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 57/147 4. Asymptotic Properties
Problem:
1 Let us consider an i.i.d. sample Y1, Y2.., YN , where Y has a pdf fY (y; θ) and θ is an unknown parameter.
2 We assume that fY (y; θ) is also unknown (we do not know the distribution of Yi ).
3 We consider an estimator θ (also denoted θN to show that it depends on N) such that θ = Tb (Y1, Y2, .., YN )b θN 4 The …nite sample distribution of θ is unknown.... b N b
θN ??? N N b 8 2 b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 58/147 4. Asymptotic Properties
Question: what is the behavior of the random variable θN when the sample size N tends to in…nity? b De…nition (Asymptotic theory) Asymptotic or large sample theory consists in the study of the distribution of the estimator when the sample size is su¢ ciently large.
The asymptotic theory is fundamentally based on the notion of convergence...
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 59/147 4. Asymptotic Properties
We are mainly concerned with four modes of convergence:
1 Almost sure convergence
2 Convergence in probability
3 Convergence in quadratic mean
4 Convergence in distribution.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 60/147 4. Asymptotic Properties
We are mainly concerned with four modes of convergence:
1 Almost sure convergence
2 Convergence in probability
3 Convergence in quadratic mean
4 Convergence in distribution.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 61/147 Section 4
Asymptotic Properties
4.1. Almost Sure Convergence
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 62/147 4. Asymptotic Properties 4.1. Almost sur convergence
De…nition (Almost sure convergence)
Let XN be a sequence random variable indexed by the sample size. XN converges almost surely (or with probability 1 or strongly) to a constant c, if, for every ε > 0,
Pr lim XN c < ε = 1 N ∞ !
or equivalently if:
Pr lim XN = c = 1 N ∞ ! It is written a.s. XN c !
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 63/147 4. Asymptotic Properties 4.1. Almost sur convergence
Comments
1 The almost sure convergence means that the values of XN approach the value c, in the sense (see almost surely) that events for which XN does not converge to c have probability 0.
2 In another words, it means that when N tends to in…nity, the random variable Xn tends to a degenerate random variable (a random variable which only takes a single value c) with a pdf equal to a probability mass function.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 64/147 4. Asymptotic Properties 4.1. Almost sur convergence
1.2
1
0.8
0.6
0.4
0.2
0 0 0.5 1 1.5 2 2.5 3 3.5 4 c=2
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 65/147 4. Asymptotic Properties 4.1. Almost sur convergence
De…nition (Strong consistency)
A point estimator θN of θ is strongly consistent if:
a.s. θN θ b ! b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 66/147 4. Asymptotic Properties 4.1. Almost sur convergence
Comments When N ∞, the estimator tends to a degenerate random ! variable that takes a single value equal to θ.
The crème de la crème (best of the best) of the estimators....
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 67/147 4. Asymptotic Properties
We are mainly concerned with four modes of convergence:
1 Almost sure convergence
2 Convergence in probability
3 Convergence in quadratic mean
4 Convergence in distribution.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 68/147 4. Asymptotic Properties
We are mainly concerned with four modes of convergence:
1 Almost sure convergence
2 Convergence in probability
3 Convergence in quadratic mean
4 Convergence in distribution.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 69/147 Section 4
Asymptotic Properties
4.2. Convergence in Probability
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 70/147 4. Asymptotic Properties 4.2. Convergence in probability
De…nition (Convergence in probability)
Let XN be a sequence random variable indexed by the sample size. XN converges in probability to a constant c, if, for any ε > 0,
lim Pr ( XN c > ε) = 0 N ∞ j j ! It is written p XN c or plim XN = c !
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 71/147 4. Asymptotic Properties 4.2. Convergence in probability
p XN c if lim Pr ( XN c > ε) = 0 ! N ∞ j j !
4.5
4
3.5 c•e c+e
3
2.5 This area tends to 0 2
1.5
1
0.5
0 0 0.5 1 1.5 2 2.5 3 3.5 4 c=2
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 72/147 4. Asymptotic Properties 4.2. Convergence in probability
p XN c if lim Pr ( XN c > ε) = 0 for a very small ε... ! N ∞ j j !
400
350
300
250
200
150
100
50
0 0 0.5 1 1.5 2 2.5 3 3.5 4 c=2
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 73/147 4. Asymptotic Properties 4.2. Convergence in probability
Comments
1 The general idea is the same than for the a.s. convergence: XN tends to a degenerate random variable (even if it is not exactly the case) equal to c..
2 But when XN is very likely to be close to c for large N, what about the location of the remaining small probability mass which is not close to c?...
3 Convergence in probability allows more erratic behavior in the converging sequence than almost sure convergence.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 74/147 4. Asymptotic Properties 4.2. Convergence in probability
Remark The notation p XN X ! where X is a random element (scalar, vector, matrix) means that the variable XN X converges to c = 0. p XN X 0 !
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 75/147 4. Asymptotic Properties 4.2. Convergence in probability
De…nition (Weak consistency)
A point estimator θN of θ is (weakly) consistent if:
p b θN θ ! b
Remark: In econometrics, in most of cases, we only consider the weak consistency. When we say that an estimator is "consistent", it generally refers to the convergence in probability.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 76/147 4. Asymptotic Properties 4.2. Convergence in probability
Lemma (Convergence in probability)
Let XN be a sequence random variable indexed by the sample size and c a constant. If lim E (XN ) = c N ∞ ! lim V (XN ) = 0 N ∞ ! Then, XN converges in probability to c as N ∞ : ! p XN c !
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 77/147 4. Asymptotic Properties 4.2. Convergence in probability
Example (Consistent estimator) 2 Assume that Y1, Y2, .., YN are i.i.d. with E (Yi ) = m and V (Yi ) = σ , where σ2 is known and m is unknow. The estimator m, de…ned by,
N 1 b m = ∑ Yi N i=1 is a consistenty estimator of m.b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 78/147 4. Asymptotic Properties 4.2. Convergence in probability
2 Proof: Since Y1, Y2, .., YN are i.i.d. with E (Yi ) = m and V (Yi ) = σ , we have : 1 N E (m) = ∑ E (Yi ) = m N i=1 b 1 N σ2 lim V (m) = lim V (Yi ) = lim = 0 N ∞ N ∞ N2 ∑ N ∞ N ! ! i=1 ! The estimator m is (weakly)b consistent:
p m m b ! b
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 79/147 4. Asymptotic Properties 4.2. Convergence in probability
Example (Consistent estimator) 2 Assume that Y1, Y2, .., YN are .i.d. m, σ random variables. The sample variance de…ned by N N 2 1 2 SN = ∑ Yi Y N N 1 i=1 is a (weakly) consistent estimator of σ2.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 80/147 3. Finite Sample Properties 4.2. Convergence in probability
Proof: We known that for normal sample:
(N 1) S2 χ2 (N 1) N N σ2 N 8 2 (N 1) (N 1) E S2 = N 1 V S2 = 2 (N 1) σ2 N σ2 N We get immediately: 2 2 E SN = σ 4 2 2σ lim V SN = lim = 0 N ∞ N ∞ N 1 ! ! p The estimator S2 is (weakly) consistent : S2 σ2. N N !
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 81/147 4. Asymptotic Properties 4.2. Convergence in probability
Lemma (Chain of implication) The almost sure convergence implies the convergence in probability:
p a.s. = ! ) ! where the symbol "= " means ’implies". The converse is not true )
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 82/147 4. Asymptotic Properties 4.2. Convergence in probability
Comments
1 One of the main applications of the convergence in probability and the almost sure convergence is the law of large numbers.
2 The law of large numbers tells you that the sample mean converges in probability (weak law of large numbers) or almost surely (strong law of large numbers) to the population mean:
1 N X N = Xi E (Xi ) N ∑ N !∞ i=1 !
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 83/147 4. Asymptotic Properties 4.2. Convergence in probability
Theorem (Weak law of large numbers, Khinchine)
If Xi , for i = 1, .., N is a sequence of independently and identically f g distributed (i.i.d.) random variables with …nite mean E (Xi ) = µ (<∞), then the sample mean X N converges in probability to µ:
N 1 p X N = ∑ Xi E (Xi ) = µ N i=1 !
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 84/147 4. Asymptotic Properties 4.2. Convergence in probability
Theorem (Strong law of large numbers, Kolmogorov)
If Xi , for i = 1, .., N is a sequence of independently and identically f g distributed (i.i.d.) random variables such that E (Xi ) = µ (< ∞) and E ( Xi ) < ∞, then the sample mean X N converges almost surely to µ: j j N 1 a.s. X N = ∑ Xi E (Xi ) = µ N i=1 !
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 85/147 4. Asymptotic Properties 4.2. Convergence in probability
Illustration:
1 Let us consider a random variable Xi U[0,10] and draw an i.i.d N sample xi f gi=1 1 N 2 Compute the sample mean xN = N ∑i=1 xi .
3 Repeat this procedure 500 times. We get 500 realisations of the sample mean xN .
4 Build an histogram of these 500 realisations.
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 86/147 4. Asymptotic Properties 4.2. Convergence in probability
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 87/147 4. Asymptotic Properties 4.2. Convergence in probability N = 10 N = 100
20 20
18 18
16 16
14 14
12 12
10 10
8 8
6 6
4 4
2 2
0 0 0 2 4 6 8 10 0 2 4 6 8 10 N = 1, 000 N = 10, 000
20 20
18 18
16 16
14 14
12 12
10 10
8 8
6 6
4 4
2 2
0 0 0 2 4 6 8 10 0 2 4 6 8 10
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 88/147 4. Asymptotic Properties 4.2. Convergence in probability An animation is worth 1,000,000 words...
Click me!
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 89/147 4. Asymptotic Properties 4.2. Convergence in probability
Proof: There are many proofs of the law of large numbers. Most of them 2 use the additional assumption of …nite variance V (Xi ) = σ and the Chebyshev’sinequality.
Theorem (Chebyshev’sinequality) Let X be a random variable with …nite expected value µ and …nite non-zero variance σ2. Then for any real number k > 0, 1 Pr ( X µ kσ) j j k2
Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 90/147 4. Asymptotic Properties 4.2. Convergence in probability Proof (cont’d): Under the assumpition of i.i.d. µ, σ2 , we have that: