Chapter 1: Estimation Theory Advanced Econometrics - HEC Lausanne

Home , Estimation theory

Christophe Hurlin

University of Orléans

November 20, 2013

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 1/147 Section 1

Introduction

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 2/147 1. Introduction

Estimation problem

Let us consider a continuous random variable Y characterized by a marginal probability density function fY (y; θ) for y R and θ Θ. 2 2 The parameter θ is unknown.

Let Y1, .., YN a random sample of i.i.d. random variables that have the samef distributiong as Y .

We have one realisation y1, .., yN of this sample. f g How to estimate the parameter θ?

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 3/147 1. Introduction

Remarks

1 The estimation problem can be extended to the case of an econometric model. In this case we consider two variables Y and X and a conditional pdf f Y X =x (y; θ) that depends on a parameter or a vector of unknown parametersj θ.

2 In this chapter, we don’tderive the estimators (for the estimation methods, see next chapters). We admit that we have an estimator θ for θ whatever the estimation method used and we study its …nite sample and large sample properties. b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 4/147 1. Introduction

Notations: In this course, I will (try to...) follow some conventions of notation.

Y random variable y realisation fY (y) probability density or mass function FY (y) cumulative distribution function Pr () probability y vector Y matrix

Problem: this system of notations does not allow to discriminate between a vector (matrix) of random elements and a vector (matrix) of non-stochastic elements (realisation). Abadir and Magnus (2002), Notation in econometrics: a proposal for a standard, Econometrics Journal.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 5/147 1. Introduction

The outline of this chapter is the following: Section 2: What is an estimator? Section 3: Finite sample properties Section 4: Large sample properties Subsection 4.1: Almost sure convergence Subsection 4.2: Convergence in probability Subsection 4.3: Convergence in mean square Subsection 4.4: Convergence in distribution Subsection 4.5: Asymptotic distributions

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 6/147 Section 2

What is an Estimator?

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 7/147 2. What is an Estimator?

Objectives

1 De…ne the concept of estimator.

2 De…ne the concept of estimate.

3 Sampling distribution.

4 Discussion about the notion of "good "estimator.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 8/147 2. What is an Estimator?

De…nition (Point estimator)

A point estimator is any function T (Y1, Y2, .., YN ) of a sample. Any statistic is a point estimator.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 9/147 What is an estimator?

Example (Sample mean) 2 Assume that Y1, Y2, .., YN are i.i.d. m, σ random variables. The sample mean (or average) N 1 N Y N = ∑ Yi N i=1 is a point estimator (or an estimator) of m.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 10/147 2. What is an Estimator?

Example (Sample variance) 2 Assume that Y1, Y2, .., YN are i.i.d. m, σ random variables. The sample variance N N 2 1 2 SN = ∑ Yi Y N N 1 i=1 is a point estimator (or an estimator) of σ2.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 11/147 2. What is an Estimator?

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 12/147 2. What is an Estimator?

Fact An estimator θ is a random variable.

Consequence:bθ has a (marginal or conditional) probability distribution. This sampling distribution is caracterized by a probability density

function (pdf) fθb(u)

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 13/147 2. What is an Estimator?

De…nition (Sampling Distribution) The probability distribution of an estimator (or a statistic) is called the sampling distribution.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 14/147 2. What is an Estimator?

Fact An estimator θ is a random variable.

Consequence:bThe sampling distribution of θ is caracterized by moments such that the expectation E θ , the variance V θ and more b generally the kth central moment de…ned by: b b k k E θ E θ = u µ f (u) du k N θ θ 8 2 Z b b b b E µθ = θ = u fθ (u) du Z b b b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 15/147 2. What is an Estimator?

De…nition (Point estimate) A (point) estimateis the realized value of an estimator (i.e. a number) that is obtained when a sample is actually taken. For an estimator θ it can be denoted by θ (y) . b b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 16/147 2. What is an Estimator?

Example (Point estimate)

For instance y N is an estimate of m.

1 N y N = ∑ yi N i=1

If N = 3 and y1, y2, y3 = 3, 1, 2 then y = 1.333. f g f g N If N = 3 and y1, y2, y3 = 4, 8, 1 then y = 1. f g f g N etc..

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 17/147 2. What is an Estimator?

Question: What constitues a good estimator?

1 The search for good estimators constitutes much of econometrics.

2 An estimator is a rule or strategy for using the data to estimate the parameter. It is de…ned before the data are drawn.

3 Our objective is to use the sample data to infer the value of a parameter or set of parameters, which we denote θ.

4 Sampling distributions are used to make inferences about the population. The issue is to know if the sampling distribution of the estimator θ is informative about the value of θ....

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 18/147 2. What is an Estimator?

Question (cont’d): What constitues a good estimator?

1 Obviously, some estimators are better than others.

1 To take a simple example, your intuition should convince you that the sample mean would be a better estimator of the population mean than the sample minimum; the minimum is almost certain to underestimate the mean. 2 Nonetheless, the minimum is not entirely without virtue; it is easy to compute, which is occasionally a relevant criterion.

2 The idea is to study the properties of the sampling distribution and especially its moments such as E θ (for the bias), V θ (for the precision), etc.. b b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 19/147 2. What is an Estimator?

Question (cont’d): What constitues a good estimator? Estimators are compared on the basis of a variety of attributes.

1 Finite sample properties (or …nite sample distribution) of estimators are those attributes that can be compared regardless of the sample size (SECTION 3). 2 Some estimation problems involve characteristics that are unknown in …nite samples. In these cases, estimators are compared on the basis on their large sample, or asymptotic properties (SECTION 4).

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 20/147 2. What is an Estimator?

Key Concepts Section 2

1 Point estimator 2 Point estimate 3 Sampling distribution

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 21/147 Section 3

Finite Sample Properties

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 22/147 3. Finite Sample Properties

Objectives

1 De…ne the concept of …nite sample distribution.

2 Finite sample properties => What is a good estimator?

3 Unbiased estimator.

4 Comparison of two unbiased estimators.

5 FDCR or Cramer Rao bound.

6 Best Linear Unbiased Estimator (BLUE).

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 23/147 3. Finite Sample Properties

De…nition (Finite sample properties and …nite sample distribution) The …nite sample properties of an estimator θ correspond to the properties of its …nite sample distribution (or exact distribution) de…ned for any sample size N N. b 2

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 24/147 3. Finite Sample Properties

Two cases:

1 In some particular cases, the …nite sample distribution of the estimator is known. It corresponds to the distribution of the random variable θ for any sample size N.

2 In mostb of cases, the …nite sample distribution is unknown, but we can study some speci…c moments (mean, variance, etc..) of this distribution (…nite sample properties).

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 25/147 3. Finite Sample Properties

Example (Sample mean and …nite sample distribution) 2 Assume that Y1, Y2, .., YN are .i.d. m, σ random variables. The N estimator m = Y (sample mean) has also a normal distribution: N N 2 b 1 σ m = Yi m, N N N ∑ N N 8 2 i=1 Consequence: theb …nite sample distribution of m for any N N is fully 2 characterized by m and σ2 (parameters that can be estimated). Example: 2 2 if N = 3, then m m, σ /3 , if N = 10, thenb m m, σ /10 , etc.. N N b b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 26/147 3. Finite Sample Properties

Proof: The sum of independent normal variables has a normal distribution with: 1 N Nm E (m) = ∑ E (Yi ) = = m N i=1 N b1 N 1 N Nσ2 σ2 V V V (m) = ∑ Yi = 2 ∑ (Yi ) = 2 = N i=1 ! N i=1 N N

since the variablesb Yi are

independent (then cov (Yi , Yj ) = 0) 2 identically distributed (then E (Yi ) = m and V (Yi ) = σ , i [1, .., N]). 8 2

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 27/147 3. Finite Sample Properties

Remarks

1 Except in very particular cases (normally distributed samples), the exact distribution of the estimator is very di¢ cult to calculate.

2 Sometimes, it is possible to derive the exact distribution of a transformed variable g θ , where g (.) is a continuous function. b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 28/147 3. Finite Sample Properties

Example (Sample variance and …nite sample distribution) 2 Assume that Y1, Y2, .., YN are .i.d. m, σ random variables. The sample variance N N 2 1 2 S = Yi Y N N N 1 ∑ i=1 2 2 2 is an estimator of σ . The transformed variable (N 1) SN /σ has a Chi-squared (exact / …nite sample) distribution withN 1 degrees of freedom: (N 1) S2 χ2 (N 1) N N σ2 N 8 2 Proof: see Chapter 4.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 29/147 3. Finite Sample Properties

Fact In most of cases, it is impossible to derive the exact / …nite sample distribution for the estimator (or a transformed variable).

Two reasons:

1 In some cases, the exact distribution of Y1, Y2..YN is known, but the function T (.) is too complicated to derive the distribution of θ :

θ = T (Y1, ..YN ) ??? N N 8 2 b 2 In most of cases,b the distribution of the sample variables Y1, Y2..YN is unknown... θ = T (Y1, ..YN ) ??? N N 8 2 b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 30/147 3. Finite Sample Properties

Question: how to evaluate the …nite sample properties of the estimator θ when its …nite sample distribution is unknow? b θ ??? N N 8 2 Solution: We will focus onb some speci…c moments of this (unknown) …nite sample (sampling) distribution in order to study some properties of the estimator θ and determine if it is a "good" estimator or not.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 31/147 3. Finite Sample Properties

De…nition (Unbiased estimator) An estimator θ of a parameter θ is unbiased if the mean of its sampling distribution is θ: b E θ = θ or b E θ θ = Bias θ θ = 0 implies that is unbiased. If is a vector of parameters, then the θ b θ b estimator is unbiased if the expected value of every element of θ equals the correspondingb element of θ. b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 32/147 3. Finite Sample Properties

Source: Greene (2007), Econometrics

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 33/147 3. Finite Sample Properties

Example (Bernouilli distribution)

Let Y1, Y2, .., YN be a random sampling from a Bernoulli distribution with a success probability p. An unbiased estimator of p is

1 N p = ∑ Yi N i=1 b Proof: Since the Yi are i.i.d. with E (Yi ) = p, then we have:

1 N pN E (p) = ∑ E (Yi ) = = p N i=1 N b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 34/147 3. Finite Sample Properties

Example (Uniform distribution)

Let Y1, Y2, .., YN be a random sampling from a uniform distribution . U[0,θ] An unbiased estimator of θ is

2 N θ = ∑ Yi N i=1 b Proof: Since the Yi are i.i.d. with E (Yi ) = (θ + 0) /2 = θ/2, then we have:

2 N 2 N 2 Nθ E θ = E ∑ Yi = ∑ E (Yi ) = = θ N i=1 ! N i=1 N 2 b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 35/147 3. Finite Sample Properties

Example (Multiple linear regression model) Consider the model y = Xβ + µ N K where y R , X N K is a nonrandom matrix, β R is a vector of 2 2 M 2 2 parameters, E (µ) = 0N 1 and V (µ) = σ IN . The OLS estimator 1 β = X>X X>y is an unbiased estimator ofb β.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 36/147 3. Finite Sample Properties

Proof: Since y = Xβ + µ, X N K is a nonrandom matrix and 2 M E (µ) = 0, we have E (y) = Xβ As a consequence:

1 E β = X>X X>E (y) 1 b = X>X X>Xβ = β

The estimator β is unbiased.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 37/147 3. Finite Sample Properties

Remark:

Even it is not relevant in the section devoted to the …nite sample properties of estimators, we can introduce here the notion of asymptotically unbiased estimator (which can be considered as a large sample property..).

Here we assume that the estimator θ = θN depends on the sample size N. b b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 38/147 3. Finite Sample Properties

De…nition (Asymptotically unbiased estimator)

The sequence of estimators θN (with N N) is asymptotically unbiased if 2

blim E θN = θ N ∞ ! b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 39/147 3. Finite Sample Properties

Example (Sample variance) 2 Assume that Y1, Y2, .., YN are .i.d. m, σ random variables. The uncorrected sample variance de…nedN by N 2 1 2 SN = ∑ Yi Y N N i=1 e is a biased estimator of σ2 but is asymptotically unbiased.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 40/147 3. Finite Sample Properties

Proof: We known that:

N 2 1 2 SN = ∑ Yi Y N N 1 i=1 (N 1) S2 χ2 (N 1) N N σ2 N 8 2 2 2 Since, we have a relationship between SN and SN , such that:

N 2 1 2 Ne 1 2 S = Yi Y N = S N N ∑ N N i=1 then we get: e N S2 χ2 (N 1) N N σ2 N 8 2 e

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 41/147 3. Finite Sample Properties

Proof (cont’d):

N S2 χ2 (N 1) N N σ2 N 8 2 Reminder: If X χ2 (v)e, then E (X ) = v and V (X ) = 2v. By de…nition: N E S2 = N 1 σ2 N or equivalently: e N 1 E S2 = σ2 = σ2 N N 6 2 N 2 2 So, S = (1/N) ∑ Yi e Y N is a biased estimator of σ . N i=1 e

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 42/147 3. Finite Sample Properties

2 N 2 Proof (cont’d): But SN = (1/N) ∑i=1 Yi Y N is asymptotically unbiased since: e 2 N 1 2 2 lim E SN = lim σ = σ N ∞ N ∞ N ! ! e Remark: Even in a more general framework (non-normal), the sample variance (with a correction for small sample) is an unbiased estimator of σ2

N 2 1 2 SN = (N 1) ∑ Yi Y N i=1 correction for small sample 2 2 | E{z SN} = σ

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 43/147 3. Finite Sample Properties

Unbiasedness is interesting per se but not so much!

1 The absence of bias is not a su¢ cient criterion to discriminate among competitive estimators.

2 It may exist many unbiased estimators for the same parameter (vector) of interest.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 44/147 3. Finite Sample Properties

Example (Estimators)

Assume that Y1, Y2, .., YN are i.i.d. with E (Yi ) = m, the statistics

1 N m1 = ∑ Yi N i=1

b m2 = Y1 are unbiased estimators of m. b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 45/147 3. Finite Sample Properties

Proof: Since the Yi are i.i.d. with E (Yi ) = m, then we have:

1 N Nm E (m1) = ∑ E (Yi ) = = m N i=1 N

b E (m2) = E (Y1) = m

Both estimators m1 and m2 of the parameter m are unbiased. b b b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 46/147 3. Finite Sample Properties

How to compare two unbiased estimators?

When two (or more) estimators are unbiased, the best one is the more precise,.i.e. the estimator with the minimum variance.

Comparing two (or more) unbiased estimates becomes equivalent to comparing their variance-covariance matrices.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 47/147 3. Finite Sample Properties

De…nition

Suppose that θ1 and θ2 are two unbiased estimators. θ1 dominates θ2, i.e. θ1 θ2, if and only if b b V θ1 V θ2 b b b b In the case where θ1, θ2 and θ areb vectors,b this inequality becomes:

V θ2 b bV θ1 is a positive semi de…nite matrix b b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 48/147 3. Finite Sample Properties

0.8

0.7 Estimator 1 Estimator 2 0.6

0.5

0.4

0.3

0.2

0.1

0 0 0.5 1 1.5 2 2.5 3 3.5 4 q

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 49/147 3. Finite Sample Properties

Example (Estimators) 2 Assume that Y1, Y2, .., YN are i.i.d. E (Yi ) = m and V (Yi ) = σ , the 1 N estimator m1 = N ∑i=1 Yi dominates the estimator m2 = Y1. Proof: The two estimators m and m are unbiased, so they can be b 1 2 b compared in terms of variance (precision): b b 1 N Nσ2 σ2 V V (m1) = 2 ∑ (Yi ) = 2 = since the Yi are i.i.d. N i=1 N N

b 2 V (m2) = V (Y1) = σ

So, V (m1) V (m2) , the estimator m1 is preferred to m2, m1 m2. b b b b b b b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 50/147 3. Finite Sample Properties

Question: is there a bound for the variance of the unbiased estimators?

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 51/147 3. Finite Sample Properties

De…nition (Cramer-Rao or FDCR bound)

Let X1, .., XN be an i.i.d. sample with pdf fX (θ; x). Let θ be an unbiased estimator of θ; i.e., Eθ(θ) = θ. If fX (θ; x) is regular then b 1 V θ I b (θ0) = FDCR or Cramer-Rao bound θ N where I N (θ0) denotesb the Fisher information number for the sample evaluated at the true value θ0. If θ is a vector then this inequality means 1 that V θ I (θ0) is positive semi-de…nite. θ N FDCR: Frechetb - Darnois - Cramer and Rao Remark: we will de…ne the Fisher information matrix (or number) in Chapter 2 (Maximum Likelihood Estimation).

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 52/147 3. Finite Sample Properties

De…nition (E¢ ciency) An estimator is e¢ cient if its variance attains the FDCR (Frechet - Darnois - Cramer - Rao) or Cramer-Rao bound:

1 Vθ θ = I N (θ0) where I N (θ0) denotes the Fisherb information matrix associated to the sample evaluated at the true value θ0.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 53/147 3. Finite Sample Properties

Finally, note that in some cases we further restrict the set of estimators to linear functions of the data.

De…nition (Estimator BLUE) An estimator is the minimum variance linear unbiased estimator or best linear unbiased estimator (BLUE) if it is a linear function of the data and has minimum variance among linear unbiased estimators

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 54/147 3. Finite Sample Properties

Remark: the term "linear" means that the estimator θ is a linear function of the data Yi : N b θj = ∑ ωij Yi i=1 b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 55/147 3. Finite Sample Properties

Key Concepts Section 3

1 Finite sample distribution 2 Finite sample properties 3 Bias and unbiased estimator 4 Comparison of unbiased estimators 5 Cramer-Rao or FDCR bound 6 E¢ cient estimator 7 Linear estimator 8 Estimateur BLUE

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 56/147 Section 4

Asymptotic Properties

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 57/147 4. Asymptotic Properties

Problem:

1 Let us consider an i.i.d. sample Y1, Y2.., YN , where Y has a pdf fY (y; θ) and θ is an unknown parameter.

2 We assume that fY (y; θ) is also unknown (we do not know the distribution of Yi ).

3 We consider an estimator θ (also denoted θN to show that it depends on N) such that θ = Tb (Y1, Y2, .., YN )b θN 4 The …nite sample distribution of θ is unknown.... b N b

θN ??? N N b 8 2 b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 58/147 4. Asymptotic Properties

Question: what is the behavior of the random variable θN when the sample size N tends to in…nity? b De…nition (Asymptotic theory) Asymptotic or large sample theory consists in the study of the distribution of the estimator when the sample size is su¢ ciently large.

The asymptotic theory is fundamentally based on the notion of convergence...

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 59/147 4. Asymptotic Properties

We are mainly concerned with four modes of convergence:

1 Almost sure convergence

2 Convergence in probability

3 Convergence in quadratic mean

4 Convergence in distribution.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 60/147 4. Asymptotic Properties

We are mainly concerned with four modes of convergence:

1 Almost sure convergence

2 Convergence in probability

3 Convergence in quadratic mean

4 Convergence in distribution.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 61/147 Section 4

Asymptotic Properties

4.1. Almost Sure Convergence

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 62/147 4. Asymptotic Properties 4.1. Almost sur convergence

De…nition (Almost sure convergence)

Let XN be a sequence random variable indexed by the sample size. XN converges almost surely (or with probability 1 or strongly) to a constant c, if, for every ε > 0,

Pr lim XN c < ε = 1 N ∞ !

or equivalently if:

Pr lim XN = c = 1 N ∞ ! It is written a.s. XN c !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 63/147 4. Asymptotic Properties 4.1. Almost sur convergence

Comments

1 The almost sure convergence means that the values of XN approach the value c, in the sense (see almost surely) that events for which XN does not converge to c have probability 0.

2 In another words, it means that when N tends to in…nity, the random variable Xn tends to a degenerate random variable (a random variable which only takes a single value c) with a pdf equal to a probability mass function.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 64/147 4. Asymptotic Properties 4.1. Almost sur convergence

1.2

0.8

0.6

0.4

0.2

0 0 0.5 1 1.5 2 2.5 3 3.5 4 c=2

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 65/147 4. Asymptotic Properties 4.1. Almost sur convergence

De…nition (Strong consistency)

A point estimator θN of θ is strongly consistent if:

a.s. θN θ b ! b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 66/147 4. Asymptotic Properties 4.1. Almost sur convergence

Comments When N ∞, the estimator tends to a degenerate random ! variable that takes a single value equal to θ.

The crème de la crème (best of the best) of the estimators....

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 67/147 4. Asymptotic Properties

We are mainly concerned with four modes of convergence:

1 Almost sure convergence

2 Convergence in probability

3 Convergence in quadratic mean

4 Convergence in distribution.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 68/147 4. Asymptotic Properties

We are mainly concerned with four modes of convergence:

1 Almost sure convergence

2 Convergence in probability

3 Convergence in quadratic mean

4 Convergence in distribution.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 69/147 Section 4

Asymptotic Properties

4.2. Convergence in Probability

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 70/147 4. Asymptotic Properties 4.2. Convergence in probability

De…nition (Convergence in probability)

Let XN be a sequence random variable indexed by the sample size. XN converges in probability to a constant c, if, for any ε > 0,

lim Pr ( XN c > ε) = 0 N ∞ j j ! It is written p XN c or plim XN = c !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 71/147 4. Asymptotic Properties 4.2. Convergence in probability

p XN c if lim Pr ( XN c > ε) = 0 ! N ∞ j j !

4.5

3.5 c•e c+e

2.5 This area tends to 0 2

1.5

0.5

0 0 0.5 1 1.5 2 2.5 3 3.5 4 c=2

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 72/147 4. Asymptotic Properties 4.2. Convergence in probability

p XN c if lim Pr ( XN c > ε) = 0 for a very small ε... ! N ∞ j j !

400

350

300

250

200

150

100

0 0 0.5 1 1.5 2 2.5 3 3.5 4 c=2

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 73/147 4. Asymptotic Properties 4.2. Convergence in probability

Comments

1 The general idea is the same than for the a.s. convergence: XN tends to a degenerate random variable (even if it is not exactly the case) equal to c..

2 But when XN is very likely to be close to c for large N, what about the location of the remaining small probability mass which is not close to c?...

3 Convergence in probability allows more erratic behavior in the converging sequence than almost sure convergence.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 74/147 4. Asymptotic Properties 4.2. Convergence in probability

Remark The notation p XN X ! where X is a random element (scalar, vector, matrix) means that the variable XN X converges to c = 0. p XN X 0 !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 75/147 4. Asymptotic Properties 4.2. Convergence in probability

De…nition (Weak consistency)

A point estimator θN of θ is (weakly) consistent if:

p b θN θ ! b

Remark: In econometrics, in most of cases, we only consider the weak consistency. When we say that an estimator is "consistent", it generally refers to the convergence in probability.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 76/147 4. Asymptotic Properties 4.2. Convergence in probability

Lemma (Convergence in probability)

Let XN be a sequence random variable indexed by the sample size and c a constant. If lim E (XN ) = c N ∞ ! lim V (XN ) = 0 N ∞ ! Then, XN converges in probability to c as N ∞ : ! p XN c !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 77/147 4. Asymptotic Properties 4.2. Convergence in probability

Example (Consistent estimator) 2 Assume that Y1, Y2, .., YN are i.i.d. with E (Yi ) = m and V (Yi ) = σ , where σ2 is known and m is unknow. The estimator m, de…ned by,

N 1 b m = ∑ Yi N i=1 is a consistenty estimator of m.b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 78/147 4. Asymptotic Properties 4.2. Convergence in probability

2 Proof: Since Y1, Y2, .., YN are i.i.d. with E (Yi ) = m and V (Yi ) = σ , we have : 1 N E (m) = ∑ E (Yi ) = m N i=1 b 1 N σ2 lim V (m) = lim V (Yi ) = lim = 0 N ∞ N ∞ N2 ∑ N ∞ N ! ! i=1 ! The estimator m is (weakly)b consistent:

p m m b ! b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 79/147 4. Asymptotic Properties 4.2. Convergence in probability

Example (Consistent estimator) 2 Assume that Y1, Y2, .., YN are .i.d. m, σ random variables. The sample variance de…ned by N N 2 1 2 SN = ∑ Yi Y N N 1 i=1 is a (weakly) consistent estimator of σ2.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 80/147 3. Finite Sample Properties 4.2. Convergence in probability

Proof: We known that for normal sample:

(N 1) S2 χ2 (N 1) N N σ2 N 8 2 (N 1) (N 1) E S2 = N 1 V S2 = 2 (N 1) σ2 N σ2 N We get immediately: 2 2 E SN = σ 4 2 2σ lim V SN = lim = 0 N ∞ N ∞ N 1 ! ! p The estimator S2 is (weakly) consistent : S2 σ2. N N !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 81/147 4. Asymptotic Properties 4.2. Convergence in probability

Lemma (Chain of implication) The almost sure convergence implies the convergence in probability:

p a.s. = ! ) ! where the symbol "= " means ’implies". The converse is not true )

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 82/147 4. Asymptotic Properties 4.2. Convergence in probability

Comments

1 One of the main applications of the convergence in probability and the almost sure convergence is the law of large numbers.

2 The law of large numbers tells you that the sample mean converges in probability (weak law of large numbers) or almost surely (strong law of large numbers) to the population mean:

1 N X N = Xi E (Xi ) N ∑ N!∞ i=1 !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 83/147 4. Asymptotic Properties 4.2. Convergence in probability

Theorem (Weak law of large numbers, Khinchine)

If Xi , for i = 1, .., N is a sequence of independently and identically f g distributed (i.i.d.) random variables with …nite mean E (Xi ) = µ (<∞), then the sample mean X N converges in probability to µ:

N 1 p X N = ∑ Xi E (Xi ) = µ N i=1 !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 84/147 4. Asymptotic Properties 4.2. Convergence in probability

Theorem (Strong law of large numbers, Kolmogorov)

If Xi , for i = 1, .., N is a sequence of independently and identically f g distributed (i.i.d.) random variables such that E (Xi ) = µ (< ∞) and E ( Xi ) < ∞, then the sample mean X N converges almost surely to µ: j j N 1 a.s. X N = ∑ Xi E (Xi ) = µ N i=1 !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 85/147 4. Asymptotic Properties 4.2. Convergence in probability

Illustration:

1 Let us consider a random variable Xi U[0,10] and draw an i.i.d N sample xi f gi=1 1 N 2 Compute the sample mean xN = N ∑i=1 xi .

3 Repeat this procedure 500 times. We get 500 realisations of the sample mean xN .

4 Build an histogram of these 500 realisations.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 86/147 4. Asymptotic Properties 4.2. Convergence in probability

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 87/147 4. Asymptotic Properties 4.2. Convergence in probability N = 10 N = 100

20 20

18 18

16 16

14 14

12 12

10 10

8 8

6 6

4 4

2 2

0 0 0 2 4 6 8 10 0 2 4 6 8 10 N = 1, 000 N = 10, 000

20 20

18 18

16 16

14 14

12 12

10 10

8 8

6 6

4 4

2 2

0 0 0 2 4 6 8 10 0 2 4 6 8 10

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 88/147 4. Asymptotic Properties 4.2. Convergence in probability An animation is worth 1,000,000 words...

Click me!

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 89/147 4. Asymptotic Properties 4.2. Convergence in probability

Proof: There are many proofs of the law of large numbers. Most of them 2 use the additional assumption of …nite variance V (Xi ) = σ and the Chebyshev’sinequality.

Theorem (Chebyshev’sinequality) Let X be a random variable with …nite expected value µ and …nite non-zero variance σ2. Then for any real number k > 0, 1 Pr ( X µ kσ) j j k2

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 90/147 4. Asymptotic Properties 4.2. Convergence in probability Proof (cont’d): Under the assumpition of i.i.d. µ, σ2 , we have that:

σ2 E X = µ V X = N N N Given the Chebyshev’sinequality, we get for k > 0: σ 1 Pr X N µ k p k2 N

Let us de…ne ε > 0 such that kσ εpN ε = k = pN () σ Then we get for any ε > 0: σ2 Pr X N µ ε ε2N Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 91/147 4. Asymptotic Properties 4.2. Convergence in probability Proof (cont’d): for any ε > 0:

σ2 Pr X N µ ε ε2N So, when N ∞ this probability is necessarily equal to 0 (since 0 means = 0) !

Pr lim X N µ ε = 0 ε > 0 N ∞ 8 !

Since Pr X N µ < ε = 1 P X N µ ε , we have:

Pr lim X N µ < ε = 1 ε > 0 N ∞ 8 !

a.s. p X N µ (SLLN) = X N µ (WLLN) ! ) !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 92/147 4. Asymptotic Properties 4.2. Convergence in probability

Remarks

1 These two theorems consider a sequence of independently and identically distributed (i.i.d.) random variables (as a consequence with the same mean E (Xi ) = µ, i = 1, .., N. 8 2 There are alternative versions of the law of large numbers for independent random variables not identically (heterogeneously)

distributed with E (Xi ) = µi (cf. Greene, 2007).

1 Chebychev’sWeak Law of Large Numbers.

2 Markov’sStrong Law of Large Numbers.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 93/147 4. Asymptotic Properties 4.2. Convergence in probability

Theorem (Slutsky’stheorem) p Let XN and YN be two sequences of random variables where XN X and p ! YN c, where c = 0, then: ! 6 p XN + YN X + c ! p XN YN cX ! XN p X YN ! c

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 94/147 4. Asymptotic Properties 4.2. Convergence in probability

Remark: This also holds for sequences of random matrices. The last p p statement reads: if XN X and YN Ω then ! ! 1 p 1 Y XN Ω X N ! 1 provided that Ω exists.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 95/147 4. Asymptotic Properties 4.2. Convergence in probability

Example Let us consider the multiple linear regression model

yi = xi>β + µi

where xi = (xi1..xiK )> is K 1 vector of random variables, β = (β ...β )> is K 1 vector of parmeters, and where the error term µ 1 K i satis…es E (µi ) = 0 and E ( µi xij ) = 0 j = 1, ..K. Question: show that the OLS estimator de…ned by j 8

1 N N β = ∑ xi xi> ∑ xi yi i=1 ! i=1 ! b is a consistent estimator of β.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 96/147 4. Asymptotic Properties 4.2. Convergence in probability

Proof: let us rewritte the OLS estimator as:

1 N N β = ∑ xi xi> ∑ xi yi i=1 ! i=1 ! 1 b N N = ∑ xi xi> ∑ xi xi>β + µi i=1 ! i=1 ! 1 1 N N N N = ∑ xi xi> ∑ xi xi> β + ∑ xi xi> ∑ xi µi i=1 ! i=1 ! i=1 ! i=1 ! 1 N N = β + ∑ xi xi> ∑ xi µi i=1 ! i=1 !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 97/147 4. Asymptotic Properties 4.2. Convergence in probability Proof (cont’d): By multiplying and dividing by N, we get:

1 1 N 1 N β = β + ∑ xi xi> ∑ xi µi N i=1 ! N i=1 ! b 1 By using the (weak) law of large number (Kitchine’stherorem), we have:

N N 1 p 1 p ∑ xi xi> E xi xi> ∑ xi µi E (xi µi ) N i=1 ! N i=1 ! 2 By using the Slutsky’stheorem:

p 1 β β + E xi x> E (xi µ ) ! i i b Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 98/147 4. Asymptotic Properties 4.2. Convergence in probability

Reminder: If X and Y are two random variables, then

E (X Y ) = 0 = E (XY ) = 0 j ) The reverse is not true. cov (X , Y ) = E (XY ) E (X ) E (Y ) = 0 E (X Y ) = 0 = j ) ( E (X ) = 0

E (X Y ) = 0 = E (XY ) = 0 j )

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 99/147 4. Asymptotic Properties 4.2. Convergence in probability

Proof (cont’d): p 1 β β + E xi x> E (xi µ ) ! i i Since b E ( µi xij ) = 0 j = 1, ..K E (µi xi ) = 0K 1 j 8 ) We have p β β ! The OLS estimator β is (weakly) consistent. b b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 100 / 147 4. Asymptotic Properties

We are mainly concerned with four modes of convergence:

1 Almost sure convergence

2 Convergence in probability

3 Convergence in quadratic mean

4 Convergence in distribution.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 101 / 147 4. Asymptotic Properties

We are mainly concerned with four modes of convergence:

1 Almost sure convergence

2 Convergence in probability

3 Convergence in quadratic mean

4 Convergence in distribution.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 102 / 147 Section 4

Asymptotic Properties

4.3. Convergence in Mean Square

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 103 / 147 4. Asymptotic Properties 4.3. Convergence in mean square

De…nition (Convergence in mean square)

Let Xi for i = 1, .., N be a sequence of real-valued random variables f g 2 such that E XN < ∞. XN converges in mean square to a constant c, j j if: 2 lim E XN c = 0 N ∞ j j ! It is written m.s. XN c !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 104 / 147 4. Asymptotic Properties 4.3. Convergence in mean square

Remark: It is the less usefull notion of convergence.. except for the demonstrations of the convergence in probability.

Lemma (Chain of implication) The convergence in mean square implies the convergence in probability:

p m.s. = ! ) ! where the symbol "= " means ’implies". The converse is not true. )

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 105 / 147 4. Asymptotic Properties

We are mainly concerned with four modes of convergence:

1 Almost sure convergence

2 Convergence in probability

3 Convergence in quadratic mean

4 Convergence in distribution

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 106 / 147 4. Asymptotic Properties

We are mainly concerned with four modes of convergence:

1 Almost sure convergence

2 Convergence in probability

3 Convergence in quadratic mean

4 Convergence in distribution

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 107 / 147 Section 4

Asymptotic Properties

4.4. Convergence in Distribution

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 108 / 147 4. Asymptotic Properties 4.4. Convergence in distribution

De…nition (Convergence in distribution)

Let XN be a sequence random variable indexed by the sample size with a cdf FN (.). XN converges in distribution to a random variable X with cdf F (.) if lim FN (x) = F (x) x N ∞ 8 ! It is written: d XN X !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 109 / 147 4. Asymptotic Properties 4.4. Convergence in distribution

Comment: In general, we have:

d XN X ! random var. random var. |{z} p |{z} XN c ! random var. constant In the case, where |{z} |{z} p p XN X it means XN X 0 ! ! random var. random var. random var. constant |{z} |{z} | {z } |{z}

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 110 / 147 4. Asymptotic Properties 4.4. Convergence in distribution

Lemma (Chain of implication) The convergence in probability implies the convergence in distribution:

p = d ! ) ! where the symbol "= " means ’implies". The converse is not true. )

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 111 / 147 4. Asymptotic Properties 4.4. Convergence in distribution

De…nition (Asymptotic distribution)

If XN converges in distribution to X , where FN (.) is the cdf of XN , then F (.) is the cdf of the limiting or asymptotic distribution of XN .

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 112 / 147 4. Asymptotic Properties 4.4. Convergence in distribution

Consequence: Generally, we denote:

d XN !L random var. asy. distribution

It means XN converges in distribution|{z} to a|{z} random variable X that has a dsitribution . L Example d XN (0, 1) !N means that XN converges to a random variable X normally distributed or that XN has an asymptotic standard normal distribution.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 113 / 147 4. Asymptotic Properties 4.4. Convergence in distribution

De…nition (Asymptotic mean and variance)

The asymptotic mean and variance of a random variable XN are the mean and variance of the asymptotic or limiting distribution, assuming that the limiting distribution and its moments exist. These moments are denoted by Easy (XN ) Vasy (XN )

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 114 / 147 4. Asymptotic Properties 4.4. Convergence in distribution

De…nition (Asymptotically normally distributed estimator) A consistent estimator θ of θ is said to be asymptotically normally distributed (or asymptotically normal) if: b d pN θ θ0 (0, Σ0) !N Equivalently, θ is asymptoticallyb normal if:

asy 1 b θ θ0, N Σ0 N The asymptotic variance ofbθ is then de…ned by: 1 Vasy bθ avar θ = Σ0 N b b Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 115 / 147 Section 4

Asymptotic Properties

4.5. Asymptotic Distributions

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 116 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Let’sgo back to our estimation problem

We consider a (strongly) consistent estimator θN of the true parameter θ0. a.s. p θN θ0 = θN θ0b ! ) ! This estimator has a degenerated asymptotic distribution b b (point-mass distribution), since when N ∞, ! lim f (x) = f (x) N ∞ θN ! b where f (.) is the pdf of θN and f (x) is de…ned by: θN

b b 1 if x = θ f (x) = 0 0 0 otherwise

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 117 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Conclusion: one needs more than consistency to do inference (tests about the true value of θ, etc.).

Solution: we will transform the estimator θN to get a transformed variable that has a non degenerated asymptotic distribution in order to derive the the asymptotic distribution. b It is the general idea of the Central Limit Theorem for a particular estimator: the sample mean...

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 118 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Theorem (Lindeberg–Levy Central Limit Theorem, univariate)

Let X1, .., XN denote a sequence of independent and identically distributed random variables with …nite mean E (Xi ) = µ and …nite variance 2 1 N V (Xi ) = σ . Then the sample mean X N = N ∑i=1 Xi satis…es

d 2 pN X N µ 0, σ !N

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 119 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Comment:

1 The result is quite remarkable as it holds regardless of the form of the parent distribution (the distribution of Xi ).

2 The central limit theorem requires virtually no assumptions (other than independence and …nite variances) to end up with normality: normality is inherited from the sums of ”small” independent disturbances with …nite variance.

Proof: Rao (1973).

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 120 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Illustration:

2 1 Let us consider a random variable Xi χ (2) , such that E (Xi ) = 2 N and V (Xi ) = 4 and draw an i.i.d sample xi f gi=1 1 N 2 Compute the sample mean xN = N ∑i=1 xi and the transformed variable pN (xN 2) /2 3 Repeat this procedure 5,000 times. We get 5,000 realisations of this transformed variable.

4 Build an histogram (and a non parametric kernel estimate of f X N (.)) of these 5,000 realisations and compare it to the normal pdf.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 121 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 122 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions N = 10 N = 100

0.45 0.45 Realisations Realisations 0.4 S tandard normal pdf 0.4 S tandard normal pdf K ernel estimate K ernel estimate

0.35 0.35

0.3 0.3

0.25 0.25

0.2 0.2

0.15 0.15

0.1 0.1

0.05 0.05

0 0 •4 •2 0 2 4 6 •5 0 5 N = 1, 000 N = 10, 000

0.4 0.4 Realisations Realisations S tandard normal pdf S tandard normal pdf 0.35 0.35 K ernel estimate K ernel estimate

0.3 0.3

0.25 0.25

0.2 0.2

0.15 0.15

0.1 0.1

0.05 0.05

0 0 •6 •4 •2 0 2 4 6 •5 0 5

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 123 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Click me!

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 124 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

De…nition The convergence result (CLT)

d 2 pN X N µ 0, σ !N can be understood as: asy σ2 X N µ, N N asy where the symbol means "asymptotically distributed as". The asymptotic mean and variance of the sample mean are then de…ned by:

σ2 E X = µ V X = asy N asy N N

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 125 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Speed of convergence: why studying pNX N in the TCL?

1 For simplicity, let us assume that µ = E (Xi ) = 0 and let us study the α asymptotic behavior of N X N

σ2 V NαX = N2αV X = N2α = N2α 1σ2 N N N 2 If we assume that α > 1/2, then 2α 1 > 0, the asymptotic variance α of N X N is in…nite:

α lim V N X N = +∞ N ∞ !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 126 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

α 1 If we assume that α < 1/2, then 2α 1 < 0, the N X N has a degenerated distribution:

α lim V N X N = 0 N ∞ ! 2 As a consequence α = 1/2 is the only choice to get a …nite and positive variance 2 V pNX N = σ

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 127 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Summary: Let X1, .., XN denote a sequence of independent and identically distributed random variables with …nite mean E (Xi ) = µ and 2 2 …nite variance V Xi = σ . Then, the sample mean

1 N X N = ∑ Xi N i=1 satis…es p WLLN: X N µ ! d 2 CLT: pN X N µ 0, σ !N

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 128 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

The central limit theorem does not assert that the sample mean tends to normality. It is the transformation of the sample mean that has this property

p WLLN: X N µ ! d 2 CLT: pN X N µ 0, σ !N

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 129 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Theorem (Lindeberg–Levy Central Limit Theorem, multivariate)

Let x1, .., xN denote a sequence of independent and identically distributed random K 1 vectors with …nite mean E (xi ) = µ and …nite variance covariance K K matrix V (xi ) = Σ. Then the sample mean 1 N xN = N ∑i=1 xi satis…es

d pN(xN µ) 0 , Σ !N 0 1 K 1 K K K 1 @ A | {z } |{z} |{z}

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 130 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Remark: there exist other versions of the CLT, especially for independent but not identically (heterogeneously) distributed variables

1 Lindeberg–Feller Central Limit Theorem for unequal variances.

2 Liapounov Central Limit Theorem for unequal means and variances.

For more details, see: Greene W. (2007), Econometric Analysis, sixth edition, Pearson Prentice Hill.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 131 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Question: from the CLT (univariate or multivariate), and the asymptotic distribution of X N , how to derive the asymptotic distribution of an estimator θ that depends on the sample mean?

asy b θ = g X N ??? b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 132 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Theorem (Continouous mapping theorem)

Let Xi for i = 1, .., N be a sequence of real-valued random variables and g (.)fa continousg function:

a.s a.s if XN X then g (XN ) g (X ) ! ! p p if XN X then g (XN ) g (X ) ! ! d d if XN X then g (XN ) g (X ) ! !

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 133 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Example (multiple linear regression model) Let us consider the multiple linear regression model

yi = xi>β + µi

where xi = (xi1..xiK )> is K 1 vector of random variables, β = (β1...βK )> is K 1 vector of parameters, and where the error term 2 µi satis…es E (µi ) = 0, V (µi ) = σ and E ( µi xij ) = 0, j = 1, ..K Question: show that the OLS estimator satis…esj 8

d 2 1 pN β β 0, σ E x>xi 0 !N i b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 134 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Proof:

1 Rewritte the OLS estimator as:

1 1 N N N N β = ∑ xi xi> ∑ xi yi = β0 + ∑ xi xi> ∑ xi µi i=1 ! i=1 ! i=1 ! i=1 ! b 2 Normalize the vector β β 0 1 b 1 N 1 N p p N β β0 = ∑ xi xi> N ∑ xi µi N i=1 ! N i=1 ! b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 135 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Reminder: if x is a vector of random variables and Y is a scalar (random variable) such that E (xY ) = 0, then

V (xY ) = E x E (Y x) x> j

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 136 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Proof (cont’d): 3. Using the WLLN and the CMP:

1 N 1 p 1 ∑ xi xi> E xi xi> N i=1 ! ! 4. Using the CLT:

1 N p d N ∑ xi µi E (xi µi ) (0, V (xi µi )) N i=1 ! !N

with E ( µ xik ) = 0, k = 1, ..K = E (xi µ ) = 0 and i j 8 ) i

V (xi µi ) = E xi µi µi xi> = E E xi µi µi xi> xi 2 = E xi V ( µ xi ) x> = σ E xi x> i j i i Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 137 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Proof (cont’d): we have

1 N 1 p 1 ∑ xi xi> E xi xi> N i=1 ! ! 1 N p d 2 N ∑ xi µi 0, σ E xi xi> N i=1 ! !N

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 138 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Theorem (Slutsky’stheorem for convergence in distribution) d Let XN and YN be two sequences of random variables where XN X and p ! YN c, where c = 0, then: ! 6 d XN + YN X + c ! d XN YN cX ! XN d X YN ! c

1 d 1 If YN and XN are matrices/vectors, then Y XN c X with N ! 1 1 1 V c X = c Vc >

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 139 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Proof (cont’d): By using the Slusky’stheorem (for a convergence in distribution), we have:

1 1 N 1 N p p d N β β0 = ∑ xi xi> N ∑ xi µi (Π, Ω) N i=1 ! N i=1 ! !N b with 1 Π = E xi x> 0 = 0 i 1 2 1 2 1 Ω = E xi x> σ E xi x> E xi x> = σ E xi x> i i i i Finally, we have:

d 2 1 pN β β 0, σ E xi x> 0 !N i b

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 140 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

De…nition (univariate Delta method)

Let ZN be a sequence random variable indexed by the sample size N such that d 2 pN (ZN µ) 0, σ !N If g (.) is a continuous and continuously di¤erentiable function with g (µ) = 0 and not involving N, then 6 2 d ∂g (x) 2 pN (g (ZN ) g (µ)) 0, σ !N 0 ∂x 1 µ!

@ A

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 141 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Multivariate Delta method Let ZN be a sequence random vectors indexed by the sample size such that

d pN (ZN µ) (0, Σ) !N If g (.) is a continuous and continuously di¤erentiable multivariate function with g (µ) = 0 and not involving N, then 6

d ∂g (x) ∂g (x) pN (g (ZN ) g (µ)) 0, Σ !N ∂x ∂x µ > µ!

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 142 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Example (Gamma distribution)

Let X1, .., XN denote a sequence of independent and identically distributed random variables. We assume that Xi Γ (α, β) (gamma distribution) 2 with E (X ) = αβ and V (X ) = αβ , α > 0, β > 0 and a pdf de…ned by:

α 1 x x exp β fX (x; α, β) = , useless in this exercice, but for your culture Γ (α) βα

for x [0, +∞[ , where Γ (α) = ∞ tα 1 exp ( t) dt denotes the 8 2 0 Gamma function. We assume that α is known. Question: What is the asymptotic distribution of the estimatorR β de…ned by:

1 Nb β = ∑ Xi αN i=1 b Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 143 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions

Solution: The estimator β is de…ned by:

b 1 N β = ∑ Xi αN i=1

b 2 Since X1, .., XN are i.i.d. with E (X ) = αβ and V (X ) = αβ , we can apply the Lindeberg–Levy CLT, and we get immediately:

d 2 pN X N αβ 0, αβ !N

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 144 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Solution (cont’d): If we de…ne g (x) = x/α, with

g E X N = g (αβ) = β = 0 6 1 β = X N = g X N α b d 2 pN X N αβ 0, αβ !N By using the delta method , we have: 2 d ∂g (z) 2 pN g X N g (αβ) 0, αβ !N 0 ∂z 1 αβ! @ A Since ∂g (z) /∂z = ∂ (z/α) /∂z = 1/α, we have:

2 p d β N β β 0, !N α ! Christophe Hurlin (University of Orléans) Advancedb Econometrics - HEC Lausanne November20,2013 145 / 147 4. Asymptotic Properties

Key Concepts Section 4

1 Almost sure convergence 2 Convergence in probability 3 Law of large numbers: Khinchine’sand Kolmogorov’stheorems 4 Weakly and strongly consistent estimator 5 Slutsky’stheorem 6 Convergence in mean square 7 Convergence in distribution 8 Asymptotic distribution and asymptotic variance 9 Lindeberg-Levy Central Limite Theorem (univariate and multivariate) 10 Continuous mapping theorem 11 Delta method

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 146 / 147 End of Chapter 1

Christophe Hurlin (University of Orléans)

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November20,2013 147 / 147