<<

Anderson

Dept. of

University of Pittsburgh GSPH

A Lecture on the Central Limit Theorem

One of the fundamental notions of statistical reasoning is the idea that one can make

reasonable inferences from some of observations randomly drawn from a p opulation

of p otential observations. Figure 1 depicts a crude schema of how is made.

FIGURE 1: AN OVERVIEW OF HOW A POPULATION AND A RANDOM SAMPLE ARE

LINKED THROUGH INFERENTIAL

random POPULATION sample

PROBABILITY (The Mathematics of Chance)

Inferential

Statistics 1

First, one must de ne a p opulation from which observations can b e drawn. This p opula-

tion of observations has some distribution. The distribution of observations can b e character-

ized by its , , , and other \p opulation parameters".

From the p opulation, one attempts to draw a random sample of observations. By \random"

we mean that all observations have equal chances of b eing drawn for the sample. Using the

random sample together with the laws of probability one can then make statistical inferences

ab out p opulation parameters. One of the most imp ortant distributions used in statistical

inference is the normal Gaussian or \b ell{shap ed" distribution. One reason that the nor-

mal distribution has a central role in statistics is b ecause of the \Central Limit Theorem".

The Central Limit Theorem essentially states if one takes many di erent samples of size n

where n is reasonably large from any distribution with a known mean, then the mean

values of those samples will approximate a even though the distribution

which was sampled from may not be normal ly distributed.

It is the goal of this lecture to state the Central Limit Theorem mathematically and to

demonstrate how it works with some simulated .

Statement of the Central Limit Theorem

If X ;X ;:::X are indep endently and identically distributed i.i.d. random variables,

1 2 n

2

each with mean  and  < 1, then the distribution of the arithmetic mean of

2

the X 's is approximately normal ly distributed with mean  and variance  =n as n gets

i

large. This prop erty is true even if the individual X 's are not normally distributed, for

i

example, if they are from a skewed distribution. The central limit theorem c.l.t. is written

2

mathematically as: if X ;X ;:::;X are i.i.d. D ;   for any distribution D , then

1 2 n

P

n

X

i

i=1

2

= X _ N ;  =n as n !1: 1

n

Consequences of the Central Limit Theorem

1. If we take the mean, X , of a large sample of identically distributed random vari-

ables, then we will exp ect it to be a more and more accurate estimate of the the true

p opulation mean, . Thus the variance of X gets smaller and smaller and smaller as n

gets larger.

2

2. Supp ose X ;X ;:::X are i.i.d.  D with mean  and variance  . Then

1 2 n

n

X

1 1

[VarX +:::VarX ] X = Var X = Var

1 n i

2

n n

i=1

2

1 1 

2 2 2

= = : 2 [ + ::: +  ] n =

2 2

{z } |

n n n

n times

p

2



X = n and is called the standard Thus, Var . The squareroot of this quantityis=

n

error of the mean. The of the mean s.e.m. is estimated from a sample 2

p p

of observations by ^ n = s= n where = =^

x

v v

u u

   

n n n

X X X

u u

1 1 1

2

2

2

t t

s = = x x x   x  :

i i

i

n 1 n 1 n

i=1 i=1 i=1

The di erence between the standard deviation and the standard error of the mean is

that the standard deviation measures the variability of individual observations ab out

the mean whereas the standard error of the mean measures the variability of the mean

of n observations and so, estimates the population variation.

3. If we take the mean of a large number of observations which come from even a very

highly skewed distribution, then that mean is approximately normally distributed. One

can test this by simulating a numb er of these \samples" of observations [See simulation

example].

x, estimates 4. As was indicated earlier, the mean of a random sample of n observations,

the p opulation mean, . In the case where the individual random variables, that is,

2

the X are i.i.d. N ;   i = 1;:::;n, then the mean of these random variables, X

i

is said to be an unbiased of , that is, E X  = . X is also said to be a

minimum variance estimator of , i.e., Var X   Varf X ;:::;X  for any function,

1 n

x, is said to b e a point estimate of . f ,ofthe X 's. The sample estimate,

i

5. The Central Limit Theorem allows us to make probability statements ab out an arith-

metic mean from any typ e of distribution when the sample size, n, is large, i.e., for

large n,

X  X 

p p

_ N 0; 1 = for example; Pr 1:96 < < 1:96  :95: 3

= n = n

If the original distribution is normal, then for all n, equation 3 will hold exactly, that

2

is, if the X are i.i.d. N ;  ; i =1;:::;n, then

i

X  X 

p p

 N 0; 1 = for example; Pr < 1:96 1:96 < =:95: 4

= n = n

Equation 4 is useful for constructing con dence intervals for .

Example

Consider the on the next page of this handout. These histograms were gen-

erated using a statistical package called S|Plus and show the distributions of 64 samples

of standard uniform random numb ers. Each sample consists of 30 numb ers. A standard

uniform distribution is a continuous and at distribution which takes on values between 0

and 1, inclusive. The exp ected value mean of a uniform distribution is 0.5. On page 5 of

this handout is a of the mean values of the 64 samples. Notice that the histogram

of the lo oks roughly like a normal distribution and has a mean value which is near

the mean value of a uniform distribution. 3 Figure 2A: 64 Simulations of 30 Standard Uniform Random Numbers 02468 02468 02468 02468 02468 02468 02468 02468

0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8

xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 02468 0246 02468 02468 0246 02468 02468 02468

0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8

xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 02468 12 02468 02468 02468 02468 12 02468 02468 02468

0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8

xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 02468 02468 02468 12 0246 02468 02468 02468 02468

0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8

xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 0246 0246 02468 02468 02468 02468 02468 0246

0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8

xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 0246 02468 02468 02468 02468 02468 02468 02468

0.0 0.4 0.8 0.0 0.4 0.8 0.2 0.6 1.0 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8

xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 0246 04812 02468 02468 02468 02468 02468 02468

0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8

xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 02468 02468 02468 02468 02468 02468 02468 02468

0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8

xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 4 Figure 2B: Distribution of 64 means from Standard Uniform distributions of size 30 0 5 10 15

0.4 0.5 0.6 0.7

mxu 5