Anderson
Dept. of Biostatistics
University of Pittsburgh GSPH
A Lecture on the Central Limit Theorem
One of the fundamental notions of statistical reasoning is the idea that one can make
reasonable inferences from some sample of observations randomly drawn from a p opulation
of p otential observations. Figure 1 depicts a crude schema of how statistical inference is made.
FIGURE 1: AN OVERVIEW OF HOW A POPULATION AND A RANDOM SAMPLE ARE
LINKED THROUGH INFERENTIAL STATISTICS
random POPULATION sample
PROBABILITY descriptive statistics (The Mathematics of Chance)
Inferential
Statistics 1
First, one must de ne a p opulation from which observations can b e drawn. This p opula-
tion of observations has some distribution. The distribution of observations can b e character-
ized by its mean, standard deviation, skewness, kurtosis and other \p opulation parameters".
From the p opulation, one attempts to draw a random sample of observations. By \random"
we mean that all observations have equal chances of b eing drawn for the sample. Using the
random sample together with the laws of probability one can then make statistical inferences
ab out p opulation parameters. One of the most imp ortant distributions used in statistical
inference is the normal Gaussian or \b ell{shap ed" distribution. One reason that the nor-
mal distribution has a central role in statistics is b ecause of the \Central Limit Theorem".
The Central Limit Theorem essentially states if one takes many di erent samples of size n
where n is reasonably large from any distribution with a known mean, then the mean
values of those samples will approximate a normal distribution even though the distribution
which was sampled from may not be normal ly distributed.
It is the goal of this lecture to state the Central Limit Theorem mathematically and to
demonstrate how it works with some simulated data.
Statement of the Central Limit Theorem
If X ;X ;:::X are indep endently and identically distributed i.i.d. random variables,
1 2 n
2
each with mean and variance < 1, then the distribution of the arithmetic mean of
2
the X 's is approximately normal ly distributed with mean and variance =n as n gets
i
large. This prop erty is true even if the individual X 's are not normally distributed, for
i
example, if they are from a skewed distribution. The central limit theorem c.l.t. is written
2
mathematically as: if X ;X ;:::;X are i.i.d. D ; for any distribution D , then
1 2 n
P
n
X
i
i=1
2
= X _ N ; =n as n !1: 1
n
Consequences of the Central Limit Theorem
1. If we take the mean, X , of a large sample of identically distributed random vari-
ables, then we will exp ect it to be a more and more accurate estimate of the the true
p opulation mean, . Thus the variance of X gets smaller and smaller and smaller as n
gets larger.
2
2. Supp ose X ;X ;:::X are i.i.d. D with mean and variance . Then
1 2 n
n
X
1 1
[VarX +:::VarX ] X = Var X = Var
1 n i
2
n n
i=1
2
1 1
2 2 2
= = : 2 [ + ::: + ] n =
2 2
{z } |
n n n
n times
p
2
X = n and is called the standard Thus, Var . The squareroot of this quantityis=
n
error of the mean. The standard error of the mean s.e.m. is estimated from a sample 2
p p
of observations by ^ n = s= n where = =^
x
v v
u u
n n n
X X X
u u
1 1 1
2
2
2
t t
s = = x x x x :
i i
i
n 1 n 1 n
i=1 i=1 i=1
The di erence between the standard deviation and the standard error of the mean is
that the standard deviation measures the variability of individual observations ab out
the mean whereas the standard error of the mean measures the variability of the mean
of n observations and so, estimates the population variation.
3. If we take the mean of a large number of observations which come from even a very
highly skewed distribution, then that mean is approximately normally distributed. One
can test this by simulating a numb er of these \samples" of observations [See simulation
example].
x, estimates 4. As was indicated earlier, the mean of a random sample of n observations,
the p opulation mean, . In the case where the individual random variables, that is,
2
the X are i.i.d. N ; i = 1;:::;n, then the mean of these random variables, X
i
is said to be an unbiased estimator of , that is, E X = . X is also said to be a
minimum variance estimator of , i.e., Var X Varf X ;:::;X for any function,
1 n
x, is said to b e a point estimate of . f ,ofthe X 's. The sample estimate,
i
5. The Central Limit Theorem allows us to make probability statements ab out an arith-
metic mean from any typ e of distribution when the sample size, n, is large, i.e., for
large n,
X X
p p
_ N 0; 1 = for example; Pr 1:96 < < 1:96 :95: 3
= n = n
If the original distribution is normal, then for all n, equation 3 will hold exactly, that
2
is, if the X are i.i.d. N ; ; i =1;:::;n, then
i
X X
p p
N 0; 1 = for example; Pr < 1:96 1:96 < =:95: 4
= n = n
Equation 4 is useful for constructing con dence intervals for .
Example
Consider the histograms on the next page of this handout. These histograms were gen-
erated using a statistical package called S|Plus and show the distributions of 64 samples
of standard uniform random numb ers. Each sample consists of 30 numb ers. A standard
uniform distribution is a continuous and at distribution which takes on values between 0
and 1, inclusive. The exp ected value mean of a uniform distribution is 0.5. On page 5 of
this handout is a histogram of the mean values of the 64 samples. Notice that the histogram
of the means lo oks roughly like a normal distribution and has a mean value which is near
the mean value of a uniform distribution. 3 Figure 2A: 64 Simulations of 30 Standard Uniform Random Numbers 02468 02468 02468 02468 02468 02468 02468 02468
0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8
xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 02468 0246 02468 02468 0246 02468 02468 02468
0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8
xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 02468 12 02468 02468 02468 02468 12 02468 02468 02468
0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8
xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 02468 02468 02468 12 0246 02468 02468 02468 02468
0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8
xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 0246 0246 02468 02468 02468 02468 02468 0246
0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8
xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 0246 02468 02468 02468 02468 02468 02468 02468
0.0 0.4 0.8 0.0 0.4 0.8 0.2 0.6 1.0 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8
xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 0246 04812 02468 02468 02468 02468 02468 02468
0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8
xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 02468 02468 02468 02468 02468 02468 02468 02468
0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8
xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] xu[, i] 4 Figure 2B: Distribution of 64 means from Standard Uniform distributions of size 30 0 5 10 15
0.4 0.5 0.6 0.7
mxu 5