<<

Ismor Fischer, 5/29/2012 5.2-1

5.2 Formal Statement and Examples

Sampling Distribution of a Normal Variable

Given a X. Suppose that the population distribution of X is known to be normal, with µ and σ 2, that is, X ~ N(µ, σ). Then, for any size n, it follows that the distribution of X is normal, σ 2  σ  with mean µ and variance , that is, X ~ Nµ,  . n  n 

Comments:

σ  is called the “ of the mean,” denoted SEM, or more simply, s.e. n X − µ  The corresponding Z-score transformation formula is Z = ~ N(0, 1). σ / n

Example: Suppose that the ages X of a certain population are normally distributed, with mean µ = 27.0 years, and σ = 12.0 years, i.e., X ~ N(27, 12).

The probability that the age of a single randomly selected individual is less than 30 years  30 − 27   is P(X < 30) = PZ < 12  X µ = 27 30 = P(Z < 0.25) = 0.5987.

In this population, the Now consider all random samples of size n = 36 taken probability that the average from this population. By the above, their mean ages age of 36 random people is under 30 years old, is much X are also normally distributed, with mean µ = 27 yrs greater than the probability σ 12 yrs that the age of one random as before, but with standard error = = 2 yrs. person is under 30 years old. n 36 Exercise: Compare the two That is, X ~ N(27, 2). probabilities of being under 24 years old.

The probability that the mean age of a single sample of Exercise: Compare the two n = 36 randomly selected individuals is less than 30 probabilities of being between 24 and 30 years old.  30 − 27   years is P( X < 30) = PZ < 2  = P(Z < 1.5) = 0.9332. X µ = 27 30 Ismor Fischer, 5/29/2012 5.2-2

 σ   If X ~ N(µ, σ) approximately, then X ~ Nµ,  approximately. (The larger the value  n  of n, the better the approximation.) In fact, more is true...

IMPORTANT GENERALIZATION:

The

Given any random variable X, discrete or continuous, with finite mean µ and finite variance σ 2. Then, regardless of the shape of the population distribution of X, as the sample size n gets larger,

the of X becomes increasingly closer to

σ 2  σ 

normal, with mean µ and variance , that is, X ~ Nµ,  , n  n  approximately. X − µ More formally, Z= → Nn(0,1) as →∞ . σ / n 

 Intuitively perhaps, there is less variation between different sample mean values, than there is between different population values. This formal result states that, under very general conditions, the sampling variability is usually much smaller than the population variability, as well as gives the precise form of the “limiting distribution” of the .

 What if the population standard deviation σ is unknown? Then it can be replaced by the  s  sample standard deviation s, provided n is large. That is, X ~ Nµ,  approximately,  n  s if n ≥ 30 or so, for “most” distributions (... but see example below). Since the value n is a sample-based estimate of the true standard error s.e., it is commonly denoted s.e.

 µ µ Because the mean X of the sampling distribution is equal to the mean X of the

population distribution – i.e., EX[]= µX – we say that X is an unbiased of µX . In other words, the sample mean is an unbiased estimator of the population mean. A biased sample estimator is a statistic θˆ whose “expected value” either consistently overestimates or underestimates its intended population parameter θ .

 Many other versions of CLT exist, related to so-called Laws of Large Numbers.

Ismor Fischer, 5/29/2012 5.2-3

Example: Consider a(n infinite) population of paper notes, 50% of which are blank, 30% are ten-dollar bills, and the remaining 20% are twenty-dollar bills.

Experiment 1: Randomly select a single note from the population.

Random variable: X = $ amount obtained

x f(x) = P(X = x)

0 .5

10 .3

.5 20 .2 .3 .2

 Mean µX = E[X] = (.5)(0) + (.3)(10) + (.2)(20) = $7.00

2 2 2 2 2  Variance σ X = E[ (X – µX ) ] = (.5)(−7) + (.3)(3) + (.2)(13) = 61

 Standard deviation σ X = $7.81 Ismor Fischer, 5/29/2012 5.2-4

Experiment 2: Each of n = 2 people randomly selects a note, and split the winnings.

Random variable: X = $ sample mean amount obtained per person

x 0 5 10 5 10 15 10 15 20

(x1, x2) (0, 0) (0, 10) (0, 20) (10, 0) (10, 10) (10, 20) (20, 0) (20, 10) (20, 20) .5 × .5 .5 × .3 .5 × .2 .3 × .5 .3 × .3 .3 × .2 .2 × .5 .2 × .3 .2 × .2 Probability = 0.25 = 0.15 = 0.10 = 0.15 = 0.09 = 0.06 = 0.10 = 0.06 = 0.04

x f ( x ) = P( X = x )

0 .25

5 .30 = .15 + .15

.30 .29 10 .29 = .10 + .09 + .10 .25 .12 15 .12 = .06 + .06 .04

20 .04

 µ = (.25)(0) + (.30)( 5) + (.29)(10) + (.12)(15) + (.04)(20) = $7.00 = µ !! Mean X X

 σ 2 (.25)(−7)2 + (.30)(−2)2 + (.29)(3)2 + (.12)(8)2 + (.04)(13)2 Variance X =

2 61 σ X = 30.5 = = !! 2 n

σ X  Standard deviation σ = $5.52 = !! X n

Ismor Fischer, 5/29/2012 5.2-5

Experiment 3: Each of n = 3 people randomly selects a note, and split the winnings.

Random variable: X = $ sample mean amount obtained per person

x 0 3.33 6.67 3.33 6.67 10 6.67 10 13.33

(x1, x2, x3) (0, 0, 0) (0, 0, 10) (0, 0, 20) (0, 10, 0) (0, 10, 10) (0, 10, 20) (0, 20, 0) (0, 20, 10) (0, 20, 20) .5 × .5 × .5 .5 × .5 × .3 .5 × .5 × .2 .5 × .3 × .5 .5 × .3 × .3 .5 × .3 × .2 .5 × .2 × .5 .5 × .2 × .3 .5 × .2 × .2 Probability = 0.125 = 0.075 = 0.050 = 0.075 = 0.045 = 0.030 = 0.050 = 0.030 = 0.020

3.33 6.67 10 6.67 10 13.33 10 13.33 16.67 (10, 0, 0) (10, 0, 10) (10, 0, 20) (10, 10, 0) (10, 10, 10) (10, 10, 20) (10, 20, 0) (10, 20, 10) (10, 20, 20) .3 × .5 × .5 .3 × .5 × .3 .3 × .5 × .2 .3 × .3 × .5 .3 × .3 × .3 .3 × .3 × .2 .3 × .2 × .5 .3 × .2 × .3 .3 × .2 × .2 = 0.075 = 0.045 = 0.030 = 0.045 = 0.027 = 0.018 = 0.030 = 0.018 = 0.012

6.67 10 13.33 10 13.33 16.67 13.33 16.67 20 (20, 0, 0) (20, 0, 10) (20, 0, 20) (20, 10, 0) (20, 10, 10) (20, 10, 20) (20, 20, 0) (20, 20, 10) (20, 20, 20) .2 × .5 × .5 .2 × .5 × .3 .2 × .5 × .2 .2 × .3 × .5 .2 × .3 × .3 .2 × .3 × .2 .2 × .2 × .5 .2 × .2 × .3 .2 × .2 × .2 = 0.050 = 0.030 = 0.020 = 0.030 = 0.018 = 0.012 = 0.020 = 0.012 = 0.008

x f ( x ) = P( X = x )

0.00 .125

3.33 .225 = .075 + .075 + .075 .285 = .050 + .045 + .050 + .285 6.67 .045 + .045 + .050 .225 .207 = .030 + .030 + .030 + .027 .207 10.00 + .030 + .030 + .030 .125 .114 .114 = .020 + .018 + .018 + .036 13.33 .020 + .018 + .020 .008

16.67 .036 = .012 + .012 + .012

20.00 .008

 µ µ Mean X = Exercise = $7.00 = X !!!

2 61 σ X  Variance σ 2 = Exercise = 20.333 = = !!! X 3 n

σ X  Standard deviation σ = $4.51 = !!! X n Ismor Fischer, 5/29/2012 5.2-6

The tendency toward a becomes stronger as the sample size n gets larger, despite the mild skew in the original population values. This is an empirical consequence of the Central Limit Theorem.

For most such distributions, n ≥ 30 or so is sufficient for a reasonable normal approximation to the sampling distribution. In fact, if the distribution is symmetric, then convergence to a bell curve can often be seen for much lower n, say only n = 5 or 6. Recall also, from the first result in this section, that if the population is normally distributed (with known σ), then so will be the sampling distribution, for any n.

BUT BEWARE....

Ismor Fischer, 5/29/2012 5.2-7

However, if the population distribution of X is highly skewed, then the sampling distribution of X can be highly skewed as well (especially if n is not very large), i.e., relying on CLT can be risky! (Although, sometimes using a transformation, such as ln(X) or X, can restore a bell shape to the values. Later…)

Example: The two graphs on the bottom of this page are simulated sampling distributions for the highly skewed population shown below. Both are density based on the of 1000 random samples; the first corresponds to samples of size n = 30, the second to n = 100. Note that skew is still present!

Population Distribution