Introduction to Sampling Distributions and Statistical Estimation

Home , Statistical population

Topics for this Module Parameters and... Sampling Distributions Sampling Error Principles of “Good... Introduction to Sampling Distributions and Practical vs....

Statistical Estimation Home Page

James H. Steiger Print

November 12, 2003 Title Page

JJ II

J I

Page 1 of 12

Go Back

Full Screen

Quit Topics for this Module Parameters and... 1. Topics for this Module Sampling Distributions Sampling Error 1. Parameters and Statistics Principles of “Good... Practical vs.... (a) Reversing the Information Flow in Statistical Inference

Home Page 2. Sampling Distributions

3. Sampling Error Print

4. Principles of “Good Estimation” Title Page (a) Unbiasedness JJ II (b) Consistency

5. Practical vs. Theoretical Considerations Go Back

Full Screen

Quit Topics for this Module Parameters and... 2. Parameters and Statistics Sampling Distributions Sampling Error In statistical estimation we use a statistic (a function of a sample) to es- Principles of “Good... timate a parameter, a numerical characteristic of a statistical population. Practical vs.... In the preceding discussion of the binomial distribution, we discussed a well-known statistic, the sample proportion p, and how its long-run distri- Home Page bution over repeated samples can be described, using the binomial process and the binomial distribution as models. We found that, if the binomial b Print model is correct, we can describe the exact distribution of p (over repeated samples) if we know N and p, the parameters of the binomial distribution. Title Page The situation can be diagrammed as in the top of Figure b1 below.

JJ II 2.1. Reversing the Information Flow in Statistical

Inference J I Probability theory tells us about the long run behavior of p, but this requires speci…cation of precisely what we do not know, i.e., p, the proportion Page 3 of 12 in the population. However, what we would like to have is somethingb dif- ferent from what probability theory provides directly. Generally what we Go Back have is one sample, of size N, and what we would like probability theory to provide for us is knowledge about p on the basis of the information in Full Screen our data. But it doesn’t, and so probability theory, as important as it is, provides just the beginning point for statistical inference. Close

Quit Topics for this Module Parameters and... Sampling Distributions ’/(8 564)()0108< 8/,46< .0:,7 <49 Sampling Error &(6(2,8,67 N" p! Principles of “Good... $,/(:046 4- p 4:,6 6,5,(8,+ 7(251,7 Practical vs.... 4- 70=, N #7792580437 $03420(1 &64*,77! Home Page

’/(8 <49 ;(38 Print

%)7,6:(8043 4- ( 703.1, p )(7,+ 43 ( # 6,(743()1, 03-,6,3*, ()498 p Title Page 703.1, 7(251, 4- 70=, N JJ II Figure 1: Reversing the information ‡ow in statistical inference J I

Looking back at Figure 1, what we would like is to turn around the Page 4 of 12 direction of information ‡ow. In probability theory, knowledge of p and N leads to knowledge about the long run behavior of p. In statistical Go Back inference, we would like something else — a method to use knowledge of p and N to lead to knowledge of p. b Full Screen b Close

Quit Topics for this Module Parameters and... 3. Sampling Distributions Sampling Distributions Sampling Error A statistic is any function of the sample. Over repeated samples, statistics Principles of “Good... will almost always vary in value. So, over repeated samples, a statistic will Practical vs.... have a sampling distribution. Sampling distributions have several characteristics: Home Page

1. Exact sampling distributions are di¢ cult to derive Print 2. They are often di¤erent in shape from the distribution of the population from which they are sampled Title Page

3. They often vary in shape (and in other characteristics) as a function JJ II of N.

J I

Page 5 of 12

Go Back

Full Screen

Quit Topics for this Module Parameters and... 4. Sampling Error Sampling Distributions Sampling Error Consider any statistic used to estimate a parameter :For any given Principles of “Good... sample of size N, it is virtually certain that will not be equal to . We Practical vs.... can describe the situationb with the following equation in random variables

b Home Page = + "

Print where " is called “sampling error,”andb is de…ned tautologically as

" = (1) Title Page i.e., the amount by which is wrong.b In most situations, " can be either JJ II positive or negative. b J I

Page 6 of 12

Go Back

Full Screen

Quit Topics for this Module Parameters and... 5. Principles of “Good Estimation” Sampling Distributions Sampling Error A statistic that is used to estimate a particular parameter is called an Principles of “Good... estimator of that parameter. A realized value of the estimator is called an Practical vs.... estimate of the parameter. Although some of the estimators that we use

(like the sample mean X and the sample proportion p) are the sample Home Page analogs of the population quantities they estimate, many other estimators (for example, s2, the sample variance) are not. b Print For any parameter, there are many possible estimators. Generally, an estimator in wide use has achieved popularity because it satis…es one or more optimality criteria, i.e. qualities that a good estimator is supposed Title Page to have. Below, we discuss a number of commonly used criteria for a good estimator. JJ II

5.1. Unbiasedness J I

De…nition 5.1 (Unbiased Estimator) An estimator of a parameter Page 7 of 12 is unbiased if E() = , or, equivalently, if E (") = 0, where " is sampling error as de…ned in Equation 1. b Go Back b Ideally, we would like the positive and negative errors of an estimator Full Screen to balance out in the long run, so that, on average, the estimator is neither high (an overestimate) nor low (an underestimate). Close

Quit Topics for this Module Parameters and... 5.2. Consistency Sampling Distributions Sampling Error We would like an estimator to get better and better as N gets larger and Principles of “Good... larger, otherwise we are wasting our e¤ort gathering a larger sample. If we de…ne some error tolerance , we would like to be sure that sampling error Practical vs.... " is almost certainly less than if we let N get large enough. Formally we say the following. Home Page

De…nition 5.2 (Consistency). An estimator of a parameter is con- Print sistent if for any error tolerance > 0, no matter how small, a sequence of statistics N based on a sample of size N willb satisfy the following Title Page

b lim Pr N < = 1 (2) JJ II N !1

Example 5.1 (An Unbiased, Inconsistentb Estimator) Consider the J I statistic D = (X1 + X2)=2 as an estimator for the population mean. No matter how large N is, D always takes the average of just the …rst two Page 8 of 12 observations. This statistic has an expected value of , the population mean, since Go Back 1 1 1 1 1 1 E X + X = E (X ) + E (X ) = + = Full Screen 2 1 2 2 2 1 2 2 2 2 but it does not keep improving in accuracy as N gets larger and larger. Close

Quit Topics for this Module Parameters and... 5.3. E¢ ciency Sampling Distributions Sampling Error All other things being equal, we prefer estimators with a smaller sampling Principles of “Good... errors. Several reasonable measures of “smallness”suggest themselves: (a) the average absolute error, and (b) the average squared error. Consider Practical vs.... the latter. The variance of an estimator can be written Home Page 2 2 = E E Print b and when the estimator is unbiased,b E b= ;so the variance becomes Title Page 2 2 = E b= E "2 JJ II since = ". b b J I For an unbiased estimator, the sampling variance is also the average squaredb error, and is a direct measure of how inaccurate the estimator is, Page 9 of 12 on average. More generally, though, one can think of sampling variance as the randomness, or noise, inherent in a statistic. (The parameter is the “signal.”) Such noise is generally to be avoided. Go Back Consequently, the e¢ ciency of a statistic is inversely related to its sampling variance, i.e. Full Screen 1 Efficiency() = Close 2 Quit b b Topics for this Module Parameters and... The relative e¢ ciency of two statistics is the ratio of their e¢ ciencies, Sampling Distributions which is the inverse of the ratio of their sampling variances. Sampling Error Principles of “Good... Example 5.2 (Relative E¢ ciency) Suppose statistic A has a sampling Practical vs.... variance of 5, and statistic B has a sampling variance of 10. The relative

e¢ ciency of A relative to B is 2. Home Page

5.4. Su¢ ciency Print An estimator is su¢ cient for estimating if it uses all the information Title Page about available in a sample. The formal de…nition is as follows: b De…nition 5.3 (Su¢ cient Statistic) Recalling that any statistic is a JJ II function of the sample, de…ne (S) to be a particular value of an estimator based on a speci…c sample S. An estimator is a su¢ cient statistic for J I estimating if the conditionalb distribution of the sample S given (S) does bnot depend on . b Page 10 of 12 b The fact that once the distribution is conditionalized on it no longer Go Back depends on , shows that all the information that might “reveal in the sample”is captured by . b Full Screen

b Close

Quit Topics for this Module Parameters and... 5.5. Maximum Likelihood Sampling Distributions Sampling Error The likelihood of a sample of N independent observations is simply the Principles of “Good... product of the probability densities of the individual observations. Of course, if you don’t know the parameters of the population distribution, Practical vs.... you cannot compute the probability density of an observation. The principle of maximum likelihood says that the best estimate of a population Home Page parameter is the one that makes the sample most likely. Deriving estimators by the principle of maximum likelihood often requires calculus to Print solve the maximization problem, and so we will not pursue the topic here. Title Page

JJ II

J I

Page 11 of 12

Go Back

Full Screen

Quit Topics for this Module Parameters and... 6. Practical vs. Theoretical Considerations Sampling Distributions Sampling Error In any particular situation, depending on circumstances, you may have Principles of “Good... an overriding consideration that causes you to ignore one or more of the Practical vs.... above considerations — for example the need to make as small an error as possible when using your own data. In some situations, any additional Home Page error of estimation can be extremely costly, and practical considerations may dictate a biased estimator if it can be guaranteed that a bias can Print reduce " for that sample.

Title Page

JJ II

J I

Page 12 of 12

Go Back

Full Screen

Quit