Outline for Class Meeting 7 (Hapter 3, Lohr, 9/20/99)

Outline for Class meeting 16 (Chapter 6, Lohr, 4/3/06) Unequal probability, With Replacement Sampling

I. (Semi-review) One-stage sampling We may want to give every unit in the population its own probability of selection. The units can be individual observations or psu's.

A. Example: Library data Let yi = # of inquiries (variable of interest); let xi = # of circulated items

Suppose I sample with replacement, and with probability proportional to xi; i.e., let x   i i N  j 1 x j at each draw. I make n draws in all, but don't necessarily get n different units. (Why do we sample with replacement ?)

B. How do I implement such a design? One way is by cumulative size method: Step 1 Compute x1, x1+ x2, …, x1 + x2 +…+ xN = tx Step 2 Select a random # between 1 and tx, say ri i1 i Step 3 If 1  j 1 x j  ri  j 1x j , then select unit i into the sample. Step 4. Repeat Steps 1 - 3 n times.

II. Estimators for one-stage sampling A. Define i  Pr[ith unit is selected in sample] And i  Pr[ith unit is selected on each draw]

The two estimators are 1. The Horvitz-Thomson Estimator (rarely used) y y tˆ   i   i HT  2 is i is 1 (1 i ) 2. Another estimator (usually used) n y tˆ  1  i  n  i1 i B. Properties of tˆ 1. It is unbiased. 2. Its variance is 2 1 N  yi  i1i   t  (*) n  i  3. Its variance can be estimated by  n y  v(tˆ )  1  1  ( i  tˆ )2   n  n1    (**)  i1 i 

C. Proving properties of tˆ When you do sampling with replacement, you are back into the iid world of math stat. Define a random variable yi Zi  with probability i . i Observe this random variable n times.

Thinking of the sampling design in this way allows you to use the machinery of iid mathematical statistics. 1. We know N y N    i    y  t z  i i i1 i i1 2. We know that the sample mean n y z  1  i  tˆ n   i1 i is an unbiased estimator of z.

3. We know that the variance of the sample mean is N V (z)  1 2  1 E[Z   ]2  1   (z   )2 n n i z n i i z = (*). i1

4. We can estimate the variance of the sample mean by 2  n  v(z)  s  1  1  (z  z)2   (**) n n  n1 i   i1 

III. What probability of selection should be used? A. The variance can be driven to 0 if i  yi (!) So goal is to pick probabilities proportional to some characteristic that is as highly correlated as possible to yi.

B. When sample units are psu's, the probability is chosen proportional to size of the psu. Called probability-proportional-to-size (pps) sampling. Also called dollar unit sampling in accounting.

IV. Two-stage sampling Sometimes psu’s are selected with replacement with unequal probablility, but then ssu’s within the psu are subsampled. A. Estimator

1. As before, define i  Pr[ith unit (psu) is selected on each draw] The most commonly used estimator is tˆ . It is similar to the estimator for one-stage designs of the same notation (we’ve run out of notation apparently!). Since we cannot observe ti, we must estimate it. Any kind of sample design we wish can be used within the psu to estimate this total, but it must be determined in advance and must be the same design each time the psu is sampled. Then the two-stage estimator is n tˆ tˆ  1 i  n   . i1 i Note that the same psu can be sampled more than once, and unlike the one-stage case, each time it enters the sample its estimated total may be different, because a different sample may be chosen.

(Note: Lohr writes this estimator differently, using the Qi notation, where Qi = # of times psu i is chosen into the sample. But the estimator she writes is the same as ours.)

B. Properties of tˆ 1. It is unbiased. (You should be able to prove this.)

2. Its variance is 2 1 N  yi  1 N Vi   i   t    n i1 i  n i1i ˆ where Vi  Var(ti | i) This is pretty tedious to prove, but uses the ideas we discussed in class last week.

3. The good news is that its variance can be estimated by  n t  v(tˆ )  1  1 ( i  tˆ )2   1 s 2  n  n1     n  i1 i  where s2 is the same as in the one-stage case. It is just the sample variance of the n ˆ estimates of psu total ti / i . C. Example

Telephone surveys are often done using a two-stage, unequal probability with replacement design. See the steps on p. 200 of your text.

This is called the Waksberg-Mitofsky method. For this design,  i  M i / K , where K is the number of residential numbers in the universe. The cleverness of this design is that it is not necessary to know  i in order to calculate the estimate if the same number of phone #’s are sampled from each psu.