2.7 Randomization Theory Resultsfor Simple Random Sampling * 43
in the alphabet is associated with the characteristic of interest. However, systematic sampling is not the same as simple random sampling; it does not have the property that every possible group ofn units has the same probability of being the sample. In the preceding example, it is impossible to have students 345 and 346 both appear in the sample. Systematic sampling is technically a form ofcluster sampling, as will be discussed in Chapter 5. Most of the time, a systematic sample gives results comparable to those of an SRS, and SRS methods can be used in the analysis. If the population is in random order, the systematic sample will be much like an SRS. The population itself can be thought of as being mixed. In the quote at the beginning of the chapter, Sorensen reports that President Kennedy used to read a systematic sample of letters written to him at the White House. This systematic sample most likely behaved much like a random sample. Note that Kennedy was well aware that the letters he read, although representative of letters written to the White House, were not at all representative of public opinion. Systematic sampling does not necessarily give a representative sample, though, if the listing of population units is in some periodic or cyclical order. If male and female names alternate in the list, for example, and k is even, the systematic sample will contain either all men or all women-this cannot be considered a representative sample. In ecological surveys done on agricultural land, a ridge-and-fuITow topogra- phy may be present that would lead to a periodic pattern ofvegetation. Ifa systematic sampling scheme follows the same cycle, the sample will not behave like an SRS. On the other hand, some populations are in increasing or decreasing order. A list of accounts receivable may be ordered from largest amount to smallest amount. In this case, estimates from the systematic sample may have smaller (but unestimable) variance than comparable estimates from the SRS. A systematic sample from an ordered list of accounts receivable is forced to contain some large amounts and some small amounts. It is possible for an SRS to contain all small amounts or all large amounts, so there may be more variability among the sample means of all possible SRSs than there is among the sample means of all possible systematic samples. In systematic sampling, we must still have a sampling frame and be careful when defining the target population. Sampling every 20th student to enter the library will not give a representative sample of the student body. Sampling every 10th person exiting an airplane, though, will probably give a representative sample of the persons on that flight. The sampling frame for the airplane passengers is not written down, but it exists all the same. 2.7 Randomization Theory Results for Simple Random Sampling*l In this section we show that y is an unbiased estimator ofYu: Yu is the average of all possible values of ys if we could examine all possible SRSs S that could be chosen. We also calculate the variance of ygiven in Equation (2.7) and show that the estimator in Equation (2.9) is unbiased over repeated sampling.
I An asterisk (*) indicates a section, chapter, or exercise that requires more mathematical background.
= N.
I- N
=
N -
N
=
(E[ZiD -
E[Zi] )2 V(Zi) n) ( n 22 n ( n
that Note
ZN. , ...
Z!,
variables
random
the of
properties
using calculated y also is of variance The
N n
i=!
N n
i=!
i=!
=
[
= Yu·
- = L....J
- - L....J
L....J
- - i "Z Y
- "Yi
Yi E[
"n
Yi
E
- -]
N N ] N
and
,N
=- E[Z] = E[ZiJ
n 2
(2.18), Equation of consequence a As
N
samples
possible of number
=
=
I) ---::------:--=-----,-,-----=-- Zi P(
n
( n-l
unit i including samples of number
-1) N
so
1,
- size N of population a from drawn be may 1 - n
size of
samples
possible
(
of
total A population. the in units I - N other the
from
chosen
be must
sample
the in
units 1 - n other the then sample, the in is unit i
if
that
note
this, see
To
SRS. an of
definition the from follows (2.18) in probability The
=
(2.18)
N'
sample) in = unit i = P(select = 1) P(Zi Jri
n
with variables random Bernoulli
distributed
identically are
N} Z
{Z , ... 1, population, the in units the N of out
units
n
of
SRS an
choose
we When
quantities. fixed are the Yi'S theory,
randomization
to
according
because,
equation
above the in variables random only the are Zi'S The
n n i=1 iES
"Z
i-· L....J
= - L....J y=
Yi "Yi -
N
Then
otherwise
0
,- Z
{I ._
sample the in is unit i if
define
(1944),
Cornfield in done As sampling. random eni simple in mean
sample the of
properties
deriving
for works theory randomization the how see Let's
variables.
random
of
distribution the about
assumptions any make not need inference-we
to
approach nonparametric
a
provides approach theory randomization The
sample.
the in
be to
units
selecting of
probabilities the from arise used probabilities any
numbers-
unknown
but
fixed be
to considered are y/s the sampling, to approach
design-based) called
(also
theory
randomization the In mean J-L. with distributed
normally
are 's the Yi that
assume instance, for not, Yu. do We
estimating for
unbiased
y
is
that
ascertain to order in 's
the Yi
about made are assumptions distributional No
Samples Probability Simple 2: 44 Chapter ------:. 2.7 Randomization Theory Results for Simple Random Sampling* 45
For i =1= j, E[Z;Zj] = P(Z; = 1and Zj = 1) = P(Zj = 1 I Z; = l)P(Z; = 1)
Because the population is finite, the Z;'s are not quite independent-if we know that unit i is in the sample, we do have a small amount of information about whether unit j is in the sample, reflected in the conditional probability P(Zj = 1 I Z; = 1). Consequently, for i =1= j, COV(Zi, Zj) = E[ZiZj] - E[ZilE[Zj] n - 1 n (n )2 =N-1N-N (1- We use the covariance (Cov) of Zi and Zj to calculate the variance of y; see Ap- pendix B for properties of covariances. The negative covariance of Zi and Zj is the source of the fpc.
V(y) = :2 V (t ZiY;)
= :2 COV ( t ZiY;, t ZjYj) l[N ] = n2 8 y;V(Z;) + 8 YiYj COV(Zi, Zj)
="21 [n- ( 1- -n) LN Yi2 - LNNL Y;Yj-_-1 (1 --n )( -n )] n NNi=1 i=1 Hi N 1 NN
= Y;- YiYj] n NN[ti=1 N 1 t;=1 tj""i
= (1 - 1 [(N - 1) t l -(t Yi)2 +t l] n N N(N - 1) i=1 i=1 i=1
To show that the estimator in (2.9) is an unbiased estimator of the variance, we need to show that E[s2] = S2. The argument proceeds much like the previous one. Since S2 = (Yi - Yu? I(N - 1), it makes sense when trying to find an unbiased estimator to find the expected value of LiES(Yi - y)2 and then find the multiplicative
population,
finite
the
for
values
actual
The
model.
some
from
N
generated
2 Y
,
••• Y ,
Y
j ,
variables
random
of
thinking
by
sampling
to
approach this extend can We
statistics.
variou
of
values
expected
find
to
distribution
normal
the
and
variables
random dent
indepen of
properties used
and a
variance
and
J.1,
mean
2 with
distribution
normal a
fron
distributed
identically
and
independent
n
were
2 Y
1
, Y •.• ,
Y , that
example,
for
assumec
you
Thus
variables.
random
those of
realizations
were
values
sample
actual
th and
distribution,
probability
some
followed
that
{Yd
variables
random
had
you
Then
inference. to
approach
different a
learned
you
class, statistics basic your In
sample.
the
in
included
aJ units
that
probabilities
the are
y
of
variance
and
value
expected the
finding in
probabilities
only
The Yu.
equals S
samples
possible
all yS
for
of
average
the
becau:
unbiased
is y
and
values,
fixed
be to
considered
were YN
, ...
Y2, Yl,
theory:
randomizatic
using
y
mean
sample
the
of
properties
found, we 2.7 Section In
generator.
numb
random
the for
value
starting
different
a
used
we
had
sampled
been have
cou
units
nonsampled
the
that is
sampled
not
units
and
sampled
units
between
relationsh
only
the
inference,
sampling to
approach
theory,
randomization
or
based,
desig
a
In
not.
or
sample
the
in
is
unit ith
the
whether
us
tell that
variables
random
simI
are
They
Yi:
responses
the
with
concerned not
are
theory
randomization
in
variab
random
The
you.
to
strange
seemed
probably
section
preceding
the in
proofs
1
of
design
the
in theory
randomization studied have you Unless
Sampling*
Random Simple for Model A
2.8
Thus,
-1)S2. = (n
NN
S2 n - N _ S2
1) = - n(N
N
N i=1
(1-
!!...)S2 -
yu)2 - t(Yi = !!...
1=1
nV(y)
- yu)2] -
Zi(Yi [t = E
lES
[I:
)2] Yu -
n(y - YU)2 -
(Yi = E
IES
[I: lES
yu)}2]
-
- Yu) - {(Yi E = y)2] - [I:(Yi E unbiasedness: the give will that constant Samples Probability Simple 2: Chapter 46