<<

Chapter 4 Stratified An important objective in any estimation problem is to obtain an estimator of a population parameter which can take care of the salient features of the population. If the population is homogeneous with respect to the characteristic under study, then the method of simple random sampling will yield a homogeneous , and in turn, the sample will serve as a good estimator of the population mean. Thus, if the population is homogeneous with respect to the characteristic under study, then the sample drawn through simple random sampling is expected to provide a representative sample. Moreover, the of the sample mean not only depends on the sample size and but also on the population variance. In order to increase the precision of an estimator, we need to use a sampling scheme which can reduce the heterogeneity in the population. If the population is heterogeneous with respect to the characteristic under study, then one such sampling procedure is a .

The basic idea behind the stratified sampling is to  divide the whole heterogeneous population into smaller groups or subpopulations, such that the sampling units are homogeneous with respect to the characteristic under study within the subpopulation and  heterogeneous with respect to the characteristic under study between/among the subpopulations. Such subpopulations are termed as strata.  Treat each subpopulation as a separate population and draw a sample by SRS from each stratum. [Note: ‘Stratum’ is singular and ‘strata’ is plural].

Example: In order to find the average height of the students in a school of class 1 to class 12, the height varies a lot as the students in class 1 are of age around 6 years, and students in class 10 are of age around 16 years. So one can divide all the students into different subpopulations or strata such as Students of class 1, 2 and 3: Stratum 1 Students of class 4, 5 and 6: Stratum 2 Students of class 7, 8 and 9: Stratum 3 Students of class 10, 11 and 12: Stratum 4 Now draw the samples by SRS from each of the strata 1, 2, 3 and 4. All the drawn samples combined together will constitute the final stratified sample for further analysis.

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 1

Notations: We use the following symbols and notations: N : Population size k : Number of strata th Ni : Number of sampling units in i strata

k NN  i i1

th ni : Number of sampling units to be drawn from i stratum.

k nn  i : Total sample size i1

Population (N units)

Stratum 1 Stratum 2 Stratum k N1 units N2 units Nk units … … …

Sample Sample Sample

1 2 … … … k n1 units n2 units nk

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 2

Procedure of stratified sampling

th Divide the population of N units into k strata. Let the i stratum has Ni1, 1,2,..., k number of units.  Strata are constructed such that they are non-overlapping and homogeneous with respect to the

k characteristic under study such that  NNi  . i1

th  Draw a sample of size ni from i (ik 1,2,..., ) stratum using SRS (preferably WOR) independently from each stratum.  All the sampling units drawn from each stratum will constitute a stratified sample of size

k nn  i . i1

Difference between stratified and schemes In stratified sampling, the strata are constructed such that they are  within homogeneous and  among heterogeneous.

In cluster sampling, the clusters are constructed such that they are  within heterogeneous and  among homogeneous. [Note: We discuss the cluster sampling later.]

Issues in the estimation of parameters in stratified sampling

th Divide the population of N units in k strata. Let the i stratum has Nii , 1,2,..., k number of units. Note that there are k independent samples drawn through SRS of sizes nn, ,..., n from each of the 12 k strata. So, one can have k estimators of a parameter based on the sizes nn12, ,..., nk respectively. Our interest is not to have k different estimators of the parameters, but the ultimate goal is to have a single estimator. In this case, an important issue is how to combine the different sample information together into one estimator, which is good enough to provide information about the parameter.

We now consider the estimation of population mean and population variance from a stratified sample.

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 3

Estimation of population mean and its variance Let Y :characteristic under study,

th th yij :value of j unit in i stratum j = 1,2,…,ni, i = 1,2,...,k,

Ni 1 th Yyiij  :population mean of i stratum Ni j1

ni 1 th yyiij  : sample mean from i stratum ni j1

kk 1 Ni YNYwYii ii:population mean where wi  . N ii11 N

Estimation of population mean: First, we discuss the estimation of the population mean. Note that the population mean is defined as the weighted of stratum in the case of stratified sampling where the weights are provided in terms of strata sizes. 1 k Based on the expression YNY  ii, one may choose the sample mean N i1 1 k yny  ii n i1 as a possible estimator of Y .

Since the sample in each stratum is drawn by SRS, so

E()yYii , thus 1 k Ey()  nEyii ( ) n i1 1 k

  nYii n i1  Y and y turns out to be a biased estimator of Y . Based on this, one can modify y so as to obtain an unbiased estimator of Y . Consider the stratum mean which is defined as the of strata sample means with strata sizes as weights given by 1 k yNystii  . N i1 Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 4

Now 1 k Ey()stii  NE ()y N i1 1 k   NYi i N i1  Y

Thus yst is an unbiased estimator of Y .

Variance of yst

kkni 2 Var() ystii w Var () y w ijij w Cov (,). y y iijj1()11 Since all the samples have been drawn independently from each of the strata by SRSWOR so

Cov(,yyij ) 0,i j

Nnii 2 Var()yii S Nnii where

Ni 221 SYYiij().i Ni 1 j1 Thus k 22Nnii Var()ysti  wS i i1 Nnii k 2 2 nSii  wi 1. i1 Nnii

2 Observe that Var() yst is small when Si is small. This observation suggests how to construct the strata.

2 If Si is small for all i = 1,2,...,k, then Var() yst will also be small. That is why it was mentioned earlier

2 that the strata are to be constructed such that they are within homogeneous, i.e., Si is small and among heterogeneous.

For example, the units in geographical proximity will tend to be more closer. The consumption pattern in the households will be similar within a lower income group housing society and within a higher income group housing society, whereas they will differ a lot between the two housing societies based on income.

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 5

Estimate of Variance Since the samples have been drawn by SRSWOR, so

22 Es()ii S

ni 221 wheresiiji (yy ) ni 1 j1

 Nnii 2 andVar ( yii )  s Nnii k  2  soVar ( yst )  w i Var ( y i ) i1 k 22Nnii  wsii  Nn i1 ii

Note: If SRSWR is used instead of SRSWOR for drawing the samples from each stratum, then in this case

k ywyst  i i i1

Ey()st  Y kk2 222Nii1  Var() yst w i S i w i ii11Nnii n i k 22  wsii Var() yst   i1 ni 1 Ni where 22 (yy ) . iijin  i j1

Advantages of stratified sampling 1. of known precision may be required for certain parts of the population. This can be accomplished with a more careful investigation to a few strata. Example: In order to know the direct impact of the hike in petrol prices, the population can be divided into strata like lower income group, middle-income group and higher income group. Obviously, the higher income group is more affected than the lower-income group. So more careful investigation can be made in the higher income group strata. 2. Sampling problems may differ in different parts of the population. Example: To study the consumption pattern of households, the people living in houses, hotels, hospitals, prison etc. are to be treated differently. Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 6

3. Administrative convenience can be exercised in stratified sampling. Example: In taking a sample of villages from a big state, it is more administratively convenient to consider the districts as strata so that the administrative set up at district level may be used for this purpose. Such administrative convenience and the convenience in the organization of fieldwork are important aspects in national level surveys. 4. Full cross-section of the population can be obtained through stratified sampling. It may be possible in SRS that some large part of the population may remain unrepresented. Stratified sampling enables one to draw a sample representing different segments of the population to any desired extent. The desired degree of representation of some specified parts of the population is also possible. 5. Substantial gain in is achieved if the strata are formed intelligently. 6. In the case of skewed population, use of stratification is of importance since larger weight may have to be given for the few extremely large units, which in turn reduces the sampling variability. 7. When estimates are required not only for the population but also for the subpopulations, then the stratified sampling is helpful. 8. When the sampling frame for subpopulations is more easily available than the sampling frame for the whole population, then stratified sampling is helpful. 9. If the population is large, then it is convenient to sample separately from the strata rather than the entire population. 10. The population mean or population total can be estimated with higher precision by suitably providing the weights to the estimates obtained from each stratum.

Allocation problem and choice of sample sizes is different strata

Question: How to choose the sample sizes nn12, ,..., nk so that the available resources are used in an effective way? There are two aspects of choosing the sample sizes: (i) Minimize the cost of for a specified precision. (ii) Maximize the precision for a given cost.

Note: The sample size cannot be determined by minimizing both the cost and variability simultaneously. The cost function is directly proportional to the sample size, whereas variability is inversely proportional to the sample size. Based on different ideas, some allocation procedures are as follows: Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 7

1. Equal allocation

Choose the sample size ni to be the same for all the strata. Draw samples of equal size from each stratum. Let n be the sample size and k be the number of strata, then n niki for all 1,2,..., . k

2. Proportional allocation

For fixed k, select ni such that it is proportional to stratum size Ni , i.e., nN ii or nNii  where  is the constant of proportionality.

k ni   i1 or nN  n  . N n ThusnNii  . N Such allocation arises from considerations like operational convenience.

3. Neyman or optimum allocation This allocation considers the size of strata as well as variability nNSiii * nNSiii  where  * is the constant of proportionality.

kk * nNSiii  ii11 k * or nNS   ii i1

* n or  k .  NSii i1

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 8

nNii S Thus ni  k .  NSii i1

k

This allocation arises when the Var yst  is minimized subject to the constraint ni (prespecified). i1

There are some limitations to the optimum allocation. The knowledge of Sii (1,2,...,) k is needed to know ni . If there are more than one characteristics, then they may lead to conflicting allocation.

Choice of sample size based on the cost of survey and variability The cost of the survey depends upon the nature of the survey. A simple choice of the cost function is

k CC0  Cnii i1 where C : total cost C : overhead cost, e.g., setting up the office, training people etc 0 C : cost per unit in the ith stratum i k Cnii: total cost within the sample. i1

To find ni under this cost function, consider the Lagrangian function with a Lagrangian multiplier as

2 Var() yst ( C C0 ) kk 22211 wSCniiii ii11nNii kkk22 22 wSii2 wS ii  Cnii iii111nNii 2 k  wSii   Cnii terms independent of n i . i1 ni Thus  is minimum when wS ii  Cn for all i n ii i 1 wSii or ni  .  Ci Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 9

How to determine ? There are two ways to determine  . (i) Minimize variability for a fixed cost. (ii) Minimize cost for given variability. We consider both cases.

(i) Minimize variability for fixed cost

* Let CC 0 be the pre-specified cost which is fixed.

k * So Cnii C0 i1 k wSii * or CCi  0 i1  Ci k  CwSiii i1 or  * . C0

1 wSii Substituting  in the expression for ni  , the optimum ni is obtained as  Ci

 * * wSii C0 ni  k . Ci   CwSiii i1 * The required sample size to estimate Y such that the variance is minimum for the given cost CC 0 is

k * nn  i . i1

(ii) Minimize cost for a given variability

Let VV 0 be the pre-specified variance. Now determine ni such that

k 11 22 wSii V0 i1 nNii kk22 22 wSii wS ii or V0 ii11nNii

kk22  Ci 22 wSii or wSii V0 ii11wSii N i

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 10

k 22 wSii V0   i1 NwSiii1 or  k (after substituting ni ).  Ci  wSii C i i1

Thus the optimum ni is

k wSii C i wS   ii i1 ni  k 22 . Ci wSii V0   i1 Ni So the required sample size to estimate Y such that cost C is minimum for a

k  prespecified variance V0 is nn  i . i1

Sample size under proportional allocation for fixed cost and for fixed variance

k (i) If cost CC 0 is fixed then CCn0   ii. i1 n Under proportional allocation, nNnw iiiN

k So CnwC0   ii i1

C0 or n  k . wCii i1 Cw Thus n  oi . i wC  ii k The required sample size to estimateYnn in this case is  i . i1

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 11

(ii) If variance = V0 is fixed, then

k 11 22 wSii V0 i1 nNii kk22 22 wSii wS ii or V0 ii11nNii kk22 22 wSii wS ii orVnnw0 (usingii  ) ii11nwii N k 22  wSii i1 or n  k 22 wSii V0   i1 Ni k 22  wSii ornw i1 . ii k 22 wSii V0   i1 Ni This is known as Bowley’s allocation.

Variances under different allocations

Now we derive the variance of yst under proportional and optimum allocations. (i) Proportional allocation Under proportional allocation n nN iiN and

k Nnii 22 Var() yst  w i S i i1 Nnii n NN 2 k iiN Var() y N i S 2 prop stn  i i1 NNN iiN Nn k NS2   ii Nni1 N k Nn 2  wSii. Nn i1

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 12

(ii) Optimum allocation Under optimum allocation

nNii S ni  k  NSii i1 k 11 22 Vyopt() st wS i i i1 nNii kkwS22 wS 22 ii ii ii11nNii k NS kk ii wS22 22iii1 wSii ii11nNii S N i   kkk1 NS wS22 . iiNS i i 2 ii iii111nN Ni

kk2222 k k 1 NSii wSii 11  2    wSii   wS ii. n ii11NNnNi  i  1 i  1

Comparison of of the sample mean under SRS with stratified mean under proportional and optimal allocation: (a) Proportional allocation: Nn Vy() S2 SRS Nn

k 2 Nn NSii Vypopr () st   . Nni1 N

2 2 In order to compare VSRS()andy V prop (y st ), first we attempt to express S as a function of Si . Consider

k Ni 22 (1)NS ( YYij  ) ij11 2 k Ni  ()()YYij i YY i ij11 kkNNii 22 ()YYij i  () YY i ij11 ij  11 kk 22 (1)NSii NYY ii ( ) ii11 Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 13

kk N 1 22NNii1 2 SSYYii(). NNNii11

For simplification, we assume that Ni is large enough to permit the approximation N 1 N 1 i 1and 1. NNi Thus

kk 22NNii 2 SSYYii ( ) ii11NN

kk Nn22 NnNNii Nn  2 Nn- orSSii ( YY  ) (Premultiply by on both sides) Nn Nnii11 N Nn N Nn Nn k 2 VarSRS ( Y ) V prop ( yst ) w i ( Y i  Y ) Nn i1

k 2 Since wYii ( Y ) 0, i1

Varprop() y st VarSRS ().y

A larger gain in the difference is achieved when Yi differs from Y more.

(b) Optimum allocation

kk2 11 2 Vyopt() st wS i i wS i i . nNii11 Consider

kkk2 Nn 22 11 Vprop() y st V opt () y st wS i i  wS i i  wS i i Nniii111 n N kk2 1 2  wSii wS ii n ii11 k 1122  wSii S nni1 k 1 2  wSii() S n i1 where

k SwS  ii i1

Varprop() y st Var opt ()0 y st orVaropt ( y st ) Var prop ( y st ). Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 14

The larger gain in efficiency is achieved when Si differs from S more. Combining the results in (a) and (b), we have

Varopt() y st Var prop () y st Var SRS () y

Estimate of variance and confidence intervals

2 th Under SRSWOR, an unbiased estimate of Si for the i stratum (i = 1,2,...,k) is

ni 221 siiji().yy ni 1 j1 In stratified sampling,

k 22Nnii Var() ysti  w S i . i1 Nnii

So, an unbiased estimate of Var() yst is

k  22Nnii Var() yst  w i s i i1 Nnii kkws22 ws 22 ii ii ii11nNii kk22 wsii 1 2 wsii ii11nNi The second term in this expression represents the reduction due to finite population correction. The confidence limits of Y can be obtained as  ytVarystst ()

 assuming yst is normally distributed and Var() yst is well determined so that t can be read from normal distribution tables. If only few degrees of freedom are provided by each stratum, then t values are obtained from the table of student’s t-distribution.  The distribution of Var() yst is generally complex. An approximate method of assigning an effective

k 2 2  gsii  i1 number of degrees of freedom ()n to Var() y is ne  24 e st k gs  ii i1 ni 1 k NNii() n i where gi  and Min(1) niei n  (1) n  assuming yij are normally distributed. ni i1 Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 15

Modification of optimal allocation Sometimes in the optimal allocation, the size of subsample exceeds the stratum size. In such a case,

replace ni by Ni and recompute the rest of nsi ' by the revised allocation.

For example, if nN11 , then take the revised nsi ' as

nN11 and

()nNwS 1 ii niki k ;2,3,...,  wSii i2 provided nNii for all i = 2,3,…,k.

Suppose in revised allocation, we find that nN22 then the revised allocation would be

nN11

nN22

()nN12 NwSii niki k ; 3,4,..., .  wSii i3 provided nNii for all ik 3,4,..., .

We continue this process until every nNii .

In such cases, the formula for the minimum variance of yst need to be modified as ()*2wS *2 wS Min Var() y ii ii st nN* *  where  denotes the summation over the strata in which nNii and n* is the revised total sample size in the strata.

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 16

Stratified sampling for proportions If the characteristic under study is qualitative in nature, then its values will fall into one of the two mutually exclusive complimentary classes C and C’. Ideally, only two strata are needed in which all the units can be divided depending on whether they belong to C or its complement C’. Thus is difficult to achieve in practice. So the strata are constructed such that the proportion in C varies as much as possible among strata. Let

Ai th Pi  :Proportion of units in C in the i stratum Ni

ai th pi  :Proportion of units in C in the sample from the i stratum ni An estimate of population proportion based on the stratified sampling is

k Npii pst   . i1 N which is based on the indicator variable 1 whenjth unit belongs to theiC th stratum is in Yij   0otherwise andypst st .

2 Ni Here SPQiii Ni 1 where QPii1 .

k Nnii 22 Also Var() ystii  w S . i1 Nnii

1 k NN2 () n PQ So Var() p  iiiii. st 2  NNi1 ii1 n If the finite population correction can be ignored, then

k 2 PiiQ Var() pst  w i . i1 ni

If the proportional allocation is used for ni , then the variance of pst is

Nn 1 k NPQ2 Var() p  iii prop st NNnN 1 i1 i Nn k   wPQii i Nn i1

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 17 and its estimate is

k  Nn piiq Varprop () pst  w i . Nni1 ni 1

The best choice of ni such that it minimizes the variance for fixed total sample size is

NPQii i nNii Ni 1

 NPQiii

NPQiii Thus nni  k .  NPQiii i1

k Similarly, the best choice of ni such that the variance is minimum for fixed cost CC0  Cnii is i1

PQ nN ii i C n  i . i k PQii  Ni i1 Ci

Estimation of the gain in precision due to stratification An obvious question crops up that what is the advantage of stratifying a population in the sense that instead of using SRS, the population is divided into various strata? This is answered by estimating the variance of estimators of population mean under SRS (without stratification) and stratified sampling by evaluating   VarSRS () y Var ( y ) st .  Var() yst This gives an idea about the gain in efficiency due to stratification.

Nn 2 2 2 2 Since VarSRS ()y  S ,so there is a need to express S in terms of Si . How to estimate S based Nn on a stratified sample?

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 18

Consider

k Ni 22 (1)NS ( YYij  ) ij11 2 k Ni  ()()YYij i YY i ij11

kkNi 22 ()YYij  NYY i () i ij11 i  1 kk 22 (1)NSii NYY ii ( ) ii11 kk 222 (1)NSNwYYii ii  . ii11

2 22 2 In order to estimate S , we need to estimates of SYii,and. YWe consider their estimation one by one.

2 (I) For an estimate of Si , we have

22 E()sSii

ˆ 22 So Ssii .

2 (II) For estimate of Yi , we know

22 Vary()ii Ey ( )[()] Ey i 22 Ey()ii Y or YEyVary22()(). ii i

2 An unbiased estimate of Yi is

ˆ 22 YyVaryii () i

22Nnii ysii. Nnii

(III) For the estimation of Y 2, we know

22 Var() yst E ()[()] y st E y st 22 Ey()st Y YEyVary22()  () st st

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 19

So, an estimate of Y 2 is

ˆ 22 YyVaryst() st k 222Nnii ywsstii.  Nn i1 ii Substituting these estimates in the expression (1)nS 2 as follows, the estimate of S 2 is obtained as

kk 2222 (1)(NS Nii  1) SNwYY ii  ii11 kk ˆˆ22221 N ˆˆ as S(1) Nii S w ii Y Y NN11ii11

kk k 1 222222N Nnii Nn ii  Nsii1  wy ii s i  y st ws ii  NNNnNn11  ii11ii i  1 ii  kkk 1 222N Nnii Nsii1()(1). wyy iisti ww i s i NN11iii111 Nnii Thus

 Nn 2 VarSRS () y Sˆ Nn kkk Nn222 NNn() Nnii  (1)Nsii w iisti (yy )ww (1) i s i NN(1) niii111 nNN (1) Nnii  and k  Nnii 22 Var() yst  w i s i . i1 Nnii

Substituting these expressions in   VarSRS () y Var ( y ) st ,  Var() yst the gain in efficiency due to stratification can be obtained.

If any other particular allocation is used, then substituting the appropriate n under that allocation, such i gain can be estimated.

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 20

Interpenetrating subsampling Suppose a sample consists of two or more subsamples which are drawn according to the same sampling scheme. The samples are such that each subsample yields an estimate of the parameter. Such subsamples are called interpenetrating subsamples.

The subsamples need not necessarily be independent. The assumption of independent subsamples helps in obtaining an unbiased estimate of the variance of the composite estimator. This is even helpful if the sample design is complicated and the expression for variance of the composite estimator is complex.

Let there be g independent interpenetrating subsamples and tt12, ,..., tg be g unbiased estimators of

th parameter  where tjj (1,2,...,) g is based on j interpenetrating subsample.

Then an unbiased estimator of  is given by

g ˆ 1  ttj ,say. g j1 Then EEt()ˆ  ()  and

g ˆ 1 2 Var()  Var () t ( tj  t ). gg(1) j1 Note that

1 g EVart() E ( t )22 gt ( )   j gg(1) j1 1 g

Var() tj gVar () t gg(1) j1 1 ()()()g2 g Var t Var t gg(1)

If the distribution of each estimator tj is symmetric about  , then the of  can be obtained by

g1 1  PMintt(12 , ,..., tgg ) Maxtt ( 12 , ,..., t )  1 . 2 Implementation of interpenetrating subsamples in stratified sampling

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 21

Consider the set up of stratified sampling. Suppose that each stratum provides an independent interpenetrating subsample. So based on each stratum, there are L independent interpenetrating subsamples drawn according to the same sampling scheme.

ˆ th th Let Yij() tot be an unbiased estimator of the total of j stratum based on the i subsample , i = 1,2,...,L; j = 1,2,...,k.

An unbiased estimator of the jth stratum total is given by

J ˆˆ1 YYj()tot  ij () tot L i1 ˆ and an unbiased estimator of the variance of Yj()tot is given by

L  ˆˆˆ1 2 Var() Yj() tot ( Y ij () tot Y j () tot ). LL(1) i1

Thus an unbiased estimator of population total Ytot is

kLk ˆˆ1 ˆ YYtot j() tot Y ij () tot jij111k And an unbiased estimator of its variance is given by

k ˆˆ Var() Ytot  Var ( Y j() tot ) j1 Lk 1 ˆˆ2  (YYij() tot j () tot ) . LL(1) ij11

Post Stratifications

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 22

Sometimes the stratum to which a unit belongs may be known after the field survey only. For example, the age of persons, their educational qualifications etc. can not be known in advance. In such cases, we adopt the post-stratification procedure to increase the precision of the estimates.

Note: This topic is to be read after the next module on ratio method of estimation. Since it is related to the stratification, so it is given here.

In post-stratification,  draw a sample by simple random sampling from the population and carry out the survey.  After the completion of the survey, stratify the sampling units to increase the precision of the estimates.

Assume that the stratum size Ni is fairly accurately known. Let m : number of sampling units from ith stratum, i = 1,2,...,k. i k mni  . i1

Note that mi is a (and that is why we are not using the symbol ni as earlier).

Assume n is large enough or the stratification is such that the probability that some m  0 is negligibly i small. In case, mi  0 for some strata, two or more strata can be combined to make the sample size non- zero before evaluating the final estimates.

A post stratified estimator of the population mean Y is 1 k yNypost  i i . N i1 Now 1 k Ey()post E NEymm i ( i12 ,,...,) m k N i1 1 k  ENY ii N i1 Y

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 23

 VaryEVarymmmVarEymmm(post ) ( post12 , ,..., k ) ( post 12 , ,..., k ) k 11 22 Ew ii SVarY() i1 mNii

k 2211  wEii S(Since VarY ( )  0). i1 mNii

11 To findE  , proceed as follows :  mNii Consider the estimate of ratio based on ratio method of estimation as

nN yY yYjj RRˆ jj11,.   xXnN xXjj jj11 We know that Nn RSS2  ER()ˆ  R .XXY . Nn X 2 1 ifjith unit belongs to th stratum Let x j   0otherwise  and y j 1for all j = 1,2,...,N.

ˆ 2 Then RR,and Sx reduces to

n  y j ˆ j1 n R n ni  x j j1 N Yj j1 N R N Ni  X j j1 111N  NN22  SXNXNNN222 ii xj  i2  i NNNNN111j1    11N NN SXYNXYNi 0. xy j j i 2 NNN11j1 

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 24

Using these values in E()RRˆ  ,we have

 ˆ nNNN()() n N Ni ER ( ) R E   2 . nNii nNN i(1) Thus

11N NN()() n N Ni 1 E  22  nNnNnNNii i i(1) N i ()NnN  N 1 1. nN(1) Nii Nn n

Replacing mi in place of ni , we obtain

11()NnN  N 1  E  1    mNnNN(1) Nnn ii i  i 

Now substitute this in the expression of Var() y post as

k 2211 Var() ypost w i E S i i1 mNii

k 22Nn N N 1  wSii.1 i1 (1)NnNnNn ii

k Nn 22111  wSii1 nN(1) i1  wii nw n Nn k 1 wS2 n 1 2  ii nN(1) i1  wi Nn k (1)nw w S 2 2  iii nN(1) i1 Nn kkNn  wS22(1 w ) S . ii2 i i nN(1)ii11 n (1) N Assuming NN1. Nnnn Nn V ()y wS22 (1) w S postNn i i n2 N i i ii11 Nn n Vy() (1). wS2 prop st2   i i Nn i1

The second term is the contribution to the variance of ypost due to msi ' not being proportionately distributed.

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 25

22 If SSiw , say for all i, then the last term in the expression is Nnkk Nn (1wS )22 S ( k 1) (Since w 1) 22iw w  i  Nnii11 Nn

kNn1 2  Sw nNn k 1  Var(). yst n n The increase in the variance over Var ()y is small if the average sample size n  per stratum is prop st 2 reasonably large.

Thus a post-stratification with a large sample produces an estimator which is almost as precise as an estimator in the stratified sampling with proportional allocation.

Sampling Theory| Chapter 4 | Stratified Sampling | Shalabh, IIT Kanpur Page 26