<<

Chapter 5 Ratio and Product Methods of Estimation

An important objective in any statistical estimation procedure is to obtain the estimators of parameters of interest with more precision. It is also well understood that incorporation of more information in the estimation procedure yields better estimators, provided the information is valid and proper. Use of such auxiliary information is made through the ratio method of estimation to obtain an improved estimator of the population . In ratio method of estimation, auxiliary information on a variable is available, which is linearly related to the variable under study and is utilized to estimate the population mean.

Let Y be the variable under study and X be an auxiliary variable which is correlated with Y . The observations xi on X and yi on Y are obtained for each unit. The population mean X of X

(or equivalently the population total Xtot ) must be known. For example, xi 's may be the values of ysi ' from - some earlier completed , - some earlier surveys, - some characteristic on which it is easy to obtain information etc.

th th For example, if yi is the quantity of fruits produced in the i plot, then xi can be the area of i plot or the production of fruit in the same plot in the previous year.

Let (x11 ,yxy ),( 2 , 2 ),...,( xynn , ) be the random sample of size n on the paired variable (X, Y) drawn, preferably by SRSWOR, from a population of size N. The ratio estimate of the population mean Y is y YXRXˆ ˆ R x

N assuming the population mean X is known. The ratio estimator of population total YYtot  i is i1

ˆ ytot YXRtot() tot xtot N n n where XXtot  i is the population total of X which is assumed to be known, yytot  i and xtot x i i1 i1 i1 ˆ are the sample totals of Y and X respectively. The YR()tot can be equivalently expressed as y YXˆ  Rtot() x tot ˆ  RX tot . Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 1

Y Looking at the structure of ratio estimators, note that the ratio method estimates the relative change tot Xtot

yi that occurred after (,xiiy ) were observed. It is clear that if the variation among the values of and is xi y y nearly same for all i = 1,2,...,n then values of tot (or equivalently ) vary little from sample to sample xtot x and the ratio estimate will be of high precision.

Bias and mean squared error of ratio estimator:

Assume that the random sample (xii ,yi ), 1,2,..., n is drawn by SRSWOR and population mean X is known. Then

N  n ˆ 1 yi E()YXR   N i1 xi  n  Y (in general).

yy2 Moreover, it is difficult to find the exact expression for  . So we approximate them and EEand 2 x x proceed as follows: Let yY 0 y (1o )Y Y xX x (1 )X . 11X Since SRSWOR is being followed, so

E()0 0 

E()1  0 1 E() 22EY (y ) 0 Y 2 1 Nn  S 2 YNn2 Y f S 2  Y nY2 f  C 2 n Y Nn 1 N S 22Y is the related to Y. wherefS ,YiY ( YYC  ) and  NN1 i1 Y

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 2

Similarly, f EC() 22 1 n X 1 EExXyY()  [()()] 01 XY 11Nn N ()()X iiXY Y XY Nn N 1 i1 1 f  . S XY n XY 1 f  SS XY n XY f SS   XY nXY f  CC n XY S where C  X is the coefficient of variation related to X and  is the population correlation coefficient X X between X and Y.

ˆ Writing YR in terms of  ',s we get y YXˆ  R x (1  )Y  0 X (1 1 ) X 1 (101 )(1  ) Y

1 Assuming 1  1, the term (11 ) may be expanded as an infinite series and it would be convergent.

xX Such an assumption that 1, i.e., a possible estimate x of the population mean X lies X between 0 and 2 X . This is likely to hold if the variation in x is not large. In order to ensure that variation in x is small, assume that the sample size n is fairly large. With this assumption,

YYˆ (1 )(1 2 ...) R 011 2 Y (1011 10 ...).

ˆ So the estimation error of YR is

ˆ 2 YYYR (011   10  ...).

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 3

In case, when the sample size is large, then 01and are likely to be small quantities and so the terms involving second and higher powers of 01and  would be negligibly small. In such a case ˆ YYYR  () 01 and ˆ EY()0.R  Y So the ratio estimator is an unbiased estimator of the population mean up to the first order of approximation.

If we assume that only terms of 01and involving powers more than two are negligibly small (which is more realistic than assuming that powers more than one are negligibly small), then the estimation error of ˆ YR can be approximated as

YYYˆ  () 2  R 011 10 ˆ Then the bias of YR is given by

ˆ ff2 EY()00RXXy Y Y  C  C C nn  f Bias() Yˆˆ E ( Y Y ) YC ( C  C ). RRn XXY upto the second order of approximation. The bias generally decreases as the sample size grows large.

ˆ The bias of YR is zero, i.e., ˆ Bias ( YR ) 0 2 if E (101 ) 0 Var() x Cov (,) x y or if 0 XXY2 1 X or if2 Var ( x ) Cov ( x , y ) 0 XY Cov(, x y ) or ifVar ( x ) 0 (assuming X  0) R YCovxy(, ) or if R  XVarx() which is satisfied when the regression line of Y on X passes through the origin.

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 4

Now, to find the mean squared error, consider

ˆˆ2 MSE() YRR E ( Y Y ) 22 2 EY(...)011 10  22 2 EY(2).01  01

Under the assumption 1 1 and the terms of 01and  involving powers, more than two are negligibly small, ff2 f MSE() Yˆ  Y22 C C 2  C C RXYXYnnn  Yf2 CC22 2 CC n XY Xy up to the second-order of approximation.

Efficiency of ratio estimator in comparison to SRSWOR Ratio estimator is a better estimate of Y than sample mean based on SRSWOR if ˆ MSE ( YRSRS ) Var ( y )

222ff 22 or ifYCCCCYC (XY 2 XY )  Y nn 2 or ifCCCXXY 2 0 1 C or if  X . 2 CY Thus ratio estimator is more efficient than the sample mean based on SRSWOR if 1 C  X if R 0 2 C Y 1 C and X if R  0. 2 CY It is clear from this expression that the success of ratio estimator depends on how close is the auxiliary information to the variable under study.

Upper limit of ratio estimator: Consider

Cov(,) Rˆˆˆ x E ( R x ) E ()() R E x y ˆ E xEREx()() x YERX()ˆ .

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 5

Thus YCovRx(,)ˆ ER()ˆ  XX Cov(,) Rˆ x R X Bias() Rˆˆ E () R R Cov(,) Rˆ x  X

ˆˆ  Rx, R x X where is the correlation between Rxˆ and ; and  are the standard errors of Rxˆ and Rxˆ , Rˆ x respectively. Thus

ˆˆx Bias() Rˆ  Rx, R X  Rˆ x  1. X Rxˆ , assuming X  0. Thus

Bias() Rˆ   x  X Rˆ Bias() Rˆ or  C X  Rˆ ˆ where CX is the coefficient of variation of X. If CX < 0.1, then the bias in R may be safely regarded as negligible in relation to the of Rˆ.

ˆ Alternative form of MSE ()YR Consider

NN 2 2 ()()()YRXii YY i  YRX i ii11 N 2 ()()(Using)YYii RXX Y  RX i1 NN N 22 2 ()YYii R ( X  X )2( R  X ii  XYY )() ii11 i  1 N 1 2222 ()YRXSiiY RS X 2. RS XY N 1 i1

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 6

ˆ The MSE of YR has already been derived which is now expressed again as follows: f MSE() Yˆ  Y22 ( C C 2 2 C C ) RYXXYn 22 f 2 SSYX S XY Y 22 2 nYXXY 22 fY22 Y Y 22SSYX 2 S XY nY X X

f 222 SRSRSYX 2 XY n N f 2  (YRXii ) nN(1) i1 N Nn 2  (YRXii ) . nN(1) N  i1

ˆ Estimate of MSE() YR ˆ Let UYRXiii i,  1,2,.., N then MSE of YR can be expressed as f 1 N f MSE() Yˆ  ( U U )=22 S RiUnN1 n i1 N 221 whereSUUUi ( ) . N 1 i1 ˆ Based on this, a natural estimator of MSE()YR is f MSE() Yˆ  s2 Run n 221 wheresuuui ( ) n 1 i1 1 n 2 ()()yyRxxˆ  ii n 1 i1 222ˆˆ sRsRsyx 2, xy y Rˆ  . x Based on the expression

N ˆ f 2 MSE() YRii ( Y RX ), nN(1) i1 ˆ an estimate of MSE() YR is f n MSE() Yˆ  ( y Rxˆ )2 Riinn(1)  i1 . f (2).sRsRs222ˆˆ n y xxy

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 7

Confidence interval of ratio estimator If the sample is large so that the normal approximation is applicable, then the 100(1-)% confidence intervals of Y and R are

ˆˆˆˆ  YRRRR Z Var(), Y Y Z Var () Y 22 and

ˆˆˆˆ  R Z Var(), R R Z Var () R 22 respectively where Z is the normal derivate to be chosen for a given value of confidence coefficient 2 (1 ). If (,x y ) follows a bivariate , then ()yRx is normally distributed. If SRS is followed for drawing the sample, then assuming R is known, the yRx

Nn (2)sRsRs222 Nn yx xy is approximately N(0,1).

This can also be used for finding confidence limits, see Cochran (1977, Chapter 6, page 156) for more details.

Conditions under which the ratio estimate is optimum ˆ The ratio estimate YR is the best linear unbiased estimator of Y when

(i) the relationship between yi and xi is linear passing through origin., i.e.

yxeiii ,

where esi ' are independent with Ee(/)ii x  0 and  is the slope parameter

(ii) this line is proportional to xi , i.e.

2 Var(/) yii x E () e i Cx i where C is constant.

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 8

n Proof. Consider the linear estimate of  because ˆ where yxe   and  ‘s are constant.   iiy iii i i1 ˆ Then  is unbiased if YX  as Ey() X Ee (ii / x ).

If n sample values of xi are kept fixed and then in repeated sampling

n ˆ Ex ( )    ii i1 nn ˆ 22 andVar ( )iiiii Var ( y / x ) C x ii11

n ˆ So Ex() whenii 1. i1

Consider the minimization of Var(/) yii x subject to the condition for being the unbiased estimator

n iix 1 using the Lagrangian function. Thus the Lagrangian function with Lagrangian multiplier is i1

n Var(/)2( yii x ii x 1.) i1 nn 2 Cx1 iii2( x  1). ii11 Now  0iix xi i ,  1,2,.., n i  n 01iix   i1 n Usingiix  1 i1 n orxi  1 i1 1 or  . nx Thus 1   i nx

n y  i y and so ˆ i1 . nx x

Thus ˆ is not only superior to y but also the best in the class of linear and unbiased estimators.

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 9

Alternative approach: This result can alternatively be derived as follows: y Y The ratio estimator Rˆ  is the best linear unbiased estimator of R  if the following two x X conditions hold: (i) For fixed x, Ey ( )  x , i.e., the line of regression of y on x is a straight line passing through the origin. (ii) For fixed x , Var( x ) x , i.e., Var ( x ) x where is constant of proportionality.

Proof: Let yyyy(1) , 2 ,...,nn ) ' and xxxx ( 1 , 2 ,..., ) ' be two vectors of observations on

ys'and'. xs Hence for any fixed x , Ey()  x

Var( y ) diag( x12 , x ,..., xn ) where diag(x12 ,xx ,...,n ) is the diagonal matrix with x12,xx ,..., n as the diagonal elements.

The best linear unbiased estimator of  is obtained by minimizing

Syx21()'()  yx  n ()yx  2   ii. i1 xi Solving S 2  0 

n ˆ  ()0yxii i1 y or ˆ Rˆ . x ˆ ˆ ˆ Thus R is the best linear unbiased estimator of R . Consequently, RXY R is the best linear unbiased estimator of Y.

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 10

Ratio estimator in Suppose a population of size N is divided into k strata. The objective is to estimate the population mean Y using the ratio method of estimation.

th In such a situation, a random sample of size ni is being drawn from the i strata of size Ni on the variable under study Y and auxiliary variable X using SRSWOR. Let

th th yij : j observation on Y from i strata

th th xij : j observation on X from i strata i =1, 2,…,k; j 1,2,...,ni .

An estimator of Y based on the philosophy of stratified sampling can be derived in the following two possible ways:

1. Separate ratio estimator - Employ first the ratio method of estimation separately in each stratum and obtain ratio estimator

Yiˆ 1, 2, .., k , assuming the stratum mean X to be known. Ri i - Then combine all the estimates using weighted .

This gives the separate ratio estimator as ˆ k NY ˆ iRi YRs   i1 N k  wYˆ  iRi i1

k yi  wXii i1 xi

ni 1 th where yyiij  : sample mean of Y from i strata ni j1

ni 1 th xiij  x :sample mean of X from i strata ni j1

Ni 1 th Xxiij  :mean of all the X units in i stratum Ni j1 No assumption is made that the true ratio remains constant from stratum to stratum. It depends on information on each X i .

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 11

2. Combined ratio estimator: - Find first the stratum mean of Ys'and' Xs as k ywystii  i1 k xstii  wx. i1 - Then define the combined ratio estimator as

ˆ yst YXRc  xst N where X is the population mean of X based on all the NN  i units. It does not depend on individual i1 stratum units. It does not depend on information on each Xi but only on X .

Properties of separate ratio estimator:

k k Note that there is an analogy between YwY  ii and YwYRs  i Ri . i1 i1 y We already have derived the approximate bias of YXˆ  as R x Yf EY()ˆ  Y ( C2  C C ). R n xXY ˆ So for YRi , we can write

ˆ fi 2 EY (Ri ) Y i Y i ( C ix  i C iX C iY ) ni 11NNii whereYyXx , iijiijNN iijj11 2 2 Nnii 22Siy S ix fCCiiyix,,22 , NYXii i

NNii 222211 SYYSXXiy ( ij i ) , ix ( ij i ) , NNii11jj11 th i : correlation coefficient between the observation on X and Y in i stratum

th Cix : coefficient of variation of X values in i sample. Thus

ˆˆk EY()Rs  wEY i () Ri i1 k fi 2  wYii Y i( C ixiixiy CC i1 ni k wYii f i 2 YCCC ()ix  i ix iy i1 ni

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 12

ˆˆ Bias() YRs E () Y Rs Y k wYii f i  CCix() ix i C iy i1 ni upto the second order of approximation. Assuming finite population correction to be approximately 1, nnk /and, CC and are the same iixiyi for all the strata as CC, and  respectively, we have xy k Bias() Yˆ  ( C2  C C ). Rsn x x y

Thus the bias is negligible when the sample size within each stratum should be sufficiently large and YRs is unbiased when CCix  iy . ˆ ˆ Now we derive the approximate MSE of YRs . We already have derived the MSE of YR earlier as

Yf2 MSE() Yˆ  ( C22 C 2 C C ) RXYxyn

N f 2 ()YRXii nN(1) i1 Y where R  . X Thus the MSE of ratio estimate up to the second order of approximation based on ith stratum is f MSE() Yˆ i ( C22 C 2 C C ) RinN(1) iX iY i iX iY ii Ni fi 2 ()YRXij i ij nNii(1) j1 and so k ˆˆ2 MSE() YRs  w i MSE () Y Ri i1 k 2 wfii 22 2  YCi(2) iX C iY i CC iX iY i1 ni

k Ni 22fi wYRXiijiij() ij11nNii(1)

ˆ 22 2 An estimate of MSE()YRs can be found by substituting the unbiased estimators of SSiX,and iY S iXY as

22 th ssix,and iy s ixy , respectively for i stratum and Riii YX/ can be estimated by ryxiii /.

k 2  ˆ wfii 222 MSE() YRs  ( s iy r i s ix 2 r i s ixy ). i1 ni Also k 2 ni  ˆ wfii 2 MSE() YRs ( y ij r i x ij ) ij11nnii(1)

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 13

Properties of combined ratio estimator: Here

k  wyii ˆ ist1 y ˆ YXXRXRCk c . xst  wxii i1 ˆ It is difficult to find the exact expression of bias and mean squared error of YRc , so we find their approximate expressions.

Define y Y   st 1 Y x  X   st 2 X

E()1  0

E()0 2 

kkNnwS 22 fwS 22 ffS 2 E(2 )i i iiY i iiY Recall that in case of YEˆ , (22 ) Y C 1 22RY1 2 ii11Nnii Y n i Y nY n k fwS22 E()2 iiiX  2   2 i1 nXi k 2 fSiiXY Ew()12   i . i1 nXYi

Thus assuming  2 1,

ˆ (1 1 )Y YXRC  (1  2 ) X 2 Y (1122 )(1  ...) 2 Y (112122 ...)

ˆ Retaining the terms up to order two due to the same reason as in the case of YR ,

YYˆ  (1  2 ) RC 12122 ˆ 2 YYYRC ()12122   

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 14

ˆ The approximate bias of YRc up to the second-order of approximation is ˆˆ Bias() YRc E ( Y Rc Y ) 2  YE()12122  YEE00   2 12 2 k fSS2 YwiiXiXY2  i 2 i1 nXXYi  k fS2  SS Ywi2 iX i iX iY  i 2 i1 nXi  XY k Y fSSiiXiiY2   wSiiX Xni1 i  XY k f RwSCi 2  C  iiXiX iiY i1 ni Y where R  ,  is the correlation coefficient between the observations on YXand in the ith stratum, X i

CCixand iy are the coefficients of variation of XYand respectively in the ith stratum.

The mean squared error upto second order of approximation is

ˆˆ2 MSEY()Rc EY ( Rc Y ) 22  YE()12122  222  YE(2)12  12

k 22 2 fSSS2 YwiiXiYiXY2  i 22 i1 nXYXYi  k fSS222 SS Yw22i iX iY i iX iY  i 22 i1 nXYXYi  YY22k f  Y i wSS2222 SS 22 i iX iY i iX iY  YnXi1 i  X  k f i wRS222(2). S 2  RSS  i iX iY i iX iY i1 ni

22 An estimate of MSE() YRc can be obtained by replacing SSiX, iY and S iXY by their unbiased estimators

22 Y y ssix, iy and s ixy respectively whereas R  is replaced by r  . Thus the following estimate is obtained: X x k wf2  ii 22 2 MSE() YRc  r s ix s iy 2 rs ixy . i1 ni

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 15

Comparison of combined and separate ratio estimators ˆ ˆ An obvious question arises that which of the estimates YRs or YRc is better. So we compare their MSEs. Note that the only difference in the term of these MSEs is due to the form of ratio estimate. It is y  RMSEYi in (ˆ ) iRsx i Y  RMSEY in (ˆ ). X Rc Thus ˆˆ MSE() YRc  MSE () Y Rs k wf2 ii222  ()2()RRSiiX RRSS i iiXiY i1 ni k wf2 ii22 2  ()2()(RRi S iX RRRS i i iX i SS iX iY ). i1 ni

The difference  depends on

(i) The magnitude of the difference between the strata ratios ()Ri and whole population ratio (R).

2 (ii) The value of ()RiSSS ix  i ix iy is usually small and vanishes when the regression line of y on x is linear and passes through origin within each stratum. See as follows:

2 RSi ix i S ix S iy 0

iixiySS Ri  2 Six which is the estimator of the slope parameter in the regression of y on x in the ith stratum. In such a case

MSE ( Yˆˆ ) MSE ( Y ) Rc Rs but Bias ( Yˆˆ ) Bias ( Y ). Rc Rs

ˆ So unless Ri varies considerably, the use of YRc would provide an estimate of Y with negligible bias and ˆ precision as good as YRs . ˆ  If RiRs RY, can be more precise but bias may be large.

If R  RY, ˆ can be as precise as Yˆ but its bias will be small. It also does not require knowledge  iRc Rs of XX, ,..., X . 12 k

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 16

Ratio estimators with reduced bias: ˆˆˆ The ratio type estimators that are unbiased or have smaller bias than RY,orRRctot Y () are useful in sample surveys. There are several approaches to derive such estimators. We consider here two such approaches:

1. Unbiased ratio – type estimators: Y Under SRS, the ratio estimator has form X to estimate the population mean Y . As an alternative to x this, we consider following as an estimator of the population mean

1 n Y ˆ i . YXRo   nXi1 i

Yi Let Rii , 1,2,.., N , Xi then

ˆ 1 n YRXRi0   n i1  rX where 1 n rR  i n i1 ˆˆ Bias ( YRR00 ) E ( Y ) Y E (rX ) Y E (rX ) Y . Since 11nN Er ( )  ( Ri ) nNii11 1 n   R n i1  R . ˆ So Bias ( YR0 ).RX Y

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 17

Nn Using the result that under SRSWOR, Cov(, x y ) S , it also follows that Nn XY Nn 1 N Cov(, r x ) ( Rii R )( X X ) Nn N 1 i1 Nn 1 N () RXii NRX Nn N 1 i1 N Nn 1 Yi () XNRXi nN1 i1 Xi Nn 1 ()NY NRX Nn N 1 Nn 1 [Bias ( Yˆ )]. nN1 R0 Nn Nn Thus using the result that in SRSWOR, Cov(, x y ) S , and therefore Cov(, r x ) S , we Nn XY Nn RX have nN(1) Bias() Yˆ  Cov (,) r x Ro Nn nN(1) N n  S Nn Nn RX N 1 SRX N 1 N where SRRXXRX()(). i i N 1 i1 The following result helps in obtaining an unbiased estimator of a population mean: Since under SRSWOR set up,

Es (xy )  S xy 1 n wheresxxyyxy ( i )( i ), n 1 i1 1 N SXXYYxy()(). i i N 1 i1 ˆ So an unbiased estimator of the bias in Bias() YRRX0  ( N  1) S is obtained as follows: (1)N  Bias() Yˆ  s Rrx0 N N 1 n ()()rrxxii   Nn(1) i1 N 1 n () rxii  nr x Nn(1) i1

n N 1 yi  xi nr x Nn(1) i1 xi N 1 ().ny  nr x Nn(1)

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 18

So  nN(1) Bias ( Yˆˆ ) E Y Y ( y r x ). RR00  Nn(1) Thus E Yˆˆ Bias ( Y ) Y RR00

ˆ nN(1) orEYR0  ( y rx ) Y . Nn(1) Thus nN(1) nN (1) YyrxrXyrxˆ  ( ) ( ) R0 Nn(1) Nn (1) is an unbiased estimator of the population mean.

2. Jackknife method for obtaining a ratio estimate with lower bias Jackknife method is used to get rid of the term of order 1/n from the . Suppose the

ER()ˆ can be expanded after ignoring finite population correction as aa ER(ˆ ) R 12   ... nn2 Let n = mg and the sample is divided at random into g groups, each of size m. Then ga ga EgR()ˆ  gR 12   ... gm g22 m

aa gR 12 ... mgm2 * y Let ˆ *  i where the * denotes the summation over all values of the sample except the ith group. Ri  *   xi ˆ * So Ri is based on a simple random sample of size m(g - 1), so we can express aa ER (ˆ * ) R 12   ... i mg(1)(1) m22 g or aa Eg ( 1) Rˆ * ( g 1) R 12  ... i mmg2 (1) Thus a EgRgˆˆ ( 1) R*  R 2  ... i gg(1) m2 or ag EgRgˆˆ ( 1) R*  R 2  ... i ng2 1

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 19

1 Hence the bias of gRˆˆ(1)g R* is of order . i n2 Now g estimates of this form can be obtained, one estimator for each group. Then the jackknife or Quenouille’s estimator is the average of these of estimators

g ˆ  Ri RgRgˆˆ(1).i1 Q g

Product method of estimation: 1 C The ratio estimator is more efficient than the sample mean under SRSWOR if   .,x if R  0, which 2 Cy 1 C is usually the case. This shows that if auxiliary information is such that   x , then we cannot use 2 Cy the ratio method of estimation to improve the sample mean as an estimator of the population mean. So there is a need for another type of estimator which also makes use of information on auxiliary variable X. Product estimator is an attempt in this direction. The product estimator of the population mean Y is defined as ˆ yx YP  . X assuming the population mean X to be known

ˆ We now derive the bias and of Yp .

yY x X Let ,, 01YX ˆ (i) Bias of Yp .

ˆ We write Yp as yx YYˆ (1 )(1 ) p X 01

Y (10101 ). ˆ Taking expectation, we obtain bias of Yp as 1 f Bias() Yˆ  E ( ) Cov (,) y x S , pxy01 XnX ˆ ˆ which shows that bias of Yp decreases as n increases. Bias of Yp can be estimated by f Bias() Yˆ  s . pxynX

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 20

ˆ (ii) MSE of Yp : ˆ ˆ Writing Yp is terms of 01and  , we find that the mean squared error of the product estimator Yp up to second order of approximation is given by

ˆˆ2 MSEY()pp EY ( Y ) 22 YE (1012 ) 222 YE (10 2 12 ).

Here terms in (,10 ) of degrees greater than two are assumed to be negligible. Using the expected values, we find that f MSE() Yˆ  S222 R S 2 RS . pYXXYn 

ˆ (iii) Estimation of MSE of Yp ˆ The mean squared error of Yp can be estimated by f MSE() Yˆ  s222 r s 2 rs pyxxyn  where ryx / .

(iv) Comparison with SRSWOR: From the of the sample mean under SRSWOR and the product estimator, we obtain f Var() y MSE ( Yˆ ) RS (2 S RS ), SRS pn X Y X f where Var() y S 2 which shows that Yˆ is more efficient than the simple mean y for SRSn Y p 1 C  x ifR  0 2 Cy and for 1 C  x ifR  0. 2 Cy

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 21

Multivariate Ratio Estimator

Let y be the study variable and X 12,XX ,..., p be p auxiliary variables assumed to be corrected with y .

Further, it is assumed that X 12,XX ,..., p are independent. Let YX,12 , X ,..., Xp be the population means of the variables y , X 12,XX ,..., p . We assume that a SRSWOR of size n is selected from the population of N units. The following notations will be used.

2 SXi : the population mean sum of squares for the variate i , 2 sXi : the sample mean sum of squares for the variate i , 2 S0 : the population mean sum of squares for the study variable y , 2 s0 : the sample mean sum of squares for the study variable y ,

Si CXi  : coefficient of variation of the variatei , X i S Cy 0 : coefficient of variation of the variate , 0 Y

Siy i  : coefficient of correlation betweenyX andi , SSi 0 ˆ y YRi  :ratio estimator of,basedonYXi X i where ip1,2,..., . Then the multivariate ratio estimator of Y is given as follows.

ˆˆpp YwYwMR i Ri,1 i ii11 p X  ywi .  i x i1 i

(i) Bias of the multivariate ratio estimator: ˆ The approximate bias of YRi up to the second order of approximation is f Bias() Yˆ  Y ( C2  C C ). Rin i i i 0 ˆ The bias of YMR is obtained as

p Yf Bias() Yˆ  w ( C2  C C ) MR in i i i 0 i1 Yf p  wCii(). C i i C0 n i1

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 22

(ii) Variance of the multivariate ratio estimator: ˆ The variance of YRi up to the second-order of approximation is given by

f Var() Yˆ  Y22 ( C C 2 2 C C ). Rin 00 i i i ˆ The variance of YMR up to the second-order of approximation is obtained as f p ˆ 2222 Var() YMRiiii Y w ( C00 C 2 C C ). n i1

Sampling Theory| Chapter 5 | Ratio & Product Methods of Estimation | Shalabh, IIT Kanpur Page 23