Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

THE DESIGN EFFECT: DO WE KNOW ALL ABOUT IT?

Inho Park and Hyunshik Lee, Westat Inho Park, Westat, 1650 Research Boulevard, Rockville, Maryland 20850

Key Words: Complex sample, Simple random where y is an estimate of the population mean (Y ) sample, pps , Point and under a complex design with the sample size of n, f is a estimation, Ratio mean, Total 2 = ()− −1∑N − 2 sampling fraction, and S y N 1 i=1(yi Y ) is 1. Introduction the population element variance of the y-variable. In general, the design effect can be defined for any The design effect is widely used in survey meaningful statistic computed from the sample selected sampling for planning a sample design and to report the from the complex sample design. effect of the sample design in estimation and analysis. It The Deff is a population quantity that depends on is defined as the ratio of the variance of an estimator the sample design and the statistic. The same parameter under a sample design to that of the estimator under can be estimated by different and their simple random sampling. An estimated design effect is Deff’s are different even under the same design. routinely produced by sample survey software such as Therefore, the Deff includes not only the efficiency of WesVar and SUDAAN. the design but also the efficiency of the estimator. This paper examines the relationship between the Särndal et al. (1992, p. 54) made this point clear by design effects of the total estimator and the ratio mean defining it as a function of the design (p) and the estimator, under unequal probability sampling. After a θˆ θ brief review on various definitions and practical usage estimator ( ) for the population parameter .Thus, of the design effect, we consider a decomposition of the we may write it as θˆ = θˆ θˆ′ design effect of the total estimator in term of that of the Deff(p, ) Varp ( ) / Varsrswor ( ) (2.2) ratio mean. In addition, a model-assisted method is used to derive some useful approximate formulae, with where θˆ′ is the usual form of estimator for θ under which we make further comparison of the design effects srswor, which is normally different from θˆ .For of the two estimators. The approximate formulae example, to estimate the population mean, one may use derived here are also compared with the well-known θˆ = Kish’s approximate formula (Kish, 1965, 1995). the ratio mean ∑s wi yi / ∑s wi with sampling Finally, we apply our formulae to an artificial weights w but θˆ′ would be the sample mean ∑ y / n . population created from a complex health survey data i s i for illustration. We will see the effect of a particular statistic θˆ on the design effect in Section 3. In particular, we will show 2. Definition and Use of Design Effect in that the Deff for the Hansen-Hurwitz (or Horvitz- Practice Thompson) estimator for the population total can be very different from the ratio mean for the population The precursor of the design effect that has been mean. popularized by Kish (1965) was used by Cornfield Cochran (1977, p. 85) stated “The design effect (1951). He defined the efficiency of a complex has two primary uses – in sample size determination sampling design as the ratio of the variance of a statistic and in appraising the efficiency of more complex under simple random sampling without replacement plans.” These are still main uses of the design effect. (srswor)withasamplesizeofn to the variance of the However, other important uses have emerged; to statistic under the complex design with the same sample compute the effective sample size and to modify the size. The inverse of the ratio defined by Cornfield was inferential statistic in data analysis as Scott and Rao also used by other sampling statisticians. For example, (1981) used the average design effect to modify the Hansen, Hurwitz, and Madow (1953, pp. 259-270) Pearson type χ 2 -test statistic for complex samples. discussed the increase of the variance due to clustering effect of the over srswor. However, the Kish (1992) later advocated using a slightly name, the design effect, or Deff in short was coined and different Deff, which is called Deft and uses the simple defined formally by Kish (1965, Section 8.2, p. 258) as random sampling with replacement (srswr) variance in “the ratio of the actual variance of a sample to the the denominator. His logic was that without- variance of a of the same replacement sampling is a part of design and should be numberofelements.”TheDeffforanestimateofthe captured in the definition. Deft is also easier to use for population mean is given by making inferences. Another reason Kish quoted is that the finite population correction factor (1− f ) may be = 2 Deff Var( y) / {(1- f ) S y / n} (2.1) difficult to compute in some situations. The new correlation between the sampling weights and the definition is given by survey variable (y). However, if the correlation is θˆ = θˆ θˆ′ present, the formula must be modified as studied by Deft(p, ) Varp ( ) / Varsrswr ( ) (2.3) Spencer (2000). We also examine this point further along with Spencer’s approximate formulae in or Deft2 ( p,θˆ) = Var (θˆ) / Var (θˆ) . Survey data p srswr Section 4. software such as WesVar and SUDAAN produce However, when one comes to the Horvitz- Deft2 instead of Deff. Thompson type of total estimator, the situation becomes When the population parameter is the total (Y), very different. Particularly, when the weights are poorly the unbiased estimator is the (weighted) sample total, correlated with the y-variable and the weights vary a lot, the design effect for the total estimator can be much ˆ = ∑ namely, Y s wi yi ,where wi is the properly defined greater than that for the ratio mean. In Section 3, we sampling weight and the summation is over the sample analyze this fact in detail for unequal probability s. When the mean is the parameter of interest, it is sampling. usually estimated by the ratio mean, that is, Yˆ = ∑ w y ∑ w . It is a special case of the ratio 3. DecompositionofDesignEffectofSample s i i s i Total Under Unequal Probability Sampling ≡ ∈ estimator, ∑s wi yi ∑s wi xi ,where xi 1 for all i s . One common misconception about the design Consider a sample design (denoted by p )witha ˆ effects of Yˆ and Y is that they in values are similar. sample size of n drawn by unequal probability However, it has been observed that the design effect of sampling with replacement from a finite population U Yˆ , deft 2 ( p,Yˆ) , is much larger than that of the ratio of size N.Let yk denote the y-value of element k and ˆ let pk denote the associated selection probability, mean, deft 2 ( p,Y ) for human populations. For where ∑ p = 1 . Note that the n draws are example, see Kish (1987) and Hansen, Hurwitz, and U k Madow (1953, p. 608). Some explanation can be found independent since sample selection is with replacement. in Särndal et al. (1992, p. 65 and p. 133) who showed If pk are proportional to a positive size measure xk , 2 ∝ that deft ( p,Yˆ) depends on the relative variation of that is, pk xk , then the sampling scheme is the y-variable under Bernoulli sampling and a special probability proportional to size (pps) sampling. case of one-stage cluster sampling. This dependence Let ki represent the element selected on the i-th contradicts what the design effect is intended to draw. Then the Hansen-Hurwitz’s estimator of the finite measure as Kish (1995) explicitly described: “Deft are population total, Y = ∑ y , is given as used to express the effects of sample design beyond the U k n 2 ˆ = 1 elemental variability ( S y / n ), removing both the units Y ∑ yk pk . (3.1) n = i i of measurement and sample size as nuisance i 1 It is easy to see that E (y / p ) = Y and thus Yˆ is parameters. With the removal of S y , the units, and the p ki ki sample size n,thedesigneffectsonthesamplingerrors the average of n unbiased estimators of Y .For are made generalizable (transferable) to other statistics simplicity, we use i to denote ki unless any confusion and to other variables, within the same survey, and even to other surveys.” His statement is loosely true for the arises. The variance of Yˆ can be written as ˆ 1 ratio mean Y , as the usual approximate design effect Var()Yˆ = ∑ p {}()y / p −Y 2 n U i i i formula is independent of a particular y-variable. A (3.2) 2 = 1 −1 − 2 frequently used approximate formula for the Deft of ∑U pi (yi piY) the ratio mean is given by Kish (1987) n (see Särndal et al., 1992, pp. 51-52). 2 ˆ = + ρ − + 2 Deft ( p,Y ) {1 (b 1)}(1 cv w ) (2.4) Now consider the estimation of the finite where p is an unequal probability sampling of clusters population mean, Y = Y / N .Itcanbeestimatedbythe design, ρ is the intraclass correlation (often called ˆ = ˆ ˆ ˆ = n ratio mean Y Y / N where N ∑ = 1 (npi ) is an within cluster homogeneity measure), b is the average i 1 estimator of the population size N. Using Taylor 2 cluster size, and cvw is the relative variance of the linearization, as shown in Särndal et al. (1992, pp. 172- sampling weights. Gabler, Haeder, and Lahiri (1999) ˆ 176), Y can be approximated as justified expression (2.4) using a superpopulation − model. This formula is valid only when there is no Yˆ =& Y + N 1dˆ , (3.3) ˆ = −1 n = − 2 where d n ∑ i=1di pi with di yi Y .Using Note that g is a decreasing function of CVy and expression (3.3), an approximate variance of Yˆ is 2 ˆ 2 ˆ 2 thus Deft (Y ) becomes larger than Deft (Y ) as CVy obtained as decreases. This imbalance can be dramatic when pi ’s ˆ ≡ −2 ˆ = −2 1 ∑ −1 2 AVar(Y ) N Var(d) N U pi di (3.4) − 2 n vary a lot, pi ’s and ( yi Y ) ’s are uncorrelated, and where the equation holds since ∑ d = 0 . 2 U i CVy is small (see Section 5 for an example). ˆ ˆ Now we derive the design effects of Y and Y . When p ’s and ( y −Y )2 ’s are uncorrelated, Under srswr of size n, the total estimator is given by i i which happens often in social or health surveys, we can Yˆ = Ny and its variance by Var (Ynˆ ) = N 2S 2 / , s s srswr s y −1 − 2 − 2 approximate ∑U pi (yi Y ) by nW ∑U (yi Y ) and = ∑ n where ys i=1 yi n . then expression (3.6) can be simplified as From (3.2) and assuming N is large (so that ˆ Deft2 (Y ) =& nW / N , N /(N −1) =& 1), an approximate design effect of Yˆ is where W = ∑ w / N with w = 1/(np ) .Laterwe obtained as U i i i −1 2 + 2 ∑ p ( y − p Y) will see that nW / N can be estimated by 1 cvw Deft2 (Yˆ) =& U i i i . (3.5) − 2 2 ∑U N(yi Y ) where cvw is the relative variance of the sampling ˆ = 2 ˆ = weights, wi ’s. Also, since Varsrswr (Ys ) S y / n ,whereYs ys ,it follows from (3.4) that Remark 3.1 (Binary Variable: total vs. proportion). −1 2 ˆ ∑ p (y −Y ) For a binary variable, we can show that Deft2 (Y ) =& U i i . (3.6) − 2 2 =& −Y Y p = Y ∑U N(yi Y ) CVy (1 ) / . When iscloseto1,then 2 2 ˆ 2 2 Note that Deft (Yˆ) comes closer to Deft (Y ) and g(CVy ) in (3.7) can be very large, while g(CVy ) both approach to 1 as pi approaches to a constant, say becomes small and thus the two design effects are close → 1/ N , for all i ∈U . Note also that sample design p is as Y 0 . omitted from expressions (3.5) and (3.6) for brevity’s sake. Remark 3.2 (Comparison with a Model-based In addition, we see from expressions (3.5) and Approach). The difference between Deft 2 (Yˆ) and (3.6) that the only difference between these expressions 2 ˆ 2 is in the square deviations of the numerator. Using the Deft (Y ) , that is, the quantity g(CVy ) in decom- standard ANOVA decomposition to the numerator of position (3.7), cannot be revealed by a model-based (3.5), we can write approach used by Gabler, Haeder, and Lahiri (1999), −1 − 2 = −1 − 2 ∑U pi (yi piY ) ∑U pi (yi Y ) since yi ’s are treated as random variables while wi ’s + 2 2 −1 − 2 as fixed constants. Under the model-based approach, N Y ∑U pi ( pi P) 2 ˆ 2 ˆ − −1 − − Deft (Y) is different from Deft (Y ) by a factor of 2NY ∑U pi (yi Y )( pi P) ˆ 2 = = (N / N) , which is negligible in comparison with where P ∑U pi / N 1/ N .Hence,wehave 2 g(CVy ) . More discussion will come in Section 4. ˆ 1 − 2 Deft2 (Yˆ) =& Deft2 (Y ) + ∑ p 1( p − P) 2 U i i CVy 4. Design Effects when the Survey Variable is Linearly Related with the Sampling Weight 2 1 − − ∑ p 1(y −Y )(p − P) 2 U i i i CVy Y Assuming a linear relation between the y-value and the selection probability, Spencer (2000) derived an = 2 + 2 Deft (Y ) g(CVy ) (3.7) approximate formula to the design effect of the Hansen- Hurwitz estimator under unequal probability sampling where CV = S Y denotes the coefficient of y y with-replacement. However, it appears that he did not variation (CV) of the y-variable. The last two terms are recognize the difference between the design effects of denoted by g as a function of CV 2 . the total and the ratio mean estimators, as he compared y directly his approximate formula for the total estimator + 2 ˆ = 2 ˆ with Kish’s approximate formula 1 cvw , which was Since Varsrswr (Ys ) S y / n , the design effect of Y originally derived for the ratio mean estimator. In this under (S1) and (S2) is given as section, we use Spencer’s set-up to derive approximate 2 ˆ formula for the mean estimator and then we compare Deft (Y | S1,S2) them with those for the total estimator in the same B2 1 context of Section 3. =& (1− R2 )nW / N + ()nW / N −1 (4.3) yp 2 2 Consider the of selection S y N probability p on y given by 2 i i  R  = − 2 +  yp  ()− (1 Ryp )nW / N   nW / N 1 , = + +  CVp  (S1) yi A Bpi ei . The least-square regression coefficients of this model at where the second expression follows since = = the population level are given by A = Y − BP and B RypS y / S p and P 1/ N . Note that S p denotes = − − − 2 B ∑U ( yi Y )(pi P) ∑U ( pi P) .Asbefore, the population of the pi ’s and CVp = = wi 1/(npi ) and W ∑U wi / N .Theratiomean is the of pi ’s. ˆ = 2 = estimator has an approximate variance AVar(Y ) = Since E(wi ) N / n and E(wi ) NW / n ,we −2 − 2 = + have E(w2 ) / E 2 (w ) = nW / N . Based on these N ∑U wi (yi Y ) as given in (3.4). Since Y A i i − = − + × relations, Spencer substituted nW / N in his derivation BP and yi Y B( pi P) ei ,wehave ∑U wi (Section 5) by an estimate − 2 = 2 − + 2 − ( yi Y ) ∑U ei wi 2B∑U eiwi / N B (W / N 1/ n). 2 ∑s wi / n In parallel with Spencer (2000), under (S1) we can . (4.4) ()∑ 2 write s wi / n ˆ Note that expression (4.4) is equivalent to Kish’s N 2AVar(Y | S1) + 2 approximate formula 1 cvw for the ratio mean = (N −1)R S S + (N −1)(1− R2 )S 2W (4.1) e2w e2 w yp y ˆ estimator Y . Substituting further R and CV by − + 2 ()− yp p 2BRewSeSw B W / N 1/ n their respective sample estimates, say ryp and cv p ,in where S , S 2 , S denote the (finite) population e e w expression (4.3), we obtain the following estimator of 2 the design effect of the ratio mean estimator: standard deviations of the ei ’s, the ei ’s, the wi ’s, 2 respectively, and R , R 2 , R denote the ˆ ryp ew e w yp deft2 (Y | S1,S2) = (1+ cv2 )(1− r 2 ) + cv2 (4.5) w yp 2 w 2 cv p population correlations of pairs ( ei , wi ), ( ei , wi )and = Kish’s formula immediately follows by setting ryp =0, ( yi , pi ), respectively. For example, Ryw that is, ∑ − − − U ( yi Y )(wi W ) {(N 1)S y Sw} .Toobtain(4.1), ˆ deft2 (Y | S1,S2,r = 0) = 1+ cv2 . (4.6) we have used the following expressions given in yp w Meanwhile, Spencer (2000) derived the Spencer (2000, Section 5): ∑ e2w = (N −1)R × U i i e2w population design effect for the total estimator under S S + (N −1)WS 2 (1− R2 ) and ∑ e w = (N −1)× (S1) and (S2) as follows: e2 w y yp U i i Deft2 (Yˆ | S1,S2) RewSeSw . If the regression model (S1) fits well to the =& − 2 + 2 ()− population and the error variance is roughly (1 Ryp )nW / N (A/ S y ) nW / N 1 (4.7a) homogeneous so that which can be rewritten using A = Y − BP as 2 ˆ (S2) R =& 0 and R =& 0 , Deft (Y | S1,S2) ew e2w then expression (4.1) further simplifies to  2 2  Ryp 1  ˆ =& (1− R )nW / N + − ()nW / N −1 (4.7b) N 2AVar(Y | S1,S2) yp  CV CV  (4.2)  p y  = − 2 2 − + 2 ()− (1 Ryp )S y (N 1)W B W / N 1/ n . An estimator of this approximate design effect is then given by deft2 (Yˆ | S1,S2) variable (x) is highly correlated with the y-variable (see also Lehtonen and Pahkinen, 1995, p. 110). 2  r  (4.8) = + 2 − 2 +  yp − 1  2 (1 cvw )(1 ryp )   cvw 5. Example  cv p cv y  If r = 0 , then Spencer’s formula simplifies to In this section we discuss comparison between yp the design effects of the sample total and the ratio mean cv2 estimator using an artificial dataset created from a deft2 (Yˆ | S1, S2, r = 0) = 1 + cv2 + w , (4.9) yp w 2 health survey. The dataset is a subset of the adult data cv y from the U.S. third National Health and Nutrition which does not reduce to Kish’s formula unless Examination Survey (NHANES III), which is given as a demo file in WesVar version 4. From the dataset, cv2 / cv2 = 0 . w y N=5,000 records were selected by simple random sampling without replacement to construct an artificial Remark 4.1 (Spencer’s approximate formula). In his finite population. Sampled were only the records with original derivation, Spencer proposed the following complete responses to the four variables CIGNUM formula to estimate the design effect: (number of cigarettes smoked per day), SYSTOLIC (1− r 2 )(1+ cv 2 ) + (a / s )2 cv 2 , (average systolic blood pressure), DIASTOL (average yp w y w diastolic blood pressure) and HEIGHT (height without where a and s y are the sample estimates for A and shoes - inches). The inverse of the final weight in the demo file was used as the measure of size (MOS) for + 2 + 2 2 2 S y , respectively or 1 cvw cvwa / s y as a special our sampling experiment. Note that the final weight in case for r = 0 with saying that “this is close to the demo file is different from the NHANES III final yp weight that was obtained by further adjusting the Kish’s approximation when a / s y is near zero.” weight by poststratification. = = ˆ = However, if ryp 0 ,then a Y and a / s y 1/ cv y , Table 1. Parameters of the artificial population and it becomes exactly (4.9). Variable (y) Y Y CVy Ryp CIGNUM 18,620 3.72 2.3781 -.0966 Remark 4.2 (Comparison of Deft2 (Yˆ) and SYSTOLIC 640,288 128.06 .1700 .1300 ˆ Deft2 (Y ) ). Using Cauchy-Schwarz inequality, it is DIASTOL 377,920 75.58 .1628 .0278 HEIGHT 331,269 66.25 .0606 -.1058 easy to show that nW / N ≥1. Assume y ≥ 0 so that i pi 1 1/5,000 1.3910 1.0000 > CVy 0 . From (4.3) and (4.7b), it follows that wi 846,946 169.39 1.2350 -.4104 2 ˆ ≥ 2 ˆ ≤ Deft (Y | S1, S2) Deft (Y | S1, S2) iff 2Ryp Table 1 presents several parameters of the artificial population on the selected four variables = CVp / CVy (the equality holds iff 2Ryp CVp / CVy ). including the selection probabilities pi ’s and the = 2 ˆ ≥ 2 ˆ If Ryp 0 , Deft (Y | S1,S2) Deft (Y | S1,S2) and sampling weights wi ’s. The survey variables are the equality holds under srswr, in which case weakly correlated with the selection probability, where ˆ Ryp ranges from -.1058 to .1300. CVy for Deft2 (Yˆ) = Deft2 (Y ) = 1. SYSTOLIC, DIASTOL and HEIGHT is less than .20, ∝ = Under ppswr (i.e., pi xi ), pi xi ∑U xi while for CIGNUM it is 2.3781, which is much larger = = than others. and thus we have CVp CVx and Ryp Ryx We used ppswr with n = 100 . < ∈ provided pi 1 for all i U . Therefore, we have Table 2 displays the decomposition of the 2 ˆ ≥ 2 ˆ ≤ population design effect in (3.7) for the four survey Deft (Y | S1,S2) Deft (Y | S1,S2) iff 2Ryx variables. Since CIGNUM has a large CVy ,the CV / CV . On the other hand, the inequality can be x y ˆ difference between Deft 2 (Yˆ) and Deft 2 (Y ) is small. reversed when R is large (i.e., close to 1) and even yx However, the difference is very large for the other three 2 ˆ 2 Deft (Y) can be well below 1. This happens variables due to small CVy , which causes g(CVy ) to frequently in business surveys where the y-variable is 2 ˆ heavily skewed to the right and the size measure become dominant in Deft (Y ) for these three variables. Table 2. Decomposition of design effect in (3.7) usually large and the design effect can be much smaller than 1. However, we can use the design effect in 2 ˆ 2 ˆ g(CV2) business surveys as well to measure the gain by the Variable Deft (Y ) Deft (Y ) y sample design (instead of the loss, which is usually the CIGNUM 4.837 5.676 .837 case in social surveys) over the simple random SYSTOLIC 2.833 78.205 75.372 sampling. Business surveys often employ pps sampling DIASTOL 3.028 91.377 88.349 or size-stratification. In this context, it would be worth HEIGHT 3.488 668.036 664.548 while to pursue further on this research to take into account of without-replacement sampling and the finite population correction (fpc). The fpc can be important Table 3 shows Pearson’s correlations of the particularly in business surveys. We are currently 2 sampling weights wi with ei and ei , and approximate working on the extension of our formulae to multi-stage values to the population design effects of the ratio mean sampling. We will also try to tackle these problems if and the sample total estimators. Although the plots of time permits. Lastly, we would like to point out the importance data points (xi , yi ) show less than perfect linear of an efficient estimation technique such as relationships between xi and yi (i.e., pi and yi ), poststratification for estimation of the total, when Ryp Table 3 reveals that R and R are nearly zero and ew e2w is close to zero. the approximate formulae (4.3) and (4.7a) under (S1) and (S2) trace fairly closely the true design effect given 7. References in Table 2. Cochran, W. G. (1977). Sampling Techniques.3rded. Table 3. Approximate design effects under linear New Work: Wiley. relation between pi ’s and yi ’s Cornfield, J. (1951). Modern Methods in the Sampling of Human Populations. American Journal of Public Goodness of Fit 2 Health, 41, pp. 654-661. Approximate Deft Gabler, S., Haeder, S., and Lahiri, P. (1999). A Model R R 2 2 ˆ 2 ˆ Variable ew e w Deft (Y ) Deft (Y ) Based Justification of Kish’s Formula for Design Effects for Weighting and Clustering. Survey CIGNUM 0.079 0.097 3.368 3.930 Methodology, 25, pp.105-106. SYSTOLIC -0.093 -0.078 3.351 83.293 Hansen, M. H., Hurwitz, W. N., and Madow, W. G. DIASTOL -0.022 -0.045 3.386 92.857 (1953). Sample Survey Methods and Theory,Vol.I. HEIGHT 0.058 0.016 3.364 660.107 New York: Wiley. Kish, L. (1965). Survey Sampling, New York: John 6. Concluding Remarks Wiley & Sons. Kish, L. (1987). Weighting in Deft. The Survey Kish originally intended the design effect as a Statistician, June 1987. measure of the effect of the sample design in parameter Kish, L. (1992). Weighting for Unequal p . Journal of estimation, which is independent of the elemental i variability of a particular y-variable, the unit of Official Statistics, 8, pp.183-200. measurement, and the sample size. He is largely Lehtonen, R., and Pahkinen, E. J. (1995). Practical successful for the ratio mean estimator to achieve this Methods for Design and Analysis of Complex goal. Probably due to this success, it is commonly Surveys. New York: Wiley. misconceived that the design effect of the total Särndal, C.E., Swensson, B., and Wretman, J. (1992). estimator is similar to that of the ratio mean estimator. Model Assisted Survey Sampling.NewYork: However, as clearly demonstrated in this paper, the Springer-Verlag. design effect of the Horvitz-Thompson type estimator Scott, A. J., and Rao, J. N. K. (1981). Chi-squared Tests for the total under an unequal probability sampling for Contingency Tables with Proportions estimated design is not only dependent on the variability of y- from Survey Data. In: D. Krewski, R. Platek, and J. N. K. Rao (eds.), Current Topics in Survey Sampling. variable ( CV ) but also can be greatly influenced by it. y New York: Academic Press, pp. 247-265. Another important factor in determining the design Spencer, B. D. (2000). An Approximate Design Effect effect is the correlation ( Ryp ) between the selection for Unequal Weighting When Measurements May probability and the y-variable as shown in Section 4. Correlate With Selection Probabilities. Survey Kish’s approximate design effect formulae were Methodology, 26, pp. 137-138. derived assuming tacitly that this correlation is zero. Westat (2001). WesVar 4.0 User’s Guide. Rockville, This may be the reason why the design effect is not MD: Westat, Inc. used in business surveys, where the correlation is