Catalogue no. 12-001-XIE

Survey Methodology

December 2004 How to obtain more information

Specific inquiries about this product and related statistics or services should be directed to: Business Survey Methods Division, Statistics Canada, Ottawa, Ontario, K1A 0T6 (telephone: 1 800 263-1136).

For information on the wide range of data available from Statistics Canada, you can contact us by calling one of our toll-free numbers. You can also contact us by e-mail or by visiting our website.

National inquiries line 1 800 263-1136 National telecommunications device for the hearing impaired 1 800 363-7629 Depository Services Program inquiries 1 800 700-1033 Fax line for Depository Services Program 1 800 889-9734 E-mail inquiries [email protected] Website www.statcan.ca

Information to access the product

This product, catalogue no. 12-001-XIE, is available for free. To obtain a single issue, visit our website at www.statcan.ca and select Our Products and Services.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner and in the official language of their choice. To this end, the Agency has developed standards of service that its employees observe in serving its clients. To obtain a copy of these service standards, please contact Statistics Canada toll free at 1 800 263-1136. The service standards are also published on www.statcan.ca under About Statistics Canada > Providing services to Canadians. Statistics Canada Business Survey Methods Division

December 2004

Published by authority of the Minister responsible for Statistics Canada

© Minister of Industry, 2006

All rights reserved. The content of this electronic publication may be reproduced, in whole or in part, and by any means, without further permission from Statistics Canada, subject to the following conditions: that it be done solely for the purposes of private study, research, criticism, review or newspaper summary, and/or for non-commercial purposes; and that Statistics Canada be fully acknowledged as follows: Source (or “Adapted from”, if appropriate): Statistics Canada, year of publication, name of product, catalogue number, volume and issue numbers, reference period and page(s). Otherwise, no part of this publication may be reproduced, stored in a retrieval system or transmitted in any form, by any means—electronic, mechanical or photocopy—or for any purposes without prior written permission of Licensing Services, Client Services Division, Statistics Canada, Ottawa, Ontario, Canada K1A 0T6.

April 2006

Catalogue no. 12-001-XIE ISSN 1492-0921

Frequency: semi-annual

Ottawa

Cette publication est disponible en français sur demande (no 12-001-XIF au catalogue).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued cooperation and goodwill. 4 Park and Lee: Design Effects for the Weighted Mean and Total Under Complex Surveys Vol. 30, No. 2, pp. 183-193 Statistics Canada, Catalogue No. 12-001

Design Effects for the Weighted Mean and Total Estimators Under Complex Survey Sampling

Inho Park and Hyunshik Lee 1

Abstract We revisit the relationship between the design effects for the weighted total and the weighted mean estimator under complex survey sampling. Examples are provided under various cases. Furthermore, some of the misconceptions surrounding design effects will be clarified with examples.

Key Words: ; pps sampling; Multistage sampling; Self-weighting; Poststratification; Intracluster correlation coefficient.

1. Introduction 2. A Brief Review on Definition and Use of Design Effect in Practice The design effect is widely used in survey sampling for A precursor of the design effect that has been developing a sampling design and for reporting the effect of popularized by Kish (1965) was used by Cornfield (1951). the sampling design in estimation and analysis. It is defined He defined the efficiency of a complex sampling design for as the ratio of the of an estimator under a complex estimating a population proportion as the ratio of the sampling design to that of the estimator under simple variance of the proportion estimator under simple random random sampling with the same sample size. An estimated sampling with replacement (srswr) to the corresponding design effect is routinely produced by computer software variance under a simple random design packages for complex surveys such as WesVar and with the same sample size. The inverse of the ratio defined SUDAAN. It was originally intended and defined for the by Cornfield (1951) was also used by others. For example, weighted (ratio) estimator of the population mean (Kish Hansen, Hurwitz and Madow (1953, Vol. I, pages 1995). However, a common practice has been to apply this 259 − 270) discussed the increase of the relative variance concept for other statistics such as the weighted total of a due to the clustering effect of cluster estimator often with success but at times with confusion and sampling over simple random sampling without misunderstanding. The latter situation occurs particularly replacement (srswor). The name, design effect, or Deff in when simple but useful results derived under a relatively short, however, was coined and defined formally by Kish simple sampling design are applied to more complex (1965, section 8.2, page 258) as “the ratio of the actual problems. In this paper, we examine the relationship variance of a sample to the variance of a simple random between the design effects for the weighted total estimator sample of the same number of elements” (for more history, and the weighted mean estimator under various complex see also Kish 1995, page 73 and references cited therein). survey sampling designs. In section 2, we briefly review the Suppose that we are interested in estimating the definition of the design effect and its practical usage while population mean (Y ) of a variable y from a sample of discussing some of the misconceptions surrounding design size m drawn by a complex sampling design denoted by effects for the weighted total and mean estimators. p from a population of size M. Kish’s Deff for an Subsequently, in section 3, we analyze the difference estimate (y ) is given by between the design effect for the weighted total estimator p and that for the weighted mean estimator under a two-stage V (y ) Deff = p p (2.1) sampling design followed by a discussion regarding the − 2 (1 f ) S y m design effects under various two-stage sampling designs and = some more general cases in section 4. We try to clarify where V p denotes variance with respect to p, f m /M is 2 = − −1 some of the misconceptions with these examples. Finally, the overall sampling fraction, and S y (M 1) M − 2 we summarize our discussion in section 5. ∑k =1 (yk Y ) is the population element variance of the

1. Inho Park and Hyunshik Lee, Westat, Inc. 1650 Research Blvd., Rockville, MD 20850, U.S.A. E-mail: [email protected]. Survey Methodology, December 2004 5 y-variable. Although the design effect was originally was also noted in, for example, Kish (1987) and Barron and intended and defined for an estimator of the population Finch (1978). Some explanation can be found in Hansen mean (Kish 1995), it can be defined for any meaningful et al. (1953, Vol. I, pages 336 – 340) who showed that the statistic computed from a sample selected by a complex difference arises from the relative variance of the cluster sampling design. sizes. More recently Särndal et al. (1992, pages 315 – 318) The Deff is a population quantity that depends on the showed that contrary to the case of Yˆ, the design effect sampling design and refers to a particular statistic estimating for Yˆ depends on the (relative) variation of the y-variable. a particular population parameter of interest. Different In fact, even the design effect for Yˆ may depend on the estimators can estimate the same parameter and their design (relative) variation of the y-variable, which we will discuss effects are different even under the same design. Therefore, in section 4. This dependence contradicts what the design the design effect includes not only the efficiency of the effect is intended to measure as Kish (1995) explicitly design but also the efficiency of the estimator. Särndal, described: Swensson, and Wretman (1992, page 54) made this point clear by defining it as a function of the design ( p) and the “Deft are used to express the effects of sample design 2 estimator (θˆ) for the population parameter (θ = θ (y)). beyond the elemental variability (S y /m), removing both Thus, we may write it as the units of measurement and sample size as nuisance parameters. With the removal of S , the units, and the V (θˆ) y Deff (θˆ) = p sample size m, the design effects on the sampling errors p ˆ′ Vsrswor (θ ) are made generalizable (transferable) to other statistics where θˆ′ is the usual form of an estimator for θ under and to other variables, within the same survey, and even srswor, which is normally different from θˆ. For example, to to other surveys.” estimate the population mean, one may use the weighted His statement may be loosely true for the weighted mean (ratio) mean θˆ = ∑ w y /∑ w with sampling weights s k k s k Yˆ as expressed in the frequently used sample approximate w but θˆ′ would be the simple sample mean ∑ y /m, k s k formula for Deft 2 ( p, Yˆ ) given by Kish (1987): where the summation is over the sample s. We will see the p effect of particular estimators θˆ on the design effect in the Deft 2 (Yˆ ) = {1+ ρ(m −1) }()1+ cv 2 (2.2) later sections. p w Kish (1995) later advocated using a somewhat different definition, which is called Deft and uses the srswr variance where the sample design p contains complex features such as unequal weighting and cluster sampling, ρ = ρ (y) is the in the denominator on the ground that without-replacement p sampling is a part of the design and should be captured in intraclass correlation coefficient (often called within cluster the definition. He also reasoned that Deft is easier to use for homogeneity measure), m is the average cluster sample size, and cv 2 is the sample relative variance of the weights. making inferences and that it is better to define the design w effect without the finite population correction factor (1− f ) Strictly speaking, this formula is not independent of the ρ because the factor is difficult to compute in some situations. y-variable because is dependent on the y-variable. Also, The new definition is given by the design effect may not be free of the unit of measurement unless V (Yˆ ) is expressed in a factorial form of S 2 /m. ˆ p y Vp (θ) See Park and Lee (2002). This formula (2.2) is valid only Deft (θˆ) = p ˆ′ Vsrswr (θ ) when there is no correlation between the sampling weights and the survey variable y. However, if the correlation is 2 ˆ = ˆ ˆ′ or Deft p (θ) V p (θ) /Vsrswr (θ ). Survey data software 2 present, the formula may need to be modified as studied by such as WesVar and SUDAAN produce Deft instead of Spencer (2000) and Park and Lee (2001). In the following Deff. We will use this definition in this paper. section, we elaborate this aspect in detail for two-stage When the population parameter is the total (Y), the sampling and we will also examine this point further in unbiased estimator is the weighted sample total, namely, section 4.1. ˆ = Y ∑s wk yk . When the population mean is the parameter of interest, it is usually estimated by the weighted mean, that ˆ = is, Y ∑∑sswk yk / wk . It is a special case of the ratio 3. Decomposition of the Design Effect Under ≡ ∈ estimator, ∑s wk yk /∑s wk xk , where xk 1 for all k s. Two-Stage Sampling One common misconception about the design effects for Yˆ and Yˆ is that they are similar in values. However, it has We consider a sampling design conducted in two stages. ˆ 2 ˆ = = been observed that the design effect for Y, Deft p (Y ), Suppose that a population U {k:k 1,…, M} with M ˆ 2 ˆ tends to be much larger than that for Y , Deft p (Y ). This elements is grouped into N clusters of size M i such that

Statistics Canada, Catalogue No. 12-001 6 Park and Lee: Design Effects for the Weighted Mean and Total Estimators Under Complex Surveys Sampling

= N = = 2 M ∑ii=1 M . The first stage sample sa {i :i 1,…, n} ⎡ N N ⎤ ˆ = 1 ⎛ − M i ⎞ + ˆ of n clusters (primary sampling units, or PSUs in AVp (Y ) 2 ⎢∑ wi ⎜Yi Y ⎟ ∑ wi Vb (Di )⎥. (3.3) M i=1 ⎝ M ⎠ i=1 abbreviation) is selected with replacement from N clusters ⎣⎢ ⎦⎥ N = = = n with probabilities pi , where ∑i =1 pi 1. Let pa Pr(sa ) If a simple random sample of size m ∑i =1 mi is denote the first stage sampling design. The second stage selected with replacement from the population U, then a = = = sample sbi { j : j 1,…, mi } of mi elements (secondary sample mean ysrs ∑s yk / m and its expansion sampling units or SSUs in abbreviation) is then selected ˆ = = 1 Ysrs M ysrs ∑ yk (3.4) independently from each PSU i selected at the first stage f s = according to some arbitrary sampling design, say pbi Pr(s | s ) where i ∈ s . Denote the total sample of ele- would serve as the estimators of the population mean Y bi a a = ments and the overall sampling design by s = ∪ ∈ s and and total Y, respectively, under srswr, where f m / M is i sa bi p = Pr(s), respectively. Associated with the j th element in the overall sampling fraction. Their under this th ˆ = 2 the i cluster is a survey characteristic y , j = 1,…, M , sampling design are given as Vsrswr (Ysrs ) M Vsrswr (ysrs ), ij i = −1 2 2 = − −1 i =1,…, N. For a given i ∈ s , let w be the second stage where Vsrswr (ysrs ) m S y and S y (M 1) a j|i − 2 ˆ = ∑s (y Y ) . We note that m is the achieved sample size, sampling weights such that an estimator of the form Yi k mi = Mi which is a random quantity in general. From (3.1), (3.3), and ∑ j=1 w j|i yij is unbiased for the cluster total Yi ∑ j =1 yij , ˆ = above expressions with m replaced by its m′ that is, Eb (Yi ) Yi , where Eb represents the expectation = with respect to the overall sampling design p, i.e., with respect to the second stage sampling. Let wi 1/(npi ) = N m′ = E (m), the design effects for Yˆ and Yˆ can be written be the first stage sampling weights and let Y ∑i =1Yi be p = as the population total. It is easy to show that Ea (Yi / pi ) Y. ∈ n N 2 N Assuming that Yi are known for i sa , ∑i =1 wi Yi is the m′ ⎧ ⎛ Y ⎞ ⎛ Yˆ ⎞⎪⎫ Deft2 (Yˆ) = w i − p + wV ⎜ i ⎟ (3.5) average of n unbiased estimators of Y so that p 2 ⎨ ∑ i ⎜ i ⎟ ∑ i b ⎜ ⎟⎬ CV i=1 ⎝ Y ⎠ i=1 Y ⎪ n = y ⎩ ⎝ ⎠⎭ Ea (∑i =1 wi Yi ) Y, where Ea denotes the expectation with respect to the first stage sampling design. Note that both and stages are sampling with replacement. Accordingly, it is 2 m′ ⎧ N ⎛ Y M ⎞ N ⎛ Dˆ ⎞⎪⎫ Deft2 (Yˆ ) ≅ w i − i + wV ⎜ i ⎟ (3.6) possible that the same sampling unit (either cluster or p 2 ⎨∑ i ⎜ ⎟ ∑ i b ⎜ ⎟⎬ CV = Y M = Y element) is selected more than once but they are treated y ⎩ i 1 ⎝ ⎠ i 1 ⎝ ⎠⎭⎪ differently. Define the overall sampling weights by where CV 2 = S 2 /Y 2 represents the population relative = ˆ = n mi y y wij wi w j| i . Clearly, Y ∑∑i =1 j=1 wij yij is unbiased for n variance of the y -variable. From these expressions, the Y, that is, E (Yˆ) = E E (Yˆ) = E (∑ = w Y ) = Y, p a b a i 1 i i difference in design effects for Yˆ and Yˆ can be written as where E represents the expectation with respect to p. The p follows. variance of Yˆ can be written as 2 ˆ − 2 ˆ ≅ Δ + Δ ˆ = ˆ + ˆ Deft p (Y ) Deft p (Y ) a b , (3.7) V p (Y ) Va Eb (Y ) EaVb (Y ) N N where = − 2 + ˆ wi (Yi piY ) wiVb (Yi ) (3.1) ∑ ∑ ⎧ N 2 2 ⎫ i=1 i=1 m′ ⎪ ⎡⎛ Y ⎞ ⎛ Y M ⎞ ⎤⎪ Δ = ⎢⎜ i − ⎟ − ⎜ i − i ⎟ ⎥ a 2 ⎨∑ wi ⎜ pi ⎟ ⎜ ⎟ ⎬ where V ,V and V represent variances defined with CV = Y Y M p a b y ⎩⎪i 1 ⎣⎢⎝ ⎠ ⎝ ⎠ ⎦⎥⎭⎪ respect to the overall, the first stage, and the second stage sampling. See Särndal et al. (1992, pages 151 – 152). and A commonly used estimator for the population mean m′ ⎪⎧ N ⎡ ⎛ Yˆ ⎞ ⎛ Dˆ ⎞⎤⎪⎫ = Δ = ⎜ i ⎟ − ⎜ i ⎟ Y Y /M is the weighted (ratio) estimator given by b 2 ⎨∑ wi ⎢Vb Vb ⎥⎬. = ⎜ Y ⎟ ⎜ Y ⎟ ˆ = ˆ ˆ ˆ = n mi CVy ⎪i 1 ⎢ ⎝ ⎠ ⎝ ⎠⎥⎪ Y Y /M where M ∑∑i =1 j =1 wij . Using Taylor linear- ⎩ ⎣ ⎦⎭ ization, as shown in Särndal et al. (1992, pages 176 – 178), Δ Δ Yˆ can be approximated as The two components a and b in expression (3.7) reflect the differences arising from the respective sources of ˆ ≅ + −1 ˆ Y Y M D (3.2) variation from the first and second stages of sampling. Of ˆ = n mi course, the second component disappears if all the elements where D ∑∑i =1 j =1 wij dij is an unbiased estimator of the = N M i = − in selected clusters are observed since it becomes a single- population total D ∑∑i =1 j =1 dij of dij yij Y , which stage design or if a simple random sample is selected in the represents the deviation of yij from the population mean = = Mi = − second stage. This is because both variances V (Yˆ ) and Y . Note that D 0. Denoting Di ∑ j =1 dij Yi M i Y b i ˆ = mi V (Dˆ ) are equivalent under the aforementioned conditions, and Di ∑ j =1 wj|i dij , we obtain the approximate variance b i ˆ ˆ = ˆ = = of Y from expression (3.2) as that is, 1) Vb (Yi ) Vb (Di ) 0 if w j |i 1 for all i and j, and

Statistics Canada, Catalogue No. 12-001 Survey Methodology, December 2004 7

ˆ = ˆ ≥ = 2) Vb (Yi ) Vb (Di ) 0 if wj |i M i /mi for all i and j. In special case of (3.9). Note that the quantity Ap (y) in ′⋅ 2 2 other words, expression (3.9) approximately reduces to m CVM /CVy Δ = = when p =1/ N for all i. b 0 if w j|i ci for all i and j, (3.8) i Example 3.2 shows that when unequal cluster sizes are where ci are nonnegative constants and not necessarily not reflected in the sampling design, the relative efficiency equal for different clusters. Meanwhile, we can show that of Yˆ over Yˆ depends in part on the relative variability of ∝ ⎧ 0 if pi M i , cluster sizes. If the cluster means are all equal, then cluster Δ = ⎪ ∝ sampling makes Yˆ more efficient than Yˆ, vice versa if all a ⎨ Ap (y) if Yi M i , (3.9) the cluster totals are equal. On the other hand, if all clusters ⎪− A (y) if p ∝Y , ⎩ p i i are equal in size, no difference in the design effects arises by = ′ 2 N − 2 for all i, where Ap (y) (m / CVy )∑i=1 wi ( pi M i / M ) . simple random sampling of clusters. Note that Ap (y) is a nonnegative quantity and also that the In section 4, we utilize the results derived in this section conditions in expression (3.9) can be restated, respectively, to discuss other examples used in the sampling literature. = = = = as pi M i /M , Yi Y , and pi Yi /Y, where Yi Yi /M i = for all i 1, …, N. This result reveals the effect of cluster 4. Examples on the Design Effect sampling on the precision of the two estimators. For = in the Sampling Literature example, if pi M i /M , cluster sampling makes no difference in the precision of the two estimators. On the 4.1 Unequal Probability Element Sampling = ˆ ˆ other hand, if pi Yi /Y, Y becomes more efficient than Y Consider an unequal probability element sampling design in precision under cluster sampling, whereas the cluster without clustering. The discussion in section 3 applies to sampling favors Yˆ over Yˆ in terms of precision if Y = Y ≡ = i this example with M i 1 for all i 1, …, N and thus, for all i. = m n. For brevity’s sake, we use lower cases yi to denote Now, let us consider some examples of the conditions of the value of the y-variable, and we also assume that N is (3.8) and (3.9). large so that N /(N −1) ≅ 1. Due to the absence of the ˆ Example 3.1 For one or two-stage cluster design with pps second stage sampling variation, the design effects for Y = = and Yˆ given in expressions (3.5) and (3.6) reduce to cluster sampling using pi M i / M and wj |i ci for all N = Δ = − i 1, …, N, we have from (3.8) and (3.9) that a 1 − 2 ∑ pi (yi piY) Δ = 0, that is, there is no difference in the design effects for = b Deft 2 (Yˆ) ≅ i 1 (4.1) Yˆ and Yˆ . p N − 2 The same result as given in example 3.1 can be achieved ∑ N(yi Y ) i =1 by Yˆ = M Y . This estimator is the ratio estimator, which can be used if M is known. The case that overall sampling and N weights are a constant for all the elements (i.e., self- −1 − 2 ∑ pi (yi Y ) weighting sampling design) is a well known special case. 2 i =1 Deft (Yˆ ) ≅ . (4.2) p N We will come back to this in section 4. 2 − ∑ N(yi Y ) Example 3.2 One-stage simple random cluster sampling or i =1 two-stage sample design with srs for both stages. Under Further let us consider an example where the survey these designs, we have w = c and p =1/ N for all i and j |i i i variable y is not correlated with the selection probability pi . Δ = j and thus, it follows from (3.8) and (3.9) that b 0 and Example 4.1 Unequal probability element sampling with no ⎧ ⎪ correlation between yi and pi . When yi and pi are not ≡ N −1 − 2 ⎪0 if M i M 0 for all i, correlated, we can approximate ∑i =1 pi (yi Y ) by N 2 −1 N ⎪ 2 nW ∑ = (y −Y ) , where W = N ∑ = w . Note that ⎪ CVM i 1 i i 1 i Δ = ′ −1 n −1 n 2 a ⎨m if Yi are all equal, (3.10) = = 2 E p (n ∑i =1 wi ) N /n, E p (n ∑i =1 wi ) NW /n and ⎪ CVy −1 n 2 2 −1 n E (n ∑ = w ) / E (n ∑ = w ) = nW /N. Thus, ⎪ CV 2 p i 1 i p i 1 i ⎪− m′ M if Y are all equal, 2 ˆ ≅ 2 i Deft p (Y ) nW N ⎩⎪ CVy ()()−1 n 2 2 −1 n − = E n w E n w . (4.3) ′= ′ 2 = 2 N − 2 p ∑i=1 i p ∑i=1 i where m m /n, CVM M ∑i=1 (M i M ) /N denotes = ≥ the relative variance of cluster sizes M i , and M M / N It is easy to show that nW /N 1 using the Cauchy- denotes the average size of clusters. The conditions in (3.10) Schwarz inequality (Apostol 1974, page 14). In addition, also satisfy the conditions in (3.9) and therefore, (3.10) is a routine calculations show from (4.1) and (4.2) that

Statistics Canada, Catalogue No. 12-001 8 Park and Lee: Design Effects for the Weighted Mean and Total Estimators Under Complex Surveys Sampling

2 ˆ − 2 ˆ which is inefficient for estimation of Y, the inefficiency Deft p (Y ) Deft p (Y ) diminishes with the ratio estimation. Next, we consider the ≅ −2 {}N −1 − 2 − −1 N − − CVy ∑i=1 pi ( pi p) 2Y ∑i=1 pi (yi Y )( pi p) opposite case where the y-variable is correlated with the −2 ˆ = − selection probability pi , where the efficiency of Y CVy (nW N 1), increases. = −1 N = where p N ∑i =1 pi 1/ N. The latter expression is N −1 − 2 = − N Example 4.2 Unequal probability element sampling where obtained from ∑i =1 pi ( pi p) nW / N 1 and ∑i =1 −1 p (y −Y )( p − p) ≅ 0 because y and p are uncorre- yi is correlated with pi . Suppose that yi is linearly related i i i i i = + + lated. Consequently, with pi by yi A Bpi ei , where A and B are the least- square regression coefficients of the model for the (finite) population and e is the corresponding residual. Further- 2 ˆ − 2 ˆ ≅ −2 {}2 ˆ − i Deft p (Y) Deft p (Y ) CVy Deft p (Y ) 1 more, assume that the regression model fits well to the or population data and the error variance is roughly homo- ≅ ≅ geneous so that R 0 and R 2 0, where R and 2 ˆ ≅ + −2 2 ˆ − −2 ew e w ew Deft p (Y) (1 CVy ) Deft p (Y ) CVy . (4.4) Re2w denote the population correlations of pairs (ei , wi ) and 2 N 2 2 (e , w ), respectively. For example, R = = From (4.4), it is clear that Deft (Yˆ) ≥ Deft (Yˆ ) if i i ew ∑i 1 p p − − − = N 2 ˆ ≥ 2 ˆ = (ei E)(wi W ) /{(N 1) SeSw}, where E ∑i =1 ei /N; Deft p (Y ) 1 and the equality holds if Deft p (Y ) 1 or = 2 ˆ < 2 ˆ + 2 < Se and Sw are the population standard deviations of ei and W N /n. Also, Deft p (Y) Deft p (Y ) if 1/(1 CVy ) 2 ˆ < wi , respectively. Then the design effects given by (4.1) and Deft p (Y ) 1. Example 4.1 shows that Yˆ tends to have a larger design (4.2) reduce to ˆ 2 ˆ ≅ − 2 effect than Y if the correlation between yi and pi is weak Deft p (Y) (nW N ) (1 R yp ) 2 ˆ and Deft (Y )≥1. 2 p ⎛ R yp 1 ⎞ The customary quantification of the effect of unequal + (nW N − 1)⎜ − ⎟ ⎜ ⎟ (4.7) weights on the design efficiency shown in (2.2) is due to ⎝ CVp CVy ⎠ Kish (1965, 11.7). He considered cases where the unequal and weights arise from “haphazard” or “random” sources such Deft 2 (Yˆ ) ≅ (nW N ) (1− R 2 ) as frame problems or non-response adjustments. Assuming p yp 2 that (1) a random sample of size n selected with replacement ⎛ R ⎞ + (nW N −1)⎜ yp ⎟ , is divided into G weighting classes such that the same ⎜ ⎟ (4.8) ⎝ CV p ⎠ weight wg is assigned to ng sampling units within class g = G respectively, where R is the population correlation and n ∑g =1 ng , and that (2) all G weighting class yp 2 = 2 between y and p and CV is the population coefficient variances are equal to the unit variance of y, i.e., S yg S y i i p for all g =1, …,G, he proposed a quantity given as of variation of pi (see Park and Lee (2001) for proof ). It 2 ˆ ≥ 2 ˆ 2 follows from (4.7) and (4.8) that Deft p (Y) Deft p (Y ) if G ⎛ G ⎞ 2 ˆ = 2 ⎜ ⎟ and only if Deft Kish (Y ) n ∑ ng wg ∑ ng wg , (4.5) = ⎜ = ⎟ g 1 ⎝ g 1 ⎠ ≤ 2Ryp CVp CVy , (4.9) to measure the increment in the variance of Yˆ in = comparison with the hypothesized variance under srswr of where the equality holds if and only if 2Ryp CVp / CVy . size n. The rationale behind the above derivation is that the Also, the inequality is reversed when the inequality in (4.9) loss in precision of Yˆ due to haphazard unequal weighting becomes opposite. can be approximated by the ratio of the variance under The condition (4.9) indicates that Yˆ tends to be less ˆ disproportionate to that under the efficient in terms of precision than Y whenever Ryp is proportionate stratified sampling. small. Thus, we see that Ryp plays an important role in = = In (4.5), letting ng 1 for all g and thus, n G, Kish determining the design efficiency of unequal probability (1992) later proposed a well-known approximate formula sampling on Yˆ and Yˆ and their relative efficiency. given as In an attempt to develop an approximate expression to 2 n ⎛ n ⎞ the design effect when y is correlated with p , Spencer 2 ˆ = 2 ⎜ ⎟ = + 2 i i Deft Kish (Y ) n ∑ wi ∑ wi 1 cv w , (4.6) (2000) proposed a sample approximate formula for Yˆ and = ⎜ = ⎟ i 1 ⎝ i 1 ⎠ compared it with Kish’s approximate formula (4.6) for the 2 = −1 n − 2 2 = where cvw n ∑i =1(wi w) / w is the sample relative special case of Ryp 0. As seen in example 4.2, the two variance and w is the sample mean of wi . Note that (4.6) design effects (4.7) and (4.8) are not equal unless is a sample approximate of (4.3). For a sampling design W = N / n (see Park and Lee (2001) for more discussion

Statistics Canada, Catalogue No. 12-001 Survey Methodology, December 2004 9 and some numerical examples). In addition, this special case − − 2 ˆ = ⎛ N 1⎞ ⎛ + M N ⎞ Deft p (Y) ⎜ ⎟ ⎜1 δ⎟ provides the same condition as for example 4.1 and thus, the ⎝ N ⎠ ⎝ N −1 ⎠ two approximate design effect formulae (4.7) and (4.8) are 2 1 N ⎛ M ⎞⎛ Y ⎞ equivalent to (4.4) and (4.3), respectively. + (M − M )⎜ i ⎟⎜ i ⎟ (4.13) ⋅ 2 ∑ i ⎜ ⎟⎜ ⎟ N CVy i =1 ⎝ M ⎠⎝ Y ⎠ 4.2 One-Stage Cluster Sampling Consider a one-stage cluster sampling, where every and element in a sampled cluster is included in the sample, i.e., ⎛ N − 1⎞ ⎛ M − N ⎞ m ≡ M for all i∈s . Due to the absence of the second Deft 2 (Yˆ ) ≅ 1 + δ i i a p ⎜ ⎟ ⎜ − ⎟ stage sampling variation, the variance of Yˆ takes only the ⎝ N ⎠ ⎝ N 1 ⎠ 2 first term of expression (3.1) and it can be decomposed as 1 N ⎛ M ⎞⎛ D ⎞ + (M − M )⎜ i ⎟⎜ i ⎟ , (4.14) 2 ∑ i ⎜ ⎟⎜ ⎟ N − N N ⋅ CV = M Y − 2 = M (N 1) 2 + 2 y i 1 ⎝ ⎠⎝ ⎠ ∑ wi (Yi piY ) S yB ∑ wi QiYi , (4.10) i =1 n i =1 = 2 ˆ − 2 ˆ ∝ where M M / N. Since Deft p (Y) Deft p (Y ) − N 2 = − 1 N − 2 = ∑ = M (M − M )(2Y −Y ), the inequality between design where S yB (N 1) ∑i =1 M i (Yi Y ) and Qi i 1 i i i − = = effects for Yˆ and Yˆ depends on the joint distribution of Y M i (M i pi M ) for i 1, …, N. Note that Qi 0 if i = and M . pi M i / M , that is, pi is proportional to the cluster size i 2 M i. Also, note that S yB is the between-cluster mean square Example 4.4 One-stage simple random sampling of clusters deviation in an analysis of variance. Denoting the within- of equal-size. In this case, we have M ≡ M and 2 = − −1 N i 0 cluster mean square deviation as S yW (M N) ∑i=1 p =1/ N for all i =1, …, N and both design effects in 2 2 2 i Mi − = + − − ∑ j=1 (yij Yi ) , write S yB S y {1 δ(M N) /(N 1)} (4.13) and (4.14) can be approximated by the same quantity = − 2 2 with δ 1 S yW / S y . Since the expected sample size is given as m′ = nM , the design effect for Yˆ can be written from − − (4.10) as ⎛ N 1⎞ ⎡ + N(M 0 1) ⎤ ⎜ ⎟ ⎢1 δ⎥, (4.15a) ⎝ N ⎠ ⎣ N −1 ⎦ − − 2 ˆ = ⎛ N 1⎞ ⎛ + M N ⎞ Deft p (Y ) ⎜ ⎟ ⎜1 δ⎟ since M − M = 0 for all i =1, …, N. ⎝ N ⎠ ⎝ N −1 ⎠ i To introduce the clustering effect on variance estimation, 2 nM N w Q ⎛ Y ⎞ one often uses the simplest form of one-stage simple + i i ⎜ i ⎟ . (4.11) 2 ∑ 2 ⎜ ⎟ random cluster sampling as in example 4.4. For example, CVy i =1 M i ⎝ Y ⎠ see Cochran (1977, section 9.4), Lehtonen and Pahkinen Similarly, the design effect for Yˆ can be expressed as (1995, page 91), and Lohr (1999, section 5.2.2). Although these authors adopted a without-replacement sampling scheme, we compare their formulae with our formulae with ⎛ N −1⎞ ⎛ M − N ⎞ Deft 2 (Yˆ ) ≅ ⎜ ⎟ ⎜1+ δ⎟ the with – replacement sampling assumption for the sake of p N N −1 ⎝ ⎠ ⎝ ⎠ both simplicity and consistency. Furthermore, the compar- 2 nM N w Q ⎛ D ⎞ ison is valid because their formulae are defined with the + i i ⎜ i ⎟ . (4.12) 2 ∑ 2 ⎜ ⎟ finite population correction incorporated in both numerator CVy i =1 M i ⎝ Y ⎠ and denominator so that its effect is basically cancelled out. Cochran (1977, section 9.4) derived We observe that the design effect for Yˆ differs from that for M Yˆ in the second term containing D = ∑ =i (y −Y ) instead NM −1 i i 1 ij Deft 2 (Yˆ ) = 0 []1+ (M −1)ρ of Y . In addition, we note that the quantity δ = δ (y) is p − 0 i p M 0 (N 1) the adjusted coefficient of determination (R2 ) in the adj ≅ 1+ (M −1)ρ, (4.15b) regression analysis context. It may be called a homogeneity 0 measure. For more discussion on δ, see Särndal et al. where ρ is called the intracluster correlation coefficient (1992, pages 130 – 131) and Lohr (1999, page 140). defined by

N M 0 Example 4.3 One-stage simple random sampling of − − 2∑ ∑ (yij Y )(yij Y ) clusters. In this example, if p =1/ N for all i =1, …, N, = > = i ρ = i 1 j k 1 . (4.15c) the two design effects in (4.11) and (4.12) reduce, N M 0 − − 2 respectively, to (M 0 1)∑ ∑ (yij Y ) i=1 j=1

Statistics Canada, Catalogue No. 12-001 10 Park and Lee: Design Effects for the Weighted Mean and Total Estimators Under Complex Surveys Sampling

N M 2 2 Rewriting ∑ = [∑ =0 (y −Y )] = M (N −1)S and 4.3 Self-Weighting Designs i 1 j 1 ij 0 yB 2 2 2 N M 0 − = − = − + − ∑i =1 ∑ j =1 (yij Y ) (NM 0 1) S y (N 1)S yB N(M 0 1) In a self-weighting sample, every sample element has the 2 S yW , it is easy to show that same weight. This leads to simple forms for both total and ˆ = N M 0 mean estimators. They are given by Y y/ f and − − ˆ = = 2∑ ∑ (yij Y )(yik Y ) Y y/m, where f m/M is the overall sampling fraction = > = i 1 j k 1 = n mi and y ∑i=1 ∑ j=1 yij is the sample total. Then just like 2 N ⎡ M 0 ⎤ N M 0 simple random sampling as shown in (3.4), the two = − − − 2 ∑ ⎢∑ (yij Y )⎥ ∑ ∑ (yij Y ) estimators have the same design effect. i=1 j=1 i=1 j=1 ⎣ ⎦ A self-weighting sampling design can be implemented in = − []− 2 − 2 (M 0 1) (NM 0 1)S y NM 0 S y w various ways by synchronizing the first stage sampling method with the second stage sampling method (e.g., Kish and, thus, from (4.15c), ρ =1−{NM /(NM −1)} 0 0 1965, section 7.2). For example, if equal probability (S 2 / S 2 ) ≅ δ assuming M ≡ M for all i =1, …, N, yW y i 0 sampling is used for the first stage sampling, then the − ≅ − NM 0 /(NM 0 1) 1. Therefore, further assuming (N 1) / − − second stage should be sampled by an equal probability N ≅1 and (NM −1) M 1(N −1) 1 ≅1, both design effect 0 0 sampling method with a uniform sampling fraction for all formulae (4.15a) and (4.15b) are approximately equivalent PSUs. As a special case of this, where an srs of PSUs of to 1+ (M −1)δ. Other authors arrived at the same 0 equal size (i.e., M = M for all i ) is selected, Hansen δ ρ i 0 approximate formula. This is because and essentially et al. (1953, Vol. II, pages 162 – 163) showed measure the same thing, which is the cluster homogeneity. ˆ ˆ 2 ˆ ≅ 1 2 []+ − Under this situation, two estimators Y and Y have the CVp (Y ) CVy 1 ρ(m 1) , (4.17) same design effect as discussed in example 3.2. Note that m 2 ˆ = ˆ 2 ˆ this is a simple case of a self-weighting sampling design. where CVp (Y ) Vp (Y ) / Y is the relative variance of Y Särndal et al. (1992, section 8.7) compared the design under the sampling design p and ρ is the intracluster effects for the two estimators under the setting of example correlation coefficient as defined in (4.15c). Since the ˆ −1 2 4.3. They also derived a simplified expression 1+ (M −1)δ relative variance of Y under srswr is m CVy , the well ˆ for (4.13) and (4.14), assuming the covariances of M i with known approximate design effect formula for Y under a 2 2 M i Yi and M i Di are ignorable. Their discussion on the self-weighting design follows immediately as difference between total and mean estimators boils down to Deft 2 (Yˆ ) =1 + ρ(m −1). (4.18) Δ p a in example 3.2. They also noted that the design effect can be much more severe for the population total than for For one-stage cluster designs, we showed similar forms the population mean because more is lost through sampling given in (4.15a) and (4.15b) (see also Yamane 1967, section of clusters when the total is estimated than when the mean is 8.7). Hansen et al. (1953, Vol. II. page 204) further showed 2 ˆ = 2 ˆ estimated. CVp (Y ) CVp (Y ) for a sample design that employs A common practice to handle unequal cluster sizes is to simple random sampling at both stages. This implies that Yˆ ˆ use a more efficient sampling method that incorporates the and Y have the same design effect. size difference such as pps sampling of clusters. Expressions 4.4 Two-Stage Unequal Probability Sampling (4.11) and (4.12) can be applied to arbitrary selection Let us first consider the following example. probabilities pi , where pi are set to be proportional to ≥ Example 4.5 A two-stage sampling design where n PSUs some size measures Z i 0. The difference between the ˆ design effects for Yˆ and Y is explained by Δ in (3.9), or are selected with replacement with probability pi and an a ≥ alternatively equal size simple random sample of m0 2 elements is selected with replacement from each selected PSU. With 2 2 m′ N w Q ⎡⎛ Y ⎞ ⎛ D ⎞ ⎤ routine calculations and simplification, we can show that Δ = i i ⎢⎜ i ⎟ − ⎜ i ⎟ ⎥. (4.16) a 2 ∑ 2 ⎜ ⎟ ⎜ ⎟ 2 ˆ ≅ + − + * CV = M Y Y Deft p (Y ) 1 (m0 1)τ W y , (4.19) y i 1 ⎣⎢⎝ ⎠ ⎝ ⎠ ⎦⎥ where The term Qi in (4.16) represents the effect of pi on the N − 2 + − −1 2 variance estimation when size measures other than the (N 1)S yB ∑ (m0 1) S yi = actual cluster sizes M are used. Thomsen, Tesfu, and τ = i 1 , (4.20) i N Binder (1986) considered the effect of an out-dated size − 2 + − 2 (N 1)S yB ∑ (M i 1)S yi measure among other factors under two-stage sampling with i=1 2 −1 2 * = − M i − = ˆ = simple random sample of element at the second stage. We S yi (M i 1) ∑ j =1 (yij Yi ) ,Wy Wy /Vsrswr (Ysrs ) 2 2 2 2 Ni + will come back to this in section 4.4. (m0 / CVy ) ∑i =1 (Qi / pi M )(Yi /Y ) (1 CVyi / m0 ), and

Statistics Canada, Catalogue No. 12-001 Survey Methodology, December 2004 11

2 = 2 2 2 ˆ ˆ 2 CVyi S yi /Yi denotes the within-cluster relative variance differs from Deft p (Y ) only by a factor of (M / M ) , of the y-variable. Similarly, although the actual difference can be much more pronounced as we have showed in this paper (e.g., Deft 2 (Yˆ ) ≅ 1 + (m −1)τ + W * , (4.21) p 0 d expressions (3.7) and (4.23)). * = ˆ = 2 N 2 where Wd Wd /Vsrswr (Ysrs ) (m0 /CVy )∑i=1 (Qi /pi M ) 4.5 More General Cases 2 + 2 2 (Di /Y ) (1 CVdi / m0 ), and Di and CVdi are defined = − Weighting survey data involves not only sampling with the transformed variable d (dij yij Y ) analogously 2 weights but also various weighting adjustments such as to Yi and CVyi , respectively. (Detailed derivations of expressions (4.19) and (4.21) are available from the post-stratification, raking, and nonresponse compensation. = We consider these general cases here. authors.) For the case with mi m0 for all i, the difference in the design effects given in (4.19) and (4.21) reduces to We can rewrite the first-order Taylor approximation to ˆ = ˆ ˆ (3.7) or (4.16). There is no contribution from the second the weighted mean estimator Y Y / M given in (3.2) as ˆ − ≅ ˆ − + ˆ − stage sampling to the difference. (Y Y ) /Y (Y Y ) /Y (M M ) / M. Taking variance Coming back to Thomsen et al. (1986) who studied the on both sides, effect of using an outdated measure of size on the variance, 2 ˆ ≅ 2 ˆ + 2 ˆ CVp (Y ) CVp (Y ) CVp (M ) the above discussion on Yˆ parallels with their discussion. + ˆ ˆ ˆ ˆ (4.22) The only difference is that they assumed a without- 2R p (Y , M ) CVp (Y )CVp (M ), replacement sampling scheme at the second stage. Note, 2 ˆ 2 ˆ 2 ˆ where CVp (Y ),CVp (Y ),CVp (M ) are the relative vari- however, that the definition of τ in Thomsen et al. (1986) is ˆ ˆ ˆ ˆ ˆ ances of Y, Y , and M respectively, and Rp (Y , M ) is the slightly different from (4.20) and from δ in section 4.2. correlation coefficient of Yˆ and Mˆ with respect to the However, there is a close connection between them. To see complex sampling design p and any weighting adjustments. τ this, let us write the as a function of some quantities bi ’s Since the relative variances of simple sample total and mean associated with PSUs as follows: ˆ 2 ˆ = 2 = −1 2 Ysrs and ysrs are CVsrswr (Ysrs ) CVsrswr (ysrs ) m CVy N − 2 − 2 under srswr of size m, it follows from (4.22) that (N 1)S yB ∑ bi S yi = τ(b ) = i 1 . i N Deft 2 (Yˆ) ≅ Deft 2 (Yˆ ) − 2 + − 2 p p (N 1)S yB ∑ (M i 1)S yi i=1 + ˆ ˆ ∇ ˆ + ∇ 2 (4.23) 2R p (Y , M ) p (y)Deft p (Y ) p (y), = Then the τ in Thomsen et al. (1986) is obtained with bi 1, − − ∇ = ˆ the τ in example 4.5 with 1/(m0 1), and δ in section 4.2 where p (y) CVp (M ) / CVsrswr (ysrs ) is nonnegative. − N − − with (M i 1) /{∑i=1 (M i 1) /(N 1)}. Equating Kish’s As an illustration, consider a binary variable y, where ˆ ˆ 2 ≅ − ∇ formula (4.18) for Y to (4.19) for Y, they obviously over- CVy (1 Y ) /Y and, thus, p (y) can be arbitrarily looked that the design effects for Yˆ and Yˆ can be very large as Y approaches 1 or small as Y approaches zero ˆ ≠ ∇ different. assuming CVp (M ) 0. When p (y) is near zero, the For more general cases, Kish (1987) proposed the two design effects are nearly equal. Otherwise, one is larger ˆ ∇ following popular formula for Y : than the other depending on the values of p (y) and ˆ ˆ G Rp (Y , M ). When the sampling weights are benchmarked 2 ˆ n∑ ng wg to the known population size M , Yˆ and Y have the same = Deft 2 (Yˆ ) = g 1 []1 + ρ(m − 1) design effect since Mˆ = M and CV (Mˆ ) = 0. In this case, Kish 2 p ⎛ G ⎞ Yˆ is not affected by the benchmarking but Yˆ = M Yˆ, ⎜ n w ⎟ ⎜ ∑ g g ⎟ which is a ratio estimator. Note that poststratification or ⎝ g =1 ⎠ raking procedures may be used if population size infor- = + 2 [ + ρ − ] (1 cv w ) 1 (m 1) . mation is available at subpopulation level and we also get This was obtained by applying (4.5) (or (4.6)) and (4.18) equivalent design effects. In general, however, we have 2 ˆ ≥ 2 ˆ recursively to incorporate the effects of both clustering and Deft p (Y ) Deft p (Y ) if unequal weights. Gabler, Haeder and Lahiri (1999) justified ∇ (y) ˆ ˆ ˆ ≥ − 1 p the above formula for Y using a superpopulation model R p (Y , M ) or 2 ˆ defined for the cross-classification of N clusters and G Deft p (Y ) weighting classes. However, the difference between the ˆ 1 CVp (M ) ˆ ˆ ˆ ≥ − (4.24) design effects for Y and Yˆ cannot be exposed by such a R p (Y , M ) , 2 ˆ CVp (Y ) model-based approach, since yk is treated as a random 2 ˆ variable while wk as fixed. Under this approach, Deft p (Y ) and vice versa.

Statistics Canada, Catalogue No. 12-001 12 Park and Lee: Design Effects for the Weighted Mean and Total Estimators Under Complex Surveys Sampling

It is illuminating to look at some specific situations. For Nutrition Examination Survey (NHANES III), which is ˆ ˆ ≥ 2 ˆ > 2 ˆ example, if Rp (Y , M ) 0, then Deft p (Y ) Deft p (Y ), given as a demo file in WesVar version 4.0. NHANES III is ˆ ˆ < however, a negative correlation (i.e., Rp (Y , M ) 0 ) a nationwide large-scale medical examination survey based 2 ˆ ≤ 2 ˆ doesn’t necessarily lead to Deft p (Y ) Deft p (Y ). For a on a stratified multistage sampling design, for which the ˆ ˆ = special case of Rp (Y , M ) 0, the difference is given by Fay’s modified balance repeated replication (BRR) method CV 2 (Mˆ ) was employed for variance estimation. (See Judkins 1990 2 ˆ − 2 ˆ ≅ p Deft p (Y ) Deft p (Y ) 2 . (4.25) for more details on Fay’s method.) We used only 19,793 CVsrswr (ysrs ) records with complete responses to those characteristics listed in Table 1. Note that the weight in the demo file is Figure 1 shows graphically the relation between the two different from the NHANES III final weight that was design effects. The expression in (4.23) is plotted for some obtained by poststratification. For more detailed information fixed values of R (Yˆ, Mˆ ) and ∇ (y). The solid line p p on the demo file, see Westat (2001). passing through the origin which represents equal design Table 1 presents the design effects for Yˆ and Yˆ, and effects is the reference line. As the graphs show, the component terms of (4.23) for the selected characteristics. comparison is not clear-cut. When R (Yˆ, Mˆ ) < 0, p Note that ∇ (y) monotonically decreases in CV given Deft2 (Yˆ) ≥ Deft2 (Yˆ ) for small Deft 2 (Yˆ ) but the relation p y p p p that m =19,793 and cv (Mˆ ) = 3.2%. Although ∇ (y) flips over as Deft2 (Yˆ ) grows larger. p p p tends to be the determinant factor in the difference of the Hansen et al. (1953, Vol. I, pages 338 – 339) indicated design effects, R (Yˆ , Mˆ ) can be important when it is that R (Yˆ, Mˆ ) would often be close to 0. Under this p p negative. For example, for two race/ethnicity characteristics, 2 ˆ ≅ situation, expression (4.25) is also written as Deft p (Y ) African American and Hispanic, the negative values, – 0.67 2 ˆ [ + 2 ˆ 2 ˆ ] Deft p (Y ) 1 CVp (M ) / CV p (Y ) , from which we get ˆ ˆ 2 and – 0.24 of R (Y , M ) were responsible for Deft (Yˆ) < 2 ˆ ≥ 2 ˆ p p Deft p (Y ) Deft p (Y ).This special case was studied by 2 ˆ ˆ Deft p (Y ). Some design effects for Y are huge. This is not Jang (2001). However, this doesn’t seem necessary as can the case with the NHANES III poststratified final weights, be seen in the following example. ˆ ˆ with which Y and Y have the same design effect. This Example 4.6 To illustrate the relationship between the illustrates the importance of benchmarking weight design effects for Yˆ and Yˆ, we used a data set for the adjustments for total estimates. adults collected from the U.S. Third National Health and

9 5 2 9 5 2 0 ...... 0 0 0 0 0 0 . 0 0 2 x . 0 = 0 - y x 2 . = y 0 -

.5 5 0 . - -0

.9 0 -

Design effect for total Design effect for total

0.9 2 4 6 8 10 12 2 4 6 8 10 12 -

24681012 24681012 Design effect for mean Design effect for mean ∇ = ∇ = (a) p (y) 1.0 (b) p (y) 2.5

2 ˆ 2 ˆ ∇ = ∇ = Figure 1. Plots of Deft p (Y) versus Deft p (Y ) for (a) p (y) 1. 0 (b) p (y) 2.5. The solid line corresponds to 2 ˆ = 2 ˆ ˆ ˆ = Deft p (Y ) Deft p (Y). Other lines correspond to Rp (Y ,M ) – 0.9, – 0.5, – 0.2, 0, 0.2, 0.5, 0.9, respectively.

Statistics Canada, Catalogue No. 12-001 Survey Methodology, December 2004 13

Table 1 Comparison of the design effects for the weighted total and mean using a subset of the adult data file from the U.S. Third National Health and Nutrition Examination Survey (NHANES III)

Mean Total ˆ cv (Mˆ ) Estimate Deft2 cv Estimate Deft2 cv cv r (Y , Mˆ ) ∇ (y) − p Characteristic y p p ˆ 2cv p (Y ) Has smoked 100+ Yes 0.53 4.13 0.014 98,397,795 31.31 0.038 0.944 0.20 4.83 – 0.58 cigarettes in life?

Has diabetes? Yes 0.05 1.75 0.040 9,783,307 1.92 0.042 4.246 – 0.34 1.07 – 0.31 No 0.95 1.75 0.002 176,341,218 393.47 0.033 0.236 0.34 19.35 – 5.53 Has hypertension/ Yes 0.23 3.42 0.024 42,939,866 7.96 0.037 1.826 – 0.18 2.50 – 0.37 0.77 3.42 0.007 143,184,660 78.44 0.034 0.548 high blood pressure? No 0.18 8.32 – 1.22

AFRICAN 0.12 7.64 0.054 21,567,028 4.21 0.040 2.762 – 0.67 1.65 – 0.11 Race/Ethnicity AMERICAN*

HISPANIC* 0.05 6.70 0.079 9,550,326 6.48 0.078 4.300 – 0.24 1.06 – 0.08

Gender MALE 0.48 1.40 0.009 88,725,967 19.18 0.033 1.048 – 0.11 4.35 – 1.55

FEMALE 0.52 1.40 0.008 97,398,559 25.39 0.034 0.954 0.11 4.77 – 1.70 Number of cigarettes 5.25 6.42 0.037 977,225,826 10.51 0.047 2.044 0.09 2.23 0.17 smoked per day – – –

Population Size – – – – 186,124,526 – 0.032 – – – –

ˆ Note: * denotes the cases where the design effect for Yˆ is smaller than that for Y . 5. Conclusion the study variable and its relations to the sampling design on the statistic. As complex survey software packages routinely We studied the design effects of the two most widely produce the design effect, it seems appropriate to warn the used estimators for the population mean and total in sample user of the packages of these rather obscure facts about the surveys under various with-replacement sampling schemes. design effect. We do not think the employment of with-replacement sampling is necessarily a serious limitation because we can Acknolwedgement see things more clearly without muddling the math with probably unnecessary complications with without-replace- The authors thank Louis Rizzo at Westat, an associate ment sampling schemes. Furthermore, the effect of the finite editor, and two referees for their helpful comments and population correction is largely canceled out in our suggestions on an earlier version of this paper. formulation of the design effect and so the results are quite comparable with traditional design effects for without- References replacement sampling. Therefore, our findings should be nd useful in practice. We summarize our key findings below. Apostol, T.M. (1974). Mathematical Analysis. 2 Ed. Reading, MA: Addison-Wesley. Kish’s well-known approximate formulae for the design effect for (ratio type) weighted mean estimators are not Barron, E.W., and Finch, R.H. (1978). Design Effects in a complex multistage sample: The Survey of Low Income Aged and easily generalized in their form and concepts to more Disabled (SLIAD), Proceedings of the Section on Survey general problems, especially weighted total estimators Research Methods, American Statistical Association, 400-405. ˆ contrary to what many people would perceive. In fact, Y Cochran, W.G. (1977). Sampling Techniques. 3rd Ed. New York: John and Yˆ often have very different design effects unless the Wiley & Sons, Inc. sampling design is self-weighting or the sampling weights Cornfield, J. (1951). Modern methods in the sampling of human are benchmarked to the known population size. In addition, populations. American Journal of Public Health, 41, 654-661. the design effect is in general not free from the distribution Gabler, S., Haeder, S. and Lahiri, P. (1999). A model based of the study variable even for the mean estimator, let alone justification of Kish’s formula for design effects for weighting and clustering. Survey Methodology, 25, 105-106. the total estimator. Furthermore, the correlation of the study variable with the weights used in estimation can be an Hansen, M.H., Hurwitz, W.N. and Madow, W.G. (1953). Sample Survey Methods and Theory. Vol. I, New York: John Wiley & important factor in determining the design effect. Therefore, Sons, Inc. apart from its original intention, the design effect measures Hansen, M.H., Hurwitz, W.N. and Madow, W.G. (1953). Sample not only the effect of a complex sampling design on a Survey Methods and Theory, Vol. II, New York: John Wiley & particular statistic but also the effects of the distribution of Sons, Inc.

Statistics Canada, Catalogue No. 12-001 14 Park and Lee: Design Effects for the Weighted Mean and Total Estimators Under Complex Surveys Sampling

Judkins, D.R. (1990). Fay’s method for variance estimation, Journal Park, I., and Lee, H. (2001). The design effect: do we know all about of Official Statistics, 6, 223-239. it? Proceedings of the Section on Survey Research Methods, American Statistical Association. In CD-ROM. Kish, L. (1965). Survey Sampling. New York: John Wiley & Sons, Inc. Park, I., and Lee, H. (2002). A revisit of design effects under unequal Kish, L. (1987). Weighting in Deft2. The Survey Statistician. June probability sampling. The Survey Statistician, 46, 23-26. 1987. Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. New York: Springer-Verlag. Kish, L. (1992). Weighting for unequal pi . Journal of Official Statistics, 8, 183-200. Spencer, B.D. (2000). An approximate design effect for unequal Kish, L. (1995). Methods for design effects. Journal of Official weighting when measurements may correlate with selection Statistics, 11, 55-77. probabilities. Survey Methodology, 26, 137-138. Jang, D. (2001). On procedures to summarize variances for survey Thomsen, I., Tesfu, D. and Binder, D.A. (1986). Estimation of Design estimates. Proceedings of the Survey Research Methods of the Effects and Intraclass Correlations When Using Outdated American Statistical Association. In CD-ROM. Measures of Size. International Statistical Review, 54, 343-349. Lehtonen, R., and Pahkinen, E.J. (1995). Practical Methods for Westat (2001). WesVar 4.0 User’s Guide. Rockville, MD: Westat, Design and Analysis of Complex Surveys. New York: John Wiley Inc. & Sons, Inc. Yamane, T. (1967). Elementary Sampling Theory. New Jersey: Lohr, S.L. (1999). Sampling: Design and Analysis. Pacific Grove, Prentice-Hall. CA: Brooks/Cole.

Statistics Canada, Catalogue No. 12-001