A Design Effect Measure for Calibration Weighting in Cluster Samples

JSM 2014 - Survey Research Methods Section A Design Effect Measure for Calibration Weighting in Cluster Samples Kimberly Henry1 and Richard Valliant2 1Statistics of Income, Internal Revenue Service 77 K Street, NE, Washington, DC 20002 2 Universities of Michigan & Maryland, College Park, MD 20854. Abstract We propose a model-based extension of weighting design-effect measures for two-stage sampling when calibration weighting is used. Our proposed design effect measure captures the joint effects of a non- epsem sampling design, unequal weights produced using calibration adjustments, and the strength of the association between an analysis variable and the auxiliaries used in calibration. The proposed measure is compared to existing design effect measures in an example involving household-type data. Key words: Auxiliary data; Kish weighting design effect; Spencer design effect; generalized regression estimator 1. Introduction The most popular measure for gauging the effect of differential weighting on the precision of an estimator is Kish’s (1965, 1992) design-based design effect. Gabler et al. (1999) showed that, for cluster sampling, this estimator is a special form of a design effect produced using variances from random effects models, with and without intra-class correlations. Spencer (2000) proposed a simple model-based approach that depends on a single covariate to estimate the impact on variance of using variable weights. However, these approaches do not provide a summary measure of the impact of the gains in precision that may accrue from sampling with varying probabilities and using a calibration estimator like the general regression (GREG) estimator. While the Kish design effects attempt to measure the impact of variable weights, as noted in Kish (1992), they are informative only under special circumstances, do not account for alternative variables of interest, and can incorrectly measure the impact of differential weighting in some circumstances. Spencer’s approach holds for with-replacement single-stage sampling for a very simple estimator of the total constructed with inverse-probability weights with no further adjustments. There are also few empirical examples comparing these measures in the literature. Henry and Valliant (2013) extended this to gauge the impact of differential calibration weights in single-stage samples. In particular, the existing measures, reviewed in Section 2, may not accurately produce design effects for unequal weighting induced by calibration adjustments. These are often applied to reduce variances and correct for undercoverage and/or nonresponse in surveys (e.g., Särndal and Lundström 2005; Kott 2009). When the calibration covariates are correlated with the coverage/response mechanism, calibration weights can improve the mean squared error (MSE) of an estimator. In many applications, since calibration involves element-level adjustments, calibration weights can vary more than the base weights or category- based nonresponse or poststratification adjustments (Kalton and Cervantes-Flores 2003; Brick and Montaquila 2009). Thus, an ideal measure of the impact of calibration weights incorporates not only the correlation between the survey variable of interest y and the weights, but also the correlation between y and the calibration covariates x to avoid “penalizing” weights for the mere sake that they vary. 1 1564 JSM 2014 - Survey Research Methods Section We extend these existing design effects to produce a new measure that summarizes the impact of calibration weight adjustments before and after they are applied to two-stage cluster-sample survey weights. The proposed measure in Section 3 accounts for the joint effect of a non-epsem sample design and unequal weight adjustments in the larger class of calibration estimators. Our summary measure incorporates the survey variable like Spencer’s model, but also uses a generalized regression variance to reflect multiple calibration covariates and the cluster sample design. In section 4, we apply the estimators in a case study involving household-type survey data and demonstrate empirically how the proposed estimator outperforms the existing methods in the presence of unequal calibration weights. 2. Existing Methods In this section, we specify notation and summarize the existing design effect measures for cluster sampling. The assumptions used to derive each of these are also presented. 2.1. Notation We consider that a finite population of M elements is partitioned into N clusters, with sizes Mi , and denoted by Uiji , : 1, , NjM , 1, , i . On element ij, , an analysis variable yij is observed. NM The population total of the y ’s is Ty i and the population variance is ij11ij NM N 21M i yY, where M M and YTM . We select a sample of n clusters yijij11 i1 i and mi elements within cluster i , using two-stage sampling from U and obtain a set of n s ij, : i 1, , nj , 1, , m respondents. The total sample size of elements is mm . i i1 i In the following sections, when a two-stage design is considered, we assume that clusters are selected with replacement. The selection probability of cluster i on a single draw is pi . Within cluster i , the sample of mi elements is selected via simple random sampling without replacement (srswor). Assuming that we have probability-with-replacement (pwr) sampling of clusters, the probability of selection for n clusters is approximately iii11 p np (if pi is not too large), where pi is the one-draw selection probability. The second-stage selection probability is ji mMii for element j in cluster i . Then the overall selection probability is approximately ij iji np i m i M i . We estimate the total nm1 M nm from the sample using with the pwr-estimator Tywyˆ iii , pwr, yij11 ij ij 11 ij ij npii m where wMnpmij i i i . 2 1565 JSM 2014 - Survey Research Methods Section 2.2. GREG Weight Adjustments Case weights resulting from calibration on benchmark auxiliary variables can be defined with a global regression model for the survey variables (see Kott 2009 for a review). Deville and Särndal (1992) proposed the calibration approach that involves minimizing a distance function between the base weights and final weights to obtain an optimal set of survey weights. Specifying alternative calibration distance functions produces alternative estimators. Suppose that a single-stage probability sample of m elements is selected with i being the selection probability of element i and xi a vector of p auxiliaries associated with element i. A least squares distance function produces the general regression estimator (GREG): TTˆˆˆˆBTT T gy , (1) GREG HTy x HTx is i i i where Tyˆ is the Horvitz-Thompson estimator of the population total of y, HTyis i i N Txˆ is the vector of HT estimated totals for the auxiliary variables, Tx is the HTxis i i x i1 i ˆ 111T corresponding vector of known totals, BAXV s ssssΠ y s is the regression coefficient, with T 11 T AXVs ssssΠ X s, Xs is the matrix of xi values in the sample, Vssi diag v is the diagonal of the variance matrix specified under the working model (defined below), and Πs diag i . In the second ˆ T 11 expression for the GREG estimator in (1), gixHTxsii1 TT Axv is the “g-weight,” such that the case weights are wgiii for each sample element i . T The GREG estimator for a total is model-unbiased under the associated working model, yiix β i, ii~0, v . The GREG is consistent and approximately design-unbiased when the sample size is large (Deville and Särndal 1992). When the model is correct, the GREG estimator achieves efficiency gains. If the model is incorrect, then the efficiency gains will be dampened (or nonexistent) but the GREG estimator is still approximately design-unbiased. Relevant to this work, the variance of the GREG estimator can be used to approximate the variance of any calibration estimator (Deville and Särndal 1992) when the sample size is large. This result holds for multistage sampling and allows us to produce one design effect measure applicable to all estimators in the family of calibration estimators. 2.3. Direct Design-Effect Measures For a given non-epsem sample and estimator Tˆ for the finite population total T , one definition for the direct design effect (Kish 1965) is ˆˆ ˆ Deff T Var T Varsrswr T srswr . (2) where TMymˆ is the expansion estimator under simple random sampling with srswrks k ˆ 22 replacement (srswr) and Var Tsrswr M y m . We refer to this as a “direct” estimator because it uses theoretical variances in the numerator and denominator. The alternatives that are presented subsequently use various approximations to the components in (2). The design effect in (2) measures the size of the 3 1566 JSM 2014 - Survey Research Methods Section variance of the estimator Tˆ under the design , relative to the variance of the estimator of the same total if an srswr of the same size had been used. ˆ We can approximate the variance of any calibration estimator Tcal using the approximate variance of the GREG (Deville and Särndal 1992), in which case ˆˆˆ Deff Tcal Var GREG T cal Var srswr T srswr . (3) As described in Sec. 3, our proposed design effect is a model-based approximation to (3). 2.4. Kish’s “Haphazard-Sampling” Design-Effect for Unequal Single-stage Sample Weights Kish (1965, 1990) proposed the “design effect due to weighting” as a measure to quantify the loss of T precision due to using unequal and inefficient weights. For w ww1,, m denoting the weights from a simple random sample without replacement (srswor) of m elements, this measure is 2 deffK ww1 CV 2 , (4) mw2 w isii is 2 where CVw m12 w w w is the coefficient of variation of the weights, and is i wm 1 w. Expression (4) is derived from the ratio of the variance of the weighted survey mean is i under disproportionate stratified srswor to the variance under proportionate stratified srswor when all stratum element variances are equal (Kish 1992).

A Design Effect Measure for Calibration Weighting in Cluster Samples

A Primer for Using and Understanding Weights with National Datasets

Design Effects in School Surveys

Computing Effect Sizes for Clustered Randomized Trials

IBM SPSS Complex Samples Business Analytics

The Design Effect: Do We Know All About It?

Design Effect

Estimation of Design Effects for ESS Round II

Selection Bias in Web Surveys and the Use of Propensity Scores

Version 4.1 Effect Size and Standard Error Formulas

UC Berkeley UC Berkeley Electronic Theses and Dissertations

Example of the Impact of Weights and Design Effects on Contingency Tables and Chi-Square Analysis

Sample Size Calculation and Development of Sampling Plan