Computing Effect Sizes for Clustered Randomized Trials
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Choosing the Sample
CHAPTER IV CHOOSING THE SAMPLE This chapter is written for survey coordinators and technical resource persons. It will enable you to: U Understand the basic concepts of sampling. U Calculate the required sample size for national and subnational estimates. U Determine the number of clusters to be used. U Choose a sampling scheme. UNDERSTANDING THE BASIC CONCEPTS OF SAMPLING In the context of multiple-indicator surveys, sampling is a process for selecting respondents from a population. In our case, the respondents will usually be the mothers, or caretakers, of children in each household visited,1 who will answer all of the questions in the Child Health modules. The Water and Sanitation and Salt Iodization modules refer to the whole household and may be answered by any adult. Questions in these modules are asked even where there are no children under the age of 15 years. In principle, our survey could cover all households in the population. If all mothers being interviewed could provide perfect answers, we could measure all indicators with complete accuracy. However, interviewing all mothers would be time-consuming, expensive and wasteful. It is therefore necessary to interview a sample of these women to obtain estimates of the actual indicators. The difference between the estimate and the actual indicator is called sampling error. Sampling errors are caused by the fact that a sample&and not the entire population&is surveyed. Sampling error can be minimized by taking certain precautions: 3 Choose your sample of respondents in an unbiased way. 3 Select a large enough sample for your estimates to be precise. -
A Primer for Using and Understanding Weights with National Datasets
The Journal of Experimental Education, 2005, 73(3), 221–248 A Primer for Using and Understanding Weights With National Datasets DEBBIE L. HAHS-VAUGHN University of Central Florida ABSTRACT. Using data from the National Study of Postsecondary Faculty and the Early Childhood Longitudinal Study—Kindergarten Class of 1998–99, the author provides guidelines for incorporating weights and design effects in single-level analy- sis using Windows-based SPSS and AM software. Examples of analyses that do and do not employ weights and design effects are also provided to illuminate the differ- ential results of key parameter estimates and standard errors using varying degrees of using or not using the weighting and design effect continuum. The author gives recommendations on the most appropriate weighting options, with specific reference to employing a strategy to accommodate both oversampled groups and cluster sam- pling (i.e., using weights and design effects) that leads to the most accurate parame- ter estimates and the decreased potential of committing a Type I error. However, using a design effect adjusted weight in SPSS may produce underestimated standard errors when compared with accurate estimates produced by specialized software such as AM. Key words: AM, design effects, design effects adjusted weights, normalized weights, raw weights, unadjusted weights, SPSS LARGE COMPLEX DATASETS are available to researchers through federal agencies such as the National Center for Education Statistics (NCES) and the National Science Foundation (NSF), to -
Design Effects in School Surveys
Joint Statistical Meetings - Section on Survey Research Methods DESIGN EFFECTS IN SCHOOL SURVEYS William H. Robb and Ronaldo Iachan ORC Macro, Burlington, Vermont 05401 KEY WORDS: design effects, generalized variance nutrition, school violence, personal safety and physical functions, school surveys, weighting effects activity. The YRBS is designed to provide estimates for the population of ninth through twelfth graders in public and School surveys typically involve complex private schools, by gender, by age or grade, and by grade and multistage stratified sampling designs. In addition, survey gender for all youths, for African-American youths, and for objectives often require over-sampling of minority groups, Hispanic youths. The YRBS is conducted by ORC Macro and the reporting of hundreds of estimates computed for under contract to CDC-DASH on a two year cycle. many different subgroups of interest. Another feature of For the cycle of the YRBS examined in this paper, 57 these surveys that limits the ability to control the precision primary sampling units (PSUs) were selected from within 16 for the overall population and key subgroups simultaneously strata. Within PSUs, at least three schools per cluster is that intact classes are selected as ultimate clusters; thus, spanning grades 9 through 12 were drawn. Fifteen additional the designer cannot determine the ethnicity of individual schools were selected in a subsample of PSUs to represent students nor the ethnic composition of classes. very small schools. In addition, schools that did not span the ORC Macro statisticians have developed the design entire grade range 9 through 12 were combined prior to and weighting procedures for numerous school surveys at the sampling so that every sampling unit spanned the grade national and state levels. -
Evaluating Probability Sampling Strategies for Estimating Redd Counts: an Example with Chinook Salmon (Oncorhynchus Tshawytscha)
1814 Evaluating probability sampling strategies for estimating redd counts: an example with Chinook salmon (Oncorhynchus tshawytscha) Jean-Yves Courbois, Stephen L. Katz, Daniel J. Isaak, E. Ashley Steel, Russell F. Thurow, A. Michelle Wargo Rub, Tony Olsen, and Chris E. Jordan Abstract: Precise, unbiased estimates of population size are an essential tool for fisheries management. For a wide variety of salmonid fishes, redd counts from a sample of reaches are commonly used to monitor annual trends in abundance. Using a 9-year time series of georeferenced censuses of Chinook salmon (Oncorhynchus tshawytscha) redds from central Idaho, USA, we evaluated a wide range of common sampling strategies for estimating the total abundance of redds. We evaluated two sampling-unit sizes (200 and 1000 m reaches), three sample proportions (0.05, 0.10, and 0.29), and six sampling strat- egies (index sampling, simple random sampling, systematic sampling, stratified sampling, adaptive cluster sampling, and a spatially balanced design). We evaluated the strategies based on their accuracy (confidence interval coverage), precision (relative standard error), and cost (based on travel time). Accuracy increased with increasing number of redds, increasing sample size, and smaller sampling units. The total number of redds in the watershed and budgetary constraints influenced which strategies were most precise and effective. For years with very few redds (<0.15 reddsÁkm–1), a stratified sampling strategy and inexpensive strategies were most efficient, whereas for years with more redds (0.15–2.9 reddsÁkm–1), either of two more expensive systematic strategies were most precise. Re´sume´ : La gestion des peˆches requiert comme outils essentiels des estimations pre´cises et non fausse´es de la taille des populations. -
Cluster Sampling
Day 5 sampling - clustering SAMPLE POPULATION SAMPLING: IS ESTIMATING THE CHARACTERISTICS OF THE WHOLE POPULATION USING INFORMATION COLLECTED FROM A SAMPLE GROUP. The sampling process comprises several stages: •Defining the population of concern •Specifying a sampling frame, a set of items or events possible to measure •Specifying a sampling method for selecting items or events from the frame •Determining the sample size •Implementing the sampling plan •Sampling and data collecting 2 Simple random sampling 3 In a simple random sample (SRS) of a given size, all such subsets of the frame are given an equal probability. In particular, the variance between individual results within the sample is a good indicator of variance in the overall population, which makes it relatively easy to estimate the accuracy of results. SRS can be vulnerable to sampling error because the randomness of the selection may result in a sample that doesn't reflect the makeup of the population. Systematic sampling 4 Systematic sampling (also known as interval sampling) relies on arranging the study population according to some ordering scheme and then selecting elements at regular intervals through that ordered list Systematic sampling involves a random start and then proceeds with the selection of every kth element from then onwards. In this case, k=(population size/sample size). It is important that the starting point is not automatically the first in the list, but is instead randomly chosen from within the first to the kth element in the list. STRATIFIED SAMPLING 5 WHEN THE POPULATION EMBRACES A NUMBER OF DISTINCT CATEGORIES, THE FRAME CAN BE ORGANIZED BY THESE CATEGORIES INTO SEPARATE "STRATA." EACH STRATUM IS THEN SAMPLED AS AN INDEPENDENT SUB-POPULATION, OUT OF WHICH INDIVIDUAL ELEMENTS CAN BE RANDOMLY SELECTED Cluster sampling Sometimes it is more cost-effective to select respondents in groups ('clusters') Quota sampling Minimax sampling Accidental sampling Voluntary Sampling …. -
IBM SPSS Complex Samples Business Analytics
IBM Software IBM SPSS Complex Samples Business Analytics IBM SPSS Complex Samples Correctly compute complex samples statistics When you conduct sample surveys, use a statistics package dedicated to Highlights producing correct estimates for complex sample data. IBM® SPSS® Complex Samples provides specialized statistics that enable you to • Increase the precision of your sample or correctly and easily compute statistics and their standard errors from ensure a representative sample with stratified sampling. complex sample designs. You can apply it to: • Select groups of sampling units with • Survey research – Obtain descriptive and inferential statistics for clustered sampling. survey data. • Select an initial sample, then create • Market research – Analyze customer satisfaction data. a second-stage sample with multistage • Health research – Analyze large public-use datasets on public health sampling. topics such as health and nutrition or alcohol use and traffic fatalities. • Social science – Conduct secondary research on public survey datasets. • Public opinion research – Characterize attitudes on policy issues. SPSS Complex Samples provides you with everything you need for working with complex samples. It includes: • An intuitive Sampling Wizard that guides you step by step through the process of designing a scheme and drawing a sample. • An easy-to-use Analysis Preparation Wizard to help prepare public-use datasets that have been sampled, such as the National Health Inventory Survey data from the Centers for Disease Control and Prevention -
Introduction to Survey Sampling and Analysis Procedures (Chapter)
SAS/STAT® 9.3 User’s Guide Introduction to Survey Sampling and Analysis Procedures (Chapter) SAS® Documentation This document is an individual chapter from SAS/STAT® 9.3 User’s Guide. The correct bibliographic citation for the complete manual is as follows: SAS Institute Inc. 2011. SAS/STAT® 9.3 User’s Guide. Cary, NC: SAS Institute Inc. Copyright © 2011, SAS Institute Inc., Cary, NC, USA All rights reserved. Produced in the United States of America. For a Web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this publication. The scanning, uploading, and distribution of this book via the Internet or any other means without the permission of the publisher is illegal and punishable by law. Please purchase only authorized electronic editions and do not participate in or encourage electronic piracy of copyrighted materials. Your support of others’ rights is appreciated. U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation by the U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227-19, Commercial Computer Software-Restricted Rights (June 1987). SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513. 1st electronic book, July 2011 SAS® Publishing provides a complete selection of books and electronic products to help customers use SAS software to its fullest potential. For more information about our e-books, e-learning products, CDs, and hard-copy books, visit the SAS Publishing Web site at support.sas.com/publishing or call 1-800-727-3228. -
Sampling and Household Listing Manual [DHSM4]
SAMPLING AND HOUSEHOLD LISTING MANuaL Demographic and Health Surveys Methodology This document is part of the Demographic and Health Survey’s DHS Toolkit of methodology for the MEASURE DHS Phase III project, implemented from 2008-2013. This publication was produced for review by the United States Agency for International Development (USAID). It was prepared by MEASURE DHS/ICF International. [THIS PAGE IS INTENTIONALLY BLANK] Demographic and Health Survey Sampling and Household Listing Manual ICF International Calverton, Maryland USA September 2012 MEASURE DHS is a five-year project to assist institutions in collecting and analyzing data needed to plan, monitor, and evaluate population, health, and nutrition programs. MEASURE DHS is funded by the U.S. Agency for International Development (USAID). The project is implemented by ICF International in Calverton, Maryland, in partnership with the Johns Hopkins Bloomberg School of Public Health/Center for Communication Programs, the Program for Appropriate Technology in Health (PATH), Futures Institute, Camris International, and Blue Raster. The main objectives of the MEASURE DHS program are to: 1) provide improved information through appropriate data collection, analysis, and evaluation; 2) improve coordination and partnerships in data collection at the international and country levels; 3) increase host-country institutionalization of data collection capacity; 4) improve data collection and analysis tools and methodologies; and 5) improve the dissemination and utilization of data. For information about the Demographic and Health Surveys (DHS) program, write to DHS, ICF International, 11785 Beltsville Drive, Suite 300, Calverton, MD 20705, U.S.A. (Telephone: 301-572- 0200; fax: 301-572-0999; e-mail: [email protected]; Internet: http://www.measuredhs.com). -
Taylor's Power Law and Fixed-Precision Sampling
87 ARTICLE Taylor’s power law and fixed-precision sampling: application to abundance of fish sampled by gillnets in an African lake Meng Xu, Jeppe Kolding, and Joel E. Cohen Abstract: Taylor’s power law (TPL) describes the variance of population abundance as a power-law function of the mean abundance for a single or a group of species. Using consistently sampled long-term (1958–2001) multimesh capture data of Lake Kariba in Africa, we showed that TPL robustly described the relationship between the temporal mean and the temporal variance of the captured fish assemblage abundance (regardless of species), separately when abundance was measured by numbers of individuals and by aggregate weight. The strong correlation between the mean of abundance and the variance of abundance was not altered after adding other abiotic or biotic variables into the TPL model. We analytically connected the parameters of TPL when abundance was measured separately by the aggregate weight and by the aggregate number, using a weight–number scaling relationship. We utilized TPL to find the number of samples required for fixed-precision sampling and compared the number of samples when sampling was performed with a single gillnet mesh size and with multiple mesh sizes. These results facilitate optimizing the sampling design to estimate fish assemblage abundance with specified precision, as needed in stock management and conservation. Résumé : La loi de puissance de Taylor (LPT) décrit la variance de l’abondance d’une population comme étant une fonction de puissance de l’abondance moyenne pour une espèce ou un groupe d’espèces. En utilisant des données de capture sur une longue période (1958–2001) obtenues de manière cohérente avec des filets de mailles variées dans le lac Kariba en Afrique, nous avons démontré que la LPT décrit de manière robuste la relation entre la moyenne temporelle et la variance temporelle de l’abondance de l’assemblage de poissons capturés (peu importe l’espèce), que l’abondance soit mesurée sur la base du nombre d’individus ou de la masse cumulative. -
The Design Effect: Do We Know All About It?
Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001 THE DESIGN EFFECT: DO WE KNOW ALL ABOUT IT? Inho Park and Hyunshik Lee, Westat Inho Park, Westat, 1650 Research Boulevard, Rockville, Maryland 20850 Key Words: Complex sample, Simple random where y is an estimate of the population mean (Y ) sample, pps sampling, Point and variance under a complex design with the sample size of n, f is a estimation, Ratio mean, Total estimator 2 = ()− −1∑N − 2 sampling fraction, and S y N 1 i=1(yi Y ) is 1. Introduction the population element variance of the y-variable. In general, the design effect can be defined for any The design effect is widely used in survey meaningful statistic computed from the sample selected sampling for planning a sample design and to report the from the complex sample design. effect of the sample design in estimation and analysis. It The Deff is a population quantity that depends on is defined as the ratio of the variance of an estimator the sample design and the statistic. The same parameter under a sample design to that of the estimator under can be estimated by different estimators and their simple random sampling. An estimated design effect is Deff’s are different even under the same design. routinely produced by sample survey software such as Therefore, the Deff includes not only the efficiency of WesVar and SUDAAN. the design but also the efficiency of the estimator. This paper examines the relationship between the Särndal et al. (1992, p. 54) made this point clear by design effects of the total estimator and the ratio mean defining it as a function of the design (p) and the estimator, under unequal probability sampling. -
Design Effect
2005 YOUTH RISK BEHAVIOR SURVEY Design Effect The results from surveys that use samples such as the Youth Risk Behavior Survey (YRBS) are affected by two types of errors: non-sampling error and sampling error. Nonsampling error usually is caused by procedural mistakes or random or purposeful respondent errors. For example, nonsampling errors would include surveying the wrong school or class, data entry errors, and poor question wording. While numerous measures were taken to minimize nonsampling error for the Youth Risk Behavior Survey (YRBS), nonsampling errors are impossible to avoid entirely and difficult to evaluate or measure statistically. In contrast, sampling error can be described and evaluated. The sample of students selected for your YRBS is only one of many samples of the same size that could have been drawn from the population using the same design. Each sample would have yielded slightly different results had it actually been selected. This variation in results is called sampling error and it can be estimated using data from the sample that was chosen for your YRBS. A measure of sampling error that is often reported is the "standard error" of a particular statistic (mean, percentage, etc.), which is the square root of the variance of the statistic across all possible samples of equal size and design. The standard error is used to calculate confidence intervals within which one can be reasonably sure the true value of the variable for the whole population falls. For example, for any given statistic, the value of that statistic as measured in 95 percent of all independently drawn samples of equal size and design would fall within a range of plus or minus 1.96 times the standard error of that statistic. -
Estimation of Design Effects for ESS Round II
Estimation of Design Effects for ESS Round II Documentation Matthias Ganninger Tel: +49 (0)621-1246-189 E-Mail: [email protected] November 30, 2006 1 Introduction For the calculation of the effective sample size, neff = nnet/deff, as well as for the determination of the required net sample size in round III, nnet-req(III) = n0 × deff, the design effect is calculated based on ESS round II data. The minimum required effective sample size, n0, is 800 for countries with less than 2 Mio. inhabitants aged 15 and over and 1 500 for all other countries. In the above formulas, nnet denotes the net sample size and deff is a com- bination of two separate design effects: firstly, the design effect due to un- equal selection probabilities, deffp, and, secondly, the design effect due to clustering, deffc. These two quantities, multiplied with each other, then, give the overall design effect according to Gabler et al. (1999): deff = deffp × deffc. This document reports in detail how these quantities are calculated. The remaining part of this section gives an overview of the sampling schemes in the participating countries. In section 2, a description of the preliminary work, foregoing the actual calculation, is given. Then, in section 3, intro- duces the calculation of the design effect due to clustering (deffc) and due to unequal selection probabilities (deffp). The last section closes with some remarks on necessary adjustments of the ESS data to fit the models. The design effects based on ESS round II data are compared to those from round I in section 4.