Stratified Cluster Sampling Under Multiplicative Model for Quantitative Sensitive Question Survey

STRATIFIED CLUSTER SAMPLING UNDER MULTIPLICATIVE MODEL FOR QUANTITATIVE SENSITIVE QUESTION SURVEY Pu Xiangke, Gao Ge, Fan Yubo and Wang Mian SUMMARY Advanced sampling methods and corresponding formulas are total probability formulas. The performance of the sampling seldom applied to quantitative sensitive question survey. This methods and formulas in survey of cheating on exams on Dus- paper develops a series of formulas for parameter estimation hu Lake Campus of Soochow University, China, are provided. in cluster sampling and stratified cluster sampling under mul- The reliability of the survey methods and formulas for quanti- tiplicative model on the basis of classic sampling theories and tative sensitive question survey is found to be high. Introduction sponse error, randomized re- sitive question survey. However, more, formulas for simple ran- sponse models have been devel- this method is vulnerable to dom sampling may also be So-called sensitive questions oped for collecting sensitive sampling error because the ran- wrongly used when a relatively are some private issues such as information after the pioneering domness of the selection may complicated sampling procedure acquired immune deficiency work of Warner (1965). The result in a sample that doesn’t is conducted in sensitive ques- syndrome (AIDS), drug addic- multiplicative model was de- reflect the makeup of the popu- tion survey and evaluation of tion, gambling, prostitution, signed for quantitative sensitive lation and it is cumbersome reliability of sampling methods driving while intoxicated, per- question survey, with advan- when sampling from a large and corresponding formulas sonal income, tax evasion, pre- tages of simple design, small target population. Stratified under randomized response marital sex, venereal diseases, sample size and small sampling sampling is superior to simple models for sensitive question homosexual tendencies, and so error (Wang, 2003; Raghunath random sampling in reducing survey has seldom been set. on. It is difficult to get honest and Georg, 2006). sampling error and cluster sam- In this paper, designs for answers by asking direct ques- Simple random sampling has pling is less costly, but these cluster sampling and stratified tions on sensitive issues in sur- been considered a useful sam- methods are not so popular be- cluster sampling under multipli- vey research (Tourangeau and pling method under multiplica- cause they are not too simple cative model and corresponding Yan, 2007). To reduce the re- tive model for quantitative sen- (Ding and Gao, 2008). Further- formulas for parameter estima- KEYWORDS / Multiplicative Model / Parameter Estimation / Quantitative Sensitive Question / Stratified Cluster Sampling / Received: 09/01/2010. Modified: 09/05/2011. Accepted: 09/07/2011. Pu Xiangke. Ph.D. candidate in Gao Ge. M.Sc. in Biostatistics, P.R.China. e-mail: gaoge@ Wang Mian. M.Sc. in Epidemiol- Epidemiology and Health Sta- Soochow University, China. suda.edu.cn ogy and Health Statistics. So- tistics, Soochow University, Professor, Soochow University, Fan Yubo. Ph.D. candidate in ochow University, China. China. Technologist-in-charge, China. Address: Department of Epidemiology and Health Sta- Institute of Hepatology, Third Health Statistics, School of tistics. Soochow University, People’s Hospital, Changzhou, Radiation Medicine and Public China. China. e-mail: [email protected] Health, Suzhou, 215123, NOV 2012, VOL. 37 Nº 11 0378-1844/12/11/833-05 $ 3.00/0 833 MUESTREO POR CONGLOMERADOS ESTRATIFICADOS BAJO UN MODELO MULTIPLICATIVO PARA PREGUNTAS SENSIBLES EN ENCUESTAS CUANTITATIVAS Pu Xiangke, Gao Ge, Fan Yubo y Wang Mian RESUMEN Los métodos de muestreo avanzados y las correspondientes total. Se ilustra el desempeño de los métodos de muestreo y fórmulas son raramente aplicados en preguntas sensibles de las fórmulas en una encuesta de engaños en exámenes en el encuestas cuantitativas. Este trabajo desarrolla una serie de Campus Dushu Lake de la Universidad de Soochow, China. La fórmulas para parámetros de estimación en muestreo de con- confiabilidad de los métodos de encuesta y las fórmulas para glomerados estratificados bajo un modelo multiplicativo basa- encuestas cuantitativas de preguntas sensibles resulta ser alta. das en teorías clásicas de muestreo y fórmulas de probabilidad AMOSTRAGEM ESTRATIFICADA POR CONGLOMERADOS SOB UM MODELO MULTIPLICATIVO PARA PERGUNTAS SENSÍVEIS EM PESQUISAS QUANTITATIVAS Pu Xiangke, Gao Ge, Fan Yubo e Wang Mian RESUMO Os métodos avançados de amostragem e as corresponden- probabilidade total. Ilustra-se o desempenho dos métodos de tes fórmulas são raramente aplicados em perguntas sensíveis amostragem e as fórmulas em uma pesquisa de enganos em de pesquisas quantitativas. Este trabalho desenvolve uma sé- exames no Campus Dushu Lake da Universidade de Soochow, rie de fórmulas para estimação de parâmetros em amostragem China. A confiabilidade dos métodos de pesquisa e as fórmulas de conglomerados estratificados sob um modelo multiplicativo para pesquisas de perguntas sensitivas quantitativas resultam baseadas em teorias clássicas de amostragem e fórmulas de ser alta. tion are provided. These com- eral clusters (primary units), the ith cluster contains Mi of the subunits in each clus- plex sampling methods have and each cluster is composed subunits. Then, n clusters are ter, and been employed in survey of of secondary units. Some drawn randomly from the cheating on Dushu Lake Cam- clusters are randomly selected population. is the sampling ratio. pus of Soochow University, from the population. Then, Estimation of the population China, and may be applicable the multiplicative model mean and its variance for to a large population. (above) is applied to all the quantitative sensitive question When Mi=M, secondary units of selected survey. Suppose the mean of Methods for Quantitative clusters for the quantitative the response values in the ith Sensitive Question Survey sensitive question survey. (4) cluster is µi, yi denotes the sum of values in the ith clus- Multiplicative model Stratified cluster sampling ter and the population mean is where f=nM / NM=n/N is the under the multiplicative µ. By Cochran (1977) and sampling ratio. In the multiplicative model model Wang and Gao (2006) the es- (Wang, 2003), a randomized timator of the population Calculation of µi and yi. Sup- device is designed to ran- Before cluster sampling, mean can be stated as pose µi denotes the mean of domly generate an integer the population is divided variables with sensitive charac- between 0 and 9. In the de- into different strata by char- ter in the ith cluster, µiz device, ten balls of identical acters. Then, cluster sam- notes the mean of numerical size are respectively labeled pling is applied to each stra- values of the answers in the by the ten integers. By a ran- tum. In each stratum, differ- (1) ith cluster, and µy denotes the domization process, each re- ent clusters (primary units) and when Mi=M, mean of all the random num- spondent takes a ball labeled are also composed of sec- bers in the randomized device. by an integer and multiplies ondary units. And then, the Then, from characteristics of the integer by the numerical multiplicative model is em- means (Wang et al., 2006), value of his response to the ployed to those secondary (2) µ = µ µ , i= 1,2,...,n (5) quantitative sensitive question units of selected clusters for Also following Cochran iz i y so as to get a final result. the quantitative sensitive (1977), the estimator of the vari- µ = µ /µ , i= 1,2,...,n (6) question survey. ance of can be obtained as i iz y Cluster sampling under the multiplicative model on Deduction of Formulas and yi = Miµi, i= 1,2,...,n quantitative sensitive questions Formulas for cluster Formulas for stratified sampling cluster sampling It is convenient to use a (3) cluster sampling method. The Suppose the population is where is the mean Suppose the population is population is divided into sev- divided into N clusters, and composed of L strata, and 834 NOV 2012, VOL. 37 Nº 11 the hth stratum contains Nh tum according to the number Table I clusters (primary units), and of subunits. The mean of times of cheating of students the ith cluster contains Mih As samples of each stratum in 38 classes in last two semesters secondary units. The popula- are independent, from Eq. 11, by twice repeated stratified cluster tion contains N secondary its variance is sampling under multiplicative model units and nh clusters are randomly drawn from the hth (12) Serial Serial stratum. numbers of numbers According to Eqs. 8 and undergraduate µi1 µ'i1 of graduate µi2 µ'i2 Estimation of the population 10, the estimator of V( ) classes classes mean and its variance of the is v( ). And from Eq. 12, 1 0.8085 0.9787 1 1.3990 1.3788 hth stratum. From Eq. 1, the v( ) will be the estimator of 2 1.9222 1.7333 2 2.3308 2.6925 estimator of the population V( ). 3 1.2278 1.4722 3 2.3457 2.2370 mean (µh) of the hth stratum 4 1.5934 1.5177 4 0.8322 1.1111 can be obtained as Calculation of µih and yih 5 1.0505 0.9748 5 0.9660 0.9669 6 0.6333 0.4858 6 0.8133 0.9622 Let µih denote the mean of 7 1.1358 0.9926 7 0.9235 0.8889 h= 1,2,…, L variables with sensitive char- 8 1.0922 1.0118 8 0.7488 0.9220 (7) acter in the ith cluster of the 9 0.9827 0.8494 9 0.5383 0.5333 10 1.0431 1.1066 10 0.8009 0.7824 hth stratum, µiz denote the and from Eq. 3, we can get average value of all the an- 11 1.1376 1.4233 11 0.7565 0.7488 the estimator of the variance swers in the ith cluster of 12 0.7727 0.6010 12 0.7546 0.7593 of : the hth stratum, and µ be 13 1.2691 1.1012 13 0.7565 0.7234 y 14 0.8636 0.8232 14 0.4921 0.4815 15 0.9082 0.9034 15 0.9975 0.9728 16 1.0000 0.8485 16 0.8931 0.8637 17 0.1401 0.0821 17 0.7677 0.6717 18 0.9312 0.9206 18 0.9333 0.8404 where is the 19 0.8360 0.7196 the average value of all the 20 0.4840 0.5284 random numbers in the ran- mean of subunits of each domized device.

Stratified Cluster Sampling Under Multiplicative Model for Quantitative Sensitive Question Survey

Choosing the Sample

Computing Effect Sizes for Clustered Randomized Trials

Evaluating Probability Sampling Strategies for Estimating Redd Counts: an Example with Chinook Salmon (Oncorhynchus Tshawytscha)

Cluster Sampling

Introduction to Survey Sampling and Analysis Procedures (Chapter)

Sampling and Household Listing Manual [DHSM4]

Taylor's Power Law and Fixed-Precision Sampling

METHOD GUIDE 3 Survey Sampling and Administration

Use of Sampling in the Census Technical Session 5.2

How Big of a Problem Is Analytic Error in Secondary Analyses of Survey Data?

Guidelines for Evaluating Prevalence Studies

Accounting for Cluster Sampling in Constructing Enumerative Sequential Sampling Plans