Chapter 5 Random Sampling and Sampling Distributions Table of Contents
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Hypothesis Testing with Two Categorical Variables 203
Chapter 10 distribute or Hypothesis Testing With Two Categoricalpost, Variables Chi-Square copy, Learning Objectives • Identify the correct types of variables for use with a chi-square test of independence. • Explainnot the difference between parametric and nonparametric statistics. • Conduct a five-step hypothesis test for a contingency table of any size. • Explain what statistical significance means and how it differs from practical significance. • Identify the correct measure of association for use with a particular chi-square test, Doand interpret those measures. • Use SPSS to produce crosstabs tables, chi-square tests, and measures of association. 202 Part 3 Hypothesis Testing Copyright ©2016 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. he chi-square test of independence is used when the independent variable (IV) and dependent variable (DV) are both categorical (nominal or ordinal). The chi-square test is member of the family of nonparametric statistics, which are statistical Tanalyses used when sampling distributions cannot be assumed to be normally distributed, which is often the result of the DV being categorical rather than continuous (we will talk in detail about this). Chi-square thus sits in contrast to parametric statistics, which are used when DVs are continuous and sampling distributions are safely assumed to be normal. The t test, analysis of variance, and correlation are all parametric. Before going into the theory and math behind the chi-square statistic, read the Research Examples for illustrations of the types of situations in which a criminal justice or criminology researcher would utilize a chi- square test. -
Esomar/Grbn Guideline for Online Sample Quality
ESOMAR/GRBN GUIDELINE FOR ONLINE SAMPLE QUALITY ESOMAR GRBN ONLINE SAMPLE QUALITY GUIDELINE ESOMAR, the World Association for Social, Opinion and Market Research, is the essential organisation for encouraging, advancing and elevating market research: www.esomar.org. GRBN, the Global Research Business Network, connects 38 research associations and over 3500 research businesses on five continents: www.grbn.org. © 2015 ESOMAR and GRBN. Issued February 2015. This Guideline is drafted in English and the English text is the definitive version. The text may be copied, distributed and transmitted under the condition that appropriate attribution is made and the following notice is included “© 2015 ESOMAR and GRBN”. 2 ESOMAR GRBN ONLINE SAMPLE QUALITY GUIDELINE CONTENTS 1 INTRODUCTION AND SCOPE ................................................................................................... 4 2 DEFINITIONS .............................................................................................................................. 4 3 KEY REQUIREMENTS ................................................................................................................ 6 3.1 The claimed identity of each research participant should be validated. .................................................. 6 3.2 Providers must ensure that no research participant completes the same survey more than once ......... 8 3.3 Research participant engagement should be measured and reported on ............................................... 9 3.4 The identity and personal -
Lecture 1: Why Do We Use Statistics, Populations, Samples, Variables, Why Do We Use Statistics?
1pops_samples.pdf Michael Hallstone, Ph.D. [email protected] Lecture 1: Why do we use statistics, populations, samples, variables, why do we use statistics? • interested in understanding the social world • we want to study a portion of it and say something about it • ex: drug users, homeless, voters, UH students Populations and Samples Populations, Sampling Elements, Frames, and Units A researcher defines a group, “list,” or pool of cases that she wishes to study. This is a population. Another definition: population = complete collection of measurements, objects or individuals under study. 1 of 11 sample = a portion or subset taken from population funny circle diagram so we take a sample and infer to population Why? feasibility – all MD’s in world , cost, time, and stay tuned for the central limits theorem...the most important lecture of this course. Visualizing Samples (taken from) Populations Population Group you wish to study (Mostly made up of “people” in the Then we infer from sample back social sciences) to population (ALWAYS SOME ERROR! “sampling error” Sample (a portion or subset of the population) 4 This population is made up of the things she wishes to actually study called sampling elements. Sampling elements can be people, organizations, schools, whales, molecules, and articles in the popular press, etc. The sampling element is your exact unit of analysis. For crime researchers studying car thieves, the sampling element would probably be individual car thieves – or theft incidents reported to the police. For drug researchers the sampling elements would be most likely be individual drug users. Inferential statistics is truly the basis of much of our scientific evidence. -
Assessment of Socio-Demographic Sample Composition in ESS Round 61
Assessment of socio-demographic sample composition in ESS Round 61 Achim Koch GESIS – Leibniz Institute for the Social Sciences, Mannheim/Germany, June 2016 Contents 1. Introduction 2 2. Assessing socio-demographic sample composition with external benchmark data 3 3. The European Union Labour Force Survey 3 4. Data and variables 6 5. Description of ESS-LFS differences 8 6. A summary measure of ESS-LFS differences 17 7. Comparison of results for ESS 6 with results for ESS 5 19 8. Correlates of ESS-LFS differences 23 9. Summary and conclusions 27 References 1 The CST of the ESS requests that the following citation for this document should be used: Koch, A. (2016). Assessment of socio-demographic sample composition in ESS Round 6. Mannheim: European Social Survey, GESIS. 1. Introduction The European Social Survey (ESS) is an academically driven cross-national survey that has been conducted every two years across Europe since 2002. The ESS aims to produce high- quality data on social structure, attitudes, values and behaviour patterns in Europe. Much emphasis is placed on the standardisation of survey methods and procedures across countries and over time. Each country implementing the ESS has to follow detailed requirements that are laid down in the “Specifications for participating countries”. These standards cover the whole survey life cycle. They refer to sampling, questionnaire translation, data collection and data preparation and delivery. As regards sampling, for instance, the ESS requires that only strict probability samples should be used; quota sampling and substitution are not allowed. Each country is required to achieve an effective sample size of 1,500 completed interviews, taking into account potential design effects due to the clustering of the sample and/or the variation in inclusion probabilities. -
Sampling Methods It’S Impractical to Poll an Entire Population—Say, All 145 Million Registered Voters in the United States
Sampling Methods It’s impractical to poll an entire population—say, all 145 million registered voters in the United States. That is why pollsters select a sample of individuals that represents the whole population. Understanding how respondents come to be selected to be in a poll is a big step toward determining how well their views and opinions mirror those of the voting population. To sample individuals, polling organizations can choose from a wide variety of options. Pollsters generally divide them into two types: those that are based on probability sampling methods and those based on non-probability sampling techniques. For more than five decades probability sampling was the standard method for polls. But in recent years, as fewer people respond to polls and the costs of polls have gone up, researchers have turned to non-probability based sampling methods. For example, they may collect data on-line from volunteers who have joined an Internet panel. In a number of instances, these non-probability samples have produced results that were comparable or, in some cases, more accurate in predicting election outcomes than probability-based surveys. Now, more than ever, journalists and the public need to understand the strengths and weaknesses of both sampling techniques to effectively evaluate the quality of a survey, particularly election polls. Probability and Non-probability Samples In a probability sample, all persons in the target population have a change of being selected for the survey sample and we know what that chance is. For example, in a telephone survey based on random digit dialing (RDD) sampling, researchers know the chance or probability that a particular telephone number will be selected. -
Summary of Human Subjects Protection Issues Related to Large Sample Surveys
Summary of Human Subjects Protection Issues Related to Large Sample Surveys U.S. Department of Justice Bureau of Justice Statistics Joan E. Sieber June 2001, NCJ 187692 U.S. Department of Justice Office of Justice Programs John Ashcroft Attorney General Bureau of Justice Statistics Lawrence A. Greenfeld Acting Director Report of work performed under a BJS purchase order to Joan E. Sieber, Department of Psychology, California State University at Hayward, Hayward, California 94542, (510) 538-5424, e-mail [email protected]. The author acknowledges the assistance of Caroline Wolf Harlow, BJS Statistician and project monitor. Ellen Goldberg edited the document. Contents of this report do not necessarily reflect the views or policies of the Bureau of Justice Statistics or the Department of Justice. This report and others from the Bureau of Justice Statistics are available through the Internet — http://www.ojp.usdoj.gov/bjs Table of Contents 1. Introduction 2 Limitations of the Common Rule with respect to survey research 2 2. Risks and benefits of participation in sample surveys 5 Standard risk issues, researcher responses, and IRB requirements 5 Long-term consequences 6 Background issues 6 3. Procedures to protect privacy and maintain confidentiality 9 Standard issues and problems 9 Confidentiality assurances and their consequences 21 Emerging issues of privacy and confidentiality 22 4. Other procedures for minimizing risks and promoting benefits 23 Identifying and minimizing risks 23 Identifying and maximizing possible benefits 26 5. Procedures for responding to requests for help or assistance 28 Standard procedures 28 Background considerations 28 A specific recommendation: An experiment within the survey 32 6. -
Chapter 8 Fundamental Sampling Distributions And
CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS 8.1 Random Sampling pling procedure, it is desirable to choose a random sample in the sense that the observations are made The basic idea of the statistical inference is that we independently and at random. are allowed to draw inferences or conclusions about a Random Sample population based on the statistics computed from the sample data so that we could infer something about Let X1;X2;:::;Xn be n independent random variables, the parameters and obtain more information about the each having the same probability distribution f (x). population. Thus we must make sure that the samples Define X1;X2;:::;Xn to be a random sample of size must be good representatives of the population and n from the population f (x) and write its joint proba- pay attention on the sampling bias and variability to bility distribution as ensure the validity of statistical inference. f (x1;x2;:::;xn) = f (x1) f (x2) f (xn): ··· 8.2 Some Important Statistics It is important to measure the center and the variabil- ity of the population. For the purpose of the inference, we study the following measures regarding to the cen- ter and the variability. 8.2.1 Location Measures of a Sample The most commonly used statistics for measuring the center of a set of data, arranged in order of mag- nitude, are the sample mean, sample median, and sample mode. Let X1;X2;:::;Xn represent n random variables. Sample Mean To calculate the average, or mean, add all values, then Bias divide by the number of individuals. -
Lesson 3: Sampling Plan 1. Introduction to Quantitative Sampling Sampling: Definition
Quantitative approaches Quantitative approaches Plan Lesson 3: Sampling 1. Introduction to quantitative sampling 2. Sampling error and sampling bias 3. Response rate 4. Types of "probability samples" 5. The size of the sample 6. Types of "non-probability samples" 1 2 Quantitative approaches Quantitative approaches 1. Introduction to quantitative sampling Sampling: Definition Sampling = choosing the unities (e.g. individuals, famililies, countries, texts, activities) to be investigated 3 4 Quantitative approaches Quantitative approaches Sampling: quantitative and qualitative Population and Sample "First, the term "sampling" is problematic for qualitative research, because it implies the purpose of "representing" the population sampled. Population Quantitative methods texts typically recognize only two main types of sampling: probability sampling (such as random sampling) and Sample convenience sampling." (...) any nonprobability sampling strategy is seen as "convenience sampling" and is strongly discouraged." IIIIIIIIIIIIIIII Sampling This view ignores the fact that, in qualitative research, the typical way of IIIIIIIIIIIIIIII IIIII selecting settings and individuals is neither probability sampling nor IIIII convenience sampling." IIIIIIIIIIIIIIII IIIIIIIIIIIIIIII It falls into a third category, which I will call purposeful selection; other (= «!Miniature population!») terms are purposeful sampling and criterion-based selection." IIIIIIIIIIIIIIII This is a strategy in which particular settings, persons, or activieties are selected deliberately in order to provide information that can't be gotten as well from other choices." Maxwell , Joseph A. , Qualitative research design..., 2005 , 88 5 6 Quantitative approaches Quantitative approaches Population, Sample, Sampling frame Representative sample, probability sample Population = ensemble of unities from which the sample is Representative sample = Sample that reflects the population taken in a reliable way: the sample is a «!miniature population!» Sample = part of the population that is chosen for investigation. -
MRS Guidance on How to Read Opinion Polls
What are opinion polls? MRS guidance on how to read opinion polls June 2016 1 June 2016 www.mrs.org.uk MRS Guidance Note: How to read opinion polls MRS has produced this Guidance Note to help individuals evaluate, understand and interpret Opinion Polls. This guidance is primarily for non-researchers who commission and/or use opinion polls. Researchers can use this guidance to support their understanding of the reporting rules contained within the MRS Code of Conduct. Opinion Polls – The Essential Points What is an Opinion Poll? An opinion poll is a survey of public opinion obtained by questioning a representative sample of individuals selected from a clearly defined target audience or population. For example, it may be a survey of c. 1,000 UK adults aged 16 years and over. When conducted appropriately, opinion polls can add value to the national debate on topics of interest, including voting intentions. Typically, individuals or organisations commission a research organisation to undertake an opinion poll. The results to an opinion poll are either carried out for private use or for publication. What is sampling? Opinion polls are carried out among a sub-set of a given target audience or population and this sub-set is called a sample. Whilst the number included in a sample may differ, opinion poll samples are typically between c. 1,000 and 2,000 participants. When a sample is selected from a given target audience or population, the possibility of a sampling error is introduced. This is because the demographic profile of the sub-sample selected may not be identical to the profile of the target audience / population. -
Permutation Tests
Permutation tests Ken Rice Thomas Lumley UW Biostatistics Seattle, June 2008 Overview • Permutation tests • A mean • Smallest p-value across multiple models • Cautionary notes Testing In testing a null hypothesis we need a test statistic that will have different values under the null hypothesis and the alternatives we care about (eg a relative risk of diabetes) We then need to compute the sampling distribution of the test statistic when the null hypothesis is true. For some test statistics and some null hypotheses this can be done analytically. The p- value for the is the probability that the test statistic would be at least as extreme as we observed, if the null hypothesis is true. A permutation test gives a simple way to compute the sampling distribution for any test statistic, under the strong null hypothesis that a set of genetic variants has absolutely no effect on the outcome. Permutations To estimate the sampling distribution of the test statistic we need many samples generated under the strong null hypothesis. If the null hypothesis is true, changing the exposure would have no effect on the outcome. By randomly shuffling the exposures we can make up as many data sets as we like. If the null hypothesis is true the shuffled data sets should look like the real data, otherwise they should look different from the real data. The ranking of the real test statistic among the shuffled test statistics gives a p-value Example: null is true Data Shuffling outcomes Shuffling outcomes (ordered) gender outcome gender outcome gender outcome Example: null is false Data Shuffling outcomes Shuffling outcomes (ordered) gender outcome gender outcome gender outcome Means Our first example is a difference in mean outcome in a dominant model for a single SNP ## make up some `true' data carrier<-rep(c(0,1), c(100,200)) null.y<-rnorm(300) alt.y<-rnorm(300, mean=carrier/2) In this case we know from theory the distribution of a difference in means and we could just do a t-test. -
Chapter 3: Simple Random Sampling and Systematic Sampling
Chapter 3: Simple Random Sampling and Systematic Sampling Simple random sampling and systematic sampling provide the foundation for almost all of the more complex sampling designs that are based on probability sampling. They are also usually the easiest designs to implement. These two designs highlight a trade-off inherent in all sampling designs: do we select sample units at random to minimize the risk of introducing biases into the sample or do we select sample units systematically to ensure that sample units are well- distributed throughout the population? Both designs involve selecting n sample units from the N units in the population and can be implemented with or without replacement. Simple Random Sampling When the population of interest is relatively homogeneous then simple random sampling works well, which means it provides estimates that are unbiased and have high precision. When little is known about a population in advance, such as in a pilot study, simple random sampling is a common design choice. Advantages: • Easy to implement • Requires little advance knowledge about the target population Disadvantages: • Imprecise relative to other designs if the population is heterogeneous • More expensive to implement than other designs if entities are clumped and the cost to travel among units is appreciable How it is implemented: • Select n sample units at random from N available in the population All units within the population must have the same probability of being selected, therefore each and every sample of size n drawn from the population has an equal chance of being selected. There are many strategies available for selecting a random sample. -
Sampling Distribution of the Variance
Proceedings of the 2009 Winter Simulation Conference M. D. Rossetti, R. R. Hill, B. Johansson, A. Dunkin, and R. G. Ingalls, eds. SAMPLING DISTRIBUTION OF THE VARIANCE Pierre L. Douillet Univ Lille Nord de France, F-59000 Lille, France ENSAIT, GEMTEX, F-59100 Roubaix, France ABSTRACT Without confidence intervals, any simulation is worthless. These intervals are quite ever obtained from the so called "sampling variance". In this paper, some well-known results concerning the sampling distribution of the variance are recalled and completed by simulations and new results. The conclusion is that, except from normally distributed populations, this distribution is more difficult to catch than ordinary stated in application papers. 1 INTRODUCTION Modeling is translating reality into formulas, thereafter acting on the formulas and finally translating the results back to reality. Obviously, the model has to be tractable in order to be useful. But too often, the extra hypotheses that are assumed to ensure tractability are held as rock-solid properties of the real world. It must be recalled that "everyday life" is not only made with "every day events" : rare events are rarely occurring, but they do. For example, modeling a bell shaped histogram of experimental frequencies by a Gaussian pdf (probability density function) or a Fisher’s pdf with four parameters is usual. Thereafter transforming this pdf into a mgf (moment generating function) by mgf (z)=Et (expzt) is a powerful tool to obtain (and prove) the properties of the modeling pdf . But this doesn’t imply that a specific moment (e.g. 4) is effectively an accessible experimental reality.