Elementry Forest Sampling

Total Page:16

File Type:pdf, Size:1020Kb

Elementry Forest Sampling ELEMENTARY FOREST SAMPLING FRANK FREESE Southern Forest Experiment Station, Forest Service Asticulture Handbook No. 232 December 1962 U.S. Department of Agriculture 0 Forest Service Reviewed and approved for reprinting, November 1976. I should like to express my appreciation to Professor George W. Snedecor of the Iowa State University Statistical Laboratory and to the Iowa State University Press for their generous permission to reprint tables 1, 3, and 4 from their book Statistical Methods, 5th edition. Thanks are also due to Dr. C. I. Bliss of the Connecti- cut Agricultural Experiment Station, who originally prepared the material in table 4. I am indebted to Professor Sir Ronald A. , Fisher, F.R.S., Cambridge, and to Dr. Frank Yates, F.R.S., Roth- amsted, and to Messrs. Oliver and Boyd Ltd., Edinburgh, for permission to reprint table 2 from their book Statistica Tables for Biological, Agricultural, and Medical Research. I?EANK FltEESE Southern Forest Experiment Station Forukbytha Sucwintondant of Doatmwb, U.S. twommmt Printing offia Washington. D.C. 20402 - Prior $1.60 25% dbcountJtwodon~d100ormorotoonerddraa stock MO. ool -ooo-Olesl -2/Cata#og No. A 1.7&232 Thus la a minimum chuga 04 $1.00 ior aach ma&l order ii , CONTENTS P*go Basic concepts .-___________-_--___.-.-----------.--.....-------..--.-----.--------- ma-em-e 1 Why sample? __._____-___..._--._.--.-------.--._--.___-________-_________ -----.--.-- 1 Populations, parameters, and estimatzs -...____._.____..___.__-_--.---..--- 2 Bias, accuracy, and precision ___.___.______._.___---.--..-.---.---..----------- 3 Variables, continuous and discrete ___._..____________..___------------------- 6 Distribution functions __________._____ ________.____._.__._____._.__.___.-_._------ 6 Tools of the trade .__.____.__--________________________-__--.___._------.-.--.--.--------- 6 Subscripts, summations, and brackets __ . ___ ________.._________.__._._____--- 6 Variance -________ _ _ _ ____ __ _ _ _ _ _ _ _ _ _ _ _ _ . __ __ .._..____..__..___.__.___.._.-.--- _____._ - 9 Standard errors and confidence limits _____ . ________ _ _ _ _ . __ __ _ __ _ _ _ ___ . _ _ ,-- 10 Expanded variances and standard errors .__.__________.___._____._____..--- 12 CoefRcient of variation ___.__.______-___--._.__--_.-__________._.____-______--_--- 13 Covariance _______.___________.-_.______.____-....________..______________-_._._-...-- 14 Correlation coefficient ..__-___..__________________--______..___________.____..--__15 Independence ____________.______.____. _._---_._________._.--.---.-.-------.__..___.-- 16 Variances of products, ratios, and sums _______________._____________._____ 17 Transformations of variables ____ __ ____. ________________._________________I____ 19 Sampling methods for continuous variables .______________.__._---.-------------- 20 Simple random sampling ____ _-----_-_._____..___-.___-__..-___-..-.--.__-__----- 20 Stratified random sampling . __ . ._____._______.__.-----.------------------.--.--- 28 Regression estimators. ______.___._-___________--__-..--__-.-.._---__.-------.----- 36 Double sampling ____________.-_-____.__..-_---.--.----.------..___--__-_.__...-__._ 43 Sampling when units are unequal in size (including pps sampling) . _ 47 Two-stage sampling .___.__________.________________________-.---.---------------.5570 Two-stage sampling with unequal-sized primaries ._.___.________._______ Systematic sampling ____.__._________I__---..-.---.---------.----..--.__. ____ __ ___ 60 Sampling methods for discrete variables ____._________________-..-----.---.---.-.- 61 Simple random sampling-classification data _____________._______________61 Cluster sampling for attributes ____________.___.___.-------.-----.______-_____- 6”; Cluster sampling for attributes-unequal-sized clusters ._________..---- Sampling of count variables ________.___________---.--.---.-.--- ~~_~.-~~~---~~ -- 68 Some other aspects of sampling _________ __.______. _________ ______-___----._ _..-. _-__ 70 Size and shape of sampling units _________________._-._-- -------.------------- ‘77 Estimating changes __________.____.__..__________________._--.--. -.-------- --__--- Design of sample surveys ________________.___-..-.-..-..---------------.--------- 75 iii Referencea for additional reading . 70 Practice problems in subscript and summation notation . 79 Tables . 82 1. Ten thousand randomly assorted digits ................................. 82 2. The distribution of t .......................................................... 86 3. Confidence intervals for binomial distribution.. ....................... 87 4. Arcsin transformation ....................................................... 89 ELEMENTARY FOREST SAMPLING This is a statistical cookbook for foresters. It presents some sampling methods that have been found useful in forestry. No attempt is made to go into the theory behind these methods. This has some dangers, but experience has shown that few foresters will venture into the intricacies of statistical theory until they are familiar with some of the common sampling designs and computations. The aim here is to provide that familiarity. Readers who attain such familiarity will be able to handle many of the routine sam- pling problems. They will also find that many problems have been left unanswered and many ramifications of sampling ignored. It is hoped that when they reach this stage they will delve into more comprehensive works on sampling. Several very good ones are listed on page 78. BASIC CONCEPTS Why Sample? Most human decisions are made with incomplete knowledge. In daily life, a physician may diagnose disease from a single drop of blood or a microscopic section of tissue; a housewife judges a watermelon by its “plug” or by the sound it emits when thumped; and amid a bewildering array of choices and claims we select toothpaste, insurance, vacation spots, mates, and careers with but a fragment of the total information necessary or desirable for complete understanding. All of these we do with the ardent hope that the drop of blood, the melon plug, and the advertising claim give a reliable picture of the population they represent. In manufacturing and business, in science, and no less in fores- try, partial knowledge is a normal state. The complete census is rare-the sample is commonplace. A ranger must advertise timber sales with estimated volume, estimated grade yield and value, esti- mated cost, and estimated risk. The nurseryman sows seed whose germination is estimated from a tiny fraction of the seedlot, and at harvest he estimates the seedling crop with sample counts in the nursery beds. Enterprising pulp companies, seeking a source of raw material in sawmill residue, may estimate the potential tonnage of chippablt material by multiplying reported production ::A;;“, of conversion factors obtamed at a few representative However desirable a complete measurement may seem, there are several good reasons why sampling is often preferred. In the first place, complete measurement or enumeration may be impossible. The nurseryman might be somewhat better informed if he knew 1 2 AGRICULTURE HANDBOOK 232,U.S.DEpT. OF AGRICULTURE the germinative capacity of all the seed to be sown, but the de- structive nature of the germination test precludes testing every seed. For identical reasons, it is impossible to measure the bend- ing strength of all the timbers to be used in a bridge, the tearing strength of all the paper to be put into a book, or the grade of all the boards to be produced in a timber sale. If the tests were permitted, no seedlings. would be produced, no bridges would be built, no books printed, and no stumpage sold. Clearly where test- ing is destructive, some sort of sampling is inescapable. In other instances total measurement or count is not feasible. Consider the staggering task of testing the quality of all the water in a reservoir, weighing all the fish in a stream, counting all the seedlings in a SOO-bednursery, enumerating all the egg masses in a turpentine beetle infestation, measuring diameter and height of all the merchantable trees in a lO,OOO-acreforest. Obviously, the enormity of the task would demand some sort of sampling procedure. It is well known that sampling will frequently provide the essen- tial information at a far lower cost than a complete enumeration. Less well known is the fact that this information may at times be more reliable than that obtained by a loo-percent inventory. There are several reasons why this might be true. With fewer observa- tions to be made and more time available, measurement of the units in the sample can be and is more likely to be made with greater care. In addition, a portion of the saving resulting from sampling could be used to buy better instruments and to employ or train higher caliber personnel. It is not hard to see that good measure- ments on 5 percent of the units in a population could provide more reliable information than sloppy measurements on 100 percent of the units. Finally, since sample data can be collected and processed in a fraction of the time required for a complete inventory, the infor- mation obtained may be more timely, Surveying 100 percent of the lumber market is not going to provide information that is very useful to a seller if it takes 10 months to complete the job. Populations, Parameters, and Estimates The central notion in any sampling problem is the existence of a population. It is helpful to think of a population as an aggregate of unit values, where the “unit” is the thing upon which the obser- vation is made,
Recommended publications
  • SAMPLING DESIGN & WEIGHTING in the Original
    Appendix A 2096 APPENDIX A: SAMPLING DESIGN & WEIGHTING In the original National Science Foundation grant, support was given for a modified probability sample. Samples for the 1972 through 1974 surveys followed this design. This modified probability design, described below, introduces the quota element at the block level. The NSF renewal grant, awarded for the 1975-1977 surveys, provided funds for a full probability sample design, a design which is acknowledged to be superior. Thus, having the wherewithal to shift to a full probability sample with predesignated respondents, the 1975 and 1976 studies were conducted with a transitional sample design, viz., one-half full probability and one-half block quota. The sample was divided into two parts for several reasons: 1) to provide data for possibly interesting methodological comparisons; and 2) on the chance that there are some differences over time, that it would be possible to assign these differences to either shifts in sample designs, or changes in response patterns. For example, if the percentage of respondents who indicated that they were "very happy" increased by 10 percent between 1974 and 1976, it would be possible to determine whether it was due to changes in sample design, or an actual increase in happiness. There is considerable controversy and ambiguity about the merits of these two samples. Text book tests of significance assume full rather than modified probability samples, and simple random rather than clustered random samples. In general, the question of what to do with a mixture of samples is no easier solved than the question of what to do with the "pure" types.
    [Show full text]
  • Stratified Sampling Using Cluster Analysis: a Sample Selection Strategy for Improved Generalizations from Experiments
    Article Evaluation Review 1-31 ª The Author(s) 2014 Stratified Sampling Reprints and permission: sagepub.com/journalsPermissions.nav DOI: 10.1177/0193841X13516324 Using Cluster erx.sagepub.com Analysis: A Sample Selection Strategy for Improved Generalizations From Experiments Elizabeth Tipton1 Abstract Background: An important question in the design of experiments is how to ensure that the findings from the experiment are generalizable to a larger population. This concern with generalizability is particularly important when treatment effects are heterogeneous and when selecting units into the experiment using random sampling is not possible—two conditions commonly met in large-scale educational experiments. Method: This article introduces a model-based balanced-sampling framework for improv- ing generalizations, with a focus on developing methods that are robust to model misspecification. Additionally, the article provides a new method for sample selection within this framework: First units in an inference popula- tion are divided into relatively homogenous strata using cluster analysis, and 1 Department of Human Development, Teachers College, Columbia University, NY, USA Corresponding Author: Elizabeth Tipton, Department of Human Development, Teachers College, Columbia Univer- sity, 525 W 120th St, Box 118, NY 10027, USA. Email: [email protected] 2 Evaluation Review then the sample is selected using distance rankings. Result: In order to demonstrate and evaluate the method, a reanalysis of a completed experiment is conducted. This example compares samples selected using the new method with the actual sample used in the experiment. Results indicate that even under high nonresponse, balance is better on most covariates and that fewer coverage errors result. Conclusion: The article concludes with a discussion of additional benefits and limitations of the method.
    [Show full text]
  • Stratified Random Sampling from Streaming and Stored Data
    Stratified Random Sampling from Streaming and Stored Data Trong Duc Nguyen Ming-Hung Shih Divesh Srivastava Iowa State University, USA Iowa State University, USA AT&T Labs–Research, USA Srikanta Tirthapura Bojian Xu Iowa State University, USA Eastern Washington University, USA ABSTRACT SRS provides the flexibility to emphasize some strata over Stratified random sampling (SRS) is a widely used sampling tech- others through controlling the allocation of sample sizes; for nique for approximate query processing. We consider SRS on instance, a stratum with a high standard deviation can be given continuously arriving data streams, and make the following con- a larger allocation than another stratum with a smaller standard tributions. We present a lower bound that shows that any stream- deviation. In the above example, if we desire a stratified sample ing algorithm for SRS must have (in the worst case) a variance of size three, it is best to allocate a smaller sample of size one to that is Ω¹rº factor away from the optimal, where r is the number the first stratum and a larger sample size of two to thesecond of strata. We present S-VOILA, a streaming algorithm for SRS stratum, since the standard deviation of the second stratum is that is locally variance-optimal. Results from experiments on real higher. Doing so, the variance of estimate of the population mean 3 and synthetic data show that S-VOILA results in a variance that is further reduces to approximately 1:23 × 10 . The strength of typically close to an optimal offline algorithm, which was given SRS is that a stratified random sample can be used to answer the entire input beforehand.
    [Show full text]
  • Statistical Theory and Methodology for the Analysis of Microbial Compositions, with Applications
    Statistical Theory and Methodology for the Analysis of Microbial Compositions, with Applications by Huang Lin BS, Xiamen University, China, 2015 Submitted to the Graduate Faculty of the Graduate School of Public Health in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2020 UNIVERSITY OF PITTSBURGH GRADUATE SCHOOL OF PUBLIC HEALTH This dissertation was presented by Huang Lin It was defended on April 2nd 2020 and approved by Shyamal Das Peddada, PhD, Professor and Chair, Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh Jeanine Buchanich, PhD, Research Associate Professor, Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh Ying Ding, PhD, Associate Professor, Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh Matthew Rogers, PhD, Research Assistant Professor, Department of Surgery, UPMC Children's Hospital of Pittsburgh Hong Wang, PhD, Research Assistant Professor, Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh Dissertation Director: Shyamal Das Peddada, PhD, Professor and Chair, Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh ii Copyright c by Huang Lin 2020 iii Statistical Theory and Methodology for the Analysis of Microbial Compositions, with Applications Huang Lin, PhD University of Pittsburgh, 2020 Abstract Increasingly researchers are finding associations between the microbiome and human diseases such as obesity, inflammatory bowel diseases, HIV, and so on. Determining what microbes are significantly different between conditions, known as differential abundance (DA) analysis, and depicting the dependence structure among them, are two of the most challeng- ing and critical problems that have received considerable interest.
    [Show full text]
  • Analytic Inference in Finite Population Framework Via Resampling Arxiv
    Analytic inference in finite population framework via resampling Pier Luigi Conti Alberto Di Iorio Abstract The aim of this paper is to provide a resampling technique that allows us to make inference on superpopulation parameters in finite population setting. Under complex sampling designs, it is often difficult to obtain explicit results about su- perpopulation parameters of interest, especially in terms of confidence intervals and test-statistics. Computer intensive procedures, such as resampling, allow us to avoid this problem. To reach the above goal, asymptotic results about empirical processes in finite population framework are first obtained. Then, a resampling procedure is proposed, and justified via asymptotic considerations. Finally, the results obtained are applied to different inferential problems and a simulation study is performed to test the goodness of our proposal. Keywords: Resampling, finite populations, H´ajekestimator, empirical process, statistical functionals. arXiv:1809.08035v1 [stat.ME] 21 Sep 2018 1 Introduction The use of superpopulation models in survey sampling has a long history, going back (at least) to [8], where the limits of assuming the population characteristics as fixed, especially in economic and social studies, are stressed. As clearly appears, for instance, from [30] and [26], there are basically two types of inference in the finite populations setting. The first one is descriptive or enumerative inference, namely inference about finite population parameters. This kind of inference is a static \picture" on the current state of a population, and does not take into account the mechanism generating the characters of interest of the population itself. The second one is analytic inference, and consists in inference on superpopulation parameters.
    [Show full text]
  • IBM SPSS Complex Samples Business Analytics
    IBM Software IBM SPSS Complex Samples Business Analytics IBM SPSS Complex Samples Correctly compute complex samples statistics When you conduct sample surveys, use a statistics package dedicated to Highlights producing correct estimates for complex sample data. IBM® SPSS® Complex Samples provides specialized statistics that enable you to • Increase the precision of your sample or correctly and easily compute statistics and their standard errors from ensure a representative sample with stratified sampling. complex sample designs. You can apply it to: • Select groups of sampling units with • Survey research – Obtain descriptive and inferential statistics for clustered sampling. survey data. • Select an initial sample, then create • Market research – Analyze customer satisfaction data. a second-stage sample with multistage • Health research – Analyze large public-use datasets on public health sampling. topics such as health and nutrition or alcohol use and traffic fatalities. • Social science – Conduct secondary research on public survey datasets. • Public opinion research – Characterize attitudes on policy issues. SPSS Complex Samples provides you with everything you need for working with complex samples. It includes: • An intuitive Sampling Wizard that guides you step by step through the process of designing a scheme and drawing a sample. • An easy-to-use Analysis Preparation Wizard to help prepare public-use datasets that have been sampled, such as the National Health Inventory Survey data from the Centers for Disease Control and Prevention
    [Show full text]
  • Using Sampling Matching Methods to Remove Selectivity in Survey Analysis with Categorical Data
    Using Sampling Matching Methods to Remove Selectivity in Survey Analysis with Categorical Data Han Zheng (s1950142) Supervisor: Dr. Ton de Waal (CBS) Second Supervisor: Prof. Willem Jan Heiser (Leiden University) master thesis Defended on Month Day, 2019 Specialization: Data Science STATISTICAL SCIENCE FOR THE LIFE AND BEHAVIOURAL SCIENCES Abstract A problem for survey datasets is that the data may cone from a selective group of the pop- ulation. This is hard to produce unbiased and accurate estimates for the entire population. One way to overcome this problem is to use sample matching. In sample matching, one draws a sample from the population using a well-defined sampling mechanism. Next, units in the survey dataset are matched to units in the drawn sample using some background information. Usually the background information is insufficiently detaild to enable exact matching, where a unit in the survey dataset is matched to the same unit in the drawn sample. Instead one usually needs to rely on synthetic methods on matching where a unit in the survey dataset is matched to a similar unit in the drawn sample. This study developed several methods in sample matching for categorical data. A selective panel represents the available completed but biased dataset which used to estimate the target variable distribution of the population. The result shows that the exact matching is unex- pectedly performs best among all matching methods, and using a weighted sampling instead of random sampling has not contributes to increase the accuracy of matching. Although the predictive mean matching lost the competition against exact matching, with proper adjust- ment of transforming categorical variables into numerical values would substantial increase the accuracy of matching.
    [Show full text]
  • Sampling and Evaluation
    Sampling and Evaluation A Guide to Sampling for Program Impact Evaluation Peter M. Lance Aiko Hattori Suggested citation: Lance, P. and A. Hattori. (2016). Sampling and evaluation: A guide to sampling for program impact evaluation. Chapel Hill, North Carolina: MEASURE Evaluation, University of North Carolina. Sampling and Evaluation A Guide to Sampling for Program Impact Evaluation Peter M. Lance, PhD, MEASURE Evaluation Aiko Hattori, PhD, MEASURE Evaluation ISBN: 978-1-943364-94-7 MEASURE Evaluation This publication was produced with the support of the United States University of North Carolina at Chapel Agency for International Development (USAID) under the terms of Hill MEASURE Evaluation cooperative agreement AID-OAA-L-14-00004. 400 Meadowmont Village Circle, 3rd MEASURE Evaluation is implemented by the Carolina Population Center, University of North Carolina at Chapel Hill in partnership with Floor ICF International; John Snow, Inc.; Management Sciences for Health; Chapel Hill, NC 27517 USA Palladium; and Tulane University. Views expressed are not necessarily Phone: +1 919-445-9350 those of USAID or the United States government. MS-16-112 [email protected] www.measureevaluation.org Dedicated to Anthony G. Turner iii Contents Acknowledgments v 1 Introduction 1 2 Basics of Sample Selection 3 2.1 Basic Selection and Sampling Weights . 5 2.2 Common Sample Selection Extensions and Complications . 58 2.2.1 Multistage Selection . 58 2.2.2 Stratification . 62 2.2.3 The Design Effect, Re-visited . 64 2.2.4 Hard to Find Subpopulations . 64 2.2.5 Large Clusters and Size Sampling . 67 2.3 Complications to Weights . 69 2.3.1 Non-Response Adjustment .
    [Show full text]
  • 3 Stratified Simple Random Sampling
    3 STRATIFIED SIMPLE RANDOM SAMPLING • Suppose the population is partitioned into disjoint sets of sampling units called strata. If a sample is selected within each stratum, then this sampling procedure is known as stratified sampling. • If we can assume the strata are sampled independently across strata, then (i) the estimator of t or yU can be found by combining stratum sample sums or means using appropriate weights (ii) the variances of estimators associated with the individual strata can be summed to obtain the variance an estimator associated with the whole population. (Given independence, the variance of a sum equals the sum of the individual variances.) • (ii) implies that only within-stratum variances contribute to the variance of an estimator. Thus, the basic motivating principle behind using stratification to produce an estimator with small variance is to partition the population so that units within each stratum are as similar as possible. This is known as the stratification principle. • In ecological studies, it is common to stratify a geographical region into subregions that are similar with respect to a known variable such as elevation, animal habitat type, vegetation types, etc. because it is suspected that the y-values may vary greatly across strata while they will tend to be similar within each stratum. Analogously, when sampling people, it is common to stratify on variables such as gender, age groups, income levels, education levels, marital status, etc. • Sometimes strata are formed based on sampling convenience. For example, suppose a large study region appears to be homogeneous (that is, there are no spatial patterns) and is stratified based on the geographical proximity of sampling units.
    [Show full text]
  • Overview of Propensity Score Analysis
    1 Overview of Propensity Score Analysis Learning Objectives zz Describe the advantages of propensity score methods for reducing bias in treatment effect estimates from observational studies zz Present Rubin’s causal model and its assumptions zz Enumerate and overview the steps of propensity score analysis zz Describe the characteristics of data from complex surveys and their relevance to propensity score analysis zz Enumerate resources for learning the R programming language and software zz Identify major resources available in the R software for propensity score analysis 1.1. Introduction The objective of this chapter is to provide the common theoretical foundation for all propensity score methods and provide a brief description of each method. It will also introduce the R software, point the readers toward resources for learning the R language, and briefly introduce packages available in R relevant to propensity score analysis. Draft ProofPropensity score- Doanalysis methodsnot aim copy, to reduce bias inpost, treatment effect or estimates distribute obtained from observational studies, which are studies estimating treatment effects with research designs that do not have random assignment of participants to condi- tions. The term observational studies as used here includes both studies where there is 1 Copyright ©2017 by SAGE Publications, Inc. This work may not be reproduced or distributed in any form or by any means without express written permission of the publisher. 2 Practical Propensity Score Methods Using R no random assignment but there is manipulation of conditions and studies that lack both random assignment and manipulation of conditions. Research designs to estimate treatment effects that do not have random assignment to conditions are also referred as quasi-experimental or nonexperimental designs.
    [Show full text]
  • Sampling Handout
    SAMPLING SIMPLE RANDOM SAMPLING – A sample in which all population members have the same probability of being selected and the selection of each member is independent of the selection of all other members. SIMPLE RANDOM SAMPLING (RANDOM SAMPLING): Selecting a group of subjects (a sample) for study from a larger group (population) so that each individual (or other unit of analysis) is chosen entirely by chance. When used without qualifications (such as stratified random sampling), random sampling means “simple random sampling.” Also sometimes called “equal probability sample,” because every member of the population has an equal probability (chance) of being included in the sample. A random sample is not the same thing as a haphazard or accidental sample. Using random sampling reduces the likelihood of bias. SYSTEMATIC SAMPLING – A procedure for selecting a probability sample in which every kth member of the population is selected and in which 1/k is the sampling fraction. SYSTEMATIC SAMPLING: A sample obtained by taking every ”nth” subject or case from a list containing the total population (or sampling frame). The size of the n is calculated by dividing the desired sample size into the population size. For example, if you wanted to draw a systematic sample of 1,000 individuals from a telephone directory containing 100,000 names, you would divide 1,000 into 100,000 to get 100; hence, you would select every 100th name from the directory. You would start with a randomly selected number between 1 and 100, say 47, and then select the 47th name, the 147th, the 247th, the 347th, and so on.
    [Show full text]
  • Samples Can Vary - Standard Error
    - Stratified Samples - Systematic Samples - Samples can vary - Standard Error - From last time: A sample is a small collection we observe and assume is representative of a larger sample. Example: You haven’t seen Vancouver, you’ve seen only seen a small part of it. It would be infeasible to see all of Vancouver. When someone asks you ‘how is Vancouver?’, you infer to the whole population of Vancouver places using your sample. From last time: A sample is random if every member of the population has an equal chance of being in the sample. Your Vancouver sample is not random. You’re more likely to have seen Production Station than you have of 93rd st. in Surrey. From last time: A simple random sample (SRS) is one where the chances of being in a sample are independent. Your Vancouver sample is not SRS because if you’ve seen 93rd st., you’re more likely to have also seen 94th st. A common, random but not SRS sampling method is stratified sampling. To stratify something means to divide it into groups. (Geologically into layers) To do stratified sampling, first split the population into different groups or strata. Often this is done naturally. Possible strata: Sections of a course, gender, income level, grads/undergrads any sort of category like that. Then, random select some of the strata. Unless you’re doing something fancy like multiple layers, the strata are selected using SRS. Within each strata, select members of the population using SRS. If the strata are different sizes, select samples from them proportional to their sizes.
    [Show full text]