Springer Series in Statistics Advisors: P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger Springer Series in Statistics Alho/Spencer: Statistical Demography and Forecasting. Andersen/Borgan/Gill/Keiding: Statistical Models Based on Counting Processes. Atkinson/Riani: Robust Diagnostic Regression Analysis. Atkinson/Riani/Cerioli: Exploring Multivariate Data with the Forward Search. Berger: Statistical Decision Theory and Bayesian Analysis, 2nd edition. Borg/Groenen: Modern Multidimensional Scaling: Theory and Applications, 2nd edition. Brockwell/Davis: Time Series: Theory and Methods, 2nd edition. Bucklew: Introduction to Rare Event Simulation. Cappé/Moulines/Rydén: Inference in Hidden Markov Models. Chan/Tong: Chaos: A Statistical Perspective. Chen/Shao/Ibrahim: Monte Carlo Methods in Bayesian Computation. Coles: An Introduction to Statistical Modeling of Extreme Values. David/Edwards: Annotated Readings in the History of Statistics. Devroye/Lugosi: Combinatorial Methods in Density Estimation. Efromovich: Nonparametric Curve Estimation: Methods, Theory, and Applications. Eggermont/LaRiccia: Maximum Penalized Likelihood Estimation, Volume I: Density Estimation. Fahrmeir/Tutz: Multivariate Statistical Modelling Based on Generalized Linear Models, 2nd edition. Fan/Yao: Nonlinear Time Series: Nonparametric and Parametric Methods. Farebrother: Fitting Linear Relationships: A History of the Calculus of Observations 1750-1900. Federer: Statistical Design and Analysis for Intercropping Experiments, Volume I: Two Crops. Federer: Statistical Design and Analysis for Intercropping Experiments, Volume II: Three or More Crops. Ferraty/Vieu: Nonparametric Functional Data Analysis: Models, Theory, Applications, and Implementation Ghosh/Ramamoorthi: Bayesian Nonparametrics. Glaz/Naus/Wallenstein: Scan Statistics. Good: Permutation Tests: Parametric and Bootstrap Tests of Hypotheses, 3rd edition. Gouriéroux: ARCH Models and Financial Applications. Gu: Smoothing Spline ANOVA Models. Györfi/Kohler/Krzyz• ak/Walk: A Distribution-Free Theory of Nonparametric Regression. Haberman: Advanced Statistics, Volume I: Description of Populations. Hall: The Bootstrap and Edgeworth Expansion. Härdle: Smoothing Techniques: With Implementation in S. Harrell: Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. Hart: Nonparametric Smoothing and Lack-of-Fit Tests. Hastie/Tibshirani/Friedman: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Hedayat/Sloane/Stufken: Orthogonal Arrays: Theory and Applications. Heyde: Quasi-Likelihood and its Application: A General Approach to Optimal Parameter Estimation. (continued after index) Yves Tille´ Sampling Algorithms Yves Tillé Institut de Statistique, Université de Neuchâtel Espace de l’Europe 4, Case postale 805 2002 Neuchâtel, Switzerland [email protected] Library of Congress Control Number: 2005937126 ISBN-10: 0-387-30814-8 ISBN-13: 978-0387-30814-2 © 2006 Springer Science+Business Media, Inc. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, Inc., 233 Springer Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, elec- tronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed in the United States of America. (MVY) 987654321 springer.com Preface This book is based upon courses on sampling algorithms. After having used scattered notes for several years, I have decided to completely rewrite the material in a consistent way. The books of Brewer and Hanif (1983) and H´ajek (1981) have been my works of reference. Brewer and Hanif (1983) have drawn up an exhaustive list of sampling methods with unequal probabilities, which was probably a very tedious work. The posthumous book of H´ajek (1981) contains an attempt at writing a general theory for conditional Poisson sampling. Since the publication of these books, things have been improving. New techniques of sampling have been proposed, to such an extent that it is difficult to have a general idea of the interest of each of them. I do not claim to give an exhaustive list of these new methods. Above all, I would like to propose a general framework in which it will be easier to compare existing methods. Furthermore, forty-six algorithms are precisely described, which allows the reader to easily implement the described methods. This book is an opportunity to present a synthesis of my research and to develop my convictions on the question of sampling. At present, with the splitting method, it is possible to construct an infinite amount of new sampling methods with unequal probabilities. I am, however, convinced that conditional Poisson sampling is probably the best solution to the problem of sampling with unequal probabilities, although one can object that other procedures provide very similar results. Another conviction is that the joint inclusion probabilities are not used for anything. I also advocate for the use of the cube method that allows selecting balanced samples. I would also like to apologize for all the techniques that are not cited in this book. For example, I do not mention all the methods called “order sampling” because the methods for coordinating samples are not examined in this book. They could be the topic of another publication. This material is aimed at experienced statisticians who are familiar with the theory of survey sampling, to Ph.D. students who want to improve their knowledge in the theory of sampling and to practitioners who want to use or implement modern sampling procedures. The R package “sampling” available VI Preface on the Web site of the Comprehensive R Archive Network (CRAN) contains an implementation of most of the described algorithms. I refer the reader to the books of Mittelhammer (1996) and Shao (2003) for questions of inferential statistics, and to the book of S¨arndal et al. (1992) for general questions related to the theory of sampling. Finally, I would like to thank Jean-Claude Deville who taught me a lot on the topic of sampling when we worked together at the Ecole´ Nationale de la Statistique et de l’Analyse de l’Information in Rennes from 1996 to 2000. I thank Yves-Alain Gerber, who has produced most of the figures of this book. I am also grateful to C´edric B´eguin, Ken Brewer, Lionel Qualit´e, and Paul- Andr´e Salamin for their constructive comments on a previous version of this book. I am particularly indebted to Lennart Bondesson for his critical reading of the manuscript that allowed me to improve this book considerably and to Leon Jang for correction of the proofs. Neuchˆatel, October 2005 Yves Till´e Contents Preface ........................................................ V 1 Introduction and Overview ................................ 1 1.1 Purpose................................................ 1 1.2 Representativeness ....................................... 1 1.3 TheOriginofSamplingTheory ........................... 2 1.3.1 Sampling with Unequal Probabilities ................. 2 1.3.2 ConditionalPoissonSampling....................... 2 1.3.3 TheSplittingTechnique............................ 3 1.3.4 BalancedSampling................................ 3 1.4 ScopeofApplication..................................... 4 1.5 AimofThisBook....................................... 4 1.6 OutlineofThisBook .................................... 5 2 Population, Sample, and Sampling Design ................. 7 2.1 Introduction............................................ 7 2.2 PopulationandVariableofInterest........................ 7 2.3 Sample................................................. 8 2.3.1 SampleWithoutReplacement....................... 8 2.3.2 SampleWithReplacement.......................... 9 2.4 Support ................................................ 9 2.5 Convex Hull, Interior, and Subspaces Spanned by a Support . 12 2.6 SamplingDesignandRandomSample ..................... 14 2.7 ReductionofaSamplingDesignWithReplacement.......... 14 2.8 Expectation and Variance of a Random Sample ............. 15 2.9 Inclusion Probabilities ................................... 17 2.10 Computation of the Inclusion Probabilities ................. 18 2.11CharacteristicFunctionofaSamplingDesign............... 19 2.12ConditioningaSamplingDesign........................... 20 2.13 Observed Data and Consistency ........................... 20 2.14 Statistic and Estimation .................................. 21 VIII Contents 2.15 Sufficient Statistic ....................................... 22 2.16 The Hansen-Hurwitz (HH) Estimator ...................... 26 2.16.1 Estimation of a Total .............................. 26 2.16.2 Variance of the Hansen-Hurwitz Estimator ........... 26 2.16.3VarianceEstimation .............................. 27 2.17 The Horvitz-Thompson (HT) Estimator .................... 28 2.17.1 Estimation of a Total .............................. 28 2.17.2 Variance of the Horvitz-Thompson Estimator ......... 28 2.17.3VarianceEstimation .............................. 28 2.18MoreonEstimationinSamplingWithReplacement........
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages221 Page
-
File Size-