Use of Sampling in the Census Technical Session 5.2

Regional Workshop on the Operational Guidelines of the WCA 2020 Dar es Salaam, Tanzania 23-27 March 2020 Use of sampling in the census Technical Session 5.2 Eloi OUEDRAOGO Statistician, Agricultural Census Team FAO Statistics Division (ESS) 1 CONTENTS Complete enumeration censuses versus sample-based censuses Uses of sampling at other census stages Sample designs based on list frames Sample designs based on area frames Sample designs based on multiple frames Choice of sample design Country examples 2 Background As mentioned in the WCA 2020, Vol. 1 (Chapter 4, paragraph 4.34), when deciding whether to conduct a census by complete or sample enumeration, in addition to efficiency considerations (precision versus costs), one should take into account: desired level of aggregation for census data; use of the census as a frame for ongoing sampling surveys; data content of the census; and capacity to deal with sampling methods and subsequent statistical analysis based on samples. 3 Complete enumeration versus sample enumeration census Complete enumeration Sample enumeration Advantages 1. Reliable census results for the 1. Is generally less costly that a smallest administrative and complete enumeration geographic units and on rare events 2. Contributes to decrease the overall (such as crops/livestock types) response burden 2. Provides a reliable frame for the organization of subsequent regular 3. Requires a smaller number of infra-annual and annual sample enumerators and supervisors than a surveys. In terms of frames, it is census conducted by complete much less demanding in respect of enumeration. Consequently, the the holdings’ characteristics non-sampling errors can be 3. Requires fewer highly qualified expected to be lower because of statistical personnel with expert the employment of better trained knowledge of sampling methods enumerators and supervisors and than a census conducted on a sample basis. Important in countries better quality control with limited technical expertise 4. Require less processing capacity and 4. Aggregating data from a complete the results are usually available enumeration is straightforward and sooner does not involve statistical estimations 4 Complete enumeration versus sample enumeration census (contd.) Complete enumeration Sample enumeration Disadvantages 1. High cost and administrative 1. The amount of subnational data and cross- complexity tabulations that can be produced is limited 2. High overall response burden 2. Cannot provide accurate information on 3. Requires a very large number of events that occur infrequently field staff. As a result: 3. May not ensure an adequate or complete • candidates with the desired frame for subsequent agricultural surveys qualifications might not be 4. Requires a reliable sampling frame available in the required number; 5. Auxiliary information (such as total area of • the standard might be lowered ; the holding, area by main land use types, • adequate training of a large number of livestock by main types) is number of field census staff in a needed for a sound sample design short period of time is also 6. Requires personnel who are well trained in challenging with a consequent sampling methods and analysis effect on data quality. 7. Analysing the data from a sample 4. The amount of data to be processed enumeration requires the use of more is very large. The results may be complicated techniques. considerably delayed if not sufficient data processing capacities are in place. 5 Factors for consideration in choosing between complete or sample enumeration ◦ Sample enumeration is an optimal alternative where there is severe limitation of funds and personnel . ◦ However, it requires a reliable sampling frame, as well as the capacity to deal with sampling methods and subsequent statistical analysis based on samples. ◦ A decision will depend on the level at which the census data are required (i.e. for the entire country, provinces, districts or even for smaller administrative units (such as communities)). ◦ Even countries that lack resources should seriously consider undertaking at least a part of the census items on a complete enumeration basis. This ensures a good base for preparing an efficient sampling design, e.g. for planning future agricultural surveys to collect current agricultural statistics. 6 Combination of complete and sample enumeration There are different ways: ◦ Complete enumeration in the most important agricultural regions of the country and/or with easy access and a sample of villages or EAs for the rest of the country. ◦ Complete enumeration for some types of holdings or for those above an established threshold and sample enumeration for the remaining holdings. In the second case the following scenarios could be used: ◦ Complete enumeration for agricultural holdings with the largest contribution to the agricultural production, which usually constitute the bulk of the holdings. The remaining holdings are enumerated on a sample basis. ◦ Complete enumeration for large or “special” holdings, which may account for a significant contribution to agricultural production and sample enumeration for smaller holdings. 7 Use of sampling for census enumeration • In a classical census sampling may be applied when: using the short-long questionnaire concept (the short questionnaire is administered to all target population of holdings, while the long questionnaire is administered only to a sample of such holdings); or conducting a sample-based census as a single one-off operation. • In the modular approach sampling is needed for selection of holdings to apply the supplementary module(s). • When the census is part of an integrated census/survey modality, rotating modules are conducted on a sample basis. • Use of registers as a source of census data could also be combined with filed enumeration on a sample basis. 8 Uses of sampling at other census stages Sampling techniques can also be used: During the preparation phase, in pilot censuses, sampling methods may be applied to test census instruments and procedures For quality checks during field operations when the work of enumerators and supervisors is evaluated using sampling to avoid selections biases To improve the census coverage (holdings outside of list frame) For non-response follow-up In post-enumeration surveys (PES) sampling methods are applied to assess the census coverage and the accuracy of responses For the preparation of flash census preliminary results to be disseminated shortly after census data collection 9 Sampling techniques used in a census The following sampling techniques can be used for the construction of probability sample designs for agricultural censuses and surveys: o simple random sampling (SRS), o systematic sampling (SYS), o stratified sampling (STR), o sampling with probability proportional to size (PPS), o multivariate probability proportional to size, o cluster sampling, o multi-stage sampling, etc. Advances in sampling theory, such as calibration, ratio and regression estimation may also be used to improve the reliability of census data collected by sample enumeration. 10 One-stage or element sampling • Some conditions are essential for cost-efficient element sampling (using SRS, SYS, STR or PPS): o A fairly complete and up-to-date frame (listing of elements, e.g. holdings or households) of the target population must be available. o Locating the elements and collecting the data must be feasible and economical. o Techniques such as PPS and STR have strong requirements for prior auxiliary information for each element in the population. Therefore, element sampling is not always feasible in ACs, especially in countries that do not have a well-established system of agricultural surveys. For a given sample size: o Sampling errors are smaller in an element sampling than in a cluster sampling. o However, the element sampling involves a larger frame development and data collection costs because the sample in that case is more widely dispersed than, for example, if two-stage sample design is used. 11 Clustering and multi-stage sampling procedure Typically, the sample design for an AC consists of a combination of various sample selection techniques. A manageable sample design often involves clustering and several stages of sampling. In cluster sampling, a sample of clusters (PSUs, such as districts, villages, EAs) is first drawn from the population of clusters. In the next stage, all elements of the sampled clusters are taken if one-stage cluster sampling is applied, or a sample of elements (SSUs) is drawn from each sample cluster in the case of two-stage cluster sampling. The practical aspects of sampling and data collection are the main motivation for use of clustering. ◦ An important advantage in cluster sampling is that a sampling frame at the element level (e.g. AH) is not needed for the whole population, but cluster- level frames are often accessible. ◦ Cluster sampling is especially motivated by cost efficiency, that is, the relatively low cost per sample element (AH), because of lower costs for both listing and data collection (locating). 12 Clustering and multi-stage sampling procedure (contd.) • However, in practice, clusters (such as villages, EAs) tend to be internally homogeneous, and this intra-cluster homogeneity increases standard errors and thus decreases statistical efficiency. Therefore, when constructing a sample design for ACs and surveys, more clusters will need to be selected and then subsampled using measures of size. • The multi-stage sampling

Use of Sampling in the Census Technical Session 5.2

Choosing the Sample

811D Ecollomic Statistics Adrllillistra!Tioll

2019 TIGER/Line Shapefiles Technical Documentation

2020 Census Barriers, Attitudes, and Motivators Study Survey Report

Sampling Methods It’S Impractical to Poll an Entire Population—Say, All 145 Million Registered Voters in the United States

THE CENSUS in U.S. HISTORY Library of Congress of Library

Survey Nonresponse Bias and the Coronavirus Pandemic∗

MRS Guidance on How to Read Opinion Polls

Computing Effect Sizes for Clustered Randomized Trials

Categorical Data Analysis

Evaluating Probability Sampling Strategies for Estimating Redd Counts: an Example with Chinook Salmon (Oncorhynchus Tshawytscha)

Cluster Sampling