Statistical Basis for Sampling
Total Page:16
File Type:pdf, Size:1020Kb
PART 4 STATISTICAL BASIS FOR SAMPLING I. INTRODUCTION The question of how many samples need to be taken and from which locations they should be taken in order to obtain a representative sample is often a perplexing one- The appropriate location and number of samples to be taken at a particular site is dependent upon a variety of factors including the degree of accuracy desired, the spatial and temporal variability of the media to be sampled, and the cost of collecting and analyzing the samples. Intuitively, the greater the number of samples taken, the greater the data accuracy. However, with increased sample number comes greater sampling and analytical time and costs. One of the objectives of any sampling program should be to obtain the most accurate data possible while minimizing these costs. One way to accomplish this objective is to use statistically-valid sampling strategies wherein the appropriate sample number can be estimated and the sampling locations can be chosen without bias. Often, non-statistical (judgment) sampling strategies are used to select the appropriate sampling locations and to estimate how many samples are needed. In some sampling situations the use of non-statistical methods can lead to inaccurate results. The use of non-statistical methods for selecting sampling location are discussed below. II. SAMPLING LOCATION SELECTION A. Non-Statistical Sampling Strategies Non-statistical or judgment sampling requires and relies upon the sampler's experience and knowledge of the waste distribution to determine the locations that will provide the most representative sampling. The validity of such sampling is dependent on the accuracy of this knowledge. The data can be biased intentionally or because of inadequate knowledge of waste distribution. Because judgment sampling is subject to bias, its validity can be questioned in legal hearings and prosecutions. When judgment is used the degree of bias introduced, i.e. the sampling error, can't be estimated since the equations for estimating error are based on the assumption that the probability of a location being included in a sample is known. Thus, documentation of why a particular sampling location is chosen is critical when judgment is used. If the limitations of judgment sampling are known to the sampler and are taken into account, judgment sampling can be a very useful technique and can often be the preferred sampling method. For instance, judgment sampling might be the method of choice when the sampling objective is to identify or verify specific hazards present on site rather than to accurately quantify them. In this case, a statistical sampling strategy may follow. Other sampling situations may be equally applicable to the use of judgment sampling. 4-1 B. Statistical Sampling Strategies The use of statistical sampling strategies, i.e. random sampling, can often times produce increased data accuracy while eliminating sampler bias. Random sampling depends on the theory of random chance probabilities in order to choose the most representative sample. Little or no knowledge of waste distribution is needed. Unlike judgment sampling, the error in data accuracy of a random sampling scheme can be objectively measured because the probability of selecting each sampling location is known. A random numbers table or a random numbers generator should be used to select the sampling locations, thereby eliminating any bias of the sample collector (Table 4-1). Random sampling is often used at dump-sites involving large numbers of drums of unknown contents in order to obtain unbiased, accurate data while minimizing analytical costs. On smaller sites, where there is generally more knowledge of the materials present, judgment sampling may be preferable over random sampling. On drum sites where random sampling is to be used, the drums should first be separated into groups according to content based on information provided by manifests, records, labels, etc- Then, drums of each group are chosen at random intervals according to random sampling techniques. Random sampling is also often used for sampling of lagoons, ponds, and other surface waters. Here, the area of concern is divided into a two- or three-dimensional grid, and the grid points to be sampled are chosen randomly. A number of statistical sampling strategies are available to produce an unbiased, representative sampling plan. Among these are the simple random, the systematic random, and the stratified random strategies. The principles behind these and the situations in which each would be useful are discussed below. 1. Simple random Simple random sampling is a statistical sampling method that requires little or no prior knowledge of waste distribution. This strategy relies on random chance probability theory wherein each sampling location has an equal and known probability of being selected. Because the probability of selection is known, sampling error can be accurately estimated. Generally, the area of interest is partitioned into either a 2- or 3- dimensional grid pattern and random coordinates are selected for sampling. 2. Systematic random Systematic random sampling is a refinement of simple random sampling that can produce a more efficient sampling survey in certain circumstances. The use of a systematic random scheme can improve the efficiency by reducing the sampling error while maintaining the same sample number, or by reducing the number of samples required to achieve a specified sampling error, or by reducing the cost of collection. Like simple random sampling, systematic random sampling requires little or no knowledge of the waste distribution; however, bias and imprecision may be introduced if unrecognized trends or cycles exist- Two examples of selecting sampling locations using 4-2 TABLE 4-1 RANDOM NUMBERS TABLE * 03 47 43 73 86 36 96 47 36 61 46 98 63 71 62 97 74 24 67 62 42 31 14 57 20 42 53 32 37 32 16 76 62 27 66 56 50 26 71 07 32 90 79 78 53 12 56 85 99 26 96 96 68 27 31 05 03 72 93 15 55 59 56 35 64 38 54 82 46 22 31 62 43 09 90 16 22 77 94 39 49 54 43 54 82 17 37 93 23 78 84 42 17 53 31 57 24 55 06 88 77 04 74 47 67 63 01 63 78 59 16 95 55 67 19 98 10 50 71 75 33 21 12 34 29 78 64 56 07 82 52 42 07 44 38 57 60 86 32 44 09 47 27 96 54 49 17 46 09 62 18 18 07 92 46 44 17 16 58 09 79 83 86 19 62 26 62 38 97 75 84 16 07 44 99 83 11 46 32 24 23 42 40 64 74 82 97 77 77 81 07 45 32 14 08 52 36 28 19 95 50 92 26 11 97 00 56 76 31 38 37 85 94 35 12 83 39 50 08 30 42 34 07 96 88 70 29 17 12 13 40 33 20 38 26 13 89 51 03 74 56 62 18 37 35 96 83 50 87 75 97 12 25 93 47 99 49 57 22 77 88 42 95 45 72 16 64 36 16 00 16 08 15 04 72 33 27 14 34 09 45 59 34 68 49 31 16 93 32 43 50 27 89 87 19 20 15 37 00 49 *How to use the Random Numbers Table: 1. If sampling containerized wastes (i.e., drums, sacks, etc.) segregate the containers according to waste type based on available information e.g. container markings, labels. Number containers with the same waste type consecutively, starting from 01. If sampling surface waters, divide the area into a two- or three-dimensional grid and number the grid locations. 2- Determine the number of samples you need to take. For routine surveillance sampling one or two is usually adequate and judgment sampling is suitable. But for regulatory or research purposes, a larger sample size (such as one sample for every group of five containers) taken at random will generate more statistically valid data. 3. Using the random numbers table, choose any number as a starting point. 4. From this number go in any direction until you have selected the predetermined number of samples with no repetitions. Numbers larger than the population size are ineligible. 4-3 systematic sampling are 1) randomly selecting a transect (an arbitrary dividing line) or transects and then sampling at a preselected interval, or 2) preselecting both the transact or transects and the sampling interval and then beginning the sampling from a randomly selected starting point. 3- Stratified random sampling Unlike both simple random and systematic random sampling, stratified random sampling requires some prior knowledge of the waste distribution. When the waste is known or assumed to be stratified, for example, when an oil layer is thought to overlie a lower aqueous layer of a lagoon, the sampling efficiency can be improved by dividing the area to be sampled into strata that are more homogeneous than the total area. Simple random sampling techniques can then be used to sample each stratum independently. Each stratum is subdivided into grids and then the sampling locations are selected randomly. If the area is known to be vertically stratified, the sampling locations within each stratum are selected randomly and then selected depths are sampled. If the area is known or assumed to be horizontally stratified, the sampling locations within each stratum are also selected randomly, but the total depth is sampled. When the analytical results have been obtained an analysis of variance (ANOVA) should be performed to determine if the strata differ significantly and therefore whether the assumption that stratification exists was correct and thus whether the use of stratified random sampling was statistically valid.