10: Inference from Small Samples

Total Page:16

File Type:pdf, Size:1020Kb

10: Inference from Small Samples 10: Inference from Small Samples 10.1 Refer to Table 4, Appendix I, indexing df along the left or right margin and tα across the top. a t.05 = 2.015 with 5 df b t.025 = 2.306 with 8 df c t.10 = 1.330 with 18 df c t.025 ≈ 1.96 with 30 df 10.2 The value Pt(>ta )=ais the tabled entry for a particular number of degrees of freedom. a For a two-tailed test with α = .01, the critical value for the rejection region cuts off α 2 = .005 in the two tails of the t distribution shown below, so that t.005 = 3.055 . The null hypothesis H0 will be rejected if t > 3.055 or t <−3.055 (which you can also write as t > 3.055 ). b For a right-tailed test, the critical value that separates the rejection and nonrejection regions for a right tailed test based on a t-statistic will be a value of t (called tα ) such that Pt( >=tα ) α =.05 and df = 16 . That is, t.05 = 1.746 . The null hypothesis H0 will be rejected if t > 1.746 . c For a two-tailed test with α 2 = .025 and df = 25 , H0 will be rejected if t > 2.060 . d For a left-tailed test withα = .01and df = 7 , H0 will be rejected if t < −2.998 . 10.3 a The p-value for a two-tailed test is defined as pP-value =>( t2.43) =2P(t>2.43) so that 1 Pt()>=2.43 p-value 2 Refer to Table 4, Appendix I, with df = 12 . The exact probability, Pt( > 2.43) is unavailable; however, it is evident that t = 2.43 falls between t.025 = 2.179 and t.01 = 2.681. Therefore, the area to the right of t = 2.43 must be between .01 and .025. Since 1 .01 <<p-value .025 2 the p-value can be approximated as .02 < p-value < .05 237 b For a right-tailed test, pP-value =>(t3.21) with df = 16 . Since the value t = 3.21 is larger than t.005 = 2.921 , the area to its right must be less than .005 and you can bound the p-value as p-value < .005 1 c For a two-tailed test, pP-value =>( t1.19) =2P(t>1.19) , so that Pt()>=1.19 p-value . From 2 Table 4 with df = 25 , t = 1.19 is smaller than t.10 = 1.316 so that 1 pp-value >>.10 and -value .20 2 d For a left-tailed test, pP-value =<(t−8.77) =P(t>8.77) with df = 7 . Since the value t = 8.77 is larger than t.005 = 3.499 , the area to its right must be less than .005 and you can bound the p-value as p-value < .005 10.4 a The stem and leaf plot is shown below. Notice the mounded shape of the data, which justifies the normality assumption. Stem-and-Leaf Display: Scores Stem-and-leaf of Scores N = 20 Leaf Unit = 1.0 1 5 7 2 6 2 5 6 578 8 7 123 (4) 7 5567 8 8 244 5 8 669 2 9 13 b Using the formulas given in Chapter 2 or your scientific calculator, calculate ∑ x 1533 x ==i =76.65 n 20 22 2 ()∑ xi ()1533 ∑−xi 119419 − ss2 ==n 20 =100.7658 and =100.7658 =10.0382 n −119 c Small sample confidence intervals are quite similar to their large sample counterparts; however, these intervals must be based on the t-distribution. Thus, the confidence interval for the single population mean described in this exercise will be s xt± α 2 n where tα 2 is a value of t (Table 4) based on df = n −1 degrees of freedom that has area α 2 to its right. For this exercise, nx==20, 76.65, s=10.0382 and t.025 with n −1= 19degrees of freedom is t.025 = 2.093 . Hence the 95% confidence interval is s 10.0382 xt±⇒α 2 76.65 ±2.093 ⇒76.65 ±4.70 n 20 or 71.95 <<µ 81.35 . Intervals constructed using this procedure will enclose µ 95% of the time in repeated sampling. Hence, we are fairly certain that this particular interval encloses µ . 2 10.5 a Using the formulas given in Chapter 2, calculate ∑ xi = 70.5 and ∑=xi 499.27 . Then ∑ x 70.5 x ==i =7.05 n 10 238 22 2 ()∑ xi ()70.5 ∑−xi 499.27 − ss2 ==n 10 =.249444 and =.4994 n −19 b With df =−n 1=9, the appropriate value of t is t.01 = 2.821(from Table 4) and the 99% upper one- sided confidence bound is s .249444 xt+⇒.01 7.05 +2.821 ⇒7.05 +.446 n 10 or µ < 7.496 . Intervals constructed using this procedure will enclose µ 99% of the time in repeated sampling. Hence, we are fairly certain that this particular interval encloses µ . c The hypothesis to be tested is H:0aµ = 7.5 versus H:µ < 7.5 and the test statistic is x −−µ 7.05 7.5 t == =−2.849 sn .249444 10 The rejection region with α = .01and n −1= 9degrees of freedom is located in the lower tail of the t- distribution and is found from Table 4 as tt< −=.01 −2.821 . Since the observed value of the test statistic falls in the rejection region, H0 is rejected and we conclude that µ is less than 7.5. d Notice that the 99% upper one-sided confidence bound for µ does not include the value µ = 7.5 . This would confirm the results of the hypothesis test in part c, in which we concluded that µ is less than 7.5. 2 10.6 a Using the formulas given in Chapter 2, calculate ∑ xi = 12.55 and ∑=xi 13.3253 . Then ∑ x 12.55 x ==i =.896 n 14 2 2 2 ()∑ xi ()12.55 ∑−xi 13.3253 − s ==n 14 =.3995 n −113 With df =−n 11=3, the appropriate value of t is t.025 = 2.16 (from Table 4) and the 95% confidence interval is s .3995 xt±⇒.025 .896 ±2.16 ⇒.896 ±.231 n 14 or .665 <<µ 1.127 . Intervals constructed using this procedure will enclose µ 95% of the time in repeated sampling. Hence, we are fairly certain that this particular interval encloses µ . ()∑ x 2 ()4.9 2 ∑−x2 i 6.0058 − ∑ x 4.9 i b Calculate x ==i =1.225 and s ==n 4 =.0332 n 4 n −13 and the 95% confidence interval is s .0332 xt±⇒.025 1.225 ±3.182 ⇒1.225 ±.053 or 1.172 <<µ 1.278 . n 4 The interval is narrower than the interval in part a, even though the sample size is smaller, because the data is so much less variable. c For white tuna in water, 239 ()∑ x 22()10.24 ∑−x2 i 13.235 − ∑ x 10.24 i x ==i =1.28 and s ==n 8 =.1351 n 8 n −17 and the 95% confidence interval is s .1351 xt±⇒.025 1.28 ±2.365 ⇒1.28 ±.113 or 1.167 <<µ 1.393 . n 8 For light tuna in oil, ()∑ x 2 ()12.62 2 ∑−x2 i 19.0828 − ∑ x 12.62 i x ==i =1.147 and s ==n 11 =.6785 n 11 n −110 and the 95% confidence interval is s .6785 xt±⇒.025 1.147 ±2.228 ⇒1.147 ±.456 or .691 <<µ 1.603 . n 11 1.4 1.3 1.2 t s 1.1 Co 1.0 0.9 0.8 Light Oil Light Water White Oil White Water The plot of the four treatment means shows substantial differences in variability. The cost of light tuna in water appears to be the lowest, and quite difference from either of the white tuna varieties. 10.7 Similar to previous exercises. The hypothesis to be tested is H:0aµ = 5 versus H:µ < 5 ∑ x 29.6 Calculate x ==i =4.933 n 6 22 2 ()∑ xi ()29.6 ∑−xi 146.12 − ss2 ==n 6 =.01867 and =.1366 n −15 The test statistic is x −−µ 4.933 5 t == =−1.195 sn .1366 6 The critical value of t with α = .05 and n −1= 5degrees of freedom is t.05 = 2.015 and the rejection region is t <−2.015 . Since the observed value does not fall in the rejection region, H0 is not rejected. There is no evidence to indicate that the dissolved oxygen content is less than 5 parts per million. ∑ x 608 10.8 Calculate x ==i =60.8 n 10 240 22 2 ()∑ xi ()608 ∑−xi 37538 − ss2 ==n 10 =63.5111 and =7.9694 n −19 The 95% confidence interval based on df = 9 is s 7.9694 xt±⇒.025 60.8 ±2.262 ⇒60.8 ±5.701 n 10 or 55.099 <<µ 66.501 . 10.9 a Similar to previous exercises. The hypothesis to be tested is H0a: µ =<100 versus H : µ 100 ∑ x 1797.095 Calculate x ==i =89.85475 n 20 22 2 ()∑ xi ()1797.095 ∑−xi 165,697.7081− ss2 ==n 20 =222.1150605 and =14.9035 n −119 The test statistic is x −µ 89.85475 −100 t == =−3.044 sn 14.9035 20 The critical value of t with α = .01and n −11= 9degrees of freedom is t.01 = 2.539 and the rejection region is t <−2.539 .
Recommended publications
  • T-Testing (Single)
    t-Testing (Single) Young W. Lim 2019-07-13 Sat Young W. Lim t-Testing (Single) 2019-07-13 Sat 1 / 24 Outline 1 Based on 2 t Testing Young W. Lim t-Testing (Single) 2019-07-13 Sat 2 / 24 Based on "Understanding Statistics in the Behavioral Sciences" R. R. Pagano I, the copyright holder of this work, hereby publish it under the following licenses: GNU head Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled GNU Free Documentation License. CC BY SA This file is licensed under the Creative Commons Attribution ShareAlike 3.0 Unported License. In short: you are free to share and make derivative works of the file under the conditions that you appropriately attribute it, and that you distribute it only under a license compatible with this one. Young W. Lim t-Testing (Single) 2019-07-13 Sat 3 / 24 Student t-Test (1) The t-test is any statistical hypothesis test in which the test statistic follows a Student’s t-distribution under the null hypothesis. Young W. Lim t-Testing (Single) 2019-07-13 Sat 4 / 24 Student t-Test (2) A t-test is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known.
    [Show full text]
  • Nonparametric Statistics
    14 Nonparametric Statistics • These methods required that the data be normally CONTENTS distributed or that the sampling distributions of the 14.1 Introduction: Distribution-Free Tests relevant statistics be normally distributed. 14.2 Single-Population Inferences Where We're Going 14.3 Comparing Two Populations: • Develop the need for inferential techniques that Independent Samples require fewer or less stringent assumptions than the 14.4 Comparing Two Populations: Paired methods of Chapters 7 – 10 and 11 (14.1) Difference Experiment • Introduce nonparametric tests that are based on ranks 14.5 Comparing Three or More Populations: (i.e., on an ordering of the sample measurements Completely Randomized Design according to their relative magnitudes) (14.2–14.7) • Present a nonparametric test about the central 14.6 Comparing Three or More Populations: tendency of a single population (14.2) Randomized Block Design • Present a nonparametric test for comparing two 14.7 Rank Correlation populations with independent samples (14.3) • Present a nonparametric test for comparing two Where We've Been populations with paired samples (14.4) • Presented methods for making inferences about • Present a nonparametric test for comparing three or means ( Chapters 7 – 10 ) and for making inferences more populations using a designed experiment about the correlation between two quantitative (14.5–14.6) variables ( Chapter 11 ) • Present a nonparametric test for rank correlation (14.7) 14-1 Copyright (c) 2013 Pearson Education, Inc M14_MCCL6947_12_AIE_C14.indd 14-1 10/1/11 9:59 AM Statistics IN Action How Vulnerable Are New Hampshire Wells to Groundwater Contamination? Methyl tert- butyl ether (commonly known as MTBE) is a the measuring device, the MTBE val- volatile, flammable, colorless liquid manufactured by the chemi- ues for these wells are recorded as .2 cal reaction of methanol and isobutylene.
    [Show full text]
  • Hypothesis Testing-I SUBJECT FORENSIC SCIENCE
    SUBJECT FORENSIC SCIENCE Paper No. and Title PAPER No.15: Forensic Psychology Module No. and Title MODULE No.31: Hypothesis Testing-1 Module Tag FSC_P15_M31 FORENSIC SCIENCE PAPER No.15 : Forensic Psychology MODULE No.31: Hypothesis Testing-I TABLE OF CONTENTS 1. Learning Outcomes 2. Introduction 3. Single Sample Parametric Test 3.1 Single sample Z test 3.2 Single sample t-test 4. Single Sample Non-Parametric Test 4.1 Wilcoxon Signed-Rank Test 5. Summary FORENSIC SCIENCE PAPER No.15 : Forensic Psychology MODULE No.31: Hypothesis Testing-I 1. Learning Outcomes After studying this module, you will be able to- Know about parametric and non-parametric method. Understand one sample Z and t test. Understand Wilcoxon Signed-Rank Test 2. Introduction Parametric Methods These methods are categorized on the basis the population we are studying. Parametric methods are those for which the population is almost normal, or we can estimate using a normal distribution after we invoke the central limit theorem. Eventually the categorization of a method as non-parametric depends upon the conventions that are made about a population. A few parametric methods include: Confidence interval for a population mean, with known standard deviation. Confidence interval for a population mean with unknown standard deviation Confidence interval for a population variance. Confidence interval for the difference of two means with unknown standard deviation. Non-Parametric Methods Non-parametric methods are statistical techniques for which any assumption of normality for the population is not required unlike parametric methods. Certainly the methods do not have any dependency on the population of interest.
    [Show full text]
  • The Statistics Tutor's Quick Guide to Commonly Used Statistical Tests
    statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence Stcp-marshallowen-7 The Statistics Tutor’s Quick Guide to Commonly Used Statistical Tests © Ellen Marshall, University of Sheffield Reviewer: Jean Russell University of Sheffield www.statstutor.ac.uk Contents CONTENTS ..................................................................................................................................................................................2 INTRODUCTION ............................................................................................................................................................................4 TIPS FOR TUTORING .......................................................................................................................................................................5 SECTION 1 GENERAL INFORMATION ............................................................................................................................................6 DATA TYPES .................................................................................................................................................................................7 SUMMARY STATISTICS ....................................................................................................................................................................7 Summary of descriptive and graphical statistics ..................................................................................................................8
    [Show full text]
  • Correction Factors for Freshwater Dissolved and Total Nutrients for 1998 and 2006 Instrument and Method Changes by the King County Environmental Laboratory
    Correction Factors for Freshwater Dissolved and Total Nutrients for 1998 and 2006 Instrument and Method Changes by the King County Environmental Laboratory March 2016 Alternate Formats Available Correction Factors for Freshwater Dissolved and Total Nutrients for 1998 and 2006 Instrument and Method Changes by the King County Environmental Laboratory Submitted by: Timothy Clark Science and Technical Support Section King County Water and Land Resources Division Department of Natural Resources and Parks Correction Factors for Freshwater Dissolved and Total Nutrients Acknowledgements I would like to express my gratitude to the current and past employees of the King County Environmental Laboratory; their constant drive to improve their methods of analysis is inspiring and a tremendous asset to the Division. My thanks to Sally Abella, Debra Bouchard, Curtis DeGasperi, and Jim Simmonds for their review of this document. Citation King County. 2016. Correction Factors for Dissolved and Total Nutrients for 1998 and 2006 Instrument and Method Changes by the King County Environmental Laboratory. Prepared by Timothy Clark, Water and Land Resources Division. Seattle, Washington. King County Science and Technical Support Section i March 2016 Correction Factors for Freshwater Dissolved and Total Nutrients Table of Contents Executive Summary............................................................................................................................................. vi Acronyms and Abbreviations ........................................................................................................................
    [Show full text]
  • [PSS] Glossary
    Glossary 2 × 2 contingency table. A 2 × 2 contingency table is used to describe the association between a binary independent variable and a binary response variable of interest. acceptance region. In hypothesis testing, an acceptance region is a set of sample values for which the null hypothesis cannot be rejected or can be accepted. It is the complement of the rejection region. actual alpha, actual significance level. This is an attained or observed significance level. allocation ratio. This ratio n2=n1 represents the number of subjects in the comparison, experimental group relative to the number of subjects in the reference, control group. Also see[ PSS] unbalanced designs. alpha. Alpha, α, denotes the significance level. alternative hypothesis. In hypothesis testing, the alternative hypothesis represents the counterpoint to which the null hypothesis is compared. When the parameter being tested is a scalar, the alternative hypothesis can be either one sided or two sided. alternative value, alternative parameter. This value of the parameter of interest under the alternative hypothesis is fixed by the investigator in a power and sample-size analysis. For example, alternative mean value and alternative mean refer to a value of the mean parameter under the alternative hypothesis. analysis of variance, ANOVA. This is a class of statistical models that studies differences between means from multiple populations by partitioning the variance of the continuous outcome into independent sources of variation due to effects of interest and random variation. The test statistic is then formed as a ratio of the expected variation due to the effects of interest to the expected random variation.
    [Show full text]
  • Chapter 10 10-1
    Chapter 10 10-1 Learning Objectives In this chapter, you learn how to use hypothesis testing for comparing the difference between: • The means of two independent populations • The means of two related populations • The proportions of two independent Chapter 10 populations Two-Sample Tests • The variances of two independent populations by testing the ratio of the two variances Prof. Shuguang Liu Chap 10-1 Prof. Shuguang Liu Chap 10-2 Two-Sample Tests Difference Between Two Means DCOVA DCOVA Two-Sample Tests Population means, Goal: Test hypothesis or form independent * a confidence interval for the samples difference between two Population Population population means, µ – µ Means, Means, Population Population 1 2 Proportions Variances Independent Related σ1 and σ2 unknown, Samples Samples assumed equal The point estimate for the difference is Examples: Group 1 vs. Same group Proportion 1 vs. Variance 1 vs. Group 2 before vs. after Proportion 2 Variance 2 X1 – X2 treatment σ1 and σ2 unknown, not assumed equal Prof. Shuguang Liu Chap 10-3 Prof. Shuguang Liu Chap 10-4 Difference Between Two Means: Hypothesis Tests for Independent Samples Two Population Means DCOVA DCOVA § Different data sources Two Population Means, Independent Samples Population• Unrelated means, independent• Independent * Lower-tail test: Upper-tail test: Two-tail test: samplesu Sample selected from one population has no effect on the sample selected from the other population H0: µ1 ≥ µ2 H0: µ1 ≤ µ2 H0: µ1 = µ2 H1: µ1 < µ2 H1: µ1 > µ2 H1: µ1 ≠ µ2 Use S to estimate unknown i.e., i.e., i.e., p σ and σ unknown, 1 2 σ.
    [Show full text]
  • Reporting Mann Whitney U Test
    Reporting Mann Whitney U Test medially.palmatelyApproved asButch boozier never Dimitri dissert clobbers so nowhere her kiddles or garbles balloting any hautboiscloudlessly. proverbially. Equiangular Vasily Brewer pirouetting dabbling Mann-Whitney U Test SPSS UPDATED YouTube. Mann-Whitney U Test AO1 AO2 PSYCHOLOGY WIZARD. Mann-Whitney U Test in SPSS Statistics Setup Procedure. With the MannWhitney test several options on what do report alter the effect. Thus when conducting this test aside from reporting the p-value the. Why use Mann Whitney U test instead of t test? 10 Rank score tests BMJcom. To conclude that i know who gave us that longer than large samples of reporting mann whitney u test as predictors of statistical methods are kindly thanking you? 22 A Mann-Whitney test indicated that the slices of pizza eaten was greater for football Mdn 9 than for basketball players Mdn 5. If that assumption does not pluck the nonparametric Mann-Whitney test is living better for drawing. In reporting the results of a MannWhitney test it with important the state aid measure with the central tendencies of sight two groups means or medians since the MannWhitney is an ordinal test medians are usually recommended The provided of U The sample sizes The significance level. The paired t test repeated-measures ANOVA Wilcoxon signed-rank test and. The Wilcoxon rank-sum test is a nonparametric alternative to point two- sample t-test which is based solely on chain order in known the observations from for two. Mann Whitney Wilcoxon TEST EXAMPLES Alessio. Medians and IQR values to further describe the data before reporting the results of the.
    [Show full text]
  • T-Test, F-Test, Z-Test ,Chi Square Test. Parametric Test
    Make economics easy T-test, f-test, Z-test ,chi square test. Parametric test 1.T-test 2.Z-test 3.F-test 1.t-test T-test is a small sample test. It was developed by William Gosset in 1908 It is also called students t test(pen name) Deviation from population parameter t = Standard error of 0th sample statistics Uses of t-test/application Size of sample is small (n<30) Degree of freedom is v=n-1 T-test is used for test of significance of regression coefficient in regression model. In multiple regression with 3 individual variable the regression coefficient are to be tested by- t-test. To test the hypothesis that correlation coefficient in population is zero then we used t-test. We used t-statistics when parameter of population are normal. Population Variance are unknown. It is also called Welch t-test. 2.Z-test Z-test was given by Fisher. T-test is used when correlation coefficient of population Is zero. But if population coeff. Correlation is not zero then z-test is used. Zr-Zp Z= SEz In z-test sample size is large (n>30) Z-test is used to determine whether two population means are different ,when Population variance is known. Z-test is based on standard normal distribution. Z-test is also called large sample test.. For each significance level the z-test has single critical value ie. 1.96 for 5% two tailed. Z-test conduct various tests like.,… One sample test, Two sample test. Location test, Maximum likelihood estimate.
    [Show full text]
  • Data Analysis Methods: Average Values Evaluation of the Reliability of the Results of the Study
    DATA ANALYSIS METHODS: AVERAGE VALUES EVALUATION OF THE RELIABILITY OF THE RESULTS OF THE STUDY. CHARACTERISTICS AND ANALYSIS OF STATISTICAL ERRORS. Lecturer: Maxim V. Khorosh, PhD THE NEXT GROUPS OF METHODS ARE DISTINGUISHED: 1. Methods of calculating generalizing coefficients that characterize the different aspects of each of the features of the program: • methods of calculating of the relative values • methods of calculating of the average values • methods for assessing the reliability of relative and average values 2. Methods of comparing of the different statistical aggregates: • methods for assessing the reliability of the differences of generalizing coefficients; • methods for assessing the reliability of differences in the distribution of characteristics; • methods of standardization of generalizing coefficients. 3. Methods of differentiation, assessment of interaction and integration of factors. analysis of variance correlation analysis allow to solve the following tasks: regression analysis a) to decompose a multifactorial complex into constituent factors, highlighting factor analysis important and insignificant; principal components method b) to study the interaction of factors c) obtain an integrated assessment based on discriminant analysis a set of factors. sequential analysis 4. Methods of analysis of the dynamics of phenomena (analysis of dynamic or time series). STAGES OF DESCRIPTION OF QUANTITATIVE SIGN 1. Determining the type of distribution of the sign. 2. Estimation of the central tendency of the studied population. 3. Assessment of its diversity (spread). TO CORRECTLY CHOOSE THE PATH OF STATISTICAL ANALYSIS, IT IS NECESSARY TO KNOW THE TYPE OF DISTRIBUTION OF THE RESEARCH SIGN. Under the type of distribution of a random variable means the correspondence that is established between all possible numerical values of a random variable and the possibility of their occurrence in the aggregate.
    [Show full text]
  • Chapter 11 Outline
    Park University - EC315 - Chapter 11 Lecture Notes Page 1 Chapter 11 Lecture Notes Introduction In this chapter we continue our study of hypothesis testing. Recall that in Chapter 10 we considered hypothesis tests in which we compared the results of a single sample statistic to a population parameter. In this chapter , we expand the concept of hypothesis testing to two samples. We select random samples from two independent populations and conduct a hypothesis test to determine whether the population means are equal. We might want to test to see if there is a difference in the mean number of defects produced on the 7:00 AM to 3:00 PM shift and the 3:00 PM to 11:00 PM shift at the DaimlerChrysler Jeep Liberty plant in Toledo, Ohio. We also conduct a hypothesis tests to determine if two sample proportions come from populations which are equal. For example, we may want to determine if the proportion of Jumpin’ Java customers who purchase frozen coffee drinks is the same for New England stores versus stores in the southeast. Two-Sample Tests of Hypothesis: Independent Samples As noted above, we expand the concept of hypothesis testing to two samples. When there are two populations, we can compare two sample means to determine if they came from populations with the same or equal means. For example, a purchasing agent is considering two brands of tires for use on the company's fleet of cars. A sample of 60 Rossford tires indicates the mean useful life to be 65,000 miles. A sample of 50 Maumee tires reveals the useful life to be 68,000 miles.
    [Show full text]
  • 7 Non-Parametric Statistics 7.1 Anderson
    SEC 4 Page 1 of 6 7 NON-PARAMETRIC STATISTICS 7.1 ANDERSON - DARLING TEST: The Anderson–Darling test is a statistical test of whether a given sample of data is drawn from a given probability distribution. In its basic form, the test assumes that there are no parameters to be estimated in the distribution being tested, in which case the test and its set of critical values is distribution-free. However, the test is most often used in contexts where a family of distributions is being tested, in which case the parameters of that family need to be estimated and account must be taken of this in adjusting either the test-statistic or its critical values. When applied to testing if a normal distribution adequately describes a set of data, it is one of the most powerful statistical tools for detecting most departures from normality. K-sample Anderson–Darling tests are available for testing whether several collections of observations can be modeled as coming from a single population, where the distribution function does not have to be specified. In addition to its use as a test of fit for distributions, it can be used in parameter estimation as the basis for a form of minimum distance estimation procedure. The test is named after Theodore Wilbur Anderson (born 1918) and Donald A. Darling (born 1915), who invented it in 1952. The Anderson-Darling test for normality is one of three general normality tests designed to detect all departures from normality. While it is sometimes touted as the most powerful test, no one test is best against all alternatives and the other 2 tests are of comparable power.
    [Show full text]