Test of a Hypothesis About Μ When Value of Sigma Is Unknown I s1

PSY 2010 Lecture Notes One Way Analysis of VarianceExample problemA professor teaches the same class to students from three different populations. The first is a population of "regular" day students. The second is a population of students attending at night. The third is a population of students working for a large corporation and meeting in a room provided by the corporation. The same test is given to all three classes. The professor wonders whether the mean final exam performance of students in the three populations will be equal. DataRegular: 58 69 67 80 91 86 94 Night: 82 92 96 99 86 93 99 Corporate: 72 85 89 75 79 80 94 Summary statisticsGroup Mean SD Regular 77.86 13.508 Night 92.43 6.451 Corporate 82.00 7.789The main question here is this: How can the means of three populations be compared?It appears that the night students did best and that the regular students did worst with the corporate students in the middle.But are those sample differences statistically significant?How can the sample mean differences be used to make a statement about differences between the three populations of students represented here – Regular, Night, or Corporate?One possibility: Multiple independent groups t-tests.1st t-test: Regular vs. Night 2nd t-test: Regular vs. Corporate 3rd t-test: Night vs. CorporateUse the results of those 3 t-tests to determine whether or not the 3 population means were equal or not.Two problems with the multple t-test solution:This method is not efficient. What if there were 5 groups? That would require 10 t-tests!!!This method does not permit precise computation of a p-value for the main question: Are there differences in the means of the populations represented by these 3 samples?Solution: The analysis of variance. Biderman’s P2010 Handouts Topic 15: Analysis of Variance - 1 5/18/2018 One Way Analysis of Variance: Overview(One way means that the groups differ on only one dimension – type of student in the above example.)SituationWe wish to test the hypothesis that the means of several (2 or more) populations are equal. The populations are independent - in the sense of the independent groups t-test. This means that the scores in the several populations are not matched or paired. (There is a matched/paired scores version of the analysis of variance, but we won’t cover it in this semester.)General ProcedureTake a sample of scores from each population. The sample sizes need not be equal. Compute the mean of each sample. If the values of the sample means are all "close" to each other, then retain the hypothesis of equality of the population means. But, if the values of the sample means are not "close" to each other, then reject the hypothesis of equality of the population means.As in previous hypothesis testing situations, we first translate “close” and “far” into probabilistic terms. As before, for many data analysts “far” is any outcome whose probability is .05 or less. You can substitute your own significance level for .05. To determine the probability associated with an outcome, statisticians employ a statistic, originally conceived by Sir R. A. Fisher, called the F statistic.Test Statistic: F StatisticEqual sample size formula Common Sample size * Variance of Means F = ------Mean of Sample variances where n = common sample sizeK = No. of means being compared2 = Variance of sample means SX 2 Si = Variance of scores within group iBiderman’s P2010 Handouts Topic 15: Analysis of Variance - 2 5/18/2018 Unequal Sample Size formulaWhew!! I’m glad we have computers to do this. where ni = No. of scores in group iN = n1 + n2 + . . . + nK = Total no. of scores observed.X - = Mean of all the N scores.Numerator df = K - 1Denominator df = N - KBiderman’s P2010 Handouts Topic 15: Analysis of Variance - 3 5/18/2018 Why do the above? More Than You Ever Wanted to Know about FThe analysis of variance involves estimating the population variance in two different ways and then comparing those two estimates.The numerator of the F formulaThe numerator of the F statistic is called Mean Square Between Groups. It is an estimate of the variance of the scores in the population computed from the variability of the sample means.It’s based on the following from Chapter 15, p. 178: (Standard error of the mean.)2 2 2 2 This means that σ X-bar = σ / n. If so, then n * σ X-bar = σ2 2 We don’t know the values of σ X-bar, but we can estimate it using S X-bar, the variance of the means of the samples.2 So, we have an estimate of the population variance, n * S X-bar.This is an unbiased estimate of σ2 if the population means are equal. Key facts. 2 2 But if the population means are not equal, n * S X-bar will be larger than σ .The denominator of the F formulaThe denominator of the F statistic is called Mean Square Within Groups. The denominator is simply the arithmetic average, i.e., the mean, of the individual sample variances. Mean Square Within Groups is also an estimate of σ2. It is always unbiased.The F statisticThe F statistic is simply the ratio of the two values: 2 Mean Square Between Groups n * S X-bar F = ------= ------Mean Square Within Groups Average of within-group variancesValues expected if the Null Hypothesis is trueIf the null hypothesis of no difference in population means is true, the value of F should be about equal to 1. Values not expected if the Null is true If the null is false, F should be larger than 1. Biderman’s P2010 Handouts Topic 15: Analysis of Variance - 4 5/18/2018 One Way Analysis of Variance: Worked Out ExampleProblemA professor teaches the same class to students from three different populations. The first is a population of "regular" day students. The second is a population of students attending at night. The third is a population of students working for a large corporation and meeting in a room provided by the corporation. The same test is given to all three classes. The professor wonders whether the mean final exam performance of students in the three populations will be equal. Statement of Hypotheses In SPSS . . . H0: µ1 = µ2 = µ3. H1: At least 1 inequality.Test statisticF statistic for the one way analysis of variance.DataRegular: 58 69 67 80 91 86 94 Night: 82 92 96 99 86 93 99 Corporate: 72 85 89 75 79 80 94 Summary statisticsGroup Mean SD Regular 77.86 13.508 Night 92.43 6.451 Corporate 82.00 7.789Variance of the sample means is 7.5082 = 56.370Conclusion7*7.5082 394.590 F = ------= ------= 4.157 (13.5082+6.4512+7.7892) ------94.917 3 The F value of 4.157 is larger than 1. Since we would expect it to be equal to 1 if the null were true, this kind of suggests that the null is not true. But it might be possible to get an F of 4.157 even when the null was true. What would be the probability of that? The value we’re looking for is the p-value for the F statistic. If that p-value is less than or equal to .05, then we can feel comfortable in rejecting the null hypothesis of equal population means. Otherwise we’ll retain it.The following shows how SPSS was used to conduct the analysis. The SPSS output reports the p-value associated with the F statistic. Biderman’s P2010 Handouts Topic 15: Analysis of Variance - 5 5/18/2018 One way analysis of variance using SPSS Analyze -> Compare Means -> OnewayPut the name of the variable being analyzed (the dependent variable) in this box.Click on the Options button to open the Options Dialog box.Put the name of the variable which designates the groups being compared in this box.Biderman’s P2010 Handouts Topic 15: Analysis of Variance - 6 5/18/2018 OnewayDescriptives examscorN Mean Std. Deviation Std. Error 95% Confidence Interval for Mean Minimum MaximumLower Bound Upper Bound1 7 77.86 13.508 5.106 65.36 90.35 58 942 7 92.43 6.451 2.438 86.46 98.40 82 993 7 82.00 7.789 2.944 74.80 89.20 72 94Total 21 84.10 11.175 2.439 79.01 89.18 58 99ANOVA examscorSum of Squares df Mean Square F Sig.Between Groups 789.238 2 394.619 4.157 .033Within Groups 1708.571 18 94.921Total 2497.810 20Means PlotsBiderman’s P2010 Handouts Topic 15: Analysis of Variance - 7 5/18/2018 A Second ExampleAn Industrial/Organizational Psychologist compared three methods of teaching workers the safety rules which were supposed to be in force in the organization. Method L consisted of two days of classes, mostly lecture. Method C consisted of two days of interactive computer games designed to teach the basics of safety in the organization. Method W consisted of two days of filling out a workbook covering the safety rules. The dependent variable in the research was the number of problems answered correctly on a 50 question multiple choice exam given on the last day of training. The data are as follows:Group L 30 31 32 28 39 30 33 27 28 29 34 35 32 25 28 29 32 30 31 35 27 26 31 30 29 28 34 31 29 30Group C 23 29 30 27 28 31 32 33 29 27 28 30 29 30 27 29 26 25 24 28 29 30 34 31 30 24 26 27 30 22Group W 42 42 41 43 41 40 48 43 48 42 46 47 40 41 40 38 39 41 46 45 38 39 39 40 43 46 40 48 49 50HypothesesNull Hypothesis: µL = µC = µWAlternative Hypothesis: Some inequality between the values of the means.Post Hoc TestsIf the p-value associated with F is larger than .05, then that means there are no differences between the population means. We retain the null and that’s it.But if the p-value is <= .05, then that means there is “some inequality between the values of the means.”Unfortunately, that’s all the overall test tells us. It does NOT tell us where the inequality lies.For this reason, analysts usually follow up a significant overall test with special tests, called post hoc tests, designed to identify just which means are different.In this example, the use of the Tukey b post hoc test is illustrated.Biderman’s P2010 Handouts Topic 15: Analysis of Variance - 8 5/18/2018 A portion of the data matrixSpecifying the analyses: Analyze -> Compare means -> One-Way ANOVA . . .Click on the Post Hoc button to open the Post Hocs dialog box.Check Tukey’s-b.Biderman’s P2010 Handouts Topic 15: Analysis of Variance - 9 5/18/2018Click on the Options button to choose options.Check Descriptive, Homogeneity of variance test and Means plot. Click on the Options button to choose options.Check Descriptive, Homogeneity of variance test and Means plot.The Output OnewayDescriptives numcorrectN Mean Std. Deviation Std. Error 95% Confidence Interval for Mean Minimum MaximumLower Bound Upper Bound1 18 30.67 3.290 .775 29.03 32.30 25 392 42 28.79 2.876 .444 27.89 29.68 22 353 30 42.83 3.563 .651 41.50 44.16 38 50Total 90 33.84 7.167 .755 32.34 35.35 22 50ANOVA numcorrectSum of Squares df Mean Square F Sig.Between Groups 3680.584 2 1840.292 179.644 .000Within Groups 891.238 87 10.244Total 4571.822 89Biderman’s P2010 Handouts Topic 15: Analysis of Variance - 10 5/18/2018 Post Hoc TestsHomogeneous Subsets numcorrect Means that are in different Tukey B columns are significantly group N Subset for alpha = 0.05 different from each other. 1 22 42 28.79 Means that are in the same column are NOT significantly 1 18 30.67 different. 3 30 42.83 Means for groups in homogeneous subsets are displayed. In this instance, the Group 3 a. Uses Harmonic Mean Sample Size = 26.620. mean is significantly larger than b. The group sizes are unequal. The harmonic mean of the group either the Group 2 or Group 1 sizes is used. Type I error levels are not guaranteed. mean.Means PlotsBiderman’s P2010 Handouts Topic 15: Analysis of Variance - 11 5/18/2018

Test of a Hypothesis About Μ When Value of Sigma Is Unknown I s1

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support