BHS 307 – Statistics for the Behavioral Sciences
Total Page:16
File Type:pdf, Size:1020Kb
PSY 307 – Statistics for the Behavioral Sciences Chapter 14 – t-Test for Two Independent Samples Independent Samples Observations in one sample are not paired on a one-to-one basis with observations in the other sample. Effect – any difference between two population means. Hypotheses: Null H0: 1 – 2 = 0 ≤ 0 Alternative H1: 1 – 2 ≠ 0 > 0 The Difference Between Two Sample Means Effect Size X1 minus X2 The null hypothesis (H0) is that these two means come from underlying populations with the same mean (so the difference between them is 0 and 1 – 2 = 0). Sampling Distribution of Differences in Sample Means All possible x1-x2 difference scores that could occur by chance 1 – 2 x1-x2 Critical Value Critical Value Does our x1-x2 exceed the critical value? YES – reject the null (H0) What if the Difference is Smaller? All possible x1-x2 difference scores that could occur by chance 1 – 2 x1-x2 Critical Value Critical Value Does our x1-x2 exceed the critical value? NO – retain the null (H0) Distribution of the Differences In a one-sample case, the mean of the sampling distribution is the population mean. In a two-sample case, the mean of the sampling distribution is the difference between the two population means. The standard deviation of the difference scores is the standard error of this distribution. Formulas for t-test (independent) (X X ) ( ) t 1 2 1 2 hyp s Estimated standard error x1 x2 2 2 s p s p 2 SS1 SS2 SS1 SS2 s s p x1 x2 df n n 2 n1 n2 1 2 2 2 2 ( X1) 2 ( X 2 ) SS1 X1 SS2 X 2 n1 n2 Estimated Standard Error Pooled variance – the variance common to both populations is estimated by combining the variances. The variance average is computed by weighting the group variance by the degrees of freedom (df) then dividing by combined df. Df for pooled variance: n1 + n2 - 2 Confidence Intervals for t The confidence interval for two independent samples is: X X (t )(s ) 1 2 conf x1 x2 Find the appropriate value of t in the t table using the formula for df. The true difference in population means will lie between the upper and lower limits some % of the time Assumptions Both populations are normally distributed with equal variance. With equal sample sizes > 10, valid results will occur even with non- normal populations. Equate sample sizes to minimize effects of unequal variance. Increase sample size to minimize non-normality. Population Correlation Coefficient Two correlated variables are similar to a matched sample because in both cases, observations are paired. A population correlation coefficient ( ) would represent the mean of r’s for all possible pairs of samples. Hypotheses: H0: = 0 H1: ≠ 0 t-Test for Rho ( ) Similar to a t–test for a single group. Tests whether the value of r is significantly different than what might occur by chance. Do the two variables vary together by accident or due to an underlying relationship? Formula for t r t hyp 2 1 r Standard error of prediction n 2 Calculating t for Correlated Variables Except that r is used in place of X, the formula for calculating the t statistic is the same. The standard error of prediction is used in the denominator to calculate the standard deviation. Compare against the critical value for t with df = n – 2 (n = pairs). Importance of Sample Size Lower values of r become significant with greater sample sizes: As n increases, the critical value of t decreases, so it is easier to obtain a significant result. Cohen’s rule of thumb .10 = weak relationship .30 = moderate relationship .50 = strong relationship.