<<

Hypothesis Tests: Two Independent Samples

Cal State Northridge Ψ320 Andrew Ainsworth PhD

2

Psy 320 - Cal State Major Points Northridge •What are independent samples? •Distribution of differences between •An example •Heterogeneity of •Confidence limits

3

Psy 320 - Cal State Northridge

Independent Samples •Samples are independent if the participants in each sample are not related in any way •Samples can be considered independent for 2 basic reasons ▫ First, samples randomly selected from 2 separate populations

1 4

Psy 320 - Cal State Independent Samples Northridge • Getting Independent Samples ▫ Method #1 Random Random Selection Selection

5

Psy 320 - Cal State Northridge

Independent Samples •Samples can be considered independent for 2 basic reasons ▫ Second, participants are randomly selected from a single population ▫ Subjects are then randomly assigned to one of 2 samples

6

Psy 320 - Cal State Northridge Independent Samples • Getting Independent Samples ▫ Method #2

Population

Random Selection and Random Assignment

Sample #1 Sample #2

2 7

Psy 320 - Cal State Northridge Sampling Distributions Again • If we draw all possible pairs of

independent samples of size n1 and n2, respectively, from two populations.

• Calculate a X 1 and X 2 in each one. • Record the difference between these values. • What have we created? • The Sampling Distribution of the Difference Between Means

8

Psy 320 - Cal State Northridge Sampling Distribution of the Difference Between Means

Population 1 Population 2 • Mean of sampling

Calculate distribution X is µ Statistic 1 1

1 Sample 1 Sample 2 • Mean of sampling X11− X 21 distribution X 2 is µ2 2 Sample 1 Sample 2 X12− X 22 • So, the mean of a sampling 3 Sample 1 Sample 2 X13− X 23 distribution X 1 − X 2 ...... is µ1- µ2 ∞ Sample 1 Sample 2 X1∞− X 2 ∞

9

Psy 320 - Cal State Northridge Sampling Distribution of the Difference Between Means

• Shape ▫ approximately normal if  both populations are approximately normal ▫ or

 N1 and N 2 are reasonably large.

3 10

Psy 320 - Cal State Northridge Sampling Distribution of the Difference Between Means • Variability (variance sum rule) ▫ The variance of the sum or difference of 2 independent samples is equal to the sum of their σ2 σ 2 σ2= σ 2 + σ 2 =X1 + X 2 XX12− X 1 X 2 n1 n 2 σ2 σ 2 σ σ=XX1 + 2 ; Remembering that σ = X 1 XX1− 2 X 1 n1 n 2 n1 Think Pathagorasc 2= a 2 + b 2

11

Psy 320 - Cal State Northridge Sampling Distribution of the Difference Between Means • Sampling distribution of X1− X 2 σ2 σ 2 1+ 2 n1 n 2

µ1 - µ2

12

Psy 320 - Cal State Northridge Converting Mean Differences into Z- Scores • In any normal distribution with a known mean and , we can calculate z-scores to estimate probabilities. (X− X )( −µ − µ ) z = 1 2 12 σ X1− X 2 • What if we don’t know the σ’s? • You got it, we guess with s!

4 13

Psy 320 - Cal State Northridge Converting Mean Differences into t-scores s σ • We can use X 1 − X 2 to estimate X1− X 2 (X− X )( −µ − µ ) t = 1 2 12 s • If, X1− X 2 ▫ The variances of the samples are roughly equal (homogeneity of variance) ▫ We estimate the common variance by pooling (averaging)

14

Psy 320 - Cal State Northridge

Homogeneity of Variance

• We assume that the variances are equal because: ▫ If they’re from the same population they should be equal (approximately) ▫ If they’re from different populations they need to be similar enough for us to justify comparing them (“apples to oranges” and all that)

15

Psy 320 - Cal State Northridge

Pooling Variances • So far, we have taken 1 sample estimate (e.g. mean, SD, variance) and used it as our best estimate of the corresponding population parameter • Now we have 2 samples, the average of the 2 sample is a better estimate of the parameter than either alone

5 16

Psy 320 - Cal State Northridge

Pooling Variances

• Also, if there are 2 groups and one groups is larger than it should “give” more to the average estimate (because it is a better estimate itself) • Assuming equal variances and pooling allows us to use an estimate of the population that is smaller than if we did not assume they were equal

17

Psy 320 - Cal State Northridge

Calculating Pooled Variance SS SS • Remembering that s2 = = df n −1 2 sp is the pooled estimate: we pool SS and df to get

SS 2 pooled SS12+ SS SS 12 + SS SS 12 + SS sp == = = dfpooled dfdf121+( n −+− 1) ( n 2 1) nn 12 +− 2

18

Psy 320 - Cal State Northridge

Calculating Pooled Variance

2 SS1+ SS 2 sp = n1+ n 2 − 2

Method 1 Method 2 SS X X 2 2 1=∑(i 1 − 1 ) SS1=( n 1 − 1) s 1 2 2 SS2=∑( Xi 2 − X 2 ) SS2=( n 2 − 1) s 2

6 19

Psy 320 - Cal State Northridge

Calculating Pooled Variance

• When sample sizes are equal ( n1 = n2) it is simply the average of the 2 s2+ s 2 s2 = X1 X 2 p 2 • When unequal n we need a weighted average

2 2 2 (n1− 1) s 1 + ( n 2 − 1) s 2 sp = n1+ n 2 − 2

20

Psy 320 - Cal State Northridge Calculating SE for the Differences Between Means s2 s 2 s =p + p X1− X 2 n1 n 2 • Once we calculate the pooled variance we just substitute it in for the sigmas earlier

• If n1 = n2 then s2 s 2 s2 s 2 s =p += p 1 + 2 X1− X 2 nn1 2 nn 12

21

Psy 320 - Cal State Northridge

Example: Violent Videos Games

• Does playing violent video games increase aggressive behavior • Two independent randomly selected/assigned groups ▫ Grand Theft Auto: San Andreas (violent: 8 subjects) VS. NBA 2K7 (non-violent: 10 subjects) ▫ We want to compare mean number of aggressive behaviors following game play

7 22

Psy 320 - Cal State Northridge

Hypotheses (Steps 1 and 2) • 2-tailed

▫ h0: µ1 - µ2 = 0 or Type of video game has no impact on aggressive behaviors

▫ h1: µ1 - µ2 ≠ 0 or Type of video game has an impact on aggressive behaviors • 1-tailed

▫ h0: µ1 - µ2 ≤ 0 or GTA leads to the same or less aggressive behaviors as NBA

▫ h1: µ1 - µ2 > 0 or GTA leads to more aggressive behaviors than NBA

23

Psy 320 - Cal State Northridge Distribution, Test and Alpha (Steps 3 and 4) • We don’t know σ for either group, so we cannot perform a Z-test • We have to estimate σ with s so it is a t-test (t distribution) • There are 2 independent groups • So, an independent samples t-test • Assume alpha = .05

24

Psy 320 - Cal State Northridge

Decision Rule (Step 5)

• df total = ( n1 – 1) + ( n2 – 1) = n1 + n2 – 2 ▫ Group 1 has 8 subjects – df = 8 – 1= 7 ▫ Group 2 has 10 subjects – df = 10 – 1 = 9

• df total = ( n1 – 1) + ( n2 – 1) = 7 + 9 = 16 OR df total = n1 + n2 – 2 = 8 + 10 – 2 = 16 • 1-tailed t.05 (16) = ____; If t o > ____ reject 1-tailed

• 2-tailed t.05 (16) = ____; If t o > ____ reject 2-tailed

8 df Critical Values of Student's t 25 1-tailed 0.1 0.05 0.025 0.01 0.005 0.0005 2-tailed 0.2 0.1 0.05 0.02 0.01 0.001 Psy 320 - Cal State Northridge 1 3.078 6.314 12.706 31.821 63.657 636.619 2 1.886 2.92 4.303 6.965 9.925 31.598 3 1.683 2.353 3.182 4.5415 5.841 12.941 4 1.533 2.132 2.776 3.747 4.604 8.61 5 1.476 2.015 2.571 3.365 4.032 6.859 6 1.44 1.943 2.447 3.143 3.707 5.959 7 1.415 1.895 2.365 2.998 3.499 5.405 8 1.397 1.86 2.306 2.896 3.355 5.041 9 1.383 1.833 2.262 2.821 3.25 4.781 10 1.372 1.812 2.228 2.764 3.169 4.587 t 11 1.363 1.796 2.201 2.718 3.106 4.437 12 1.356 1.782 2.179 2.681 3.055 4.318 Distribution 13 1.35 1.771 2.16 2.65 3.012 4.221 14 1.345 1.761 2.145 2.624 2.977 4.14 15 1.341 1.753 2.131 2.602 2.947 4.073 16 1.337 1.746 2.12 2.583 2.921 4.015 17 1.333 1.74 2.11 2.567 2.898 3.965 18 1.33 1.734 2.101 2.552 2.878 3.922 19 1.328 1.729 2.093 2.539 2.861 4.883 20 1.325 1.725 2.086 2.528 2.845 3.85 21 1.323 1.721 2.08 2.518 2.831 3.819 22 1.321 1.717 2.074 2.508 2.819 3.792 23 1.319 1.714 2.069 2.5 2.807 3.767 24 1.318 1.711 2.064 2.492 2.797 3.745 25 1.316 1.708 2.06 2.485 2.787 3.725

26

Psy 320 - Cal State Northridge Subject GTA NBA Calculate t (Step 6) 1 12 9 o 2 9 9 3 11 8 4 13 6 5 8 11 6 10 7 Data 7 9 8 8 10 11 9 8 10 7

Mean 10.25 8.4 s 1.669 1.647 s2 2.786 2.713

27

Psy 320 - Cal State Northridge t Calculate o (Step 6)

• We have means of 10.25 (GTA) and 8.4 (NBA), but both of these are sample means. • We want to test differences between sample means. ▫ Not between a sample and a population mean

9 28

Psy 320 - Cal State Northridge t Calculate o (Step 6) • The t for 2 groups is: (X− X )( −µ − µ ) t = 1 2 12 s X1− X 2 • But under the null µ1 - µ2 = 0, so: (X− X ) t = 1 2 s X1− X 2

29

Psy 320 - Cal State Northridge t Calculate o (Step 6) s • We need to calculate X 1 − X 2 and to do that we need to first calculate 2 sp

2 2 2(n 1−+− 1) s 1( n 2 1) s 2 __(____) + __( ____ ) sp = = n1+− n 2 2 __ +− __ 2 _____+ ______= = = ______

30

Psy 320 - Cal State t Northridge Calculate o (Step 6) 2 • Since s p is _____ we can insert this into the equation s X1− X 2 s2 s 2 ______s =p += p + X1− X 2 n1 n 2 __ __ =___ + ___ = ____ Therefore , X− X ___ − ______t =1 2 = == ___ o s ______X1− X 2

10 31

Psy 320 - Cal State Northridge Make your decision (Step 7) • Since ____ > ____ we would reject the null hypothesis under a 2-tailed test • Since ____ > ____ we would reject the null hypothesis under a 1-tailed test • There is evidence that violent video games increase aggressive behaviors

32

Psy 320 - Cal State Northridge Heterogeneous Variances • Refers to case of unequal population variances • We don’t pool the sample variances just add them • We adjust df and look t up in tables for adjusted df • For adjustment use Minimum df = smaller n - 1

33

Psy 320 - Cal State Northridge Effect Size for Two Groups • Extension of what we already know. • We can often simply express the effect as the difference between means (e.g. 1.85) • We can scale the difference by the size of the standard deviation. ▫ Gives d (aka Cohen’s d) ▫ Which standard deviation?

11 34

Psy 320 - Cal State Northridge Effect Size • Use either standard deviation or their pooled average. ▫ We will pool because neither has any claim to priority. ▫ Pooled st. dev. = 2.745 = 1.657

35

Psy 320 - Cal State Effect Size, cont. Northridge X− X 10.25− 8.4 d =viol nonviol = spooled 1.657 1.85 = = 1.12 1.657 • This difference is approximately 1.12 standard deviations • This is a very large effect

36

Psy 320 - Cal State Northridge

Confidence Limits

CI= XX1 − 2 ± t* s .95( ) ( .05,2 −tailed X1− X 2 ) =(____ − ____) ±() ____ × ____ =____ ± ____ =____ <µ < ____

12 37

Psy 320 - Cal State Northridge

Confidence Limits • p = .95 that based on the sample values the interval

formed in this way includes the true value of µ1 - µ2 • The probability that interval includes µ1 - µ2 given the value of X1− X 2 • Does 0 fall in the interval? What does that mean?

38

Psy 320 - Cal State Northridge

Computer Example

• SPSS reproduces the t value and the confidence limits. • All other statistics are the same as well. • Note different df depending on homogeneity of variance. • Remember: Don’t pool if heterogeneity  and modify degrees of freedom

39

Psy 320 - Cal State SPSS output Northridge

Group Statistics

Std. Error GAME N Mean Std. Deviation Mean AGGRESS 1.00 gta 8 10.2500 1.66905 .59010 2.00 nba 10 8.4000 1.64655 .52068

Independent Samples Test

Levene's Test for Equality of Variances t-test for Equality of Means 95% Confidence Interval of the Mean Std. Error Difference F Sig. t df Sig. (2-tailed) Difference Difference Lower Upper AGGRESS Equal variances .005 .942 2.355 16 .032 1.8500 .78571 .18436 3.51564 assumed Equal variances 2.351 15.048 .033 1.8500 .78697 .17308 3.52692 not assumed

13