INFERENCE for CONTRASTS (Chapter 4) Recall: a Contrast Is A

1 2

INFERENCE FOR CONTRASTS (Chapter 4) It's unbiased:

v Recall: A contrast is a linear combination of effects v c Y E(c Y ) with coefficients summing to zero: E(# i i" ) = # i i" i=1 i=1 v v c E (Y ) v = # i i" c " c i=1 # i i where " i = 0. v i=1 i=1 = #ci (µ + " i ) i=1 Specific types of contrasts of interest include: v v c c • Differences in effects = " iµ + " i#i i=1 i=1 • Differences in means v

= #ci" i , A special type of difference in means often of interest i=1 v # v & in an experiment with a control group: The difference % ( since "ciµ = %" ci (µ = 0. between the control group effect and the mean of the i=1 $ i=1 ' other treatment effects.

Recall: The least squares estimator of the contrast v #ci" i is i=1 v v c ˆ # i" i = #ciYi " i=1 i=1

! ! !

! !

! ! ! !

! ! 3 4

Recall two model assumptions: Recall two model assumptions:

• Yit = µ + !i + "it . 2 • Yit = µ + !i + "it . • For each i and t, "it ~ N(0, # ) • The "it are independent random variables. These imply:

This implies that the Yit's are independent. 2 Yit ~ N(µ + !i , # )

Since each Yi " is a linear combination of the Yit's for th Since the Y 's are independent, each Yi , as a linear the i treatment group only, it follows that the Yi " 's it " are independent. Thus combination of independent normal random variables, is also normal. v v 2 c Y c Var(Y ) v Var(# i i" ) = # i i" c Y i=1 i=1 Since the contrast estimator # i i" is a linear i=1 combination of the independent normal random v " 2 c2 = # i variables Yi " , it too must be normal. i=1 ri v 2 2 ci Summarizing: = " # . i=1 ri v v v c 2 c 2 i #ciYi " ~ N(# i" i ," # ). i=1 i=1 i=1 ri

! ! ! !

! !

! ! ! ! ! ! 5 6

Standardizing, Replacing the standard deviation of the contrast by the standard error in (*) gives v v $ciYi " #$ci%i v v i=1 i=1 v 2 ciYi " # ci% i (*) ci ~ N(0,1) $ $ & $ i=1 i=1 i=1 ri v c 2 MSE i , Using the estimate msE for #2, we obtain the $ ri v i=1 c Y standard error for the contrast estimator # i i" : i=1 which no longer has a normal distribution because of the substitution of MSE for #. v v 2 ci se(#ciYi " ) = msE" . ! i=1 i=1 ri The usual trick works:

v v v v In terms of random variables: c Y c $ciYi " #$ci% i $ i i" #$ i%i i=1 i=1 i=1 i=1 MSE 2 v v 2 v c 2 (**) v c 2 = c & c Y i i i SE(# i i" ) = MSE " . MSE$ & $ i=1 i=1 ri i=1 ri i=1 ri

As mentioned before, MSE/ #2 = SSE/[(n-v) #2] ~ 2 $!( n-v)/(n-v). It can be proved that the numerator and denominator in (**) are independent. Thus

! ! !

! 7 8

v v Comments:

$ciYi " #$ci% i i=1 i=1 1. For a two-sided test, we could also do an F-test v 2 with test statistic t2. ci ~ t(n-v). MSE $ r i=1 i 2. A very similar analysis shows:

We can use this as a test statistic to do inference th • The standard error for the i treatment mean µ + !i (confidence intervals and hypothesis tests) for msE ! contrasts. is . ri

Example: In the battery experiment, treatments 1 and • The test statistic 2 were alkaline batteries, while types 3 and 4 were Y # µ heavy duty. To compare the alkaline with the heavy i" i duty, we consider the difference of means contrast D MSE = (1/2)(!1 + !2) - (1/2)(!3 + !4). ri • Find a 95% confidence interval for the contrast. has a t-distribution with n - v degrees of freedom. • State precisely what the resulting confidence interval means. So we can do hypothesis tests and form confidence • Perform a hypothesis test with null hypothesis: The intervals for a single mean. means for the two types are equal. 2. We haven't done examples of finding confidence intervals or hypothesis tests for effect differences or for treatment means, since in practice in ANOVA, one does not usually do just one test or confidence interval, so modified techniques for multiple comparisons are needed.

! 9 10

If we let A be the event that the confidence interval

The Problem of Multiple Comparisons for !1 - !2 actually contains !1 - !2, and let B be the

event that the confidence interval for !3 - !4 actually

Suppose we want to form confidence intervals for contains !3 - !4, the best we can say in general is the two means or for two effect differences. If we formed following: a 95% confidence interval for, say, !1 - !2, and another 95% confidence interval for !3 - !4, we would P(obtaining a sample giving a confidence interval get two intervals, say (a,b) and (c,d), respectively. for !1 - !2 that actually contains !1 - !2 and also

These would mean: giving a confidence interval for !3 - !4 that actually

contains !3 - !4) 1. We have produced (a,b) by a method which, for = P(A%B) = 1 - P((A%B)C) 95% percent of all completely randomized samples of = 1 - P(AC &BC) the same size with the specified number in each = 1 - [P(AC) + P(BC) - P(AC % BC)] treatment, yields an interval containing !1 - !2, and = 1 - P(AC) - P(BC) + P(AC % BC) ! 1 - P(AC) - P(BC) 2. We have produced (c,d) by a method which, for = 1 - 0.05 - 0.05 = 0.90 95% percent of all completely randomized samples of the same size with the specified number in each Similarly, if we were forming k 95% confidence treatment, yields an interval containing !3 - !4. intervals, our "confidence" that for all of them, the corresponding true effect difference would lie in the But there is absolutely no reason to believe that the corresponding CI would be reduced 95% of samples in (1) are the same as the 95% of to 1 - .05k. samples in (2). Thus, other techniques are needed for such "simultaneous inference" (or "multiple comparisons").