Analysis of Variance
OPRE 6301 Introduction . . .
The purpose of analysis of variance is to compare two or more populations of interval data. Specifically, we are interested in determining whether differences exist between the population means.
Although we are interested in a comparison of means, the method works by analyzing the sample variance. The intuitive reason for this is that by studying the “sources” of variation in the sample data, we can decide whether any observed differences among multiple sample means can be attributed to chance, or whether they are indicative of actual differences among the means of the corresponding populations.
Analysis of variance is one of the most widely used meth- ods in statistics.
1 One-Way Analysis of Variance . . .
Consider the following scenarios:
— We wish to decide on the basis of sample data whether there is really a difference in the effectiveness of three methods of teaching computer programming. — We wish to compare the average yield per acre of wheat when different types of fertilizers are used. — We wish to find out whether differences exist between the mean travel time from home to work/school along three different routes. — We wish to find out whether differences exist between average starting salary for MBA graduates from dif- ferent schools.
In all of these examples, we are trying to find out the effect of a single “factor” on a population; this explains the description “one-way.”
2 Example: Advertising Strategy . . .
We will illustrate the method using the following concrete example.
An apple juice manufacturer is planning to develop a new product — a liquid concentrate. The marketing manager has to decide how to market the new product. Three strategies are considered: • Emphasize convenience of using the product. • Emphasize the quality of the product. • Emphasize the product’s low price. An experiment was conducted as follows: • An advertising campaign was launched in three cities. • In each city, only one of the three characteristics of the new product (convenience, quality, and price) was emphasized.
3 Weekly sales were recorded for twenty weeks following
the beginning of the campaigns (see Xm15-01.xls).
C o n v n c e Q u a l i t y P r i c e
5 2 9 8 0 4 6 7 2
6 5 8 6 3 0 5 3 1
7 9 3 7 7 4 4 4 3
5 1 4 7 1 7 5 9 6
6 6 3 6 7 9 6 0 2
7 1 9 6 0 4 5 0 2
7 1 1 6 2 0 6 5 9
6 0 6 6 9 7 6 8 9
4 6 1 7 0 6 6 7 5
5 2 9 6 1 5 5 1 2
4 9 8 4 9 2 6 9 1
6 6 3 7 1 9 7 3 3
6 0 4 7 8 7 6 9 8
4 9 5 6 9 9 7 7 6
4 8 5 5 7 2 5 6 1
5 5 7 5 2 3 5 7 2
3 5 3 5 8 4 4 6 9
5 5 7 6 3 4 5 8 1
5 4 2 5 8 0 6 7 9
6 1 4 6 2 4 5 3 2
4 Research Hypothesis
We wish to answer the following question: Do differences in sales exist between the test markets (i.e., advertising strategies)? In other words, our research hypothesis is that at least two of the three mean sales differ.
Therefore,
H0 : µ1 = µ2 = µ3 H1 : At least two of the µjs differ
We now need to develop an appropriate statistic to test the above pair of hypotheses.
5 Notation
Let k = total number of populations (k = 3 in this example) nj = sample size for the jth population (nj = 20 for all 1 ≤ j ≤ k in this example) k n = j=1 nj , the total number of observations
Xij =P the ith observation from the jth population, where 1 ≤ j ≤ k and for given j,1 ≤ i ≤ nj.
X¯j = the sample mean of all observations from the jth population, for 1 ≤ j ≤ k; formally, n 1 j X¯ = X . j n ij j i=1 X sj = the standard deviation of all observations from the jth population X¯ = the grand mean of all observations; formally, n 1 k j 1 k X¯ = X = n X¯ . n ij n j j j=1 i=1 j=1 X X X
6
Thus,
F i r s t o b s e r v a t i o n ,
X 1 1
X X
1 1 k
2
f
i r s t s a m p l e
1
x 2
k
x x
2 2 2
.
. .
.
S
e c o n d o b s e r v a t i o n ,
. .
.
. .
X n 1 , 1
s e c o n d s a m p l e
X X
n , n k , k
2 2
n
1
n n
2
k
x
1
x x
2
k
S a m p l e s i z e
S a m p l e m e a n
Note that in general, the njs do not have to be the same.
7 Generic Terminology
In the context of this example,
Response Variable: weekly sales, denoted by X above Responses: actual sales values, which are the observed
values of the Xijs Experimental Unit: weeks in the three cities when we recored sales figures Factor: the criterion by which we classify the popula- tions — advertising strategy; another possible factor is advertising medium (TV, newspaper, or internet) Factor Levels: possible treatments for a given fac- tor — convenience, quality, or price
8 Test Statistic
We will consider two measures of variability. The idea is
depicted below ...
3 0
2 5
2 0
x 3 =
2 0
3
x =
2 0
2 0
1 9
1 5
x 2 =
1 5
1 6
x 2 =
1 5
1 4
1 0
x 1 =
1 2
1 1
1 0
1
x =
1 0
1 0
9
9
7
A s m a l l v a r i a b i l i t y w i t h i n T h e s a m p l e m e a n s a r e t h e s a m e a s b e f o r e ,
1
t h e s a m p l e s m a k e s i t e a s i e r b u t t h e l a r g e r w i t h i n s a m p l e v a r i a b i l i t y
T r e a t m e n t 3
T r e a t m e n t 1 T r e a t m e n t 2 T r e a t m e n t 3 T r e a t m e n t 1 T r e a t m e n t 2
t o d r a w a c o n c l u s i o n a b o u t t h e m a k e s i t h a r d e r t o d r a w a c o n c l u s i o n
p o p u l a t i o n m e a n s . a b o u t t h e p o p u l a t i o n m e a n s .
9 Formally, we assume that the Xijs are independent and normally distributed with means µj and a common vari- ance σ2. That is,
X¯ij = µj + ǫij , (1) where the ǫijs, the errors, are normally distributed with mean 0 and variance σ2.
Now, if the null hypothesis is true, say with all µjs equal to µ, then the key idea is that we can estimate σ2 in two ways . . .
10 Method 1: For each j, the nj observations X1j, X2j,
..., Xnj,j can be viewed as a sample of size nj from a normal population with mean µ and variance σ2. It follows that n 1 j (X − X¯ )2 σ2 ij j i=1 2 X has a χ distribution with nj − 1 degrees of freedom (see equation (3) in notes for Chapter 12). Furthermore, summing the above over j shows that
n 1 k j (X − X¯ )2 σ2 ij j j=1 i=1 X X has a χ2 distribution with n − k degrees of freedom k (recall that j=1 nj = n). 2 2 Since E(χ )=P ν, we see that σ can be estimated by n 1 k j MSE ≡ (X − X¯ )2 , (2) n − k ij j j=1 i=1 X X where MSE is for mean square for errors (cf. the white groupings in the previous figure).
11 Method 2: Recall that for each j, the sample mean X¯j also is a normally distributed variable with mean µ 2 and variance σ /nj. Since we have one mean for each j, i.e., for each treatment, it is intuitive that one should be able to estimate σ2 by looking at the variability of the treatment means around the grand mean. Indeed, it can be shown that
1 k n (X¯ − X¯ )2 σ2 j j j=1 X (think of this as weighting the squared deviation (X¯j− ¯ 2 X¯ ) by nj, the number of observations for treatment j)hasa χ2 distribution with k−1 degrees of freedom. Again, since E(χ2) = ν, we see that σ2 can also be estimated by
1 k MST ≡ n (X¯ − X¯ )2 , (3) k − 1 j j j=1 X where MST is for mean square for treatments (cf. the horizontal pink lines in the previous figure).
12 Since both (2) and (3) are unbiased point estimates for σ2, we expect the two estimates to be close. This suggests that we consider the statistic
MST F ≡ . (4) MSE
If the null hypothesis is true, we expect F to be close to 1. If, on the other hand, the null hypothesis is not true, we expect F to be pulled up by a greater MST (again, refer to the previous figure). Thus, the F statistic is a measure of relative variability.
How big an F value is considered sufficient for us to reject
H0? The answer depends on the sampling distribution of the F statistic, which turns out to be the so-called (natu- rally) F distribution. (A discussion of the F distribution can be found on pp. 270–274 of the text.)
13 The F distribution is characterized by two parameters ν1 and ν2. In our setting, these parameters correspond to the degrees of freedom for MST and MSE, respectively, and hence ν1 = k − 1 and ν2 = n − k.
The F critical value for a given right-tail probability α can be found by the Excel function FINV(α, ν1, ν2).
In our example, we have ν1 =3−1=2and ν2 = 60−3= 57. For α =0.05, this yields the critical value
FINV(0.05, 2, 57) = 3.16 .
We are now ready to test H0 ...
14 The F Test
The MST and the MSE are usually computed via SST MST = k − 1 and SSE MSE = , n − k respectively, where k ¯ 2 SST ≡ nj(X¯j − X¯ ) , (5) j=1 X which is called the sum of squares for treatment, and k nj 2 SSE ≡ (Xij − X¯j) , (6) j=1 i=1 X X which is called the sum of squares for errors.
15 For our example, ...
SST:
1 2 3
k
2
j j
1
j
=
T h e g r a n d m e a n i s c a l c u l a t e d b y
= 2 0 ( 5 7 7 . 5 5 6 1 3 . 0 7 ) 2 +
+ 2 0 ( 6 5 3 . 0 0 6 1 3 . 0 7 ) 2 +
1 1 2 2 k k
+ 2 0 ( 6 0 8 . 6 5 6 1 3 . 0 7 ) 2
1 2 k
= 5 7 , 5 1 2 . 2 3
SSE:
j
2 2 3 3 2
2 2
= = 1 1
16 It follows that SST/(k − 1) F = SSE/(n − k) 57512.23/2 = 509983.50/57 = 3.23 . Since 3.23 exceeds 3.16, the F critical value at α =0.05, there is sufficient evidence to reject H0 in favor of H1 and we conclude that at least one mean sales under these three strategies differ.
As usual, we can also determine the right-side p-value associated F =3.23 via
FDIST(3.23, 2, 57) = 0.0469 ,
which is seen to be lower than 0.05.
0 . 1
0 . 0 8
0 . 0 6
p V a l u e = P ( F > 3 . 2 3 ) = . 0 4 6 9
0 . 0 4
0 . 0 2
0
0 1 2 3 4