Topic 2. Distributions, hypothesis testing, and size determination

The Student - t distribution (ST&D pg 56 and 77) Consider a repeated drawing of samples of size n = 5 from a . Y s For each sample compute , s, Y , and another , t:

s  t (n-1)= (Y - )/ Y (Remember Z = (Y - )/ Y )

The t is the number of that separate Y and its hypothesized µ.

df=n-1=4

Critical values for |t|>|Z| -> less sensitivity. This is the price we pay for being uncertain about the population i Fig. 1. Distribution of t (df=4) compared to Z. The t distribution is symmetric, but wider and flatter than the Z distribution, lying under it at the center and above it in the tails.

1

When N increases the t distribution tend towards the N distribution

2 2. 2. Confidence limits based on sample statistics (ST&D p.77)

The general formula for any parameter  is: Estimated   Critical value * Standard error of the estimated 

So, for a population mean estimated via a sample mean:

  Y  t  sY ,n1 2 The statistic Y is distributed about  according to the t distribution so it satisfies

Y s Y s P{ - t /2, n-1 Y    + t /2, n-1 Y }= 1- 

For a confidence interval of size 1-, use a t value corresponding to  /2.

Therefore the confidence interval is

s s Y - t /2, n-1 Y    Y + t /2, n-1 Y

These two terms represent the lower and upper 1-  confidence limits of the mean. The interval between these terms is called confidence interval (CI).

Example: Data Set 1 of Hordeum 14 malt extraction values

Y s 14 = 75.94 Y = 1.23 / = 0.3279 A table gives the t0.025,13 value of 2.16

95% CI for  = 75.94 ± 2.160 * 0.3279  [75.23- 76.65]

If we repeatedly obtained samples of size 14 from the population and constructed these limits for each, we expect 95% of the intervals to contain the true mean.

True mean

Fig. 2 Twenty 95% confidence intervals. One out of 20 intervals does not include the true mean.

3

2. 3. Hypothesis testing and power of the test (ST&D 94).

= 75.94, s = sn2 / = 0.3279, t = 2.160, CI: [75.23- 76.65] Example Barley data. Y Y 0.025,13

1) Choose a null hypothesis: Test Ho  = 78 against the H1   78. 2) Choose a significance level: Assign  = 0.05 3) Calculate the test statistic t: Y   75.94  78.00 t    6.28 sY 0.3279 (interpretation: the sample mean is 6.3 SE from the hypothetical mean of 78. Too far!).

4) Compare the absolute value of the test statistic to the critical statistic: | - 6.28 | > 2.16

5) Since the absolute value of the test statistic is larger, we reject H0.

This is equivalent to calculate a 95% confidence interval for the mean.

Since o (78) is not within the CI [75.23- 76.65] we reject Ho.

is called the significance level of the test (<0.05): probability of incorrectly rejecting a true Ho, a Type I error.

is the Type II error: to incorrectly accept Ho when it is false  Null hypothesis Accepted Rejected

True Correct decision Type I error=  Null hypothesis False Type II error= Correct decision= Power= 1-

Power of the test: 1- is the power of the test, and represents the probability of correctly rejecting a false null hypothesis. It is a measure of the ability of the test to detect an alternative mean or a significant difference when it is real

Note that for a given Y and s, if 2 of the 3 quantities , , and n are specified then the third one can be determined. Choose the right number of replications to keep Type I error  and Type II error  under the desired limits (e.g.  <0.05 & <0.20).

4

Power of a test (ST&D pg 118-119)

 between 2 in SE 1   0 1   0 Power 1    P(Z  Z  ) or P(t  t  ) units  / 2   / 2 s Y Y

What is the for Ho: = 74.88 in the barley data set against H1:  = s 75.94. Since  = 0.05, n = 14 (t 0.025,13 = 2.160), and Y = 0.32795.

75.94  74.88 Power  1    P(t  2.160  )  P(t  1.072)  0.85 0.32795

The probability of the Type II error  we are looking for is the shaded area to the left of the lower curve.

/2 Fig. 3. Type I and Type II errors in the Barley Ho is true data set.

Ho is false

 1-

Acceptance Rejection The area 1- in the rejection region = the probability that > 75.588 under H1 = power =

P(t>(75.588-75.94)/0.32795)= P(t>-1.072)=0.85 same as above!

The magnitude of  depends on 1. The Type I error rate  2. The distance between the two means under consideration s 3. The number of observations (n)  s  Y n

5 When the distance between the two means is reduced,  increases

Variation of power as a function of the distance between the alternative hypotheses (Biometry Sokal and Rohlf)

n=35 SE=0.7

n=5 SE=1.7

6 2.3.2. Power of the test for the difference between the means of two samples

Two types of H0: 1 - 2=0 versus H1: 1 - 2  0 (two tail test) -> (t value: top t Table) H1: 1 - 2 >0 (one tail test) -> (t value: bottom t Table)

The general power formula for both equal and unequal sample sizes reads as:

| 1   2 | | 1   2 | Power  P(t  t  )  P(t  t  ) s 2 , 2 Y 1Y 2 2 s pooled N

2 2 2 2 (n1 1)s1  (n2 1)s2 where s pooled is a weighted given by: s pooled  (n1 1)  (n2 1) n n and N  1 2 . n1  n2

When n1 = n2 = n (equal sample sizes) that the formulas reduce to:

2 2 2 2 2 2 2 (n1 1)s1  (n2 1)s2 (n 1)(s1  s2 ) s1  s2 s pooled    (n1 1)  (n2 1) 2(n 1) 2

2 n1n2 n n N    n1  n2 2n 2

| 1   2 | | 1   2 | Power  P(t  t  )  P(t  t  ) s 2 2 Y 1Y 2 2 2s pooled n The variance of the difference between two random variables is the sum of the (error are always added) (ST&D 113-115).

The degrees of freedom for the critical t /2 are

General case: (n1-1) + (n2-1) Equal sample size: 2*(n-1)

7 2. 4. 2 Sample size for estimating µ, when is known. Using the z statistic If the population variance  is known the Z statistic may be used.

Y    Z  so CI = Y  Z  or Y  Z   / 2 Y  / 2 n The formula for d= half-length of the confidence interval for the mean is  d d  Z   Z [ Y d ]  Y  2 2 n

This can be rearranged to estimate the confidence interval in terms of the population variance. For =0.05:

2 2 2 2 2 2 2 2 n = z /2  / d = z /2 (/d) = (1.96) (/d) = 3.8(/d)

So if d=  n 4 d= 0.5   n 16 d= 0.25   n 64

2      The equation may be expressed in terms   2 2    2 CV of the coefficient of variation n  Z  2  Z  2 2  d  2  d           

CV= s / Y (as a proportion not as a %) d/ is the confidence interval as a fraction of the population mean. For example d/ < 0.1 means that the length of the confidence interval should not be larger than two tenth of the population mean.

d/ < 0.1 and so 2d/ < 0.2 Example: The CVs of yield trials in our experimental station are never higher than 15%. How many replications are necessary to have a 95% CI for the true mean of less than 1/10 of the average yield? 2d= 0.1 so d= 0.1/2= 0.05 n= 1.962 0.152/0.052 = 34.6  35

8 2. 4. 3 Sample size for the estimation of the mean Unknown 2. Stein's Two-Stage Sample

Consider a (1 - )% confidence interval about some mean µ:

Y t s Y t s - ,n1 Y    + ,n1 Y 2 2

The half-length (d) of this confidence interval is therefore:

s d  t s  t ,n1 Y ,n1 2 2 n This formula can be rearranged to estimate necessary sample size n

2 2 2 s 2  n  t 2  Z 2 ,n1 2 d 2 d

Stein's Two-Stage procedure involves using a pilot study to estimate s2.

Note that n is now present at both sides of the equation: iterative approach

Example: An experimenter wants to estimate the mean height of certain plants. From a pilot study of 5 plants, he finds that s = 10 cm. What is the required sample size, if he wants to have the total length of a 95% CI about the mean be no longer than 5 cm?

2 2 2 Using n = t /2,n-1 s / d , n is estimated iteratively,

initial-n t5%, df n 5 2.776 (2.776)2 (10)2 /2.52 = 123 123 1.96 (1.96)2 (10)2 / 2.52 = 62 62 2.00 64 64 2.00 64 Thus with 64 observations, he could estimate the true mean with a CI no longer than 5 cm at =0.05.

To accelerate the iteration you can start with Z:

n = z2 s2 / d2 = (1.96)2 (10)2 / 2.52 = 62

9 2. 4. 4 Sample size estimation for a comparison of two means

When testing the hypothesis Ho: o, we can take into account the possibility of a Type I and Type II error simultaneously.

To calculate n we need to known the alternative 1 or at least the minimum difference we wish to detect between the means  = |o - 1 The formula for computing n, the number of observations on each treatment, is:  2 n = 2 ( /  (Z/2 + Z)

2 For = 0.05 and = 0.20, z= 0.8416 and z /2 = 1.96, (Z/2 + Z) =7.85 8 We can define in terms of   If δ = 2σ, n ≈ 4  If δ = 1σ, n ≈ 16  If δ = 0.5σ, n ≈ 64

We rarely know 2 and must estimate it via sample variances:

2 2  s pooled    2 2   s1  s2 n  2  t  t ,n1n22 , where s     ,n1n22  pooled     2  2 n is estimated iteratively. If no estimate of s is available, the equation may be expressed in terms of the CV, and  as a proportion of the mean:

 2 2 2 n  2 [(/) / ( (Z/2 + Z)  2(CV/%) (Z/2 + Z)

2 Example: Two varieties are compared for yield, with a previously estimated s = 2.25 (s=1.5). How many replications are needed to detect a difference of 1.5 tons/acre with a  = 5%, and  = 20%?

 2 2 Approximate: n  2 (/ (Z/2 + Z) = 2 (1.5/1.5) (1.96+0.8416)= 15.7  2 Then use n = 2 (s /  (t/2 + t) to estimate the sample size iteratively.

guesstimate n df = 2(n - 1) t0.025 t0.20 estimated n 16 30 2.0423 0.8538 16.8 17 32 2.0369 0.8530 16.7 The answer is that there should be 17 replications of each variety.

10 2. 4. 5. Sample size to estimate population

2 The chi-squared ( ) distribution is used to establish confidence intervals around the sample variance as a way of estimating the true, unknown population variance.

2. 4. 5. 1. The Chi- square distribution (ST&D p. 55)

0 .5 2 df 0 .4 4 df 6 df 0 .3

0 .2

0 .1

0 .0

1 2 3 4 5 6 C hi - sq ua r e Distribution of 2 , for 2, 4, and 6 degrees of freedom.

Relation between the normal and chi-square distributions. The 2 distribution with df = n is defined as the sum of squares of n independent, normally distributed variables with zero means and unit variances. 2 2 2  α, df=1= Z (0,1) α/2  α/2, df= 2 2 2 2  1, 0.05 = 3.84 and Z (0,1), 0.025 = t , 0.025 = 1.96 = 3.84 Note: Z values from both tails go into the upper tail of the χ2 because of the disappearance of the minus sign in the 2 squaring. For this reason we use  for the  and /2 for Z and t. n 2 2 ()Yi   1 2 Zi  22 ()Yi  i1 

If we estimate the parametric mean  with a sample mean, we obtain: 2 112 ()ns n (Y Y )2 n ()YY 2 i  2 2 2  i 2 …due to: s   (Yi Y )  (n 1)s   i1 n 1 i1

This expression, which has a 2n-1 distribution, provides a relationship between the sample variance and the parametric variance.

11 2. 4. 5. 2. Confidence interval for 2 2 2 2 We can make the following statement about the ratio (n-1) s / that has n-1 distribution, 2 2 2 2 P {  1-/2, n-1  (n-1) s /   /2, n-1} = 1 - 

Simple algebraic manipulation of the quantities in the inequality yields

2 2 2 2 P {  1-/2, n-1 /(n-1)  s /   /2, n-1/(n-1)} = 1 -  which is useful when the precision of s2 can be expressed in terms of the % of 2.

Or inverting the ratio and moving (n-1): 2 2 2 2 2 P {(n-1) s /  /2, n-1    (n-1) s /  1-/2, n-1} = 1 -  2 which is useful to construct 95% confidence intervals for  .

Example: What sample size is required to obtain an estimate of  that deviates no more than 20% from the true value of  with 90% confidence?

2 2 Pr {0.8 < s/ < 1.2} = 0.90 = Pr {0.64 < s / < 1.44} = 0.90 thus 2 2  (1 - /2, n-1) / (n-1)= 0.64 and  (/2, n-1) / (n-1)= 1.44

2 Since  is not symmetrical, the above two solutions may not identical for small n. The computation involves an arbitrary initial n and an iterative process:

df 1 - /2 = 95% /2 = 5% 2 2 2 2 n (n-1)  (n-1)  (n-1) /(n-1)  (n-1)  (n-1) /(n-1) 21 20 10.90 0.545 31.4 1.57 31 30 18.50 0.616 43.8 1.46 41 40 26.50 0.662 55.8 1.40 36 35 22.46 0.642 49.8 1.42 35 35 21.66 0.637 48.6 1.43

Thus a rough estimate of the required sample size is ~ 36.

12