Quick viewing(Text Mode)

Introductory Statistics Refresher

Introductory Refresher

Dr. Julia L. Sharp

Short Course on Introductory Statistics Part III

Sharp (Clemson University) ASA 1 / 26 Hypothesis Testing

As an example, suppose that I claim that I am excellent free throw shooter, making 80% or more of my free throw shots.

Given a claim. Gathered evidence. Assessed the evidence using the claim.

Sharp (Clemson University) ASA 2 / 26 Hypothesis Testing

State the null and alternative hypotheses. State the Type I and Type II Errors for the hypotheses. State the level of significance (maximum acceptable α). Check assumptions. Compute the test . Calculate the p-value. Compare the p-value with the level of significance. Make a decision regarding the null hypothesis. Draw a conclusion in terms of the problem.

Sharp (Clemson University) ASA 3 / 26 Hypothesis Testing Definitions

Null Hypothesis: (Ho) a statement of no effect or no change. This statement is assumed to be true unless sufficient evidence is gathered to reject this hypothesis.

Alternative Hypothesis: (Ha) the research hypothesis. This is the statement that one wishes to support as being true. This is done by gathering evidence against the null hypothesis.

Type I Error: an error that occurs if the null hypothesis is rejected when it is true.

The probability of a Type I error is denoted as α

Type II Error: an error that occurs if the null hypothesis is not rejected when it is false.

The probability of a Type II error is denoted as β

Sharp (Clemson University) ASA 4 / 26 Hypothesis Testing Definitions

State of Nature Ho is True Ho is False

Reject Ho

Fail to Reject Ho

Sharp (Clemson University) ASA 5 / 26 More Hypothesis Testing Definitions : a quantity computed from that depends on the value of the parameter begin tested

Level of significance: the maximum allowable chance of making a Type I error that the researcher is willing to accept

P-value: the probability, computed assuming the null hypothesis is true, that a test statistic will be as or more extreme than the test statistic that was actually observed.

Sharp (Clemson University) ASA 6 / 26 Small Sample P-value Method: Ho : µ = µ0 y − µ t = √ 0 obs s/ n

Ha : µ < µ0 Ha : µ > µ0 Ha : µ 6= µ0 P-value: P-value: P-value:

P(T < tobs) P(T < tobs) P(T < tobs)

Decision Rule:

Sharp (Clemson University) ASA 7 / 26 P-value Method Example Suppose that we would like to conduct a test to determine if the average Phosphorus leaching is less than 50mm. Recall that the sample from 32 lysometer samples is 44.7166 and the sample is 7.8069. Use a significance level of 0.05. State the hypotheses.

Compute the test statistic.

Determine the p-value.

Sharp (Clemson University) ASA 8 / 26 P-value Method Example

Suppose that we would like to conduct a test to determine if the average Phosphorus leaching is less than 50mm. Use a significance level of 0.05.

Make a decision regarding Ho.

State the conclusion in terms of the problem.

Sharp (Clemson University) ASA 9 / 26 Example Riddle and Bergström (2013) describe several to examine Phosphorus leaching from two soils. A table of results from one of the experiments is reproduced below. There were four different rain simulations used and two soil types (clay and sand). The amount of drainage water collected from lysimeters was recorded.

Riddle,Sharp (Clemson M. U. andUniversity) Bergström, L. (2013). “PhosphorusASA leaching from two soils with catch crops10 / 26 exposed to freeze-thaw cycles,” Agronomy Journal, 105(3): 803-811. Hypothesis Test: Phosphorus Leaching

Conduct a test to determine if the average Phosphorus leaching is less than 50mm.

One Sample t-test

data: drain$drainage t = -3.8283, df = 31, p-value = 0.0002936 : true mean is less than 50 95 percent : -Inf 47.05657 sample estimates: mean of x 44.71664

Sharp (Clemson University) ASA 11 / 26 Inferences Comparing Two Population Central Values

Compare the average responses in two groups. Assumptions:

Independent random samples of n1 observations from one population and n2 observations from a second population are selected. Samples are selected from normal distributions or large sample sizes are used. GOAL: Make inference about the difference between the population .

Population Sample Mean Standard Deviation Size Mean Standard Deviation 1 2

Sharp (Clemson University) ASA 12 / 26 Inference for Two Population Means: Example

Riddle and Bergström (2013) describe several experiments to examine Phosphorus leaching from two soils. A table of results from one of the experiments is reproduced below. There were four different rain simulations used and two soil types (clay and sand). The amount of drainage water collected from lysimeters was recorded.

Suppose that we would like to compare the average amount of drainage water collected from clay soil to the average amount of drainage water col- lected from sandy soil.

Sharp (Clemson University) ASA 13 / 26 Distribution of Y 1 − Y 2

Suppose two independent random variables Y1 and Y2 are normally distributed with appropriate means and :

The sampling distributions of Y 1 and Y 2 are:

The of Y 1 − Y 2 is:

The mean of the sampling distribution is:

The of the sampling distribution is:

Sharp (Clemson University) ASA 14 / 26 Inference for Comparing Two Population Means: Independent Samples

2 2 #σ1 and σ2

Equal Unequal 2 2 @ 2 2 σ1 = σ2 " @!σ1 6= σ2 © @@R of'  $ '2 2 $ 2 1 1 σ1 σ2 Y¯ − Y¯ σ + + 1 2 n1 n2 n1 n2

@ Variance Estimate& © % & @R % '  $ '2 2 $ 2 1 1 s1 s2 sp + + n1 n2 n1 n2

&Sharp (Clemson University) % ASA & 15% / 26 Independent Samples, Equal Variances: Hypothesis Tests for Comparing Two Population Means

Ho : µ1 − µ2 = D0

Ha : µ1 − µ2 < D0 Ha : µ1 − µ2 > D0 Ha : µ1 − µ2 6= D0

Test statistic: (y 1 − y 2) − D0 tobs = r 1 1 sp + n1 n2 where

2 2 2 (n1 − 1)s1 + (n2 − 1)s2 sp = n1 + n2 − 2

Sharp (Clemson University) ASA 16 / 26 Independent Samples, Equal Variances: Hypothesis Test P-values Ho : µ1 − µ2 = D0

Ha : µ1 − µ2 < D0 Ha : µ1 − µ2 > D0 Ha : µ1 − µ2 6= D0

P-value: P-value: P-value: P(T < tobs) P(T > tobs) 2P(T > |tobs|)

Decision Rule:

Sharp (Clemson University) ASA 17 / 26 of Treatment to Experimental Units Is the average amount of drainage water collected from clay soil different from the average amount of drainage water collected from sandy soil?

Sharp (Clemson University) ASA 18 / 26 Inference for Two Means Is the average amount of drainage water collected from clay soil different from the average amount of drainage water collected from sandy soil? Use a significance level of 0.05.

Sharp (Clemson University) ASA 19 / 26 Inference for Two Means, Independent Samples, Equal Variances: Confidence Interval A 100(1 − α)% confidence interval for the difference in population means is s ! 1 1 (y 1 − y 2) ± tα/2,(n1+n2−2) sp + n1 n2 Two Sample t-test

data: drain$drainage by drain$soil t = 1.5148, df = 30, p-value = 0.1403 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.42650 9.61915 sample estimates: mean in group clay mean in group sand 46.76480 42.66848

Sharp (Clemson University) ASA 20 / 26 Inference for Comparing Two Population Means: Independent Samples

2 2 #σ1 and σ2

Equal Unequal 2 2 @ 2 2 σ1 = σ2 " @!σ1 6= σ2 © @@R Variance of'  $ '2 2 $ 2 1 1 σ1 σ2 Y¯ − Y¯ σ + + 1 2 n1 n2 n1 n2

@ Variance Estimate& © % & @R % '  $ '2 2 $ 2 1 1 s1 s2 sp + + n1 n2 n1 n2

&Sharp (Clemson University) % ASA & 21% / 26 Independent Samples, Unequal Variances: Hypothesis Tests for Comparing Two Population Means

Ho : µ1 − µ2 = D0

Ha : µ1 − µ2 < D0 Ha : µ1 − µ2 > D0 Ha : µ1 − µ2 6= D0

Test statistic: 0 (y 1 − y 2) − D0 tobs = s s2 s2 1 + 2 n1 n2

Sharp (Clemson University) ASA 22 / 26 Distribution of the Test Statistic

0 t ∼˙ t(df )

(n − 1)(n − 1) = 1 2 where df 2 2 (1 − c) (n1 − 1) + c (n2 − 1)

2 s1 n and c = 1 s2 s2 1 + 2 n1 n2

Sharp (Clemson University) ASA 23 / 26 Independent Samples, Unequal Variances: Hypothesis Test P-values

Ho : µ1 − µ2 = D0

Ha : µ1 − µ2 < D0 Ha : µ1 − µ2 > D0 Ha : µ1 − µ2 6= D0

P-value: P-value: P-value: P(T < tobs) P(T > tobs) 2P(T > |tobs|)

Decision Rule:

Sharp (Clemson University) ASA 24 / 26 Inference for Two Means (Unequal Variances): Example Using PROC TTEST Is the average amount of drainage water collected from clay soil different from the average amount of drainage water collected from sandy soil? Use a significance level of 0.05.

Sharp (Clemson University) ASA 25 / 26 Inference for Two Means, Independent Samples, Unequal Variances: Confidence Interval

A 100(1 − α)% confidence interval for the difference in population means when the population variances are not equal is s 2 2 s1 s2 (y 1 − y 2) ± tα/2,df + n1 n2 (n − 1)(n − 1) = 1 2 where df 2 2 (1 − c) (n1 − 1) + c (n2 − 1)

2 s1 n and c = 1 s2 s2 1 + 2 n1 n2

Sharp (Clemson University) ASA 26 / 26