Review: The pivotal quantity method, the p-value, & some basic tests
We have learned three approaches to derive the tests:
- Likelihood Ratio Test - Roy’s Union Intersection Principle - Pivotal Quantity Method
Part 1. ***Now we review the Pivotal Quantity Method.
Example. Inference on 1 population mean, when the population is normal and the population variance is known the Z-test.
Definition : the Pivotal Quantity (P.Q.) :
A pivotal quantity is a function of the sample and the parameter of interest. Furthermore, its distribution is entirely known.
2 1. We start by looking at the point estimator of . X ~ N(, ) n
* Is X a pivotal quantity for ?
→ X is not because is unknown.
2 * function of X and : X ~ N(0, ) n
1
→ Yes, it is pivotal quantity.
* Another function of X and :
X Z ~ N(0,1) is our P.Q. n
Yes, it is a pivotal quantity.
So, Pivotal Quantity is not unique.
2. Now that we have found the pivotal quantity Z, we shall start the derivation for the test for µ using the pivotal quantity Z
Definition: The test statistic is the PQ with the value of the parameter of interest under the null
hypothesis ( H 0 ) inserted
(*in this case, it is 휇 = 휇 ):
H0 X 0 Z 0 ~ N(0,1) is our test statistic. n
X 0 That is, given H : in true Z 0 ~ N(0,1) 0 0 n
3. * Derive the decision threshold for your test based on the Type I error rate the significance level For the pair of hypotheses:
H : H : a 0 0 0 versus
It is intuitive that one should reject the null hypothesis, in support of the alternative hypothesis, when the sample mean is
larger than 0 . Equivalently, this means when the test statistic
Z0 is larger than certain positive value c - the question is what is
2
the exact value of c -- and that can be determined based on the significance level α—that is, how much Type I error we would allow ourselves to commit.
Setting:
P(Type I error) = P(reject H | H ) = 0 0 P(Z c | H : ) 0 0 0
c z We will see immediately that from the pdf plot below.
∴ At the significance level α, we will reject H 0 in favor of H a if
Z 0 Z
Other Hypotheses
H : 0 0 (one-sided test or one-tailed test) Ha : 0
X Test statistic : Z 0 ~ N (0,1) 0 / n
P(Z 0 c | H 0 : 0 ) c Z
3
H : 0 0 (Two-sided or Two-tailed test) H a : 0
X Test statistic : Z 0 ~ N (0,1) 0 / n
P(| Z 0 | c | H 0 ) P(Z 0 c | H 0 ) P(Z 0 c | H 0 )
2 P(Z 0 c | H 0 )
P(Z c | H ) 2 0 0
c Z 2
Reject H 0 if | Z 0 | Z 2
4
Part 2. We have just discussed the “rejection region” approach for decision making. There is another approach for decision making, it is “p-value” approach.
*Definition: p-value – it is the probability that we observe a test statistic value that is as extreme, or more extreme, than the one we observed, given that the null hypothesis is true.
H : H : H : 0 0 0 0 0 0 H a : 0 Ha : 0 H a : 0
H0 X 0 Observed value of test statistic Z 0 N(0,1) n ~
p-value p-value p-value P(| Z 0 || z0 || H 0 )
P(Z 0 z0 | H 0 ) P(Z 0 z0 | H 0 ) 2 P(Z 0 || z0 || H 0 ) (1) the area under (2) the area under (3) twice the area to N(0,1) pdf to the N(0,1) pdf to the the right of | z0 | right of z0 left of z0
H 0: 0 (1) H a : 0
5
H0: 0 (2) Ha : 0
H0: 0 (3) H a : 0
The way we make conclusions is the same for all hypotheses.
We reject 푯ퟎ in favor of 푯풂 iff p-value < α
6
Part 3. CLT -- the large sample scenario: Any population (*usually non-normal – as the exact tests should be used if the population is normal), however, the sample size is large (this usually refers to: n ≥ 30)
Theorem. The Central Limit Theorem
Let X 1 , X 2 ,, X n be a random sample from a population with
mean and variance 2 , we have:
X n N (0,1) n
* When n is large enough (n 30) ,
X Z N(0,1) (approximately) – by CLT and the S n ~
Slutsky’s Theorem
Therefore the pivotal quantities (P.Q.’s) for this scenario:
X X Z~ NZ (0,1) or ~ N (0,1) n S n
Use the first P.Q. if σ is known, and the second when σ is unknown.
7
The derivation of the hypothesis tests (rejection region and the p-value) are almost the same as the derivation of the exact Z-test discussed above.
H : H : H : 0 0 0 0 0 0 H a : 0 Ha : 0 H a : 0
X 0 Test Statistic Z 0 N(0,1) S n ~
Rejection region : we reject H 0 in favor of H a at the significance level if
Z 0 Z Z0 Z | Z 0 | Z 2 p-value p-value p-value P(| Z 0 || z0 || H 0 )
P(Z 0 z0 | H 0 ) P(Z 0 z0 | H 0 ) 2 P(Z 0 || z0 || H 0 ) (1) the area under (2) the area under (3) twice the area to N(0,1) pdf to the N(0,1) pdf to the the right of | z0 | right of z0 left of z0
Example. Normal Population, but the population variance is unknown
100 years ago – people use Z-test
This is OK for n large (n 30) per the CLT (Scenario 2)
This is NOT ok if the sample size is samll.
“A Student of Statistics”
– pen name of William Sealy Gosset (June 13, 1876–October 16, 1937)
8
“The Student’s t-test”
X P.Q. T ~ t S/ n n1
(Exact t-distribution with n-1 degrees of freedom)
“A Student of Statistics” – pen name of William Sealy Gosset
(June 13, 1876–October 16, 1937)
http://en.wikipedia.org/wiki/William_Sealy_Gosset
“The Student’s t-distribution”
X P.Q. T ~ t S/ n n1
(Exact t-distribution with n-1 degrees of freedom )
Wrong Test for a 2-sided alternative hypothesis
Reject H 0 if |푧 | ≥ 푍 /
Right Test for a 2-sided alternative hypothesis
Reject H if |푡 | ≥ 푡 0 , /
9
(Because t distribution has heavier tails than normal distribution.)
Right Test
H 0: 0 * Test Statistic H a : 0
H0 X 0 T0 tn1 S n ~
* Reject region : Reject H 0 at if the observed test statistic
value |푡 | ≥ 푡 , /
* p-value
p-value = shaded area * 2
10
Further Review:
1. Definition : t-distribution Z T ~ t k W k
Z ~ N (0,1)
2 W ~ k (chi-square distribution with k degrees of freedom)
Z &W are independent.
2. Def 1 : chi-square distribution : from the definition of the gamma distribution: gamma(α = k/2, β = 2)