Review: The pivotal quantity method, the p-value, & some basic tests We have learned three approaches to derive the tests: - Likelihood Ratio Test - Roy’s Union Intersection Principle - Pivotal Quantity Method Part 1. ***Now we review the Pivotal Quantity Method. Example. Inference on 1 population mean, when the population is normal and the population variance is known the Z-test. Definition : the Pivotal Quantity (P.Q.) : A pivotal quantity is a function of the sample and the parameter of interest. Furthermore, its distribution is entirely known. 2 1. We start by looking at the point estimator of . X ~ N(, ) n * Is X a pivotal quantity for ? → X is not because is unknown. 2 * function of X and : X ~ N(0, ) n 1 → Yes, it is pivotal quantity. * Another function of X and : X Z ~ N(0,1) is our P.Q. n Yes, it is a pivotal quantity. So, Pivotal Quantity is not unique. 2. Now that we have found the pivotal quantity Z, we shall start the derivation for the test for µ using the pivotal quantity Z Definition: The test statistic is the PQ with the value of the parameter of interest under the null hypothesis ( H 0 ) inserted (*in this case, it is 휇 = 휇): H0 X 0 Z 0 ~ N(0,1) is our test statistic. n X 0 That is, given H : in true Z 0 ~ N(0,1) 0 0 n 3. * Derive the decision threshold for your test based on the Type I error rate the significance level For the pair of hypotheses: H : H : a 0 0 0 versus It is intuitive that one should reject the null hypothesis, in support of the alternative hypothesis, when the sample mean is larger than 0 . Equivalently, this means when the test statistic Z0 is larger than certain positive value c - the question is what is 2 the exact value of c -- and that can be determined based on the significance level α—that is, how much Type I error we would allow ourselves to commit. Setting: P(Type I error) = P(reject H | H ) = 0 0 P(Z c | H : ) 0 0 0 c z We will see immediately that from the pdf plot below. ∴ At the significance level α, we will reject H 0 in favor of H a if Z 0 Z Other Hypotheses H : 0 0 (one-sided test or one-tailed test) Ha : 0 X Test statistic : Z 0 ~ N (0,1) 0 / n P(Z 0 c | H 0 : 0 ) c Z 3 H : 0 0 (Two-sided or Two-tailed test) H a : 0 X Test statistic : Z 0 ~ N (0,1) 0 / n P(| Z 0 | c | H 0 ) P(Z 0 c | H 0 ) P(Z 0 c | H 0 ) 2 P(Z 0 c | H 0 ) P(Z c | H ) 2 0 0 c Z 2 Reject H 0 if | Z 0 | Z 2 4 Part 2. We have just discussed the “rejection region” approach for decision making. There is another approach for decision making, it is “p-value” approach. *Definition: p-value – it is the probability that we observe a test statistic value that is as extreme, or more extreme, than the one we observed, given that the null hypothesis is true. H 0: 0 H0: 0 H 0: 0 H a : 0 Ha : 0 H a : 0 H0 X 0 Observed value of test statistic Z 0 N(0,1) n ~ p-value p-value p-value P(| Z 0 || z0 || H 0 ) P(Z 0 z0 | H 0 ) P(Z 0 z0 | H 0 ) 2 P(Z 0 || z0 || H 0 ) (1) the area under (2) the area under (3) twice the area to N(0,1) pdf to the N(0,1) pdf to the the right of | z0 | right of z0 left of z0 H 0: 0 (1) H a : 0 5 H0: 0 (2) Ha : 0 H0: 0 (3) H a : 0 The way we make conclusions is the same for all hypotheses. We reject 푯ퟎ in favor of 푯풂 iff p-value < α 6 Part 3. CLT -- the large sample scenario: Any population (*usually non-normal – as the exact tests should be used if the population is normal), however, the sample size is large (this usually refers to: n ≥ 30) Theorem. The Central Limit Theorem Let X 1 , X 2 ,, X n be a random sample from a population with mean and variance 2 , we have: X n N (0,1) n * When n is large enough (n 30) , X Z N(0,1) (approximately) – by CLT and the S n ~ Slutsky’s Theorem Therefore the pivotal quantities (P.Q.’s) for this scenario: X X Z~ NZ (0,1) or ~ N (0,1) n S n Use the first P.Q. if σ is known, and the second when σ is unknown. 7 The derivation of the hypothesis tests (rejection region and the p-value) are almost the same as the derivation of the exact Z-test discussed above. H 0: 0 H0: 0 H 0: 0 H a : 0 Ha : 0 H a : 0 X 0 Test Statistic Z 0 N(0,1) S n ~ Rejection region : we reject H 0 in favor of H a at the significance level if Z 0 Z Z0 Z | Z 0 | Z 2 p-value p-value p-value P(| Z 0 || z0 || H 0 ) P(Z 0 z0 | H 0 ) P(Z 0 z0 | H 0 ) 2 P(Z 0 || z0 || H 0 ) (1) the area under (2) the area under (3) twice the area to N(0,1) pdf to the N(0,1) pdf to the the right of | z0 | right of z0 left of z0 Example. Normal Population, but the population variance is unknown 100 years ago – people use Z-test This is OK for n large (n 30) per the CLT (Scenario 2) This is NOT ok if the sample size is samll. “A Student of Statistics” – pen name of William Sealy Gosset (June 13, 1876–October 16, 1937) 8 “The Student’s t-test” X P.Q. T ~ t S/ n n1 (Exact t-distribution with n-1 degrees of freedom) “A Student of Statistics” – pen name of William Sealy Gosset (June 13, 1876–October 16, 1937) http://en.wikipedia.org/wiki/William_Sealy_Gosset “The Student’s t-distribution” X P.Q. T ~ t S/ n n1 (Exact t-distribution with n-1 degrees of freedom ) Wrong Test for a 2-sided alternative hypothesis Reject H 0 if |푧| ≥ 푍/ Right Test for a 2-sided alternative hypothesis Reject H if |푡 | ≥ 푡 0 ,/ 9 (Because t distribution has heavier tails than normal distribution.) Right Test H 0: 0 * Test Statistic H a : 0 H0 X 0 T0 tn1 S n ~ * Reject region : Reject H 0 at if the observed test statistic value |푡| ≥ 푡,/ * p-value p-value = shaded area * 2 10 Further Review: 1. Definition : t-distribution Z T ~ t k W k Z ~ N (0,1) 2 W ~ k (chi-square distribution with k degrees of freedom) Z &W are independent. 2. Def 1 : chi-square distribution : from the definition of the gamma distribution: gamma(α = k/2, β = 2) / MGF: 푀(푡) = mean & varaince: 퐸(푊) = 푘; 푉푎푟(푊) =2푘 i.i.d . Z , Z ,, Z N (0,1) Def 2 : chi-square distribution : Let 1 2 k ~ , k 2 2 then W Z i ~ k i1 11 Example. Inference on 2 population means, when both populations are normal and we have two independent samples. Furthermore the population variances are unknown but 2 2 2 equal (1 2 ) pooled-variance t-test. iid Data: X,, X ~(, N 2 ) 1n1 1 1 iid 2 Y,, Y ~(, N ) 1n2 2 2 Goal: Compare 1 and 2 Pivotal Quantity Approach: 1) Point estimator: 2 2 1 2 1 1 2 ˆ1 ˆ 2XYN~( 1 2 , )( N 1 2 ,( )) n1 n 2 n 1 n 2 2) Pivotal quantity: (X Y )( ) Z 1 2 ~ N (0,1) not the PQ, since we 1 1 n1 n 2 don’t know 2 (n 1) S 2 (n 1) S 2 1 1 ~ 2 , 2 2 ~ 2 , and they are 2 n1 1 2 n2 1 2 2 independent ( S1 & S2 are independent because these two samples are independent to each other) (n 1) S2 ( n 1) S 2 W 1 1 2 2 ~ 2 2 2 n1 n 2 2 12 i... i d 2 2 2 Definition: WZZ1 2 Zk , when Zi ~ N (0,1) , then 2 W ~ k . Definition: 2 t-distribution: Z~ N (0,1) , W ~ k , and Z & W are Z independent, then T ~ t W k k (X Y )( 1 2 ) 1 1 Z n1 n 2 (X Y )( 1 2 ) T ~ tn n 2 W (n 1) S2 ( n 1) S 2 1 1 1 2 1 1 2 2 S 2 2 p n1 n 2 2 n 1 n 2 n1 n 2 2 . 2 2 2 (n1 1) S 1 ( n 2 1) S 2 where S p is the pooled variance. n1 n 2 2 This is the PQ of the inference on the parameter of interest (1 2 ) 3) Confidence Interval for (1 2 ) 13 1 Pt ( Tt ) n n2, n n 2, 1 22 1 2 2 (X Y )( 1 2 ) 1 Pt ( t ) n n2, n n 2, 1 221 1 1 2 2 S p n1 n 2 1 1 1 1 1(P t Sp ()() X Y 1 2 t S p ) n n2, n n 2, 1 22n1 n 2 1 2 2 n 1 n 2 1 1 1 1 1PXYt ( Sp 1 2 XYt S p ) n n2, n n 2, 1 22 n1 n 2 1 2 2 n1 n 2 This is the 100(1 )% C.I for (1 2 ) 4) Test: (XY ) c H0 Test statistic: T 0 ~ t 01 1 n1 n 2 2 S p n1 n 2 H0: 1 2 c 0 a) (The most common situation Ha :1 2 c 0 is c0 0 H0: 1 2 ) 14 At the significance level α, we reject H0 in favor of H iff T t a 0n1 n 2 2, If P value P( T0 t 0 | H 0 ) , reject H0 H0: 1 2 c 0 b) Ha :1 2 c 0 At significance level α, reject H0 in favor of H a iff T t 0n1 n 2 2, If P value P( T0 t 0 | H 0 ) , reject H0 H0: 1 2 c 0 c) Ha :1 2 c 0 At α=0.05, reject H0 in favor of H a iff |T0 | t n n 2, 1 2 2 If P value2 P ( T0 | t 0 || H 0 ) , reject H0 15 Inference on two population variances * Both pop’s are normal, two independent samples i.i.d.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages23 Page
-
File Size-