Hypothesis Testing I
Total Page:16
File Type:pdf, Size:1020Kb
Introduction Wald test p-values t-test Quiz Hypothesis testing I. Botond Szabo Leiden University Leiden, 26 March 2018 Introduction Wald test p-values t-test Quiz Outline 1 Introduction 2 Wald test 3 p-values 4 t-test 5 Quiz Introduction Wald test p-values t-test Quiz Example Suppose we want to know if exposure to asbestos is associated with lung disease. We take some rats and randomly divide them into two groups. We expose one group to asbestos and then compare the disease rate in two groups. We consider the following two hypotheses: 1 The null hypothesis: the disease rate is the same in both groups. 2 Alternative hypothesis: The disease rate is not the same in two groups. If the exposed group has a much higher disease rate, we will reject the null hypothesis and conclude that the evidence favours the alternative hypothesis. Introduction Wald test p-values t-test Quiz Null and alternative hypotheses We partition the parameter space Θ into sets Θ0 and Θ1 and wish to test H0 : θ 2 Θ0 versus H1 : θ 2 Θ1: We call H0 the null hypothesis and H1 the alternative hypothesis. In our previous example θ is the difference between the disease rates in the exposed and not exposed groups, Θ0 = f0g; and Θ1 = (0; 1): Introduction Wald test p-values t-test Quiz Rejection region Let T be a random variable and let T be the range of X : We test the hypothesis by finding a subset of outcomes R ⊂ X ; called the rejection region. If T 2 R; we reject the null hypothesis, otherwise we retain it: T 2 R ) reject H0 T 2= R ) retain H0: In our previous example T is an estimate of the difference between the disease rates in the exposed and not exposed groups. We reject the null hypothesis if T is large. Introduction Wald test p-values t-test Quiz Test statistic and critical value Usually, the rejection region R is of the form R = fx : T (x) > cg; where T is a test statistic and c is critical value. The problem in hypothesis testing is to find an appropriate test statistic T and an appropriate critical value c: Introduction Wald test p-values t-test Quiz Warning There is a tendency to use hypothesis testing even when they are not appropriate. Point estimates and confidence intervals are at times a better option. Use testing only with well-defined hypotheses and in general take care of formulating them: things can go wrong when testing ill-defined or overly complex hypotheses. Introduction Wald test p-values t-test Quiz Type I and type II errors By rejecting H0 when H0 is true we commit type I error. By retaining H0 when H1 is true we commit type II error. Retain Null Reject Null H0 true X type I error H1 true type II error X Introduction Wald test p-values t-test Quiz Power function Definition The power function of a test T with rejection region R is defined by β(θ) = Pθ(T 2 R): The size of a test is defined to be α = sup β(θ): θ2Θ0 A test is said to be of level α; if its size is less or equal to α: Introduction Wald test p-values t-test Quiz Comments Let Θ0 = fθ0g and Θ1 = fθ1g: In this case β(θ0) is the probability of type I error. β(θ1) gives 1 minus the probability of type II error. In the ideal case we would like the probabilities of both types of errors be equal to zero. Except trivial cases, this is not possible. So we concentrate on tests of size or level α; where α is taken to be small, e.g. α = 0:05: Introduction Wald test p-values t-test Quiz Hypotheses and tests A hypothesis of the form θ = θ0 is called a simple hypothesis. A hypothesis of the form θ > θ0 or θ < θ0 (or both) is called a composite hypothesis. A test of the form H0 : θ = θ0 versus H1 : θ 6= θ0 is called a two-sided test. A test of the form H0 : θ ≤ θ0 versus H1 : θ > θ0 or H0 : θ ≥ θ0 versus H1 : θ < θ0 is called a one-sided test. Introduction Wald test p-values t-test Quiz Example Example Let X1;:::; Xn ∼ N(µ, σ); with σ known. We want to test H0 : µ ≤ 0 versus µ > 0: Hence Θ0 = (−∞; 0] and Θ1 = (0; 1): Let T = X n and conisder the test reject H0 if T > c: The rejection region is R = f(x1;:::; xn): T (x1;:::; xn) > cg : Introduction Wald test p-values t-test Quiz Example (continued) Example The power function is β(µ) = Pµ(X n > c) p p n(X − µ) n(c − µ) = n > Pµ σ σ p n(c − µ) = Z > P σ p n(c − µ) = 1 − Φ : σ The power function is increasing in µ. Introduction Wald test p-values t-test Quiz Example (continued) Example Hence p nc size = sup β(µ) = β(0) = 1 − Φ : µ≤0 σ For size α test we set this equal to α and solve for c to get σΦ−1(1 − α) c = p : n We reject the null hypothesis when X n > c: Equivalently, when p nX n > z = Φ−1(1 − α): σ α Introduction Wald test p-values t-test Quiz Example 2 Example Let X be an observation from a Uniform(0; θ) distribution. Suppose one wants to test H0 : θ = 2 versus H1 : θ 6= 2. Suppose one rejects H0 if X ≥ 1:9. Compute the probability of type 1 error. Introduction Wald test p-values t-test Quiz Most powerful test It would be desirable to find the test with highest power under H1 among all size α tests. Such a test, if it exists, is called the most powerful test. Finding such tests is not easy, and often they don't even exist. We shall instead consider several widely used tests. Introduction Wald test p-values t-test Quiz Wald test Definition Let θ be a scalar parameter and θ^ an estimate of θ with ^se its estimated standard error. Consider the test H0 : θ = θ0 versus H1 : θ 6= θ0: Assume that θ^ is asymptotically normal: θ^ − θ 0 N(0; 1): ^se The size α Wald test is: reject H0 when jW j > zα/2; where θ^ − θ W = 0 : ^se Introduction Wald test p-values t-test Quiz Size of the Wald test Theorem Asymptotically, the Wald test has size α; i.e. Pθ0 (jW j > zα/2) ! α as n ! 1: Proof. We have ! jθ^ − θ j (jW j > z ) = 0 > z Pθ0 α/2 Pθ0 ^se α/2 ! P(jZj > zα/2) = α: Introduction Wald test p-values t-test Quiz Remark An alternative version of the Wald test statistic is θ^ − θ W = 0 ; se0 where se0 is the standard error of θ^ computed at θ0: Both versions of the test are valid. Introduction Wald test p-values t-test Quiz Power of the Wald test Theorem ∗ ∗ Suppose the true value of θ is θ 6= θ0: The power β(θ ) (the probability of correctly rejecting the null hypothesis) is approximately given by ! ! θ^ − θ∗ θ^ − θ∗ 1 − Φ + z + Φ − z : ^se α/2 ^se α/2 Introduction Wald test p-values t-test Quiz Example Example Let X1;:::; Xm and Y1;:::; Yn be two independent samples from distributions with means µ1 and µ2; respectively. Let us test the null hypothesis that µ1 = µ2: Write this as H0 : δ = 0 versus H1 : δ 6= 0; where δ = µ1 − µ2: The plug-in estimate of δ is δ^ = X m − Y n with q 2 2 s1 s2 2 2 estimated standard error ^se= m + n ; where s1 ; s2 are sample variances. The size α Wald test rejects H0 when jW j > zα/2; where δ^ − 0 X − Y W = = m n : q 2 2 ^se s1 s2 m + n Introduction Wald test p-values t-test Quiz Example 2 Example Let us assume that we flip a coin 100 times and 25 out of the 100 we got head. Do we have strong enough evidence to say that the coin is not fair? Introduction Wald test p-values t-test Quiz Relation to confidence intervals Theorem The size α Wald test rejects H0 : θ = θ0 versus H1 : θ 6= θ0 if and only if θ0 2= C; where ^ ^ C = (θ − ^sezα/2; θ + ^sezα/2): In words, testing the hypothesis with Wald test isequivalent to checking whether the null value θ0 is in the confidence interval. Introduction Wald test p-values t-test Quiz Warning When we reject H0; we say the result is statistically significant. The result may be statistically significant without the effect size being large. In such a case we have a result that is statistically significant, but perhaps not scientifically or practically significant. Any confidence interval that excludes θ0 corresponds to rejecting θ0: But θ0 can be close to an endpoint of the confidence interval (not scientifically significant), or far away (scientifically significant). Introduction Wald test p-values t-test Quiz Heuristics Consider a hypothesis test with test statistic T and the rejection region Rα at level α: Fix α: Upon observing the data X we evaluate the statistic T (X ) and ask whether the test rejects the null hypothesis at level α: If yes, we gradually decrease α; while asking the same question. Eventually we arrive at α at which the test does not reject the null hypothesis.