Hypothesis Testing and Likelihood Ratio Tests

Hypothesis Testing and Likelihood Ratio Tests

<p>Hypothesis testing and likelihood ratio tests </p><ul style="display: flex;"><li style="flex:1">i t&nbsp;ti li&nbsp;li ti&nbsp;t t </li><li style="flex:1">t</li></ul><p></p><p></p><ul style="display: flex;"><li style="flex:1">Y</li><li style="flex:1">We will adopt the following model for observed data.&nbsp;The distribution of&nbsp;Y = (Y ,&nbsp;..., Y&nbsp;) is </li></ul><p></p><p></p><ul style="display: flex;"><li style="flex:1">1</li><li style="flex:1">n</li></ul><p></p><p>parameter t θ, which may be a vector θ = (θ , ..., θ ); θ∈Θ, considered known except for some p </p><p>parameter space <br>. The&nbsp;parameter space will usually be an open set.&nbsp;If Y&nbsp;is a continuous </p><p></p><ul style="display: flex;"><li style="flex:1">1</li><li style="flex:1">k</li></ul><p></p><p>Y</p><ul style="display: flex;"><li style="flex:1">the </li><li style="flex:1">t</li></ul><p></p><ul style="display: flex;"><li style="flex:1">probability density function </li><li style="flex:1">y</li><li style="flex:1">Y</li><li style="flex:1">random variable, its p </li></ul><p>ilit it ti&nbsp;(pdf) will de denoted f(y&nbsp;;θ) .&nbsp;If Y&nbsp;is </p><ul style="display: flex;"><li style="flex:1">y</li><li style="flex:1">probability mass function </li><li style="flex:1">y</li><li style="flex:1">Y y </li></ul><p>discrete then f(y&nbsp;;θ) represents the p&nbsp;ilit ti&nbsp;(pmf); f(y ;θ) = P ( = ). </p><p>θ</p><p>statistical hypothesis <br>A t&nbsp;ti ti&nbsp;l i&nbsp;is a statement about the value of θ. We&nbsp;are interested in testing the null t</p><p>hypothesis H&nbsp;: θ∈Θ versus the alternative hypothesis H&nbsp;: θ∈Θ . Where Θ and Θ ⊂&nbsp;Θ. </p><p></p><ul style="display: flex;"><li style="flex:1">0</li><li style="flex:1">0</li><li style="flex:1">1</li><li style="flex:1">1</li><li style="flex:1">0</li><li style="flex:1">1</li></ul><p></p><p>hypothesis test <br>Naturally Θ ∩&nbsp;Θ =&nbsp;∅, but we need not have Θ ∪&nbsp;Θ =&nbsp;Θ. A&nbsp;h t&nbsp;tis a procedure </p><ul style="display: flex;"><li style="flex:1">t</li><li style="flex:1">i</li></ul><p></p><p></p><ul style="display: flex;"><li style="flex:1">0</li><li style="flex:1">1</li><li style="flex:1">0</li><li style="flex:1">1</li></ul><p></p><p></p><ul style="display: flex;"><li style="flex:1">critical region </li><li style="flex:1">for deciding between H&nbsp;and H&nbsp;based on the sample data.&nbsp;It is equivalent to a c&nbsp;iti l&nbsp;i : </li></ul><p></p><p></p><ul style="display: flex;"><li style="flex:1">0</li><li style="flex:1">1</li></ul><p>n</p><p>ya critical region is a set C ⊂ R such that if y&nbsp;= (y ,&nbsp;..., y&nbsp;) ∈ C, H&nbsp;is rejected.&nbsp;Typically C is </p><p></p><ul style="display: flex;"><li style="flex:1">1</li><li style="flex:1">n</li><li style="flex:1">0</li></ul><p></p><p>test statistic </p><ul style="display: flex;"><li style="flex:1">expressed in terms of the value of some t </li><li style="flex:1">t t&nbsp;ti ti&nbsp;, a function of the sample data.&nbsp;For </li></ul><p></p><p>y – µ<sub style="top: 0.2189em;">0 </sub>s /&nbsp;n </p><p></p><ul style="display: flex;"><li style="flex:1">example, we might have C = {(y ,&nbsp;..., y&nbsp;): </li><li style="flex:1">≥ 3.324}. The&nbsp;number 3.324 here is called a </li></ul><p></p><p></p><ul style="display: flex;"><li style="flex:1">1</li><li style="flex:1">n</li></ul><p></p><p>Y – µ<sub style="top: 0.2189em;">0 </sub>S /&nbsp;n </p><p></p><ul style="display: flex;"><li style="flex:1">critical value </li><li style="flex:1">iti l&nbsp;l of&nbsp;the test statistic </li><li style="flex:1">.</li></ul><p>y<br>If ∈C but θ∈Θ , we have committed a Type I error.&nbsp;If y∉C but θ∈Θ , we have committed a </p><p></p><ul style="display: flex;"><li style="flex:1">0</li><li style="flex:1">1</li></ul><p></p><p>Type II error.&nbsp;The ideal hypothesis test would simultaneously minimize the probabilities of both types of error, but this turns out to be impossible in principle.&nbsp;So what we do is to select a </p><p>Max </p><p></p><ul style="display: flex;"><li style="flex:1">significance level </li><li style="flex:1">i</li><li style="flex:1">i i </li><li style="flex:1">l</li></ul><p></p><p>lα = </p><p>P (Y∈C) to be a small number; α = 0.05 and α = 0.01 are </p><p>θ ∈ Θ<sub style="top: 0.1592em;">0 </sub></p><p>θ</p><p>traditional. Then&nbsp;a "good" test is one with a small probability of Type II error, among all possible ytests with significance level α. When&nbsp;y ∈C for a test of level α (that is, H&nbsp;is rejected), we say the </p><p>0</p><p>statistically significant results are s&nbsp;t ti ti&nbsp;i i&nbsp;tat level α. </p><ul style="display: flex;"><li style="flex:1">l</li><li style="flex:1">i</li></ul><p>For a particular set of data, the smallest significance level that leads to the rejection of H&nbsp;is called </p><p>0</p><p>p−value l .&nbsp;H is&nbsp;rejected if and only if p≤α. the </p><p>0</p><p>power <br>P (Y∈C) can be viewed as a function of θ. If θ∈Θ , we refer to this quantity as the p of </p><p>θ</p><p>1</p><p>Likelihood Ratio Tests:&nbsp;Page 1 of 3 the test C.&nbsp;For any good test, we will have P&nbsp;(Y∈C) </p><p>↑</p><p>1 as n&nbsp;→ ∞&nbsp;for each θ∈Θ . This </p><p>θ</p><p>1</p><p>provides a way to choose sample size. For a fixed value θ∈Θ that is of scientific interest, choose </p><p>1</p><p>n large enough so that the probability of rejecting H&nbsp;is acceptably high. </p><p>0</p><p></p><ul style="display: flex;"><li style="flex:1">likelihood ratio </li><li style="flex:1">i li&nbsp;ti </li><li style="flex:1">How do we construct good hypothesis tests?&nbsp;Usually it is hard to beat the l </li></ul><p></p><p>Max </p><p>L(θ;<strong>y</strong>) </p><p>θ ∈ Θ<sub style="top: 0.1592em;">0 </sub></p><p></p><ul style="display: flex;"><li style="flex:1">tests </li><li style="flex:1">y</li><li style="flex:1">t t&nbsp;, which are defined as follows.&nbsp;C = {y :&nbsp;λ = </li><li style="flex:1">≤ k}. The&nbsp;value of k (0&lt;k&lt;1) </li></ul><p></p><p>Max </p><p><sub style="top: 0.4975em;">θ ∈ Θ</sub>L(θ;<strong>y</strong>) </p><p>varies from problem to problem.&nbsp;It is chosen so that the test will have significance level ("size") α. Notice that the denominator of λ is just the likelihood function evaluated at the MLE.&nbsp;The numerator is the likelihood function evaluated at a sort of restricted MLE, in which θ is forced to stay in the set Θ . Also&nbsp;notice that when λ=0, H&nbsp;is always rejected, and when λ=1, H&nbsp;is never </p><p></p><ul style="display: flex;"><li style="flex:1">0</li><li style="flex:1">0</li><li style="flex:1">0</li></ul><p></p><p>rejected. There are two main versions of the likelihood ratio approach.&nbsp;One version leads to exact tests, and exact likelihood ratio tests the other leads to large−sample approximations.&nbsp;The e t li&nbsp;li ti&nbsp;t t&nbsp;are obtained </p><p>by working on the critical region C, and re−expressing it in terms of the value of some test statistic whose distribution (given that H&nbsp;is true) is known exactly.&nbsp;This is where we get most of the </p><p>0</p><p>standard statistical tests, including t−tests and F−tests in regression and the analysis of variance. Sometimes, after the critical region C has been re−expressed in terms of some seemingly convenient test statistic, nobody can figure out its distribution under H&nbsp;. This&nbsp;means we are unable to choose </p><p>0</p><p>k ∈ (0,1) so that the test has size α. And&nbsp;if the MLE has to be approximated numerically, there is large−sample likelihood ratio tests </p><ul style="display: flex;"><li style="flex:1">li li&nbsp;ti t&nbsp;t ; </li><li style="flex:1">little hope of an exact test.&nbsp;In such cases we resort to l </li></ul><p>we will use the following result. l<br>Likelihood Ratio Tests:&nbsp;Page 2 of 3 <br>Let θ = (θ , ..., θ ); we want to test H&nbsp;: θ∈Θ versus H&nbsp;: θ∈Θ , where Θ ∪&nbsp;Θ =&nbsp;Θ. Let </p><p>1</p><p>p</p><p></p><ul style="display: flex;"><li style="flex:1">0</li><li style="flex:1">0</li><li style="flex:1">1</li><li style="flex:1">1</li><li style="flex:1">0</li><li style="flex:1">1</li></ul><p></p><p>r≤p be the number of parameters θ whose values are restricted by H&nbsp;. Then&nbsp;under some </p><p></p><ul style="display: flex;"><li style="flex:1">j</li><li style="flex:1">0</li></ul><p>2</p><p>smoothness conditions,&nbsp;G = −2 log(λ) has (for large n) an approximate χ distribution with r </p><p>2</p><p>degrees of freedom.&nbsp;We reject H&nbsp;when G is greater than the critical value of the χ distribution. </p><p>0</p><p>If g is the observed value of the statistic G calculated from the sample data, the p−value is 1−γ(g), </p><p>2</p><p>where γ is the cumulative distribution function of a χ distribution with r degrees of freedom. <br>Likelihood Ratio Tests:&nbsp;Page 3 of 3 </p>

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    3 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us