Hypothesis Testing We Saw Last Week That We Can Never Know the Population Parameters Without Measuring the Entire Population
Total Page:16
File Type:pdf, Size:1020Kb
Formulating a hypothesis test Formulating a hypothesis test Interpreting a hypothesis test Interpreting a hypothesis test Common types of hypothesis test Common types of hypothesis test Power calculations Power calculations Hypothesis tests and confidence intervals Hypothesis tests and confidence intervals Introduction Hypothesis Testing We saw last week that we can never know the population parameters without measuring the entire population. Mark Lunt We can, however, make inferences about the population parameters from random samples. Centre for Epidemiology Versus Arthritis University of Manchester Last week, we saw how we can create a confidence interval, within which we are reasonably certain the population parameter lies. This week, we will see a different type of inference: is there evidence that the parameter does not take a particular value ? 10/11/2020 Formulating a hypothesis test Formulating a hypothesis test Interpreting a hypothesis test Components of Hypothesis test Interpreting a hypothesis test Components of Hypothesis test Common types of hypothesis test Test statistics Common types of hypothesis test Test statistics Power calculations Examples Power calculations Examples Hypothesis tests and confidence intervals Hypothesis tests and confidence intervals Hypothesis Testing Significance Testing Fisher’s p-value was very informal way to assess evidence Form the Null Hypothesis against null hypothesis Calculate probability of observing data if null hypothesis is Neyman and Pearson developed more formal approach: true (p-value) significance testing Low p-value taken as evidence that null hypothesis is Based on decision making: rule for deciding whether or not unlikely to reject the null hypothesis Originally, only intended as informal guide to strength of Clinically, need to make decisions. Scientifically, may be evidence against null hypothesis more appropriate to retain uncertainty. Introduces concepts of power, significance Formulating a hypothesis test Formulating a hypothesis test Interpreting a hypothesis test Components of Hypothesis test Interpreting a hypothesis test Components of Hypothesis test Common types of hypothesis test Test statistics Common types of hypothesis test Test statistics Power calculations Examples Power calculations Examples Hypothesis tests and confidence intervals Hypothesis tests and confidence intervals The Null Hypothesis The Alternative Hypothesis Simplest acceptable model. “Null Hypothesis is untrue” If the null hypothesis is true, the world is uninteresting. Covers any other possibility. Must be possible to express numerically (“test statistic”). May be one-sided, if effect in opposite direction is as Sampling distribution of test statistic must be known. uninteresting as the null hypothesis Formulating a hypothesis test Formulating a hypothesis test Interpreting a hypothesis test Components of Hypothesis test Interpreting a hypothesis test Components of Hypothesis test Common types of hypothesis test Test statistics Common types of hypothesis test Test statistics Power calculations Examples Power calculations Examples Hypothesis tests and confidence intervals Hypothesis tests and confidence intervals One and Two-sided tests Test Statistic Good example: χ2 test. 2 χ test measures difference between expected and Null hypothesis distribution must be known. observed frequencies Expected value if null hypothesis is true. Only unusually large differences are evidence against null Variation due to sampling error (standard error) if null hypothesis. hypothesis is true. Bad example: clinical trial From this distribution, probability of any given value can be A drug company may only be interested in how much better its drug is than the competition. calculated. Easier to get a significant difference with a one-sided test. Can be a mean, proportion, correlation coefficient, The rest of the world is interested in differences in either regression coefficient etc. direction, want to see a two-sided test. One-sided tests are rarely justified Formulating a hypothesis test Formulating a hypothesis test Interpreting a hypothesis test Components of Hypothesis test Interpreting a hypothesis test Components of Hypothesis test Common types of hypothesis test Test statistics Common types of hypothesis test Test statistics Power calculations Examples Power calculations Examples Hypothesis tests and confidence intervals Hypothesis tests and confidence intervals Normally Distributed Statistics Test statistic: Normal distribution Many test statistics can be considered normally distributed, if sample is large enough. .4 If the test statistic T has mean µ and standard error σ, then T − µ σ has a normal distribution with mean 0 and standard .3 error 1. σ We do not know , we only have estimate s. y .2 T − µ If our sample is of size n, s has a t-distribution with n − 1 d.f. Hence the term “t-test”. .1 If n ≥ 100, a normal distribution is indistinguishable from the t-distribution. 0 −4 −2 0 2 4 Extreme values less unlikely with a t-distribution than a x normal distribution. Formulating a hypothesis test Formulating a hypothesis test Interpreting a hypothesis test Components of Hypothesis test Interpreting a hypothesis test Components of Hypothesis test Common types of hypothesis test Test statistics Common types of hypothesis test Test statistics Power calculations Examples Power calculations Examples Hypothesis tests and confidence intervals Hypothesis tests and confidence intervals T-distribution and Normal Distribution Non-Normally Distributed Statistics .4 Statistics may follow a distribution other than the normal .3 distribution. χ2 y .2 Mann-Whitney U Many will be normally distributed in large enough samples .1 Tables can be used for small samples. Can be compared to quantiles of their own distribution 0 −4 −2 0 2 4 x Normal t−distribution (10 d.f.) Formulating a hypothesis test Formulating a hypothesis test Interpreting a hypothesis test Components of Hypothesis test Interpreting a hypothesis test Components of Hypothesis test Common types of hypothesis test Test statistics Common types of hypothesis test Test statistics Power calculations Examples Power calculations Examples Hypothesis tests and confidence intervals Hypothesis tests and confidence intervals 2 Test Statistic: χ4 Example 1: Height and Gender .2 Null hypothesis On average, men and women are the same height .15 Alternative Hypothesis One gender tends to be taller than the other. y .1 Test Statistic Difference in mean height between men and women. .05 One-Sided Hypotheses Men are taller than women Women are taller than men 0 0 5 10 15 x Formulating a hypothesis test Formulating a hypothesis test Interpreting a hypothesis test Components of Hypothesis test Interpreting a hypothesis test p-values Common types of hypothesis test Test statistics Common types of hypothesis test Errors Power calculations Examples Power calculations Hypothesis tests and confidence intervals Hypothesis tests and confidence intervals Example 2: Drinks preferences The p-value Null hypothesis Equal numbers of people prefer Coke and Pepsi Probability of obtaining a value of the test statistic at least Alternative Hypothesis Most people prefer one drink to the as extreme as that observed, if the null hypothesis is true. other Small value ) data obtained was unlikely to have occurred Test Statistic Several possibilities: under null hypothesis Difference in proportions preferring each drink Data did occur, so null hypothesis is probably not true. Ratio of proportions preferring each drink Originally intended as informal way to measure strength of One-Sided Hypotheses evidence against null hypothesis More people prefer Coke It it not the probability that the null hypothesis is true. More people prefer Pepsi Formulating a hypothesis test Formulating a hypothesis test Interpreting a hypothesis test Interpreting a hypothesis test p-values p-values Common types of hypothesis test Common types of hypothesis test Errors Errors Power calculations Power calculations Hypothesis tests and confidence intervals Hypothesis tests and confidence intervals Interpreting the p-value Factors Influencing p-value 0 ≤ p ≤ 1 Effect Size: a big difference is easier to find than a small Large p (≥ 0:2, say) ) no evidence against null hypothesis difference. Sample Size: The more subjects, the easier to find a p ≤ 0:05 ) there is some evidence against null difference hypothesis Always report actual p-values, not p < 0:05 or p > 0:05 Effect is “statistically significant at the 5% level” NS is unforgivable 0.05 is an arbitrary value: 0.045 is very little different from “No significant difference” can mean “no difference in 0.055. population” or “Sample size was too small to be certain” Smaller p ) stronger evidence Statistically significant difference may not be clinically Large p-value not evidence that null hypothesis is true. significant. Formulating a hypothesis test Formulating a hypothesis test Interpreting a hypothesis test Interpreting a hypothesis test p-values p-values Common types of hypothesis test Common types of hypothesis test Errors Errors Power calculations Power calculations Hypothesis tests and confidence intervals Hypothesis tests and confidence intervals Interpreting a significance test Meta-Analysis Example Significance test asks “Is the null hypothesis true” Ten studies Answers either “Probably not” or “No comment” 50 unexposed and 50 exposed in each Answers are interpreted as “No” or “Yes” Prevalence 10% in unexposed, 15% in exposed Misinterpretation leads to incorrect conclusions True RR = 1.5 Formulating