Biost 517 Applied Biostatistics I Lecture Outline
Total Page:16
File Type:pdf, Size:1020Kb
Applied Biostatistics I, AUT 2012 November 26, 2012 Lecture Outline Biost 517 • Comparing Independent Proportions – Large Samples (Uncensored) Applied Biostatistics I • Chi Squared Test – Small Samples (Uncensored) Scott S. Emerson, M.D., Ph.D. • Fisher’s Exact Test Professor of Biostatistics University of Washington • Adjusted Chi Squared or Fisher’s Exact Test Lecture 13: Two Sample Inference About Independent Proportions November 26, 2012 1 2 © 2002, 2003, 2005 Scott S. Emerson, M.D., Ph.D. Summary Measures Comparing Independent Proportions • Comparing distributions of binary variables across two groups • Difference of proportions (most common) – Most common Large Samples (Uncensored) – Inference based on difference of estimated proportions • Ratio of proportions – Of most relevance with low probabilities – Often actually use odds ratio 3 4 Part 1:1 Applied Biostatistics I, AUT 2012 November 26, 2012 Data: Contingency Tables Large Sample Distribution • The cross classified counts • With totally independent data, we use the Central Limit Response Theorem Yes No | Tot Group 0 a b | n0 • Proportions are means 1 c d | n1 – Sample proportions are sample means Total m0 m1 | N • Standard error estimates for each group’s estimated proportion based on the mean – variance relationship 5 6 Asymptotic Sampling Distn Asymptotic Confidence Intervals • Comparing two binomial proportions • Confidence interval for difference between two binomial proportions n0 Suppose independent X i ~ B1, p0 X i X ~ B n0 , p0 i1 We want to make inference about p p n1 1 0 and Y ~ B 1, p Y Y ~ B n , p i 1 i 1 1 100(1)% confidence interval is i1 We want to make inference about p p ˆ ˆ ˆ ˆ 1 0 z1 / 2 se , z1 / 2 se X p (1 p ) Y p (1 p ) pˆ ~ N p , 0 0 pˆ ~ N p , 1 1 0 n 0 n 1 n 1 n 0 0 1 1 ˆ pˆ1 pˆ 0 p (1 p ) p (1 p ) pˆ1(1 pˆ1) pˆ 0 (1 pˆ 0 ) ˆ ˆ ˆ 1 1 0 0 Estimate se ˆ p1 p0 ~ N p1 p0 , n1 n0 n1 n0 7 8 Part 1:2 Applied Biostatistics I, AUT 2012 November 26, 2012 Asymptotic Hypothesis Tests Goodness of Fit Test • Test statistic for difference between two binomial proportions • An alternative derivation of the asymptotic test of binomial proportions as a special case of the “goodness of fit test” • For contingency tables of arbitrary size Suppose we want to test H 0 : p1 p0 0 – “R by C” table – E.g., tumor grade by bone scan score in PSA data set ˆ Under the null hypothesis, Z ~ N(0,1) seˆ Under H0 the entire distributions are equal so pˆ(1 pˆ) pˆ(1 pˆ) X Y Estimate seˆ with pˆ n1 n0 n0 n1 9 10 . tabulate grade bss, row col cell Goodness of Fit Test | bss grade | 1 2 3 | Total 1 | 1 3 6 | 10 • Given categorical random variables, we often have scientific | 10.00 30.00 60.00 | 100.00 questions that relate to the frequency distribution for the | 20.00 27.27 25.00 | 25.00 measurements | 2.50 7.50 15.00 | 25.00 2 | 2 4 9 | 15 • Examples: | 13.33 26.67 60.00 | 100.00 – Checking for Poisson, normal, etc. | 40.00 36.36 37.50 | 37.50 – Distribution of phenotypes to agree with genetic theory | 5.00 10.00 22.50 | 37.50 – Comparing distributions of categorical data across 3 | 2 4 9 | 15 populations | 13.33 26.67 60.00 | 100.00 – Independence of random variables | 40.00 36.36 37.50 | 37.50 | 5.00 10.00 22.50 | 37.50 Total | 5 11 24 | 40 | 12.50 27.50 60.00 | 100.00 | 100.00 100.00 100.00 | 100.00 11 | 12.50 27.50 60.00 | 100.00 Part 1:3 Applied Biostatistics I, AUT 2012 November 26, 2012 Aside: Sampling Nomenclature Basic Idea • We often characterize the sampling scheme according to the • We compare the observed counts in each cell to the number total counts that were fixed by design we might have expected – Poisson sampling: none – “Observed – Expected” – Multinomial sampling: total counts – Counts, NOT proportions – Binomial sampling: either row or column totals – Hypergeometric sampling: both row and column totals 13 14 Sampling Distribution Test Statistic • The test to see whether the data fits the hypotheses (hence • Pearson’s chi-square goodness of fit statistic in the general “goodness of fit”) case – Assuming only one “constraint” on the cells • Observed counts in each cell are assumed to be Poisson random variables – Standard error is the square root of the mean Given K categories, and for the ith cell Oi the observed count and • Test statistic based on sum of Z scores for each cell – Actually done as squared Z scores Ei the expected count (typically Ei pi N) 2 – Sum of squared normals has a chi squared distn K H0 2 Oi Ei 2 ~ K 1 i1 Ei 15 16 Part 1:4 Applied Biostatistics I, AUT 2012 November 26, 2012 Determining the “Expected” Test for Independence • The scientific null hypothesis usually specifies the probability • Chi-square test for independence (or association) that a random observation would belong in each cell R C table for two categorical variables, in the r,cth cell • Most often: – Testing independence of variables Orc the observed count and • Probabilities in each cell are expected to be the product of Erc pˆ r pˆ c N the expected count the marginal distributions 2 R C H 0 2 Oij Eij 2 • Other uses ~ R1 C 1 i1 j1 E – Testing “goodness of fit” to distributions ij – Testing agreement with genetic hypotheses where pˆ r and pˆ c are the estimated proportions are for the row and column margins, respectively 17 18 . tabulate grade bss, row col cell . tabulate grade bss, row col exp | bss | bss grade | 1 2 3 | Total grade | 1 2 3 | Total 1 | 1 3 6 | 10 1 | 1 3 6 | 10 | 10.00 30.00 60.00 | 100.00 | 1.3 2.8 6.0 | 10.0 | 20.00 27.27 25.00 | 25.00 | 10.00 30.00 60.00 | 100.00 | 2.50 7.50 15.00 | 25.00 | 20.00 27.27 25.00 | 25.00 2 | 2 4 9 | 15 2 | 2 4 9 | 15 | 13.33 26.67 60.00 | 100.00 | 1.9 4.1 9.0 | 15.0 | 40.00 36.36 37.50 | 37.50 | 13.33 26.67 60.00 | 100.00 | 5.00 10.00 22.50 | 37.50 | 40.00 36.36 37.50 | 37.50 3 | 2 4 9 | 15 3 | 2 4 9 | 15 | 13.33 26.67 60.00 | 100.00 | 1.9 4.1 9.0 | 15.0 | 40.00 36.36 37.50 | 37.50 | 13.33 26.67 60.00 | 100.00 | 5.00 10.00 22.50 | 37.50 | 40.00 36.36 37.50 | 37.50 Total | 5 11 24 | 40 Total | 5 11 24 | 40 | 12.50 27.50 60.00 | 100.00 | 5.0 11.0 24.0 | 40.0 | 100.00 100.00 100.00 | 100.00 | 12.50 27.50 60.00 | 100.00 | 12.50 27.50 60.00 | 100.00 | 100.00 100.00 100.00 | 100.00 Part 1:5 Applied Biostatistics I, AUT 2012 November 26, 2012 In 2 x 2 Contingency Tables Elevator Statistics • The chi squared test and the Z test comparing binomial • Well, nearly elevator statistics proportions are exactly the same test • The chi squared statistic is just the square of the Z statistic 0 a b n0 • The chi squared test P value will be the same as the two- 1 c d n1 sided P value for the Z test m0 m1 N • Most software packages have a tendency to tell you the value of the chi squared statistic ad bc 2 N N 2 Z ad bc n0n1m0m1 n0n1m0m1 21 22 Need for Large Samples Other Large Sample Tests • Note that because the goodness of fit test is relying on • “Likelihood Ratio Test” asymptotic properties, it is only valid in “large” samples • Also has chi squared distribution in large samples • A commonly used rule of thumb if that the expected counts be – But not the same statistic as “chi squared statistic” greater than 5 in the vast majority of the cells • Good large sample properties – Often most powerful • Less commonly used in 2 x 2 tables – We will use it more often in logistic regression 23 24 Part 1:6 Applied Biostatistics I, AUT 2012 November 26, 2012 Stata Commands: CI Ex: Stata Commands • “cs respvar groupvar, level(#)” • Example: Hepatomegaly by treatment group in PBC data set • Both variables must be coded as 0 and 1 • Response will be called “cases” and “noncases” . g tx= 2 – treatmnt – CI can be found under “Risk difference” . cs hepmeg tx – Chi squared statistic and two-sided P value | tx | | Exposed Unexposed | Total • “tabulate respvar groupvar, row col chi2 lr” Cases | 73 87 | 160 – Row and column percentages Noncases | 84 66 | 150 Total | 157 153 | 310 – Chi squared and likelihood ratio P values | | Risk | .4649682 .5686275 | .516129 25 26 Ex: Stata Commands (cont.) Interpretation • Example: Hepatomegaly by treatment group in PBC data set • With 95% confidence, the true difference in the prevalence of (cont.) hepatomegaly at baseline is between .007 higher in treatment group and .214 lower in treatment group. | Point | • (What are these populations? At baseline, we only have | estimate | [95% Conf. Interval] samples that are treated and control. Both groups were Risk diff | -.1037 | -.2143 .0070 drawn from the same population.) Risk ratio | .8177 | .6580 1.0161 Prev frac ex | .1823 | -.0161 .3420 • Based on two sided P = .0679, we cannot reject the null Prev frac pop | .0923 | _ _ hypothesis of equality chi2(1) = 3.33 Pr>chi2 = 0.0679 27 28 Part 1:7 Applied Biostatistics I, AUT 2012 November 26, 2012 Caveat: Sad Fact of Life Comparison to t Test • Different variance estimates are typically used for CI and • Hepatomegaly by treatment group using two sample t test hypothesis tests with unequal variances – Z test uses the standard normal distribution and does not • We can see disagreement between the conclusion reached use the sample variance by CI and P value – The P value might be less than .05, but the CI contain 0 .ttest hepmeg, by(tx) unequal – The P value might be greater than .05, but the CI exclude Two-sample t test; unequal variances 0: N obs= 157 0 1: N obs= 153 Variable | Mean St Err t P>|t| [95% CI] 0 | .465 .0399 11.6 0.0000 .386 .544 1 | .569 .0402 14.2 0.0000 .489 .648 diff |-.104 .0566 -1.83 0.0682 -.215 .008 29 30 Comparison to