Confidence Intervals

Confidence Intervals PoCoG Biostatistical Clinic Series Joseph Coll, PhD | Biostatistician Introduction › Introduction to Confidence Intervals › Calculating a 95% Confidence Interval for the Mean of a Sample › Correspondence Between Hypothesis Tests and Confidence Intervals 2 Introduction to Confidence Intervals › Each time we take a sample from a population and calculate a statistic (e.g. a mean or proportion), the value of the statistic will be a little different. › If someone asks for your best guess of the population parameter, you would use the value of the sample statistic. › But how well does the statistic estimate the population parameter? › It is possible to calculate a range of values based on the sample statistic that encompasses the true population value with a specified level of probability (confidence). 3 Introduction to Confidence Intervals › DEFINITION: A 95% confidence for the mean of a sample is a range of values which we can be 95% confident includes the mean of the population from which the sample was drawn. › If we took 100 random samples from a population and calculated a mean and 95% confidence interval for each, then approximately 95 of the 100 confidence intervals would include the population mean. 4 Introduction to Confidence Intervals › A single 95% confidence interval is the set of possible values that the population mean could have that are not statistically different from the observed sample mean. › A 95% confidence interval is a set of possible true values of the parameter of interest (e.g. mean, correlation coefficient, odds ratio, difference of means, proportion) that are consistent with the data. 5 Calculating a 95% Confidence Interval for the Mean of a Sample › ‾x‾ ± k * (standard deviation / √n) › Where ‾x‾ is the mean of the sample › k is the constant dependent on the hypothesized distribution of the sample mean, the sample size and the amount of confidence desired. › n is the number of observations in the sample › Note that (standard deviation / √n) is the standard error of the mean and is a measure of how good our estimate of the mean is. 6 Calculating a 95% Confidence Interval for the Mean of a Sample › The width of the confidence interval is affected by: - The level of confidence - The sample size - The variability of the data (standard deviation) 7 Example – Continuous Outcome › A sample of 10 yields the following values: 1.18, 3.51, 2.56, 1.64, 2.77, 2.87, 2.01, 2.31, 3.71, 2.57 › The mean is 2.512 and the standard deviation is 0.779. › The values for k were obtained from a table of values for the t distribution. Confidence k Confidence Width Level Interval 90% 1.833 2.06 – 2.96 0.90 95% 2.262 1.96 – 3.07 1.11 99% 3.169 1.71 – 3.32 1.60 8 Example (continued) › If we increase the sample size and hold the confidence at 95% Sample k 95% CI Width Size 10 2.262 1.96 – 3.07 1.11 20 2.093 2.15 – 2.88 0.73 30 2.045 2.22 – 2.80 0.58 60 1.980 2.31 – 2.72 0.41 9 Example (continued) › If we are able to reduce the standard deviation (holding the sample size at 10 and assuming 95% confidence): Standard k 95% CI Width Deviation 0.779 2.262 1.96 – 3.07 1.11 0.600 2.262 2.08 – 2.94 0.86 0.400 2.262 2.23 – 2.80 0.57 10 Proportions › Approximate confidence intervals for proportions can be obtained using the same formula. › For proportions, the underlying statistical distribution is binomial with mean p. The standard deviation the square root of p*(1-p). › The confidence intervals are “approximate” because the values for k are obtained from a table of values for the t distribution and assume normal theory. 11 Proportions › For proportions, the standard deviation is a function of the mean. › When in doubt, use the maximum standard deviation of 0.50 obtained when the proportion is 0.50. Proportion Standard Deviation 0.10 0.30 0.25 0.43 0.50 0.50 0.75 0.43 0.90 0.30 12 Example – Proportions › Suppose the prevalence of depression in a population is known to be 20%. › The mean proportion is 0.20 and the standard deviation is 0.40. Sample k 95% CI Width Size 10 2.262 0.07 – 0.33 0.25 20 2.093 0.11 – 0.29 0.18 30 2.045 0.13 – 0.27 0.15 60 1.980 0.15 – 0.25 0.10 13 Correspondence Between Hypothesis Tests and Confidence Intervals › If the p-value for testing H0: μ = c is less than 0.05 then the 95% confidence interval will not contain c. › Conversely, if the p-value for testing H0: μ = c is greater than 0.05, then the 95% confidence interval will contain c. › Similar statements hold for other levels of significance (e.g. 0.01, 0.10). › EXAMPLE: Using our sample of 10 observations, if we test H0: mean = 2 versus HA: mean ≠ 2, the p-value is 0.067. We cannot reject the null hypothesis at the 5% level, but can reject the null at the 10% level of significance. › The 95% CI is 1.96 – 3.07 which includes the value 2. › The 90% CI is 2.25 – 2.77 which does not include the value 2. 14 Reporting Options › Report the mean and standard deviation › Report that the observed difference is (or is not) statistically significant › Report the actual p-value › Report the 95% confidence interval › Report a combination or all of the above 15 P-value or Confidence Interval (or both)? Pro Confidence Interval › One of the biggest objections to statistical tests is that they answer the question “Is there a difference?” instead of the more clinically meaningful question “Is the difference large enough to recommend a change in the standard of care?” › A statistically significant p-value can be associated with a clinically insignificant difference due only to a large sample size. A clinically significant difference can have a non-significant p-value if the sample size is too small. P-values are not good descriptive statistics. › With the use of a confidence interval, there is explicit recognition that a single study sample can give only an estimate of the true population parameter. 16 Statistical versus Clinical Significance › Consider the following systolic blood pressure examples: › Case 1 – Statistical significance without clinical significance - n1=n2=500 - Drug A: Mean SBP = 140.2 mmHg - Drug B: Mean SBP = 140.3 mmHg - p <0.001 › Case 2 – Clinical significance without statistical significance - n1=n2=10 - Drug A: Mean SBP = 124.6 mmHg - Drug B: Mean SBP = 149.8 mmHg - p = 0.31 P-value or Confidence Interval (or both)? Con Confidence Interval › Confidence intervals can be difficult to calculate especially when the distribution of the sample is not bell-shaped or if the statistical model is complicated. › The p-value provides an indication of the degree of significance (or how far outside of the confidence interval the null value lies). The confidence interval just lets you say whether the null value is inside or outside of the confidence interval. 18 Acknowledgements › The material presented in this webinar was adapted from a Biostatistics Short Course developed by Lynn Ackerson PhD, Becki Bucher Bartelson PhD and David McCormick MS. 19 .

Confidence Intervals

Confidence Intervals for the Population Mean Alternatives to the Student-T Confidence Interval

STAT 22000 Lecture Slides Overview of Confidence Intervals

Statistics for Dummies Cheat Sheet from Statistics for Dummies, 2Nd Edition by Deborah Rumsey

Understanding Statistical Hypothesis Testing: the Logic of Statistical Inference

P Values and Confidence Intervals Friends Or Foe

Student's T Distribution

Multiple Random Variables

History of Biostatistics

Statistical Tests, P-Values, Confidence Intervals, and Power

Skewness-Kurtosis Adjusted Confidence Estimators and Significance Tests Wolf-Dieter Richter

Confidence Intervals for One Standard Deviation with Tolerance Probability

CONFIDENCE INTERVAL IS MORE INFORMATIVE THAN P-VALUE in RESEARCH Shyamal Kumar Das Independent Researcher