Estimation: Understanding Confidence Interval S

StatisticalV Notes I Estimation:s Understanding confidence interval Ah Cook, A Sheik INTRODUCTION presentings instead a confidence interval, one require Thes previous paper of this series described hypothesi they reader to think about what the values actuall testing, showing how a p-value indicates the mean,. thus interpreting the results more fully likelihoodn of a genuine difference existing betwee twos or more groups. An alternative to using p-value CALCULATINGLVA THE CONFIDENCE INTER alone,o to show whether or not differences exist, is t FORN A PROPORTIO report the actual size of the difference. Since a Kaur etl a carrieda out a prevalence study of asthm difference observed between groups is subject to symptoms. and diagnosis in British 12-14 year olds 1 randomt variation it becomes necessary to present no From) a sample of 27,507 children, 20.8% (n=5,736 onlyd the difference, but also a range of values aroun reportede ‘ever having’ asthma. Assuming the sampl thee observed difference within which it is believed th toe be representative, the true national prevalenc true value will lie. Such a range is known as a shouldn be close to this figure, but it remains unknow confidence. interval and a different sample would probably yield a slightly differenty estimate. To calculate a range likel Confidence, intervals may be calculated for means to containo the true figure, we need t proportions, differences between means or knowo by how much the sample proportion is likely t proportions,r relative risks, odds ratios and many othe vary. Put more technically,o we need t summaryy statistics. Here we describe in detail onl know. the standard deviation of the sample statistic onel simple calculation, that of a confidence interva This. is known as the standard error fore a proportion. Using examples from the literatur we look at the interpretation of other confidence Oneo way of finding the standard error would be t intervals,s and show the relationship between p-value taken several more samples, calculate the proportio and. confidence intervals ‘evern having’ asthma separately in each sample, the calculate. the standard deviation of these proportions WHYVALUESH P- ALONE ARE NOT ENOUG Fortunately,s such labour is unnecessary because it ha P-values express statistical significance, but been shown that most summary statistics follow statisticallyl significant results may have little clinica normals distributions, particularly when sample size significance.e This is particularly the case with larg aref large. Furthermore, the standard deviations o studies that have power to detect very small thesed distributions are directly related to the standar differences.e For example, an improvement in averag deviation of the original data. The situation is peakn flow of 1 l/min, when comparing interventio illustrated in Figure 1, which shows a possible andt control groups, may be statistically significant bu distribution of data, together with the expected clearly. has little clinical benefit distribution. of mean values generated by data samples Adrian Cook Ad further drawback of p-values is the emphasis place With% data from a normally distributed variable, 95 Statistician ont p=0.05, a value chosen purely by convention bu of observations should lie within two standard whichg has spawned a tendency to dismiss anythin deviations of the mean. Having said that a sample Azizh Sheik largery and focus attention on smaller values only. B statistic is expected to be normally distributed, it NHSy R&D National Primar Carew Training Fello Figuren 1. Expected distribution of a sample mea Departmenth of Primary Healt Caree & General Practic Imperialf College School o Medicine Correspondence: to Adriank Coo Departmenth of Primary Healt Caree & General Practic ICSM, Charing Cross Campus Reynolds, Building St Dunstan’s Road, London,P W6 8R [email protected] Date0 submitted: 19/7/0 Date0 accepted: 17/10/0 Prim Care Respir J 2000: 9(3);0 48-5 48 Primary Care Respiratory Journal StatisticalV Notes I Tabler 1. Formulae for calculating standard erro Summaryc statisti Formula Symbol Standard) error (SE •x sd(x) Mean n x n cases p(1–p) Proportion p casess + noncase n x1 x2 2 2 • • – sd(x1) sd(x2) Differences in mean – x1 x2 + n1 n2 n1 n2 cases 1 cases 2 p1–p 2 p1(1–p1)p2–(1 p2) – + Differences in proportion cases 1+s noncase 1 cases 2+s noncase 2 n1 n2 cases 1/(cases 1+s noncase 1) 1 1 1 1 * Relativek ris cases 2/(cases 2+s noncase 2) R cases1–cases1+s noncase 1+scase 2–cases 2+ noncases 2 cases 1/noncases 1 1 1 1 1 * Oddso rati OR + + + cases 2/noncases 2 cases 1 noncases 1 cases 2 noncases 2 *Standard. error of the logged statistic followsn that on 95% of occasions it will be less tha CONFIDENCESVALUE INTERVALS AND P- twoi standard errors from its true value. The probabil -It may have become apparent that the statistical tyr of the range formed by the sample statistic plus o significance of differences can be gleaned from minuss two standard errors containing the true value i confidenceg intervals. A confidence interval containin therefore 95%. This range is the 95% confidence 1.0e for a relative risk or an odds ratio means we ar interval.n Formulae for the standard error of commo lessa than 95% sure that a genuine difference exists, summary statistics are in Table 1, those for other significancee test of the difference would thus giv statisticsr are usually readily available in textbooks. Fo p>0.05.g Similarly a confidence interval not includin ther example of asthma prevalence, the standard erro 1.0d corresponds to p<0.05, while an interval bounde cane be calculated as 0.25% giving a 95% confidenc ati one end by 1.0 exactly would give p=0.05. A sim - intervals of 20.3% to 21.3%. The narrowness of thi lar situation exists with confidence intervals for confidence interval reflects the large sample size, differences in means or proportions, the only illustrating how certainty in a result grows as the differencee being that no effect is represented by th numberr of observations increase, resulting in smalle value. 0.0, rather than 1.0 standard. errors and narrower confidence intervals Ther practice of reporting confidence intervals togethe INTERPRETINGS CONFIDENCE INTERVAL withe p-values is questionable, p-values adding littl FORS RELATIVE RISK informationo for the informed reader. An exception t Waldg and Watt compared all-cause mortality amon thise rule occurs when a large number of confidenc different- types of smoker with that of lifelong non intervalsy are reported, in this instance the generall smokers. 2Comparedk to non-smokers, the relative ris discouraged habit of replacing p-values with stars (RR)s of mortality among former cigarette smokers wa indicating p<0.05 and p<0.01 becomes useful, 1.11t (95% confidence interval 0.92 to 1.34). The bes allowing. a rapid overview of results to be made estimatef of the effect on mortality is an increase o 11%,t RR=1.11, but the possibility of no effec CONCLUSIONS (RR=1.0)a remains. Among current smokers the rel - Here we have outlined the theory and practice of tive7 mortality was 2.26 (95% confidence interval 1.9 calculatingn confidence intervals, and give toe 2.58). This confidence interval does not includ pointers. toward their meaningful interpretation RR=1.0s and so we can be confident that mortality i higher. among current smokers Tables 2. Relative all-cause mortality of different smoking group Amongr pipe and cigar smokers who had neve Smokingp grou n died R † 95%I C smokeds cigarettes, mortality compared to non-smoker was. 1.23 (95% confidence interval 0.99 to 1.75) Lifelongr non-smoke 6539 346 1.00 Relative mortality, is higher than that of ex-smokers Formerr cigarette smoke 1465 162 11.1 0.924 to 1.3 butr the confidence interval is much wider. The greate Pipe/cigarr smoke widthg is partly due to the pipe/cigar group bein nevers smoked cigarette 1309 311 1.23 0.995 to 1.7 smallere than the ex-smokers group, one more or on Currentr cigarette smoke 4182 540 2.26 1.978 to 2.5 fewerd death thus has a greater effect on mortality an thee wider confidence interval reflects the less stabl †Adjustedy for age at entry to stud result. Primary Care Respiratory Journal 49 StatisticalV Notes I Clinicale significance may be gauged both from th Recommendedg readin pointf estimate of the difference, and consideration o Gardner MJ, Altman DG. Statistics. with confidence the. confidence limit’s upper and lower bounds London:. BMJ, 1993 Whetherr or not a confidence interval contains unity fo ae relative difference, or zero for an absolute differenc References .1 Kaur B, Ross Anderson H, Austin J etl a .a Prevalence of asthm revealsy statistical significance. Because they conve symptoms,n diagnosis, and treatment in 12-14 year old childre bothe aspects of significance, confidence intervals hav acrosss Great Britain (international study of asthma and allergie becomeg the strongly preferred way of presentin in childhood, ISAAC UK). BMJ 1998;316.:118-24 results. 2. Waldm NJ, Watt HC. Prospective study of effect of switching fro cigarettesg to pipes or cigars on mortality from three smokin related diseases. BMJ 1997;314:1860-3. Acknowledgements ASy is supported by an NHS R&D National Primar Care. Training Fellowship Erratum In the June 2000 issue of the Primaryl Care Respiratory Journa , kreference: Cropper JA, Frank TL, Fran PI, Hannaford PC. Primary care workload and prescribing costs in children. The impact of respiratory symptoms.

Estimation: Understanding Confidence Interval S

Statistical Significance Testing in Information Retrieval:An Empirical

Tests of Hypotheses Using Statistics

Statistical Significance

Understanding Statistical Hypothesis Testing: the Logic of Statistical Inference

What Are Confidence Intervals and P-Values?

Understanding Statistical Significance: a Short Guide

The Independent Samples T Test 189

Confidence Intervals and Statistical Significance -Statistical Literacy Guide

Statistical Approaches to Uncertainty: P Values and Confidence Intervals Unpacked

Introduction to Hypothesis Testing 3

One-Way Analysis of Variance: Comparing Several Means

INTRODUCTION to ONE-WAY ANALYSIS of VARIANCE Dale Berger, Claremont Graduate University