Standard 4: Determining Adequate Sample Sizes

DILEMMA AUTHORS: Ingeborg van der Tweel, PhD,a Lisa Askie, PhD, MPH, BN, NICC, RM, RN,b Ben Vandermeer, MSc,c Susan There are many challenges to be faced when conducting randomized Ellenberg, PhD,d Ricardo M. Fernandes, MD,e Haroon controlled trials (RCTs) in pediatric research. One important challenge Saloojee, MD, MBBCh, FCPaed,f Dirk Bassler, MD, MSC,g is the determination of an appropriate sample size. Recruiting more Douglas G. Altman, DSc,h Martin Offringa, MD, PhD,i and children than necessary risks unnecessary overexposure of children to Johanna H. van der Lee, MD, PhD,j for the StaR Child an inferior treatment, whereas underestimating the sample required Health Group a will lead to inconclusive or unreliable results. Both options pose im- , Julius Centre for Health Sciences and Primary Care, University Medical Centre, Utrecht, Netherlands; bNHMRC portant ethical dilemmas for the pediatric researcher. Clinical Trials Centre, University of Sydney, Sydney, Australia; Reviews have concluded that sample size calculations are frequently cDepartment of Pediatrics, Alberta Research Centre for Health Evidence, University of Alberta, Edmonton, Alberta, Canada; based on inaccurate assumptions regarding the key pieces of informa- dDepartment of Biostatistics and , University of tion needed.1,2 For example, a recent “failed” pediatric RCT highlighted Pennsylvania School of , Philadelphia, Pennsylvania; the commonly encountered problem of underestimating the SD of a eDepartment of Pediatrics, Santa Maria Hospital, Laboratory of 3 Clinical Pharmacology and Therapeutics, Faculty of Medicine, continuous primary outcome variable. This leads to underestimation of Instituto de Medicina Molecula, Lisboa, Portugal; fDivision of the necessary sample size, inadequate statistical power, and, conse- Community Pediatrics, Department of Pediatrics and Child quently, an unanswered study question. Similarly, an incorrect estimation Health, University of the Witwatersrand, Johannesburg, South g of the frequency with which a dichotomous outcome, or event, occurs in Africa; Centre for Pediatric Clinical Studies and Department and Department of Neonatology, University Children’s Hospital, the control group of a trial (known as the control or baseline event rate) Tuebingen, Germany; hCentre for in Medicine, University will also lead to incorrect sample size estimation. This standards article of Oxford, Oxford, United Kingdom; iChild Health Evaluative uses a series of scenarios to assist pediatric researchers in not only Sciences, Research Institute, The Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada; and jDepartment determining an adequate trial sample size but also how to proceed when of Pediatric Clinical Epidemiology, Emma Children’s Hospital, this sample size may be difficult to achieve. Recommendations for Academic Medical Centre, Amsterdam, Netherlands practice are summarized in Table 1. KEY WORDS Bayesian methods, nuisance estimates, pediatric, sample size, StaR Child Health GUIDANCE ABBREVIATIONS CINV—chemotherapy-induced nausea and vomiting Methods to calculate adequate sample sizes are described for supe- RCT—randomized controlled trial — riority, noninferiority, and cluster-randomized trials with various types RSV respiratory syncytial virus of primary outcome variables, including continuous, dichotomous, or Drs van der Tweel, Askie, and Vandermeer wrote the first draft of the article; Drs Ellenberg, Fernandes, Saloojee, Bassler, Altman, 4–10 time-to-event data. In this article, we focus primarily on superiority Offringai, and van der Lee contributed to the writing of the article; trial designs with continuous or dichotomous outcomes. Informa- Drs van der Tweel, Askie, Vandermeer, Ellenberg, Fernandes, tion on noninferiority or equivalence designs, as well as on cluster- Saloojee, Bassler, Altman, and van der Lee participated in regular fi 9,10 meetings and conference calls, identi ed the issues, and drafted randomized designs, can be found in the literature. RCTs generally the manuscript; Drs van der Tweel, Askie, Vandermeer, Ellenberg, have many outcomes of interest, but the number of patients required Fernandes, Saloojee, Bassler, Altman, Offringa, and van der Lee (and the main statistical analysis) should be based on the primary participated in identifying the evidence base for StaR Child Health outcome. standards; and Drs van der Tweel, Askie, Vandermeer, Ellenberg, Fernandes, Saloojee, Bassler, Altman, Offringa, and van der Lee A sample size calculation for a standard (2-sided, superiority) RCT is agree with the final version. based on 3 values. First, one must specify the target difference be- This is the fourth in a series of standard articles resulting from tween the outcomes for those children who received the new treatment an ongoing process in which a group of invited experts called a Standard Development Group from StaR Child Health compared with those who did not, based on what is judged clinically assembles and exchanges information about methods for relevant or meaningful. Next, one must specify the level of risk one is pediatric trial design, conduct, and reporting. More detailed willing to take that, by chance, the trial will erroneously conclude information about this topic can be found in the introductory a clinically relevant difference in the primary outcome. This is known as article of this supplement or at the StaR Child Health Web site (www.starchildhealth.org). the type I error rate or a (the probability of a false-positive conclusion). Finally, one must specify the probability that the trial will correctly (Continued on last page) detect a difference in the primary outcome between the control and

S138 VAN DER TWEEL et al Downloaded from www.aappublications.org/news by guest on September 24, 2021 SUPPLEMENT ARTICLE

TABLE 1 Recommendations for Practice with is considered clinically rel- • Sample size calculation is a matter of Good Clinical Practice in the design phase of an RCT evant. Assuming a 2-sided type I error of • When previous information is available from similar pediatric populations, sample size calculation can be .05 (5%) and power of 80%, by entering performed using standard methods or software • When the only previous information is from adult or other pediatric populations or when no previous these numbers into a standard sample information is available, alternative methods should be considered size software package, it can be deter- • Parameter values used for the sample size calculation (eg, type I error, power, SD, control event rate, mined that the minimum required sam- fi expected treatment difference) should be clearly mentioned in the nal trial report ple size for a new trial would be 86 • Good follow-up procedures are essential so children are not lost to follow up • Always consult a statistician or methodologist about the sample size calculation during the planning phase childrenineacharm. of a trial If information is available from several comparable (but often small) RCTs or the new intervention if the true differ- The following scenarios describe sit- subgroups within RCTs, these can be ence is the size of the target difference. uations that may be encountered when combined in a more systematic and This is known as the statistical power determining sample size for a pediatric quantitative way, known as meta- (the probability of a true positive con- and possible approaches. analysis, to obtain an estimate of a nui- a clusion). Conventional values for and The overall approach is summarized sance parameter.12–14 Anyone planning power are .05 (or 5%) and 80% to 90%, in Fig 1. to conduct a new trial should first un- respectively. Requiring a smaller prob- dertake a thorough review of the avail- ability of a type I error, larger power, or SCENARIO 1 able literature (known as a systematic smaller clinically relevant difference You already have the required infor- review).12 The information from the will increase the necessary sample size. mation from other similar pediatric identified studies can be entered into a A smaller value for may be appro- populations. How do you calculate the readily available meta-analysis soft- priate to reduce the chance of a false- required sample size for your new ware to obtain estimates of various positive conclusion when an effective trial? nuisance parameters.15 These parame- standard therapy is tested against ters are obtained by looking at the event a new, competing treatment. A higher Possible Approaches rate for the combined control groups power could be considered if, in com- Ifinformationisavailablefromprevious for dichotomous outcomes or the SD for paring treatments to prevent a common studies for a reliable estimate of the the combined control groups for con- disease, one would not miss a safe, in- nuisance parameters, then a standard tinuous outcomes. If the evidence is expensive, possibly effective treat- sample size calculation can be made limited, such as in Example 1 with the 2 ment.11 based on whatever clinically meaning- small trials, one might be more con- To make a sample size calculation, so- ful difference is considered important servative and take a higher value than called nuisance parameters also need to detect. This can be done by using just the average. It is further recom- to be estimated. In statistics, a nuisance methods previously described4–10 or mend to examine how sensitive the parameter is any parameter which is using readily available software. sample size is to variation in the nui- not of immediate interest but that must sance parameter.16 be accounted for in the analysis of Example 1 those parameters which are of interest. There have been 2 previous small trials For example, for continuous outcomes, of corticosteroids to treat respiratory Example 2 such as blood pressure or duration of syncytial virus (RSV) lower respiratory For decades, methylxanthines (eg, caf- mechanical ventilation, an estimate of tract infections in young children, but feine,theophylline,aminophylline)have the SD is needed. For dichotomous out- more evidence is needed. These 2 trials been commonly given to preterm comes, such as dead/alive, an accurate found similar reductions in the primary infants to reduce their rates of apnea. estimate of the control or baseline event outcome (ie, hours of mechanical ven- In 2001, a indicated rate is needed. To estimate the values tilation required). Thus, a possible ap- that this treatment had only been for these nuisance parameters, infor- proach would be to calculate the SD assessed in 5 small trials with a total of mation is needed from comparable estimate based on the average SD from 192 randomized infants.17 Using this in- previous studies. Ideally, such studies these 2 trials. In this case, the average formation, a group of pediatric trialists wouldhaveincludedsimilarpopulations SD from the 2 trials was 84 hours. estimated that the number of infants of children; however, this is often not the A reduction of 36 hours’ ventilation needed to be randomized to provide case. with corticosteroid treatment compared definitive evidence of both long- and

PEDIATRICS Volume 129, Supplement 3, June 2012 S139 Downloaded from www.aappublications.org/news by guest on September 24, 2021 FIGURE 1 Determining sample size. short-term benefits (and potential SCENARIO 2 if the population studied differs from harms) would be ∼2000. This finding the type of children you are interested The only previous information you have was based on an estimate of the base- in. The 2 examples that follow here is from either adult populations or pe- line event rate of death or neurodevel- show how information obtained from diatric populations that differ from your opment disability (the primary outcome) either other pediatric populations (Ex- group of interest. How do you calculate of 20% and having 80% power to detect ample 3) or adults (Example 4) can in- the required sample size for your new a 25% relative reduction in the risk of form a sample size calculation for a trial? this outcome. A multicenter interna- new trial. tional trial subsequently randomized 2006 infants to treatment and dem- Possible Approaches Example 3 onstrated conclusively that caffeine In this scenario, the most common Followingonfrom Example1,you wishto compared with placebo significantly available sources of nuisance esti- investigate, in a placebo-controlled trial, reduced the risk of death or disability in mates are studies of similar design whether dexamethasone is effective in preterm infants.18 evaluating similar treatments, even reducing the duration of mechanical

S140 VAN DER TWEEL et al Downloaded from www.aappublications.org/news by guest on September 24, 2021 SUPPLEMENT ARTICLE ventilation in children aged ,24 months Previous information may be sourced about the of estimates from with severe RSV lower respiratory tract from previous trial data and/or expert these existing studies, it may be better infections. The best available data are opinion and will be combined with the to proceed to scenario 3, rather than from a trial published 6 years ago that new trial data to yield effect estimates. use inaccurate estimates. used a different type of corticosteroid With this approach, a reduction of the and included children admitted to the number of trial participants to include SCENARIO 3 hospital with RSV bronchiolitis (both can often be expected. ventilated and nonventilated children). The previous information needed to Based on data of a subgroup of venti- calculate a pediatric trial sample size is latedchildrenfromthistrial,theSDwas Example 5 either not available from relevant pe- estimated as 98 hours. By using these Bayesian methods were used to obtain diatric or adult populations or any in- estimates, if you designed a new trial with a sample size in a trial comparing in- formation that is available is considered 80% power and a 2-sided a of .05, you travenous immunoglobulin and plas- too unreliable for use. How do you cal- would need to recruit, at minimum, 117 mapheresis in treating Guillain-Barré culate the required sample size for your children to each arm of the trial to detect syndrome in children.21 The authors new trial? adifferenceof36ventilationhours. showed that using standard methods to obtain a sample size for their non- Possible Approaches inferiority design would have required a An internal pilot is 1 possible solution. Example 4 sample size beyond what was reasonably First, the nuisance parameter (eg, SD) Ginger therapy has been effective in re- feasible. By using previous information is estimated for the primary outcome ducing chemotherapy-induced nausea basedonempiricaldatafromadultRCTs from a previous adult trial. With this and vomiting (CINV) in adults. However, combined with expert opinion, they were estimate, a preliminary sample size no data were available regarding the able to greatly reduce the required calculation can be made. A pilot phase antiemetic efficacy of ginger in pediatric sample size. for a trial can then be commenced in patients receiving chemotherapy. A new children, and after a prespecified num- trial is planned starting with the adult However, if the previous information is ber of patients are included, the nui- information, which showed a CINV rate fi incorrectly speci ed or trial data are sance parameter can be re-estimated ranging from 30% to 90%.19 If a relative not consistent with previous informa- on the basis of available information. reduction in the rate of CINV of 20% is tion, an over- or underpowered study Thisinformationcanbeusedtoreassess considered clinically relevant (this is may still result, as is the case in all sam- the final sample size requirements. The another assumption, which may vary ple size calculations. This problem can first part of the data are considered an because the participants are children be exacerbated when the previous in- internal pilot (ie, the participants en- rather than adults), and a type I error formation comes from studies conduc- rolled in this phase of the trial will be of .05 and power of 80% is assumed, ted in adults. Caution must be exercised part of the final sample size). However, thesamplesizecalculationwilldepend when translating previous information because there has been an interim greatly on the control event rate. If a fi from adults or speci c pediatric popu- “look” at the data, there is a risk of in- control event rate of 60% is assumed lations to (other types of) children, as flating the trial’s type I error. If the in- (approximately the middle of the 30%– there may be differences in disease terim look is used only to estimate the 90% found in adults), then a new fi de nitions/criteria and outcomes, as nuisance parameter from the control trial would require at least 267 patients in well as differences in biology, physiol- group data and the treatment difference each arm. However, if the maximal con- ogy, pharmacology, and measurement. is not estimated, then this inflation will trol event rate found in adults (90%) is Importantly, children may have larger be negligible.24–28 assumed, the new pediatric trial would variations in some outcomesthan adults, require only 71 patients per arm. especially when a wide age range is included in the study. The issues Example 6 Another way of addressing the issue of regarding the use of appropriate lung If, in Example 3, you considered the using information from adult or other function reference equations in trials previous trial information to be too pediatric populations is to take a Bayes- of asthma and other respiratory con- unreliable, you could undertake an in- ian approach.20–22 This approach incor- ditions are 1 such example.23 Better ternal pilot study of 30 children in each porates previous information in the solutions are needed to help combat arm. Suppose, from the blinded pilot assessment of data from the new trial. this problem. If one is not confident data, the common SD was 108 hours

PEDIATRICS Volume 129, Supplement 3, June 2012 S141 Downloaded from www.aappublications.org/news by guest on September 24, 2021 (not 98 hours as had been estimated in Possible Approaches this trial design can only be used if the previous trial). Based on these pilot The European Agency guide- an outcome can be measured several data, your re-estimated sample size is line on clinical trials in small popula- times per patient (eg, blood pres- at least 142 children in each arm of the tions29 states that “… no methods sure, quality of life). Furthermore, re- trial to detect a difference of 36 hours in exist that are relevant to small studies peated measures require relatively ventilation duration. Thus, you would that are not also applicable to large sophisticated statistical design and need to recruit another 112 children in studies. However, it may be that in analysis. each arm in addition to those already conditions with small and very small included in the pilot study. populations, less conventional and/ Meta-analysis of N-of-1 Trials or less commonly seen methodologi- Another possibility if no previous esti- cal approaches may be acceptable if Another potential approach for address- mate for the SD can be derived is to use they help to improve the interpretability ing small available sample sizes in rare theeffectsizetocalculatetheneces- of the study results.” Some suggested diseases is to meta-analyze “N-of-1 trials.” sary sample size. The is the approaches29,30 for addressing the N-of-1 trials include only 1 participant standardized treatment difference. issue of limited availability of chil- pertrialwhoactsashisorherown For a continuous outcome, this is the dren for inclusion in a trial are given control, using a crossover approach. ratio of the treatment difference to the here. Typically, the participant has repeated measures of the same outcome. Vari- SD. Cohen proposed values for small, First, one could switch to another (type ous modeling methods have been de- moderate, and large effect sizes for of) outcome, but only if this also rep- 7 veloped to combine (meta-analyze) the various types of outcomes. However, resents a clinically relevant outcome. results of such trials.33 This approach as effect sizes are abstract concepts, Another approach could be to extend fi could be useful in designing pediatric it is sometimes dif cult to translate the follow-up time so more events will pilot studies, when funding and/or par- them into clinically interpretable accrue. quantities. ticipants are difficult to obtain. However, this design will only be feasible when Example 7 Crossover Design dealing with short-term temporary outcomes (eg, pain) and no carryover Suppose no information was available A simple method for reducing the effect.34 to estimate the SD in the trial described sample size needed is to use a different in Example 3. We could design a trial design. A crossover study exposes all with 80% power and a 2-sided a of .05 participants to both the new interven- Sequential Designs to detect a standardized difference of tion and the control treatment. This method is only suitable in specific cir- Most RCTs estimate the sample size in .5, which, according to Cohen, would be the design phase of the trial and sub- a “moderate” effect.7 This may be a rea- cumstances; namely, when evaluating interventions with a temporary effect sequently include, randomize, and fol- sonable assumption (rather than as- low up their patients until the primary suming a very small or large effect size) in the treatment of stable, chronic con- ditions and when the outcomes are outcome values are obtained for all given no available previous information participants. Trials using sequential is available. With these assumptions, short-term and reversible (eg, head- ache relief). It is also important to allow designs provide for regular analyses of at least 63 patients would need to be interim data with the possibility of recruited in each arm. Note that the enough timeto passbetween treatment exposures to ensure that there is no terminating the trial before full accrual effect size calculated in Example 3 would has been achieved if the data reach have been 36 of 98 (ie, 0.37), which could carryover effect from 1 treatment to the other.31 a predetermined threshold for making be considered small to moderate. adefinitive judgment about the results. It is widely recognized that an “adjust- SCENARIO 4 Repeated Measures ment” of the type I error for each of the The number of children required to ob- The sample size needed can also be interim analyses is required to avoid tain reliable trial results is large, but the reduced by undertaking repeated increasing the probability of a false- number of children potentially avail- measures of the primary outcome in positive conclusion if the trial were to able for inclusion in the trial is limited each trial patient. Multiple observations be stopped early on the basis of an (eg, a rare disease). What are the avail- per patient will reduce the number of interim analysis. Various ways of ad- able options? patients required.32 Again, however, justment have been described.33–36

S142 VAN DER TWEEL et al Downloaded from www.aappublications.org/news by guest on September 24, 2021 SUPPLEMENT ARTICLE

Although sequential designs may on combined sample size from the 4 the required sample size in some occasion lead to a smaller sample size trials of 1800 children that the outcome cases, possibly presenting complica- when the interim data demonstrate of most public health significance (ie, tions to researchers and funders. definitively that 1 treatment is better a reduction in obesity Further research should investigate than another, if there are serious safety from 20% to 15%) will be able to be when these methods may be most concerns or further study is unlikely reliably detected. These trialists have appropriate.Theimplicationsofapro- to demonstrate any difference, such thus formed the Early Prevention of spective meta-analysis have yet to designs cannot be relied on to reduce Obesity in Children Collaboration38 be fully appreciated. Specifically, it the sample size. to undertake a prospective meta- is not clear how results may differ analysis to achieve the required sam- from a multicenter trial on the same plesizeandanswertheprimary topic. Although the prospective na- Prospective Meta-analysis research question. ture of the collaboration overcomes Traditionally, when a large sample size many potential limitations of multi- is required, a multicenter trial is often Adaptive Designs ple separate studies, exploration of fi fi needed to ensure suf cient numbers of Adaptive designs are flexible designs the bene ts and limitations of this children are recruited in a reasonable that permit modifications during the type of research endeavor will be period of time (see Example 2). How- course of the trial. The flexibility covers important. ever, conducting a large, multicenter a wide range of possible adaptations, trial is not always possible due to including changes in sample size. For ex- CONCLUSIONS funding, feasibility, and timeline ample, the internal pilot design discussed constraints. A possible solution is to earlier can be viewed as an adaptive This standards article provides an in- prospectively plan a series of in- design, as can sequential designs. The troduction to a variety of methods to dividual trials that all follow identical price paid for this flexibility is that the overcome common challenges faced (or very similar) protocols but exist implementation of such designs is often when attempting to derive sample size independently of 1 another. The tria- complex and, in some cases, can reveal estimates in pediatric research. The lists form a collaborative group and interim trends that optimally should be desire to reduce exposure of trial prospectively agree to collect core, kept confidential during the study.39,40 participants to any potential harm key data items in a common format Adaptive designs can also lead to larger must be carefully balanced with the and also agree to share these data sample sizes than originally estimated. need to obtain an adequate sample on trial completion for inclusion in These factors may complicate assess- size to ensure the validity of trial results. a combined meta-analysis. This is ments of the overall trial cost and Fortunately, there are well-described 37 known as a prospective meta-analysis. should thus be taken into consider- methods to determine sample size that By agreeing to harmonize the data ation when deciding whether to use capitalize on previous data, whether fi items, de nitions, coding, and stand- these methods during the trial’splan- froma similarordifferent population,or ards before the data are collected, ning phase. on a systematic review. To supplement the prospective meta-analysis tria- data derived from different populations, lists guarantee that the amount of internal pilot studies or a Bayesian RESEARCH AGENDA data available for later synthesis is approach can be used to integrate new maximized. This Standard Development Group has data and/or expert opinion into the identified a number of areas for future sample size calculation. Crossover research related to determining sam- studies, repeated measures, and the Example 8 ple sizes in pediatric research. An eval- use of effect sizes can overcome a lack Four trials are currently underway uation of the use of information from ofprevious dataoraverysmall potential to assess whether interventions com- adult studies for pediatric trial designs subject pool. The use of sequential menced very early in infancy reduce would clarify situations or conditions in designs may allow early termination of childhood obesity. Each of the 4 trials which this type of extrapolation could trials when results are definitive before (with between 400 and 800 children lead to unacceptable errors in sample full accrual is achieved but cannot be enrolled in each) has sufficient statis- size calculation. Although the use of relied on as a way to conduct a smaller tical power to detect meaningful Bayesian methods and adaptive de- trial. Such designs should weigh limit- differences in BMI z scores at 2 years signs may reduce the overall sample ing the exposure of trial participants of age. However, it is only with the size, such designs can also increase to potentially harmful treatments

PEDIATRICS Volume 129, Supplement 3, June 2012 S143 Downloaded from www.aappublications.org/news by guest on September 24, 2021 with ensuring that the final results are absence of resources or possibilities are no exception. Consultation with sufficient for reliable benefit-to-risk for a multicenter trial. a methodologist or statistician in the assessments. Finally, the use of a pro- Ultimately, appropriate time and at- design phase can minimize the number spective meta-analysis allows research- tention must be dedicated to the de- of childrenexposed tothe potentialrisks erstocollectivelycontributetheirdatato termination of sample size in the of studyparticipationwhileensuringthe achieve the necessary power in the planning of any trial, and pediatric trials validity of the research findings.

REFERENCES

1. Charles P, Giraudeau B, Dechartres A, meta-analysis. Stat Med. 2007;26(12):2479– 24. Wittes J, Brittain E. The role of internal pilot Baron G, Ravaud P. Reporting of sample 2500 studies in increasing the efficiency of clin- size calculation in randomised controlled 14. Whitehead A. Meta-analysis of Controlled ical trials. Stat Med. 1990;9(1–2):65–71; dis- trials: review. BMJ. 2009;338:b1732 Clinical Trials. Chichester, West Sussex, cussion 71–72 2. Vickers AJ. Underpowering in randomized United Kingdom: Wiley; 2002 25. Freidlin B, Korn EL. Release of data from an trials reporting a sample size calculation. 15. Cochrane Collaboration. Review Manager ongoing randomized clinical trial for sam- J Clin Epidemiol. 2003;56(8):717–720 (RevMan). Available at: www.ims.cochrane. ple size adjustment or planning. Stat Med. 3. van der Lee JH, Tanck MW, Wesseling J, org/revman/download. Accessed April 28, 2007;26(22):4074–4082 Offringa M. Pitfalls in the design and 2012 26. Kieser M, Friede T. Re-calculating the sam- analysis of paediatric clinical trials: a case 16. Julious SA, Owen RJ. Sample size calcu- ple size in internal pilot study designs with ‘ ’ of a failed multi-centre study, and po- lations for clinical studies allowing for un- control of the type I error rate. Stat Med. tential solutions. Acta Paediatr. 2009;98 – certainty about the . Pharm Stat. 2000;19(7):901 911 (2):385–391 2006;5(1):29–37 27. Proschan MA, Liu Q, Hunsberger S. Practi- 4. Lachin JM. Introduction to sample size de- fi 17. Henderson-Smart DJ, Steer P. Methylxanthine cal midcourse sample size modi cation in termination and power analysis for clinical clinical trials. Control Clin Trials. 2003;24 treatment for apnea in preterm infants. trials. Control Clin Trials. 1981;2(2):93–113 (1):4–15 Cochrane Database Syst Rev. 2001;(3): 5. Freedman LS. Tables of the number of CD000140 28. Gould AL, Shih WJ. Sample size re- patients required in clinical trials using the estimation without unblinding for normally 18. Schmidt B, Roberts RS, Davis P, et al; Caffeine . Stat Med. 1982;1(2):121–129 distributed outcomes with unknown vari- for Apnea of Prematurity Trial Group. Long- 6. Donner A. Approaches to sample size esti- ance. Comm Statist Theory Methods.1992; term effects of caffeine therapy for apnea mation in the design of clinical trials— 21(10):2833–2853 of prematurity. N Engl J Med. 2007;357(19): a review. Stat Med. 1984;3(3):199–214 29. Committee for Medicinal Products for Human 1893–1902 7. Cohen J. A power primer. Psychol Bull.1992; Use. Guideline on Clinical Trials in Small Pop- 19. Pillai AK, Sharma KK, Gupta YK, Bakhshi S. 112(1):155–159 ulations. London, United Kingdom: European Anti-emetic effect of ginger powder versus 8. Julious SA. Sample sizes for clinical trials Medicines Agency; 2006 placebo as an add-on therapy in children with normal data. Stat Med. 2004;23(12): 30. Evans, Jr, CH and Ildstad ST, eds. Committee and young adults receiving high emeto- 1921–1986 on Strategies for Small-Number-Participant. genic chemotherapy. Pediatr Blood Cancer. 9. Dann RS, Koch GG. Methods for one-sided Small Clinical Trials: Issues and Challenges. 2011;56(2):234–238 testing of the difference between pro- Washington, DC: National Academy Press; portions and sample size considerations 20. Whitehead J, Valdés-Márquez E, Johnson 2001 related to non-inferiority clinical trials. P, Graham G. Bayesian sample size for 31. Senn S. Cross-over Trials in Clinical Re- Pharm Stat. 2008;7(2):130–141 exploratory clinical trials incorporating search. 2nd ed. Chichester, West Sussex, 10. Friedman LM, Furberg CD, DeMets DL. historical data. Stat Med. 2008;27(13): United Kingdom: Wiley; 2002 – Fundamentals of Clinical Trials. 4th ed. New 2307 2327 32. Winkens B, Schouten HJ, van Breukelen GJ, York, NY: Springer; 2010 21. Goodman SN, Sladky JT. A Bayesian ap- Berger MP. Optimal number of repeated 11. Piantadosi S. Clinical Trials. A Methodo- proach to randomized controlled trials in measures and group sizes in clinical trials logical Perspective. New York, NY: John children utilizing information from adults: with linearly divergent treatment effects. Wiley & Sons; 1997 the case of Guillain-Barré syndrome. Clin Contemp Clin Trials. 2006;27(1):57–69 – – 12. Clarke M, Hopewell S, Chalmers I. Clin- Trials. 2005;2(4):305 310, discussion 364 378 33. Haybittle JL. Repeated assessment of results ical trials should begin and end with 22. Schoenfeld DA, Hui Zheng , Finkelstein DM. in clinical trials of cancer treatment. Br J systematic reviews of relevant evidence: 12 Bayesian design using adult data to aug- Radiol. 1971;44(526):793–797 years and waiting. Lancet. 2010;376(9734): ment pediatric trials. Clin Trials. 2009;6(4): 34. Peto R, Pike MC, Armitage P, et al. Design 20–21 297–304 and analysis of randomized clinical trials 13. Sutton AJ, Cooper NJ, Jones DR, Lambert PC, 23. Stanojevic S, Wade A, Stocks J. Reference requiring prolonged observation of each Thompson JR, Abrams KR. Evidence-based values for lung function: past, present and patient. I. Introduction and design. Br J sample size calculations based upon updated future. Eur Respir J. 2010;36(1):12–19 Cancer. 1976;34(6):585–612

S144 VAN DER TWEEL et al Downloaded from www.aappublications.org/news by guest on September 24, 2021 SUPPLEMENT ARTICLE

35. O’Brien PC, Fleming TR. A multiple testing eds. Cochrane Handbook for Systematic prospective meta-analysis. BMC Public procedure for clinical trials. Biometrics. Reviews of Interventions Version 5.1.0 Health. 2010;10:728 1979;35(3):549–556 (updated March 2011). The Cochrane Col- 39. Bauer P, Köhne K. Evaluation of 36. DeMets DL, Lan KK. Interim analysis: the laboration; 2011. Available at: www.cochrane- with adaptive interim analyses. Biometrics. alpha spending function approach. Stat handbook.org. Accessed April 28, 2012 1994;50(4):1029–1041 Med. 1994;13(13–14):1341–1352; discussion 38. Askie LM, Baur LA, Campbell K, et al; EP- 40. Bauer P, Brannath W. The advantages and 1353–1356 OCH Collaboration Group. The Early Pre- disadvantages of adaptive designs for clini- 37. Ghersi D, Berlin J, Askie L. Chapter 19: Pro- vention of Obesity in CHildren (EPOCH) cal trials. Drug Discov Today. 2004;9(8): spective meta-analysis. In: Higgins J, Green S, Collaboration—an individual patient data 351–357

(Continued from first page) www.pediatrics.org/cgi/doi/10.1542/peds.2012-0055G doi:10.1542/peds.2012-0055G Accepted for publication Mar 23, 2012 Address correspondence to Martin Offringa, MD, PhD, Senior Scientist and Program Head, Child Health Evaluative Sciences, Research Institute, The Hospital for Sick Children, 555 University Ave, Toronto, Ontario, Canada M5G 1X8. E-mail: [email protected] PEDIATRICS (ISSN Numbers: Print, 0031-4005; Online, 1098-4275). Copyright © 2012 by the American Academy of Pediatrics FINANCIAL DISCLOSURE: The authors have indicated they have no financial relationships relevant to this article to disclose.

PEDIATRICS Volume 129, Supplement 3, June 2012 S145 Downloaded from www.aappublications.org/news by guest on September 24, 2021 Standard 4: Determining Adequate Sample Sizes Ingeborg van der Tweel, Lisa Askie, Ben Vandermeer, Susan Ellenberg, Ricardo M. Fernandes, Haroon Saloojee, Dirk Bassler, Douglas G. Altman, Martin Offringa and Johanna H. van der Lee Pediatrics 2012;129;S138 DOI: 10.1542/peds.2012-0055G

Updated Information & including high resolution figures, can be found at: Services http://pediatrics.aappublications.org/content/129/Supplement_3/S138 References This article cites 29 articles, 2 of which you can access for free at: http://pediatrics.aappublications.org/content/129/Supplement_3/S138 #BIBL Subspecialty Collections This article, along with others on similar topics, appears in the following collection(s): Medical Education http://www.aappublications.org/cgi/collection/medical_education_su b Permissions & Licensing Information about reproducing this article in parts (figures, tables) or in its entirety can be found online at: http://www.aappublications.org/site/misc/Permissions.xhtml Reprints Information about ordering reprints can be found online: http://www.aappublications.org/site/misc/reprints.xhtml

Downloaded from www.aappublications.org/news by guest on September 24, 2021 Standard 4: Determining Adequate Sample Sizes Ingeborg van der Tweel, Lisa Askie, Ben Vandermeer, Susan Ellenberg, Ricardo M. Fernandes, Haroon Saloojee, Dirk Bassler, Douglas G. Altman, Martin Offringa and Johanna H. van der Lee Pediatrics 2012;129;S138 DOI: 10.1542/peds.2012-0055G

The online version of this article, along with updated information and services, is located on the World Wide Web at: http://pediatrics.aappublications.org/content/129/Supplement_3/S138

Pediatrics is the official journal of the American Academy of Pediatrics. A monthly publication, it has been published continuously since 1948. Pediatrics is owned, published, and trademarked by the American Academy of Pediatrics, 345 Park Avenue, Itasca, Illinois, 60143. Copyright © 2012 by the American Academy of Pediatrics. All rights reserved. Print ISSN: 1073-0397.

Downloaded from www.aappublications.org/news by guest on September 24, 2021