<<

American Journal of www.biomedgrid.com Biomedical Science & Research ISSN: 2642-1747 ------Review Article Copy Right@ Shein-Chung Chow Statistical Test for Composite Hypothesis in Clinical Research

Fuyu Song1, Xinyi Ma2, and Shein-Chung Chow3* 1Center for Food and Drug Inspection, National Medical Products Administration, China

2Amherst College, Amherst, Massachusetts, USA

3Professor of and , Duke University School of Medicine, USA *Corresponding author: Shein-Chung Chow, Professor of Biostatistics and Bioinformatics, Duke University School of Medicine, 2424 Erwin Road, Room 11037, Durham, NC 27705, USA.

To Cite This Article: Shein-Chung Chow, Statistical Test for Composite Hypothesis in Clinical Research. 2020 - 10(2). AJBSR.MS.ID.001485. DOI: 10.34297/AJBSR.2020.10.001485. Received: August 24, 2020; Published: September 03, 2020

Abstract

In clinical evaluation of the safety and efficacy of a test treatment under investigation, a typical approach is to test for the null hypothesis of no treatment difference in efficacy in randomized clinical trials (RCT). The investigator would reject the null hypothesis of no treatment difference and then conclude the alternative hypothesis that the treatment is efficacious. In practice, however, this typical approach based on test for efficacy alone may not be appropriate for a full assessment of both efficacy and safety of the test treatment under study. Alternatively, [1] suggested testing composite hypothesis by taking both safety and efficacy into consideration. In this article, appropriate statistical test for a composite hypothesis of non-inferiority in efficacy and superiority in safety is derived. The impact on power calculation for sample size requirement when switching from a singleKeywords hypothesis (e.g., for efficacy) to a composite hypothesis (e.g., for both safety and efficacy) is also examined.

: Randomized (RCT); Composite Hypothesis; Non-inferiority; Superiority; Power Calculation

Introduction

In clinical evaluation of a test treatment under investigation, a This traditional approach, however, may not be appropriate because one single primary efficacy endpoint cannot fully assess safety under study. Statistically, the traditional approach based the performance of the treatment with respect to both efficacy and traditional approach is to first test for a null hypothesis that there is no treatment difference in efficacy alone in randomized clinical trials treatment difference and then conclude the alternative hypothesis on single primary efficacy endpoint for clinical evaluation of both (RCTs) [2]. The investigator would reject the null hypothesis of no that there is a difference and in favor of the test treatment under safety and efficacy is a conditional approach (i.e., conditional on safety performance). It should be noted that under the traditional detecting a clinically meaningful difference if such a different truly (conditional) approach, the observed safety profile may not be of investigation. As a result, if there is a sufficient power for correctly any statistical meaning (i.e., the observed safety profile could be treatment will be reviewed and approved by the regulatory agency by chance alone and may not be reproducible). In addition, the exists, the test treatment is then claimed to be efficacious. The test if the test treatment is well tolerated and there appears no safety traditional approach for clinical evaluation of both efficacy and in treating the disease under investigation. concerns. In practice, the intended clinical trial is often powered for safety may have inflated the false positive rate of the test treatment In the past several decades, the traditional approach is found achieving the study objective with a desired power (say 80%) at a pre-specified level of significance (say 5%). to be inefficient because many drug products have been withdrawn

This work is licensed under Creative Commons Attribution 4.0 License AJBSR.MS.ID.001485. 103 Am J Biomed Sci & Res Copy@ Shein-Chung Chow

from the marketplace because of the unreasonable risks to patients. safety of a test treatment as compared to a control. In this case, we

can consider testing the null hypothesis that H0 : not NS , where N For illustration purpose, Table 1 provides a list of significant withdrawn drugs between 2000-2010 [3]. As it can be seen from denotes the non-inferiority in efficacy and S represents superiority alternative hypothesis that H: NS , i.e., the test treatment is Table 1, most drugs withdrawn from the marketplace are due to of safety. We would reject the a null hypothesis and conclude the safety concern (i.e., unreasonable risks to the patients). These effects that were not detected during late phase clinical trials and unreasonable risks to the patients include unexpected adverse non-inferior to the active control agent and its safety profile is were only apparent from post-marketing surveillance data from the that H: not NS , appropriate statistical tests should be derived superior0 to the active control agent. To test the null hypothesis wider patient population (Table 1). under the null hypothesis. The derived test can then be evaluated for achieving the study objectives with a desired power To take both safety and efficacy into consideration in clinical under the alternative hypothesis. The selected sample size will trials, [1] suggested testing composite hypothesis by testing non- ensure that the intended trial will achieve the study objectives of (i) inferiority/superiority or equivalence of safety and efficacy as establishing non-inferiority of the test treatment in efficacy and (ii) compared to a control (e.g., placebo control or active control). As H: not NS versus H : NS where N represents testing for non- showing superiority of the safety profile of the test treatment at a an 0example, a commonlya considered composite hypothesis is that pre-specifiedNote that level for of significance.testing H0 : not NS versus Ha : NS , the inferiority in efficacy and S is for testing superiority in safety. Under the composite hypothesis, it is of interest to examine the impact of alternative hypothesis is that the test treatment is non-inferior (N) power calculation for sample size requirement when switching in efficacy and superior (S) in safety. Thus, the null hypothesis is not from testing a single hypothesis (i.e., for efficacy alone) to testing a NS, i.e., the test treatment is inferior in efficacy or the test treatment composite hypothesis (i.e., for both safety and efficacy). is not superior in safety. Thus, the null hypothesis actually consists of three subsets: (i) the test treatment is inferior in efficacy and In the next section, several composite hypotheses which will Section 3, for illustration purpose, statistical methods for testing superior in safety; (ii) the test treatment is non-inferior in efficacy take both safety and efficacy into consideration are proposed. In the composite hypothesis that H: not NS versus H : NS are and not superior in safety; (iii) the test treatment is inferior in 0 a consider all these three subsets when derive appropriate statistical derived. Section 4 studies the impact on power calculation for efficacy and not superior in safety. It would be complicated to test under the null hypothesis.

It also should be noted that in the interest of controlling the sample size requirement when switching from testing for a single α hypothesis (for efficacy alone) to testing for a composite hypothesis 1 are provided. α (for both safety and efficacy). In Section 5, some concluding remarks overall type I error2 rate at the α level, appropriate α levels (say single hypothesis testing to a composite hypothesis testing, sample Hypotheses for Clinical Evaluation for efficacy and for safety) should be chosen. Switching from a

Testingsize increase Composite is expected. Hypothesis of Safety and Efficacy considered approaches include tests for hypotheses of superiority In clinical trials, for clinical evaluation of efficacy, commonly For illustration purpose, consider the following composite hypothesis that (S), non-inferiority (N), or (therapeutic) equivalence (E). For safety in terms of adverse events and other safety parameters such as assessment, the investigator usually examines the safety profile H: not NS versus H : NS laboratory testing to determine whether the test treatment is 0 a , (1) S is for testing superiority in safety either better (superiority), non-inferior (non-inferiority) or similar where represents testing for non-inferiority in efficacy and (equivalence) as compared to the control. As an alternative to the Derivation of Statistical Test Under the Composite Null traditional approach, [1] suggested testing composite hypothesis Hypothesis that will take into consideration both safety and efficacy. For scenarios of composite hypotheses for clinical evaluation of safety illustration purpose, Table 2 provides a summary all possible should be derived under the null hypothesis. Let X and Y be the To test the null hypothesis that , appropriate statistical tests (,)XY and efficacy of a test treatment under investigation. follows a bi-variate normal distribution with µµ,)and efficacy and safety endpoint, respectively. Assume that( xy Statistically, we would reject the null hypothesis at a pre- ∑ i.e., where 2 specified level of significance and conclude the alternative σx ρσ xy σ -covariance matrix∑ =  hypothesis with a desired power. For example, the investigator may 2 ρσxy σ σ y be interested in testing non-inferiority in efficacy and superiority in

American Journal of Biomedical Science & Research 104 Am J Biomed Sci & Res Copy@ Shein-Chung Chow

Suppose that the investigator is interested in testing non- X, UY

where (U ) is the standard bi-variate normal random vector, ρ . inferiority in efficacy and superiority in safety of a test treatment i.e., a bi-variate normal random vector with zero , unit be considered: µ− µ ≤− δµor − µ ≤ δ as compared to a control. The following composite hypotheses may variancesUnder and the null a correlation hypothesis coefficient H 0 that Xof1 X 2 XY 12 Y Y , it can be shown that the upper limit of >>is the H : µ− µ ≤− δµor − µ ≤ δ PT(,)XY C12 T C 01X X 2 XY 1 Y 2 Y v.s. {1(),1()−ΦCC12 −Φ } µ− µ >− δ µµ − > δ ,where Φ is the cumulative distribution function of the standard H a : X1 X 2 Xand YY12 Y maximum of the two probabilities, i.e., max normal distribution. A brief proof is given below. For given constants where (µµ, ) and (µµ, ) are the means of (,)XY(2) and and a standard bi-variate normal vector XY11 XY22 a1 a2 for the test treatment and the control, respectively, and δ X and δ are the corresponding non-inferiority margin and superiority Y ρ margin. Note that δ and δ are positive constants. If the null 1 X Y (UU , ) ~ N( 0, 0) , , we have XY ρ 1 hypothesis is rejected based on a statistical test, we conclude that 22 12+∞ +∞ x+− yρ xy X , and is superior over the control in safety endpoint Y . >>= ∫∫exp − dydx the test treatment is non-inferior to the control in efficacy endpoint PU(,)XY a12 U a 2 2 aa 21πρ− 12 2(1− ρ ) random sample of (,)XY is collected from each treatment arm. In 22 To test the above composite hypotheses, suppose that a 1+∞ x+∞ 1  ()yx− ρ  =∫∫exp −−exp  dydx particular, (XY , ),...., ( X , Y ) are iid. . . N ((µµ , ),∑ ) , which is the 2 2 11 1nn11 1 XY11 2π aa122 2(1− ρ ) 2πρ (1− )   random sample from the test treatment, and (XY , ),...., ( X , Y ) 21 21 2nn2 2 2 are , which is the random sample from 22 iid. . . N ((µµXY22 , ),∑ ) 1 +∞ ax− ρ  x 1−Φ (a ) −∫ Φexp − dx the control treatment, where i.i.d. stands for independent and 1 2 2π a1 2 − − 1 − ρ  identically distributed. Let X and X be the sample means of X (4) 1 2 − in the test treatment and the control, respectively. Similarly, and (,)UUXY − Y1 are the sample means of in the test treatment and the control, Y2 Y Since the joint distribution of is symmetric, (4) is also (XY ,) 22 ii equal to 1 +∞ ay− ρ  y 1−Φ (a2 ) −∫ Φexp − dy follows a bi-variate normal distribution. In particular, (XYii ,)follows 2 2π a2 1 − ρ 2 respectively.−1 It can be verified that the sample mean vector Nn((µµ , ), ∑ . Since (XY ,)and (XY ,)are independent bi-variate (5) Xi Yi i 11 22 normal vectors, it follows that (X−− XY ,) Yis also normally 1 21 2 PT(,)XY>> C12 T C −−11 distributed as N((µ− µµ , − µ ), (nn +∑ ) ) For simplicity, and replaced by X1 XY 21 Y 2 1 2 a1 a2 σ 2 2 Based on (5), can be expressed by (4) and X , σY and (5) with µµδ−+ XX12X H 0 DC= − we assume Σ is known, i.e., the values of parameters 11 −−1 12 and safety, we may consider the following test statistics ()nn+ σ are known. To test the composite hypothesis for both efficacy 12X and

XX12−+δ X TX = µµδ−+ −−1 12 DC= − YY12Y ()nn+ σ 22 −− 12X 1+ 12σ ()nn12y

respectively. Under the null hypothesis H that YY12−+δY 0 Ty = −−1 12 µX1− µ X 2 ≤− δµ XYor 12 − µ Y ≤ δ Y, it’s true that either DC11≥ or ()nn12+ σ Y DC22≥ H for large values 0 PT(>. CSince , T integrals > C | in H (4) ) 0, δ and respectively. Then, we have X δ µ− µ + δ µµδ−+ Y To complete the proof, we need to show for any > > = >−XXX1 2 >−YYY12 PT(,)XY C12 T C P UXY C12,UC µµ− and µµ− 1−Φ ()C −ε −−1 12 −−1 12 XX12 YY12 1 ()nn12++σσXY ()nn12 (>0), and given values of other parameters, there exists values of , and 1−Φ ()C −ε µµ−=− δ (3) 2 such thatXX12 (6) is larger X than . Let . Then (5) becomes

American Journal of Biomedical Science & Research 105 Am J Biomed Sci & Res Copy@ Shein-Chung Chow

µµ− ≤− δ µµ− >− δ 2 H0112: XX XvsH..a : XX12 X 1 +∞ Dx− ρ x 1 2 1−Φ (C1 ) −∫ Φexp − dx π c 2 H 01 2 1 1 − ρ 2 (6) at the α TZ> Then, a commonly used test isX to rejectα the null hypothesis concluding the test treatment is non-inferior to the control with For ρ K such that when level of significance if . The total sample size for 1 − β power if the difference of mean µµXX12− >− δ Xis Dx− ρ DK< > 0, there exists[C +Φ 00], a negative2 < valueε 2 1 2 2 22 1 − ρ (1++rZ ) (αβ Z ) σ X N = , for any x in X 2 r()µµδXXX12−+ µµ− , it can happen that DK< . YY12 2 Where rnn= / µµ− 1−Φ ()C −ε . For 21 For sufficient large YY12 1 ρ is the sample size allocation ratio between II12+ , where Therefore, for sufficient large , (9) > the controlX and test treatment [4]? (Table 3) gives total sample ≤ 0, express the integral in (7) as 2 size (N ) for test of non-inferiority based on efficacy endpoint E Dx− ρ x I =∫ Φ−2 exp dx X and total sample size (N) for testing composite hypothesis 1  c 2 1 1 − ρ 2 based on both efficacy endpoint X and safety endpoint Y, for α =0.05, β =0.20, µµδ− −=0.3 , = 1, and several values various scenarios. InYY 12 particular,Y we calculated sample sizes for of ∆=µµδXXX12 − + and other parameters. For a hypothesis 2 of superiority of the test treatment in safety, i.e., the component +∞ Dx− ρ x and I =∫ Φ−2 exp dx with respect to safety in the composite hypothesis, the preceding 2 2  E 1 − ρ 2 µµδYYY12−−and σ =275. specifiedY values of type I error rate,Y power, and

2 +∞ x requires a total sample size N I ≤−∫ exp dx 2  is chosen such that E 2 < 0.5 For many scenarios in Table 3, the total sample size N for test ϵ ϵ.The first inequality of the composite hypothesis is much largerX than the sample size value of , the argument for ρ > 0 can be applied to prove I <0.5 holds as the cumulative distribution is always ≤ 1. For a chosen1 µµ− is for test of non-inferiority in efficacy (N ). However, it happens in YY12 PT(X >> C1 , T C |) H ϵ Y 20 some cases that they are the same or their difference is quite small. greater than 1−Φ ()C −ε for µµ−=− δ ϵ for sufficient large 1 . XXHence,12 X X Y µµ− . Similarly, it can be proved that is Actually, N is associated with the sample sizes for individual test YY12 PT(X >> C1 , T C |) H ρ and sufficientY 20 large of non-inferiority in efficacy (N ) and of superiority in safety (N ), greater than 1−Φ ()C −ε for µµ−≤ δ 2 YY12 Y X and NY µµ− and the correlation coefficient ( ) between X and Y. When large XX12 of N and N and has little change along with change in ρ . In this and sufficient large differenceX existsY between N , N is quite close to the larger numerical study, for N and X . This completes the proof. TX TY 275; for N can be controlled at the level of α by appropriately choosing X = 69 and 39 (<< 275), N is mostly equal to Therefore, the type I error of the test based on N corresponding critical values of and . Denote by the X C1 C2 Zα = 1392 and 619 (>>275), the difference between N and upper α is 0 or negligible compared with the size of N. At the preceding has little impact on N. On the hand, the larger of N and N is not power function of the above test is PT(,)>> Z T Z , which X Y X α1 α four scenarios, change in correlation coefficient between X and Y - of the standard normal distribution.Y 2Then, the always close to N, especially when NX and NY close to each other. For of the standard bi-variate distribution. X Y can be calculated from (5) and the cumulative distribution function ρ =0.5, and 373 for ρ The Impact on Power Calculation for Sample Size example in Table 3, when both N is equal to 275 (=N ), N is 352 for =0. In addition, the results in Table 3 suggest

Fixed Power Approach that the correlation coefficient between X and Y is unlikely to haveX and N In practice, when switching from testing a single hypothesis great influenceY on N, especially when the difference between N is quite substantial. The above findings consistent with different, taking N as the larger of N and N will ensure the powers the underlying ‘rule’: when the twoX sampleY sizes are substantially (i.e., based on a single study endpoint such as the efficacy endpoint in clinical trials) to testing a composite hypothesis (i.e., based on 1 − β , ‘resulting’ in a power of 1 − β for test of the composite X be the of two individual tests for efficacy and safety is essentially 1 and two study endpoints such as both safety and efficacy endpoints in hypotheses; when NX and NY is close to each other, taking N as the clinical trials), increase in sample size is expected. Let larger of NX and NY will power the test of composite hypotheses at single non-inferiority hypothesis with a non-inferiority margin of 2 efficacy endpoint in clinical trials. Consider testing the following about (1− β ) δ X : . Therefore, a significant increment in N is required

American Journal of Biomedical Science & Research 106 Am J Biomed Sci & Res Copy@ Shein-Chung Chow

for achieving a power of 1 − β . =39 for σ =0.5, ρ =-1 ad ∆ for many casesX in our numericalX study. The worst scenario is P = Fixed Sample Size Approach 4.3% when N =0.4.Therefore, test which is for achieving a certain power in testing hypothesis of compositeX hypothesis of both efficacy and safety using sample

0 were calculated with results presented in size N Based on the sample size in Table 3, power of the test of hypothesis. Interestingly, testing the composite hypothesis with composite hypothesis H of efficacy only, may not have enough power to reject the null N X M TableX 3, where PM is the power of test of composite hypothesisX with X is close to 275 X, the power of test of composite hypothesis is max(N , 275), the power P is close to the target value 80% in in Table 3. P is the power of the same test with max (N , 275). ∆ =0.3, σ ∆ =0.4, σ is always not X X With sample size N X most scenarios. Some exceptions happen when N X σ =1.5 > σ =1.0, NX (corresponding to ( =1.0), and ( =1.5 )) such always not greater than the target value 80%X as N Y larger than N in Table 4. In some cases that that a significant increment in sample size from max(N , 275) to N =N. Hence the corresponding P=80%. However, P is less than 60% is required. This suggest taking N as the larger of the two sample X and NY sizes N for testing hypothesis of individual endpoint when one of the two is much larger, say, one-fold larger than the other (Table 4). Table 1: Significant Withdrawals of Drug Products between 2000-2010. Drug name Withdrawn Remarks 2000 Withdrawn because of risk of fatal complications of constipation; reintroduced 2002 on a (Rezulin) 2000 Withdrawn because of risk of ; superseded by and restricted basis (Lotronex) 2000s Withdrawn in many countries because of risk of cardiac

Cisapride (Propulsid) 2000 - Withdrawn because of risk of stroke in women under 50 years of age when taken at high (Survector) 2000 Withdrawn because of hepatotoxicity, dermatological side effects, and abuse potential. (Propag est, Dexatrim) 2001 Withdrawndoses (75mg because twice daily) of risk for of weightliver failure loss. (Trovan) 2001 Withdrawn because of risk of (Baycol, Lipobay) 2001 Withdrawn in many countries because of risk of fatal bronchospasm Rapacuronium (Raplon) 2004 Withdrawn because of risk of

Rofecoxib (Vioxx) - 2005 was later lifted because the death rate among those taking XR was determined to Withdrawn in Canada because of risk of stroke. See Health Canada press release. The ban mixed amphetamine salts (Ad be no greater than those not taking Adderall. derall XR) - 2005 Withdrawn because of a high risk of accidental overdose when administered with extended-re lease (Palladone) 2005 (Cylert) Withdrawn from U.S. market because of hepatotoxicity - Voluntarily withdrawn from U.S. market because of risk of Progressive multifocal leukoen (Tysabri) 2005-2006 cephalopathy (PML). Returned to market July, 2006.

Ximelagatran (Exanta) 20072006 Withdrawn because of risk of hepatotoxicity (liver damage). elsewhere. Voluntarily withdrawn in the U.S. because of the risk of heart valve damage. Still available (Permax) Withdrawn because of imbalance of cardiovascular ischemic events, including heart attack 2007 and stroke. Was available through a restricted access program until April 2008. (Zelnorm) Withdrawn because of increased risk of complications or death; permanently withdrawn 2007

Aprotinin (Trasylol) Progressively withdrawn around the world because of serious side effects, mainly liver 2007-2008 in 2008 except for research use damage 2008 Withdrawn around the world because of risk of severe depression and suicide Withdrawn because of increased risk of progressive multifocal leukoencephalopathy; to be (Accomplia) 2009 completely withdrawn from market by June 2009 (Raptiva) 2010 Withdrawn in Europe because of increased cardiovascular risk

Sibutramine (Reductil)

American Journal of Biomedical Science & Research 107 Am J Biomed Sci & Res Copy@ Shein-Chung Chow

Table 2: Composite Hypotheses for Clinical Evaluation. Safety Efficacy N S E N NN NS NE S SN SS SE E EN ES EE

Note: N=Non-inferiority; S=Superiority; E=Equivalence

Table 3: Comparison of Sample Size between Tests for Composite Hypothesis and Single Hypothesis. ∆ = 0.2 ∆ = 0.3 ∆ = 0.4

NX N N/NX NX N N/NX NX N N/NX

0.5σ x -1.0ρ 155 304 4 39 275 7.05 -0.5 155 303 1.951.96 69 276 4 39 275 7.05 0 155 300 1.94 69 276 4 39 275 7.05 0.5 155 289 69 276275 3.99 39 275 7.05 1 155 275 1.861.77 69 275 3.99 39 275 7.05 1.0 -1.0 1.05 27569 381 1.39 155 304 -0.5 619 647 1.04 275 381 1.39 155 303 1.961.95 0 619 646 1.04 275 373 155 300 1.94 0.5 619 642 1.02 275 352 1.361.28 155 289 1 619 629 1 275 275 1 155 275 1.861.77 1.5 -1.0 1392619 1392619 1 1.05 348 433 1.24 -0.5 1392 1392 1 619 647 1.04 348 432 1.24 0 1392 1392 1 619 646 1.04 348 424 1.22 0.5 1392 1392 1 619 642 1.02 348 402 1 1392 1392 1 619 629 1 348 348 1.161

Table 4: Power (%) of Test of Composite Hypothesis. 619 619 ∆ = 0.2 ∆ = 0.2 ∆ = 0.4

ρ P Pm P Pm P Pm σ0.5x -1.0 38.9 75.3 14.7 80 4.3 80 -0.5 41.9 75.4 22 80 14.2 80 0 47.1 27.7 80 19.2 80

0.5 52.9 76.278.1 32.3 80 22.8 80 1 58.8 80 34.5 80 23.9 80 1.0 -1.0 78.2 78.2 38.9 75.3

-0.5 78.2 78.2 60.1 60.1 41.9 75.4 0 60.9 60.9 47.1 0.5 78.679.4 78.679.4 64 64 52.9 76.278.1 1 80 80 68.880 68.880 58.8 80 1.5 -1.0 80 80 78.2 78.2

-0.5 80 80 78.2 78.2 67.6 67.6 0 80 80 70.168 70.168 0.5 80 80 78.679.4 78.679.4 73.7 73.7 1 80 80 80 80 80 80

American Journal of Biomedical Science & Research 108 Am J Biomed Sci & Res Copy@ Shein-Chung Chow

Concluding Remarks for sample size requirement when switching from testing a single hypothesis (for efficacy) to testing a composite hypothesis for In clinical evaluation of a test treatment in randomized clinical both safety and efficacy [5,6]. It, however, should be noted that trials, the traditional (conditional) approach of testing single relationships between the single hypothesis and the composite closed forms and/or formulas for sample size calculation and the hypothesis for efficacy alone is not efficient because the observed safety profile could be by chance alone and may not be reproducible. be useful. hypothesis may not exist. In this case, clinical trial simulation may Thus, testing a composite hypothesis which takes both safety and References efficacy into consideration (e.g., testing non-inferiority in efficacy 1. and testing superiority of safety as compared to a control) is Recent Development. Marcel Dekker Inc, New York, USA. recommended. In practice, sample size is expected to increase Chow SC, Shao J (2002) Statistics in Drug Research – Methodologies and 2. when switching from a single hypothesis testing (the traditional Statistical Sections of New Drug Applications. U.S. Food and Drug of a test treatment under investigation. approach) to a composite hypothesis testing for clinical evaluation Administration,FDA (1988). Guideline Rockville for MD. the Format and Content of the Clinical and For illustration purpose, in this article, we assume that both 3. are continuous variables which (,)XY 4. Wikipedia (2010). List of withdrawn drugs. follow a bi-variate normal distribution. Statistical tests were derived in Clinical Research. 3rd efficacy and safety data Chow SC, Shao J, Wang H, Lokhnygina Y (2017) Sample Size Calculation under the framework of bi-variate normal distribution. In practice, 5. Edition, Taylor & Francis, New York, USA. could be either a continuous (,)XY Chow SC, Huang Z (2019) Innovative thinking on endpoint selection in variable, a binary response, or time-to-event data. Similar idea can clinical trials. Journal of Biopharmaceutical Statistics 29(5): 941-951. efficacy and safety data rd 6. Chow SC, Liu, JP (2013) Design and Analysis of Clinical Trials – Revised and Expanded, 3 Edition, John Wiley & Sons, New York, USA. be applied (i) to derive appropriate statistical test under the null hypothesis and (ii) to determine the impact on power calculation

American Journal of Biomedical Science & Research 109