Importance of Generalized Linear Models

Paper AS15 - PHUSE US 2020 Importance of General Linear Models (GLM) in Clinical Research Venkata Ikkurthy, Bayer, Whippany, USA Ritesh Dhimmar, Bayer, Whippany, USA ABSTRACT A fully randomized clinical trial is necessary to conduct unbiased and objective research. However, randomization can also hide some relations between study subjects that could be interesting to the investigator. Classification and similarity techniques are a powerful tool to discover these lost relations. Among these, general linear models help to find the effects between the treatment groups and within treatment groups. This paper provides an example of General linear model, ANOVA (Analysis of Variance) for different treatment groups. Analyzing the data between the treatment groups and within the groups using Proc GLM. Keywords: ANOVA, Least squares, SAS, PROC GLM, Regression, Interaction effect, linear. Mathematical Approach INTRODUCTION This paper explains the importance and application of linear models in clinical trials. Most of the statistical analysis are based on linear models, and most of analysis of linear models can be performed by three SAS® procedures REG, ANOVA, and GLM. These Statistical procedures provide the power and flexibility for almost all linear model analysis. This paper explains to the basic statistical concepts to understand and using linear models that can be analyzed with these SAS (Statistical Procedures) STATISTICAL BACKGROUND Variation is inevitable, we need to identify the causes of variation and reason for variation, generally variation between the groups called it as assignable variation and within the groups as chance variation. Mathematically we need to identify the assignable variation and eliminate that variation, but we are not able to eliminate the chance variation, but we want to minimize this chance variation by using minima and maxima of differential calculus and principle of least squares. Simple linear regression analysis is used to analyze the relationship between a response, 푌, and a predictor, 푋. To assess the significance of the predictor in explaining the variability or behavior of the response. To predict the values of the response given the values of the predictor. GLM PROCEDURE (PROC GLM) LINEAR REGRESSION ANALYSIS ANALYSIS OF VARIANCE (ANOVA) ANALYSIS OF COVARIANCE (ANCOVA) SIMPLE LINEAR REGRESSION ANALYSIS Simple linear regression analysis is used to analyze the relationship between a response, 풀, and a predictor, 푿. 1 Objectives To assess the significance of the predictor in explaining the variability or behavior of the response. To predict the values of the response given the values of the predictor. 8 2 Study design: A multi-center, randomized, double-blind, parallel study conducted at four study centers. 75 patients in 4 centers are randomized in equal numbers to one of three treatment groups (i.e., High Dose of Anti-hypertensive medication (HI), Low Dose of Anti-hypertensive medication (LO) and Placebo (PB)). Objectives: To compare the efficacy of three treatment groups in terms of the mean decreases in diastolic blood pressure (DBP). To compare the safety of three treatment groups by evaluating their Adverse Drug Reaction (ADR,) rates (i.e., diarrhea) rates. QUESTION: DO THE DATA SHOW ANY LINEAR RELATIONSHIP BETWEEN MEAN CHANGE OF DIASTOLIC BLOOD PRESSURES (BPDIACH) AND PATIENTS’ AGES (AGE)? ODS GRAPHICS ON; PROC GLM DATA = BP PLOTS = ALL; MODEL BPDIACH = AGE; TITLE1 'SIMPLE LINEAR REGRESSION'; TITLE2 'DBP CHANGES FROM BASELINE TO ENDPOINT VS. AGE'; QUIT; ODS GRAPHICS OFF; 3 PROC ANOVA Simple Linear Regression syntax : Multiple Linear Regression syntax: proc glm data = SAS-data-set; proc glm data = SAS-data-set; model Y = X Z; model Y = X; quit; quit; 4 Y푖 = 훽0 + 훽1푋푖 + 훽2푍푖 + 휀푖 Y푖 = 훽0 + 훽1X푖 + 휀푖 Question: Do the data show any linear relationship of mean change of diastolic blood pressure (bpdiach) versus patients’ ages (age) and baseline of diastolic blood pressures (bpdia0)? ods graphics on; proc glm data = bp plots = all; model bpdiach = age bpdia0; 퐛퐩퐝퐢퐚퐜퐡 = 휷 + 휷 ∙ 퐚퐠퐞 + 휷 ∙ 퐛퐩퐝퐢퐚ퟎ + 휺 title1 ‘Multiple Linear Regression'; ퟎ ퟏ ퟐ title2 ‘DBP changes from baseline to endpoint vs. Age & DBP baseline'; quit; ods graphics off; • Fail to reject 휷ퟏ = ퟎ at .05 level, i.e., we cannot conclude that there is a linear relationship between DBP change from baseline to endpoint and age. • Succeed to reject 휷ퟐ = ퟎ at .05 level, i.e., we can conclude that there is a linear relationship between DBP change from baseline to endpoint and baseline DBP. Analysis of Variance (Oneway classification) Goal: To simultaneously compare two or more group means based on independent samples from each group. Assumptions • Normality: the samples are from normally distributed populations. • Independency: samples are independent within group. 5 • Variance homogeneity the within group variance is constant across groups In clinical trials, ANOVA method might be appropriate for comparing mean responses among a number of parallel-dose groups or among various strata based on patients’ background information, such as race, age group, or disease severity. Example 6 Question: Whether the means of changes of diastolic blood pressure(bpdiach) are significantly different among treatment groups (trt)? ods graphics on; proc glm data = bp plots = all; class trt; model bpdiach = trt; lsmeans trt / stderr pdiff cl; estimate 'High Dose - Placebo ' trt 1 0 -1; estimate 'Active - Placebo ' trt 0.5 0.5 -1; quit; ods graphics off; 7 Analysis of variance (ANOVA) Two way classification 8 Question: Whether the means of changes of diastolic blood pressure (bpdiach) are significantly different among treatment groups (trt) & centers (center)? ods graphics on; proc glm data = bp plots = all; class trt center; model bpdiach = trt center trt*center; lsmeans trt / stderr pdiff cl; 9 estimate 'High Dose - Placebo ' trt 1 0 -1; estimate 'Active - Placebo ' trt 0.5 0.5 -1; quit; ods graphics off; Analysis of Covariance (ANCOVA) Goal: To compare response means among two or more groups adjusted for a quantitative concomitant variable or ‘covariate’, thought to influence the response. i-th group mean as 휇푖 = 훼푖 + 훽푋 ANCOVA combines regression and ANOVA methods by fitting simple linear regression models within each group and comparing regressions among groups. 10 Analysis of Covariance (ANCOVA) Example Syntax for ANCOVA ods graphics on; proc glm data = bp plots = all; class trt; model bpdiach = trt bpdia0; lsmeans trt / stderr pdiff cl; estimate 'High Dose - Placebo ' trt 1 0 -1; estimate 'Active - Placebo ' trt 0.5 0.5 -1; quit; ods graphics off; REFERENCES 1) FUNDAMENTALS OF APPLIED STATISTICS BY S.C GUPTA AND V.K.KAPOOR. 2) SAS SYSTEM FOR LINEAR MODELS BY SAS INSTITUTE. 3) GENERAL LINEAR MODELS BY SAS INSTITUTE. 11 ACKNOWLEDGMENTS The authors wish to thank to colleague, Iraj Mohebalian, for his valuable comments to this paper. Management at Bayer Joerg Guettner and Michael Badlani, who have long been encourage and supporting me to present this paper. CONTACT INFORMATION Name : Venkata Ikkurthy Enterprise: Bayer U.S. LLC Address: 100 Bayer Boulevard, P.O. Box 915 City, State ZIP: Whippany, NJ 07981-0915 Phone:+1 732 662 0832 E-mail: [email protected] Name : Ritesh Dhimmar Enterprise: Bayer U.S. LLC Address: 100 Bayer Boulevard, P.O. Box 915 City, State ZIP: Whippany, NJ 07981-0915 Phone: +1 551 998 9086 E-mail: [email protected] SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. 12 .

Load more