Advanced Regression Methods Symposium on Updates on Clinical Research Methodology March 18, 2013
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
An Introduction to Poisson Regression Russ Lavery, K&L Consulting Services, King of Prussia, PA, U.S.A
NESUG 2010 Statistics and Analysis An Animated Guide: An Introduction To Poisson Regression Russ Lavery, K&L Consulting Services, King of Prussia, PA, U.S.A. ABSTRACT: This paper will be a brief introduction to Poisson regression (theory, steps to be followed, complications and interpretation) via a worked example. It is hoped that this will increase motivation towards learning this useful statistical technique. INTRODUCTION: Poisson regression is available in SAS through the GENMOD procedure (generalized modeling). It is appropriate when: 1) the process that generates the conditional Y distributions would, theoretically, be expected to be a Poisson random process and 2) when there is no evidence of overdispersion and 3) when the mean of the marginal distribution is less than ten (preferably less than five and ideally close to one). THE POISSON DISTRIBUTION: The Poison distribution is a discrete Percent of observations where the random variable X is expected distribution and is appropriate for to have the value x, given that the Poisson distribution has a mean modeling counts of observations. of λ= P(X=x, λ ) = (e - λ * λ X) / X! Counts are observed cases, like the 0.4 count of measles cases in cities. You λ can simply model counts if all data 0.35 were collected in the same measuring 0.3 unit (e.g. the same number of days or 0.3 0.5 0.8 same number of square feet). 0.25 λ 1 0.2 3 You can use the Poisson Distribution = 5 for modeling rates (rates are counts 0.15 20 per unit) if the units of collection were 8 different. -
Generalized Linear Models (Glms)
San Jos´eState University Math 261A: Regression Theory & Methods Generalized Linear Models (GLMs) Dr. Guangliang Chen This lecture is based on the following textbook sections: • Chapter 13: 13.1 – 13.3 Outline of this presentation: • What is a GLM? • Logistic regression • Poisson regression Generalized Linear Models (GLMs) What is a GLM? In ordinary linear regression, we assume that the response is a linear function of the regressors plus Gaussian noise: 0 2 y = β0 + β1x1 + ··· + βkxk + ∼ N(x β, σ ) | {z } |{z} linear form x0β N(0,σ2) noise The model can be reformulate in terms of • distribution of the response: y | x ∼ N(µ, σ2), and • dependence of the mean on the predictors: µ = E(y | x) = x0β Dr. Guangliang Chen | Mathematics & Statistics, San Jos´e State University3/24 Generalized Linear Models (GLMs) beta=(1,2) 5 4 3 β0 + β1x b y 2 y 1 0 −1 0.0 0.2 0.4 0.6 0.8 1.0 x x Dr. Guangliang Chen | Mathematics & Statistics, San Jos´e State University4/24 Generalized Linear Models (GLMs) Generalized linear models (GLM) extend linear regression by allowing the response variable to have • a general distribution (with mean µ = E(y | x)) and • a mean that depends on the predictors through a link function g: That is, g(µ) = β0x or equivalently, µ = g−1(β0x) Dr. Guangliang Chen | Mathematics & Statistics, San Jos´e State University5/24 Generalized Linear Models (GLMs) In GLM, the response is typically assumed to have a distribution in the exponential family, which is a large class of probability distributions that have pdfs of the form f(x | θ) = a(x)b(θ) exp(c(θ) · T (x)), including • Normal - ordinary linear regression • Bernoulli - Logistic regression, modeling binary data • Binomial - Multinomial logistic regression, modeling general cate- gorical data • Poisson - Poisson regression, modeling count data • Exponential, Gamma - survival analysis Dr. -
Moderated Mediation Analysis: a Review and Application to School Climate Research
Practical Assessment, Research, and Evaluation Volume 25 Article 5 2020 Moderated Mediation Analysis: A Review and Application to School Climate Research Kelly D. Edwards University of Virginia Timothy R. Konold University of Virginia Follow this and additional works at: https://scholarworks.umass.edu/pare Part of the Educational Assessment, Evaluation, and Research Commons, Educational Methods Commons, and the Social Statistics Commons Recommended Citation Edwards, Kelly D. and Konold, Timothy R. (2020) "Moderated Mediation Analysis: A Review and Application to School Climate Research," Practical Assessment, Research, and Evaluation: Vol. 25 , Article 5. Available at: https://scholarworks.umass.edu/pare/vol25/iss1/5 This Article is brought to you for free and open access by ScholarWorks@UMass Amherst. It has been accepted for inclusion in Practical Assessment, Research, and Evaluation by an authorized editor of ScholarWorks@UMass Amherst. For more information, please contact [email protected]. Moderated Mediation Analysis: A Review and Application to School Climate Research Cover Page Footnote We thank members of our research team including Dewey Cornell, Anna Grace Burnette, Brittany Zellers Crowley, Katrina Debnam, Francis Huang, Yuane Jia, Jennifer Maeng, and Shelby Stohlman. This project was supported by Grant #NIJ 2017-CK-BX-007 awarded by the National Institute of Justice, Office of Justice Programs, U.S. Department of Justice. Surveying was conducted in collaboration with the Center for School and Campus Safety at the Virginia Department of Criminal Justice Services. The opinions, findings, and conclusions or ecommendationsr expressed in this report are those of the authors and do not necessarily reflect those of the U.S. -
Mediation and Moderation Analyses with R
Mediation and Moderation Analyses with R Stephen D. Short Saturday, February 28, 2015 Stephen D. Short (College of Charleston) Mediation and Moderation Analyses with R Saturday, February 28, 2015 1 / 25 Overview Mediation analysis in R Simple mediation model example Multiple mediator model example Moderation analysis in R Continuous moderator model example Simple slope figures Tips For slides and code please visit http://stephendshort.wix.com/psyc Stephen D. Short (College of Charleston) Mediation and Moderation Analyses with R Saturday, February 28, 2015 2 / 25 Mediation Occurs when the effect of one variable (X) on another variable (Y) “passes through” a third variable (M) M = a0 + aX + eM 0 Y = b0 + bM + c X + ey The indirect effect is quantified as ab Stephen D. Short (College of Charleston) Mediation and Moderation Analyses with R Saturday, February 28, 2015 3 / 25 Notable Mediation Packages Available in R R packages for mediation analyses BayesMed (Nuijten, Wetzels, Matzke, Dolan, & Wagenmakers, 20015) bmem (Zhang & Wang, 2011) mediation (Tingley, Yamamoto, Hirose, Keele, & Imai, 2014) powerMediation (Qui, 2015) RMediation (Tofighi & MacKinnon, 2010) Functions within other packages mediate () in psych package (Revelle, 2012) mediation () in MBESS package (Kelley & Lai, 2012) Note. This is not a complete list, but merely suggestions for social science researchers Stephen D. Short (College of Charleston) Mediation and Moderation Analyses with R Saturday, February 28, 2015 4 / 25 Example 1: Data From Pollack, VanEpps, & Hayes (2012) Also example data in Hayes (2013) mediation text Does economic stress (X) lead to a desire to withdraw from small business (Y), as a result of negative affect (M)? N = 262 small business owners X = estress (1-7 Likert scale) M = affect (1-5 Likert scale) Y = withdraw (1-7 Likert scale) Example data available from www.afhayes.com Stephen D. -
Generalized Linear Models with Poisson Family: Applications in Ecology
UNIVERSITY OF ABOMEY- CALAVI *********** FACULTY OF AGRONOMIC SCIENCES *************** **************** Master Program in Statistics, Major Biostatistics 1st batch Generalized linear models with Poisson family: applications in ecology A thesis submitted to the Faculty of Agronomic Sciences in partial fulfillment of the requirements for the degree of the Master of Sciences in Biostatistics Presented by: LOKONON Enagnon Bruno Supervisor: Pr Romain L. GLELE KAKAÏ, Professor of Biostatistics and Forest estimation Academic year: 2014-2015 UNIVERSITE D’ABOMEY- CALAVI *********** FACULTE DES SCIENCES AGRONOMIQUES *************** ************** Programme de Master en Biostatistiques 1ère Promotion Modèles linéaires généralisés de la famille de Poisson : applications en écologie Mémoire soumis à la Faculté des Sciences Agronomiques pour obtenir le Diplôme de Master recherche en Biostatistiques Présenté par: LOKONON Enagnon Bruno Superviseur: Pr Romain L. GLELE KAKAÏ, Professeur titulaire de Biostatistiques et estimation forestière Année académique: 2014-2015 Certification I certify that this work has been achieved by LOKONON E. Bruno under my entire supervision at the University of Abomey-Calavi (Benin) in order to obtain his Master of Science degree in Biostatistics. Pr Romain L. GLELE KAKAÏ Professor of Biostatistics and Forest estimation i Acknowledgements This research was supported by WAAPP/PPAAO-BENIN (West African Agricultural Productivity Program/ Programme de Productivité Agricole en Afrique de l‟Ouest). This dissertation could only have been possible through the generous contributions of many people. First and foremost, I am grateful to my supervisor Pr Romain L. GLELE KAKAÏ, Professor of Biostatistics and Forest estimation who tirelessly played key role in orientation, scientific writing and mentoring during this research. In particular, I thank him for his prompt availability whenever needed. -
Heteroscedastic Errors
Heteroscedastic Errors ◮ Sometimes plots and/or tests show that the error variances 2 σi = Var(ǫi ) depend on i ◮ Several standard approaches to fixing the problem, depending on the nature of the dependence. ◮ Weighted Least Squares. ◮ Transformation of the response. ◮ Generalized Linear Models. Richard Lockhart STAT 350: Heteroscedastic Errors and GLIM Weighted Least Squares ◮ Suppose variances are known except for a constant factor. 2 2 ◮ That is, σi = σ /wi . ◮ Use weighted least squares. (See Chapter 10 in the text.) ◮ This usually arises realistically in the following situations: ◮ Yi is an average of ni measurements where you know ni . Then wi = ni . 2 ◮ Plots suggest that σi might be proportional to some power of 2 γ γ some covariate: σi = kxi . Then wi = xi− . Richard Lockhart STAT 350: Heteroscedastic Errors and GLIM Variances depending on (mean of) Y ◮ Two standard approaches are available: ◮ Older approach is transformation. ◮ Newer approach is use of generalized linear model; see STAT 402. Richard Lockhart STAT 350: Heteroscedastic Errors and GLIM Transformation ◮ Compute Yi∗ = g(Yi ) for some function g like logarithm or square root. ◮ Then regress Yi∗ on the covariates. ◮ This approach sometimes works for skewed response variables like income; ◮ after transformation we occasionally find the errors are more nearly normal, more homoscedastic and that the model is simpler. ◮ See page 130ff and check under transformations and Box-Cox in the index. Richard Lockhart STAT 350: Heteroscedastic Errors and GLIM Generalized Linear Models ◮ Transformation uses the model T E(g(Yi )) = xi β while generalized linear models use T g(E(Yi )) = xi β ◮ Generally latter approach offers more flexibility. -
Generalized Linear Models
CHAPTER 6 Generalized linear models 6.1 Introduction Generalized linear modeling is a framework for statistical analysis that includes linear and logistic regression as special cases. Linear regression directly predicts continuous data y from a linear predictor Xβ = β0 + X1β1 + + Xkβk.Logistic regression predicts Pr(y =1)forbinarydatafromalinearpredictorwithaninverse-··· logit transformation. A generalized linear model involves: 1. A data vector y =(y1,...,yn) 2. Predictors X and coefficients β,formingalinearpredictorXβ 1 3. A link function g,yieldingavectoroftransformeddataˆy = g− (Xβ)thatare used to model the data 4. A data distribution, p(y yˆ) | 5. Possibly other parameters, such as variances, overdispersions, and cutpoints, involved in the predictors, link function, and data distribution. The options in a generalized linear model are the transformation g and the data distribution p. In linear regression,thetransformationistheidentity(thatis,g(u) u)and • the data distribution is normal, with standard deviation σ estimated from≡ data. 1 1 In logistic regression,thetransformationistheinverse-logit,g− (u)=logit− (u) • (see Figure 5.2a on page 80) and the data distribution is defined by the proba- bility for binary data: Pr(y =1)=y ˆ. This chapter discusses several other classes of generalized linear model, which we list here for convenience: The Poisson model (Section 6.2) is used for count data; that is, where each • data point yi can equal 0, 1, 2, ....Theusualtransformationg used here is the logarithmic, so that g(u)=exp(u)transformsacontinuouslinearpredictorXiβ to a positivey ˆi.ThedatadistributionisPoisson. It is usually a good idea to add a parameter to this model to capture overdis- persion,thatis,variationinthedatabeyondwhatwouldbepredictedfromthe Poisson distribution alone. -
Smartphone Use and Academic Performance of University Students: a Mediation and Moderation Analysis
sustainability Article Smartphone Use and Academic Performance of University Students: A Mediation and Moderation Analysis Rizwan Raheem Ahmed 1,* , Faryal Salman 2, Shahab Alam Malik 1, Dalia Streimikiene 3,* , Riaz Hussain Soomro 2 and Munwar Hussain Pahi 4 1 Faculty of Management Sciences, Indus University, Block-17, Gulshan, Karachi 75300, Pakistan; [email protected] 2 Institute of Health Management, Dow University of Health Sciences, Mission Road, Karachi 74200, Pakistan; [email protected] (F.S.); [email protected] (R.H.S.) 3 Institute of Sport Science and Innovations, Lithuanian Sports University, Sporto str. 6, Kaunas 44221, Lithuania 4 College of Business Management, PAF-KIET University, Korangi Creek, Karachi 75190, Pakistan; [email protected] * Correspondence: [email protected] (R.R.A.); [email protected] (D.S.) Received: 3 December 2019; Accepted: 1 January 2020; Published: 6 January 2020 Abstract: The purpose of the undertaken study is to examine the influence of smartphones on the performance of university students in Pakistan. This paper also investigates the functions of a smartphone as exogenous predictors such as smartphone applications, multimedia messaging service (MMS), short message service (SMS), warp-speed processing, and entertainment on the academic performance of a student. This paper also addresses the impact of electronic word of mouth (eWOM) and attitude as mediating variables between exogenous and endogenous variables. Finally, we incorporated technology and addiction as moderating variables between independent variables and the outcome variable to measure the influence of moderating variables. We have taken 684 responses from seven universities in Pakistan and employed the SEM-based multivariate approach for the analysis of the data. -
Moderation Fundamentals
Moderation Fundamentals: - Moderation refers to a change in the relationship between an independent variable and a dependent variable, depending on the level of a third variable, termed the moderator variable. Moderating effects are also referred to as interaction and conditioning effects. * For two continuous variables, moderation means that the slope of the relationship between the independent and dependent variable varies (i.e., increases or decreases) according to the level of the moderator variable. * For a continuous independent variable and a categorical moderator variable, moderation means that the slope of the relationship between the independent and dependent variable differs across the groups represented by the categorical moderator variable. * For a categorical independent variable and a continuous moderator variable, moderation means that the differences between the group means represented by the levels of the categorical independent variable differ according to the level of the moderator variable. * For two categorical variables, moderation means that the difference between the group means for the categorical independent variable differ depending on group membership on the moderator variable. - When the predictor and moderator variables are continuous, a single product is needed to capture the moderating effect. When one variable is continuous and the other is categorical, the required number of product terms is g – 1, where g equals the number of groups represented by the categorical variable. When both variables are categorical, the required number of product terms is (g1 – 1)(g2 – 1), where g1 and g2 are the number of groups represented by the two categorical variables. - Interactions can range up to the kth order, where k represents the number of variables on the right side of the equation. -
Bayesian Hierarchical Poisson Regression Model for Overdispersed Count Data
Bayesian Hierarchical Poisson Regression Model for Overdispersed Count Data Overview This example uses the RANDOM statement in MCMC procedure to fit a Bayesian hierarchical Poisson regression model to overdispersed count data. The RANDOM statement, available in SAS/STAT 9.3 and later, provides a convenient way to specify random effects with substantionally improved performance. Overdispersion occurs when count data appear more dispersed than expected under a reference model. Overdispersion can be caused by positive correlation among the observations, an incorrect model, an in- correct distributional specification, or incorrect variance functions. The example displays how Bayesian hierarchical Poisson regression models are effective in capturing overdispersion and providing a better fit. The SAS source code for this example is available as a text file attachment. In Adobe Acrobat, right-click the icon in the margin and select Save Embedded File to Disk. You can also double-click the icon to open the file immediately. Analysis Count data frequently display overdispersion (more variation than expected from a standard parametric model). Breslow(1984) discusses these types of models and suggests several different ways to model them. Hierarchical Poisson models have been found effective in capturing the overdispersion in data sets with extra Poisson variation. Hierarchical Poisson regression models are expressed as Poisson models with a log link and a normal vari- ance on the mean parameter. More formally, a hierarchical Poisson regression model is written as Yij ij Poisson.ij / j log.ij / Xi ˇ ij D C 2 ij normal.0; / for i 1; :::; n, j 1; :::; J , and y 0; 1; 2; ::: . -
The Great Moderation and the Relationship Between Output Growth and Its Volatility
The Great Moderation and the Relationship between Output Growth and Its Volatility WenShwo Fang Department of Economics Feng Chia University 100 WenHwa Road Taichung, TAIWAN [email protected] and Stephen M. Miller* College of Business University of Nevada, Las Vegas 4505 Maryland Parkway Las Vegas, Nevada, USA 89154-6005 [email protected] Abstract: This study examines the effect of the Great Moderation on the relationship between U.S. output growth and its volatility over the period 1947 to 2006. First, we consider the possible effects of structural change in the volatility process. In so doing, we employ GARCH-M and ARCH-M specifications of the process describing output growth rate and its volatility with and without a one-time structural break in volatility. Second, our data analyses and empirical results suggest no significant relationship between the output growth rate and its volatility, favoring the traditional wisdom of dichotomy in macroeconomics. Moreover, the evidence shows that the time-varying variance falls sharply or even disappears once we incorporate a one-time structural break in the unconditional variance of output starting 1982 or 1984. That is, the integrated GARCH effect proves spurious. Finally, a joint test of a trend change and a one-time shift in the volatility process finds that the one-time shift dominates. Keywords: Great Moderation, economic growth and volatility, structural change in variance, IGARCH JEL classification: C32; E32; O40 * Corresponding author 1. Introduction Macroeconomic volatility declined substantially during the past 20 years. Kim and Nelson (1999), McConnell and Perez-Quiros (2000), Blanchard and Simon (2001), Stock and Watson (2003), and Ahmed, Levin, and Wilson (2004), among others, document this Great Moderation in the volatility of U.S. -
Using Geographically Weighted Poisson Regression for County-Level Crash Modeling in California ⇑ Zhibin Li A,B, , Wei Wang A,1, Pan Liu A,2, John M
Safety Science 58 (2013) 89–97 Contents lists available at SciVerse ScienceDirect Safety Science journal homepage: www.elsevier.com/locate/ssci Using Geographically Weighted Poisson Regression for county-level crash modeling in California ⇑ Zhibin Li a,b, , Wei Wang a,1, Pan Liu a,2, John M. Bigham b,3, David R. Ragland b,3 a School of Transportation, Southeast University, Si Pai Lou #2, Nanjing 210096, China b Safe Transportation Research and Education Center, Institute of Transportation Studies, University of California, Berkeley, 2614 Dwight Way #7374, Berkeley, CA 94720-7374, United States article info abstract Article history: Development of crash prediction models at the county-level has drawn the interests of state agencies for Received 25 December 2012 forecasting the normal level of traffic safety according to a series of countywide characteristics. A com- Received in revised form 11 March 2013 mon technique for the county-level crash modeling is the generalized linear modeling (GLM) procedure. Accepted 13 April 2013 However, the GLM fails to capture the spatial heterogeneity that exists in the relationship between crash counts and explanatory variables over counties. This study aims to evaluate the use of a Geographically Weighted Poisson Regression (GWPR) to capture these spatially varying relationships in the county-level Keywords: crash data. The performance of a GWPR was compared to a traditional GLM. Fatal crashes and countywide Safety factors including traffic patterns, road network attributes, and socio-demographic characteristics were Crash County-level collected from the 58 counties in California. Results showed that the GWPR was useful in capturing Geographically Weighted Regression the spatially non-stationary relationships between crashes and predicting factors at the county level.