On Modified Skew Logistic Regression Model and Its Applications

STATISTICA, anno LXXV, n. 4, 2015 ON MODIFIED SKEW LOGISTIC REGRESSION MODEL AND ITS APPLICATIONS C. Satheesh Kumar1 Department of Statistics, University of Kerala, Trivandrum-695 581, India L. Manju Department of Community Medicine, Sree Gokulam Medical College, Trivandrum-695 607, India 1. Introduction Regression methods are usually used for studying the relationship between a re- sponse variable and one or more explanatory variables. Over the last decade the logistic regression model, also known as logit model, has become the standard method of analysis when the outcome variable is dichotomous in nature and it has been found applications in several areas of scientific studies such as bioassay problems (Finney, 1952), study of income distributions (Fisk, 1961), analysis of survival data (Plackett, 1959) and modelling of the spread of an innovation (Oliver, 1969). The main drawback of a logit model is that it consider variables of only symmetric and unimodal nature. But asymmetry may arise in several practical situations where the logit model is not appropriate. So through this paper we develop certain regression models based on a modified version of the skew-logistic distribution and compare it with the existing logit model as well as a regression model based on the skew-logistic distribution of (Nadarajah, 2009). The paper is organized as follows. In section 2, we describe some important aspects of the skew-logistic regression model(SLRM) and propose a modified version of the SLRM, which we termed as the “Modified Skew Logistic Regression Model (MSLRM)”. In section 3, we obtain some structural properties of the modified skew logistic distribution. In section 4, we consider the estimation of the parameters of the MSLRM and in section 5, two real life medical datasets are considered for illustrating the usefulness of the model compared to both the logit and skew logit models. In section 6, a generalized likelihood ratio test procedure is suggested for testing the significance of the parameters and a simulation study is also conducted to test the efficiency of the maximum likelihood estimators(MLEs) of MSLRM. 1 Corresponding Author. E-mail: [email protected] 362 S. Kumar and L. Manju 2. Skew Logistic and Modified Skew Logitic regression models A random variable X is said to follow the logistic distribution (LD) if its probability density function (p.d.f) is of the following form. e−x f(x)= , (1) (1 + e−x)2 where x ∈ R = (−∞, +∞). The logistic regression model(LRM) is given by 1 p = , (2) 1+ e−z in which s z = a + brXr. (3) r=1 X The importance of the logistic regression model is due to its mathematical flexibility and in several medical applications it provides clinically meaningful in- terpretations (Hosmer and Lemeshow, 2000). Nadarajah (2009) developed a modified version of the logistic distribution, similar to the skew-normal distribution of Azzalini (1985), namely skew-logistic distribution(SLD), which he defined through the following p.d.f, in which x ∈ R,β > 0 and λ ∈ R. − x 2e( β ) f(x)= (4) −x 2 −λx β 1+ e( β ) 1+ e( β ) h i h i When λ = 0, (4) reduces to p.d.f of the standard logistic distribution given in (1). The main feature of skew-logistic distribution is that a new parameter λ is included here for controlling the skewness and kurtosis. Now the skew-logistic regression model (SLRM) can be obtained through the following double series representation. ∞ ∞ − − (1+λ+λj+k)z 1 2 1 ( β ) 2 j k 1+λ+λj+k e if z< 0 j=0 =0 p = P P (5) ∞ ∞ −1 −2 − (1+λj+k)z 1 ( β ) 1 − 2 j k 1+λj+k e if z> 0 j=0 =0 P P where z is as defined in (3). Here we consider a modified version of the skew-logistic distribution namely “the modified skew-logistic distribution(MSLD)” through the following p.d.f, in which x ∈ R, α ≥−1 and β > 0. 2 e−x αe−βx f(x; α, β)= 1+ (6) α +2 −x 2 1+ e−βx (1 + e ) Clearly when α = 0 and/or β = 0, the p.d.f reduces to the p.d.f of the logistic distribution and when α = -1 the p.d.f reduces to the p.d.f of the skew-logistic On Modified skew logistic regression model 363 density given in (4). The p.d.f f(x; α, β) of the MSLD(α, β) can also be expressed interms of the following single series as well as double series representations, ∞ − − α P 1 e( 1+βj)x − ( j ) 2 e x j=0 −x + −x , if x< 0 α+2 (1+e )2 (1+e )2 f(x; α, β)= (7) ∞ − − α P 1 e (1+β+βj)x − ( j ) 2 e x j=0 α+2 (1+e−x)2 + (1+e−x)2 , if x> 0 ∞ ∞ ∞ 2 −2 (1+k)x −1 −2 (1+βj+k)x α+2 k e + α j k e , ifx< 0 "k=0 j=0 k=0 # f(x; α, β)= P P P (8) ∞ ∞ ∞ 2 −2 −(1+k)x −1 −2 −(1+β+βj+k)x α+2 k e + α j k e , ifx> 0 "k=0 j=0 k=0 # P P P The modified skew-logistic regression model(MLRM) is given by the following double series representation based on (8). ∞ ∞ ∞ 2 −2 e(1+k)z −1 −2 e(1+βj+k)z α+2 k (1+k) + α j k (1+βj+k) , if z< 0 "k=0 j=0 k=0 # P P P p = (9) ∞ ∞ ∞ − −(1+k)z − − −(1+β+βj+k)z 1 − 2 2 e + α 1 2 e , if z ≥ 0 α+2 k (1+k) j k (1+β+βj+k) "k=0 j=0 k=0 # P P P where z is as given in (3). For the derivation of (9), see Appendix I. A graphical representation of MSLRM for particular values of α and β = 2 is given in Figure 1. 3. Some structural properties of modified skew logistic model Proposition 1. If X follows MSLD(α, β), then Y = −X follows a convex mixture of standard logistic and skew-logistic distributions. Proof. The p.d.f f (y) of Y is the following , for y ∈ R , β ∈ R and α ≥−1. dx f (y)= f (−y; α, β) dy y βy 2 e α e = α+2 (1+ey )2 1+ 1+eβy h i − 2 e y α = α+2 (1+e−y )2 1+ 1+e−βy h i − − 2 e y 2α e y 1 − , = α+2 (1+e−y )2 + α+2 (1+e−y )2 1+e βy 364 S. Kumar and L. Manju Figure 1 – Plots of regression function of MSLD(α, 2) for different values of α . which shows that the p.d.f of Y can be considered as a convex mixture of the p.d.f of the standard logistic and and skew-logistic distributions. Proposition 2. If X follows MSLD(α, β), then Y = |X| follows a half logistic distribution. Proof. For y> 0, the p.d.f f (y) of Y is dx dx f (y)= f (y; α, β) dy + f (−y; α, β) dy −y −βy −y 2 e e 2 e α α − − = α+2 (1+e−y )2 1+ 1+e βy + α+2 (1+e−y )2 1+ 1+e βy h i h i −y 2e , = (1+e−y )2 which is the p.d.f of the half logistic distribution. Proposition 3. If X follows MSLD(α, β), then Y = X1/c,c ∈ (0, 1] follows a distribution with p.d.f c c 2c yc−1e−y e−βy f(y)= 1+ α α +2 −yc 2 1+ e−βyc (1 + e ) Proof. For any y > 0, the p.d.f f (y) of Y = X1/c is given by − − c − c 2 yc 1e y e βy dx f(y)= − c 1+ α − c α+2 (1+e y )2 1+e βy dy h i c−1 −yc − c 2c y e e βy = − c 1+ α − c α+2 (1+e y )2 1+e βy h i On Modified skew logistic regression model 365 Proposition 4. The first four raw moments of MSLD(α, β) are given by ∞ ∞ ′ 2α −1 −2 1 (−1) µ = + , (10) 1 α +2 j k 2 2 j=0 " (1 + β + βj + k) (1 + βj + k) # X kX=0 ∞ − ∞ ∞ 4 2 −1 −2 1 1 µ′ = 2 k + α + , (11) 2 α +2 (1 + k)3 j k 3 3 k=0 j=0 k=0 " (1 + β + βj + k) (1 + βj + k) # X X X ∞ ∞ 12α −1 −2 1 (−1) µ′ = + (12) 3 α +2 j k 4 4 j=0 "(1 + β + βj + k) (1 + βj + k) # X Xk=0 and ∞ − ∞ ∞ 48 2 −1 −2 1 1 µ′ = 2 k + α + . (13) 4 α +2 (1 + k)5 j k 5 5 k=0 j=0 k=0 " (1 + β + βj + k) (1 + βj + k) # X X X Proof. By using the double series representation of the p.d.f of the MSLD(α, β) as given in (8) we obtain its first raw moment as ∞ ′ µ1 = xf(x)dx −∞ R ∞ 0 ∞ ∞ 0 2 −2 (1+k)x −1 −2 (1+βj+k)x = α+2 k xe dx + α j k xe dx + "k=0 −∞ j=0 k=0 −∞ P∞ ∞R P∞ P∞ ∞R −2 −(1+k)x −1 −2 −(1+β+βj+k) k xe dx + α j k xe dx , k=0 0 j=0 k=0 0 # P R P P R which gives (10), by using the standard results of integration.

On Modified Skew Logistic Regression Model and Its Applications

Logistic Regression, Dependencies, Non-Linear Data and Model Reduction

Logistic Regression Maths and Statistics Help Centre

Generalized Linear Models

An Introduction to Logistic Regression: from Basic Concepts to Interpretation with Particular Attention to Nursing Domain

Variance Partitioning in Multilevel Logistic Models That Exhibit Overdispersion

Randomization Does Not Justify Logistic Regression

Chapter 19: Logistic Regression

An Introduction to Biostatistics: Part 2

Lecture 9: Logistic Regression (V2)

Logistic Regression, Part I: Problems with the Linear Probability Model

Generalized Linear Models Link Function the Logistic Equation Is

Binary Logistic Regression