<<

UNIVERSITY OF CALIFORNIA RIVERSIDE

Estimation of the Parameters of by Approximating the Ratio of the Normal Density and Distribution Functions

A Dissertation submitted in partial satisfaction of the requirements for the degree of

Doctor of Philosophy

in

Applied

by

Debarshi Dey

August, 2010

Dissertation Committee: Dr. Subir Ghosh, Chairperson Dr. Barry Arnold Dr. Aman Ullah

Copyright by Debarshi Dey 2010

The Dissertation of Debarshi Dey is approved:

______

______

______

Committee Chairperson

University of California, Riverside

ACKNOWLEDGEMENTS

I would take this opportunity to express my sincere gratitude and thanks to my advisor,

Professor Subir Ghosh for his continuous and untiring guidance over the course of the last five years. His sincere interest in not only my academic progress, but also my personal well-being, has been a source of sustained motivation for me.

I would also like to extend my sincere thanks to Professor Barry Arnold, of the

Department of Statistics, UC Riverside, and Professor Aman Ullah, of the Department of

Economics, UC Riverside, for graciously accepting to serve on my PhD Committee and for their valuable time and advice.

I wish to thank the entire faculty of the Department of Statistics for enriching me with their vast knowledge in various fields of Statistics.

I would like to thank the entire staff of the Department of Statistics who were always ready with their help.

My friends here at UCR, have also been a source of major strength and support for the last five years. Your friendship has made my experience at Riverside all the more memorable and fulfilling.

iv

A very special thanks to my wife, Trupti, for being with me and for being my pillar of strength. Though she joined me only a few months ago, her boundless affection, and immense support are invaluable to me in this accomplishment.

I would like to thank my parents, because it is for them and their hard work and sacrifice that made me achieve whatever little I have achieved. Without their constant support, and their immense faith in me, I might not have pursued and persevered. Together with them, my younger brother, Tukan, and my Grandmother, Danni, are equally committed to my success, and have taken immense pride in my accomplishments.

Finally I would like to express my most humble gratitude to my Divine Master,

Bhagawan Sri Sathya Sai Baba, without whose Grace and Benevolence I could not have achieved anything.

v

ABSTRACT OF THE DISSERTATION

Estimation of the Parameters of Skew Normal Distribution Using Linear Approximations of the Ratio of the Normal Density and Distribution Functions

by

Debarshi Dey

Doctor of Philosophy, Graduate Program in Applied Statistics University of California, Riverside, August 2010 Dr Subir Ghosh, Chairperson

The normal distribution is symmetric and enjoys many important properties. That is why it is widely used in practice. Asymmetry in is a situation where the normality assumption is not valid. Azzalini (1985) introduces the skew normal distribution reflecting varying degrees of . The skew normal distribution is mathematically tractable and includes the normal distribution as a special case. It has three parameters: location, scale and shape. In this thesis we attempt to respond to the complexity and challenges in the maximum likelihood estimates of the three parameters of the skew normal distribution. The complexity is traced to the ratio of the normal density and distribution function in the likelihood equations in the presence of the skewness parameter. Solution to this problem is obtained by approximating this ratio by linear and non-linear functions. We observe that the linear approximation performs quite

vi satisfactorily. In this thesis, we present a method of estimation of the parameters of the skew normal distribution based on this linear approximation. We define a performance measure to evaluate our approximation and estimation method based on it. We present the simulation studies to illustrate the methods and evaluate their performances.

vii

Contents

1. Introduction 1

1.1 Motivation and Historical Development...... 2

1.2 Normal Distribution and Simple ...... 5

1.3 Skew Normal Distribution and Regression...... 8

1.4 Thesis Description...... 10

viii

2. The Univariate Skew Normal Distribution 11

2.1 The Univariate Skew Normal Distribution...... 12

2.2 Moments of the Univariate Skew Normal Distribution...... 15

2.3 and Maximum Likelihood Estimates...... 20

2.4 Challenges of the Maximum Likelihood Estimates of the Skew Normal Dsitribution...... 30

2.5 Literature review on challenges...... 33

3. Approximations of the ratio of the Standard Normal Density

and Distribution Functions 35

3.1 Introduction...... 36

3.2 Motivation...... 38

3.3 Fitting the Linear Approximation to the Ratio...... 41

3.4 Fitting the Non-linear Approximation to the Ratio...... 46

4. Estimation of the of the Standard

Skew Normal Distribution 51

4.1 Introduction...... 52

4.2 The Estimation Procedure using A (z) ...... 53

ix

4.2.1 Case I : Covariate X Present...... 54

4.2.2 Case II : Covariate X Absent...... 56

4.2.3 Measure of ...... 57

4.3 The Estimation Procedure using B (z) ...... 58

4.3.1 Case I : Covariate X Present...... 59

4.3.2 Case II : Covariate X Absent...... 61

4.3.3 Measure of Goodness of Fit...... 63

4.4 A Simulated Data...... 64

4.4.1 Case I : Covariate X Present...... 65

4.4.2 Case II : Covariate X Absent...... 70

4.5 Estimating Bias and Accuracy in Approximations using Simulations...... 75

5. Estimation of Location, Scale and Shape Parameter of a Skew Normal Distribution 80

5.1 Introduction...... 81

5.2 Relations Among Estimated Parameters...... 82

5.3 Estimation Procedure using A (z) ...... 91

5.4 A simulated Data...... 99

5.5 Estimating Bias and Accuracy in Approximation using Simulations...... 104

6. Conclusion 124

Bibliography 127

x

List of Figures

2.1 The pdf of Z ~ SN(0,1,) ,   1,2 and10 ...... 14

2.2 Probability that Z < 0 for values of  ranging from 0 to 30...... 31

3.1 Plots of R (z) against z for   0.5, 1 and  2 ...... 37

3.2 Plots of A (z) against z for , and ...... 42

3.3 Plot of against z for   0.5 (continuous lines) and

A (z) against z for (dotted lines)...... 43

3.4 Plot of against z for   1 (continuous lines) and against z for   1 (dotted lines)...... 44

xi

3.5 Plot of R (z) against z for   2 (continuous lines) and

A (z) against z for   2 (dotted lines)...... 45

3.6 Plots of B (z) against z for   0.5, 1 and  2 ...... 47

3.7 Plot of against z for   0.5 (continuous lines) and

B (z) against z for (dotted lines)...... 48

3.8 Plot of against z for   1 (continuous lines) and against z for   1 (dotted lines)...... 49

3.9 Plot of against z for (continuous lines) and against z for (dotted lines)...... 50

4.1 Plots of both R (z) and Aˆ (z) against z when covariate X is present...... 67 1 1

4.2 Plots of both and Bˆ (z) against z when covariate X is present...... 69 1

4.3 Plots of both and against z when covariate X is absent...... 72

4.4 Plots of both and against z when covariate X is absent...... 74

5.1 Plot of the estimated log-likelihood lˆ  against the values of b [0,1) . . . . . 102

5.2 Plot of the estimated log-likelihood against the values of b[0.35,0.50] ...... 103

xii

List of Tables

ˆ ˆ2 ˆ 4.1. The values of Q1 ,, , Q3, and SD for  ,  ˆ , ˆ ,

and Ave A when  0  30, 1  5,   0.5, and   0.5 in approximating (3.1) by (3.2)...... 77

4.2. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when , and  1.5 in approximating (3.1) by (3.2)...... 77

4.3. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when  1.0, and in approximating (3.1) by (3.2)...... 77

4.4. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when  1.0, and  1.5 in approximating (3.1) by (3.2)...... 78

xiii

ˆ 2 4.5. The values of Q1 ,Median, Mean, Q3, and SD for  , ˆ ˆ , ˆˆ ,

and Ave B when  0  30, 1  5,   0.5, and   0.5 in approximating (3.1) by (3.3)...... 78

4.6. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when , and  1.5 in approximating (3.1) by (3.3)...... 78

4.7. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when  1.0, and in approximating (3.1) by (3.3)...... 79

4.8. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when  1.0, and  1.5 in approximating (3.1) by (3.3)...... 79

5.1 The true values of  , the sample size n, and the proportion of times the sign of is correctly determined...... 105

5.2 The values of Q1 ,Median, Mean, Q3, and SD for

b ,ˆ0 , ˆ1 , ˆ , ,ˆ when n=20, 0  30, 1  5,   0.05, and   0.5in approximating (3.1) by (3.2)...... 108

5.3 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, ,

and in approximating (3.1) by (3.2)...... 108

xiv

5.4 The values of Q1 ,Median, Mean, Q3, and SD for ˆ b ,ˆ0 , ˆ1 , ˆ ,  ,ˆ when n=100,  0  30, 1  5,   0.05, and   0.5in approximating (3.1) by (3.2)...... 109

5.5 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2)...... 109

5.6 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=20, , and  1.0in approximating (3.1) by (3.2)...... 110

5.7 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and  1.0in approximating (3.1) by (3.2)...... 110

5.8 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=100, , and in approximating (3.1) by (3.2)...... 111

5.9 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2)...... 111

5.10 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=20, , and   2.0 in approximating (3.1) by (3.2)...... 112

xv

5.11 The values of Q1 ,Median, Mean, Q3, and SD for ˆ b ,ˆ0 , ˆ1 , ˆ ,  ,ˆ when n=50,  0  30, 1  5,   0.05, and   2.0 in approximating (3.1) by (3.2)...... 112

5.12 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=100, , and in approximating (3.1) by (3.2)...... 113

5.13 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2)...... 113

5.14 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=20, , and   3.0in approximating (3.1) by (3.2)...... 114

5.15 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and   3.0in approximating (3.1) by (3.2)...... 114

5.16 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=100, , and in approximating (3.1) by (3.2)...... 115

5.17 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2). )...... 115

xvi

5.18 The values of Q1 ,Median, Mean, Q3, and SD for ˆ b ,ˆ0 , ˆ1 , ˆ ,  ,ˆ when n=20,  0  30, 1  5,   0.05, and   0.5in approximating (3.1) by (3.2)...... 116

5.19 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and   0.5in approximating (3.1) by (3.2)...... 116

5.20 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=100, , and in approximating (3.1) by (3.2)...... 117

5.21 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2)...... 117

5.22 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=20, , and   1.0in approximating (3.1) by (3.2)...... 118

5.23 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and   1.0in approximating (3.1) by (3.2)...... 118

5.24 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=100, , and in approximating (3.1) by (3.2)...... 119

xvii

5.25 The values of Q1 ,Median, Mean, Q3, and SD for ˆ b ,ˆ0 , ˆ1 , ˆ ,  ,ˆ when n=500,  0  30, 1  5,   0.05, and   1.0in approximating (3.1) by (3.2)...... 119

5.26 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=20, , and   2.0in approximating (3.1) by (3.2)...... 120

5.27 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and   2.0in approximating (3.1) by (3.2)...... 120

5.28 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=100, , and in approximating (3.1) by (3.2)...... 121

5.29 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2)...... 121

5.30 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=20, , and   3.0in approximating (3.1) by (3.2)...... 122

5.31 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and   3.0in approximating (3.1) by (3.2)...... 122

xviii

5.32 The values of Q1 ,Median, Mean, Q3, and SD for ˆ b ,ˆ0 , ˆ1 , ˆ ,  ,ˆ when n=100,  0  30, 1  5,   0.05, and   3.0in approximating (3.1) by (3.2)...... 123

5.33 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2)...... 123

xix

Chapter 1

Introduction

1

1.1 Motivation and Historical Development

In drawing statistical inferences under the parametric framework, we assume a distribution that describes the data in the best possible way. The celebrated Gaussian distribution is the most popular distribution for describing a data. Its popularity has been driven by its analytical simplicity, associated , its multivariate extension- both the marginals and conditionals being normal, additivity and other properties. However, there are numerous situations, where the Gaussian distribution assumption may not be valid. Alternatively, many near normal distributions have been proposed (Mudholkar and Hutson (2000), Turner(1960), Prentice ( 1975) and

Azzalini ( 1985)). These families describe the variations from normality, share some desirable properties of normal distributions to some extent, and also include the normal distribution as a special case. Under many situations when the data cannot be satisfactorily modeled by a normal distribution, these para metric distributions provide alternatives in drawing inferences.

Some of these families deal with the deviations from symmetry in the normal distribution. They are analytically tractable, contain reasonable degrees of skewness and

2 , and include the normal distribution as a special case. One such distribution is the skew normal distribution, which was proposed by Azzalini (1985). While the normal distribution with its symmetry has only location and scale parameters, the skew normal distribution has an additional shape parameter describing the skewness. From practical standpoint, this is a very desirable property, where in many real life situations, some skewness is always present in the data. In addition, the skew normal distribution shares many important properties of the normal distribution: for example, the skew normal densities are unimodal, their support is the real line, and the square of a skew normal has the Chi-square distribution with one degree of freedom.

Arnold et al. (1993) provided the following motivation for the skew normal model.

Let X be the GPA and Y be the SAT score of students who want to be admitted in colleges. We assume (X,Y) to follow a bivariate normal distribution. This implies the marginal distribution of both X and Y are normal. Suppose we consider a college, where the admitted students have an above average SAT score. If we consider the X values of the admitted students in that college, the distribution of the X values follows a skew normal distribution.

Although the skew normal distribution has some attractive properties, there are some complexities and challenges present in drawing the inference on its parameters.

Detailed discussions on these issues can be found in Sartori (2006), Dalla Valle (2004),

Pewsey (2000), Monti (2003), and others. Different methods in overcoming these issues

3 are presented by Azzalini and Capatitanio (1999), Liseo and Loperfido (2002), Ana Clara

Monti (2003) , Sartori (2005).

This thesis identifies a source of the complexity in the maximum likelihood estimation of the skew normal parameter. The source is in fact the ratio of a normal density and distribution functions, in the presence of location, scale and shape parameters of the skew normal distribution. We propose approximations of this ratio by linear and non-linear functions. We observe that the linear approximation performs quite satisfactorily in approximating the complex function. Hence, we used the linear approximation for our estimation procedure. We consider a linear regression setup, where the of the skew normal distribution is a linear function of a covariate X.

4

1.2 Normal Distribution and Simple Linear

Regression

The normal distribution is widely used in describing data in many applications. The distribution is symmetric with mean  and  , and approximately

99.7% of the distribution lies in the   3 . The function of a normal random variable X is

2 1  1  x     f (x;, )  exp    ,    x  . (1.1)  2  2    

We denote, X ~ N(,) .

We consider a setup, with Y as the response variable and X as the predictor variable.

A simple linear regression model with the normality assumption is

Yi   0   1xi   i , (1.2) where,

Yi is the response variable,

xi is the predictor variable,

 0 is the intercept parameter,

5

 1 is the slope parameter, and

 are i.i.d. , for . i N(0, ) i 1,...,n

The unknown parameters are  0 , and  .

x Y N(  x , ) Thus, for fixed values of i , the observed i ’s are independent 0 1 i ,

.

The joint probability distribution function of Y1,Y2 ,...,Yn , is then given by

L 0 , 1 , y1 ,..., yn  f (y1,..., yn ; 0 , 1,)

n   f (yi ; 0 , 1, ) i1

n  n 2   1   1  yi   0   1 xi      exp    . (1.3)  2   2 i1    

The log-likelihood function is given by

n 2 n 1  yi   0   1 xi  log L 0 , 1, y1,..., yn   nlog  log2    . (1.4) 2 2 i1   

The likelihood equations for  0 , 1 , and  are obtained by maximizing the log-likelihood equation given in (1.4) with respect to and respectively.

6

The likelihood equations are given by:

n yi   0   1 xi   0, (1.5) i1

n  xi yi   0   1 xi   0, (1.6) i1 and

n 2 2 yi   0   1xi   n . (1.7) i1

Solving (1.5)-(1.7) we get the maximum likelihood estimators of  0 , 1 , and  as :

n xi  xyi  y ˆ i1  1  n , (1.8) 2 xi  x i1

ˆ0  y ˆ1x , (1.9) and

1 n ˆ 2 ˆ ˆ 2   yi   0   1 xi  . (1.10) n i1

The fitted values of yi ’s , i 1,...,n , are then

yˆi  ˆ0  ˆ1 xi . (1.11)

The residuals are given by yˆ i  yi  , .

7

1.3 Skew Normal Distribution and Regression

We present the skew normal distribution and its properties in detail in Chapter 2.

Here we introduce the skew normal random variable and an outline of our research.

A random variable Y is said to have a skew normal distribution with location parameter  ,  , and skewness parameter  , if its probability density function is given by:

2  y     y    f (y;, , )     ,    y   , (1.12)        where, (.) is the probability density function and .is the cumulative distribution function.

We denote, Y ~ SN(,,).

We consider a simple linear regression setup, with Y as the response variable and X as the predictor variable.

A simple linear regression model with the skew-normality assumption is

Yi   0 1xi i , (1.13) where,

Yi is the response variable,

8 xi is the predictor variable,

 0 is the intercept parameter,

 1 is the slope parameter, and

 N(0, ) i are i.i.d. , for i 1,...,n .

The unknown parameters are , ,  and  .

Thus, for fixed values of , the observed Yi ’s are independent SN( 0 1xi ,,) ,

.

We have four unknown parameters of interest, namely, , , and . In this thesis, we resolve the challenges and the complexities involved with the maximum likelihood estimation method for estimating the parameters , , and , and present a new method for solving the maximum likelihood estimating equations.

9

1.4 Thesis Description

In Chapter 2, we present the skew normal distribution and its properties, its moments, likelihood function and maximum likelihood estimates, and the associated

Information matrix. We also discuss the challenges of the maximum likelihood estimates and present a review of the literature on the challenges.

In Chapter 3, we identify the ratio of normal density and distribution functions in the presence of the shape parameter as the source of the complexity in estimating the parameters, and propose a linear and non-linear approximation of the ratio.

In Chapter 4, we present a procedure for the estimation of the shape parameter of the skew normal distribution, assuming that the location and the scale parameter are known. We also evaluate the performance of the estimation procedure by a simulation study.

In Chapter 5, we present a procedure for the estimation of the location, scale and the shape parameter of the skew normal distribution. Here we use only the linear approximation of the ratio for our estimation procedure. We evaluate the performance of the estimators by a simulation study.

10

Chapter 2

The Univariate Skew Normal

Distribution.

11

2.1 The Univariate Skew Normal

Distribution

In this Section we present the univariate skew normal distribution, introduced by

Azzalini (1985). We then discuss several properties and moments of this distribution.

Definition 1. A random variable Z is said to have a standard normal distribution if its probability density function (pdf) is given by

1 2 (z)  ez / 2 ,   z   . (2.1) 2

We describe a standard normal random distribution by Z ~ N(0,1) .

The cumulative distribution function (cdf) of a standard normal random variable will be denoted by ., where

z (z)   udu . (2.2) 

12

Definition 2. [Azzalini(1985)] A random variable X is said to have a skew normal distribution if its pdf is given by

f (x)  2(x)(x) ,    x  , (2.3) where  , a real number, is the skewness parameter, . and . are the standard normal pdf and cdf respectively, as given in (2.1) and (2.2).

For brevity we write Z ~ SN(0,1,).

The cdf of a skew normal distribution is denoted by z; , where

z  z;  f u du       . (2.4) 

Figure 2.1 shows the shape of the pdf (2.3) for three values of the skewness parameter  , namely 1, 2 and 10. We observe that the shape of the pdf becomes increasingly skewed to the right with the increase in the value of . When   1, the shape becomes slightly skewed to the right, and when  10, the shape becomes close to the pdf of a half normal random variable. The right-tails of these distributions for the above three values of become virtually indistinguishable for the values of z greater than 2.

13

Figure 2.1 The pdf of Z ~ SN(0,1,) ,  1,2 and10 .

We now state the properties P1-P5 of :

P1. When   0 , Z ~ N(0,1) ,

P2. As   , f (z) tends to 2(z)I Z 0 , which is the half normal pdf,

14

P3.  Z ~ SN(0,1,) ,

P4. z; 1  z;,

P5. log f (z)is a concave function of z and hence the pdf f (z) is a unimodal function

of z,

2 2 P6. Z ~ 1 .

2.2 Moments of the Univariate Skew

Normal Distribution.

In this Section we derive the generating function and the moments of a skew normal random variable.

Lemma 1. Let Z be a standard normal random variable and let h and k be real numbers.

Then

 k  EhZ  k   for all h and k, (2.5)  1 h 2 

where . is the standard normal distribution function.

15

Proof. Let Z be a standard normal random variable.

For any real h and k, we write EhZ  k as

 h,k  EhZ  k hz  kzdz . (2.6) 

Differentiating (2.6) with respect to k we get:

h,k   hz  kzdz . k 

 1  1 2 2    exp hz  k  z dz . 2   2 

2 1  k 2    (1 h 2 )  hk    exp  exp  z   dz 2  2   2 2 .  2(1 h )    1 h  

 hk  Letting u  1 h2  z   , we have  1 h2 

h,k 1  k 2  1   u 2   exp  exp du 2  2   . k 2 1 h  2(1 h )  2   2 

1  k     . 2  2  1 h  1 h 

16

Now integrating with respect to k , we have

 k  h,k    .  1 h 2 

This proves the lemma.

Theorem 1. When Z ~ SN(0,1,) , the moment generating function of Z is

 t 2   t  M Z (t)  2exp   . (2.7)    2   2   1  

 M (t)  E exp tz  2 exp tz (z)(z)dz Proof: Z       

2   t  1  1 2   2exp   exp z  t (t)dt  2  2  2 

2   t  1  1 2   2exp   exp u u  tdu  2  2  2 

 t 2   2exp Eu  t, where U ~ N(0,1) .  2 

From Lemma 1 we get

 t 2   t  M Z (t)  2exp   .    2   2   1  

17

The first moment of a skew normal random variable Z is given by

  t 2    t   t 2   t  EZ   M '(0)  2exp     t exp     2  2     2  2 1  1  2 1           t0

  2 0 1 2

2   .  1 2

The second moment of a skew normal random variable Z is given by

2   t 2     t   t 2    t  EZ 2   M ''(0)  2exp   '   2t exp      2  2   2   2  2  2     1    1     1   1  

 t 2   t   t 2 1exp      2  2 1      t0

 20

 1.

We get

2  EZ   , (2.8)  1 2

18

and

22 VarZ   1 . (2.9)  1 2 

In practice, it is common to work with a location and scale transformation Y   Z , where  is a real number and   0 . Hence the density for the random variable Y distributed as SN(,,) is

2  y     y    f (y;, , )     ,    y   . (2.10)       

The expectation and of Y are given by

2  EY     , (2.11)  1 2 and

2 2  2  VarY    1 2 . (2.12)   1  

19

2.3 Likelihood function and maximum

likelihood estimates.

Let y1 , y2, …, yn be an independent and identically distributed sample from

SN(,,) , where  ,  and  are unknown and , and  ( 0) are real numbers.

Then the likelihood function is:

2n n  y     y    L(, ,)   i   i n      . (2.13)  i1      

The log-likelihood function is given by

l,,  log L,,

n  y    n  y     nlog 2  nlog  log i   log i . (2.14) i1    i1   

We define

 y     i     W y   . (2.15) i  y     i    

20

The likelihood equations for  ,  and  are obtained by maximizing the log- likelihood function given in (2.14) with respect to , and respectively. The likelihood equations obtained by taking partial derivatives of the likelihood equations with respect to , and are:

n n l,,  yi        W(yi )  0, (2.16)  i1    i1

n 2 n l,,  yi     yi     n      W(yi )  0 , (2.17)  i1    i1   

n l,,  yi      W(yi )  0. (2.18)  i1   

Let ˆ ,ˆ andˆ be the solutions for  , and  of the equations (2.16)-(2.18).

We write

 y  ˆ  ˆ i  ˆ Wˆ y     . (2.19) i  y  ˆ  ˆ i   ˆ 

Then from (2.16)-(2.18), we get

n ˆ n  yi    ˆ ˆ    W yi , (2.20) i1  ˆ  i1

21

n ˆ ˆ  yi    W yi    0 , (2.21) i1  ˆ 

2 n  y  ˆ   i   n . (2.22) i1  ˆ 

The Fisher Information Matrix is given by,

   2    2    2  E ln L , ,  E ln L , ,  E ln L , ,    2                      2    2    2  I , ,   E ln L , ,  E ln L , ,  E ln L , ,  .        2                   2    2    2   E ln L , ,  E ln L , ,  E ln L , ,          2              

We let

2 p  , 

 Y    Z    , and   

 2  k Z   ak  EZ   , k=0,1,2.  Z  

22

Now we derive the elements of the Fisher-Information matrix.

First we have

  2        E ln L , ,   E  ln L ,,   2             

   Y        i   n n        Yi          E      2    i1     i1  Yi              

n n    Zi  Zi    E       i1   i1 Zi 

2  n 2 n Z Z   Z     E   i i   i    2  2  Z   Z   i1  i  i  

n na 2 2 n  Z Z    0  i i 2Z Z dZ 2 2 2   i i i    i1  Zi 

n na 2 23 n  1  Z 2 1 2    0  Z exp  i dZ 2 2 2   i   i .    i1 2  2 

2 2 2 Let u  Zi 1  .

2 Hence udu  (1  )Zi dZi .

23

Thus we have

2 2 2 3  u    n na  2 n 1   E ln L,,   0  ue 2 du  2  2 2 2   2 .       i1 2 1  

It is known that the first moment of a standard normal random variable is zero.

Thus,

  2  n na 2  E ln L ,,    0  2   2 2 .     

Next we have

  2    2   E ln L, ,  E ln L,,      

      E  ln L, ,    

   Y        i   n n        Yi          E      2    i1     i1  Yi              

n n    Zi  Zi    E       i1   i1 Zi 

24

2  n  2Z  n   Z   Z    i 2 2  i   i    E   Z i 1  Z i      2  2  Z   Z   i1 i1  i  i  

3 n  2 2np  Zi Zi    2Zi Zi dZ i 2 2 2    1   i1  Zi 

 n  Z  n2a  i 2Z Z dZ  1 2   i i i 2  i1 Zi  

3 n  2 2 2np 2 1 2  Zi 1     Zi exp dZ i 2 2 2    1   i1 2  2 

2 n  1  Z 2 1 2  n2a  exp  i dZ  1 2     i 2  i1 2  2  

3 n  2 2np 2 1 2  u    u exp du 2 2 2   2 3/ 2  1   i1 2 1    2 

2 n  1  u 2  n2a  exp du  1 2   2 2  i1 2 1   2  

It is known that the second moment of a standard normal random variable is one.

Thus,

  2  2np n3 p np n2a  E ln L ,,      1    3/ 2 2     2 1 2  2 1 2   2 1 2 

25

2 2 np1 2  n a1  3/ 2  2 .  2 1 2  

Next we have,

  2    2   E ln L, ,  E ln L,,      

      E  ln L, ,    

   Y        i   n        Yi         E     i1     Yi              

2  n    1 2 2 Zi   Zi     E Zi  1  Zi      Z   Z   i1  i  i  

2 n  Z 2Z    i i 2 Z  Z dZ    i   i  i  i1  Zi 

1 n  Z  na  i 2 Z  Z dZ  1    i   i  i  i1 Zi  

2  np np na1   3/ 2    1 2   1 2 

26

  n  p    a1 .  2 3/ 2    1   

Next we have,

   Y        i  2  n 2 n          2n 1 Yi      Yi         E ln L, ,  E         2             i1     i1     Yi              

2  n 2 n    2n 3Zi  2 3 Zi  2  Zi     E    Zi  2Zi   Zi     2   2  2  Z   Z   i1 i1  i  i  

Using the fact that the odd moments of a standard normal are zero and the fact that the second moment is one, we have

  2  n  E ln L , ,   1 2a  2   2  2 .    

27

Next we have,

  2    2   E ln L,,  E ln L, ,      

      E  ln L, ,    

   Y        i   n        Yi         E     i1     Yi              

2  n    1 2 3 Zi  2  Zi     E  Zi  Zi   Zi      Z   Z   i1  i  i  

Using the fact that the odd moments of a standard normal are zero and the fact that the second moment is one, we have

2    na2  E ln L, ,   .    

Finally we have,

   Y        i  2  n           Yi         E ln L, ,  E    2         i1     Yi              

28

2  n    ZiZi  2  Zi     E   Zi     Z   Z   i1  i  i  

 na2 .

Hence the Fisher Information matrix is given by:

  2 np 1 22 n2 a n  p   n1  a0    1    3 / 2  a1   2 2 2 3 / 2 2 2    1     1     2 2   np1 2  n a1 n 2 na2  I  3 / 2  2 2 1  a2   . (2.21)  2 1 2       na na2   n  p   2    a1     2 3/ 2   1    

29

2.4 Challenges of the Maximum

Likelihood Estimates of the

Univariate Skew Normal Distribution.

We now discuss the two main problems that arise in the maximum likelihood estimation of the parameters of the skew normal distribution.

Firstly, the likelihood function with respect to  might be unbounded.

Consequently, the estimate of becomes infinite, though in reality is finite. When the sample size n is small, this situation arises more frequently. We explain this situation for

SN(0,1,) . Let be z1 , z2 , …, zn be a random sample from Z ~ SN(0,1,) . From (2.14), we can write the log-likelihood equation as

n n 1 2 l(0,1,)  ln L0,1,    zi  log zi  (2.22) 2 i1 i1

30

When z1 , z2 , …, zn are all positive , l(0,1,) is an increasing function of  . Hence the estimate of  becomes unbounded. The of unbounded estimates of  decreases as the sample size n increases. For instance, if   5 and n=20, we have the probability 0.273 of having all positive observations. The probability decreases to 0.002 for n=100. But for large values of  , the probability of getting an unbounded estimate of is still quite high.

Figure 2.2 Probability that Z < 0 for values of  ranging from 0 to 30.

31

For Z ~ SN(0,1,) , we calculate P(Z < 0) for  ranging from 0 to 30, and present them in Figure 2.2. As increases, the probability for the observations to be all positive in a small sample is very high. The reverse is true when  is negative, that results in a sample with all negative values.

Secondly, the Information matrix becomes singular when   0 . This singularity can be traced to the parameter redundancy of the parameterization for the normal case, a fact identified using the results of Catchpole & Morgan (1997). They identify an model as being parameter redundant if the mean can be expressed using a reduced number of parameters. From equation (1.3.7) , E(Y) is a function of all three parameters  , and , whereas for , it is just a function of one parameter, .

32

2.5 Literature Review on Challenges.

In this section, we are reviewing the different procedures described in the literature for the situations with unbounded estimates of  .

In order to deal with the problem of unbounded estimates of , in the SN model,

Azzalini and Capatitanio (1999) propose to stop the maximization procedure when the log-likelihood value is not significantly lower than the maximum.

Sartori (2005) proposes to reduce the asymptotic bias of the maximum likelihood estimate by of a penalization to the likelihood function. He proposes a two step procedure for estimating the parameters of SN Distribution. Initially the estimators

ˆ and ˆ of  and  are computed. Then in the second step,   ˆ and   ˆ are fixed, and then a bias preventive method proposed by Firth (1993) is applied to the score function of the skewness parameter in order to give a finite estimate.

33

Liseo and Loperfido (2002) performs a default Bayesian analysis for the skew normal distribution, to show that the Jeffrey’s prior to  is proper.

Ana Clara Monti (2003) uses the minimum chi-square method proposed by Neyman

(1949) to estimate parameters of the distribution of discrete or .

In the SN model, the singularity of the information matrix can be removed by a certain parameterization of the parameters. (Azzalini,1985; Azzalini and Capitanio,

1999; Chiogna, 1997; Pewsey, 2000).

34

Chapter 3

Approximations of the Ratio of the

Standard Normal Density and

Distribution Functions

35

3.1 Introduction.

In this chapter we introduce a linear and a non-linear approximation of the ratio of the standard normal density and distribution functions in the presence of an unknown constant representing the shape of the skew normal distribution. The purpose of this approximation is to estimate the skew normal shape parameter.

In (2.15), we have defined the ratio of the standard normal density and distribution functions W(y) as

 y       W(y)    .  y        

y   We write z  and W(y) as R (z) , where  

z R (z)  . (3.1)  z

36

2 The numerical value of R (0) is .  

Figure 3.1 is showing the graphs of R (z) against z for   0.5, 1 and  2 . The graphs in Figure 3.1 intersect at z  0. It is seen that the slope of is positive when  takes negative values and negative when takes positive values. It is also noticed that the magnitude of the slope increases for bigger values of  .

Figure 3.1. Plots of against z for , and .

37

For a given  , we want to approximate R (z) by the following linear and non-  linear function for  3  z  3 ,

2 A (z)   z , (3.2)  

2 B (z)    expz1, (3.3)   where  and  are unknown constants and

1 if   0;    1 if   0.

These approximations become weaker for the values of z satisfying z  3 .

3.2 Motivation.

We consider a random variable Y satisfying

Y   Z , (3.4) where,  is a real number and   0 and Z ~ SN(0,1,) .

The random variable Y is distributed as SN(,,) .

38

Now we consider two cases:

i.  changes with a covariate X and we write

   0   1x , (3.5)

where x is a given value of the covariate X.

ii. No such covariate X is available.

Case I. Covariate X Present

The random variable Y in this case is distributed skew normal with density

2  y   0   1x   y   0   1 x  f (y;, 0 , 1, )     ,    y   . (3.6)       

We denote the distribution with the density in (3.6) as SN( 0   1x,,) .

We now consider n independent observations yi , xi from the skew normal distribution with density (3.6).

y     x We denote z  i 0 1 i , i 1,...,n , and i 

z  R (z )  i  i . zi 

39

The Maximum Likelihood Equations can then be expressed as

n n

 zi   R  zi , (3.7) i1 i1

n n

 xi zi   xi R zi , (3.8) i1 i1

n

 zi R zi   0, (3.9) i1

n 2  zi  n . (3.10) i1

Case II. Covariate X Absent

This situation is discussed in section 2.2.

The maximum likelihood equations can be expressed as:

, (3.11)

, (3.12)

. (3.13)

40

We observe that the complexity in the equations (3.7)-(3.10) and in (3.11)-(3.13)

is due to the presence of the complex function R (z) . We propose to deal with this

A (z) B (z) complexity by approximating by  and  as given in (3.2) and (3.3).

We use equations (3.7)-(3.10) and (3.11)-(3.13) to estimate the parameters  in

,  in , and  in B (z) based on the values of zi , i 1,...,n.

3.3 Fitting the Linear Approximation to the

ratio.

 R (z) In Section (3.1) we have seen that for a given ,  is approximated by

as given in (3.2). In this section we examine how well the linear function A (z) approximates .

In Figure 3.2 we plot A (z) against z for   0.5, 1 and  2 . We consider  to take the following values:  0.284 ,  0.450 and  0.569 for , and

41 respectively. As in Figure 3.1, the graphs in Figure 3.2 intersect at z=0 and the slope of

A (z) is positive when  takes negative values and negative when takes positive values. Also the magnitude of the slope increases for bigger values of  .

Figure 3.2. Plots of A (z) against z for   0.5, 1 and  2 .

R (z) R (z) To compare  and , we plot  (continuous line) and (dotted line) against z values for in the same graph. We present the plots in Figure 3.3.

We draw similar graphs for   1 and respectively in figure 3.4 and 3.5. We

42 consider   0.284 ,  0.450 and  0.569 for   0.5, 1 and  2 respectively.

We observe that, in all the three graphs, the linear function A (z) quite accurately

approximates R (z) for the z values considered.

Figure 3.3. Plot of against z for   0.5 (continuous lines) and

against z for   0.5 (dotted lines).

43

Figure 3.4. Plot of R (z) against z for   1 (continuous lines) and

A (z) against z for   1 (dotted lines).

44

Figure 3.5. Plot of R (z) against z for   2 (continuous lines) and

A (z) against z for   2 (dotted lines).

45

3.4 Fitting the Non-Linear Approximation to the

ratio.

 R (z) In Section (3.2) we have seen that for a given ,  is approximated by

B (z) as given in (3.3). In this section we examine how well the linear function B (z) approximates .

In Figure 3.6 we plot against z for   0.5, 1 and  2 . We consider

 to take the following values: 0.3620 , 0.7768 and 1.10810 for , and

respectively. As in Figure 3.1, the graphs in Figure 3.6 intersect at z = 0 and the

slope of B (z) is positive when takes negative values and negative when takes positive values. Also the magnitude of the slope increases for bigger values of  .

46

Figure 3.6. Plots of B (z) against z for   0.5, 1 and  2 .

R (z) R (z) To compare  and , we plot  (continuous line) and (dotted line) against z values for in the same graph. We present the plots in Figure 3.7.

We draw similar graphs for   1 and respectively in figure 3.8 and 3.9. We consider   0.3620 , 0.7768 and 1.10810 for   0.5, and respectively. We observe that, in all the three graphs, the non-linear function quite accurately

approximates R (z) for the z values considered.

47

Figure 3.7. Plot of R (z) against z for   0.5 (continuous lines) and

B (z) against z for (dotted lines).

48

Figure 3.8. Plot of R (z) against z for   1 (continuous lines) and

B (z) against z for   1 (dotted lines).

49

Figure 3.9. Plot of R (z) against z for   2 (continuous lines) and

B (z) against z for   2 (dotted lines).

50

Chapter 4

Estimation of the Shape parameter of the Standard Skew Normal

Distribution

51

4.1 Introduction.

In Chapter 3 we have discussed that the complexity in solving the maximum likelihood equations arise from the presence of the ratio of the normal density and

distribution function R (z) in the likelihood equations. We propose to solve this

complexity by approximating first by a linear function A (z) as given in (3.2) and

then by a non-linear function B (z) as given in (3.3). We have seen in chapter 3 that when we consider a covariate X to be present, we have four parameters of interest,

namely,  0 ,1, and  , and when we consider no covariate to be present, we have

three parameters of interest, namely  0 , and . In this chapter we assume  0 , 1 and

 to be known. The unknown parameters then are and  , when we consider the linear approximation for , and and  , when we consider the non-linear approximation for .

When we consider a covariate X to be present, we have a data set of n pairs of

observations (xi , yi ) ,i 1,...,n, where xi ’s are observations of the covariate X and yi ’s

are the observed values of the dependent variable Y, where, Y ~ SN( 0   1x,,) .

With the known values of we get the observed values of zi , where,

y     x z  i 0 1 i , i 1,...,n , from the observed values of . Now we have pairs of i 

52 observations (xi , zi ) . When we consider no covariate to be present, we have n

observations yi , i 1,...,n, from random variable Y, where, Y ~ SN( 0 ,,) . Assuming

y    and  to be known, we transform , to z , where, z  i 0 . 0 i i 

Based on these observations, we discuss an estimation procedure for estimating

the parameters  in R (z) ,  in A (z) , and  in B (z) , both when a covariate X is present and when it is absent. From the estimates of , and we obtain the estimates of ,  and  . We also present the performance of our estimation procedure with simulation results.

4.2 The estimation Procedure using

In this chapter we assume that the parameters  0 , 1 and  are known. In this section we present the estimation procedure for estimating the parameters  and

using the linear approximation given in (3.2) for given in (3.1), in the presence and absence of a co-variate X. In Section 4.2.1 we consider the estimation procedure in the presence of a co-variate X and in Section 4.2.2 we consider the estimation procedure in the absence of any co-variate. In Section 4.2.3 we discuss about a measure of goodness of fit to observe how the function performs in approximating .

53

4.2.1 Case I. Covariate X Present

We have n pairs of observations (xi , yi ) , i 1,...,n , where xi , i 1,...,n, are

observations of the covariate X and yi , i 1,...,n, are the observed values of the

Y ~ SN(   x,,) dependent variable Y, where, 0 1 . From the ’s we calculate the

y     x z ’s, where, z  i 0 1 i , . Here we note that, though we assume that i i 

the parameters  0 , 1 and  are known, we present the estimating equations with

respect to  0 , 1, and  , so that we can use them to estimate the unknown parameters and  .

We assume R (z)  A (z) from (3.1) and (3.2). Then the estimating equations

with respect to  0 , 1, and  can be written as:

n n  2  z     z  , (4.1)  i  i  i1 i1   

n n  2  x z   x   z  , (4.2)  i i  i  i  i1 i1   

n  2  z   z   0, (4.3)  i  i  i1   

54

n 2  zi  n . (4.4) i1

Simplifying (4.1)-(4.3), and using (4.4), we can write them in the following form:

 n   n   n  zi  2  zi      i1    i1  n n    n  x x z    x z  (4.5)  i  i i    i i  i1 i1  2  i1  n     0        zi n    i1   

The equation (4.5) can be expressed in the general form

W   , (4.6)

 n   n   n  zi   2    zi   i1     i1  n n    n where, W   x x z  ,   and    x z  .  i  i i     i i  i1 i1  2  i1  n     0         zi n    i1   

We assume Rank(W)  2 , and the estimates of  and 2 can be expressed in the general form (Rao(1973))

ˆ  W'W 1W' . (4.7)

From (4.7), we get the estimates ˆ and ˆ2ˆ , from where we can calculate and ˆˆ .

ˆ ˆ We denote A (z) at   ˆ by A (z) .

55

4.2.2 Case II. Covariate X Absent

Here we have n observations yi , i 1,...,n, from the random variable Y, where,

y   Y ~ SN( ,,) . From the ’s we calculate the z ’s, where, z  i 0 , i 1,...,n . 0 i i 

Here we note that, though we assume that the parameters  0 and  are known, we

present the estimating equations with respect to  0 ,  , and  , so that we can use them to estimate the unknown parameters and  .

We assume R (z)  A (z) from (3.1) and (3.2). Then the estimating equations

with respect to  0 , and  can be written as:

n n  2  z     z  , (4.8)  i  i  i1 i1   

n  2  z   z   0, (4.9)  i  i  i1   

n 2  zi  n . (4.10) i1

Simplifying (4.8)-(4.9), and using (4.10), we can write them in the following form:

 n  2   n   n  zi    zi   i1     i1        (4.11)  n  2    z n   0  i      i1    

56

The equation (4.11) can be expressed in the general form

W   . (4.12)

 n     n  n z 2   i     zi   i1      i1  where, W    ,     and     .  n   2  z n    0   i       i1     

We assume Rank(W)  2 , and the estimates of  and 2 can be expressed in the general form (Rao(1973))

ˆ  W'W 1W' . (4.13)

From (4.13), we get the estimates ˆ and ˆ2ˆ , from where we can calculate and

ˆˆ .

ˆ ˆ We denote A (z) at   ˆ by A (z) .

4.2.3 Measure of goodness of fit.

For evaluating goodness of fit for approximating R (z) given in (3.1) by given

in (3.2) for each value of zi , i 1,...,n we define

 (z )  R (z )  Aˆ (z ) A i  i  i . (4.14)

57

For a particular value of zi , (4.6) gives us a measure of the absolute difference

between the true value of the ratio R (zi ) and the estimated value of the linear

ˆ approximation A (zi ) .

To get a measure of how close the estimated linear approximation is to the true ratio,

we calculate Ave A , the average of  A (zi ) over all the values of ’s, i 1,...,n .

is given by

n Ave A    A (zi ) . (4.15) i1

A small value of Ave A indicates a satisfactory approximation of R (z) by A (z) .

4.3 The estimation Procedure using B (z)

In this Section we present the estimation procedure for estimating the parameters

 and  using the linear approximation B (z) defined in (3.3) for defined in

(3.1), in the presence and absence of a co-variate X. In Section 4.3.1 we consider the estimation procedure in the presence of a co-variate X and in Section 4.3.2 we consider the estimation procedure in the absence of any co-variate. In Section 4.3.3 we discuss

58 about a measure of goodness of fit to observe how the function B (z) performs in

approximating R (z) . Here we will assume   0 , hence we can write (3.3) as

2 B (z)    exp z1. (4.16)  

4.3.1 Case I. Covariate X Present

We have n pairs of observations (xi , yi ) , i 1,...,n , where xi , i 1,...,n, are

observations of the covariate X and yi , i 1,...,n, are the observed values of the

Y ~ SN(   x,,) dependent variable Y, where, 0 1 . From the ’s we calculate the

y     x z ’s, where, z  i 0 1 i , . Here we note that, though we assume that i i 

the parameters  0 , 1 and  are known, we present the estimating equations with

respect to  0 , 1, and , so that we can use them to estimate the unknown parameters  and  .

59

We assume R (z)  B (z) from (3.1) and (4.16). Then the estimating equations with

respect to  0 , 1, and  can be written as:

n n  2  z      exp  z 1  , (4.17)  i    i   i1 i1   

n n  2  x z   x    exp  z 1  , (4.18)  i i  i    i   i1 i1   

n  2  z    exp  z 1   0, (4.19)  i    i   i1   

n 2  zi  n . (4.20) i1

Simplifying (4.17)-(4.19), and using (4.20), we can write them in the following form:

 n   n   n exp zi 1  2  zi      i1    i1  n n    n  x x exp z 1    x z  (4.21)   i  i i    i i  i1 i1  2  i1  n n     0        zi  zi exp zi 1    i1 i1   

The equation (4.21) can be expressed in the general form

W   , (4.22)

60

 n   n   n exp zi 1   2    zi   i1     i1  n n    n where, W   x x exp z 1  ,   and    x z  .   i  i i     i i  i1 i1  2  i1  n n     0         zi  zi exp zi 1    i1 i1   

We assume Rank(W)  2 , and the estimates of  and 2  can be expressed in the general form (Rao(1973))

ˆ  W'W 1W' . (4.23)

From (4.23), we get the estimates ˆ and ˆ2 ˆ , from where we can calculate and

ˆˆ .

ˆ ˆ ˆ We denote B (z) at    by B (z) .

4.3.2 Case II. Covariate X Absent

Here we have n observations yi , i 1,...,n, from the random variable Y, where,

y   Y ~ SN( ,,) . From the ’s we calculate the z ’s, where, z  i 0 , i 1,...,n . 0 i i 

Here we note that, though we assume that the parameters  0 and  are known, we

61 present the estimating equations with respect to  0 , , and , so that we can use them to estimate the unknown parameters  and  .

We assume R (z)  B (z) from (3.1) and (4.16). Then the estimating equations

with respect to  0 , and  can be written as:

n n  2  z      exp  z 1  , (4.24)  i    i   i1 i1   

n  2  z    exp  z 1   0, (4.25)  i    i   i1   

n 2  zi  n . (4.26) i1

Simplifying (4.24)-(4.25), and using (4.26), we can write them in the following form:

 n  2   n   n exp zi 1    zi   i1     i1        (4.27)  n n  2    z z exp  z 1   0  i  i   i        i1 i1    

The equation (4.27) can be expressed in the general form

W   , (4.28)

 n     n  n exp  z 1 2    i       zi   i1      i1  where, W    ,     and     .  n n   2  z z exp  z 1    0   i  i   i         i1 i1     

62

We assume Rank(W)  2 , and the estimates of  and 2  can be expressed in the general form (Rao(1973))

ˆ  W'W 1W' . (4.29)

From (4.28), we get the estimates ˆ and ˆ2 ˆ , from where we can calculate and

ˆˆ .

ˆ ˆ ˆ We denote B (z) at    by B (z) .

4.3.3 Measure of goodness of fit.

For evaluating goodness of fit for approximating R (z) given in (3.1) by B (z) given in

(4.16) for each value of zi , i 1,...,n we define

 (z )  R (z )  Bˆ (z ) . (4.30) B i  i  i

For a particular value of , (4.30) gives us a measure of the absolute difference

between the true value of the ratio R (zi ) and the estimated value of the linear

ˆ approximation B (zi ) .

63

To get a measure of how close the estimated linear approximation is to the true ratio,

we calculate Ave B , the average of  B (zi ) over all the values of zi ’s, i 1,...,n .

is given by

n Ave B    B (zi ) . (4.31) i1

A small value of Ave B indicates a satisfactory approximation of R (z) by B (z) .

4.4 A simulated Data

In order to evaluate the performance of the estimation procedure given in Section

4.2 and 4.3 we present the estimated values of  ,  and  using a simulated data. We consider the case when co-variate X is present and the case when it is absent. Here we

assume the parameters  0 , 1 and  are known.

We generate zi , i 1,...,n from SN(0,1,) , where n  20 and   1, keeping nine decimal places. The rounded values at the second decimal place are

1.11, 0.45, -0.26, -0.18, 0.92, 1.48, 1.03, 0.32, 2.03, 1.93, 0.07, 0.67, -0.15, 0.31, -0.11, -

1.23,0.38,0.01,1.25,-0.52. (Dataset 4.1)

We now treat the value of  to be unknown. The unknown parameters are then

and , when we consider A (z) and and  , when we consider .

64

4.4.1 Case I. Covariate X present

The xi , i 1,...,20 (Exercise 12.25, Page 522, Mendenhall, Beaver and Beaver (2009)) values are

100, 96, 88, 100, 100, 96, 80, 68, 92, 96, 88, 92, 68, 84, 84, 88, 72, 88, 72, 88.

(Dataset 4.2)

We consider  0  30 ,  1  5 and   0.05.

We generate a random variable Y from Dataset 1 and 2, such that,

yi   0   1xi zi , i 1,...,n .

We can then write Y ~ SN( 0   1 X  30  5X,  0.05, 1) .

The yi values , obtained are

470.0555, 450.0225, 409.9870, 469.9910, 470.0460, 450.0740, 370.0515, 310.0160,

430.1015, 450.0965, 410.0035, 430.0335, 309.9925, 390.0155, 389.9945, 409.9385,

330.0190, 410.0005, 330.0625, 409.9740. (Dataset 4.3)

We recall here that the values of  0 , 1 and  are known, while  is unknown. Hence

y     x the z values are known, where, z  i 0 1 i , . We now work with the i i 

dataset (xi , yi ) transformed into (xi , zi ) , .

65

4.4.1 (A) Estimation using linear approximation A (z) .

We now follow the procedure discussed in Section 4.2.1, where we approximated

R (z) A (z)    in (3.1) by  given in (3.2) to estimate the parameters and in the presence of a covariate X.

In the Equation (4.7), we have:

 20 9.51   9.51      W  1740 862.24 , and   862.24 . 9.51 20   0 

We obtain from (4.7) and as:

ˆ  0.8119577 ,

ˆ  0.4662167.

ˆ Hence the estimated linear function A (z) is,

Aˆ (z)  0.7978846  0.3785406z  .

The measure of goodness of fit is calculated to be

Ave A  0.09446111.

We observe that the value of Ave A is quite small, and hence we can conclude that

ˆ our estimated linear function A (z) has satisfactorily approximated R (z) .

66

Figure 4.1 Plots of both R (z) and Aˆ (z) against z when covariate X is present. 1 1

In Figure 4.1 we plot both and against z . From this figure we observe that

for the data considered, the approximation A (z) for is quite strong within the range of z values between [-0.5,2] and moderately strong in the range [-2,-0.5].

67

4.4.1 (B) Estimation using non-linear approximation B (z) .

We now follow the procedure discussed in Section 4.3.1, where we approximated

R (z) B (z)   in (3.1) by  given in (3.3) to estimate the parameters and  in the presence of the covariate X.

In the Equation (4.23), we have:

 20  3.034   9.51      W  1740  276.732 , and   862.24 . 9.51 11.676   0 

We obtain from (4.23) the estimates of and as:

ˆ  0.7133538,

ˆ  0.90986 .

ˆ Hence the estimated non-linear function B (z) is

ˆ B (z)  0.7978846  0.649052exp z1.

The measure of goodness of fit is calculated to be

Ave B  0.09331804 .

We observe that the value of Ave B is quite small, and hence we can conclude that

ˆ our estimated linear function B (z) has satisfactorily approximated R (z) .

68

In Figure 4.2 we plot both and ˆ against z. From this figure and the data R1 (z) B1 (z) considered, we find that the strength of approximation for is very strong within the range of z values between [-1,1] and moderately strong in the range [-2,-1] and [1,2].

Figure 4.2 Plots of both and against z when covariate X is present.

69

4.4.2 Case II. Covariate X absent

We consider  0  30 ,  1  5 and   0.05.

We generate a random variable Y from Dataset 1 and 2, such that,

yi   0   1xi zi , i 1,...,n .

We can then write Y ~ SN( 0  30,  0.05, 1).

The yi values , i 1,...,20 obtained are

-29.9445, -29.9775, -30.0130, -30.0090, -29.9540, -29.9260, -29.9485, -29.9840,

-29.8985, -29.9035, -29.9965, -29.9665, -30.0075, -29.9845, -30.0055, -30.0615,

-29.9810, -29.9995, -29.9375, -30.0260. (Dataset 4.4)

We recall here that the values of  0 and  are known, while  is unknown. Hence

y   the z values are known, where, z  i 0 , . We now work with the i i  dataset transformed into .

70

4.4.2 (A) Estimation using linear approximation A (z) .

We now follow the procedure discussed in Section 4.2.2, where we approximated

R (z) A (z)    in (3.1) by  given in (3.2) to estimate the parameters and in the absence of any covariate.

In the Equation (4.7), we have:

 20 9.51 9.51 W    , and     . 9.51 20   0 

We obtain from (4.13) the estimates of and as:

ˆ  0.770062 ,

ˆ  0.4926799 .

ˆ Hence the estimated linear function A (z) is,

Aˆ (z)  0.7978846  0.3793941z  .

The measure of goodness of fit is calculated to be

Ave A  0.09402337 .

We observe that the value of Ave A is quite small, and hence we can conclude that

ˆ our estimated linear function A (z) has satisfactorily approximated R (z) .

71

Figure 4.3 Plots of both R (z) and Aˆ (z) against z when covariate X is absent. 1 1

Aˆ (z) In Figure 4.3 we plot both and 1 against z . This figure is very similar to

Figure 4.1 and we obsereve that for the data considered, the approximation A (z) for

is quite strong within the range of z values between [-0.5,2] and moderately strong in the range [-2,-0.5].

72

4.4.2(B) Estimation using non-linear approximation B (z) .

We now follow the procedure discussed in Section 4.3.2, where we approximated

R (z) B (z)   in (3.1) by  given in (4.16) to estimate the parameters and  in the absence of any covariate.

In the Equation (4.23), we have:

 20  3.034  9.51 W    , and     . 9.51 11.676  0 

We obtain from (4.23) the estimates of and as:

ˆ  0.6799513,

ˆ  0.9557566 .

ˆ Hence the estimated non-linear function B (z) is,

ˆ B (z)  0.7978846  0.649868exp z1.

The measure of goodness of fit is calculated to be

Ave B  0.0931951.

We notice that the value of Ave B is quite small, and hence we can conclude that

ˆ our estimated linear function B (z) has satisfactorily approximated R (z) .

73

In Figure 4.4 we plot both R (z) and Bˆ (z) against z . This figure is very similar to 1 1

ˆ Figure 4.2 and we notice that the strength of approximation B1 (z) for is very strong within the range of z values between [-1,1] and moderately strong in the range

[-2,-1] and [1,2].

Figure 4.4 Plots of both and against z when covariate X is absent.

74

4.5 Estimating Bias and Accuracy in

Approximations using simulations

We now obtain 100,000 datasets by repeating the simulation method described in the earlier sections 100,000 times. We consider the different values of the

parameters  0 ,1, and  . However, we present here the outcomes only for a few sets of the parameter values. For a set of fixed values of the parameters, we generate

100,000 data sets and from each data set we obtain the numerical values of ˆ , ˆ2ˆ or

ˆ2 ˆ , and ˆˆ or ˆˆ for the approximation in (3.2) or (3.3) using the equation (4.5) or

(4.11). We also calculate the numerical values  A (zi ) or  B (zi ) for twenty z values in

each of 100,000 datasets. We then calculate their average, Ave A or Ave B , over their twenty calculated values. Finally from 100,000 datasets we get 100,000 such values and 100,000 such values. We present the First Quartile Q1, Median, Mean,

Third Quartile Q3, and Standard Deviation (SD) for the 100,000 values of all the cases. In

Tables 4.1-4.8 we present only four out of many sets of parameter values that we have done our calculations.

In Tables 4.1-4.8, the numerical values of the difference between and Median

ˆ could be considered as the estimates of bias in ˆ as an estimate of in (3.1). The

75 numerical values of (  - Median ˆ ) are -0.0189 in Table 4.1, -0.0440 in Table 4.2, -

0.0195 in Table 4.3, and -0.0500 in Table 4.4 when we estimate approximating (3.1) by (3.2). The numerical values of ( - Median ) are 0.0560 in Table 4.5, 0.0360 in Table

4.6, 0.0556 in Table 4.7, and -0.0370 in Table 4.8 when we estimate approximating

(3.1) by (3.3). The sufficiently small numerical values of the estimated biases thus calculated from Tables 4.1-4.8 indicate the absence of an alarming bias in the estimates of . We observe that the approximation in (3.2) provides a small overestimate of .

On the other hand, the approximation in (3.3) provides a small underestimate of . We also get a similar picture when we use ( - Mean ) for estimating the bias in for

estimating in (3.1). The median values of Ave A are 0.0857 in Table 4.1, 0.1334 in

Table 4.2, 0.0853 in Table 4.3 and 0.1336 in Table 4.4. The median values of Ave B are

0.1347 in Table 4.5, 0.0589 in Table 4.6, 0.1347 in Table 4.7 and 0.590 in Table 4.8. The goodness of the two approximating functions in (3.2) and (3.3) indicated by the

numerical values of and Ave B are comparable to each other and are sufficiently strong approximating functions of (3.1).

We observe that non-linear approximation given in (3.3) does not significantly perform better than the linear approximation given in (3.2). Thus, in our next chapter, we will restrict ourselves only to considering the linear approximation, which is much simpler and quite satisfactory, to develop a new estimation procedure.

76

ˆ ˆ2 ˆ Table 4.1. The values of Q1 ,Median, Mean, Q3, and SD for  ,  ˆ , ˆ , and

Ave A when  0  30, 1  5,   0.5, and   0.5in approximating (3.1) by (3.2)

ˆ Table 4.2. The values of Q1 ,Median, Mean, Q3, and SD for , , ˆ , and

when  0  30, 1  5,   0.5, and  1.5in approximating (3.1) by (3.2)

Table 4.3. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when  1.0, and   0.5in approximating (3.1) by (3.2)

77

ˆ ˆ2 ˆ Table 4.4. The values of Q1 ,Median, Mean, Q3, and SD for  ,  ˆ , ˆ , and

Ave A when  0  30, 1  5,  1.0, and  1.5in approximating (3.1) by (3.2)

ˆ2 ˆ ˆ ˆ Table 4.5. The values of Q1 ,Median, Mean, Q3, and SD for ,   ,  , and

Ave B when   0.5, and   0.5in approximating (3.1) by (3.3)

ˆ2 ˆ ˆ ˆ Table 4.6. The values of Q1 ,Median, Mean, Q3, and SD for ,   ,  , and

Ave B when   0.5, and  1.5in approximating (3.1) by (3.3)

78

ˆ ˆ2 ˆ ˆ ˆ Table 4.7. The values of Q1 ,Median, Mean, Q3, and SD for  ,   ,  , and

Ave B when  0  30, 1  5,  1.0, and   0.5in approximating (3.1) by (3.3)

Table 4.8. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when  1.0, and  1.5in approximating (3.1) by (3.3)

79

Chapter 5

Estimation of Location, Scale and Shape parameters of a Skew

Normal Distribution

80

5.1 Introduction.

In chapter 4 we have discussed the estimation procedure to estimate  in R (z) ,

 in A (z) , and  in B (z) assuming the parameters  0 , 1 and  to be known, considering a covariate X to be present. In this chapter we assume all the parameters,

i.e.,  0 ,1, and  to be unknown. To estimate the unknown parameters, we approximate in (3.1) by the linear function in (3.2). Thus, we have one additional unknown parameter,  .

We consider a data set of n pairs of observations (xi , yi ) ,i 1,...,n, where xi ’s are

observations of the covariate X and yi ’s are the observed values of the dependent

variable Y, where, Y ~ SN( 0   1x,,) . We rewrite the estimating equations in (3.7)-

(3.10) using in . Based on the observations , and the estimating equations we discuss some important relationships that exist between the estimated parameter values. Using these relationships, we present an estimation procedure for estimating the unknown parameters. We also discuss the performance of our estimation procedure with simulation results.

81

5.2 Relations Among Estimated Parameters

In this section, we obtain some interesting relations among the estimated parameter values. Using these relations, we present an estimation procedure for the unknown parameters.

By substituting R (z)  A (z) in the maximum likelihood equations given in (3.7)-

(3.10), we get:

n n n   yi   0   1xi   yi  n 0   1  xi   0.7978846  , (5.1) i1 i1 i1    

n n n n 2   yi   0   1xi   xi yi   0  xi   1  xi    xi 0.7978846   , (5.2) i1 i1 i1 i1    

n n   yi   0   1 xi    yi   0   1 xi   yi 0.7978846     0 0.7978846    i1     i1    

n   yi   0   1xi   1 xi 0.7978846  , (5.3)   i1    and

n 2 2 n  yi   0   1xi  . (5.4) i1

82

2 We note that  0.7978846 . 

ˆ Letˆ0 ,ˆ1 , ˆ , ˆ and  be the solutions for  0 , 1 ,  ,  and  in the equations (5.1)-

(5.4).

We define

wˆ i  yi ˆ0 ˆ1xi . (5.5)

From (5.1)-(5.4), using (5.5), we write,

n ˆ ˆ2 ˆ ˆ ˆ 1 wi  0.7978846n , (5.6) i1

n ˆ ˆ2 ˆ ˆ ˆ 1  xi wi  0.7978846nx , (5.7) i1

n ˆˆ n ˆ ˆ 2 0.7978846wi   wi , (5.8) i1 ˆ i1

n ˆ 2 ˆ 2 n   wi . (5.9) i1

From (5.8) and (5.9) we get

n nˆˆˆ ˆˆˆ ˆ ˆ wi    w   . (5.10) i1 0.7978846 0.7978846

ˆ Multiplying both sides of (5.6) by , and using (5.10) we have n

ˆ  (ˆˆ)2 wˆ  0.7978846ˆˆˆ . (5.11)

From (5.6)-(5.11) we obtain Observations 1-11 relating the estimated parameter values.

83

ˆ2 Observation 1. If 1ˆ  0 , then the maximum likelihood estimate of  1 is given by

n xi  xyi  y ˆ i1  1  n . (5.12) 2 xi  x i1

Proof: When , multiplying both sides of (5.6) by x and subtracting it from

(5.7), we obtain

n n ˆ xi  xyi  1 xi  xxi . (5.13) i1 i1

Then (5.13) follows immediately from (5.12).

Note: We note that the estimate of given in (5.12) for SN( 0   1 x,,) is exactly

the same as the estimate of for N( 0   1 x,) .

Observation 2: We have

ˆ  (ˆˆ)2  0.79788462 . (5.14)

Proof: Combining (5.10) and (5.11), we obtain (5.14).

84

Observation 3: We have

(ˆˆ)2  0.79788462 . (5.15)

Proof: We know

n ˆ 2 ˆ 2 nw   wi , (5.16) i1

Then from (5.9), (5.10) and (5.16), the inequality (5.15) follows.

Observation 4: We have

 0.79788462  ˆ  0 , (5.17)

Proof: In (3.2),   0 and hence ˆ  0 .

Combining (5.14) and (5.15), we have ˆ  0 .

From (5.14), ˆ  0.79788462  (ˆˆ)2  0 .

Hence,  0.79788462  ˆ  0 .

Observation 5. We have

ˆˆˆ ˆ  y  ˆ x  . (5.18) 0 1 0.7978846

85

Proof: From (5.5) we write

n n n wˆ  y  nˆ  ˆ x . (5.19)  i  i 0 1  i i1 i1 i1

Dividing both sides of (5.19) throughout by n we have

wˆ  y ˆ0 ˆ1x. (5.20)

Now from (5.10) and (5.20) we get (5.18).

Observation 6: We observe i) ˆ  0 if and only if wˆ  0, ii) ˆ  0 if and only if wˆ  0.

Proof: Since ˆ  0 , the results in (i) and (ii) are clear from (5.10) and (5.17).

Observation 7: When   0 we denote ˆ by ˆ0 and it is given by

1 n ˆ ˆ 2  0  yi  y  1 xi  x . (5.21) n i1

Proof: When , we know that yi ~ N( 0   1xi , 0 ) , i=1,2,…,n.

86

From (1.5)-(1.7), we obtain the maximum likelihood estimates of  0 , 1, and  0 :

n x  x y  y  i  i  n ˆ  i1 , ˆ  y ˆ x a nd ˆ 2 ˆ ˆ 2 . 1 n 0 1 n 0  yi   0   1xi  2 i1 xi  x i1

Hence, (5.21) follows.

Observation 8. We have

2 ˆ 0 ˆ 2   0.7978846 2 . (5.22) ˆ

Proof: From (5.9), (5.10) and (5.21) we get

2  ˆˆ  ˆ 2 ˆ 2   ˆ 2 n  n 0  n   . (5.23)  0.7978846 

This implies

 2 ˆ 2  0.7978846  ˆ 2 2 ˆ ˆ 0  2    . (5.24)  0.7978846 

Using (5.12), we obtain from (5.24)

2 2 ˆˆ  0.7978846ˆ 0 . (5.25)

Hence the equality of (5.22) is true.

87

Observation 9. We have

2 2 2 ˆ ˆ 0  wˆ . (5.26)

Proof: We write for i 1,...,n,

yi ˆ0 ˆ1xi   yi  y ˆ1 xi  x y ˆ0 ˆ1x.

By squaring both sides and adding over i 1,...,n, we get

n 2 n 2 ˆ ˆ ˆ ˆ ˆ 2 yi   0   1xi   yi  y   1 xi  x  ny   0   1 x . (5.27) i1 i1

Hence (5.26) is true.

Observation 10. We have

ˆ 2 2 2 2  ˆˆ  ˆ ˆ 0  wˆ    . (5.28)  0.7978846   

Proof: The proof is clear from (5.10) and (5.23).

88

Observation 11 .We have

2  n  x  x z  z  0.7978846 2 n   i i  ˆ    z  z  x  x i 1    i   i  n n 2 i 1  x  x     i    i 1 

Proof: Let y1 , y2, …, yn be a random sample from SN( 0   1 X,,).

That is yi =  0   1xi zi , where, z1 , z2, …, zn is a random sample

from SN(0,1,) .

Hence, y   0   1x z .

n xi  xyi  y ˆ i1 Now,  1  n 2 xi  x i1

n xi  x 1 xi  x  zi  z i1  n 2 xi  x i1

n xi  xzi  z i1  1  n . 2 xi  x i1

89

1 n ˆ 2 ˆ 2 Again,  0  yi  y  1 xi  x n i1

2   n    x  xz  z 1 n   i i    i1     1xi  x zi  z 1  n xi  x n   2   i1 x  x    i      i1  

2  n  x  x2 z  z  2 n   i i    z  z  i1    i  n . n 2 i1  x  x    i    i1 

From (5.19) we have,

2 2 0.7978846 ˆ 0 ˆ    2

2  n  x  xz  z 0.79788462 n   i i     z  z  x  x i1    i   i  n . n 2 i1  x  x    i    i1 

90

5.3 The estimation Procedure using A (z)

Here we recall that we have five unknown parameters  0 , 1,, and  , and we have only four estimating equations given in (5.7)-(5.10). Hence we observe that we do not have enough estimating equations to solve for the estimates uniquely. To address this issue we introduce a new variable b and express the estimates of and

in terms of b .

Let us write

(ˆˆ)2 b  . (5.26) 0.79788462

From Observation 2 and Observation 4 we notice that

0  b 1. (5.27)

From Observation 2 and (5.26), we have

2 ˆ  0.7978846 1 b. (5.28)

From (5.22) and (5.26) , we have

2 ˆ 0 ˆ 2  . (5.29) 1 b

1 n ˆ 2 ˆ 2 where,  0  yi  y  1 xi  x . n i1

91

From (5.26) and (5.28), we have

b ˆ2  . (5.30) 1 b2 0.79788462

Thus we see, that for a particular value of b , we have two values of estimated ˆ ,

b ˆ   . (5.31) 0.79788461 b

We denote ˆ when we assume ˆ  0 and ˆ when we assume ˆ  0. That is

b ˆ  , (5.31a) 0.79788461 b and

b ˆ   . (5.31b) 0.79788461 b

Recalling (5.18) we have:

ˆˆˆ ˆ0  y  ˆ1x  . 0.7978846

ˆ ˆ  ˆ ˆ  Now in (5.18) if    , we denote ˆ0  ˆ0 , and if    , we denote ˆ0  ˆ0 .

That is from (5.29) and (5.31a) we have

b ˆ   y  ˆ x ˆ . (5.32a) 0 1 0 1 b

92

From (5.29) and (5.31b) we have

b ˆ   y  ˆ x ˆ . (5.32b) 0 1 0 1 b

We now have two issues:

From the n pairs of observations (xi , yi ) ,i 1,...,n,

1. How do we determine the sign of ˆ ?

2. How do we determine the optimal value of b , so that we get the best estimates

of the parameters  0 , 1,, and  ?

To answer the first question, we define lˆ  and lˆ  , where, is the estimated log-

 ˆ ˆ likelihood function with ˆ0  ˆ0 and    , and is the estimated log-likelihood

 ˆ ˆ function with ˆ0  ˆ0 and    .

The and are given by:

n  y  ˆ   ˆ x  n  y  ˆ   ˆ x  ˆ   i 0 1 i   ˆ i 0 1 i  l  n log 2  nlogˆ  log   log   (5.33a) i1  ˆ  i1  ˆ 

n  y  ˆ   ˆ x  n  y  ˆ   ˆ x  lˆ   n log 2  nlogˆ  log i 0 1 i   log ˆ i 0 1 i  (5.33b)   ˆ    ˆ  i1   i1  

93

Let us define

y  y ˆ x  x C  i 1 i , i ˆ

2 p  , 

b and d  . 1 bp

n   dC  b Result 1. lˆ  lˆ if and only if  i  1 (5.34) i1  dCi  b

and

n   dCi  b lˆ  lˆ if and only if  1. (5.35) i1  dCi  b

Proof: From (5.31a), (5.32a) and (5.33a) we get

  b    yi   y  ˆ1 x  ˆ 0   ˆ1 xi  n  1  b         lˆ  n log 2  n logˆ   log  i1  ˆ     

      b   yi   y  ˆ1 x ˆ 0   ˆ1 xi  n  b   1 b         log    i1  p1 b ˆ       

94

n n ˆ = nlog 2  nlog  logCi  b log dCi  b . i1 i1

Similarly, from (5.31b), (5.32b) and (5.33b) we get

  b    yi   y  ˆ1 x  ˆ 0   ˆ1 xi  n  1  b         lˆ  n log 2  n logˆ   log  i1  ˆ     

      b   yi   y  ˆ1 x  ˆ 0   ˆ1 xi  n  b   1 b         log    i1  p1 b ˆ       

n n ˆ  n log 2  n log  logCi  b log  dCi  b. i1 i1

Now,

n n   Ci  b dCi  b lˆ  lˆ  log  log i1 Ci  b i1  dCi  b

 n     Ci  b dCi  b   log       . (5.36)  i1 Ci  b  dCi  b

95

We see that,

n  1   1 n 2    exp  C  b n     i   Ci  b  2   2 i1   n  n i1 Ci  b  1   1 2    exp Ci  b   2   2 i1 

 1 n 2 2   exp  Ci  b  Ci  b   2 i1 

n  1 2 2   exp  Ci  2Ci b  b  Ci  2Ci b  b  2 i1 

n n yi  y ˆ1 xi  x  1, since, Ci    0 . i1 i1 ˆ

Hence, from (5.36), we have,

 n      dCi  b  lˆ  lˆ  log    .  i1   dCi  b

Thus we see that (5.34) and (5.35) holds.

96

Sign of ˆ :

We consider one hundred values of b, i.e., (0.00,0.01,0.02,...,0.99). Corresponding to

each value of and from n pairs of observations (xi , yi ) ,i 1,...,n, we calculate

n dC  b  i . i1  dCi  b

n dC  b If  i  1, for majority of the b values, we conclude ˆ  0 , i1  dCi  b and

n dC  b if  i  1, for majority of the b values, we conclude ˆ  0. i1  dCi  b

Optimal value of b :

We consider one hundred values of i.e., First we determine the sign of . If we find that , then we consider only lˆ  . If we find that , then we consider only lˆ  .

 ˆ For , corresponding to each value of we calculate ˆ0 ,ˆ1 , ˆ , ˆ and  and then find the value of . We find the optimal value of b  b* which maximizes . Then the estimates of the parameters are obtained corresponding to that value of b* . We

* * * * ˆ* denote the estimated parameter values by ˆ0 , ˆ1 ,ˆ , ˆ and  .

97

ˆ   ˆ For l , corresponding to each value of b, we calculate ˆ0 ,ˆ1 , ˆ , ˆ and  and then find the value of . We find the optimal value of b  b* which maximizes . Then the estimates of the parameters are obtained corresponding to that value of b* . We

* * * * ˆ* denote the estimated parameter values by ˆ0 ,ˆ1 , ˆ , ˆ and  .

Thus we obtain the estimated parameter values.

98

5.4 . A simulated Data

In order to evaluate the performance of the estimation procedure given in Section

5.3, we present the estimated values of the unknown parameters  0 , 1,, and  , in the presence of a covariate X using a simulated data.

We generate zi , i 1,...,n from SN(0,1,) , where n  20 and   1, keeping nine decimal places. The rounded values at the second decimal place are

1.40, 1.96, -0.12, 0.41, 0.75, 0.98, 0.74, 0.24, 1.59, 1.09, -0.13, 0.64, 1.98, 0.17, -

0.14, 1.40, 0.72, 1.36, 0.99, 0.66. (Dataset 5.1)

The xi , i 1,...,20, (Exercise 12.25, Page 522, Mendenhall, Beaver and Beaver

(2009)) values are

100, 96, 88, 100, 100, 96, 80, 68, 92, 96, 88, 92, 68, 84, 84, 88, 72, 88, 72, 88.

(Dataset 5.2)

We consider  0  30 ,  1  5 and  0.05.

We generate a random variable Y from (Dataset 5.1) and (Dataset 5.2), such that,

yi   0   1xi zi , .

We can then write Y ~ SN( 0  1 x  30 5x,  0.05, 1) .

99

The yi values , i 1,...,20, are

470.0698, 450.0979, 409.9942, 470.0207, 470.0375, 450.0491, 370.0371, 310.0119,

430.0793, 450.0547, 409.9933, 430.0319, 310.0990, 390.0083, 389.9928, 410.0700,

330.0358, 410.0679, 330.0494, 410.0329. (Dataset 5.3)

Now we treat the values of  0 , 1,, to be unknown. We apply the estimation procedure proposed in Section (5.3) to estimate the parameters , and also

 in (3.2).

We first determine the sign of ˆ .

For our dataset (5.3), we calculate

n dC  b  i , i1  dCi  b

y  y ˆ x  x 2 b C  i 1 i , p  , and d  i ˆ  1 bp

for each value of b , where, b  0.00,0.01,0.02,...,0.99.

We obtain

ˆ1  5.000267.

n dC  b We see that  i  1 for all the values of b, i1  dCi  b

Hence we conclude ˆ  1.

100

ˆ   Hence, we consider only l . Now corresponding to each value of b, we obtain ˆ0 ,

ˆ ˆ1 , ˆ , ˆ and  . Thus we have 100 sets of , , , and estimates.

Corresponding to each set of estimated values of the parameters, we obtain . Then we find the maximum from that set of 100 values, and note the corresponding value of b , b* , as our optimum value of .

Here, maxlˆ  26.90821 and the corresponding , b*  0.43 .

The estimates obtained corresponding to b*  0.43 are our estimated values of the unknown parameters. They are:

* ˆ0  - 30.00863 ,

* ˆ1  5.000263,

ˆ *  0.04184073,

ˆ *  -0.3628733,

and

ˆ* 1.441847.

We note that our procedure very accurately estimates the location and the scale parameters. For the dataset considered here, the estimate of the shape parameter is also seen to be quite satisfactory, though a little over-estimated.

ˆ Here, the estimated linear function A (zˆ) is,

Aˆ (zˆ)  0.7978846  0.5232079zˆ  .

101

In Figure 5.1 we plot lˆ  against all values of b  0.00,0.01,0.02,...,0.99.We notice that the curve appears to be almost flat till b=0.7 and then falls down. But if we magnify the plot, around the region where b takes values between 0.35 and 0.5, we observe that the curve reaches a maximum when b=0.43 . We plot this in Figure 5.2.

Figure 5.1 Plot of the estimated log-likelihood against the values of b [0,1) .

102

Figure 5.2 Plot of the estimated log-likelihood lˆ  against the values of b[0.35,0.50] .

103

5.5 Estimating Bias and Accuracy in

Approximation using simulations

We now obtain 10,000 datasets by repeating the simulation method described in the earlier

sections 10,000 times. We consider the different values of the parameters  0 ,1, and  . We also consider different sample sizes, n  20,50,100, and 500 . However, we present here the outcomes only for a few sets of parameter values.

First, for a set of fixed values of the parameters  0  30, 1  5,  0.05, and different values of the parameter and sample size n, we generate 10,000 datasets, and observe how many times we can correctly determine the sign of  . In Table 5.1 we give the true value of the parameter , the sample size and the proportion of times the sign of were correctly determined out of 10,000 simulations.

From table 5.1, we can make the following observations:

i. When  0.5, the sign of is correctly determined a little more than 50% of the

times when sample size n=20, and about 60% of the times when sample size, n=500.

104

ii. When  1.0, the sign of  is correctly determined about 60% of the times when

sample size n=20, about 70% of the times when sample size, n=100, and almost 90%

of the times when sample size, n=500.

iii. When  2.0, the sign of is correctly determined about 75% of the times when

sample size n=20, about 95% of the times when sample size, n=100, and almost

always when sample size, n=500.

iv. When  3.0, the sign of is correctly determined almost always for all sample

sizes.

True Value of  Sample size Proportion of times sign (n) of is correctly determined. 0.5 20 0.5101

0.5 50 0.5312

0.5 100 0.5346

0.5 500 0.5891

1 20 0.5790

1 50 0.6369

1 100 0.7005

1 500 0.8832

2 20 0.7460

2 50 0.8842

2 100 0.9564

2 500 0.9999

105

3 20 0.8445

3 50 0.9653

3 100 0.9964

3 500 1.0000

-0.5 20 0.5121

-0.5 50 0.5191

-0.5 100 0.5362

-0.5 500 0.5859

-1 20 0.5682

-1 50 0.6334

-1 100 0.7031

-1 500 0.8874

-2 20 0.7369

-2 50 0.8760

-2 100 0.9567

-2 500 1.0000

-3 20 0.8318

-3 50 0.9650

-3 100 0.9966

-3 500 1.0000

Table 5.1 The true values of  , the sample size n, and the proportion of times the sign of is correctly determined.

106

Next, for a set of fixed values of the parameters and a given sample size, we generate

10,000 data sets and from each data set we obtain optimum value of b , b* , and the

* * * ˆ* * corresponding estimated values of ˆ0 ,ˆ1 ,ˆ , and ˆ . Thus we have a set of 10,000 estimates of the parameters. In Tables 5.1-5.32 we present the first quartile Q1, median, mean, third quartile Q3, and standard deviation (SD) for the 10,000 values of

* * * ˆ* * the estimated values of ˆ0 ,ˆ1 ,ˆ , and ˆ .

We observe that median estimates of the parameters perform better than the Mean estimates. The numerical values of the difference between the parameters and their

Median estimates could be considered as the estimates of bias. A few other things can be noted from the tables:

i. The estimated bias in ˆ is smaller when the true value of  is 0.5 or 1

compared to when it is 2 or 3. This is because, as discussed in Chapter 3,

the linear function A (z) given in (3.2) approximates R (z) in (3.1)

more accurately for smaller values of .

ii. The estimated bias in is smaller for larger sample sizes.

iii. For larger values of , our method slightly under-estimates the true

value of .

iv. The estimated bias in ˆ0 , ˆ1 , and ˆ are quite small for all cases.

107

Q1 Median Mean Q3 SD

-30.06 -29.98 -29.98 -29.91 0.1066 ˆ0

ˆ1 4.999 5 5 5.001 0.0010

ˆ 0.051 0.0596 0.0601 0.0690 0.0127

ˆ -1.827 0.1266 0.0434 1.883 1.7606

ˆ -0.3947 -0.3119 -0.3582 -0.2801 -0.2355

ˆ Table 5.2. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ ,  , when n=20,  0  30, 1  5,   0.05, and   0.5in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.03 -29.98 -29.98 -29.94 0.0724

5 5 5 5 0.0007

0.0516 0.0585 0.0586 0.0652 0.0093

-1.321 0.2611 0.0956 1.485 1.4273 -0.4775 -0.3692 -0.4028 -0.3183 0.1060

Table 5.3. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and   0.5in approximating (3.1) by (3.2)

108

Q1 Median Mean Q3 SD

-30.02 -29.99 -29.99 -29.95 0.0550 ˆ0

ˆ1 5 5 5 5 0.0005

ˆ 0.0509 0.0565 0.0567 0.0621 0.0075

ˆ -1.011 0.295 0.1025 1.21 1.1998

ˆ -0.5157 -0.4202 -0.4371 -0.3565 0.0983

ˆ Table 5. 4. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ ,  , when n=100,  0  30, 1  5,   0.05, and   0.5in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.01 -29.99 -29.99 -29.96 0.0310

5 5 5 5 0.0002

0.0493 0.0522 0.0527 0.0556 0.0043

-0.5711 0.3853 0.1631 0.8079 0.7849 -0.5666 -0.5029 -0.5051 -0.4456 0.0727

Table 5.5. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and   0.5in approximating (3.1) by (3.2)

109

Q1 Median Mean Q3 SD

-30.04 -29.98 -29.98 -29.91 0.0939 ˆ0 4.999 5 5 5.001 0.0009 ˆ1

ˆ 0.0450 0.0529 0.0532 0.0610 0.0115

ˆ -1.621 0.7006 0.2957 2.002 1.7377

ˆ -0.3947 -0.3119 -0.3575 -0.2801 0.1118

ˆ Table 5.6. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ ,  , when

n=20,  0  30, 1  5,   0.05, and   1in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.03 -29.98 -29.98 -29.94 0.0634

5 5 5 5 0.0005

0.0458 0.0522 0.0521 0.0581 0.0085

-0.8355 0.9211 0.4703 1.72 1.3815 -0.4711 -0.3629 -0.3984 -0.3119 0.1062

Table 5.7. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and   1in approximating (3.1) by (3.2)

110

Q1 Median Mean Q3 SD

-30.02 -29.99 -29.99 -29.95 0.0476 ˆ0

ˆ1 5 5 5 5 0.0004

ˆ 0.0457 0.0509 0.0510 0.0560 0.0069

ˆ -0.4404 0.9506 0.5883 1.529 1.1345

ˆ -0.4902 -0.4011 -0.4227 -0.3438 0.0971

ˆ Table 5.8. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ ,  , when

n=100,  0  30, 1  5,   0.05, and   1in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

b 0.19 0.31 0.283 0.38 0.1230

-30.01 -30.00 -29.99 -29.98 0.0234

5 5 5 5 0.0001

0.04563 0.04931 0.04918 0.0525 0.0046

0.6226 0.9807 0.8322 1.246 0.6137 -0.5157 -0.4393 -0.4565 -0.3947 0.0784

Table 5.9. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and   1in approximating (3.1) by (3.2)

111

Q1 Median Mean Q3 SD

-30.04 -29.99 -29.98 -29.93 0.0787 ˆ0

ˆ1 4.999 5 5 5.001 0.0008

ˆ 0.0383 0.0456 0.0459 0.0528 0.0104

ˆ -0.1266 1.827 0.9958 2.201 1.5543

ˆ -0.3629 -0.2992 -0.3436 -0.2737 0.1074

ˆ Table 5.10. The values of Q1 ,Median, Mean, Q3, and SD b ,ˆ0 , ˆ1 , ˆ , , when

n=20,  0  30, 1  5,   0.05, and   2 in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.03 -29.99 -29.99 -29.96 0.0502

5 5 5 5 0.0005

0.0413 0.0472 0.0468 0.0524 0.0080

1.175 1.772 1.39 2.066 0.9952 -0.382 -0.3183 -0.3552 -0.2865 0.0972

Table 5.11. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and   2 in approximating (3.1) by (3.2)

112

Q1 Median Mean Q3 SD

-30.02 -30 -30 -29.97 0.0346 ˆ0

ˆ1 5 5 5 5 0.0003

ˆ 0.0438 0.0482 0.0476 0.0520 0.0062

ˆ 1.442 1.772 1.573 1.941 0.6514

ˆ -0.3629 -0.3183 -0.3462 -0.2992 0.0782

ˆ Table 5.12. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ ,  , when n=100,  0  30, 1  5,   0.05, and   2 in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.01 -30 -30 -29.99 0.0150

5 5 5 5 0.0001

0.0476 0.0493 0.0492 0.05097 0.0026

1.67 1.772 1.749 1.883 0.1757 -0.331 -0.3183 -0.3225 -0.3056 0.0234

Table 5.13. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and   2 in approximating (3.1) by (3.2)

113

Q1 Median Mean Q3 SD

-30.04 -29.99 -29.99 -29.94 0.0716 ˆ0

ˆ1 5 5 5 5 0.0007

ˆ 0.0367 0.0438 0.0440 0.0511 0.0104

ˆ 1.283 2.066 1.427 2.273 1.3394

ˆ -0.331 -0.2865 -0.326 -0.2674 0.1009

ˆ Table 5. 14. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ , , when n=20,  0  30, 1  5,   0.05, and   2 in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.03 -30 -29.99 -29.96 0.0452

5 5 5 5 0.0005

0.0416 0.0464 0.0461 0.0509 0.0072

1.772 2.066 1.835 2.201 0.6619 -0.3183 -0.2865 -0.3136 -0.2737 0.0723

Table 5.15. The values of Q1 ,Median, Mean, Q3, and SD , , , , , when n=50, , and   3in approximating (3.1) by (3.2)

114

Q1 Median Mean Q3 SD

-30.02 -30 -30 -29.98 0.0310 ˆ0

ˆ1 5 5 5 5 0.0003

ˆ 0.0443 0.0474 0.0473 0.0507 0.0050

ˆ 1.883 2.066 1.972 2.132 0.3172

ˆ -0.3056 -0.2865 -0.2987 -0.2801 0.0400

ˆ Table 5.16. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ ,  , when n=100,  0  30, 1  5,   0.05, and   3in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.01 -30 -30 -29.99 0.0137

5 5 5 5 0.0001

0.0467 0.0483 0.0483 0.0497 0.0021

2.002 2.066 2.035 2.066 0.0883 -0.2928 -0.2865 -0.2898 -0.2865 0.0091

Table 5.17. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and   3in approximating (3.1) by (3.2)

115

Q1 Median Mean Q3 SD

-30.09 -30.02 -30.02 -29.94 0.1060 ˆ0

ˆ1 4.999 5 5 5.001 0.0010

ˆ 0.0503 0.0593 0.0597 0.0686 0.0128

ˆ -1.883 -0.1266 -0.0471 1.772 1.7458

ˆ -0.4011 -0.3119 -0.361 -0.2801 0.1137

ˆ Table 5.18. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ ,  , when n=20,  0  30, 1  5,   0.05, and   0.5in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.06 -30.02 -30.02 -29.97 0.0716

5 5 5 5 0.0007

0.0516 0.0583 0.0585 0.0650 0.0094

-1.442 -0.1809 -0.0715 1.321 1.4253 -0.4775 -0.3692 -0.404 -0.3183 0.1066

Table 5.19. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and   0.5in approximating (3.1) by (3.2).

116

Q1 Median Mean Q3 SD

-30.05 -30.01 -30.02 -29.98 0.0552 ˆ0

ˆ1 5 5 5 5 0.0005

ˆ 0.0510 0.0566 0.0567 0.0621 0.0074

ˆ -1.21 -0.295 -0.0984 1.043 1.2089

ˆ -0.5157 -0.4138 -0.4359 -0.3501 0.0990

ˆ Table 5.20. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ ,  , when n=100,  0  30, 1  5,   0.05, and   0.5in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.04 -30.01 -30.01 -29.99 0.0316

5 5 5 5 0.0002

0.0492 0.0522 0.0527 0.0558 0.0043

-0.8355 -0.3853 -0.1646 0.5711 0.7831 -0.5666 -0.5093 -0.5056 -0.4456 0.0730

Table 5.21. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and   0.5in approximating (3.1) by (3.2).

117

Q1 Median Mean Q3 SD

-30.08 -30.02 -30.02 -29.96 0.0941 ˆ0

ˆ1 4.999 5 5 5.001 0.0009

ˆ 0.0446 0.0525 0.053 0.0609 0.0115

ˆ -2.002 -0.5968 -0.2794 1.621 1.7341

ˆ -0.3947 -0.3119 -0.3593 -0.2801 0.1130

ˆ Table 5.22. The values of Q1 ,Median, Mean, Q3, and SD for b,ˆ0 , ˆ1 , ˆ ,  , when n=20,  0  30, 1  5,   0.05, and   1in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.06 -30.02 -30.02 -29.97 0.0637

5 5 5 5 0.0006

0.0457 0.0519 0.0521 0.0580 0.0085

-1.67 -0.9211 -0.4475 0.8921 1.3911 -0.4647 -0.3629 -0.3972 -0.3119 0.1050

Table 5.23. The values of Q1 ,Median, Mean, Q3, and SD for b, , , , , when n=50, , and   1in approximating (3.1) by (3.2).

118

Q1 Median Mean Q3 SD

-30.05 -30.01 -30.01 -29.98 0.0474 ˆ0

ˆ1 5 5 5 5 0.0004

ˆ 0.0457 0.0509 0.0510 0.0560 0.0069

ˆ -1.529 -0.9807 -0.5904 0.4132 1.1344

ˆ -0.4966 -0.4011 -0.4228 -0.3438 0.0978

ˆ Table 5.24. The values of Q1 ,Median, Mean, Q3, and SD for b,ˆ0 , ˆ1 , ˆ ,  , when n=100,  0  30, 1  5,   0.05, and   1in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.02 -30 -30.01 -29.99 0.0233

5 5 5 5 0.0002

0.0457 0.0494 0.0492 0.0527 0.0045

-1.246 -1.011 -0.8469 -0.6226 0.6098 -0.5093 -0.4393 -0.4546 -0.3883 0.0784

Table 5.25. The values of Q1 ,Median, Mean, Q3, and SD b, , , , , when n=500, , and   1in approximating (3.1) by (3.2).

119

Q1 Median Mean Q3 SD

-30.07 -30.02 -30.02 -29.96 0.0784 ˆ0

ˆ1 4.999 5 5 5.001 0.0008

ˆ 0.0382 0.0458 0.0460 0.0532 0.0106

ˆ -2.201 -1.772 -0.9625 0.1266 1.5753

ˆ -0.3629 -0.2992 -03443 -0.2737 0.1086

ˆ Table 5.26. The values of Q1 ,Median, Mean, Q3, and SD for b,ˆ0 , ˆ1 , ˆ ,  ,

when n=500,  0  30, 1  5,   0.05, and   2in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.04 -30.01 -30.01 -29.97 0.0503

5 5 5 5 0.0005

0.0411 0.0472 0.0469 0.0525 0.0080

-2.066 -1.772 -1.372 -1.175 1.0277 -0.3756 -0.3183 -0.3536 -0.2865 0.0953

Table 5.27. The values of Q1 ,Median, Mean, Q3, and SD for b, , , , , when n=50, , and   2in approximating (3.1) by (3.2).

120

Q1 Median Mean Q3 SD

-30.03 -30 -30 -29.98 0.0347 ˆ0

ˆ1 5 5 5 5 0.0004

ˆ 0.0440 0.0482 0.0477 0.0520 0.0063

ˆ -2.002 -1.772 -1.585 -1.442 0.6487

ˆ -0.3629 -0.3183 -0.3447 -0.2928 0.0782

ˆ Table 5.28. The values of Q1 ,Median, Mean, Q3, and SD for b,ˆ0 , ˆ1 , ˆ ,  , when n=100,  0  30, 1  5,   0.05, and   2in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.01 -30 -30 -29.99 0.0148

5 5 5 5 0.0002

0.0476 0.0493 0.0492 0.0510 0.0026

-1.883 -1.772 -1.75 -1.67 0.1731 -0.331 -0.3183 -0.3224 -0.3056 0.0230

Table 5.29. The values of Q1 ,Median, Mean, Q3, and SD for b, , , , , when n=500, , and   2in approximating (3.1) by (3.2).

121

Q1 Median Mean Q3 SD

-30.04 -30 -30.06 -29.96 0.0717 ˆ0

ˆ1 5 5 5 5 0.0007

ˆ 0.365 0.0435 0.0438 0.0507 0.0102

ˆ -2.273 -2.066 -1.383 -1.175 1.3730

ˆ -0.331 -0.2865 -0.3274 -0.2674 0.1015

ˆ Table 5.30. The values of Q1 ,Median, Mean, Q3, and SD for b,ˆ0 , ˆ1 , ˆ ,  , when n=20,  0  30, 1  5,   0.05, and   3.0in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.04 -30 -30.01 -29.98 0.0445

5 5 5 5 0.0005

0.0416 0.0464 0.0460 0.0509 0.0072

-2.201 -2.066 -1.839 -1.772 0.6550 -0.3183 -0.2865 -0.3132 -0.2737 0.0720

Table 5. 31. The values of Q1 ,Median, Mean, Q3, and SD for b , , , , , and when n=50, , and   3.0in approximating (3.1) by (3.2).

122

Q1 Median Mean Q3 SD

-30.02 -30 -30 -29.98 0.0309 ˆ0

ˆ1 5 5 5 5 0.0003

ˆ 0.0443 0.0476 0.0473 0.0507 0.0050

ˆ -2.132 -2.066 -1.976 -1.883 0.3083

ˆ -0.3056 -0.2865 -0.2986 -0.2801 0.0409

ˆ Table 5. 32. The values of Q1 ,Median, Mean, Q3, and SD for b,ˆ0 , ˆ1 , ˆ ,  , when n=100,  0  30, 1  5,   0.05, and   3.0in approximating (3.1) by (3.2).

Q1 Median Mean Q3 SD

-30.01 -30 -30 -29.99 0.0135

5 5 5 5 0.0001

0.0469 0.0483 0.0483 0.0497 0.0021

-2.066 -2.066 -2.035 -2.002 0.0887 -0.2928 -0.2865 -0.2897 -0.2865 0.0091

Table 5. 33. The values of Q1 ,Median, Mean, Q3, and SD for b , , , , , when n=500, , and in approximating (3.1) by (3.2).

123

Chapter 6

Conclusion

In this dissertation we have seen the challenges in finding the maximum likelihood estimate for the parameters of the skew normal distribution. We have argued that the complex function of the ratio of the normal density and distribution functions, in the presence of the shape, location and scale parameters, in the likelihood equations, makes it very difficult to estimate the parameters. In our dissertation we proposed a

124 simple linear and non-linear approximation to the complex function. We have seen that the linear function quite satisfactorily approximates the complex function. Thus using the linear approximating function in the likelihood equations, we estimate the parameters of interest. We also present our estimation procedure in a regression setup, assuming a covariate X to be present.

In our dissertation we have considered two cases. First, as our main parameter of interest is the shape parameter, we assumed the location and the scale parameter to be fixed and considered estimating only the shape parameter. We have presented an estimation procedure under this situation using both the linear and the non-linear approximating functions for the complex function of the ratio of the normal density and distribution functions. Second, we have assumed all the parameters to be unknown and presented a numerical method to estimate the parameters using a linear approximating function. Here we considered only the linear approximating function. We performed simulation studies to evaluate the performance of the estimation procedure for the parameters.

The skew normal distribution has a lot of potential. Absence of an efficient estimation procedure for the parameters has been a main hindrance to its even wider usage. Our research provides an efficient and simple estimation procedure as well as a new approach to resolve the complexity in solving the maximum likelihood estimating equations.

125

In our dissertation we approach the challenges in the maximum likelihood estimate in a novel way. There is ample scope for future research in this direction. Better approximating functions can be one way of improving the performance of the estimation procedure. In our dissertation we have presented a numerical procedure for estimation. In future work, it can be seen if based on our approach, an analytical procedure can be developed. The properties of the skew normal distribution in the light of the approximations can also be revisited.

126

Bibliography

[1] Arellano-Valle, R.B., Bolfarine, H., & Lachos, V.H. (2005a). Skew-Normal Linear

Mixed Models. Journal of Data Science 3, 415-438.

[2] Arellano-Valle, R.B., Del Pino, G., & San Martin, E. (2002). Definition and Probabilistic

Properties of Skew Normal Distributions. Statist. Probab. Lett. 58, 111-121.

[3] Arnold, B.C. & Beaver, R. J. (2000a). Hidden Truncation Models. Sankhya, ser.A 62,

22-35.

[4] Arnold, B. C. & Lin, G.D. (2004). Characterizations of the Skew Normal and

Generalized Chi Distributions. Sankhya 66, 593-06.

[5] Arnold, B.C., Castillo, E., & Sarabia, J.M. (2007). Distributions with Generalized

Skewed Conditionals and Mixtures of such Distributions. Commun. Statist. – Theory &

Methods 36, 1493-1503.

127

[6] Azzalini, A. (1985). A Class of Distributions Which Includes the Normal Ones. J. Statist.

12, 171- 178.

[7] Azzalini, A. (2001). A Note on Regions of Given Probability of the Skew Normal

Distribution. Metron LIX, 27-34.

[8] Azzalini, A. (1986). Further Results on a Class of Distributions Which Includes the

Normal ones. Satistica 46. 199-208.

[9] Azzalini, A. & Dalla Valle, A. (1996). The multivariate skew-normal distribution.

Biometrika, 83, 715–726.

[10] Bansal, N. K., Maadooliat, M. & Wang, X. (2008). Emperical Bayes and Hierarchical

Bayes Estimation of Skew Normal Populations. 37, 1024-1037.

[11] Capitanio, A., Azzalini, A., & Stanghellini, E. (2003). Graphical Models for Skew-

Normal Variates. Scand. J. Statist 30, 129-144.

[12] Catchpole, E.A. & Morgan, B.J.T. (1997). Detecting parameter redundancy,

Biometrika, 84, 187-196.

[13] Chen, J.T., Gupta, A.K. & Nguyen, T.T. (2004). The Density of the Skew Normal

Sample Mean and Application. J. Statist. Comput. Simul. 74, 487-494.

128

[14] Chiogna, M. (1998). Some Results on the Scalar Skew-Normal Distribution. J. Ital.

Statist. Soc 7, 1-13.

[15] Chiogna, M. (2005). A Note on the Asymptotic Distribution of the Maximum

Likelihood Estimator for the Skew Normal Distribution. Stat. Meth. & Appl. 14,

331-341.

[16] Dalla Valle, A., (2004). The skew-normal distribution. In: M.G. Genton (Ed.) Skew-

Elliptical Distributions and Their Applications: a Journey Beyond Normality,

Chapter 1 (Boca Raton, FL: Chapman and Hall/CRC), 3–24.

[17] Dalla Valle, A. (2007). A Test for the Hypothesis of Skew-normality in a Population. J.

Statist. Comput. Simul 77, 63-77.

[18] D’Agostino, R.B. and Stephens, M.A., (1986). Handbook of Goodness-of-fit

techniques (New York: Marcel Dekker).

[19] Genton, M. G. (2005). Discussion of “The Skew Normal Distribution and Related

Multivariate Families” by A. Azzalini. Scand. J. Statist. 32, 189-98.

[20] Genton, M. G., He, L. and Liu, X. (2001). Moments of skew-normal random vectors

and their quadratic forms. Statistics & Probability Letters, 51, 319.

129

[21] Gupta, A. K. & Chen, T. (2001). Goodness-of-fit Tests for the Skew Normal

Distribution. Commun. Statist. - Simulation & Computation 30, 907-930.

[22] Gupta, A.K. & Huang, W.J. (2002). Quadratic Forms in the Skew Normal Variates. J.

Math. Anal. Appl. 273, 558-564.

[23] Monti, A. C. (2003). A Note on the Estimation of the Skew Normal and the Skew

Exponential Power Distributions. Metron XLI, 205-219.

[24] Pewsey, A. (2000). Problems of Inference for Azzalini’s Skew Normal Distribution.

Journal of Applied Statistics 27, 859-870.

[25] Rohatgi, V.K. & Ehsanes Saleh, A.K. Md. (2003). An Introduction to Probability and

Statistics. Second Edition. Wiley Series in Probability and Statistics.

[26] Sartori, N. (2006). Bias Prevention of Maximum Likelihood Scalar Skew Normal and

Skew t Distributions. J. Statist. Planning and Inference. 136, 4259-4275.

130