UNIVERSITY OF CALIFORNIA RIVERSIDE
Estimation of the Parameters of Skew Normal Distribution by Approximating the Ratio of the Normal Density and Distribution Functions
A Dissertation submitted in partial satisfaction of the requirements for the degree of
Doctor of Philosophy
in
Applied Statistics
by
Debarshi Dey
August, 2010
Dissertation Committee: Dr. Subir Ghosh, Chairperson Dr. Barry Arnold Dr. Aman Ullah
Copyright by Debarshi Dey 2010
The Dissertation of Debarshi Dey is approved:
______
______
______
Committee Chairperson
University of California, Riverside
ACKNOWLEDGEMENTS
I would take this opportunity to express my sincere gratitude and thanks to my advisor,
Professor Subir Ghosh for his continuous and untiring guidance over the course of the last five years. His sincere interest in not only my academic progress, but also my personal well-being, has been a source of sustained motivation for me.
I would also like to extend my sincere thanks to Professor Barry Arnold, of the
Department of Statistics, UC Riverside, and Professor Aman Ullah, of the Department of
Economics, UC Riverside, for graciously accepting to serve on my PhD Committee and for their valuable time and advice.
I wish to thank the entire faculty of the Department of Statistics for enriching me with their vast knowledge in various fields of Statistics.
I would like to thank the entire staff of the Department of Statistics who were always ready with their help.
My friends here at UCR, have also been a source of major strength and support for the last five years. Your friendship has made my experience at Riverside all the more memorable and fulfilling.
iv
A very special thanks to my wife, Trupti, for being with me and for being my pillar of strength. Though she joined me only a few months ago, her boundless affection, and immense support are invaluable to me in this accomplishment.
I would like to thank my parents, because it is for them and their hard work and sacrifice that made me achieve whatever little I have achieved. Without their constant support, and their immense faith in me, I might not have pursued and persevered. Together with them, my younger brother, Tukan, and my Grandmother, Danni, are equally committed to my success, and have taken immense pride in my accomplishments.
Finally I would like to express my most humble gratitude to my Divine Master,
Bhagawan Sri Sathya Sai Baba, without whose Grace and Benevolence I could not have achieved anything.
v
ABSTRACT OF THE DISSERTATION
Estimation of the Parameters of Skew Normal Distribution Using Linear Approximations of the Ratio of the Normal Density and Distribution Functions
by
Debarshi Dey
Doctor of Philosophy, Graduate Program in Applied Statistics University of California, Riverside, August 2010 Dr Subir Ghosh, Chairperson
The normal distribution is symmetric and enjoys many important properties. That is why it is widely used in practice. Asymmetry in data is a situation where the normality assumption is not valid. Azzalini (1985) introduces the skew normal distribution reflecting varying degrees of skewness. The skew normal distribution is mathematically tractable and includes the normal distribution as a special case. It has three parameters: location, scale and shape. In this thesis we attempt to respond to the complexity and challenges in the maximum likelihood estimates of the three parameters of the skew normal distribution. The complexity is traced to the ratio of the normal density and distribution function in the likelihood equations in the presence of the skewness parameter. Solution to this problem is obtained by approximating this ratio by linear and non-linear functions. We observe that the linear approximation performs quite
vi satisfactorily. In this thesis, we present a method of estimation of the parameters of the skew normal distribution based on this linear approximation. We define a performance measure to evaluate our approximation and estimation method based on it. We present the simulation studies to illustrate the methods and evaluate their performances.
vii
Contents
1. Introduction 1
1.1 Motivation and Historical Development...... 2
1.2 Normal Distribution and Simple Linear Regression...... 5
1.3 Skew Normal Distribution and Regression...... 8
1.4 Thesis Description...... 10
viii
2. The Univariate Skew Normal Distribution 11
2.1 The Univariate Skew Normal Distribution...... 12
2.2 Moments of the Univariate Skew Normal Distribution...... 15
2.3 Likelihood Function and Maximum Likelihood Estimates...... 20
2.4 Challenges of the Maximum Likelihood Estimates of the Skew Normal Dsitribution...... 30
2.5 Literature review on challenges...... 33
3. Approximations of the ratio of the Standard Normal Density
and Distribution Functions 35
3.1 Introduction...... 36
3.2 Motivation...... 38
3.3 Fitting the Linear Approximation to the Ratio...... 41
3.4 Fitting the Non-linear Approximation to the Ratio...... 46
4. Estimation of the Shape Parameter of the Standard
Skew Normal Distribution 51
4.1 Introduction...... 52
4.2 The Estimation Procedure using A (z) ...... 53
ix
4.2.1 Case I : Covariate X Present...... 54
4.2.2 Case II : Covariate X Absent...... 56
4.2.3 Measure of Goodness of Fit...... 57
4.3 The Estimation Procedure using B (z) ...... 58
4.3.1 Case I : Covariate X Present...... 59
4.3.2 Case II : Covariate X Absent...... 61
4.3.3 Measure of Goodness of Fit...... 63
4.4 A Simulated Data...... 64
4.4.1 Case I : Covariate X Present...... 65
4.4.2 Case II : Covariate X Absent...... 70
4.5 Estimating Bias and Accuracy in Approximations using Simulations...... 75
5. Estimation of Location, Scale and Shape Parameter of a Skew Normal Distribution 80
5.1 Introduction...... 81
5.2 Relations Among Estimated Parameters...... 82
5.3 Estimation Procedure using A (z) ...... 91
5.4 A simulated Data...... 99
5.5 Estimating Bias and Accuracy in Approximation using Simulations...... 104
6. Conclusion 124
Bibliography 127
x
List of Figures
2.1 The pdf of Z ~ SN(0,1,) , 1,2 and10 ...... 14
2.2 Probability that Z < 0 for values of ranging from 0 to 30...... 31
3.1 Plots of R (z) against z for 0.5, 1 and 2 ...... 37
3.2 Plots of A (z) against z for , and ...... 42
3.3 Plot of against z for 0.5 (continuous lines) and
A (z) against z for (dotted lines)...... 43
3.4 Plot of against z for 1 (continuous lines) and against z for 1 (dotted lines)...... 44
xi
3.5 Plot of R (z) against z for 2 (continuous lines) and
A (z) against z for 2 (dotted lines)...... 45
3.6 Plots of B (z) against z for 0.5, 1 and 2 ...... 47
3.7 Plot of against z for 0.5 (continuous lines) and
B (z) against z for (dotted lines)...... 48
3.8 Plot of against z for 1 (continuous lines) and against z for 1 (dotted lines)...... 49
3.9 Plot of against z for (continuous lines) and against z for (dotted lines)...... 50
4.1 Plots of both R (z) and Aˆ (z) against z when covariate X is present...... 67 1 1
4.2 Plots of both and Bˆ (z) against z when covariate X is present...... 69 1
4.3 Plots of both and against z when covariate X is absent...... 72
4.4 Plots of both and against z when covariate X is absent...... 74
5.1 Plot of the estimated log-likelihood lˆ against the values of b [0,1) . . . . . 102
5.2 Plot of the estimated log-likelihood against the values of b[0.35,0.50] ...... 103
xii
List of Tables
ˆ ˆ2 ˆ 4.1. The values of Q1 ,Median, Mean, Q3, and SD for , ˆ , ˆ ,
and Ave A when 0 30, 1 5, 0.5, and 0.5 in approximating (3.1) by (3.2)...... 77
4.2. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when , and 1.5 in approximating (3.1) by (3.2)...... 77
4.3. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when 1.0, and in approximating (3.1) by (3.2)...... 77
4.4. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when 1.0, and 1.5 in approximating (3.1) by (3.2)...... 78
xiii
ˆ 2 4.5. The values of Q1 ,Median, Mean, Q3, and SD for , ˆ ˆ , ˆˆ ,
and Ave B when 0 30, 1 5, 0.5, and 0.5 in approximating (3.1) by (3.3)...... 78
4.6. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when , and 1.5 in approximating (3.1) by (3.3)...... 78
4.7. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when 1.0, and in approximating (3.1) by (3.3)...... 79
4.8. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when 1.0, and 1.5 in approximating (3.1) by (3.3)...... 79
5.1 The true values of , the sample size n, and the proportion of times the sign of is correctly determined...... 105
5.2 The values of Q1 ,Median, Mean, Q3, and SD for
b ,ˆ0 , ˆ1 , ˆ , ,ˆ when n=20, 0 30, 1 5, 0.05, and 0.5in approximating (3.1) by (3.2)...... 108
5.3 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, ,
and in approximating (3.1) by (3.2)...... 108
xiv
5.4 The values of Q1 ,Median, Mean, Q3, and SD for ˆ b ,ˆ0 , ˆ1 , ˆ , ,ˆ when n=100, 0 30, 1 5, 0.05, and 0.5in approximating (3.1) by (3.2)...... 109
5.5 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2)...... 109
5.6 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=20, , and 1.0in approximating (3.1) by (3.2)...... 110
5.7 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and 1.0in approximating (3.1) by (3.2)...... 110
5.8 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=100, , and in approximating (3.1) by (3.2)...... 111
5.9 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2)...... 111
5.10 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=20, , and 2.0 in approximating (3.1) by (3.2)...... 112
xv
5.11 The values of Q1 ,Median, Mean, Q3, and SD for ˆ b ,ˆ0 , ˆ1 , ˆ , ,ˆ when n=50, 0 30, 1 5, 0.05, and 2.0 in approximating (3.1) by (3.2)...... 112
5.12 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=100, , and in approximating (3.1) by (3.2)...... 113
5.13 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2)...... 113
5.14 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=20, , and 3.0in approximating (3.1) by (3.2)...... 114
5.15 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and 3.0in approximating (3.1) by (3.2)...... 114
5.16 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=100, , and in approximating (3.1) by (3.2)...... 115
5.17 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2). )...... 115
xvi
5.18 The values of Q1 ,Median, Mean, Q3, and SD for ˆ b ,ˆ0 , ˆ1 , ˆ , ,ˆ when n=20, 0 30, 1 5, 0.05, and 0.5in approximating (3.1) by (3.2)...... 116
5.19 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and 0.5in approximating (3.1) by (3.2)...... 116
5.20 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=100, , and in approximating (3.1) by (3.2)...... 117
5.21 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2)...... 117
5.22 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=20, , and 1.0in approximating (3.1) by (3.2)...... 118
5.23 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and 1.0in approximating (3.1) by (3.2)...... 118
5.24 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=100, , and in approximating (3.1) by (3.2)...... 119
xvii
5.25 The values of Q1 ,Median, Mean, Q3, and SD for ˆ b ,ˆ0 , ˆ1 , ˆ , ,ˆ when n=500, 0 30, 1 5, 0.05, and 1.0in approximating (3.1) by (3.2)...... 119
5.26 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=20, , and 2.0in approximating (3.1) by (3.2)...... 120
5.27 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and 2.0in approximating (3.1) by (3.2)...... 120
5.28 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=100, , and in approximating (3.1) by (3.2)...... 121
5.29 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2)...... 121
5.30 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=20, , and 3.0in approximating (3.1) by (3.2)...... 122
5.31 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and 3.0in approximating (3.1) by (3.2)...... 122
xviii
5.32 The values of Q1 ,Median, Mean, Q3, and SD for ˆ b ,ˆ0 , ˆ1 , ˆ , ,ˆ when n=100, 0 30, 1 5, 0.05, and 3.0in approximating (3.1) by (3.2)...... 123
5.33 The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and in approximating (3.1) by (3.2)...... 123
xix
Chapter 1
Introduction
1
1.1 Motivation and Historical Development
In drawing statistical inferences under the parametric framework, we assume a distribution that describes the data in the best possible way. The celebrated Gaussian distribution is the most popular distribution for describing a data. Its popularity has been driven by its analytical simplicity, associated Central Limit Theorem, its multivariate extension- both the marginals and conditionals being normal, additivity and other properties. However, there are numerous situations, where the Gaussian distribution assumption may not be valid. Alternatively, many near normal distributions have been proposed (Mudholkar and Hutson (2000), Turner(1960), Prentice ( 1975) and
Azzalini ( 1985)). These families describe the variations from normality, share some desirable properties of normal distributions to some extent, and also include the normal distribution as a special case. Under many situations when the data cannot be satisfactorily modeled by a normal distribution, these para metric distributions provide alternatives in drawing inferences.
Some of these families deal with the deviations from symmetry in the normal distribution. They are analytically tractable, contain reasonable degrees of skewness and
2 kurtosis, and include the normal distribution as a special case. One such distribution is the skew normal distribution, which was proposed by Azzalini (1985). While the normal distribution with its symmetry has only location and scale parameters, the skew normal distribution has an additional shape parameter describing the skewness. From practical standpoint, this is a very desirable property, where in many real life situations, some skewness is always present in the data. In addition, the skew normal distribution shares many important properties of the normal distribution: for example, the skew normal densities are unimodal, their support is the real line, and the square of a skew normal random variable has the Chi-square distribution with one degree of freedom.
Arnold et al. (1993) provided the following motivation for the skew normal model.
Let X be the GPA and Y be the SAT score of students who want to be admitted in colleges. We assume (X,Y) to follow a bivariate normal distribution. This implies the marginal distribution of both X and Y are normal. Suppose we consider a college, where the admitted students have an above average SAT score. If we consider the X values of the admitted students in that college, the distribution of the X values follows a skew normal distribution.
Although the skew normal distribution has some attractive properties, there are some complexities and challenges present in drawing the inference on its parameters.
Detailed discussions on these issues can be found in Sartori (2006), Dalla Valle (2004),
Pewsey (2000), Monti (2003), and others. Different methods in overcoming these issues
3 are presented by Azzalini and Capatitanio (1999), Liseo and Loperfido (2002), Ana Clara
Monti (2003) , Sartori (2005).
This thesis identifies a source of the complexity in the maximum likelihood estimation of the skew normal parameter. The source is in fact the ratio of a normal density and distribution functions, in the presence of location, scale and shape parameters of the skew normal distribution. We propose approximations of this ratio by linear and non-linear functions. We observe that the linear approximation performs quite satisfactorily in approximating the complex function. Hence, we used the linear approximation for our estimation procedure. We consider a linear regression setup, where the location parameter of the skew normal distribution is a linear function of a covariate X.
4
1.2 Normal Distribution and Simple Linear
Regression
The normal distribution is widely used in describing data in many applications. The distribution is symmetric with mean and standard deviation , and approximately
99.7% of the distribution lies in the range 3 . The probability distribution function of a normal random variable X is
2 1 1 x f (x;, ) exp , x . (1.1) 2 2
We denote, X ~ N(,) .
We consider a simple linear regression setup, with Y as the response variable and X as the predictor variable.
A simple linear regression model with the normality assumption is
Yi 0 1xi i , (1.2) where,
Yi is the response variable,
xi is the predictor variable,
0 is the intercept parameter,
5
1 is the slope parameter, and
are i.i.d. , for . i N(0, ) i 1,...,n
The unknown parameters are 0 , and .
x Y N( x , ) Thus, for fixed values of i , the observed i ’s are independent 0 1 i ,
.
The joint probability distribution function of Y1,Y2 ,...,Yn , is then given by
L 0 , 1 , y1 ,..., yn f (y1,..., yn ; 0 , 1,)
n f (yi ; 0 , 1, ) i1
n n 2 1 1 yi 0 1 xi exp . (1.3) 2 2 i1
The log-likelihood function is given by
n 2 n 1 yi 0 1 xi log L 0 , 1, y1,..., yn nlog log2 . (1.4) 2 2 i1
The likelihood equations for 0 , 1 , and are obtained by maximizing the log-likelihood equation given in (1.4) with respect to and respectively.
6
The likelihood equations are given by:
n yi 0 1 xi 0, (1.5) i1
n xi yi 0 1 xi 0, (1.6) i1 and
n 2 2 yi 0 1xi n . (1.7) i1
Solving (1.5)-(1.7) we get the maximum likelihood estimators of 0 , 1 , and as :
n xi xyi y ˆ i1 1 n , (1.8) 2 xi x i1
ˆ0 y ˆ1x , (1.9) and
1 n ˆ 2 ˆ ˆ 2 yi 0 1 xi . (1.10) n i1
The fitted values of yi ’s , i 1,...,n , are then
yˆi ˆ0 ˆ1 xi . (1.11)
The residuals are given by yˆ i yi , .
7
1.3 Skew Normal Distribution and Regression
We present the skew normal distribution and its properties in detail in Chapter 2.
Here we introduce the skew normal random variable and an outline of our research.
A random variable Y is said to have a skew normal distribution with location parameter , scale parameter , and skewness parameter , if its probability density function is given by:
2 y y f (y;, , ) , y , (1.12) where, (.) is the probability density function and .is the cumulative distribution function.
We denote, Y ~ SN(,,).
We consider a simple linear regression setup, with Y as the response variable and X as the predictor variable.
A simple linear regression model with the skew-normality assumption is
Yi 0 1xi i , (1.13) where,
Yi is the response variable,
8 xi is the predictor variable,
0 is the intercept parameter,
1 is the slope parameter, and
N(0, ) i are i.i.d. , for i 1,...,n .
The unknown parameters are , , and .
Thus, for fixed values of , the observed Yi ’s are independent SN( 0 1xi ,,) ,
.
We have four unknown parameters of interest, namely, , , and . In this thesis, we resolve the challenges and the complexities involved with the maximum likelihood estimation method for estimating the parameters , , and , and present a new method for solving the maximum likelihood estimating equations.
9
1.4 Thesis Description
In Chapter 2, we present the skew normal distribution and its properties, its moments, likelihood function and maximum likelihood estimates, and the associated
Information matrix. We also discuss the challenges of the maximum likelihood estimates and present a review of the literature on the challenges.
In Chapter 3, we identify the ratio of normal density and distribution functions in the presence of the shape parameter as the source of the complexity in estimating the parameters, and propose a linear and non-linear approximation of the ratio.
In Chapter 4, we present a procedure for the estimation of the shape parameter of the skew normal distribution, assuming that the location and the scale parameter are known. We also evaluate the performance of the estimation procedure by a simulation study.
In Chapter 5, we present a procedure for the estimation of the location, scale and the shape parameter of the skew normal distribution. Here we use only the linear approximation of the ratio for our estimation procedure. We evaluate the performance of the estimators by a simulation study.
10
Chapter 2
The Univariate Skew Normal
Distribution.
11
2.1 The Univariate Skew Normal
Distribution
In this Section we present the univariate skew normal distribution, introduced by
Azzalini (1985). We then discuss several properties and moments of this distribution.
Definition 1. A random variable Z is said to have a standard normal distribution if its probability density function (pdf) is given by
1 2 (z) ez / 2 , z . (2.1) 2
We describe a standard normal random distribution by Z ~ N(0,1) .
The cumulative distribution function (cdf) of a standard normal random variable will be denoted by ., where
z (z) udu . (2.2)
12
Definition 2. [Azzalini(1985)] A random variable X is said to have a skew normal distribution if its pdf is given by
f (x) 2(x)(x) , x , (2.3) where , a real number, is the skewness parameter, . and . are the standard normal pdf and cdf respectively, as given in (2.1) and (2.2).
For brevity we write Z ~ SN(0,1,).
The cdf of a skew normal distribution is denoted by z; , where
z z; f u du . (2.4)
Figure 2.1 shows the shape of the pdf (2.3) for three values of the skewness parameter , namely 1, 2 and 10. We observe that the shape of the pdf becomes increasingly skewed to the right with the increase in the value of . When 1, the shape becomes slightly skewed to the right, and when 10, the shape becomes close to the pdf of a half normal random variable. The right-tails of these distributions for the above three values of become virtually indistinguishable for the values of z greater than 2.
13
Figure 2.1 The pdf of Z ~ SN(0,1,) , 1,2 and10 .
We now state the properties P1-P5 of :
P1. When 0 , Z ~ N(0,1) ,
P2. As , f (z) tends to 2(z)I Z 0 , which is the half normal pdf,
14
P3. Z ~ SN(0,1,) ,
P4. z; 1 z;,
P5. log f (z)is a concave function of z and hence the pdf f (z) is a unimodal function
of z,
2 2 P6. Z ~ 1 .
2.2 Moments of the Univariate Skew
Normal Distribution.
In this Section we derive the moment generating function and the moments of a skew normal random variable.
Lemma 1. Let Z be a standard normal random variable and let h and k be real numbers.
Then
k EhZ k for all h and k, (2.5) 1 h 2
where . is the standard normal distribution function.
15
Proof. Let Z be a standard normal random variable.
For any real h and k, we write EhZ k as
h,k EhZ k hz kzdz . (2.6)
Differentiating (2.6) with respect to k we get:
h,k hz kzdz . k
1 1 2 2 exp hz k z dz . 2 2
2 1 k 2 (1 h 2 ) hk exp exp z dz 2 2 2 2 . 2(1 h ) 1 h
hk Letting u 1 h2 z , we have 1 h2
h,k 1 k 2 1 u 2 exp exp du 2 2 . k 2 1 h 2(1 h ) 2 2
1 k . 2 2 1 h 1 h
16
Now integrating with respect to k , we have
k h,k . 1 h 2
This proves the lemma.
Theorem 1. When Z ~ SN(0,1,) , the moment generating function of Z is
t 2 t M Z (t) 2exp . (2.7) 2 2 1
M (t) E exp tz 2 exp tz (z)(z)dz Proof: Z
2 t 1 1 2 2exp exp z t (t)dt 2 2 2
2 t 1 1 2 2exp exp u u tdu 2 2 2
t 2 2exp Eu t, where U ~ N(0,1) . 2
From Lemma 1 we get
t 2 t M Z (t) 2exp . 2 2 1
17
The first moment of a skew normal random variable Z is given by
t 2 t t 2 t EZ M '(0) 2exp t exp 2 2 2 2 1 1 2 1 t0
2 0 1 2
2 . 1 2
The second moment of a skew normal random variable Z is given by
2 t 2 t t 2 t EZ 2 M ''(0) 2exp ' 2t exp 2 2 2 2 2 2 1 1 1 1
t 2 t t 2 1exp 2 2 1 t0
20
1.
We get
2 EZ , (2.8) 1 2
18
and
22 VarZ 1 . (2.9) 1 2
In practice, it is common to work with a location and scale transformation Y Z , where is a real number and 0 . Hence the density for the random variable Y distributed as SN(,,) is
2 y y f (y;, , ) , y . (2.10)
The expectation and variance of Y are given by
2 EY , (2.11) 1 2 and
2 2 2 VarY 1 2 . (2.12) 1
19
2.3 Likelihood function and maximum
likelihood estimates.
Let y1 , y2, …, yn be an independent and identically distributed sample from
SN(,,) , where , and are unknown and , and ( 0) are real numbers.
Then the likelihood function is:
2n n y y L(, ,) i i n . (2.13) i1
The log-likelihood function is given by
l,, log L,,
n y n y nlog 2 nlog log i log i . (2.14) i1 i1
We define
y i W y . (2.15) i y i
20
The likelihood equations for , and are obtained by maximizing the log- likelihood function given in (2.14) with respect to , and respectively. The likelihood equations obtained by taking partial derivatives of the likelihood equations with respect to , and are:
n n l,, yi W(yi ) 0, (2.16) i1 i1
n 2 n l,, yi yi n W(yi ) 0 , (2.17) i1 i1
n l,, yi W(yi ) 0. (2.18) i1
Let ˆ ,ˆ andˆ be the solutions for , and of the equations (2.16)-(2.18).
We write
y ˆ ˆ i ˆ Wˆ y . (2.19) i y ˆ ˆ i ˆ
Then from (2.16)-(2.18), we get
n ˆ n yi ˆ ˆ W yi , (2.20) i1 ˆ i1
21
n ˆ ˆ yi W yi 0 , (2.21) i1 ˆ
2 n y ˆ i n . (2.22) i1 ˆ
The Fisher Information Matrix is given by,
2 2 2 E ln L , , E ln L , , E ln L , , 2 2 2 2 I , , E ln L , , E ln L , , E ln L , , . 2 2 2 2 E ln L , , E ln L , , E ln L , , 2
We let
2 p ,
Y Z , and
2 k Z ak EZ , k=0,1,2. Z
22
Now we derive the elements of the Fisher-Information matrix.
First we have
2 E ln L , , E ln L ,, 2
Y i n n Yi E 2 i1 i1 Yi
n n Zi Zi E i1 i1 Zi
2 n 2 n Z Z Z E i i i 2 2 Z Z i1 i i
n na 2 2 n Z Z 0 i i 2Z Z dZ 2 2 2 i i i i1 Zi
n na 2 23 n 1 Z 2 1 2 0 Z exp i dZ 2 2 2 i i . i1 2 2
2 2 2 Let u Zi 1 .
2 Hence udu (1 )Zi dZi .
23
Thus we have
2 2 2 3 u n na 2 n 1 E ln L,, 0 ue 2 du 2 2 2 2 2 . i1 2 1
It is known that the first moment of a standard normal random variable is zero.
Thus,
2 n na 2 E ln L ,, 0 2 2 2 .
Next we have
2 2 E ln L, , E ln L,,
E ln L, ,
Y i n n Yi E 2 i1 i1 Yi
n n Zi Zi E i1 i1 Zi
24
2 n 2Z n Z Z i 2 2 i i E Z i 1 Z i 2 2 Z Z i1 i1 i i
3 n 2 2np Zi Zi 2Zi Zi dZ i 2 2 2 1 i1 Zi
n Z n2a i 2Z Z dZ 1 2 i i i 2 i1 Zi
3 n 2 2 2np 2 1 2 Zi 1 Zi exp dZ i 2 2 2 1 i1 2 2
2 n 1 Z 2 1 2 n2a exp i dZ 1 2 i 2 i1 2 2
3 n 2 2np 2 1 2 u u exp du 2 2 2 2 3/ 2 1 i1 2 1 2
2 n 1 u 2 n2a exp du 1 2 2 2 i1 2 1 2
It is known that the second moment of a standard normal random variable is one.
Thus,
2 2np n3 p np n2a E ln L ,, 1 3/ 2 2 2 1 2 2 1 2 2 1 2
25
2 2 np1 2 n a1 3/ 2 2 . 2 1 2
Next we have,
2 2 E ln L, , E ln L,,
E ln L, ,
Y i n Yi E i1 Yi
2 n 1 2 2 Zi Zi E Zi 1 Zi Z Z i1 i i
2 n Z 2Z i i 2 Z Z dZ i i i i1 Zi
1 n Z na i 2 Z Z dZ 1 i i i i1 Zi
2 np np na1 3/ 2 1 2 1 2
26
n p a1 . 2 3/ 2 1
Next we have,
Y i 2 n 2 n 2n 1 Yi Yi E ln L, , E 2 i1 i1 Yi
2 n 2 n 2n 3Zi 2 3 Zi 2 Zi E Zi 2Zi Zi 2 2 2 Z Z i1 i1 i i
Using the fact that the odd moments of a standard normal are zero and the fact that the second moment is one, we have
2 n E ln L , , 1 2a 2 2 2 .
27
Next we have,
2 2 E ln L,, E ln L, ,
E ln L, ,
Y i n Yi E i1 Yi
2 n 1 2 3 Zi 2 Zi E Zi Zi Zi Z Z i1 i i
Using the fact that the odd moments of a standard normal are zero and the fact that the second moment is one, we have
2 na2 E ln L, , .
Finally we have,
Y i 2 n Yi E ln L, , E 2 i1 Yi
28
2 n ZiZi 2 Zi E Zi Z Z i1 i i
na2 .
Hence the Fisher Information matrix is given by:
2 np 1 22 n2 a n p n1 a0 1 3 / 2 a1 2 2 2 3 / 2 2 2 1 1 2 2 np1 2 n a1 n 2 na2 I 3 / 2 2 2 1 a2 . (2.21) 2 1 2 na na2 n p 2 a1 2 3/ 2 1
29
2.4 Challenges of the Maximum
Likelihood Estimates of the
Univariate Skew Normal Distribution.
We now discuss the two main problems that arise in the maximum likelihood estimation of the parameters of the skew normal distribution.
Firstly, the likelihood function with respect to might be unbounded.
Consequently, the estimate of becomes infinite, though in reality is finite. When the sample size n is small, this situation arises more frequently. We explain this situation for
SN(0,1,) . Let be z1 , z2 , …, zn be a random sample from Z ~ SN(0,1,) . From (2.14), we can write the log-likelihood equation as
n n 1 2 l(0,1,) ln L0,1, zi log zi (2.22) 2 i1 i1
30
When z1 , z2 , …, zn are all positive , l(0,1,) is an increasing function of . Hence the estimate of becomes unbounded. The frequency of unbounded estimates of decreases as the sample size n increases. For instance, if 5 and n=20, we have the probability 0.273 of having all positive observations. The probability decreases to 0.002 for n=100. But for large values of , the probability of getting an unbounded estimate of is still quite high.
Figure 2.2 Probability that Z < 0 for values of ranging from 0 to 30.
31
For Z ~ SN(0,1,) , we calculate P(Z < 0) for ranging from 0 to 30, and present them in Figure 2.2. As increases, the probability for the observations to be all positive in a small sample is very high. The reverse is true when is negative, that results in a sample with all negative values.
Secondly, the Information matrix becomes singular when 0 . This singularity can be traced to the parameter redundancy of the parameterization for the normal case, a fact identified using the results of Catchpole & Morgan (1997). They identify an exponential family model as being parameter redundant if the mean can be expressed using a reduced number of parameters. From equation (1.3.7) , E(Y) is a function of all three parameters , and , whereas for , it is just a function of one parameter, .
32
2.5 Literature Review on Challenges.
In this section, we are reviewing the different procedures described in the literature for the situations with unbounded estimates of .
In order to deal with the problem of unbounded estimates of , in the SN model,
Azzalini and Capatitanio (1999) propose to stop the maximization procedure when the log-likelihood value is not significantly lower than the maximum.
Sartori (2005) proposes to reduce the asymptotic bias of the maximum likelihood estimate by means of a penalization to the likelihood function. He proposes a two step procedure for estimating the parameters of SN Distribution. Initially the estimators
ˆ and ˆ of and are computed. Then in the second step, ˆ and ˆ are fixed, and then a bias preventive method proposed by Firth (1993) is applied to the score function of the skewness parameter in order to give a finite estimate.
33
Liseo and Loperfido (2002) performs a default Bayesian analysis for the skew normal distribution, to show that the Jeffrey’s prior to is proper.
Ana Clara Monti (2003) uses the minimum chi-square method proposed by Neyman
(1949) to estimate parameters of the distribution of discrete or grouped data.
In the SN model, the singularity of the information matrix can be removed by a certain parameterization of the parameters. (Azzalini,1985; Azzalini and Capitanio,
1999; Chiogna, 1997; Pewsey, 2000).
34
Chapter 3
Approximations of the Ratio of the
Standard Normal Density and
Distribution Functions
35
3.1 Introduction.
In this chapter we introduce a linear and a non-linear approximation of the ratio of the standard normal density and distribution functions in the presence of an unknown constant representing the shape of the skew normal distribution. The purpose of this approximation is to estimate the skew normal shape parameter.
In (2.15), we have defined the ratio of the standard normal density and distribution functions W(y) as
y W(y) . y
y We write z and W(y) as R (z) , where
z R (z) . (3.1) z
36
2 The numerical value of R (0) is .
Figure 3.1 is showing the graphs of R (z) against z for 0.5, 1 and 2 . The graphs in Figure 3.1 intersect at z 0. It is seen that the slope of is positive when takes negative values and negative when takes positive values. It is also noticed that the magnitude of the slope increases for bigger values of .
Figure 3.1. Plots of against z for , and .
37
For a given , we want to approximate R (z) by the following linear and non- linear function for 3 z 3 ,
2 A (z) z , (3.2)
2 B (z) expz1, (3.3) where and are unknown constants and
1 if 0; 1 if 0.
These approximations become weaker for the values of z satisfying z 3 .
3.2 Motivation.
We consider a random variable Y satisfying
Y Z , (3.4) where, is a real number and 0 and Z ~ SN(0,1,) .
The random variable Y is distributed as SN(,,) .
38
Now we consider two cases:
i. changes with a covariate X and we write
0 1x , (3.5)
where x is a given value of the covariate X.
ii. No such covariate X is available.
Case I. Covariate X Present
The random variable Y in this case is distributed skew normal with density
2 y 0 1x y 0 1 x f (y;, 0 , 1, ) , y . (3.6)
We denote the distribution with the density in (3.6) as SN( 0 1x,,) .
We now consider n independent observations yi , xi from the skew normal distribution with density (3.6).
y x We denote z i 0 1 i , i 1,...,n , and i
z R (z ) i i . zi
39
The Maximum Likelihood Equations can then be expressed as
n n
zi R zi , (3.7) i1 i1
n n
xi zi xi R zi , (3.8) i1 i1
n
zi R zi 0, (3.9) i1
n 2 zi n . (3.10) i1
Case II. Covariate X Absent
This situation is discussed in section 2.2.
The maximum likelihood equations can be expressed as:
, (3.11)
, (3.12)
. (3.13)
40
We observe that the complexity in the equations (3.7)-(3.10) and in (3.11)-(3.13)
is due to the presence of the complex function R (z) . We propose to deal with this
A (z) B (z) complexity by approximating by and as given in (3.2) and (3.3).
We use equations (3.7)-(3.10) and (3.11)-(3.13) to estimate the parameters in
, in , and in B (z) based on the values of zi , i 1,...,n.
3.3 Fitting the Linear Approximation to the
ratio.
R (z) In Section (3.1) we have seen that for a given , is approximated by
as given in (3.2). In this section we examine how well the linear function A (z) approximates .
In Figure 3.2 we plot A (z) against z for 0.5, 1 and 2 . We consider to take the following values: 0.284 , 0.450 and 0.569 for , and
41 respectively. As in Figure 3.1, the graphs in Figure 3.2 intersect at z=0 and the slope of
A (z) is positive when takes negative values and negative when takes positive values. Also the magnitude of the slope increases for bigger values of .
Figure 3.2. Plots of A (z) against z for 0.5, 1 and 2 .
R (z) R (z) To compare and , we plot (continuous line) and (dotted line) against z values for in the same graph. We present the plots in Figure 3.3.
We draw similar graphs for 1 and respectively in figure 3.4 and 3.5. We
42 consider 0.284 , 0.450 and 0.569 for 0.5, 1 and 2 respectively.
We observe that, in all the three graphs, the linear function A (z) quite accurately
approximates R (z) for the z values considered.
Figure 3.3. Plot of against z for 0.5 (continuous lines) and
against z for 0.5 (dotted lines).
43
Figure 3.4. Plot of R (z) against z for 1 (continuous lines) and
A (z) against z for 1 (dotted lines).
44
Figure 3.5. Plot of R (z) against z for 2 (continuous lines) and
A (z) against z for 2 (dotted lines).
45
3.4 Fitting the Non-Linear Approximation to the
ratio.
R (z) In Section (3.2) we have seen that for a given , is approximated by
B (z) as given in (3.3). In this section we examine how well the linear function B (z) approximates .
In Figure 3.6 we plot against z for 0.5, 1 and 2 . We consider
to take the following values: 0.3620 , 0.7768 and 1.10810 for , and
respectively. As in Figure 3.1, the graphs in Figure 3.6 intersect at z = 0 and the
slope of B (z) is positive when takes negative values and negative when takes positive values. Also the magnitude of the slope increases for bigger values of .
46
Figure 3.6. Plots of B (z) against z for 0.5, 1 and 2 .
R (z) R (z) To compare and , we plot (continuous line) and (dotted line) against z values for in the same graph. We present the plots in Figure 3.7.
We draw similar graphs for 1 and respectively in figure 3.8 and 3.9. We consider 0.3620 , 0.7768 and 1.10810 for 0.5, and respectively. We observe that, in all the three graphs, the non-linear function quite accurately
approximates R (z) for the z values considered.
47
Figure 3.7. Plot of R (z) against z for 0.5 (continuous lines) and
B (z) against z for (dotted lines).
48
Figure 3.8. Plot of R (z) against z for 1 (continuous lines) and
B (z) against z for 1 (dotted lines).
49
Figure 3.9. Plot of R (z) against z for 2 (continuous lines) and
B (z) against z for 2 (dotted lines).
50
Chapter 4
Estimation of the Shape parameter of the Standard Skew Normal
Distribution
51
4.1 Introduction.
In Chapter 3 we have discussed that the complexity in solving the maximum likelihood equations arise from the presence of the ratio of the normal density and
distribution function R (z) in the likelihood equations. We propose to solve this
complexity by approximating first by a linear function A (z) as given in (3.2) and
then by a non-linear function B (z) as given in (3.3). We have seen in chapter 3 that when we consider a covariate X to be present, we have four parameters of interest,
namely, 0 ,1, and , and when we consider no covariate to be present, we have
three parameters of interest, namely 0 , and . In this chapter we assume 0 , 1 and
to be known. The unknown parameters then are and , when we consider the linear approximation for , and and , when we consider the non-linear approximation for .
When we consider a covariate X to be present, we have a data set of n pairs of
observations (xi , yi ) ,i 1,...,n, where xi ’s are observations of the covariate X and yi ’s
are the observed values of the dependent variable Y, where, Y ~ SN( 0 1x,,) .
With the known values of we get the observed values of zi , where,
y x z i 0 1 i , i 1,...,n , from the observed values of . Now we have pairs of i
52 observations (xi , zi ) . When we consider no covariate to be present, we have n
observations yi , i 1,...,n, from random variable Y, where, Y ~ SN( 0 ,,) . Assuming
y and to be known, we transform , to z , where, z i 0 . 0 i i
Based on these observations, we discuss an estimation procedure for estimating
the parameters in R (z) , in A (z) , and in B (z) , both when a covariate X is present and when it is absent. From the estimates of , and we obtain the estimates of , and . We also present the performance of our estimation procedure with simulation results.
4.2 The estimation Procedure using
In this chapter we assume that the parameters 0 , 1 and are known. In this section we present the estimation procedure for estimating the parameters and
using the linear approximation given in (3.2) for given in (3.1), in the presence and absence of a co-variate X. In Section 4.2.1 we consider the estimation procedure in the presence of a co-variate X and in Section 4.2.2 we consider the estimation procedure in the absence of any co-variate. In Section 4.2.3 we discuss about a measure of goodness of fit to observe how the function performs in approximating .
53
4.2.1 Case I. Covariate X Present
We have n pairs of observations (xi , yi ) , i 1,...,n , where xi , i 1,...,n, are
observations of the covariate X and yi , i 1,...,n, are the observed values of the
Y ~ SN( x,,) dependent variable Y, where, 0 1 . From the ’s we calculate the
y x z ’s, where, z i 0 1 i , . Here we note that, though we assume that i i
the parameters 0 , 1 and are known, we present the estimating equations with
respect to 0 , 1, and , so that we can use them to estimate the unknown parameters and .
We assume R (z) A (z) from (3.1) and (3.2). Then the estimating equations
with respect to 0 , 1, and can be written as:
n n 2 z z , (4.1) i i i1 i1
n n 2 x z x z , (4.2) i i i i i1 i1
n 2 z z 0, (4.3) i i i1
54
n 2 zi n . (4.4) i1
Simplifying (4.1)-(4.3), and using (4.4), we can write them in the following form:
n n n zi 2 zi i1 i1 n n n x x z x z (4.5) i i i i i i1 i1 2 i1 n 0 zi n i1
The equation (4.5) can be expressed in the general form
W , (4.6)
n n n zi 2 zi i1 i1 n n n where, W x x z , and x z . i i i i i i1 i1 2 i1 n 0 zi n i1
We assume Rank(W) 2 , and the estimates of and 2 can be expressed in the general form (Rao(1973))
ˆ W'W 1W' . (4.7)
From (4.7), we get the estimates ˆ and ˆ2ˆ , from where we can calculate and ˆˆ .
ˆ ˆ We denote A (z) at ˆ by A (z) .
55
4.2.2 Case II. Covariate X Absent
Here we have n observations yi , i 1,...,n, from the random variable Y, where,
y Y ~ SN( ,,) . From the ’s we calculate the z ’s, where, z i 0 , i 1,...,n . 0 i i
Here we note that, though we assume that the parameters 0 and are known, we
present the estimating equations with respect to 0 , , and , so that we can use them to estimate the unknown parameters and .
We assume R (z) A (z) from (3.1) and (3.2). Then the estimating equations
with respect to 0 , and can be written as:
n n 2 z z , (4.8) i i i1 i1
n 2 z z 0, (4.9) i i i1
n 2 zi n . (4.10) i1
Simplifying (4.8)-(4.9), and using (4.10), we can write them in the following form:
n 2 n n zi zi i1 i1 (4.11) n 2 z n 0 i i1
56
The equation (4.11) can be expressed in the general form
W . (4.12)
n n n z 2 i zi i1 i1 where, W , and . n 2 z n 0 i i1
We assume Rank(W) 2 , and the estimates of and 2 can be expressed in the general form (Rao(1973))
ˆ W'W 1W' . (4.13)
From (4.13), we get the estimates ˆ and ˆ2ˆ , from where we can calculate and
ˆˆ .
ˆ ˆ We denote A (z) at ˆ by A (z) .
4.2.3 Measure of goodness of fit.
For evaluating goodness of fit for approximating R (z) given in (3.1) by given
in (3.2) for each value of zi , i 1,...,n we define
(z ) R (z ) Aˆ (z ) A i i i . (4.14)
57
For a particular value of zi , (4.6) gives us a measure of the absolute difference
between the true value of the ratio R (zi ) and the estimated value of the linear
ˆ approximation A (zi ) .
To get a measure of how close the estimated linear approximation is to the true ratio,
we calculate Ave A , the average of A (zi ) over all the values of ’s, i 1,...,n .
is given by
n Ave A A (zi ) . (4.15) i1
A small value of Ave A indicates a satisfactory approximation of R (z) by A (z) .
4.3 The estimation Procedure using B (z)
In this Section we present the estimation procedure for estimating the parameters
and using the linear approximation B (z) defined in (3.3) for defined in
(3.1), in the presence and absence of a co-variate X. In Section 4.3.1 we consider the estimation procedure in the presence of a co-variate X and in Section 4.3.2 we consider the estimation procedure in the absence of any co-variate. In Section 4.3.3 we discuss
58 about a measure of goodness of fit to observe how the function B (z) performs in
approximating R (z) . Here we will assume 0 , hence we can write (3.3) as
2 B (z) exp z1. (4.16)
4.3.1 Case I. Covariate X Present
We have n pairs of observations (xi , yi ) , i 1,...,n , where xi , i 1,...,n, are
observations of the covariate X and yi , i 1,...,n, are the observed values of the
Y ~ SN( x,,) dependent variable Y, where, 0 1 . From the ’s we calculate the
y x z ’s, where, z i 0 1 i , . Here we note that, though we assume that i i
the parameters 0 , 1 and are known, we present the estimating equations with
respect to 0 , 1, and , so that we can use them to estimate the unknown parameters and .
59
We assume R (z) B (z) from (3.1) and (4.16). Then the estimating equations with
respect to 0 , 1, and can be written as:
n n 2 z exp z 1 , (4.17) i i i1 i1
n n 2 x z x exp z 1 , (4.18) i i i i i1 i1
n 2 z exp z 1 0, (4.19) i i i1
n 2 zi n . (4.20) i1
Simplifying (4.17)-(4.19), and using (4.20), we can write them in the following form:
n n n exp zi 1 2 zi i1 i1 n n n x x exp z 1 x z (4.21) i i i i i i1 i1 2 i1 n n 0 zi zi exp zi 1 i1 i1
The equation (4.21) can be expressed in the general form
W , (4.22)
60
n n n exp zi 1 2 zi i1 i1 n n n where, W x x exp z 1 , and x z . i i i i i i1 i1 2 i1 n n 0 zi zi exp zi 1 i1 i1
We assume Rank(W) 2 , and the estimates of and 2 can be expressed in the general form (Rao(1973))
ˆ W'W 1W' . (4.23)
From (4.23), we get the estimates ˆ and ˆ2 ˆ , from where we can calculate and
ˆˆ .
ˆ ˆ ˆ We denote B (z) at by B (z) .
4.3.2 Case II. Covariate X Absent
Here we have n observations yi , i 1,...,n, from the random variable Y, where,
y Y ~ SN( ,,) . From the ’s we calculate the z ’s, where, z i 0 , i 1,...,n . 0 i i
Here we note that, though we assume that the parameters 0 and are known, we
61 present the estimating equations with respect to 0 , , and , so that we can use them to estimate the unknown parameters and .
We assume R (z) B (z) from (3.1) and (4.16). Then the estimating equations
with respect to 0 , and can be written as:
n n 2 z exp z 1 , (4.24) i i i1 i1
n 2 z exp z 1 0, (4.25) i i i1
n 2 zi n . (4.26) i1
Simplifying (4.24)-(4.25), and using (4.26), we can write them in the following form:
n 2 n n exp zi 1 zi i1 i1 (4.27) n n 2 z z exp z 1 0 i i i i1 i1
The equation (4.27) can be expressed in the general form
W , (4.28)
n n n exp z 1 2 i zi i1 i1 where, W , and . n n 2 z z exp z 1 0 i i i i1 i1
62
We assume Rank(W) 2 , and the estimates of and 2 can be expressed in the general form (Rao(1973))
ˆ W'W 1W' . (4.29)
From (4.28), we get the estimates ˆ and ˆ2 ˆ , from where we can calculate and
ˆˆ .
ˆ ˆ ˆ We denote B (z) at by B (z) .
4.3.3 Measure of goodness of fit.
For evaluating goodness of fit for approximating R (z) given in (3.1) by B (z) given in
(4.16) for each value of zi , i 1,...,n we define
(z ) R (z ) Bˆ (z ) . (4.30) B i i i
For a particular value of , (4.30) gives us a measure of the absolute difference
between the true value of the ratio R (zi ) and the estimated value of the linear
ˆ approximation B (zi ) .
63
To get a measure of how close the estimated linear approximation is to the true ratio,
we calculate Ave B , the average of B (zi ) over all the values of zi ’s, i 1,...,n .
is given by
n Ave B B (zi ) . (4.31) i1
A small value of Ave B indicates a satisfactory approximation of R (z) by B (z) .
4.4 A simulated Data
In order to evaluate the performance of the estimation procedure given in Section
4.2 and 4.3 we present the estimated values of , and using a simulated data. We consider the case when co-variate X is present and the case when it is absent. Here we
assume the parameters 0 , 1 and are known.
We generate zi , i 1,...,n from SN(0,1,) , where n 20 and 1, keeping nine decimal places. The rounded values at the second decimal place are
1.11, 0.45, -0.26, -0.18, 0.92, 1.48, 1.03, 0.32, 2.03, 1.93, 0.07, 0.67, -0.15, 0.31, -0.11, -
1.23,0.38,0.01,1.25,-0.52. (Dataset 4.1)
We now treat the value of to be unknown. The unknown parameters are then
and , when we consider A (z) and and , when we consider .
64
4.4.1 Case I. Covariate X present
The xi , i 1,...,20 (Exercise 12.25, Page 522, Mendenhall, Beaver and Beaver (2009)) values are
100, 96, 88, 100, 100, 96, 80, 68, 92, 96, 88, 92, 68, 84, 84, 88, 72, 88, 72, 88.
(Dataset 4.2)
We consider 0 30 , 1 5 and 0.05.
We generate a random variable Y from Dataset 1 and 2, such that,
yi 0 1xi zi , i 1,...,n .
We can then write Y ~ SN( 0 1 X 30 5X, 0.05, 1) .
The yi values , obtained are
470.0555, 450.0225, 409.9870, 469.9910, 470.0460, 450.0740, 370.0515, 310.0160,
430.1015, 450.0965, 410.0035, 430.0335, 309.9925, 390.0155, 389.9945, 409.9385,
330.0190, 410.0005, 330.0625, 409.9740. (Dataset 4.3)
We recall here that the values of 0 , 1 and are known, while is unknown. Hence
y x the z values are known, where, z i 0 1 i , . We now work with the i i
dataset (xi , yi ) transformed into (xi , zi ) , .
65
4.4.1 (A) Estimation using linear approximation A (z) .
We now follow the procedure discussed in Section 4.2.1, where we approximated
R (z) A (z) in (3.1) by given in (3.2) to estimate the parameters and in the presence of a covariate X.
In the Equation (4.7), we have:
20 9.51 9.51 W 1740 862.24 , and 862.24 . 9.51 20 0
We obtain from (4.7) and as:
ˆ 0.8119577 ,
ˆ 0.4662167.
ˆ Hence the estimated linear function A (z) is,
Aˆ (z) 0.7978846 0.3785406z .
The measure of goodness of fit is calculated to be
Ave A 0.09446111.
We observe that the value of Ave A is quite small, and hence we can conclude that
ˆ our estimated linear function A (z) has satisfactorily approximated R (z) .
66
Figure 4.1 Plots of both R (z) and Aˆ (z) against z when covariate X is present. 1 1
In Figure 4.1 we plot both and against z . From this figure we observe that
for the data considered, the approximation A (z) for is quite strong within the range of z values between [-0.5,2] and moderately strong in the range [-2,-0.5].
67
4.4.1 (B) Estimation using non-linear approximation B (z) .
We now follow the procedure discussed in Section 4.3.1, where we approximated
R (z) B (z) in (3.1) by given in (3.3) to estimate the parameters and in the presence of the covariate X.
In the Equation (4.23), we have:
20 3.034 9.51 W 1740 276.732 , and 862.24 . 9.51 11.676 0
We obtain from (4.23) the estimates of and as:
ˆ 0.7133538,
ˆ 0.90986 .
ˆ Hence the estimated non-linear function B (z) is
ˆ B (z) 0.7978846 0.649052exp z1.
The measure of goodness of fit is calculated to be
Ave B 0.09331804 .
We observe that the value of Ave B is quite small, and hence we can conclude that
ˆ our estimated linear function B (z) has satisfactorily approximated R (z) .
68
In Figure 4.2 we plot both and ˆ against z. From this figure and the data R1 (z) B1 (z) considered, we find that the strength of approximation for is very strong within the range of z values between [-1,1] and moderately strong in the range [-2,-1] and [1,2].
Figure 4.2 Plots of both and against z when covariate X is present.
69
4.4.2 Case II. Covariate X absent
We consider 0 30 , 1 5 and 0.05.
We generate a random variable Y from Dataset 1 and 2, such that,
yi 0 1xi zi , i 1,...,n .
We can then write Y ~ SN( 0 30, 0.05, 1).
The yi values , i 1,...,20 obtained are
-29.9445, -29.9775, -30.0130, -30.0090, -29.9540, -29.9260, -29.9485, -29.9840,
-29.8985, -29.9035, -29.9965, -29.9665, -30.0075, -29.9845, -30.0055, -30.0615,
-29.9810, -29.9995, -29.9375, -30.0260. (Dataset 4.4)
We recall here that the values of 0 and are known, while is unknown. Hence
y the z values are known, where, z i 0 , . We now work with the i i dataset transformed into .
70
4.4.2 (A) Estimation using linear approximation A (z) .
We now follow the procedure discussed in Section 4.2.2, where we approximated
R (z) A (z) in (3.1) by given in (3.2) to estimate the parameters and in the absence of any covariate.
In the Equation (4.7), we have:
20 9.51 9.51 W , and . 9.51 20 0
We obtain from (4.13) the estimates of and as:
ˆ 0.770062 ,
ˆ 0.4926799 .
ˆ Hence the estimated linear function A (z) is,
Aˆ (z) 0.7978846 0.3793941z .
The measure of goodness of fit is calculated to be
Ave A 0.09402337 .
We observe that the value of Ave A is quite small, and hence we can conclude that
ˆ our estimated linear function A (z) has satisfactorily approximated R (z) .
71
Figure 4.3 Plots of both R (z) and Aˆ (z) against z when covariate X is absent. 1 1
Aˆ (z) In Figure 4.3 we plot both and 1 against z . This figure is very similar to
Figure 4.1 and we obsereve that for the data considered, the approximation A (z) for
is quite strong within the range of z values between [-0.5,2] and moderately strong in the range [-2,-0.5].
72
4.4.2(B) Estimation using non-linear approximation B (z) .
We now follow the procedure discussed in Section 4.3.2, where we approximated
R (z) B (z) in (3.1) by given in (4.16) to estimate the parameters and in the absence of any covariate.
In the Equation (4.23), we have:
20 3.034 9.51 W , and . 9.51 11.676 0
We obtain from (4.23) the estimates of and as:
ˆ 0.6799513,
ˆ 0.9557566 .
ˆ Hence the estimated non-linear function B (z) is,
ˆ B (z) 0.7978846 0.649868exp z1.
The measure of goodness of fit is calculated to be
Ave B 0.0931951.
We notice that the value of Ave B is quite small, and hence we can conclude that
ˆ our estimated linear function B (z) has satisfactorily approximated R (z) .
73
In Figure 4.4 we plot both R (z) and Bˆ (z) against z . This figure is very similar to 1 1
ˆ Figure 4.2 and we notice that the strength of approximation B1 (z) for is very strong within the range of z values between [-1,1] and moderately strong in the range
[-2,-1] and [1,2].
Figure 4.4 Plots of both and against z when covariate X is absent.
74
4.5 Estimating Bias and Accuracy in
Approximations using simulations
We now obtain 100,000 datasets by repeating the simulation method described in the earlier sections 100,000 times. We consider the different values of the
parameters 0 ,1, and . However, we present here the outcomes only for a few sets of the parameter values. For a set of fixed values of the parameters, we generate
100,000 data sets and from each data set we obtain the numerical values of ˆ , ˆ2ˆ or
ˆ2 ˆ , and ˆˆ or ˆˆ for the approximation in (3.2) or (3.3) using the equation (4.5) or
(4.11). We also calculate the numerical values A (zi ) or B (zi ) for twenty z values in
each of 100,000 datasets. We then calculate their average, Ave A or Ave B , over their twenty calculated values. Finally from 100,000 datasets we get 100,000 such values and 100,000 such values. We present the First Quartile Q1, Median, Mean,
Third Quartile Q3, and Standard Deviation (SD) for the 100,000 values of all the cases. In
Tables 4.1-4.8 we present only four out of many sets of parameter values that we have done our calculations.
In Tables 4.1-4.8, the numerical values of the difference between and Median
ˆ could be considered as the estimates of bias in ˆ as an estimate of in (3.1). The
75 numerical values of ( - Median ˆ ) are -0.0189 in Table 4.1, -0.0440 in Table 4.2, -
0.0195 in Table 4.3, and -0.0500 in Table 4.4 when we estimate approximating (3.1) by (3.2). The numerical values of ( - Median ) are 0.0560 in Table 4.5, 0.0360 in Table
4.6, 0.0556 in Table 4.7, and -0.0370 in Table 4.8 when we estimate approximating
(3.1) by (3.3). The sufficiently small numerical values of the estimated biases thus calculated from Tables 4.1-4.8 indicate the absence of an alarming bias in the estimates of . We observe that the approximation in (3.2) provides a small overestimate of .
On the other hand, the approximation in (3.3) provides a small underestimate of . We also get a similar picture when we use ( - Mean ) for estimating the bias in for
estimating in (3.1). The median values of Ave A are 0.0857 in Table 4.1, 0.1334 in
Table 4.2, 0.0853 in Table 4.3 and 0.1336 in Table 4.4. The median values of Ave B are
0.1347 in Table 4.5, 0.0589 in Table 4.6, 0.1347 in Table 4.7 and 0.590 in Table 4.8. The goodness of the two approximating functions in (3.2) and (3.3) indicated by the
numerical values of and Ave B are comparable to each other and are sufficiently strong approximating functions of (3.1).
We observe that non-linear approximation given in (3.3) does not significantly perform better than the linear approximation given in (3.2). Thus, in our next chapter, we will restrict ourselves only to considering the linear approximation, which is much simpler and quite satisfactory, to develop a new estimation procedure.
76
ˆ ˆ2 ˆ Table 4.1. The values of Q1 ,Median, Mean, Q3, and SD for , ˆ , ˆ , and
Ave A when 0 30, 1 5, 0.5, and 0.5in approximating (3.1) by (3.2)
ˆ Table 4.2. The values of Q1 ,Median, Mean, Q3, and SD for , , ˆ , and
when 0 30, 1 5, 0.5, and 1.5in approximating (3.1) by (3.2)
Table 4.3. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when 1.0, and 0.5in approximating (3.1) by (3.2)
77
ˆ ˆ2 ˆ Table 4.4. The values of Q1 ,Median, Mean, Q3, and SD for , ˆ , ˆ , and
Ave A when 0 30, 1 5, 1.0, and 1.5in approximating (3.1) by (3.2)
ˆ2 ˆ ˆ ˆ Table 4.5. The values of Q1 ,Median, Mean, Q3, and SD for , , , and
Ave B when 0.5, and 0.5in approximating (3.1) by (3.3)
ˆ2 ˆ ˆ ˆ Table 4.6. The values of Q1 ,Median, Mean, Q3, and SD for , , , and
Ave B when 0.5, and 1.5in approximating (3.1) by (3.3)
78
ˆ ˆ2 ˆ ˆ ˆ Table 4.7. The values of Q1 ,Median, Mean, Q3, and SD for , , , and
Ave B when 0 30, 1 5, 1.0, and 0.5in approximating (3.1) by (3.3)
Table 4.8. The values of Q1 ,Median, Mean, Q3, and SD for , , , and when 1.0, and 1.5in approximating (3.1) by (3.3)
79
Chapter 5
Estimation of Location, Scale and Shape parameters of a Skew
Normal Distribution
80
5.1 Introduction.
In chapter 4 we have discussed the estimation procedure to estimate in R (z) ,
in A (z) , and in B (z) assuming the parameters 0 , 1 and to be known, considering a covariate X to be present. In this chapter we assume all the parameters,
i.e., 0 ,1, and to be unknown. To estimate the unknown parameters, we approximate in (3.1) by the linear function in (3.2). Thus, we have one additional unknown parameter, .
We consider a data set of n pairs of observations (xi , yi ) ,i 1,...,n, where xi ’s are
observations of the covariate X and yi ’s are the observed values of the dependent
variable Y, where, Y ~ SN( 0 1x,,) . We rewrite the estimating equations in (3.7)-
(3.10) using in . Based on the observations , and the estimating equations we discuss some important relationships that exist between the estimated parameter values. Using these relationships, we present an estimation procedure for estimating the unknown parameters. We also discuss the performance of our estimation procedure with simulation results.
81
5.2 Relations Among Estimated Parameters
In this section, we obtain some interesting relations among the estimated parameter values. Using these relations, we present an estimation procedure for the unknown parameters.
By substituting R (z) A (z) in the maximum likelihood equations given in (3.7)-
(3.10), we get:
n n n yi 0 1xi yi n 0 1 xi 0.7978846 , (5.1) i1 i1 i1
n n n n 2 yi 0 1xi xi yi 0 xi 1 xi xi 0.7978846 , (5.2) i1 i1 i1 i1
n n yi 0 1 xi yi 0 1 xi yi 0.7978846 0 0.7978846 i1 i1
n yi 0 1xi 1 xi 0.7978846 , (5.3) i1 and
n 2 2 n yi 0 1xi . (5.4) i1
82
2 We note that 0.7978846 .
ˆ Letˆ0 ,ˆ1 , ˆ , ˆ and be the solutions for 0 , 1 , , and in the equations (5.1)-
(5.4).
We define
wˆ i yi ˆ0 ˆ1xi . (5.5)
From (5.1)-(5.4), using (5.5), we write,
n ˆ ˆ2 ˆ ˆ ˆ 1 wi 0.7978846n , (5.6) i1
n ˆ ˆ2 ˆ ˆ ˆ 1 xi wi 0.7978846nx , (5.7) i1
n ˆˆ n ˆ ˆ 2 0.7978846wi wi , (5.8) i1 ˆ i1
n ˆ 2 ˆ 2 n wi . (5.9) i1
From (5.8) and (5.9) we get
n nˆˆˆ ˆˆˆ ˆ ˆ wi w . (5.10) i1 0.7978846 0.7978846
ˆ Multiplying both sides of (5.6) by , and using (5.10) we have n
ˆ (ˆˆ)2 wˆ 0.7978846ˆˆˆ . (5.11)
From (5.6)-(5.11) we obtain Observations 1-11 relating the estimated parameter values.
83
ˆ2 Observation 1. If 1ˆ 0 , then the maximum likelihood estimate of 1 is given by
n xi xyi y ˆ i1 1 n . (5.12) 2 xi x i1
Proof: When , multiplying both sides of (5.6) by x and subtracting it from
(5.7), we obtain
n n ˆ xi xyi 1 xi xxi . (5.13) i1 i1
Then (5.13) follows immediately from (5.12).
Note: We note that the estimate of given in (5.12) for SN( 0 1 x,,) is exactly
the same as the estimate of for N( 0 1 x,) .
Observation 2: We have
ˆ (ˆˆ)2 0.79788462 . (5.14)
Proof: Combining (5.10) and (5.11), we obtain (5.14).
84
Observation 3: We have
(ˆˆ)2 0.79788462 . (5.15)
Proof: We know
n ˆ 2 ˆ 2 nw wi , (5.16) i1
Then from (5.9), (5.10) and (5.16), the inequality (5.15) follows.
Observation 4: We have
0.79788462 ˆ 0 , (5.17)
Proof: In (3.2), 0 and hence ˆ 0 .
Combining (5.14) and (5.15), we have ˆ 0 .
From (5.14), ˆ 0.79788462 (ˆˆ)2 0 .
Hence, 0.79788462 ˆ 0 .
Observation 5. We have
ˆˆˆ ˆ y ˆ x . (5.18) 0 1 0.7978846
85
Proof: From (5.5) we write
n n n wˆ y nˆ ˆ x . (5.19) i i 0 1 i i1 i1 i1
Dividing both sides of (5.19) throughout by n we have
wˆ y ˆ0 ˆ1x. (5.20)
Now from (5.10) and (5.20) we get (5.18).
Observation 6: We observe i) ˆ 0 if and only if wˆ 0, ii) ˆ 0 if and only if wˆ 0.
Proof: Since ˆ 0 , the results in (i) and (ii) are clear from (5.10) and (5.17).
Observation 7: When 0 we denote ˆ by ˆ0 and it is given by
1 n ˆ ˆ 2 0 yi y 1 xi x . (5.21) n i1
Proof: When , we know that yi ~ N( 0 1xi , 0 ) , i=1,2,…,n.
86
From (1.5)-(1.7), we obtain the maximum likelihood estimates of 0 , 1, and 0 :
n x x y y i i n ˆ i1 , ˆ y ˆ x a nd ˆ 2 ˆ ˆ 2 . 1 n 0 1 n 0 yi 0 1xi 2 i1 xi x i1
Hence, (5.21) follows.
Observation 8. We have
2 ˆ 0 ˆ 2 0.7978846 2 . (5.22) ˆ
Proof: From (5.9), (5.10) and (5.21) we get
2 ˆˆ ˆ 2 ˆ 2 ˆ 2 n n 0 n . (5.23) 0.7978846
This implies
2 ˆ 2 0.7978846 ˆ 2 2 ˆ ˆ 0 2 . (5.24) 0.7978846
Using (5.12), we obtain from (5.24)
2 2 ˆˆ 0.7978846ˆ 0 . (5.25)
Hence the equality of (5.22) is true.
87
Observation 9. We have
2 2 2 ˆ ˆ 0 wˆ . (5.26)
Proof: We write for i 1,...,n,
yi ˆ0 ˆ1xi yi y ˆ1 xi x y ˆ0 ˆ1x.
By squaring both sides and adding over i 1,...,n, we get
n 2 n 2 ˆ ˆ ˆ ˆ ˆ 2 yi 0 1xi yi y 1 xi x ny 0 1 x . (5.27) i1 i1
Hence (5.26) is true.
Observation 10. We have
ˆ 2 2 2 2 ˆˆ ˆ ˆ 0 wˆ . (5.28) 0.7978846
Proof: The proof is clear from (5.10) and (5.23).
88
Observation 11 .We have
2 n x x z z 0.7978846 2 n i i ˆ z z x x i 1 i i n n 2 i 1 x x i i 1
Proof: Let y1 , y2, …, yn be a random sample from SN( 0 1 X,,).
That is yi = 0 1xi zi , where, z1 , z2, …, zn is a random sample
from SN(0,1,) .
Hence, y 0 1x z .
n xi xyi y ˆ i1 Now, 1 n 2 xi x i1
n xi x 1 xi x zi z i1 n 2 xi x i1
n xi xzi z i1 1 n . 2 xi x i1
89
1 n ˆ 2 ˆ 2 Again, 0 yi y 1 xi x n i1
2 n x xz z 1 n i i i1 1xi x zi z 1 n xi x n 2 i1 x x i i1
2 n x x2 z z 2 n i i z z i1 i n . n 2 i1 x x i i1
From (5.19) we have,
2 2 0.7978846 ˆ 0 ˆ 2
2 n x xz z 0.79788462 n i i z z x x i1 i i n . n 2 i1 x x i i1
90
5.3 The estimation Procedure using A (z)
Here we recall that we have five unknown parameters 0 , 1,, and , and we have only four estimating equations given in (5.7)-(5.10). Hence we observe that we do not have enough estimating equations to solve for the estimates uniquely. To address this issue we introduce a new variable b and express the estimates of and
in terms of b .
Let us write
(ˆˆ)2 b . (5.26) 0.79788462
From Observation 2 and Observation 4 we notice that
0 b 1. (5.27)
From Observation 2 and (5.26), we have
2 ˆ 0.7978846 1 b. (5.28)
From (5.22) and (5.26) , we have
2 ˆ 0 ˆ 2 . (5.29) 1 b
1 n ˆ 2 ˆ 2 where, 0 yi y 1 xi x . n i1
91
From (5.26) and (5.28), we have
b ˆ2 . (5.30) 1 b2 0.79788462
Thus we see, that for a particular value of b , we have two values of estimated ˆ ,
b ˆ . (5.31) 0.79788461 b
We denote ˆ when we assume ˆ 0 and ˆ when we assume ˆ 0. That is
b ˆ , (5.31a) 0.79788461 b and
b ˆ . (5.31b) 0.79788461 b
Recalling (5.18) we have:
ˆˆˆ ˆ0 y ˆ1x . 0.7978846
ˆ ˆ ˆ ˆ Now in (5.18) if , we denote ˆ0 ˆ0 , and if , we denote ˆ0 ˆ0 .
That is from (5.29) and (5.31a) we have
b ˆ y ˆ x ˆ . (5.32a) 0 1 0 1 b
92
From (5.29) and (5.31b) we have
b ˆ y ˆ x ˆ . (5.32b) 0 1 0 1 b
We now have two issues:
From the n pairs of observations (xi , yi ) ,i 1,...,n,
1. How do we determine the sign of ˆ ?
2. How do we determine the optimal value of b , so that we get the best estimates
of the parameters 0 , 1,, and ?
To answer the first question, we define lˆ and lˆ , where, is the estimated log-
ˆ ˆ likelihood function with ˆ0 ˆ0 and , and is the estimated log-likelihood
ˆ ˆ function with ˆ0 ˆ0 and .
The and are given by:
n y ˆ ˆ x n y ˆ ˆ x ˆ i 0 1 i ˆ i 0 1 i l n log 2 nlogˆ log log (5.33a) i1 ˆ i1 ˆ
n y ˆ ˆ x n y ˆ ˆ x lˆ n log 2 nlogˆ log i 0 1 i log ˆ i 0 1 i (5.33b) ˆ ˆ i1 i1
93
Let us define
y y ˆ x x C i 1 i , i ˆ
2 p ,
b and d . 1 bp
n dC b Result 1. lˆ lˆ if and only if i 1 (5.34) i1 dCi b
and
n dCi b lˆ lˆ if and only if 1. (5.35) i1 dCi b
Proof: From (5.31a), (5.32a) and (5.33a) we get
b yi y ˆ1 x ˆ 0 ˆ1 xi n 1 b lˆ n log 2 n logˆ log i1 ˆ
b yi y ˆ1 x ˆ 0 ˆ1 xi n b 1 b log i1 p1 b ˆ
94
n n ˆ = nlog 2 nlog logCi b log dCi b . i1 i1
Similarly, from (5.31b), (5.32b) and (5.33b) we get
b yi y ˆ1 x ˆ 0 ˆ1 xi n 1 b lˆ n log 2 n logˆ log i1 ˆ
b yi y ˆ1 x ˆ 0 ˆ1 xi n b 1 b log i1 p1 b ˆ
n n ˆ n log 2 n log logCi b log dCi b. i1 i1
Now,
n n Ci b dCi b lˆ lˆ log log i1 Ci b i1 dCi b
n Ci b dCi b log . (5.36) i1 Ci b dCi b
95
We see that,
n 1 1 n 2 exp C b n i Ci b 2 2 i1 n n i1 Ci b 1 1 2 exp Ci b 2 2 i1
1 n 2 2 exp Ci b Ci b 2 i1
n 1 2 2 exp Ci 2Ci b b Ci 2Ci b b 2 i1
n n yi y ˆ1 xi x 1, since, Ci 0 . i1 i1 ˆ
Hence, from (5.36), we have,
n dCi b lˆ lˆ log . i1 dCi b
Thus we see that (5.34) and (5.35) holds.
96
Sign of ˆ :
We consider one hundred values of b, i.e., (0.00,0.01,0.02,...,0.99). Corresponding to
each value of and from n pairs of observations (xi , yi ) ,i 1,...,n, we calculate
n dC b i . i1 dCi b
n dC b If i 1, for majority of the b values, we conclude ˆ 0 , i1 dCi b and
n dC b if i 1, for majority of the b values, we conclude ˆ 0. i1 dCi b
Optimal value of b :
We consider one hundred values of i.e., First we determine the sign of . If we find that , then we consider only lˆ . If we find that , then we consider only lˆ .
ˆ For , corresponding to each value of we calculate ˆ0 ,ˆ1 , ˆ , ˆ and and then find the value of . We find the optimal value of b b* which maximizes . Then the estimates of the parameters are obtained corresponding to that value of b* . We
* * * * ˆ* denote the estimated parameter values by ˆ0 , ˆ1 ,ˆ , ˆ and .
97
ˆ ˆ For l , corresponding to each value of b, we calculate ˆ0 ,ˆ1 , ˆ , ˆ and and then find the value of . We find the optimal value of b b* which maximizes . Then the estimates of the parameters are obtained corresponding to that value of b* . We
* * * * ˆ* denote the estimated parameter values by ˆ0 ,ˆ1 , ˆ , ˆ and .
Thus we obtain the estimated parameter values.
98
5.4 . A simulated Data
In order to evaluate the performance of the estimation procedure given in Section
5.3, we present the estimated values of the unknown parameters 0 , 1,, and , in the presence of a covariate X using a simulated data.
We generate zi , i 1,...,n from SN(0,1,) , where n 20 and 1, keeping nine decimal places. The rounded values at the second decimal place are
1.40, 1.96, -0.12, 0.41, 0.75, 0.98, 0.74, 0.24, 1.59, 1.09, -0.13, 0.64, 1.98, 0.17, -
0.14, 1.40, 0.72, 1.36, 0.99, 0.66. (Dataset 5.1)
The xi , i 1,...,20, (Exercise 12.25, Page 522, Mendenhall, Beaver and Beaver
(2009)) values are
100, 96, 88, 100, 100, 96, 80, 68, 92, 96, 88, 92, 68, 84, 84, 88, 72, 88, 72, 88.
(Dataset 5.2)
We consider 0 30 , 1 5 and 0.05.
We generate a random variable Y from (Dataset 5.1) and (Dataset 5.2), such that,
yi 0 1xi zi , .
We can then write Y ~ SN( 0 1 x 30 5x, 0.05, 1) .
99
The yi values , i 1,...,20, are
470.0698, 450.0979, 409.9942, 470.0207, 470.0375, 450.0491, 370.0371, 310.0119,
430.0793, 450.0547, 409.9933, 430.0319, 310.0990, 390.0083, 389.9928, 410.0700,
330.0358, 410.0679, 330.0494, 410.0329. (Dataset 5.3)
Now we treat the values of 0 , 1,, to be unknown. We apply the estimation procedure proposed in Section (5.3) to estimate the parameters , and also
in (3.2).
We first determine the sign of ˆ .
For our dataset (5.3), we calculate
n dC b i , i1 dCi b
y y ˆ x x 2 b C i 1 i , p , and d i ˆ 1 bp
for each value of b , where, b 0.00,0.01,0.02,...,0.99.
We obtain
ˆ1 5.000267.
n dC b We see that i 1 for all the values of b, i1 dCi b
Hence we conclude ˆ 1.
100
ˆ Hence, we consider only l . Now corresponding to each value of b, we obtain ˆ0 ,
ˆ ˆ1 , ˆ , ˆ and . Thus we have 100 sets of , , , and estimates.
Corresponding to each set of estimated values of the parameters, we obtain . Then we find the maximum from that set of 100 values, and note the corresponding value of b , b* , as our optimum value of .
Here, maxlˆ 26.90821 and the corresponding , b* 0.43 .
The estimates obtained corresponding to b* 0.43 are our estimated values of the unknown parameters. They are:
* ˆ0 - 30.00863 ,
* ˆ1 5.000263,
ˆ * 0.04184073,
ˆ * -0.3628733,
and
ˆ* 1.441847.
We note that our procedure very accurately estimates the location and the scale parameters. For the dataset considered here, the estimate of the shape parameter is also seen to be quite satisfactory, though a little over-estimated.
ˆ Here, the estimated linear function A (zˆ) is,
Aˆ (zˆ) 0.7978846 0.5232079zˆ .
101
In Figure 5.1 we plot lˆ against all values of b 0.00,0.01,0.02,...,0.99.We notice that the curve appears to be almost flat till b=0.7 and then falls down. But if we magnify the plot, around the region where b takes values between 0.35 and 0.5, we observe that the curve reaches a maximum when b=0.43 . We plot this in Figure 5.2.
Figure 5.1 Plot of the estimated log-likelihood against the values of b [0,1) .
102
Figure 5.2 Plot of the estimated log-likelihood lˆ against the values of b[0.35,0.50] .
103
5.5 Estimating Bias and Accuracy in
Approximation using simulations
We now obtain 10,000 datasets by repeating the simulation method described in the earlier
sections 10,000 times. We consider the different values of the parameters 0 ,1, and . We also consider different sample sizes, n 20,50,100, and 500 . However, we present here the outcomes only for a few sets of parameter values.
First, for a set of fixed values of the parameters 0 30, 1 5, 0.05, and different values of the parameter and sample size n, we generate 10,000 datasets, and observe how many times we can correctly determine the sign of . In Table 5.1 we give the true value of the parameter , the sample size and the proportion of times the sign of were correctly determined out of 10,000 simulations.
From table 5.1, we can make the following observations:
i. When 0.5, the sign of is correctly determined a little more than 50% of the
times when sample size n=20, and about 60% of the times when sample size, n=500.
104
ii. When 1.0, the sign of is correctly determined about 60% of the times when
sample size n=20, about 70% of the times when sample size, n=100, and almost 90%
of the times when sample size, n=500.
iii. When 2.0, the sign of is correctly determined about 75% of the times when
sample size n=20, about 95% of the times when sample size, n=100, and almost
always when sample size, n=500.
iv. When 3.0, the sign of is correctly determined almost always for all sample
sizes.
True Value of Sample size Proportion of times sign (n) of is correctly determined. 0.5 20 0.5101
0.5 50 0.5312
0.5 100 0.5346
0.5 500 0.5891
1 20 0.5790
1 50 0.6369
1 100 0.7005
1 500 0.8832
2 20 0.7460
2 50 0.8842
2 100 0.9564
2 500 0.9999
105
3 20 0.8445
3 50 0.9653
3 100 0.9964
3 500 1.0000
-0.5 20 0.5121
-0.5 50 0.5191
-0.5 100 0.5362
-0.5 500 0.5859
-1 20 0.5682
-1 50 0.6334
-1 100 0.7031
-1 500 0.8874
-2 20 0.7369
-2 50 0.8760
-2 100 0.9567
-2 500 1.0000
-3 20 0.8318
-3 50 0.9650
-3 100 0.9966
-3 500 1.0000
Table 5.1 The true values of , the sample size n, and the proportion of times the sign of is correctly determined.
106
Next, for a set of fixed values of the parameters and a given sample size, we generate
10,000 data sets and from each data set we obtain optimum value of b , b* , and the
* * * ˆ* * corresponding estimated values of ˆ0 ,ˆ1 ,ˆ , and ˆ . Thus we have a set of 10,000 estimates of the parameters. In Tables 5.1-5.32 we present the first quartile Q1, median, mean, third quartile Q3, and standard deviation (SD) for the 10,000 values of
* * * ˆ* * the estimated values of ˆ0 ,ˆ1 ,ˆ , and ˆ .
We observe that median estimates of the parameters perform better than the Mean estimates. The numerical values of the difference between the parameters and their
Median estimates could be considered as the estimates of bias. A few other things can be noted from the tables:
i. The estimated bias in ˆ is smaller when the true value of is 0.5 or 1
compared to when it is 2 or 3. This is because, as discussed in Chapter 3,
the linear function A (z) given in (3.2) approximates R (z) in (3.1)
more accurately for smaller values of .
ii. The estimated bias in is smaller for larger sample sizes.
iii. For larger values of , our method slightly under-estimates the true
value of .
iv. The estimated bias in ˆ0 , ˆ1 , and ˆ are quite small for all cases.
107
Q1 Median Mean Q3 SD
-30.06 -29.98 -29.98 -29.91 0.1066 ˆ0
ˆ1 4.999 5 5 5.001 0.0010
ˆ 0.051 0.0596 0.0601 0.0690 0.0127
ˆ -1.827 0.1266 0.0434 1.883 1.7606
ˆ -0.3947 -0.3119 -0.3582 -0.2801 -0.2355
ˆ Table 5.2. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ , , when n=20, 0 30, 1 5, 0.05, and 0.5in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.03 -29.98 -29.98 -29.94 0.0724
5 5 5 5 0.0007
0.0516 0.0585 0.0586 0.0652 0.0093
-1.321 0.2611 0.0956 1.485 1.4273 -0.4775 -0.3692 -0.4028 -0.3183 0.1060
Table 5.3. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and 0.5in approximating (3.1) by (3.2)
108
Q1 Median Mean Q3 SD
-30.02 -29.99 -29.99 -29.95 0.0550 ˆ0
ˆ1 5 5 5 5 0.0005
ˆ 0.0509 0.0565 0.0567 0.0621 0.0075
ˆ -1.011 0.295 0.1025 1.21 1.1998
ˆ -0.5157 -0.4202 -0.4371 -0.3565 0.0983
ˆ Table 5. 4. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ , , when n=100, 0 30, 1 5, 0.05, and 0.5in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.01 -29.99 -29.99 -29.96 0.0310
5 5 5 5 0.0002
0.0493 0.0522 0.0527 0.0556 0.0043
-0.5711 0.3853 0.1631 0.8079 0.7849 -0.5666 -0.5029 -0.5051 -0.4456 0.0727
Table 5.5. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and 0.5in approximating (3.1) by (3.2)
109
Q1 Median Mean Q3 SD
-30.04 -29.98 -29.98 -29.91 0.0939 ˆ0 4.999 5 5 5.001 0.0009 ˆ1
ˆ 0.0450 0.0529 0.0532 0.0610 0.0115
ˆ -1.621 0.7006 0.2957 2.002 1.7377
ˆ -0.3947 -0.3119 -0.3575 -0.2801 0.1118
ˆ Table 5.6. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ , , when
n=20, 0 30, 1 5, 0.05, and 1in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.03 -29.98 -29.98 -29.94 0.0634
5 5 5 5 0.0005
0.0458 0.0522 0.0521 0.0581 0.0085
-0.8355 0.9211 0.4703 1.72 1.3815 -0.4711 -0.3629 -0.3984 -0.3119 0.1062
Table 5.7. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and 1in approximating (3.1) by (3.2)
110
Q1 Median Mean Q3 SD
-30.02 -29.99 -29.99 -29.95 0.0476 ˆ0
ˆ1 5 5 5 5 0.0004
ˆ 0.0457 0.0509 0.0510 0.0560 0.0069
ˆ -0.4404 0.9506 0.5883 1.529 1.1345
ˆ -0.4902 -0.4011 -0.4227 -0.3438 0.0971
ˆ Table 5.8. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ , , when
n=100, 0 30, 1 5, 0.05, and 1in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
b 0.19 0.31 0.283 0.38 0.1230
-30.01 -30.00 -29.99 -29.98 0.0234
5 5 5 5 0.0001
0.04563 0.04931 0.04918 0.0525 0.0046
0.6226 0.9807 0.8322 1.246 0.6137 -0.5157 -0.4393 -0.4565 -0.3947 0.0784
Table 5.9. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and 1in approximating (3.1) by (3.2)
111
Q1 Median Mean Q3 SD
-30.04 -29.99 -29.98 -29.93 0.0787 ˆ0
ˆ1 4.999 5 5 5.001 0.0008
ˆ 0.0383 0.0456 0.0459 0.0528 0.0104
ˆ -0.1266 1.827 0.9958 2.201 1.5543
ˆ -0.3629 -0.2992 -0.3436 -0.2737 0.1074
ˆ Table 5.10. The values of Q1 ,Median, Mean, Q3, and SD b ,ˆ0 , ˆ1 , ˆ , , when
n=20, 0 30, 1 5, 0.05, and 2 in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.03 -29.99 -29.99 -29.96 0.0502
5 5 5 5 0.0005
0.0413 0.0472 0.0468 0.0524 0.0080
1.175 1.772 1.39 2.066 0.9952 -0.382 -0.3183 -0.3552 -0.2865 0.0972
Table 5.11. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and 2 in approximating (3.1) by (3.2)
112
Q1 Median Mean Q3 SD
-30.02 -30 -30 -29.97 0.0346 ˆ0
ˆ1 5 5 5 5 0.0003
ˆ 0.0438 0.0482 0.0476 0.0520 0.0062
ˆ 1.442 1.772 1.573 1.941 0.6514
ˆ -0.3629 -0.3183 -0.3462 -0.2992 0.0782
ˆ Table 5.12. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ , , when n=100, 0 30, 1 5, 0.05, and 2 in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.01 -30 -30 -29.99 0.0150
5 5 5 5 0.0001
0.0476 0.0493 0.0492 0.05097 0.0026
1.67 1.772 1.749 1.883 0.1757 -0.331 -0.3183 -0.3225 -0.3056 0.0234
Table 5.13. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and 2 in approximating (3.1) by (3.2)
113
Q1 Median Mean Q3 SD
-30.04 -29.99 -29.99 -29.94 0.0716 ˆ0
ˆ1 5 5 5 5 0.0007
ˆ 0.0367 0.0438 0.0440 0.0511 0.0104
ˆ 1.283 2.066 1.427 2.273 1.3394
ˆ -0.331 -0.2865 -0.326 -0.2674 0.1009
ˆ Table 5. 14. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ , , when n=20, 0 30, 1 5, 0.05, and 2 in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.03 -30 -29.99 -29.96 0.0452
5 5 5 5 0.0005
0.0416 0.0464 0.0461 0.0509 0.0072
1.772 2.066 1.835 2.201 0.6619 -0.3183 -0.2865 -0.3136 -0.2737 0.0723
Table 5.15. The values of Q1 ,Median, Mean, Q3, and SD , , , , , when n=50, , and 3in approximating (3.1) by (3.2)
114
Q1 Median Mean Q3 SD
-30.02 -30 -30 -29.98 0.0310 ˆ0
ˆ1 5 5 5 5 0.0003
ˆ 0.0443 0.0474 0.0473 0.0507 0.0050
ˆ 1.883 2.066 1.972 2.132 0.3172
ˆ -0.3056 -0.2865 -0.2987 -0.2801 0.0400
ˆ Table 5.16. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ , , when n=100, 0 30, 1 5, 0.05, and 3in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.01 -30 -30 -29.99 0.0137
5 5 5 5 0.0001
0.0467 0.0483 0.0483 0.0497 0.0021
2.002 2.066 2.035 2.066 0.0883 -0.2928 -0.2865 -0.2898 -0.2865 0.0091
Table 5.17. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and 3in approximating (3.1) by (3.2)
115
Q1 Median Mean Q3 SD
-30.09 -30.02 -30.02 -29.94 0.1060 ˆ0
ˆ1 4.999 5 5 5.001 0.0010
ˆ 0.0503 0.0593 0.0597 0.0686 0.0128
ˆ -1.883 -0.1266 -0.0471 1.772 1.7458
ˆ -0.4011 -0.3119 -0.361 -0.2801 0.1137
ˆ Table 5.18. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ , , when n=20, 0 30, 1 5, 0.05, and 0.5in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.06 -30.02 -30.02 -29.97 0.0716
5 5 5 5 0.0007
0.0516 0.0583 0.0585 0.0650 0.0094
-1.442 -0.1809 -0.0715 1.321 1.4253 -0.4775 -0.3692 -0.404 -0.3183 0.1066
Table 5.19. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=50, , and 0.5in approximating (3.1) by (3.2).
116
Q1 Median Mean Q3 SD
-30.05 -30.01 -30.02 -29.98 0.0552 ˆ0
ˆ1 5 5 5 5 0.0005
ˆ 0.0510 0.0566 0.0567 0.0621 0.0074
ˆ -1.21 -0.295 -0.0984 1.043 1.2089
ˆ -0.5157 -0.4138 -0.4359 -0.3501 0.0990
ˆ Table 5.20. The values of Q1 ,Median, Mean, Q3, and SD for b ,ˆ0 , ˆ1 , ˆ , , when n=100, 0 30, 1 5, 0.05, and 0.5in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.04 -30.01 -30.01 -29.99 0.0316
5 5 5 5 0.0002
0.0492 0.0522 0.0527 0.0558 0.0043
-0.8355 -0.3853 -0.1646 0.5711 0.7831 -0.5666 -0.5093 -0.5056 -0.4456 0.0730
Table 5.21. The values of Q1 ,Median, Mean, Q3, and SD for , , , , , when n=500, , and 0.5in approximating (3.1) by (3.2).
117
Q1 Median Mean Q3 SD
-30.08 -30.02 -30.02 -29.96 0.0941 ˆ0
ˆ1 4.999 5 5 5.001 0.0009
ˆ 0.0446 0.0525 0.053 0.0609 0.0115
ˆ -2.002 -0.5968 -0.2794 1.621 1.7341
ˆ -0.3947 -0.3119 -0.3593 -0.2801 0.1130
ˆ Table 5.22. The values of Q1 ,Median, Mean, Q3, and SD for b,ˆ0 , ˆ1 , ˆ , , when n=20, 0 30, 1 5, 0.05, and 1in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.06 -30.02 -30.02 -29.97 0.0637
5 5 5 5 0.0006
0.0457 0.0519 0.0521 0.0580 0.0085
-1.67 -0.9211 -0.4475 0.8921 1.3911 -0.4647 -0.3629 -0.3972 -0.3119 0.1050
Table 5.23. The values of Q1 ,Median, Mean, Q3, and SD for b, , , , , when n=50, , and 1in approximating (3.1) by (3.2).
118
Q1 Median Mean Q3 SD
-30.05 -30.01 -30.01 -29.98 0.0474 ˆ0
ˆ1 5 5 5 5 0.0004
ˆ 0.0457 0.0509 0.0510 0.0560 0.0069
ˆ -1.529 -0.9807 -0.5904 0.4132 1.1344
ˆ -0.4966 -0.4011 -0.4228 -0.3438 0.0978
ˆ Table 5.24. The values of Q1 ,Median, Mean, Q3, and SD for b,ˆ0 , ˆ1 , ˆ , , when n=100, 0 30, 1 5, 0.05, and 1in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.02 -30 -30.01 -29.99 0.0233
5 5 5 5 0.0002
0.0457 0.0494 0.0492 0.0527 0.0045
-1.246 -1.011 -0.8469 -0.6226 0.6098 -0.5093 -0.4393 -0.4546 -0.3883 0.0784
Table 5.25. The values of Q1 ,Median, Mean, Q3, and SD b, , , , , when n=500, , and 1in approximating (3.1) by (3.2).
119
Q1 Median Mean Q3 SD
-30.07 -30.02 -30.02 -29.96 0.0784 ˆ0
ˆ1 4.999 5 5 5.001 0.0008
ˆ 0.0382 0.0458 0.0460 0.0532 0.0106
ˆ -2.201 -1.772 -0.9625 0.1266 1.5753
ˆ -0.3629 -0.2992 -03443 -0.2737 0.1086
ˆ Table 5.26. The values of Q1 ,Median, Mean, Q3, and SD for b,ˆ0 , ˆ1 , ˆ , ,
when n=500, 0 30, 1 5, 0.05, and 2in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.04 -30.01 -30.01 -29.97 0.0503
5 5 5 5 0.0005
0.0411 0.0472 0.0469 0.0525 0.0080
-2.066 -1.772 -1.372 -1.175 1.0277 -0.3756 -0.3183 -0.3536 -0.2865 0.0953
Table 5.27. The values of Q1 ,Median, Mean, Q3, and SD for b, , , , , when n=50, , and 2in approximating (3.1) by (3.2).
120
Q1 Median Mean Q3 SD
-30.03 -30 -30 -29.98 0.0347 ˆ0
ˆ1 5 5 5 5 0.0004
ˆ 0.0440 0.0482 0.0477 0.0520 0.0063
ˆ -2.002 -1.772 -1.585 -1.442 0.6487
ˆ -0.3629 -0.3183 -0.3447 -0.2928 0.0782
ˆ Table 5.28. The values of Q1 ,Median, Mean, Q3, and SD for b,ˆ0 , ˆ1 , ˆ , , when n=100, 0 30, 1 5, 0.05, and 2in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.01 -30 -30 -29.99 0.0148
5 5 5 5 0.0002
0.0476 0.0493 0.0492 0.0510 0.0026
-1.883 -1.772 -1.75 -1.67 0.1731 -0.331 -0.3183 -0.3224 -0.3056 0.0230
Table 5.29. The values of Q1 ,Median, Mean, Q3, and SD for b, , , , , when n=500, , and 2in approximating (3.1) by (3.2).
121
Q1 Median Mean Q3 SD
-30.04 -30 -30.06 -29.96 0.0717 ˆ0
ˆ1 5 5 5 5 0.0007
ˆ 0.365 0.0435 0.0438 0.0507 0.0102
ˆ -2.273 -2.066 -1.383 -1.175 1.3730
ˆ -0.331 -0.2865 -0.3274 -0.2674 0.1015
ˆ Table 5.30. The values of Q1 ,Median, Mean, Q3, and SD for b,ˆ0 , ˆ1 , ˆ , , when n=20, 0 30, 1 5, 0.05, and 3.0in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.04 -30 -30.01 -29.98 0.0445
5 5 5 5 0.0005
0.0416 0.0464 0.0460 0.0509 0.0072
-2.201 -2.066 -1.839 -1.772 0.6550 -0.3183 -0.2865 -0.3132 -0.2737 0.0720
Table 5. 31. The values of Q1 ,Median, Mean, Q3, and SD for b , , , , , and when n=50, , and 3.0in approximating (3.1) by (3.2).
122
Q1 Median Mean Q3 SD
-30.02 -30 -30 -29.98 0.0309 ˆ0
ˆ1 5 5 5 5 0.0003
ˆ 0.0443 0.0476 0.0473 0.0507 0.0050
ˆ -2.132 -2.066 -1.976 -1.883 0.3083
ˆ -0.3056 -0.2865 -0.2986 -0.2801 0.0409
ˆ Table 5. 32. The values of Q1 ,Median, Mean, Q3, and SD for b,ˆ0 , ˆ1 , ˆ , , when n=100, 0 30, 1 5, 0.05, and 3.0in approximating (3.1) by (3.2).
Q1 Median Mean Q3 SD
-30.01 -30 -30 -29.99 0.0135
5 5 5 5 0.0001
0.0469 0.0483 0.0483 0.0497 0.0021
-2.066 -2.066 -2.035 -2.002 0.0887 -0.2928 -0.2865 -0.2897 -0.2865 0.0091
Table 5. 33. The values of Q1 ,Median, Mean, Q3, and SD for b , , , , , when n=500, , and in approximating (3.1) by (3.2).
123
Chapter 6
Conclusion
In this dissertation we have seen the challenges in finding the maximum likelihood estimate for the parameters of the skew normal distribution. We have argued that the complex function of the ratio of the normal density and distribution functions, in the presence of the shape, location and scale parameters, in the likelihood equations, makes it very difficult to estimate the parameters. In our dissertation we proposed a
124 simple linear and non-linear approximation to the complex function. We have seen that the linear function quite satisfactorily approximates the complex function. Thus using the linear approximating function in the likelihood equations, we estimate the parameters of interest. We also present our estimation procedure in a regression setup, assuming a covariate X to be present.
In our dissertation we have considered two cases. First, as our main parameter of interest is the shape parameter, we assumed the location and the scale parameter to be fixed and considered estimating only the shape parameter. We have presented an estimation procedure under this situation using both the linear and the non-linear approximating functions for the complex function of the ratio of the normal density and distribution functions. Second, we have assumed all the parameters to be unknown and presented a numerical method to estimate the parameters using a linear approximating function. Here we considered only the linear approximating function. We performed simulation studies to evaluate the performance of the estimation procedure for the parameters.
The skew normal distribution has a lot of potential. Absence of an efficient estimation procedure for the parameters has been a main hindrance to its even wider usage. Our research provides an efficient and simple estimation procedure as well as a new approach to resolve the complexity in solving the maximum likelihood estimating equations.
125
In our dissertation we approach the challenges in the maximum likelihood estimate in a novel way. There is ample scope for future research in this direction. Better approximating functions can be one way of improving the performance of the estimation procedure. In our dissertation we have presented a numerical procedure for estimation. In future work, it can be seen if based on our approach, an analytical procedure can be developed. The properties of the skew normal distribution in the light of the approximations can also be revisited.
126
Bibliography
[1] Arellano-Valle, R.B., Bolfarine, H., & Lachos, V.H. (2005a). Skew-Normal Linear
Mixed Models. Journal of Data Science 3, 415-438.
[2] Arellano-Valle, R.B., Del Pino, G., & San Martin, E. (2002). Definition and Probabilistic
Properties of Skew Normal Distributions. Statist. Probab. Lett. 58, 111-121.
[3] Arnold, B.C. & Beaver, R. J. (2000a). Hidden Truncation Models. Sankhya, ser.A 62,
22-35.
[4] Arnold, B. C. & Lin, G.D. (2004). Characterizations of the Skew Normal and
Generalized Chi Distributions. Sankhya 66, 593-06.
[5] Arnold, B.C., Castillo, E., & Sarabia, J.M. (2007). Distributions with Generalized
Skewed Conditionals and Mixtures of such Distributions. Commun. Statist. – Theory &
Methods 36, 1493-1503.
127
[6] Azzalini, A. (1985). A Class of Distributions Which Includes the Normal Ones. J. Statist.
12, 171- 178.
[7] Azzalini, A. (2001). A Note on Regions of Given Probability of the Skew Normal
Distribution. Metron LIX, 27-34.
[8] Azzalini, A. (1986). Further Results on a Class of Distributions Which Includes the
Normal ones. Satistica 46. 199-208.
[9] Azzalini, A. & Dalla Valle, A. (1996). The multivariate skew-normal distribution.
Biometrika, 83, 715–726.
[10] Bansal, N. K., Maadooliat, M. & Wang, X. (2008). Emperical Bayes and Hierarchical
Bayes Estimation of Skew Normal Populations. 37, 1024-1037.
[11] Capitanio, A., Azzalini, A., & Stanghellini, E. (2003). Graphical Models for Skew-
Normal Variates. Scand. J. Statist 30, 129-144.
[12] Catchpole, E.A. & Morgan, B.J.T. (1997). Detecting parameter redundancy,
Biometrika, 84, 187-196.
[13] Chen, J.T., Gupta, A.K. & Nguyen, T.T. (2004). The Density of the Skew Normal
Sample Mean and Application. J. Statist. Comput. Simul. 74, 487-494.
128
[14] Chiogna, M. (1998). Some Results on the Scalar Skew-Normal Distribution. J. Ital.
Statist. Soc 7, 1-13.
[15] Chiogna, M. (2005). A Note on the Asymptotic Distribution of the Maximum
Likelihood Estimator for the Skew Normal Distribution. Stat. Meth. & Appl. 14,
331-341.
[16] Dalla Valle, A., (2004). The skew-normal distribution. In: M.G. Genton (Ed.) Skew-
Elliptical Distributions and Their Applications: a Journey Beyond Normality,
Chapter 1 (Boca Raton, FL: Chapman and Hall/CRC), 3–24.
[17] Dalla Valle, A. (2007). A Test for the Hypothesis of Skew-normality in a Population. J.
Statist. Comput. Simul 77, 63-77.
[18] D’Agostino, R.B. and Stephens, M.A., (1986). Handbook of Goodness-of-fit
techniques (New York: Marcel Dekker).
[19] Genton, M. G. (2005). Discussion of “The Skew Normal Distribution and Related
Multivariate Families” by A. Azzalini. Scand. J. Statist. 32, 189-98.
[20] Genton, M. G., He, L. and Liu, X. (2001). Moments of skew-normal random vectors
and their quadratic forms. Statistics & Probability Letters, 51, 319.
129
[21] Gupta, A. K. & Chen, T. (2001). Goodness-of-fit Tests for the Skew Normal
Distribution. Commun. Statist. - Simulation & Computation 30, 907-930.
[22] Gupta, A.K. & Huang, W.J. (2002). Quadratic Forms in the Skew Normal Variates. J.
Math. Anal. Appl. 273, 558-564.
[23] Monti, A. C. (2003). A Note on the Estimation of the Skew Normal and the Skew
Exponential Power Distributions. Metron XLI, 205-219.
[24] Pewsey, A. (2000). Problems of Inference for Azzalini’s Skew Normal Distribution.
Journal of Applied Statistics 27, 859-870.
[25] Rohatgi, V.K. & Ehsanes Saleh, A.K. Md. (2003). An Introduction to Probability and
Statistics. Second Edition. Wiley Series in Probability and Statistics.
[26] Sartori, N. (2006). Bias Prevention of Maximum Likelihood Scalar Skew Normal and
Skew t Distributions. J. Statist. Planning and Inference. 136, 4259-4275.
130