Statistical Analysis of Skew Normal Distribution and Its Applications

STATISTICAL ANALYSIS OF SKEW NORMAL DISTRIBUTION AND ITS APPLICATIONS Grace Ngunkeng A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY August 2013 Committee: Wei Ning, Advisor Jane Y. Chang, Graduate Faculty Representative Arjun K. Gupta John T. Chen Copyright c August 2013 Grace Ngunkeng All rights reserved iii ABSTRACT Wei Ning, Advisor In many practical applications it has been observed that real data sets are not symmetric. They exhibit some skewness, therefore do not conform to the normal distribution, which is popular and easy to be handled. Azzalini (1985) introduced a new class of distributions named the skew normal distribution, which is mathemat- ically tractable and includes the normal distribution as a special case with skewness parameter being zero. The skew normal distribution family is well known for model- ing and analyzing skewed data. It is the distribution family that extends the normal distribution family by adding a shape parameter to regulate the skewness, which has the higher flexibility in fitting a real data where some skewness is present. In this dissertation, we will explore statistical analysis related to this distribution family. In the first part of the dissertation, we develop a nonparametric goodness-of-fit test based on the empirical likelihood method for the skew normal distribution. The empirical likelihood was proposed by Owen (1988). It is a method which combines the reliability of the canonical nonparametric method with the flexibility and effec- tiveness of the likelihood approach. The statistical inference of the test statistic is derived. Simulations indicate that the proposed test can control the type I error within a given nominal level, and it has competitive power comparing to the other available tests. The test is applied to IQ scores data set and Australian Institute of Sport data set to illustrate the testing procedure. In the second part we focus on the change point problem of the skew normal distribution. The world is filled with changes, which can lead to unnecessary losses if people are not aware of it. Thus, statisticians are faced with the problem of de- iv tecting the number of change points or jumps and their location, in many practical applications. In this part, we address this problem for the standard skew normal family. We focus on the test based on the Schwartz information criterion (SIC) to detect the position and the number of change points for the shape parameter. The likelihood ratio test and the bayesian methods as two alternative approaches will be introduced briefly. The asymptotic null distribution of the SIC test statistics is derived and the critical values for different sample sizes and nominal levels are computed for the adjustified SIC test statistic. Simulation study indicates the performance of the proposed test. In the third part of the dissertation, we extend the methods in the second part by studying the different types of change point problem for the general skew normal distribution, which include: the simultaneous changes of location and scale parameters, the simultaneous change of location, scale and shape parameters. We derive the test statistic based on SIC to detect and estimate the number of possible change points. Firstly, we consider the change point problem for the simultaneous changes of location and scale parameters, assuming that the shape parameter is unknown and has to be estimated. Secondly, we explore the change point problem for simultaneous changes of location, scale and shape parameters. The asymptotic null distribution and the corresponding adjustification for the test statistic are established. Simulations for each proposed test are conducted to indicate the performance of the test. Power comparisons with the available tests are investigated to indicate the advantage of the proposed test. Applications to real data are provided to illustrate the test procedure. v This work is dedicated to my beloved grand mother Ngunkeng Mariana and my parents Ashu Alexander and Monica Fuabe Ashu, for their constant love and support. vi ACKNOWLEDGMENTS To God be the honor and glory. I wish to express my sincere gratitude to my advisor, Dr. Wei Ning, for his continuous support, guidance and patience throughout this research, and from whom I have acquired a great deal of skills. I also want to extend my gratitude to my committee members, Dr. Arjun K. Gupta, Dr. John T. Chen and Dr. Jane Chang for taking the time to serve on my committee and for their constructive comments. I would like to thank the Mathematics and Statistics Department and the Grad- uate College for providing me with financial support during my studies at BGSU. I would like to thank all the professors in the Mathematics and Statistics Department for their vast knowledge that has impacted me. I would also like to thank all my fellow graduate students for their friendship. I would like to especially thank Marcia Lynn Seubert, Mary Jane Busdeker and Barbara J Berta for all their assistance. I would like to thank Professor Reialdo B. Arellano-Valle, Professor Luis M. Castro and Professor Rosangela H. Loschi for proving us with the Latin American stock market data used in chapters 3 & 4. I owe special thanks to Dr. Lisa Chyvonne Chavers, Mr. Sidney Robert Childs, Dr. Nkem Khumbah and Mrs. Prudence Nojang for making it possible for me to continue my studies at BGSU and for the continuous moral and financial support. Finally my deepest gratitude goes to my parents, family and friends for their constant love and spiritual support throughout my studies. Grace Ngunkeng Bowling Green, Ohio, USA August 2013 vii Table of Contents CHAPTER 1: SKEW NORMAL DISTRIBUTION 1 1.1 Introduction . 1 1.1.1 Properties of skew normal distribution (SN) . 3 1.2 Literature Review . 5 1.2.1 Thesis Structure . 6 CHAPTER 2: EMPIRICAL LIKELIHOOD RATIO BASED GOODNESS- OF-FIT TEST FOR SKEW NORMALITY 8 2.1 Introduction . 8 2.2 Empirical Likelihood Based Test . 15 2.2.1 Empirical Likelihood Method . 15 2.2.2 Test Statistic . 20 2.3 Asymptotic Results . 27 2.4 Calculations of Critical Values and P-values . 32 2.4.1 Critical Values . 32 2.4.2 Approximations to the p-value of SNn . 33 2.5 Simulations . 34 2.6 Application . 38 viii 2.6.1 Otis IQ Scores for Non-whites . 38 2.6.2 Australian Institute of Sport Data . 40 2.7 Conclusion . 41 CHAPTER 3: CHANGE POINT PROBLEM FOR STANDARD SKEW NORMAL DISTRIBUTION 43 3.1 Introduction . 43 3.1.1 Literature Review . 45 3.2 Change of the Shape Parameter λ .................... 47 3.2.1 Information Approach . 48 3.2.2 Likelihood Ratio Based Test . 53 3.2.3 Bayesian Approach . 55 3.3 Simulation . 62 3.4 Application . 62 3.5 Conclusion . 68 CHAPTER 4: CHANGE POINT PROBLEM FOR GENERAL SKEW NORMAL DISTRIBUTION 70 4.1 Location and Scale Change . 71 4.1.1 Information Approach (SIC) . 72 4.1.2 Power Simulation . 78 4.2 Application to Biomedical Data . 78 4.3 The Change of Location, Scale and Shape . 80 4.3.1 Test Statistics . 81 4.3.2 Power Simulation . 85 4.4 Applications to Latin American Emerging Market Stock Returns . 86 ix 4.4.1 Argentina Weekly Stock Market . 87 4.4.2 Brazilian Stock Return . 89 4.4.3 Chile Stock Return Market . 92 4.4.4 Mexico Stock Return Market . 94 4.5 Conclusion . 97 BIBLIOGRAPHY 99 x List of Figures 2.1 Histogram of IQ scores with a skew normal fit and normal fit. 39 2.2 The histogram with a skew normal fit and normal fit for the body mass index (BMI) of 50 females. 41 3.1 The Graph of the time series data for the weekly stock returns and return rate for Brazil with the corresponding change points respectively. 66 4.1 Left: The SIC values for every locus on chromosome 4 of the fibroblast cell line GM13330; Right: Chromosome 4 of the fibroblast cell line GM13330. 80 4.2 The graphs of the time series data for the weekly stock returns and return rate Rt for Argentina market with the corresponding change points. 88 4.3 Left: The graph of the acf values of the transformed data Rt ; Right: Test for normality. 89 4.4 The graphs of the time series data for the weekly return rate Rt and stock returns and for Brazil market with the corresponding change points. 91 4.5 Left: The acf plot of Brazil Rt series data; Right: Test for Normality. 92 xi 4.6 The graphs of the time series data for the weekly return rate Rt and stock returns for Chile market with the corresponding change points . 94 4.7 Left: Graph of the acf of the Chile Rt series; Right: Test for Normality. 95 4.8 The graphs of the time series data for the weekly return rate Rt and stock returns for Mexico market with the corresponding change point. 96 4.9 The ACF of Mexico stock return rate Rt and Q-Q plot to test for normality assumption. 97 xii List of Tables 2.1 Type I error with SN(0; 1; λ); α = 0:05 . 34 2.2 Power comparison with n = 20; 25; 50 and 100 . 36 2.3 Power comparison with n = 20; 25; 50 and 100 . 37 2.4 Power of Test with Alternative Distribution N(0; 1) . 37 2.5 Empirical Power Evaluation of the statistic (2.2.11) with different δ at α = 0:05 .................................... 38 2.6 Otis IQ Scores for Non-whites . 38 2.7 Estimated values for N(µ, σ) and SN(µ, σ; λ) .

Load more