STUDY OF GENERALIZED AND CHANGE POINT PROBLEM

Amani Alghamdi

A Dissertation

Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

August 2018

Committee:

Arjun K. Gupta, Committee Co-Chair

Wei Ning, Committee Co-Chair

Jane Chang, Graduate Faculty Representative

John Chen Copyright c 2018 Amani Alghamdi All rights reserved iii ABSTRACT

Arjun K. Gupta and Wei Ning, Committee Co-Chair

Generalizations of univariate distributions are often of interest to serve for real life phenomena. These generalized distributions are very useful in many fields such as medicine, physics, engineer- ing and biology. Lomax distribution (Pareto-II) is one of the well known univariate distributions that is considered as an alternative to the exponential, gamma, and Weibull distributions for heavy tailed . However, this distribution does not grant great flexibility in modeling data. In this dissertation, we introduce a generalization of the Lomax distribution called Rayleigh Lo- max (RL) distribution using the form obtained by El-Bassiouny et al. (2015). This distribution provides great fit in modeling wide of real data sets. It is a very flexible distribution that is related to some of the useful univariate distributions such as exponential, Weibull and Rayleigh dis- tributions. Moreover, this new distribution can also be transformed to a lifetime distribution which is applicable in many situations. For example, we obtain the inverse estimation and confidence intervals in the case of progressively Type-II right censored situation. We also apply Schwartz information approach (SIC) and modified information approach (MIC) to detect the changes in of the RL distribution. The performance of these approaches is studied through simu- lations and applications to real data sets. According to Aryal and Tsokos (2009), most of the real world phenomenon that we need to study are asymmetrical, and the normal model is not a good model for studying this type of dataset. Thus, skewed models are necessary for modeling and fitting asymmetrical datasets. Azzalini (1985) in- troduced the univariate skew and his approach can be applied in any symmet- rical model. However, if the underlying (base) is not symmetric, we can not apply the Azzalini’s approach. This motivated the study for more flexible alternative. Shaw and Buckley (2007) introduced a quadratic rank transmutation map (QRTM) which can be applied in any (symmetric or asymmetric) distribution. Recently, many distributions have been suggested using the QRTM to derive the transmuted class (TC) of distributions. This provides great flexibility in performing real datasets. We extend our work in RL distribution to derive the iv transmuted Rayleigh Lomax (TR-RL) distribution using the QRTM. Mathematical and statistical properties, such as generating function, L-moment, probability weight moments are de- rived and studied. We also establish the relationship between the TR-RL , the RL, and other useful distributions to show that our proposed distribution includes them as special cases. TR-RL is fitted to a well known dataset, the goodness of fit test and the likelihood ratio test are presented to show how well the TR-RL fits the data. v

To the memory of my parents who paved the way for me during their life- time. To my husband Khalid Alghamdi and my children Lujain, Loai and Wael for their unconditional love and . I dedicate this work. vi ACKNOWLEDGMENTS

First of all, I would like to express my deepest thanks to my advisors Dr. Arjun Gupta and Dr. Wei Ning for their guidance, support, detailed comments, patience and invaluable encourage- ment they offered throughout this research. Under their guidance, I successfully overcame many difficulties and learned a lot. I would especially like to thank Dr. John Chen for his unforgettable support, and acceptance of being a member in my dissertation committee. I am also thankful to Dr. Jane Chang for the time she spent on reviewing this dissertation. Deepest sense of gratitude to Dr. Craig Zirbel for his careful and precious guidance throughout my study. My thanks and appreciation also to the staff of Department of Mathematics and at BGSU for their help and support. I must thank all of my friends in the US for helping and understanding me during this journey. Special thanks go to my brothers and my sisters for encouraging and inspiring me to follow my dreams. I am also grateful to my parents in law, who supported me emotionally and believed in me. I owe thanks to a very special person, my husband, Khalid Alghamdi for all his love, continuing support and encouragement. This entire journey would not have been possible without his support. Last but not least, I wish to express my love to my children Lujain, Loai and Wael who have unavoidably missed my presence during this period. vii

TABLE OF CONTENTS Page

CHAPTER 1 LITERATURE REVIEW ...... 1 1.1 Introduction ...... 1 1.2 Methodology ...... 3 1.3 Dissertation Structure ...... 7

CHAPTER 2 RAYLEIGH LOMAX DISTRIBUTION ...... 8 2.1 Introduction ...... 8 2.2 The Rayleigh Lomax Distribution ...... 8 2.3 Distributional Properties ...... 10 2.3.1 Shapes of pdf ...... 10 2.3.2 Moments ...... 11 2.3.3 L-moments ...... 14 2.3.4 Order statistics ...... 18 2.3.5 ...... 20 2.3.6 Probability weighted moments ...... 20 2.3.7 Moment generating function ...... 22 2.4 Estimation ...... 22 2.4.1 MLEs of parameters ...... 23 2.4.2 Asymptotic distribution ...... 24 2.4.3 Simulation ...... 25 2.5 Application ...... 25 2.6 Hazard rate function ...... 31 viii 2.7 Inference under progressively type-II right-censored for transformed Rayleigh Lomax distribution ...... 33 2.7.1 of α ...... 38 2.7.2 Interval estimation of parameter σ ...... 39 2.7.3 Inverse estimation of parameters α and σ ...... 39 2.7.4 Simulation study ...... 40 2.7.5 An Illustrative Example ...... 46 2.8 Discussion ...... 48

CHAPTER 3 TRANSMUTED RAYLEIGH LOMAX DISTRIBUTION ...... 49 3.1 Introduction ...... 49 3.2 The Transmuted Rayleigh Lomax distribution ...... 50 3.2.1 Rank transmutation ...... 50 3.2.2 The Transmuted Rayleigh Lomax distribution ...... 52 3.3 Distributional Properties ...... 53 3.3.1 Shape of pdf ...... 53 3.3.2 Moments ...... 55 3.3.3 L-moments ...... 57 3.3.4 Order statistics ...... 60 3.3.5 Quantile function ...... 62 3.3.6 Probability weighted moments ...... 66 3.3.7 Moment generating function ...... 67 3.4 Estimation ...... 68 3.4.1 MLEs of parameters ...... 68 3.4.2 Asymptotic distribution ...... 70 3.4.3 Simulation ...... 73 3.5 Application ...... 73 3.6 Discussion and Conclusions ...... 76 ix CHAPTER 4 AN INFORMATION APPROACH FOR THE CHANGE POINT PROB- LEM OF THE RAYLEIGH LOMAX DISTRIBUTION ...... 78 4.1 Introduction ...... 78 4.2 Literature Review of the Change Point Problem ...... 79 4.3 Methodology ...... 80 4.4 Simulation Study ...... 85 4.5 Change Point Analysis for British Coal Mining Disaster ...... 88 4.6 Change Point Analysis for IBM Stock Price ...... 92 4.7 Change Point Analysis for the Radius of Circular Indentations ...... 94 4.8 Conclusions ...... 96

BIBLIOGRAPHY ...... 97

APPENDIX A SELECTED R PROGRAMS ...... 104 x

LIST OF FIGURES Figure Page

1.1 Probability density function of Lomax distribution for different values of α and λ = 1...... 2 1.2 Cumulative density function of Lomax distribution for different values of shape parameter α and λ = 1...... 3 1.3 Probability density function of Pareto (IV) distribution for different values of the shape parameter α and inequality parameter γ = 1, 2...... 4

2.1 Probability density function of Rayleigh Lomax as α increases and decreases. . . . 11 2.2 Probability density function of Rayleigh Lomax as λ increases and decreases. . . . 11 2.3 Probability density function of Rayleigh Lomax as σ increases and decreases. . . . 12 2.4 Plot of the estimated densities for the Aircraft Windshield data...... 28 2.5 Model of Probability density function of Rayleigh Lomax for the Aircraft Wind- shield data using the L-moments, the MLE and the method of moments...... 29

2 1 2.6 Probability density function of the transformed Rayleigh Lomax for σ = 2 . . . . 31 2.7 Hazard rate function of the transformed Rayleigh Lomax for a T when σ2 = 1/2...... 33

3.1 Probability density function of the TR-RL as α increases and decreases...... 53 3.2 Probability density function of the TR-RL as λ increases and decreases...... 54 3.3 Probability density function of the TR-RL as σ increases and decreases...... 54 3.4 Probability density functions of TR-RL and RL (dot curve)...... 55 3.5 The relationship between α and the of the TR-RL distribution...... 64 3.6 The relationship between λ and the median of the TR-RL distribution...... 65 xi 3.7 Fitted density curves to the remission time real data...... 76

2 4.1 χ3 Q-Q plot of Sn as n=100, p-value=0.4376...... 87

2 4.2 χ3 Q-Q plot of Sn as n=200, p-value=0.6342...... 88

2 4.3 χ3 Q-Q plot of Sn as n=400, p-value=0.8649...... 88 4.4 The auto-correlation plot of the British Coal Mining Disaster data...... 91 4.5 for the British Coal Mining Disaster data...... 91 4.6 SIC(k) values in the British Coal Mining Disaster data...... 91 4.7 The auto-correlation plot of the transformed IBM data...... 92 4.8 The IBM stock daily closing prices from May 17 of 1961 to November 2 of 1962. 93 4.9 The IBM stock daily closing prices rate from May 17 of 1961 to November 2 of 1962...... 93 4.10 The auto-correlation plot of the radius of circular indentations data...... 95 4.11 The Radius of Circular Indentations data...... 95 4.12 SIC(k) values for the radius of circular indentations data...... 95 4.13 MIC(k) values for the radius of circular indentations data...... 96 xii

LIST OF TABLES Table Page

2.1 and of Rayleigh Lomax distribution for different values of α, λ and σ. 14 2.2 and values of Rayleigh Lomax Distribution for different values of α, λ and σ...... 15 2.3 The Quartile values (The first quartile, the median and the third quartile) for RL distribution...... 20 2.4 MLE, SD, Bias, MSE and 95% confidence limits from RL distribution...... 26 2.5 Failure times of 84 Aircraft Windshield...... 27 2.6 The statistics log-likelihood, AIC, SIC and AD for failure times of 84 Aircraft Windshield data...... 27 2.7 MLEs for failure times of 84 Aircraft Windshield data...... 28 2.8 MLEs, L-moments , Method of moments, AIC and SIC for failure times of 84 Aircraft Windshield data...... 28 2.9 The coverage of the confidence intervals in the transformed RL dis- tribution for α = 1 and σ = 1...... 41 2.10 The average bias and average MSE of the inverse estimators of the parameters of the transformed RL distribution for α = 1 and σ = 1...... 42 2.11 The inverse estimates of the parameters of the transformed RL distribution for α = 1 and σ = 1...... 43 2.12 The average bias of the L-moments and MLEs of the parameters of the transformed RL distribution...... 44 2.13 The average MSE of the L-moments and MLEs of the parameters of the trans- formed RL distribution...... 45 xiii 2.14 The maximum likelihood and L-moment estimates of the parameters of the trans- formed RL distribution for α = 1 and σ = 1...... 46 2.15 The maximum likelihood and inverse estimates of α and σ...... 47 2.16 The 0.90 confidence intervals for the parameters α and σ...... 47 2.17 The 0.95 confidence intervals for the parameters α and σ...... 48

3.1 Mean and variance of the TR-Lomax distribution as α increases for different values of α, λ and σ...... 58 3.2 Mean and variance of the TR-Lomax distribution as λ increases and σ decreases for different values of α and β...... 59 3.3 MLE, SE, Bias and 95% confidence limits from TR-RL distribution...... 74 3.4 Summary description of the remission time data...... 75 3.5 The Loglikelihood, MLEs, AIC and SIC for the remission time real data...... 75 3.6 The likelihood ratio tests for the remission time real data ...... 76

4.1 Power comparison between SIC and MIC as n=100...... 86 4.2 Power comparison between SIC and MIC as n=200...... 86 4.3 Power comparison between SIC and MIC as n=400...... 87 4.4 of | kˆ − k |≤ η when n = 100 and different k’s...... 89 4.5 Probability distribution of | kˆ − k |≤ η when n = 150 and different k’s...... 89 4.6 Probability distribution of | kˆ − k |≤ η when n = 200 and different k’s...... 90 4.7 Time intervals between explosions in mines...... 90 1

CHAPTER 1 LITERATURE REVIEW

1.1 Introduction

Pareto distribution is one of heavy tailed distributions which usually models nonnegative data. It was propounded by Pareto (1897) as a model for the distribution of incomes. Several different forms of have been studied by many authors including Lomax (1954), Davis and Feldstein (1979), Grimshaw (1993) and Nadarajah and Gupta (2008). The main common types of Pareto distribution are known as Pareto Type I, II, III, IV, and Feller Pareto distributions. One of the popular hierarchy of Pareto distribution is Pareto Type II which has been named as Lomax distribution. Lomax distribution has been applied in a variety of fields such as engineering and reliability and life testing. Harris (1968) and Atkinson and Harrison (1978) applied Lomax distribution to model data obtained from income and wealth. Lomax distribution has been used as an alternative to the exponential, gamma and Weibull distributions for heavy tailed data by Bryson (1974). Golaup et al. (2005) introduced the size distribution of computer files on servers using Lomax distribution. The Lomax distribution is considered as an important model of lifetime models since it belongs to the family of decreasing (Chahkandi and Ganjali (2009)). Hassan and Al-Ghamdi (2009) presented an optimum step-stress life testing for Lomax distribution. Nagar et al. (2012) generalized Lomax distribution to the matrix case. The cumulative and probability density functions of the Lomax distribution are defined as follows.

Definition. 1.1.1. A random variable X has the Lomax distribution with two parameters α and λ if its cumulative distribution function (cdf) is given by

λ α G(x; α, λ) = 1 − ( ) , x ≥ 0, α, λ > 0, (1.1.1) x + λ where α and λ are the shape and scale parameters respectively. The probability density function (pdf) corresponding to (1.1.1) is 2

α x −α−1 g(x; α, λ) = (1 + ) , x ≥ 0, α, λ > 0. (1.1.2) λ λ

Figures 1.1 and 1.2 present the behavior of the Lomax probability and cumulative density func- tions for various values of the shape parameter α and λ = 1. We notice that as the shape parameter α → ∞, the Lomax distribution approaches the δ(x) which is defined as follows.

Definition. 1.1.2. The Dirac-delta function, which was first introduced by physicist Dirac (1958) is a function that is equal to zero everywhere except for zero with an integral of one over the entire domain. It is defined as follows.

  +∞, x = 0 δ(x) = (1.1.3)  0, x 6= 0,

R ∞ which is constrained by −∞ δ(x)dx = 1.

Figure 1.1: Probability density function of Lomax distribution for different values of shape parameter α and λ = 1. 3

Figure 1.2: Cumulative density function of Lomax distribution for different values of shape parameter α and λ = 1.

The following theorem reveals the relationship between Lomax and other well known distribu- tions.

Theorem 1.1.1. (a) The Lomax distribution is a Pareto Type I distribution shifted so that its sup- port begins at zero.

(b) The Lomax distribution is a Pareto Type II distribution with µ = 0. (c) The Lomax distribution is a of the second kind with one of its shape parame- ters equals one.

(d) The Lomax distribution is the Feller-Pareto distribution with location parameter µ = 0, in- equality parameter γ = 1, shape parameter γ1 = α and shape parameter γ2 = 1. (e) The Lomax distribution is the Pareto Type IV with location parameter µ = 0 and inequality parameter γ = 1.

1.2 Methodology

In 1897, an Italian economist and sociologist Vilfredo Pareto made the well-known monitoring that 20% of the input or causes produce 80% of the output or consequences for many cases. For example, 80% of the revenue is occured from 20% of the customers. Pareto distribution has been observed as being adequate for modeling income and wealth distributions. According to Arnold (2015), a hierarchy of Pareto models is created by starting with the classical Pareto distribution 4 (Pareto Type I) and then presenting additional parameters such as location, scale, shape and in- equality which form the Pareto Type IV family with the pdf given by

1 −1 α x−µ  γ g(x) = λ , x > µ, (1.2.1) h 1 iα+1 λγ x−µ  γ 1 + λ where −∞ < µ < ∞ is the location parameter, λ > 0 is the scale parameter, γ > 0 is the inequality parameter and α > 0 is the shape parameter.

Figure 1.3: Probability density function of Pareto (IV) distribution for different values of the shape parameter α and inequality parameter γ = 1, 2.

Pareto (IV) is the general family of Pareto (I), Pareto (II), Pareto (III) and the Burr distributions. Figure1.3 shows the pdf of the Pareto Type IV distribution with various values of the shape pa- rameter α and two values of the inequality parameter, γ = 1 and γ = 2. We observe that when the inequality parameter γ = 1, we have the Lomax distribution. Although Lomax distribution has wide applications in real life, it does not provide large flexibility in modeling data. Recently, many applications have shown a clear need of the generalization of Lomax distribution by adding one or more parameters to assert the ability of fitting diverse data sets. These generalized distributions contain some of the well-known distributions as special cases. According to Alizadeh et al. (2017), the basic motivations for generating a new distribution in practice are the following: 5 • To create a skewness for symmetrical models.

• To define special models with all types of hazard rate functions.

• To construct heavy-tailed distributions for modeling diverse real data sets.

• To generate distributions with different types of skewness.

• To provide consistently better fits than other generated distributions with the same underlying model.

In the literature, several extensions of the Lomax distribution are available such as exponentiated Lomax (Abdul-Moniem and Abdel-Hameed, 2012), Beta-Lomax (Rajab et al., 2013), Poisson- Lomax (Al-Zahrani and Sagor, 2014), Gamma-Lomax (Cordeiro et al., 2015), exponential Lomax (El-Bassiouny et al., 2015) and Gumbel-Lomax (Tahir et al., 2016) distributions. Many researches have continued in this area of deriving new distributions from the classic ones. However, we need more flexible models that can be applied in any symmetrical or asymmetrical models. Shaw and Buckley (2007) investigated a new technique for introducing skewness or kur- tosis into a symmetric or other distribution. They presented a rank transmutation map, which is the functional composition of the cumulative distribution function of one distribution with the inverse cumulative distribution (quantile) function of another, D(u) = F [G−1(u)], where F and G are the cumulative distribution functions with a common sample space. This map is used to modulate the moments of a given base distribution, such as skewness and kurtosis. By looking at the rank transmutation map, we notice that it has the same structure as copula. Shaw and Buckley (2007) stated some positive features of the transmutation map which are:

• Transmutation maps are applied to any base distribution, whether symmetric or asymmetric.

• Transmutation maps are easily generalized to introduce some kurtosis.

• Monte Carlo simulation can be conducted by the use of quantile function of the base distri- bution. 6 • The raw moments of the transmuted distribution can be obtained as simple linear functions of the transmutation parameters.

One of the asymptotic analogues of the rank transmutation map is the Edgeworth or Gram- Charlier (EGC) expansion. It approximates a probability distribution in terms of its cumulants. However, higher moments through additional terms in the expansion, lead to negative values for the pdf. Thus, the rank transmutation map is a good alternative to this expansion. The simplest ex- ample of the rank transmutation is the quadratic rank transmutation map (QRTM) which is defined as follows.

Definition. 1.2.1. (Shaw and Buckley, 2007) A natural rank transmutation map or the QRTM has the following simple quadratic form, for | β |≤ 1 :

D(u) = u + βu(1 − u), (1.2.2) from which it follows that the CDFs obey the relationship

F (x) = G(x) + βG(x)(1 − G(x)), (1.2.3) where G(x) is the cdf of base distribution and F (x) is the cdf of transmuted distribution.

Several authors studied the transmuted distributions and their statistical properties. For example, Aryal and Tsokos (2009) applied the QRTM on the extreme value distributions and compared it with Azzalini’s approach (Azzalini, 1985). Aryal and Tsokos (2011) introduced a new generaliza- tion of the called the transmuted Weibull distribution. Merovci (2014) applied the transmuted generalized to the nicotine measurements made in several brands of cigarettes in 1998. Elbatal et al. (2014) proposed a new generalization of the Expo- nentiated Frechet distribution called the transmuted Exponentiated Frechet distribution. Acik Ke- maloglu and Yilmaz (2017) derived the transmuted two-parameter lindley distribution and applied it on three different data sets. Chhetri et al. (2017) studied a new five-parameter Kumaraswamy 7 transmuted Pareto distribution and discussed various mathematical and statistical properties of the distribution.

1.3 Dissertation Structure

In chapter 2, we develop a new generalization of the Lomax distribution called Rayleigh Lomax (RL) distribution according to El-Bassiouny et al. (2015). We present its related properties such as mathematical properties, moments, and order statistics. Moreover, we study the maximum likeli- hood and compare it with the method of moments and L-moments using a real data set. Finally, according to Wang et al. (2010), the proposed distribution is transformed to a lifetime distribu- tion, then the inverse and maximum likelihood estimators, L-moments and confidence intervals are obtained in the case of progressively Type II right censored situation. In chapter 3, we present a new generator of the RL distribution called the transmuted Rayleigh Lomax (TR-RL) distribution. This distribution is derived using the QRTM. We discuss the related properties of the TR-RL distribution such as the shapes of the density function, moments, order statistics, probability weighted moments and L-moments. A simulation study is conducted to illustrate the performance of the maximum likelihood estimation method. Further, we perform an application by fitting the TR-RL and some of its sub-models to a real data set. We show that the proposed distribution is superior compared with the other models using the likelihood ratio test as well as Akaike information criterion (AIC) and Schwarz information criterion (SIC). In chapter 4, we study the change point problem of the RL distribution. We detect the locations of the change points of the RL distribution using the SIC and the modified information approach (MIC). Simulations are conducted under the RL distribution with different values of the shape parameter and scale parameters, to calculate the power of the SIC and MIC. Moreover, we study the convergence of the change point estimator through simulations. Finally, the performance of these approaches are studied through three different real data sets. 8

CHAPTER 2 RAYLEIGH LOMAX DISTRIBUTION

2.1 Introduction

Lomax distribution is one of the well known distributions that is very useful in many fields such as engineering and reliability and life testing. However, this distribution does not provide great flexibility in modeling data. Thus, Lomax distribution can be generalized by presenting additional parameters such as shape, scale or location in the distribution and then observing the characteristics of the new distribution. Several generalized classes of distributions are available such as exponentiated Lomax (EL) (Abdul-Moniem and Abdel-Hameed, 2012), Beta-Lomax (BL) (Rajab et al., 2013), exponential Lomax (ELomax) (El-Bassiouny et al., 2015), Gamma-Lomax (GL) (Cordeiro et al., 2015) and Gumbel-Lomax (GuLx) (Tahir et al., 2016). This chapter provides another extension of the Lomax distribution called Rayleigh Lomax (RL) distribution. Rayleigh Lomax distribution is an asymmetric distribution, which provides great fit in modeling wide ranges of real data sets. It is a very flexible distribution which allows us to obtain some useful distributions by changing its parameters. This new distribution can be used as a lifetime distribution by adding one of its scale parameters to its random variable. In this chapter, Rayleigh Lomax distribution and several of its shape and distributional proper- ties are discussed. Maximum likelihood estimators and Fisher information matrix are obtained. We also compare the maximum likelihood with the method of moments and L-moments using failure times of 84 aircraft windshield data and goodness of fit tests are presented. Finally, we transform our proposed distribution to a lifetime distribution to obtain the inverse and the maximum like- lihood estimators, L-moments and confidence intervals in the case of progressively Type-II right censored situation.

2.2 The Rayleigh Lomax Distribution

Rayleigh distribution is a lifetime distribution that was developed by Rayleigh (1880). Many researchers used this distribution to demonstrate observed short term distributions of the heights of 9 sea waves and many other related phenomena (Hoffman and Karst, 1975). The cdf of a Rayleigh distribution with the scale parameter σ is

2 − t F (t; σ) = 1 − e 2σ2 , t ≥ 0, σ > 0, (2.2.1)

and the corresponding pdf is

2 t − t f(t; σ) = e 2σ2 , t ≥ 0, σ > 0. (2.2.2) σ2

According to El-Bassiouny et al. (2015), the Rayleigh Lomax distribution (RL) can be obtained using the following expression,

1 Z 1−G(x;α,λ) F (x) = f(t; σ)dt, (2.2.3) 0 where G(x; α, λ) is the cdf of the Lomax distribution given in (1.1.1) and f(t; σ) is the pdf given in (2.2.2). Then, the cdf of the Rayleigh Lomax is given by

1 2 Z ( λ )α t −1 x+λ 2α x+λ t − ( ) 2σ2 2σ2 λ F (x) = 2 e dt = 1 − e , (2.2.4) 0 σ

and the pdf corresponding to (2.2.4) is given by

α x + λ 2α−1 −1 ( x+λ )2α f(x) = ( ) e 2σ2 λ , x > −λ, α, λ, σ > 0, (2.2.5) λσ2 λ

where α is the shape parameter and λ and σ are the scale parameters of Rayleigh Lomax distribu- tion. The following results show that the Rayleigh Lomax distribution is a very flexible model using the transformation technique.

1 X+λ 2 Theorem 2.2.1. a) Let X ∼ RL( 2 , λ, σ), then Y = λ ∼ exp(2σ ). b) Let X ∼ RL(α, λ, √1 ), then Y = X+λ ∼ W eibull(2α, 1). 2 λ 10 X+λ c) Let X ∼ RL(1, λ, σ), then Y = λ ∼ Rayleigh(σ).

Proof. By using the transformation of random variable technique, let f(x) be the Rayleigh Lomax

x+λ −1 dx 1 pdf defined in (2.2.5). a) Let y = λ , then x = g (y) = λ(y − 1) and | dy |= λ. Let α = 2 , the 1 − y 2σ2 2 pdf of a random variable Y is given by f(y) = 2σ2 e ∼ exp(2σ ). x+λ −1 dx b) Let y = λ , then x = g (y) = λ(y − 1) and | dy |= λ. Hence, the pdf of a random variable α − 1 y2α 1 2α 2α−1 2σ2 2 2α−1 −y Y is given by f(y) = σ2 y e . Let σ = 2 , then f(y) = 2αy e ∼ W eibull(2α, 1) x+λ −1 dx c) Let y = λ , then x = g (y) = λ(y − 1) and | dy |= λ. Hence, we have the same pdf 2 y −y 2σ2 of a random variable Y as in part (b), and by assuming α = 1, we have f(y) = σ2 e ∼ Rayleigh(σ)

2.3 Distributional Properties

In this section, the distributional properties such as the shapes of the density function, mo- ments, L-moments, rth order statistics, quantile function, probability weighted moments and mo- ment generating function of the RL distribution are presented.

2.3.1 Shapes of pdf

The critical points of the Rayleigh Lomax density which are the roots of the first derivative can be illustrated numerically in order to obtain the local maximum and minimum. The first derivatives of the Rayleigh Lomax density function is

2 −α x + λ 4α−2 −1 ( x+λ )2α α(2α − 1) x + λ 2α−2 −1 ( x+λ )2α ( ) e 2σ2 λ + ( ) e 2σ2 λ (2.3.1) λ2σ2 λ λ2σ2 λ

Figures 2.1, 2.2 and 2.3 show various shapes of the density function for some different parameter values. Figure 2.1 illustrates the effect of different values of the shape parameter α on the shape of the Rayleigh Lomax pdf. It shows that the pdf of the Rayleigh Lomax becomes right skewed when α equals one. Also, when α = 0.5, the Rayleigh Lomax reduces to the exponential density function. In addition, when α goes to infinity or zero, the Rayleigh Lomax density approaches zero. The behavior of the skewness for various choices of parameters can be shown in Table 2.2. 11 Figures 2.2 and 2.3 show the effect of the scale parameters λ and σ on the Rayleigh Lomax pdf. The great magnitude of λ and σ stretch out the density function.

Figure 2.1: Probability density function of Rayleigh Lomax as α increases and decreases.

Figure 2.2: Probability density function of Rayleigh Lomax as λ increases and decreases.

2.3.2 Moments

The method of moments is one of the most popular methods of estimation. Most of the substantial aspects of a distribution like skewness, kurtosis, dispersion and tendency can be studied through moments. The estimate of the parameters of any distribution using the method of moments are obtained by equating the theoretical moments of the distribution to the corresponding sample 12

Figure 2.3: Probability density function of Rayleigh Lomax as σ increases and decreases.

0 0 0 1 Pn r th moments (i.e µr = mr), for r = 1, 2, ..., where mr = n i=1 Xi . The r theoretical moments of the Rayleigh Lomax distribution is demonstrated by the following theorem.

Theorem 2.3.1. Let X1,X2, ..., Xn be an i.i.d random variables from a Rayleigh Lomax distribu- tion with the density function given in (2.2.5). The rth moment about the mean of Rayleigh Lomax distribution is

r   r X r r−j j j r−j j j E(X − µ) = (−1) λ σ α (µ + λ) Γ( + 1)2 2α , (2.3.2) j 2α j=0

R ∞ a−1 −t where Γ(a) = 0 t e dt is the and µ is the of the Rayleigh Lomax distribution.

Proof. The rth moment about the mean of a random variable X can be derived using integration as follows:

Z ∞ Z ∞ r r r α x + λ 2α−1 −1 ( x+λ )2α 2σ2 λ E(X − µ) = (x − µ) f(x)dx = (x − µ) 2 ( ) e dx. −λ −λ λσ λ

1 λ 1 1 λ 1 1 −2α 2α α α 2α −1 Let u = σ2 ( x+λ ) , then x = λu σ − λ and dx = 2α σ u du. 13 Thus

Z ∞ r 1 1 1 r −u E(X − µ) = (λu 2α σ α − λ − µ) e 2 du 2 0 r   Z ∞ 1 X r j j r−j j −u = λ σ α (−µ − λ) u 2α e 2 du 2 j j=0 0 r   X r j j r−j j j = λ σ α (−µ − λ) Γ( + 1)2 2α . (2.3.3) j 2α j=0

If µ = 0, then the rth moment about the origin is:

r   r X r−j r r j j j E(X) = (−1) λ σ α Γ( + 1)2 2α . (2.3.4) j 2α j=0

The first three theoretical moments about the origin are

1 1 E(X) = −λ + λσ α Γ((1/2α) + 1)2 2α , (2.3.5)

2 2 2 1 1 2 2 1 E(X ) = λ − 2λ σ α Γ((1/2α) + 1)2 2α + λ σ α Γ((1/α) + 1)2 α , (2.3.6)

3 3 3 1 1 3 2 1 E(X ) = −λ + 3λ σ α Γ((1/2α) + 1)2 2α − 3λ σ α Γ((1/α) + 1)2 α

3 3 3 + λ σ α Γ((3/2α) + 1)2 2α . (2.3.7)

The estimators of Rayleigh Lomax parameters are obtained by equating (2.3.5), (2.3.6) and (2.3.7) to the first three sample moments and solve them numerically. 14 Corollary. 2.3.1. The mean and the variance of Rayleigh Lomax distribution are given as follows:

  1 1 1 E(X) = −λ 1 − σ α Γ( + 1)2 2α , 2α   2 2 1 1 2 1 V ar(X) = λ σ α 2 α Γ( + 1) − Γ ( + 1) , α 2α

where α, λ, σ > 0. The skewness and kurtosis measures can also be calculated from the first four moments using a well-known relationships. It can be observed from Table 2.1 that both the mean and the variance are decreasing when the shape parameter α is increasing.

Table 2.1: Mean and variance of Rayleigh Lomax distribution for different values of α, λ and σ.

λ = 0.5, σ = 2 λ = 1, σ = 2 λ = 1.5, σ = 2 α Mean Variance Mean Variance Mean Variance 1 0.7533 0.2257 1.5066 0.9102 2.2599 2.0479 2 0.2622 0.0586 0.5243 0.2346 0.7866 0.5279 3 0.1559 0.0323 0.3119 0.1291 0.4679 0.2904 4 0.1106 0.0222 0.2213 0.0888 0.3319 0.1998 λ = 0.5, σ = 3 λ = 1, σ = 3 λ = 1.5, σ = 3 α Mean Variance Mean Variance Mean Variance 1 1.3799 0.5119 2.7599 2.0479 4.1399 4.6078 2 0.4335 0.0879 0.8669 0.3519 1.3004 0.7918 3 0.2509 0.0423 0.5018 0.1692 0.7528 0.3806 4 0.1758 0.0272 0.3516 0.1088 0.5274 0.2447 λ = 0.5, σ = 4 λ = 1, σ = 4 λ = 1.5, σ = 4 α Mean Variance Mean Variance Mean Variance 1 2.0067 0.9102 4.0132 3.6407 6.0199 8.1917 2 0.5779 0.1173 1.1558 0.4692 1.7337 1.0558 3 0.3265 0.0512 0.6530 0.2049 0.9795 0.4611 4 0.2262 0.0314 0.4524 0.1256 0.6785 0.2826

2.3.3 L-moments

L-moments are analogous to the conventional moments but can be estimated by linear combina- tions of order statistics. They have the theoretical advantages over conventional moments of being able to characterize a wider range of distributions. They are also more robust than conventional moments to the presence of outliers in the data (Hosking (1990)). 15 Table 2.2: Skewness and Kurtosis values of Rayleigh Lomax Distribution for different values of α, λ and σ.

λ = 0.5, σ = 2 λ = 1, σ = 2 λ = 1.5, σ = 2 α Skewness Kurtosis Skewness Kurtosis Skewness Kurtosis 1 5.8405 11.5504 5.8336 11.5453 5.8339 11.5459 2 -0.7741 1.6728 -0.7748 1.6699 -0.7764 1.6697 3 -1.4306 0.7593 -1.4317 0.7606 -1.4320 0.7611 4 -1.5243 0.4548 -1.5242 0.4549 -1.5237 0.4549 λ = 0.5, σ = 3 λ = 1, σ = 3 λ = 1.5, σ = 3 α Skewness Kurtosis Skewness Kurtosis Skewness Kurtosis 1 6.7663 11.5498 6.7630 11.5459 6.7626 11.5456 2 -1.0243 1.6728 -1.0252 1.6699 -1.0258 1.6699 3 -1.9592 0.7604 -1.9592 0.7604 -1.9602 0.7609 4 -2.1361 0.4547 -2.1361 0.4546 -2.1361 0.4549 λ = 0.5, σ = 4 λ = 1, σ = 4 λ = 1.5, σ = 4 α Skewness Kurtosis Skewness Kurtosis Skewness Kurtosis 1 7.2254 11.5447 7.2276 11.5459 7.2268 11.5454 2 -1.1758 1.6699 -1.1758 1.6699 -1.1761 1.6697 3 -2.2924 0.7618 -2.2929 0.7610 -2.2930 0.7608 4 -2.5343 0.4549 -2.5343 0.4549 -2.5328 0.4548

Let X1,X2, ..., Xn be a random sample of size n from Rayleigh Lomax distribution with param- eters α, λ, σ > 0. The rth population L-moment is

Z ∞ E(Xr:n) = xf(xr:n)dx −λ Z ∞ n! α λ −2α+1 −1 ( λ )−2α(n−r+1) −1 ( x+λ )2α r−1 2σ2 x+λ 2σ2 λ = x 2 ( ) e [1 − e ] dx. −λ (r − 1)!(n − r)! λσ x + λ | {z } D

By expanding the quantity D in power series as

r−1   X j r − 1 −j ( x+λ )2α D = (−1) e 2σ2 λ , j j=0 16 we obtain,

r−1 X r − 1 n! α E(X ) = (−1)j r:n j (r − 1)!(n − r)! λ j=0 Z ∞ 1 λ −2α+1 −1 ( λ )−2α(n−r+1) −j ( λ )−2α 2σ2 x+λ 2σ2 x+λ x 2 ( ) e e dx. −λ σ x + λ

1 λ 1 1 λ 1 1 −2α 2α α α 2α −1 Let u = σ2 ( x+λ ) , then x = λu σ − λ and dx = 2α σ u du.

Then,

r−1   Z ∞ X j r − 1 n! λ 1 1 −u (n−r+j+1) E(X ) = (−1) (u 2α σ α − 1)e 2 du r:n j (r − 1)!(n − r)! 2 j=0 0 r−1    X j r − 1 λ 1 1 2 1 +1 = (−1) σ α Γ( + 1)( ) 2α j 2β(r, n − r + 1) 2α n − r + j + 1 j=0 2  − , (2.3.8) (n − r + j + 1)

Γ(r)Γ(n−r+1) (r−1)!(n−r)! where β(r, n − r + 1) = Γ(n+1) = n! .

The expected value of the first is

  1 1 2 1 E(X ) = λ σ α Γ( + 1)( ) 2α − 1 , 1:n 2α n

and the expected value of the nth order statistic is

n−1     X j n − 1 nλ 1 1 2 1 +1 2 E(X ) = (−1) σ α Γ( + 1)( ) 2α − . n:n j 2 2α j + 1 (j + 1) j=0

If r = n = 1, we have   1 1 1 E(X) = λ σ α Γ( + 1)2 2α − 1 , 2α

which is the mean of the Rayleigh Lomax distribution. Similar to the method of moments, the L- 17 moments attain parameter estimates by equating the first theoretical L-moments of the distribution ˆ (λr and τr) to the corresponding L-moments of a sample (λr and τˆr) for r ≤ p where p is the number of parameters of a probability distribution. According to Asquith (2011), the first four theoretical L-moments in terms of the order statistic expectations can be computed as follows:

λ1 = E(X1:1), 1 λ = (E(X ) − E(X )), 2 2 2:2 1:2 1 λ = (E(X ) − 2E(X ) + E(X )), 3 3 3:3 2:3 1:3 1 λ = (E(X ) − 3E(X ) + 3E(X ) − E(X )), 4 4 4:4 3:4 2:4 1:4

and the theoretical L-moments ratios are the quantities

τ2 = λ2/λ1 = coefficient of L-variation, (2.3.9)

τ3 = λ3/λ2 = L-skew, (2.3.10)

τ4 = λ4/λ2 = L-kurtosis, (2.3.11) and for r ≥ 5, which are unnamed, are

τr = λr/λ2. (2.3.12)

Therefore, the L-moments of the Rayleigh Lomax distribution are

  1/α 1 1 λ = λ σ Γ( + 1)2 2α − 1 , (2.3.13) 1 2α

1/α 1 h 1 i λ = λσ Γ( + 1) 2 2α − 1 , (2.3.14) 2 2α 18

" 1 #   2α 1/α 1 1 2 λ = λσ Γ( + 1) 2 2α − 3 + 2 , (2.3.15) 3 2α 3 and the L-moment ratios are

1 h 1 i 1/α 2α λ σ Γ( 2α + 1) 2 − 1 τ = 2 = , (2.3.16) 2 1 1 λ1 1/α 2α σ Γ( 2α + 1)2 − 1

1 1 2 2α  2α λ3 2 − 3 + 2 3 τ3 = = 1 . (2.3.17) λ2 2 2α − 1

The L-moment estimate of α, say αˆ can be obtained as the solution of nonlinear equation (2.3.17). Once αˆ is obtained, the L-Moment estimates of λ and σ, say λˆ and σˆ respectively, can be computed from (2.3.14) and (2.3.15).

2.3.4 Order statistics

The probability density function of rth order statistic of the RL is derived in an explicit form as follows.

Theorem 2.3.2. Let F (x) and f(x) be the cdf and the pdf of the Rayleigh Lomax distribution for a random variable X obtained from (2.2.4) and (2.2.5). The density of the rth order statistic is given by

r−1   αn! X j r − 1 x + λ 2α−1 −(n−r+j+1) ( x+λ )2α f (x) = (−1) ( ) e 2σ2 λ . (2.3.18) (r) (r − 1)!(n − r)!λσ2 j λ j=0

Proof. Substituting the cdf F (x) and the pdf f(x) from equations (2.2.4) and (2.2.5) into the probability density function of order statistics defined by David and Nagaraja (1970).

n! α x + λ 2α−1 −1 ( x+λ )2α n−r+1 −1 ( x+λ )2α r−1 2σ2 λ 2σ2 λ f(r)(x) = 2 ( ) [e ] [1 − e ] . (2.3.19) (r − 1)!(n − r)! λσ λ | {z } A 19 By expanding A in power series,

r−1   X r − 1 j −1 ( x+λ )2α j A = (−1) [e 2σ2 λ ] , j j=0

we obtain,

r−1   αn! X j r − 1 x + λ 2α−1 −(n−r+j+1) ( x+λ )2α f (x) = (−1) ( ) e 2σ2 λ . (2.3.20) (r) (r − 1)!(n − r)!λσ2 j λ j=0

Hence, the densities of the minimum and the maximum order statistics are given by

nα x + λ 2α−1 −n ( x+λ )2α f (x) = ( ) e 2σ2 λ , (2.3.21) (1) λσ2 λ

n−1   αn X j n − 1 x + λ 2α−1 −(j+1) ( x+λ )2α f (x) = (−1) ( ) e 2σ2 λ . (2.3.22) (n) λσ2 j λ j=0

The density function of rth order statistic of the Rayleigh Lomax can be expressed as a mixture of Rayleigh Lomax densities.

r−1 X f (x) = η f √ (x), (r) j α,λ,σ/ (n−r+j+1) (2.3.23) j=0

n! jr−1 where ηj = (n−r+j+1)(r−1)!(n−r)! (−1) j . Some of the mathematical properties of these order statistics like L-moments, mean deviations, incomplete moments and moment generating function can be obtained from (2.3.23). 20 2.3.5 Quantile function

Let X be a random variable with the distribution function in (2.2.4), and p ∈ (0, 1). The quantile function of the Rayleigh Lomax distribution is obtained by inverting (2.2.4) as follows

−1 ( x+λ )2α F (x) = 1 − e 2σ2 λ = p

−1 ( x+λ )2α ⇒ e 2σ2 λ = 1 − p −1 x + λ ⇒ ( )2α = ln(1 − p) 2σ2 λ ⇒ x = Qˆ(p) = λ[−2σ2ln(1 − p)]1/2α − λ. (2.3.24)

A second quartile or a median, the first and the third quartiles can be obtained by setting p = 1/2, p = 1/4 and p = 3/4 respectively. Table 2.3 shows that the quartile values are decreasing in α, while they are increasing in λ and σ.

Table 2.3: The Quartile values (The first quartile, the median and the third quartile) for RL distribution.

λ = 0.5, σ = 2 λ = 1, σ = 2 λ = 1.5, σ = 2 α Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3 1 0.2585 0.6774 1.1651 0.5171 1.3548 2.3302 0.7756 2.0322 3.4953 2 0.1158 0.2673 0.4124 0.2317 0.5345 0.8249 0.3475 0.8018 1.2373 3 0.0745 0.1625 0.2467 0.1490 0.3304 0.4933 0.2236 0.4956 0.7400 4 0.0549 0.1194 0.1754 0.1098 0.2388 0.3509 0.1647 0.3581 0.5263 5 0.0435 0.0934 0.1360 0.0869 0.1868 0.2720 0.1304 0.2803 0.4080 λ = 0.5, σ = 3 λ = 1, σ = 3 λ = 1.5, σ = 3 α Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3 1 0.6378 1.2661 1.9977 1.2756 2.322 3.9953 1.9134 3.7983 5.9929 2 0.2543 0.4397 0.6175 0.5085 0.8794 1.2350 0.7628 1.3191 1.8525 3 0.1576 0.2615 0.3547 0.3153 0.5229 0.7094 0.4729 0.7844 1.0642 4 0.1141 0.1855 0.2475 0.2282 0.3709 0.4949 0.3423 0.5564 0.7425 5 0.0894 0.1435 0.1897 0.1787 0.2871 0.3795 0.2681 0.4306 0.5692

2.3.6 Probability weighted moments

Probability weighted moments (PWM) are considered as an alternative approach to other types of moments. They are used in estimating parameters for a probability distribution, mostly 21 when maximum likelihood estimates are difficult to calculate. According to Asquith (2011), the probability weighted moments and L-moments are linear combination of each other. Computation of one implies the other, hence inferences based on each are identical. The (r, s)th probability weighted moment of X (r ≥ 1, s ≥ 0) is

Z ∞ E[XrF (X)s] = xrF (x)sf(x)dx, −λ

s s h −1 ( x+λ )2α i s where F (x) = 1 − e 2σ2 λ . Expanding F (x) in power series, we have

s   s X j s −j ( x+λ )2α F (x) = (−1) e 2σ2 λ . j j=0

Then,

Z ∞ s    2α−1 r s α r X j s x + λ −(1+j) ( x+λ )2α E[X F (X) ] = x (−1) e 2σ2 λ dx. λσ2 j λ −λ j=0

1 λ −2α 1 1 λ 1 1  2α α α 2α −1 Let u = σ2 x+λ , then x = λu σ − λ and dx = 2α σ u du. So we have

Z ∞ s   r s 1 X j s 1 1 r −u(j+1) E[X F (X) ] = (−1) (λu 2α σ α − λ) e 2 du 2 j 0 j=0 s r    Z ∞ 1 X X j+r−k s r r k k −u(j+1) = (−1) λ σ α u 2α e 2 du 2 j k j=0 k=0 0 s r k      2α +1 1 X X j+r−k s r r k k 2 = (−1) λ σ α Γ( + 1) . (2.3.25) 2 j k 2α j + 1 j=0 k=0

Setting s = 0 yields the rth moments of the Rayleigh Lomax distribution. 22 2.3.7 Moment generating function

Let X be a random variable with probability density function of Rayleigh Lomax distribu- tion. The moment generating function (mgf) of X is given by

Z ∞  −2α+1 tx α tx λ −1 ( λ )−2α+1 2σ2 x+λ E(e ) = 2 e e dx. λσ −λ x + λ

tx P∞ (tx)n Let A = e = n=0 n! , then

∞ n Z ∞  −2α+1 tx X t α n 1 λ −1 ( λ )−2α E(e ) = x e 2σ2 x+λ dx. n!λ σ2 x + λ n=0 −λ

1 λ −2α 1 1 λ 1 1  2α α α 2α −1 Let u = σ2 x+λ , then x = λu σ − λ and dx = 2α σ u du with u(−λ) = 0 and u(∞) = ∞.

Hence,

∞ n Z ∞ tx X t 1 1 n −u E(e ) = (λu 2α σ α − λ) e 2 du 2n!λ n=0 0 ∞ n Z ∞ n   X t X n 1 1 k n−k −u = (λu 2α σ α ) (−λ) e 2 du 2n! k n=0 0 k=0 ∞ n   n n Z ∞ X X n−k n λ t k k −u = (−1) σ α u 2α e 2 du k 2n! n=0 k=0 0 ∞ n   n n X X n−k n λ t k k k +1 = (−1) σ α Γ( + 1)2 2α . (2.3.26) k 2n! 2α n=0 k=0

By evaluating the first and the second derivatives respectively of the moment generating function at t = 0, we obtain the mean and the variance of the Rayleigh Lomax distribution.

2.4 Estimation

In this section, the maximum likelihood estimates are derived for parameters α, λ and σ. The Fisher information matrix is calculated to obtain the asymptotic distribution and the approximate confidence intervals of MLEs for α, λ and σ. 23 2.4.1 MLEs of parameters

Let x1, x2, ..., xn be a sample of size n from the Rayleigh Lomax distribution given by (2.2.5). The for the vector of parameters Θ = (α, λ, σ)T is given by

n 2α−1   x +λ α n Y xi + λ −1 ( i )2α L(x; α, λ, σ) = ( ) e 2σ2 λ . (2.4.1) λσ2 λ i=1 Then the log-likelihood function can be expressed as

`(x; α, λ, σ) = n[log(α) − log(λ) − 2log(σ)] − nlog(λ)2α−1 n n  2α X 1 X xi + λ + (2α − 1) log(x + λ) − . (2.4.2) i 2σ2 λ i=1 i=1

The maximum likelihood estimates αˆ, λˆ and σˆ for the parameters α, λ and σ are the values which maximize the likelihood function in equation (2.4.1). The first partial derivatives of the logarithm of the likelihood function (2.4.2) with respect to α, λ and σ are

n n  2α   ∂logL n X X 1 xi + λ xi + λ = − 2log(λ) + 2 log(x + λ) − log , (2.4.3) ∂α α i σ2 λ λ i=1 i=1 n n  2α−1 ∂logL n (2α − 1)n X 1 α X xi + λ = − − + (2α − 1) + x , (2.4.4) ∂λ λ λ x + λ σ2λ2 i λ i=1 i i=1 n  2α ∂logL 2n 1 X xi + λ = − + . (2.4.5) ∂σ σ σ3 λ i=1

The MLEs of the three parameters of the Rayleigh Lomax distribution are obtained by setting these above equations to zero and solving them simultaneously. Closed forms of the solutions are not available for the equations (2.4.3), (2.4.4) and (2.4.5). So, iterative procedure will be applied to solve these equations numerically. 24 2.4.2 Asymptotic distribution

The Fisher information matrix is important in the parameter estimation. It is a measure of the information content of the data relative to the parameters being estimated. The Fisher information matrix F is obtained by taking the expected value of the second and mixed partial derivatives of `(x; α, λ, σ) with respect to α, λ and σ. Unfortunately, the exact mathematical expression is difficult to find. Therefore, it can be approximated by numerically inverting the

Fisher information matrix F = (Fij). It is composed of the negative second and mixed derivatives of the natural logarithm of the likelihood function evaluated at the MLE. The asymptotic Fisher information matrix can be written as follow:

 −∂2logL −∂2logL −∂2logL  ∂α2 ∂α∂λ ∂α∂σ          2 2 2  F =  −∂ logL −∂ logL −∂ logL  .  ∂α∂λ ∂λ2 ∂λ∂σ          −∂2logL −∂2logL −∂2logL ∂α∂σ ∂λ∂σ ∂σ2

The second and mixed partial derivatives of the logarithm of the likelihood function are obtained as follow

2 n   2 ∂ logL −n 2 X xi + λ xi + λ = − ( )2α log , (2.4.6) ∂α2 α2 σ2 λ λ i=1 n ∂logL −2 X 1 = + 2 ∂α∂λ λ x + λ i=1 i n " 2α  2α−1  # 1 X xi + λ xi 2αxi xi + λ xi + λ + + log , (2.4.7) σ2 λ λ(x + λ) λ2 λ λ i=1 i 25 n  2α   ∂logL 2 X xi + λ xi + λ = log , (2.4.8) ∂α∂σ σ3 λ λ i=1 2 n n  2(α−1) ∂ logL 2αn X 1 α(2α − 1) X xi + λ = − (2α − 1) − x2 ∂λ2 λ2 (x + λ)2 λ4σ2 i λ i=1 i i=1 n  2α−1 2α X xi + λ − x , (2.4.9) σ2λ3 i λ i=1 n  2α−1 ∂logL −2α X xi + λ = x , (2.4.10) ∂λ∂σ λ2σ3 i λ i=1 2 n  2α ∂ logL 2n 3 X xi + λ = − . (2.4.11) ∂σ2 σ2 σ4 λ i=1

−1 The variance- matrix is approximated by V = (Vij) where Vij = Fij . The asymp- totic distribution of MLE for α, λ and σ can be written as [(ˆα − α), (λˆ − λ), (ˆσ − σ)] ∼

ˆ −1 N3(0,F (θ) ). Then the approximate 100(1 − γ)% confidence intervals for α, λ and σ are given q p ˆ ˆ p ˆ ˆ by αˆ ±z γ var(ˆα), λ±z γ var(λ) and σˆ ±z γ var(ˆσ), where θ = (ˆα, λ, σˆ) and zγ is the upper 2 2 2 100γ-th of the standard normal distribution.

2.4.3 Simulation

The MLEs of the Rayleigh Lomax parameters are obtained numerically through simulations by considering n=100, 200 and 500. The process is repeated 2000 times to obtain biases and mean square errors (MSEs) which measure the performance of the estimators. Also, the approximate two sided confidence intervals with confidence level 95% are constructed. It is observed that when the sample size n increses, the values of MSE and bias decrease. Furthermore, the MLEs approach their true values as the increase of the sample size.

2.5 Application

In this section, the Rayleigh Lomax distribution is fitted to the data set by using MLE, method of moments and method of L-moments. Some goodness of fit statistics also including AIC, SIC and Anderson-Darling (AD) are used to compare the Rayleigh Lomax distribution with 26 Table 2.4: MLE, SD, Bias, MSE and 95% confidence limits from RL distribution.

95% confidence limits parameters n MLE SD Bias MSE LCL UCL length α = 1 100 0.9729 0.1534 0.0271 0.0243 0.6997 1.3009 0.6013 200 0.9856 0.1094 0.0144 0.0122 0.9184 1.3473 0.4289 500 0.9913 0.0462 0.0087 0.0022 0.9151 1.0964 0.1813 λ = 1 100 0.9583 0.1083 0.0417 0.0135 0.6911 1.1156 0.4245 200 0.9775 0.0824 0.0225 0.0073 0.8643 1.1873 0.3230 500 0.9875 0.0303 0.0125 0.0011 0.9519 1.0706 0.1186 σ = 1 100 0.9978 0.0755 0.0022 0.0057 0.8779 1.1740 0.2961 200 0.9993 0.0432 0.0006 0.0019 0.9487 1.1587 0.2099 500 0.9996 0.0326 0.0004 0.0011 0.9801 1.1080 0.1279

95% confidence limits parameters n MLE SD Bias MSE LCL UCL length α = 2 100 2.0809 0.3612 0.0809 0.1370 0.8611 2.2771 1.4159 200 2.0145 0.3507 0.0145 0.1232 1.4284 2.8030 1.3746 500 1.9977 0.2479 0.0023 0.0615 1.6241 2.5962 0.9721 λ = 1 100 1.0336 0.2615 0.0336 0.0695 0.3010 1.3261 1.0250 200 0.9966 0.2421 0.0034 0.0586 0.6190 1.5681 0.9490 500 0.9971 0.1753 0.0029 0.0307 0.7449 1.4322 0.6873 σ = 2 100 2.0700 0.1907 0.0700 0.0413 1.4481 2.1958 0.7476 200 2.0378 0.1412 0.0378 0.0214 1.6056 2.1590 0.5534 500 2.0126 0.0954 0.0126 0.0093 1.7600 2.1339 0.3739

other generalization of Lomax distribution.

AIC = −2`(x;α, ˆ λ,ˆ σˆ) + 2k,

SIC = −2`(x;α, ˆ λ,ˆ σˆ) + klog(n), n 1 X AD = −n − (2i − 1)[log(F (x , α,ˆ λ,ˆ σˆ)) + log(1 − F (x , α,ˆ λˆ)]2, n i i i=1 where `(x;α, ˆ λ,ˆ σˆ) indicates the log-likelihood function evaluated at the maximum likelihood es- timates, k is the number of parameters, and n is the sample size. A real data set corresponding to the failure times of 84 aircraft windshield is considered. The windshield on a large aircraft is a complex piece of equipment, comprised basically of several layers of material, including a very strong outer skin with a heated layer just beneath 27 Table 2.5: Failure times of 84 Aircraft Windshield.

0.040 1.866 2.385 3.443 0.301 1.876 2.481 3.467 0.309 1.899 2.610 3.478 0.557 1.911 2.625 3.578 0.943 1.912 2.632 3.595 1.070 1.914 2.646 3.699 1.124 1.981 2.661 3.779 1.248 2.010 2.688 3.924 1.281 2.038 2.823 4.035 1.281 2.085 2.890 4.121 1.303 2.089 2.902 4.167 1.432 2.097 2.934 4.240 1.480 2.135 2.962 4.255 1.505 2.154 2.964 4.278 1.506 2.190 3.000 4.305 1.568 2.194 3.103 4.376 1.615 2.223 3.114 4.449 1.619 2.224 3.117 4.485 1.652 2.229 3.166 4.570 1.652 2.300 3.344 4.602 1.757 2.324 3.376 4.663 it, all laminated under high temperature and pressure. Failures of these items are not structural failures. Instead, they typically involve damage or delamination of the nonstructural outer ply or failure of the heating system. These failures do not result in damage to the aircraft but do result in replacement of the windshield (El-Bassiouny et al. (2015)). The failure times of 84 Aircraft Windshield are given in Table 2.5. The estimates αˆ = 1.590, λˆ = 0.628 and σˆ = 11.185 are obtained using an iterative procedure, then the log-likelihood function given by equation (2.4.2) is computed at these values. By looking at Table 2.6 and Figure 2.4, we observe that the RL model has the smallest AIC and SIC among all other models. Moreover, the test statistic AD has the smallest value for the dataset in connection with other fitted models. Hence, the RL distribution provides the best fit for the failure times of 84 Aircraft Windshield data.

Table 2.6: The statistics log-likelihood, AIC, SIC and AD for failure times of 84 Aircraft Windshield data.

Distribution Log-likelihood AIC SIC AD RL -128.610 263.226 270.554 0.266 GL -138.404 282.808 290.136 1.367 BL -138.718 285.435 295.206 1.408 EL -141.399 288.799 296.127 1.743 Lomax -164.988 333.977 338.862 1.398 28 Table 2.7: MLEs for failure times of 84 Aircraft Windshield data.

Distribution a b α λ σ RL - - 1.590 0.628 11.185 GL 3.588 - 52001.499 37029.658 - BL 3.604 33.639 4.831 118.837 - EL 3.626 - 20074.509 26257.681 - Lomax - - 51425.350 131789.780 -

Figure 2.4: Plot of the estimated densities for the Aircraft Windshield data.

Figure 2.5 and Table 2.8 show that RL density has the ability to fit the of the Aircraft Windshield data using the method of moments, L-moments and MLE.

Table 2.8: MLEs, L-moments , Method of moments, AIC and SIC for failure times of 84 Aircraft Windshield data.

Distribution α λ σ log-likelihood AIC SIC MLE 1.590 0.628 11.185 -128.610 263.226 270.554 L-Moments 1.399 0.363 16.594 -129.881 265.762 273.090 Method of moments 1.666 0.810 9.092 -128.663 263.327 270.655 29

Figure 2.5: Model of Probability density function of Rayleigh Lomax for the Aircraft Windshield data using the L-moments, the MLE and the method of moments.

Rayleigh Lomax distribution behaves as a lifetime distribution by adding the scale parameter λ

X+λ to the random variable X. Particularly, let Y = λ , then X = λY − λ and dx = λdy. Hence, the pdf of the transformed RL random variable Y is.

α 2α−1 −1 y2α f (y; α, σ) = y e 2σ2 , y, α, σ > 0, (2.5.1) Y σ2 and its cdf is given by

−1 2α 2 y FY (y; α, σ) = 1 − e 2σ . (2.5.2)

It can be used for reliability and life data analysis since the range of Y only consists of positive values. The of the density for the transformed RL model can be seen by the following theorem.

Theorem 2.5.1. Let Y be the random variable of the transformed RL distribution. Then, the 30 transformed Rayleigh Lomax probability density function has the following properties:

(i) The density is decreasing for 0 < α ≤ 1/2.

1 (ii) For 2 < α ≤ 1 and α > 1 , the density has a local maximum.

Proof. Let Y be the random variable of the transformed Rayleigh Lomax density given by (2.5.1). The first and second derivatives of (2.5.1) with respect to y are

  d α 2α−2 −α 2α − 1 y2α f (y; α, σ) = y y + (2α − 1) e 2σ2 , (2.5.3) dy Y σ2 σ2

2 d α 2α−3 h α h 2 2 2α α 4αii −1 y2α f (y; α, σ) = y 2 + 4ασ − 6σ − 3y (2α − 1) + y e 2σ2 . (2.5.4) dy2 Y σ2 σ2 σ2

∗ d ∗ The extreme values of fY (y; α, σ) are obtained by the root y of dy fY (y; α, σ) = 0, where y = σ2(2α−1) 1 2α ( α ) and the density is

2α−1  2  2α 2 ∗ α σ (2α − 1) − σ (2α−1) f (y ; α, σ) = e α . (2.5.5) Y σ2 α

d2 The inflection points are obtained from dy2 fY (y; α, σ)=0 with roots

1 p ! 2α ∗∗ 3(2α − 1) ± (10α − 1)(2α − 1) y = 2α . σ2

To show the behavior of the transformed Rayleigh Lomax density function, we have

d (i) The first derivative (2.5.3) when 0 < α < 1/2 is dy fY (y; α, σ) < 0 for y > 0, which indi-

cates that the transformed RL density is a decreasing function.Moreover, limy→∞ fY (y; α, σ) = 0 and limy→0 fY (y; α, σ) = ∞ (Rinne, 2008). By setting 2α = 1, the density is an exponential

1 − y 1 2σ2 distribution fY (y; α, σ) = 2σ2 e . By taking its limit, we have limy→0 fY (y; α, σ) = 2σ2 , d −1 −y 2σ2 limy→∞ fY (y; α, σ) = 0 and dy fY (y; α, σ) = (2σ2)2 e < 0, which indicates that the density function is also decreasing. 31 ∗ d2 ∗ (ii) The range of y is (0, σ] for which dy2 f(y , α, σ) is negative. Hence, the density has a local

maximum. Moreover, limy→0 fY (y; α, σ) = 0 and limy→∞ fY (y; α, σ) = 0, which implies that the density goes up to a mode and then falls. For α > 1, the density has a local maximum since

d2 ∗ ∗ dy2 fY (y ; α, σ) < 0 for a critical point y < 1.

Figure 2.6 shows the behavior of the transformed Rayleigh Lomax density function for different

2 1 values of the shape parameter α when σ = 2 .

2 1 Figure 2.6: Probability density function of the transformed Rayleigh Lomax for σ = 2 .

2.6 Hazard rate function

Let the random variable T be the time to failure of the transformed Rayleigh Lomax distribu-

tion. The survival function SRL(t), which is the probability of a unit not failing until some time

T , is defined by SRL(t) = 1 − FRL(t). The survival function of the transformed Rayleigh Lomax distribution is given by

−1 2α 2 t SRL(t) = e 2σ ; t > 0, α, λ, σ > 0. (2.6.1)

One of the life characteristic of a random variable is the hazard rate function (hf) also known as

the instantaneous rate of occurrence of the event denoted by hRL(t) which is defined by taking the

derivative of −log(SRL(t)) as

d fRL(t) α 2α−1 h(t) = (−log(SRL(t))) = = 2 t ; t > 0, α, λ, σ > 0. (2.6.2) dt SRL(t) σ 32 The following theorem shows the behavior of the hazard rate function of the transformed

X+λ Rayleigh Lomax distribution as a lifetime distribution when the random variable T = λ > 0.

Theorem 2.6.1. Let T be the random variable of the transformed Rayleigh Lomax distribution. Then the hazard rate function has the following properties:

1 (i) If 0 < α < 2 , the failure rate is decreasing and convex. 1 (ii) If 2 < α < 1, the failure rate is increasing and concave. 1 (iii) If α = 2 , the failure rate is constant. (iv) If α = 1, the failure rate is an increasing linear failure rate. (v) If α > 1, the failure rate is increasing and convex.

Proof. Let T be the time to failure of the transformed RL density. The transformed RL hazard rate function is given by α h (t; α) = t2α−1, t > 0, α > 0. T σ2

The first and second derivatives of the previous function with respect to t are

d α h (t; α) = (2α − 1)t2α−2, dt T σ2

d2 α h (t; α) = (2α − 1)(2α − 2)t2α−3, t > 0. dt2 T σ2

d 1 The critical points where the first derivative dt hT (t; α) equals zero are α = 0 and α = 2 , and the 1 possible inflection points are α = 0, α = 2 and α = 1. To study the behavior of the transformed Rayleigh Lomax hazard rate function, we consider the following cases

1 d d2 (i) If 0 < α < 2 , then dt hT (t; α) < 0 for t > 0, and dt2 hT (t; α) > 0 for t > 0. Also, as t → 0, hT (t; α) = ∞ and as t → ∞, hT (t; α) = 0. Hence, the hazard rate is decreasing and convex.

1 d d2 (ii) If 2 < α < 1, then dt hT (t; α) > 0 for t > 0, and dt2 hT (t; α) < 0 for t > 0. Also, as t → 0, hT (t; α) = 0 and as t → ∞, hT (t; α) = ∞. Hence, the hazard rate is increasing and concave.

1 1 (iii) If α = 2 , then the hazard rate function is hT (t; α) = 2σ2 which is constant. 1 (iv) If α = 1, then the hazard rate function is hT (t; α) = σ2 t, which is an increasing linear hazard 33 rate.

d d2 (v) If α > 1, then dt hT (t; α) > 0 for t > 0, and dt2 hT (t; α) > 0 for t > 0. Hence, the hazard rate is increasing and convex.

Figure2.7 illustrates the behavior of the transformed Rayleigh Lomax hazard rate for a random

X+λ variable T = λ when the shape parameters varies from 0.25 to 2.

Figure 2.7: Hazard rate function of the transformed Rayleigh Lomax for a random variable T when σ2 = 1/2.

In the following section, we derive the confidence intervals and the inverse estimators of the parameters of the transformed Rayleigh Lomax distribution. Moreover, a simulation study is con- ducted to report the coverage probabilities, the average relative biases and average relative mean square errors for the MLE, the method of L-moments and inverse estimators. Finally, an illustrative example is provided to demonstrate the proposed methods.

2.7 Inference under progressively type-II right-censored sampling for transformed Rayleigh Lo- max distribution

According to Wang et al. (2010), let F (y; α, σ) be a lifetime distribution with parameters α and σ. Consider parameter estimation for the family with

F (y; α, σ) = 1 − [1 − G(y; α)]σ, (2.7.1) 34 where G(y; α) is a distribution function contains only the shape parameter α. Equation (2.7.1) is called “proportional hazard family” or “frailty parameter”, σ is the G-parameter or scale parameter and α is the power or the shape parameter. Then, regarding the transformed RL distribution, (2.7.1) can be written as

1 F (y; α, σ) = 1 − [1 − G(y; α)] 2σ2 , y, α, σ > 0,

where G(y; α) = 1 − e−y2α , is a distribution function depending only on α. In this section, the purpose is to derive confidence intervals for the transformed RL distribution in the presence of progressively Type-II right censored observations, and inverse estimators for both shape and scale parameters are studied. Suppose that n units are placed on life test at time zero. Before we begin the test, a number Pm m(< n) is fixed and the censoring scheme R = (R1, ..., Rm) with Rj ≥ 0 and j=1 Rj + m = n

is specified. Immediately following the first failure, R1 surviving units are removed from the test at

random. Then, immediately following the second observed failure, R2 surviving units are removed from the test at random . This process continuous until at the time of the m-th observed failure,

the remaining Rm = n − R1 − R2 − ... − Rm−1 − m units are removed from the and censored. This scheme includes as special cases the complete sample (when m = n and

R1 = ... = Rm = 0). To derive the confidence intervals for α and σ, the following results, which are also stated in Wang et al. (2010) are needed.

Theorem 2.7.1. (I) If Vi:m:n = −log(1 − F (Yi:m:n; α, σ)), i = 1, ..., m, then V1:m:n, ..., Vm:m:n is a progressively Type-II right censored sample from the standard with

sample size n and censoring scheme R = (R1, ..., Rm). In the transformed RL distribution,

1 2α Vi:m:n = 2σ2 Yi:m:n. h Pi−1 i (II) If W1 = nV1:m:n, Wi = n − j=1 (Rj + 1) (Vi:m:n − Vi−1:m:n), i = 2, ..., m, then

W1, ..., Wm are independent standard exponential random variates.

(III) If S = Pi W , i = 1, ..., m and U = Si , i = 1, ..., m − 1, then U < ... < U are i j=1 j (i) Sm (1) (m−1) order statistics from the uniform(0,1) distribution with sample size m-1. 35 Proof. (I) Balakrishnan and Aggarwala (2000) provided the joint pdf of all m progressively Type-II right censored order statistics as follows

m Y Ri fY1:m:n,...,Ym:m:n (y1, ..., ym) = c f(yi)[1 − F (yi)] , y1 < ... < ym (2.7.2) i=1

where c = n(n−R1 −1)...(n−R1 −R2 −...−Rm−1 −m+1). For the transformed RL distribution, the joint pdf of all m progressively Type-II right censored order statistics is

m R +1 Y α 2α−1  −1 y2α  i f (y , ..., y ) = c y e 2σ2 i . (2.7.3) Y1:m:n,...,Ym:m:n 1 m σ2 i i=1

1 1 2α 2 2α Since, Vi:m:n= 2σ2 Yi:m:n, we have Yi:m:n=(2σ Vi:m:n) , and

m 1 Y 2 2α − 1 1 −1 1 dYi:m:n 2α 2 2α = Vi (σ ) . dVi:m:n α i=1

Hence,

Pm − i=1(Ri+1)vi fV1:m:n,...,Vm:m:n (v1, ..., vm) = ce , 0 < v1 < ... < vm < ∞, (2.7.4)

which implies that V1:m:n, ..., Vm:m:n is a progressively Type-II right censored sample from the standard exponential distribution. (II) If

W1 = nV1:m:n,

W2 = (n − R1 − 1) (V2:m:n − V1:m:n) , . .

Wm = (n − R1 − ... − Rm−1 − m + 1) (Vm:m:n − Vm−1:m:n) . 36 Then,

W V = 1 , 1:m:n n W1 W2 V2:m:n = + , n n − R1 − 1 . .

W1 Wm Vm:m:n = + ... + . n n − R1 − ... − Rm−1 − m + 1

Hence, from (2.7.4), we have

w  w1 w2   w1 wm  −(R +1) 1 −(R2+1) + −(Rm+1) +...+ 1 n n n−R1−1 n n−R1−...−Rm−(m−1) fW1,...,Wm (w1, ..., wm) = e .e ...e

m w Pm w2 wm − P (R +1) 1 − (Ri+1) −(Rm+1) = e i=1 i n .e i=2 n−R1−1 ...e n−R1−...−Rm−1−(m−1) .

(2.7.5)

Since,

m m X X (Ri + 1) = Ri + m = n i=1 i=1

m m m X X X (Ri + 1) = Ri + (m − 1) = Ri + m − R1 − 1 = n − R1 − 1, i=2 i=2 i=1 . .

m−1 X (Rm + 1) = n − Ri − m + 1, i=1

we have

Pm − i=1 wi fW1,...,Wm (w1, ..., wm) = e , wi ≥ 0. (2.7.6)

Therefore, W1, ..., Wm are independent standard exponential random variates. 37

(III) The probability distribution function of the order statistic Ui, i = 1, ..., m − 1. from the uniform(0,1) is given by

(m − 1)! f (u) = ui−1(1 − u)m−i−1, U(i) (i − 1)!(m − i − 1)! Γ(m) = ui−1(1 − u)m−i−1, 0 < u < u < ... < u < 1 (2.7.7) Γ(i)Γ(m − i) (1) (2) (m−1)

which implies that U(i) ∼ Beta(i, m − i).

Given S = Pi W , i = 1, ..., m, we need to show that U = Si , i = 1, ..., m − 1 i j=1 j i Sm

is order statistic from the uniform (0,1) distribution with sample size m − 1. Since W1, ..., Wm Pi are independent standard exponential random variates, then Si = j=1 Wj is a random variable from the with the shape parameter i > 0 and the rate parameter equals 1. Let

Si+1 = Wi+1 + Wi+2 + ... + Wm and Sm = Si + Si+1, where Si+1 follows the gamma distribution with the shape parameter m − i > 0 and the rate parameter equals 1.

Since Si and Si+1 are independent random variables, the joint pdf of of Si and Si+1 is given as

si−1 1 f(s , s ) = i e−si . sm−i−1e−si+1 ,S ,S ≥ 0, i, m > 0. (2.7.8) i i+1 Γ(i) Γ(m − i) i+1 i i+1

Si Let K = and Sm = Si + Si+1, then Si = KSm and Si+1 = Sm(1 − K) with Si+Si+1

∂(S ,S ) KSm i i+1 = = Sm. ∂(Sm,K) 1 − K −S m

Then, the joint pdf of K and Sm is given as

Γ(m) sm−1 f(k, s ) = ki−1(1 − k)m−i−1 m e−sm . (2.7.9) m Γ(i)Γ(m − i) Γ(m)

By factorization theorem, Si = Si ∼ Beta(i, m − i). Si+Si+1 Sm Therefore, U = Si , i = 1, ..., m − 1, where U < ... < U are order statistics from the (i) Sm (1) (m−1) 38 uniform(0,1) distribution with sample size m − 1.

2.7.1 Interval estimation of parameter α

In order to construct the confidence interval of the parameter α, we consider the following

m−1 m−1   X X Sm W (2α) = (−2log(U )) = 2 log (i) S i=1 i=1 i m−1 " Pm # X j=1(Rj + 1)Vj:m:n = 2 log Pi Pi i=1 j=1(Rj + 1)Vj:m:n + [n − j=1(Rj + 1)]Vi:m:n m−1 " Pm 2α # X j=1(Rj + 1)Yj:m:n = 2 log . (2.7.10) Pi 2α Pi 2α i=1 j=1(Rj + 1)Yj:m:n + [n − j=1(Rj + 1)]Yi:m:n

We notice that W (2α) is a function of α and does not depend on σ. Moreover, W (2α) =

Pm−1(−2logU ) = Pm−1(−2log(U )), and U , ..., U = S1 , ..., Sm−1 is a random sample i=1 (i) i=1 i 1 m−1 Sm Sm from the uniform(0,1) distribution which implies that W (2α) has the χ2 distribution with 2(m−1) degrees of freedom. To show that W (2α) is strictly monotonic function, we write equation (2.7.10) as follows.

m−1   X Sm − Si W (2α) = 2 log 1 + S i=1 i m−1 Pm Pi ! X j=i+1(Rj + 1)Vj:m:n − [n − j=1(Rj + 1)]Vi:m:n = 2 log 1 + Pi Pi i=1 j=1(Rj + 1)Vj:m:n + [n − j=1(Rj + 1)]Vi:m:n m−1 Pm Pi ! X j=i+1(Rj + 1)P(j,i) − [n − j=1(Rj + 1)] = 2 log 1 + , (2.7.11) Pi Pi i=1 j=1(Rj + 1)P(j,i) + n − j=1(Rj + 1)

2α Vj:m:n  Yj:m:n  where P(j,i) = = is strictly increasing in α for j > i, and then W (2α) is strictly Vi:m:n Yi:m:n increasing function of α. Therefore, W −1 exists and the confidence interval of the parameter α is stated in the following theorem.

Theorem 2.7.2. Suppose X = (X1:m:n, ..., Xm:m:n) is a progressively Type II right censored sample from the transformed RL distribution with sample of size n and the censoring scheme 39

R = (R1, ..., Rm). Then, for any 0 < γ < 1,

h 1 −1 2 1 −1 2 i 2 W [χ1−γ/2(2(m − 1))], 2 W [χγ/2(2(m − 1))] 2 is a 100(1 − γ)% confidence interval for the shape parameter α, where χ1−γ/2(2(m − 1)) and

2 2 χγ/2(2(m − 1)) are the lower and upper γ respectively of the χ distribution with 2(m − 1) degrees of freedom.

2.7.2 Interval estimation of parameter σ

To obtain the confidence interval of σ, we consider the quantity, V = 2Sm. Note that from part Pm (III), Sm = j=1 Wj. Hence,

m X V = 2 Wj j=1 m X = 2 (Rj + 1)Vj:m:n (2.7.12) j=1

For the transformed RL distribution, the quantity V can be written as

m 1 X 2g(W,Y ) V = (R + 1)Y , (2.7.13) σ2 j j:m:n j=1

1 −1 2 where g(W, Y ) = α = 2 W (t) obtained from (2.7.10) numerically and V has the χ distribution with 2m degrees of freedom. Hence, the 100(1 − γ)% confidence interval of σ is

v v  uPm 2g(W,Y ) uPm 2g(W,Y ) u (Rj + 1)Yj:m:n u (Rj + 1)Yj:m:n t j=1 , t j=1   2 2  χγ/2(2m) χ1−γ/2(2m)

.

2.7.3 Inverse estimation of parameters α and σ

Since W (2α) has the χ2 distribution with 2(m − 1) degrees of freedom and E(W (2α)) = 2(m − 1) < ∞, then by strong law of large numbers, W (2ˆα) →a.s 2(m − 2) (or W (2ˆα) converges with probability one to 2(m − 2)). Therefore, we can obtain the point estimator αˆ of α from the 40 following equation:

W (2ˆα) = 2(m − 2). (2.7.14)

The inverse estimate of α is obtained by solving equation (2.7.14) numerically. From the previous

2 subsection, we know that V = 2Sm has the χ distribution with 2m degrees of freedom. Hence, the inverse estimate of the parameter σ is

sPm 2ˆα (Rj + 1)Y σˆ = j=1 j:m:n . (2.7.15) 2(m − 1)

2.7.4 Simulation study

We conduct a simulation study for the transformed RL distribution under a variety of progres- sively Type-II right censored sampling schemes with 10000 replications. Using the algorithm pre- sented in Balakrishnan and Sandhu (1995), we generate progressively Type-II censored samples from the transformed RL distribution for different choices of sample sizes and censoring schemes provided by Wang et al. (2010). Table 2.9 shows the coverage probabilities of the confidence inter- vals of α and σ at 0.90 and 0.95 confidence levels for the transformed RL distribution. It illustrates that the simulated probabilities for 0.90 and 0.95 are very close to the 0.90 and 0.95 confidence lev- els. We also obtain the inverse estimate of α and σ in Table 2.10 and Table 2.11 and compare their performance with the MLEs and L-moment estimates which are presented in Tables 2.12-2.14. We observe that the inverse estimation provides a good alternative to the method of L-moments and MLE in terms of bias and MSE. Moreover, in almost all cases, the estimators of the parameters α and σ become less biased as the censored units increase for a fixed sample size. 41 Table 2.9: The coverage probabilities of the confidence intervals in the transformed RL distribution for α = 1 and σ = 1.

α σ

(n,m) (r1,...,rm) 0.90 0.95 0.90 0.95

(10,5) (0,...,0,5) 0.8500 0.9149 0.8416 0.9100

(10,5) (5,0,...,0) 0.8771 0.9368 0.8591 0.9223

(10,5) (1,1,...,1) 0.8676 0.9280 0.8712 0.9298

(10,8) (0,0,...,2) 0.8288 0.8975 0.8781 0.9346

(10,8) (2,0,...,0) 0.8447 0.9102 0.8722 0.9334

(20,10) (0,0,...,10) 0.8401 0.9050 0.8755 0.9348

(20,10) (10,0,...,0) 0.8557 0.9190 0.8776 0.9364

(20,15) (0,0,...,5) 0.8385 0.9025 0.8877 0.9415

(20,15) (5,0,...,0) 0.8469 0.9125 0.8838 0.9403

(30,10) (0,0,...,20) 0.8520 0.9146 0.8790 0.9370

(30,10) (20,0,...,0) 0.8626 0.9234 0.8836 0.9393

(30,10) (2,2,...,2) 0.8657 0.9252 0.8787 0.9370

(50,12) (0,...,0,38) 0.8694 0.9298 0.8881 0.9406

(50,12) (38,0,...,0) 0.8725 0.9293 0.8820 0.9366

(50,25) (0,0,...,25) 0.8457 0.9102 0.8872 0.9396

(50,25) (25,0,...,0) 0.8410 0.9056 0.8948 0.9446

(50,25) (1,1,...,1) 0.8615 0.9216 0.8964 0.9455 42 Table 2.10: The average bias and average MSE of the inverse estimators of the parameters of the transformed RL distribution for α = 1 and σ = 1.

Bias MSE

(n,m) (r1,...,rm) αˆ σˆ αˆ σˆ

(10,5) (0,...,0,5) -0.04258 0.27993 0.32956 0.75423

(10,5) (5,0,...,0) -0.03892 0.19442 0.162947 0.59069

(10,5) (1,1,...,1) -0.00416 0.25841 0.10647 0.36904

(10,8) (0,...,0,2) -0.00334 0.20340 0.10729 0.31404

(10,8) (2,0,...,0) -0.02781 0.14724 0.10363 0.50432

(20,10) (0,0,...,10) -0.01636 0.10813 0.11379 0.18029

(20,10) (10,0,...,0) -0.03189 0.07033 0.05969 0.19855

(20,10) (1,1,...,1) 0.00142 0.10278 0.09369 0.17434

(20,15) (0,0,...,5) 0.00031 0.09648 0.07304 0.22366

(20,15) (5,0,...,0) -0.01603 0.08205 0.04257 0.15048

(30,10) (0,...,0,20) -0.00912 0.13493 0.12528 0.16423

(30,10) (20,0,...,0) -0.01380 0.06646 0.05292 0.16163

(30,10) (2,2,...,2) -0.01118 0.09256 0.09188 0.14443

(50,12) (0,...,0,38) 0.00532 0.11155 0.10696 0.14701

(50,12) (38,0,...,0) -0.00705 0.06631 0.04168 0.14275

(50,25) (0,...,0,25) -0.01335 0.04005 0.03567 0.04766

(50,25) (25,0,...,0) -0.00398 0.02737 0.02273 0.07204

(50,25) (1,1,...,1) 0.00192 0.02141 0.02882 0.05354 43 Table 2.11: The inverse estimates of the parameters of the transformed RL distribution for α = 1 and σ = 1.

(n,m) (r1,...,rm) αˆ σˆ

(10,5) (0,0...,5) 0.95742 1.27993

(10,5) (5,0,...,0) 0.96108 1.19442

(10,5) (1,1,...,1) 0.99584 1.25841

(10,8) (0,...,0,2) 0.99666 1.20340

(10,8) (2,0,...,0) 0.97219 1.14724

(20,10) (0,0,...,10) 0.98364 1.10813

(20,10) (10,0,...,0) 0.96811 1.07033

(20,10) (1,1,...,1) 1.00142 1.10278

(20,15) (0,...,0,5) 1.00031 1.09648

(20,15) (5,0,...,0) 0.98397 1.08205

(30,10) (0,...,0,20) 0.99088 1.13493

(30,10) (20,0,...,0) 0.98620 1.06646

(30,10) (2,2,...,2) 0.98882 1.09256

(50,12) (0,...,0,38) 1.00532 1.11155

(50,12) (38,0,...,0) 0.99295 1.06631

(50,25) (0,...,0,25) 0.98665 1.04005

(50,25) (25,0,...,0) 0.99602 1.04342

(50,25) (1,1,...,1) 1.00192 1.02141 44 Table 2.12: The average bias of the L-moments and MLEs of the parameters of the transformed RL distribution.

Bias αˆ σˆ

(n,m) (r1,...,rm) L-mom MLE L-mom MLE

(10,5) (0,..,0,5) 0.55567 0.92258 -0.38858 -0.79442

(10,5) (5,0,...,0) -0.07781 0.15797 -0.07739 -0.10608

(10,5) (1,1,...,1) 0.15453 0.42711 -0.26749 -0.44135

(10,8) (0,...,0,2) 0.27196 0.45911 -0.11571 -0.3555

(10,8) (2,0,...,0) 0.01472 0.16016 0.00327 0.01753

(20,10) (0,...,0,10) 0.42648 0.61213 -0.42983 -0.61913

(20,10) (10,0,...,0) -0.05016 0.05026 -0.05732 -0.08407

(20,10) (1,1,...,1) 0.03664 0.14912 -0.28479 -0.48656

(20,15) (0,...,0,5) 0.28703 0.39965 -0.18612 -0.25265

(20,15) (5,0,...,0) 0.00450 0.07531 -0.00134 0.01254

(30,10) (0,...,0,20) 0.50733 0.70672 -0.60172 -0.64089

(30,10) (20,0,...,0) -0.10060 -0.01042 -0.08483 -0.11747

(30,10) (2,2,...,2) 0.07229 0.18551 -0.43015 -0.51704

(30,20) (0,...,0,10) 0.31000 0.41319 -0.28147 -0.31776

(50,12) (38,0,...,0) -0.11779 -0.05355 -0.09259 -0.10947

(50,25) (25,0,...,0) -0.03584 -0.00130 -0.02766 -0.01838

(50,25) (1,1,...,1) 0.02812 0.06811 -0.28931 -0.30948 45 Table 2.13: The average MSE of the L-moments and MLEs of the parameters of the transformed RL distribution.

MSE αˆ σˆ

(n,m) (r1,...,rm) L-mom MLE L-mom MLE

(10,5) (0,..,0,5) 1.50534 2.34460 0.31145 1.13771

(10,5) (5,0,...,0) 0.16243 0.25583 0.07078 0.20065

(10,5) (1,1,...,1) 0.53145 0.87607 0.21046 0.68921

(10,8) (0,...,0,2) 0.29683 0.51212 0.10391 0.70489

(10,8) (2,0,...,0) 0.12723 0.19352 0.08648 0.23226

(20,10) (0,...,0,10) 0.42118 0.68416 0.20333 0.56756

(20,10) (10,0,...,0) 0.06158 0.07473 0.04174 0.16764

(20,10) (1,1,...,1) 0.09250 0.12553 0.09828 0.51031

(20,15) (0,...,0,5) 0.17883 0.28198 0.06654 0.22357

(20,15) (5,0,...,0) 0.04760 0.05868 0.02974 0.06064

(30,10) (0,...,0,20) 0.50528 0.83076 0.37277 0.42960

(30,10) (20,0,...,0) 0.05666 0.05525 0.03644 0.15508

(30,10) (2,2,...,2) 0.10852 0.15792 0.19509 0.35923

(30,20) (0,...,0,10) 0.17262 0.26647 0.09526 0.16951

(50,12) (38,0,...,0) 0.04869 0.04024 0.03231 0.09948

(50,25) (25,0,...,0) 0.02286 0.02195 0.02030 0.01638

(50,25) (1,1,...,1) 0.03492 0.03903 0.09031 0.13263 46 Table 2.14: The maximum likelihood and L-moment estimates of the parameters of the transformed RL distribution for α = 1 and σ = 1.

Estimate αˆ σˆ

(n,m) (r1,...,rm) L-mom MLE L-mom MLE

(10,5) (0,...,0,5) 1.47614 1.97562 0.61142 0.20558

(10,5) (5,0,...,0) 0.92219 1.15797 0.92261 0.89392

(10,5) (1,1,...,1) 1.15454 1.42711 0.73251 0.55867

(10,8) (0,0,...,2) 1.27196 1.45911 0.88428 0.64447

(10,8) (2,0,...,0) 1.01472 1.16015 1.00327 1.01753

(20,10) (0,0,...,10) 1.42648 1.61213 0.57017 0.38087

(20,10) (10,0,...,0) 0.94983 1.05026 0.94267 0.91593

(20,10) (1,1,...,1) 1.03664 1.14912 0.71520 0.51344

(20,15) (0,0,...,5) 1.28703 1.39965 0.81388 0.74734

(20,15) (5,0,...,0) 1.00450 1.07531 0.99866 1.01254

(30,10) (0,0,...,20) 1.50733 1.70672 0.39828 0.35910

(30,10) (20,0,...,0) 0.89939 0.98958 0.91517 0.88253

(30,10) (2,2,...,2) 1.07229 1.18551 0.56985 0.48296

(30,20) (0,0,...,10) 1.31000 1.41319 0.71853 0.68223

(50,12) (38,0,...,0) 0.88221 0.94645 0.90741 0.89053

(50,25) (25,0,...,0) 0.96416 0.99869 0.97234 0.98162

(50,25) (1,1,...,1) 1.02812 1.06811 0.71069 0.69052

2.7.5 An Illustrative Example

We consider the following general progressively Type-II censored data which represent the time (in minutes) to breakdown of an insulating fluid between electrodes at voltage 30 kv. This data is given in Table 6.1 of Nelson (1982). The complete data set consist of n = 11 times to breakdown. 47 The progressively censored data are given as follows

ri 0 0 0 0 3 0 0 0

Yi 2.0464 2.8361 3.0184 3.0454 3.1206 4.9706 5.1698 5.2724

The experimenter removed three survival units from the test at the failure (breakdown) of an P8 insulating fluid which is occurred at 3.1206 minutes such that i=1 ri + m = 3 + 8 = 11. The maximum likelihood and inverse estimates of α and σ are computed and the results are shown in Table 2.15. Table 2.15: The maximum likelihood and inverse estimates of α and σ.

α σ

MLE Inverse MLE Inverse

Complete data 1.84201 1.63638 9.73321 7.45820

Progressive data 1.74957 1.67150 8.38262 8.53374

From Table 2.15, we observe that the inverse estimates of the transformed RL parameters α and σ in the case of progressively censored sample are closer to the ones based on the complete sample than the MLEs. Hence, the inverse estimation is preferable and is considered as a good alternative even for small sample size. Moreover, the confidence intervals at 0.90 and 0.95 confidence levels for each of α and σ are calculated and shown in the following table.

Table 2.16: The 0.90 confidence intervals for the parameters α and σ.

90% confidence limits α σ

Lower Upper Lower Upper Complete data (Subsections 2.7.1 and 2.7.2) 1.07606 2.59416 6.27758 10.68066 Progressive data (Subsections 2.7.1 and 2.7.2 ) 1.05046 2.77078 6.56098 12.45661 Complete data (Method of maximum likelihood) 1.07657 2.60781 -1.97060 21.44285 Progressive data (Method of maximum likelihood) 1.02828 2.47307 -1.18758 17.98259 48 Table 2.17: The 0.95 confidence intervals for the parameters α and σ.

95% confidence limits α σ

Lower Upper Lower Upper Complete data (Subsections 2.7.1 and 2.7.2) 0.97022 2.78400 6.01879 11.36063 Progressive data (Subsections 2.7.1 and 2.7.2) 0.92780 2.98112 6.24778 13.45855 Complete data(Method of maximum likelihood) 1.06161 2.62278 -2.19938 21.67163 Progressive data(Method of maximum likelihood) 1.04213 2.45922 -1.00412 17.79886

Tables 2.16 and 2.17 show that the confidence intervals for both α and σ derived in 2.7.1 and 2.7.2 are shorter than the confidence intervals based on the maximum likelihood estimation in almost all cases, which indicates that the confidence intervals using Wang et al. (2010) method outperform those of maximum likelihood method.

2.8 Discussion

In this chapter, a new distribution called Rayleigh Lomax distribution has been proposed. Some of its mathematical properties including the explicit formula for the density, and the explicit ex- pression for the moments, L-moments, order statistics, quantile functions, probability weighted moments and moment generating function have been discussed. The MLE method is used for es- timating the model parameters and the variance is determined. The new model is fitted to a real data set and the result shows the usefulness of the Rayleigh Lomax in practice. It illustrates that the Rayleigh Lomax model provides better fits than other generalized classes of dis- tributions for this data set. The performance of the MLE is compared with the method of moments and L-moments through the same data set. It is obvious that the Rayleigh Lomax distribution has the ability to fit this real data set using all the three methods of estimation. Finally, we transform the RL distribution to a lifetime distribution and we obtained the inverse estimation and confidence intervals in the case of progressively Type-II right censored situation. 49

CHAPTER 3 TRANSMUTED RAYLEIGH LOMAX DISTRIBUTION

3.1 Introduction

The accuracy of the procedure used in statistical analysis depends dramatically on the assumed probability distributions. For this reason, several work has been growing towards developing new generalized models to serve for real life applications. For example, Eugene et al. (2002) presented the shape properties of the unimodal beta-normal distribution. Gupta and Nadarajah (2005) de- rived a more general expression for the nth moment of the beta normal distribution. Kundu and Raqab (2005) discussed various estimators and showed the behavior of the estimators of different unknown parameters for different sample sizes. Cordeiro et al. (2011) applied the exponentiated generalized gamma distribution to lifetime data. A new generalized model for Lomax distribu- tion is called Rayleigh Lomax (RL) distribution, which was derived using the expression given by El-Bassiouny et al. (2015) as follows. A random variable X has the RL distribution with three parameters α, λ and σ if its cumulative distribution function (cdf) is given by

−1 ( x+λ )2α G(x) = 1 − e 2σ2 λ , x > −λ, α, λ, σ > 0, (3.1.1) where α is the shape parameter, λ and σ are scale parameters. The corresponding probability density function (pdf) is given by

 2α−1 α x + λ −1 ( x+λ )2α g(x) = e 2σ2 λ , x > −λ, α, λ, σ > 0. (3.1.2) λσ2 λ

According to Aryal and Tsokos (2009), not all phenomena can be studied by symmetrical models such as the normal distribution since some of the phenomena are asymmetric. Hence, skewed models are necessary complements for studying the real life events. Edgeworth (1886) investigated the problem of fitting asymmetrical distributions to asymmetrical data. Azzalini (1985) 50 introduced the univariate family for modeling asymmetric data. Gupta (2003) defined multivariate skew t-distribution which has some of the properties of multivariate t- distribution and has a shape parameter to represent skewness. Ning and Gupta (2012) generalized the univariate extended skew normal distribution family to the matrix variate case.

3.2 The Transmuted Rayleigh Lomax distribution

3.2.1 Rank transmutation

Let F and G be the cumulative distribution functions of two distributions with a common sample space. The general rank transmutation is defined as

D(u) = F (G−1(u)), (3.2.1)

which is the functional composition of the cumulative distribution function F with the inverse cumulative distribution function (or quantile function) G−1. It is obvious that D(0) = 0 and D(1) = 1 since u ∈ (0, 1). According to Shaw and Buckley (2007), a quadratic rank transmutation map is defined as

D(u) = u + βu(1 − u), | β |≤ 1 (3.2.2)

from which it follows that the cdfs satisfy the relationship

F (x) = G(x) + βG(x)(1 − G(x)), (3.2.3) where G(x) is the cdf of base distribution and F (x) is the cdf of transmuted distribution. The corresponding pdf is given by taking the derivative of (3.2.3),

f(x) = g(x)[1 + β − 2βG(x)]. (3.2.4)

Equation (3.2.3) is called the transmuted class (TC) of distributions. Note that, F (x) = G(x) 51 when β equals zero. According to Bourguignon et al. (2016), a random variable X is said to have an exponentiated-G (Exp-G) distribution with power parameter a > 0, if its cdf and pdf are given by

Π(x; a) = G(x)a, (3.2.5) and

π(x; a) = aG(x)a−1g(x), (3.2.6) where G is the cdf of base distribution. Therefore, the density function in (3.2.4) can be expressed as the linear mixture

f(x; β) = (1 + β)g(x) − βπ(x; 2), (3.2.7)

From (3.2.7), we observe that f(x; β) = π(x; 2) for β = −1, Recently, some distributions have been suggested using the TC of distributions. Aryal and Tsokos (2011) introduced a new generalization of the Weibull distribution called the transmuted Weibull distribution. Merovci (2014) applied the transmuted generalized Rayleigh distribution to the nicotine measurements made in several brands of cigarettes in 1998. Elbatal et al. (2014) proposed a new generalization of the Exponentiated Frechet´ distribution called the transmuted Exponentiated Frechet´ distribution. Acik Kemaloglu and Yilmaz (2017) derived the transmuted two-parameter lindley distribution and applied it on three different data sets. This chapter aims to introduce a new generator of the RL distribution called transmuted Rayleigh Lomax (TR-RL) distribution which is derived using the QRTM (Shaw and Buckley (2007)). The TR-RL distribution is applied in any symmetrical or asymmetrical models to provide great flex- ibility in modeling data. In this chapter, we study the TR-RL distribution and its mathematical and distributional properties, such as the method of moment estimators, L-moments, probability 52 weighted moments and moment generating function. We obtain the maximum likelihood estimator and the Fisher information matrix. The proposed distribution is fitted to a well known data set and the goodness of fit test is presented to determine how well the transmuted Rayleigh Lomax fits the data.

3.2.2 The Transmuted Rayleigh Lomax distribution

If we choose G(x) in (3.2.3) to be the cdf of the RL defined in (3.1.1), we obtain the cdf of the TR-RL.

−1 ( x+λ )2α −1 ( x+λ )2α 2 F (x) = (1 + β)(1 − e 2σ2 λ ) − β[1 − e 2σ2 λ ]

−1 ( x+λ )2α −1 ( x+λ )2α = (1 − e 2σ2 λ )[1 + βe 2σ2 λ ], (3.2.8)

where α, λ, σ > 0 and |β| ≤ 1. The corresponding pdf of the TR-RL with parameters α, λ, σ and β is given as

α x + λ 2α−1 −1 ( x+λ )2α −1 ( x+λ )2α f(x) = ( ) e 2σ2 λ [1 − β + 2βe 2σ2 λ ]. (3.2.9) λσ2 λ

A random variable X with the pdf (3.2.9) is denoted by X ∼ TR − RL(x; α, λ, σ, β). When the transmuting parameter β = 0, it reduces to the RL distribution. The TR-RL distribution is a very flexible model that different shapes of the distribution can be obtained by changing its parameters. The following lemma shows the relationship between the TR-RL distribution and other well known distributions.

1 X+λ 2 Lemma. 3.2.1. a) Let X ∼ TR − RL( 2 , λ, σ, β), then Y = λ ∼ TR − exponential(2σ , β). b) Let X ∼ TR − RL(α, λ, √1 , β), then Y = X+λ ∼ TR − W eibull(2α, 1, β). 2 λ X+λ c) Let X ∼ TR − RL(1, λ, σ, β), then Y = λ ∼ TR − Rayleigh(σ, β).

Proof. By using the transformation of random variable technique, let f(x) be the TR-RL pdf

x+λ dx defined in (3.2.9). a) Let y = λ , then | dy |= λ. Hence, the pdf of a random variable Y is given α − 1 y2α − 1 y2α 1 2α−1 2σ2 2σ2 2 by f(y) = σ2 y e [1 − β + 2βe ]. Let α = 2 . Then, Y ∼ TR − exp(2σ , β). 53 x+λ dx b) Let y = λ , then | dy |= λ. Hence, the pdf of a random variable Y is given by f(y) = α − 1 y2α − 1 y2α 1 2α 2α 2α−1 2σ2 2σ2 2 2α−1 −y −y σ2 y e [1 − β + 2βe ]. Let σ = 2 , then f(y) = 2αy e [1 − β + 2βe ]. Thus, Y ∼ TR − W eibull(2α, 1, β).

x+λ dx c) Let y = λ , then | dy |= λ. Hence, the pdf of a random variable Y is given by f(y) = 2 2 α − 1 y2α − 1 y2α y −y −y 2α−1 2σ2 2σ2 2σ2 2σ2 σ2 y e [1 − β + 2βe ]. By setting α = 1, we have f(y) = σ2 e [1 − β + 2βe ]. Hence, Y ∼ TR − Rayleigh(σ, β).

3.3 Distributional Properties

This section studies the distributional properties of the TR-RL distribution such as the shape of the density function, moments, L- moments, rth order statistics, the quantile function, probability weighted moments and the moment generating function.

3.3.1 Shape of pdf

Figures 3.1-3.4 show several shapes of the density function for some different parameter values. Figure 3.1 shows the effect of the shape parameter α on the shape of the TR-RL density, while the scale parameters, λ and σ are constant. It is observed that as α increases, the TR-RL density curve degenerates to zero. In addition, Figures 3.2 and 3.3 illustrate that the density function of the TR-RL gets streched out as λ or σ are increased. Figure 3.4 shows the convergence of the TR-RL density to the RL density as the transmuting parameter β approaches zero.

Figure 3.1: Probability density function of the TR-RL as α increases and decreases. 54

Figure 3.2: Probability density function of the TR-RL as λ increases and decreases.

Figure 3.3: Probability density function of the TR-RL as σ increases and decreases. 55

Figure 3.4: Probability density functions of TR-RL and RL (dot curve).

3.3.2 Moments

The rth central moment of the TR-RL distribution is given as follows.

Theorem 3.3.1. The rth central moment of the TR-RL distribution is given as

r     r X r r−j j j r−j j h j i E(X − µ) = (−1) λ σ α (µ + λ) Γ + 1 (1 − β)2 2α + β , (3.3.1) j 2α j=0

R ∞ a−1 −t where Γ(a) = 0 t e dt is the gamma function.

Proof. The rth central moment of a random variable X can be derived using the integration as follows:

Z ∞ E(X − µ)r = (x − µ)rf(x)dx −λ 2α−1 Z ∞   2α r α x + λ −1 ( x+λ )2α h −1 x+λ i 2σ2 λ 2σ2 ( λ ) = (x − µ) 2 e 1 − β + 2βe dx. −λ λσ λ 56 1 λ −2α Let u = σ2 ( x+λ ) . Then,

Z ∞ r r 1  1 1  −u h −u i E(X − µ) = λu 2α σ α − λ − µ e 2 1 − β + 2βe 2 du 2 0 r   Z ∞ 1 X r−j r j j r−j j −u h −u i = (−1) λ σ α (µ + λ) u 2α e 2 1 − β + 2βe 2 du 2 j j=0 0 r        1 X r−j r j j r−j j j +1 j = (−1) λ σ α (µ + λ) (1 − β)Γ + 1 2 2α + 2βΓ + 1 2 j 2α 2α j=0 r     X r−j r j j r−j j h j i = (−1) λ σ α (µ + λ) Γ + 1 (1 − β)2 2α + β , (3.3.2) j 2α j=0

where µ = E(X) is the mean of the TR-RL distribution.

The following corollary shows the relationship between the rth moment for the transmuted dis- tribution if the rth moment of the baseline distribution exists.

Corollary. 3.3.1. (Elbatal et al. (2014)). Let F (X) and G(X) be the two cumulative distribution functions of the transmuted and baseline distributions respectively, then

r µr(F ) = (1 + β)µr(G) − 2βEG(X G(X)), (3.3.3)

R r where µr(G) = x g(x)dx and g(x) is the pdf of the baseline distribution.

From Corollary 3.3.1, we can immediately obtain the mean and variance of the TR-RL distribu- tion as follows.

  1 1 h 1 i E(X) = −λ + λσ α Γ + 1 (1 − β)2 2α + β , 2α    2 2 2 2 1 h 1 i 2 2 1 h 1 i V ar(X) = λ σ α Γ + 1 (1 − β)2 α + β − λ σ α Γ( + 1) (1 − β)2 2α + β , α 2α

where X is a random variable from the TR-RL distribution with α, λ, σ > 0 and |β| ≤ 1.

Corollary. 3.3.2. (I) The limits of the mean and variance of the TR-RL distribution as α → ∞, while λ, σ and β are fixed and finite are given by limα→∞ E(X) = 0 and limα→∞ V ar(X) = 0. 57 (II) The limits of the mean and variance of the TR-RL distribution as β → 0, while α,λ and σ are

1 1 1 α  2α fixed and finite are given by limβ→0 E(X) = −λ + λσ Γ 2α + 1 2 and limβ→0 V ar(X) = 2 1 1 1 2 α α  2 λ σ 2 Γ( α + 1) − (Γ( 2α + 1)) , which are the mean and variance of the RL distribution. (III) The limits of the mean and variance of the TR-RL distribution as σ → 0, while α, λ and β are

finite and fixed are given by limσ→0 E(X) = −λ and limσ→0 V ar(X) = 0. (IV) The limits of the mean and variance of the TR-RL distribution as σ → ∞, while α,λ and β are fixed and finite are given by limσ→∞ E(X) = ∞ and limσ→∞ V ar(X) = ∞. (V) The limits of the mean and variance of the TR-RL distribution as λ → ∞ and σ → 0, while α and β are fixed and finite are given by limλ→∞ E(X) = −∞, limλ→∞ V ar(X) = 0 σ→0 σ→0

Numerical values of the mean and variance of the TR-RL distribution using Corollary 3.3.1 and Theorem 3.3.1 are given in Table 3.1 and 3.2, by setting different values of the TR-RL parameters, which confirm the results in Corollary 3.3.2 (I and V). It can be observed from Table 3.1 that both the mean and the variance approach zero as the shape parameter α increases. Moreover, as λ increases and σ decreases, the mean approaches −∞ and the variance converges to zero as shown in Table 3.2.

3.3.3 L-moments

The L-moments were introduced by Hosking (1990) as a linear combination of the order statis- tics. Compared with conventional moments, L-moments lead to less bias in estimates. They also approximate the normal distribution more closely in finite samples (Hosking, 1990). Parameter es- timates obtained from L-moments are sometimes more precise in small samples than the maximum likelihood estimates.

Theorem 3.3.2. Let X1,X2, ..., Xn be a random sample of size n from the TR-RL distribution with 58 Table 3.1: Mean and variance of the TR-Lomax distribution as α increases for different values of α, λ and σ.

λ = 1, σ = 1, β = 0.5 λ = 1.5, σ = 2,β = 0.5 λ = 2, σ = 3, β = 0.5 α Mean Variance Mean Variance Mean Variance 1 0.0698 0.3555 1.7093 3.2003 4.4186 12.8013 2 -0.0078 0.0854 0.6047 0.3843 1.4369 1.0248 3 -0.0155 0.0397 0.3606 0.1419 0.8399 0.3307 4 -0.0156 0.0232 0.2559 0.0738 0.5909 0.1606 100 -0.0011 4.3729e-05 0.0087 9.9766e-05 0.0198 0.0002 λ = 1, σ = 1,β = 1 λ = 1.5, σ = 2, β = 1 λ = 2, σ = 3, β = 1 α Mean Variance Mean Variance Mean Variance 1 -0.1138 0.2146 1.1587 1.9314 3.3174 7.7257 2 -0.0936 0.0647 0.4228 0.2909 1.1399 0.7759 3 -0.0723 0.0323 0.2533 0.1154 0.6760 0.2689 4 -0.0583 0.0195 0.1799 0.0621 0.4788 0.1353 100 -0.0029 4.0593e-05 0.0061 9.2608e-05 0.0163 0.0002 λ = 1, σ = 1, β = −1 λ = 1.5, σ = 2, β = −1 λ = 2, σ = 3, β = −1 α Mean Variance Mean Variance Mean Variance 1 0.6204 0.3743 3.3612 3.3687 7.7224 13.4748 2 0.2494 0.0594 1.1504 0.2673 2.3280 0.7129 3 0.1549 0.0233 0.6827 0.0832 1.3314 0.1939 4 0.1122 0.0124 0.4839 0.0394 0.9275 0.0858 100 0.0040 1.7189e-05 0.0166 3.9215e-05 0.0303 7.0284e-05

parameters α, λ, σ > 0 and | β |≤ 1. The rth population L-moment is

 2  E(X ) = ζ . r:n j,k,m k + m + 1 " " 1 # " 1 ##     2α     2α 1 1 2 1 1 2 (1 − β) σ α Γ + 1 − 1 + 2β σ α Γ + 1 − 1 , 2α k + m + 1 2α k + m + 1

(3.3.4)

λ Pn−r Pj+r−1 n−rj+r−1j+r−1 k m where ζj,k,m = 2 j=0 k,m=0 j k m (−1) β .

Proof. The rth population L-moment of a random variable X can be derived using the following 59 Table 3.2: Mean and variance of the TR-Lomax distribution as λ increases and σ decreases for different values of α and β.

α = 1, β = 1 α = 0.5, β = 0.5 λ σ Mean Variance Mean Variance 10 1 -1.1377 21.4602 5 275 20 0.5 -11.1377 21.4602 -12.5 68.75 40 0.05 -38.2275 0.8584 -39.85 0.0275 100 0.002 -99.8227 0.0086 -99.9994 4.4e-07 800 0.0001 -4999.5570 0.0013 -800 1.76e-10 α = 2, β = −1 α = 0.8, β = −0.5 λ σ Mean Variance Mean Variance 10 1 2.4939 12.1333 6.2578 83.2488 20 0.5 -2.3308 19.2605 -6.3289 58.8658 40 0.05 -28.8250 3.5759 -38.4624 0.7446 100 0.002 -94.4125 0.3057 -99.9312 0.0015 800 0.0001 -4937.53 0.3604 -799.987 5.3279e-05 integration.

Z ∞ E(Xr:n) = xf(xr:n)dx −λ

∞  2α−1 2α 2α Z α x + λ −1 x+λ h −1 x+λ i 2σ2 ( λ ) 2σ2 ( λ ) = x 2 e 1 − β + 2βe · −λ λσ λ r−1 r−1  −1 ( x+λ )2α   −1 ( x+λ )2α  1 − e 2σ2 λ 1 + βe 2σ2 λ ·

n−r h  −1 ( x+λ )2α   −1 ( x+λ )2α i 1 − 1 − e 2σ2 λ 1 + βe 2σ2 λ dx.

1 x+λ 2α Let σ2 λ = u(x). We obtain

∞ λ Z  1 1   −u   −u r−1  −u r−1 E(Xr:n) = u 2α σ α − 1 1 − β + 2βe 2 1 − e 2 1 + βe 2 2 0 h  −u   −u in−r −u 1 − 1 − e 2 1 + βe 2 e 2 du. | {z } A

By expanding the quantity A in a power series,

n−r   j j X n − r j  −u   −u  A = (−1) 1 − e 2 1 + βe 2 , j j=0 60 we obtain

n−r   Z ∞ λ X n − r j  1 1   −u  E(X ) = (−1) u 2α σ α − 1 1 − β + 2βe 2 · r:n 2 j j=0 0

 −u j+r−1  −u j+r−1 −u 1 − e 2 1 + βe 2 e 2 du.

Considering j+r−1 j+r−1    −u  X j + r − 1 k −ku 1 − e 2 = (−1) e 2 , k k=0 and j+r−1 j+r−1    −u  X j + r − 1 m −mu 1 + βe 2 = β e 2 , m m=0 We obtain

Z ∞  1 1  −u (k+m+1)  −u  E(Xr:n) = ζj,k,m u 2α σ α − 1 e 2 1 − β + 2βe 2 du 0 " " 1 #       2α 2 1 1 2 = ζ (1 − β) σ α Γ + 1 − 1 j,k,m k + m + 1 2α k + m + 1 " 1 ##     2α 1 1 2 + 2β σ α Γ + 1 − 1 , 2α k + m + 1

λ Pn−r Pj+r−1 n−rj+r−1j+r−1 k m where ζj,k,m = 2 j=0 k,m=0 j k m (−1) β .

If r = n = 1, we have

1 1 1 E(X) = −λ + λσ α Γ( + 1)[(1 − β)2 2α + β], 2α

which is the mean of the TR-RL distribution.

3.3.4 Order statistics

In this subsection, we derive an explicit form of the probability density function of the TR-RL order statistics as follows. 61

Theorem 3.3.3. Let X1,X2, ..., Xn be a random sample of size n with F (x) and f(x) being the cdf and the pdf of the TR-RL distribution obtained from (3.2.8) and (3.2.9). The density of the rth order statistic is given by

n−r    2α−1 2α 2α n!α X n − r j x + λ −1 ( x+λ ) −1 ( x+λ ) f (x) = (−1) e 2σ2 λ [1 − β + 2βe 2σ2 λ ]. (r) (r − 1)!(n − r)!λσ2 j λ j=0

2α j+r−1 2α  −1 ( x+λ )  −1 ( x+λ ) j+r−1 1 − e 2σ2 λ (1 + βe 2σ2 λ ) .

Proof. Substituting the cdf F (x) and the pdf f(x) from equations (3.2.8) and (3.2.9) into the probability density function of order statistics defined by David and Nagaraja (1970).

 2α−1 2α 2α n!α x + λ −1 ( x+λ ) −1 ( x+λ ) f (x) = e 2σ2 λ [1 − β + 2βe 2σ2 λ ]. (r) (r − 1)!(n − r)!λσ2 λ 2α r−1 2α 2α 2α n−r  −1 ( x+λ )  −1 ( x+λ ) r−1 h  −1 ( x+λ )   −1 ( x+λ ) i 1 − e 2σ2 λ [1 + βe 2σ2 λ ] 1 − 1 − e 2σ2 λ 1 + βe 2σ2 λ . | {z } D

By expanding the quantity D in a power series

n−r   2α j 2α j X n − r j  −1 ( x+λ )   −1 ( x+λ )  D = (−1) 1 − e 2σ2 λ 1 + βe 2σ2 λ , j j=0

we have,

n−r    2α−1 2α n!α X n − r j x + λ −1 ( x+λ ) f (x) = (−1) e 2σ2 λ · (r) (r − 1)!(n − r)!λσ2 j λ j=0 j+r−1 j+r−1 h −1 ( x+λ )2α i  −1 ( x+λ )2α   −1 ( x+λ )2α  1 − β + 2βe 2σ2 λ 1 − e 2σ2 λ 1 + βe 2σ2 λ . (3.3.5)

Note that when β = 0, we have the rthorder statistics of the RL distribution. 62 The density functions of the smallest and the largest order statistics are given by

n−1    2α−1 2α 2α n!α X n − 1 j x + λ −1 ( x+λ ) h −1 ( x+λ ) i f (x) = (−1) e 2σ2 λ 1 − β + 2βe 2σ2 λ · (1) (n − 1)!λσ2 j λ j=0 j j  −1 ( x+λ )2α   −1 ( x+λ )2α  1 − e 2σ2 λ 1 + βe 2σ2 λ , (3.3.6)

 2α−1 2α 2α nα x + λ −1 ( x+λ ) h −1 ( x+λ ) i f (x) = e 2σ2 λ 1 − β + 2βe 2σ2 λ · (n) λσ2 λ n−1 n−1  −1 ( x+λ )2α   −1 ( x+λ )2α  1 − e 2σ2 λ 1 + βe 2σ2 λ . (3.3.7)

3.3.5 Quantile function

Let X be a random variable with the distribution function in (3.2.8), and p ∈ (0, 1). The quantile function of the TR-RL distribution is obtained by inverting the distribution function F in (3.2.8) as follows.

 −1 ( x+λ )2α   −1 ( x+λ )2α  F (x) = p = 1 − e 2σ2 λ 1 + βe 2σ2 λ , α, λ, σ > 0, x > −λ, |β| ≤ 1. (3.3.8)

−1 ( x+λ )2α Let t = e 2σ2 λ , then equation (3.3.8) can be written as

βt2 − t(β − 1) − (1 − p) = 0. (3.3.9)

Solving equation (3.3.9) and simplifying, we obtain

(β − 1) ± p(β − 1)2 + 4β(1 − p) t = 2β (β − 1) ± p(β + 1)2 − 4βp = (3.3.10) 2β

The roots in (3.3.10) exist since the quadratic 4 = (β + 1)2 − 4βp is always positive for any value of β ∈ [−1, 1]. To show this, we consider the following cases: 63 Case 1: If −1 ≤ β ≤ 0, then 4 > 0. Case 2: If 0 < β ≤ 1, then 4 ≥ 0. Case 2 follows directly from the first and second derivative of the quadratic 4. In particular, the first derivative of 4 is 40 = 2β + 2 − 4p. The critical point of 40 is the value of β where 40 is zero. Therefore, the critical point is β = 2p − 1. The second derivative of 4 is 400 = 2 > 0, which −1 ( x+λ )2α implies that 4 has a local minimum at β = 2p − 1. Substituting the value of t = e 2σ2 λ in (3.3.10), we have

p 2 −1 ( x+λ )2α (β − 1) ± (β + 1) − 4βp e 2σ2 λ = . 2β

Simplifying we get

1 " # 2α x + λ (β − 1) ± p(β + 1)2 − 4βp = −2σ2ln[ ] . λ 2β

From the previous cases and since the argument inside the logarithm ln must be positive, the quantile function of the TR-RL distribution is

1 " " ## 2α (β − 1) + p(β + 1)2 − 4βp x = λ −2σ2ln − λ. (3.3.11) p 2β

In particular, the median is

1 " " ## 2α (β − 1) + p(β + 1)2 − 2β x = λ −2σ2ln − λ. (3.3.12) 0.5 2β

Median is used as a robust measure for location since it is not too sensitive to outliers. Figures 3.5 and 3.6 illustrate the relationship between the shape parameter α, the scale parameter λ and the median for different values of the transmuting parameter β, when σ = 1. It is obvious that as α increases to infinity and λ moves toward zero, the median approaches zero which is certified by the next theorem. 64

Figure 3.5: The relationship between α and the median of the TR-RL distribution.

Theorem 3.3.4. Let X be a random variable having the TR-RL distribution. Consider the follow- ing two cases.

1  2 √  2α Case 1: If β = −1, then xp = λ −2σ ln[1 − p] − λ. Therefore,   −1 2 −λ for p < (1 − e 2σ2 ) limα→∞ xp = 0 and limα→0 xp = ,  −1 2 ∞ for p > (1 − e 2σ2 )

where λ, σ and β are fixed and finite. Furthermore, limλ→0 xp = 0 and   −1 2 −∞ for p < (1 − e 2σ2 ) limλ→∞ xp = ,  −1 2 ∞ for p > (1 − e 2σ2 ) where α, σ and β are fixed and finite. √ 1  2  2α Case 2: If β = 1, then xp = λ −2σ ln[ 1 − p] − λ. Therefore,   −1 −λ for p < 1 − e σ2 limα→∞ xp = 0 and limα→0 xp = ,  −1 ∞ for p > 1 − e σ2

where λ, σ and β are fixed and finite. Moreover, limλ→0 xp = 0 and   −1 −∞ for p < 1 − e σ2 limλ→∞ xp = .  −1 ∞ for p > 1 − e σ2 where α, σ and β are fixed and finite. √ 1   2  2α 2 (β−1)+ (β+1) −4βp Case 3: If −1 < β < 1, then xp = λ −2σ ln 2β − λ. Therefore, 65

Figure 3.6: The relationship between λ and the median of the TR-RL distribution.

 −1 h −1 2 2i  2σ2 −λ for p < 4β (2βe − (β − 1)) − (β + 1) limα→∞ xp = 0 and limα→0 xp = , −1 h −1 2 2i  2σ2 ∞ for p > 4β (2βe − (β − 1)) − (β + 1) where λ, σ and β are fixed and finite. On the other hand, limλ→0 xp = 0 and  −1 h −1 2 2i  2σ2 −∞ for p < 4β (2βe − (β − 1)) − (β + 1) limλ→∞ xp = , −1 h −1 2 2i  2σ2 ∞ for p > 4β (2βe − (β − 1)) − (β + 1) . where α, σ and β are fixed and finite.

Based on the quantile function, another measure of skewness called Bowley skewness (Bowley (1920)) also known as quartile skewness coefficient can be obtained. This measure is considered as robust alternative to the conventional skewness since the conventional measures are extremely sensitive to a single outlier or small groups of outliers (Kim and White (2004)). Bowley skewness is given by Q(3/4) + Q(1/4) − 2Q(2/4) SK = , Q(3/4) − Q(1/4)

where Q(1/4), Q(2/4) and Q(3/4) are the first, the second and the third quartiles respectively. Applying the QRTM defined in (3.2.2) to any base distribution under the assumption that |β| ≤ 1 provides flexible distributions that account for skewness in data. 66 3.3.6 Probability weighted moments

The probability weighted moments (PWMs) emerged in late 1970s for the purpose of pa- rameter estimation. These moments are superior to other methods such as maximum likelihood estimation (MLE) and the method of moments, especially when they are difficult to be obtained. In the mid 1980s, the PWMs were converted to L-moments by Hosking (1990). The (r, s)th prob- ability weighted moment of X (r ≥ 1, s ≥ 0) is

Z ∞ E[XrF (X)s] = xrF (x)sf(x)dx. −λ

s s s h −1 ( x+λ )2α i h −1 ( x+λ )2α i s From (3.2.8), F (x) = 1 − e 2σ2 λ 1 + βe 2σ2 λ . We expand F (x) as follows

s   s   s X j s −j ( x+λ )2α X s s−k −(s−k) ( x+λ )2α F (x) = (−1) e 2σ2 λ β e 2σ2 λ . j k j=0 k=0

Then,

s α X ss E[XrF (X)s] = (−1)j βs−k· λσ2 j k j,k=0 Z ∞  r x + λ 2α−1 −(j+s−k) ( x+λ )2α −1 ( x+λ )2α −1 ( x+λ )2α x ( ) e 2σ2 λ e 2σ2 λ (1 − β + 2βe 2σ2 λ )dx (3.3.13) −λ λ 67 1 λ −2α Let u = σ2 ( x+λ ) . We obtain

s 1 X ss E[XrF (X)s] = (−1)j βs−k· 2 j k j,k=0 Z ∞ r  −(j+s−k) u  1 1  −u  −u  e 2 λu 2α σ α − λ e 2 1 − β + 2βe 2 du (3.3.14) 0 s 1 X ss = (−1)j βs−k· 2 j k j,k=0 "Z ∞ r   # X r 1 1 i r−i −(j+s−k+1) u −u (λu 2α σ α ) (−λ) e 2 (1 − β + 2βe 2 )du (3.3.15) i 0 i=0 s r     1 X X r+j−i s s r s−k i r = (−1) β σ α λ · 2 j k i j,k=0 i=0 Z ∞ i −(j+s−k+1) u −u u 2α e 2 (1 − β + 2βe 2 )du (3.3.16) 0 s r     " Z ∞ 1 X X r+j−i s s r s−k i r i −(j+s−k+1) u = (−1) β σ α λ (1 − β) u 2α e 2 du 2 j k i j,k=0 i=0 0 Z ∞ # i −(j+s−k+2) u + 2β u 2α e 2 0 s r     1 X X r+j−i s s r s−k i r i = (−1) β σ α λ Γ( + 1)· 2 j k i 2α j,k=0 i=0 " i +1 i +1#  2  2α  2  2α (1 − β) + 2β . (3.3.17) j + s − k + 1 j + s − k + 2

In particular, we obtain the rth moments by setting s = 0.

3.3.7 Moment generating function

Let X be a random variable following a TR-RL distribution. The moment generating func- tion ( mgf ) of X is given by

−2α+1 Z ∞   −2α tX α tx λ −1 ( λ )−2α h −1 λ i 2σ2 x+λ 2σ2 ( x+λ ) E(e ) = 2 e e 1 − β + 2βe dx λσ −λ x + λ 68 tX P∞ (tx)r Let e = r=0 r! , then

∞ −2α+1 r Z ∞   −2α tX X t r α λ −1 ( λ ) h −1 ( λ )−2α i E(e ) = x e 2σ2 x+λ 1 − β + 2βe 2σ2 x+λ dx. r! λσ2 x + λ r=0 −λ (3.3.18)

th The mgf can be calculated easily using the r moment (µr) since the quantity inside the integral (3.3.18) is the rth moment of the TR-RL distribution. Hence,

∞ X tr E(etX ) = µ0 r! r r=0 ∞ r   r r   X X r−k r λ t k k h k i = (−1) σ α Γ + 1 (1 − β)2 2α + β . k r! 2α r=0 k=0

We can also calculate the mgf using the method of integration.

3.4 Estimation

In this section, the maximum likelihood estimators are derived for parameters α, λ, σ and β. The Fisher information matrix is calculated to obtain the asymptotic distribution and the approximate confidence intervals of MLEs for α, λ, σ and β.

3.4.1 MLEs of parameters

Let x = (x1, x2, ..., xn) be a sample of size n from the TR-RL distribution given by (3.2.9). The likelihood function for the vector of parameters Θ = (α, λ, σ, β)T is given by

n n  2α−1 2α  2α   α  Y xi + λ −1 ( xi+λ ) −1 ( xi+λ ) L(x; α, λ, σ, β) = e 2σ2 λ 1 − β + 2βe 2σ2 λ . (3.4.1) λσ2 λ i=1 69 Then the log likelihood function can be expressed as

n     2α X xi + λ 1 xi + λ `(x; α, λ, σ, β) = log(α) − log(λ) − 2log(σ) + (2α − 1)log − λ 2σ2 λ i=1  −1 ( xi+λ )2α + log(1 − β + 2βe 2σ2 λ ) . (3.4.2)

The maximum likelihood estimates αˆ, λˆ, σˆ and βˆ for the parameters α, λ, σ and β are the values which maximize the likelihood function in equation (3.4.1). Therefore, by setting the first partial derivatives of the logarithm of the likelihood function (3.4.2) with respect to α, λ, σ and β to be equal zero as follows:

n     2α   ∂logL X 1 xi + λ 1 xi + λ xi + λ = + 2log − log ∂α α λ σ2 λ λ i=1 x +λ x +λ 2α x +λ −1 ( i )2α 2β( i ) log( i )e 2σ2 λ  − λ λ = 0, (3.4.3) −1 ( xi+λ )2α 1 − β + 2βe 2σ2 λ 2α n " −1 xi+λ 2α−1 #  2α−1 2 ( λ ) xi+λ  ∂logL X −1 xi(2α − 1) αxi xi + λ 2αβe 2σ xi = − + + λ = 0, 2 2 −1 x +λ 2α ∂λ λ λ(xi + λ) λ σ λ ( i ) i=1 λ2σ2(1 − β + 2βe 2σ2 λ ) (3.4.4)   n x +λ  2α −( i )2α  2α ∂logL X −2 1 xi + λ 2β 1 λ xi + λ 2σ2 =  + + x +λ e  = 0, ∂σ σ σ3 λ −( i )2α σ3 λ i=1 λ 1 − β + 2βe 2σ2 (3.4.5)   x +λ 2α −( i ) n  λ  ∂logL X −1 + 2e 2σ2 =   = 0. (3.4.6)  x +λ ! ∂β  −( i )2α  i=1 λ log 1 − β + 2βe 2σ2 

Explicit forms of the solutions from equations (3.4.3), (3.4.4), (3.4.5) and (3.4.6) are not available, and numerical solutions will be provided in practice. 70 3.4.2 Asymptotic distribution

The asymptotic distribution of MLEs is used to make inference about the parameters of the distribution. Therefore, the Fisher information matrix is derived. However, the exact mathematical expression is difficult to derive. Hence, inverting the Fisher information matrix numerically can be obtained using statistical software. The asymptotic Fisher information matrix of the TR-RL distribution can be written as follows.

  −∂2logL −∂2logL −∂2logL −∂2logL 2  ∂α ∂α∂λ ∂α∂σ ∂α∂β         2 2 2 2   −∂ logL −∂ logL −∂ logL −∂ logL   ∂α∂λ ∂λ2 ∂λ∂σ ∂λ∂β      F =   .    −∂2logL −∂2logL −∂2logL −∂2logL   2   ∂α∂σ ∂λ∂σ ∂σ ∂σ∂β          −∂2logL −∂2logL −∂2logL −∂2logL ∂α∂β ∂λ∂β ∂σ∂β ∂β2

Mathematically, the second and mixed partial derivatives of the logaritm of the likelihood 71 function can be derived as follows.

2 n "   2  2α ∂ logL X 1 −1 2 xi + λ xi + λ = − log 2 " x +λ #2 2 2 ∂α ( i )2α α σ λ λ i=1 − λ 1 − β + 2βe 2σ2

" x +λ # x +λ −( i )2α  4α   2 ( i )2α λ xi + λ −2β xi + λ − λ − 1 − β + 2βe 2σ2 log e 2σ2 λ σ2 λ

x +λ #  2  2α ( i )2α xi + λ xi + λ − λ + 4β log( ) e 2σ2 λ λ

x +λ 2  4α  2 −( i )2α 4β xi + λ xi + λ λ + log( ) e 2σ2 , (3.4.7) σ2 λ λ n "  2α ∂logL 1 X −2xi xi xi + λ = + 2α 2 2 ∂α∂λ " ( xi+λ ) # λ(xi + λ) σ λ(xi + λ) λ − λ i=1 1 − β + 2βe 2σ2

" x +λ # ( i )2α  2αxi xi + λ 2α−1 xi + λ − λ xi + λ 2α xi + λ + ( ) log( ) − 1 − β + 2βe 2σ2 2β( ) log( ) σ2λ2 λ λ λ λ

x +λ x +λ ( i )2α ( i )2α − λ αxi xi + λ 2α−1 2xiβ − λ xi + λ 2α 2σ2 2σ2 e 2 2 ( ) − e ( ) σ λ λ λ(xi + λ) λ x +λ ( i )2α  4βαxi xi + λ 2α−1 xi + λ − λ 2 xi + λ 2α xi + λ − ( ) log( )e 2σ2 − 2β ( ) log( ) λ2 λ λ λ λ x +λ # ( i )2α − λ xi xi + λ 2α−1 e 2σ2 ( ) , (3.4.8) λ2σ2 λ n "  2α   ∂logL X 2 xi + λ xi + λ 1 = log − . 3 2α 2 ∂α∂σ σ λ λ " ( xi+λ ) # i=1 − λ 1 − β + 2βe 2σ2

x +λ x +λ  ( i )2α ( i )2α − λ 2β xi + λ 4α xi + λ − λ (1 − β + 2βe 2σ2 ) ( ) log( )e 2σ2 σ3 λ λ x +λ # 2 ( i )2α  4β xi + λ 4α xi + λ − λ − ( ) log( )e 2σ2 , (3.4.9) σ3 λ λ 72 n " ∂logL X 1 −xi(2α − 1) (λ + (xi + λ)) = − + ∂λ2 λ2 λ2(x + λ)2 i=1 i ! αx −x x + λ2α−2 2 x + λ i i (2α − 1) i − ( i )2α−1 + σ2 λ4 λ λ3 λ

" x +λ ( i )2α 2 1 2 2 − λ −2αβxi λ σ (1 − β + 2βe 2σ2 )( (2α − 1). " x +λ #2 2 ( i )2α λ − λ λ4σ4 1 − β + 2βe 2σ2

2α x +λ 4α−2 xi+λ ( i )2α 2   ( ) xi + λ 2α−2 − λ 2α βxi xi + λ − λ ( ) e 2σ2 + e 2σ2 ) λ λ2σ2 λ x +λ x +λ ( i )2α ( i )2α − λ xi + λ 2α−1 2 2 − λ − 2αβx e 2σ2 ( ) (2λσ − 2σ βλ + 2βαx e 2σ2 i λ i 2α ( xi+λ ) ## xi + λ 2α−1 2 4 − λ ( ) + 8β σ λe 2σ2 ) , (3.4.10) λ

n "  2 ∂logL X −2αxi xi + λ 1 2αβλ = ( )2α−1 + . 3 2 x +λ !2 ∂λ∂σ σ λ λ ( i )2α σ i=1 − λ λ4σ4 1 − β + 2βe 2σ2

x +λ x +λ ( i )2α ( i )2α xi + λ 4α−1 − λ − λ x ( ) e 2σ2 (1 − β + 2βe 2σ2 ) − 2αβx i λ i x +λ x +λ ( i )2α  2 ( i )2α − λ xi + λ 2α−1 2λ β xi + λ 2α − λ e 2σ2 ( ) ( ) e 2σ2 λ σ λ x +λ # ( i )2α  2 − λ + 2σλ (1 − β + 2βe 2σ2 ) , (3.4.11)

n " x +λ ( i )2α ∂logL 1 X 2 2 − λ = λ σ (1 − β + 2βe 2σ2 ). " x +λ #2 ∂λ∂β ( i )2α − λ i=1 λ4σ4 1 − β + 2βe 2σ2

2α x +λ 2α−1 2α−1 xi+λ ( i )2α     ( ) − λ xi + λ xi + λ − λ 2αx e 2σ2 − 2αβx e 2σ2 . i λ i λ 2α ( xi+λ ) # 2 2 2 2 − λ (−λ σ + 2λ σ e 2σ2 ) , (3.4.12) 73

n " " x +λ # ( i )2α ∂logL 1 X 6 − λ 2σ2 = x +λ σ 1 − β + 2βe . ∂σ∂β ( i )2α − λ i=1 σ6[1 − β + 2βe 2σ2 ]2 2α 2α ( xi+λ )  2α ( xi+λ ) − λ xi + λ − λ xi + λ 2α 2e 2σ2 − 2βe 2σ2 ( ) . λ λ x +λ # ( i )2α 3 3 − λ (−σ + 2σ e 2σ2 ) , (3.4.13)

 2α 2 n ( xi+λ ) ! ∂logL 1 X − λ = − −1 + 2e 2σ2 . (3.4.14) 2 2α 2   ∂β " ( xi+λ ) # − λ i=1 1 − β + 2βe 2σ2

−1 The variance-covariance matrix is approximated by V = (Vij) where Vij = Fij . The asymptotic distribution of MLE for α, λ, σ and β can be written as [(ˆα − α), (λˆ − λ), (ˆσ − σ), (βˆ − β)] ∼

ˆ −1 N3(0,F (θ) ). Then the 100(1 − γ)% confidence intervals for α, λ, σ and β can be approximated p q p q by αˆ±z γ var(ˆα), λˆ±z γ var(λˆ), σˆ±z γ var(ˆσ) and βˆ±z γ var(βˆ) , where θˆ = (ˆα, λ,ˆ σ,ˆ βˆ) 2 2 2 2 and zγ is the upper 100γ-th percentile of the standard normal distribution.

3.4.3 Simulation

We conduct simulations with the sample size n=200, 300 and 400 to calculate the MLEs of the TR-RL parameters. The process is repeated 2000 times to obtain standard errors (SE) and biases which measure the performance of the estimators. Moreover, the approximate two-sided confidence intervals with the confidence level 95% are calculated. It is observed that when the sample size n increses, the MLEs approach their true values. The results are listed in Table 3.3.

3.5 Application

In this section, an application to a real data is provided to verify the flexibility of the TR-RL distribution. The TR-RL distribution is fitted to the data set and MLEs of parameters are obtained. AIC and SIC are used to compare the TR-RL distribution with some of its related models. The 74 Table 3.3: MLE, SE, Bias and 95% confidence limits from TR-RL distribution.

95% confidence limits parameters n MLE SE Bias LCL UCL α = 1 200 0.8972 0.0110 -0.1028 0.5913 1.2031 300 0.8526 0.0050 -0.1474 0.6809 1.0243 400 0.9513 0.0056 -0.0487 0.7325 1.1700 λ = 1 200 0.8373 0.0044 -0.1627 0.7149 0.9597 300 0.9943 0.0031 -0.0057 0.8901 1.0984 400 0.9099 0.0021 -0.0900 0.8256 0.9943 σ = 1 200 1.0952 0.0148 0.0952 0.6834 1.5070 300 0.9006 0.0050 -0.0994 0.7299 1.0713 400 0.9889 0.0061 -0.0110 0.7509 1.2269 β = −0.5 200 -0.3820 0.0299 0.1179 -1.2124 0.4484 300 -0.6822 0.0134 -0.1822 -1.1381 -0.2262 400 -0.4907 0.0144 0.0092 -1.0549 0.0734

AIC and SIC are given in the following

AIC = −2`(x;α, ˆ λ,ˆ σ,ˆ βˆ) + 2d,

SIC = −2`(x;α, ˆ λ,ˆ σ,ˆ βˆ) + d log(n),

where `(x;α, ˆ λ,ˆ σ,ˆ βˆ) denotes the log-likelihood function evaluated at the maximum likelihood estimates, d is the number of parameters, and n is the sample size. The distribution with smaller AIC and SIC is considered to be a better fitting.

Further, we apply the likelihood ratio (LR) test to the TR-RL distribution for testing H0 vs H1 by computing the maximum values of the log-likelihoods of the TR-RL distribution and its related models. The LR test statistic is

ˆ ˆ w = 2(logL(θ, x) − logL(θ0, x)), (3.5.1) where θˆ = (ˆα, λ,ˆ σ,ˆ βˆ) is a vector of the maximum likelihood estimates of the TR-RL parameters, ˆ and θ0 is a vector of the maximum likelihood estimates of the TR-RL sub-model parameters. In this section, we choose the RL, Transmuted Rayleigh and Rayleigh distributions to be the sub- 75 models of the TR-RL distribution. As n → ∞, the test statistic w is asymptotically distributed as

2 χk, where k is the dimension of the parameter vector θ of interest. By considering the hypotheses

H0 : f = f0 vs H1 : f = f1, (3.5.2)

where f1 is the pdf of the TR-RL distribution and f0 is the pdf of the TR-RL sub-model. By

2 2 considering the hypotheses in (3.5.2), we reject H0 if the w is greater than χα(k), where χα(k) is the upper 100αth percentile of a chi square distribution with k degrees of freedom. Table 3.6 shows that the TR-RL distribution for a given data set performs better than the RL, Transmuted Rayleigh and Rayleigh distributions. A dataset corresponding to remission times (in months) of a random sample of 128 bladder cancer patients given in Lee and Wang (2003) is considered in this chapter. Table 3.4 gives of the dataset and the results are illustrated in Table 3.5. Ta- ble 3.5 shows that the TR-RL distribution is the better model as compared with its related models. Figure 3.7 shows the graphical display of the TR-RL distribution and its related models fitted to the histogram of the remission time data. We observe that the TR-RL represents better fitting to the histogram than its related models.

Table 3.4: Summary description of the remission time data.

Min. Max. Mean Median sd Skewness Kurtosis 0.08 79.05 9.3656 6.395 10.5083 3.2866 18.4831

Table 3.5: The Loglikelihood, MLEs, AIC and SIC for the remission time real data.

Distribution α λ σ β -2logL AIC SIC TR-RL 0.5672 0.0272 25.0067 0.7418 824.3903 832.3903 843.7984 RL 0.5247 0.0209 17.6095 - 828.6451 834.6451 843.2012 TR-Rayleigh - - 11.3948 0.7913 933.2392 941.2392 946.9433 Rayleigh - - 9.9317 - 976.5312 984.5312 995.9393 76

Figure 3.7: Fitted density curves to the remission time real data.

Table 3.6: The likelihood ratio tests for the remission time real data

Hypotheses LR-statistic chi-square percentile 2 Rayleigh(σ) vs TR − RL(α, λ, σ, β) 158.1018 χ0.05(3) = 7.815 2 RL(α, λ, σ) vs TR − RL(α, λ, σ, β) 4.2158 χ0.05(1) = 3.841 2 TR − Rayleigh(σ, β) vs TR − RL(α, λ, σ, β) 112.8098 χ0.05(2) = 5.991

In Table 3.6, the LR test statistic w is applied to illustrate the superiority of the TR-RL distri- bution as compared to the RL, the transmuted Rayleigh (TR-Rayleigh) and Rayleigh distributions. The results of the test statistic and the chi-square percentile are presented in the second and the third columns respectively with the significant level equals 0.05. We observe that the LR test statistic in all the three cases is greater than the chi-square percentile. Hence, the TR-RL distribution is statistically superior to the RL, transmuted Rayleigh and Rayleigh distributions.

3.6 Discussion and Conclusions

In this chapter, we propose a TR-RL distribution by applying the QRTM. This distribution is more general than the RL distribution by adding a transmuting parameter |β| ≤ 1 which provides flexibility in modeling skewed data. Three important transmuted models are considered as special cases of the TR-RL which are the transmuted exponential, the transmuted Weibull and the trans- 77 muted Rayleigh distributions. Also, when the transmuting parameter of the TR-RL distribution is zero, the distribution reduces to the RL distribution. We derive some mathematical properties such as moments, order statistics, quantile and the probability weighted moments. The maximum likelihood estimation method is used for estimating the model parameters, and the variance co- variance matrix is determined. Simulations are conducted to demonstrate the performance of the maximum likelihood estimation method. We apply the proposed distribution on a data to illustrate the advantage of modeling skewed data using Akaike Information Criterion, Schwarz information criterion and the likelihood ratio test statistic. 78

CHAPTER 4 AN INFORMATION APPROACH FOR THE CHANGE POINT PROBLEM OF THE RAYLEIGH LOMAX DISTRIBUTION

4.1 Introduction

Change point analysis is often of interest to serve for real life phenomena. In statistics, a change point can be viewed as an unknown or time point such that the observations follow different dis- tributions before and after that point. There are various approaches to perform the change-point analysis, such as the likelihood ratio test, the Bayesian method and the information approach. According to Chen and Gupta (2011), the change point problem can be defined as follows.

Let X1,X2, ..., Xn be a sequence of independent random variables with distribution functions

F1,F2, ..., Fn, respectively. Then in general, the change point problem is to test the following null hypothesis,

H0 : F1 = F2 = ... = Fn versus the alternative:

H1 : F1 = ... = Fk1 6= Fk1+1 = ... = Fkq 6= Fkq+1 = ... = Fn,

where 1 < k1 < k2 < ... < kq < n, q is the unknown number of change points and k1, k2, ..., kq are the respective unknown positions that have to be estimated. If the distributions F1,F2, ..., Fn belong to a common parametric family F (θ), where θ ∈ Rp, then the change point problem is to test the null hypothesis about the population parameters θi, i = 1, ..., n:

H0 : θ1 = θ2 = ... = θn = θ (unknown), (4.1.1) 79 versus the :

H1 : θ1 = ... = θk1 6= θk1+1 = ... = θk2 6= θk2+1 = ... = θkq−1 6= θkq = ... = θn, (4.1.2)

where q and k1, k2, ..., kq have to be estimated. These hypotheses together reveal the aspects of change point inference: determining if any change point exists in the process and estimating the number and position(s) of change point(s).

4.2 Literature Review of the Change Point Problem

Many of the changes that occur in this world can cause unnecessary losses if people are not aware of them. Thus, some devoted their efforts to solve this problem by detecting the location of the change points in many practical applications. Most of the studies assume a single change point in the data. Chernoff and Zacks (1964) derived the for the current mean of a normal distribution which is subjected to changes in time for a priori uniform distribution on the real line and a quadratic . Sen and Srivastava (1975) derived the exact and asymptotic distribution function for some Bayesian test statistics on tests for detecting change in mean for a sequence of normal distribution. Hawkins (1977) derived the null distributions of the likelihood ratio test for known and unknown variance for testing the mean change. However, the test statistics for case of unknown variance was incorrect. Worsley (1979) derived the null distribution for a single change point with known and unknown variance. Kim and White (2004) derived a likelihood ratio test to detect a single change point in simple model. Detecting change points of variance has also been considered in the literature. Hsu (1977) used the maximum likelihood to estimate the time of changes and at the different period of the . Davis and Feldstein (1979) derived the bayesian approach to detect multiple change points of variance in observations using posterior odds. Horvath´ et al. (2004) considered the change point for linear models. Gurevich and Vexler (2005) studied the change point problem for . Vexler et al. (2009) studied the classification problems in the context of chanfe point problem. Ning and Gupta (2009) investigated the change point problem for generalized lambda 80 distribution. Chen and Gupta (2011) provided an exhaustive literature review of the change point problem. Ning (2012) proposed nonparametric methods to detect the different types of changes in mean. Multiple change point problem has also been widely considered in literature. Vostrikova (1981) proposed the binary segmentation procedure to detect the number of change points and their locations in a multidimensional random process. This method has advantages of detecting the number of change points and the corresponding locations simultaneously and saving a lot of computational time. It has been used widely for change point analysis. The binary segmentation procedure can be summarized in the following steps: Consider testing (4.1.1) versus (4.1.2). Then, Step 1: Test the null hypothesis given by (4.1.1) versus the following alternative.

H1 : θ1 = ... = θk 6= θk+1 = ... = θn, (4.2.1)

where k is the location for the single change point at this stage. If we do not reject H0, then we

stop and conclude there is no change point. But if we reject H0, then there is a change point and we proceed to step 2. Step 2: Test for two subsequences before and after the change point obtained in step 1 separately for a possible change. Step 3: Repeat the process until no further subsequences have change points. Step 4: All the change point positions which are collected from step 1 to 3 are denoted by nˆ ˆ ˆ o k1, k2, ..., kq with estimated number of change points q.

4.3 Methodology

Given a collection of models for the data, Akaike (1973) proposed the following Akaike infor- mation criterion ˆ ˆ AIC(k) = −2logL(θk) + 2dim(θk),

ˆ where L(θk) is the maximum value of the likelihood function for the model. Then the most ap- propriate model is the one that has the minimum AIC value (MAIC). However, the MAIC is not an asymptotically consistent estimator of model order (Schwarz et al., 1978). Schwarz (1978) 81 modified the information criterion without modulating the original principle of Akaike. This new criterion is called the Schwarz information criterion SIC. The procedure of the SIC is to choose the model which minimizes

ˆ ˆ SIC(k) = −2logL(θk) + dim(θk)log(n), for k = 1, 2, ..., K, (4.3.1)

where n is the sample size. Bear in mind that the difference is only occurred in the penalty term ˆ ˆ in SIC which is dim(θk)log(n) instead of 2dim(θk). Schwarz (1978) proved that the SIC is asymptotically consistent estimator of the true model order. Hence, this criterion can be used to detect the change point of the data and estimate the location k of the change point. To estimate the parameters, we need sufficient observations since the method used in change point problem may not detect the changes if they locate at the very beginning or at the very ending

of the observations. Therefore, we compute the SIC(k) for k0 ≤ k ≤ n − k0, where k0 is chosen to be large enough such that the MLE can be calculated precisely. According to Ning and Gupta (2009), Chen and Gupta (2011) and the principle of information criterion, SIC(n) is defined under

the null hypothesis H0 as follows

SIC(n) = −2logL(θˆ) + dim(θˆ)log(n). (4.3.2)

Hence, we do not reject H0 if

SIC(n) ≤ min SIC(k), (4.3.3) k0≤k≤n−k0

and reject H0 if SIC(n) > SIC(k), (4.3.4) for some k. Then, the position of the change point is estimated by kˆ such that

SIC(kˆ) = min SIC(k). (4.3.5) k0≤k≤n−k0 82 We note that the general penalty of the SIC is only related to the sample size and the number of parameters to be estimated. Zhang and Siegmund (2007) indicated that the SIC can detect change points more efficiently when the change points are in the middle of the data. However, when the change point is at the beginning or in the end of the data, some parameters of the studied distribution become redundant. Hence, testing a penalty term that is robust in most situations is needed. Chen et al. (2006) suggested the necessity of studying the complexity of the penalty term so that it could be related to the location of the changes. When the model complexity is the focus, equation (4.3.1) can be written as

ˆ ˆ ˆ ˆ SIC(k) = −2logL(θ1k, θ2k, k) + complexity(θ1k, θ2k, k)log(n), (4.3.6)

ˆ ˆ where θ1k and θ2k are two parameters before and after the change point. Chen et al. (2006) re-tested the complexity in terms of the change point problem. When the change point location k is in the

n middle of the data, which is close to 2 , the parameters before and after the change point become effective parameters. However, when k is near the beginning (1) or the end (n) of a data set, either one of these parameters becomes redundant, which implies to have an undesirable parameter k. Thus, a modified information criterion (MIC) has been proposed, and is defined as follows.

" # 2k 2 MIC(k) = −2logL(θˆ ) + dim(θˆ ) + − 1 log(n), 1 ≤ k < n. (4.3.7) k k n

Under the null hypothesis H0, MIC is defined as

MIC(n) = −2logL(θˆ) + dim(θˆ)log(n), (4.3.8)

ˆ ˆ where θ maximizes logL(θ). If MIC(n) > min1≤k

MIC(kˆ) = min MIC(k). (4.3.9) 1≤k

Tn = SIC(n) − min SIC(k) + dim(θ)log(n), (4.3.10) 1≤k

Sn = MIC(n) − min MIC(k) + dim(θ)log(n), (4.3.11) 1≤k

where SIC(k), SIC(n), MIC(k) and MIC(n) are defined in (4.3.1), (4.3.2), (4.3.7) and (4.3.8) respectively. This chapter focuses on the change point problem using SIC and MIC approaches to detect

changes in parameters of the RL distribution defined in chapter 2. Let X1,X2, ..., Xn be a sequence of independently random variables from the RL distribution with the shape parameter α and scale parameters λ and σ. The change point problem is to test the null hypothesis:

 H0 : α1 = α2 = ... = αn = α   λ1 = λ2 = ... = λn = λ (unknown),   σ1 = σ2 = ... = σn = σ 

versus the alternative 84

∗ ∗∗ H1 :α1 = ... = αk = α 6= αk+1 = ... = αn = α

∗ ∗∗ λ1 = ... = λk = λ 6= λk+1 = ... = λn = λ

∗ ∗∗ σ1 = ... = σk = σ 6= σk+1 = ... = σn = σ

where 1 < k < n is the unknown position that has to be estimated. Under H0, SIC and MIC are defined as n X ˆ SIC(n) = MIC(n) = −2 Log(f(xi;α, ˆ λ, σˆ)) + 3Log(n), (4.3.12) i=1

where α,ˆ λˆ and σˆ are the MLEs of the shape parameter α and the scale parameters λ and σ respec-

tively fitted to the whole data set. Under H1, SIC and MIC are defined as follows

k n X ∗ ˆ∗ ∗ X ∗∗ ˆ∗∗ ∗∗ SIC(k) = −2 Log(f(xi;α ˆ , λ , σˆ )) − 2 Log(f(xi;α ˆ , λ , σˆ )) i=1 i=k+1 + 6Log(n), (4.3.13)

k n X ∗ ˆ∗ ∗ X ∗∗ ˆ∗∗ ∗∗ MIC(k) = −2 Log(f(xi;α ˆ , λ , σˆ )) − 2 Log(f(xi;α ˆ , λ , σˆ )) i=1 i=k+1 " # 2k 2 + 6 + − 1 log(n), (4.3.14) n

where αˆ∗, λˆ∗ and σˆ∗ are the MLEs of α, λ and σ respectively fitted to the first segment of the data set and αˆ∗∗, λˆ∗∗ and σˆ∗∗ are the MLEs of α, λ and σ respectively fitted to the second segment of the data set. Multiple change points can be detected using the binary segmentation method explained in section (4.2).

Theorem 4.3.1. Under Wald Conditions and the regularity conditions (Chen et al. (2006)), as 85 n → ∞,

2 Sn → χ3 (4.3.15)

2 in distribution under the null hypothesis, where Sn is defined in (4.3.11), and χ3 is the Chi-square distribution with dimension equals the number of RL distribution parameters.

In addition, if there has been a change point at k such that as n → ∞, k/n has a limit in (0,1), then

Sn → ∞ (4.3.16)

in probability.

Theorem 4.3.1 indicates that the MIC is consistent which that, when there is a fixed amount of change in the RL parameters at k, such that, k/n has a limit in (0,1), the model having change point will be chosen with the probability approaching to one. The proof of Theorem 4.3.1 is similar to Chen et al. (2006).

Theorem 4.3.2. Assume that the Wald conditions and the regularity conditions are satisfied by

k the RL distribution. As n → ∞, the change point satisfies 0 < n < 1. Then, the change point estimator ˆ k − k = Op(1). (4.3.17)

Theorem 4.3.2 implies that the estimator kˆ of the change point attains the best convergence rate. The proof is similar to Chen et al. (2006). For RL distribution, Theorems 4.3.1 and 4.3.2 are also verified using numerical methods which are described in the following section.

4.4 Simulation Study

In this section, we will study the change point problem for the shape and the scale parameters of the RL distribution. We consider at most one change point since multiple change points can be studied through the binary segmentation method proposed by Vostrikova (1981). This method will be applied in our application. To calculate SIC(n), SIC(k), MIC(n) and MIC(k), we use the package bbmle in R by 86 Bolker and Team (2010) to fit a data set with RL distribution since the first derivatives of the

ˆ ∗ ˆ∗ ∗ ∗∗ ˆ∗∗ ∗∗ Log(f(xi;α, ˆ λ, σˆ)), Log(f(xi;α ˆ , λ , σˆ )) and Log(f(xi;α ˆ , λ , σˆ )) are not in closed form. We conduct simulations 1000 times under the RL(α, λ, σ) with different values of the shape

parameter α and the scale parameters λ and σ. The test statistics Tn and Sn are calculated and compared to the critical values corresponding to the significant level 0.05. After rejecting the null hypothesis, we calculate the powers of the SIC and the MIC with different sample sizes n = 100, 200, 400 and different change locations as reported in Tables 4.1, 4.2 and 4.3. We observe

Table 4.1: Power comparison between SIC and MIC as n=100.

RL changing parameters Criteria k=50 k=60 k=70 α∗ = 1, α∗∗ = 1.5 SIC 0.410 0.363 0.269 λ∗ = 2, λ∗∗ = 2.5 MIC 0.600 0.563 0.421 σ∗ = 2, σ∗∗ = 2.5 α∗ = 1, α∗∗ = 1.8 SIC 0.630 0.579 0.462 λ∗ = 2, λ∗∗ = 2.8 MIC 0.797 0.766 0.623 σ∗ = 2, σ∗∗ = 2.8 α∗ = 1, α∗∗ = 2 SIC 0.790 0.730 0.619 λ∗ = 2, λ∗∗ = 3 MIC 0.911 0.865 0.757 σ∗ = 2, σ∗∗ = 3

Table 4.2: Power comparison between SIC and MIC as n=200.

RL changing parameters Criteria k=100 k=130 k=160 α∗ = 1, α∗∗ = 1.5 SIC 0.770 0.734 0.546 λ∗ = 2, λ∗∗ = 2.5 MIC 0.915 0.898 0.706 σ∗ = 2, σ∗∗ = 2.5 α∗ = 1, α∗∗ = 1.8 SIC 0.945 0.935 0.775 λ∗ = 2, λ∗∗ = 2.8 MIC 0.985 0.988 0.887 σ∗ = 2, σ∗∗ = 2.8 α∗ = 1, α∗∗ = 2 SIC 0.990 0.983 0.910 λ∗ = 2, λ∗∗ = 3 MIC 0.996 0.997 0.963 σ∗ = 2, σ∗∗ = 3

that the MIC has high powers to detect the change points positions comparing to the SIC. For both MIC and SIC, when the differences between the parameters before and after the change as well as the sample size increase, the powers of tests also increase. When sample sizes are large enough, the power approaching to 1 which indicates that both criteria are consistent. We also verify the 87 Table 4.3: Power comparison between SIC and MIC as n=400.

RL changing parameters Criteria k=200 k=250 k=280 α∗ = 1, α∗∗ = 1.5 SIC 0.946 0.920 0.851 λ∗ = 2, λ∗∗ = 2.5 MIC 0.996 0.986 0.941 σ∗ = 2, σ∗∗ = 2.5 α∗ = 1, α∗∗ = 1.8 SIC 1 1 0.993 λ∗ = 2, λ∗∗ = 2.8 MIC 1 1 0.999 σ∗ = 2, σ∗∗ = 2.8 α∗ = 1, α∗∗ = 2 SIC 1 1 1 λ∗ = 2, λ∗∗ = 3 MIC 1 1 1 σ∗ = 2, σ∗∗ = 3

2 behavior of Sn and its convergence to χ3 in distribution as n → ∞ as stated in Theorem 4.3.1

2 numerically using Kolmogorov-Smirnov test. Figures 4.1-4.3 show the χ3 quantile-quantile (Q- Q) plot for different sample sizes. It indicates that as the increase of the sample size, the better

2 2 approximation to χ3 Sn has. The p-values also indicate that Sn can be approximated by χ3 as the increase of the sample size.

2 Figure 4.1: χ3 Q-Q plot of Sn as n=100, p-value=0.4376. 88

2 Figure 4.2: χ3 Q-Q plot of Sn as n=200, p-value=0.6342.

2 Figure 4.3: χ3 Q-Q plot of Sn as n=400, p-value=0.8649.

We also conduct simulated data 1000 times to evaluate the convergence of the estimator kˆ numer- ically for different sample sizes n = 100, n = 150, n = 200 and different change point locations. The results are listed in Table 4.4, 4.5 and 4.6. We observe that the estimators of the location of the change point kˆ converge in probability to its assumed value k for both SIC and MIC criteria as the sample size n gets larger and η = 3.

4.5 Change Point Analysis for British Coal Mining Disaster

This section explains the change point analysis of coal mining disasters data which is from 1875 to 1951 and the total is 26,263 days. These data are about the 109 time intervals between explosions in mines in Great Britain involving the loss of ten lives or more and given in Table 4.7. They are taken from Wu (2007) and originally analyzed by Maguire et al. (1952). 89 Table 4.4: Probability distribution of | kˆ − k |≤ η when n = 100 and different k’s.

P (| kˆ − k |≤ η) Criteria k=50 k=60 k=70 SIC 0.593 0.493 0.287 P (| kˆ − k |≤ 1) MIC 0.688 0.616 0.477 SIC 0.664 0.550 0.354 P (| kˆ − k |≤ 2) MIC 0.785 0.726 0.564 SIC 0.738 0.542 0.353 P (| kˆ − k |≤ 3) MIC 0.858 0.751 0.575

Table 4.5: Probability distribution of | kˆ − k |≤ η when n = 150 and different k’s.

P (| kˆ − k |≤ η) Criteria k=75 k=90 k=110 SIC 0.769 0.754 0.615 P (| kˆ − k |≤ 1) MIC 0.777 0.755 0.705 SIC 0.874 0.866 0.710 P (| kˆ − k |≤ 2) MIC 0.893 0.884 0.806 SIC 0.934 0.935 0.720 P (| kˆ − k |≤ 3) MIC 0.945 0.949 0.817

Before we consider the RL change point model for this data, we need to check whether the data are independent or not. According to Ngunkeng (2013), we check independence of the data set using portmanteau test with the test statistic given by,

k X 2 Qk = n ri , (4.5.1) i=1 where ri is the coefficient (acf) at lag i, and the k is the number of lags to which

2 the autocorrelation coefficient function is considered. Under H0 of independence, Qk ∼ χk. By applying the test (4.5.1) for British Coal Mining Disaster data, the result is as follows

18 X 2 2 Q18 = 109 ∗ ri = 19.927 < χ0.95(18) = 28.869. i=1 90 Table 4.6: Probability distribution of | kˆ − k |≤ η when n = 200 and different k’s.

P (| kˆ − k |≤ η) Criteria k=100 k=130 k=160 SIC 0.786 0.780 0.679 P (| kˆ − k |≤ 1) MIC 0.781 0.788 0.714 SIC 0.885 0.890 0.785 P (| kˆ − k |≤ 2) MIC 0.912 0.902 0.839 SIC 0.946 0.925 0.824 P (| kˆ − k |≤ 3) MIC 0.950 0.944 0.902

Table 4.7: Time intervals between explosions in mines.

378 36 15 31 215 11 137 4 15 72 96 124 50 120 203 176 55 93 59 315 59 61 1 13 189 345 20 81 286 114 108 188 233 28 22 61 78 99 326 275 54 217 113 32 23 151 361 312 354 58 275 78 17 1205 644 467 871 48 123 457 498 49 131 182 255 195 224 566 390 72 228 271 208 517 1613 54 326 1312 348 745 217 120 275 20 66 291 4 369 338 336 19 329 330 312 171 145 75 364 37 19 156 47 129 1630 29 217 7 18 1357

Therefore, we fail to reject the null hypothesis H0 and we conclude that the data are independent. Figure 4.4 is the graph of autocorrelation function which also indicates the data is uncorrelated. Maguire et al. (1952) showed that the intervals are independent and distributed exponentially. We use the RL distribution for the same data set to detect the change point, since it is related to the exponential distribution as discussed in chapter 2. To detect changes in the data set of the British Coal Mining Disaster, we apply the test statistics defined in (4.3.10) and (4.3.11). After calculation, we obtain SIC(n) = 1423.592,

−6 min5≤k≤104 SIC(k) = SIC(46) = 1414.529 and Tn = 23.137 with p-value 9.459×10 . Regard- ing the modified information criterion, we have MIC(n) = 1423.592, min5≤k≤104 MIC(k) =

−5 MIC(46) = 1414.643 and Sn = 23.023 with p-value 3.994 × 10 . Hence, there is a change point occurring at position 46 which corresponds to the year 1890. According to Hall and Snelling (1907), the use of explosives or inflammable materials are the principal causes of danger of ex- 91

Figure 4.4: The auto-correlation plot of the British Coal Mining Disaster data.

Figure 4.5: Scatter plot for the British Coal Mining Disaster data.

plosions in mines, before the generalization of safety explosive which was after 1890. By binary segmentation method, we found that there is no further change point. Our result matches the one obtained by Worsley (1979) and Wu (2007). Figure 4.6 displays the values of SIC of the data set.

Figure 4.6: SIC(k) values in the British Coal Mining Disaster data. 92 4.6 Change Point Analysis for IBM Stock Price

This data set is taken as the IBM stock daily closing prices from May 17 of 1961 to November. 2 of 1962 (Box et al., 1994). The data consists of 369 observations. We note that the IBM stock price data may not be independent. According to Hsu (1979), we transform the data into Rt independent series as follows

Pt+1 − Pt Rt = for t=1,2,...,368. (4.6.1) Pt

To check independence of the transformed data, we apply the test statistic (4.5.1) as follows

12 X 2 2 Q12 = 368 ∗ ri = 368 ∗ 0.05102 = 18.776 < χ0.95(12) = 21.026. i=1

Hence, we fail to reject the null hypothesis H0 and we conclude that the data are independent.

Figure 4.7: The auto-correlation plot of the transformed IBM data.

To detect the change point in the data set of the IBM stock price, we apply the SIC and MIC test criteria. After computation, we have SIC(n) = −1870.608 > min6≤k≤362 SIC(k) =

SIC(235) = −2056.913. Also, Tn = 204.029 with p-value  0. Furthermore, we compute

the test statistic for MIC which is Sn = 204.459 and its p-value  0. Hence, there is a significant change at kˆ = 235. Using the binary segmentation method, another change point is detected by testing the subsequence before and after the change point kˆ = 235. After calculation, we obtain 93

SIC(n) = −572.304 > min5≤k≤128 SIC(k) = SIC(280) = −590.410 and Tn = 32.777 with

−8 the p-value 7.631 × 10 . Also, MIC(n) = −572.304 > min5≤k≤128 MIC(k) = MIC(280) =

−7 −589.850 and Sn = 32.217 with the p-value 4.710 × 10 . Therefore, there is a change in the parameters of the RL distribution and the location of the change point is kˆ = 280. Our results are similar to those obtained by Baufays and Rasson (1985). The two locations of the change points, kˆ = 235 and kˆ = 280, are occurred in the beginning of 1962. Smith (2001) indicated that there does seem to be substantial evidence that stock prices may have been “too high” before the 1962 market break. In December 1961, the price-earning ratio was significantly higher than the ratio in the years before, making the market susceptible to a serious drop. The graphs of the IBM stock

daily closing prices and the transformed data Rt with corresponding change points are given in Figure 4.8 and Figure4.9.

Figure 4.8: The IBM stock daily closing prices from May 17 of 1961 to November 2 of 1962.

Figure 4.9: The IBM stock daily closing prices rate from May 17 of 1961 to November 2 of 1962. 94 4.7 Change Point Analysis for the Radius of Circular Indentations

Lombard (1987) considered a sequence of data consisting of the radii of 100 circular indenta- tions cut by a milling machine. These were obtained in an experiment to compare the effects of two servicing and resetting routines on the variability of the output of such a machine. Lombard suggested that “there may have been two increases in mean, or even a smooth increase, between observations 20 and 40 followed by a decrease at observation 76”. Lombard mentioned if the change point is in the form of a smooth regression, a wider separation of the estimates could be expressed. We detect the change points using the methods of SIC and MIC. Before detection, we test independence of the data set using the same method in the previous sections. The result is as follows 12 X 2 2 Q12 = 100 ∗ ri = 100 ∗ 0.194 = 19.357 < χ0.95(12) = 21.026. i=1

Hence, we fail to reject the null hypothesis H0 and we conclude that the data are independent. Fig- ure 4.10 illustrates the autocorrelation at each lag, and we observe that the observations are inde-

pendent. After computation, we have SIC(n) = −122.509 > min5≤k≤95 SIC(k) = SIC(37) =

−6 −133.047. Also, Tn = 23.530 with the p-value 7.772 × 10 . Regarding MIC, we obtain

MIC(n) = −122.509 > min5≤k≤95 MIC(k) = MIC(37) = −133.044 and Sn = 23.527 with the p-value 3.135 × 10−5. Hence, there is a significant change point kˆ = 37 which is located be- tween observations 20 and 40. We apply the binary segmentation method and we found that there is no further change point. Figure4.12 and Figure4.13 show the SIC and MIC values with red lines that mark the detected locations of the change points. 95

Figure 4.10: The auto-correlation plot of the radius of circular indentations data.

Figure 4.11: The Radius of Circular Indentations data.

Figure 4.12: SIC(k) values for the radius of circular indentations data. 96

Figure 4.13: MIC(k) values for the radius of circular indentations data.

4.8 Conclusions

In this chapter, we consider the change point problem for RL distribution. Testing procedures based on Schwartz information criterion and modified information criterion are proposed to detect changes in parameters of RL distribution. Asymptotic null distributions of the test statistics are established. Consistency of the tests and change point estimators are also derived. Simulations are calculated to investigate the powers of proposed test statistics and computations between two procedures are made. The simulation study shows that the power of the MIC is higher than the SIC after rejecting the null hypothesis. We apply our testing procedures to detect the change points in British coal mining, IBM stock prices and the radius of circular indentations data sets. Various changes in data have been located successfully. 97

BIBLIOGRAPHY

Abdul-Moniem, I. and Abdel-Hameed, H. (2012). On exponentiated Lomax distribution. Interna- tional Journal of Mathematical Archive (IJMA) ISSN 3(5), 2229–5046.

Acik Kemaloglu, S. and Yilmaz, M. (2017). Transmuted two-parameter lindley distribution. Com- munications in Statistics-Theory and Methods (accepted).

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. 2nd International Symposium of Information Theory, Akademia Kiado, Budapest, 267–281.

Al-Zahrani, B. and Sagor, H. (2014). The Poisson-Lomax distribution. Revista Colombiana de Estad´ıstica 37, 225–245.

Alizadeh, M., Ghosh, I., Yousof, H. M., Rasekhi, M., and Hamedani, G. (2017). The generalized odd generalized of distributions: Properties, characterizations and applica- tion. Journal of Data Science 15(3), 443–465.

Arnold, B. C. (2015). Pareto distribution. Wiley Online Library.

Aryal, G. R. and Tsokos, C. P. (2009). On the transmuted extreme value distribution with applica- tion. Nonlinear Analysis: Theory, Methods & Applications 71(12), e1401–e1407.

Aryal, G. R. and Tsokos, C. P. (2011). Transmuted Weibull distribution: A generalization of the Weibull probability distribution. European Journal of Pure and Applied Mathematics 4(2), 89–102.

Asquith, W. H. (2011). Distributional Analysis With L-moment Statistics Using the R Environment for Statistical Computing. CreateSpace Independent Publishing Platform, 2nd printing, ISBN 978-1463508418.

Atkinson, A. B. and Harrison, A. J. (1978). Distribution of Personal Wealth in Britain. Cambridge University Press. 98 Azzalini, A. (1985). A class of distributions which includes the normal ones. Scandinavian Journal of Statistics, 171–178.

Balakrishnan, N. and Aggarwala, R. (2000). Progressive Censoring: Theory, Methods, and Appli- cations. Springer Science and Business Media.

Balakrishnan, N. and Sandhu, R. (1995). A simple simulational algorithm for generating progres- sive Type-II censored samples. The American 49(2), 229–230.

Baufays, P. and Rasson, J. (1985). Variance changes in autoregressive models. Time Series Analy- sis: Theory and Practice 7, 119–127.

Bolker, B. and Team, R. (2010). bbmle: Tools for general maximum likelihood estimation. R package version 0.9 5.

Bourguignon, M., Ghosh, I., and Cordeiro, G. (2016). General results for the transmuted family of distributions and new models. Journal of Probability and Statistics, 1–12.

Bowley, A. L. (1920). Elements of Statistics, Volume 2. PS King.

Box, G., Jenkins, G., and Reinsel, G. (1994). Time Series Analysis: Forecasting and Control. Prentice Hall, Englewood Cliffs, NJ.

Bryson, M. C. (1974). Heavy-tailed distributions: properties and tests. Technometrics 16(1), 61– 68.

Chahkandi, M. and Ganjali, M. (2009). On some lifetime distributions with decreasing failure rate. Computational Statistics and Data Analysis 53(12), 4433–4440.

Chen, J., Gupta, A. K., and Pan, J. (2006). Information criterion and change point problem for regular models. Sankhya:¯ The Indian Journal of Statistics, 252–282.

Chernoff, H. and Zacks, S. (1964). Estimating the current mean of a normal distribution which is subjected to changes in time. The Annals of 35(3), 999–1018. 99 Chhetri, S. B., Akinsete, A. A., Aryal, G., and Long, H. (2017). The Kumaraswamy transmuted Pareto distribution. Journal of Statistical Distributions and Applications 4(1), 1–24.

Cordeiro, G. M., Ortega, E. M., and Popovic,´ B. V. (2015). The Gamma-Lomax distribution. Journal of Statistical Computation and Simulation 85(2), 305–319.

Cordeiro, G. M., Ortega, E. M., and Silva, G. O. (2011). The exponentiated generalized gamma distribution with application to lifetime data. Journal of Statistical Computation and Simula- tion 81(7), 827–842.

Csorg¨ o,¨ M. and Horvath,´ L. (1997). Limit theorems in change-point analysis, Volume 18. John Wiley & Sons Inc.

David, H. A. and Nagaraja, H. N. (1970). Order Statistics. Wiley Online Library.

Davis, H. T. and Feldstein, M. L. (1979). The generalized Pareto law as a model for progressively censored survival data. Biometrika 66(2), 299–306.

Dirac, P. (1958). The principles of quantum mechanics. Oxford: Clarendon 180, 287–313.

Edgeworth, F. Y. (1886). The law of error and the elimination of chance. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 21(131), 308–324.

El-Bassiouny, A., Abdo, N., and Shahen, H. (2015). Exponential Lomax distribution. International Journal of Computer Applications 121(13), 24–29.

Elbatal, I., Asha, G., and Raja, A. V. (2014). Transmuted exponentiated frechet´ distribution: prop- erties and applications. Journal of Statistics Applications & Probability 3(3), 379–394.

Eugene, N., Lee, C., and Famoye, F. (2002). Beta-normal distribution and its applications. Com- munications in Statistics-Theory and Methods 31(4), 497–512.

Golaup, A., Holland, O., and Aghvami, A. H. (2005). Concept and optimization of an effective packet scheduling algorithm for multimedia traffic over HSDPA. In Personal, Indoor and Mobile 100 Radio Communications, 2005. PIMRC 2005. IEEE 16th International Symposium on, Volume 3, pp. 1693–1697.

Grimshaw, S. D. (1993). Computing maximum likelihood estimates for the generalized Pareto distribution. Technometrics 35(2), 185–191.

Gupta, A. K. (2003). Multivariate skew t-distribution. Statistics 37(4), 359–363.

Gupta, A. K. and Nadarajah, S. (2005). On the moments of the beta normal distribution. Commu- nications in Statistics-Theory and Methods 33(1), 1–13.

Gurevich, G. and Vexler, A. (2005). Change point problems in the model of logistic regression. Journal of Statistical Planning and Inference 131(2), 313–331.

Hall, C. and Snelling, W. O. (1907). Coal-mine accidents: their causes and prevention. Technical report, Washington: Government Printing Office.

Harris, C. M. (1968). The Pareto distribution as a queue service discipline. Operations Re- search 16(2), 307–313.

Hassan, A. S. and Al-Ghamdi, A. S. (2009). Optimum step-stress accelerated life testing for lomax distribution. Journal of Applied Sciences Research 5(12), 2153–2164.

Hawkins, D. M. (1977). Testing a sequence of observations for a shift in location. Journal of the American Statistical Association 72(357), 180–186.

Hoffman, D. and Karst, O. (1975). The theory of the Rayleigh distribution and some of its appli- cations. Journal of Ship Research 19(3), 172–191.

Horvath,´ L., Huskovˇ a,´ M., Kokoszka, P., and Steinebach, J. (2004). Monitoring changes in linear models. Journal of Statistical Planning and Inference 126(1), 225–251. 101 Hosking, J. R. (1990). L-moments: analysis and estimation of distributions using linear combina- tions of order statistics. Journal of the Royal Statistical Society. Series B (Methodological) 52(1), 105–124.

Hsu, D. (1977). Tests for variance shift at an unknown time point. Applied Statistics, 279–284.

Hsu, D. (1979). Detecting shifts of parameter in gamma sequences with applications to stock price and air traffic flow analysis. Journal of the American Statistical Association 74(365), 31–40.

Kim, T.-H. and White, H. (2004). On more robust estimation of skewness and kurtosis. Finance Research Letters 1(1), 56–73.

Kundu, D. and Raqab, M. Z. (2005). Generalized Rayleigh distribution: different methods of estimations. Computational Statistics and Data Analysis 49(1), 187–200.

Lee, E. T. and Wang, J. (2003). Statistical Methods for Survival Data Analysis, Volume 476. John Wiley & Sons.

Lomax, K. (1954). Business failures: Another example of the analysis of failure data. Journal of the American Statistical Association 49(268), 847–852.

Lombard, F. (1987). Rank tests for changepoint problems. Biometrika 74(3), 615–624.

Maguire, B. A., Pearson, E., and Wynn, A. (1952). The time intervals between industrial accidents. Biometrika 39(1/2), 168–180.

Merovci, F. (2014). Transmuted generalized Rayleigh distribution. Journal of Statistics Applica- tions & Probability 3(1), 9–20.

Nadarajah, S. and Gupta, A. K. (2008). A product Pareto distribution. Metrika 68(2), 199–208.

Nagar, D. K., Joshi, L., and Gupta, A. K. (2012). Matrix variate Pareto distribution of the second kind. ISRN Probability and Statistics 2012, 1–20.

Nelson, W. B. (1982). Applied Life Data Analysis, Volume 577. John Wiley & Sons. 102 Ngunkeng, G. (2013). Statistical analysis of skew normal distribution and its applications. Bowl- ing Green State University.

Ning, W. (2012). Empirical likelihood ratio test for a mean change point model with a linear trend followed by an abrupt change. Journal of Applied Statistics 39(5), 947–961.

Ning, W. and Gupta, A. K. (2009). Change point analysis for generalized lambda distribution. Communications in Statistics-Simulation and Computation 38(9), 1789–1802.

Ning, W. and Gupta, A. K. (2012). Matrix variate extended skew normal distributions. Random Operators and Stochastic Equations 20(4), 299–310.

Pareto, V. (1897). The new theories of economics. Journal of Political Economy 5(4), 485–502.

Rajab, M., Aleem, M., Nawaz, T., and Daniyal, M. (2013). On five parameter beta Lomax distri- bution. Journal of Statistics 20(1), 102–118.

Rayleigh, L. (1880). On the resultant of a large number of vibrations of the same pitch and of arbitrary phase. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 10(60), 73–78.

Rinne, H. (2008). The Weibull Distribution: a handbook. CRC Press.

Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics 6(2), 461–464.

Sen, A. and Srivastava, N. S. (1975). On tests for detecting change in mean when variance is unknown. Annals of the Institute of Statistical Mathematics 27(1), 479–486.

Shams, T. M. (2013). The Kumaraswamy-generalized Lomax distribution. Middle-East Journal of Scientific Research 17, 641–646.

Shaw, W. T. and Buckley, I. (2007). The alchemy of probability distributions: Beyond gram-charlier & cornish-fisher expansions, and skew-normal or kurtotic-normal distributions. http://www.mth.kcl.ac.uk/ shaww/web page/papers/alchemy.pdf 7, 1–28. 103 Smith, B. M. (2001). Toward Rational Exuberance: The Evolution of the Modern Stock Market. Macmillan.

Tahir, M., Hussain, M. A., Cordeiro, G. M., Hamedani, G., Mansoor, M., and Zubair, M. (2016). The Gumbel-Lomax distribution: Properties and applications. Journal of and Applications 15(1), 61–79.

Vexler, A., Wu, C., Liu, A., Whitcomb, B. W., and Schisterman, E. F. (2009). An extension of a change-point problem. Statistics 43(3), 213–225.

Vostrikova, L. (1981). Detection of the disorder in multidimensional random-processes. Doklady Akademii Nauk SSSR 259(2), 270–274.

Wang, B. X., Yu, K., and Jones, M. (2010). Inference under progressively Type II right-censored sampling for certain lifetime distributions. Technometrics 52(4), 453–460.

Worsley, K. (1979). On the likelihood ratio test for a shift in location of normal populations. Journal of the American Statistical Association 74(366a), 365–367.

Wu, Y. (2007). Inference for Change Point and Post Change Means After a CUSUM Test, Volume 180. Springer Science & Business Media.

Zhang, N. R. and Siegmund, D. O. (2007). A modified bayes information criterion with applica- tions to the analysis of comparative genomic hybridization data. Biometrics 63(1), 22–32. 104

APPENDIX A SELECTED R PROGRAMS

• Progressively Type II right censored samples for the transformed RL distribution.

## Generating Progressively Type II right censored samples from

the transformed RL distribution.

add = function(x){

len = length(x)

v e c t o r <− 0

sum <− 0

for (i in len:1) {

sum = sum + x[i]

v e c t o r [ len −i +1] = sum

}

return (vector)

}

mul = function(x){

len = length(x)

v e c t o r <− 0

sum <− 1

for (i in len:1) {

sum = sum ∗ x [ i ]

v e c t o r [ len −i +1] = sum

}

return (vector)

}

W<−runif(5,0,1)

V=0

U=0 105 x=0

R<−c(0,0,1,1,1) ## one of the censore schemes

R1<− add (R)

for(i in 1:length(R)) {

V[ i ]<−W[i ]ˆ(1/( i+R1[i]))

}

V

V. product <− mul (V)

for(j in 1:length(R)) {

U[ j ]<−1−V.product[j]

x [ j ]<−(−2∗ l o g (1−U[j]))ˆ(1/2)

} x

## The inverse estimation of the transformed RL distribution for the

parameter alpha

library(rootSolve) fun <− function(alpha) {

y <− numeric ( 1 ) m = 5

W<−runif(m,0,1)

V=0

U=0

M=0 x=0

R<−c(5,0,0,0,0) ## The censore scheme

R1<− add (R)

for(i in 1:length(R)) { 106 V[ i ]<−W[i ]ˆ(1/( i+R1[i]))

}

V

V. product <− mul (V)

for(j in 1:length(R)) {

U[ j ]<−1−V.product[j]

x [ j ]<−(−2∗ l o g (1−U[j]))ˆ(1/2)

} x

numerator = 0 num = 0

for(j in 1:m){

num[j] = (R[j] + 1) ∗ ( x [ j ] ˆ (2∗ a l p h a ) )

}

numerator <− sum ( num )

n = sum (R) + m

# To find the first part of the denominator temp = 0 den1 = 0

for(i in 1:(m−1)){

for(j in 1:i){

temp = temp + ((R[j] + 1) ∗ ( x [ j ] ˆ (2∗ a l p h a ) ) )

}

den1[i] = temp

temp = 0

} 107

# To find the second part of the denominator den2 = 0 temp = 0

for(i in 1:(m−1)){

for(j in 1:i){

temp = temp + (R[j] + 1)

}

den2[i] = temp

temp = 0

}

# To find the sum of log ratio

r a t i o = 0

for(i in 1:(m−1)){

ratio = ratio + log(numerator/(den1[i] + (n−den2 [ i ] ) ∗ x [ i ] ˆ 2 ) )

}

c i = (2∗ r a t i o ) −(2∗(m−2))

}

u n i <− uniroot(fun, c(0, 8))$root

## The coverage probability of the parameter alpha

a l p h =1 conf<−function(alph=1,correct=T){ m = 5

W<−runif(m,0,1) ## Here m=3 and sum(R j)+m=n so n=4

# p r i n t (W)

V=0

U=0 108 R<−c(1,1,1,1,1) ## The censore scheme

#R1<−c(R[3] ,R[2]+R[3] ,R[2]+R[3]+R[1])

R1<− add (R)

for(i in 1:length(R)) {

V[ i ]<−W[i ]ˆ(1/( i+R1[i]))

}

V

#V. product <−c(V[3] ,V[3] ∗V[ 2 ] ,V[ 3 ] ∗V[ 2 ] ∗V[ 1 ] )

V. product <− mul (V)

for(j in 1:length(R)) {

U[ j ]<−1−V.product[j]

x [ j ]<−(−2∗ l o g (1−U[j]))ˆ(1/2)

} x

numerator = 0 num = 0

for(j in 1:m){

num[j] = (R[j] + 1) ∗ ( x [ j ] ˆ (2∗ a l p h a ) )

}

numerator <− sum ( num )

n = sum (R) + m

# To find the first part of the denominator temp = 0 den1 = 0

for(i in 1:(m−1)){

for(j in 1:i){ 109 temp = temp + ((R[j] + 1) ∗ ( x [ j ] ˆ (2∗ a l p h ) ) )

}

den1[i] = temp

temp = 0

}

# To find the second part of the denominator den2 = 0 temp = 0

for(i in 1:(m−1)){

for(j in 1:i){

temp = temp + (R[j] + 1)

}

den2[i] = temp

temp = 0

}

# To find the sum of log ratio

r a t i o = 0

for(i in 1:(m−1)){

ratio = ratio + log(numerator/(den1[i] + (n−den2 [ i ] ) ∗ x [ i ] ˆ 2 ) )

}

c i = 2∗ r a t i o

ub<−qchisq(0.975,df=(2∗( l e n g t h (R) −1)))

lb<−qchisq(0.025,df=(2∗( l e n g t h (R) −1)))

i f ( ci l b ){

r e t u r n ( 1 )

} e l s e {

r e t u r n ( 0 )

} 110 }

set .seed(1200)

times = 10000

r e s u l t s <− replicate(times , conf(alph=1,correct=T))

sum(results) / times

• The RL L-moments parameter estimation for the Aircraft Windshield data.

data <−x1

lmom<−lmoms ( x1 )

tau3 <−lmom$ratios[3]

fun<−function(x) 2ˆ(1/(2 ∗ x ) ) + 2 ∗ (2/3)ˆ(1/(2 ∗ x)) −3+ tau3 −t a u 3 ∗ 2 ˆ ( 1 / ( 2 ∗ x ) )

curve(fun(x),0,15)

abline(h=0,lty=3)

a l p h a . e s t <− uniroot(fun, c(0.001, 15))$root

tau2 <−lmom$ratios[2]

sigma . e s t <−(gamma((1/(2 ∗ alpha.est))+1) ∗ ( 2 ˆ ( 1 / ( 2 ∗ alpha.est)))

−(1/ t a u 2 )∗ gamma((1/(2 ∗ alpha.est))+1) ∗ ( 2 ˆ ( 1 / ( 2 ∗ alpha.est)))

+ ( 1 / t a u 2 )∗ gamma((1/(2 ∗ alpha. est))+1))ˆ( − 1 ∗ a l p h a . e s t )

lambda . e s t <−lambda2/(sigma.est ˆ(1/alpha.est)∗ gamma((1/(2 ∗ alpha.est))+1)

∗ 2 ˆ ( 1 / ( 2 ∗ alpha. est)) − sigma.estˆ(1/alpha.est)∗ gamma((1/(2 ∗ alpha.est))+1))

## To Compute AIC and SIC values

R . l<−function(alpha ,lambda ,sigma){

n<−l e n g t h ( x1 )

A=(alpha /(lambda ∗ sigma ˆ 2 ) ) ∗ ((x1+lambda)/lambda)ˆ(2 ∗ alpha −1)

∗ exp (( −1/(2∗ sigma ˆ 2 ) ) ∗ ((x1+lambda)/lambda)ˆ(2 ∗ a l p h a ) )

r e t u r n (−sum(log(A))) } 111

AIC . RL<−2∗R.l(alpha.est ,lambda.est ,sigma.est)+6

SIC . RL<−2∗R.l(alpha.est ,lambda.est ,sigma.est)+3∗ log(length(x1))

• The RL method of moments parameter estimation for the Aircraft Windshield data.

library(nleqslv)

d s l n e x <− function(x) {

y <− numeric ( 3 )

y [ 1 ] <−(sum(data )/ length(data))−(−x [ 2 ] + x [ 2 ] ∗ x[3]ˆ(1/x[1])

∗gamma((1/(2 ∗ x [ 1 ] ) ) + 1 ) ∗ 2 ˆ ( 1 / ( 2 ∗ x [ 1 ] ) ) )

y[2]=(sum(dataˆ2)/ length(data)) −(( x [2]ˆ2 −2∗ x [ 2 ] ˆ 2 ∗ x[3]ˆ(1/x[1])

∗gamma((1/(2 ∗ x [ 1 ] ) ) + 1 ) ∗ 2 ˆ ( 1 / ( 2 ∗ x[1]))+(x[2]ˆ2) ∗ x[3]ˆ(2/x[1])

∗gamma((2/(2 ∗ x [ 1 ] ) ) + 1 ) ∗ 2 ˆ ( 2 / ( 2 ∗ x [ 1 ] ) ) ) )

y[3]=(sum(data ˆ3)/ length(data))−(−x [ 2 ] ˆ 3 + 3 ∗ x [ 2 ] ˆ 3 ∗ x[3]ˆ(1/x[1])

∗gamma((1/(2 ∗ x [ 1 ] ) ) + 1 ) ∗ 2 ˆ ( 1 / ( 2 ∗ x [1])) −3∗ x [ 2 ] ˆ 3 ∗ x[3]ˆ(2/x[1])

∗gamma((2/(2 ∗ x [ 1 ] ) ) + 1 ) ∗ 2 ˆ ( 2 / ( 2 ∗ x[1]))+x[2]ˆ3 ∗ x[3]ˆ(3/x[1])

∗gamma((3/(2 ∗ x [ 1 ] ) ) + 1 ) ∗ 2 ˆ ( 3 / ( 2 ∗ x [ 1 ] ) ) )

y

}

x s t a r t <− c ( 1 , 1 , 1 )

f s t a r t <− dslnex(xstart)

x s t a r t

f s t a r t

# a solution is c(1,1)

r<−nleqslv(xstart , dslnex , control=list(btol=.01))

# To compute AIC and SIC values

R . l<−function(alpha ,lambda ,sigma){

n<−l e n g t h ( x1 ) 112 A=(alpha /(lambda ∗ sigma ˆ 2 ) ) ∗ ((x1+lambda)/lambda)ˆ(2 ∗ alpha −1)

∗ exp (( −1/(2∗ sigma ˆ 2 ) ) ∗ ((x1+lambda)/lambda)ˆ(2 ∗ a l p h a ) )

r e t u r n (−sum(log(A))) }

AIC . RL<−2∗R.l(r$x[1],r$x[2],r$x[3])+6

SIC . RL<−2∗R.l(r$x[1],r$x[2],r$x[3])+3 ∗ log(length(x1))

• The power comparisons of SIC and MIC of the change point as all RL parameters changed .

RL<−function(par ,data){ # RL distribution function to compute MLE

alpha=par[1]

lambda=par [2]

sigma=par[3]

B=(alpha /(sigmaˆ2∗ lambda ) ) ∗ ((( data+lambda)/lambda)ˆ(2 ∗ alpha −1))

∗ exp (( −1/(2∗ sigma ˆ 2 ) ) ∗ (( data+lambda)/lambda)ˆ(2 ∗ a l p h a ) )

−sum(log(B))

}

P<−function(x1,x2,n,s){ # Function to calculate the power

x=c ( x1 , x2 )

mle0<−optim(c(0.7 ,3 ,3) , RL,data=x,method=”Nelder −Mead” , hessian=FALSE)

mle1<−optim(c(0.6 ,2 ,4) , RL,data=x1,method=”Nelder −Mead” , hessian=FALSE)

mle2<−optim(c(0.8 ,3 ,4) , RL,data=x2,method=”Nelder −Mead” , hessian=FALSE)

f o r ( k i n 4 : n−2){

SIC O<−2∗(RL(c(mle0$par[1] ,mle0$par[2] ,mle0$par[3]) ,x))+3∗ l o g ( n )

SIC 1 <−2∗(RL(c(mle1$par[1] ,mle1$par[2] ,mle1$par[3]) ,x1))

+2∗(RL(c(mle2$par[1] ,mle2$par[2] ,mle2$par[3]) ,x2))+7∗ l o g ( n )

MIC O<−2∗(RL(c(mle0$par[1] ,mle0$par[2] ,mle0$par[3]) ,x))+3∗ l o g ( n )

MIC 1<−2∗(RL(c(mle1$par[1] ,mle1$par[2] ,mle1$par[3]) ,x1))

+2∗(RL(c(mle2$par[1] ,mle2$par[2] ,mle2$par[3]) ,x2))

+ ( 6 + ( ( ( 2 ∗ s ) / n ) −1)ˆ2)∗ l o g ( n ) 113 }

Tn<−SIC O−min ( SIC 1 )+3∗ log(n) # SIC test statistic

p1=ifelse (Tn>qgumbel(0.95,0,2),1,0)

Sn<−MIC O−min ( MIC 1 )+3∗ log(n) # MIC test statistic

p2=ifelse (Sn>qchisq(0.95,3),1,0)

res = c(p1,p2);res

}

n=100

i =0

p1=0

p2=0

K=50 # change point location

w hi l e ( i <1001){

x1 < −2∗(( −2∗(2)ˆ2∗ l o g (1− runif(K,0 ,1)))ˆ(1/(2 ∗ 1 ) ) ) − 2

x2 < −2.5∗(( −2∗(2.5)ˆ2∗ l o g (1− r u n i f ( n−K,0 ,1)))ˆ(1/(2 ∗ 1 . 5 ) ) ) − 2 . 5

result=P(x1,x2,n,K)

p1 = p1 + result[1]

p2 = p2 + result[2]

i = i +1

}

p1 # The power of the SIC after rejecting the null hypothesis.

p2 # The power of the MIC after rejecting the null hypothesis.

• To calculate that the estimator of the change point converges in probability to its assumed value for both SIC and MIC criteria.

RL<−function(par ,data){ 114 alpha=par[1] lambda=par [2] sigma=par[3]

B=(alpha /(sigmaˆ2∗ lambda ) ) ∗ ((( data+lambda)/lambda)ˆ(2 ∗ alpha −1))

∗ exp (( −1/(2∗ sigma ˆ 2 ) ) ∗ (( data+lambda)/lambda)ˆ(2 ∗ a l p h a ) )

−sum(log(B))

} k1=75 # The assumed change point location n=150

CALC=function (data){

SICn . Est<−optim(c(1,1,1), RL ,data=data ,method=”Nelder −Mead”

, hessian=FALSE)

SIC 0 <−2∗(RL(c(SICn.Est$par[1],SICn.Est$par[2],SICn.Est$par[3]) ,data))

+3∗ l o g ( n )

# p r i n t ( SIC 0 ) k0<−0

SIC k <−10000000

MIC k<−10000000

for(k in 2:(n −2)){

d a t a 1=data[1:k]

d a t a 2=data[(k+1):n]

SICk1 . Est<−optim(c(1,1,1), RL ,data=data 1 ,method=”Nelder −Mead”

, hessian=FALSE)

SICk2 . Est<−optim(c(1,1,1), RL,data=data 2 ,method=”Nelder −Mead”

, hessian=FALSE)

SIC 1 <<−2∗(RL(c(SICk1. Est$par[1] ,SICk1. Est$par[2] ,SICk1. Est$par [3])

, d a t a 1 ) ) + 2 ∗ (RL(c(SICk2.Est$par[1] ,SICk2.Est$par[2] ,SICk2.Est$par [3])

, d a t a 2 ))+6∗ l o g ( n )

MIC 1<<−2∗(RL(c(SICk1. Est$par[1] ,SICk1. Est$par[2] ,SICk1. Est$par [3]) 115 , d a t a 1 ) ) + 2 ∗ (RL(c(SICk2.Est$par[1] ,SICk2.Est$par[2] ,SICk2.Est$par [3])

, d a t a 2 ) ) + ( 6 + ( ( 2 ∗ k1 / n ) −1)ˆ2)∗ l o g ( n )

# p r i n t ( SIC 1 )

i f ( SIC 1

SIC k<−SIC 1

k s i c <−k

# p r i n t ( SIC k ) c1<−abs ( k s i c −k1) # The difference between k(estimate)

under SIC and k1(=75)

}

e l s e i f ( MIC 1

MIC k<−MIC 1 kmic<−k

# p r i n t ( SIC k ) c2<−abs ( kmic−k1) # The difference between k(estimate) under MIC and k1(=75)

}

}

return(c(SIC k , MIC k, ksic, kmic, c1, c2))

} c1<− 0 c2<−0

k s i c <−0 kmic<−0

SIC k<−0

MIC k<−0

for(i in 1 : 100){ x 1 < −1∗(( −2∗(2)ˆ2∗ l o g (1− runif(k1,0,1)))ˆ(1/2 ∗ 1 ) ) − 1 x 2 < −0.9∗(( −2∗(1)ˆ2∗ l o g (1− r u n i f ( n−k1,0 ,1)))ˆ(1/2 ∗ 0 . 8 ) ) − 0 . 9 116 data <−c ( x 1 , x 2 )

res <−CALC( d a t a )

#print(res)

SIC k [ i ]<− r e s [ 1 ]

MIC k [ i ]<− r e s [ 2 ]

k s i c [ i ]<− r e s [ 3 ] kmic [ i ]<− r e s [ 4 ] c1 [ i ]<− r e s [ 5 ] c2 [ i ]<− r e s [ 6 ]

}

r a t i o c o u n t e r 1 <−0

r a t i o c o u n t e r 2 <−0

for(j in 1:length(c1)) {

i f ( c1 [ j ] <=1)

r a t i o counter1= ratio counter1+ 1 # to calculate the probability when the k(estimate) under SIC and k1(=75) is less than 1.

}

for(d in 1:length(c2)) {

i f ( c2 [ d] <=1)

r a t i o counter2 = ratio counter2 + 1 # to calculate the probability when the k(estimate) under MIC and k1(=75) is less than 1.

}