ISSN: 2332-2071 Volume 8 Number 2A 2020

Special Edition on Synergy Through Diversity in Science and Mathematics

Mathematics and Statistics

http://www.hrpub.org

Horizon Research Publishing, USA

http://www.hrpub.org

Mathematics and Statistics Mathematics and Statistics is an international peer-reviewed journal that publishes original and high-quality research papers in all areas of mathematics and statistics. As an important academic exchange platform, scientists and researchers can know the most up-to-date academic trends and seek valuable primary sources for reference. The subject areas include, but are not limited to the following fields: Algebra, Analysis, Applied mathematics, Approximation theory, Combinatorics, Computational statistics, Computing in Mathematics, Design of experiments, Discrete mathematics, Dynamical systems, Geometry and Topology, Logic and Foundations of mathematics, Number theory, Numerical analysis, Probability theory, Quantity, Recreational mathematics, Sample Survey, Statistical modelling, Statistical theory. General Inquires Publish with HRPUB, learn about our policies, submission guidelines etc. Email: [email protected] Tel: +1-626-626-7940 Subscriptions Journal Title: Mathematics and Statistics Journal’s Homepage: http://www.hrpub.org/journals/jour_info.php?id=34 Publisher: Horizon Research Publishing Co.,Ltd Address: 2880 ZANKER RD STE 203 SAN JOSE, CA 95134 USA Publication Frequency: bimonthly Electronic Version: freely online available at http://www.hrpub.org/journals/jour_info.php?id=34 Online Submission Manuscripts should be submitted by Online Manuscript Tracking System (http://www.hrpub.org/submission.php). If you are experiencing difficulties during the submission process, please feel free to contact the editor at [email protected]. Copyright Authors retains all copyright interest or it is retained by other copyright holder, as appropriate and agrees that the manuscript remains permanently open access in HRPUB 's site under the terms of the Creative Commons Attribution International License (CC BY). HRPUB shall have the right to use and archive the content for the purpose of creating a record and may reformat or paraphrase to benefit the display of the record. Creative Commons Attribution License (CC-BY) All articles published by HRPUB will be distributed under the terms and conditions of the Creative Commons Attribution License(CC-BY). So anyone is allowed to copy, distribute, and transmit the article on condition that the original article and source is correctly cited. Open Access Open access is the practice of providing unrestricted access to peer-reviewed academic journal articles via the internet. It is also increasingly being provided to scholarly monographs and book chapters. All original research papers published by HRPUB are available freely and permanently accessible online immediately after publication. Readers are free to copy and distribute the contribution under creative commons attribution-non commercial licence. Authors can benefit from the open access publication model a lot from the following aspects: • High Availability and High Visibility-free and unlimited accessibility of the publication over the internet without any restrictions; • Rigorous peer review of research papers----Fast, high-quality double blind peer review; • Faster publication with less cost----Papers published on the internet without any subscription charge; • Higher Citation----open access publications are more frequently cited. Mathematics and Statistics

Editor-in-Chief

Prof. Dshalalow Jewgeni Florida Inst. of Technology, USA Members of Editorial Board

Jiafeng Lu Zhejiang Normal University, China

Nadeem-ur Rehman Aligarh Muslim University, India

Debaraj Sen Concordia University, Canada

Mauro Spreafico University of São Paulo, Brazil

Veli Shakhmurov Okan University, Turkey

Antonio Maria Scarfone Institute of Complex Systems - National Research Council, Italy

Liang-yun Zhang Nanjing Agricultural University, China

Ilgar Jabbarov Ganja state university, Azerbaijan

Mohammad Syed Pukhta Sher-e-Kashmir University of Agricultural Sciences and Technology, India

Vadim Kryakvin Southern Federal University, Russia

Rakhshanda Dzhabarzadeh National Academy of Science of Azerbaijan, Azerbaijan

Sergey Sudoplatov Sobolev Institute of Mathematics, Russia

Birol Altın Gazi University, Turkey

Araz Aliev Baku State University, Azerbaijan

Francisco Gallego Lupianez Universidad Complutense de Madrid, Spain

Hui Zhang St. Jude Children's Research Hospital, USA

Yusif Abilov Odlar Yurdu University, Azerbaijan

Evgeny Maleko Magnitogorsk State Technical University, Russia

İmdat İşcan Giresun University, Turkey

Emanuele Galligani University of Modena and Reggio Emillia, Italy

Mahammad Nurmammadov Baku State University, Azerbaijan

Horizon Research Publishing http://www.hrpub.org ISSN: 2332-2071 Table of Contents

Mathematics and Statistics

Volume 8 Number 2A 2020

The Performance of Different Correlation Coefficient under Contaminated Bivariate Data (https://www.doi.org/10.13189/ms.2020.081301) Bahtiar Jamili Zaini, Shamshuritawati Sharif ...... 1

Approximate Analytical Solutions of Nonlinear Korteweg-de Vries Equations Using Multistep Modified Reduced Differential Transform Method (https://www.doi.org/10.13189/ms.2020.081302) Che Haziqah Che Hussin, Ahmad Izani Md Ismail, Adem Kilicman, Amirah Azmi ...... 9

Bayesian Estimation in Piecewise Constant Model with Gamma Noise by Using Reversible Jump MCMC (https://www.doi.org/10.13189/ms.2020.081303) Suparman ...... 17

Weakly Special Classes of Modules (https://www.doi.org/10.13189/ms.2020.081304) Puguh Wahyu Prasetyo, Indah Emilia Wijayanti, Halina France-Jackson, Joe Repka ...... 23

Markov Chain: First Step towards Heat Wave Analysis in (https://www.doi.org/10.13189/ms.2020.081305) Nur Hanim Mohd Salleh, Husna Hasan, Fariza Yunus ...... 28

Robust Method in Multiple Linear Regression Model on Diabetes Patients (https://www.doi.org/10.13189/ms.2020.081306) Mohd Saifullah Rusiman, Siti Nasuha Md Nor, Suparman, Siti Noor Asyikin Mohd Razali ...... 36

An Alternative Approach for Finding Newton's Direction in Solving Large-Scale Unconstrained Optimization for Problems with an Arrowhead Hessian Matrix (https://www.doi.org/10.13189/ms.2020.081307) Khadizah Ghazali, Jumat Sulaiman, Yosza Dasril, Darmesah Gabda ...... 40

Parameter Estimations of the Generalized Extreme Value Distributions for Small Sample Size (https://www.doi.org/10.13189/ms.2020.081308) RaziraAniza Roslan, Chin Su Na, Darmesah Gabda ...... 47

Fourth-order Compact Iterative Scheme for the Two-dimensional Time Fractional Sub-diffusion Equations (https://www.doi.org/10.13189/ms.2020.081309) Muhammad Asim Khan, Norhashidah Hj. Mohd Ali ...... 52

Hybrid Flow-Shop Scheduling (HFS) Problem Solving with Migrating Birds Optimization (MBO) Algorithm (https://www.doi.org/10.13189/ms.2020.081310) Yona Eka Pratiwi, Kusbudiono, Abduh Riski, Alfian Futuhul Hadi ...... 58

Mathematics and Statistics 8(2A): 1-8, 2020 http://www.hrpub.org DOI: 10.13189/ms.2020.081301

The Performance of Different Correlation Coefficient under Contaminated Bivariate Data

Bahtiar Jamili Zaini*, Shamshuritawati Sharif

School of Quantitative Sciences, Universiti Utara Malaysia, Malaysia

Received July 31, 2019; Revised September 28, 2019; Accepted February 20, 2020

Copyright©2020 by authors, all rights reserved. Authors agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License

Abstract Bivariate data consists of 2 random variables these 2 variables. Besides scatter plot, correlation that are obtained from the same population. The coefficient can be used to measure the relationship between relationship between 2 bivariate data can be measured by 2 variables [1]. There are several types of correlation correlation coefficient. A correlation coefficient computed coefficients, such as Pearson correlation coefficient, from the sample data is used to measure the strength and Spearman rank correlation coefficient, Kendall’s Tau direction of a linear relationship between 2 variables. correlation coefficient, and other robust correlation However, the classical correlation coefficient results are methods. Correlation coefficient is a simple statistical inadequate in the presence of outliers. Therefore, this study measure of relationship between 2 random variables. The focuses on the performance of different correlation correlation coefficient computed from the sample data coefficient under contaminated bivariate data to determine measures the strength and direction of a linear relationship the strength of their relationships. We compared the between 2 variables [2]. If there is a strong linear performance of 5 types of correlation, which are classical relationship between the variables, the value of correlation correlations such as Pearson correlation, Spearman coefficient will be close to +1 or -1 depending on the correlation and Kendall’s Tau correlation with other robust direction of that relationship. However, when there is no correlations, such as median correlation and median linear relationship between the variables or only a weak absolute deviation correlation. Results show that when relationship, the value of correlation coefficient will be there is no contamination in data, all 5 correlation methods close to 0. This correlation coefficient can be used as an show a strong relationship between 2 random variables. indicator to the statistical fit of a regression based on their However, under the condition of data contamination, squared value, frequently known as coefficient of median absolute deviation correlation denotes a strong determination, used as a measure of the goodness-of-fit of relationship compared to other methods. the [2]–[4]. Furthermore, testing the correlation has become an Keywords Bivariate, Contaminated Data, Correlation, important subject in multivariate analysis. The importance Robust Correlation of testing the correlation has been shown in many areas such as economics, financial market, medicine, and social science [5]–[9]. However, there are issues in testing the correlation matrices using classical statistics when some 1. Introduction outlier exists in the data set [10], [11]. In the presence of outliers, the Pearson correlation coefficient results will be Analysis of bivariate data is a statistical method to inadequate. Therefore, it has reduced the capability to investigate the relationship between 2 variables. 2 variables, measure the strength of the relationship, damage effects on for example, X and Y, are said to relate when the value statistical analysis, increase the variance of error and assumed by one variable affects the distribution of another reduce statistical tests power [12], [13]. variable. For a bivariate datum, the input variable or To overcome the presence of outliers, robust methods independent variable, xi, is plotted on the horizontal axis, can be used as alternative methods in reducing the and the output variable or dependent variable, yi, is plotted influence of outliers [10], [14]–[16]. The aim of robust on the vertical axis. Scatter plot can be used to plot all the statistic procedure is to address the data without being ordered pairs (xi, yi) of bivariate data on a coordinate axis affected by the outlier in the data and to ensure the stability system and can be used to display the relationship between of statistical inference under deviations from the assumed

2 The Performance of Different Correlation Coefficient under Contaminated Bivariate Data

distribution model [17][18]. In multivariate setting, the work started by Huber, which is related to robust methods, has been successfully applied to regression, repeated measures, principal components, canonical correlation, discriminant analysis, dimension reduction and correspondence analysis [15].

2. Methodology

In this study, we focus on the robust correlation coefficient performance with contaminated data to ensure Figure 1. No contamination data that the hypothesis testing of correlation is conducted effectively. Therefore, the ultimate goal of this research is to identify the best correlation coefficient when involving contaminated data using simulation study.

2.1. Simulation Study Simulation study is used to compare the performance of all correlation coefficients under 2 situations either with or without contaminated data. Then, R programming version 3.4.2 is used to run the simulation data. This study used correlation values, bias and standard errors to conduct performance analysis among correlation methods [12], [19]. Figure 2. 10% contamination data We generated 200 datasets based on the linear relation of: 푦푖 = 6.0 + 2.0푥푖 + z푖 where xi is pseudo-random numbers from the uniform distribution with U(0,1), zi is drawn from normal distribution with N(0,1). A 200-dataset is generated based on sample sizes of 20, 30, 50 and 100. For each dataset, we generated contaminated data with a percentage of 10%, 20%, 30%, 40% and 50% of the sample size, respectively. Then, we calculated the correlation coefficient value, bias and standard deviation.

2.2. Nature of Data Figure 3. 20% contamination data Figures 1-6 show the examples of 1 out of 200 datasets about the nature of data where we start from no contaminated data to 50% contaminated data. Figure 1 shows that there is a strong positive linear relationship between 2 variables and there is no contaminated data in this dataset. The simple regression for this dataset is y = 6.3552 + 1.9414x. We simulated the dataset in Figure 1 and then added 10% contaminated data in the dataset. Some data appeared to be somewhat distant with other data as illustrated in Figure 2. We continued to simulate the dataset and added another 20% until 50% contaminated data in dataset is achieved. Figures 3 - 6 show more data away from the original dataset when we increased the percentage of contaminated data. Figure 4. 30% contamination data

Mathematics and Statistics 8(2A): 1-8, 2020 3

∑푛 (푥 −푥̅)(푦 −푦̅) 푟 = 푖=1 푖 푖 (1) 푝 푛 2 2 2 1/2 [∑푖=1(푥푖−푥̅) ∑푖=1(푦푖−푦̅) ] This Pearson correlation coefficient can be measured when the data is on at least interval scale. However, the Pearson correlation is excessively influenced by outliers, unequal variances, non-normality, and nonlinearity assumptions. 2.3.2. Spearman Correlation A non-parametric alternative to the Pearson’s linear correlation coefficient is Spearman correlation. Usually, the Spearman correlation coefficient can be used when the data is non-normal distributed, measured in ordinal scale, or outliers exist in the data. Technically, the Spearman Figure 5. 40% contamination data correlation involved exists ranking of each set of data. Therefore, the Spearman correlation is sometimes called as the Spearman's rank-order correlation. The Spearman’s rank-order correlation coefficient measures the strength and direction of association between 2 ranked variables. This correlation coefficient is defined in (2).

6 ∑푛(푑 )2 푟 = 1 − 푖 푖 (2) 푠 푛(푛2−1) 푑푖 is the difference in paired rankings, R(푋) −푅(푌), and n is the number of pairs of data. The value of rs will range from -1 to +1 and will be used in the same manner as the Pearson’s linear correlation coefficient, r, was used.

2.3.3. Kendall's Tau Correlation

Figure 6. 50% contamination data Similar to the Spearman rank correlation coefficient, Kendall’s Tau correlation, 휏 is based on the ranks of observations. Commonly, the Kendall’s Tau correlation is 2.3. Method of Data Analysis used to measure the degree of correspondence between sets Generally, we have 6 conditions of contaminated data of rankings where the measures are not equidistant. type (0%, 10%, 20%, 30%, 40%, and 50%), and 4 different Because it is used with ordinal association data, this sample sizes (n = 20, 30, 50, and 100). Based on that correlation is also a non-parametric alternative to the condition, therefore we have 24 sets of 200 generated Pearson correlation coefficient. The Kendall’s Tau datasets. The performance evaluation is based on 3 correlation coefficient can be measured in (3). 푛 −푛 measurements, which are the value of correlation, bias and 휏 = 푐 푑 (3) standard error. 5 correlation coefficients, which are 푛(푛−1)/2 classical correlation, such as Pearson correlation, where Spearman correlation and Kendall’s Tau correlation with 푛푐 is the number of concordant pairs other robust correlations, such as median correlation and 푛푑 is the number of discordant pairs median absolute deviation (MAD) correlation that will be n is the number of pairs of data applied. 2.3.4. Median Correlation 2.3.1. Pearson Correlation Another alternative method that can be used to calculate There are several types of correlation method to measure a correlation between 2 variables is robust correlation relationships among variables, such as Pearson correlation. approaches. Robust correlation methods could ensure high Pearson correlation coefficient is a classical statistic stability of statistical inference under the deviations from approach in measuring the relationship between 2 variables, the assumed distribution model [18], [20]. Median especially for the data that are normally distributed and correlation is one type of robust correlation coefficient there is linearity between 2 variables. Let X and Y be drawn known as the median-product correlation coefficient. As from 2 random variables with n sample size, therefore the the Pearson correlation refers to the product moment sample correlation of 2 random variables introduced by correlation, alternative approaches to calculate the Pearson is defined in (1), correlation between 2 variables can use median as an

4 The Performance of Different Correlation Coefficient under Contaminated Bivariate Data

estimator to the mean. This method uses a 2-dimensional variables have a strong relationship if the value of sample position determined by the use of median and as an correlation is close to 1 or -1. For uncontaminated data, estimator to the mean. which has no outlier, all the correlation methods From bivariate data, let (X,Y) pair, we determined the demonstrate a coefficient value close to original, ρ = +1. median for X variables, denoted by Med(x), and median for The Pearson correlation values present the highest values Y variable, denoted by Med(y). Then, the median absolute compared to other correlations when the sample size is deviation for variable X denoted by MAD(x) is given by small, for n = 20, r = 0.986 and for n = 30, r = 0.985. For 푀퐴퐷(푥) = 푀푒푑(|푥푖 − 푀푒푑(푥)|) while the median sample size 50 and 100, the MAD correlation gave the absolute deviation for variable for variable Y, denoted by highest values r = 0.99 for both sample sizes. On the other MAD(y) is given by 푀퐴퐷(푦) = 푀푒푑(|푦푖 − 푀푒푑(푦)|). hand, the Kendall Tau correlation recorded the worst values The median covariance for variables X and Y denoted by for all different sample sizes. Covmed(X,Y) is the median determined by given products, The performance of all 5 correlation methods can also be which are 퐶표푣푚푒푑 (푋, 푌) = 푀푒푑{(푥 − 푀푒푑푥)(푦 − measured by using the average bias and their standard error. 푀푒푑푦)}. Based on 200 generated samples, we calculated the average In (1), the denominator is replaced by MAD(x) and and standard error of sample correlation. These biases and MAD(y). For the nominator, it is replaced by Covmed(X,Y). standard errors for sample correlation are displayed in Then, the median correlation denoted by 푟푚 , is given by Table 1. Bias is referred as the difference of average of using (4) and (5). sample correlation coefficient with true correlation. 퐶표푣푚푒푑 (푋,푌) Therefore, the smaller the value of bias and standard error 푟 = (4) 푚 푀퐴퐷(푥).푀퐴퐷(푦) are, the better the performance of that method is. By or referring to Table 1, the Pearson correlation coefficient shows the smallest bias and standard error for small sample 푀푒푑{(푥−푚푒푑(푥))(푦−푚푒푑(푦))} = (5) size (n = 20 and 30) while the MAD correlation coefficient 푀퐴퐷(푥).푀퐴퐷(푦) for large sample size (n = 50 and 100). 2.3.5. MAD Correlation The performances of all 5 correlation methods will be compared with and without contaminated data. Table 2-5 Shevlyakov et al. [21] utilizes the MAD scale estimator show the correlation coefficient value for all 5 methods for to obtain the median correlation coefficient and MAD different percentage of contaminated data for sample size, correlation coefficient. MAD correlation is defined (6) to n = 20, 30, 50 and 100 respectively. Based on Table 2, for (8). small sample size (n = 20), when the contaminated data (푀퐴퐷2(푢)−푀퐴퐷2(푣)) 푟 = (6) exist in the dataset, the correlation value for all methods 푀퐴퐷 (푀퐴퐷2(푢)+푀퐴퐷2(푣)) drops compared without contaminated data. It means that where the correlation value is highly sensitive to the presence of

푥−푚푒푑(푥) 푦−푚푒푑(푦) outliers, which can cause invalid results. For 10% 푢 = + (7) √2푀퐴퐷(푥) √2푀퐴퐷(푦) contaminated data existing in the dataset, the MAD correlation gave the highest value of correlation (r = 0.968) 푥−푚푒푑(푥) 푌−푚푒푑(푦) 푣 = − (8) compared to others. This MAD correlation shows that √2푀퐴퐷(푥) √2푀퐴퐷(푦) although outlier exists in the data, the value of correlation is not affected. The value of correlation for the Pearson 3. Results and Discussion correlation values (r = 0.554) drops more compared to other correlation methods, therefore the Pearson correlation Table 1 shows the results of the performance of Pearson coefficient suffers in the presence of outlier although only correlation with another robust correlation coefficient. 2 10% outliers exist in the dataset.

Mathematics and Statistics 8(2A): 1-8, 2020 5

Table 1. The Correlation Coefficient without Contaminated Data Sample Statistic Correlation Methods Size Pearson Spearman Kendall Median MAD Mean 0.986 0.971 0.888 0.977 0.981 n = 20 Bias 0.014 0.029 0.112 0.023 0.019 S.E 0.006 0.015 0.036 0.026 0.022 Mean 0.985 0.978 0.894 0.98 0.984 n = 30 Bias 0.015 0.022 0.106 0.02 0.016 S.E 0.005 0.009 0.026 0.016 0.013 Mean 0.985 0.984 0.893 0.988 0.99 n = 50 Bias 0.015 0.016 0.107 0.012 0.01 S.E 0.003 0.003 0.011 0.004 0.004 Mean 0.985 0.984 0.893 0.988 0.99 n = 100 Bias 0.015 0.016 0.107 0.012 0.01 S.E 0.003 0.003 0.011 0.004 0.004

Table 2. The Correlation Coefficient Value for n = 20 with different percentage of contaminated data Correlation Methods contaminated data Pearson Spearman Kendall Median MAD

0% 0.986 0.971 0.888 0.977 0.981 10% 0.554 0.791 0.723 0.94 0.968 20% 0.409 0.629 0.572 0.847 0.926 30% 0.315 0.481 0.442 0.625 0.651 40% 0.257 0.354 0.322 0.396 0.342 50% 0.208 0.234 0.215 0.06 0.171

We added another 10%outliers to become 20% outliers outliers in the dataset. The value of MAD correlation still in the dataset, the MAD correlation still presents a good produces the highest correlation coefficients value for value of correlation (r = 0.926) compared to others. This different percentage of contaminated data except for 40% MAD correlation shows the best performance compared to contaminated data. other correlation methods although it uses a small sample The performance of all 5 correlation methods with size. When the data consists of more than 40% contaminated data can also be measured using their average contaminated data, the value of correlation for all the bias and standard error. Based on 200 generated samples, methods indicates weak relationships. where they contain 10% contaminated data, the value of The performance of the correlation methods when we bias and standard error of correlation methods was increase the number of sample size is presented. Tables 3- calculated as displayed in Table 6. The results show that the 5 display the coefficient value of different correlation MAD correlation has the smallest bias and standard error methods based on different sample sizes, n = 20, 30, 50 and compared to other methods for all different sample sizes. 100, respectively. The performance of large sample size n The median correlations have the second lowest bias and = 50 and n = 100, is the same as small sample size, n = 20 standard error. Meanwhile, the Pearson correlation has the and n = 30. As we can see in Tables 3 - 5, the correlation largest bias and standard error for all samples sizes. values for all methods would drop compared to without

6 The Performance of Different Correlation Coefficient under Contaminated Bivariate Data

Table 3. The Correlation Coefficient Value for n = 30 with different percentage of contaminated data

contaminated Correlation Methods data Pearson Spearman Kendall Median MAD 0% 0.985 0.978 0.894 0.98 0.984 10% 0.538 0.783 0.714 0.942 0.975 20% 0.398 0.619 0.562 0.843 0.927 30% 0.308 0.47 0.43 0.637 0.657 40% 0.259 0.35 0.318 0.409 0.346 50% 0.211 0.246 0.223 0.096 0.209

Table 4. The Correlation Coefficient Value for n = 50 with different percentage of contaminated data

contaminated Correlation Methods data Pearson Spearman Kendall Median MAD 0% 0.985 0.984 0.893 0.988 0.99 10% 0.533 0.785 0.714 0.957 0.978 20% 0.409 0.63 0.573 0.867 0.949 30% 0.305 0.467 0.426 0.629 0.655 40% 0.244 0.339 0.311 0.429 0.365 50% 0.224 0.252 0.227 0.079 0.212

Table 5. The Correlation Coefficient Value for n = 100 with different percentage of contaminated data

contaminated Correlation Methods data Pearson Spearman Kendall Median MAD 0% 0.985 0.984 0.893 0.988 0.99 10% 0.545 0.792 0.719 0.963 0.984 20% 0.399 0.623 0.566 0.873 0.953 30% 0.32 0.48 0.435 0.667 0.688 40% 0.259 0.351 0.318 0.468 0.395 50% 0.203 0.233 0.216 0.044 0.197

Table 6. The Value of Bias and Standard Error of Correlation Methods with 10% Contaminated Data Correlation Methods 10% contaminated data Pearson Spearman Kendall Median MAD n = 20 Bias 0.446 0.209 0.277 0.06 0.032 S.E 0.191 0.121 0.087 0.059 0.039 n = 30 Bias 0.462 0.217 0.286 0.058 0.025 S.E 0.147 0.097 0.071 0.053 0.021 n = 50 Bias 0.467 0.215 0.286 0.043 0.022 S.E 0.122 0.074 0.051 0.034 0.017 n = 100 Bias 0.455 0.208 0.281 0.037 0.016 S.E 0.089 0.057 0.039 0.022 0.008

We continued the performance comparison of correlation methods when the number of contaminated data is increased. Tables 7 – 10 display the results of bias and standard error for 5 correlation methods with different percentage of contaminated data. Based on Table 7, we conclude that the same pattern occurred. The MAD correlation has the smallest bias and standard error compared to other methods for all different sample sizes followed by median correlations with the second lowest of bias and standard error. The Pearson correlation still has the largest bias and standard error for all samples sizes.

Mathematics and Statistics 8(2A): 1-8, 2020 7

Table 7. The Value of Bias and Standard Error of Correlation Methods with 20% Contaminated Data Correlation Methods 20% contaminated data Pearson Spearman Kendall Median MAD n =20 Bias 0.591 0.371 0.428 0.153 0.074 S.E 0.213 0.17 0.121 0.118 0.074 n =30 Bias 0.602 0.381 0.438 0.157 0.073 S.E 0.161 0.131 0.09 0.105 0.069 n =50 Bias 0.591 0.37 0.427 0.133 0.051 S.E 0.123 0.099 0.069 0.096 0.037 n =100 Bias 0.601 0.377 0.434 0.127 0.047 S.E 0.093 0.074 0.05 0.065 0.023

Table 8. The Value of Bias and Standard Error of Correlation Methods with 30% Contaminated Data Correlation Methods 30% contaminated data Pearson Spearman Kendall Median MAD n =20 Bias 0.685 0.519 0.558 0.375 0.349 S.E 0.205 0.186 0.13 0.212 0.263 n =30 Bias 0.692 0.53 0.57 0.363 0.343 S.E 0.174 0.154 0.108 0.184 0.217 n =50 Bias 0.695 0.533 0.574 0.371 0.345 S.E 0.128 0.117 0.081 0.153 0.174 n =100 Bias 0.68 0.52 0.565 0.333 0.312 S.E 0.091 0.081 0.055 0.104 0.11

Table 9. The Value of Bias and Standard Error of Correlation Methods with 40% Contaminated Correlation Methods 40% contaminated data Pearson Spearman Kendall Median MAD n =20 Bias 0.743 0.646 0.678 0.604 0.658 S.E 0.232 0.225 0.159 0.313 0.36 n =30 Bias 0.741 0.65 0.682 0.591 0.654 S.E 0.157 0.151 0.103 0.263 0.305 n =50 Bias 0.756 0.661 0.689 0.571 0.635 S.E 0.123 0.122 0.084 0.205 0.241 n =100 Bias 0.741 0.649 0.682 0.532 0.605 S.E 0.079 0.077 0.053 0.15 0.176

Table 10. The Value of Bias and Standard Error of Correlation Methods with 50% Contaminated Correlation Methods 50% contaminated data Pearson Spearman Kendall Median MAD n =20 Bias 0.792 0.766 0.785 0.94 0.829 S.E 0.232 0.23 0.164 0.401 0.445 n =30 Bias 0.789 0.754 0.777 0.904 0.791 S.E 0.178 0.176 0.12 0.328 0.353 n =50 Bias 0.776 0.748 0.773 0.921 0.788 S.E 0.137 0.133 0.091 0.272 0.308 n =100 Bias 0.797 0.767 0.784 0.956 0.803 S.E 0.092 0.091 0.061 0.208 0.233

8 The Performance of Different Correlation Coefficient under Contaminated Bivariate Data

For 50% contaminated data existing, all the correlation N. S. Yusoff and M. A. Djauhari, Regional Annual methods gave a poor result as displayed in Table 10. All Fundamental Science Symposium Understanding the Shift of Correlation Structure in Major Currency, in Regional correlation methods produced a high value of bias and Annual Fundamental Science Symposium, 2010. standard error. The bias is around 0.748 to 0.973. This means that for 50% contaminated data existing, the strength M. A. Djauhari and S. L. Gan, Dynamics of Correlation from strong relationship is reduced to no relationships Structure in Stock Market, Entropy, vol. 16, no. 1, pp. 455– 470, 2014. among them. F. Li, Expression and Correlation of miR-124 and miR-126 in Breast Cancer, Oncology Letters, vol. 17, no. 6, pp. 4. Conclusions 5115–5119, 2019. R. R. Wilcox, D. A. Granger, and F. Clark, Modern Robust For perfect data with no outlier in the dataset, all 5 Statistical Methods: Basics with Illustrations Using correlation methods demonstrated a strong relationship, Psychobiological Data, Universal Journal of Psychology, which is close to +1. However, when outliers exist in the vol. 1, no. 2, pp. 21–31, 2013. dataset, the MAD correlation still gave a strong relationship S. Sharif and T. A. M. Atiany, Testing Several Correlation compared to other methods. This shows that although Matrices using Robust Approach, Asian Journal of outliers exist in the data, the value of MAD correlation is Scientific Research, vol. 11, no. 1, pp. 84–95, 2017. not affected. The relationship became weak for MAD N. A. Ahad, N. A. Zakaria, S. Abdullah, S. S. S. Yahaya, and correlation when there is 40% contaminated data. The N. Yusof, Robust Correlation Procedure via Sn Estimator, value of correlation for the Pearson correlation drops more Journal of Telecommunication and Computer Engineering, compared to other correlation methods. Therefore, the vol. 10, no. 1–10, pp. 115–118, 2018. Pearson correlation coefficient suffers in the presence of C. R. Pernet, R. Wilcox, and G. A. Rousselet, Robust outlier. This means that the Pearson correlation value is Correlation Analyses: False Positive and Power Validation highly sensitive to the presence of outliers, which can cause using a New Open Source Matlab Toolbox, Frontier in invalid results. Psychology, vol. 3, no. Jan 2013, 2013. A. Z. M. Shafiullah and J. A. Khan, A New Robust Correlation Estimator for Bivariate Data, Bangladesh Acknowledgements Journal of Scientific Research, vol. 24, no. 2, pp. 97–106, 2011. This research is part of PhD study under Awang Had P. M. Yuan, K.-H., & Bentler, Effect of Outliers on Salleh Graduate School, Universiti Utara Malaysia. Estimators and Tests in Covariance Structure, British Malaysia. Journal of Mathematical and Statistical Psychology, vol. 54, no. 1, pp. 161–175, 2001.

A. R. Yusof, Z. M., Abdullah, S., Yahaya, S.S.S., & Othman, A Robust Alternative to the t – Test., Modern Applied REFERENCES Science., vol. 6, no. 5, 2012. R. R. Wilcox, Introduction to Robust Estimation and B. Larson, R. and Faber, Elementary Statistics. Picturing the Hypothesis Testing, 3rd Edition. Oxford Academic Press, World. Seventh Edition. Pearson Education, 2019. 2012. A. G. Asuero, A. Sayago, and A. G. González, The G. L. Shevlyakov and P. Smirnov, Robust Estimation of the correlation coefficient: An overview, Critical Review in Correlation Coefficient: An Attempt of Survey, Austrian Analytical Chemistry, vol. 36, no. 1, pp. 41–59, 2006. Journal of Statistics, vol. 40, no. 1, pp. 147–156, 2011. T. Mendenhall, W. and Sincich, Regression Analysis. A M. Abdullah, On a Robust Correlation Coefficient, Journal Second Course in Statistics. Eighth Edition. Pearson of Royal Statistical Society. Ser. D (The Statistician), vol. Education, 2019. 39, no. 4, pp. 455–460, 1990. G. C. Montgomery, D. C. and Runger, Applied Statistics and P. J. Huber, John W. Tukey’s Contributions to Robust Probability for Engineers. Sixth Edition. John Wiley & Sons, Statistics, The Annals of Statistics, vol. 30, no. 6, pp. 1640– 2013. 1648, 2019. A. Gupta, B. Johnson, and D. Nagar, Testing Equality of G. L. Shevlyakov, P. O. Smirnov, V. I. Shin, and K. Kim, Several Correlation Matrices, Revista Colombiana Asymptotically Minimax Bias Estimation of the Correlation Estadistica, vol. 36, no. 2, pp. 237–258, 2013. Coefficient for Bivariate Independent Component Distributions, Journal of Multivariate Analysis, vol. 111, pp. M. A. Djauhari and E. T. Herdiani, Monitoring the Stability 59–65, 2012. of Correlation Structure in Drive Rib Production Process: An MSPC Approach, Open Industrial Manufacturing Engineering Journal, vol. 1, no. 1, pp. 8–18, 2008.

Mathematics and Statistics 8(2A): 9-16, 2020 http://www.hrpub.org DOI: 10.13189/ms.2020.081302

Approximate Analytical Solutions of Nonlinear Korteweg-de Vries Equations Using Multistep Modified Reduced Differential Transform Method

Che Haziqah Che Hussin1,2,*, Ahmad Izani Md Ismail1, Adem Kilicman3, Amirah Azmi1

1School of Mathematical Sciences, University of Science, Malaysia 2Preparatory Centre of Science and Technology, Universiti Malaysia , Malaysia 3Department of Mathematics, Faculty of Science, Universiti Putra Malaysia, Malaysia

Received August 5, 2019; Revised December 22, 2019; Accepted February 20, 2020

Copyright©2020 by authors, all rights reserved. Authors agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License

Abstract This paper aims to propose and investigate approximate solution of these equations, being able to solve the application of Multistep Modified Reduced Differential such equations analytically is significant due to the fact that Transform Method (MMRDTM) for solving the nonlinear manipulation are easier if the approximation is analytical in Korteweg-de Vries (KdV) equation. The proposed nature. Many PDEs and ordinary differential equations technique has the advantage of producing an analytical (ODEs) have been tackled by using approximate analytical approximation in a fast converging sequence with a methods such as Differential Transform Method (DTM) reduced number of calculated terms. MMRDTM is and Reduced Differential Transform Method (RDTM) [2- presented with some modification of the reduced 6]. There exists scope to seek improvements for the method. differential transformation method (RDTM) which is the Sahoo and Ray [7] obtained analytical solutions of time nonlinear term is replaced by related Adomian polynomials fractional modified Korteweg-de Vries (KdV) equation by and then adopting a multistep approach. Consequently, the using (G'/G)-expansion method and improved (G'/G)- obtained approximation results do not only involve smaller expansion method. Besides that, Islam et. al [8] considered number of calculated terms for the nonlinear KdV equation, a recent extension of the (G'/G)-expansion approach in but also converge rapidly in a broad time frame. We determining the solitary wave solutions of the modified provided three examples to illustrates the advantages of the KdV equation. Apart from that, Ray [9] proposed a proposed method in obtaining the approximation solutions modification on the fractional RDTM and implemented it of the KdV equation. To depict the solution and show the to find solutions of fractional KdV equations. In this validity and precision of the MMRDTM, graphical inputs approach, the adjustment included the substitution of the are included. nonlinear term by relating Adomian polynomials [9]. Therefore, the solutions of the nonlinear problem can be Keywords Adomian Polynomials, Multistep obtained in a simpler way with reduced calculated terms. Approach, Nonlinear Korteweg-de Vries Equation, Furthermore, El-Zahar [10] introduced an adaptive Reduced Differential Transform Method multistep DTM to obtain solution of singular perturbation initial-value problems. It produces the solution in a rapid convergent series which results in the solution converging in wide time area. 1. Introduction Recently, Hussin et. al [11] proposed and implemented the Multistep Modified Reduced Differential Transform Partial differential equations (PDEs) have broad Method (MMRDTM) for solving nonlinear Schrodinger applications in various branches of sciences and equations (NLSE). The outcome showed the approximate engineering such as fluid mechanics, thermodynamics, heat solutions of NLSE with high accuracy were obtained. transfer as well as many other areas of physics [1]. For Hussin et. al [12] also solved Klein-Gordon equations using many nonlinear PDEs, it is rather challenging to manage MMRDTM and results showed that the MMRDTM is a the nonlinear terms of these equations. Despite the fact that valid and efficient method for finding analytic approximate most researches utilized numerical methods to obtain the solution of the Klein-Gordon equations. Besides that,

10 Approximate Analytical Solutions of Nonlinear Korteweg-de Vries Equations Using Multistep Modified Reduced Differential Transform Method

Hussin et. al [13] obtain solution of fractional nonlinear The functions Schrodinger equations (FNLSEs) by using MMRDTM. 퐷푢(푥, 푡), 푃푢(푥, 푡), 푁푢(푥, 푡) and ℎ(푥, 푡) are transformed In this study, we combine the modification made in [9] and then represented and the multistep approach in [10] to execute a new as 푈푘(푥), 푃푈푘(푥), 푁푈푘(푥) and 퐻푘(푥) respectively. We technique called Multistep Modified Reduced Differential have Transform Method (MMRDTM). The proposed technique 푈0(푥) = 푓(푥), (6) has the advantage of yielding an analytical approximation in a fast-convergent sequence with reduced number of from the initial condition. Referring to Ray [9], the computed terms. nonlinear term is denoted as ∞

푁푢(푥, 푡) = ∑ 퐴푛(푈0(푥), 푈1(푥), … , 푈푛(푥)). 2. The Development of Multistep 푛=0 Modified Reduced Differential By combining (6) into (5) and through iterative calculation, the 푈푘(푥) values can be obtained. Furthermore, Transform Method 푛 the set of values {푈푘(푥)}푘=0 of the inverse transformation For notation purpose, the original functions will be yields the following approximate solution, denoted using lowercase letter such as the letter u in the 퐾 푘 function 푢(푥, 푡), while the transformed functions will be 푢(푥, 푡) = ∑ 푈푘(푥)푡 , 푡 ∈ [0, 푇]. denoted using uppercase letter such as the letter U in the 푘=0 function 푈 (푥). Basically, the differential transformation 푘 The interval [0, 푇] is divided of the function 푢(푥, 푡) = 푓(푥)푔(푡) is given by Keskin and into 푀 subintervals [푡푖−1, 푡푖] for 푖 = 1,2, … , 푀, by equal Oturanç [14] as follows, 푇 step size 푠 = and nodes 푡 = 푖ℎ are used. The following ∞ ∞ ∞ 푀 푖 steps are used to calculate MMRDTM. Firstly, the 푢(푥, 푡) = ∑ 퐹(푖)푥푖 ∑ 퐺(푗)푡푗 = ∑ 푈 (푥)푡푘 푘 modified RDTM is applied to the initial value problem of 푖=0 푗=0 푘=0 interval [0, 푡1]. where 푈푘(푥) is known as the function of 푢(푥, 푡). Some By using the initial conditions fundamental properties of RDTM are given as follows: 푢(푥, 0) = 푓 (푥), 푢 (푥, 0) = 푓 (푥), Definition 1: For an analytically and continuously 0 1 1 differential function u(x, t) with respect to time t and space the approximate result variable x , the differential transformation of u(x, t) is 퐾 defined by 푘 푢1(푥, 푡) = ∑ 푈푘,1(푥)푡 , 푡 ∈ [0, 푡1] 1 휕푘 푘=0 푈푘(푥) = [ 푘 푢(푥, 푡)] (1) 푘! 휕푡 푡=0 is obtained. At each subinterval [푡 , 푡 ], the initial conditions where 푈푘(푥) is the transformed function. 푖−1 푖 Definition 2. The inverse transform of 푈푘(푥) is given by 푢푖(푥, 푡푖−1) = 푢푖−1(푥, 푡푖−1), ∞ 푘 푢(푥, 푡) = ∑푘=0 푈푘(푥)푡 . (2) (휕⁄휕푡)푢푖(푥, 푡푖−1) = (휕⁄휕푡)푢푖−1(푥, 푡푖−1) Then by combining (1) and (2), we can write the are used for 푖 ≥ 2 and the multistep RDTM is implemented equation as following, to the initial value problem on [푡푖−1, 푡푖], where 푡0 is replaced with 푡 . For 푖 = 1,2, … , 푀 , the process is 푘 푖−1 ∞ 1 휕 푘 continued and carried out repeatedly to produce a sequence 푢(푥, 푡) = ∑푘=0 [ 푘 푢(푥, 푡)] 푡 . (3) 푘! 휕푡 푡=0 approximate solutions 푢푖(푥, 푡) such as In order to represent the core properties of the RDTM, 퐾 consider the following nonlinear PDE, 푘 푢푖(푥, 푡) = ∑ 푈푘,푖(푥)(푡 − 푡푖−1) , 푡 ∈ [푡푖−1, 푡푖]. 퐷푢(푥, 푡) + 푃푢(푥, 푡) + 푄푢(푥, 푡) = ℎ(푥, 푡), (4) 푘=0 Finally, the MMRDTM proposes the following solutions: where 푢(푥, 0) = 푓(푥) is the initial condition. Here, 퐷 = 휕 푢 (푥, 푡), for 푡 ∈ [0, 푡 ] and 푃 is the remaining part of linear operator. The 1 1 휕푡 푢 (푥, 푡), for 푡 ∈ [푡 , 푡 ] 푢(푥, 푡) = { 2 1 2 nonlinear and inhomogeneous terms are represented ⋮ as 푁푢(푥, 푡) and ℎ(푥, 푡) respectively. 푢푀(푥, 푡), for 푡 ∈ [푡푀−1, 푡푀]. Based on the MMRDTM, the iteration formula can be formed as follows: With better computing performance, the new algorithm MMRDTM is straightforward for all values of ℎ. It can be (푘 + 1)푈푘+1(푥) = 퐻푘(푥) − 푃푈푘(푥) − 푁푈푘(푥). (5) effortlessly observed that if the step size ℎ = 푇, then the MMRDTM reduces to the modified RDTM.

Mathematics and Statistics 8(2A): 9-16, 2020 11

3. Application of MMRDTM for replaced with 푡푖−1. For 푖 = 1,2, … , 푀 , the process is Solving Nonlinear Korteweg-de continued and carried out repeatedly to produce a sequence approximate solutions 푢푖(푥, 푡) such as Vries Equation 퐾 푘 The generalized Korteweg-de Vries (KdV) equation was 푢푖(푥, 푡) = ∑ 푈푘,푖(푥)(푡 − 푡푖−1) , 푡 ∈ [푡푖−1, 푡푖]. introduced as follows [15]: 푘=0 푝 푢푡 + (푝 + 1)(푝 + 2)푢 푢푥 + 푢푥푥푥 = 푔(푥, 푡) (7) The new algorithm MMRDTM is straightforward for computational performance for all values of ℎ. It can be where 푔(푥, 푡) is a given function with 푝 = effortlessly observed that if the step size ℎ = 푇, then the and as | | . For , (7) 1,2, … 푢, 푢푥, 푢푥푥푥 → 0 푥 → ∞ 푝 = 0,1,2 MMRDTM reduces to the modified RDTM. becomes into the linearized KdV, nonlinear KdV, and modified KdV equation, respectively [16,17]. Applying MMRDTM to (7) and using essential properties of MMRDTM, we can obtain, 4. Numerical Results and Discussion

1 To represent the effectiveness of the proposed method, 푈 (푥) = ( ) (−(푝 + 1)(푝 + 푘+1,푖 (푘+1) we consider three test examples in this section. For accuracy evaluation purpose, we compare the obtained 휕 휕3 2) ∑푛 퐴 (푈 (푥)) − 푈 (푥) + 푔(푥, 푡)). (8) solutions with the exact solutions for each example. 푘=0 푘,푖 휕푥 푘,푖 휕푥3 푘 Example 1: Consider the KdV equation which takes the From initial condition, we compose form [18,19]:

푈0(푥) = 푓(푥). (9) 푢푡 + 6푢푢푥 + 푢푥푥푥 = 0, 푥 ∈ ℝ (10) Now, the nonlinear term can be written as follows: subject to the initial condition ∞ 1 푥 푁푢(푥, 푡) = ∑ 퐴 (푈 (푥), 푈 (푥), … , 푈 (푥)). 푢(푥, 0) = sech2 ( ) 푛 0 1 푛 2 2 푛=0 1 푥−푡 The exact solution of this equation is sech2 ( ). Substituting (9) into (8) and by direct iterative 2 2 computation, we obtain the following 푈푘(푥) values. Then, Using basic properties of MMRDTM then applying the 푛-terms approximation solution is obtained from the MMRDTM to (10), we can obtain inverse transformation of the set of 푛 1 푘 휕 values {푈푘(푥)}푘=0 given as follows: 푈 (푥) = ( )(−6 ∑ 푈 (푥) 푈 (푥 푘+1,푖 (푘+1) 푟=0 푘−푟,푖 휕푥 푟,푖 퐾 휕3 푢(푥, 푡) = ∑ 푈 (푥)푡푘 , 푡 ∈ [0, 푇]. ) − (푈 (푥))) (11) 푘 휕푥3 푘,푖 푘=0 The interval [0, 푇] is divided From the initial condition, we express into 푀 subintervals [푡푖−1, 푡푖] for 푖 = 1,2, … , 푀, by equal 1 2 푥 푇 푈0(푥) = sech ( ). (12) step size 푠 = and nodes 푡 = 푖ℎ are used. The following 2 2 푀 푖 steps are used to calculate MMRDTM. Firstly, the The 푈푘(푥) values were obtained by substituting (12) into modified RDTM is applied to the initial value problem of (11) by straightforward iterative calculation. The interval [0,1] is divided interval [0, 푡1]. By using the initial conditions into 10 subintervals [푡푖−1, 푡푖] for 푖 = 1,2, … ,10, by equal step size ℎ = 0.1 and nodes 푡푖 = 푖ℎ are used. Firstly, the 푢(푥, 0) = 푓0(푥), 푢1(푥, 0) = 푓1(푥), modified RDTM is applied to the initial value problem of the approximate result interval [0, 푡1]. 퐾 By using the initial conditions 푘 푢1(푥, 푡) = ∑ 푈푘,1(푥)푡 , 푡 ∈ [0, 푡1] 푢(푥, 0) = 푓0(푥), 푢1(푥, 0) = 푓1(푥), 푘=0 the approximate result is obtained. 퐾 At each subinterval [푡푖−1, 푡푖], the initial conditions 푘 푢1(푥, 푡) = ∑ 푈푘,1(푥)푡 , 푡 ∈ [0, 푡1] 푢푖(푥, 푡푖−1) = 푢푖−1(푥, 푡푖−1), 푘=0

(휕⁄휕푡)푢푖(푥, 푡푖−1) = (휕⁄휕푡)푢푖−1(푥, 푡푖−1) is obtained. At each subinterval [푡 , 푡 ], the initial conditions are used for 푖 ≥ 2 and the multistep RDTM is implemented 푖−1 푖 to the initial value problem on [푡푖−1, 푡푖], where 푡0 is 푢푖(푥, 푡푖−1) = 푢푖−1(푥, 푡푖−1),

12 Approximate Analytical Solutions of Nonlinear Korteweg-de Vries Equations Using Multistep Modified Reduced Differential Transform Method

(휕⁄휕푡)푢푖(푥, 푡푖−1) = (휕⁄휕푡)푢푖−1(푥, 푡푖−1) are used for 푖 ≥ 2 and the multistep RDTM is implemented to the initial value problem on [푡푖−1, 푡푖], where 푡0 is replaced with 푡푖−1. For 푖 = 1,2, … ,10 , the process is continued and carried out repeatedly to produce a sequence approximate solutions 푢푖(푥, 푡) such as 퐾 푘 푢푖(푥, 푡) = ∑ 푈푘,푖(푥)(푡 − 푡푖−1) , 푡 ∈ [푡푖−1, 푡푖]. 푘=0 The exact solution, approximate solutions yielded by the MMRDTM for t ∈ [-2,2], x ∈ [0,1] and the approximate Figure 1B. MMRDTM solution given by the RDTM for t ∈ [-2,2], x ∈ [0,1] for the Example 1 are graphically represented in the Fig. 1A, Fig. 1B and Fig.1C respectively. Observe that, the multistep modified approximate solutions for this type of nonlinear KdV equations obtain the exact solutions while the approximate solution of RDTM is convergent to the exact solution as in [19-21]. The performance error analyses obtained by MMRDTM and RDTM are summarized in Table 1.

Table 1. Comparison error results of MMRDTM and RDTM approximate solutions for Example 1 Absolute Error T Exact Solution MMRDTM Figure 1C. RDTM (RDTM) 0.1 0.500000000 0.500000000 0.00000000 Example 2: Consider the KdV equation which has two- soliton solution [9] 0.2 0.5000000000 0.500000000 6.5000000 10−9 0.3 0.5000000000 0.500000000 1.5420000 10−7 푢푡 + 6푢푢푥 + 푢푥푥푥 = 0, 푥 ∈ ℝ (13) 0.4 0.5000000000 0.500000000 1.3855000 10−6 with respect to the initial condition 2 0.5 0.5000000000 0.500000000 7.1852000 10−6 푢(푥, 0) = 6 sech 푥. 0.6 0.5000000000 0.500000000 2.5953000 10−5 The exact solution of this equation is 0.7 0.5000000000 0.500000000 7.1985000 10−5 24(4 cosh(푥 − 4푡)2 + sinh(2푥 − 32푡)2) . 0.8 0.5000000000 0.500000000 1.6189510 10−4 (cosh(3푥 − 36푡) + 3cosh(푥 − 28푡))2 0.9 0.5000000000 0.500000000 3.0383430 10−4 Using basic properties of MMRDTM then applying 1.0 0.5000000000 0.500000000 4.7937780 10−4 MMRDTM (13), we can obtain

1 휕 푈 (푥) = ( ) (−6 ∑푘 푈 (푥) 푈 (푥 푘+1,푖 (푘+1) 푟=0 푘−푟,푖 휕푥 푟,푖

휕3 ) − (푈 (푥))) (14) 휕푥3 푘,푖 From initial condition, write 2 푈0(푥) = 6 sech 푥. (15) The interval [0,1] is divided into 10 subintervals [푡푖−1, 푡푖] for 푖 = 1,2, … ,10, by equal step size ℎ = 0.1 and nodes 푡 = 푖ℎ are used. Firstly, the 푖 modified RDTM is applied to the initial value problem of Figure 1A. Exact solution interval [0, 푡1]. By using the initial conditions

푢(푥, 0) = 푓0(푥), 푢1(푥, 0) = 푓1(푥), the approximate result

Mathematics and Statistics 8(2A): 9-16, 2020 13

퐾 푘 푢1(푥, 푡) = ∑ 푈푘,1(푥)푡 , 푡 ∈ [0, 푡1] 푘=0 is obtained. At each subinterval [푡푖−1, 푡푖], the initial conditions

푢푖(푥, 푡푖−1) = 푢푖−1(푥, 푡푖−1),

(휕⁄휕푡)푢푖(푥, 푡푖−1) = (휕⁄휕푡)푢푖−1(푥, 푡푖−1) are used for 푖 ≥ 2 and the multistep RDTM is implemented to the initial value problem on [푡푖−1, 푡푖], where 푡0 is replaced with 푡푖−1. For 푖 = 1,2, … ,10 , the process is continued and carried out repeatedly to produce a sequence approximate solutions 푢푖(푥, 푡) such as Figure 2C. RDTM 퐾 푘 푢푖(푥, 푡) = ∑ 푈푘,푖(푥)(푡 − 푡푖−1) , 푡 ∈ [푡푖−1, 푡푖]. 푘=0 Fig. 2A shows the exact solution, Fig. 2B shows graph of approximate solution of MMRDTM for t ∈ [-2,2] and x ∈ [-6,6] while Fig. 2C shows graph of approximate solution RDTM for t ∈ [-2,2] and x ∈ [-6,6]. Clearly, the multistep modified approximate solutions for this type of nonlinear KdV equations converge to the exact solutions as in [9].

Figure 2D. 2-Dimensional graph for Exact Solution

Figure 2A. Exact solution

Figure 2E. 2-Dimensional graph for MMRDTM Figure 2B. MMRDTM

14 Approximate Analytical Solutions of Nonlinear Korteweg-de Vries Equations Using Multistep Modified Reduced Differential Transform Method

approximate solutions 푢푖(푥, 푡) such as 퐾 푘 푢푖(푥, 푡) = ∑ 푈푘,푖(푥)(푡 − 푡푖−1) , 푡 ∈ [푡푖−1, 푡푖]. 푘=0 Fig. 3A shows the exact solution, Fig. 3B shows graph of approximate solution of MMRDTM for t ∈ [-1,1] and x ∈ [0,1] while Fig. 3C shows graph of approximate solution RDTM for t ∈[-1,1] and x ∈ [0,1]. It can be observed that the multistep modified approximate solutions for this type of nonlinear KdV equations obtain the exact solutions while the approximate solution of RDTM is convergent to the exact solution as in [19-21]. The performance error analyses obtained by MMRDTM and RDTM are summarized in Table 2. Figure 2F. 2-Dimensional graph for RDTM Example 3 Consider the nonlinear KdV equation of the form [19] 2 푢푡 + 6푢 푢푥 + 푢푥푥푥 = 0, 푥 ∈ ℝ (16) subject to the initial condition 푢(푥, 0) = sech(푥) where the exact solution of this equation is sech (푥 − 푡). Using basic properties of MMRDTM and then applying MMRDTM to (16), we obtain

1 푈 (푥) = ( ) (−6 Figure 3A. Exact solution 푘+1,푖 (푘+1) 휕 ∑푘 ∑푠 푈 (푥) 푈 (푥) (푈 (푥)) − 푠=0 푙=0 푠−푙,푖 푘−푠,푖 휕푥 푙,푖 휕3 (푈 (푥))). (17) 휕푥3 푘,푖 From the initial condition, write

푈0(푥) = sech(푥). (18) The interval [0,1] is divided into 10 subintervals [푡푖−1, 푡푖] for 푖 = 1,2, … ,10, by equal step size ℎ = 0.1 and nodes 푡푖 = 푖ℎ are used. Firstly, the modified RDTM is applied to the initial value problem of interval [0, 푡1]. By using the initial conditions Figure 3B. MMRDTM

푢(푥, 0) = 푓0(푥), 푢1(푥, 0) = 푓1(푥), the approximate result 퐾 푘 푢1(푥, 푡) = ∑ 푈푘,1(푥)푡 , 푡 ∈ [0, 푡1] 푘=0 is obtained. At each subinterval [푡푖−1, 푡푖], the initial conditions

푢푖(푥, 푡푖−1) = 푢푖−1(푥, 푡푖−1),

(휕⁄휕푡)푢푖(푥, 푡푖−1) = (휕⁄휕푡)푢푖−1(푥, 푡푖−1) are used for 푖 ≥ 2 and the multistep RDTM is implemented to the initial value problem on [푡푖−1, 푡푖], where 푡0 is Figure 3C. RDTM replaced with 푡푖−1. For 푖 = 1,2, … ,10 , the process is continued and carried out repeatedly to produce a sequence

Mathematics and Statistics 8(2A): 9-16, 2020 15

Table 2. Comparison error results of MMRDTM and RDTM Scientist and Engineers, Birkhauser, Boston, 1997. approximate solutions for Example 3. Absolute Error A. F. Jameel, N. R. Anakira, M. M. Rashidi, A. K. Alomari, T Exact Solution MMRDTM (RDTM) A. Saaban, M. A. Shakhatreh. Differential Transformation Method for Solving High Order Fuzzy Initial Value −9 0.1 1.0000000000 1.00000000 3.00000000 10 Problems, Italian Journal of Pure and Applied Mathematics, 0.2 1.0000000000 1.00000000 5.06000000 10−7 Vol.39, 194–208, 2018. 0.3 1.0000000000 1.00000000 1.00870000 10−5 T. R. R. Rao. Numerical Solution of Sine Gordon Equations

−5 Through Reduced Differential Transform Method, Global 0.4 1.0000000000 1.00000000 6.98690000 10 Journal of Pure and Applied Mathematics, Vol.13, No.7, 0.5 1.0000000000 1.00000000 2.50442000 10−4 3879–3888, 2017. 0.6 1.0000000000 1.00000000 5.19650000 10−4 O. Acan, Y. Keskin. Reduced Differential Transform

−4 Method for (2+1) Dimensioal Type of the Zakharov– 0.7 1.0000000000 1.00000000 4.84115000 10 Kuznetsov ZK (n,n) Equations, AIP Conference 0.8 1.0000000000 1.00000000 7.84796400 10−4 Proceedings, Vol.1648, No. 1, 2015. 0.9 1.0000000000 1.00000000 4.58862280 10−3 H. R. Marasi, N. Sharifi, H. Piri. Modified Differential

−2 Transform Method for Singular Lane-Emden Equations in 1.0 1.0000000000 1.00000000 1.20748174 10 Integer and Fractional Order, Journal of Applied and Engineering Mathematics, Vol.5, No.1, 124–131, 2015. B. Benhammouda, H. V. A. Leal. New Multi ‑ Step 5. Conclusions Technique with Differential Transform Method for Analytical Solution of Some Nonlinear Variable Delay We proposed an approximate analytical method called Differential Equations, SpringerPlus, Vol. 5, No. 1, the MMRDTM in this paper. In the proposed method, the 1723,2016. modification involves the substitution of nonlinear term by S. Sahoo, S. S. Ray. Solitary wave solutions for time its Adomian polynomials and the use of a multistep fractional third order modified KdV equation using two approach. The significance of the modification is that for reliable techniques (G'/G)-expansion method and improved any analytical nonlinearity, the nonlinear KdV term can be (G'/G)-expansion method, Physica A., Vol. 448, 265-282, easily solved with less computational work. This is due to 2015. the Adomian polynomials’ properties and available M. S. Islam, K. Khan, M. A. Akbar. An analytical method algorithms. To illustrate the effectiveness of the proposed for finding exact solutions of modified Korteweg – de Vries method, we executed the method for obtaining the solution equation, Results in Physics, Vol. 5, 131–135, 2015. of the one-dimensional nonlinear equations of KdV. S. S. Ray. Numerical Solutions and Solitary Wave Solutions Moreover, we also implemented the multistep RDTM of Fractional KdV Equations using Modified Fractional which offers accurate approximate solutions over longer Reduced Differential Transform Method, Computational time frames. As a result, the approximate solutions of Mathematics and Mathematical Physics, Vol. 53, No. 12, nonlinear KdV equation obtained with high precision and 1870-1881, 2013. good agreement to the exact solutions. Moreover, the E. R. El-Zahar. Applications of Adaptive Multi Step proposed method also more accurate and easier to solve Differential Transform Method to Singular Perturbation compared to RDTM. We conclude that, the MMRDTM is Problems Arising in Science and Engineering, Applied valid and it offers good efficiency for obtaining analytic Mathematics and Information Sciences, Vol. 9, No.1, 223– 232, 2015. approximate solution for these types of equations. The computations in this paper were solved by utilizing Maple C. H. C. Hussin, A. Kilicman, A. Azmi. Analytical Solutions 13. of Nonlinear Schrodinger Equations using Multi-step Modified Reduced Differential Transform Method, International Journal of Advanced Computer Technology, Vol.7, No.11, 2939-2944, 2018. Acknowledgements C. H. C. Hussin, A. I. M. Ismail, A. Kilicman, A. Azmi. The authors express their appreciation of supporting this Analytical Solutions of Non-Linear Klein-Gordon research to the Malaysian Higher Education Ministry and Equations Using Multistep Modified Reduced Differential Universiti Malaysia Sabah. We also have Universiti Sains Transform Method, Thermal Science, Vol.23, No.1, S317- Malaysia financial support. S326, 2019. C. H. C. Hussin, A. I. M. Ismail, A. Kilicman, A. Azmi. Approximate Analytical Solutions of Fractional Nonlinear Schrodinger Equations using Multistep Modified Reduced Differential Transform Method, Proceedings of the REFERENCES International Conference on Mathematical Sciences and Technology 2018 (MathTech2018), AIP Conference L. Debtnath. Nonlinear Partial Differential Equations for Proceedings, Vol.2184, 060003, 2019.

16 Approximate Analytical Solutions of Nonlinear Korteweg-de Vries Equations Using Multistep Modified Reduced Differential Transform Method

Y. Keskin, G. Oturanc. Reduced Differential Transform Method for Partial Differential Equations, International Journal of Nonlinear Sciences and Numerical Simulation, Vol.10, No.6, 741–749, 2009. P. G. Drazin, R. S. Johnson. Solitons: An Introduction, Cambridge University Press, Cambridge, 1989. D. Kaya, M. Aassila. An application for a generalized KdV equation by the decomposition method, Physics Letters A, Vol.299, 201–206, 2002. S. Momani, Z. Odibat, A Alawneh. Variational Iteration Method for Solving the Space- and Time-Fractional KdV Equation, Numerical Methods for Partial Differential Equations, Vol. 24, No.1, 262–271, 2007. P. Saucez, A. V. Wouwer, W. E. Schiesser. An adaptive method of lines solution of the Korteweg-de Vries equation, Computers and Mathematics with Applications, Vol.35, No.12, 13–25, 1998. Y. Keskin, G. Oturanc. Reduced Differential Transform Method for Generalized KDV Equations, Mathematical and Computational Applications, Vol.15, No.3, 382–393, 2010. D. Kaya, M. Aassila. An application for a generalized KdV equation by the decomposition method, Physics Letters A, Vol. 299, 201-206, 2002. A.T. Abassy, M. A. El-Tawil, H. El. Zoheiry. Solving nonlinear partial differential equations using the modified variational iteration Pade technique, Journal of Computational and Applied Mathematics, Vol. 207, No. 1, 73-91, 2007.

Mathematics and Statistics 8(2A): 17-22, 2020 http://www.hrpub.org DOI: 10.13189/ms.2020.081303

Bayesian Estimation in Piecewise Constant Model with Gamma Noise by Using Reversible Jump MCMC

Suparman

Department of Mathematics Education, University of Ahmad Dahlan, Indonesia

Received August 3, 2019; Revised September 25, 2019; Accepted February 20, 2020

Copyright©2020 by authors, all rights reserved. Authors agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License

Abstract A piecewise constant model is often applied multiplicative noise is also used by several authors, for to model data in many fields. Several noises can be added example [7-10]. The multiplicative noise is used as a in the piecewise constant model. This paper proposes the measurement error in line transect sampling [7]. An piecewise constant model with a gamma multiplicative asymptotic Cramer-Rao bound is discussed for frequency noise and a method to estimate a parameter of the model. estimation in the multiplicative noise [8]. Adsorption of The estimation is done in a Bayesian framework. A prior ligands on DNA is considered for an arbitrary filling in the distribution for the model parameter is chosen. The prior presence of multiplicative noise [9]. A multiplicative noise distribution for the parameter model is multiplied with a is used in a segmentation [10]. likelihood function for the data to build a posterior Noise in a mathematical model is assumed to be of a distribution for the parameter. Because a number of models certain distribution, for example [11-14]. Gaussian additive are also parameters, a form of the posterior distribution for noise is used in the piecewise constant model [11]. the parameter is too complex. A Bayes estimator cannot be Exponential additive noise is used in an autoregressive calculated easily. A reversible jump Monte Carlo Markov model [12-14]. However, in some applications, the data is Chain (MCMC) is used to find the Bayes estimator of the often modeled following the piecewise constant model with model parameter. A result of this paper is the development a gamma multiplicative noise. The Gamma is a distribution of the piecewise constant model and the method to estimate that is more general than an exponential distribution. The the model parameter. An advantage of this method can exponential distribution is a particular case of the Gamma simultaneously estimate the constant piecewise model distribution. If the piecewise constant model with the parameter. Gamma multiplicative noise used to model the data, the model parameters are unknown. The model parameters Keywords Bayesian, Gamma Noise, Piecewise include a number of constant models, a location of constant Constant, Reversible Jump MCMC model changes, a constant model height, and a noise variance. This study proposes an estimation method of the piecewise constant model that has a Gamma multiplicative noise where the number of constant models is unknown. 1. Introduction A piecewise constant model is a model used to model 2. Method data in many fields, for example [1-3]. The piecewise constant model is used for smoothing images of flowers [1]. A Bayesian framework is adopted to estimate the The piecewise constant model is used for a population size parameters [15]. A prior distribution for the number of modeling [2], [3]. The piecewise constant model can constant models, the location of changes in the constant contain an additive noise or a multiplicative noise. The model, the constant height of the model, and the noise additive noise is considered by various authors, for variance are selected. Then this prior distribution is example [4-6]. The additive noise is added to a spatial combined with a likelihood function of the data to get a regression model [4]. The additive noise is used in a posterior distribution. Based on this posterior distribution, partially linear functional model[5]. This linear functional a Bayes estimator for the number of constant models, the model is partly applied to tecator data. The additive noise location of changes in the constant model, the constant is used in a log regression model [6]. On the other hand, a height of the model, and the noise variances are estimated.

18 Bayesian Estimation in Piecewise Constant Model with Gamma Noise by Using Reversible Jump MCMC

A reversible jump Monte Carlo Markov Chain (MCMC) distribution. A prior distribution for 휏1, … , 휏푘 according to method [16] was proposed to determine the Bayes ordered statistics. estimator. The basic idea of the MCMC reversible jump 1 푘+1 휋(휏1, … , 휏푘|푘) = 2푘+1 ∏푖=1 (푛푖 − 1) (6) method is a creation of the Markov chain that is recurrent 퐶푛−2 and irreducible such that limit distribution of the Markov A prior distribution for ℎ1, … , ℎ푘+1 is chosen inversely chain will be equal to the posterior distribution. Gamma distribution with parameter 푢 > 0 and 휈 > 0. Furthermore, a resulting Markov chain is used to calculate estimator for the parameters. 휋(ℎ1, … , ℎ푘+1|푘, 푢, 휈) 푘+1 푢 휈 −푢−1 1 = ∏ ℎ푖 푒푥푝 − 휈 푖=1 Γ(푢) ℎ푖 3. Results and Discussion −푢−1 휈푢 (푘+1) 푘+1 = ( ) (∏ ℎ푖) Suppose that 푛 represents a number of data and Γ(푢) 푖=1 푦1,… , 푦푛 represent a data set. These data follow the 푘+1 1 piecewise constant model if these data satisfy the following 푒푥푝 − 휐 ∑푖=1 (7) ℎ푖 mathematical equation: Here, 푢 = 1 and Jeffreys prior distribution are chosen 푦푡 = 푚푡푧푡, 푡 = 1, … , 푛 (1) as a hyperprior distribution for 휈 , i.e. , 휋(푣) ∝ 푣−1 . where Similarly, Jeffreys prior distribution is also selected as a hyperprior distribution for 훽 , i.e., 휋(훽) ∝ 훽−1 . So the ℎ , 휏 < 푡 ≤ 휏 1 1 2 prior distribution for the parameters (푘, 휏, ℎ, 휆, 휈, 훽) can be ℎ , 휏 < 푡 ≤ 휏 푚 = { 2 2 3 (2) written as 푡 … … ℎ푘+1, 휏푘+1 < 푡 ≤ 휏푘+2 휋(푘, 휏, ℎ, 훼, 훽) 1 with 휏1 = 0 and 휏푘+2 = 푛. The value of 푘 denotes the 푘푚푎푥 푘 푘푚푎푥−푘 = 퐶푘 휆 (1 − 휆) 2푘+1 number of constant models. The values of 휏 = (휏1, … , 휏푘) 퐶푛−2 state the location of the change in the constant model. The −푢−1 푘+1 휈푢 (푘+1) 푘+1 value of ℎ = (ℎ1, … , ℎ푘+1) expresses the height of the ∏ (푛푖 − 1) ( ) (∏ ℎ푖) constant model. Here, 푧푡 is assumed to have the Gamma 푖=1 Γ(푢) 푖=1 distribution with the parameters 훼 > 0 and 훽 > 0. 푘+1 1 1 1 푒푥푝 − 휐 ∑푖=1 (8) ℎ푖 훽 휈 3.1. Likelihood Function 3.3. Posterior distribution The random variable 푧푡 is distributed Gamma so that the probability function for 푧푡 can be written as Let 퐻1 = (푘, 휏, ℎ) and 퐻2 = ( 휆, 푣, 훽). A posterior 훽훼 distribution can be written as 푔(푧 |훼, 훽) = 푧훼−1 푒푥푝 − 훽푧 (3) 푡 Γ(훼) 푡 푡 휋(퐻1, 퐻2|푦) Here 훼 = 1. Suppose that 푦 = (푦1,… , 푦푛). By using 훽훼푛−1 푘+1 1 variable transformation, a likelihood function for data y is ∏ (훼−1)푛푖 ∝ 푛 푦푡 훼푛푖 (Γ(훼)) 푖=1 ℎ 훽훼 1 푦 푖 푓(푦|푘, 휏, ℎ, 훼, 훽) = ∏푘+1 ∏휏푖+1 푦훼−1 푒푥푝 − 훽 푡 푖=1 푡=휏푖+1 Γ(훼) 푡 ℎ훼 ℎ 푠 1 푖 푖 푖 푘푚푎푥 푘 푘 −푘 훼 푛 exp −훽 퐶 휆 (−휆) 푚푎푥 훽 푘+1 (훼−1)푛푖 1 푠푖 푘 2푘+1 ∏ ℎ푖 퐶 = ( ) 푖=1 푦푡 훼푛푖 푒푥푝 − 훽 (4) 푛−2 Γ(훼) ℎ ℎ푖 푖 푘+1 휏푖+1 ∏ (푛 − 1) where 푠푖 = ∑ (푦푡 − 푚푡) and 푛푖 = 휏푖+1 − 휏푖 for 푖 푡=휏푖+1 푖=1 𝑖 = 1, … , 푘 + 1. −푢−1 푣푢(푘+1)−1 푘+1 (푘+1) (∏ ℎ푖) 3.2. Prior Distribution (Γ(푢)) 푖=1 푘+1 1 푒푥푝 − 휐 ∑푖=1 (9) To obtain a posterior distribution, a prior distribution ℎ푖 must be determined. As in [10], the prior distribution for 푘 is chosen of a Binomial distribution with a parameter 0 < 3.4. Reversible Jump MCMC 휆 < 1. For 푘 = 0, 1, … , 푘푚푎푥 Parameter estimation (퐻 , 퐻 ) is carried out using the 푘 1 2 휋(푘|휆) = ( 푚푎푥) 휆푘(1 − 휆)푘푚푎푥−푘 (5) Gibbs algorithm which consists of two stages, namely: 푘 distribution simulation 휋(퐻2|퐻1, 푦) and distribution Where 푘푚푎푥 states the maximum value of 푘 . A simulation 휋(퐻1|퐻2,푦) . The distribution simulation hyperprior distribution for 휆 is chosen as a uniform 휋(퐻2|퐻1, 푦) can be done using the following distributions,

Mathematics and Statistics 8(2A): 17-22, 2020 19

i.e.: Suppose that 푥 = (휏1, … , 휏푗, … , 휏푘, ℎ1, … , ℎ푗, … , ℎ푘+1,) ∗ 푘+1 and 푥 = 푠푖 훽 ∼ 퐺 (훼푛, ∑ ) , 휆 ∗ ∗ ∗ (휏1, … , 휏푗, 휏푗 , 휏푗+1 … , 휏푘,ℎ1, … , , ℎ푗−1, ℎ푗, ℎ푗+1, ℎ푗+1, … , ℎ푘+1,) 푖=1 ℎ푖 . The point 푥∗ will replace 푥 with probability ∼ 퐵(푘 + 1, 푘푚푎푥 − 푘 + 1), and 휐 푘+1 ∗ ∗ 1 푓(푦|푥 ) 휋(푥 |푘) 푞(푥∗,푥) ∼ 퐺 (푢(푘 + 1), ∑ ). 휌(푥, 푥∗) = 푚𝑖푛 {1, } (17) 푓(푦|푥) 휋(푥|푘) 푞(푥,푥∗) 푖=1 ℎ푖

The distribution simulation 휋(퐻1|퐻2, 푦) is done by In this case, the ratio of the likelihood function is using the reversible jump MCMC algorithm. This ∗ ∗ ∗ 푓(푦|푥 ) 푠푗 푠푗+1 푠푗 algorithm uses 3 transformations, namely: changes in a = 푒푥푝 − 훽( ∗ + ∗ − ) (18) 푓(푦|푥) ℎ ℎ ℎ푗 location of the constant model, birth of the constant model, 푗 푗+1 and death of the constant model. The ratio of the posterior distribution is 휋(푥∗|푘) 푘 − 푘 휆 (2푘 + 3)(2푘 + 2) 3.4.1. Change in the Location = 푚푎푥 ( | ) ( )( ) The change in the location of the constant model is as 휋 푥 푘 푘 + 1 1 − 휆 푛 − 2푘 − 2 푛 − 2푘 − 3 ∗ ∗ 푢 ∗ ∗ −푢−1 follows. Take a location randomly between 휏1, … , 휏푘. If 휏푗 푛푗 푛푗+1 휈 ℎ푗 ℎ푗+1 is selected, the location 휏 is deleted and replaced by ( ) 푗 푛푗 Γ(푢) ℎ푗 ∗ location 휏푗 . Take u randomly according to 푈(휏푗−1, 휏푗+1). ∗ 1 1 1 so that 휏푗 = 푢 . Suppose that 푥 = 푒푥푝 − 휈 ( ∗ + ∗ − ) (19) ℎ ℎ ℎ푗 ∗ 푗 푗+1 (휏1, … , 휏푗, … , 휏푘, ℎ1, … , ℎ푘+1,) and 푥 = ∗ ∗ The ratio of the instrumental distribution is (휏1, … , 휏푗 , … , 휏푘,ℎ1, … , ℎ푘+1,). The point 푥 will replace 푥 2 with probability ∗ ∗ 푞(푥∗,푥) 푛 (ℎ푗+ℎ푗+1) ∗ ∗ = (20) 푓(푦|푥 ) 휋(푥 |푘) 푞(푥∗,푥) 푞(푥,푥∗) 푘+1 ℎ 휌(푥, 푥∗) = 푚𝑖푛 {1, } (10) 푗 푓(푦|푥) 휋(푥|푘) 푞(푥,푥∗) 3.4.3. Death of the Constant Model If 휏∗ = 휏 then a ratio of the likelihood function is 푗 푗 The death of the constant model is as follows. Take a ∗ 푓(푦|푥 ) = 1 (11) location randomly between 휏1, … , 휏푘 . If location 휏푗+1 is 푓(푦|푥) selected, then location 휏푗+1 is deleted. The height of ℎ푗 ∗ ∗ If 휏푗 < 휏푗 then a ratio of the likelihood function is and ℎ푗+1 is also removed and replaced by the height ℎ푗 ∗ such that ∗ (휏푗−휏푗) 푓(푦|푥 ) ℎ푗−1 휏푗 1 1 = ( ) 푒푥푝 − 훽 ∑ ∗ 푦푡 ( − ) 푓(푦|푥) ℎ푗 푡=휏푗+1 ℎ푗 ℎ푗−1 (휏푗+1 − 휏푗) log(ℎ푗) + (휏푗+2 − 휏푗+1) log(ℎ푗+1) = ∗ (12) (휏푗+2 − 휏푗)log (ℎ푗 ) (21) ∗ If 휏푗 > 휏푗 then a ratio of the likelihood function is Suppose that ∗ ∗ (휏 −휏 ) ∗ 푥 = (휏1, . . , 휏푗, 휏푗+1, 휏푗+2, . . , 휏푘, ℎ1, . . , ℎ푗, ℎ푗+1, . . , ℎ푘+1) 푓(푦|푥 ) ℎ 푗 푗 휏 1 1 = ( 푗 ) 푒푥푝 − 훽 ∑ 푗 푦 ( − ) (1 and 푥∗ = 푓( | ) ℎ 푡=휏푗+1 푡 ℎ ℎ 푦 푥 푗−1 푗−1 푗 ∗ 3) (휏1, … , 휏푗, 휏푗+2, … , 휏푘,ℎ1, … , ℎ푗−1, ℎ푗 , ℎ푗+2, … , ℎ푘+1) . The point 푥∗ will replace 푥 with probability The ratio of the posterior distribution is ∗ ∗ ∗ ∗ 푓(푦|푥 ) 휋(푥 |푘) 푞(푥 ,푥) ∗ ∗ ∗ ( ) { } 휋(푥 |푘) 푛 푛 휌 푥, 푥 = 푚𝑖푛 1, ∗ (22) = 푗−1 푗 (14) 푓(푦|푥) 휋(푥|푘) 푞(푥,푥 ) 휋(푥|푘) 푛푗−1 푛푗 In this case, the ratio of the likelihood function is The ratio of the instrumental distribution is ∗ ∗ ∗ 푓(푦|푥 ) 푠푗 푠푗 푠푗+1 푞(푥∗,푥) = 푒푥푝 − 훽( − ∗ − ∗ ) (23) = 1 (15) 푓(푦|푥) ℎ푗 ℎ푗 ℎ푗+1 푞(푥,푥∗) The ratio of the posterior distribution is 3.4.2. Birth of the Constant Model 휋(푥∗|푘) 푘 − 푘 휆 (2푘 + 3)(2푘 + 2) The birth of the constant model is as follows. Take = 푚푎푥 ( | ) ( )( ) location 휏∗ randomly between 2, … , 푛 − 1 . If 휏∗ ∈ 휋 푥 푘 푘 + 1 1 − 휆 푛 − 2푘 − 2 푛 − 2푘 − 3 ∗ ∗ 푢 ∗ ∗ −푢−1 (휏푗, 휏푗+1) then the height of ℎ푗 is deleted and replaced by 푛푗 푛푗+1 휈 ℎ푗 ℎ푗+1 the height ℎ∗ and ℎ∗ such that ( ) 푗 푗+1 푛푗 Γ(푢) ℎ푗

∗ ∗ ∗ ∗ 1 1 1 (휏 − 휏푗) log(ℎ푗 ) + (휏푗+1 − 휏 ) log(ℎ푗+1) = (휏푗+1 − 푒푥푝 − 휈 ( ∗ + ∗ − ) (24) ℎ푗 ℎ푗+1 ℎ푗 휏푗)log (ℎ푗) (16) The ratio of the instrumental distribution is

20 Bayesian Estimation in Piecewise Constant Model with Gamma Noise by Using Reversible Jump MCMC

2 ∗ ∗ 푞(푥∗,푥) 푛 (ℎ푗+ℎ푗+1) ∗ = (25) 푞(푥,푥 ) 푘+1 ℎ푗

3.5. Simulation Performance of the algorithm is tested using a simulation study. Synthetic data is made using the piecewise constant model in equation (1). A value of a piecewise constant model parameter is presented in Table 1 while noise is assumed to have a Gamma distribution with parameter values 훼 = 5 and 훽 = 5. A value of the maximum k is 10.

Table 1. Value of model parameter

Value of 푘 Value of 휏 Value of ℎ

40 1.5 1.1 80 1.6 5 120 Figure 2. Histogram of the number of models 0.8 170 200 0.4 ( ) (0.7) Figure 2 shows that the maximum value of k is reached at 푘 = 5. So the maximum probability estimator for the This synthetic data is presented in Figure 1. number of models k is 푘̂ = 5. For 푘̂ = 5, the estimator for the location of the model change is presented in Table 2. The image of the synthetic data and estimator for a model change location is shown in Figure 3.

Figure 1. Synthetic data This synthetic data is used as input to the reversible jump MCMC algorithm. The output is the number of constant models, the location of the model changes, and the height Figure 3. Segmentation of synthetic data of the constant model. The algorithm runs as many as 100,000 iterations with a Figure 3 shows that this synthetic data has 6 different burn-in period of 20,000. A histogram of the number of models. Finally, for 푘̂ = 5, the estimator for the height of models is presented in Figure 2. the constant model is presented in Table 2.

Mathematics and Statistics 8(2A): 17-22, 2020 21

Table 2. Value of parameter estimation in the prior distribution. Estimation of 푘 Estimation of 휏 Estimation of ℎ 1.61 39.40 0.93 79.02 Acknowledgements 5 1.32 120.16 0.89 168.76 0.42 The author would like to thank the University of Ahmad 200.33 ( ) (0.65) Dahlan for providing the grant. In Table 1 that is compared with Table 2, the estimated value of the parameter approaches the parameter value. The distance between 휏 and 휏̂ is |휏 − 휏̂| = 1.73 while the distance between ℎ and ℎ̂ is |ℎ − ℎ̂| = 0.36. Synthetic REFERENCES data and an estimator for height are presented in Figure 4. X. Pang, S. Zhang, J. Gu, L. Li, B. Liu, and H. Wang, Improved L0 Gradient Minimization Alt L1 Fidelity for Image Smoothing, PLOS ONE, 1-10, 2015.

D. Zivkovic, M. Steinrucken, Y.S. Song, and W. Stephan, Transition Densities and Sample Frequency Spectra of Diffusion Processes with Selection and Variable Population Size, Genetics, 601-617, 2015.

J.A. Kamm, J.P. Spence, J. Chan, and Y.S. Song, Two-Locus Likelihoods Under Variable Population Size and Fine-Scale Recombination Rate Estimation, Genetics, 1381-1399, 2016.

S. Nandy, C.Y. Lim, and T. Maiti, Additive model building for spatial regression, J.R.Statist. Soc.B, 779-800, 2017.

Y. Hu, S. Feng, and L. Xue, Automatic Variable Selection for Partially Linear Functional Additive Model and Its Application to the Tecator Data Set, Mathematical Figure 4. Synthetic data and reconstructed data Problems in Engineering, 1-9, 2018.

Figure 4 shows that the reversible jump MCMC R. Richardson, H.D, Tolley, W.E, Evenson, B.M. and Lunt, algorithm can be used to estimate the number of constant Accounting for measurement error in log regression models models, estimate the location of a model change, and the with applications to accelerated testing, PLOS ONE, 1-13, 2018. height of a constant model. The simulation studies show that the reversible jump MCMC algorithm can determine T. A. Marques, Predicting and Correcting Bias Caused by the number of models and parameters of a piecewise Measurement Error in Line Transect Sampling Using constant model well. Multiplicative Error Models, Biometrics, 757-763, 2004. Z. Wang and S.S. Abeysekera, Asymptotic Bounds for Frequency Estimation in the Presence of Multiplicative 4. Conclusions Noise, Journal on Advances in Signal Processing, 7, 1-9, 2007. This paper develops a piecewise constant model and its V.K. Andriasyan. Adsorption of Ligands on DNA for parameter estimation procedure. The piecewise constant Arbitrary Filling in the Presence of Multiplicative Noise, model parameter includes the number of constant models, Journal of Contemporary Physics, 48, 243-246, 2013. the location of changes in the constant model, the constant model height, and noise variance. The Bayes estimator Suparman and M. Doisy, Bayesian Segmentation in Signal cannot be formulated explicitly because the number of with Multiplicative Noise Using Reversible Jump MCMC, Telkomnika, 673-680, 2018. constant models is a parameter. The reversible jump MCMC method is proposed to estimate these parameters. E. Punskaya, C. Andrieu, A. Doucet, and W.J. Fitzgerald, According to the simulation study, the reversible jump Bayesian Curve Fitting Using MCMC With Applications to MCMC estimated the piece-wise constant model parameter Signal Segmentation, IEEE Transactions on Signal Processing, 50, 747-758, 2002. well. The reversible jump MCMC algorithm has several Z. Shi and H. Aoyama, Estimation of the Exponential advantages. This algorithm can be used to estimate the Autoregressive time series model by using the genetic piecewise constant model parameter that has Gamma algorithm, Journal of Sound and Vibration, 309-321, 1997. multiplicative noise simultaneously. This algorithm can N. Sad, On Exponential Autoregressive Time Series Models, also be used to determine the hyper-parameters that appear J. Math, 97-101, 1999.

22 Bayesian Estimation in Piecewise Constant Model with Gamma Noise by Using Reversible Jump MCMC

L. Larbi and H. Fellag, Robust Bayesian Analysis of an Autoregressive Model with Exponential, Innovations, Afr. Stat., 955-964, 2016. G.A. Triantafylllidis, D. Tzovaras, and M.G. Strintzis, A Bayesian Approach for Segmentation in Stereo Image Sequences, Journal on Applied Signal Processing, 10, 1116- 1126, 2002. P.J. Green, Reversible Jump MCMC Computation and Bayesian Model Determination. Biometrika, 4, 711-732, 1995.

Mathematics and Statistics 8(2A): 23-27, 2020 http://www.hrpub.org DOI: 10.13189/ms.2020.081304

Weakly Special Classes of Modules

Puguh Wahyu Prasetyo1,*, Indah Emilia Wijayanti2, Halina France-Jackson3, Joe Repka4

1Department of Mathematics Education, Ahmad Dahlan University, Indonesia 2Department of Mathematics, Gadjah Mada University, Indonesia 3Department of Mathematics, Nelson Mandela Metropolitan University, South Africa 4Department of Mathematics, University of Toronto, Canada

Received August 10, 2019; Revised February 20, 2020; Accepted February 25, 2020

Copyright©2020 by authors, all rights reserved. Authors agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License

Abstract In the development of Theory Radical of 2 called a semiprime ideal of A if for any I ideal of I  K Rings, there are two kinds of radical constructions. The first radical construction is the lower radical construction and implies I  K . The ring is a semiprime ring if the zero the second one is the upper radical construction. In fact, the ideal 0 is a semiprime ideal. An ideal L of A such that class 휋 of all prime rings forms a special class and the L  A is called a maximal ideal of A if does not upper radical class 푈(휋) of 휋 forms a radical class which contain any proper ideal I of A satisfying L  I  A . A is called the prime radical. An upper radical class which is nonzero ideal I of A is called a minimal ideal of A if generated by a special class of rings is called a special I  0 and A does not contain any ideal J of A such radical class. On the other hand, we also have the class 휌 that 0  J  I . A nonzero ideal K of is called an of all semiprime rings which is weakly special class of rings. essential ideal if K  I  0 for every nonzero ideal I of A Moreover, we can construct a special class of modules by and this is denoted by K  A . using a given special class of rings. This condition A ring A is called a prime essential ring if A is a motivates the existence of the question how to construct semiprime ring and every prime ideal of A is an essential weakly special class modules by using a given weakly ideal [4]. The important consequences of the existence of special class of rings. This research is a qualitative research. prime essential rings were given in [3, 4]. An A − module The results of this research are derived from fundamental M is said to be a prime module if there exists elements axioms and properties of radical class of rings especially on n  M and a  A such that an  0 and if mM and special and weakly special radical classes. In this paper, we J  A are such that Jm = 0, then m = 0 or JM = 0. introduce the notion of a weakly special class of modules, Furthermore, an A − module M is said to be a faithful a generalization of the notion on a special class of modules module if (0 : M) = {a  A | aM = 0} = 0. Moreover, an based on the definition of semiprime modules. Furthermore, A some properties and examples of weakly special classes of A − module is called a simple module if there exist modules are given. The main results of this work are the elements r  A,m M satisfying rm  0 and the module definition of a weakly special class of modules and their M has only the trivial submodules, 0 and M itself [5]. A properties. submodule N of M is said to be a semiprime submodule of M if for any L  A and every submodule P of M, Keywords Prime Module, Semiprime Module, Special Class of Rings, Special Class of Modules, Weakly Special L2P  N implies LP  M. Furthermore, the A − module Class of Rings, Weakly Special Class of Modules M is called a semiprime module over a ring A if 0 is a semiprime submodule of M [1]. In fact, every prime

module is semiprime. However, the converse is not generally true. We shall follow the Amitsur-Kurosh definition of a radical class. Let  be any collection of 1. Introduction rings. The class of rings forms a radical class if  has In this paper, we consider only associative rings, but do the following three properties [5]: not require them to be commutative or to have an identity. 1. The class  is closed under homomorphism, that is, Let A be a ring. An ideal P of A is called a prime ideal of A if A , then A/ I  for every I  A . In other words, if for any two ideals I and J of A, IJ  P implies that the class  contains all images of every ring in  either I  P or J  P . Moreover, an ideal K of A is under any ring homomorphisms,

24 Weakly Special Classes of Modules

2. Let A be any ring. If we define of prime modules is said to be special if the conditions (M1),  (A) = {I  A | I  }, then  (A)  , (M2), (SM3), and (SM4) hold for  . On the other hand, a 3. For any ring A, the ring factor A /  (A) has no prime ring A is said to be a *− ring if A is a prime ring nonzero ideal in  . and there is no nonzero proper ideal such that A/ I is prime. The definition of ring was introduced by Furthermore, a radical class  is hereditary if  Korolczuk (France-Jackson) in her paper [6]. A significant contains all ideals I of all the rings in  . A hereditary contribution of the existence of rings to the n development of the Radical Theory of Rings can be found radical class  containing the class  0 = {R | R = 0 for in Theorem 1 in [6]. Let  be any special class of rings. some n   + } of all nilpotent rings is called a Then the class U() = {A | there is no proper ideal of A supernilpotent radical. On the other hand, let  be a class of rings consisting of prime rings (respectively, semiprime satisfying A / I belongs to } of rings forms a radical rings). The class  is called a special (respectively, weakly class of rings and the upper radical class U() is called a special) class if  is hereditary and if R contains an special radical class. The upper radical class U( ) of the essential ideal I  , then R   . The definition of a class  of all prime rings will be called the prime radical weakly special class is the main contribution of this work. class and it is denoted by . It follows from [6] that for a In 1993, [3] showed that the class of all prime essential ˆ rings can be applied to determine whether a radical is nonzero rings A, the smallest special radical LA special. Some properties of special classes of rings were containing A forms a special atom, that is, the smallest also described in [9]. Let  be a special class of rings. A special radical which properly contains the prime radical . A hereditary radical class is called a supernilpotent ring A is called a * − ring if A and A does not  radical if  contains the prime radical . Therefore, every contain a nonzero proper ideal which belongs to  . The special radical is supernilpotent. A supernilpotent radical class of all * − rings will be denoted by *. Special  is called a supernilpotent atom if  is the smallest classes  which generate radical classes coinciding with the supernilpotent radical which properly contains the prime radical generated by * were given in [9]. Moreover, a radical . In fact, every nonzero ring generates a ring A is called a subdirectly irreducible ring if supernilpotent atom since the smallest supernilpotent {I  0 | I  A}  0. In other words, the intersection of all radical LA containing a nonzero ring A is a nonzero ideals of A is nonzero, otherwise, if this supernilpotent atom [2]. Moreover, [11] showed that there intersection is zero, the ring A is called a subdirectly exists a prime essential rings which generates a special reducible ring. It follows from Theorem 2 in [8] that every atom. prime essential ring is a subdirectly reducible ring. Some Let R and S be rings, and let V = V and W = W properties of generalizations of prime essential rings were R S S R be an R − S − bimodule and an S − R − bimodule, also given in [8]. Furthermore, as in [5], for any A, let  A respectively. The 4 − tuple (R,V,W, S) is said to be a denote the class of modules M defined over the ring A satisfying AM  0, and let  =   A .  R V  Morita context if the set   of all 2 2 matrices in Now, let ker( A ) = {(0 : M ) A | M   A} and we W S  consider the class  might satisfy the following which the entries satisfy a11  R, a12 V, a21 W, and conditions: a22  S forms a ring under matrix addition and matrix (M1) If I is an ideal of A and M belongs to A/ I , multiplication. This definition can be considered if the then M belongs to A . maps V W → R and W V → S exist [5]. Let  be any (M2) Let A be a ring and let I be an ideal of such that radical class of rings. The radical class of rings is said to (0 : M) A contains I, then M belongs to  A if and only be a normal radical if V (S)W is contained in the largest if M belongs to  A/ I . ideal  (R) of R contained in  for every tuples

Moreover, it follows from [7] that for every ring A , the (R,V,W, S) which is a Morita context. We already know class  forms a special class of modules if has the about the normal class of modules introduced in [7] as the property (M1) and property (M2), and satisfies the generalization of a normal class of rings. We shall follow following conditions: [7] to learn the concept of the definition of a normal class (SM3) If M  A, B  A and there exists an element of modules. Let the class (A) be the class of prime A − r  B and mM such that rm  0, then M B, modules and  = (A) , the union extending over all (SM4) Let be a ring and let B be any ideal of A. If rings A. The class  is called a normal class of modules M   B , then BM   A. if  satisfies condition M2 and for every 4 − tuple (R,V,W, S) which is a ring of Morita context and the In the general case, a class  of modules which consist

Mathematics and Statistics 8(2A): 23-27, 2020 25

M  and let M be a semiprime module over the ring A such that context module E =   is such that if n N satisfies 2  N  (0: M) A = I. Suppose J  A satisfies J  I. Since Vn = 0, then n = 0 and N =WM and SN  0, M  (R) 2 (0 : M) A = I, J P = 0 for every submodule P of M. implies N  (R). By the semiprimeness of M , JP = 0 for every submodule Moreover, let be any special classes of modules. It  P of M. Therefore, J  I. This means that I is a follows from [7] that the class  ={A| A has a faithful semiprime ideal of the ring A. Conversely, let I be a module in (A)} of rings, where (A) is the class of semiprime ideal of A such that (0: M) A = I. Let J  A prime A − modules, forms a special class of rings. 2 Conversely, for every special class , the class such that J P = 0 for every submodule P of M. 2  = (A) of all prime A − modules, where Therefore, J  I. Since I is a semiprime ideal of (A) ={M | M is a prime A − module and the factor A, J  I = (0: M) A. Hence JP = 0.In other words, M is a semiprime A − module. ring A/(0 : M) A belongs to } forms a special class of As consequences of Theorem 2.2, we therefore have the modules [7]. following results. The primeness and the semiprimeness of a ring Theorem 2.3. Let R be any ring. The following (respectively, a module) are important in the development conditions are equivalent of modern algebra especially in the development of the i. The ring is a semiprime ring Radical Theory of Rings and Modules. In the ring theory, ii. The ring R has a faithful semiprime module. we have the class of all prime rings which is the largest special class of rings and the class of all semiprime rings Proof. which is the largest weakly special class of rings. Some (i  ii) subsets of these classes generate special classes of rings and Assume R is a semiprime ring. Clearly, the singleton {0} weakly special classes of rings, respectively. The class of of is a semiprime ideal of R.It follows from Theorem all prime essential rings is contained in the class of all 2.2 that there is an R − module M such that M is a semiprime rings and it can be used as a tool in determining semiprime module with (0: M)R = 0. This means that the speciality of any radical class of rings. Hence, the M is a faithful semiprime R − module. existence of prime essential rings is very important. We (ii  i) will give an alternative definition of prime essential rings Conversely, suppose R is a ring such that there exists a in view of the module theory so we can work on the module semiprime R − module M such that (0 : M) = 0. It theory. R In fact, some properties of semiprime modules are also follows from Theorem 2.2 that 0 is a semiprime ideal of R. valid for prime modules. Furthermore, some properties of Therefore, R is a semiprime ring. the class of all prime modules extending over any rings Theorem 2.4. Let A be a ring. The following have been found in [7], the class of all prime modules forms statements are equivalent a special class of modules. In general, it is very important i. The ring A is prime essential ring to know how to construct weakly special classes of ii. The ring A has a nonzero faithful semiprime modules and investigate their properties so that we can module and for every nonzero ideal I of A, I has clarify some properties which are valid for special classes no faithful prime module. of modules and which also hold for weakly special class of Proof. Obvious. modules. The existence of special classes of rings motivated the study of special classes of modules. In this notion, we 2. Results and Discussion would like to introduce the weakly special class of modules. Definition 2.5. Let  (A) be a class consistings of We already have a necessary and sufficient condition for semiprime A − modules and  = (A), the union any ring A to have a prime module over itself. This extending over all rings A. Then  is called a weakly property can be found in [5]. Furthermore, in the general case, we show that this property is also valid for the special class of modules if  satisfies the conditions (M1), existence of semiprime modules. A necessary and (M2), (SM3), and (SM4). sufficient condition for a ring A to have a semiprime The followings theorems show that weakly special module over itself is given below. classes of modules exist. Theorem 2.2. Given a ring A and let I  A. Then there Theorem 2.6. Given any ring A, define  A to be the exists a semiprime A − module M such that the annihilator class of all semiprime A − modules. Then the class (0: M) A = I  I is a semiprime ideal of A.  =   A is a weakly special class of modules. Proof. Let A be a ring such that there is an ideal I  A, Proof. Let I = (0: M) A = 0 and let M be a

26 Weakly Special Classes of Modules

semiprime A / I − module. We would like to show that M Proof. Let  be a special class of modules. Since every is a semiprime A − module. Let J  A and let N be any prime module is semiprime, the class  of prime modules 2 consists of semiprime modules. It follows from the submodule of M such that J N = 0. Since speciality of that satisfies the conditions (M1), (M2), 2   J  A,(J + I)/ I  A/ I and J N = 0, the following (SM3), and (SM4). Hence, the class  of semiprime property holds ((J + I) / I)2 N = 0. Moreover, by the modules is a weakly special class. Moreover, Theorem 2.7 clearly shows that every special semiprimeness of M over This A/ I,((J + I) / I)N = 0. class of modules is naturally a weakly special class of implies JN = 0. This gives (J / I)N = 0. Therefore, modules. This condition is equivalent with the property M is a semiprime A − module. stating that every special class of rings is a weakly special Let M be a semiprime A − module and let I  A class of rings. such that I  (0 : M) A. Let J / I  A/ I and N be a In fact, the essential closure *k of the class *of all *− 2 rings forms a special class of rings. Hence, the class submodule of M satisfying (J / I) N = 0, so * =  *(A), where  *(A) = {M | M is a prime A − consequently (J / I)2 N = 0 if and only if J 2N = 0. By module and A/(0 : M) A *k }, forms a special class of the semiprimeness of M , JN = 0. This gives modules [10]. Let A be any ring and let M be an A − (J / I)N = 0. Therefore, M is a semiprime A/ I − module. The module is called a *p − module if M is module. a prime A − module and M has no nonzero proper prime Let M A and let B  A. Suppose J  B and let submodule. Some properties of modules were given 2 in [10]. Moreover, the following theorem shows that every N be any submodule of M such that J N = 0. It nonzero module is contained in  *(A). 3 follows from Andrunakievich’s Lemma that J  J, Theorem 2.8. Every *p − module is contained in where J is the ideal of A generated by J. Therefore,  *(A). 6 3 6 Proof. Let be a ring and let M be a module over J = ( J )2  J 2 , consequently J N = 0. By the ring A. Then it follows from the definition of 3 the semiprimeness of M over the ring A, J N = 0. This module that M is a prime A − module and M has no 4 2 nonzero proper prime A − submodule. Since M is a prime condition implies J N = 0. Moreover, J N = 0. A − module, Proposition 3.14.16 in [5] shows that the Hence, J N = 0. It follows that annihilator (0: M)A of M is a prime ideal of A. Therefore, the factor ring A/(0 : M) , where is the JN  J N = 0  JN = 0. Thus M  B . A class of all prime rings. Suppose A/(0 : M) is not in the Let M B, B  A. We would like to show that A class * of all *− rings. Then A/(0 : M) has a nonzero BM A . In other words, that (0 : BM ) A is a A proper homomorphic image, say semiprime ideal of A. Let J  A such that JB  0 and A/ B  (A/(0 : M) ) /((0 : M) / B). Then, B is a prime 2 A A J  (0 : BM ) A. Moreover, JB  J and JB  B. ideal of A such that (0 : M) A  B. Moreover, BM is a Hence nonzero proper prime A − submodule of M , contrary to the assumption that M is a module. Hence, (JB)2  J 2 (JB)2  B2. and A/(0 : M) A *  *k. This means that M is contained in Thus, the special class of modules generated by the essential 2 2 2 2 closure * of the class of all rings. (JB)  J  B  J  B  (0 : BM ) A  B = (0 : M )B. k 2 Another natural example of a weakly special class of This implies that (JB)  (0 : M )B. Since (0 : M)B is modules is described in the following result. a semiprime ideal of A, JB  (0 : M)B. This means that Theorem 2.9. Every normal class of modules is a weakly special class of modules. JBM = 0, consequently, J  (0 : BM )B. So, we may Proof. It follows from Proposition 2 in [7] that every deduce that BM is a semiprime A − module. normal class of modules forms a special class of modules. In fact, we already have seen some examples of special Moreover, it follows from Theorem 2.7 that every normal classes of modules which can be accessed in [5]. A class of modules is weakly special. different example of a weakly special class of modules is described below. On the other hand, some other properties of Theorem 2.7. Every special class of modules is a weakly Modules can be found in [10]. special class of modules. Theorem 2.10. If  is a weakly special class of

Mathematics and Statistics 8(2A): 23-27, 2020 27

modules, then the class  = {R | R has a faithful module classes of modules are also valid for weakly special classes in  (R)} of rings is a weakly special class of rings. of modules. Proof. We will show that  consists of semiprime rings. Let R   . Then R has a faithful module in the Acknowledgements class (R) of all semiprime R − modules. It follows The first author wishes to thank all members and faculty from Theorem 2.3 that R is semiprime. Therefore,   of Mathematics Department, University of Toronto for consists of semiprime rings. Now let 0  I  R   . their kindness. Some parts of this paper were started when Since R is a semiprime ring, so is I. It means that I has the first author visited the Mathematics Department, a faithful module in the class  (I) of all semiprime I − University of Toronto, Ontario, Canada in Winter 2016. modules, so I   .Now suppose R is a ring such that there exists I  R with I   . This means that I has a faithful module in  (I), say M. Then, (0 : M)R = 0 and M  (I). Since M (I), IM (R). Let REFERENCES x (0 : IM)R. Then x IM = 0. Since M. Behboodi. A Generalization of Baer's Lower Nilradical for Modules, J. Algebra. Appl, Vol.6, 337-353. (0 : M)R = 0, IM  0, and consequently x I M = 0 implies x = 0. This means that IM is a faithful R − H. France-Jackson. On Atoms of the lattice of Supernilpotent Radicals, Quaestiones Mathematicae, vol.10, 251-255. module. Hence, R has a faithful module in (R). So, we may deduce that  is a weakly special class of rings. H. France-Jackson. On Prime Essential Rings, Bull. Austral. Math. Soc, Vol.47, 287-290. As a direct consequence of Theorem 2.10, we therefore have the following result. B. J. Gardner, P. M. Stewart. Prime Essential Rings, Theorem 2.11. Let  be class of rings which is a weakly Proceedings of The Edinburgh Mathematical Society, Vol.34, 241-250. special class of modules and let  = {R | there exists an B. J. Gardner, R. Wiegandt. Radical Theory of Rings, Marcel R − module M such that (0 : M)R = 0 and M belongs Dekker Inc, New York, 2004. to (R)}be the class of rings such that every member of  has a faithful module in the class  (R) of all R − H. Korolczuk. A Note on The Lattice of Special Radicals,  Bulletin De LAcademie Polonaise Des Sciences, vol.29, 3- semiprime modules. Then the upper radical U( ) 4. generated by the class  of rings is a supernilpotent W. Nicholson, J. Watters. Normal Radicals and Normal radical class of rings Classes of Modules. Glasgow Math. J Vol.30, 97-100. Proof. It follows from Theorem 2.10 that  is a  P. W. Prasetyo, I. E. Wijayanti, H. France-Jackson. Some weakly special class of rings. Therefore, U( ) is a properties of prime essential rings and their generalization,  U(  ) supernilpotent radical class. Proceedings of the 1st International Conference on Science and Technology, (120003)1-6, 2016. P. W. Prasetyo, I. E. Wijayanti, H. France-Jackson. Some 3. Conclusions special classes μ whose upper radical U(μ) determined by μ coincides with the upper radical U(*μ)k determined by Based on the results, we can construct a semiprime (*μ)k. Proceedings of The 7th SEAMS UGM Int. Conference on Mathematics and Its Applications, R R module over a ring if the ring has a semiprime ideal. (020013)1-4, 2016. This condition is described in Theorem 2.2. As consequence of this property, a necessary and sufficient P. W. Prasetyo, I. E. Wijayanti, H. France-Jackson. *p- condition for a ring to be semiprime is having a faithful Module and A Special Class of Modules Determined by The semiprime module. Moreover, the construction of a weakly Essential Closure of the Class of All *-Rings, JP Journal of Algebra, Number Theory, and Its Applications, vol.39, 11- special class of modules is also successfully determined by 20. following the construction of a special class of modules which was introduced by Nicholson and Watters in [7]. A S. Wahyuni, I. E. Wijayanti, H. France-Jackson. A Prime further property of *p − modules described in this paper is Essential Rings That Generates a Special Atoms, Bull. Aust. Math. Soc Vol.95, 214-218. that every module is contained in  *(A), the class of all A − modules M such that the annihilator (0 : M ) A *k. We recommend for further investigation to the reader to find some additional properties of *p − modules and to find whether any properties of special

Mathematics and Statistics 8(2A): 28-35, 2020 http://www.hrpub.org DOI: 10.13189/ms.2020.081305

Markov Chain: First Step towards Heat Wave Analysis in Malaysia

Nur Hanim Mohd Salleh1,*, Husna Hasan1, Fariza Yunus2

1School of Mathematical Sciences, Universiti Sains Malaysia, Malaysia 2Malaysian Meteorological Department (Jabatan Meteorologi Malaysia), Malaysia

Received August 8, 2019; Revised October 8, 2019; Accepted February 20, 2020

Copyright©2020 by authors, all rights reserved. Authors agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License

Abstract Extreme temperature has been carried out located in the western region of Peninsular Malaysia around the world to provide awareness and proper including Perlis, Kedah, Perak, Kuala Lumpur and Johor opportunity for the societies to prepare necessary [1]. The level 1 alert was issued as the daily maximum arrangements. In this present paper, the first order Markov temperature exceeding 35˚C for three consecutive days. chain model was applied to estimate the probability of extreme temperature based on the heat wave scales provided by the Malaysian Meteorological Department. In 2. Background of Study this study, the 24-year period (1994-2017) daily maximum Stochastic processes are progressively being used for temperature data for 17 meteorological stations in Malaysia modeling and predicting climate data. The arising need of was assigned to the four heat wave scales which are using this approach in climate and weather models is monitoring, alert level, heat wave and emergency. The caused by the inability to resolve all necessary processes analysis result indicated that most of the stations had three and scales in comprehensive numerical weather and categories of heat wave scales. Only Chuping station had climate prediction models [2]. One of the famous stochastic four categories while Bayan Lepas, Kuala Terengganu, models applied to evaluate climate data is Markov chain. Kota Bharu and stations had two categories. The Markov chain model is a commonly used tool for The limiting probabilities obtained at each station showed simulating time series of discrete random variables [3]. a similar trend which the highest proportion of daily This model is considered as a memoryless process due to maximum temperature occurred in the scale of monitoring the probability of transitioning from one state to another and followed by the alert level. This trend is apparent when depends only on the current state and not on the past. The the daily maximum temperature data revealed that detailed theory and structure of Markov chain can be Malaysia is experiencing two consecutive days of referred to in some textbooks stochastic processes [4, 5]. temperature below 35˚C. Many researchers around the world have applied Markov Keywords Markov Chain, Maximum Temperature, chain model to analyze the climate data. In Malaysia, Heat Wave, Transition Probability Matrix, Limiting several similar analyses also have been attempted by the Probabilities researchers to model the daily observation of rainfall [6], wind speed [7] and maximum temperature [8]. In the year 2015, Hasan [8] proposes the application of Markov chain onto daily maximum temperature data. The study covers only the northern region of Peninsula Malaysia. A scale for 1. Introduction the state of transition was determined by using the physiological equivalent temperature (PET). The same One of the most important climate parameters which scale was applied by Hassan and Hasan [9] in the year 2017 affect natural and social phenomena is the temperature. The to determine the steady-state probability for the daily temperature that has exceeded the average temperature for maximum temperature across Peninsular Malaysia. This a given area is considered as a heat wave. Recently, on study is the extension of these two works. The main February 2019, the Malaysian Meteorological Department purpose of the current work is to describe the heat wave (MMD) issued a level 1 alert for ten areas in the country condition in Malaysia by applying Markov chain model amidst the ongoing nationwide heat wave. The ten areas are onto the daily maximum temperature data.

Mathematics and Statistics 8(2A): 28-35, 2020 29

Figure 1. Location of 17 selected stations in Malaysia 3. Data Characteristics 4.2. A Markov Chain Model for Daily Maximum Temperature The data used in this study are daily maximum temperature measured in degree Celsius (˚C) from 17 There are two properties need to be distinguished when meteorological stations across Malaysia. These data are applying Markov chain model to the climate data. The first provided by Malaysian Meteorological Department (MMD) property is the “state”, defined as the number of different with less than 2% missing data. Each station has at least 24 values that the variable can have and the second property is the “order”, described as the number of previous values years of daily data except KLIA station (19 years). The used to determine the state-to-state transition probabilities location of each station is illustrated in Figure 1. Fourteen [3]. In this study, the Markov chain model applied on the stations are located in the west part of Malaysia (Peninsular daily maximum temperature is the first order model with Malaysia) while remaining three stations are located in the four states which can be described as follows. Let X be eastern region of Malaysia (Sabah and Sarawak). t defined as the state of temperature at t th day where

Xit = , i =1,2,3,4 4. Methodology 1. if day t is at state of monitoring, 2. if day is at state of alert level, 3. if day is at state of heat wave, 4.1. Heat Wave Scales 4. if day is at state of emergency. The daily maximum temperature model based on a Assuming that the state of temperature is dependent on Markov chain describes the temperature changes by the condition of the previous day, then X t follows the following the heat wave scales. Four levels of heat wave first-order Markov chain. As an example, the below scale provided by the Malaysian Meteorological equations express the conditional probabilities of the Department are used in this study. The daily temperature temperature at state j on day t depending on the data in this study is assigned to its heat wave scales. The temperature at state i on day t −1 . The transition descriptive terms for each scale together with the range of probabilities estimated from the historic measurements, daily maximum temperature are listed in Table 1. signify the probabilities of temperature from state i to Table 1. The details of heat wave scales state j . In general, the transition probability from state into state can be written as Descriptive Range of daily maximum temperature Scale term (˚C) pij= P[ X t = j | X t−1 = i ] , ij,= 1,2,3,4 1 Monitoring below 35

2 Alert level [35,37] 4.3. Development of Transition Probability Matrix

3 Heat wave (37, 40] To obtain the transition probability matrix P , the data 4 Emergency above 40 with the previous state and current state j are first counted. Then, the counted values, m will be sorted and ij placed into a transition count matrix M . Since there are

30 Markov Chain: First Step towards Heat Wave Analysis in Malaysia

four states of transition used in this study, the transition of packages and other packages that can be downloaded and daily maximum temperature can be divided into the 16 installed. These packages contain R functions, data and cases which can constitute the following count matrix M . compiled code in a well-defined format. In this study, the Define package named ‘markovchain’ has been used to manage discrete time Markov chain more easily. It was developed m11 m 12 m 13 m 14  and maintained by Giorgio Alfredo Spedicato and m m m m M=21 22 23 24 published on 21st January 2019. m m m m 31 32 33 34 An individual transition count matrix for each station can m41 m 42 m 43 m 44 be created using a createsequenceMatrix function while the transition probability matrices can be computed using which is then transformed into the transition probability markovchainFit function. Using a plot function, a better matrix, P , that is understanding of the Markov Chain model can be gained

pppp11121314 through a transition diagram. This transition diagram is a  graphical representation of a Markov chain which is pppp21222324 P= equivalent to its transition probability matrix [10]. pppp 31323334 Furthermore, a steady-state distribution or a limiting state pppp 41424344 probability can also be computed using steadyStates

4 function. mij where Pij = and Pij =1 mij j=1 j 5. Result and Discussion 4.4. Development of Limiting State Probabilities 5.1. Descriptive Statistics The n-step transition probabilities can be defined as The descriptive statistics consisting of mean, maximum follows: and minimum values (˚C) of the daily maximum n temperature from the selected metrological stations are pppp11121314  pppp presented in Table 2. The mean values of the observed data P=n 21222324 range from 31.30˚C to 32.94˚C. The highest value of pppp 31323334 maximum temperature is recorded at Chuping (40.10˚C) pppp 41424344 followed by Alor Setar (39.10˚C) station. It is interesting to As n increases, the n-step transition probabilities note that both stations are located in the northern region of converge to certain probabilities: Peninsular Malaysia.

n Table 2. Descriptive statistics of daily maximum temperature p11 p 12 p 13 p 14  p p p p Station name Mean Maximum Minimum n 21 22 23 24 limP= lim= (1 ,  2 ,  3 ,  4 ) nn→ → p p p p Chuping 32.87 40.10 23.70 31 32 33 34 Alor Setar 32.70 39.10 24.10 p41 p 42 p 43 p 44 Bayan Lepas 31.69 36.00 25.10 4 Sitiawan 32.28 36.40 23.90  j = 1 1 Subang 32.94 37.90 24.60

The probabilities that 1234,,, represent the mean KLIA 32.11 37.40 24.20 occurrence probabilities of daily maximum temperature Seremban 32.24 38.30 23.30 will be in each state, namely monitoring, alert level, heat Melaka 32.13 38.00 24.40 wave and emergency state respectively after a sufficiently long time. These probabilities are also called the limiting or Mersing 31.30 38.20 23.60 stationary state probabilities. The limiting state probability Senai 31.96 37.20 23.40 can be obtained by first, multiplying the transition Muadzam Shah 32.50 37.70 23.30 probability matrix by itself. Then the resulting matrix is Kuantan 31.99 37.80 23.20 iterated until all the transition probability matrices reach the Kuala Terengganu 31.39 35.80 23.80 equilibrium [8]. Kota Bharu 31.30 36.40 23.80 4.5. Analyzing Markov Chain Using R Software Kuching 31.75 37.30 23.60 Labuan 31.31 36.60 25.00 R is a free software environment used for statistical computing and graphics. R comes with a standard set of Kota Kinabalu 32.17 36.50 24.30

Mathematics and Statistics 8(2A): 28-35, 2020 31

5.2. Count and Transition Probability Matrices behavior of the daily maximum temperature can be assessed and the existence of heat wave at the studied The number of categories indicates the dimension of location can be determined. Referring to the transition their respective count and probability matrices. To obtain diagram, all states communicate with each other. Thus, this the count matrix, the transition between two consecutive Markov chain model is irreducible. The transition of days is tallied and totaled. We notice that a large number of temperature from the monitoring or alert level to heat wave days are accumulated in cell m11 compared to the other state is probably happening at Chuping, Alor Setar, Subang, cells. This common trend explains that all of the stations KLIA, Seremban, Melaka, Mersing, Senai, Muadzam Shah, tend to experience two consecutive days at the state of Kuantan and Kuching stations as their transition probability monitoring. The second highest counted values are value, p13  0 .0 0 0 or p23  0.000 . observed at cell m12 (the transition of temperature from To be considered as heat wave, the temperatures have to state monitoring to alert level) for most of the station and be outside the historical averages for a given area for at at cell m22 (the temperature stays at alert level state for least two or more days [11]. As mentioned above, the daily two consecutive days) for Chuping and Alor Setar stations. maximum temperature in Malaysia falls into heat wave As explained in the methodology section, the transition category (Scale = 3) when the temperature is above 37˚C. probability matrix, presented in Figure 2, can be computed Based on the transition probability and transition diagram, from the count matrix. The result evaluated from the several stations located in the western region of Peninsular transition probability matrix shows that the daily maximum Malaysia experience two consecutive days at the state of temperature has the highest probability (p > 0.900) to heat wave with their transition probability value, remain at the state of monitoring for two consecutive days. p33  0.000 . The stations are Chuping, Alor Setar, Furthermore, to obtain a better understanding of Seremban, Melaka and Mersing. This result means that the transition probability for each of the station, the transition western part of Peninsular Malaysia is likely to experience diagram, as shown in Figure 3, is plotted. Having obtained heat wave compared to the other regions in Malaysia. both transition probability and transition diagram, the

32 Markov Chain: First Step towards Heat Wave Analysis in Malaysia

Figure 2. Transition probability matrix for 17 stations

Mathematics and Statistics 8(2A): 28-35, 2020 33

Chuping Alor Setar Bayan Lepas

Sitiawan Subang KLIA

Seremban Melaka Mersing

Senai Muadzam Shah Kuantan

Kuala Terengganu Kota Bharu Kuching

Labuan Kota Kinabalu

Figure 3. Transition diagram for each station

34 Markov Chain: First Step towards Heat Wave Analysis in Malaysia

5.3. Limiting State Probabilities monitoring, alert level, heat wave and emergency are used to categorize the data. The data classification result shows The limiting state probabilities can be interpreted as the that the data can be categorized into four states of transition; probabilities that remain unchanged in the Markov chain as monitoring, alert level, heat wave and emergency. The time progresses. These probabilities are computed by majority of the stations have three states of transition; multiplication of the transition probability with itself until monitoring, alert level and heat wave except for Bayan the stationary distribution is attained. After a sufficiently Lepas, Sitiawan, Kota Bharu, Kuala Terengganu, Labuan long time, the limiting state probabilities are achieved when and Kota Kinabalu stations which have two states and each row of the transition probability has the same Chuping possesses all four states. probability value. It is not surprising to note that the highest transition Table 3. Summary of stationary distribution. probability value for each station is to remain at the state of monitoring for two consecutive days. Only five stations State of transition Station Name located in the western region of Peninsular Malaysia 1 2 3 4 experience state of heat wave for two consecutive days. Chuping 0.878 0.104 0.018 <0.001 Investigation of the limiting state probabilities shows the Alor Setar 0.900 0.090 0.010 - decreasing trend from the state of monitoring towards the emergency state. After a sufficiently long time, all the Bayan Lepas 0.998 0.002 - - stations have the highest proportion of time to experience Sitiawan 0.990 0.010 - - monitoring state (temperature below 35˚C). This paper is Subang 0.983 0.016 <0.001 - limited to only investigate daily maximum temperature KLIA 0.979 0.021 <0.001 - data. A more complete study of heat wave might consider Seremban 0.959 0.040 0.001 - other variables including daily minimum temperature or measure of heat stress. Melaka 0.976 0.023 0.001 - Mersing 0.986 0.013 0.001 - Senai 0.984 0.016 <0.001 - Acknowledgements Muadzam Shah 0.922 0.077 0.001 - We would like to thank the Malaysian Meteorological Kuantan 0.975 0.025 <0.001 - Department for providing the data. Kuala Terengganu 0.995 0.005 - - Kota Bharu 0.996 0.004 - - Kuching 0.983 0.017 <0.001 - Labuan 0.997 0.003 - - REFERENCES Kota Kinabalu 0.989 0.011 - - H. K Kannan, Heat wave: MetMalaysia issues 'Level 1' alert It can be understood from Table 3 that the highest mean for 10 areas nationwide”, NewStraitsTime, Online available occurrence probabilities of daily maximum temperature for from https://www.nst.com.my/news/nation/2019/02/46363 all stations will be at monitoring state after a sufficiently 0/heatwave-metmalaysia-issues-level-1-alert-10-areas- long time. The proportion value of the occurrence at the nationwide, last visit: 25.04.2019 state of monitoring is quite high ( 1  0.800 ) compared to C. L. E. Franzke, T. J. O’kane, J Berner, P. D. Williams & V. the other states. Chuping, Alor Setar, Subang, KLIA, Lucarini, Stochastic climate theory and modeling. WIREs Seremban, Melaka, Mersing, Senai, Muadzam Shah, Climate Change, Vol.6, 63–78, 2014. Kuantan and Kuching stations show that there are chances J. T. Scoof & S. C. Pryor, On the proper order of Markov of daily temperature occurring at the state of heat wave. In chain model for daily precipitation occurrence in the contrast to the other stations, Chuping proves that there contiguous United States, Journal of Applied Meteorology exists a small probability that the daily maximum and Climatology, Vol.47, No.9, 2477–2486, 2008. temperature will be at the emergency state in the future. D. R. Cox & H. D. Miller, The Theory of Stochastic Processes, Methuen, London, 1965. S. M. Ross, Introduction to Probability Models, Academic 6. Conclusions Press, New York, NY, 2010. In this paper, the behavior of daily maximum S. Mohd Deni, Fitting optimum order of Markov chain temperature data from 17 meteorological stations across models for daily rainfall occurrences in Peninsular Malaysia, Malaysia is assessed through Markov chain modeling. Theoretical and Applied Climatology, Vol.97, No.1, 109- Previous studies have used physiological equivalent 121, 2007. temperature (PET) scale to determine the state of transition. H. Hasan, A. Mohamad & N. H. Mohd Salleh, Application For the data analyzed here, the heat wave indications; of Markov chain to wind speed in Northern Peninsular

Mathematics and Statistics 8(2A): 28-35, 2020 35

Malaysia, Journal of Applied and Physical Sciences, Vol.3, No.2, 52-57, 2017. H. Hasan, M. A. Che Nordin & N. H. Mohd Salleh, Modeling Daily Maximum Temperature for Thermal Comfort in Northern Malaysia, Advances in Environmental Biology, Vol.9, No.926, 12-18, 2015. S. Hassan & H. Hasan, Determining the Steady-State Probability for the Daily Maximum Temperature in Peninsular Malaysia, ESTEEM Academic Journal, Vol.13(Special Issue), 129-138, 2017. F. Kachapova, Representing Markov Chains with Transition Diagrams, Journal of Mathematics and Statistics, Vol.9, No.3, 149-154, 2013. NOAA SciJinks, What Is a Heat Wave? NOAA SciJinks, Online available from https://scijinks.gov/heat/, last visit:25.04.2019

Mathematics and Statistics 8(2A): 36-39, 2020 http://www.hrpub.org DOI: 10.13189/ms.2020.081306

Robust Method in Multiple Linear Regression Model on Diabetes Patients

Mohd Saifullah Rusiman1,*, Siti Nasuha Md Nor1, Suparman2, Siti Noor Asyikin Mohd Razali1

1Faculty of Applied Sciences and Technology, Universiti Tun Hussein Onn Malaysia, Malaysia 2Department of Mathematics Education, Universiti of Ahmad Dahlan, Indonesia *Corresponding Author: [email protected]

Received August 3, 2019; Revised October 5, 2019; Accepted February 20, 2020

Copyright©2020 by authors, all rights reserved. Authors agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License

Abstract This paper is focusing on the application of population and 10% was in rural population [2]. There are robust method in multiple linear regression (MLR) model several factors that contribute to diabetes disease such as age, towards diabetes data. The objectives of this study are to body mass index (BMI) and central adiposity, measured identify the significant variables that affect diabetes by using either as waist circumference (WC) [2]. Nowadays, people MLR model and using MLR model with robust method, and no longer practice physical activities but eat additional food to measure the performance of MLR model with/without with the high consumption of sugar causing high tendency robust method. Robust method is used in order to overcome for people in India to develop insulin resistance [3]. the outlier problem of the data. There are three robust Multiple linear regression (MLR) model can be described methods used in this study which are least quartile difference as statistical approach to describe the association between (LQD), median absolute deviation (MAD) and least-trimmed two or more quantitative variables so that the dependent squares (LTS) estimator. The result shows that multiple variable can be predicted from the others. MLR is widely linear regression with application of LTS estimator is the used in business, the social and behavioural sciences and best model since it has the lowest value of mean square error many other areas [4]. MLR needs the assumption of (MSE) and mean absolute error (MAE). In conclusion, normally distributed variables and measurement errors plasma glucose concentration in an oral glucose tolerance necessarily causing underestimation of simple regression test is positively affected by body mass index, diastolic blood coefficients [5]. Robust regression is used to help in pressure , triceps skin fold thickness, diabetes pedigree detection and deletion of outlier and it is an approach that function, age and yes/no for diabetes according to WHO provides estimation, inference and testing that are not criteria while negatively affected by the number of influenced by outlying observations but described correctly pregnancies. This finding can be used as a guideline for the structure for the data [6]. The goal to use robust medical doctors as an early prevention of stage 2 of diabetes. regression is to produce linear models that are not biased by few outliers [7]. There are other quite considerable studies Keywords Multiple Linear Regression, Least Quartile carried out in statistical modelling [8, 9]. The objectives in Difference (LQD), Median Absolute Deviation (MAD), this study are to identify the significant variables that affect Least-Trimmed Squares (LTS), Mean Square Error diabetes by using multiple linear regression model, to apply the robust regression method on diabetes data and to measure the performance of robust method by comparing MLR model only and MLR model with robust method. 1. Introduction Diabetes is defined as a disease in which the body’s capability to produce or respond to the hormone insulin is 2. Materials and Methods reduced, resulting in unusual metabolism of carbohydrates The data were collected from the US National Institute of and raised levels of glucose in the blood. Nowadays, diabetes Diabetes and Digestive and Kidney Disease webpage. It is a common disease and is becoming more common. involved 332 women who were at least 21 years old of Pima Age-adjusted prevalence is set to increase from 5.9% to 7.1% India heritage and living near Phoenix, Arizona. There are 8 (246-380 million) worldwide in the 20-79 year age group [1]. variables involved in this study which are 1 dependent 20% of case of diabetes among adults was in urban variable and 7 independent variables. The dependent

Mathematics and Statistics 8(2A): 36-39, 2020 37

variables are denoted by y that is plasma glucose where concentration in an oral glucose tolerance test. This test is a = residual common test used in general hospital in Malaysia and other ( , … ) = {| |; 1 < } countries. This test checks a standard dose of glucose 𝑒𝑒 ingested by mouth and blood levels that are checked two In this method,𝑛𝑛 𝑖𝑖 the𝑛𝑛 residual𝑖𝑖 needs𝑘𝑘 to be sorted first. Then, hours later where the normal reading should be below 6.1 𝑄𝑄𝑄𝑄 𝑒𝑒 𝑒𝑒 𝑒𝑒 − 𝑒𝑒 ≤ 𝑖𝑖 𝑘𝑘 ≤ 𝑛𝑛 25% of upper data and 25% of lower data need to be mmol/L. While other 7 independent variables are denoted by discarded. Then 50% of the remaining data need to be (body mass index), (number of pregnancies), analysed using MLR model. This model should be applied to (diastolic blood pressure), (triceps skin fold thickness), 1 2 all data in order to find new MSE and MAE values. 𝑥𝑥 (diabetes pedigree function),𝑥𝑥 (age) and (yes or no 3 4 𝑥𝑥for diabetes according to WHO𝑥𝑥 criteria). 2.2.2. Mean Absolute Deviation (MAD) Method 𝑥𝑥5 𝑥𝑥6 𝑥𝑥7 Median absolute deviation (MAD) is one of the most 2.1. Multiple Linear Regression familiar robust measures. MAD is defined as the median of Multiple linear regression (MLR) is one of the most absolute values and overall median of the data set. MAD is commonly used of all statistical methods. MLR is known as also known as a robust measure of variability of univariate predictive analysis that is used to explain the relationship sample of quantitative data and also refers to parameter of between one dependent variable and two or more population that is estimated by MAD calculated from a independent variables. Leona (2012) described multiple sample as in (3) [12]. regression model as a linear regression model with two or MAD = median (| median (e)|) (3) more predictors and one response [10]. The model equation where 𝑒𝑒𝑖𝑖 − expresses the value of predictor variables as a linear model of = residual two or more independent variables and the error terms as in i = 1, 2,….n (1) [4]: 𝑒𝑒 MAD is an estimator of the spread in data and has an = + , + , +…. + , + (1) approximately 50% breakdown point like the median. where, 𝑦𝑦𝑖𝑖 𝑏𝑏0 𝑏𝑏1𝑥𝑥𝑖𝑖 1 𝑏𝑏2𝑥𝑥𝑖𝑖 2 𝑏𝑏𝑘𝑘𝑥𝑥𝑖𝑖 𝑘𝑘 𝑒𝑒𝑖𝑖 = dependent variable or real data 2.2.3. Least- Trimmed Squares (LTS) Method = constant of MLR Least-Trimmed Squares (LTS) are a robust estimator with 𝑦𝑦𝑖𝑖 = constant of the independent 50% breakdown point. This estimator is unaffected to 𝑏𝑏0 , = independent variables𝑡𝑡ℎ exploitation due to outliers, if the outliers found not more 𝑏𝑏𝑘𝑘 𝑘𝑘 eyyi= ii − ˆ = residual than 50% of the data and can be represented as in (4) and (5) 𝑥𝑥𝑖𝑖 𝑘𝑘 yˆi = estimated data from MLR model [13]. Before the MLR can be done, two assumptions need to be Ordering the squared residuals from smallest to largest: fulfilled. Firstly, the normality test for y vs all , should be ( )( ), ( )( ), … . , ( )( ) (4) done by using P-P Plot or by numerical calculation. Using 𝑖𝑖 𝑘𝑘 2 2 2 P-P Plot, straight graph data indicate normality𝑥𝑥 distribution. The LTS estimator𝑒𝑒 1chooses𝑒𝑒 2 the regression𝑒𝑒 𝑛𝑛 coefficients b to minimize the sum of the smallest m of the squared residuals, Next, multicollinearity test among , variables should be done to identify the existence of multicollinearity in model ( ) 𝑖𝑖 𝑘𝑘 = ( )( ) (5) that can affect the least square method𝑥𝑥 accuracy of the 𝑚𝑚 2 estimated model. The VIFF value less than 10 indicates no where, 𝐿𝐿𝐿𝐿𝐿𝐿 𝑏𝑏 ∑𝑡𝑡=1 𝑒𝑒 𝑛𝑛 = [ 2] + [( + 2) 2], a little more than half of the multicollinearity among , variables [4]. observation 𝑖𝑖 𝑘𝑘 ⁄ ⁄ 𝑥𝑥 𝑚𝑚 𝑛𝑛 𝑘𝑘 = squared residual 2.2. Robust Method 2 𝑒𝑒 2.2.1. Least Quartile Difference (LQD) Method 2.3. Cross Validation Technique LQD is a regression estimator which is highly robust since Cross validation is a technique for evaluating how the the LQD can resist up to 50% largely deviant data values results of a predicted model will predict the real data set. It is without becoming too biased. Since LQD has almost 50% of used when someone wants to evaluate how precise a breakdown point, LQD is expected to deal with unusual predictive model is when it will be applied into real data [14]. observation and should give the good performance when the In this study, only two methods of cross validation are used data is not contaminated. LQD formula decreases the Q of 1 as in (6) and (7). the | residuali – residualk | as in (2) [11]. LQD estimator of β is defined by, 2.3.1. Mean Square Error (MSE) = ( ( ),…, ( )), (2) MSE is represented as in (6), ̂𝐿𝐿𝐿𝐿𝐿𝐿 𝑛𝑛 1 𝑛𝑛 𝛽𝛽 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑄𝑄𝐷𝐷 𝑒𝑒 𝛽𝛽 𝑒𝑒 𝛽𝛽 38 Robust Method in Multiple Linear Regression Model on Diabetes Patients

2 The value of MSE of MLR model is 650.042, whereas the MSE = ∑ (yi − yˆi ) / N (6) MAE of the MLR model is 20.5234. Using studentized i = 1, …, N residual test, it is shown that 12 points are identified as outliers. This is the reason why the usage of robust method is where yi is the real data, yˆi is the predicted data, N is the number of observations. needed in this study.

2.3.2. Mean Absolute Error (MAE) 3.2. Least Quartile Difference (LQD) Method MAE is represented as in (7), Firstly, sorting the residual value from smallest to highest

MAE = ∑| yi − yˆi | / N (7) value, then 25% of upper and 25% of lower quartile of the data will be removed. The remaining data are used to build i = 1, …, N the new model. The new model is stated as in (9). where yi is the real data, yˆ is the predicted data, N is the i yˆ = 56.990+ 0.311 –1.041 +0.318 + 0.160 + 9.7 number of observations. 11 + 0.407 +28.604 (9) 𝑥𝑥1 𝑥𝑥2 𝑥𝑥3 𝑥𝑥4 The new model 𝑥𝑥equation5 in𝑥𝑥 6(9) will be𝑥𝑥7 applied to the 3. Result and Discussion original data, then MSE and MAE are calculated. After using LQD method, the MSE value is 641.9068 and MAE value is 20.4288. 3.1. Multiple Linear Regression Referring to the Figure 1, P-P plot shows that the data are 3.3. Mean Absolute Deviation (MAD) Method in nearly straight lines which indicate that the distribution of The analysis of MADe method is as shown below, y vs all , is normally distributed. Since the VIFF value for all is less than 10, it indicates that the 1. MADe Method : Median ± 1 MADe = (-20.368775, 𝑖𝑖 𝑘𝑘, multicollinearity𝑥𝑥 among variables does not exist. So, 16.243715) 𝑖𝑖 𝑘𝑘 , the two early𝑥𝑥 assumptions in MLR have been satisfied. The 2. MADe Method : Median ± 2 MADe = (-38.67502, 𝑖𝑖 𝑘𝑘 MLR model equation is given𝑥𝑥 below where all variables are 34.54996) included without exceptional as in (8), 3. MADe Method : Median ± 3 MADe = (-56.981265, 52.856205) yˆ = 72.736 + 0.261 - 1.298 + 0.138 + 0.149 + The MAD value is 18.30625. Based on the Median ± 3 7.849 1+ 0.465 2+ 28.66 3 4 (8) 𝑥𝑥 𝑥𝑥 𝑥𝑥 𝑥𝑥 MADe value obtained, 326 data have been included in 𝑥𝑥5 𝑥𝑥6 𝑥𝑥7 building the new model and 6 data were removed. The new model equation is as shown as in (10),

yˆ =68.604+ 0.2 1.472 +0.176 +0.129 +5.512 + 0.537 +28.147 (10) 𝑥𝑥1 − 𝑥𝑥2 𝑥𝑥3 𝑥𝑥4 𝑥𝑥5 The new model equation𝑥𝑥 6as in (10)𝑥𝑥7 is applied to the original data and the value of MSE and MAE is calculated. The values of MSE and MAE obtained are 652.9255 and 20.53357 respectively.

3.4. Least-Trimmed Square (LTS) Method In LTS estimator method, the square of residual is sorted in ascending order in Microsoft Excel. Then 116 of the data were removed and the remaining 171 data are used to build the new model by using equation (5). The new model equation in (11) is as follows, = 60.6 + 0.261 0.966 + 0.290 + 0.172 +10.9 + 0.378 + 28.8 (11) 𝑦𝑦� 𝑥𝑥1 − 𝑥𝑥2 𝑥𝑥3 𝑥𝑥4 The new model in5 (11) then 6is applied 7to the original data 𝑥𝑥 𝑥𝑥 𝑥𝑥 and the MSE and MAE values are calculated 640.2429 and Figure 1. P-P Plot for vs all xi 20.4255 respectively. 𝑦𝑦�

Mathematics and Statistics 8(2A): 36-39, 2020 39

3.5. Comparison of All Method diabetes in India, Journal of Diabetes, Vol. 1, No. 1, 18–28, 2009. Table 1 shows the comparison among 4 models with [3] S. Gulati & A. Misra. Sugar intake, obesity, and diabetes in different methods using MSE and MAE value. It indicates India, Nutrients, Vol. 6, No. 12, 5955–5974, 2014. that MLR model with application of LTS method tends to be the best model with the smallest value of MSE and MAE. [4] M. H. Kutner, C. J. Nachtsheim, J. Neter & W. Li. Applied This means y is positively affected by Linear Statistical Models (fifth Edition), McGraw-Hill, 2005. , , , , and .Besides that, y is negatively affected [5] M. Williams, C. A. G. Grajales & D. Kurkiewicz. by . Assumptions of multiple regression: Correcting two 1 3 4 5 6 7 𝑥𝑥 𝑥𝑥 𝑥𝑥 𝑥𝑥 𝑥𝑥 𝑥𝑥 misconceptions, Practical Assessment, Research & 𝑥𝑥2 Table 1. MSE and MAE comparison of all method Evaluation, Vol. 18, No. 11, 1–14, 2013. Model and method MSE value MAE value [6] D. S. Courvoisier & O. Renaud. Robust analysis of the central MLR Model 650.042 20.523 tendency, simple and multiple regression and ANOVA, MLR model with applied LQD Journal of American Statistician, Vol. 3, No. 1, 78–87, 2011. 641.907 20.429 method [7] S. Morasca. Building Statistically Significant Robust MLR model with applied MAD 652.926 20.534 Regression Models in Empirical Software Engineering, method PROMISE '09 Proceedings of the 5th International MLR model with applied LTS 640.243 20.426 Conference on Predictor Models in Software Engineering, method Vol 17, No. 1-2, 23-33, 2009. [8] N. Che-Him, M. G. Kamardan, M. S. Rusiman, S. Sufahani, M. Mohamad & N. K. Kamaruddin. Spatio-temporal modelling 4. Conclusions of dengue fever incidence in Malaysia, Journal of Physics: Conference Series, Vol. 995, No. 1, 012003, 2018. In order to measure the effectiveness of the model, the comparison of methods is done. The value of MSE and MAE [9] N. Che-Him, R. Roslan, M. S. Rusiman, K. Khalid, M. G. of the original data of MLR, MLR with applied LQD, MLR Kamardan, F. A. Arobi, N. Mohamad. Factor Affecting Road with applied MAD and MLR with applied LTS estimator Traffic Accident in Batu Pahat, Johor, Malaysia, Journal of Physics: Conference Series, Vol. 995, No. 1, 012033, 2018. was compared. Based on the value of MSE and MAE, it is concluded that the MLR model with applied LTS estimator is [10] S. A. Leona, G. W. Stephen, C. P. Steven, N. B. Amanda & C. the best model since it has the lowest value of MSE and W. Ingrid. Multiple Linear Regression - Research Methods in MAE. In conclusion, plasma glucose concentration in an oral Psychology, Wiley Online Library, 2012. glucose tolerance test is positively affected by the increasing [11] C. Croux, P. J. Rousseeuw, O. Hossjer, C. Croux, P. J. of body mass index, diastolic blood pressure, triceps skin Rousseeuw & O. Hossjer. Generalized S-Estimators, Journal fold thickness, diabetes pedigree function, age and yes or no of the American Statistical Association, Vol. 89, No. 428, 1271–1281, 1994. for diabetes according to WHO criteria. In fact, diabetes pedigree function and yes/no for diabetes according to WHO [12] M. Satyaki & S. Robert. Bahadur Representations for the criteria have the highest impact on plasma glucose Median Absolute Deviation and Its Modifications, Journal of concentration in an oral glucose tolerance test. Besides that, The American Statistical Association, Vol. 88, No. 424, 1273-1283, 2009. plasma glucose concentration in an oral glucose tolerance test is negatively affected by number of pregnancies. This [13] M. M. David, S. N. Nathan, D. P. Christine, S. Ruth & Y. W. result can be used as a guideline for medical doctors as an Angela. On the Least Trimmed squares Estimator, early prevention of stage 2 of diabetes. Algorithmica, Vol. 69, No. 1, 148-183, 2014. [14] A. Khamis. Application of Statistical and Neural Network Model for Oil Palm Yield Study, Ph.D. Thesis, Universiti Acknowledgements Teknologi Malaysia, Malaysia, 2005. This research is supported by the Universiti Tun Hussein Onn Malaysia under the TIER 1 grant scheme vot number H232.

REFERENCES [1] R. Bilous, & R. Donnelly. Handbook of Diabetes: Fourth Edition. John Wiley & Sons Ltd., 2010. [2] A. Ramachandran & C. Snehalatha. Current scenario of

Mathematics and Statistics 8(2A): 40-46, 2020 http://www.hrpub.org DOI: 10.13189/ms.2020.081307

An Alternative Approach for Finding Newton's Direction in Solving Large-Scale Unconstrained Optimization for Problems with an Arrowhead Hessian Matrix

Khadizah Ghazali1,*, Jumat Sulaiman1, Yosza Dasril2, Darmesah Gabda1

1Faculty of Science and Natural Resources, Universiti Malaysia Sabah, Malaysia 2Faculty of Electronic and Computer Engineering, Universiti Teknikal Malaysia Melaka, Malaysia

Received August 10, 2019; Revised February 1, 2020; Accepted February 20, 2020

Copyright©2020 by authors, all rights reserved. Authors agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License

Abstract In this paper, we proposed an alternative way possess the fast quadratic rate of convergence for solving to find the Newton direction in solving large-scale problem (1). In practice, however, Newton’s method may unconstrained optimization problems where the Hessian of be not very favorable if it involves large-scale problems. the Newton direction is an arrowhead matrix. The This matter happens because Newton’s method uses direct alternative approach is a two-point Explicit Group Gauss- methods to find Newton’s direction that contains Seidel (2EGGS) block iterative method. To check the calculating not only first derivatives but also second validity of our proposed Newton’s direction, we combined derivatives [1]. the Newton method with 2EGGS iteration for solving Therefore, in large-scale problems, many researchers unconstrained optimization problems and compared it with avoid this method by using other gradient descent methods a combination of the Newton method with Gauss-Seidel such as in [2-5] or by merely modifying this classical (GS) point iteration and the Newton method with Jacobi Newton’s method [6-8]. As proposed by Esmaeili and point iteration. The numerical experiments are carried out Kimiaei in [2], they developed an improved adaptive trust- using three different artificial test problems with its Hessian region method for solving the unconstrained optimization in the form of an arrowhead matrix. In conclusion, the problem and tested it on 93 standard un-constrained numerical results showed that our proposed method is more optimization test problems with the dimension up to 20000. superior than the reference method in term of the number Also, a new generalized nonmonotone line search for of inner iterations and the execution time. solving large-scale unconstrained optimization is proposed by Shuai et al. [3] to integrate advantages of the existing Keywords Newton Method, Explicit Group Iteration, line searches. Other than that Esmaeili et al. [4] presented Unconstrained Optimization Problems, Large-Scale a new spectral conjugate gradient method based on the Dai- Optimization, Arrowhead Matrix Yuan strategy for solving large-scale unconstrained optimization, while Moyi and Leong [5] proposed a three-

term conjugate gradient method via the symmetric rank- one update. 1. Introduction In addition to the methods described in [2-5], there are also researchers who still retain the use of the Newton We consider large-scale unconstrained optimization method but without finding the second derivatives of the problem: function problem. Dembo and Steihaug [6] proposed a combination of the Newton method with the conjugate min f (x) (1) gradient method for solving problem (1) and named it as x n truncated-Newton algorithms, while Grapsa [7] modified where f : n → is twice continuously differentiable and Newton direction thru a proper gradient vector n 1000 . Theoretically, classical Newton’s method is one modification and then introduced Componentwise of the most powerful gradient descent techniques that Approximation Gradient method. Recently, Boggs and

Mathematics and Statistics 8(2A): 40-46, 2020 41

Byrd [8] studied the L-BFGS method, then suggested two +=f (xxx0(kk) ) H( ( ) ) , (4) computationally efficient ways to measure the effectiveness of various memory sizes and showed that so thus, we obtained the new iteration scheme as; their approach improves the performance of the L-BFGS −1 x(k+1) = x( k) − H xx( k) f ( k ) (5) method for solving problem (1). ( ) ( )

Just as what the researchers have discussed in [6-8], we −1 where H x(k ) is the inverse of the Hessian. The iteration have also modified this Newton direction, but we use a ( ) different approach compared to them. We noticed that the scheme defined by equation (5) is known as Newton’s Newton direction is obtained from solving the large linear iteration. Noted that from equation (5), we properly defined system. Thus we used a family of Explicit Group (EG) the Newton direction as; block iterative method to find the Newton direction. This −1 df(kkk) =−H xx( ) ( ) . (6) EG block iterative method was introduced by Evans [9] by ( ) ( ) grouping the linear system to several points for generating a new system of linear equations and it is famous as one of It seems that each iteration of this classical Newton’s the efficient block iterative methods which later have been method requires computational of the inverse of the further established by Yousif and Evans [10,11], Othman Hessian. In order to prevent this procedure from occurring and Abdullah [13] and Sulaiman et al. [12,13]. (since our problem is a large-scale problem), we rewrite Thus, in this paper, inspired by the advantages of the equation (6) as the Newton equation; Newton method, we proposed 2-point explicit group H(xx(kkk) )df( ) =− ( ( ) ) . (7) Gauss-Seidel (2EGGS) block iterative method (as inner iteration) to find the Newton direction and then solve We now focus our attention on seeking the Newton problem (1) using Newton’s method (as outer iteration). direction through equation (7) for the large-scale problem We named this method as the Newton-2EGGS method. To with n by n real arrowhead matrix. Therefore, we only evaluate the performance of the Newton-2EGGS method, considered problems with Hessian of an arrowhead matrix we consider a combination of the Newton method with of order n with the general form given by [15]; Gauss-Seidel iteration and a combination of the Newton 2f  2 f  2 f  2 f method with Jacobi iteration as reference method and they 2 x  x  x  x  x  x  x are called as the Newton-GS method and the Newton- 1 1 2 1 3 1 n 22ff Jacobi method respectively. 00 x  x  x2 This paper is organized as follows. In the next section, 2 1 2 H x(k ) = 22 (8) ( ) ff we described a Newton scheme with its Hessian type, while 0 2 x3  x 1  x 3 in Section 3, we formulated the 2EGGS iteration for  computing Newton’s direction. Numerical experiment and 0 22ff result are analyzed in Section 4. Finally, the conclusion is 00 x  x  x2 stated in Section 5. nn1 This Hessian is a square matrix of second-order partial derivatives of order n by n. 2. Approaches to Newton Scheme with an Arrowhead Hessian Matrix 3. The Proposed Approach for Finding In this section, we obtained the Newton iteration base on the Newton Direction a quadratic model of the f (x) , by using the first three terms of the Taylor series expansion about x(k) ; As noted in the previous section, we start searching the Newton direction from equation (7) and this equation is T 1 T fxH fx(k) +  f x( k)  x +  x x( k )  x (2) nothing but a linear system. For simplicity, we rewrite ( ) ( ) ( )   ( ) 2 equation (7) as in the general form; (k) (k ) where x = x − x ,  f (x ) is the gradient of and Adf= (9) H xx(kk) =2 f ( ) is the Hessian of . Next, we set the ( ) ( ) where, partial derivatives of equation (2) to zero (as followed the b c c c d f optimality conditions [1]) to obtained the minimum value 1 1 2 n 1 1 ab 00 d f of ; 22 2 2 A0= ab , dd= , ff= . 33 3 3 =f x0(k) . (3) 0   ( )   abnn00 dn fn From equations (2) and (3); To formulate the Jacobi iteration, we decomposed matrix

42 An Alternative Approach for Finding Newton's Direction in Solving Large-Scale Unconstrained Optimization for Problems with an Arrowhead Hessian Matrix

A in equation (9) as a summation of three matrices as Algorithm: Newton-2EGGS with an Arrowhead Hessian Matrix follows [16,17]; Scheme i. Initialize – Set up the objective function f (x) , A D= L− −U (10) (0) n −6 −8 x  , 1 10 , 2 10 and n  where D is the nonzero diagonal matrix of A , L and ii. For jn=1,2 ,. . . , implement U are strictly lower and upper triangular matrices of A , respectively. By applying the decomposition in equation a. Set d (0)  0 (10) into a linear system (9), the iterative formulation of the b. Calculate f (x(k ) ) Jacobi method stated in vector form as [16]; (k+1) c. Calculate the approximate value of di using equation (kk+1) −−11( ) ddf =++DLUD( ) (11) (14). d. Check the convergence test dd(kk+1) −( )  . If yes, go In the same way, the Gauss-Seidel iteration also can be 2 stated in vector form as [16,18]; to step (e). Otherwise, go back to step (b) e. For in=1,2 ,. . . , calculate the estimation value; (kk+1) −−11( ) ddf =−+−(DLUDL) ( ) (12) xxd(kkk+1) +( ) ( )

(k) The following subsections will discuss the formation of f. Check the convergence test fx( ) 1 . If yes, go to 2-point explicit group Gauss-Seidel iterative methods. (iii). Otherwise, go back to step (a) iii. Display approximate solutions 3.1. Formation of 2EGGS Iteration The 2EGGS iteration involves a complete group of two 4. Numerical Experiments and points accordingly for generating 2 by 2 systems of a linear equation. Thus, the formulation of the proposed iterative Computational Results Conclusion method can be expressed as a group of two points from the Table 1. Description of the Test Problem linear system (9) as [9,12,13]; Name, Algebraic Expression, local optimal value, Test bcd  s optimal point and initial point (a-standard; b- iii ==1 ,1i Number    nonstandard1; c-nonstandard2) abdiii +++111 s2 (13) LIARWHD bd  0 s ii==−3 ,3,5,...,1in    nn 0  bdii++11s4 2 2 2 fxxx(x41) =−++−( 1 ii) ( ) where, ii==11 1 * * n−1 f = 0 , x = (1.0,1.0,...,1.0) and 4.0,4.0,...,4.0 sfc112131411=−==−=− dsfsfaij jiiiii 1 dsfa++++ d, , , (a) ( ) j=2 (b) (1.5,1.5,...,1.5) Hence, solving and simplifying equation (13) gives the (c) (3.3,3.5,...,3.3,3.5) explicit form of the 2EGGS iteration as; NONDIA

n 1 2 2 2 (k+1) fxxxx1100=−+− idb==−1;, s c s iii  +1 12  ( ) ( 111 )  ( i− )  i=2

2 * (k+1) 1 , x = (1.0,1.0,...,1.0) and = daiii++11 s 12=−+ b s .  (a) (−1.0, − 1.0,..., − 1.0)

(k+1) 1 (14) (b) (2.0,2.0,...,2.0) indb=−=3,5,...,1 s ; ii +13,  ' (c) (2.0,1.5,...,2.0,1.5)

(k+1) 1 DIAG-AUP1 = dbii+14 s =  .  ' nn 2222 fxxx(x41) =−+−( iI1 ) ( ) where, ii==11 3  =−bbac  ' = bb , and iiii++11 and ii+1 . (a) This equation (14) is the alternative approach for finding (b) Newton direction. Thus, we proposed the following (c) algorithm for solving problem (1).

Mathematics and Statistics 8(2A): 40-46, 2020 43

In this section, we applied three different artificial five parameters are observed, and there are the number of unconstrained optimization test problems selected from the inner iterations, number of outer iterations, execution time collection collected by Andrei in [19,20] to see the behavior (measured in seconds), function value at the iterate where and evaluate the performance of our proposed algorithm. execution terminated, and maximum absolute error. The selection of all these test problems are based on its Computational results corresponding to the algorithm are Hessian type, which is an arrowhead Hessian matrix, and compared with the reference methods and tabulated in the description of all these test problems are stated in Table Table 2. We used the symbol NJ, NGS, and N2EGGS to 1. For each test problem, we used three initial points, and represent the N ewton-Jacobi method, the Newton-GS one of these initial points (denoted as (a)) is from the method, and the Newton-2EGGS method, respectively. standard point suggested by Andrei in [19,20]. While the All values shown in Table 2 are rounded up to two other two nonstandard initial points (denoted as (b) and (c)) decimal places. Therefore, all maximum absolute error are chosen randomly from a range surrounding the standard values are smaller than the convergence tolerance. We initial point. Moreover, all these test problems were run noted that our proposed method achieved about the same five times using five different order of Hessian matrix as final objective value as both our reference method (see columns under function value of Table 2) and all of these n = 1000,5000,10000,20000,30000 . value are closer to the optimal value. Thus, this gives the total number of test problems are 75 It is noticeable that the number of inner iteration and the problems. execution time (in seconds) for our proposed method indicate that there is a reduction (or at least the same) The proposed algorithm was coded in C language with compared to the referenced method. Therefore, to have well double precision arithmetic. From the algorithm, we can understood for the efficiency of these reductions and as see that this algorithm involves two loops with two well as to evaluate the performance of our proposed difference convergence tolerance. The first loop is for algorithm, we presented Table 3 and Table 4. Table 3 finding the Newton direction using our proposed iterative specified the decrement percentage of the number of inner method as inner iteration with  =10−8 as the 2 iteration for the Newton-2EGGS method and the Newton- convergence tolerance and the second loop is to estimate GS method compared to the Newton-Jacobi method, while the solution for problem (1) using the Newton iteration (5) Table 4 stated the comparison of speed ratio for Newton- −6 as an outer iteration with 1 =10 as the convergence 2EGGS method with both the reference methods. tolerance. Furthermore, in Table 4, we used the total execution time In order to see the behavior of our proposed algorithm, in seconds for every test problems.

44 An Alternative Approach for Finding Newton's Direction in Solving Large-Scale Unconstrained Optimization for Problems with an Arrowhead Hessian Matrix

Table 2. Computational Result of the Newton-Jacobi method, the Newton-GS method, and the Newton-2EGGS method

Test Execution time Function value at the iterate Number of inner iteration Number of outer iteration Maximum absolute error Problem n (seconds) where execution terminated (initial point) NJ NGS N2EGGS NJ NGS N2EGGS NJ NGS N2EGGS NJ NGS N2EGGS NJ NGS N2EGGS 1(a) 1000 3343 1567 1566 129 38 38 0.07 0.04 0.03 2.60E-15 2.40E-13 2.37E-13 9.58E-07 9.90E-07 9.84E-07 5000 4771 2194 2194 203 52 52 0.44 0.38 0.33 4.51E-17 2.35E-13 2.35E-13 9.67E-07 9.72E-07 9.71E-07 10000 5479 2509 2509 227 58 58 0.99 0.63 0.62 2.23E-17 2.23E-13 2.23E-13 9.64E-07 9.46E-07 9.46E-07 20000 6149 2821 2821 240 63 63 2.23 1.46 1.30 1.00E-16 2.40E-13 2.40E-13 9.98E-07 9.80E-07 9.80E-07 30000 6456 2899 2899 264 65 65 3.58 2.07 1.96 2.35E-18 2.45E-13 2.45E-13 9.82E-07 9.91E-07 9.91E-07 1(b) 1000 2289 1013 1012 130 32 32 0.06 0.03 0.02 2.10E-15 2.31E-13 2.27E-13 9.59E-07 9.70E-07 9.64E-07 5000 2274 1039 1039 178 46 46 0.25 0.21 0.17 8.00E-16 2.21E-13 2.21E-13 9.75E-07 9.41E-07 9.41E-07 10000 2281 1022 1022 219 36 36 0.53 0.35 0.30 3.66E-17 2.48E-13 2.47E-13 9.48E-07 9.97E-07 9.96E-07 20000 2287 1038 1038 240 52 52 1.11 0.65 0.63 5.94E-18 2.48E-13 2.48E-13 9.48E-07 9.97E-07 9.96E-07 30000 2303 1042 1042 260 58 58 1.68 0.97 0.88 2.53E-18 2.31E-13 2.31E-13 9.85E-07 9.61E-07 9.61E-07 1(c) 1000 3378 1556 1556 148 37 37 0.07 0.04 0.03 2.00E-16 2.42E-13 2.42E-13 9.62E-07 9.93E-07 9.93E-07 5000 4670 2131 2131 202 50 50 0.43 0.27 0.26 4.14E-17 2.47E-13 2.47E-13 9.98E-07 9.95E-07 9.95E-07 10000 5242 2384 2384 227 57 57 1.07 0.56 0.55 1.27E-17 2.25E-13 2.25E-13 9.73E-07 9.50E-07 9.50E-07 20000 5487 2553 2553 214 61 61 2.02 1.19 1.18 6.00E-16 2.49E-13 2.49E-13 9.78E-07 9.99E-07 9.99E-07 30000 5513 2593 2593 254 33 33 3.12 1.64 1.63 3.61E-17 2.39E-13 2.39E-13 9.90E-07 9.78E-07 9.78E-07 2(a) 1000 56822 22986 20391 14805 7528 4790 2.09 1.34 1.11 1.00E-16 2.47E-15 6.18E-13 1.00E-06 1.00E-06 1.00E-06 5000 196378 52084 41128 91368 28374 16286 53.87 18.95 13.79 8.24E-19 2.47E-15 3.09E-12 1.00E-06 1.00E-06 1.00E-06 10000 359514 128743 92343 198451 96881 58220 227.37 123.63 105.12 9.28E-19 2.47E-15 6.19E-12 1.00E-06 1.00E-06 1.00E-06 20000 593232 243256 163452 416445 209648 125329 880.42 518.18 318.13 1.00E-16 2.47E-15 1.24E-11 1.00E-06 1.00E-06 1.00E-06 30000 708901 351492 225732 645745 326806 194282 2011.75 1182.51 730.05 1.23E-14 2.47E-15 1.86E-11 1.00E-06 9.99E-07 1.00E-06 2(b) 1000 101544 38789 35915 14976 7536 4799 2.71 1.55 1.03 1.07E-17 2.50E-15 6.17E-13 9.99E-07 9.99E-07 9.99E-07 5000 461271 168711 150420 89585 45606 27952 72.64 37.86 23.21 4.70E-17 2.50E-15 3.09E-12 1.00E-06 1.00E-06 1.00E-06 10000 763029 302950 262245 172737 98059 59324 256.17 152.13 90.57 4.00E-16 2.50E-15 6.19E-12 9.99E-07 1.00E-06 1.00E-06 20000 1369207 557276 468968 357916 209806 125492 1004.41 621.94 354.63 2.19E-17 2.50E-15 1.24E-11 1.00E-06 1.00E-06 1.00E-06 30000 1899369 799141 660672 517316 326742 194237 2181.71 1194.97 802.18 2.61E-17 2.50E-15 1.86E-11 9.99E-07 9.99E-07 1.00E-06 2(c) 1000 101288 38695 35821 14985 7535 4798 3.68 1.18 0.96 1.24E-17 2.50E-15 6.17E-13 1.00E-06 1.00E-06 9.99E-07 5000 495794 173806 155654 89583 45608 27953 89.09 33.15 24.14 1.17E-18 2.50E-15 3.09E-12 1.00E-06 1.00E-06 1.00E-06 10000 792833 302014 261383 192949 98058 59325 333.82 129.24 90.22 1.00E-16 2.50E-15 6.19E-12 9.99E-07 1.00E-06 1.00E-06 20000 1328778 554369 465902 351904 209808 125492 1186.07 522.01 355.52 1.40E-15 2.50E-15 1.24E-11 1.00E-06 1.00E-06 1.00E-06 30000 1938690 796433 659580 587757 326742 194237 2465.87 1193.60 805.08 1.00E-16 2.50E-15 1.86E-11 9.99E-07 9.99E-07 1.00E-06 3(a) 1000 827 412 408 44 17 17 0.06 0.03 0.01 1.30E-15 4.69E-14 4.55E-14 9.09E-07 8.70E-07 8.57E-07 5000 927 452 450 56 20 20 0.22 0.05 0.05 2.00E-16 4.95E-14 5.23E-14 9.61E-07 8.90E-07 9.16E-07 10000 950 463 462 64 21 22 0.30 0.11 0.10 1.00E-16 5.79E-14 4.71E-14 8.96E-07 9.63E-07 8.68E-07 20000 971 470 470 71 23 23 0.56 0.20 0.19 1.79E-17 5.32E-14 4.84E-14 8.48E-07 9.23E-07 8.80E-07 30000 977 473 473 74 24 24 0.87 0.27 0.27 1.58E-17 4.54E-14 4.26E-14 9.71E-07 8.53E-07 8.25E-07 3(b) 1000 595 279 279 42 13 14 0.02 0.01 0.01 4.00E-16 4.66E-14 4.30E-14 8.21E-07 8.67E-07 8.33E-07 5000 615 284 284 56 17 17 0.08 0.07 0.03 1.00E-16 5.70E-14 5.61E-14 8.35E-07 9.56E-07 9.48E-07 10000 623 286 286 62 19 19 0.15 0.07 0.06 3.91E-17 4.63E-14 4.59E-14 8.53E-07 8.61E-07 8.57E-07 20000 629 287 287 68 20 20 0.33 0.14 0.13 2.05E-17 5.90E-14 5.88E-14 8.72E-07 9.72E-07 9.70E-07 30000 633 289 289 72 21 21 0.47 0.19 0.19 1.26E-17 5.66E-14 5.64E-14 8.36E-07 9.52E-07 9.50E-07 3(c) 1000 820 404 401 45 16 16 0.02 0.01 0.01 3.00E-16 5.94E-14 5.00E-14 8.32E-07 9.79E-07 8.98E-07 5000 892 434 432 58 20 20 0.10 0.05 0.04 1.00E-16 4.65E-14 5.66E-14 9.83E-07 8.64E-07 9.52E-07 10000 913 441 439 66 22 21 0.21 0.09 0.09 4.77E-17 4.09E-14 5.60E-14 8.10E-07 8.09E-07 9.47E-07 20000 922 444 443 71 23 23 0.40 0.18 0.17 2.11E-17 4.11E-14 5.99E-14 8.79E-07 8.11E-07 9.79E-07 30000 928 446 445 75 24 24 0.64 0.28 0.27 1.89E-17 5.64E-14 5.38E-14 8.02E-07 9.50E-07 9.28E-07

Table 3. Decrement Percentage of the Number of Inner Iterations for the Newton-GS and the Newton-2EGGS compared to the Newton-Jacobi

Test Problem Number of inner iteration (%)

(initial point) NGS N2EGGS 1(a) 53.13-55.10 53.16-55.10 1(b) 54.31-55.74 54.31-55.79 1(c) 52.97-54.52 52.29-54.52 2(a) 50.42-73.48 64.11-79.06 2(b) 57.93-63.42 64.63-67.39 2(c) 58.28-64.94 64.63-68.61 3(a) 50.18-51.60 50.67-51.60 3(b) 53.11-54.37 53.11-54.37 50.73-51.94 51.10-52.05 3(c)

Mathematics and Statistics 8(2A): 40-46, 2020 45

Table 4. Comparison of Speed Ratio for the Newton-2EGSOR with the Newton-GS and the Newton-SOR

Total execution time (seconds) Speed ratio Test Problem (initial point) NJ NGS N 2EGGS (I ) (II ) (III ) 1(a) 7.31 4.58 4.24 1.60 1.72 1.08 1(b) 3.63 2.21 2.00 1.64 1.82 1.11 1(c) 6.71 3.70 3.65 1.81 1.84 1.01 2(a) 3175.50 1844.61 1168.20 1.72 2.72 1.58 2(b) 3517.64 2008.45 1271.62 1.75 2.77 1.58 2(c) 4078.53 1879.18 1275.92 2.17 3.20 1.47 3(a) 2.01 0.66 0.62 3.05 3.24 1.06 3(b) 1.05 0.48 0.42 2.19 2.50 1.14 3(c) 1.37 0.61 0.58 2.25 2.36 1.05

5. Conclusions Region Method for Unconstrained Optimization”, Mathematical Modeling and Analysis, Vol. 19, No. 4, In this paper, we have proposed an alternative algorithm (2014), pp. 469-490 for solving large-scale unconstrained optimization Shuai Z, Zong W & Jing Z, “An Extended Nonmonotone problems with an arrowhead Hessian matrix. Through the Line Search Technique for Large-scale Unconstrained implementation of block iterative method for finding the Optimization”, Journal of Computational and Applied Newton direction, we can point out that the approach we Mathematics, Vol. 330, (2018), pp. 586-604 proposed is superior compared to the reference methods. Esmaeili H, Rostami M & Kimiaei M, “Extended Dai-Yuan This point can be clarified in Table 3 and Table 4. As Conjugate Gradient Strategy for Large-scale Unconstrained described in Table 3, the decrement percentage of the Optimization with Applications to Compressive Sensing”, number of inner iteration is in the range 50.18%-73.48%, Filomat, Vol. 32, No. 6, (2018), pp. 2173-2191, https://doi. org/10.2298/FIL1806173E and 50.67%-79.06% correspond to the Newton-2EGGS and the Newton-GS methods compared to the Newton- Moyi AU & Leong WJ, “A Sufficient Descent Three-term Jacobi method. Furthermore, from the speed ratio shown in Conjugate Gradient Method via Symmetric Rank-one Table 4, we observe that our method is up to 1.58 times Update for Large-scale Optimization”, Optimization, Vol.64, No.1, (2016), pp.121-143 faster than the Newton-GS method and up to 3.20 times more rapid than the Newton-Jacobi method. Thus, it can be Dembo RS & Steihaug T, “Truncated-Newton Algorithms concluded that our proposed alternative method can show for Large-scale Unconstrained Optimization”, significant improvement in the number of inner iterations Mathematical Programming, Vol. 26, (1983), pp. 190-212 and execution time compared to the Newton-GS and the Grapsa TN, “A Modified Newton Direction for Newton-Jacobi iterative methods. In addition, we expect Unconstrained Optimization”, Optimization, Vol. 63, No. 7, that the proposed Newton’s direction presented in this (2014), pp. 983-1004 paper can be extended to 4-point block iterative method, as Boggs PT & Byrd RH, “Adaptive, Limited-Memory BFGS in Ghazali et al. [21]. Algorithms for Unconstrained Optimization”, SIAM Journal on Optimization, Vol. 29, No. 2, (2019), pp. 1282- 1299

Acknowledgements Evans DJ. “Group explicit iterative methods for solving large linear systems”. International Journal of Computer The authors are grateful for the fund received from Mathematics. Vol. 11, (1985), pp. 81-108 Universiti Malaysia Sabah upon publication of this paper (GUG0160-2/2017). Yousif WS & Evans DJ. “Explicit group over-relaxation methods for solving elliptic partial differential equations”. Mathematics and Computers in Simulation. Vol. 28, (1986), pp. 453–66

Yousif WS & Evans DJ. “Explicit de-coupled group iterative REFERENCES methods and their implementations”. Parallel Algorithms and Application. Vol. 7, (1995), pp. 53-71 Sun W & Yuan Y, Optimization Theory and Methods- Nonlinear Programming. United States: Springer, (2006), Sulaiman J, Hasan MK, Othman M & Karim SAA, “Newton- pp. 119-302 EGMSOR Methods for Solution of Second Order Two- Point”, Journal of Mathematics and System Science. Vol. 2, Esmaeili H & Kimiaei M, “An Improved Adaptive Trust- (2012), pp. 185–190

46 An Alternative Approach for Finding Newton's Direction in Solving Large-Scale Unconstrained Optimization for Problems with an Arrowhead Hessian Matrix

Sulaiman J, Hasan MK, Othman M & Karim SAA, “Application of Block Iterative Methods with Newton Scheme for Fisher’s Equation by Using Implicit Finite Difference”, Jurnal Kalam. Vol. 8, No. 1, (2015), pp. 39-46 Othman M & Abdullah AR, “An efficient four points modified explicit group poison solver”. International Journal of Computer Mathematics. Vol. 76, (2000), pp. 203–217 Stanimirovic PS, Katsikis VN & Kolundzija D, “Inversion and pseudoinversion of block arrowhead matrices”, Applied Mathematics and Computation, Vol. 341, (2019), pp. 379- 340 Varga RS, Matrix iterative analysis. Berlin: Springer, (2000), pp.63-101 Saad Y, Iterative Methods for Sparse Linear Systems. United States: PWS Publishing Company, (1996), pp. 103-128 Ghazali K, Sulaiman J, Dasri, Y, Gabda D, “Newton-MSOR Method for Solving Large-Scale Unconstrained Optimization Problems with an Arrowhead Hessian Matrices”. Transactions on Science and Technology Vol. 6, No. 2-2, (2019), pp. 228-234 Andrei N, Test functions for unconstrained optimization. Research Institute for informatics. Center for Advanced Modeling and Optimization, 8-10, Averescu Avenue, Sector 1, Bucharest, Romania, (2004), pp. 1-15 Andrei N, “An unconstrained optimization test function collection. Advanced Modeling and Optimization”. An Electronic International Journal, Vol. 10, No. 1, (2008), pp. 147-161 Ghazali K, Sulaiman J, Dasri, Y, Gabda D, “Application of Newton-4EGSOR Iteration for Solving Large Scale Unconstrained Optimization Problems with a Tridiagonal Hessian Matrix”. In: Alfred R, Lim Y, Ibrahim A, Anthony P. (eds) Computational Science and Technology. Lecture Notes in Electrical Engineering. 481 Springer, Singapore (2019), pp. 401-411

Mathematics and Statistics 8(2A): 47-51, 2020 http://www.hrpub.org DOI: 10.13189/ms.2020.081308

Parameter Estimations of the Generalized Extreme Value Distributions for Small Sample Size

RaziraAniza Roslan*, Chin Su Na, Darmesah Gabda

Faculty of Science and Natural Resources, Universiti Malaysia Sabah, Malaysia

Received July 28, 2019; Revised September 27, 2019; Accepted February 20, 2020

Copyright©2020 by authors, all rights reserved. Authors agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License

Abstract The standard method of the maximum maxima (BM) and peak over the threshold (POT). In BM, likelihood has poor performance in GEV parameter the period will be divided into equal section and the estimates for small sample data. This study aims to explore maximum of each will be selected. The approach is usually the Generalized Extreme Value (GEV) parameter going to pair with generalized extreme value (GEV). While estimation using several methods focusing on small sample POT will select every value that exceeds a certain threshold size of an extreme event. We conducted simulation study and this approach leads to generalized Pareto distribution to illustrate the performance of different methods such as (GPD) [1]. the Maximum Likelihood (MLE), probability weighted GEV distribution was introduced by Jenkinson [3] and moment (PWM) and the penalized likelihood method has been used in many research areas such as in civil (PMLE) in estimating the GEV parameters. Based on the engineering design [4], in hydrology [2], to estimate air simulation results, we then applied the superior method in quality [15] and also in finance [14]. The GEV distribution modelling the annual maximum stream flow in Sabah. The consists of three parameters; shape (휉), scale (휎) and result of the simulation study shows that the PMLE gives location µ. This parameter estimation of GEV distribution better estimate compared to MLE and PMW as it has small can be obtained using several statistical methods such as bias and root mean square errors, RMSE. For an application, the Maximum Likelihood Estimator (MLE), Probability we can then compute the estimate of return level of river Weighted Moments (PWM), Penalized Maximum flow in Sabah. Likelihood Estimator (PMLE) and L-moment. The aim of this study is to model the annual maximum stream flow Keywords Extreme Value Theory (EVT), using the GEV distribution focusing on small sample size Generalized Extreme Value (GEV), Maximum Likelihood data. We apply several methods to estimate the GEV Estimation (MLE), Probability Weighted Moments (PWM), parameters. Penalized Maximum Likelihood (PMLE), L-Moment Each method of parameter estimation has its advantages and disadvantages. But to get ideal parameter estimation, it can be explained in terms of unbiasedness, efficiency and consistency. It is said that the parameter estimation must be 1. Introduction unbiased where the estimated parameter closed to the true parameter and the parameter estimation is efficient. The Extreme Value Theory (EVT) is a statistics field that method with the smallest of the root mean square error concentrates on any possible event that can be led to more (RMSE) shows an efficient estimator. Other than that, the extreme than it is normally happening. Usually, EVT is parameter estimation must be consistent where the function used to measure safety during catastrophic events, of estimation is well converged [12]. sometimes if we do not pay attention to the risk of an event MLE is the method that is mostly used to estimate the because it just has a low occurrence it will cause huge GEV parameter because MLE has good asymptotic losses. Therefore we can use EVT in a specific location to properties such as consistency and efficiency. MLE is easy estimate the frequency and cost of such events over a period to adapt to model change [12]. Besides that, MLE can be of time. EVT has been widely used in various fields such used in a complex model such as the non-stationary model, as geophysical variable, insurance, risk management and temporal dependence and covariate effect. However, this hydrology [19]. There are two approaches used when it parameter estimation can only be used in large sample data comes to analyzing the extreme value, which is Block and the result will become uncertain if the data is less than

48 Parameter Estimations of the Generalized Extreme Value Distributions for Small Sample Size

50 values (minimum) [7]. This is confirmed by Hosking et distribution, uncertainty will be ignored [12]. al. (1985) that MLE shows a poor performance due to the The GEV distribution having the non-degenerate small sample size. Considering MLE cannot perform well distribution function fulfills where 푎푛and 푏푛 are constant in small sample size, Coles & Dixon [18] show an with 푎푛> 0 investigation about how to improve MLE by proposing an 푥−푏푛 ∗ Pr (Mn≤푥)≈ 퐺 ( ) = 퐺 (푥) (1) alternative method called PMLE. Their study stated that 푎푛 PMLE will not only maintain model flexibility and large The cumulative distribution function (CDF) of GEV sample optimality of MLE, also help to improve it on small distribution is denoted as follows [13]: properties. This may be concluded that PMLE is given an 1 improved smoother estimation along with better accuracy 푥−µ exp [− ⌈1 + 휉 ( )⌉휉] , 휉 ≠ 0 thandirect estimate without penalties [20]. G(x) ={ 휎 (2) PWM was probably advantageous for small set data 푥−µ 푒푥푝 {−푒푥푝 (− )} , 휉 = 0 because it has smaller uncertainty than ordinary moment 휎 [19] and has lower variance than others [12]. But when the x- µ Where푥: 1+ ( ) > 0, -∞< µ <∞, 휎 > 0 and -∞<휉<∞, shape parameter is large, this parameter estimation σ performs poorly [18] and upper quantile will show PWM is in this model 휉,휎 and µ are the parameters for shape, scale, biased. But PWM is still preferable than MLE for small and location. By equation (2) GEV distribution for sample size data. On the other hand, MLE is more flexible Frẻchetξ>0and Weibullξ<0, while for Gumbel distribution than PWM because covariate can be easily added in ξ=0 taken as ξ→0. parameterization [11]. PWM is equivalent to L-moment and it performs better 2.2. Maximum Likelihood Estimation (MLE) than MLE in terms of bias and RMSE [17,10].L moment is the summary statistic, where it provides a measure of Generally, MLE is the most popular estimation method location, kurtosis, skewness or any aspects of shape that in EVT because MLE is having good asymptotic properties explain about probability distribution and data sample. such as consistency, efficiency, and normality. MLE can be Although L-moment produces bias, but it is still preferable applied to complex modeling situations such as temporal due to having a smaller variance than MLE [13] as MLE dependence, non-stationary and covariate effects [12]. The produced a very large variance and error for estimation [9]. likelihood function can be written as 푛 However, L-moment parameter estimation can only be L(휃/푥)= ∏푖=1 𝑔(푥) (3) used to estimate the stationary process [8]. Therefore L- where g is probability density function of GEV Moment and MLE can be "mixed' to produce a better result for GEV parameter estimator. The outcome of this L(휃/푥)= 1 1 − −( )−1 combination helps to reduce variance and bias [13]. In this 푛 1 x−µ 휉 x−µ 휉 ∏푖=1 exp {− ⌈1 + ξ ( )⌉ } ⌈1 + ξ ( )⌉ , 휉 ≠ 0 study, we will illustrate the GEV parameter estimations { 휎 σ σ using simulation study. The superior method then will be 1 x−µ x−µ ∏푛 푒푥푝 (− ) exp {−exp (− )} , 휉 = 0 applied to model the annual maximum stream flow in 푖=1 휎 σ σ Sabah. (4) When the sample size rises to infinity, it is said that the MLE shows consistent estimator and the variance will go 2. Methodology to zero. The asymptotic theory allows MLE to be normally distributed as the sample size rises. MLE was chosen due The previous study has shown that PMLE is more to the stable performance in a large sample size (n>50) [5]. superior to other methods. In this study, we will illustrate The parameter estimation for GEV can be obtained by the GEV parameter estimation using 3 parameter estimates maximizing log likelihood function with respect to such as MLE, PWM, and PMLE. We conducted simulation parameters. study for methods comparison using R software with our own written code. From the result, then we will apply this method to model the annual maximum river flow in Sabah. 2.3. Penalized Maximum Likelihood Estimator (PMLE)

2.1. GEV Distribution The following Penalization Maximum Likelihood was introduced byColes & Dixon [18]. With the penalized The GEV distribution is a family distribution consisting likelihood function can be written as equation 5 [18]: of three distributions called as Gumbel, Fréchet, and L ,, = L ,,  P  Weibull. These distributions can fit the extreme data set pen( ) ( ) ( ) (5) with high accuracy. Choosing only one of family GEV distribution may cause bias in data and the term of Where L(,,   ) is the standard likelihood function of

Mathematics and Statistics 8(2A): 47-51, 2020 49

MLE from equation (4) and the penalty function p (훏) is 4. Simulation Study shown in equation (6): We illustrate the comparison of GEV parameter  1  estimations using a simulation study. For this purpose, we α if ξ  0  ξ simulate extreme events from GEV distribution, X~GEV p(ξ) =−− exp λ1if 0 ξ 1 (6) ξ −1 for (0,1,0.15) with a sample size of n=30. We repeat this  if ξ  1  simulation for 1000 times. For each case, we estimate the  0 parameter estimation using MLE, PWM, and PML. Then Where the appropriate value for  and  is non- we compute the bias and RMSE for method comparison. negative. The PMLE will help to overcome the poor result Table 1 shows the GEV parameter estimation by MLE, of MLE due to the small sample size. This is supported PWM and PMLE. It shows that estimation is close to the byColes & Dixon[18], where they have conducted a study actual value, ̂휃 ≈ 휃. to explain the behavior of penalized likelihood. The PMLE Table 1. GEV parameter estimation of PWM, MLE and PMLE was almost identical to the MLE for the case of 훏 that is a negative value. But if 훏 shows positive value PMLE will be Parameter estimation (휽̂) Method almost the same with PWM, hence the characteristics of 휇̂ 휎̂ 휉̂ PMLE will inherit smaller variance at expense of negative PWM 0.029251 0.959233 0.138762 bias[12]. Overall, PMLE has properties that will match in all sample sizes and helps to improve MLE and PWM. MLE 0.011395 0.964506 0.145367 PMLE 0.0167963 0.971953 0.150994 2.4. Probability Weight Moment (PWM) & L-Moment Table 2 shows that the biasness is close to zero for all L-moment is a method based on a combination of PWM parameter estimation methods. As we can see from Table 3, PMLE produces smaller RMSE of compared to other [7], hence PWM is equivalent to L-moment for GEV 휉 methods. Hence we can conclude that PMLE is superior distribution [6] and this method was introduced by Hosking compared to MLE and PMW as shown in a previous study [10]. Coles & Dixon [18] and Musakkal [2]. Random variable X for PWM and L-moment can be defined as; Table 2. Bias of GEV parameter estimation of PWM, MLE, and PMLE 푟 훽푟 = 푀1,푟,0 = 퐸[푋{퐹(푋)} ], r = 0,1,2, … (7) Bias Method ̂ X is distribution function for F and 훽푟 is the estimate of 휇 휎 휉 empirical distribution in; PWM -0.000253 0.000681 -0.000123 1 푟 (푖−0.35) 훽̂ = ∑푛 푥 (퐹̂ ) Where퐹̂ = (8) MLE -0.000208 0.000644 -0.000095 푟 푛 푖=1 (푖) 푖 푖 푛 PMLE -0.000242 0.000699 -0.000125 Therefore parameter of GEV can be estimated using this equation; Table 3. Root mean square error (RMSE) of GEV parameter estimation of PWM, MLE, and PMLE ξ̂ = 7.8590c + 2.9554c2 (9) RMSE (2푏 푏 )휉̂ 휎̂ = 1− 0 (10) Method 푟(1−휉)(2휉−1) 휇 휎 휉 휎̂ ̂ PWM 0.0197239 0.027801 0.017808 휇̂ = 푏0 − ̂ {훤(1 − 휉) − 1} (11) 휉 MLE 0.019521 0.028044 0.018097 2푏1−푏0 푙표푔2 푤ℎ푒푟푒 푐 = − (12) PMLE 0.019464 0.027903 0.017759 3푏1−푏0 푙표푔3

3. Return Level 5. Result and Discussion Return level is frequently used to convey information about the likelihood of extreme events such as earthquake, 5.1. Application of Data Stream Flow in Sabah flood, hurricanes, etc [19]. For the application above 3 - method, we can estimate the return level by using equation This study uses data annual maximum streamflow (m s 1 13. ) from several stations in Sabah. Data were obtained from 휎 the Hydrology Department of Sabah. The data were 휇 − (log(1 − 푝)−휉 − 1)ξ ≠ 0 휉 푍푝 = { (13) collected from several stations. Table 4 shows the number 휇 − 휎푙표𝑔[−푙표𝑔(1 − 푝)]ξ = 0 of observations for each station.

50 Parameter Estimations of the Generalized Extreme Value Distributions for Small Sample Size

Table 4. Number of observations for each station tolerance interval. No Station Name Years Duration TOLERANCE INTERVAL PMLE 1994- 1 Sg Segama At Limkabong Sabah 23 2019

2 Sg Balung At Balung Bridge Sabah 25 1992-2016 3500 3 Sg At Sapulut Sabah 29 1990-2018

4 Sg At Kalabakan Sabah 32 1986-2017 3000 5 Sg At Kuhara Sabah 34 1983-2016

6 Sg Mengalong At Sindumin 37 1983-2019 2500 Sg Kalumpang At Mostyn Bridge 7 38 1979-2016 Sabah 1978- 2000

8 Sg Lakutan At Mesapol Sabah 39 Data Observation 2016

9 Sg Padas At Kemabong Sabah 50 1969-2018 1500 1969- 10 Sg Kuamut At UluKuamut Sabah 50

2018 1000 We applied the result from the simulation study in modeling the annual maximum river flow in Sabah. As a 1000 1500 2000 2500 result, Table 5 shows the GEV parameter estimation by using the PMLE method. Estimated Data

Table 5. Parameter estimation for PMLE Figure 1. Q-Q Plot with 95% tolerance interval at station Segama (PMLE method) PMLE No Station 훍 훔 훏 We then calculated the return value of annual maximum SgSegama At for each site with p=0.01. The corresponding return value 1 Limkabong 490.5986 365.4626 -0.0606 estimation for all station is shown in Table 6. Sabah SgBalung At Table 6. Return value estimates 2 Balung Bridge 16.2600 10.6001 -0.0578 Sabah No Station PMLE SgSapulut At 3 464.0687 147.8391 -0.0144 1 SgSegama At Limkabong Sabah 1957.537 Sapulut Sabah SgKalabakan 2 SgBalung At Balung Bridge Sabah 59.07734 4 At Kalabakan 177.0876 196.1891 -0.0000 Sabah 3 SgSapulut At Sapulut Sabah 1122.095 SgTawau At 5 10.2473 3.6354 -0.1465 Kuhara Sabah 4 SgKalabakan At Kalabakan Sabah 1079.415 SgMengalong 6 205.9045 100.0220 -0.2425 5 SgTawau At Kuhara Sabah 22.4142 At Sindumin SgKalumpang 6 SgMengalong At Sindumin 483.1688 7 At Mostyn 173.4530 117.6743 -0.0000 Bridge Sabah 7 SgKalumpang At Mostyn Bridge Sabah 714.7055 SgLakutan At 8 69.8482 35.49840 -0.0953 8 SgLakutan At Mesapol Sabah 202.0502 Mesapol Sabah Sg Padas At 9 Sg Padas At Kemabong Sabah 1640.258 9 Kemabong 634.7662 218.7411 -0.0003 Sabah 10 SgKuamut At UluKuamut Sabah 2620.125 SgKuamut At 10 UluKuamut 887.36199 376.63772 0.00004 Sabah 6. Conclusions We evaluate the goodness of fit of GEV using the Q-Q plot with a 95% tolerance interval. The Q-Q plot is a useful The simulation study shows that the PMLE gives a better tool to check the empirical distribution that is close or estimate compared to MLE and PMW because it has small similar to the critical distribution. As a result, GEV has bias and RMSE. We then use this result for an application fitted well the annual maximum river for all stations. Figure of modeling the annual maximum river flow in Sabah. The 1 shows the example of the Q-Q plot with a 95% tolerance GEV distribution is the appropriate model for these interval for the GEV fit of annual maximum river flow at extreme data. For the application we used 100 years return station Segama. It can be seen that all points are scattered level for each of station. It shows that the theoretical in a straight line with slope equal to 1 and within 95% distribution is similar to the empirical distribution. For

Mathematics and Statistics 8(2A): 47-51, 2020 51

future study, we will consider the effect of the covariate in annual maximum flow with small sample sizes. Malaysian the model [14] Journal of Fundamental and Applied Sciences. 13, 4, 563- 566. P. Ailliot., C. Thompson, & Thomson, P (2011). Mixed Acknowledgements methods for fitting the GEV distribution. Water Resources Research, 47(5), 1–14. I am greatly thankful and appreciate the Hydrology P. Embrect, C. Kluppelberg & T. Mikosh. (1997). Modelling Department of Sabah for providing data of stream flow. I Extremal Events for Insurance and Finance. Berlin: also want to thank University Malaysia Sabah for research Springer Verlag. grant UMSGreat (GUG0355-1/2019) to conduct this study. P. Sharma, A. Chandra, S. C. Kaushik, P. Sharma &S. Jain (2012). Predicting violations of national ambient air quality standard using extreme value theory for Delhi City. Atmospheric Pollution Research, 3, 2, 170-179. R. L. Smith (1985). Maximum likelihood estimation in a REFERENCES class of non-regular cases. Biometrika, 72, 67–92. A. Fierra & L. De Haan. (2015). On the block maxima R. Wada, T. Waseda, & P. Jonathan. (2016). Extreme value method in extreme value theory: pwm estimators. The estimation using the likelihood-weighted method. Ocean Annals of Statistics 43, 1, 276–298. Engineering, 124(June), 241–251. A. Z. Ismail, Z. Yusop & Z. Yusof. (2015). Comparison of S.Coles, & M. Dixon. (1999). Likelihood based inference flood distribution models or johor river Basin. Journal for extreme value models. Extremes 2(1), 5-23. Technology 74(11): 1123-128. V. P. Singh &K. P. Das. (2016). Characterization of the tail A.F Jenkinson. (1955). The frequency distribution of the of river flow data by generalized pareto distribution. Journal annual maximum or minimum value of meteorological of Statistical Research. 48-50, 2, 55-70. element. Quarterly Journal of the Royal Meteorological Society 87, 145-58. Z. Zhou, S. Liu, H. Hua, C. S. Chen, G. Zhong, H. Lin, &C. W. Huang. (2014). Frequency Analysis for Predicting E. Castillo, A. S. Hadi, N. Balakrishan, & J. M. Sarabia, Extreme Precipitation in Changxing Station of Taihu Basin, (2005). Extreme Value and Related Models with China. Journal of Coastal Research, 68(Cm), 144–151. Applications in Engineering and Science. New Jersey: Wiley. E. Gilleland &R.W. Katz. (2006). Analyzing seasonal to interannual extreme weather and climate variability with the extreme toolkit. Research Application Laboratory, National Center for Atmospheric Research,2 (15): 1-9 E.S. Martins &J. R Stedinger (2000). Generalized maximum likelihood generalized extreme value quantile estimators for hydrologic data. Water Resources Research, 36(3), 737-744. G. Lazoglou, &C. Anagnostopoulou (2017). An Overview of Statistical Methods for Studying the Extreme Rainfalls in Mediterranean. Proceedings, 1(5), 681. H B.. Hasan, N F. B.. Radi & S. B. Kassim.. (2012). Modeling of extreme temperature using generalized extreme value (GEV) distribution: a case study of Penang. Proceedings of the World Congress on Engineering 1. WCE 2012, J. E. Morrison, & J A. Smith (2002). Stochastic modeling of flood peaks using the generalized extreme value distribution. Water resources research. 38, 12, 1305. J.R.M. Hosking, J.R. Wallis &E. F. Wood. (1985). Estimation of the generalized extreme-value distribution by the method of probability-weighted moment. Technometrics 27, 3, 251-261 K Engeland, H. Hidal, & A. Frigessi. (2005). Practical extreme value modelling of hydrological floods and droughts: a case study. Extremes 7, 5–30. N. F. K. Musakkal, S. N Chin., K. Ghazali. &D. Gabda (2018). A penalized likelihood approach to model the

Mathematics and Statistics 8(2A): 52-57, 2020 http://www.hrpub.org DOI: 10.13189/ms.2020.081309

Fourth-order Compact Iterative Scheme for the Two-dimensional Time Fractional Sub-diffusion Equations

Muhammad Asim Khan∗, Norhashidah Hj. Mohd Ali

School of Mathematical Sciences, Universiti Sains Malaysia, 11800 USM Penang, Malaysia

Received July 28, 2019; Revised September 27, 2019; Accepted February 20, 2020

Copyright ©2020 by authors, all rights reserved. Authors agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License Abstract The fractional diffusion equation is an im- fractional diffusion equations. The method will be shown to portant mathematical model for describing phenomena of be fourth-order accurate in space. The formulation of the anomalous diffusion in transport processes. A high-order scheme will be discussed in the next section. Numerical re- compact iterative scheme is formulated in solving the sults are presented in section 3 and concluding remarks are two-dimensional time fractional sub-diffusion equation. The given in section 4. spatial derivative is evaluated using Crank-Nicolson scheme with a fourth-order compact approximation and the Caputo derivative is used for the time fractional derivative to obtain 2 Formulation of the proposed scheme a discrete implicit scheme. The order of convergence for the proposed method will be shown to be of O(τ 3−α + h4). The two-dimensional time fractional diffusion is described Numerical examples are provided to verify the high-order as accuracy solutions of the proposed scheme. 2 2 C α ∂ u ∂ u High-order Compact Scheme, Crank- 0 Dt u(x, y, t) = + + f(x, y, t), Keywords ∂x2 ∂y2 (1) Nicolson, Finite Difference, Two-dimensional Time (x, y) ∈ (L ,L ) × (L ,L ), 0 < t < T. Fractional Sub-diffusion 1 2 3 4 C α Here 0 Dt u (0 < α < 1) represents Caputo fractional derivative of defined by [12], i.e.

Z t ∂u(x,y,τ) C α 1 ∂τ 1 Introduction 0 Dt u = α ∂τ, (0 < α < 1), (2) Γ(1 − α) 0 (t − τ) Fractional differential equations have gained more atten- where Γ(.) is a Euler Gamma function. tion over the last few years because of their application in Using finite difference approximations to the time and space various fields of science and technology such as in physics, derivatives of (1), let h > 0 be the space step and k > 0 be engineering, and biology [1–4]. Most of the fractional dif- the time step and the step size is taken be equal in both x and ferential equations, however cannot be solved analytically; y directions. Define x = ih, y = jh, {i, j = 0, 1, 2, ..., n}, therefore researchers turn to numerical methods for alterna- tk = τk, k = 0, 1, 2, ..., l and steps sizes of spatial vari- tive solvers. Numerous numerical methods have been formu- 1 ables hx = hy = h = n , where n is an arbitrary positive lated in solving various types of fractional differential equa- integer. Many approximation formulas could be obtained for tions [5–15]. In particular, two-dimensional time fractional (1) at the point (xi, yj, tk). Consider a Taylor series expan- diffusion equations have been solved by Zhuang and Liu [12], sion of each of function values of u(xi, yj, tk) about the point Balasim and Ali [13], Cui [14], and Abbaszadeh and Mo- (xi, yj, tk): hebbi [15] with promising results. However, formulations of high order ac-curacy solvers are still in its infancy particu- k 2 k k k ∂u h ∂u u(xi + h, yj, tk) = u =u + h + larly for two-dimensional fractional differential equations. i+1,j i,j ∂x 2 ∂x In this paper, we present a high order method using Caputo i,j i,j 3 k time derivatives in hybrid with Crank-Nicolson scheme for h ∂u + + ..., the spatial derivatives in solving the two-dimensional time 6 ∂x i,j Mathematics and Statistics 8(2A): 52-57, 2020 53

k 2 k k k ∂u h ∂u so u(xi, yj + h, tk) = u =u + h + i+1,j i,j ∂y 2 ∂y i,j i,j     k k 1 1 X h3 ∂u τ αΓ(2 − α) 1 + δ 2 1 + δ 2 b (uk+1−s + + .... 12 x 12 y s i,j s=0 6 ∂y i,j τ αΓ(2 − α)  1  uk+1 + uk+1 If the control operator is introduced as − uk−s) = δ2 + δ2 + δ2δ2 ( i,j i,j ) i,j h2 x y 6 x y 2 2 k k k k δx ui,j = ui+1,j − 2ui,j + ui−1,j,     1 1 k+ 1 + τ αΓ(2 − α) 1 + δ 2 1 + δ 2 f 2 . and then using above Taylor series expansions at points 12 x 12 y i,j k k ui+1,j and ui,j+1, we get k+1 After simplifying to the point ui,j , we get 2 k 2 4 k 4 6 k 2 k ∂ u h ∂ u h ∂ u 6 δx ui,j = 2 + 4 + 6 + O(h ), k+1 k+1 k+1 ∂x i,j 12 ∂x i,j 360 ∂x i,j (4A − 4B + 2)ui,j = (A − 2B)[ui+1,j + ui−1,j (3) k+1 k+1 k+1 k+1 k+1 + ui,j+1 + ui,j−1] + B[ui+1,j+1 + ui−1,j+1 + ui+1,j−1 ∂2u k h2 ∂4u k h4 ∂6u k δ 2uk = + + + O(h6). k+1 k k k y i,j 2 4 6 + ui−1,j−1] + (C − 2D)[ui+1,j + ui−1,j + ui,j+1 ∂y i,j 12 ∂y i,j 360 ∂y i,j k k k k k (4) + ui,j−1] + D[ui+1,j+1 + ui−1,j+! + ui+1,j−1 + ui−1,j−1] After re-arranging (3) and (4), the following is obtained k k+ 1 X 25H + (4D − 4C + 2)uk+1 + F 2 − b [ uk+1−s ∂2u k  1 −1 δ 2 i,j i,j s 18 i,j 2 x k 4 s=1 = 1 + δx ui,j + o(h ), (5) ∂x2 12 h2 5 i,j + (uk+1−s + uk+1−s + uk+1−s + uk+1−s)+ 36 i+1,j i−1,j i,j+1 i,j−1 2 k  −1 2 ∂ u 1 2 δy k 4 1 = 1 + δy u + o(h ). (6) k+1−s k+1−s k+1−s k+1−s 2 2 i,j (ui+1,j+1 + ui−1,j+1 + ui+1,j−1 + ui−1,j−1) ∂y i,j 12 h 72 25H 5 For the time fractional derivative, we use the Crank-Nicolson − ( uk−s + (uk−s + uk−s + uk−s + uk−s ) approximation for Caputo derivative [13] 18 i,j 36 i+1,j i−1,j i,j+1 i,j−1 1 k−s k−s k−s k−s k + (u + u + u + u ))] ∂αu(x ,y ,t ) −α i+1,j+1 i−1,j+1 i+1,j−1 i−1,j−1 i j k+ 1 τ X 72 2 = b uk+1−s − uk−s ∂t Γ(2 − α) s i,j i,j + O(ι3 + tαh4), s=0 (7) (9) + o(τ 3−α),

1+α 1+α where where bs = (s + 1) − (s) , s = 0, 1, 2, ..., n. Since we use Crank Nicolson scheme which is the average k+ 1 25H k+ 1 5H k+ 1 k+ 1 k+ 1 k+ 1 1 2 2 2 2 2 2 Fi,j = fi,j + (fi+1,j + fi−1,j + fi,j+1 + fi,j−1) of implicit and explicit scheme, so replacing k by k + 2 in 18 36 (5), (6) and substituting (5), (6) and (7) into (1), we have the H k+ 1 k+ 1 k+ 1 k+ 1 + (f 2 + f 2 + f 2 + f 2 ), following 72 i+1,j+1 i−1,j+1 i+1,j−1 i−1,j−1 −α k  −1 2 τ X 1 δx k+ 1 H 1 G 1 b uk+1−s − uk−s = 1 + δ 2 u 2 H = τ αΓ(2 − α),G = ,A = G − ,B = − , Γ(2 − α) s i,j i,j 12 x h2 i,j 2 s=0 h 6 6 72 1 G 1  −1 2 1 δ k+ 1 k+ 1 C = G + ,D = + . + 1 + δ 2 y u 2 + f 2 + o(τ 3−α + h4). 6 6 72 12 y h2 i,j i,j Figure (1) shows nine grid points involved in the in the up- (8) dates using (9). 1 2 1 2 A compact scheme can be constructed by iterating on each Multiplying both sides by 1 + 12 δx 1 + 12 δy Γ(2−α) and after rearranging we get point in the solution domain using Equation (9) until a cer- tain convergence is achieved. Fig. 2 shows all the points     k α 1 2 1 2 X k+1−s involved at the different time levels in updating using Equa- τ Γ(2 − α) 1 + δx 1 + δy bs(ui,j 25H 5 12 12 tion (9), where G1 = bk−1( 18 ),M = bk−1( 35 ),L = s=0 1 25H 5 1 α   bk−1( 72 ),N = b0( 18 ),P = b0( 36 )andQ = b0( 72 ). τ Γ(2 − α) 1 k+ 1 − uk−s) = δ2 + δ2 + δ2δ2 u 2 + i,j h2 x y 6 x y i,j     1 1 k+ 1 3 Numerical experiments τ αΓ(2 − α) 1 + δ 2 1 + δ 2 f 2 . 12 x 12 y i,j Two examples are used to verify the effectiveness of the Since we know high-order compact scheme in solving the two-dimensional k+1 k time fractional sub-diffusion equations. The experiments k+ 1 u + ui,j u 2 = i,j , i,j 2 were conducted using the method formulated in Section II in 54 Fourth-order Compact Iterative Scheme for the Two-dimensional Time Fractional Sub-diffusion Equations

gence criteria. We calculated the computational orders of the proposed method in space variables with [15]   ||L∞(16τ, 2h)|| C2 − order = log2 (10) ||L∞(τ, h)||

where L∞ is the maximum norm. Example 1 Consider the model problem [13]

∂2u ∂2u C Dαu(x, y, t) = + + Γ(2 + α)t − 2t1+α ex+y, 0 t ∂x2 ∂y2 with initial and boundary conditions

u(0, y, t) = eyt1+α, u(1, y, t) = e1+yt1+α, u(x, 0, t) = ext1+α, u(x, 1, t) = e1+xt1+α, u(x, y, 0) = 0, 0 ≤ x, y ≤ 1, 0 ≤ t ≤ 1. Figure 1. from Change Styles to Styles Menu. The analytical solution is

u(x, y, t) = ex+yt1+α.

Example 2 Consider the problem [13]

∂2u ∂2u  2  C Dαu(x, y, t) = + + t2−α + 2t2 0 t ∂x2 ∂y2 Γ(3 − α) × sin(x) sin(y),

with initial and boundary conditions

u(0, y, t) = 0, u(1, y, t) = t2 sin(1) sin(y), u(x, 0, t) = 0, u(x, 1, t) = t2 sin(x) sin(1), u(x, y, 0) = 0, 0 ≤ x, y ≤ 1, 0 ≤ t ≤ 1,

and the analytical solution is

u(x, y, t) = t2 sin(x) sin(y).

From Table 3 and Table 4, it is observed that with decreasing mesh size, maximum and average errors are also reduced; the effectiveness of the scheme is more pronounced. In Table 1 and Table 2, the C2 −order of convergence is checked for the different values of in Example 2 and Example 1 respectively, which shows that the computational spatial accuracy of the scheme is in agreement with the theoretical spatial accuracy.

Figure 2. from Change Styles to Styles Menu. hybrid with a Successive Over Relaxation (SOR) technique for different mesh sizes ( n =8, 16, 24, 30,36) and for dif- ferent time steps (τ = 0.12, 0.1, 0.05, 0.062, 0.041, 0.033, 0.025, 0.02, 0.015). Results were obtained using a PC with Core i7, 3.40 GHz, 4GB of RAM with Windows 7 Profes- sional and Math-ematica software. The maximum absolute −5 error (L∞) with tolerance ε = 10 was used for the conver- Mathematics and Statistics 8(2A): 52-57, 2020 55

Table 1. C2-order of convergence for Example 2

α = 0.1 α = 0.2 h/τ Max error C2 − order h/τ Max error C2 − order 1 −3 1 −3 h = τ = 2 8.0430 ×10 — h = τ = 2 8.1460 ×10 — 1 1 −4 1 1 −4 h = 4 , τ = 32 5.4351 ×10 3.88 h = 4 , τ = 32 5.4105 ×10 3.91 1 −3 1 −3 h = τ = 4 4.2014 ×10 — h = τ = 4 4.2081 ×10 — 1 1 −4 1 1 −4 h = 8 , τ = 64 2.9450 ×10 3.83 h = 8 , τ = 64 3.0412 ×10 3.79 α = 0.3 α = 0.4 h/τ Max error C2 − order h/τ Max error C2 − order 1 −3 1 −3 h = τ = 2 7.9092 ×10 — h = τ = 2 5.7375 ×10 — 1 1 −4 1 1 −4 h = 4 , τ = 32 5.4456 ×10 3.86 h = 4 , τ = 32 5.6374 ×10 3.74 1 −3 1 −3 h = τ = 4 4.2425 ×10 — h = τ = 4 4.1264 ×10 — 1 1 −4 1 1 −4 h = 8 , τ = 64 3.3294 ×10 3.69 h = 8 , τ = 64 3.1652 ×10 3.73 α = 0.5 α = 0.6 h/τ Max error C2 − order h/τ Max error C2 − order 1 −3 1 −3 h = τ = 2 8.0431 ×10 — h = τ = 2 8.1463 ×10 — 1 1 −4 1 1 −4 h = 4 , τ = 32 5.4355 ×10 3.89 h = 4 , τ = 32 5.4103 ×10 3.91 1 −3 1 −3 h = τ = 4 3.9754 ×10 — h = τ = 4 4.2081 ×10 — 1 1 −4 1 1 −4 h = 8 , τ = 64 2.8664 ×10 3.80 h = 8 , τ = 64 3.0412 ×10 3.79

Table 2. C2-order of convergence for Example 1

α = 0.3 α = 0.4 h/τ Max error C2 − order h/τ Max error C2 − order 1 −2 1 −2 h = τ = 2 3.4333 ×10 — h = τ = 2 3.9189 ×10 — 1 1 −3 1 1 −3 h = 4 , τ = 32 3.6048 ×10 3.25 h = 4 , τ = 32 3.7938 ×10 3.36 1 −2 1 −2 h = τ = 4 2.1545 ×10 — h = τ = 4 2.3970 ×10 — 1 1 −3 1 1 −3 h = 8 , τ = 64 1.8470 ×10 3.54 h = 8 , τ = 64 1.9514 ×10 3.61 α = 0.5 α = 0.6 h/τ Max error C2 − order h/τ Max error C2 − order 1 −2 1 −2 h = τ = 2 4.1847 ×10 — h = τ = 2 4.1556 ×10 — 1 1 −3 1 1 −3 h = 4 , τ = 32 3.9598 ×10 3.40 h = 4 , τ = 32 4.0642 ×10 3.35 1 −2 1 −2 h = τ = 4 2.5281 ×10 — h = τ = 4 2.5052 ×10 — 1 1 −3 1 1 −3 h = 8 , τ = 64 2.0628 ×10 3.61 h = 8 , τ = 64 2.1386 ×10 3.55

Table 3. The number of iterations, maximum error and average error for α = 0.5 for Example 1

τ h Iteration Maximum error Average error 1 1 −3 −3 8 8 44 2.3103×10 1.2109 ×10 1 1 −3 −4 16 16 45 1.1632 ×10 5.4651 ×10 1 1 −4 −4 24 24 58 7.7792 ×10 3.5271 x×10 1 1 −4 −4 30 30 56 6.2549 ×10 2.7843 ×10 1 1 −4 −4 36 36 53 5.2568 ×10 2.3054 ×10

Table 4. The number of iterations, maximum error and average error for α = 0.5 for Example 2

τ h Iteration Maximum error Average error 1 1 −2 −3 10 10 53 1.2428 ×10 8.8849 ×10 1 1 −3 −3 18 18 52 7.1213 ×10 3.6917 ×10 1 1 −3 −3 50 22 55 2.6595 ×10 1.3580 x×10 1 1 −3 −3 65 35 58 2.0605 ×10 1.0285 ×10 56 Fourth-order Compact Iterative Scheme for the Two-dimensional Time Fractional Sub-diffusion Equations

REFERENCES

[1] Du, R. Cao, W.R. and Sun, Z.Z., A compact difference scheme for the fractional diffusion-wave equation, Applied Mathematical Modelling, 34, 10., 2998-3007, 2010.

[2] Miller, K.S. and Ross, B., An introduction to the fractional calculus and fractional differential equations, Wiley- Interscience, 1993.

[3] Oldham, K.B. and Spanier, J., The fractional calculus, Academic Press, New York, 1974.

[4] Khan, M.A., Ullah, S., Ali, M. and Hj, N., Application of Op- timal Homotopy Asymptotic Method to Some Well-Known Linear and Nonlinear Two-Point Boundary Value Problems, Figure 3. Absoluteerror = |exact − aprroximate| for Example 1 International Journal of Differential Equations, 1-11, 2018. 1 where h = τ = 25 andα = 0.5

[5] Lin, Y. and Xu, C., Finite difference/spectral approximations for the time-fractional diffusion equation, Journal of Compu- tational Physics, 225 2, 1533-1552, 2007.

[6] Mainardi, F., Luchko, Y. and Pagnini, G., The fundamental solution of the space-time fractional diffusion equation, arXiv preprint cond-mat/0702419, 2007.

[7] Jiang, Y. and Ma, J., High-order finite element methods for time-fractional partial differential equations, Journal of Com- putational and Applied Mathematics, 235, 11., 3285-3290, 2011.

[8] Ford, N.J., Xiao, J. and Yan, Y., A finite element method for time fractional partial differential equations, Fractional Calculus and Applied Analysis, 14 3, 454-474, 2011.

Figure 4. Absoluteerror = |exact − aprroximate| for Example 2 [9] Liu, F., Zhuang, P., Turner, I., Burrage, K. and Anh, V., A 1 where h = τ = 25 andα = 0.5 new fractional finite volume method for solving the fractional diffusion equation, Applied Mathematical Modelling, 38, 15-16,3871-3878, 2014.

4 Conclusions [10] Gresho, P.M., Chan, S.T., Lee, R.L. and Upson, C.D., A mod- ified finite element method for solving the time-dependent, incompressible Navier-Stokes equations, Part 1: Theory. We have solved the two-dimensional time fractional sub- International Journal for Numerical Methods in Fluids, 4, 6., diffusion equation with a higher-order compact Crank- 557-598, 1984. Nicolson scheme using Caputo derivative. It is observed that by decreasing the grid size, the maximum norm (L ) er- ∞ [11] Orlande, H.R., Ozis¸ik,¨ M.N., Colac¸o, M.J. and Cotta, R.M., ror educe quite significantly, which shows that our proposed Finite difference methods in heat transfer, CRC press, 2017. scheme is accurate and reliable. The theoretical order of con- vergence for the scheme was proven to be of O(τ 3−α + h4). C2 computational orders of the spatial accuracy of the nu- [12] Zhuang, P. and Liu, F., Finite difference approximation for merical results have also been shown to be in agreement with two-dimensional time fractional diffusion equation, Journal the theoretical spatial accuracy. of Algorithms and Computational Technology, 1, 1, 1-16, 2007. Mathematics and Statistics 8(2A): 52-57, 2020 57

[13] Balasim, A.T. and Ali, N.H.M., A rotated crank-nicolson 3, 383-409, 2013. iterative method for the solution of two-dimensional time- fractional diffusion equation, Indian Journal of Science and Technology, 8, 32, 2015. [15] Abbaszadeh, M. and Mohebbi, A., A fourth-order com- pact solution of the two-dimensional modified anomalous fractional sub-diffusion equation with a nonlinear source [14] Cui, M., Convergence analysis of high-order compact alter- term, Computers and Mathematics with Applications, 66, 8, nating direction implicit schemes for the two-dimensional pp.1345-1359, 2013. time fractional diffusion equation, Numerical Algorithms, 62, Mathematics and Statistics 8(2A): 58-62, 2020 http://www.hrpub.org DOI: 10.13189/ms.2020.081310

Hybrid Flow-Shop Scheduling (HFS) Problem Solving with Migrating Birds Optimization (MBO) Algorithm

Yona Eka Pratiwi, Kusbudiono*, Abduh Riski, Alfian Futuhul Hadi

Department of Mathematics, The University of Jember, Indonesia

Received July 21, 2019; Revised September 25, 2019; Accepted February 20, 2020

Copyright©2020 by authors, all rights reserved. Authors agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License

Abstract The development of an increasingly rapid production scheduling system. industrial development resulted in increasingly intense Scheduling is a process of allocating resources, competition between industries. Companies are required to especially machines that are limited to completing several maximize performance in various fields, especially by different jobs. Scheduling functions to optimize the time of meeting customer demand with agreed timeliness. completion of a job with a limited machine so that it can Scheduling is the allocation of resources to the time to meet the production target so that the company can produce a collection of jobs. PT. Bella Agung Citra determine the right scheduling system and following the Mandiri is a manufacturing company engaged in making competitive situation and strategy implemented. One of the spring beds. The work stations in the company consist of 5 production scheduling systems in industry is the flow shop stages consisting of ram per with three machines, clamps scheduling system. per 1 machine, firing mattresses with two machines, The simplest flow shop scheduling is flow shop sewing mattresses three machines and packing with one scheduling on a single production line, which is only one machine. The model problem that was solved in this study machine for each stage of its operation. While the problem was Hybrid Flowshop Scheduling. The optimization that often arises in some companies is flow shop method for solving problems is to use the metaheuristic scheduling on parallel production lines, namely hybrid method Migrating Birds Optimization. To avoid problems flow shop scheduling with one or more services that have faced by the company, scheduling is needed to minimize more than one machine, hybrid flow shop scheduling is one makespan by paying attention to the number of parallel of the schedules of sophisticated flow shop which consists machines. The results of this study are scheduling for 16 of various stages (stages) of production processes and jobs and 46 jobs. Decreasing makespan value for 16 jobs materials to be processed in the same direction. One of the minimizes the time for 26 minutes 39 seconds, while for 46 essential goals in scheduling is to increase the efficiency jobs can minimize the time for 3 hours 31 minutes 39 and effectiveness of production by minimizing makespan seconds. to meet consumer demand on time, and therefore, a solution is needed by creating an effective scheduling Keywords MBO, HFS, Metaheuristic, Deviation system. Percentage Algorithm, Makespan In developing mathematics, the optimization method for completing hybrid flow shop scheduling is growing. In the search for efficient and comprehensive solutions, the metaheuristic method uses mechanisms that mimic social 1. Introduction behavior or strategies that exist in nature. Although there is no guarantee that the answer found is the optimal solution, Mathematics is one part of science that plays an a well-built metaheuristic method can provide a solution important role in the world of technology and companies. that approaches the optimal solution. [1] The industry that is growing rapidly along with the In the HFS problem, one of the optimization methods is increasingly modern technological sophistication causes the Migrating Birds Optimization (MBO) algorithm. It is a the level of competition between industries increasingly metaheuristic algorithm that is capable of solving QAP tight, and companies must maximize performance in (Quadratic Assignment Problem) problems [2]. So we are various fields. One of them is to fulfill customer demand interested in implementing the Migrating Birds with the agreed timelines. This aspect is related to the Optimization algorithm on other issues, namely Hybrid

Mathematics and Statistics 8(2A): 58-62, 2020 59

Flow Shop Scheduling to represent the effectiveness and absence of information about the value of the optimal efficiency of the Migrating Birds Optimization algorithm solution. Makespan is a standard criterion for HFS based on the smallest makespan aspect. problems. The Lower Bound value can be calculated using Based on the description above, the exciting thing that equation (1) we will examine is how to implement the Migrating Birds 푖−1 퐿퐵 = max {min{∑푙=1 푡푙푗} + max{푀1(푖), 푀2(푖 Optimization algorithm in Hybrid flow shop scheduling 푖=1,..,푘 푗∈퐽 and the solutions provided by it. Next, we will compare the 푘 )} + min{∑푙=푖+1 푡푙푗}} (1) results of the solution, to apply and know the solutions 푗∈퐽 provided by the Migrating Birds Optimization algorithm in where, Hybrid flow shop scheduling. So that the benefits of this writing are to obtain an optimal solution, namely the results 1 of schedule with a minimum makespan value, and can 푀1(푖) = ⌈ ∑ 푡푖푗 ⋅ 푠푖푧푒푖푗⌉, 푚푖 provide information about the Migrating Birds 푗∈퐽 Optimization algorithm and Hybrid flow shop scheduling 1 for readers. 푀 = ∑ 푡 + ⌈ ∑ 푡 ⋅ 푠푖푧푒 ⌉, 2 푗∈퐴푖 푖푗 푚 푗∈퐵푖 푖푗 푖푗 Scheduling is a work order arrangement plan and 푖 푚 푚 allocation of resources, both in the form of time and 퐴 = {푗 | 푠푖푧푒 > 푖} , 퐵 = {푗 | 푠푖푧푒 = 푖}. 푖 푖푗 2 푖 푖푗 2 facilities for each process of completing an operation. Scheduling is the first step to planning the entire To see the comparison between makespan and Lower production process because scheduling is a very important Bound, we do this by calculating the percentage deviation activity in the production process. Scheduling problems or the Deviation Percentage Algorithm (PDA). Calculation will arise when at a certain stage, several jobs must be of percentage of deviations is as follows: completed at the same time, but the number of machines or 퐶 (푙) − 퐿퐵 푃퐷 (푙) = 100 % × max other production facilities is limited. The effort to 퐴 퐿퐵 overcome this problem is to schedule some of the work so The algorithm Migrating Birds Optimization (MBO) is a that later we will get the most optimal work order. metaheuristic algorithm inspired by nature based on a "V" In the production process, there are three elements of shaped the formation of migratory birds, which has proven scheduling, namely job, operation, and machine. Job is a to be an effective formation in energy savings [2]. job. By completing the work, we will get a product. Solving optimization problems using the MBO Operations are part of the process of a job, to complete a algorithm [7] is as follows: job requires a work operation; Machines are the resources 1. Initialization of the Initial Population, the initial needed to complete the job completion process. With the positioning of the population in the MBO algorithm, existence of these elements, the purpose of scheduling is to is random, which is the solution in the search space increase the productivity of the machine, reduce the for optimal value and has a permutation solution. inventory of semi-finished goods, reduce work delays and 2. Environment and Solution Sharing, flocks of birds on reduce the total time required the right or left side of the lead bird represent a Flow shop scheduling is the production process neighboring solution. Determination or analysis of scheduling of each n job, which has a sequence of results from neighboring solutions is carried out production processes and through the same m machine [3]. iterations produce maximum values. To improve the The machine penetrates each job in the same order at least current solution, we use the best neighbor solution on one machine, and one machine can process at most one that has been obtained by the previous process. The job. The description of the flow shop scheduling model is a number of neighboring solutions made and the set of M machines, M = {1,2, ..., m} which is used to number of shared neighbor solutions are given as process a number of N jobs, N = {1,2, ..., n} [4]. Each parameters, and this parameter is determined as machine can only process one job at the same time and follows: once at each stage. Hybrid flow shop scheduling is the development of flow 푘 ∈ {3,5,7, … } shop scheduling problems that have parallel machines at (푘 − 1) each stage. Hybrid flow shop is a system consisting of 푥 ∈ {1,2,3, … , } 2 various stages (stages) of production and material processes. The Hybrid flow shop processes these stages in n = k – x a directional flow, where there is at least one stage in the Information: production process that has identical machines and is n = number of neighbor solutions except sharing arranged in parallel [5]. In the HF scheduling system, each k = number of neighboring solutions that we must stage can consist of 1 or more identical machines [6]. consider The use of Lower Bound (LB) can accommodate the x = number of neighbor solutions for us to share with the

60 Hybrid Flow-Shop Scheduling (HFS) Problem Solving with Migrating Birds Optimization (MBO) Algorithm

next solution. The data used in this research is secondary data taken The generation of a neighboring solution for Hybrid from PT Bella Agung Citra Mandiri. The research steps Flows problems we help with the SWAP method or what is used are collecting literature related to the MBO algorithm often known as swapping permutation. We do the to know and understand the MBO algorithm. Then collect swapping by generating random numbers i and j. This the data needed for research from the company. We know random number shows position i and position j. This that the company has five stages of each size and type of swapping process works by exchanging jobs in position i mattress, and also with several different machines. Next, with jobs in position j. we will apply the MBO algorithm to scheduling Hybrid Flow shop. After that, the making of programs and 3. Termination Criteria, the iteration process will stop program simulations using two types of data, namely 16 when it reaches the stop criteria. Criteria for dismissal jobs data and 46 jobs. Followed by analyzing the results of of the Migrating Birds Optimization (MBO) the program, then comparing the makespan of the program algorithm is the iteration that has been carried out has with the company's original makespan to find out the value reached the maximum iteration of the PDA so that we can conclude. We can see the flowchart of the research steps in Figure 1. 2. Materials and Methods In this study, we conducted experiments according to the data taken by comparing the makespan that we applied to two data. It is data 5 stages 16 jobs and data 5 stages 46 jobs in a maximum of different iterations. The parameter values used are population parameters and their equivalence with a combination of 5, 15, and 25 in each experiment. For sharing solutions, we make varied. We test each parameter value with a maximum iteration of 500 and 1000. From each combination we make, the program runs ten times. The results we obtained from the MBO algorithm on Hybrid flow shop scheduling problems indicate that parameter x greatly influences the achievement of optimal solutions. When the sharing parameter is worth 1, the chances of the solution we get are getting closer to optimal, and the converging is relatively faster, but the computation time is relatively longer. It is because the smaller the value of the sharing parameter, the neighboring solution except the sharing that we generate is more and more. So that candidates for new solutions are also more than before. Thus the possibility of getting an optimal solution is higher. Because of the neighboring solution except the sharing that we generate is more and more, the calculation process except sharing is also increasing so that the computation time is getting longer. In this Hybrid flow shop scheduling problem, the MBO algorithm helps solve problems that exist in a company. Based on the testing of small data and large company data, convergence and running time are affected by the parameters used. In general, the use of parameters in the MBO algorithm is relatively influential on the results of the makespan in each job. But if we look at the average results obtained, to get the optimal solution, the population used is very influential on the results of convergence and running time. We get the optimal solution quickly if the population that generates (m) in large numbers and consider the neighbor solution (k) is a relatively large neighbor solution and the smallest number of sharing. After obtaining the most optimal parameter values, then

do the final simulation to complete all data (16 jobs and 46 jobs) with these parameter values. The parameters are Figure 1. Flowchart research steps

Mathematics and Statistics 8(2A): 58-62, 2020 61

population number (m) = 75, number of neighboring production time of 5 hours 21 minutes 5 seconds with a job solutions (k) = 25, sharing (x) = 1 with a maximum sequence 1-2-3-4-5-6-7-8-9-10-11-12 -13-14-15-16, while iteration of 1000. We can see the results of applying the the optimal makespan result from the program is 14066. Hybrid Flow shop scheduling solution using data from PT Therefore, we can see that the work on Hybrid Flowshop Bella Agung Citra Mandiri in Table 1 and Table 2. scheduling with the Migrating Birds Optimization With the data of 16 company jobs for mattress algorithm can minimize the time for 26 minutes 29 production, we get makespan 15655 which requires a seconds.

Table 1. Final Simulation Data is 5 stage 16 jobs with 1000 Iterations

Experiment Makespan Convergent Iteration Computation Time PDA (%) 1 14066 5 606.6423 21.3109% 2 14066 5 608.0631 21.3109% 3 14066 11 620.0266 21.3109% 4 14066 6 611.3195 21.3109% 5 14066 6 612.4782 21.3109% 6 14066 7 611.3553 21.3109% 7 14066 13 613.6233 21.3109% 8 14066 4 612.9036 21.3109% 9 14066 9 611.6245 21.3109% 10 14066 10 614.0748 21.3109%

Table 2. Final Simulation Data is 5 stage 46 jobs with 1000 Iterations

Experiment Makespan Convergent Iteration Computation Time PDA (%) 1 35430 209 1726.3200 3.2042% 2 35529 133 1743.2942 3.4926% 3 35302 148 1724.8602 2.8313% 4 35430 209 1799.0247 3.2042% 5 35482 136 1718.3556 3.3557% 6 35330 158 1767.2683 2.9129% 7 35297 165 1740.2812 2.8168% 8 35430 209 1794.7720 3.2042% 9 35529 133 1730.5848 3.4926% 10 35313 249 1729.5146 2.8634%

Table 3. Final Simulation Data is 5 stage 46 jobs with 500 Iterations

Experiment Makespan Convergent Iteration Computation Time PDA (%) 1 35395 25 4459.7560 3,1022% 2 35278 82 4378.4267 2,7614% 3 35299 105 4297.5180 2,8226% 4 35258 132 4517.6693 2,7032% 5 35302 182 4472.4839 2,8313% 6 35419 82 4352.4709 3,1722% 7 35507 79 4527.8675 3,4285% 8 35506 119 4328.3257 3.4560% 9 35258 132 4545.4647 2.7032% 10 35365 63 4383.7481 3.0149%

62 Hybrid Flow-Shop Scheduling (HFS) Problem Solving with Migrating Birds Optimization (MBO) Algorithm

From the results of the final simulation above for the 5 stage, 16 job data we say optimal because of the ten times running, we obtained the same makespan value. Whereas for the five stages 46 job data, there is still the possibility to REFERENCES find the optimal solution. To find the best makespan of the data, the step we need to do is to enlarge the population F. S. Hillier, G. J. Lieberman GJ. Introduction to Operations Research: Ninth edition, Standford University, 2010. parameters and neighboring solutions. The parameters we use are the population (m) = 125, the number of E. Duman, M. Uysal, A. F. Alkaya. Migrating Birds neighboring solutions (k) = 75, sharing (x) = 1 with an Optimization: a new metaheuristic approach and its iteration of 500. We can see the results of the study in Table performance on quadratic assignment problem, Information Science, Vol.217, 65 – 77, 2012. 3. On the HF scheduling problem for data on five stages 46 K. Baker. Introduction to Sequencing and Scheduling, John jobs with the MBO algorithm, we get the optimal solution Wiley and Sons Inc, New York, 1997. for the best makespan, which is 35258 with converging S.R. Hejazi, S. Saghafian. Flowshop-Scheduling Problems iterations at 132. The computing time is 4517.6693 seconds, with Makespan Criterion: A Review, International Journal and the PDA is 2.7032%. From the results of makespan, the of Production Research, Vol.43, No.14, 2895-2929, 2005. production time of the company's mattress is 1 day 1 hour T. Uetake, H. Tsubone, M. Ohba M. A Production 47 minutes 38 seconds with the order of jobs Scheduling System in A Hybrid Flow Shop, International 10-20-46-45-19-40-44-37-39-41-35-18-27-43 Journal of Production Ecomonics, Vol.41, 395-398, 1995. -5-23-14-2-36-38 -31-21-29-13-34-42-3-26-16-22-9-8-1-30-11-24-28-33-4 F. S. Serifoglu, G. Ulusoy. Mulitiprocessor Task Scheduling in Multistage Hybrid Flow Shop: A Genetic -12-6-7-32-25-15-17. Algorithm Approach, Journal of Operational Research Society, Vol.55, No.5, 504-512, 2004. E. Ulker, V. Tongur. Migrating birds optimization (MBO) 3. Conclusions algorithm to solve knapsack problem, Procedia Computer Science, Vol.111, 71 – 76, 2017. Based on the results and our discussion, we conclude the following: 1. The Migrating Birds Optimization algorithm can help us solve HF Scheduling problems. We obtain makespan by entering time, size and engine data on several stages, with several parameters combined including population (m) = 5, 15, 25; neighboring solutions (k) = 5, 15, 25; sharing value (x) = 1, 2, 4, 7, 8, 12; and lots of iterations = 500, 1000. 2. The Migrating Birds Optimization algorithm is effective, because it produces a makespan value that approaches the optimal value based on Average Percentage Deviation. In the data of 16 job 5 stage the best makespan is 14066 seconds, and for data 46 jobs 5 stages is 35258 seconds. While the parameters for data 16 jobs 5 stages are population (m) = 75; neighboring solutions (k) = 25; sharing value (x) = 1; imax = 1000; and for data 16 jobs 5 stages in the population parameter (m) = 125; neighboring solution (k) = 75; sharing value (x) = 1; imax = 500.

Acknowledgements The University of Jember supported this research. For the preparation of this paper, we thanks to all member of Pemodelan Matematika Research Group and all member of the Mathematics Mod-elling and Computation Laboratory, Department of Mathematics of UNEJ.

Mathematics and Statistics

Call for Papers

Mathematics and Statistics is an international peer-reviewed journal that publishes original and high-quality research papers in all areas of mathematics and statistics. As an important academic exchange platform, scientists and researchers can know the most up- to-date academic trends and seek valuable primary sources for reference.

Aims & Scope

Algebra Discrete Mathematics Analysis Dynamical Systems Applied Mathematics Geometry and Topology Approximation Theory Statistical Modelling Combinatorics Number Theory Computational Statistics Numerical Analysis Computing in Mathematics Probability Theory Design of Experiments Recreational Mathematics

Editorial Board

Dshalalow Jewgeni Florida Inst. of Technology, USA Jiafeng Lu Zhejiang Normal University, China Nadeem-ur Rehman Aligarh Muslim University, India Debaraj Sen Concordia University, Canada Mauro Spreafico University of São Paulo, Brazil Veli Shakhmurov Okan University, Turkey Antonio Maria Scarfone National Research Council, Italy Liang-yun Zhang Nanjing Agricultural University, China Ilgar Jabbarov Ganja state university, Azerbaijan Mohammad Syed Pukhta Sher-e-Kashmir University, India Vadim Kryakvin Southern Federal University, Russia Rakhshanda Dzhabarzadeh National Academy of Science of Azerbaijan, Azerbaijan Contact Us Sergey Sudoplatov Sobolev Institute of Mathematics, Russia Birol Altin Gazi University, Turkey Horizon Research Publishing Araz Aliev Baku State University, Azerbaijan 2880 ZANKER RD STE 203 Francisco Gallego Lupianez Universidad Complutense de Madrid, Spain SAN JOSE, CA 95134 Hui Zhang St. Jude Children's Research Hospital, USA USA Yusif Abilov Odlar Yurdu University, Azerbaijan Email: [email protected] Evgeny Maleko Magnitogorsk State Technical University, Russia İmdat İşcan Giresun University, Turkey Emanuele Galligani University of Modena and Reggio Emillia, Italy Mahammad Nurmammadov Baku State University, Azerbaijan

Manuscripts Submission

Manuscripts to be considered for publication have to be submitted by Online Manuscript Tracking System(http://www.hrpub.org/submission.php). If you are experiencing difficulties during the submission process, please feel free to contact the editor at [email protected]. ISSN: 2332-2071 Table of Contents Mathematics and Statistics

Volume 8 Number 2A 2020

The Performance of Different Correlation Coefficient under Contaminated Bivariate Data Bahtiar Jamili Zaini, Shamshuritawati Sharif ...... 1

Approximate Analytical Solutions of Nonlinear Korteweg-de Vries Equations Using Multistep Modified Reduced Differential Transform Method Che Haziqah Che Hussin, Ahmad Izani Md Ismail, Adem Kilicman, Amirah Azmi ...... 9

Bayesian Estimation in Piecewise Constant Model with Gamma Noise by Using Reversible Jump MCMC Suparman ...... 17

Weakly Special Classes of Modules Puguh Wahyu Prasetyo, Indah Emilia Wijayanti, Halina France-Jackson, Joe Repka ...... 23

Markov Chain: First Step towards Heat Wave Analysis in Malaysia Nur Hanim Mohd Salleh, Husna Hasan, Fariza Yunus ...... 28

Robust Method in Multiple Linear Regression Model on Diabetes Patients Mohd Saifullah Rusiman, Siti Nasuha Md Nor, Suparman, Siti Noor Asyikin Mohd Razali ...... 36

An Alternative Approach for Finding Newton's Direction in Solving Large-Scale Unconstrained Optimization for Problems with an Arrowhead Hessian Matrix Khadizah Ghazali, Jumat Sulaiman, Yosza Dasril, Darmesah Gabda ...... 40

Parameter Estimations of the Generalized Extreme Value Distributions for Small Sample Size RaziraAniza Roslan, Chin Su Na, Darmesah Gabda ...... 47

Fourth-order Compact Iterative Scheme for the Two-dimensional Time Fractional Sub-diffusion Equations Muhammad Asim Khan, Norhashidah Hj. Mohd Ali ...... 52

Hybrid Flow-Shop Scheduling (HFS) Problem Solving with Migrating Birds Optimization (MBO) Algorithm Yona Eka Pratiwi, Kusbudiono, Abduh Riski, Alfian Futuhul Hadi ...... 58

Submit your paper at http://www.hrpub.org/submission.php

Available Online http://www.hrpub.org/journals/jour_info.php?id=34

Horizon Research Publishing http://www.hrpub.org