Software Reliability Prediction Via Relevance Vector Regression
Total Page:16
File Type:pdf, Size:1020Kb
Neurocomputing 186 (2016) 66–73 Contents lists available at ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom Software reliability prediction via relevance vector regression Jungang Lou a,b, Yunliang Jiang b,n, Qing Shen b, Zhangguo Shen b, Zhen Wang c, Ruiqin Wang b a Institute of Cyber-Systems and Control, Zhejiang Univeristy, 310027 Hangzhou, China b School of Information Engineering, Huzhou University, 313000 Huzhou, China c College of Computer Science and Technology, Shanghai University of Electric Power, 200090 Shanghai, China article info abstract Article history: The aim of software reliability prediction is to estimate future occurrences of software failures to aid in Received 21 September 2015 maintenance and replacement. Relevance vector machines (RVMs) are kernel-based learning methods Received in revised form that have been successfully adopted for regression problems. However, they have not been widely 27 November 2015 explored for use in reliability applications. This study employs a RVM-based model for software relia- Accepted 9 December 2015 bility prediction so as to capture the inner correlation between software failure time data and the nearest Communicated by Liang Wang Available online 6 January 2016 m failure time data. We present a comparative analysis in order to evaluate the RVMs effectiveness in forecasting time-to-failure for software products. In addition, we use the Mann-Kendall test method to Keywords: explore the trend of predictive accuracy as m varies. The reasonable value range of m is achieved through Software reliability model paired T-tests in 10 frequently used failure datasets from real software projects. Relevance vector machine & 2016 Elsevier B.V. All rights reserved. Mann–Kendall test Paired T-test 1. Introduction in software reliability modeling and predicting. Unlike traditional statistical models, ANNs are data-driven, nonparametric weak models In the modern world, computers are used for many different [9,11,13,14]. ANN-based software reliability models require only failure applications, and research on software reliability has become history as an input, and they can predict future failures more accu- increasingly essential. Software reliability describes the probability rately than some commonly used parametric models. However, ANNs that software will operate without failure under given environmental suffer from a number of weaknesses, including the need for numerous conditions during a specified period of time [1].Todate,software controlling parameters, difficulty in obtaining a stable solution, and a reliability models are among the most important tools in software tendency to cause over fitting. A novel type of learning machine, reliability assessment [2]. Most existing software reliability models, kernel machines (KMs), is emerging as a powerful modeling tool, and known as parametric models, depend on priori assumptions about it has received increasing attention in the domain of software relia- software development environments, the nature of software failures, bility prediction. Kernel-based models can achieve better predictive and the probability of individual failures occurring. Parametric models accuracy and generalization performance, thus arousing the interest of may exhibit different predictive capabilities across different software many researchers [18–22]. Generally speaking, KMs have been suc- projects [3–8], and researchers have found it almost impossible to cessfully applied to regressions, with remarkable training results even ; ; ; ; …; ; A develop a parametric model that can provide accurate predictions given a relatively small dataset D ¼fðx1 y1Þ ðx2 y2Þ ðxl ylÞg d under all circumstances. To address this problem, several alternative R  R,wherext are input vectors, yt are output vectors, t ¼ 1; 2; …; l , solutions have been introduced over the last decade. One possible d is a dimension of xt,andl is the number of observed input/output solution is to employ artificial neural networks (ANNs) [9–17].Kar- pairs [18]. unanithi et al., Dohi et al., Cai et al., Ho et al., Tian and Noore and Hu Examples of KMs include support vector machines (SVMs) and et al. used both classical and recurrent multi-layer perceptron neural relevance vector machines (RVMs). Vapnik [18] developed SVMs with networks to forecast software reliability. ANNs have proven to be the goal of minimizing the upper boundary of the generalization error universal approximates for any nonlinear continuous function with an consisting of the sum of the training error and confidence interval, arbitrary accuracy. Consequently, they represent an alternative method which appears to be less computationally demanding. Tian and Noore [19] proposed an SVM-based model for software reliability prediction that embraces some remarkable characteristics of SVMs, including n Corresponding author. good generalization performance, absence of local minima, and sparse E-mail addresses: [email protected] (J. Lou), [email protected] (Y. Jiang), [email protected] (Q. Shen), [email protected] (Z. Shen), solution representation. Pai and Hong [20] and Yang and Li [21] also [email protected] (Z. Wang), [email protected] (R. Wang). made efforts to develop SVM-based reliability models, showing that http://dx.doi.org/10.1016/j.neucom.2015.12.077 0925-2312/& 2016 Elsevier B.V. All rights reserved. J. Lou et al. / Neurocomputing 186 (2016) 66–73 67 these models can achieve good prediction accuracy. However, SVMs approximation is: are sensitive to uncertainties because of the lack of probabilistic out- XM puts, as well as the need to determine a regularization parameter and t ¼ yðx; wÞ¼ wiKðx; xiÞþw0 ð1Þ select appropriate kernel functions to obtain optimal prediction i ¼ 1 accuracy [22–25]. where these fwig aretheparametersofthemodel,generallycalled This paper proposes a new data-driven approach for predicting weights, and Kð; Þ is the kernel function. Assuming that each example – software reliability using RVM [23 27] to capture the uncertainties from the data set has been generated independently (an often realistic in software failure data and predictions about possible present and assumption, although not always true), the likelihood of all the data is future aquifer conditions. RVM adopts kernel functions to project given by the product: the input variables into a high-dimensional feature space, in order ! n 2 = Jt ÀΦwJ to extract the latent information. Compared to SVM, it uses fewer pðtjσ2Þ¼ ∏ Nðt jyðx ; wÞ; σ2Þ¼ð2πσ2ÞN 2exp À i i σ2 kernel functions and avoids the use of free parameters [28–30]. i ¼ 1 2 The kernel-based software reliability modeling process also T T where w ¼½w0; w1; w2; …; wN , Φ ¼½ϕðx1Þ; ϕðx2Þ; …; ϕðxNÞ ,and focuses on choosing the number of past observations related to the T future value. Some researchers suggest that failure behavior earlier ϕðxnÞ¼½1; Kðxn; x1Þ; Kðxn; x2Þ; …; Kðxn; xNÞ : in the testing process has less impact on later failures, and Next we introduce a prior distribution over the parameter vector w.The therefore not all available failure data should be used in model key difference in the RVM is that we introduce a separate hyperpara- training. However, to the best of our knowledge, such claims lack meter αi for each of the weight parameters wi instead of a single shared either theoretical support or experimental evidence. This study hyperparameter. Thus, the weight prior takes the form: uses the Mann–Kendall test and paired T-test [31–33] to investi- N α α w2 gate the appropriate number of past observations related to the α ∏ ffiffiffiffiffiffii i i ; α α ; α ; …; α : pðwj Þ¼ p πexp À ¼½ 1 2 N future value for RVM-based software reliability modeling. i ¼ 0 2 2 The paper is organized as follows. After explaining the background Having defined the prior, Bayesian inference proceeds by computing, of the research, Part 2 outlines the principle of RVM for regression. Part from Bayes rule, the posterior over all unknowns give the data: 3 introduces the framework for software reliability prediction based pðtjw; α; σ2Þpðw; α; σ2Þ on RVM and describes how RVM regression can be used in predicting pðw; α; σ2jtÞ¼ : ð2Þ pðtÞ software failure time. Part 4 discusses the process for RVM-based software reliability models and presents experimental datasets and Then, given a new test points xn,predictionsaremadeforthecorre- measures for evaluating predictability. Following that, Part 5 explains sponding target tn, in terms of the predictive distribution: Z the Mann–Kendall test and paired T-test, demonstrates the detailed ; α; σ2 ; α; σ2 α σ2 experimentation process, and analyzes the experimental results on the pðtnjtÞ¼ pðtnjw Þpðw jtÞ dw d d ð3Þ 10 datasets. Finally, Part 6 concludes the paper. We cant compute the posterior pðw; α; σ2jtÞ in (1) directly.Instead,we decompose the posterior as: ; α; σ2 ; α; σ2 α; σ2 : 2. RVM for regression pðw jtÞ¼pðwjt Þpð jtÞ The posterior distribution over the weights is thus given by: Support vector machines (SVMs) are a set of related supervised pðw; α; σ2jtÞ pðtjw; σ2ÞpðwjαÞ pðtjw; σ2ÞpðwjαÞ learning methods used for classification and regression. The goal of pðwjt; α; σ2Þ¼ ¼ ¼ R α; σ2 α; σ2 ; σ2 α fi pð jtÞ pðtj Þ pðtjw Þpðwj Þ dw SVMs classi cation is to separate an n-dimensional data space (trans- () T À 1 formed using nonlinear kernels) by an ðnÀlÞ-dimensional hyper-plane N þ1 ð = Þ ðwÀμÞ Σ ðwÀμÞ ¼ð2πÞ À Σ 1 2 nexp À ; that creates the maximum separation (margin) between two classes. 2 2 This technique can be extended to regression problems in the form of support vector regression. Regression is essentially an inverse classifi- where the posterior covariance and the mean are respectively: cation problem where, instead of searching for a maximum margin μ ¼ σ À 2ΣΦT t; classifier, a minimum margin fit needs to be found. However, SVMs are Σ σ À 2ΦT Φ À 1; not well suited to software reliability prediction due to the lack of ¼ðAþ Þ α ; α ; …; α : probabilistic outputs. Tipping [24–27] introduced the RVM which A ¼ diagð 0 1 NÞ makes probabilistic predictions and yet which retains the excellent By integrating the weights, we obtain the marginal likelihood for the predictive performance of the support vector machine.