Huang et al. BMC Bioinformatics 2019, 20(Suppl 8):290 https://doi.org/10.1186/s12859-019-2771-z
RESEARCH Open Access A new fruit fly optimization algorithm enhanced support vector machine for diagnosis of breast cancer based on high-level features Hui Huang1,Xi’an Feng1, Suying Zhou2, Jionghui Jiang3, Huiling Chen4*, Yuping Li5 and Chengye Li5*
From International Conference on Data Science, Medicine and Bioinformatics Wenzhou, China. 22- 24 June 2018
Abstract Background: It is of great clinical significance to develop an accurate computer aided system to accurately diagnose the breast cancer. In this study, an enhanced machine learning framework is established to diagnose the breast cancer. The core of this framework is to adopt fruit fly optimization algorithm (FOA) enhanced by Levy flight (LF) strategy (LFOA) to optimize two key parameters of support vector machine (SVM) and build LFOA-based SVM (LFOA-SVM) for diagnosing the breast cancer. The high-level features abstracted from the volunteers are utilized to diagnose the breast cancer for the first time. Results: In order to verify the effectiveness of the proposed method, 10-fold cross-validation method is used to make comparison among the proposed method, FOA-SVM (model based on original FOA), PSO-SVM (model based on original particle swarm optimization), GA-SVM (model based on genetic algorithm), random forest, back propagation neural network and SVM. The main novelty of LFOA-SVM lies in the combination of FOA with LF strategy that enhances the quality for FOA, thus improving the convergence rate of the FOA optimization process as well as the probability of escaping from local optimal solution. Conclusions: The experimental results demonstrate that the proposed LFOA-SVM method can beat other counterparts in terms of various performance metrics. It can very well distinguish malignant breast cancer from benign ones and assist the doctor with clinical diagnosis. Keywords: Support vector machine, Parameter optimization, Fruit fly optimization, Levy flight, Breast cancer diagnosis
Background all kinds of methods. Traditional diagnostic methods Breast cancer is the most common cancer and the mostly rely on clinicians’ personal experience and the leading cause of cancer death among females [1]. Early diagnostic results may be subjectivism with certain detection and diagnosis is the key to controlling the probability. In recent years, computational diagnostic disease and to improving the survival rate, and patho- tools and artificial intelligence techniques provide logical diagnosis is the most reliable gold standard of automated procedures for objective judgments by making use of quantitative measures and machine – * Correspondence: [email protected]; [email protected] learning techniques for medical diagnosis [2 11]. 4Department of Computer Science, Wenzhou University, Wenzhou 325035, Similarly, the methods based on artificial intelligence China technology for diagnosis of breast cancer have been 5Department of Pulmonary and Critical Care Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou 325000, China proposed. Maglogiannis et al. [12]presentedusing Full list of author information is available at the end of the article support vector machine (SVM) for diagnosing the
© The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Huang et al. BMC Bioinformatics 2019, 20(Suppl 8):290 Page 2 of 14
breast cancer both on Wisconsin Diagnostic Breast Based on these features, the pathological data of 470 cases Cancer and the Wisconsin Prognostic Breast Cancer were analyzed by two pathological experts. Then, we datasets. Kaya et al. [13] proposed a novel approach proposed a novel learning framework based on SVM based on rough set and extreme learning machine for for distinguishing malignant breast cancer from the distinguishing the benign or malignant breast cancer. healthy ones. As we all know, the two key parameters Akay et al. [14] proposed a novel SVM combined with in classic SVM are penalty factor and width of kernel feature selection for breast cancer diagnosis. The ex- function, which traditionally treated by means of grid perimental results indicate that the proposed method search and gradient descent. However, these methods can perform well in terms of accuracy, sensitivity and are easy to get into local optimal solutions. Recently, specificity. Given recent advances on digitized histo- some bio-inspired metaheuristic search algorithms logical studies, it is now possible to use histological (such as genetic algorithms (GA) [20–23], particle tissue patterns with artificial intelligence techniques- swarm optimization algorithms (PSO) [24–27], the fruit aided image analysis to facilitate disease classification fly optimization (FOA) [28], moth-flame optimization [15]. In general, accurate pathological diagnosis of (MFO) [29]) have made it easier to find the global breast cancer depends on features, which are extracted optimal solution. As a new member of the swarm- from histopathology images. There are a lot of works intelligence algorithms, FOA [30]isinspiredbythe for diagnosis of breast cancer based on histopathology foraging behavior of real fruit flies. The FOA has cer- images’ features. tain outstanding merits, such as a simple computational Kuse et al. [16] extracted texture features from the process, simple implementation, and easy understand- cells to train a SVM classier that is used to classify lym- ingwithonlyafewparametersfortuning.Duetoits phocytes and non-lymphocytes. Dundar et al. [17]pro- good properties, FOA has become a useful tool for posed to segment cell regions by clustering the pixel many real-world problems [10, 28, 31–33]. data and to identify individual cells by a watershed- Compared with gradient descent method and grid based segmentation algorithm, and a proposed MIL search method, like other swarm intelligence methods approach was used to identify the stage of breast le- [34, 35], FOA is a global optimization method, which sion. Sparks et al. [18]presentedaCBIRsystemthat can find the global optimal solution or approximate leveraged a novel set of explicit shape features which optimal solution more easily. However, the traditional accurately described the similarity between the FOA algorithm has the possibility of falling into the morphology of objects of interest. Basavanhally et al. local optimal solution for complex optimization prob- [19] presented a novel framework that classifies entire lems, and the convergence rate is not very ideal. images based on quantitative features extracted from Therefore, this paper introduces the Levy flight (LF) fields of view of varying sizes. In each FOV, cancer strategy to update the positions of fruit flies to further nuclei were automatically detected and used to con- improve its convergence speed, while reducing the struct graphs (Voronoi Diagram, Delaunay Triangu- probability of FOA falling into the local optimal. LF lation, Minimum Spanning Tree). Features describing strategy has been used widely to enhance the lots of spatial arrangement of the nuclei were extracted and metaheuristic algorithms [36–41]. The principle of LF used to train a boosted classifier that predicts image strategy can ensure the diversity of algorithms in the class for each FOV size. process of optimization [42–44] and improve the con- In all aforementioned works, an objective phenomenon vergence rate. In this study, the improved FOA can be found that these studies were usually con- method, LFOA, was utilized to optimize the two key ducted on the low-level features on image pixels and parameters pair including penalty factor and width of the high-level ones were discard, which means that kernel function in SVM method and obtain the opti- these studies may not express prior medical know- mal model (LFOA-SVM). Furthermore, this model will ledge. Therefore, in this paper, we proposed to diag- be investigated to diagnose the breast cancer on nose the breast cancer using the high-level features high-level features dataset. As far as we know, this which were defined based on the prior medical know- paper is the first to solve the parameter optimization ledge. This definition relies on two very experienced problemofSVMwithLFOA.Intheexperiment,a pathologists. Because these features include the ex- 10-fold cross-validation method was used on data to perience of doctors, doctors with clinical experience make detailed comparison between LFOA--SVM, have a high ability to differentiate between breast FOA-SVM (model based on the primitive fruit fly tumors and breast cancer in general, and have better optimization model), GA-SVM (model based on ge- comprehensibility. We extracted a set of high-level netic algorithms), PSO-SVM (model based on particle features, including 13 key features, which were the swarm optimization algorithms), random forest (RF), basis for the classification and grading of breast pathology. back propagation neural network (BPNN) and SVM. Huang et al. BMC Bioinformatics 2019, 20(Suppl 8):290 Page 3 of 14
The experimental results demonstrated that the pro- posed LFOA-SVM was superior to other methods in terms of classification accuracy, Mathews correlation coefficient (MCC), sensitivity and specificity. The rest of this paper is organized as follows. In Preliminaries Section background information used in the study was introduced. In Methods Section the detailed implementation of the proposed method was presented. In Results and discussion Section, experimental designs, results and discussion were delivered. Finally, in Con- clusion Section the conclusions and recommendations for future work were summarized.
Preliminaries Support vector machine Support Vector Machine (SVM) [45] is a supervised learn- ing model and related learning algorithm for analyzing data in classification and regression analysis. Given a set of training instances, each training instance is marked as one or the other of two classes, the SVM training algorithm creates a model that assigns a new instance to one of two classes, making it a non-probabilistic Levy flight binary linear classifier. Levy flight (LF) mechanism is often used to improve The SVM model is to represent instances as points meta-heuristics because its characteristics are similar in space, so that the mapping allows instances of se- to the movement of many animals in nature. The parate categories to be separated by as wide and distinct phenomena is called Levy statistics [46]. The LF is es- intervals as possible. Then, new instances are mapped to sentially stochastic non-Gaussian walks. Its step value the same space and the category is predicted based is dispersed relative to Levy stable distribution. Levy on which side they fall in the interval. In addition to distribution can be represented as the following linear classification, SVM can also use the so-called equation: kernel technique to effectively perform nonlinear clas- −1−β sification, mapping its input implicitly into the high- LevyðÞ s jjs ; 0 < β≤2 ð1Þ dimensional feature space. β represents an important Levy index to adjust the sta- More formally, support vector machines construct hy- bility, s is the step length. perplanes in high-dimensional or infinite-dimensional spaces. Which can be used for classification, regression Methods or other tasks. Intuitively, the farther away the nearest Levy flight enhanced FOA (LFOA) training data point is, the better, because this can reduce Levy’s flight is characterized by short steps and ran- the generalization error of the classifier. dom directions. This feature can effectively avoid the whole population falling into local optimum, thus en- Fruit-Fly optimization algorithms hancing the global detection ability of the algorithm. The fruit fly optimization algorithm (FOA) [30]was In this paper, we have introduced the LF strategy into a meta-heuristic algorithm which is inspired by the to FOA to explore the search space more efficiently. foraging behavior of fruit fly. Fruit fly relies on vi- The new position is updated according to the follow- sion and smell to position food during foraging. ing rule. FOA searches for solution space by mimicking the way of fruit fly flight when solving optimization levy Xi ¼ Xi þ Xi⊕levyðÞ s ð2Þ problems.InFOA,first,thefruitflypopulation levy (candidate solution) is randomly generated in the so- where Xi is the new position of the ith search agent Xi lution space, and then each fruit fly will update its after updating. position according to the flight mode of the fruit fly. Fruit fly population continuously improves the fit- Proposed LFOA-SVM model ness of the population (quality of solution) during This study proposes a novel evolutionary SVM that em- the iterative process. ploys the LFOA strategy, and the resultant LFOA-SVM Huang et al. BMC Bioinformatics 2019, 20(Suppl 8):290 Page 4 of 14
model can adaptively determine the two key hyper-pa- performance evaluation. During the inner parameter rameters for SVM. The general framework of the pro- optimization procedure, the SVM parameters are dyna- posed method is demonstrated in Fig. 1. The proposed mically adjusted by the LFOA technique via the 5-fold model is primarily comprised of two procedures: the inner cross validation (CV) analysis. Then, the obtained optimal parameter optimization and the outer classification parameters are fed to the SVM prediction model to
Fig. 1 Flowchart of LFOA-SVM Huang et al. BMC Bioinformatics 2019, 20(Suppl 8):290 Page 5 of 14
perform the classification task for breast cancer diagnosis Table 2 The parameter settings for the relevant methods in the outer loop using the 10-fold CV analysis. The classi- Algorithm Parameter Value fication accuracy was used as the fitness function. LFOA ax, ay 20,20 XK bx, by 10,10 fitness ¼ ACCi =k ð3Þ i¼1 FOA ax, ay 20,20 bx, by 10,10 where ACCi represents the average accuracy achieved by GA crossover 0.4 the SVM classifier via 5-fold CV. mutation 0.01 The main steps conducted by the LFOA-SVM are described in detail as follows: PSO c1,c2 2 w 1