https://doi.org/10.20965/jaciii.2019.p0928 Kong, Y. et al.

Paper: MR-AntMiner: A Novel MapReduce Classification Rule Discovery with Ant Colony Intelligence

Yun Kong∗1,∗2, Junsan Zhao∗1,†,NaDong∗1, Yilin Lin∗1, Lei Yuan∗3, and Guoping Chen∗4

∗1Faculty of Land Resource Engineering, University of Science and Technology No. 68 Wenchang Road, 121 Avenue, Wuhua , Kunming, 650093, E-mail: [email protected] ∗2Library of Kunming University of Science and Technology No. 727 Jingming Nan Road, , Kunming, Yunnan 650504, China ∗3School of Information Science and Technology, Yunnan Normal University No. 1 Yuhua District, Chenggong New District, Kunming, Yunnan 650500, China ∗4Geomatics Engineering Faculty, Kunming Metallurgy College No. 388 Xuefu Road, , Kunming, Yunnan 650028, China †Corresponding author [Received September 30, 2018; accepted May 1, 2019]

Ant colony optimization (ACO) algorithms have been that of the compared targets. Furthermore, experi- successfully applied to data classification problems mental studies show the feasibility and the good per- that aim at discovering a list of classification rules. formance of the proposed parallelized MR-AntMiner However, on the one hand, the ACO algorithm has de- algorithm. fects including long search times and convergence is- sues with non-optimal solutions. On the other hand, given bottlenecks such as memory restrictions, time Keywords: ant colony optimization (ACO), MapReduce complexity, or data complexity, it is too hard to solve Model, classification rule a problem when its scale becomes too large. One solution for this issue is to design a highly paral- lelized learning algorithm. The MapReduce program- 1. Introduction ming model has quickly emerged as the most com- mon model for executing simple algorithmic tasks over The ant colony algorithm is a heuristic intelligent huge volumes of data, since it is simple, highly ab- optimization algorithm originally proposed by Colorni stract, and efficient. Therefore, MapReduce-based et al. [1], which has been successfully applied to solve ACO has been researched extensively. However, due to many NP combinatorial optimization problems. Clas- its unidirectional communication model and the inher- sification rule mining based on ant colony optimization ent lack of support for iterative execution, ACO algo- (ACO) was first proposed by Parpinelli et al. [2]. The rithms cannot easily be implemented on MapReduce. basic problem can be depicted as follows: the definition In this paper, a novel classification rule discovery algo- of an ant search path is the connection of attribute nodes rithm is proposed, namely MR-AntMiner, which can and class nodes, in which the attribute nodes only appear capitalize on the benefits of the MapReduce model. In once and must have class nodes, each path corresponds order to construct quality rules with fewer iterations to a classification rule, and the mining of the rules can be as well as less communication between different nodes regarded as the search for the optimal path. Rule min- to share the parameters used by each ant, our algo- ing consists of three stages: rule construction, rule prun- rithm splits the training data into some subsets that ing, and pheromone path updating. The form of a rule is are randomly mapped to different mappers; then the shown in Eq. (1), where term1 is a conditional item and traditional ACO algorithm is run on each mapper to the rule conclusion (THEN) defines the prediction cate- gain the local best rule set, and the global best rule list gory of the sample (class). is produced in the reducer phase according to a voting IF  term AND term AND  THEN  class . (1) mechanism. The performance of our algorithm was 1 2 studied experimentally on 14 publicly available data With the advent of the era of big data, the scale of sets and further compared to several state-of-the-art data is increasing exponentially. The traditional data min- classification approaches in terms of accuracy. The ex- ing algorithms are mainly suitable for small and medium- perimental results show that the predictive accuracy sized data sets, but are difficult to apply to the analysis obtained by our algorithm is statistically higher than of large-scale data sets. Traditional data mining algo-

928 Journal of Advanced Computational Intelligence Vol.23 No.5, 2019 and Intelligent Informatics

© Fuji Technology Press Ltd. Creative Commons CC BY-ND: This is an Open Access article distributed under the terms of the Creative Commons Attribution-NoDerivatives 4.0 International License (http://creativecommons.org/licenses/by-nd/4.0/). MR-AntMiner

rithms are challenged by memory constraints, high time that the rules found in the example data will not be re- complexity, and data intensive as well as complex struc- peated. This process continues iteratively until the train- tures [3]. The ACO algorithm is also faced with the same ing data set is empty or the termination condition is met. problems. When the data set is increased to a certain ex- The heuristic value is calculated from the entropy of terms tent, the space and time costs of the traditional single ma- and their normalized information gain. Thereafter, a ma- chine solution become huge, so it is difficult to meet cur- jor voting mechanism method is employed in AntMiner rent computing requirements. The emerging cloud com- to prune the irrelevant terms in order to raise the accu- puting model [4], with utilities such as Hadoop [5], as a racy. Unlike AntMiner, a new heuristic function calcu- new parallel processing technology, has excellent perfor- lation method based on density estimation was adopted mance in dealing with large data sets and massive storage, in AntMiner2 [7] and AntMiner3 [8]. Besides that, the so using optimization algorithms on a cloud computing distinct feature of AntMiner3 is a new pheromone up- platform has become a feasible and reliable solution. date method, in which the pheromones are updated and The main goal of this paper is to combine the ACO evaporated only for those predefined conditions occur- algorithm and the MapReduce model to realize an ant ring in the rule. In this way, exploration behavior is en- colony classification rule mining algorithm in a large- couraged. AntMiner+ [9], which is an enhanced version scale environment. Our main work is as follows: firstly, in of AntMiner, designs a class specific heuristic function, order to solve the problem of one training data set being which enables the ants to know the class of an extracted insufficient to achieve high classification accuracy, this rule. The class label is chosen in AntMiner+ before the study employed data segmentation and sampling tech- ants construct their rules. AntMiner-CC [10] adopts a niques to divide the training data set into N subsets in a new heuristic function calculation method based on the uniform distribution mode. Secondly, considering that the correlation of data attributes, which takes full account of time complexity and search space of ACO are unaccept- the relationships between the selected nodes and the can- able when applied to large-scale datasets, in this work, didate nodes and uses a disordered search space instead we adopted a strategy that randomly casts N subsets to of a determined search space. Generally, the continuous N Mappers in a MapReduce cluster. Thirdly, in view of attributes are preprocessed by means of discretization be- the time overhead and lack of iterative execution in the fore the ACO algorithm is applied, so cAnt-Miner [11] MapReduce framework, our method takes K ants into a utilizes information entropy to discretize the continuous Map to produce the local best rule list in a certain sub- attributes. AntMinermbc [12] proposes a novel classifica- set, which can effectively reduce the time overhead of the tion rule discovery algorithm based on ACO, in which a framework and solve the problem of sharing the global new model of multiple rule sets is presented to produce pheromones when running the ACO algorithm in a map- multiple lists of rules. Multiple base classifiers are built per. Finally, a voting selection mechanism is applied to into AntMinermbc, and each base classifier is expected to generate the global best rules as the result of the final clas- remedy the weaknesses of other base classifiers, which sification rule sets at the reducer phase in the MapReduce can improve the predictive accuracy by exploiting the use- model. ful information from various base classifiers. Neverthe- The remainder of this paper is organized as follows. less, as it constructs multiple base classifiers instead of Section 2 presents the background of ACO and gives an one base classifier, it will take more execution time to overview of existing ACO algorithms for classification build a solution from each ant as well as to complete ten- rules discovery. Section 3 reviews the ACO algorithms fold cross validation in a serial computing environment. based on MapReduce and analyzes the issues in iterative To sum up, since the early 1990s, many ACO algo- execution of the MapReduce framework. Our novel al- rithms have been reported, most of which were designed gorithm is described in Section 4, which presents a new by using different probability calculation and pheromone model of MapReduce with ACO. Section 5 gives the ex- update methods. In our method, each ant builds a classifi- perimental results of our algorithm a some publicly avail- cation rule following the traditional flow [2], in which the able data sets in comparison with other classification al- probability transfer formula plays a significant role in the gorithms. At last, Section 6 summarizes our conclusions ant selecting a node, as shown in Eq. (2): and future work. τα (t)ηβ (t) P (t)= ij j , ij total next values ..... (2) τα (t)ηβ (t) 2. ACO with Classification Rule Discovery ∑ ik k k=1 The principle of ACO classification rule mining is to where τij is the concentration of pheromone between imitate ants finding the shortest path from food to nest. nodei and node j for the t-th ant, η j(t) is the value of Much research on classification tasks has been conducted the heuristic information in node j, τik(t) is the amount of to apply ACO for the discovery of classification rules. pheromone concentration between termi and termk,where The AntMiner [6] is the earliest ant colony classifica- k is a value from 1 to the total number of next attribute tion rule mining algorithm. The rules are found by a values, and ηk(t) is the current value of the heuristic func- heuristic search and sequence coverage strategy. The tion. All the selected nodes have those attributes that have rules are removed from the training data set to ensure not become prohibited. The parameters α and β are two

Vol.23 No.5, 2019 Journal of Advanced Computational Intelligence 929 and Intelligent Informatics Kong, Y. et al.

weight parameters that adjust the relative importance of computing and swarm intelligence algorithms because the the pheromone and heuristic information to control the MapReduce cloud computing model comes from the Lisp next movement of the ant. Our scheme applies a heuris- language which belongs to the field of artificial intel- tic function that considers both the correlation and cover- ligence. Many nature-inspired algorithms, such as the age to avoid deceptively high accuracy [12], as shown in ant colony algorithm, have a great level of high paral- Eq. (3), where k is the total number of classes. lel computation because they employ a number of Monte Carlo methods. Several descriptions can be found in |termi ∩ CLASS = classant| + 1 ηi(t)= . the literature of the parallelization of ACO based on the |term | + k ...(3) i MapReduce model considering the drawbacks of serial The decrease of the pheromone concentration is accom- ACO, including long search times and premature con- plished by pheromone evaporation, that is, the amount of vergence to a non-optimal solution as well as low effi- pheromone on all the paths is reduced through the evap- ciency on large-scale datasets. Meena et al. [13] em- oration factor ρ while the global best rule based on its ployed Hadoop MapReduce ACO to study feature selec- quality reinforces its pheromone concentration. The qual- tion in text categorization. Hao et al. [14] have proposed ity measure method is shown in Eq. (4): a MapReduce-based ACO to parallel ACO, where divide    TP+ FP TP P and conquer and a simulated annealing algorithm were Q = − , merged into ACO to improve its defects to enable the so- P + N TP+ FP P + N ..(4) lution of large-scale of TSP problems. Elanthiraiyan and where TP and FP respectively refer to the numbers of Arumugam [15] parallelized an ACO algorithm for re- correct and incorrect examples covered by rules that have gression testing prioritization in the Hadoop Framework. the same class label. P is the total number of examples Wang et al. [16] discussed several parallel ways of im- whose class labels are the selected class. N is the total plementing ACO and their application scenarios as well number of examples belonging to other classes. Then, the as the feasibility of combination with the MapReduce pheromone update formula is given by Eq. (5): model, in which ACO with local search features is ab- stracted into several components with several interfaces Qbest τ (t + )=ρ, τ (t)+ , based on MapReduce being built to implement each com- i 1 i c ...... (5) ponent. Jin and Ran [17] proposed a fair-rank ACO where Qbest is the quality value of the global best rule. The method in a distributed mass storage system. Siemi´nski parameter c is included to ensure that pheromone values and Kopel [18] presented a general description of the par- are bounded in the range [0,1]. With the steady accumu- allelized ACO concept and the details of two ways of im- lation of pheromone, subsequent ants are guided to con- plementation in both an inhomogeneous environment of struct better solutions. traditional computers and a homogenous Hadoop envi- ronment. Jayasena et al. [19] proposed an ACO algorithm with MapReduce for efficient resource allocation in mul- 3. ACO Algorithm Based on MapReduce timedia big data analysis and data distribution. Model 3.2. Iterative Calculation of MapReduce on ACO 3.1. ACO Algorithm with MapReduce Review According to the current reports, the procedure of the The MapReduce model is a distributed programming ACO algorithm based on MapReduce can be mainly ab- model under the cloud computing environment proposed stracted into four steps: initialization, Map, Reduce, and by the Google laboratory, which adopts a divide and con- output. The steps are as follows: quer strategy. The complex parallel computing process Step 1: Initialization. Read the initial training data set is highly abstracted into two functions, namely Mapper from the input HDFS files, including the number < key, value > and Reducer. In the map phase, the single of iterations, number of ants, information heuris- < key, value > pair as an input and a list of intermediate tic factor α, expectation factor β, and information pairs as outputs are treated by the Map function as Eq. (6). volatilization coefficient as well as the primitive < key, value > Then, all output intermediate pairs are pheromone file, and so on. grouped by key between Map and Reduce. Finally, a Step 2: Map. Each ant builds its respective classification < key, value > new output pair is produced by the Re- rule. An ant starts to construct its rule by se- duce function as Eq. (7). In this process, it has passed lecting a node randomly and transfers to the next through the shuffle function, the sort function, and the node based on a probability formula. This ac- combine function in order to reduce the amount of data tion continues until the ant has created a whole written to the disk and the data transmission between the path including the label node. Finally, the me- networks [4, 5]. dian result is stored to the local disk in the form Map: (k1,v1) → list(k2,v2), ...... (6) of < key, value >, where key is present the index   and value is the present building path of each ant. Reduce: k2,list(v2) → list(k3,v3). ....(7) Step 3: Reduce. First, the Reduce function calculates There have been natural connections between cloud the best classification rule according to the val-

930 Journal of Advanced Computational Intelligence Vol.23 No.5, 2019 and Intelligent Informatics MR-AntMiner

Fig. 1. Classical Procedure of MR-ACO.

ues provided by the Map function as input data. rithm cannot converge to the optimal solution until mul- Then, the best rule is filtered and pruned by a cer- tiple iterations have been done, because the pheromone tain quality metric method. Third, the best eval- update has an important influence on the algorithm. To uated rule is assigned to the global best rule set shorten the time of a MapReduce iteration, Hao et al. [14] and meanwhile, the pheromone matrix is updated adopted the pipeline technique of cloud computing; that in light of the best rule. Finally, the training ex- is, the output information from the Reduce function is amples covered by the best rule are removed from used as the input of the next iteration of the Map function. the primitive training set. Multiple pairs of Map and Reduce tasks are serialized to Step 4: Output. The global best rule set and modified the form of M1 to R1, M2 to R2, ..., Mn to Rn.However, pheromone information are written to the HDFS the MapReduce programming framework does not sup- file in the format of < key, value >, and then the port global variables, so there is no dependence between procedure restarts with another MapReduce iter- Mapper tasks or Reducer tasks, and there is no sharing ation. of information between different data slices. In addition, as MapReduce opens the Mapper and Reducer tasks, a The time complexity of the basic ACO is number of initialization operations including job distri- O(NC.m.n2) [14] in the serial programming model, bution, input partition, and task partition, are generated, where the computation is mainly concentrated on which account for about 7% to 10% of the execution cost building the full path by each ant, which is composed of the whole job [20]. At the same time, there is a global of constructing a classification rule, rule pruning, and public variable named pheromone in the ant colony algo- pheromone updating. The ACO classification algorithm rithm that requires preemptive access, so the pheromone based on MapReduce applies the Map function to paral- update cannot be processed in a parallel way. In a word, lelize the most time-consuming part of the algorithm, that the MapReduce computing model cannot explicitly sup- is, each ant constructs the complete path independently, port iterative execution. and employs the Reduce function to obtain the global best rule and update the pheromone information. The parallel flow is shown in Fig. 1. It can be seen from the figure that the MR-ACO algo-

Vol.23 No.5, 2019 Journal of Advanced Computational Intelligence 931 and Intelligent Informatics Kong, Y. et al.

Fig. 2. The flow of the MR-AntMiner algorithm.

4. MR-AntMiner: A Novel MapReduce Rule 4.1. Data Partition Method Based on Bootstrap Mining Based on ACO Because of the uncertainty of the search space in the ACO algorithm, the rules constructed by the same ACO In this section, a novel MapReduce ant colony classifi- algorithm may be obviously different, which indicates cation rule mining algorithm based on the multi-base clas- that the solution of a specific ACO algorithm is unstable sifier is put forward, in which the basis procedure of the in a certain sense [12]. In view of this issue, in this paper traditional ACO algorithm is referenced; it is called MR- we propose a strategy that splits the original training set AntMiner in this paper for short. First, a data classifica- into several subsets in a particular approach and then runs tion model is designed to generate a multibase classifier, the same ACO algorithm on each sub set to compensate N and the training data sets are divided into example sub- for deficiencies in other sets. Suppose the data size of the N sets. Secondly, the training subsets are mapped to the primitive training and each sub set is N and each instance N Mapper( ) in the MapReduce model, and then the ant is selected from a uniform distribution, then the admission K colony optimization algorithm of ants is run on each probability of each instance is calculated by Eq. (8). Mapper( ) to produce a local optimal rule set based on  N each subset. After that, a voting selection mechanism is Pi = − − 1 , used to generate the global optimal classification rule set j 1 1 N ...... (8) in the reduce phase. The framework for MR-AntMiner is p i shown in Fig. 2. where ij is the probability that the -th sample is included in the j-th data subset. When N is set to be large enough, the probability converges to a stable value of 0.632, which means that each data subset contains about 63% of the

932 Journal of Advanced Computational Intelligence Vol.23 No.5, 2019 and Intelligent Informatics MR-AntMiner

Fig. 4. Ranked glob. best rule by voting mech.

data and an empty global best rule list. The lines from 3 to 21 are the outer loop of the procedure, in which a local best rule will be added into the global best rule list after a whole iteration until the termination condition is met. The lines from 7 to 18 are the inner loop of the procedure, in which each rule is constructed by the ACO algorithm. Each ant will produce a rule according to a pheromone up- date mechanism and heuristic function with a high accu- Fig. 3. The procedure of Mappern(). racy while avoiding deception. The algorithm keeps run- ning until the amount of rules is equal to the number of ants. The duplicate rules will be filtered, and after that, original training data. This result demonstrates that us- the best rule will be generated after an evaluation mech- ing different classifiers can remedy the instability of the anism in a certain iteration. Then, the local pheromone algorithm as well as raise the classification accuracy. In matrix will be modified by the pruned best rule. Mean- addition, this method also reduces the training data error while, the subset of training data will remove the covered by random fluctuations and the algorithmic robustness is dataset in the best rule list. At last, the local best rule list < key, value > enhanced. will be output in the form of ,wherekeyis the classified label and value is the corresponding rule. 4.2. Constructing Best Local Rule Sets on Multiple Data Subsets 5. Generation of Global Best Rule List by a In this section, the local best rule set will be gener- Voting Mechanism ated by the ACO algorithm based on multiple classifiers, in which each data subset will be mapped into the Maps In the Reduce phase, a ranked global best rule list is randomly. The parallel strategy mentioned in the refer- generated by a voting selection mechanism according to ences assumes that K ants will run the algorithm on the the multiple local best rule lists produced by the Map same data across different Maps, in which the inherent phase. A voting selection mechanism is proposed in or- defects of the MapReduce model and the ACO algorithms der to obtain the sorted global best rule list as shown in discussed above are not adequately considered. Further- Fig. 4. Supposing the algorithm gets three Reduce oper- more, considering the consumption of the framework in ations, namely Reduce1, Reduce2, and Reduce3, which the cloud computing environment, if only one ant is run- contain 3, 2, and 4 local best rules, respectively, corre- ning in a Map, the time consumption of the algorithm is sponding to classified labels of A, B, and C. Because the relatively short, while the extra cost will be longer. There- total ballots gained by A, B, and C are 6, 2, and 1, respec- fore, a different parallel approach is proposed in our paper tively, the order of the global best rule list is A, B, and C, in which K ants run the algorithm on the same Map based spontaneously produced by the algorithm Reducerm() as on the same data subset to produce a local best rule set. shown in Fig. 5. In this section, we describe the experi- The flow is as depicted in Fig. 3. Each ant builds the mental studies carried out using the MR-AntMiner algo- rules on the same Map sharing the same sample; then, the rithm. First, Section 5.1 introduces the data sets used in rules are evaluated and pruned by Eq. (4). Only the best the experiments. Next, Section 5.2 provides the comput- rule is left after pruning in order to reduce the comput- ing environment and the important parameters. Then, to ing cost. At last, the pheromone is updated to guide the be able to verify the feasibility and the performance of next ant, the algorithm Mappern() as shown in Fig. 3. the proposed algorithm, some comparisons between the The algorithm starts at line 1, 2 with a subset of training MR-AntMiner and other algorithms are presented in Sec-

Vol.23 No.5, 2019 Journal of Advanced Computational Intelligence 933 and Intelligent Informatics Kong, Y. et al.

Table 2. The detailed construction of the cluster.

No. Nodes CPU RAM OS Bit 1 Master 8v 8GB Ubu. 16.04 64 2 Slave1 8v 8GB Ubu. 16.04 64 3 Slave2 8v 8GB Ubu. 16.04 64 4 Slave3 8v 8GB Ubu. 16.04 64

Fig. 5. The procedure of Reducerm(). 5.2. Computing Environment and Parameter Set- ting Table 1. The details of the data sets used in experiments. 5.2.1. Computing Environment In our experiments, the proposed MR-AntMiner algo- Attributes No. Data Set Class Size rithm was implemented using the Java language. For Nom. Num. convenience of coding, the Map Reduce framework of 1 Bal 4 0 3 625 Hadoop [5] and the data structures of WEKA [26] were 2 breast-l 9 0 2 286 used in the MR-AntMiner classification system. We eval- 3 breast-w 0 30 2 569 uated the performance of the MR-AntMiner classification 4 cmc 7 2 3 1,473 system in a small cluster operating environment contain- 5 car 6 0 4 1,728 ing 1 host computer and 3 servant computers using the VMware 5.5 platform. The detailed construction of the 6 credit-a 8 6 2 690 cluster is given in Table 2. 7 derma 33 1 6 366 8 ecoli 0 7 8 336 9 glass 0 10 7 214 5.2.2. Metric and Parameters of ACO 10 hill-valley 0 100 2 606 There are many performance metrics for evaluating 11 pima 0 8 2 768 classification algorithms. Our main consideration is the classification accuracy, which is obtained based on the 12 ttt 9 0 2 958 numbers of correct and incorrect class label predictions 13 Adult 6 8 2 32,561 by the classifier. It is expressed as the percentage of test- 14 Skin Seg. 0 3 2 245,057 ing examples correctly classified by the classifier. Gener- ally, more correct class label predictions indicate better classification accuracy, which is used in many ACO- tion 5.3. Section 5.4 shows the analysis of the parallel based classification algorithms as a significant perfor- performance of the MR-AntMiner algorithm. mance metric for comparison. The generated classifiers classify the testing examples through a tenfold cross- 5.1. Data Sets validation procedure. In this section, 14 public data sets were collected in MR-AntMiner has six parameters: the maximal num- order to evaluate the performance of the proposed MR- ber of iterations, the numbers of ants, the evaporation fac- α AntMiner algorithm, which can be found in the publicly tor, the pheromone control parameter ( ), the heuristic β available UCI machine learning repository [21]. The main control parameter ( ), and the partition number of the characteristics of the data sets are summarized in Table 1, training data. Continually increasing the number of it- which also can be found in [22]. These data sets include 4 erations and the number of ants may result in greater ex- data sets with only nominal attributes, 6 data sets with ecution time and no significant increase in accuracy. This only numerical attributes, and 4 data sets with mixed at- study adopted an F-Race racing procedure to find a bet- tribute types. Our proposed algorithm MR-AntMiner is ter configuration of these two parameters [27]. Experi- compared with other well-known traditional serial cal- ments show that the algorithm is able to converge to a better solution when the maximal number of iterations is culation machine learning algorithms (AntMinerabc [12], C4.5 [22], and JRip [23]) and parallel computation meth- set to 200. Meanwhile, it saves a lot of computational ods (RlueMR [24] and MR-C4.5 [25]). resources. Besides, the number of ants was set to 1000, which ensures that more ants are employed to find a better solution. For a given data set, increasing the evaporation factor ρ can result in a slower convergence process. Nu- merous experiments have validated that setting the evap- oration factor to 0.85 can obtain higher accuracy while maintaining a reasonable execution time [28]. The con- trol parameters α and β are initialized in advance and then adaptively adjusted during the ACO-search process [28].

934 Journal of Advanced Computational Intelligence Vol.23 No.5, 2019 and Intelligent Informatics MR-AntMiner

Table 3. Parameters of MR-AntMiner.

Parameter Value Limit iteration 200 Number of ants 1000 Evaporation rate 0.85 Control pheromone value (α) Self-adaption Control heuristic value (β) Self-adaption Partition num. of train. data 10

Table 4. Accuracy comparisons between MR-AntMiner with other algorithms (in %).

MR-AntMiner AntMinerabc C4.5 JRip RlueMR MR-C4.5 bal 73.76 73.84 63.71 72.95 72.84 64.24 breast-l 74.36 74.02 72.93 69.26 70.53 73.02 breast-w 94.88 94.73 94.15 93.66 94.76 94.66 cmc 47.11 47.07 46.62 52.41 48.22 47.22 car 77.24 77.26 92.36 86.17 84.23 92.06 credit-a 86.52 86.37 85.80 85.80 85.67 85.96 derma 85.29 85.31 93.52 88.01 89.53 93.88 ecoli 78.98 79.14 84.23 78.87 85.56 84.76 glass 96.56 96.36 96.73 95.33 96.21 97.08 hill-valley 52.08 52.12 50.33 48.35 51.12 50.68 pima 74.92 74.81 72.11 73.55 73.66 73.24 ttt 98.26 98.19 94.03 97.61 96.27 94.78 Adult 84.8 83.7 83.4 82.9 83.6 84.2 Skin Seg. 94.6 92.8 93.2 93.6 91.8 94.2

All the parameter settings for the MR-AntMiner algo- curacy for each data set is identified with boldface. The rithm are shown in Table 3, while the parameter settings results in Table 4 represent the average accuracy achieved of C4.5 and JRip were set as recommended in their re- by the cross-validation procedure for all the algorithms in spective papers [22, 23]. The number of multiple base the corresponding data sets. classifiers has a significant impact on the performance We mainly focus on comparisons of the accuracy on of the algorithm; it was found that MR-AntMiner us- testing data. The results marked in bold show that the ing 10 base classifiers is able to obtain the highest pre- current algorithm is significantly better than the others. dictive accuracy [12]. Therefore, 10 base classifiers are From these comparisons, as shown in Table 4, the results suggested for MR-AntMiner, as this results in the best of MR-AntMiner are extremely approximate to those of predictive accuracy with reasonable execution time. AntMinerabc, and the proposed algorithm even displays better performance on some data sets. The proposed algo- 5.3. Comparison of MR-AntMiner with Other Al- rithm showed good performance for 7 data sets out of 14, while the AntMiner and the MR-C4.5 performed well gorithms abc for 2 data sets out of 14, and the C4.5, JRip, and RuleMR In this section, the main goal is to measure the ac- for 1 data set out of 14, respectively. As can be seen from curacy of the resulting rule set, its size, and the aver- Table 4, it is reasonable to conclude that our algorithm age length of the constructed rules. Thus, we make performed best in terms of predictive accuracy, as it gave some comparisons between MR-AntMiner and other al- better performance on most of the data sets when com- gorithms, which include both traditional serial calcula- pared with other algorithms. tion algorithms and parallel ones. The proposed MR- AntMiner, RlueMR [24], and MR-C4.5 [25] algorithms were used on a small cluster on the data sets in Table 1. 5.4. Parallel Performance of the MR-AntMiner The Weka machine learning tool [26] was adopted to run In this section, the main method of MR-AntMiner is C4.5 and JRip, while AntMinerabc [12] was implemented used to present the parallel performance. In the exper- in MATLAB R2013b. The predictive accuracies of all the iments, we randomly selected the Adult data set from algorithms are given in Table 4, where all the results were Table 1 to evaluate the parallel performance of the MR- obtained by using tenfold cross-validation and the best ac- AntMiner algorithm by computing the speedup, scaleup,

Vol.23 No.5, 2019 Journal of Advanced Computational Intelligence 935 and Intelligent Informatics Kong, Y. et al.

Fig. 6. Speedup of MR-AntMiner.

Fig. 8. Scaleup of MR-AntMiner.

Fig. 7. Running time vs. data size. Fig. 9. Sizeup of MR-AntMiner.

and sizeup, whose definitions can be found in the litera- ture [29]. The MR-AntMiner algorithm was evaluated on a small performed, in which the number of processors was fixed cluster with the number of processors varying from 1 to 6, and the data sets were 1-time, 20-times, 40-times, and and the data set sizes were 1-time, 20-times, 40-times, and 60-times the size of the original data set. As denoted in 60-times the original dataset. The parallel system does Fig. 9, the proposed algorithm displayed good sizeup per- not demonstrate a linear speedup because in practice, the formance. communication cost gradually increases with the increas- ing number of processors As shown in Fig. 6, the speedup tends to be approximately linear when the data size in- 6. Conclusion creases, especially for the much bigger data sets. In ad- dition, as presented in Fig. 7, the slopes of the curves de- The classical AntMiner algorithm is one of the most crease with increasing number of processors for a given broadly used approaches for small and medium data sets data set. This is consistent with the actual situation that in real-world applications. However, there are several the bigger the samples are, the higher speedup that can drawbacks of the AntMiner algorithm when applied to be achieved; the more processors there are, the shorter large data sets. In this paper, we presented a parallelized the running time can be. According to the definition of AntMiner algorithm called MR-AntMiner to solve these scaleup, the data sets that are 1-time to 6-times the size of challenges by using the MapReduce programming frame- the original data set were analyzed on 1 to 6 processors, work. To be able to construct the MR-AntMiner algo- respectively. The scaleup could be equal to 1 if the paral- rithm, some novel methods have been capitalized that lel system is ideal. However, the truth is that the scaleup were described at length in Section 4. By comparing the exhibits a downtrend as the size of the data set and the results achieved with MR-AntMiner with those achieved number of processors are gradually increased, as shown by the classical AntMiner algorithm and some other clas- in Fig. 8. As the data set becoming larger, the scaleup sification algorithms, it was found that the proposed MR- performance is slowly reduced because of the good scal- AntMiner algorithm is feasible and effective. Further- ability of the proposed algorithm. To evaluate the sizeup more, the experimental studies of speedup, scaleup, and of the proposed algorithm, a series of experiments was sizeup confirmed the good parallel performance of the

936 Journal of Advanced Computational Intelligence Vol.23 No.5, 2019 and Intelligent Informatics MR-AntMiner proposed MR-AntMiner algorithm. In future work, the [19] K. P. N. Jayasena, L. Li, and Q. Xie, “Multi-Modal Multimedia Big Data Analyzing Architecture and Resource Allocation on Cloud data skew problem should be taken more into considera- Platform,” Neurocomputing, Vol.253, pp. 135-143, 2017. tion when partitioning the training data. In addition, par- [20] W. Pan, Z. Li, S. Wu, and Q. Chen, “Evaluation Large Graph Pro- allel hybrid nature-inspired heuristic algorithms are a very cessing in MapReduce Based on Message Passing,” Chinese J. of Computers, Vol.34, No.10, pp. 1768-1784, 2011 (in Chinese). promising class of artificial intelligence algorithm as they [21] UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/ can work effectively by combining their strengths. These [accessed March 1, 2018] algorithms may lead to better solutions for NP combina- [22] J. R. Quinlan, “C4.5: Programs for Machine Learning,” Morgan Kaufmann Publishers Inc., 1993. torial optimization problems. [23] V. N. Vapnik, “The Nature of Statistical Learning Theory,” SpringerVerlag, 1995. [24] V. Kolias, C. Kolias, I. Anagnostopoulos, and E. Kayafas, “RuleMR: Classification Rule Discovery with MapReduce,” Proc. Acknowledgements of the 2014 IEEE Int. Conf. on Big Data, pp. 20-28, 2014. The authors would like to express appreciations to the colleagues [25] Y. Mu, X. Liu, Z. Yang, and X. Liu, “A parallel C4.5 decision tree in our laboratory for their valuable comments and other helps algorithm based on MapReduce,” Concurrency and Computation: Practice and Experience, Vol.29, Issue 8, Article No.e4015, 2017. as well as the support of the national natural science foundation [26] Weka 3: Machine Learning Software in Java, http://www.cs. (No. 41761081). waikato.ac.nz/ml/weka/ [accessed March 1, 2018]. [27] M. Birattari, T. St¨utzle, L. Paquete, and K. Varrentrapp, “A racing algorithm for configuring metaheuristics,” Proc. of the 4th Annual Conf. on Genetic and Evolutionary Computation (GECCO’02), References: pp. 11-18, 2002. [1] A. Colorni, M. Dorigo, and V. Maniezzo, “Distributed Optimization [28] D. Martens, M. De Backer, R. Haesen, J. Vanthienen, M. Snoeck, by Ant Colonies,” Proc. of the European Conference on Artificial and B. Baesens, “Classification with Ant Colony Optimization,” Life (ECAL’91), pp. 134-142, 1991. IEEE Trans. on Evolutionary Computation, Vol.11, No.5, pp. 651- [2] R. S. Parpinelli, H. S. Lopes, and A. A. Freitas, “Data mining with 665, 2007. an ant colony optimization algorithm,” IEEE Trans. on Evolutionary [29] Q. He, T. Shang, F. Zhuang, and Z. Shi, “Parallel extreme learn- Computation, Vol.6, No.4, pp. 321-332, 2002. ing machine for regression based on MapReduce,” Neurocomput- [3] S. R. Pakize and A. Gandomi, “Comparative Study of Classification ing, Vol.102, pp. 52-58, 2013. Algorithms Based on MapReduce Model,” Int. J. of Innovative Re- search in Advanced Engineering, Vol.1, Issue 7, pp. 251-254, 2014. [4] J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Communications of the ACM, Vol.51, Issue 1, pp. 107-113, 2008. [5] T. White, “Hadoop: The Definitive Guide – Storage and Analysis at Internet Scale,” 4th Edition, O’Reilly Media, Inc., 2015. Name: [6] M. Pedemonte, S. Nesmachnow, and H. Cancela, “A survey on par- Yun Kong allel ant colony optimization,” Applied Soft Computing, Vol.11, Is- sue 8, pp. 5181-5197, 2011. [7] B. Liu, H. A. Abbass, and B. McKay, “Density-based heuristic for Affiliation: rule discovery with ant-miner,” Proc. of the 6th Australasia-Japan Faculty of Land Resource Engineering, Kun- Joint Workshop on Intelligent and Evolutionary System, pp. 180- ming University of Science and Technology 184, 2002. (KUST) [8] B. Liu, H. A. Abbass, and B. McKay, “Classification Rule Discov- ery with Ant Colony Optimization,” IEEE Intelligent Informatics Bulletin, Vol.3, No.1, pp. 31-35, 2004. [9] D. Martens, M. De Backer, R. Haesen, J. Vanthienen, M. Snoeck, and B. Baesens, “Classification with Ant Colony Optimization,” Address: IEEE Trans. on Evolutionary Computation, Vol.11, No.5, pp. 651- No. 68 Wenchang Road, 121 Avenue, Wuhua District, Kunming, Yunnan 665, 2007. 650093, China [10] A. R. Baig, W. Shahzad, and S. Khan, “Correlation as a Heuristic Brief Biographical History: for Accurate and Comprehensible Ant Colony Optimization Based 2009 M.S. in Computer Science, Classifiers,” IEEE Trans. on Evolutionary Computation, Vol.17, 2014- Pursuing Ph.D. in Earth exploration and Information Technology, No.5, pp. 686-704, 2013. KUST [11] F. E. B. Otero, A. A. Freitas, and C. G. Johnson, “cAnt-Miner: An Ant Colony Classification Algorithm to Cope with Continuous At- tributes,” Proc. of the 6th Int. Conf. on Ant Colony Optimization and Swarm Intelligence (ANTS 2008), pp. 48-59, 2008. [12] Z. Liang, J. Sun, Q. Lin, Z. Du, J. Chen, and Z. Ming, “A novel multiple rule sets data classification algorithm based on ant colony algorithm,” Applied Soft Computing, Vol.38, pp. 1000-1011, 2016. [13] M. J. Meena, K. R. Chandran, A. Karthik, and A. V. Samuel, “An enhanced ACO algorithm to select features for text categorization and its parallelization,” Expert Systems with Applications, Vol.39, Issue 5, pp. 5861-5871, 2012. [14] W. Hao, N. Zhiwei, and H. Wang, “MapReduce-based ant colony optimization,” Computer Integrated Manufacturing Sys- tems, Vol.18, No.7, pp. 1503-1509, 2012 (in Chinese). [15] N. Elanthiraiyan and C. Arumugam, “Parallelized ACO Algorithm for Regression Testing Prioritization in Hadoop Framework,” Proc. of the 2014 Int. Conf. on Advanced Communications, Control and Computing Technologies, pp. 1568-1571, 2014. [16] Z. Wang, T. Li, and X. Yi, “Approach for Development of Ant Colony Optimization Based on MapReduce,” Computer Science, Vol.41, Issue 7, pp. 261-265, 2014 (in Chinese). [17] H. Jin and L. Ran, “A Fair-Rank Ant Colony Algorithm in Dis- tributed Mass Storage System,” Canadian J. of Electrical and Com- puter Engineering, Vol.38, No.4, pp. 338-345, 2015. [18] A. Siemi´nski and M. Kopel, “Comparing efficiency of ACO parallel implementations,” J. of Intelligent & Fuzzy Systems, Vol.32, No.2, pp. 1377-1388, 2017.

Vol.23 No.5, 2019 Journal of Advanced Computational Intelligence 937 and Intelligent Informatics Kong, Y. et al.

Name: Name: Junsan Zhao Yilin Lin

Affiliation: Affiliation: Professor, Faculty of Land Resource Engineer- Faculty of Land Resource Engineering, Kun- ing, Kunming University of Science and Tech- ming University of Science and Technology nology (KUST) (KUST)

Address: Address: No. 68 Wenchang Road, 121 Avenue, Wuhua District, Kunming, Yunnan No. 68 Wenchang Road, 121 Avenue, Wuhua District, Kunming, Yunnan 655093, China 650093, China Brief Biographical History: Brief Biographical History: 1985 B.S. in Engineering, Central South University 2014 B.S., KUST 1988 M.S. in Engineering, Central South University 2014- MBA-DBA Student, KUST 2001- Professor, KUST 2006 Doctoral degree in Engineering, Wuhan University 2013- Doctoral Tutor, KUST Main Works: • A. Tian, J. Zhao, H. Xiong, and C. Fu, “Quantitative Inversion Model of Name: Total Potassium in Desert Soils Based on Multiple Regression Combined Lei Yuan with Fractional Differential,” Sensors and Materials, Vol.30, No.11, pp. 2479-2488, 2018. • J. Zhao, L. Yuan, and M. Zhang, “A study of the system dynamics Affiliation: coupling model of the driving factors for multi-scale land use change,” School of Information Science and Technology, Environmental Earth Sciences, Vol.75, Issue 6, Article No.529, 2016. Yunnan Normal University Membership in Academic Societies: • International Association of Chinese Professionals in Geographic Information Sciences (CPGIS) • China Resource and Environment Remote Sensing Society, Director • The Chinese Land Institute Address: No.1, Yuhua District, Chenggong New District, Kunming, Yunnan 650500, China Brief Biographical History: 2014 Doctoral degree in Geographic Information Engineering, Kunming Name: University of Science and Technology Na Dong 2014- School of Information Science and Technology, Yunnan Normal University Affiliation: Faculty of Land Resource Engineering, Kun- ming University of Science and Technology (KUST) Name: Guoping Chen

Address: Affiliation: No. 68 Wenchang Road, 121 Avenue, Wuhua District, Kunming, Yunnan Associate Professor, Assistant Dean, Geomatics 650093, China Engineering Faculty, Kunming Metallurgy Col- Brief Biographical History: lege 2008 M.S. in Geological Engineering, KUST 2015- Pursuing Ph.D. in Earth Exploration and Information Technology, KUST Address: No. 388 Xuefu Road, Wuhua District, Kunming, Yunnan 650028, China Brief Biographical History: 2006- Kunming Yunjindi Technology Co., Ltd. 2008- Kunming Metallurgy College 2014- Visiting Scholar, Wuhan University 2019- Captain, Lancang County Poverty Alleviation Task Force

938 Journal of Advanced Computational Intelligence Vol.23 No.5, 2019 and Intelligent Informatics

Powered by TCPDF (www.tcpdf.org)