Neural Comput & Applic (2018) 30:735–757 https://doi.org/10.1007/s00521-016-2710-0

ISCMI15

Exploring differential evolution and particle swarm optimization to develop some symmetry-based automatic clustering techniques: application to gene clustering

1 2 Sriparna Saha • Ranjita Das

Received: 17 April 2016 / Accepted: 8 November 2016 / Published online: 1 February 2017 Ó The Natural Computing Applications Forum 2017

Abstract In the current paper, we have developed two bio- visualization tools, namely Eisen plot and cluster profile inspired fuzzy clustering algorithms by incorporating the plot. optimization techniques, namely differential evolution and particle swarm optimization. Both these clustering tech- Keywords Unsupervised classification Á Particle swarm niques can detect symmetrical-shaped clusters utilizing the optimization (PSO) Á Differential evolution (DE) Á established point symmetry-based distance measure. Both Symmetry Á Point symmetry-based distance Á Gene the proposed approaches are automatic in nature and can expression data detect the number of clusters automatically from a given dataset. A symmetry-based cluster validity measure, F-Sym-index, is used as the objective function to be opti- 1 Introduction mized in order to automatically determine the correct partitioning by both the approaches. The effectiveness of In the field of data mining, clustering [22] has innumerable the proposed approaches is shown for automatically clus- applications for solving different real-life problems tering some artificial and real-life datasets as well as for [15, 23]. In the literature, many invariant clustering tech- clustering some real-life gene expression datasets. The niques have been proposed [4] to cluster the dataset. To current paper presents a comparative analysis of some identify clusters from a dataset, some proximity or simi- meta-heuristic-based clustering approaches, namely newly larity measurements need to be defined among data points proposed two techniques and the already existing auto- to establish rules which can be used to assign points to the matic genetic clustering techniques, VGAPS, GCUK, domain of a particular cluster centroid. For recognition and HNGA. The obtained results are compared with respect to identification of most of the objects, ‘‘Symmetry’’ is useful some external cluster validity indices. Moreover, some as it is an important characteristic of real-life objects. As statistical significance tests, as well as biological signifi- symmetry is a natural phenomenon, we can assume that cance tests, are also conducted. Finally, results on gene some kind of symmetricity exists in the cluster structure expression datasets have been visualized by using some also. Symmetry measurements can be of two types, point symmetry (PS) and line symmetry (LS). Point symmetry- based measurements are more applicable for clusters which & Sriparna Saha are symmetric about their central point. In Fig. 1, some [email protected] objects having point symmetry and line symmetry prop- Ranjita Das erties are shown. Inspired by these observations, some [email protected]; [email protected] point symmetry-based measurements are developed in 1 Department of Computer Science and Engineering, Indian [7, 39]. These distance functions are then utilized in [7]to Institute of Technology Patna, Patna, India develop some clustering techniques which can determine 2 Department of Computer Science and Engineering, National any kind of point symmetric clusters from different data- Institute of Technology Mizoram, Aizawl, India sets. The symmetry in clustering is discussed in many 123 736 Neural Comput & Applic (2018) 30:735–757

variable length coding representation, and used the two- stage selection and mutation operator. But when the dimension of the dataset increases, the search ability gets reduced. Kao et al. [24] presented a hybrid particle swarm optimization algorithm for automatically evolving the cluster centers and applied it to the problem of generalized machine cell formation. In recent years, some new optimization techniques like Fig. 1 Point symmetric and line symmetric objects cuckoo search technique [47], differential evolution (DE) [34, 46], particle swarm optimization (PSO) [33] and ant existing works on clustering, for example, in the analysis of colony optimization [13] have been proposed in the liter- invariant clustering [4]. In [7], some - ature. Recent studies have also revealed that these opti- based techniques are developed for solving the clustering mization techniques converge much faster than the genetic problem using the properties of symmetry. The clustering algorithms [34, 46]. Based on these observations, some problem is modeled as an optimization problem and differential evolution-based and particle swarm optimiza- genetic algorithm [19] was used to optimize the total tion-based clustering techniques are also developed in the symmetrical compactness of the obtained clustering to get literature [29, 31, 35, 40]. In [31], a modified differential the optimal partitioning. This algorithm overcomes some evolution-based clustering technique is developed for drawbacks associated with SBKM and Mod-SBKM clus- satellite image segmentation. In [40], a modified fitness- tering techniques [43]. based adaptive differential evolution algorithm is devel- oped for clustering of image pixels. Here the control 1.1 Some automatic clustering techniques parameters of the traditional DE-based approach are cal- culated adaptively using the fitness-based statistics. In [36], In the literature, many genetic algorithm-based clustering two variants of DE-based clustering techniques are pro- techniques are available which are capable of detecting the posed. These are then applied for solving clustering prob- number of clusters and the appropriate partitioning auto- lem from some real-life datasets. Zhang et al. [48] have matically from any given dataset. Some examples are used DE to optimize the coordinates of the samples dis- variable string length genetic K-means algorithm tributed randomly on a plane. Kernel-based approaches are (GCUK)[6], hybrid niching genetic algorithm (HNGA) utilized here to map the data of the original space into a [41] where Euclidean distance has been used for assigning high-dimensional feature space in which a fuzzy dissimi- data points into different clusters. A variable string length larity matrix is constructed. Cai et al. [11] combined tra- genetic clustering technique (VGAPS) [38] is also pro- ditional DE and one step K-means clustering for the posed where point symmetry-based distance has been used. problem of unconstrained . Tvrdk In [6], a genetic algorithm-based K-means clustering et al. [44] developed a hybrid method by combining DE technique has been developed which is able to detect and K-means algorithm and applied it to non-hierarchical clusters having equi-sized hyper-spherical shapes. GCUK clustering. In [26] authors have incorporated a local uses genetic algorithm-based K-means clustering technique improvement phase to the classical DE to get the faster for automatic identification of clusters. In HNGA [41]to convergence and better performance and further applied in prevent premature convergence, a niching method is the wireless sensor network to increase the lifetime of the developed along with a weighted sum validity function for network. Liu et al. [27] combined two multi-parent cross- optimization. Liu et al. [28] developed an automatic clus- over operators with differential evolution and it is pre- tering technique based on genetic algorithm and presented sented to solve the problem of global optimization. A good a noising selection and division absorption-based mutation survey covering the existing particle swarm optimization- technique to maintain the diversity of population and based clustering techniques can be found in [2]. selection pressure. Horta et al. [21] developed an evolu- tionary technique based on fuzzy clustering for automati- 1.2 Motivation cally identifying the clusters present in the relational data. In [1], authors have introduced a grouping-based evolu- All the existing DE- and PSO-based clustering techniques tionary approach which has used the idea of grouping are found to perform better than the corresponding genetic encoding and an adaptive exploration and exploitation algorithm-based versions. But in the earlier attempts, these operator. Moreover, an elitist scheme is also applied to algorithms were used along with popular Euclidean dis- ensure that the best solution is preserved by the algorithm. tance for assignment of points to different clusters. As He et al. [20] adopted for initialization of individual, a mentioned earlier, symmetry-based measurements [7] are 123 Neural Comput & Applic (2018) 30:735–757 737 found to perform better than the popular Euclidean dis- 1.4 Major contributions tance-based versions in detecting clusters having different shapes and sizes. Thus, the incorporation of these sym- The followings are the key contributions of the current metry-based measurements in the frameworks of differen- paper: tial evolution and particle swarm optimization-based • This is the first attempt where some differential clustering techniques can help to increase the quality of the evolution or particle swarm optimization-based fuzzy partitions further. clustering techniques are developed using the proper- In the current paper, we have made an attempt in this ties of symmetry. direction. Two algorithms based on the search capabilities • First fuzzy clustering technique is based on the search of differential evolution and particle swarm optimization capabilities of differential evolution, and the second are developed. Both the algorithms are able to detect the one is based on the search capabilities of particle swarm number of clusters and the appropriate partitioning auto- optimization. matically without having prior information about these. • Both the proposed clustering techniques use point Moreover, both the algorithms utilize the variable center- symmetry-based distance for allocating points to based encoding to represent the partitions. Symmetry- different clusters. based similarity measurement [7] is utilized for the • Both DE- and PSO-based clustering techniques are able assignment of points to different clusters. A symmetry- to detect the number of clusters and the appropriate based cluster validity index, namely F-Sym-index, a fuzzy partitioning automatically. symmetry-based cluster validity index [38], is used as an • Goodness of the partitioning measured in terms of point objective function in both the proposed clustering tech- symmetry-based cluster validity index, FSym-index, is niques. In a part of the paper, another cluster validity used as the optimization objective. index, XB-index [45], is also used as the objective func- • Results on several artificial and real-life datasets show tion for the purpose of comparison. Incorporation of point that the performance of DE-based clustering technique symmetry distance in the evaluation of F-Sym measure is the best compared to other symmetry-based makes it capable of detecting all categories of clusters algorithms. irrespective of the shapes and sizes as long as those • Results on gene expression datasets show the superior contain some symmetrical properties. F-Sym values of the performance of DE in terms of cluster accuracy. obtained partitionings are optimized using the search • Finally, some biological and statistical significance capabilities of DE and PSO. tests have been performed to evaluate the biological and statistical significance of the obtained results. 1.3 Experimental results

The effectiveness of both the clustering techniques is illustrated on several artificial and real-life datasets. 2 Existing point symmetry-based distance The performances are compared with respect to a measure variable length genetic algorithm with point symmetry- based clustering technique, VGAPS [38] and two other In this section, at first the point symmetry (PS)-based dis- genetic algorithm with Euclidean distance-based clus- tance developed in [7] is described. tering techniques, GCUK [6] and HNGA [41]interms of an external cluster validity index, Minkowski Score 2.1 Point symmetry-based distance [42]. We have also made a comparative study of the number of clusters obtained by all these algorithms. In a The PS distance or point symmetry-based distance [7] part of the paper, we have also conducted some statis- d ðx; cÞ associated with point x with respect to a cluster tical significance tests. In order to show some real-life ps center c of cluster c , j ¼ 1; 2; ..., C is described in this applications of the proposed clustering algorithms, we j section. Let the dataset contain all distinct points, and let x have shown results for gene expression data clustering. be a point. The reflected or symmetrical point of x with We have used some gene expression datasets to show respect to a particular cluster center c is 2  c À x, and this results and evaluated the goodness of the obtained is denoted by xÃ.Ifknear number of unique nearest partitions using an external cluster validity index, Sil- neighbors of xà (calculated using Euclidean distances) are houette index [37]. Finally, biological and statistical at distances of d , k ¼ 1; 2; ..., knear. Then, significance tests have been conducted on the gene k expression datasets. dpsðx; cÞ¼dsymðx; cÞÂdeðx; cÞð1Þ

123 738 Neural Comput & Applic (2018) 30:735–757 where, • Klbest: best vector of the current population. P • CR: crossover probability in case of DE-based knear d d ðx; cÞ¼ k¼1 k ð2Þ approach. sym knear In Eq. 2, knear should not be chosen as equal to 1, because if xà exists in the dataset, then the value of d ðx; cÞ = 0, and ps 3 Proposed fuzzy symmetry-based automatic there will be no impact of the Euclidean distance. Again if clustering technique using the search capability knear the value of is large, then also it will not be suit- of DE able because with respect to a particular cluster center it may overestimate the amount of symmetry of a point. So In this section, the description about variable vector length here we have kept knear = 2. Note that d ðx; cÞ is a non- ps differential evolution algorithm using a newly developed metric distance measure which mainly calculates the point point symmetry-based distance is given for automatic symmetry distance between data point and a cluster center determination of optimal clustering solution (Fuzzy- unlike the popular Minkowski distances. Computation VMODEPS scheme). A flowchart showing different steps complexity of d ðx; cÞ is O(n). Hence, for n points and C ps of the proposed approach is shown in Fig. 2. clusters, the complexity of assigning the points to different Differential evolution (DE) is a meta-heuristic technique 2 clusters is O(n C). In order to decrease the computational developedbyStornandPrice[34] to optimize real-life func- complexity, Kd-tree-based approximate nearest neighbor tions. The idea behind the DE-based clustering technique is as search is also proposed in [7]. follows: Initial cluster centers are some randomly selected data points from the dataset and those are encoded as cluster centers 2.2 Symbols used in the vector. Similarly, all the vectors in the population have been initialized. After initialization phase, centers have been Here the following symbols are used in describing the extracted to compute the fitness of a particular vector. Once proposed clustering techniques: fitness has been calculated, all the vectors in the population are • C: number of clusters present in a particular string. gone through the mutation and crossover phase to generate the • Cmax: maximum value of number of clusters. mutant and crossover vectors. • D: dimension of the dataset. • NP: population size. 3.1 Vector initialization and representation • Pbest: best particle position in case of PSO-based approach. In the proposed Fuzzy-VMODEPS scheme, population is • Gbest: best global particle position in case of PSO- consisting of a collection of vectors. Each vector Vl based approach. contains a collection of real numbers distinctly chosen • par: current particle. from given dataset where l ¼ 1; 2; ...; NP, NP is the • G: current generation. maximum size of population. Here each vector Vl • Kgbest: Best vector till the current generation in case of encodes Cj number of clusters where minimum size of DE-based approach. Cj is 2 and maximum size is Cmax. Now Cj can be

Fig. 2 Flowchart of Fuzzy- VMODEPS approach

123 Neural Comput & Applic (2018) 30:735–757 739

calculated by using the following equation Cj ¼ uij ¼ 1; if j ¼ cmin ðrandðÞmodðCmax À 1ÞÞ þ 2 where rand() is a random uij ¼ 0; if j 6¼ cmin function returning an integer and Cmax denotes the maximum value of number of clusters. Therefore, the Otherwise membership values of uij will be updated number of clusters present in the vector should be con- using the procedure as done in fuzzy C-means [10] fined between 2 to Cmax. Cj number of distinct points are algorithm. Here m 2ð1; 1Þ is a weighting exponent randomly selected from the given dataset. Let us con- called the fuzzifier whose value has been considered, sider that Vl be the vector and it contains Cj number of m ¼ 2andh value has been considered as the maximum cluster centers. If the dimension of each data point in the nearest neighbor distance among all the data points. dataset is D, then the length of the vector will be More details about h value calculation can be obtained D Â Cj. This is explained by an example: from [7]. Let a vector be represented by \1:2; 21:2; 13:2; 14:2; 5:3; 6:2; 4:2; 5:3; 6:3; 2:5; 2:3; 1:6 [ : If the vector 3.1.3 Objective function used contains Cj ¼ 3 number of clusters and each center is having D ¼ 4 dimensions, then the centers of the clus- In order to determine average symmetry present in a par- ters will be: \1:2;21:2;13:2;14:2[; \5:3;6:2;4:2;5:3[ titioning, an internal cluster validity measure FSym-index and \6:3;2:5;2:3;1:6[: [38] has been utilized as a fitness function. For each vector, After that, five iterations of fuzzy C-means algorithm first the membership values are calculated using the above [10] will be executed on the whole dataset with the set of discussed procedure. Finally, the FSym-index value is cluster centers which has been encoded in each vector. This calculated using this membership matrix. FSym-index has generally replaces the centers in the corresponding vector been computed using following equation: by the resultant centers so that centers get separated 1 1 FSymðCÞ¼ Â Â DC ð3Þ initially. C EC

3.1.1 Fitness computation Here C is the number of clusters encoded in the vector. XC Fitness computation is a two step process, in the first step Ec ¼ Ej ð4Þ using the point symmetry-based distance measure [7] j¼1 membership values of xi data points where i ¼ 1; ...; n; Such that (n is the total number of data points) with respect to Xn C different cluster centers have been computed where C is Ej ¼ uij  dpsðxi; cjÞ ð5Þ the number of centers encoded in a particular vector. Once i¼1 membership values have been calculated subsequently in and the second step using the membership matrix, fitness measure is evaluated. C DC ¼ maxi;j¼1kci À cjkÞ ð6Þ

3.1.2 Computation of membership values In this context in order to obtain the actual number of clusters and to achieve proper clustering, FSym-index, Let a particular vector contain C number of cluster centers value needs to be maximized. Thus, the objective function encoded in it. The centers are denoted by for a particular vector is FSym. This is maximized using the search capability of DE. cj; for j ¼ 1; ...; C. The cluster center cmin among all the cluster centers, cj; for j ¼ 1; ...; C nearest to data point xi has been determined in terms of symmetry to compute the 3.1.4 Updation of centers membership values. The expression for determining cmin is given below: After computing the membership values, cluster centers are updated. In order to update the cluster centers, following cmin ¼ argminj¼1;...;Cdpsðxi; cjÞ equation has been used which is similar to the equation used in fuzzy C-means [10] dpsðxi; cjÞ, i.e., point symmetry distance between data point P n m xi and cluster center cj, has been calculated by using Eq. 1. u xi c ¼ Pi¼1 ij ð7Þ Here c denotes the center of the jth cluster. In this context, j n m j i¼1 ui;j if dsym  h, i.e., dsymðxi; cminÞ is smaller than h, then the membership values are calculated as follows: for j ¼ 1...C.

123 740 Neural Comput & Applic (2018) 30:735–757 8 3.1.5 Mutation > KgbestðGÞþaðKlbestðGÞÀKrðGÞÞ <> if randð0; 1Þa The population of DE is composed of NP number of D MlðG þ 1Þ¼> > pkðGÞþFðpmðGÞÀpnðGÞÞ dimensional individuals, plðGÞ, l ¼ 1; 2; ...; NP to attain an : optimal solution where G denotes the Gth generation. Here otherwise plðGÞ, l ¼ 1; 2; ...; NP are target vectors. Now each indi- Modified mutation scheme has been used in paper [31] vidual vector plðGÞ in the population of target vectors at to accelerate the convergence of the proposed approach, Gth generation is gone through mutation phase. This leads so that the trial vector can reach global optimum in to generation of trial offsprings or mutant vectors. Mutant minimum number of generations. This is not same in vector is produced by the following equation. case of classical mutation scheme. As the generation

MlðG þ 1Þ¼pkðGÞþFðpmðGÞÀpnðGÞÞ ð8Þ increases, a value gets decreased. Whenever a value is high, the probability of adopting modi-mutation For each individual in the population, pkðGÞ, pmðGÞ and scheme is high too. So, when modi-mutation function is pnðGÞ are three vectors chosen randomly from the pop- used, then the lbest vector, i.e., best vector in the current ulation of target vectors at the (G)th generation. Here population, has a greater influence for evolving the l 6¼ k 6¼ m 6¼ n and m; k; l; n 2 1; 2; ...NP are mutually mutant vector. distinct integers taken randomly. The mutant vector is obtained by finding the difference of the two target vectors multiplied by the scalar factor F where 3.1.6 Crossover F 2½0; 1Š. Finally, this term is added with the values of third individual target vector. Here in the equation, third Crossover function has a greater influence to increase the diversity in the offspring vectors. Crossover operation is individual vector pkðGÞ is added with the weighting performed on the individual vector or target vector and its difference of target vectors, pmðGÞ and pnðGÞ,which corresponding mutant vector. After crossover operation, leads to a generation of mutant vector, MlðG þ 1Þ for ðG þ 1Þth generation. The above classical mutation trail vector is generated. The trail vector is generated in the following way: scheme is modified in the paper [31]andthatmodified  mutation operator is used in the proposed approach MjlðG þ 1Þ if randð0; 1ÞCR or j ¼ randðlÞ CjlðG þ 1Þ¼ (Fuzzy-VMODEPS). The detailed description about the pjlðGÞ otherwise modified mutation scheme, the same which has been adopted in the current paper, is given below. The mod- Here j ¼ 1; 2...d and rand(l) is the randomly selected ified mutation scheme is described as follows: index from 1; 2; ...; d, where d ¼ D  C, C: number of clusters encoded in the lth chromosome at Gth generation. MlðG þ 1Þ¼KgbestðGÞþaðÞðKlbestðGÞÀKrðGÞ 9Þ CR is the crossover rate and ClðG þ 1Þ is the trail vector for ðG þ 1Þth generation. After that, fitness value is com- in the above equation, M ðG þ 1Þ represents (l)th mutant l puted for each of the trail vectors. vector generated at ðG þ 1Þth generation. MlðG þ 1Þ vector is generated by adding the weighted difference vector of KlbestðGÞ and KrðGÞ with the third vector KgbestðGÞ. Here 3.1.7 Selection KlbestðGÞ denotes the best vector of the current population at Gth generation. KgbestðGÞ denotes the best vector gen- In this phase, the trail vector is compared with the target erated till the Gth generation. KrðGÞ represents the (r)th vector, the vector which has maximum fitness value will be vector generated randomly from the current population at survived for the next generation. The procedure is as Gth generation. Moreover, in the equation difference of follows:  two target vectors, Klbest, Kr at the Gth generation is mul- Cl FSymðplÞFSymðClÞ tiplied by the scalar factor, a. Calculation of a is given pl ¼ pl otherwise below. If the fitness value (in this case the value of FSym-index 1ÀÁ corresponding to the partitioning encoded in Cl vector) of a ¼ 1 ð10Þ 1 þ exp ÀðGÞ Cl is better than the fitness value of pl (FSym-index value corresponding to the partitioning encoded in p vector), Subsequently based on the calculated a value, classical or l then update p by C . Otherwise p ¼ p , previous value of modified mutation scheme is adopted for each of the l l l l p is preserved. generations. l

123 Neural Comput & Applic (2018) 30:735–757 741

3.1.8 Termination criteria • End while • Report the best vector In this approach, the process of mutant vector generation in the mutation phase, trail vector generation in the crossover 3.2 Time complexity phase and the selection operation is performed for a con- stant number of generations. At the final generation, a The time complexity of the proposed algorithm is analyzed population containing multiple solutions is generated. The below: FSym-index values are calculated for individual vectors. The best vector having the highest value of FSym is con- • Initialization of Fuzzy-VMODEPS requires sidered as the final solution. The corresponding set of OðPopsize  vectorlengthÞ time where Popsize and cluster centers is used to partition the given data, and the vectorlength indicate the population size and the length obtained results are reported. of each vector in Fuzzy-VMODEPS, respectively. Note The basic steps of the proposed algorithm are enumer- that vectorlength is OðCmax  DÞ where D is the ated below: dimension of the dataset and Cmax is the maximum possible number of clusters encoded in a string. • Generate the initial population NP randomly. • Fitness computation is composed of three steps. • Execute the steps of FCM algorithm five times. • Evaluate the fitness of each individual or vector in NP • In order to find membership values of each point using Eq. 3. with respect to different cluster centers, minimum • Set generation = 1, Maxgen: maximum number of symmetrical distance of that point with respect to generations. all clusters has to be calculated. For this purpose, • Initialize gbest and lbest the Kd-tree [9]-based nearest neighbor search is • While the halting criteria is not satisfied used. If the points are roughly uniformly dis- ðgeneration þþMaxgenÞ tributed, then the expected case complexity is • do Oðm  D þ log nÞ, where m is a constant depending on dimensions and the point distribution. This is • if randð0; 1Þ\a then O(logn) if the dimension D is a constant [9]. • (here a ¼ 1 ) 1þexpðÀð1=generationÞÞ Friedman et al. also reported O(logn) expected time for finding the nearest neighbor [16]. So in order to • Apply mutation operation using Equation find the minimal symmetrical distance of a partic- MlðG þ 1Þ¼KgbestðGÞþaðKlbestðGÞÀKrðGÞÞ. ular point, OðC  log nÞ time is needed. Thus, • Apply Crossover operation max total complexity of computing membership values • else of n points to Cmax clusters is OðCmax  n  log nÞ. • For updating the centers, total complexity is • Apply Mutation operation using Equation OðC Þ. M ðG þ 1Þ¼p ðGÞþFðp ðGÞÀp ðGÞÞ. max l k m n • Total complexity for computing the fitness values is • Apply crossover operation Oðn  CmaxÞ. • Evaluate the fitness of trial vector or offspring using So the fitness evaluation has total complexity = Eq. 3. OðPopsize  Cmax  n  log nÞ. • Update Lbest by the best vector in the current • Mutation and crossover require population OðPopsize  vectorlengthÞ time each. • if ðlbest [ gbestÞ • Selection step of the Fuzzy-VMODEPS requires • Replace gbest with lbest OðPopsize  vectorlengthÞ time.

• If Cl (vector generated after application of genetic Thus, summing up the above complexities, total time operators) is better than pl (original vector) complexity becomes OðCmax  n  logðnÞÂPopsizeÞ per generation. For maximum Maxgen number of genera- • pl ¼ Cl tions, total complexity becomes OðCmax  n  logðnÞ • otherwise previous value of pl will be retained Popsize MaxgenÞ.

123 742 Neural Comput & Applic (2018) 30:735–757

4 Particle swarm optimization-based variable particle’s position. In each iteration, the search for optimal length clustering technique using point position is performed by updating the velocities and posi- symmetry-based distance (Fuzzy-VPSOPS) tions of particles. The velocity of each particle is updated using Pbest and Gbest positions. The personal best posi- Particle swarm optimization (PSO) is a population-based tion, Pbest, is the best position the particle has visited and stochastic search algorithm developed by Kennedy and Gbest is the best position the swarm has visited since the Eberhart [33]. This algorithm is developed after getting first time step. The process of fitness calculation: Pbest, inspiration by the swarm behavior of birds, bees and fish as Gbest calculations and velocity, position updation, con- they search for food or communicate with each other. It tinues for a fixed number of generations or till a termina- was mainly designed to solve optimization problems. The tion condition is satisfied. PSO approach is highly decentralized and is based upon For the purpose of clustering, each particle encodes a interaction among the agents called particles [25]. Particles possible variable number of cluster centers. The goodness are the agents which represent individual solutions and the of each partition is measured using a point symmetry-based collection of particles is the swarm which represents the cluster validity index. Here we have used point symmetry- solution space. Initially, the swarm is initialized by some based distance for cluster assignment and FSym-index as random solutions and the particle starts flying through the the objective function. The details of this scheme are solution space by maintaining a velocity value and keeping described below: track of its best previous position obtained so far which is known as personal best position. Global best is another best 4.1 Particle representation and population solution which corresponds to the best fitness value initialization obtained by any of the particles. In the current work, we have proposed a fuzzy symmetry-based variable length In the proposed Fuzzy-VPSOPS scheme, population is a clustering technique using the search capabilities of parti- collection of potential solutions of clustering the data cle swarm optimization (PSO). The algorithm is named as which are termed as particles. Each particle Parl contains a Fuzzy-VPSOPS (fuzzy point symmetry-based variable collection of real numbers distinctly chosen from given length clustering technique using particle swarm opti- dataset where l ¼ 1; 2; ...; NP, NP is the number of solu- mization). A flowchart showing different steps of the pro- tions present in a population. Here each particle Parl posed approach is shown in Fig. 3. encodes Cj number of clusters where the possible range of The algorithmic flow of Fuzzy-VPSOPS is given below. Cj is ½2; CmaxŠ. Now Cj is determined by using the equation The parameters of the search space are encoded in the form Cj ¼ðrandðÞmodðCmax À 1ÞÞ þ 2 where rand() is a ran- of particles and a collection of such particles is called dom function returning an integer and Cmax denotes the swarm. Initially, the process starts with a population of maximum value of clusters. For initialization, Cj number of particles whose positions represent the potential solutions centers for a particular particle are randomly selected dis- for the studied problem, velocities are randomly initialized tinct points from the given gene dataset. Let us consider in the search space, and the population or swarm represents parl be the particle and it contains Cj number of clusters. different points in the search space. An objective function Let the dimension of each data point in the dataset be D, is associated with each particle, and this will be the then the length of the particle will be D à Cj. This is

Fig. 3 Flowchart of Fuzzy- VPSOPS approach

123 Neural Comput & Applic (2018) 30:735–757 743 explained by an example: let a particle be obtained so far and the global best position which has \11:2; 22:2; 23:2; 14:2; 5:3; 7:2; explored by the swarm. 4:5; 4:3; 5:5; 7:5; 4:3; 7:6 [ . If the particle contains Cj ¼ 3 number of clusters and each center is having D ¼ 4 4.5 Termination criteria dimensions, then the cluster centers will be: \11:2; 22:2; 23:2; 14:2 [ ; \5:3; 7:2; 4:5; 4:3 [ and In Fuzzy-VPSOPS method, the process of fitness computa- \5:5; 7:5; 4:3; 7:6 [ : tion, Pbest and Gbest calculations, update of velocity and position of the particles is executed for constant number of 4.2 Fitness computation iterations. The best particle generated by the clustering algorithm up to the last iteration will give the solution of the Here again first the cluster centers encoded in a particle are problem of clustering. The steps of Fuzzy-VPSOPS method extracted. Thereafter, the steps mentioned in Sect. 3 are are executed for constant number of iterations. The best executed. particle generated by the clustering algorithm up to the last iteration will give the solution of the problem of clustering. 4.3 Calculation of Pbest and Gbest vectors The basic steps of the proposed algorithm are enumer- ated below: At the beginning of the execution, Pbest and Gbest are 1. Initialize the parameters including population size initialized by some small values. In order to calculate NP, co , co , w, and the maximum iteration count. Pbest, we need to compare particle’s fitness value with that 1 2 2. Initialization of a swarm with NP particles, i.e., for obtained by Pbest solution. If the current particle Par ’s l each particle arbitrarily select C number of clusters fitness value is better than that of Pbest, Pbest is replaced j from the n number of data points as the centroids. by the current particle. In the similar manner, if the fitness 3. Initialize position and velocity matrix, Pbest for each value of current particle is better than Gbest, then Gbest is particle and Gbest for the swarm. updated by the current particle’s fitness value and position. 4. Run the FCM algorithm for five iterations. 5. Calculate the fitness value of each particle using 4.4 Updation of velocity and position of particles point symmetry-based distance measure as men- tioned in Sect. 3. In order to search for optimal position in each generation, 6. Calculate Pbest for each particle. velocities and positions of particles have been updated. A 7. Calculate Gbest for the swarm. velocity vector is assigned to each particle to regulate the 8. Update the velocity matrix for each particle. next transit of the particle. Each particle basically updates 9. Update the position matrix for each particle. it velocity on the basis of current velocity, personal best 10. Go to step 5 until the termination criteria are not position it has obtained so far and the global best position satisfied. which has explored by the swarm. The velocity and position of the particle are updated as: 4.6 Time complexity Vellðt þ 1Þ¼w  VellðtÞþco1  r1ðparlPbestðtÞÀposlðtÞÞ þ co2  r2ðParGbest À poslðtÞÞ The time complexity of the Fuzzy-VPSOPS clustering technique is analyzed below: poslðt þ 1Þ¼poslðtÞþVellðt þ 1Þð11Þ • Initialization of Fuzzy-VPSOPS needs Oðswarmsize  Here, w is the inertia weight, VellðtÞ is the previous particlelengthÞ time where swarmsize and particlelength velocity in iteration t of lth particle, co1 and co2 are indicate the population size and the length of each particle coefficients and r1 and r2 are random values in the range of in Fuzzy-VPSOPS, respectively. Note that particlelength 0 and 1. ðparlPbestðtÞÀposlðtÞÞ is the difference between is OðCmax  DÞ where D is the dimension of the dataset the local best parlPbest of the lth particle and the previous and Cmax is the maximum number of clusters. position poslðtÞ. Similarly, ðparGbest À poslðtÞÞ is the dif- • Fitness computation is composed of three steps. ference between the global best parGbest and the previous position poslðtÞ. • In order to find membership values of each point to In order to search for optimal position in each genera- all cluster centers, minimum symmetrical distance tion, velocity and position of the particles have been of that point with respect to all clusters has to be updated. Each particle basically updates its velocity on the calculated. For this purpose, the Kd-tree-based basis of current velocity, personal best position it has nearest neighbor search is used. If the points are

123 744 Neural Comput & Applic (2018) 30:735–757

roughly uniformly distributed, then the expected 2 case complexity is OðmD þ logðnÞÞ, where m is a constant depending on dimension and the point 1.5 distribution. This is O(logn) if the dimension D is a

constant. Friedman et al. [16] also reported Oðlog nÞ 1 expected time for finding the nearest neighbor. So in order to find the minimal symmetrical distance of 0.5 a particular point, OðCmax  logðnÞÞ time is needed. Thus, total complexity of computing membership 0 values of n points to Cmax clusters is OðCmax  n  logðnÞÞ. • For updating the centers, total complexity is −0.5 OðCmaxÞ. • Total complexity for computing the fitness values is −1 −1 −0.5 0 0.5 1 1.5 Oðn  CmaxÞ. So the fitness evaluation has total complexity= Fig. 4 Sym_3_2 Oðswarmsize  Cmax  n  logðnÞÞ. • Complexity of calculating Pbest for each particle is Mixed_3_2: The distribution of clusters is shown in Fig. 7. Here there are 600 points spread over three equal Oðswarmsize  Cmax  nÞ. • Complexity of calculating Gbest for each particle is sized clusters. OðswarmsizeÞ. Cancer: Here we use the Wisconsin breast cancer dataset • Complexity for updating the velocity matrix for each obtained from [5]. Each pattern has nine features corre- sponding to clump thickness, cell size uniformity, cell shape particle is OðCmax  n  constantÞ, i.e., OðCmax  nÞ. • Complexity for updating the position matrix for each uniformity, marginal adhesion, single epithelial cell size, particle is Oðn  constantÞ, i.e., O(n). bare nuclei, bland chromatin, normal nucleoli and mitoses. There are two categories in the data: malignant and benign. Thus, summing up the above complexities, total time The two classes are known to be linearly separable. complexity becomes O C n log n swarmsize per ð max Â Â Â Þ Iris: This dataset, obtained from [5], represents different iteration. For maximum tmax number of iterations, total categories of irises characterized by four feature values [5]. complexity becomes O C n log n swarmsize ð max   ð Þ It has three classes: Setosa, Versicolor and Virginica. It is tmax : Â Þ known that the two classes (Versicolor and Virginica) have a large amount of overlap, while the class Setosa is linearly 5 Datasets chosen separable from the other two. Glass: This is the glass identification data [5] consisting For our experiments, we have chosen three artificial data- of 214 instances having nine features (an Id feature has sets: Sym_3_2, Sph_4_3 and Mixed_3_2, and six real-life been removed). The study of the classification of the types datasets obtained from UCI repository of glass was motivated by criminological investigation. At [5]: Cancer, Glass, Iris, NewThyroid, Wine and the scene of the crime, the glass left can be used as evi- LiverDisorder. The total number of data points to be dence, if it is correctly identified. There are six categories clustered are 350, 400, 600, 150, 214, 683, 215, 178 and present in this dataset. 345 for Sym_3_2, Sph_4_3, Mixed_3_2, Iris, Glass, Can- Newthyroid: The original database from where it has cer, NewThyroid, Wine and LiverDisorder, respectively. been collected is titled as Thyroid gland data [5]. Five The dimensions of data points for nine datasets are 2, 3, 2, laboratory tests are used to predict whether a patient’s 4, 9, 9, 5, 13 and 32, respectively. thyroid belongs to the class euthyroidism, hypothyroidism Sym_3_2: The actual distribution of clusters is shown in or hyperthyroidism. There are a total of 215 instances and Fig. 4. Here there are total 350 points distributed over three the number of attributes is five. different shaped clusters, ring-shaped, compact and linear Wine: This is the wine recognition data consisting of clusters. 178 instances having 13 features resulting from a chemical Sph_4_3: This dataset consists of 400 data points in analysis of wines grown in the same region in Italy but 3-dimensional space distributed over four hyperspherical derived from three different cultivars. The analysis deter- disjoint clusters where each cluster contains 100 data mined the quantities of 13 constituents found in each of the points. This dataset is shown in Fig. 6a. three types of wines.

123 Neural Comput & Applic (2018) 30:735–757 745

LiverDisorder: This is the LiverDisorder data consisting varied the parameters of these two clustering techniques of 345 instances having six features each. The dataset has over a range. The results obtained with different values of two categories. parameters for Fuzzy-VMODEPS are shown in Table 1. Different values of F, CR, maximum number of generations and population sizes are used. Here we have first performed 6 Results and discussion the sensitivity studies of parameters on Iris and Cancer datasets. The optimal values are then used in Fuzzy- The obtained experimental results are provided in Table 3. VMODEPS while applying on different datasets. Table 1 We have executed the following clustering algorithms: clearly illustrates that the optimal parameter values for Fuzzy-VMODEPS:Fuzzy variable length modified differ- Fuzzy-VMODEPS are population size = 100, number of ential evolution with point symmetry-based clustering generations = 30, F = 0.8, CR = 0.5. Because with this technique; VGAPS: variable length genetic algorithm with parameter setting, best results by Fuzzy-VMODEPS are point symmetry distance-based clustering technique; obtained for both the datasets. Similarly, the sensitivity Fuzzy-VPSOPS: Fuzzy variable length modified particle studies of parameters are also done for Fuzzy-VPSOPS swarm optimization with point symmetry-based clustering algorithm. We have varied the values of co1, co2, number of technique; GCUK: genetic algorithm-based K-means iterations and swarm size over a range. The results obtained clustering technique; HNGA: hybrid niching genetic by Fuzzy-VPSOPS algorithm with different parameter set- algorithm-based clustering technique for all three artificial tings for Iris and Cancer datasets are shown in Table 2.This and six real-life datasets. For Fuzzy-VMODEPS, the fol- table illustrates that the best parameter values for Fuzzy- lowing parameter combinations are used: population VPSOPS are the following: maximum number of itera- size = 100, number of generations = 30, F = 0.8, tions = 30, swarm size = 100, co1, co2 ¼ 2: CR = 0.5. For Fuzzy-VPSOPS, the following parameter combinations are kept: maximum number of itera- 6.2 Discussion of results tions = 30, swarm size = 100, co1, co2 ¼ 2; w: 0.9 to 0.4. For the other clustering algorithms, parameters mentioned For artificial datasets, we have shown the clustering results in the corresponding papers are used. visually. The artificial datasets are having either 2 or 3 dimensions. Thus, such kind of visualization is possible. The 6.1 Parameter study final results obtained after application of GCUK, VGAPS/ Fuzzy-VMODEPS/Fuzzy-VPSOPS, HNGA for Sym_3_2 In order to select the optimal values of parameters for these dataset are shown in Fig. 5a–c, respectively, where genetic two proposed clustering algorithms, we have performed a algorithm-based K-means clustering technique (GCUK) and thorough sensitivity studies of the parameters. We have HNGA are found to fail in providing the proper clusters. The

Table 1 Minkowski Score (MS) Dataset F CR Max_gen Population size # Obtained cluster MS values corresponding to different parameter values Iris 0.5 0.4 40 40 3 0.633745 obtained by Fuzzy- VMODEPS:Fuzzy variable Iris 0.8 0.5 20 50 3 0.800680 length modified differential Iris 0.8 0.5 30 100 3 0.61 evolution with point symmetry- Cancer 0.8 0.5 20 50 3 0.362014 based clustering technique Cancer 0.5 0.4 40 80 3 0.38640 Cancer 0.8 0.5 30 100 2 0.346018 Bold values indicate best performances

Table 2 Minkowski Score (MS) Dataset C1 C2 Max_iteration Swarm size # Obtained cluster MS values corresponding to different parameter values Iris 1.5 1.5 30 100 2 0.824786 obtained by Fuzzy- VPSOPS:Fuzzy variable length Iris 2 2 40 80 3 0.657143 modified particle swarm Iris 2 2 30 100 3 0.61 optimization with point Cancer 2 2 20 50 4 0.639396 symmetry-based clustering Cancer 2 2 40 80 2 0.370785 technique Cancer 2 2 30 100 2 0.3670 Bold values indicate best performances

123 746 Neural Comput & Applic (2018) 30:735–757

2 2

1.5 1.5

1 1

0.5 0.5

0 0

−0.5 −0.5

−1 −1 −1 −0.5 0 0.5 1 1.5 −1 −0.5 0 0.5 1 1.5 (a) (b) 2

1.5

1

0.5

0

−0.5

−1 −1 −0.5 0 0.5 1 1.5 (c)

Fig. 5 Clustering of Sym_3_2 for C ¼ 3 after application of (a) GCUK (b) VGAPS/Fuzzy-VPSOPS/Fuzzy-VMODEPS (c) HNGA clustering technique proposed Fuzzy-VMODEPS/Fuzzy-VPSOPS clustering clusters are well-separated, all the approaches used in the techniques are capable of detecting the correct partitioning current paper are able to identify those properly (Fig. 7). from this dataset. These algorithms behave similar to Figure 8a–c shows the clusters obtained by GCUK, VGAPS. As the clusters are having symmetrical structures, VGAPS/Fuzzy-VMODEPS/Fuzzy-VPSOPS, and HNGA the proposed point symmetry-based clustering techniques are clustering techniques, respectively, for Mixed_3_2 dataset. capable of detecting the partitionings correctly. Note that a Here again all the point symmetry-based clustering tech- close investigation reveals that some points in the ellipsoidal niques are capable of detecting the proper partitioning. But cluster are erroneously allocated to the ring cluster in the GCUK and HNGA clustering techniques fail to detect the partitionings identified by VGAPS/Fuzzy-VMODEPS/ same. This is because of the presence of some overlapping Fuzzy-VPSOPS approaches. This is because of the overlap- clusters having symmetrical shapes and structures. The use ping nature of these clusters. These few points are very near to of this dataset shows the utility of point symmetry-based the ring-shaped cluster. Thus, it is difficult to identify those distance for properly detecting symmetrical structures. points properly as belonging to the ellipsoidal-shaped cluster. Again a close observation reveals that some points of the Figure 6 shows the clusters obtained by GCUK, VGAPS/ big hyper-spherical cluster are erroneously allocated to the Fuzzy-VMODEPS/Fuzzy-VPSOPS, HNGA clustering ellipsoidal-shaped cluster as there is a large overlap techniques for Sph_4_3 dataset. As is evident, genetic between these two clusters. This is because those points are algorithm-based K-means (GCUK), HNGA and all the point having low symmetrical distances with respect to the symmetry-based clustering techniques, VGAPS, Fuzzy- hyperspherical cluster than the ellipsoidal cluster. VMODEPS, Fuzzy-VPSOPS, are successful in providing the For real-life datasets, visualization is not possible as proper clusters. This is because this dataset possesses some these are high-dimensional datasets. For these datasets in hyperspherical-shaped clusters. As in this dataset all the order to quantify the partitioning results obtained by

123 Neural Comput & Applic (2018) 30:735–757 747

10

20 8

15 6 10

5 4 0

−5 2 20 15 20 10 15 0 5 10 5 0 0 −5 −5 −2 (a) −10 −8 −6 −4 −2 0 2 4 6 8

Fig. 7 Mixed_3_2 20

15 But our proposed techniques are capable of determining the appropriate partitioning with three clusters from this data- 10 set. For Cancer dataset, again Fuzzy-VMODEPS performs

5 well where as VGAPS and Fuzzy-VPSOPS perform simi- larly but poorly as compared to Fuzzy-VMODEPS. So for 0 Cancer dataset all the point symmetry distance-based

−5 clustering techniques perform well. Fuzzy-VPSOPS and 20 VGAPS attain the same Minkowski Score. But the perfor- 15 20 10 15 mance of GCUK and HNGA clustering techniques is 5 10 5 comparatively poor. For Newthyroid dataset, again Fuzzy- 0 0 −5 −5 VMODEPS and Fuzzy-VPSOPS perform well compared to (b) VGAPS, GCUK and HNGA. Fuzzy-VMODEPS attains the lowest Minkowski Score among all the clustering algo- Fig. 6 a Sph_4_3. b Clustering of Sph_4_3 obtained after application rithms. But GCUK and HNGA clustering techniques again of GUCK/VGAPS/Fuzzy-VPSOPS/Fuzzy-VMODEPS/HNGA clus- tering technique perform poorly for this dataset. For Wine dataset, the per- formance of Fuzzy-VMODEPS and Fuzzy-VPSOPS is better than VGAPS. The Minkowski Score values attained different clustering techniques, we have used an external by Fuzzy-VMODEPS and Fuzzy-VPSOPS clustering cluster validity index, Minkowski Score [8]. This is a techniques are lower compared to VGAPS clustering cluster validity measure responsible for checking the sim- technique. But again GCUK and HNGA clustering tech- ilarity between the obtained partitioning and the available niques perform poorly for this dataset. Those attain some true cluster information. Here for the real-life datasets, we higher Minkowski Score values compared to other tech- have the actual class information known to us. This is used niques. For Glass dataset, again the newly developed point to measure the goodness of different obtained partitionings. symmetry-based clustering techniques perform better than The Minkowski Score values obtained by these clustering VGAPS. Both Fuzzy-VPSOPS and Fuzzy-VMODEPS techniques for all the real-life datasets are shown in clustering techniques attain the same value of Minkowski Table 3. Results show that for Iris dataset, Fuzzy-VMO- Score. But again GCUK and HNGA clustering techniques DEPS clustering technique performs the best. It attains the fail to identify the proper partitioning. These approaches minimum Minkowski Score. Fuzzy-VPSOPS clustering attain some higher values of Minkowski Score. But for technique also performs better than VGAPS clustering LiverDisorder dataset, again Fuzzy-VMODEPS clustering technique. GCUK and HNGA clustering techniques per- technique performs the best in terms of Minkowski Score. form slightly poorly for this dataset. This is because for this Fuzzy-VPSOPS clustering technique performs the second dataset even though there exists total three clusters but best in terms of the obtained Minkowski Score. VGAPS there is a big overlap between two clusters. Thus, most of performs poorly for this dataset. It attains the maximum the algorithms detect only two clusters from this dataset. value of Minkowski Score. 123 748 Neural Comput & Applic (2018) 30:735–757

10 10

8 8

6 6

4 4

2 2

0 0

−2 −2 −10 −8 −6 −4 −2 0 2 4 6 8 −10 −8 −6 −4 −2 0 2 4 6 8 (a) (b) 10

8

6

4

2

0

−2 −10 −8 −6 −4 −2 0 2 4 6 8 (c)

Fig. 8 Clustering of Mixed_3_2 for C ¼ 3 after application of (a) GCUK (b) VGAPS/Fuzzy-VPSOPS/Fuzzy-VMODEPS (c) HNGA clustering technique

We have also reported the number of clusters obtained In the proposed approaches, point symmetry-based dis- by different clustering techniques (refer to Table 4). It can tance is used for allocating points to different clusters. be seen from this table that the proposed Fuzzy-VPSOPS Thus, there is a mismatch in the distance measure used for and Fuzzy-VMODEPS clustering techniques are capable of allocation and for computing objective functions. determining the appropriate number of clusters automati- Thus, results on a wide variety of datasets reveal the cally from different real-life datasets where most of the fact that the proposed DE- and PSO-based clustering existing techniques fail to do so. techniques using point symmetry-based distance are In order to prove the efficacy of the used objective much capable of automatically identifying partitionings function, FSym-index we have also performed experiments from some given datasets compared to another existing with another cluster validity index as the objective func- genetic algorithm with point symmetry-based automatic tion. Here in both the DE- and PSO-based frameworks, XB- clustering technique, VGAPS clustering technique. The index [45], another well-known cluster validity index is proposed algorithms are more robust in detecting the used as the objective function. Results obtained using XB- optimal partitionings. Those are capable of identifying index as the objective function are shown in Tables 3 and clusters having point symmetry properties without hav- 4. The obtained results clearly show that FSym-index is ing knowledge about the total number of clusters present indeed a better cluster quality measure than the XB-index. in the dataset. These DE- and PSO-based approaches The use of FSym-index helps the proposed approaches to obtain better partitioning results than VGAPS in terms of automatically determine the appropriate partitioning from Minkowski Score. The comparison results with some all the datasets. The poor performance of XB-index is genetic automatic clustering techniques using Euclidean because of the use of Euclidean distance in its computation. distance for cluster assignment like GCUK and HNGA

123 Neural Comput & Applic (2018) 30:735–757 749

Table 3 Minkowski Score values corresponding to different cluster- clustering technique; FVPSOPS: Fuzzy variable length modified ing techniques for different datasets; FVMODEPS: Fuzzy variable particle swarm optimization with point symmetry-based clustering length modified differential evolution with point symmetry-based technique; FVPSOPSXB: Fuzzy-VPSOPS clustering technique using clustering technique; FVMODEPSXB: Fuzzy-VMODEPS clustering XB-index as the objective function; GCUK: variable string length technique using XB-index as the objective function; VGAPS: variable genetic K-means algorithm; HNGA: hybrid niching genetic algorithm length genetic algorithm with point symmetry distance-based

Dataset FVMODEPS FVMODEPSXB VGAPS FVPSOPS FVPSOPSXB GCUK HNGA

Iris 0.61 0.840473 0.62 0.61 0.762738 0.847726 0.854081 Cancer 0.346018 0.4286 0.367056 0.367056 0.8534 0.386768 0.380332 Newthyroid 0.553785 0.8658 0.58 0.563478 0.894016 0.828616 0.838885 Wine 0.90095 1.6783 1.0854 0.943561 1.0027 1.2 0.97 Glass 0.8023 0.9262 1.106217 1.098560 0.9283 1.324295 1.117940 Liver disorder 0.968227 0.981873 0.987329 0.981923 0.982613 0.982611 0.981873

Table 4 Number of clusters obtained by different clustering techniques for different datasets: AC = actual number of clusters present in the dataset

Dataset AC FVMODEPSXB FVMODEPS VGAPS FVPSOPSXB FVPSOPS GCUK HNGA

Iris 3 2 3 3 4 3 2 2 Cancer 2 2 2 2 3 2 2 2 Newthyroid 3 2 3 3 5 3 8 5 Wine 3 6 3 2 9 3 4 3 Glass 6 3 6 6 8 6 3 6 Liver disorder 2 2 3 3 2 2 2 2

Table 5 Ranking computations Dataset Fuzzy-VMODEPS VGAPS Fuzzy-VPSOPS GCUK HNGA for the five algorithms over six datasets based on the Minkowski Iris 0.61 (1) 0.62 (2) 0.61 (1) 0.847726 (3) 0.854081 (4) Score values obtained Cancer 0.346018 (1) 0.367056 (2) 0.367056 (2) 0.386768 (4) 0.380332 (3) Newthyroid 0.553785 (1) 0.58 (3) 0.563478 (2) 0.828616 (4) 0.838885 (5) Wine 0.90095 (1) 1.0854 (4) 0.943561 (2) 1.2 (5) 0.97 (3) Glass 0.8023 (1) 1.106217 (3) 1.098560 (2) 1.324295 (5) 1.117940 (4) Liver disorder 0.968227 (1) 0.987329 (5) 0.981923 (3) 0.982611 (4) 0.981873 (2) average rank 1 3.17 2 4.17 3.5 Bold values indicate best performances prove that the point symmetry-based clustering tech- 6.3 Statistical test niques are more capable of handling symmetrical-shaped clusters. The proposed algorithms also perform better In order to prove the effectiveness of the proposed point than the traditional clustering techniques in partitioning symmetry-based clustering techniques statistically, we some real-life datasets. The real-life datasets used in the have also conducted some statistical significance tests current paper are some higher-dimensional datasets. But guided by [12, 18]. Friedman statistical test [17] is per- the obtained results show that the proposed techniques formed to establish whether the five clustering techniques, are capable of handling these real-life datasets as well. Fuzzy-VMODEPS, Fuzzy-VPSOPS, VGAPS, GCUK and But most of the existing traditional techniques fail to do HNGA used here for the experimental work are similar or so. The developed symmetry-based clustering techniques not. Each algorithm is assigned some rank after application can partition any higher-dimensional datasets and are of this statistical test. There is a check to test whether the able to detect clusters automatically. difference between the calculated average ranks and the

123 750 Neural Comput & Applic (2018) 30:735–757

Fig. 9 Cluster profile plots for Serum dataset obtained by Fuzzy-VMODEPS approach mean rank is significant or not. Friedman test proves that Yeast Sporulation: This dataset [30] is having 6118 gene for the proposed algorithms, the measured average ranks expression levels and has been measured over seven time and the mean rank are different with a p value of 0.0106. points those are (0, .5, 2, 5, 7, 9 and 11.5 h). Out of total The ranks are reported in Table 5. At the end, Nemenyi’s 6118 genes, some genes are ignored whose expression test [32] is also performed for the pair-wise comparison of levels are not changed during harvesting. Finally, 474 the clustering approaches. Here we have used a ¼ 0:05. genes have been used for analysis. This dataset is available Results reveal that for all the datasets we can reject the null from Web site .1 hypotheses which state that pairing algorithms work in a Yeast Cell cycle: Here in this dataset [30], approxi- similar way (as the corresponding p values are smaller than mately 6000 genes over 17 time points have been consid- a). Results reported in Table 5 also reveal that Fuzzy- ered. The expression levels of genes where there are no VMODEPS is the rank 1 algorithm among all the algo- substantial changes have been rejected. Finally, from 6000 rithms used here for the purpose of experiments. The sec- genes 384 have been chosen and others are ignored. This ond best algorithm is Fuzzy-VPSOPS. Sometimes its dataset is down-loadable from Web site.2 behavior is similar to Fuzzy-VMODEPS, but sometimes it Rat CNS: This dataset [30] is having 112 gene expres- performs poorly compared to Fuzzy-VMODEPS. sion levels and has been measured over nine time points. This dataset is available from Web site.3 Arabidopsis Thaliana: This dataset [30] is having 138 7 Experimental results for gene expression data gene expression levels and has been measured over eight classification time points. This dataset is available from Web site.4 Serum: This dataset consists of 8613 genes where each This section discusses about results obtained by the pro- gene is having total 13 dimensions corresponding to 13 posed clustering techniques for various publicly available time points. Out of total 8613 genes, 517 genes have been gene expression datasets which are used for the experi- considered for experimental analysis and other genes have mental analysis. The different performance metrics are also described in this section. Thereafter, obtained experimental results are demonstrated quantitatively as well as by using 1 http://cmgm.stanford.edu/pbrown/sporulation. visualization tools. Here five gene expression datasets are 2 http://faculty.washington.edu/kayee/cluster. used. A small description of those datasets is provided 3 http://faculty.washington.edu/kayee/cluster. below. 4 http://homes.esat.kuleuven.be/thijs/Work/Clustering.html. 123 Neural Comput & Applic (2018) 30:735–757 751

Fig. 10 Cluster profile plots for Yeast Sporulation dataset obtained by Fuzzy-VMODEPS approach been ignored because of no change in expression level over by the minimum average distance of a point from the 12 time points. This dataset is available from Web site.5 points of other clusters. Now the S(I) index value is calculated based on the parameters p and q, which is 7.1 Chosen validity measure defined below: q À p SðIÞ¼ ð12Þ To measure the quality of obtained gene clusters, we have maxðp; qÞ chosen the following cluster quality measurement index. It is described as follows. Here Silhouette index S(I) is considered as the average Silhouette Index S(I) [37], an internal cluster validity silhouette values over all the points. Silhouette index measure has been utilized to quantify the effectiveness of measures the separability and compactness of clusters. The the clustering solution obtained by the proposed value of silhouette index varies from -1toþ1. So best approach. Let p be one parameter of the S(I) index and partition yields higher positive value of S(I) index. it has been calculated by the average distance of a point from other points of the same cluster. Likewise q is also 7.2 Cluster profile plot another parameter of S(I) index and has been calculated Cluster profile plot [3] (example Figs. 9, 10,11, 12) is used 5 http://www.sciencemag.org/feature/data/984559.shl. to represent the expression values of genes over different 123 752 Neural Comput & Applic (2018) 30:735–757

Fig. 11 Cluster profile plots for Yeast Cell Cycle dataset obtained by Fuzzy-VMODEPS time points. The expression value of a gene is denoted by put one after another. Here white color is used as a cluster light blue color. Also, here red color is used to represent the boundary separator. average expression value for each of the cluster of genes. 7.4 Discussion of results 7.3 Eisen plot The proposed Fuzzy-VMODEPS and Fuzzy-VPSOPS clus- In the Eisen plot [14] (example Figs. 13, 14, 15, 16) using tering techniques use the search capabilities of differential similar colors of spot on the microarray, a particular cell of evolution and particle swarm-based optimization technique, the gene data matrix is colored, and in the similar way, a respectively. The parameter combinations used for different specific time point expression value of gene is specified. In clustering techniques are provided below. For Fuzzy- the figure, different expression levels of genes are specified VMODEPS, the following parameter values are used: pop- by shades of different colors. For example, the red color, ulation size: 100, maximum number of generations: 30, green color and black color shade represent higher and CR = 0.04 and F = 0.8. For Fuzzy-VPSOPS, the following lower expression levels of genes and also an absence of parameter combinations are used: maximum number of expression values, respectively. In this paper before plot- iterations=30, swarm size=100, co1, co1 = 2, w: 0.9–0.4. ting genes in the Eisen plot, all the genes have been ordered The experimental results of Fuzzy-VMODEPS and Fuzzy- in such a way that genes belonging to the same cluster are VPSOPS are shown over five real-life gene datasets like

123 Neural Comput & Applic (2018) 30:735–757 753

Fig. 12 Cluster profile plots for RatCNS dataset obtained by Fuzzy-VMODEPS

Fig. 14 Eisen plot for Yeast Sporulation dataset obtained by Fuzzy- VMODEPS

described in the current paper. Table 6 also reports the Fig. 13 Eisen plot for Serum dataset obtained by Fuzzy-VMODEPS Silhouette index values obtained by VGAPS [38], GCUK [6], HNGA [41], some well-known automatic clustering Yeast cell cycle, Yeast sporulation, Serum, Thaliana and approaches. Table 6 reveals that the proposed Fuzzy- RatCNS. Table 6 reports the Silhouette index values VMODEPS clustering technique attains higher S(I) value for obtained by the proposed approaches over five gene datasets all the datasets compared to other clustering techniques,

123 754 Neural Comput & Applic (2018) 30:735–757

Fig. 15 Eisen plots for Yeast cell cycle dataset obtained by Fuzzy- VMODEPS namely Fuzzy-VPSOPS, VGAPS, GCUK and HNGA. In Fig. 16 Eisen plots for RatCNS dataset obtained by Fuzzy- Table 7, number of clusters obtained by each of those VMODEPS algorithms has been reported. It can be seen from this table that Fuzzy-VMODEPS approach identifies the correct number of clusters from all the datasets used here for represented by cluster profile plot. The obtained plots clearly experimental purpose. From Table 6,itisobservedthat show the effectiveness of the proposed technique, Fuzzy- Fuzzy-VMODEPS approach performs the best among all the VMODEPS, for clustering five gene expression datasets, as algorithms for clustering the five gene expression datasets. compared to other existing data clustering techniques, Fuzzy- In order to visually inspect the obtained results by the VPSOPS, VGAPS, GCUK and HNGA. proposed Fuzzy-VMODEPS clustering technique, the Eisen plots (see Figs. 13, 14, 15, 16 for example) and cluster profile 7.5 Biological significance test plots (see Figs. 9, 10,11, 12) have also been drawn. Cluster profile plots show the distributions of expression values of In this paper, at 1% significance level biological signifi- different genes which belong to a single cluster obtained by cance test has been conducted for Yeast Sporulation data- the proposed approach over different time points. The com- set. Now in order to establish biological relevancy of pactness or similarities of variations in the expression values clusters, Gene Ontology annotation database (http://db. by different genes belonging to the same cluster proves that yeastgenome.org/cgi-bin/GO/goTermFinder) is used. Here genes are indeed similar in functionality. More is the com- all the six clusters obtained by Fuzzy-VMODEPS are pactness, better is the cluster in terms of expression values biologically significant, whereas for Fuzzy-VPSOPS, given. In case of Eisen plot, the presence of similar colors in VGAPS, GCUK and HNGA, number of biological signif- the same position represents the goodness of cluster. From the icant clusters are 4, 4, 2 and 2, respectively. Most signifi- corresponding Eisen plot, we can see that genes having similar cant GO terms and the corresponding p values for each of expression profiles (denoted by similar colors) are grouped the six clusters (obtained by Fuzzy-VMODEPS) of Yeast together and placed in the same cluster by the proposed Fuzzy- Sporulation dataset have been reported in Table 8. From VMODEPS clustering technique. The genes with different Table 8, we can observe that all the clusters obtained by expression values are placed in different clusters, which is the proposed Fuzzy-VMODEPS clustering technique are

123 Neural Comput & Applic (2018) 30:735–757 755

Table 6 Silhouette Index Dataset Fuzzy-VMODEPS VGAPS Fuzzy-VPSOPS GCUK HNGA values obtained by five clustering algorithms for five Yeast Sporulation 0.7060 0.6391 0.6520 0.5781 0.6263 gene expression datasets Yeast Cell Cycle 0.4531 0.3595 0.4329 0.1741 0.2379 Arabidopsis 0.3557 0.3453 0.3524 0.3194 0.2748 RatCNS 0.4386 0.411 0.4123 0.3125 0.2805 Human Fibroblasts Serum 0.3807 0.3506 0.3691 0.2681 0.2455

Table 7 Number of Clusters Dataset Fuzzy-VMODEPS VGAPS Fuzzy-VPSOPS GCUK HNGA obtained by five clustering algorithms for different gene Yeast Sporulation (C = 6) 6 4 4 2 2 expression datasets Yeast Cell Cycle (C = 5) 5 2 5 10 9 Arabidopsis (C = 4) 4 3 2 5 8 RatCNS (C = 6) 6 3 6 9 12 Human Fibroblasts Serum (C = 6) 6 2 6 15 14

Table 8 Some of the most significant GO terms obtained by Fuzzy-VMODEPS clustering technique and the corresponding p values for each of the six clusters of Yeast Sporulation dataset have been shown Clusters Significant GO term p value

Cluster1 Anatomical structure formation involved in morphogenesisanatomical structure formation involved in morphogenesis 2.60e-40 GO:0048646 Sporulation GO:0043934 8.37e-40 Sporulation resulting in formation of a cellular spore GO:0030435 6.19e-39 Cluster2 Ribosome biogenesis GO:0042254 2.85e-15 Ribonucleoprotein complex biogenesis GO:0022613 2.14e-13 rRNA processing GO:0006364 2.32e-11 Cluster3 Meiotic nuclear division GO:0051327 1.52e-31 Meiosis GO:0007126 2.29e-29 Meiotic cell cycle GO:0051321 1.79e-26 Cluster4 Monocarboxylic acid metabolic process GO:0032787 8.80e-09 Oxoacid metabolic process GO:0043436 4.33e-07 Single-organism metabolic process GO:0044710 0.00222 Cluster5 Cytoplasmic translation GO:0002181 2.14e-56 Translation GO:0006412 1.86e-27 Peptide biosynthetic process GO:0043043 2.52e-27 Cluster6 Nicotinamide nucleotide metabolic process GO:0046496 2.89e-13 Pyridine nucleotide metabolic process GO:0019362 3.62e-13 Pyruvate metabolic process GO:0006090 5.35e-13 biologically relevant because p values corresponding to MATLAB 7.5 version is used to draw the cluster profile GO categories are less than 0.01. plot and Eisen plot and also for the computation of Sil- houette index. 7.6 Execution time In Table 9, we have summarized the execution times taken by different clustering algorithms used in the Both the proposed algorithms have been implemented current study for partitioning different datasets. This using standard C??, and the experiments are performed table clearly shows that our proposed approaches are on a Intel (Core i-5) processor having 2.4 GHz machine much faster compared to other genetic algorithm-based with 4.0 GB RAM under Linux platform. Moreover, techniques.

123 756 Neural Comput & Applic (2018) 30:735–757

Table 9 Execution time (in Dataset Fuzzy-VMODEPS VGAPS Fuzzy-VPSOPS GCUK HNGA seconds) taken by different clustering algorithms for Iris 16.250s 16.324s 18.610s 20.797s 19.253s different real-life and gene expression datasets Cancer 318.278s 201.653s 329.968s 38.023s 923.769s Newthyroid 20.989s 27.674s 50.932s 62.946s 84.636s Wine 24.672s 24.798s 25.536s 39.837s 69.549s Glass 45.307s 56.875s 90.074s 102.739s 82.147s LiverDisorder 77.836s 83.562s 84.565s 99.888s 120.413s Yeast Sporulation 233.598s 256.335s 280.008s 273.404s 269.5365s Yeast Cell Cycle 327.656s 234.675s 556.636s 368.749s 405.900s Arabidopsis 42.859s 46.853s 85.787s 75.366s 44.155s RatCNS 27.776s 30.654s 34.730s 66.789s 27.873s Human Fibroblasts Serum 356.212s 365.542s 557.811s 495.934s 550.949s

8 Discussion and conclusions DE and PSO are also available. In general, multi-objective- based algorithms perform better than their single objective- In this paper, we have proposed two Fuzzy bio-inspired based versions. In future, we would like to develop some automatic clustering techniques which have utilized DE multi-objective-based clustering techniques using the and PSO as the underlying stochastic optimization tool, search capabilities of DE and PSO. respectively. Both the evolutionary techniques utilize the established point symmetry-based distance for the alloca- tion of points into different groups/clusters and an sym- metry-based cluster validity index, F-Sym-index, as the References objective function. Moreover, proposed clustering tech- 1. Agustn-Blas LE, Salcedo-Sanz S, Jimnez-Fernndez S, Carro- niques can be able to detect total number of clusters present Calvo L, Del Ser J, Portilla-Figueras JA (2012) A new grouping in the dataset automatically. Recent studies show that PSO- genetic algorithm for clustering problems. Expert Syst Appl and DE-based approaches converge much faster than the 39(10):9695–9703 GA-based approach. Motivated by this fact, in the current 2. Alam S, Dobbie G, Koh YS, Riddle P, Rehma SU (2014) Research on particle swarm optimization based clustering: a paper some automatic clustering techniques based on DE systematic review of literature and techniques. Swarm Evolut and PSO have been proposed. Results on various synthetic Comput 17:1–13 and real-life gene expression date sets indicate the superi- 3. Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald ority of Fuzzy-VMODEPS-based technique over other A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, Hudson JJ, Lu L, Lewis DB, Tibshirani R, Sher- techniques like Fuzzy-VPSOPS, VGAPS, GCUK and lock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, HNGA clustering techniques. In this context, five gene Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, expression datasets, namely Yeast Sporulation, Yeast Cell Brown PO, Staudt LM (2000) Distinct types of diffuse large Cycle, RatCNS and Serum, have also been clustered by the b-cell lymphoma identified by gene expression profiling. Nature 403(6769):503–511 proposed point symmetry-based DE and PSO clustering 4. Ankerbrandt CA, Buckles BP, Petry FE (1990) Scene recognition techniques and the obtained results are compared with using genetic algorithms with semantic nets. Pattern Recognit other techniques, namely VGAPS, GCUK and HNGA. The Lett 11:285–293 proposed algorithms often perform better than the existing 5. Asuncion A, Newman DJ (2007) UCI machine learning repository. http://www.ics.uci.edu/*mlearn/MLRepository.html. Accessed GA-based techniques in terms of cluster quality even for 20 Dec 2016 gene expression datasets. The obtained results prove the 6. Bandyopadhyay S, Maulik U (2002) Genetic clustering for utility of using PSO- and DE-based algorithms as the automatic evolution of clusters and application to image classi- underlying optimization strategies. Results also prove that fication. Pattern Recognit 35(6):1197–1208 7. Bandyopadhyay S, Saha S (2007) GAPS: a clustering method DE-based approach is better than both PSO and GA-based using a new point symmetry based distance measure. Pattern approaches. The results on gene expression datasets further Recognit 40(12):3430–3451 prove the applicabilities of the proposed clustering tech- 8. Ben-Hur A, Guyon I (2003) Detecting stable clusters using niques for solving some real-life problems. principal component analysis. Humana Press, Totowa 9. Bentley JL, Weide BW, Yao AC (1980) Optimal expected-time As a scope of future work, some real-life applications of algorithms for closest point problems. ACM Trans Math Softw the proposed clustering techniques can be done for classi- 6(4):563–580 fication of remote sensing images and MRI brain images, 10. Bezdek JC (1981) Pattern recognition with fuzzy objective etc. In the recent years, some multi-objective versions of function algorithms. Plenum, New York

123 Neural Comput & Applic (2018) 30:735–757 757

11. Cai Z, Gong W, Ling CX, Zhang H (2011) A clustering-based 31. Maulik U, Saha I (2009) Modified differential evolution based differential evolution for global optimization. Appl Soft Comput fuzzy clustering for pixel classification in remote sensing ima- 11(1):1363–1379 gery. Pattern Recognit 42(9):2135–2149 12. Demsar J (2006) Statistical comparisons of classifiers over mul- 32. Nemenyi P (1963) Distribution-free multiple comparisons. PhD tiple data sets. J Mach Learn Res 7:1–30 thesis, Princeton University 13. Dorigo M, Stu¨tzle T (2004) Ant colony optimization. MIT Press, 33. Poli R, Kennedy J, Blackwell T (2007) Particle swarm opti- Cambridge mization. Swarm Intell 1(1):33–57 14. Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster 34. Price K, Storn RM, Lampinen JA (2005) Differential evolution: a analysis and display of genome-wide expression patterns. Proc practical approach to global optimization (natural computing Natl Acad Sci USA 95(25):14863–14868 series). Springer, Secaucus 15. Everitt BS (1993) Cluster analysis, third edn. Halsted Press, New 35. Rana S, Jasola S, Kumar R (2011) A review on particle swarm York optimization algorithms and their applications to data clustering. 16. Friedman JH, Bently JL, Finkel RA (1977) An algorithm for Artif Intell Rev 35(3):211–222 finding best matches in logarithmic expected time. ACM Trans 36. Ravi V, Aggarwal N, Chauhan N (2010) Differential evolution Math Softw 3(3):209–226 based fuzzy clustering. In: Swarm, evolutionary, and memetic 17. Friedman M (1937) The use of ranks to avoid the assumption of computing—first international conference on swarm, evolution- normality implicit in the analysis of variance. J Am Stat Assoc ary, and memetic computing, SEMCCO 2010, Chennai, India, 32(200):675–701 December 16–18, 2010. Proceedings, pp 38–45 18. Garcı´a S, Herrera F (2008) An extension on statistical compar- 37. Rousseeuw P (1987) Silhouettes: a graphical aid to the inter- isons of classifiers over multiple data sets for all pairwise com- pretation and validation of cluster analysis. J Comput Appl Math parisons. J Mach Learn Res 9:2677–2694 20(1):53–65 19. Goldberg DE (1989) Genet algorithms search, optimization and 38. Saha S, Bandyopadhyay S (2009) A new point symmetry based machine learning. Addison-Wesley, New York fuzzy genetic clustering technique for automatic evolution of 20. He H, Tan Y (2012) A two-stage genetic algorithm for automatic clusters. Inf Sci 179(19):3230–3246 clustering. Neurocomputing 81:49–59 39. Saha S, Bandyopadhyay S (2008) Application of a new symmetry 21. Horta D, de Andrade IC, Campello RJGB (2011) Evolutionary based cluster validity index for satellite image segmentation. fuzzy clustering of relational data. Theor Comput Sci IEEE Geosci Remote Sens Lett 5(2):166–170 412(42):5854–5870 40. Sarkar S, Patra GR, Das S, Chaudhuri SS (2013) Fuzzy clustering 22. Jain AK, Dubes RC (1988) Algorithms for clustering data. of image pixels with a fitness-based adaptive differential evolu- Prentice-Hall, Englewood Cliffs tion. In: Swarm, evolutionary, and memetic computing—4th 23. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. international conference, SEMCCO 2013, Chennai, India, ACM Comput Surv 31(3):264–323 December 19–21, 2013, Proceedings, Part I, pp 179–188 24. Kao Y, Chen C-C (2014) Automatic clustering for generalised 41. Sheng W, Swift S, Zhang L, Liu X (2005) A weighted sum cell formation using a hybrid particle swarm optimisation. Int J validity function for clustering with a hybrid niching genetic Prod Res 52(12):3466–3484 algorithm. IEEE Trans Syst Man Cybern Part B Cybern 25. Kennedy J, Eberhart RC (2001) . Morgan 35(6):56–67 Kaufmann Publishers Inc., San Francisco 42. Sibson R (1971) Mathematical taxonomy. Wiley, London 26. Kuila P, Jana PK (2014) A novel differential evolution based 43. Su M-C, Chou C-H (2001) A modified version of the K-means clustering algorithm for wireless sensor networks. Appl Soft algorithm with a distance based on cluster symmetry. IEEE Trans Comput 25:414–425 Pattern Anal Mach Intell 23(6):674–680 27. Liu G, Li Y, Nie X, Zheng H (2012) A novel clustering-based 44. Tvrdk J, Kiv I (2015) Hybrid differential evolution algorithm for differential evolution with 2 multi-parent crossovers for global optimal clustering. Appl Soft Comput 35:502–512 optimization. Appl Soft Comput 12(2):663–681 45. Xie XL, Beni G (1991) A validity measure for fuzzy clustering. 28. Liu Y, Xindong W, Shen Y (2011) Automatic clustering using IEEE Trans Pattern Anal Mach Intell 13(8):841–847 genetic algorithms. Appl Math Comput 218(4):1267–1279 46. Yang X-S, Deb S (2012) Two-stage eagle strategy with differ- 29. Mangat V (2012) Survey on particle swarm optimization based ential evolution. IJBIC 4(1):1–5 clustering analysis. In: Swarm and — 47. Yang X-S, Deb S (2014) Cuckoo search: recent advances and international symposia, (SIDE) 2012 and EC 2012, Held in applications. CoRR, abs/1408.5316 Conjunction with ICAISC 2012, Zakopane, Poland, April 29 May 48. Zhang L, Ma M, Liu X, Sun C, Liu M, Zhou C (2006) Differ- 3, 2012. Proceedings, pp 301–309 ential evolution fuzzy clustering algorithm based on kernel 30. Maulik U, Mukhopadhyay A, Bandyopadhyay S (2009) Com- methods. In: Rough sets and knowledge technology, first inter- bining pareto-optimal clusters using supervised learning for national conference, RSKT 2006, Chongqing, China, July 24–26, identifying co-expressed genes. BMC Bioinform 10(27) 2006, Proceedings, pp 430–435

123