Granular : An Augmented Scheme of Degranulation Through a Modified Partition Matrix

Kaijie Xu, Witold Pedrycz, Fellow, IEEE, Zhiwu Li, Fellow, IEEE, and Mengdao Xing, Fellow, IEEE

The experimental results are in agreement with the underlying Abstract—As an important technology in artificial intelligence conceptual framework. The results obtained on both synthetic Granular Computing (GrC) has emerged as a new and publicly available datasets are reported to show the multi-disciplinary paradigm and received much attention in enhancement of the data reconstruction performance thanks to recent years. Information granules forming an abstract and the proposed method. It is pointed out that in some cases the efficient characterization of large volumes of numeric data have reconstruction errors can be reduced to zero by using the been considered as the fundamental constructs of GrC. By proposed approach. generating centroids (prototypes) and partition matrix, fuzzy Index Terms—Granular Computing (GrC), Partition matrix, clustering is a commonly encountered way of information Information granularity, Prototypes, Granulation-degranulation granulation. As a reverse process of granulation, degranulation mechanisms. involves data reconstruction completed on a basis of the granular representatives (decoding information granules into numeric data). Previous studies have shown that there is a relationship I. INTRODUCTION between the reconstruction error and the performance of the nformation granules [1] (fuzzy sets [2] and fuzzy relations, granulation process. Typically, the lower the degranulation error I in particular) are the building blocks of fuzzy models, is, the better performance of granulation process becomes. classifiers, and rule-based systems [3–7]. Information granules However, the existing methods of degranulation usually cannot also have been considered to be the fundamental constructs of restore the original numeric data, which is one of the important reasons behind the occurrence of the reconstruction error. To Granular Computing (GrC) [8–11]. Fuzzy clustering is one of enhance the quality of reconstruction (degranulation), in this the most commonly encountered approaches to construct study, we develop an augmented scheme through modifying the information granules on the basis of experimental data. Fuzzy partition matrix. By proposing the augmented scheme, we dwell clustering methods basically focus on the abstraction of the on a novel collection of granulation-degranulation mechanisms. In original numeric data. Fuzzy C-Means (FCM), a fuzzy the constructed approach, the prototypes can be expressed as the clustering method based on cost-function, has become a product of the dataset matrix and the partition matrix. Then, in popular technique predominantly because of its simplicity and the degranulation process, the reconstructed numeric data can be efficiency, and also due to the fact that it is based on a complete decomposed into the product of the partition matrix and the set of theoretical framework and mathematical theory [12–15]. matrix of prototypes. Both the granulation and degranulation are regarded as generalized rotation between the data subspace and Admittedly, information granules are formed based on the the prototype subspace with the partition matrix and the existing numeric evidence which gives rise to the ever-growing fuzzification factor. By modifying the partition matrix, the new importance of various fuzzy clustering mechanisms and, to the partition matrix is constructed through a series of matrix relevance of the FCM [7], [16], in particular. operations. We offer a thorough analysis of the developed scheme. In the FCM algorithm, the structure in the dataset is expressed in terms of constructed prototypes (clusters) and partition matrix [17]. Subsequently, data are encoded to This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 61672400 and 61971349. (Corresponding author: information granules with the aid of constructed prototypes and Zhiwu Li.) partition matrix. The reconstruction of information granules, K. Xu is with the School of Electro-Mechanical Engineering, Xidian usually referred to as a degranulation or decoding process, University, Xi’an 710071, China and also with the Department of remote returns a numeric result [11]. The concept of sensing science and technology, School of Electronic Engineering, Xidian University, Xi’an 710071, China. (e-mail: [email protected]). granulation-degranulation plays an important role in GrC, just W. Pedrycz is with the Department of Electrical and Computer Engineering, as fuzzification-defuzzification in fuzzy control systems, and University of Alberta, Edmonton, AB T6R 2V4, Canada, the School of analog-to-digital (A/D) as well as digital-to-analog (D/A) Electro-Mechanical Engineering, Xidian University, Xi’an 710071, China, conversion systems in digital signal processing [7, 11, 18, 19]. Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland and also with the Faculty of Engineering, King Abdulaziz University, Jeddah 21589, So far, the topic of granulation-degranulation mechanism has Saudi Arabia (e-mail: [email protected]). not been intensively studied. The lack of a well-established Z. Li is with the School of Electro-Mechanical Engineering, Xidian body of knowledge opens up new opportunities but also calls University, Xi’an 710071, China, and also with the Institute of Systems for more investigations in this area. In [20–21], the Engineering, Macau University of Science and Technology, Macau 999078, China (e-mail: [email protected]). degranulation (reconstruction) error is used as an evaluation M. Xing is with the National Laboratory of Radar Signal Processing, Xidian index of the performance of the fuzzy clustering. Studies have University, Xi’an 710071, China, Collaborative Innovation Center of shown that the reconstruction error depends on the granulation Information Sensing and Understanding, Xidian University, Xi’an 710071, scheme and the selection of its parameters. In the FCM, these China, and also with the National Key Laboratory of Microwave Imaging Technology, Institute of Electronics, Chinese Academy of Sciences, parameters refer to the fuzzification coefficient and the number 100190, China (e-mail: [email protected]). of clusters (information granules), which determine the prototypes (clusters) and partition matrix. Typically, the lower optimized through a series of matrix operations. With the the reconstruction error is, the better the performance of the modified partition matrix, the original numeric data can be clustering becomes. To reduce the reconstruction error, Hu et al. restored from the information granules which reduces the [11] make use of a linear transformation of partition matrix to reconstruction error significantly. The experimental studies obtain the high-level associations among the data being demonstrate that the proposed approach achieves better clustered, and then set up an adjustment mechanism modifying performance in comparison with the performance of the a localization of the prototypes; thus, the partition matrix and FCM-based degranulation. To the best of our knowledge, the prototypes can be modified. Finally, the reconstruction error is idea of the proposed approach has not been considered in the reduced with the use of population-based optimization. In [24], previous studies. a modified clonal mutation scheme is proposed to enhance the This paper is organized as follows. The reconstruction performance. By adjusting the location of the granulation-degranulation process is briefly reviewed in prototypes, Galaviz et al. [21] also develop a cluster Section II. An augmented scheme of degranulation is dwelled optimization algorithm. Izakian et al. [25] make use of the upon in detail in Section III. Section IV includes an reconstruction error as a vehicle to select the partition matrix so experimental setup and analysis of completed experiments. as to determine the results of fuzzy clustering. In [26], the Section V covers some conclusions and identifies future reconstruction error is also used as a criterion to assign an research directions. anomaly score to each subsequence in anomaly detection in time series data. Similarly, in [27], a DFuzzy method with deep II. GRANULATION-DEGRANULATION MECHANISMS learning based autoencoders for graph clustering is designed by n Let us consider that a numeric dataset X  R comprising of minimizing the reconstruction error. By considering the N objects with n attributes (measurements) is clustered into C reconstruction error as an indicator of the quality of constructed groups. From a general point of view, fuzzy clustering aims to clusters, Casalino et al. [28] propose a dynamic incremental form information granules and to reveal a structure in the data. semi-supervised version of the standard FCM to adapt In the FCM clustering, we realize the minimization of the dynamically the number of clusters to capture adequately the following objective function (performance index) [30], i.e., evolving structure of streaming data. Recently, a new design method for a fuzzy radial basis function neural networks NCNC mm2 2 classifier is proposed in [29]. The algorithm is based on the JdFCM= i j i j =  i jxv i − j conditional Fuzzy C-Means clustering algorithm and realized i=1 j = 1 i = 1 j = 1 x=x,,,,;;;; x x  R1n X = x x x  R N n with the aid of auxiliary information, which extracted by the i 1 2 k  1 2 i  (1) locally linear reconstruction algorithm, shown to be an i=1,2, , N , k = 1,2, , n , j = 1,2, , C effective classification approach. In summary, the s.t. reconstruction (degranulation) is an important evaluation index CN i j=1, 0  i j  N in fuzzy clustering. It exhibits a close relation with the partition ji==11 and the prototypes, and in turn it can optimize the partition and where x is the ith datum, v is the jth center (prototype) of the the prototypes by minimizing the reconstruction error to i j cluster,  is the degree of membership of individual x to the enhance the performance of fuzzy clustering. Thus, optimizing ij i the degranulation process is a very important and cluster j, mm(1)  is a fuzziness exponent (fuzziness well-motivated issue. coefficient), and stands for some distance. While there is a All of the aforementioned studies expose some similarities: substantial diversity as far as distance functions are concerned, they mainly focus on using the optimization methods in the here, we adhere to a weighted Euclidean distance taking on the granulation process to enhance the granulation results. With the following form [7]: optimization methods, they can all achieve a reduction in the n 2 2 ()xv− reconstruction error. However, they cannot usually make the xv−= ik jk ij  2 (2) information granules generate original data, which is one of the k =1  k important reasons behind the occurrence of the reconstruction where  k stands for a standard deviation of the kth variable. error. Thus, we will improve the performance of the While not being computationally demanding, this type of degranulation (reduce the reconstruction error) from the distance is still quite flexible. The objective function shown perspective of restoring the original data. above is minimized by iteratively updating the partition matrix In this paper, a novel reconstruction (degranulation) scheme U and the prototypes V [31]: is proposed. In the developed scheme, we build up a novel U =μTTTTCN,,,,,μ μ μ R  collection of models of granulation-degranulation mechanisms. 12 iN  We complete a thorough analysis of the relationship among the μi=  i12,,,,,  i  i j  i C (3) prototypes (clusters), partition matrix and the information 1 granules. In the granulation process, the matrix of prototypes is ij = C xv− 2 expressed as the product of the dataset matrix and the partition ()ijm−1 matrix. While in the degranulation process, the reconstructed t =1 xvit− numeric data can also be decomposed into the product of the partition matrix and the matrix of prototypes. By building up a supervised learning mode of the granulation-degranulation based on the developed models, the partition matrix is C m v1  ν  j=1 ij j  v jjjjkjn= vvvv12,,,,, xˆ = (9) v  i C m 2  N  j=1 ij  m Cn () x V =R  i jik (4) The degranulation (reconstruction) error becomes a function i=1 v j v =  jk N that takes into account the prototypes of the clusters and the  m ij partition matrix; (9) is very similar to (4) used for prototype  i=1 vC computing. Figs. 2 and 3 show an original data (glass To facilitate the analysis, we build a diagonal matrix Φ to identification dataset from the repository decompose the matrix of prototypes: (http://archive.ics.uci.edu/ml) and the reconstructed data with the FCM-based degranulation. Obviously, there is a vast 111 CC difference between the original data and the reconstructed data Φ =diag,,,,NNNR (5) mmm jjj===111iijiC1 with the FCM algorithm, which is also an important reason then, we build a new model of the granulation, that is, the leading to the reconstruction error. As we may observe from (9), ˆ m prototypes can be decomposed into: the reconstructed dataset X is determined by U and V . In V ==Γ X ΦUXm (6) order to reduce the reconstruction error, we consider optimizing U m and V . where mmmm 111211 iN mmmm 212222 iN  m U = mmmm (7) jjjijN12   mmmm CCCiCN12 It can be seen form (6) that the granulation is essentially a subspace rotation problem. With a generalized rotation matrix Γ, the data subspace spanned by the column of X can be rotated into the prototype subspace which is spanned by the column of V. A certain granulation-degranulation mechanism is inherently associated with fuzzy clustering [7]. The granulation-degranulation mechanisms can be organized in the two phases [11] as displayed in Fig.1:

μ1, μ2, , μC μ1, μ2, , μC Fig. 2. Glass Identification dataset.

x μ(x1), μ(x2), , μ(xC) xˆ Granulation Degranulation

ν1, ν2, , νC ν1, ν2, , νC Fig. 1. Representation of the granulation-degranulation mechanisms. Granulation of the data x is made in terms of membership grades of the constructed information granules. In other words, the granulation mechanism returns a representation of any input data (pattern) x expressed in terms of membership degrees,

(),(),()xxx12 C ; the membership grades are computed by (3). Degranulation provides a reconstruction of the original data in terms of the prototypes and membership grades computed by (3) and (6), respectively. Formally, the reconstruction of data is determined by solving the following optimization problem where the minimized performance index is defined as follows: Fig. 3. Reconstructed data with the FCM-based degranulation. C m 2  ij( x i) x i− v j (8) j=1 III. AN AUGMENTED SCHEME OF DEGRANULATION As a result, the minimized reconstruction error is obtained for Although data can be reconstructed, there exists a difference the result of reconstruction expressed as follows: between the reconstructed dataset and the original dataset, and the reconstruction error becomes nonzero. To enhance the performance of the degranulation process, in this paper we −−mm develop a novel approach to optimize the degranulation which CC−−11 22    mm−−11 can bring the reconstructed dataset closer to the original dataset ()()x1 −−νt  x iν t  tt==11 (viz. restore the original dataset).  ,   , Λ = diag CC, = ΨΘ mm jj==111 j ij A. Problem formulation   To further explore the mathematical relationships between the partition matrix, the prototypes and the dataset, we build a i==1,2, , N ; j 1,2, , C novel model of the degranulation. That is, a matrix (15) −−mm CC−−11 decomposition form of the reconstructed dataset Xˆ , i.e.,  22 mm−−11, , Ψ = diag()()x1 −−νtit , x ν m m   tt==11 (16) 11 1 j  CC mm iNjC==1,2,,;1,2,, jj==1111jj  Ξ = β ()()()()ν β ν β ν β ν m T 1122 jjCC XVˆ === ΩV Θ UV m  m  (10) −−−mmm T i1 ij 222  mmm−−−111 CC β ()ν =−−−x ν x ν x ν mm jjjijNj 1 jj==11Njij   iNjC==1,2,,;1,2,, iNjC==1,2,,;1,2,, (17) where Furthermore, Ω = ΛΞ . We define Ξ as a fuzzy curvature matrix of dataset X while β ()ν is the curvature vector of the m m jj 11 1 j CC j-th prototype. mm jj==1111jj B. Modification of the partition matrix m T Ω ==Θ U ˆ m  m  (11) Ideally, we wish to have XX= . At this point, can be i1 ij CC considered to be formed by V . Suppose that there exists a mm jj==11Njij matrix Ωˆ between and the dataset that satisfies the following expression iNjC==1,2,,;1,2,, Ωˆ VX= (18) The modified Uˆ m can be determined as follows 111 , NN (12) Θ =diag,,,CCC R ˆ mmm ΩVX= jjj===1111 jijNj Ωˆ VVXVTT= (19) It can be seen form (10) that the degranulation is also a −1 subspace rotation problem. With a generalized rotation matrix Ωˆ = XVVVTT( ) Ω, the prototype subspace spanned by the column of V can be Since Ω = ΛΞ , Λ being a diagonal matrix, in order to find a rotated into the data subspace which is spanned by the column matrix Λˆ to satisfy Ωˆ = Λˆ Ξ , we construct a cost-function of X. N 2 Obviously, U m can be expressed as ˆ F =− i Ξ(ii) Ω( ) (20) −m i=1 C 1 2 m−1 Ξ ˆ ( ) 0 0 0 where i is the i-th diagonal element, (i) and Ω(i) are the i-th t =1 x1 − νt row of the Ξ and Ωˆ , respectively. What we should do is to m T 0 0 0 U = ˆ −m find a parameter i to minimize the cost function F . The C 1 2 0 0 ( )m−1 0 expanded form of reads as follows:  x − ν t =1 it N F =Ξ −Ωˆ  Ξ HH − Ωˆ  0 0 0  ii(i) ( i)  ( i) ( i)  i=1 (21) −m −m N 2 2 x −−ν m−1 x ν m−1 =Ξ Ξ HHHH  − Ξ Ωˆ −  Ωˆ Ξ + Ωˆ Ωˆ 1 1 1 j i(i) ( i) i i( i) ( i) i ( i) ( i) ( i) ( i) i=1  (13) where i is conjugation of i , and H represents the conjugate −m −m 2 2 m−1 m−1 xi−−ν1 x iν j transpose. Computing the partial derivative of and make it  equal to 0, we have F i==1,2, , N , j 1,2, , C  =Ξ Ξ HH −Ωˆ Ξ = 0 (22)  (i) ( i) i ( i) ( i) As a result, Xˆ can be written in the following way i Xˆ = ΛΞV (14) As a result, can be expressed as −1 where ˆ = Ξ Ξ HHΩˆ Ξ (23) i (i) ( i) ( i) ( i) Thus, Λˆ can be written as  should make ˆ ˆ =Θˆ Ωˆ Π Σ GT (30) Λ = ii Ωˆ Ωˆ Ωˆ   so that we have ˆ HH−1 ˆ ()()Θˆ + +Θˆ = Ωˆ Ωˆ Π Σ GT (31) iiijijijij= ()()Ξ Ξ Ω Ξ (24) Θˆ Θˆ Θˆ iNjC==1,2,,;1,2,, Hence, the right zero subspace of [()()]Θˆ ++Θˆ Ωˆ Ωˆ is Another diagonal matrix Θˆ is therefore obtained by []Uˆ mT −1 spanned by G . The solution must be spanned by . Θˆ = Ψ Λˆ (25) Ωˆ  −IC Finally, we use the idea of the total least squares (TLS) approximation [32]. To find Uˆ m , let This leads to T ˆ ˆ m ˆ ˆ ˆ mT (Θ + =+Θ) U Ω Ω (26) []U = GTˆ (32) −I Ω Specifically, we find a matrix Uˆ m that minimizes error C G matrices Θˆ and Ωˆ for Θˆ and Ωˆ , respectively. That is, where T  RCC is an appropriate matrix. Let G = Ωˆ 1 , Ωˆ G T Ωˆ 2  Θˆ + =+Θˆ  Uˆ m Ωˆ Ωˆ ( )  NC CC  2 (27) where G  R , and G  R , then (32) derives the min[] Θˆ Ωˆ Ωˆ1 Ωˆ 2  T  U m ,,Θˆ Ωˆ F   following relations where []Θˆ Ωˆ is the augmented matrix with Θˆ and Ωˆ T Uˆ Gm T = (33) • 2  Ωˆ 1 side by side and F is the Frobenius norm. (27) can also be −=I G T (34) rewritten in the form C Ωˆ 2 T The relations (33) directly follows  Uˆ m Θˆ + Θˆ Ωˆ + Ωˆ  = 0 T ( ) ( ) UGGˆ m =−−1 (35) −I Ωˆ121 Ωˆ  C (28) m  2 With the new Uˆ , a new reconstructed dataset can be obtained. min [Θˆ Ωˆ ]  T U m ,,Θˆ Ωˆ F With the proposed approach (By modifying the partition   matrix), the reconstructed dataset Xˆ is closer to the original ˆ mT []U dataset X . Fig. 4 shows the principle of the proposed scheme. where IC is the CC identity matrix, and  must be a −IC Fig. 5 visualizes the reconstructed data of the glass []Uˆ mT identification dataset with the proposed scheme. It can be seen matrix of rank C . Moreover, it is seen from (28) that  that the structure of the reconstructed data using the proposed −I C scheme is much closer to the original dataset than the must lie in the right zero subspace of the augmented matrix reconstructed data produced by the FCM-based degranulation. [()()]Θˆ + +Θˆ  Ωˆ Ωˆ . Thus, [()()]Θˆ + +Θˆ  Ωˆ Ωˆ has to be a rank-defect matrix that at least loses rank of C . If 2 U min[] Θˆ Ωˆ is further considered, X Granulation Degranulation Xˆ [],,Uˆ mT Θˆ Ωˆ F V must lose rank of , since if defect rank

2 of is over , then []Θˆ Ωˆ F min||X﹣X ˆ || necessarily increases. Our first goal is then to find []Θˆ Ωˆ ˆ ˆ such that the rank of []Θ Ω decreases from NC+ to N. Define Uˆ T [Π ][Σ ][G ] as the singular value decomposition of the Fig. 4. An overall model: main functional processing phases. augmented matrix []Θˆ Ωˆ , where Π and G are two orthonormal matrices, and all the diagonal entries of diagonal matrix Σ are constructed by all the singular values that are arranged in decreasing order. Then, we have

Σ 0 T Θˆ Ωˆ = Π Π Θˆ GG  (29) Θˆ Ωˆ 0 Σ Θˆ Ωˆ  Ωˆ where G is partitioned into two sub-parts corresponding to Θˆ and Ωˆ . The rank is reduced by setting some of the singular 2 values to zero. Thus, if considering min [Θˆ Ωˆ ] , we [],,Uˆ mT Θˆ Ωˆ F One can notice that the shape of the reconstructed data with the proposed method is much closer to the original dataset than the FCM method. The best experimental results for this dataset are shown in Tables I. The results obtained for the 6-D synthetic dataset show that the proposed method is the clear winner in terms of the minimal reconstruction error.

Fig. 5. Reconstructed data realized with the aid of the proposed method.

Fig. 6. Synthetic dataset. IV. EXPERIMENTAL STUDIES

The following experiments are designed to evaluate the performance of the proposed approach and explore several different scenarios. The experiments are conducted for a variety of datasets using the FCM-based degranulation and the proposed method. A six-dimensional synthetic dataset is used, as seen in Fig. 6 in Section A. A number of publicly available datasets coming from the machine learning repository (http://archive.ics.uci.edu/ml) are also used. Data coming from the UCI machine learning repository [33] are commonly used as benchmarks [34]. All datasets are normalized to have zero mean and unit standard deviation. The reconstruction error is taken as the evaluation index N Fig. 7. Reconstructed data with the FCM-based degranulation. 1 2 ˆ Rerrorii=− xx (36) N i=1 It should be noted that the normalized Euclidean distance is used in (36) to determine the discrepancy (distance) between the original n-dimensional entities (patterns), and the reconstructed ones. We consider different values of the number of clusters c and fuzzification coefficient m. We run the algorithms on each dataset with the number of clusters c ranging from C (here C stands for the number of classes of each dataset) to C+5 [35]. The values of the fuzzification coefficient m are taken from 1.2 to 3, with a step size of 0.2. The algorithms will be terminated once the following stopping condition is satisfied: (kk+1) ( ) −5 maxij i j− i j 10 (37) Fig. 8. Reconstructed data with the proposed method. To estimate the effectiveness of the proposed method, we used a 10-fold cross validation [22], [36] which is commonly used to B. Publicly available data estimate (and validate) the performance of In this section, we use 11 publicly available datasets: granulation-degranulation models [21]. vertebral column, glass identification, urban land cover, A. Synthetic data connectionist bench, climate model simulation crashes, breast cancer, qsar biodegradation, statlog (heart), adult, banknote First, we report the results of reconstruction performance for authentication, and waveform generator version2. The an illustrative 6-D synthetic dataset, with a number of detailed introduction of these datasets can be found the UCI individuals 300 and three categories. To visualize and contrast machine learning repository (http://archive.ics.uci.edu/ml). The the performance of the proposed and the FCM method, Figs. 6 reconstruction errors and standard derivations results of these to 8 show the original synthetic data and the reconstructed data indices obtained for each publicly available dataset are with the FCM method and the proposed method, respectively. summarized in Tables II – VII and Fig. 9. It is noticeable that the reconstruction error (both on the in the former the reconstruction error is reduced by 90%, while training and testing) of all the datasets are reduced by using the in the latter the reconstruction error is reduced by about 30%. proposed approach. The improvement is about 25% on average Although the reconstruction error of some dataset as waveform and varies in-between a minimal improvement of 5% and 90% database generator version2 is slightly reduced, it is also in the case of the most visible improvement. It is especially improved by 5%. pointed out that in some cases the error can be reduced to zero. The shortcoming of the proposed approach is an additional The most notable result is the one obtained with the vertebral computational burden, due to the complex eigenvalue column dataset, where reconstruction errors of the training set decomposition (EVD) and singular value decomposition (SVD) and the test set reduce from 0.083 and 0.189 to 0.0002, involved, especially for highly dimensional dataset. respectively. The ones obtained with the glass identification and the urban land cover datasets also show good results, where

Table I. Results of reconstruction error of 10-fold cross validation with all the protocols - synthetic dataset.

Dataset synthetic dataset 10-Fold P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C 8 8 8 3 8 8 8 8 8 8 Parameters m 1.4 1.2 1.4 1.4 1.2 1.4 1.2 1.2 1.4 1.2 FCM 0.076 0.083 0.083 0.084 0.084 0.083 0.083 0.084 0.084 0.084 Train Proposed 0.001 0.000 0.000 0.000 0.000 0.001 0.000 0.000 0.000 0.000 FCM 0.173 0.186 0.173 0.193 0.201 0.205 0.200 0.173 0.207 0.180 Test Proposed 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.000 FCM 0.086 0.093 0.092 0.095 0.095 0.095 0.094 0.093 0.097 0.093 Total Proposed 0.001 0.000 0.000 0.000 0.000 0.001 0.000 0.000 0.000 0.000 Train Test Mean and standard deviation Mean Standard deviation Mean Standard deviation FCM 0.083 0.002 0.189 0.014 Proposed 0.0002 0.0002 0.0004 0.0001

Table II. Results of reconstruction error of 10-fold cross validation with all the protocols - vertebral column dataset.

Dataset Vertebral Column 10-Fold P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C 8 8 8 8 8 8 8 8 8 8 Parameters m 1.2 1.2 1.2 1.2 1.2 1.2 1.4 1.2 1.2 1.4 FCM 0.083 0.083 0.082 0.084 0.084 0.084 0.082 0.084 0.084 0.078 Train Proposed 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 FCM 0.197 0.201 0.210 0.203 0.189 0.176 0.195 0.176 0.154 0.186 Test Proposed 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 FCM 0.095 0.095 0.095 0.096 0.094 0.093 0.094 0.093 0.091 0.089 Total Proposed 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 Train Test Mean and standard deviation Mean Standard deviation Mean Standard deviation FCM 0.083 0.002 0.189 0.016 Proposed 0.0002 0.0001 0.0003 0.0001

Table III. Results of reconstruction error of 10-fold cross validation with all the protocols - glass identification dataset. Dataset Glass Identification 10-Fold P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C 12 12 12 12 12 12 12 12 12 12 Parameters m 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 FCM 0.115 0.114 0.110 0.108 0.111 0.113 0.112 0.107 0.108 0.111 Train Proposed 0.006 0.006 0.006 0.005 0.006 0.006 0.006 0.005 0.005 0.005 FCM 0.154 0.082 0.155 0.146 0.136 0.123 0.145 0.151 0.163 0.129 Test Proposed 0.011 0.010 0.010 0.014 0.011 0.011 0.007 0.016 0.029 0.018 FCM 0.119 0.111 0.114 0.112 0.114 0.114 0.115 0.111 0.114 0.113 Total Proposed 0.006 0.006 0.006 0.006 0.006 0.006 0.006 0.007 0.008 0.006 Train Test Mean and standard deviation Mean Standard deviation Mean Standard deviation FCM 0.111 0.003 0.138 0.023 Proposed 0.006 0.000 0.014 0.006

Table IV. Results of reconstruction error of 10-fold cross validation with all the protocols - urban land cover dataset. Dataset Urban Land Cover 10-Fold P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C 7 7 7 7 7 7 7 7 7 7 Parameters m 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 FCM 0.329 0.329 0.328 0.329 0.329 0.328 0.328 0.329 0.330 0.331 Train Proposed 0.232 0.232 0.231 0.233 0.233 0.231 0.232 0.228 0.233 0.236 FCM 0.882 0.841 0.876 0.859 0.869 0.857 0.872 0.875 0.877 0.837 Test Proposed 0.603 0.630 0.630 0.622 0.624 0.615 0.618 0.624 0.618 0.604 FCM 0.384 0.380 0.383 0.382 0.382 0.380 0.382 0.383 0.384 0.382 Total Proposed 0.269 0.271 0.271 0.271 0.271 0.269 0.270 0.268 0.271 0.272 Train Test Mean and standard deviation Mean Standard deviation Mean Standard deviation FCM 0.329 0.001 0.864 0.015 Proposed 0.232 0.002 0.619 0.009

Table V. Results of reconstruction error of 10-fold cross validation with all the protocols - connectionist bench dataset.

Dataset Connectionist Bench 10-Fold P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C 7 7 7 7 7 7 7 7 7 7 Parameters m 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 FCM 0.434 0.432 0.433 0.433 0.430 0.437 0.435 0.427 0.437 0.429 Train Proposed 0.394 0.386 0.398 0.412 0.391 0.407 0.398 0.390 0.399 0.393 FCM 0.989 1.105 1.014 1.085 1.084 0.968 1.002 1.094 1.019 1.109 Test Proposed 0.969 1.063 1.015 0.983 1.005 0.830 0.889 1.029 0.941 1.023 FCM 0.490 0.500 0.492 0.499 0.496 0.490 0.493 0.494 0.496 0.498 Total Proposed 0.452 0.454 0.460 0.469 0.453 0.450 0.447 0.455 0.454 0.456 Train Test Mean and standard deviation Mean Standard deviation Mean Standard deviation FCM 0.433 0.003 1.047 0.054 Proposed 0.397 0.008 0.975 0.071

Table VI. Results of reconstruction error of 10-fold cross validation with all the protocols - climate model simulation crashes dataset.

Dataset Climate Model Simulation Crashes 10-Fold P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C 7 7 7 7 6 7 7 7 7 4 Parameters m 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 1.2 FCM 0.192 0.192 0.192 0.192 0.192 0.192 0.192 0.192 0.192 0.192 Train Proposed 0.158 0.174 0.160 0.158 0.165 0.160 0.157 0.157 0.161 0.176 FCM 0.465 0.469 0.474 0.468 0.477 0.471 0.467 0.463 0.467 0.511 Test Proposed 0.433 0.408 0.433 0.435 0.432 0.421 0.430 0.412 0.415 0.451 FCM 0.220 0.220 0.220 0.220 0.221 0.220 0.220 0.219 0.220 0.224 Total Proposed 0.185 0.197 0.187 0.186 0.192 0.186 0.185 0.183 0.186 0.204 Train Test Mean and standard deviation Mean Standard deviation Mean Standard deviation FCM 0.192 0.000 0.473 0.014 Proposed 0.163 0.007 0.427 0.013

Table VII. Results of reconstruction error of 10-fold cross validation with all the protocols - breast cancer dataset.

Dataset Breast Cancer 10-Fold P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C 7 7 7 7 7 7 7 7 7 7 Parameters m 1.2 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 FCM 0.140 0.139 0.139 0.140 0.138 0.139 0.140 0.140 0.140 0.139 Train Proposed 0.109 0.101 0.151 0.156 0.102 0.149 0.104 0.102 0.099 0.108 FCM 0.353 0.346 0.346 0.344 0.351 0.369 0.352 0.343 0.341 0.340 Test Proposed 0.229 0.300 0.235 0.286 0.338 0.245 0.319 0.292 0.272 0.248 FCM 0.161 0.160 0.160 0.161 0.160 0.163 0.161 0.160 0.160 0.159 Total Proposed 0.140 0.139 0.139 0.140 0.138 0.139 0.140 0.140 0.140 0.139 Train Test Mean and standard deviation Mean Standard deviation Mean Standard deviation FCM 0.139 0.001 0.349 0.009 Proposed 0.118 0.024 0.276 0.037

Fig. 9. Results of reconstruction error of 10-fold cross validation with all the protocols

vol. 1, no. 3, pp. 181–197, Sep. 2016. [10] M. A. Sanchez, O. Castillo, and J. R. Castro, “Information granule V. CONCLUSIONS formation via the concept of uncertainty-based information with interval In this research, we propose a novel method to enhance the type-2 fuzzy sets representation and Takagi–Sugeno–Kang consequents performance of the degranulation. During the design process, optimized with Cuckoo search,” Applied Soft Computing, vol. 27, pp. we build up a novel set of mathematical models of 602–609, Feb. 2015. [11] X. C. Hu, W. Pedrycz, G. H. Wu, and X. M. Wang, “Data reconstruction granulation-degranulation mechanisms so as to clearly present with information granules: an augmented method of fuzzy clustering,” the relationship between the reconstructed dataset and the Applied Soft Computing, vol. 55, pp. 523–532, Jun. 2017. prototypes matrix. The partition matrix is modified by building [12] J. C. Bezdek, “Pattern recognition with fuzzy objective function up a supervised learning mode of the granulation-degranulation algorithms,” Kluwer Academic Publishers, pp. 203–239, Jan. 1981. based on the developed models. After modifying the partition [13] C. Lian, R. Su, and T. Denœux, “Dissimilarity metric learning in the belief function framework,” IEEE Transactions on Fuzzy Systems, vol. matrix, the reconstructed dataset is much closer to the original 24, no. 6, pp. 1555–1564, Mar. 2016. dataset, which significantly reduces the reconstruction error. [14] N. Ghadiri, M. Ghaffari, and M. A. Nikbakht, “Big FCM: fast, precise We conduct a theoretical analysis and offer a comprehensive and scalable FCM on Hadoop,” Future Generation Computer Systems, suite of experiments. Both the theoretical and experimental vol. 77, pp. 29–39, Dec. 2017. results are presented to validate the proposed method. To the [15] O. Castillo, L. Cervantes, and J. Soria, “A generalized type-2 fuzzy best of our knowledge, this research scheme is first proposed granular approach with applications to aerospace,” Information Sciences, vol. 354: pp. 165–177, Aug. 2016. that steadily improves the performance of degranulation. [16] E. Rubio, O. Castillo, F. Valdez, P. Melin, C. I. Gonzalez, and G. Unfortunately, the proposed method involves eigenvalue Martinez, “An extension of the fuzzy possibilistic clustering algorithm decomposition (EVD) and singular value decomposition (SVD), using type-2 fuzzy logic techniques,” Advances in Fuzzy Systems, doi: which involves some extra computing overhead. 10.1155/2017/7094046, 2017. The paper presents a way to modify the partition matrix to [17] D. Kumar, J. C. Bezdek, and M. Palaniswami, “A hybrid approach to clustering in big data,” IEEE Transactions on Cybernetics, vol. 46, no. reduce the degranulation (reconstruction) error based on the 10, pp. 2372–2385, Sep. 2015. proposed models of granulation-degranulation mechanisms, [18] K. J. Xu, W. K. Nie, D. Z. Feng, X. J. Chen, and D. Y. Fang, “A however, it is not the only way. Thus, the proposed models multi-direction virtual array transformation algorithm for 2D DOA open a specific way for restoring the original dataset (reducing estimation,” Signal Processing, vol. 125, no. C, pp. 122–133, Aug. 2016. the reconstruction error) research and pose a much general [19] K. J. Xu, W. Pedrycz, Z. W. Li, and W. K. Nie, “High-accuracy signal subspace separation algorithm based on gaussian kernel,” IEEE problem as to the reduction of computational complexity. In Transactions on Industrial Electronics, vol. 66, no. 1, pp. 491–499, Jan. addition, future work also includes the study of the model 2019. constructed in this paper combined with Petri nets [37] to [20] X. B. Zhu, W. Pedrycz, and Z. W. Li, “Fuzzy clustering with nonlinearly reduce the degranulation error. transformed data,” Applied Soft Computing, vol. 61, pp. 364–376, Jan. 2017. [21] O. F. R. Galaviz and W. Pedrycz, “Enhancement of the classification and reconstruction performance of fuzzy c-means with refinements of REFERENCES prototypes,” Fuzzy Sets and Systems, vol. 318, pp. 80–99, Jul. 2017. [1] W. Pedrycz, “Granular computing for data analytics: a manifesto of [22] D. Graves and W. Pedrycz, “Kernel-based fuzzy clustering and fuzzy human-centric computing,” IEEE/CAA Journal of Automatica Sinica, clustering: a comparative experimental study,” Fuzzy Sets and Systems, vol. 5, no. 6, pp. 1025–1034, Nov. 2018. vol. 161, no. 4, pp. 522–543, Feb. 2010. [2] L. A. Zadeh, “Fuzzy sets,” Information & Control, vol. 8, no. 3, pp. [23] H. Izakian and W. Pedrycz, “Anomaly detection and characterization in 338–353, Jan. 1965. spatial time series data: a cluster-centric approach,” IEEE Transactions [3] F. Rogai, C. Manfredi, and L. Bocchi, “Metaheuristics for specialization on Fuzzy Systems, vol. 22, no. 6, pp. 1612–1624, Dec. 2014. of a segmentation algorithm for ultrasound images,” IEEE Transactions [24] R. Liu, B. Zhu, R. Bian, Y. Ma, and L. Jiao. “Dynamic local search based on Evolutionary Computation, vol. 20, no. 5, pp. 730–741, Jan. 2016. immune automatic clustering algorithm and its applications,” Applied [4] J. Kerr-Wilson and W. Pedrycz, “Design of rule-based models through Soft Computing, vol. 27, no. C, pp. 250–268, Feb. 2015. information granulation,” Expert Systems with Applications, vol. 46, no. [25] H. Izakian, W. Pedrycz, and I. Jamal, “Clustering spatiotemporal data: 2, pp. 274–285, Mar. 2016. an augmented Fuzzy C-Means,” IEEE Transactions on Fuzzy Systems, [5] X. C. Hu, W. Pedrycz, and O. Castillo, “Fuzzy rule-based models with vol. 21, no. 5, pp. 855–868. Oct. 2013. interactive rules and their granular generalization,” Fuzzy Sets and [26] I. Hesam and W. Pedrycz. “Anomaly detection in time series data using a Systems, vol. 307, pp. 1–28, Jan. 2017. fuzzy c-means clustering,” 2013 Joint IFSA World Congress and [6] K. Neshatian, M. J. Zhang, and P. Andreae, “A filter approach to NAFIPS Annual Meeting (IFSA/NAFIPS), Edmonton, AB, Canada, Jun. multiple feature construction for symbolic learning classifiers using 2013. genetic programming,” IEEE Transactions on Evolutionary [27] V. Bhatia and R. Rinkle, “Dfuzzy: a deep learning-based fuzzy Computation, vol. 16, no. 5, pp. 645–661. Oct. 2012. clustering model for large graphs,” Knowledge and Information Systems, [7] W. Pedrycz and J. V. D. Oliveira, “A development of fuzzy encoding and vol. 57, no. 1, pp. 159–181, Oct. 2018. decoding through fuzzy clustering,” IEEE Transactions on [28] G. Casalino, C. Giovanna, and M. Corrado, “Incremental adaptive Instrumentation and Measurement, vol. 57, no. 4, pp. 829–837, Mar. semi-supervised fuzzy clustering for data stream classification,” 2018 2008. IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), [8] W. Pedrycz, “From fuzzy models to granular fuzzy models,” Rhodes, Greece, Jun. 2018. International Journal of Computational Intelligence Systems, vol. 9, no. [29] S. B. Roh, S. K. Oh, W. Pedrycz, “Design methodology for radial basis 1, pp. 35–42, Apr. 2016. function neural networks classifier based on locally linear reconstruction [9] G. Wilke and E. Portmann, “Granular computing as a basis of and conditional Fuzzy C-Means clustering,” International Journal of human-data interaction: a cognitive city use case,” Granular Computing, Approximate Reasoning, vol. 106, pp. 228–243, Mar. 2019. [30] Q. Wen, L. Yu, Y. Wang, and W. Wang, “Improved FCM algorithm based on the initial clustering center selection,” in Proc. International Conference on Consumer Electronics, Communications and Networks IEEE, pp. 351–354, Xianning, China, Jan. 2014. [31] J. M. L and H. W. Lewis, “Fuzzy clustering algorithms-review of the applications,” IEEE International Conference on Smart Cloud IEEE, pp. 282-288, New York, USA, Dec. 2016. [32] G. H. Gene and C. F. Van Loan, “Matrix computations (3rd ed.),” Johns Hopkins University Press Baltimore, MD. USA, 1996. [33] J. S. Aguilar-Ruiz, R. Giraldez, J. C. Riquelme, Natural encoding for evolutionary supervised learning,” IEEE Transactions on Evolutionary Computation, vol. 11, no. 4, pp. 466–479, Jul. 2007. [34] E. Hullermeier, M. Rifqi, S. Henzgen, and R. Senge, “Comparing fuzzy partitions: a generalization of the rand index and related measures,” IEEE Transactions on Fuzzy Systems, vol. 20, no. 3, pp. 546–556, Jun. 2012. [35] K. J. Xu, W. Pedrycz, Z. W. Li, and W. K. Nie, “Constructing a virtual space for enhancing the classification performance of fuzzy clustering,” IEEE Transactions on Fuzzy Systems, vol. 27, no. 9, pp.1779–1792, Sep. 2019. [36] E. Adeli, K. H. Thung, and L. An, “Semi-supervised discriminative classification robust to sample-outliers and feature-noises,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 2, p. 515–522, Feb. 2019. [37] Z. Y. Ma, Z. W. Li, and A. Giua, “Marking estimation in a class of time labelled Petri nets,” IEEE Transactions on Automatic Control, doi: 10.1109/tac.2019.2907413, 2019.