International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 06 | June-2016 www.irjet.net p-ISSN: 2395-0072

DENTAL BASED ON THE COMBINATION BETWEEN FUZZY CLUSTERING AND SEMI-SUPERVISED FUZZY CLUSTERING ALGORITHMS USING SPATIAL INFORMATION

Tran Manh Tuan1*, Tran Thi Ngan1, Nguyen Thuy Hoa1, Nguyen Ba Loi2 1 School of Information and Communication Technology, Thainguyen University, Vietnam 2 Medical Vocational School Ba Ria – Vung Tau, Vietnam

------***------Abstract - Dental image segmentation has an important problems related to the quality of clusters. The fuzzy role in practical dentistry. This progress supports the clustering algorithms always have some additional dentists in diagnosing effectively some dental diseases information provided by users. This kind of such as cavities, resorption bone around teeth, fractured information gives the orientation for the clustering stump, periodontal disease, etc. Some recent researches, the scientists used various techniques consisting of step. These semi-supervised fuzzy clustering common image processing techniques, hard clustering algorithms are very effective in many different fields algorithms, fuzzy clustering algorithms and semi- such as image processing [4], , face supervised fuzzy clustering algorithms. These techniques recognition [1], risk assessment [3], bankrupt were applied in dental image segmentation problem. In prediction [9], especially in image segmentation with which, semi-supervised fuzzy clustering algorithms were quite effective in the meaning of high quality in color images and medical images. clustering. In this paper, we present an improvement by Therefore, the main contribution of this paper is to using fuzzy C-means clustering algorithm (FCM) propose a new semi-supervised fuzzy clustering combined with a semi-supervised fuzzy clustering algorithm in which there is a combination between algorithm and its application in image segmentation problem. The additional information used in this semi- fuzzy clustering and semi-supervised fuzzy clustering supervised fuzzy clustering algorithm is spatial algorithm. Spatial information of dental image is information of dental image. considered as the additional information of image dental segmentation problem. A part from that, the Key Words: Fuzzy clustering, Semi-supervised fuzzy clustering, spatial information, dental image, image evaluation of new algorithm’s performance is segmentation. implemented based on real data sets including 56 dental X-ray images of the patient during the period 2014-2015 at Hanoi Medical University Hospital. The 1.INTRODUCTION paper also addresses some conclusions regarding to the use of this new algorithm. Image segmentation is the first step in image processing progress. It has an important role and The significance of this study is to find out a new effects to the accuracy of following steps in image effective algorithm in dental X-ray image segmentation. processing [8]. Dental image segmentation also is the To carry out this, we propose a mathematical model for pivotal part in practical dentistry, especially in order to the objective function of semi-supervised fuzzy support the dentists in dental disease diagnosing [11]. clustering algorithm in the form of optimization In recently researches, there are two groups of image problem and use the additional information to improve the quality of clustering. Based on a sample set of real segmentation techniques including the techniques dental X-ray images, the model is evaluated and based on threshold [6] and based on clustering compared with other related algorithms. The algorithms [12]. These methods meet the challenges in application of this algorithm in the segmentation determining values of threshold parameter and problem plays a significant role in following image defining the common boundaries of teeth samples [13]. processing progress. Thus, fuzzy clustering is an effective tool to solve the

© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 18

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 06 | June-2016 www.irjet.net p-ISSN: 2395-0072

The rest of the paper is organized as follows. In Section 1 2, we give the preliminary about the fuzzy clustering u kj  1 (4) algorithms (fuzzy clustering-FCM and semi-supervised C  X  v  m1 standard fuzzy clustering-SSSFC). Section 3 proposes a  k j    X  v  new semi-supervised fuzzy algorithm based on the i1  k i  combination between FCM and SSSFC in which additional information is the spatial information. Some The FCM algorithm is presented as in Table 1 experimental results are shown in Section 4 to validate the performance of new algorithm on a real dataset. Table 1. Fuzzy C-means algorithm Some conclusions and future works are given in Section 5. Input Dataset X including N elements in r-dimension space; The number of clusters C; fuzzier m; 2. Preliminary about the fuzzy clustering threshold; the largest number of iterations algorithms MaxStep In this Section, we present briefly a fuzzy clustering Output Matrix U and centers of clusters V algorithm (FCM) in Subsection 2.1 and semi-supervised standard fuzzy clustering algorithm SSSFC in FCM Subsection 22.2. 1 t = 0 2.1 Fuzzy C-means algorithm 2 t  Fuzzy C-Means (FCM), proposed by Bezdek, Ehrlich & ukj  random; k  1, N; j  1,C satisfy Full [2], is mainly based on the iteration process to the condition (2) optimize the membership matrix and the cluster 3 Repeat centers. The objective function of FCM is defined as follows: 4 t = t + 1 N C 2 m 5 t  J  ukj X k  v j  min (1) Compute V j ;j  1,C by formula (3) k1 j1 t  Where m is fuzzier; C is the number of clusters; N is the 6 Compute u ;j  1,C by formula (4) number of data elements; r is the dimension of the kj data; u is the membership degree of data elements X kj k 7 t  t 1 r th Until U U   or t > MaxStep to cluster j; X k  R is the k element of X  X , X ,..., X ; Vj is the center of cluster j. 1 2 N 2.2 Semi-supervised fuzzy clustering algorithm Then the constraints of the objective function in (1) Semi-supervised fuzzy clustering algorithms are are: constructed from fuzzy clustering with the additional

C information provided by users. Additional information makes the clustering progress have a higher quality. ukj  1; ukj 0,1;k  1, N (2) j1 There are three kinds of additional information consisting of Must-link and Cannot-link constraints, The optimal solutions of problem (1)-(2) obtained from labeled a part of data, pre-defined membership degree. Lagrange multiplier method are centers of clusters in Yasunori et al. [15] proposed a semi-supervised fuzzy equation (3) and membership degree in equation (4). clustering with the objective function as follow C N C m u m X 2 (5)  kj k J   ukj  u kj X k  v j  min k1 (3) k1 j1 v j  C m satisficing the constraints in (2). Then the additional ukj k1 membership degree of data point X k belonging to

clustering C j is denoted by u kj. This additional

© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 19

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 06 | June-2016 www.irjet.net p-ISSN: 2395-0072 membership matrix satisfies below condition 3. The novel combination of fuzzy clustering and semi-supervised clustering U  u kj | u kj 0,1,k 1, N, j 1,C, 3.1 The main mechanism of proposed model Based on the constraints in (2) and the objective function (5) we have C m  ukj  u kj X k k1 (6) v j  ; j 1,C C m  ukj  u kj k1

Where u are defined differently in two instances as kj follow

- m 1 : 2  1  m1 , , . (7)   k  1, N j 1,C C  X  v     k j  ukj  u kj  1 u kj  2  i1  C   m1  1    i1  X k  vi  - m=1: C  2 u kj 1 u kj ,khi k  arg min X k Vi , k  1, N ,   i j1 ukj    2 u kj ,khi k  arg min X k Vi  i

. (8) j 1,C Semi-Supervised Standard Fuzzy Clustering algorithm (SSSFC) is presented as in Table 2 Fig. 1. The diagram of new algorithm (SSFC-DF) Table 2. Semi-Supervised Standard Fuzzy Clustering algorithm In Figure 1, a novel model combining fuzzy clustering algorithm and semi-supervised fuzzy algorithm is Input Datasets X includes N elements; the number of presented. In which, additional information of semi- clusters C; additional membership matrix U , supervised fuzzy clustering algorithm is spatial Threshold  ; the maximum number of information. Firstly, from input image, we use FCM to iterations maxStep> 0 get the output membership matrix as the additional Output Matrix U and cluster centers V information of semi-supervised fuzzy clustering SSSFC algorithm. After that, the SSFC-SC is applied into the 1: t = 0 original image. The clustering results are evaluated by (t) 2: Initialize randomly clustering centers V various criteria together with the comparison among this proposed model and other related methods. 3: Repeat 4: (t) 3.2. The proposed combination model (SSFC-DF) Calculate U by defined values ukj using Based on the objective function of fuzzy clustering equation (7) if m 1 or using equation (8) in the case of m=1. algorithm (FCM) and semi-supervised fuzzy algorithm 5: t = t + 1 (SSSFC) shown in Section 2, we propose a new objective function. This combination objective function 6: Calculate V(t) by v using (6) j has three main components: fuzzy clustering objective (t) (t1) 7: Until V V   or t > maxStep function, spatial information and semi-supervised fuzzy clustering function. It can be written as in equation (9) below.

© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 20

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 06 | June-2016 www.irjet.net p-ISSN: 2395-0072

N C N C 2 2 2 2 The partial derivation of L(U) by U is presented by J  ukj  xk  v j  ukj  Rkj following equation k1 j1 k1 j1 (9) N C 2 2 L 2 2  u  u kj x  v  min  2u x  v  2u R  kj k j u kj k j kj kj k1 j1 kj (15) 2 This function must satisfy both constraints of fuzzy  2ukj  u kj xk  v j  k clustering algorithm and semi-supervised fuzzy algorithm: The membership degree is defined by using (16):

C 2    2u kj x  v ukj 1; ukj 0,1; k 1, N (10) k k j j1 ukj  (16) 2 2 2 * 2 xk  v j  Rkj  C ukj 1; u kj 0,1; k 1, N (11) From the constraint in (10), we have j1 2  C   C   ukj xk  v j   1  (17) Where C is the number of clusters, N is the number of k  1  2 2   2 2   j1   j1  2 xk  v j  Rkj  2 2 xk  v j  Rkj  objects in dataset, r is the dimension of each object, ukj       is the membership degree of kth data point belonging to The optimal solutions obtained from Lagrange jth cluster; u is the additional information (defined kj multiplier method are centers of clusters in (13) and from output of FCM). membership degree in (16)-(17). The detail steps of novel model are presented in Table Denote that r is kth point of dataset X k  R 3 below th Table 3. The proposed method (SSFC-DF) in dental image X  X1, X 2 ,..., X N ; vj is the center of j cluster, Rkj is a segmentation function used to calculate space distance [10] from X k Input Datasets X includes N elements; the number of to V . j clusters C; additional membership matrix U , Threshold  ; the maximum number of For solving the optimal problem (9-11), the Lagrange iterations maxStep> 0 multiplier method is used. Taking the derivative of (9) Output Matrix U and cluster centers V by V, we obtain: SSFC-DF 1: t = 0 J N N 2 2 (12) 2: (t)  2ukj (xk  v j )  2ukj ukj  ukj (xk  v j ) Initialize randomly V v j k 1 k 1 3: Repeat 4: (t) Setting up the value of this derivation to zero, we find Calculate U based on ukj using (16), (17). the cluster centers V as in (13). 5: t = t + 1 6: (t) N Calculate V by ukj using (13) 2 ukj   ukj  u kj xk 7: Until V (t) V (t1)   or t >maxStep k1 v j  N (13) 2 4. Experiment results ukj   ukj  u kj  k1 4.1. Preparing for illustration In this Section, we show the numerical results by Then Lagrange function is defined as applying new algorithm into dental image segmentation. The dataset consists of 56 dental X-ray N  C  L(U )  J    u 1 images obtained from Hanoi Medical University  k  kj  (14) k 1  j1  Hospital in the period of 2014-2015.

© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 21

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 06 | June-2016 www.irjet.net p-ISSN: 2395-0072

Based on this real dataset, fuzzy C-means clustering in to dental image segmentation problem. This (FCM) algorithm, semi-supervised standard fuzzy algorithm is based on the combination between fuzzy clustering (SSSFC) algorithm and our proposed clustering (FCM) algorithm and semi-supervised combination model (SSFC-DF) have been implemented fuzzy clustering with the spatial information used as in Matlab 2014 and executed on a PC VAIO laptop with additional information. The main contributions of Core i5 processor. this paper are i) to propose a new general diagram of The validity indices used to evaluate the performance dental image segmentation problem (presented in of new model in this paper are DB [14], SSWC [14], Section 3.1); ii) to propose a novel semi-supervised fuzzy clustering algorithm (given in Section 3.2); iii) PBM [14], IFV[7]. These indices are also used to the experimental results on the real dataset show the compare the effectiveness of our model with other evaluation and comparison of performance of the related algorithm (FCM, SSSFC) proposed algorithm with other related algorithms 4.2. Experiment results in detail (presented in Section 4). All three algorithms are installed in the same values From this study, there are some ideas for the future of parameters including the number of clusters C=10, works: i) to apply these results into typical applications  = 0.005 and maxStep=150 with 20 runs. The average such as image diagnosis support system, aesthetic of values obtained after 20 runs in each index are dentistry support system; ii) to improve this model presented as in Table 4. based on other features of dental image as additional Table 4. Mean of all algorithms on the given dataset (bold information in semi-supervised fuzzy clustering. values is the best in a row) FCM SSSFC SSFC-SC REFERENCES PBM 34590.6 87374.5 101982.5 [1]. Agarwal, M., Agrawal, H., Jain, N., & Kumar, M. (2010). Face recognition using principle component analysis, SSWC 0.629 0.763 0.874 eigenface and neural network. In Signal Acquisition and DB 0.658 0.562 0.546 Processing, 2010. ICSAP'10. International Conference on (pp. IFV 30.34 70.76 82.78 310-314). IEEE. From table 4, the performance of new model SSFC-DF [2]. Bezdek, J. C. (1981). Pattern recognition with fuzzy has the better value than others in all of four indices objective function algorithms. Kluwer Academic Publishers. [3]. Chen, J., Zhao, S., & Wang, H. (2011). Risk analysis of PBM, SSWWC, DB, IFV. The results of dental X-ray flood disaster based on fuzzy clustering method. Energy image segment are shown in Figure 2. Procedia, 5, 1915-1919. [4]. Chuang, K. S., Tzeng, H. L., Chen, S., Wu, J., & Chen, T. J. (2006). Fuzzy c-means clustering with spatial information for image segmentation. Computerized medical imaging and graphics, 30(1), 9-15. [5]. Jain, A. K., & Chen, H. (2004). Matching of dental X-ray images for human identification. Pattern recognition, 37(7), 1519-1532. [6]. Houhou, N., Bresson, X., Szlam, A., Chan, T. F., & Thiran, J. (a) Original image (b) Using FCM P. (2009). Semi-supervised segmentation based on non-local continuous min-cut. In Scale Space and Variational Methods in Computer Vision (pp. 112-123). Springer Berlin Heidelberg. [7]. Hu, C., Meng, L., & Shi, W. (2008). Fuzzy clustering validity for spatial data. Geo-spatial information science, 11(3), 191-196. [8]. Lim, Y. W., & Lee, S. U. (1990). On the color image

segmentation algorithm based on the thresholding and the (c) Using SSSFC (d) Using SSFC-DF fuzzy c-means techniques. Pattern recognition, 23(9), 935- Fig. 2. The illustration of segmentation results on Image 1 952. 5. Conclusions [9]. Martin, A., Gayathri, V., Saranya, G., Gayathri, P., & Venkatesan, P. (2011). A hybrid model for bankruptcy In this paper, we concentrate to propose a novel prediction using genetic algorithm, fuzzy c-means and semi-supervised fuzzy clustering algorithm applied MARS. arXiv preprint arXiv:1103.2110. © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 22

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 06 | June-2016 www.irjet.net p-ISSN: 2395-0072

[10]. Ngo, L. T., Mai, D. S., & Pedrycz, W. (2015). Semi- degree on Computer Science supervising Interval Type-2 Fuzzy C-Means clustering with at Thai Nguyen University. spatial information for multi-spectral satellite image She received PhD degree on classification and change detection. Computers & applied Mathematics – Informatics at Geosciences, 83, 1-16. Hanoi University [11]. Oliveira, J., & Proença, H. (2011). Caries detection in panoramic dental X-ray images. In Computational Vision and of Science and Technology. From Medical Image Processing (pp. 175-190). Springer 2003, she worked as a Netherlands. lecture in Faculty of Information [12]. Paiva, A. R., & Tasdizen, T. (2010). Fast semi-supervised Technology, School of Information image segmentation by novelty selection. In Acoustics Speech and Communication Technology, and Signal Processing (ICASSP), 2010 IEEE International Thai Nguyen University. Her major Conference on (pp. 1054-1057). IEEE. interests are discrete mathematic, [13]. Sujji, G. E., Lakshmi, Y. V. S., & Jiji, G. W. (2013). MRI Monte Carlo method, optimization, Brain Image Segmentation based on probability theory and statistics, Thresholding. International Journal of Advanced Computer . Research (ISSN (print): 2249-7277 ISSN (online): 2277-7970) Volume-3 Number-1 Issue, 97-101. Nguyen Thuy Hoa received the [14]. Vendramin, L., Campello, R. J., & Hruschka, E. R. (2010). Bachelor on Special English at Viet Relative clustering validity criteria: A comparative Bac Teacher Training College, Thai overview. Statistical Analysis and : The ASA Data Nguyen Unversity in 2000 and Science Journal, 3(4), 209-235. Master degree on English at [15]. Yasunori, E., Yukihiro, H., Makito, Y., & Sadaaki, M. Vietnam National University, Ha Noi (2009, August). On semi-supervised fuzzy c-means in 2005. Now, she is a lecturer at clustering. In Fuzzy Systems, 2009. FUZZ-IEEE 2009. IEEE Faculty of Basic Science of School of International Conference on (pp. 1119-1124). IEEE. Information and Communication BIOGRAPHIES Technology, Thai Nguyen University. Msc. Tran Manh Tuan received the Her major interests are English for Bachelor on Applied Mathematics soft computing and medical and Informatics at Hanoi University informatics of Science and Technology in 2003 and Master degree on Computer Mr. Nguyen Ba Loi is teacher of Science at Thainguyen University in Medical Vocational School Ba Ria – 2007. Now, he is a reseacher at Vung Tau. Now, he is a master

Institute of Information Technology, student at Lac Hong University. His Academic Institute of Science and major interests are image processing Technology, Vietnam. He worked as and intelligent computing. a lecturer at Faculty of Information Technology, School of Information and Communication Technology, Thainguyen University from 2003. Office address: University of Information and Communication Technology, Thai Nguyen University, Quyet Thang, Thai Nguyen city, Thai Nguyen, Vietnam. Email: [email protected]. Tran Thi Ngan obtained the Bachelor degrees on Mathematics – Informatics at VNU University of Science, Vietnam National University (VNU). She got Master

© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 23