Performance Enhancement of Image Clustering Using Singular Value Decomposition in Color Histogram Content-Based Image Retrieval

International Journal of Computer and Communication Engineering, Vol. 1, No. 4, November 2012 Performance Enhancement of Image Clustering Using Singular Value Decomposition in Color Histogram Content-Based Image Retrieval Catur Supriyanto, Guruh Fajar Shidik, Ricardus Anggi Pramunendar, and Pulung Nurtantio Andono This research applied K-Means as clustering technique, Abstract—This paper presents an enhancement of the where this method commonly used for partitioning [3], [4]. performance of image clustering. K-Means has been chosen as Each cluster in K-Means method is represented by its our clustering technique. We applied Hue Saturation Value centroids or the mean value of all data in the cluster. (HSV) color histogram as features to retrieve image We applied color histogram technique that usually used to information. Singular Value Decomposition (SVD) technique is employed to enhance the performance of image clustering by retrieve information image. The weakness applicable of this reducing features that not useful to be proceed in clustering features extraction for information retrieval has been state by process. We evaluated the image clustering using Recall, [5] where the implementation of color histogram does not Precision, F-Measure and computational time. The obtained give relevant image as seen by an algorithm with human result indicates that SVD in HSV color histogram visual. The aim of this paper is to adapt Singular Value Content-Based Image Retrieval (CBIR) is promising. Decomposition (SVD) as feature transformation in CBIR. Index Terms—Content based image retrieval, HSV color SVD is a Latent Semantic Analysis (LSA) approach. The histogram , k-means algorithm, singular value decomposition reason of using SVD is provide useful information of the color. CBIR system can be improved by using image clustering I. INTRODUCTION [6]. The problem of image clustering is high number of image Currently the increasing of image data on internet is giving in clustering leads to more computational time [7]. To opportunity for researcher to working in searching and overcome this problem, SVD also can be proposed to reduce classifying the content of image. The example of image the computational time of image clustering. management tool that capable to search and classifying The outline of this paper is as follows: section 2 describes image is Google image as image searching application the related work. Section 3 describes the methodology of usually used for image mining and image search in web research. Section 4 describes the dataset and shows the context. The importance of searching image information in performance analysis of proposed approach. Section 5 web context given many research communities has produced presents the conclusion and future work. many method algorithms with tools for image retrieval and clustering, where this technique also include in technique Content-Based Image Retrieval (CBIR). II. RELATED WORK There are many applications of CBIR such as biomedicine, There are several research has been conduct to improve the military, commerce, education, web image classification and performance of CBIR in image retrieval clustering area. The searching [1]. CBIR is a technique to searching by analyzes previous work in [8]-[10] have been analyzed the the actual content (feature) of image. In CBIR there are two performance of clustering algorithm for image retrieval. techniques that should strong to achieve accurate results; Kucuktunc and Zamalieva [11] proposed fuzzy color there are a technique to retrieve information and a technique histogram for CBIR. Mamdani fuzzy inference system was to cluster for classifying image. used as fuzzy technique to link L*a*b* to fuzzy color space. Image retrieval is a process to retrieve image features that Their work show that fuzzy color histogram performed better include in each image. In [2] has stated two types of images than conventional methods. features there are low level feature and high level feature, Other study in [12] was compare the using of Conventional where high level feature is difficult to extract like emotion or Color Histogram (CCH), Invariant Color Histogram (ICH) other human behavior activities. There are several image and Fuzzy Color Histogram (FCH) of an images in CBIR features at low level feature usually used to retrieve image system. ICH and FCH has been used to address the problem information such as feature color, shape and texture. In this of rotation, translation and spatial relationship of ICH. study we only used color feature to retrieve the information Tonge [13] proposed a K-means clustering algorithm to of image based on Hue Saturation Value (HSV) space format. grouping the collection of images. Image is clustered based on the query image. Color was used to be features in their CBIR system. Manuscript received July 24, 2012; revised September 2, 2012. Sakthivel et al. [14] proposed modified K-Means The authors are with the Dept. of Computer Science Dian Nuswantoro University, Semarang, Indonesia (e-mail: [email protected], clustering to group similar pixel in CBIR. Their purpose is to [email protected], [email protected], improve retrieval performance by capturing the regions and [email protected]). 317 International Journal of Computer and Communication Engineering, Vol. 1, No. 4, November 2012 also to provide a better similarity distance computation. have similarity. Clustering algorithm is classified into five types there are partitional, hierarchical, density-based, grid-based, and model-based clustering. The most popular III. METHODOLOGY clustering technique that widely used is partition and Fig.1 depicts the general overview of our proposed image hierarchical clustering [17]. This research applied K-Means clustering. First is the collection of images. Then, extract the as clustering technique that used for classifying, where this feature by using HSV color histogram. Next, reduce the technique quite popular for this purpose. K-Means is dimensionality of feature by using Singular Value unsupervised learning which means there is no data training Decomposition. Finally, cluster the image collection using in the process of clustering. K-Means clustering is carried out K-Means algorithm. Detailed description of each phase is in four steps [18]: explained in the next subsection. 1) Choose objects to be k initial seeds (basically is random) Image Collection HSV Color 2) Calculate the distance of each seed to the each object Histogram using distance or similarity metric; assign each object to the cluster with the nearest seed point. 3) Compute the new seed point. 4) Return to the step 2 if the current seeds are different to K-Means Singular Value the previous seeds. Algorithm Decomposition For distance metric, we used city block metric is defined as (5). Since, city block metric gave the best precision for Fig. 1. Proposed image clustering system content-based image retrieval [19]. A. HSV Color Histogram n (5) Hue Saturation Value (HSV) has been chosen as color d|| xii y histogram feature, since HSV color space gives better result i0 for CBIR [2]. HSV is often used because of its accordance where d is the distance value, x and y are vector of image with human visual feature [15]. The conversion of Red Green i i x and image y , respectively. Blue (RGB) space into HSV space can be seen in formulae (1), (2), and (3). 1 [(RGRB ) ( )] IV. EXPERIMENT 1 2 H cos (1) (RGRBGB )2 ( )( ) A. Dataset This paper used 150 images from corel-princeton dataset 3 SRGB1 [min( , , )] (2) which divided into five classes: column, flower, horse , model RGB and sea. The dimension of images is 128 85 pixels. The 1 sample image of each class can be seen in Fig 2. VRGB() (3) 3 B. Singular Value Decomposition Singular Value Decomposition (SVD) is a feature transformation technique which reduces the high dimensional matrix into small dimensional matrix. Let A is the features-images matrix of size mn where m is the number of features and n is the number of images. The singular value decomposition of features-images matrix A can be defined as (4). Fig. 2. Sample images of each class T AUVm n m k k k k n (4) B. Evaluation Measure where U is features vector, is the diagonal matrix of In order to evaluate the quality of images clustering, this singular value and V T is images vector. Next, matrix V T is paper employed F-measure as the standard evaluation measurement widely used in clustering. F-measure is the used to cluster the images collection. The value of rank k is k min( m , n ) . Small k of SVD was enough to generate high combination between recall and precision. The Precision, F-Measure value [16]. Recall, and F-Measure are defined as (6), (7) and (8) respectively. C. K-Means Algorithm Total number of retrieved relevant images The clustering technique in CBIR is required to classify Precision (6) number of images into several group based on its cluster that Total number of retrieved images 318 International Journal of Computer and Communication Engineering, Vol. 1, No. 4, November 2012 Total number of retrieved relevant images In order to the time computational, SVD can minimize the Recall= (7) Total number of relevant images time taken of image clustering. From the Fig. 4, SVD gave 2 Precision Recall the best computational time with the rank of SVD k=4 which F-Measure= (8) Precision + Recall the computational time can decrease into 0.002 second. This achievement was compared to the computational time of C. Experiment Result K-Means (0.3031 second). This paper used K-Means algorithm as the clustering algorithm. We randomly selected 5 initial seed of each class and averaged 30 times as the final performance of images V. CONCLUSION clustering. MATLAB was used to perform the experiment. The experiment was executed on Intel Core i7-2600 CPU 3.4 The objective of this paper is improves the performance of GHz with 16 GB of memory.

Load more