Fast Color Quantization Using Macqueen's K-Means Algorithm
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Real-Time Image Processing (2020) 17:1609–1624 https://doi.org/10.1007/s11554-019-00914-6 ORIGINAL RESEARCH PAPER Fast color quantization using MacQueen’s k-means algorithm Skyler Thompson1 · M. Emre Celebi1 · Krizia H. Buck1 Received: 22 May 2019 / Accepted: 6 September 2019 / Published online: 16 October 2019 © Springer-Verlag GmbH Germany, part of Springer Nature 2019 Abstract Color quantization (CQ) is an important operation with many applications in computer graphics and image processing and analysis. Clustering algorithms have been extensively applied to this problem. However, despite its popularity as a general purpose clustering algorithm, k-means has not received much attention in the CQ literature because of its high computational requirements and sensitivity to initialization. In this paper, we propose a novel CQ method based on an online k-means for- mulation due to MacQueen. The proposed method utilizes adaptive and efcient cluster center initialization and quasirandom sampling to attain deterministic, high speed, and high-quality quantization. Experiments on a diverse set of publicly available images demonstrate that the proposed method is signifcantly faster than the more common batch k-means formulation due to Lloyd while delivering nearly identical results. Keywords Color quantization · Clustering · MacQueen k-means · Lloyd k-means 1 Introduction Popular preclustering methods include median-cut [27], octree [22], variance-based method [59], binary splitting Given an input image, CQ involves the reduction of the [40], greedy orthogonal bipartitioning [61], center-cut [33], number of distinct colors ( N ) in the image to K ( ≪ N ) RWM-cut [66], and the more recent methods proposed by with minimum possible distortion. CQ is a challenging prob- Celebi et al. [8] and Ueda et al. [56]. On the other hand, lem as most real-world images contain tens of thousands postclustering algorithms adapted to CQ include maximin of colors. Recent applications of CQ include compression, [63], k-means [9, 10, 29, 30, 57], k-harmonic means [21], segmentation, text localization/detection, color analysis, competitive learning [11, 12], fuzzy c-means [47, 60], rough watermarking, non-photorealistic rendering, and content- c-means [50], and self-organizing maps [16, 17, 64]. based retrieval (for specifc references, the reader is referred Recent CQ methods are typically based on metaheuristics to our earlier work [8]). or a hybrid of metaheuristics and preclustering/postcluster- A large number of CQ methods have been developed over ing methods. These methods cast CQ as a global optimiza- the past four decades. These methods can be categorized into tion problem, which they then solve by means of a variety two groups: preclustering (hierarchical clustering) methods of physics- or nature-inspired metaheuristics. Metaheuristics and postclustering (partitional clustering) methods [7]. The applied to CQ to date include physics-inspired single-solu- former methods recursively fnd nested clusters either in a tion-based metaheuristics such as simulated annealing [42] top-down (divisive) or bottom-up (agglomerative) fashion. as well as nature-inspired population-based metaheuristics In contrast, the latter ones fnd all the clusters simultane- such as evolutionary algorithms (genetic algorithms [51], ously as a partition of the data and do not impose a hierarchi- evolution strategies [24], diferential evolution [31, 48, 54], cal structure [32]. Compared to preclustering methods, post- etc.) and swarm intelligence algorithms (particle swarm clustering methods generally produce better results, that is, optimization [39], ant colony optimization [43, 44], artifcial lower distortion, but are computationally more demanding. bee colony optimization [41], artifcial fsh swarm optimiza- tion [20], etc.). These methods are more powerful than pre- clustering/postclustering methods in that they can optimize * M. Emre Celebi nonsmooth, nonconvex objective functions. Unfortunately, [email protected] these “black-box” methods have several major disadvan- 1 Department of Computer Science, University of Central tages. First they are generally randomized. Second, they Arkansas, Conway, AR, USA Vol.:(0123456789)1 3 1610 Journal of Real-Time Image Processing (2020) 17:1609–1624 have several parameters that are often difcult to fne tune Bregman divergences are a family of nonmetric dissimilar- (initial/fnal temperature, cooling schedule, population size, ity functions that include the squared Euclidean distance 2 crossover/mutation probability, etc.) Third, due to the vast ( 2 ), squared Mahalanobis distance, Kullback–Leibler search space in CQ applications, they often require a large divergence, and Itakura–Saito divergence. In practice, the number of iterations, which renders them computationally most popular Bregman divergence is the squared Euclidean demanding (they can be orders of magnitude slower than distance, in which case the SE is called the sum of squared k-means). We should, however, mention that some of the error (SSE). For any Bregman divergence, it can be shown C recent studies on metaheuristics-based CQ report signif- that the optimal center i for cluster i is given by the cen- cantly reduced computational requirements [44]. troid (or center of mass) of the cluster [1], that is In this paper, we present an efective and efcient clus- 1 tering method for CQ. The rest of the paper is organized as i = , ni C (2) follows. Section 2 describes two variants of the k-means ∈ i clustering algorithm and the proposed k-means based CQ C method. Section 3 presents the experimental setup and com- where ni denotes the cardinality of cluster i , that is, the C pares the proposed method to other CQ methods. Finally, number of points that belong to i. Section 4 gives the conclusions. The pseudocode for the k-means algorithm is given below. Starting with a set of K centers, the algorithm alter- nates between two steps. First, each point is assigned to the 2 k‑means clustering for CQ nearest cluster. Each cluster center is then recomputed to be the centroid of all points that belong to that cluster. Together, In this section, we frst describe two common variants of these two steps are referred to as a “Lloyd iteration”. These the k-means clustering algorithm, one due to Lloyd and the iterations continue until cluster memberships of points no other due to MacQueen. We then elaborate on the proposed longer change. CQ method based on MacQueen’s k-means algorithm. 1. Let { 1, … , K} be the initial set of centers. C 2.1 Lloyd’s k‑means algorithm 2. For each i ∈{1, … , K} , set cluster i to be the set of X points in that are closer in terms of d to i than they Lloyd’s algorithm [35] is perhaps the most common cluster- are to any other center, that is, ing algorithm in scientifc and engineering applications [14]. C X i = ∈ : d( , i) ≤ d( , ̂� ), for all ̂� ≠ i. Commonly referred to as (batch) k-means, Lloyd’s algorithm 1 X ℝD i ∈{1, … , K} C starts with a data set ={ 1, … , N } ⊆ and a posi- 3. For each , set the center i of cluster i to C tive integer value K, which denotes the desired number of be the centroid of all points in i using Eq. (2). clusters. In the context of CQ, the parameters N, D, and K 4. Repeat Lloyd iterations, that is, steps (2) and (3), until correspond, respectively, to the number of pixels in the input convergence. image, number of color channels (which is typically three), and number of colors desired by the user. The algorithm then The most crucial aspect of k-means is step 1, that is, ini- assigns each data point ∈ X to the closest cluster, thereby tialization. A common way to determine the initial cluster X minimizing the sum of error (SE) given by centers is to select K points uniformly at random from and take these as the initial cluster centers. Unfortunately, SE = d( , { 1, … , K}), k-means is relatively sensitive to initialization [13]. Some of X (1) ∈ the negative efects of improper initialization include empty where d( , { 1, … , K}) denotes the Bregman divergence of clusters, slower convergence, and a higher probability of to the nearest center in { 1, … , K} , that is getting stuck at a poor local minimum. d( , { 1, … , K}) = min d( , i). i∈{1,…,K} 2.2 MacQueen’s k‑means algorithm MacQueen [36] proposed an online formulation of the batch k-means algorithm. The two k-means algorithms are similar in the sense that each point is assigned to the cluster repre- 1 Strictly speaking, in practice, data sets are not implemented as sented by the nearest center to that point. The algorithms, sets, but as sequences, where elements are ordered and duplicates are allowed. This is especially true for image data sets, which are often however, difer in the way the cluster centers are recom- stored and accessed in raster order. puted. The online algorithm updates the nearest center after 1 3 Journal of Real-Time Image Processing (2020) 17:1609–1624 1611 (t) the presentation of each point, whereas the batch algorithm t (t = 1, 2, … ) and be the corresponding nearest (winner) updates all centers after the presentation of the entire set unit according to the 2 distance. The adaptation equation for (t) of points. is given by The pseudocode for the online k-means algorithm is given (t+1) = (t) + (t)((t) − (t)), below. It can be seen that, unlike the batch algorithm, this (3) algorithm does not generate the centroidal Voronoi diagram where ∈[0, 1] is the learning rate, which is typically a itself, but approximate positions of its generators [19]. This monotonically decreasing function of time. The larger the allows the algorithm to achieve O(DK) per-iteration time value, the more emphasis is given to the new input (this complexity, compared to the O(NDK) time complexity of is more obvious in Eq. 5) and hence the faster the learning.