Binary Image Compression Using Neighborhood Coding
Total Page:16
File Type:pdf, Size:1020Kb
BINARY IMAGE COMPRESSION USING NEIGHBORHOOD CODING Tiago B. A. de Carvalho1, Tsang Ing Ren1, George D.C. Cavalcanti1, Tsang Ing Jyh2 1Center of Informatics, Federal University of Pernambuco, Brazil. 2Alcatel-Lucent, Bell Labs, Belgium E-mail: 1{tbac,tir,gdcc}@cin.ufpe.br, [email protected] ABSTRACT Bitmap image format requires a reasonable great amount of computer memory, since for each pixel it is necessary a Compression plays an important role in the storage and set of bits to represent it, and a relatively small image can transmission of digital image. Binary image compression is contain millions of pixels. Even a binary image that uses just of essential value for document imaging. Here, we propose a a bit per pixel can demand a large amount of disk space. novel compression technique based on the concept of There are dozens of bitmap image formats that use neighborhood coding, which codes each pixel of an image compression techniques, e.g., TIFF, GIF, PNG, JBIG and according to the number of neighbor pixels in different JPEG. The TIFF format can also use different types of directions. The proposed technique is a lossless compression compression methods such as CCITT Group 3 or Group 4. scheme, which also uses run-length encoding (RLE) and We propose a novel lossless binary image compression Huffman coding. We evaluated and compared this method to technique based on neighborhood coding scheme described several image file format, using test images taken from the in [2]. The codification starts by transforming each pixel of MPEG-7 core experiment CE-shape and the binary image the image into a vector. The elements of the vector represent compression challenge. The results are promising and a the number of neighbors in the four direction (north, east, good compression rate is achieved, even though the JBIG south, west). We extend this coding idea for two other standards showed the best performance for all the images directions and compared different coding scheme. The first tested. However, the proposed method obtained close results scheme uses the diagonal direction, and the second is based and better compression ratios if compared to several other on the combined four nearest neighbor and the diagonal well-established image file formats. direction. Once each pixels is coded, some redundant information occurs since a neighbor code can already Index Terms — Binary image coding, image compression, contain information of the pixel coded in the neighborhood graphic file format, neighborhood coding. vector. We introduce a method to reduce the total number of codes to just some few representative codes that are 1. INTRODUCTION sufficient to recover the whole image. The final reduced The objective of image compression is to generate an image number of codes are compressed using RLE and Huffman using less symbols than the original image, that is equivalent coding. of having the same image file with less bytes. In general, The structure of this paper is as follows: Section 2 we compression is an important task for the storage and describe in details the neighborhood coding, using three transmission of data, since it allows a reduction of computer different directions for the neighborhood coding. Section 3 resources. Image compression can be divided in two types: the code reduction algorithm is described. The compression lossy and lossless, the method proposed here generates procedure is explained in Section 4. Section 5 shows some lossless binary image compression. In general, there are experiments and results of the proposed method, comparing three types of image format [1]: bitmap, vector and metafile. the results to other image file format. Finally, in Section 6 Bitmap is composed of a matrix of pixels that represents a we present some discussion and conclusions. color value at a specific position. The vector format codes the image in mathematical representations of lines, polygons 2. NEIGHBORHOOD CODING and curves. The metafile format contains both bitmap and In a binary image the neighborhood coding procedure vector information in the same file format. Compression is transforms the image into a set of vectors that represent each seldom used in vector image format, since this format pixel of the image. Each vector is composed of two parts: already needs a long time to reconstruct the image and a the center position and the segment. The center position is compression scheme would cause the process to be even the location (xi, yi) of a pixel and the segment is defined by slower, moreover the mathematical description used by this the distance of the center pixels to the border of the file format is already very compact. connected component in a specific direction. We used three types of neighborhood coding: (1) segment in the horizontal in another vector. This redundancy can be eliminated and vertical direction, (2) segment in the diagonal direction, consequently creating a more compact representation of the and (3) the combination of segments (1) and (2). As an image. The problem is how to obtain the subset of codes that analogy, we can compare these neighborhood types to the are sufficient to reconstruct the binary image without loss of moving direction of the chess pieces: (1) tower, (2) bishop information. For this task, we proposed an algorithm to and (3) queen, respectively. select the codes that have the most relevant information: The tower coding contains four segments, in the horizontal and vertical direction: (north, east, south, west), 1. Create C a variable that measures if all pixels have been or (n, e, s, w), this generates the following set of positions, encoded. C starts as C = m, where m is the number of pixels respectively: Ni, Ei ,Si and Wi, where Ni = { (xi, y) | y = yi - k, to be encoded. k = 0, 1, 2, ..., ni}, Ei = { (x, yi) | x = xi + k, k = 0, 1, 2, ..., 2. Create = { i | i = 1, ... , m}, where i is the ei}, Si = { (xi, y) | y = yi + k, k = 0, 1, 2, ..., si}, Wi = { (x, yi) | neighborhood code (xi, yi, ni, ei, si, wi) that codes the pixels x = xi - k, k = 0, 1, 2, ..., wi}, and (xi, yi) is defined as the in Pi = { (x, y) | (x, y) ∈ Ni ∪ Ei ∪ Si ∪ Wi}. center position of the pixel in consideration. 3. Create A = {ai | i = 1, ... , m}, where ai is a flag, ai = 0 if In the same way, we can define the bishop coding with the center (xi, yi) of i is not yet encoded and ai = 1 the diagonal segments: (northeast, southeast, southwest, otherwise. A starts as A = 0 (no center has been encoded). northwest) or (ne, se, sw, nw) generating the set NEi, SEi, 4. Create the output array O = {oi | i = 1, … , m}, where oi SWi and, NWi, where NEi = { (x,y) | x = xi + k, y = yi - k, k = is a flag, oi = 1 if the code i has been selected and oi = 0 0, 1, 2, ..., nei}, SEi = { (x,y) | x = xi + k, y = yi + k, k = 0, otherwise. O starts as O = 0 (no code has been selected yet). 1, 2, ..., sei}, SWi = { (x,y) | x = xi - k, y = yi + k, k = 0, 1, 2, 5. create T = {ti | i = 1, … , m}, where ti starts as ti = ni + ei + ..., swi}, NWi = { (x,y) | x = xi - k, y = yi - k, k = 0, 1, 2, ..., si + wi. nwi}. 6. while C > 0 do // have all pixels been encoded? The third type of coding, the queen coding is defined as choose l | tl ≥ tv, ∀ u, v ∈ // code with maximum t the combination of the tower and the bishop coding, and C = C − tl contains the set of pixels: N, E, S, W, NE, SE, SW, NW. Ol = 1 Therefore, tower and bishop coding forms a 6-tuple: (x, y, n, for each k ∈ Vl ⊆ with ak = 0, where Vl is the set e, s, w) and (x, y, ne, se, sw, nw) and the queen coding forms of all neighborhood codes with center (x, y) ∈ Pl do a 10-tuple (x, y, n, e, s, w, ne, se, sw, nw). ak = 1 for each w ∈ Vk ⊆ do tw − − // update T 7. return O The value t indicates the coding capacity of the neighborhood vector, which means the total number of pixels that the code is able to represent. Initially t is defined as the summation of all the coding segments, for the tower coding t = n + e + s + w, for the bishop coding t = ne + se Figure 1. Example of three different type of coding: tower + so + no and for the queen coding t = n + l + s + o + ne + (left), bishop (center) and queen (right). The dark pixels se + so + no. Even though, the update of the coding capacity represents the neighbor regions . t is an exhaustive task, it runs as an interactive procedure that becomes fast at each interaction. Since each time a code Figure1 shows an example of each one of these coding, is selected the value t of the codes that also represent the they all have the same central position. On the left the tower pixels that the chosen code represents is decremented. In coding which can be written as a vector (4, 4, 2, 3, 5, 3). The other words, the coding capacity of a set of codes is center image shows the bishop coding that can be described diminished. At the end of the process just the relevant codes as a vector (4, 4, 2, 2, 2, 2). On the right, the queen coding, remains. represented as a vector (4, 4, 2, 3, 5, 3, 2, 2, 2, 2). The reduction code algorithm starts with the initialization of the variables (step 1-5).