BINARY USING NEIGHBORHOOD CODING

Tiago B. A. de Carvalho1, Tsang Ing Ren1, George D.C. Cavalcanti1, Tsang Ing Jyh2

1Center of Informatics, Federal University of Pernambuco, Brazil. 2Alcatel-Lucent, Bell Labs, Belgium E-mail: 1{tbac,tir,gdcc}@cin.ufpe.br, [email protected]

ABSTRACT image format requires a reasonable great amount of computer memory, since for each pixel it is necessary a Compression plays an important role in the storage and set of bits to represent it, and a relatively small image can transmission of digital image. Binary image compression is contain millions of pixels. Even a binary image that uses just of essential value for document imaging. Here, we propose a a bit per pixel can demand a large amount of disk space. novel compression technique based on the concept of There are dozens of bitmap image formats that use neighborhood coding, which codes each pixel of an image compression techniques, e.g., TIFF, GIF, PNG, JBIG and according to the number of neighbor pixels in different JPEG. The TIFF format can also use different types of directions. The proposed technique is a compression methods such as CCITT Group 3 or Group 4. scheme, which also uses run-length encoding (RLE) and We propose a novel lossless binary image compression . We evaluated and compared this method to technique based on neighborhood coding scheme described several image file format, using test images taken from the in [2]. The codification starts by transforming each pixel of MPEG-7 core experiment CE-shape and the binary image the image into a vector. The elements of the vector represent compression challenge. The results are promising and a the number of neighbors in the four direction (north, east, good compression rate is achieved, even though the JBIG south, west). We extend this coding idea for two other standards showed the best performance for all the images directions and compared different coding scheme. The first tested. However, the proposed method obtained close results scheme uses the diagonal direction, and the second is based and better compression ratios if compared to several other on the combined four nearest neighbor and the diagonal well-established . direction. Once each pixels is coded, some redundant information occurs since a neighbor code can already Index Terms — Binary image coding, image compression, contain information of the pixel coded in the neighborhood graphic file format, neighborhood coding. vector. We introduce a method to reduce the total number

of codes to just some few representative codes that are 1. INTRODUCTION sufficient to recover the whole image. The final reduced The objective of image compression is to generate an image number of codes are compressed using RLE and Huffman using less symbols than the original image, that is equivalent coding. of having the same image file with less bytes. In general, The structure of this paper is as follows: Section 2 we compression is an important task for the storage and describe in details the neighborhood coding, using three transmission of data, since it allows a reduction of computer different directions for the neighborhood coding. Section 3 resources. Image compression can be divided in two types: the code reduction algorithm is described. The compression lossy and lossless, the method proposed here generates procedure is explained in Section 4. Section 5 shows some lossless binary image compression. In general, there are experiments and results of the proposed method, comparing three types of image format [1]: bitmap, vector and metafile. the results to other image file format. Finally, in Section 6 Bitmap is composed of a matrix of pixels that represents a we present some discussion and conclusions. color value at a specific position. The vector format codes the image in mathematical representations of lines, polygons 2. NEIGHBORHOOD CODING and curves. The metafile format contains both bitmap and In a binary image the neighborhood coding procedure vector information in the same file format. Compression is transforms the image into a set of vectors that represent each seldom used in vector image format, since this format pixel of the image. Each vector is composed of two parts: already needs a long time to reconstruct the image and a the center position and the segment. The center position is compression scheme would cause the process to be even the location (xi, yi) of a pixel and the segment is defined by slower, moreover the mathematical description used by this the distance of the center pixels to the border of the file format is already very compact. connected component in a specific direction. We used three types of neighborhood coding: (1) segment in the horizontal in another vector. This redundancy can be eliminated and vertical direction, (2) segment in the diagonal direction, consequently creating a more compact representation of the and (3) the combination of segments (1) and (2). As an image. The problem is how to obtain the subset of codes that analogy, we can compare these neighborhood types to the are sufficient to reconstruct the binary image without loss of moving direction of the chess pieces: (1) tower, (2) bishop information. For this task, we proposed an algorithm to and (3) queen, respectively. select the codes that have the most relevant information: The tower coding contains four segments, in the horizontal and vertical direction: (north, east, south, west), 1. Create C a variable that measures if all pixels have been or (n, e, s, w), this generates the following set of positions, encoded. C starts as C = m, where m is the number of pixels respectively: Ni, Ei ,Si and Wi, where Ni = { (xi, y) | y = yi - k, to be encoded.    k = 0, 1, 2, ..., ni}, Ei = { (x, yi) | x = xi + k, k = 0, 1, 2, ..., 2. Create = { i | i = 1, ... , m}, where i is the ei}, Si = { (xi, y) | y = yi + k, k = 0, 1, 2, ..., si}, Wi = { (x, yi) | neighborhood code (xi, yi, ni, ei, si, wi) that codes the pixels x = xi - k, k = 0, 1, 2, ..., wi}, and (xi, yi) is defined as the in Pi = { (x, y) | (x, y) ∈ Ni ∪ Ei ∪ Si ∪ Wi}. center position of the pixel in consideration. 3. Create A = {ai | i = 1, ... , m}, where ai is a flag, ai = 0 if In the same way, we can define the bishop coding with the center (xi, yi) of i is not yet encoded and ai = 1 the diagonal segments: (northeast, southeast, southwest, otherwise. A starts as A = 0 (no center has been encoded). northwest) or (ne, se, sw, nw) generating the set NEi, SEi, 4. Create the output array O = {oi | i = 1, … , m}, where oi SWi and, NWi, where NEi = { (x,y) | x = xi + k, y = yi - k, k = is a flag, oi = 1 if the code i has been selected and oi = 0 0, 1, 2, ..., nei}, SEi = { (x,y) | x = xi + k, y = yi + k, k = 0, otherwise. O starts as O = 0 (no code has been selected yet). 1, 2, ..., sei}, SWi = { (x,y) | x = xi - k, y = yi + k, k = 0, 1, 2, 5. create T = {ti | i = 1, … , m}, where ti starts as ti = ni + ei + ..., swi}, NWi = { (x,y) | x = xi - k, y = yi - k, k = 0, 1, 2, ..., si + wi. nwi}. 6. while C > 0 do // have all pixels been encoded? The third type of coding, the queen coding is defined as choose l | tl ≥ tv, ∀ u, v ∈  // code with maximum t the combination of the tower and the bishop coding, and C = C − tl contains the set of pixels: N, E, S, W, NE, SE, SW, NW. Ol = 1 Therefore, tower and bishop coding forms a 6-tuple: (x, y, n, for each k ∈ Vl ⊆  with ak = 0, where Vl is the set e, s, w) and (x, y, ne, se, sw, nw) and the queen coding forms of all neighborhood codes with center (x, y) ∈ Pl do a 10-tuple (x, y, n, e, s, w, ne, se, sw, nw). ak = 1

for each w ∈ Vk ⊆  do tw − − // update T 7. return O

The value t indicates the coding capacity of the neighborhood vector, which means the total number of pixels that the code is able to represent. Initially t is defined as the summation of all the coding segments, for the tower

coding t = n + e + s + w, for the bishop coding t = ne + se Figure 1. Example of three different type of coding: tower + so + no and for the queen coding t = n + l + s + o + ne + (left), bishop (center) and queen (right). The dark pixels se + so + no. Even though, the update of the coding capacity represents the neighbor regions . t is an exhaustive task, it runs as an interactive procedure that becomes fast at each interaction. Since each time a code Figure1 shows an example of each one of these coding, is selected the value t of the codes that also represent the they all have the same central position. On the left the tower pixels that the chosen code represents is decremented. In coding which can be written as a vector (4, 4, 2, 3, 5, 3). The other words, the coding capacity of a set of codes is center image shows the bishop coding that can be described diminished. At the end of the process just the relevant codes as a vector (4, 4, 2, 2, 2, 2). On the right, the queen coding, remains. represented as a vector (4, 4, 2, 3, 5, 3, 2, 2, 2, 2). The reduction code algorithm starts with the initialization of the variables (step 1-5). Following, all the 3. CODE REDUCTION pixels are transformed into the neighborhood coding format

(step 2). The code with biggest t value is chosen, and each of Given a binary image, the neighborhood vector coding can the affect code, that is codes that represent pixels that were be used to represent the whole image. However, the associated with the selected code, is identified. The value of generated set of codes can be extremely redundant, since the t for each affected code is updated by subtracting the information contained in one code can be already be coded number of pixels that it does not code anymore (step 6). results are stored in a file using Huffman coding. The Finally the procedure returns the output (step 7). decompression procedure is also straightforward, the Figure 2, shows the results of the code reduction original image can be obtained simply by applying the process, the center of each selected code is shown in black. reverse procedure. The complete image is shown in gray, the left image shows the tower coding result, the center image shows the bishop 5. EXPERIMENTS AND RESULTS coding and the right image the queen coding. After the coding reduction, for the same image it is necessary 6 codes We applied the proposed compression scheme in different to represent the whole image using the tower coding, 9 types of binary image and compared the results to well- codes for the bishop coding and 4 for the queen coding. known image file format. Figure 3 shows the images used in the experiments. The images 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9 were obtained using an Arial font with centered window size of 128x128 pixels. The bat-12 and bell-2 images are shape descriptors obtained from the MPEG-7 Core Experiment Shape 1 part B [4, 5]. The images courier12, oldeng16, ouster, times12i and cat were obtained from the binary image compression challenge [6]. ouster and cat are halftone images and courier12, oldeng16 and times12i are short texts Figura 2. The selected code centers are shown in black. The written in Courier, Old English and Times New Roman images show the results using tower (left), bishop (center) having size 12, 16 and 12, respectively. The images in and queen (right) coding. Figure 3 are not in the original size and the text images are

The code reduction method aims to obtain the least partially shown. The original dimension of this images are number of neighborhood vector codes necessary to represent described in Table 1. the whole image without loss of information. In this way, it is possible to obtain a very compact representation of the image.

4. COMPRESSION METHOD

Once the reduced codes are obtained the rest of the compression method is straightforward. The process can be divided in 5 stages: (1) coding reduction; (2) grouping of the codes that have the same vector segments; (3) applying run- length encoding (RLE) [1] for the vector segments; (4) Figura 3. Test images compilation. The text images are applying RLE for the run counts of the first RLE and (5) partially shown and all the images are reduced from its Huffman coding [3]. The code reduction consist in original size. generating the neighborhood coding as described in the last Table 1 shows the final compressed files using all three section. The next step is to group the codes according to the different neighborhood coding. The best compression results segments. The vector segment of a code is defined as the are obtained using the tower coding in most of the cases. code vector without the central position value, x and y. For However, the halftone images (cat and ouster) show a better the tower coding the code segment is represented by (n ,e, s, result using the bishop coding. And in some cases the queen w), for the bishop coding (ne, se, sw, nw), and for the queen coding also obtain the best result, however if compared to coding (n ,e, s, w, ne, se, sw, nw). This grouping is important the second best compression, it show less than 10% to obtain a maximum information reduction using the RLE improvement. Even though the queen coding generates less procedure. The list of codes is divided in two parts: vector number of neighborhood vector, it takes a lot more space for segment and center position, both listed in an equivalent its storage since each code is associated with a longer vector order. The next step of the process is to apply RLE to the list segment. of vector segment, resulting in a list of run counts and run values, the run counts indicates the number of times each run value repeats. In this specific case, the run counts represent Image Dim Tower Bishop Queen 0 128x128 428 364 the number of times each vector segment repeats in the total 277 1 128x129 143 268 141 coding procedure. Since there are many repeated run counts, 2 128x130 337 401 326 RLE is applied to the run count of the first RLE. Finally, the 3 128x131 384 433 428 Tabela 2. The table shows the image file size in bytes for several 4 128x132 251 415 262 image file format. For the proposed method, the table shows the 5 128x133 328 501 392 best result obtained from three type of coding. In bold the best 6 128x134 377 493 453 results, the second best is shown in italic and underline. 7 128x135 254 344 246 8 128x136 389 473 479 The problem of the proposed method for the halftone 9 128x137 360 499 437 image is that it is necessary a large number of codes to bat-12 474x216 2,521 4,483 3,406 represent the image, because of the various discontinuity bell-2 59x64 232 571 295 which is typical for this type of image. Even through we cat 380x469 115,158 41,350 43,056 were not able to reach JBIG compression rates and in some courier12 374x46 977 1,712 920 case NCC performance lost for some other image format, the oldeng16 476x55 2,207 3,157 2,013 results showed to be promising and there are still several ouster 108x144 5,565 4,865 4,920 improvements that can be made in the NCC scheme. times12i 278x46 1,317 1,573 1,179 6. CONCLUSIONS Table 1. Results of the Neighborhood Coding Compression (NCC) applied in several images, the results are shown in bytes and the We extended the usage of the neighborhood coding concept bes results in bold. introduced in [2] and proposed a new binary image compression method. Even through, the technique did not To evaluate the proposed method, we compared the obtain compression rates better than the JBIG format, the results of the best neighborhood coding with other image file method showed good compression rates compared to other format. Table 2 shows the size of the compressed image files well-established image format. Clearly, for images not so using JPEG, TIFF, PNG, GIF, JBIG and the proposed well-structured like halftone and image with larger sizes, coding NCC (Neighborhood Coding Compression). The NCC still needs improvements. And to try to solve some of files were obtained using ImageMagick [7], optimizing the these limitations multi-resolution expansion and image parameters for best lossless compression result. For TIFF pyramids should be investigated. Another type of coding, image, CCITT Group 4 compression was used. The best such as [8], that is able to produce better results were obtain using JBIG format. The JPEG format compression rates, can be tested instead of Huffman coding, shows the worst results, however JPEG is not focused in There are still other aspect of the proposed method that can compression of binary image, for that the JPEG committee be improved, however at the present format the method created JBIG and JBIG2, which is the state of the art in already shows result compatible to CCITT Group 4, PNG binary image compression. We did not compare our results and GIF. In some cases the results are very close to JBIG. to JBIG2 since in our experiments JBIG already showed the best results. REFERENCES

Imagem JPEG TIFF PNG GIF JBIG NCC [1] J.D. Murray, W. Van Ryper. “Encyclopedia of Graphics File 0 2,699 439 528 425 183 277 Formats”, 2nd Edition. O'Reilly & Associates, 1996 1 1,095 405 440 345 135 141 [2] I. J. Tsang, I.R. Tsang, D. Van Dyck. “Image coding using 2 2,576 433 556 362 174 326 neighbourhood relations”. Pattern Recognition Letters 20. 1999, 3 2,928 451 596 386 200 384 pages 1279-1286. 4 1,835 419 500 386 150 251 [3] D. A. Huffman, “A Method for the Construction of Minimum- Redundancy Codes”, Proceedings of the I.R.E (Institute of Radio 5 2,772 439 539 377 187 328 Engineers), September 1952, pp. 1098-1101. 6 3,260 455 608 427 209 377 [4] Shape data for the MPEG-7 core experiment CE-Shape-1. 7 1,682 407 460 327 142 246 http://www.cis.temple.edu/~latecki/TestData/mpeg7shapeB.tar.gz 8 3,115 455 555 417 194 389 [5] L.J. Latecki, R. Lakamper, T. Eckhardt. “Shape descriptors for 9 3,162 457 603 427 199 360 non-rigid shapes with a single closed contour”. IEEE Conference bat-12 12,669 817 1,250 1,433 404 2,521 on Computer Vision and Pattern Recognition, 2000. Proceedings, bell-2 1,910 433 392 254 148 232 pages: 424-429 vol.1 cat 190,245 69,781 8,129 10,540 6,310 41,350 [6] Binary image compression challenge (http://wiki.tcl.tk/12314). Updated 2008-07-28 19:18:27 by lars_h courier12 10,546 979 760 823 412 920 [7] ImageMagick. http://www.imagemagick.org/ oldeng16 14,770 1,599 1,354 1,600 797 2,013 [8] I. H. Witten, R. M. Neal, J. G. Cleary. “Arithmetic coding for ouster 16,742 4,335 1,934 1,889 4,865 1,231 ”. Communications of the ACM. Volume 30 , times12i 9,314 1,083 833 794 426 1,179 Issue 6 (June 1987). Pages: 520 - 540