Image Compression Using the Haar Wavelet Transform
Total Page:16
File Type:pdf, Size:1020Kb
Image compression using the Haar wavelet transform Colm Mulcahy, Ph.D. ABSTRACT The wavelet transform is a relatively new arrival on the mathematical scene. Wegive a brief intro duction to the sub ject byshowing how the Haar wavelet transform allows information to b e enco ded according to \levels of detail." In one sense, this parallels the way in whichwe often pro cess information in our everydaylives. Our investigations enable us to givetwo interesting applications of wavelet metho ds to digital images: compression and progressive transmission. The mathematical prerequesites will b e kept to a minimum; indeed, the main concepts can b e understo o d in terms of addition, subtraction and division bytwo. We also present a linear algebra implementation of the Haar wavelet transform, and mention imp ortantrecent generalizations. This material is suitable for undergraduate researchby b oth mathematics and computer science ma jors, and has b een used successfully in this way at Sp elman College since 1995. 1 So much information In the real world, we are constantly faced with having to decide howmuch information to absorb or impart. Example 1: An election is imp ending, and you and your b est friend haveyet to decide for whom you will cast your votes. Yourfriendiscontent to base her decision solely on the p olitical party aliations of the available candidates, whereas you take the trouble to nd out how the candidates stand on a variety of issues, and to investigate their records for integrity, e ectiveness, and vision. Example 2: Your younger brother is across the ro om, working on a pro ject for Black History Month. From where you are sitting, you can see that he is putting a caption on a photograph. \Who is in that picture?" you ask. \A p erson sitting in a bus," he says evasively. \I can see that," you reply impatiently, \But who is it?" \A woman," he says, grinning impishly. \Is that all you are going to tell me?" you say in exasp eration, getting up. \A young woman," he volunteers. Noticing the lo ok of displeasure on your face, he quickly adds, \It's Rosa Parks." What these examples all haveincommonis information transmission at various levels of detail. In the rst example, each p erson received information up to, but not b eyond, a certain level of detail. The second example illustrates progressive information transmission: We start with a crude approximation, and over time more and more details are added in, until nally a \full picture" emerges. Figure 1 illustrates this pictorially: lo oking from left to right we see progressively more recognizable images of Rosa Parks. Figure 1: Person in bus Woman in bus Young woman in bus Rosa Parks in bus Dr. Colm Mulcahy is Associate Professor of Mathematics at Spelman Col lege, where he has taught since 1988. In recent years, his mathe- matical interests have broadened to include computational and visualization issues, such as computer aidedgeometric design (CAGD), computer graphics, image processing, and wavelets, and he has directed undergraduate student research in al l of these topics. 22 /Spelman Science and Math Journal There are two observations worth making here. Firstly, there are circumstances under whichany one of these approxi- mations would b e adequate for our immediate purp oses. For instance, viewed from a sucient distance, they all lo ok the same. Thus,ifoneofthesewas to form a small part of a much larger picture, say a photo on a mantlepiece, or a eeting image in a video, there would b e no need to display a high qualityversion. Secondly, progressive transmission of a sequence of ever-improving approximations to the \real picture" is natural: it's the waymany of us relay information, and the waywe learn new sub jects. It's also the way the p opular Netscap e World Wide Web browser delivers images to users of the web: when we call up a URL (WWW address) which contains an image, that image app ears in installments, starting with an approximation and working up to the nal complete image. All forms of progressive information transmission have one ma jor advantage: the receiver can halt the pro cess and moveonto something else if she decides, based on early information, that she do es not want \the whole picture." This applies equally to learning ab out a candidate for election, listening to p eople recounttheirvacation exp eriences, or grabbing an image on the World Wide Web using Netscap e. Wavelets provide a mathematical way of enco ding numerical information (data) in suchawaythatitislayered according to level of detail. This layering not only facilitates the progressive data transmission mentioned ab ove, but also approxima- tionsatvarious intermediate stages. The p oint is that these approximations can b e stored using a lot less space than the original data, and in situations where space is tight, this data compression is well worthwhile. 2 The wavelet transform In this section, weintro duce the simplest wavelet transform, the so-called Haar wavelet transform, and explain how it can be used to pro duce images likethe rst three in Figure 1, given the last, complete image of Rosa Parks (this image was extracted from a .gif image le downloaded from the World Wide Web.) Matlab numerical and visualization software was used to p erform all of the calculations and generate and display all of the pictures in this manuscript. Each of the digital images in Figure 1 is represented mathematically bya128 128 matrix (array) of numb ers, ranging 5 from 0 (representing black) to some p ositive whole number (representing white). The last image uses 32 = 2 di erent shades of gray, and as such is said to b e a 5-bit image. The numb ers in the sp eci c matrix we used to represent this image range from 0 to 1984, in 31 increments of 64 (these exact numb ers are not imp ortant; they were chosen to avoid fractions in certain calculations later on). Each matrix entry gives rise to a small square which is shaded a constantgraylevel according to its numerical value. We refer to these little squares as pixels; they are more noticable as individual squares when we view the images on a larger scale, suchasin Figure 2(a). Given enough of them on a given region of pap er, as in the 256 256 pixel 8-bit image of Nelson Mandela in Figure 2(b), we exp erience the illusion of a continuously shaded photograph. Figure 2: Rosa Parks (1955) and Nelson Mandela (1990) 2 2 The matrices which sp ecify these images have128 =16; 384 and 256 =65; 536 elements, resp ectively. This fact p oses immediate storage problems. Color images are even bigger, though in one sense they can b e handled by decomp osing into three \grayscale-like" arrays, one for each of the colors red, green and blue. A standard 1.44MB high density oppy disc can only accommo date a handful of large (say, 1024 1024 pixel) high qualitycolorimages; furthermore, as manyof us Review Articles / 23 can attest from p ersonal exp erience, such images are very time consuming to download on the World Wide Web. We only consider grayscale images here. We describ e a scheme for transforming such large arrays of numb ers into arrays that can b e stored and transmitted more eciently; the original images (or go o d approximations of them) can then b e reconstructed by a computer with relatively little e ort. For simplicity,we rst consider the 8 8 = 64 pixel image in Figure 3(b), which is a blow up of a region around the nose in Figure 2. The region extracted is blacked out in Figure 3(a). Figure 3: Rosa Parks - extracting a sub-region for insp ection This image is represented byrows 60 to 67 and columns 105 to 112 of the matrix which de nes Figure 2. Wenow display and name this submatrix: 1 0 576 704 1152 1280 1344 1472 1536 1536 C B 704 640 1156 1088 1344 1408 1536 1600 C B C B C B 768 832 1216 1472 1472 1536 1600 1600 C B C B 832 832 960 1344 1536 1536 1600 1536 C B : P = C B C B 832 832 960 1216 1536 1600 1536 1536 C B C B 960 896 896 1088 1600 1600 1600 1536 C B C B 768 768 832 832 1280 1472 1600 1600 A @ 448 768 704 640 1280 1408 1600 1600 (Wehave contrast stretched in Figure 3(b) to highlight the often subtle variations among the various shades of gray: the smallest and largest matrix entries, 448 and 1600, are depicted as black and white, resp ectively,whichisnottheway they app eared in Figure 2(a).) To demonsrate how to wavelet transform such a matrix, we rst describ e a metho d for transforming strings of data, called averaging and di erencing. Later, we'll use this technique to transform an entire matrix as follows: Treat eachrowas a string, and p erform the averaging and di erencing on each one to obtain a new matrix, and then apply exactly the same steps on each column of this new matrix, nally obtaining a row and column transformed matrix. To understand what averaging and di erencing do es to a data string, for instance the rst row in the matrix P ab ove, consider the table b elow. Successiverows of the table show the starting, intermediate, and nal results. 576 704 1152 1280 1344 1472 1536 1536 640 1216 1408 1536 -64 -64 -64 0 928 1472 -288 -64 -64 -64 -64 0 1200 -272 -288 -64 -64 -64 -64 0 3 There are 3 steps in the transform pro cess b ecause the data string has length 8 = 2 .