Chapter 2 Image Processing Preliminary – 24 Oct 2013

A picture is a fact. – Ludwig Wittgenstein The soul never thinks without a picture. – Aristotle We are visual organisms who depend heavily on our ability to see the world around us. Our brains have evolved to be superb image processing engines. From two simultaneous two- dimensional images, one from each eye, we automatically extract three dimensional spatial information. We can do complex pattern analysis effortlessly such as counting the number of cars on the street in front of a house or identifying the quarters in a handfull of change. In modern science, data is often in the form of pictures or images. From these, the researcher must extract quantitative information. The tasks which we do almost effortlessly can be very challenging for software. However, once the software is able to accomplish a challenge, the software can repeat it many thousands of times when a human might lose interest after only a few. There are several image processing packages available for Python. We will use the Python Image Library (PIL) and numpy’s multidimensional image processing package (ndimage).

I. Image Fundamentals

An image is a two dimensional array where every corresponds to a single (x, y) or (column, row) point in the picture. A pixel may take several forms. Some images have only two possible values for each pixel, on or off. Such an image is binary or . Some images may have a range of gray values ranging from all black to all white and denoted by an integer. Cameras often generate either 8-bit, 10-bit, or 12-bit images. In an 8-bit image, range from 0 (all black) to 255 (all white) with a smooth range of gray in between. Color cameras produce color images. The most common color format is known as RGB. It uses 24 bits per pixels. Of these 24, each color (red, green, blue) is represented by an 8-bit grayscale with 8 bits corresponding to reds, 8 bits corresponding to greens, and 8 bits corresponding to blues. The result is more than 16 million different colors.1 The display of an image within a computer requires some explanation. Modern monitors use combinations of red, green, and blue to determine the color of each pixel and most monitors are designed to use 24 bit RGB representations. Consequently, an RGB image contains all the information required to specify the color of every pixel. However, a grayscale image contains only a single integer number (usually 8 – 16 bits) for every pixel. The computer must map each number onto a specific RGB value for the monitor to display. This is done using a colormap and many colormaps are possible. The author most commonly uses gray() and spectral() which are a grayscale color map ranging from black to white and a color spectrum ranging from violet to red. 1You can use python to determine the actual range which is power(2,24) different colors. Figure 1: Available matplotlib colormaps (from matplotlib.com/gallery). (a) (b) (c)

Figure 2: (a) RGB color. (b) grayscale. (c) grayscale image thresholded to make all pixels less than 150 become 0 (black) and all greater then 150 become 255 (white) resulting in a black and white or binary image.

II. Reading and Writing Images2

In [1]: from scipy import misc, ndimage In [2]: import Image In [3]: image=Image.open("4.1.08.") In [4]: image1=misc.fromimage(image,flatten=0) In [5]: imshow(image1)

The first line loads the multidimensional image processing package, ndimage. The second loads the Python Image Library (PIL). PIL is used to read and write specific graphics file formats. Once in python, the images are treated as arrays; because, ndimage and numpy are exceptionally good at manipulating arrays. Line 3 opens the image using the PIL. Note that PIL interprets the filename and de- termines the type of file. It is able to work with a wide variety of file formats including GIF, JPEG, PNG, TIFF, and BMP. Line 4 reads the PIL image into a numpy array (called image1). If flatten is set to true (1), then the image is converted to grayscale. Finally, line 5 displays the image1 array in a figure window. If the image is full-color, then it appears with the correct colors. If it is grayscale, then it is rendered using the current color map. Note that row 0 and column 0 is at the top left of the image. Numpy provides a convenient routine for writing images to files.

In [6]: misc.imsave("new.png",image1) In [7]: misc.imsave("new.jpg",image1)

2An excellent source of test images is the University of Southern California Signal and Image Processing Institute Image Database (USC-SIPI). The images are all in the TIFF format in either 8 bit grayscale or 24 bit color. They are broadly separated into four types: textures, high altitude aerial photos, miscellaneous, and image sequences. Line 6 writes image1 as a png file while line 7 writes it as a jpg file. The image formats understood by imsave depend on the backend being used, but most backends support png, , ps, eps, and svg.

III. Simple Image Processing

III.A. Region of Interest Very often, only a portion of an image is of interest. We can use the array properties of numpy to select or clip a subregion of a larger image on which to operate. This subregion is referred to as the region of interest or ROI.

In [8]: image1_ROI=image1[TOP:BOTTOM,LEFT:RIGHT].copy() where LEFT, RIGHT, TOP, and BOTTOM are the integer values of the positions in the array that are the edges of the ROI.

III.B. Addition and Multiplication of Images Probably the single most powerful technique in image processing is background subtrac- tion. In background subtraction, a background image is subtracted from a data image. The background image has exactly the same view but without whatever you are taking data on. Properly done, this zeros all of the image except exactly what you are interested in. Because of small amounts of noise, it is generally a good idea to take the absolute value of the difference to keep any pixels from going negative.

NewImage = abs(Image − Background) (1) One way to quantify an image is to produce a histogram of the image’s pixel values using histogram(input image, min, max, bins).

In [9]: ndimage.measurements.histogram(image1, 0, 255, 50)

Here, image1 is the image array and the histogram will be 50 bins between pixel values min and max. The routine returns an array of bin counts. Since the image is just an array, one can perform arithmetic on it.

In [10]: low=image1.min In [11]: high=image1.max In [12]: image2=image1.copy()-low In [13]: image2=(int)image1.copy()*(255-0)/(high-low)

If the histogram of image1 covers the range of pixel values from low to high, then the above will stretch the histogram of the image to fully use the entire range 0 to 255. Warning: it is important that no pixel value be changed to exceed the allowed pixel range. III.C. Thresholding Often a simple grayscale is the easiest image type to process. 8 bit grayscale corresponds to 8 bits per pixel where 0 is black and 255 is white3. We can threshold the image by setting all pixels above a certain value to a single value. Similarly, we could threshold the image by setting all pixels below a certain value to a particular value. This is easy to do using the array operations of numpy.

In [14]: image2=image1.copy() In [15]: image2[image1<120]=0 In [16]: image2[image1>=120]=255 In [17]: image2.sum()/image2.size In [18]: image2.mean()

Here line 14 makes a copy of the image array. (Remember that numpy tends to operate on a single copy of an array to save time and memory unless you explicitely tell it otherwise.) Line 15 sets every pixel with a value less than 120 in the new image array to 0 (black), and line 16 sets every pixel with a value greater than or equal to 120 equal to 255 (white). The new image is now binary or black and white. Finally, lines 17 and 18 print the mean pixel value in the array image2. These are standard numpy array functions.

IV. Filtering by Convolution

A convolution is a method of multiplying a large matrix (the image) with a small one (the kernel. As shown in the following figure, if the kernel is a 3x3 matrix and the image is much larger (MxN), the kernel is aligned above a pixel or cell in the original image matrix and cell values are multiplied together cell by cell and the sum of the products becomes the value of the convolution at the central cell or pixel. Then the kernel is moved one place farther along and the multiplication is repeated cell by cell until this has been done at every cell where it can be done. (Edges are awkward and edge effects are to be expected very near them.)

m=+(M−1)/2 n=+(M−1)/2 − − Iconvolution(m, n) = Σm=−(M−1)/2Σn=−(M−1)/2h(m, n)I(i m, j n) (2) where we have assumed the most common form of the kernel, namely a square kernel of MxM where M is odd. Ndimage contains a multidimensional convolution function con- volve(image,kernel,output=None,mode=’constant’) where kernel is the name of the kernel matrix, output allows the user to pass in an array to store the filter output. Mode determines how the convolution handles edges. Possible modes include ’constant’, ’reflect’, ’mirror’, and ’wrap’. (See ndimage documentation for details on each.) When it comes to image processing, the magic is in the choice of kernel. The appropriate kernel can highlight edges or smooth (blur) an image. In our discussion of kernels, we will focus on 3x3 kernels. 5x5, 7x7, etc kernels allow calculations over a much larger area of the image. All of the 3x3 kernels that we discuss can be extrapolated to larger kernels. Simple blurring or smoothing can be done using several

3Remember that 8 bit grayscale has 28 = 256 different levels of gray  n j kernel   m     

Image i

Figure 3: Schematic showing a kernel sliding across an image. The value of the convolution at a pixel is the sum of the products of the kernel with the image below it.

11 1 1 2 1 0 −1 0 1 −2 1

1/9 1 1 1 1/16 2 4 2 −1 5 −1 −2 5 −2 11 1 1 2 1 0 −1 0 1 −2 1

Blurring Gaussian Blurring High−pass Filter High−pass Filter to sharpen image to sharpen image

1 1 1 1 0 −1 0 −1 0 −1 −2 −1

0 0 0 1 0 −1 −1 4 −1 0 0 0

−1 −1 −1 1 0 −1 0 −1 0 1 2 1

Prewitt Gradient Prewitt Gradien Laplacian Horizontal Sobel operator dx operator dy Edge Detection

−1 −1 −1 −1 2 −1 −1 −1 2 −1 0 1

2 2 2 −1 2 −1 −1 2 −1 −2 0 2

−1 −1 −1 −1 2 −1 2 −1 −1 −1 0 1

Horizontal lines Vertical lines 45 degree lines Vertical Sobel

Figure 4: Important kernels for blurring, high frequency image enhancement, and edge and line detection. kernels. The Gaussian kernel is often preferred for image processing. In either case, note that the sum of the matrix elements is 1.0. If this were not so, then the filtered image would be slightly brighter or darker than the original. A good kernel for edge detection is −1 −1 −1 −1 8 −1 −1 −1 −1

The Sobel edge detectors are somewhat less prone to noise than the simple line detectors. One could find both horizontal and vertical lines and then add the two images together to have a single image with all lines.

V. Review of Commands Image.open Opens an image using the PIL. misc.fromimage loads an image as a numpy array. imshow misc.imsave gray() Use the standard grayscale colormap.

VI. Going Farther

Image Processing is a very rich field and this chapter has touched upon only a few topics. Books for further study include: Seul, O’Gorman, and Sammon, Practical Algorithms for Image Analysis: Description, Examples, and Code Cambridge University Press, 2000.

VII. Problems (a) (b)

(c) (d)

Figure 5: Effect of various edge detection kernels on an image. (a) Original image. (b) Effect of horizontal Sobel kernel filter. (c) Effect of vertical Sobel kernel filter. (c) Effect of good kernel for edge detection.