VISVESVARAYA TECHNOLOGICAL UNIVERSITY Belgaum-590014, Karnataka

PROJECT REPORT ON DESIGN AND DEVELOPMENT OF A REAL TIME VIDEO PROCESSING BASED PEST DETECTION AND MONITORING SYSTEM ON FPGA Submitted in the partial fulfilment for the award of the degree of Bachelor of Engineering in Electronics and Communication Engg. By SHOBITHA N- (1NH11EC099) SPURTHI N- (1NH11EC107)

UNDER THE GUIDANCE OF Dr. Sanjay Jain HOD, ECE Dept., NHCE

Department of Electronics & Communication Engineering New Horizon College of Engineering, Bengaluru - 560103, Karnataka. 2014-15

NEW HORIZON

COLLEGE OF ENGINEERING

(Accredited by NBA, Permanently affiliated to VTU) Kadubisanahalli, Panathur Post, Outer Ring Road, Near Marathalli, Bengaluru-560103, Karnataka

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGG.

CERTIFICATE

This is to certify that the project work entitled DESIGN AND DEVELOPMENT OF A REAL TIME VIDEO PROCESSING BASED PEST DETECTION AND MONITORING SYSTEM ON FPGA is a bonafide work carried out by SHOBITHA N, SPURTHI N bearing USN 1NH11EC099, 1NH11EC107 in partial fulfilment for the award of degree of Bachelor of Engineering in Electronics & Communication Engg. of the Visvesvaraya Technological University, Belgaum during the academic year 2014 - 15. It is certified that all corrections/suggestions indicated for internal assessment has been incorporated in the project report deposited in the departmental library & in the main library. This project report has been approved as it satisfies the academic requirements in respect of project work prescribed for the Bachelor of Engineering Degree in ECE.

…………………….. ……………………… ……………………... Guide HoD Principal Dr. Sanjay Jain Dr. Sanjay Jain Dr. Manjunatha Names of the Students: (i) SHOBITHA N (ii) SPURTHI N University Seat Numbers: (i) 1NH11EC099 (ii) 1NH11EC107

External Viva / Orals

Name of the internal / external examiner Signature with date

1…………………………… ………………. 2…………………………… .. ……………..

ACKNOWLEDGEMENT The satisfaction that accompanies the successful completion of any task would be incomplete without due reverence given to those who made it possible, whose constant guidance and encouragement crowned our efforts with success. We convey our sincere gratitude to Dr. Manjunath, Principle, NHCE, Bangalore for facilities provided in college and for the support in numerous ways. We remain indebted to Dr. Sanjay Jain, HOD, Department of E&C, NHCE for providing us permission to take up the project work. We would like to express our profound gratitude to our internal guide Dr. Sanjay Jain lecturer, Department of E&C, NHCE. We would also like to express our heartfelt gratitude and indebtedness to our external advisor Mr. Sudhir Rao, Project Manager, World Serve Education for his inspiration and support at each and every stage of the project. We could not have reaped such good results and overwhelming appraisals without his support of knowledge and unbiased help.

Shobitha. N Spurthi. N

i

ABSTRACT

Agriculture is an important part of India‘s economy and at present it is among the top two farm producers in the world. This sector provides approximately 52 percent of the total number of jobs available in India and contributes among 18.1 percent in GDP. Agriculture is the only means of living for almost two-thirds of the employed class in India. As being stated by the economic data of financial year 2006-07, agriculture has required 18 percent of India‘s GDP. The agriculture sector of India has occupied almost 43 percent of India‘s geographical area. Therefore agriculture is very important for a country like India. … [1] There are many problems faced by farmers. Some of them are as follows  Irrigation: It is the artificial application of water to the land or soil. It is used to assist in the growing of agricultural crops, maintenance of landscapes, and re-vegetation of disturbed soils in dry areas and during periods of inadequate rainfall.  Storage: The action or method of storing something for future use.  Attack of insects/pests: Any organism that damages crops, injuries or irritates livestock or man, or reduces the fertility of land. …[2] In our project we concentrate on attack of insects/pests problem. Nowadays pest monitoring is being done by manual inspection and by spraying pesticides. But the major problem caused by manual inspection is that they cannot monitor the crops 24/7 and also the problem caused by spraying pesticides is that due to the spraying of large quantity, the skin of the farmers is affected and also the consumers are affected by these spraying of pesticides. Therefore we plan to automate the process of pest monitoring, detection and spraying pesticides only when required using video processing/ real time techniques.

ii

Acknowledgement i Abstract ii List of contents iii List of figures vi LIST OF CONTENTS CHAPTER I: INTRODUCTION 1-2 1.1 Introduction 1 1.2 Literature Survey 1 1.3 Hardware Components Used 1 1.4 Software Components Used 2 1.5 Methodology 2 CHAPTER II: WHAT IS IMAGE PROCESSING 3-15 2.1 Purpose of Image Processing 3 2.2 Types 3 2.3 Pixels, Image sizes and Aspect ratio 4-7 2.3.1 Pixels 4 2.3.2 Aspect ratios 6 2.3.3 Bits per pixel 7 2.3.4 Sub pixels 7 2.4 How does camera work? 8-14 2.4.1 Camera: Focus 10 2.4.2 Understanding the basics 11 2.4.3 A web camera 11 2.4.4 Resolution 11 2.4.5 How big are the sensors? 12 2.4.6 Capturing color: Beam Splitter 12 2.4.7 Compression 12 2.4.8 Controlling light 13 2.4.9 Aperture 13 2.4.10 Shutter speed 13 2.4.11 Exposing the sensors 13 2.4.12 Lens and focal length 14 2.4.13 Cameras work 14 2.4.14 Optical zoom vs. Digital zoom 14 CHAPTER III: COLOR IMAGE PROCESSING 16-26 3.1 Color fundamentals 16 iii

3.2 Perception of colors by the Human eye 16 3.3 Characteristics of colors 17 3.4 Color models 19-26 3.4.1 The RGB color model 19 3.4.2 XYZ (CIE) 21 3.4.3 The CMY and CMYK color model 21 3.4.4 The YCbCr color model 22 3.4.5 The HSI color model 24 3.4.6 Color planes, perpendicular to intensity axis 24 3.4.7 Converting colors from RGB to HSI 25 3.4.8 The HSI color models 25 3.4.9 Conversion color from HSI to RGB 25 3.4.10 Manipulation of HSI images 26 3.4.11 Color complements 26 CHAPTER IV: SEGMENTATION 27-28 4.1 Thresholding 27 4.2 Edge Based Segmentation 27 4.3 Region Based Segmentation 28 CHAPTER V: MORPHOLOGICAL OPERATION 29-31 5.1 Morphological dilation of a Binary image. 29 5.2 Morphological dilation of a Grayscale image. 30 5.3 Dilating an image 30 5.4 Eroding an image 31 5.5 Combining Dilation and Erosion 31 CHAPTER VI: IMPLEMENTATION OF PROJECT 32-38 6.1 Webcamera 32 6.2 FPGA 32 6.3 Block diagram 32 6.4 MATLAB 33 6.4.1 Flow chart 33 6.5 Simulink 34-36 6.5.1 From a video device. 34 6.5.2 MATLAB function: Segmentation. 35 iv

6.5.3 Thresholding segmentation. 35 6.5.4 Erosion 35 6.5.5 Dilation 36 6.6 FPGA 36-37 6.6.1 FPGA design and programming. 37 6.6.2 FPGA applications. 37 CHAPTER VII: RESULTS 39-40 7.1 Output after every block 39

Advantages and Disadvantages 41 Applications 41 Future scope 41 Conclusion 41 APPENDIX FPGA – Introduction 42 Webcamera 43  iBall face2face K20  Features  Specifications Introduction to MATLAB 44  What is MATLAB? Introduction to Simulink 44 Bibliography 45

v

LIST OF FIGURES Fig 2.1 Block diagram of Image Processing. Fig 2.2 Pixels. Fig 2.3 A photograph of sub-pixel display elements on a laptop‘s LCD screen. Fig 2.4 Pixilated images. Fig 2.5 Geometry of color elements of various CRT and LCD displays –phosphor dots in a color CRT display (top row) bear no relation to pixels or sub pixels. Fig 2.6 Formation of image using convex lens. Fig 2.7 Formation of image using concave lens. Fig 2.8 Different stages of an Image Fig 2.9 How the original (left) image is split in a beam splitter. Fig 3.1 Color spectrum seen by passing white light through a Prism. Fig 3.2 Wavelengths comparing the visible range of the electromagnetic spectrum. Fig 3.3 Absorption of light by the Red, Green and Blue cones in the human eye as the function of wavelength. Fig 3.4 Primary and secondary colors of light and pigments. Fig 3.5 Chromaticity Diagram. Fig 3.6 Typical color gamut of color monitors(triangle) and color printing devices( irregular region). Fig 3.7 The RGB color model. Fig 3.8 Schematic of the RGB Color Cube. Points along the main diagonal have gray values, from black at the origin to white at point (1,1,1). Fig 3.9 RGB 24- bit color cube. Fig 3.10 Bayer filter Mosaic of Bayer RGB pattern. Fig 3.11 Demosaicking process. Fig 3.12 The CMYK color model. Fig 3.13 A visualization of YCbCr . Fig 3.14 A color image and its Y, CB and CR components. The Y image is essentially a greyscale copy of the main image. Fig 3.15 RGB to YCbCr Conversion. vi

Fig 3.16 Conceptual Relationship between the RGB and HSI color models. Fig 3.17 Hue and saturation in the HSI color model. Fig 3.18 The HSI color model based on (a) triangualar color planes and (b) circular color planes. Fig 3.19 Complements on the color circle. Fig 4.1(a) Thresholding Fig 4.1(b) Edge-based segmentation Fig 4.1(c) Region-based segmentation Fig 5.1 Processing for a Grayscale Image Fig 5.2 Understanding Structural Element Fig 5.3 Original and Dilated image Fig 5.4 Original and Eroded image Fig 6.1 Webcamera Fig 6.2 Block Diagram Fig 6.3 Simulink Block Diagram Fig 7.1 output of each image

vii

1 Design and development of a real time video…….. Chapter I

DESIGN AND DEVELOPMENT OF A REAL TIME VIDEO PROCESSING BASED PEST DETECTION AND MONITORING SYSTEM ON FPGA

1.1 INTRODUCTION Agriculture is the only means of living for almost two-thirds of the employed class in India. The agriculture sector of India has occupied almost 43 percent of India‘s geographical area. But there are many problems faced by farmers such as irrigation, storage, attack of pests etc. The major problem faced by manual inspection is that they cannot monitor the crops 24/7 and also by spraying huge amount of pesticides there might be skin problems to the consumers as well as farmers. Therefore we automate the usage of pesticides only when required using video processing. 1.2 LITERATURE SURVEY Inside a greenhouse, attacks from insects or fungi are fast and frequent. These pests can be detected by promoting ‗in situ‘ early pests detection in greenhouse based on video analysis. The main target is to detect pests on plant organs such as green leaves. It is an autonomous system based on video analysis for ‗in situ‘ insect detection. The process begins by setting up video cameras uniformly in horizontal plane .It utilizes a yellow colour sticky trap on which insects come and get stuck. The cameras currently observe sticky traps to detect flying insects. After that a picture is taken to count the number of pests on yellow card. This detection technique is based on segmentation techniques such as background modelling or subtraction in order to detect only moving objects. Advantages –This process detects low infestation stages because it allows a continuous surveillance during daylight which favours rapid protection decisions. Disadvantages- The disadvantage of this project is that it shall detect on cards and not on the plant itself. 1.3 HARDWARE COMPONENTS USED 1. Web camera 2. Xilinx FPGA(Spartan 3E) 3. Motors (BO-motors) 4. Motor drives(L293D) 5. Display(LCD)

2 Design and development of a real time video……… 1.4 SOFTWARE COMPONENTS USED 1. MATLAB(R2011B) 2. Simulink 3. Xilinx system generator 12.3

1.5 METHODOLOGY As we have seen in agriculture, there are many problems faced by the farmers such as irrigation, storage, pest control etc. In our project we are mainly concentrating on attack of pests as this is the one major problem faced by the farmers. In our project we make a robot car which monitors the farm 24*7. We attach a camera to robot car which captures the real time video of the plants. Once the robot detects the pest it stops and sprays the pesticides depending on the count of the pests and displays the count of pests and number of times pesticides to be sprayed on the LCD screen. All these implementations are done with the help of FPGA.

Department of Electronics and Communication (2014-2015)

3 Design and development of a real time video……… Chapter II

WHAT IS IMAGE PROCESSING? Image processing is a method to convert an image into digital form and perform some operations on it, in order to get an enhanced image or to extract some useful information from it. It is type of signal dispensation in which input is image, like video frame or photograph and output may be image or characteristics associated with an image. Usually Image Processing system includes treating images as two dimensional signals while applying already set signal processing methods to them. It is among rapidly growing technologies today, with its applications in various aspects of a business. Image Processing forms core research area with engineering and computer science disciplines too. Image processing basically includes the following three steps.

 Importing the image with optical scanner or by digital photography.  Analysing and manipulating the image which includes and image enhancement and spotting patterns that are not to human eyes like satellite photographs.  Output is the last stage in which result can be altered image or report that is based on image analysis. 2.1PURPOSE OF IMAGE PROCESSING The purpose of image processing is divided into 5 groups. They are: 1. Visualization – Observe the objects that are not visible. 2. Image sharpening and restoration – To create a better image. 3. Image retrieval – Seek for the image of interest. 4. Measurement of pattern – Measures various objects in an image. 5. Image Recognition – Distinguish the objects in an image. 2.2 TYPES The two types of methods used for image processing are Analog and . Analog or visual techniques of image processing can be used for the hard copies like printouts and photographs. Image analysis are various fundamentals of interpretation while using these visual techniques. The image processing is not just confined to area that has to be studied but on knowledge of analyst. Association is another important tool in image processing through visual techniques. So analysis apply a combination of personal knowledge and collateral data to image processing. Digital Processing techniques help in manipulation of the digital images by using computers. Asraw data from imaging sensors from satellite platform contains deficiencies. To get over such flaws and to get originality of information, it has to undergo various phases of processing. The three general phases that all types of data have

Department of Electronics and Communication (2014-2015)

4 Design and development of a real time video……… to undergo while using digital technique are Pre-processing enhancement and display, information extraction.

4. Color image 5. Wavelets and 6. 7. Morphological processing multi resolution Compression processing

3. Image Restoration 8. Segmentation

Knowledge 2. Image 9. Enhancement Base Representation and description

v10. Object 1. Image acquisition Recognition

Fig 2.1 Block Diagram of Image Processing

2.3 PIXELS, IMAGE SIZES AND ASPECT RATIOS Let‘s start with one surprising fact: A pixel has no size or shape. At the time it‘s born, it‘s simply an electrical charge much like the static electricity that builds up on your body as you shuffle across a carpet on a dry day. A pixel is only given size and shape by the device you use to display or print it. Understanding how pixels and image sizes relate to one another takes a little effort but you need to bring nothing more to the process than your curiosity and elementary school arithmetic skills. 2.3.1 Pixels A pixel begins it‘s life on camera‘s during that flickering moment when the shutter is open. The size of each photo site on the image sensor can be measured, but the pixels themselves are just photons, soon to be converted into electrical charges, and then into zeros and ones. These numbers, just like any other numbers that run through your head, have no physical size.

Fig 2.2 Pixels This example shows an image with a portion greatly enlarged, in which the individual pixels are rendered as small squares and can easily be seen. Department of Electronics and Communication (2014-2015)

5 Design and development of a real time video……… Although the captured pixels have no physical dimensions, a sensor‘s size is specified just like a digital photo‘s except the count is the number of photosets that it has on it‘s surface instead of pixels. In most cases the numbers of photosets and the number of pixels are roughly the same since each photo site captures one pixel. A pixel is generally thought of as the smallest single component of a digital image. However, the definition is highly context-sensitive. For example, there can be ―printed pixels‖ in a page, or pixels carried by electronic signals, or represented by digital values, or pixels on display device or pixels in a digital camera (photo sensor elements). This list is not exhaustive, and depending on context, there are several terms that are synonymous in particular contexts, such as peel, sample, byte, bit, dot, spot, etc. The term ―pixels‖ can be used in the abstract, or as a unit of measure, in particular when using pixels as a measure of resolution, such as: 2400 pixels per inch, 640 pixels per line, or spaced 10 pixels apart.

Fig 2.3 A photograph of sub-pixel display elements on a laptop’s LCD screen. Since pixels stored in an image file have no physical size or shape, it‘s not surprising that the number of pixels doesn‘t by itself indicate a captured image‘s sharpness or size. This is because the size of each captured pixel, and the image of which it‘s a part, is determined by the output device. The device can spread the available pixels over a small or large area on the screen or printout. If the pixels in an image are squeezed into a smaller area, the image gets smaller and the perceived sharpness increases (from the same viewing distance). Images on high- resolution screens and printouts look sharper only because the available pixels are smaller and grouped into a small area not because there are more pixels. As pixels are enlarged, an image is spread over a larger area, and it‘s perceived sharpness falls (from the same viewing distance). When enlarged past a certain point, the individual pixels begin to show the image becomes pixilated.

Fig 2.4 Pixilated images A pixel does not need to be rendered as a small square. This image shows alternative ways of reconstructing an image from a set of pixel values, using dots, lines, or smooth filtering. To visualize the concept, imagine two tile mosaics, one with small tiles and one with large. Department of Electronics and Communication (2014-2015)

6 Design and development of a real time video………  If both mosaics cover an area of the small area of the same size, the one created using small tiles has more tiles so it has sharper curves and more detail.  If there is the small number of large and small tiles, the area covered by the small tiles is smaller. When viewing both mosaics from the same distance, the smaller one looks sharper. However, if you view the small mosaic from close up, it‘s sharpness and detail appear almost identical to the larger one viewed from farther away. To make an image larger or smaller for a given output device, you must add or subtract pixels. This process, called re-sampling, can be done with a photo editing program, or by an application you‘re using to print an image.  When an image is re sampled to make it larger, extra pixels are added and the color of each new pixel is determined by the colors of its neighbours.  When an image is re-sampled to make it smaller, some pixels are deleted. 2.3.2 Aspect Ratios If you have ever tried to center a photo on a sheet of paper so there are even borders all on the image, you have been dealing with the concept called ―aspect ratios‖. This is the ratio between the width and height of an image, screen display, paper or any other two dimensional rectangle. To calculate an aspect ratio, divide the largest number in a rectangle‘s size by the smallest number. The numbers can be in mm, inches, pixels or any other unit of measurement. For example, a 35mm slide or negative is 1.5 inches wide by 1 inch tall so its aspect ratio is 1.5 to 1. A square has an aspect ratio of 1:1. If a camera captures image 3000 x 2000 pixels in size, 3000divided by 2000 gives an aspect ratio is 1.5, the same as 35mm film. Aspect ratios are usually expressed in one of three ways:  When expressed as 1.5 to 1 or 1.5:1, the actual numbers calculated in the division process are used, even though one has a decimal place.  To remove the decimal, the numbers are raised to a new ratio so both numbers are even. In our example, 1.5 to 1 would be raised to 3 to 2. That‘s what is done with TV screen aspect ratios. The aspect ratio for normal TV is referred to as 4:3 and HDTV as 16:9.  In a few cases, where one part of the ratio is assumed to be 1, just the other part is given. For example, a 1.5:1 ratio is expressed as 1.5. Aspect ratios present a problem when printing or displaying images. Most camera don‘t capture images with the same aspect ratio as 11 x 8.5 paper we print on – which has an aspect ratio of 1.29(11 divided by 8.5). Few have the same aspect ratio as the screens we display on them on. Even some software that prints contact sheets crop images—greatly lowering their usefulness when you try to evaluate images for printing. When the aspect ratios don‘t match, here are your options.  Crop the image to the desired aspect ratio. Programs such as photoshop let you crop or select areas of an image using any aspect ratio that you specify. To do it manually: 1. Determine the aspect ratio you want to use. 2. Determine how high the image needs to be in pixels. 3. Multiply the height by the aspect ratio to determine how wide the image should be in pixels.

Department of Electronics and Communication (2014-2015)

7 Design and development of a real time video……… When you know the width and want to find the height, divide the aspect‘s ratio largest number into the smallest and multiply the width by that number. For example, if the aspect ratio is 1.5:1 divide 1 by 1.5 to get 0.667. If the image is 3 inches wide, 3 x 0.667 tells you the height in 2 inches.  Size the image so it fills the available space in one direction even though some of it extends past the edges in the other dimension. In effect, you are cropping the image.  Size the image leaving unequal borders around it. You can then trim it or fix the problem while matting it for farming. 2.3.3 Bits per pixel The number of distinct colors that can be represented by a pixel depends on the number of bits per pixel (bop). A 1 bop images uses 1-bit for each pixel, so each pixel can be either on or off. Each additional bit doubles the number of colors available, so a bop image can have 4 colors, and a 3 bop image can have 8 colors.

 1 bop, = 2 colors (monochrome)  2 bop, = 4 colors  bop, = 8 colors  8 bop, = 256 colors  16 bop, = 65,536 colors(―High color‖)  24 bop, = 16,777,216 colors(―True color‖) For color depths of 15 or more bits per pixel, the depth is normally the sum of the bits allocated to each of the red, green, and blue components. High color, usually meaning 16 bop, normally has five bits for red and blue, and six bits for the green, as the human eye is more sensitive to errors in green than in the other two primary colors. For applications involving transparency, the 16 bits may be divided into five bits each of red, green and blue with one bit left for transparency. A 24-bit depth allows 8 bits per component. On some systems, 32-bit depth is available: this means that each 24-bit pixel has an extra 8 bits to describe its opacity (for purposes of combining with another image). 2.3.4 Sub pixels

Fig 2.5 Geometry of color elements of various CRT and LCD displays –phosphor dots in a color CRT display (top row) bear no relation to pixels or sub pixels. Many display and image-acquisition systems are, for various reasons, not capable of displaying or sensing the different color channels at the same site. Therefore, the pixel grid is divided into single-color regions that contribute to the displayed or sensed color Department of Electronics and Communication (2014-2015)

8 Design and development of a real time video……… when viewed at a distance. In some displays, such as LCD,LED and plasma displays, these single color regions are separately addressable elements, which have come to be known as sub pixels. When the square pixel is divided into three sub pixels, each sub pixel is necessarily rectangular. In the display industry terminology, sub pixels are often referred to as pixels, as they are the basic addressable elements in a viewpoint of hardware, and they call pixel circuits rather than sub pixel circuits. Most digital camera image sensors use single use single-color sensor regions, for example using the Bayer filter pattern, and in the camera industry known as pixels just like in the display industry, not sub pixel. For systems with sub pixels, two different approaches can be taken:  The sub pixels can be ignored, with full-color pixels being treated as the smallest addressable imaging element.  The sub pixels can be in included in rendering calculations, which requires more analysis and processing time, but can produce apparently superior images in some cases. This latter approach, referred to as cub pixel rendering, uses knowledge of pixel geometry to manipulate the three colored sub pixels separately, producing an increase in the apparent resolution of color displays. While CRT displays use red-green-blue-masked phosphor areas, dictated by a mesh grid called the shadow mask, it would require a difficult calibration step to be aligned with the displayed pixel raster, and so CRTs do not currently use sub pixel rendering. The concept of sub pixels is related to samples. 2.4 HOW DOES CAMERA WORK? Photography is undoubtedly one of the most important inventions in history—it has truly transformed how people conceive the world. Now we can ―see‖ all sorts of things that are actually many miles and years away from us. Photography lets us capture moments in time and preserve them for years to come. The basic technology that makes all of this possible is fairly sample. A still film camera is made of three basic elements: an optical element (the lens), a chemical element (the film) and the mechanical element (the camera body itself). As we all see, the only trick to photography is calibrating and combining these elements in such a way that they record a crisp, recognizable image. There are many different ways of bringing everything together. In this article, we‘ll look at a manual single-lens-reflex (SLR) camera. This is a camera where the photographer sees exactly the same image that is exposed to the film and can adjust everything by turning dials and clicking buttons. Since it doesn‘t need any electricity to take a picture, a manual SLR camera provides an excellent illustration of the fundamental processes of photography. The optical component of the camera is the lens. At its simplest, a lens is just a curved piece of glass or plastic. It‘s job is to just take the beams of light bouncing of the object and redirect them so they come together to form a real image, an image that looks just like the scene in front of the lens.

Department of Electronics and Communication (2014-2015)

9 Design and development of a real time video……… But how can a piece of glass do this? The process is actually very simple. As light travels from one medium to another, it changes speed. Light travels more quickly through air than it does through glass, so a lens slows it down. When light waves enter a piece of glass at an angle, one part of the wave will reach the glass before another and so will start slowing down first. This is something like pushing a shopping cart from pavement to grass, at an angle.The right wheel hits the glass and slows down while the left wheel is still on the pavement. Because the left wheel is briefly moving more quickly than the right wheel, the shopping cart turns to the right as it moves on the grass. The effect on light is the same as it enters the glass at an angle, it bends in one direction. It bends again when it exits the glass because parts of the light wave enter the air and speed up before other parts of the wave. In a standard converging, or convex lens, one or both sides of the glass curves out. This means rays of light passing through will bend toward the center of the lens on entry. In a double convex lens, such as magnifying glass, the light will bend when it exits a well as when it enters.

Fig 2.6 Formation of image using convex lens. This effectively reverses the path of light from an object. A light source say a candle emits in all directions. The rays of light all start at the same point the candle‘s flame and then are constantly diverging. A converging lens takes those rays and redirects them so they are all converging back to one point. At the point where the rays converge, you get a real image of the candle. In the next couple sections, we‘ll look at some of the variables that determine how this real image is formed.

Department of Electronics and Communication (2014-2015)

10 Design and development of a real time video………

Fig 2.7 Formation of image using concave lens 2.4.1 Camera: Focus We‘ve seen that a real image is formed by light moving through convex lens. The nature of this real image varies depending on how light travels through the lens. This light path depends on two major factors:  The angle of the light beam‘s entry into the lens  The structure of the lens The angle of light entry changes when you move the object closer or farther away from the lens. You can see this in the diagram below. The light beams from the pencil point enter the lens at a sharper angle when the pencil is closer to the lens and a more obtuse angle when the pencil is farther away. But overall, the lens only bends the light beam to a certain total degree, no matter how it enters. Consequently, light beams that enter at a sharper angle will exit at a more obtuse angle, and vice versa. The total ―bending angle‖ at any particular point on the lens remains constant. As you can see, light beams from a closer point converge farther away from the lens than light beams from a point that‘s farther away. In other words, the real image of a closer object forms farther away from the lens than the real image from a more distant object. You can observe this phenomenon with a simple experiment. Light a candle in the dark, and hold a magnifying glass between it and the wall. You will see an upside down image of the candle on the wall. If the real image of the candle does not directly fall on the wall, it will appear somewhat blurry. The light beams from a particular point don‘t quite converge at this point. To focus the image, move the magnifying glass closer or farther away from the candle.

Department of Electronics and Communication (2014-2015)

11 Design and development of a real time video……… Fig 2.8 Different stages of an Image That is what you are doing when you turn the lens of the camera to focus it – you are moving it closer or farther away from the film surface. As you move the lens, you can line up the focused real image of an object so it falls directly on the film surface. You now know that at any one point, a lens bends to a certain total degree, no matter the light beam‘s angle of entry. This total ―bending angle‖ is determined by the structure of the lens. 2.4.2 Understanding the basics Let‘s say you want to take a picture ad e-mail it to a friend. To do this, you need the image to be represented in the language that computer recognize bits and bytes. Essentially, a digital image is just a long string of 1‘s and 0‘s that represent all the tiny coloured dots or pixels that collectively make up the image. (For information on sampling and digital representations of data, see this explanation of the digitalization of sound waves. Digitalizing light wave‘s works in a similar way. If you want to get a picture into this form, you have two options: . You can take a photograph using a convectional film camera, process the film chemically, print it onto photographic paper and then use a digital scanner to sample the print (record the pattern of light as a series of pixel values). . You can directly sample the original light that bounces off your subject, immediately breaking that light pattern down into a series of pixel values—in other words, you can use a digital camera. 2.4.3 A Web Camera At its most basic level, this is all there is to a digital camera. Just like a convectional camera, it has a series of lenses that focus light to create an image of a scene. But instead of focusing the light onto a piece of film, it focuses it onto a semiconductor device that records light electronically. A computer then breaks this electronic information down into digital data. All the fun and interesting features of digital cameras come as a direct result of this process. 2.4.4 Resolution The amount of detail that the camera can capture is called as resolution, and it is measured in pixels. The more pixels your camera has, the more detail it can capture. The more detail you have, the more you can blow up a picture before it becomes ―grainy‖ and starts to look out-of-focus. Some typical resolutions that you find in digital cameras today include: . 256 x 256 pixels - You find the resolution on very cheap cameras. This resolution is so low that the picture quality is almost always unacceptable. This is 65,000 total pixels. . 640 x 480 pixels – This is the low end on most ―real‖ cameras. This resolution is great if you plan to e-mail most of your pictures to friends or post them on a website. This is 307,000 total pixels.

Department of Electronics and Communication (2014-2015)

12 Design and development of a real time video……… . 1216 x 912 pixels – If you are planning to print your images, this is a good resolution. This is a ―megapixel‖ image size 1,109,000 total pixels. A CMOS image sensor

2.4.5 How Big Are The Sensors? The current generation of digital sensors s smaller than film. Typical film emulsions that are exposed in a film-based camera measure 24mm x 36mm. if you‘ve look at the specifications of a typical 1.3megapixel camera, you‘ll find that it has a CD senor that measures 4.4mm x 6.6mm. A smaller sensor means smaller lenses. 2.4.6 Capturing Color: Beam Splitter There are several ways of recording the three colors in a digital camera. The highest cameras use three separate sensors, each with a different filter over it. Light is directed to different sensors by placing a beam splitter in a camera. Think of the light entering the camera as water flowing through a pipe. Using a beam splitter would be like dividing an identical amount of water into three different pipes. Each sensor gets an identical look at the image; but because of the filters, each sensor only responds to one of the primary colors.

Fig 2.9 How the original (left) image is split in a beam splitter. The advantage of this method is that the camera records each of the three colors at each pixel location. Unfortunately, cameras that use this method tend to be bulky and expensive. 2.4.7 Compression It takes a lot of memory to store a picture with over 1.2 million pixels. Almost all digital cameras use some sort of data compression to make the files smaller. There are two features of digital images that make compression possible. One is repetition and the other is irrelevancy. You can imagine, throughout a given photo, certain patterns develop in the colors. For example, if a blue sky takes up 30 percent of the photograph, you can be certain that some shades of blue are going to be repeated over and over again. When compression routine takes advantage of patterns that repeat, there is no loss of informationand the image can be reconstructed exactly as it was recorded. Unfortunately, this doesn‘t reduce files any more than 50 percent, and sometimes it doesn‘t even come

Department of Electronics and Communication (2014-2015)

13 Design and development of a real time video……… close to that level. Irrelevancy is the trickier issue. A digital camera records more information than is easily detected by human eye. Some compression routine takes advantage of this fact to throw away some of the more meaningless data. If you need smaller files, you need to be willing to throw away more data. Most cameras offer several different levels of compression, although they may not call it that way. More likely they will offer you different levels of resolution. Lower resolution means more compression. 2.4.8 Controlling Light It is important to control the amount of light that reaches the sensor. Thinking back to water bucket analogy, if too much light hits the sensor, the bucket will fill up and won‘t be able to hold any more. If this happens, information about the intensity of light is being lost. Even though one photo site may be exposed to a higher intensity light than another, if both buckets are full, the camera will not register the difference between them. The word camera comes from the term camera obscure. Camera means room (or chamber) and obscure means dark. In other words, a camera is a dark room. This dark room keeps out of all unwanted light. At the click of a button, it allows a controlled amount of light to enter through an opening and focuses the light onto a sensor (either film or digital). In the next couple of sections, you will learn how the aperture and shutter work together to control the amount of light that enters the camera. 2.4.9 Aperture The aperture is the size of the opening in the camera. It‘s located behind the lens. On a bright sunny day, the light reflected off your image may be very intense, and it doesn‘t take very much of it to create a good picture. In this situation, you want a small aperture. But in a cloudy day, or in twilight, the light is not so intense and the camera will need more light to create an image. In order to allow more light, the aperture must be enlarged. Your eye works the same way. When you are in the dark, the iris of your eye dilates your pupil (i.e it makes it very large). When you go out into bright sunlight, your iris contract and makes your pupil very small. If you find a willing partner and a small flashlight, that is easy to demonstrate (if you do 1600 x 1200 6.0MB 1.7MB 20KB this, please use a small flashlight, like the ones they use in a doctor‘s office). Look at your partner‘s eyes, then shine the flashlight in and watch the pupil‘s contract. Move the flashlight away, and the pupils will dilate. 2.4.10 Shutter Speed Traditionally, the shutter speed is the amount of time that light is allowed to pass through the aperture. Think of a mechanical shutter as a window shade. It is placed across the back of the aperture to block out the light. Then, for a fixed amount of time, it opens and closes. The amount of time it is open is the shutter speed. One way of getting more light into the camera is to decrease the shutter speed in other words, leave the shutter open for a longer period of time. Film-based cameras must have a mechanical shutter. Once you expose film to light, it can‘t be wiped clean to start again. Therefore, it must be protected from unwanted light. But the sensor in a digital camera can be reset electronically and used over and over again. This is called digital shutter. Some digital cameras employ a combination of electrical and mechanical shutters.

Department of Electronics and Communication (2014-2015)

14 Design and development of a real time video……… 2.4.11 Exposing the Sensor These two aspects of a camera, aperture and shutter speed, work together to capture the proper amount of light needed to make a good image. In photographic terms they set the exposer of the sensor. Most digital cameras automatically set aperture and shutter speed for optimal exposure, which gives them the appeal of a point-and-shoot camera. Some digital cameras also offer the ability to adjust the aperture settings by using menu options on the LCD panel. More advanced hobbyists and professionals like to have control over the aperture and shutter speed selections because it gives them more creative control over the final image. As you climb into the upper levels of consumer cameras and the realm of professional cameras, you will be rewarded with controls that have the look, feel and functions common to film-based cameras. 2.4.12 Lens and Focal Length A camera lens collects the available light that focuses it on to the sensor. Most digital cameras use automatic focusing techniques. 2.4.13 Cameras Work The important difference between the lens of a digital camera and the lens of a 35mm camera of focal length is also the critical information in determining how much magnification you get when you look through your camera. In 35mm cameras, a 50mm lens given a natural view of the subject. As you increase the focal length, you get greater magnification, and objects appear to get closer. As you decrease the focal length, things appear to get farther away, but you can capture a wider field of view in the camera. You will find four different types of lenses on digital cameras: . Fixed focus, fixed –zoom lenses – These are the kinds of lens you find on disposable and inexpensive film cameras—inexpensive and great for snapshots, but fairly limited. . Optical-zoom lenses with automatic focus – Similar to the lens on a video camcorder, you have ―wide‖ and ―telephoto‖ options and automatic focus. The camera may or may not let you switch to manual focus. . Digital-zoom lenses – With digital zoom, the camera takes pixels from the centre of the image sensor and ―interpolates‖ them to make a full-size image. Depending on the resolution of the image and the sensor, this approach may create a grainy or fuzzy image. It turns out that you can manually do the same thing a digital zoom is doing simply snap a picture and then cut out the centre of the image using your image processing software. . Replaceable lens systems – If you are familiar with high-end 35mm cameras, then you are familiar with the concept of replaceable lenses. High-end digital cameras can use this same system, and in fact can use lenses from 35mm cameras in some cases. Focal length-35mm equivalents since many photographers that use film- bases cameras are familiar with focal lengths that project an image 35mm film, digital cameras advertise their focal lengths with ―35mm equivalents‖. This is extremely helpful information to have. In the chart below, you can compare the actual focal lengths of a typical 1.3-megapixel camera ad its equivalent in a 35mm camera.

Department of Electronics and Communication (2014-2015)

15 Design and development of a real time video……… 2.4.14 Optical Zoom vs. Digital Zoom In general terms, a zoom lens is any less that has an adjustable focal length. Zoom doesn‘t always mean a close-up. As you can see in the chart above, the ―normal‖ view of the world for this particular camera is 7.7mm. You can zoom out for a wide-angle view of the world, or you zoom in for a closer view of the world. Digital cameras may have an optical zoom, a digital zoom, or both. An optical zoom actually changes the focal length of your lens. As a result, the image is magnified by the lens (sometimes called the optics, hence ―optical‖ zoom). With greater magnification, the light is focal length 35mm. Equivalent view typical uses 5.4mm 35mm things look smaller and farther away. Wide-angle shots, landscapes, large buildings, groups of people 7.7mm 50mm things look about the same as what your eye sees. ―Normal‖ shots of people and objects 16.2mm 105mm things are magnified and appear closer. Telephoto shots, close-ups spread across the entire CCD sensor and all of the pixels can be used. You can think of an optical zoom as a true zoom that will improve the quality of your pictures. A digital zoom is a computer trick that magnifies a portion of the information that hits the sensor. Let‘s say you are shooting a picture with a 2X digital zoom. The camera will use half of the pixels at the centre of the CCD sensor and ignore all the other pixels. Then it will use interpolation techniques to add the detail to the photo. Although it may look like you are shooting a picture with twice the magnification, you can get the same results by shooting a photo without a zoom and blowing up the picture using your computer software.

Department of Electronics and Communication (2014-2015)

16 Design and development of a real time video……… Chapter III COLOR IMAGE PROCESSING Motivation to use color:  Powerful description that often simplifies object identification and extraction from a scene.  Humans can discern thousands of color shades and intensities, compared to about only two dozen shades of grey two major areas. They are:  Full-color processing: e.g. images acquired by color TV camera or color scanner.  Pseudo-color processing: Assigning a color to a particular monochrome intensity or range of intensities. 3.1 COLOR FUNDAMENTALS

Fig 3.1 Color spectrum seen by passing white light through a Prism

Fig 3.2 Wavelengths comparing the visible range of the electromagnetic spectrum.

3.2 PERCEPTION OF COLORS BY THE HUMAN EYE Cones can be divided into 3 principal sensing categories: (roughly) red, green and blue ~65% are sensitive to red light, ~33% to green light and ~2% to blue (but most sensitive). Colors are seen as variable combinations of the primary colors: Red, Green, Blue from CIE*(1931), wavelengths: blue=435.8nm, green=546.1nm, red=700nm.

Department of Electronics and Communication (2014-2015)

17 Design and development of a real time video………

Fig 3.3 Absorption of light by the Red, Green and Blue cones in the human eye as the function of wavelength. Primary colors can be added to produce the secondary colors of light:  Magenta (red + blue)  Cyan (green + blue)  Yellow ( red + green) Mixing the three primaries in the right intensities produce white light primary colors of pigment: absorb a primary color of light and reflects or transmits the other two magenta, cyan and yellow.

Fig 3.4 Primary and secondary colors of light and pigments. 3.3 CHARACTERISTICS OF A COLOR  Brightness: Embodies the achromatic notion of intensity.  Hue: Attribute associated with the dominant wavelength in a mixture of light waves.  Saturation: Refers to the relative purity or the amount of white light mixed with a hue. (The pure spectrum colors are fully saturated; e.g. Pink (red and white) is less saturated, degree of saturation being inversely proportional to the amount of white light added) Hue and saturation together = chromaticity. Color may be characterized by its brightness and chromaticity. Tristimulus values= amounts of red(X), green (Y), and blue(Z) needed to form a particular color. A color can be specified by its dichromatic coefficients.

Department of Electronics and Communication (2014-2015)

18 Design and development of a real time video………

a= …….. (1.1)

b= …….. (1.2)

c= …….. (1.3)

NB: a+b+c=1 Another approach for specifying colors: The CIE chromaticity diagram: Shows color composition as a function of a(red) and b(green) for any value of a and b. c (blue) is obtained by: c=1-(a+b) ……. (1.4) Pure colors of the spectrum (fully saturated): boundary

Fig 3.5 Chromaticity Diagram.

Fig 3.6 Typical color gamut of color monitors(triangle) and color printing devices( irregular region).

3.4 COLOR MODELS  Also called: color spaces or color systems.

Department of Electronics and Communication (2014-2015)

19 Design and development of a real time video………  Purpose: facilitate the specification of colors in some ―standard‖ way.  Color model: Specification of a coordinate system and a subspace within it where each color is represented by a single point. Most commonly used hardware- oriented models are:  RGB (Red, Green, Blue), for color monitors and video cameras.  CMY (Cyan, Magenta and Yellow) and CMYK (CMY + Black) for color printing.  Grayscale for CCTV cameras.  YCbCr ( ,Chroma Blue, Chroma Red)  HIS (Hue, Saturation, Intensity) 3.4.1 The RGB Color Model Computer creates the colors based on RGB model. It produces spectrum of visible light. Monitor can create millions of colors by combining different percentages of threeprimitives red, green and blue. While using the image processing software like Photoshop you can see that these RGB colors are added with the help of numerical value, which is between 0 to 255. With RGB, mixing of red and green gives yellow, mixing of green and blue creates cyan and mixing of red and blue creates magenta. When all the three colors red, green and blue are mixed equally they produces white light. Hence it is called ―additive color model‖. Another RGB model based example is human eye itself and scanners.

Fig 3.7 The RGB color model. The basic advantage of RGB model is that it is useful for color editing because it has wide range of colors. But at the same time this model is said to be device dependent. This means the way colors displayed on the screen depends on the hardware used to display it.  Each color appears in its primary spectral components of Red, Green and Blue.  Model based on a Cartesian coordinate system.  Color subspace=cube.  RGB primary values at three corners(+secondary values at three others)  Black at its origin. White at its opposite corner. Convection: all color values normalized  Unit cube and all values of R,G,B in [0,1].

Department of Electronics and Communication (2014-2015)

20 Design and development of a real time video………

Fig 3.8 Schematic of the RGB Color Cube. Points along the main diagonal have gray values, from black at the origin to white at point(1,1,1). Number of bits used to represent each pixel in the RGB space=pixel depth. Example: RGB image in which each of the red, green and blue images is a 8-bit image. Each RGB color pixel (i.e. triplet of values (R,G,B)) is said to have a depth of 24bits (full-color image). Total number of colors in a 24-bit RGB image is (28)3= 16,777,276.

Fig 3.9 RGB 24- bit color cube. NB: acquiring an image=reversed process: Using 3 filters sensitive to red, green and blue respectively (e.g. Tri-CCD sensor). Color image acquisition: Color Mono CCD and Bayer filter mosaic.

Fig 3.10 Bayer filter Mosaic of Bayer RGB pattern

Department of Electronics and Communication (2014-2015)

21 Design and development of a real time video………

“Demosaicking”: Interpolating the values of missing pixels in the component images.

Fig 3.11 Demosaicking process. 3.4.2 XYZ (CIE) One of the first color space defined is CIE (XYZ), also referred as X,Y and Z tristimulus functions, and as CIE colorspace: XYZ is created by international commission on illumination. These models are created manually with the help of human judgement ability of visualization and appearances matching, and the chosen colorimetric is based on this matching procedure. The shapes of the sensitivity curves parameters X,Y and Z are measured with some plausible accuracy. Mathematically speaking, the model can be described as luminance component Y along with two chromaticity coordinates X and Z. However, sometimes the XYZ color model is represented by its luminance parameter Y, furthermore, a normalized tristimulus (chromaticity coordinates) can be used as a respresentative for such model.  Official definition of th CIE XYZ standard (normalized matrix) is:

 Commonly used form: w/o leading fraction=>R,G,B=(1,1,1) , Y=1. 3.4.3 THE CMY AND CMYK COLOR MODELS Opposite model of RGB is CMY. Printing inks are based on this model. With the full presence of cyan, magenta and yellow we get black. But practically in printing industry it is impossible to create black with these three colors. The result of the mixture of CMY is muddy brown due to impurities of the printing inks. Hence black ink is added to get solid black. The outcome of this process CMYK model and ―K‖ stand for black color, which is also recognised as‖key” color. Since black is a full prescence of color, you will have to subtract the levels of cyan, magenta and yeellow to produce the lighter colors. This can be explained in different way : When light fall on the green sufface or green ink, it absorbs(subtracts) all the colors from light except green. Hence the model is called subtractive model. Print production is based on this model.

Department of Electronics and Communication (2014-2015)

22 Design and development of a real time video………

Fig 3.12 The CMYK color model. It is useful to have proper understanding of the color model. The monitors as well as scanner works on RGB principle. While scanning we can adjust the software to produce desired result. CMYK is for print industry. It cannot produce the color range of RGB hence after finishing the work on computer in RGB mode when you convert it into CMYK for printing some tonal changes can be occurred. In spite of itslimitation CMYK model is considered as best model available for printing because it can produce properly finished output. Cyan, Magenta and Yellow(secondary colors of light, or primary colors of pigments).  CMY data input needed by most devices that deposit colored pigments on paper, suh as color printers and copies.

 RGB to CMY conversion:

Equal amounts of cyan, magenta and yellow=>black, but muddy-looking in practice=>to produce true black (predominant color in printing) a 4th color black is added=>CMYK model(CMY+Black). 3.4.4 THE YCbCr COLOR MODEL

Fig 3.13 A visualization of YCbCr color space YCbCr, Y′CbCr, or Y Pb/Cb Pr/Cr, also written as YCBCR or Y′CBCR, is a family of color spaces used as a part of the color image pipeline in video and digital photography

Department of Electronics and Communication (2014-2015)

23 Design and development of a real time video……… systems. Y′ is the luma component and CB and CR are the blue-difference and red- difference chroma components. Y′ (with prime) is distinguished from Y, which is luminance, meaning that light intensity is nonlinearly encoded based on gamma corrected RGB primaries. Y′CbCr is not an absolute color space; rather, it is a way of encoding RGB information. The actual color displayed depends on the actual RGB primaries used to display the signal. Therefore a value expressed as Y′CbCr is predictable only if standard RGB primary chromatics are used.

Fig 3.14 A color image and its Y, CB and CR components. The Y image is essentially agreyscale copy of the main image. YCbCr is sometimes abbreviated to YCC. Y′CbCr is often called YPbPr when used for analog component video, although the term Y′CbCr is commonly used for both systems, with or without the prime. Y′CbCr is often confused with the YUV color space, and typically the terms YCbCr and YUV are used interchangeably, leading to some confusion; when referring to signals in video or digital form, the term "YUV" mostly means "Y′CbCr". RGB to YCbCr conversion

Department of Electronics and Communication (2014-2015)

24 Design and development of a real time video……… Fig 3.15 RGB to YCbCr Conversion

3.4.5 THE HSI COLOR MODEL The HSI color model RGB and CMY models: straightforward+ideally suited for hardware impementations+RGB system matches nicely the human eye perceptive abilities. But, RGB and CMY not well suited for describing colors in terms practical for human interpretation human view of a color object described by Hue, Sturation and Brightness (or intensity).  Hue: describes a pure color (pure yellow, orange or red)  Saturation: gives a measure of the degree to which a pure color is diluted by white light.  Brightness: subjective descriptor practically impossible to measure. Embodies the achromatic notion of intensity=>intensity (gray level), measurable=>HSI (Hue, Saturation, Intensity)color model. The HSI color model intensity is along the line joining white and black in the RGB cube. To determine the intensity component of any color point: pass a place perpendicular to the intensity axis and containing the color point. Intersection of the plane with the axis is the normalized intensity value. Saturation (purity) of a color incrreases as a function of distance from the intensity axis (on the axis, saturation=0, gray points)

Fig 3.16 Conceptual Relationship between the RGB and HSI color models.

Department of Electronics and Communication (2014-2015)

25 Design and development of a real time video……… 3.4.6 Color planes, perpendicular to the intensity axis.

Fig 3.17 Hue and saturation in the HSI color model. The dot is an arbitary color point. The angle from the red axis given to th hue, and the length of the vector is the saturation. The intensity of all colors in any of these planes is given by the position of the plane on the vertical intensity axis. 3.4.7 Converting colors from RGB to HSI

3.4.8 The HSI Color models based on:

Fig 3.18 The HSI color model based on (a) triangualar color planes and (b) circular color planes. The triangles and circles are perpendicular to the vertical itensity axis.

Department of Electronics and Communication (2014-2015)

26 Design and development of a real time video……… 3.4.9 Conversions colors from HSI to RGB

3.4.10 Manipulation of HSI images: Basics of full-color image processing, pre-color-component and vector-based processing equivalent if: 1. The process is applicable to both vectors and scalars. 2. The operation on each component of a vector is independent of the other components. 3.4.11 Color Complements . Hues directly opposite one another on the color circle complements. . Complements are analogous to gray-scale negatives. . Useful for enhancing detail embedded in dark regions.

Fig 3.19 Complements on the color circle

Department of Electronics and Communication (2014-2015)

27 Design and development of a real time video……… Chapter IV SEGMENTATION Image segmentation is the division of an image into regions or categories, which correspond to different objects or parts of objects. Every pixel in an image is allocated to one of number of these categories. A good segmentation is typically one which:  Pixels in the same category have similar grayscale of multivariate values and form a connected region.  Neighbouring pixels which are in different categories have dissimilar values. Segmentation is often the critical step in image analysis: the point at which we move from considering each pixel as a unit of observation to working with objects (or parts of objects) in the image, composed of many pixels. If segmentation is done well then all other stages in image analysis are made simpler. But, as we shall see, success is often only partial when automatic segmentation algorithms are used. However, manual intervention can usually overcome these problems and by this stage the computer should already have done most of the work. There are three general approaches to segmentation, termed thresholding, edge-based methods and region-based methods. 4.1 THRESHOLDING In thresholding, pixels are allocated to categories according to the range of values in which pixel lies. Fig 4.1(a) shows boundaries which were obtained by thresholding the image. Pixels with values less than 128 have been placed in one category and the rest have been placed in the other category. The boundaries between adjacent pixels in different categories have been superimposed in white on the original image. It can be seen that the threshold has successfully segmented the image into two predominant bred types.

Fig 4.1(a) Thresholding

4.2 EDGE-BASED SEGMENTATION In edge-based segmentation, an edge filter is applied to the image, pixels are classified as edge or non-edge depending on the filter output and pixels which are not separated by an edge are allocated to the same category. Fig 4.1(b) shows the boundaries of connected regions after applying Prewitt‘s filter (x3.4.2) and eliminating all non-border segments containing fewer than 500 pixels.

Department of Electronics and Communication (2014-2015)

28 Design and development of a real time video………

Fig 4.1(b) Edge-based segmentation 4.3 REGION-BASED SEGMENTATION Finally, region-based segmentation algorithms operate iteratively by grouping together pixels which are neighbours and have similar values and splitting groups of pixels which are dissimilar in value. Fi 4.1(c) shows the boundaries produced by one such algorithm, based on the concept of watersheds.

Fig 4.1(c) Region-based segmentation

Department of Electronics and Communication (2014-2015)

29 Design and development of a real time video……… Chapter V MORPHOLOGICAL OPERATION Morphology is a broad set of image processing operations that process images based on shapes. Morphological operations apply a structuring element to an input image, creating an output image of the same size. In a morphological operation, the value of each pixel in the output image is based on a comparison of the corresponding pixel in the input image with its neighbours. By choosing the size and shape of the neighbourhood, you can construct a morphological operation that is sensitive to specific shapes in the input image. The most basic morphological operations are dilation and erosion. Dilation adds pixel to the boundaries of the object in an image, while erosion removes pixels on the object boundaries. The number of pixels added or removed from the object in an image depends on the size and shape of the structuring element used to process the image. In the morphological dilation and erosion operations, the state of any given pixel in the output image is determined by applying a rule to the corresponding pixel and its neighbours in the input image. The rule used to process the pixels defines the operation as dilation or erosion. This table lists the rules for both dilation and erosion. Operation Rule Dilation The value of the output pixel is the maximum value of all the pixels in the input pixel‘s neighbourhood. In a binary image, if any of the pixels is set to the value 1, the output pixel is set to 1. Erosion The value of the output pixel is the minimum value of all the pixels in the input pixel‘s neighbourhood. In a binary image, if any of the pixels is set to the value 0, the output pixel is set to 0.

The following figure illustrates the dilation of a binary image. Note how the structuring element defines the neighbourhood of the pixel of interest, which is circled. The dilation function applies the appropriate rule to the pixels in the neighbourhood and assigns a value to the corresponding pixel in the output image. In the figure, the morphological dilation function sets the value of the output pixel to 1 because one of the elements in the neighbourhood defined by the structuring element is on. 5.1 Morphological Dilation of a Binary Image

Input image Output image Fig 5.1 Processing for a Grayscale Image Department of Electronics and Communication (2014-2015)

30 Design and development of a real time video……… The figure shows the processing of a particular pixel in the input image. Note how the function applies the rule to the input pixel‘s neighbourhood and uses the highest value of all the pixels in the neighbourhood as the value of corresponding pixel in the output image. 5.2 Morphological Dilation of a GrayscaleImage

Fig 5.2 Understanding Structural Element An essential part of dilation and erosion operations is this structuring element used to the probe the input image. A structuring element is a matrix consisting of 0‘s and 1‘s that can have any arbitrary shape and size. The pixels with values of 1 defines the neighbourhood. Two-dimensional, or flat, structuring elements are typically much smaller than the image being processed. A centre pixel of the structuring element, called the origin, identifies the pixel of interest -- the pixel being processed. The pixels in the structuring element containing 1‘s define the neighbourhood of the structuring element. These pixels are also considered in dilation or erosion processing. Three-dimensional, or non-flat, structuring elements use 0‘s and 1‘s to define the extent of the structuring element in the x- and y- planes and add height values to define the third dimension. 5.3 DILATING AN IMAGE To dilate an image, use the imdilate function. The imdilate function accepts two primary arguments:  The input image to be processed (grayscale, binary or poacked binary image)  A structuring element object, returned by a strel function, or a binary matrix defining the neighbourhood of a structuring element imdilate also accepts two optional arguments: SHAPE and PACKOPT. The SHAPE argument affects the size of the output image. The PACKOPT argument identifies the input image as packed binary. (Packaging is a method of compressing binary images that can speed up the processing of the image).

Fig 5.3 Original and Dilated image

Department of Electronics and Communication (2014-2015)

31 Design and development of a real time video………

5.4 ERODING AN IMAGE To erode an image, use imerode function. The imerode function accepts two primary arguments:  The input image to be processed (grayscale, binary or poacked binary image)  A structuring element object, returned by a strel function, or a binary matrix defining the neighbourhood of a structuring element imdilate also accepts two optional arguments: SHAPE and PACKOPT and M. The SHAPE argument affects the size of the output image. The PACKOPT argument identifies the input image as packed binary. If the image packed binary, M identifies the number of rows in the original image.

Fig 5.4 Original and Eroded image 5.5 COMBINING DILATION AND EROSION Dilation and erosion are often used in the combination to implement image processing operations. For example, the definition of a morphological opening of an image is erosion followed by dilation, using the same structuring element for both operations. The related operation, morphological closing of an image, is the reverse- it consists of dilation followed by erosion with the same structuring element. The following section uses imdilate and imerode to illustrate how to implement a morphological opening. Note, however, that the toolbox already includes imopen function, which performs this processing. The toolbox includes functions that perform many morphological operations.

Department of Electronics and Communication (2014-2015)

32 Design and development of a real time video………

Chapter VI IMPLEMENTATION OF PROJECT

6.1 Webcamera:Digital web camera equipped with a 8-megapixel digital image sensor that provides an active imaging array of 2,592H x 1,944V. It features low-noise and high .

Fig 6.1 Webcamera A webcam is a video camera that feeds or steams its image in real time to or through a computer or computer network. When ―captured‖ by the computer, the video stream may be saved, viewed or sent on to other networks via systems such as the internet, and email as an attachment. When sent to a remote location, the video stream may be saved, viewed or on sent there. Unlike an IP camera (which uses a direct connection using Ethernet or Wi-Fi), a webcam is generally connected by a USB cable, Fire Wire cable, or similar cable, or built into computer hardware, such as laptops.

6.2FPGA [Field programmable gate array]:It is a semiconductor device that can be programmed after manufacturing. FPGA allows you to program product features and functions, adapt to new standard and reconfigure hardware for specific applications even after the product has been installed in the field hence the name ―field- programmable‖. 6.3 BLOCK DIAGRAM

Fig 6.2 Block Diagram

Department of Electronics and Communication (2014-2015)

33 Design and development of a real time video……… 6.4 MATLAB MATLAB (matrix laboratory) is a multi-paradigm numerical computing environment and fourth-generation programming language. Developed by Math Works, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces and interfacing with programs written in other languages, including C, C++, JAVA and Fortan. Although MATLAB is intended primarily for numerical computing, an optional toolbox uses the MuPAD symbolic engine, allowing access to symbolic computing capabilities. An additional package, Simulink adds graphical multi-domain simulation and Model- Based Design for dynamic and embedded systems. 6.4.1 FLOWCHART

Department of Electronics and Communication (2014-2015)

34 Design and development of a real time video……… 6.5 SIMULINK Simulink, developed by Math Works, is a data flow graphical programming language tool for modelling, simulating and analysing multi domain dynamic systems. Its primary interface is a graphical block diagramming tool and a customizable set of block libraries. It offers tight integration with the rest of the MATLAB environment can either drive MATLAB or be scripted from it. Simulink is widely used in control theory and digital signal processing for multi domain simulation and Model-Based Design.

Fig 6.3 Simulink Block Diagram 6.5.1From a video device A webcam is a video camera that feeds or steams its image in real time to or through a computer or computer network. When ―captured‖ by the computer, the video stream may be saved, viewed or sent on to other networks via systems such as the internet, and email as an attachment. When sent to a remote location, the video stream may be saved, viewed or on sent there. Unlike an IP camera (which uses a direct connection using Ethernet or Wi-Fi), a webcam is generally connected by a USB cable, Fire Wire cable, or similar cable, or built into computer hardware, such as laptops. Their most popular use is the establishment of video links, permitting computers to act as videophones or video conference stations. Other popular uses include security surveillance, computer vision, video broadcasting and for recording social videos. Webcams are known for their low manufacturing cost and flexibility, making them the lowest cost from of video telephony. They have also become a source of security and privacy issues, as some built-in webcams can be remotely activated via spyware. We are using iBall face2face web camera.

Department of Electronics and Communication (2014-2015)

35 Design and development of a real time video……… 6.5.2 MATLAB function: Segmentation In this matlab function we do segmentation with which we use thresholding method. Image segmentation is the division of an image into regions or categories, which correspond to different objects or parts of objects. Every pixel in an image is allocated to one of number of these categories. A good segmentation is typically one which:  Pixels in the same category have similar grayscale of multivariate values and form a connected region.  Neighbouring pixels which are in different categories have dissimilar values. Segmentation is often the critical step in image analysis: the point at which we move from considering each pixel as a unit of observation to working with objects (or parts of objects) in the image, composed of many pixels. If segmentation is done well then all other stages in image analysis are made simpler. But, as we shall see, success is often only partial when automatic segmentation algorithms are used. However, manual intervention can usually overcome these problems and by this stage the computer should already have done most of the work. There are three general approaches to segmentation, termed thresholding, edge-based methods and region-based methods. We use the Thresholding because of the advantages of it and the image becomes black background with white object in clear so that we can recognize the object exactly where it is present. 6.5.3 THRESHOLDING SEGMENTATION In thresholding, pixels are allocated to categories according to the range of values in which pixel lies. Fig 6.4(a) shows boundaries which were obtained by thresholding the image. Pixels with values less than 128 have been placed in one category and the rest have been placed in the other category. The boundaries between adjacent pixels in different categories have been superimposed in white on the original image. It can be seen that the threshold has successfully segmented the image into two predominant bred types.

Fig 6.4(a) Thresholding 6.5.4 Erosion The erosion is a Morphological operation in which we use structuring element. Morphology is a broad set of image processing operations that process images based on shapes. Morphological operations apply a structuring element to an input image, creating an output image of the same size. In a morphological operation, the value of each pixel in the output image is based on a comparison of the corresponding pixel in the input image with its neighbours. By choosing the size and shape of the neighbourhood, you can construct a morphological operation that is sensitive to specific shapes in the input image.

Department of Electronics and Communication (2014-2015)

36 Design and development of a real time video……… Perform morphological erosion on an intensity or binary image. Use the neighbourhood or structuring element parameter to define the neighbourhood or structuring element parameter to define the neighbourhood or structuring element that the block applies to the image. Specify a neighbourhood by entering a matrix or vector of ones and zeros. Specify a structuring element using the strel function. Alternatively, you can specify neighbourhood values using the Nhood port. Depending on shape, strel can take additional parameters. Morphological operations run much faster when the structuring element uses approximation (N>0) than when it does not (N=0). 6.5.5 Dilation The dilation is a Morphological operation in which we use structuring element. Perform morphological dilation on an intensity or binary image. Use the neighbourhood or structuring element parameter to define the neighbourhood or structuring element that the block applies to the image. Specify a neighbourhood by entering a matrix or vector of ones and zeros. Specify a structuring element using the strel function. Alternatively, you can specify neighbourhood values using the Nhood port. The most basic morphological operations are dilation and erosion. Dilation adds pixel to the boundaries of the object in an image, while erosion removes pixels on the object boundaries. The number of pixels added or removed from the object in an image depends on the size and shape of the structuring element used to process the image. In the morphological dilation and erosion operations, the state of any given pixel in the output image is determined by applying a rule to the corresponding pixel and its neighbours in the input image. 6.6 FPGA Field Programmable Gate Arrays (FPGAs) are semiconductor devices that are based around a matrix of configurable logic blocks (CLBs) connected via programmable interconnects. FPGAs can be reprogrammed to desired application or functionality requirements after manufacturing. This feature distinguishes FPGAs from Application Specific Integrated Circuits (ASICs), which are custom manufactured for specific design tasks. Although one-time programmable (OTP) FPGAs are available, the dominant types are SRAM based which can be reprogrammed as the design evolves. Security considerations FPGAs flexibility makes malicious modifications during fabrication a lower risk. Previously, for many FPGAs, the design bit stream is exposed while the FDPGA loads it from external memory (typically on every power-on). All major FPGA vendors now offer a spectrum of security solutions to designers such as bit stream encryption and authentication. For example, Altera and Xilinx offer AES (up to 256 bit) encryption for bit streams stored in an external flash memory. FPGAs that store their configuration internally in non-volatile flash memory, such as Micro semi‘s ProAsic 3 or Lattice‘s XP2 programmable devices, do not expose the bit

Department of Electronics and Communication (2014-2015)

37 Design and development of a real time video……… stream and do not need encryption. In addition, flash memory for LUT provides SEU protection for space applications. 6.6.1 FPGA design and programming To define the behaviour of the FPGA, the user provides a hardware description language (HDL) or a schematic design. The HDL form is more suited to work with large structures because it‘s possible to just specify them numerically rather than having to draw every piece by hand. However, schematic entry can allow for easier visualization of a design. The most common HDLs are VHDL and Verilog although in an attempt to reduce the complexity of designing in HDLs, which have been compared to the equivalent of assembly languages, there are moves to raise the abstraction level through the introduction of alternative languages. National Instruments Lab VIEW graphical programming language (sometimes referred to as G) has an FPGA add-in module available to target and program FPGA hardware. 6.6.2 FPGA Applications Due to their programmable nature, FPGAs are an ideal fit for many different markets. As the industry leader, Xilinx provides comprehensive solutions consisting of FPGA devices, advanced software, and configurable, ready-to-use IP cores for markets and applications such as:  Aerospace & Defence - Radiation-tolerant FPGAs along with intellectual property for image processing, waveform generation, and partial reconfiguration for SDRs.  ASIC Prototyping- ASIC prototyping with FPGAs enables fast and accurate SoC system modelling and verification of embedded software  Audio - Xilinx FPGAs and targeted design platforms enable higher degrees of flexibility, faster time-to-market, and lower overall non-recurring engineering costs (NRE) for a wide range of audio, communications, and multimedia applications.  Automotive - Automotive silicon and IP solutions for gateway and driver assistance systems, comfort, convenience, and in-vehicle infotainment. - Learn how Xilinx FPGA's enable Automotive Systems  Broadcast - Adapt to changing requirements faster and lengthen product lifecycles with Broadcast Targeted Design Platforms and solutions for high-end professional broadcast systems.  Consumer Electronics - Cost-effective solutions enabling next generation, full- featured consumer applications, such as converged handsets, digital flat panel displays, information appliances, home networking, and residential set top boxes.  Data Centre - Designed for high-bandwidth, low-latency servers, networking, and storage applications to bring higher value into cloud deployments.  High Performance Computing and Data Storage - Solutions for Network Attached Storage (NAS), Storage Area Network (SAN), servers, and storage appliances.  Industrial - Xilinx FPGAs and targeted design platforms for Industrial, Scientific and Medical (ISM) enable higher degrees of flexibility, faster time- to- market, and lower overall non-recurring engineering costs (NRE) for a wide range of

Department of Electronics and Communication (2014-2015)

38 Design and development of a real time video……… applications such as industrial imaging and surveillance, industrial automation, and medical imaging equipment.  Medical - For diagnostic, monitoring, and therapy applications, the Vertex FPGA and Spartan® FPGA families can be used to meet a range of processing, display, and I/O interface requirements.  Security - Xilinx offers solutions that meet the evolving needs of security applications, from access control to surveillance and safety systems.  Video & Image Processing - Xilinx FPGAs and targeted design platforms enables higher degrees of flexibility, faster time-to-market, and lower overall non- recurring engineering costs (NRE) for a wide range of video and imaging applications.  Wired Communications - End-to-end solutions for the Reprogrammable Networking Line card Packet Processing, Framer/MAC, serial backplanes, and more  Wireless Communications - RF, base band, connectivity, transport and networking solutions for wireless equipment, addressing standards such as WCDMA, HSDPA, WiMAX and others.

Department of Electronics and Communication (2014-2015)

39 Design and development of a real time video………

Chapter VII RESULTS 7.1 OUTPUT AFTER EVERY BLOCK  iBall face2face web camera: The picture of pests captured by the camera.

 RGB to HSV conversion: Extracting the ―S‖ component.

 Segmentation: Obtaining particular object.

Department of Electronics and Communication (2014-2015)

40 Design and development of a real time video………

 Erosion: Removal of noise.

 Dilation: Retrieving back the diminished object.

Fig 7.1 output of each image

Department of Electronics and Communication (2014-2015)

41 Design and development of a real time video……… ADVANTAGES AND DISADVANTAGES ADVANTAGES: 1. Farmers need not monitor the crops 24*7. 2. Automate the use of pesticides. 3. It can be also be programmed to detect yellow and green flies. 4. It saves time and energy of the farmer. DISADVANTAGES: 1. The pests that are hidden behind the leaves cannot be captured by the robo cam. 2. Another disadvantage to be considered is brightness. 3. In case of rains, the camera may be damaged.

APPLICATIONS 1. To sow seeds 2. To identify the fruit. 3. To check if the crops are affected by pests. 4. To check the water content. 5. To collect the fruits.

FUTURE SCOPE As future scope instead of using FPGA we can implement the project onto an ASIC (application specific integrated circuit) which is used to perform only one operation at a time. We can also attach an arm to the robot that can spray pesticides. It can also be used for detecting pests on fruits and vegetables.

CONCLUSION We are successful in detecting the pests on plants and also count the number of pests by video processing and implementing on FPGA.

Department of Electronics and Communication (2014-2015)

42 Design and development of a real time video………

APPENDIX FPGA Introduction The Basys2 board is a circuit design and implementation platform that anyone can use to gain experience building real digital circuits. Built around a Xilinx Spartan-3E Field Programmable Gate Array and an Atmel AT90USB2 USB controller, the Basys2 board provides complete, ready-to-use hardware suitable for hosting circuits ranging from basic logic devices to complex controllers. A large collection of on-board I/O devices and all required FPGA support circuits are included, so countless designs can be created without the need for any other components. Four standard expansion connectors allow designs to grow beyond the Basys2 board using breadboards, user-designed circuit boards, or Pmods (Pmods are inexpensive analog and digital I/O modules that offer A/D & D/A conversion, motor drivers, sensor inputs, and many other features). Signals on the 6-pin connectors are protected against ESD damage and short-circuits, ensuring a long operating life in any environment. The Basys2 board works seamlessly with all versions of the Xilinx ISE tools, including the free WebPack. It ships with a USB cable that provides power and a programming interface, so no other power supplies or programming cables are required. The Basys2 board can draw power and be programmed via its on-board USB2 port. Digilent‘s freely available PC-based Adept software automatically detects the Basys2 board, provides a programming interface for the FPGA and Platform Flash ROM, and allows user data transfers. The Basys2 board is designed to work with the free ISE Web Pack CAD software from Xilinx. Web Pack can be used to define circuits using schematics or HDLs, to simulate and synthesize circuits, and to create programming files. The Basys2 board ships with a built-in self-test/demo stored in its ROM that can be used to test all board features. To run the test, set the Mode Jumper to ROM and apply board power. If the test is erased from the ROM, it can be downloaded and reinstalled at any time.

Department of Electronics and Communication (2014-2015)

43 Design and development of a real time video………

WEB CAMERA iBall Face2Face K20 Features 1. High quality CMOS sensor. 2. 300K pixels (Interpolated 20M pixels still image resolution & 2M pixels video resolution). 3. High quality 5G wide range lens for sharp and clear picture. 4. 4LEDs for night vision, with brightness controller. 5. Built-in high sensitive USB microphone. 6. Single button to take a picture and change the effects. 7. 4x Digital zoom and Auto face tracking. 8. 9 Special effects and 7 Photo frames for more fun. 9. Built-in JPEG compression. 10. Lens focus from 5cm to infinity. 11. Multi-utility camera base (can be used on desktop, laptop, LCD & tripod).

Specifications 1. Image Sensor: High quality 1/6‖ CMOS sensor 2. Sensor resolution: 300K pixels. 3. Video format: 24-bit true color 4. Lens: High quality 5G wide angle lens. 5. LEDs: 4 LEDs for night vision, with brightness controller. 6. USB interface: USB 2.0. Backward compatible with USB 1.1. 7. Microphone: Built-in high sensitive USB microphone. 8. Maximum Video Resolution: Maximum upto 1600 x 1200 pixels. 9. Maximum Image Resolution: Maximum upto 5500 x 3640 pixels. 10. Frame Rate: 18 frames per second. 11. Adjustable Focus: 5cm to infinity. 12. Automatic White balance: Yes. 13. Automatic Exposure: Yes. 14. Automatic compensation: Yes. 15. Power supply: USB bus powered. 16. Operating systems supported: Windows XP, Vista, 7, 8.(Mac and Linux with limited functions)

Department of Electronics and Communication (2014-2015)

44 Design and development of a real time video………

INTRODUCTION TO MATLAB What is MATLAB? MATLAB (―MATRIX Laboratory‖) is a tool for numerical computation and visualization. The basic data element is a matrix, so if you need a program that manipulates array-based data it is generally fast to write and run in MATLAB. The version of MATLAB used in this project is version 7.1.

INTRODUCTION TO SIMULINK

Simulink is a software package for modelling, simulating and analysing dynamical systems. It supports linear and nonlinear systems, modelled in continuous time, sampled time, or a hybrid of the two. Systems can also be multi-rate, i.e. have different parts that are sampled or updated at different rates. For modelling, Simulink provides a graphical user interface (GUI) for building models as block diagrams, using click-and-drag mouse operations. With this interface, you can draw the models just as you would with pencil and paper. Simulink includes a comprehensive block library of sinks, sources, linear and nonlinear components and connectors. You can also customize and create your own blocks. This approach provides insight into how a model is organised and how its parts interact. After you define a model, you can simulate it, using a choice of integration methods, either from the Simulink or by entering commands in MATLAB‘s command window. The simulation results can be put in the MATLAB workspace for post processing and visualization. And because MATLAB and Simulink are integrated, you can simulate, analyse and revise your models in either environment at any point. The XILINX version used in this project is version 12.3.

Department of Electronics and Communication (2014-2015)

45 Design and development of a real time video………

BIBLIOGRAPHY 1. ―Digital Image Processing‖, Rafael C. Gonzalez and Richard E. Woods. 2010, Second Edition. 2. Websites  Wikipedia  Encyclopedia  Imageprocessing.com  Introduction to Simulink  Introduction to MATLAB  Datasheet.com 3. ―Digital Signal Processing‖ using MATLAB- J. G. Proakis & Ingale, MGH, 2000.

Department of Electronics and Communication (2014-2015)