<<

Digital and Image Forensics Katie Bouman

In the past twenty years, much emphasis has been placed on technological breakthroughs in consumer electronics. Whether it be the creation CD’s, DVD’s, HDTV’s, or MP3’s, they were all created on the same basic principle; converting conventional analog information into digital information. Since these electronic systems are common today they may seem simple to understand, the conversion of a fluctuating wave into ones and zeros is much more difficult than it may appear. One example of this major shift is in the development of the . Inspired by the conventional analog camera, which uses chemical and mechanical processes to create an image, the digital camera has instead achieved this through the use of a digital and a computer (Wilson, et al, 2006). In just seconds a digital camera captures a sample of that has bounced off a subject and has traveled through a series of lenses, which is then focused on a sensor that records the light electronically by breaking down the light pattern into a series of values, and performs a full- rendering that includes color filter array interpolation, , anti-aliasing, infrared rejection, and point correction. However, a lot goes into doing just that (Adams, et al, 1998). Andreas Vesalius (1514-1564) was one of the first to dissect and examine the human body. As a result of Vesalius’s extensive work, many corrections were made in medicine, and scientists were able to use the enhancement in knowledge to extend their theories (Szaflarski, Diane, September 22, 2004). Although , and even more so digital cameras, were not invented until centuries later, many of the characteristics of cameras and how they work is very similar to the eye, and was essential to their development. In addition, understanding of the eye helped to propel curiosity in understanding of light and its properties (Watson, 2006). The color that is detected by the cone cells in the back of ones eye and in the back of a camera is part of the . Refer to Figure 1. The area of the spectrum that humans are able to see is called the visible spectra (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 b). The electromagnetic spectrum is an entire range of and frequencies that extends from gamma rays, the shortest , to the long waves, radio waves (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 a). Although colored light is part of the spectrum, there is more to light than meets the eye. Only about 300 nm of the spectrum is visible (400 nm to 700 nm). The rest of the light that is unseen by the naked eye is referred to as the “invisible spectrum” (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 b). There are 7 types of electromagnetic radiation including visible light. Light travels in waves (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 a). In order of length sizes of waves from largest to smallest is: radio waves, microwaves, infrared, the visible spectra, light, X-rays, and gamma rays (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 a). Refer to figure 1. All electromagnetic radiation moves at the same speed through a vacuum, 3.0 times 108 meters/second, however, while moving through matter instead of a vacuum the speed is slowed down slightly. In fact, the denser the material that the light wave is moving through the slower it tends to travel (Davis, et al, 2002). A longstanding debate has persisted as to whether light travel in waves, or is it composed of a stream of particles. Many noteworthy physicists have argued both sides of this question. In fact, light exhibits behaviors of both a stream of particles and a wave. Light travels in a wave composed of photons (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 a; Davis, et al 2002; The Franklin Institute. September 20, 2004). A photon, an uncharged particle that has no mass, is the smallest unit of electromagnetic energy. Each photon contains a certain amount of energy defined by Einstein as E = h*f (Where E is the energy of the emitted electron, h is Planck's constant (6.63 * 10^-34 J*s), and f is the frequency of the given light source (Kudenov, 2003a). The furthur the distance that the waves are

1 from each other the less energy each photon contains (Campbell, et al, 1997; Davis, et al 2002; The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 a). For example, microwaves’ distance between waves (wavelength) ranges from 106 nm to 109 nm, therefore, the amount of energy of each photon contains less energy than a gamma ray, which ranges in wavelength from 10-3 nm to 10-5 nm. However a microwave contains more energy per photon than does a Radio wave, which ranges from 109 nm to 103 meters (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 a; The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 b). Objects and events with more energy, create higher energy radiation than cool objects or events with less energy. Thus, the more heat emitted, the shorter the waves are in the electromagnetic spectrum (Watson, 2006).

FigureFigure 2 1 While wavelength λ() is the measure of one period of a wave when moving through space, frequency  1  (v)  frequency(v) =  is defined as the number of waves that pass a certain point during a certain  period(T) period of time (usually a second) and is measured in € cycles/second or a hertz (Hz). Refer to figure 2. Therefore a wave, such as a radio wave, that has a longer wavelength than another wave, such as a gamma ray, will have a lower frequency than the shorter wave (Davis, et al, 2002). The visible spectra ranges from approximately 400 nm to around 700 nm. White light, or light that comes from the sun and many other sources, is composed of all , which result from different

2 wavelengths of light. When white light hits a prism, the prism creates an area at which each wavelength bends a different amount, which causes all colors to separate (Levine, 2000; Sekuler, et al. 2002; Szaflarski, September 22, 2004; Davis, et al 2002). These colors are often memorized in the order of their appearance after going through a prism with the longest wave length first as ROY G. BIV, , , , , , , and violet (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 b). Light waves can be transmitted, absorbed or reflected. The object that the white light wave hits determines which wavelengths will be reflected (the energy of the light to be converted to heat), or transmitted (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 c; The Franklin Institute September 20, 2004). In a white object, all the colors are being reflected. A object, on the other hand, is when all the colors are being absorbed in the substance and no light waves are emitted (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 c; The Franklin Institute. September 20, 2004.). Every atom contains electrons, which can be pictured attached to the nucleus by springs. These springs start to vibrate at specific frequencies. When a light wave of that specific frequency hits the electrons they start to vibrate and the energy from the light turns into vibrational motion. The vibrating electrons then react with the electrons of the atoms surrounding it, creating thermal energy as the light wave is absorbed. This absorbed light wave is not seen. Reflected and the transmitted light is seen. Color occurs when the light causes electrons to vibrate in small spurts. These small spurts of vibration emit this energy as the light wave (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 c). The color of a green leaf is not actually contained within it (Sekuler, Robert, et al. 2002; Levine, 2000; The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 c; The Franklin Institute. September 20, 2004). The green color seen is the light that is reflected because the frequency of the wave doesn’t match the frequency of the electrons in the leaf (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 c). The primary colors of light are red, green, and blue, referred to as RGB. When mixed together these colors combine to create light’s secondary colors magenta (from blue and red), cyan (from blue and green), and yellow (from green and red). When colors of light are mixed together, the color produced becomes closer to the color white than the beginning colors were (The Franklin Institute. September 20, 2004). Color Addition is a method used to produce different colors of light. Computer monitors and televisions both use color addition. Color addition is when the three primary colors, red, blue, and green, are mixed together at varying intensities to create a wide range of colors (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 d; The

3 Franklin Institute. September 20, 2004). Monitors and televisions produce all their colors by mixing phosphors (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 d; Brain, 2006.). Phosphors are coated on the inside of screens. They emit visible light when it is struck by an electron beam. In a black and white screen there is only one type of phosphor, which glows white when struck, and black when not. However, in a color screen, three different types of phosphors, arranged as dots or stripes, use color addition to formulate a final color (Brain, 2006). In color addition the object starts out black, and as more colors are placed together, becomes closer to white (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 d). Unlike in color addition, in color subtraction, the object starts out white and colors are taken away to create the desired color (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 e; Franklin Institute. September 20, 2004). For example if a shirt reflects all wavelengths of light, it would appear white, but if the shirt was made to absorb blue and therefore reflect red and green the shirt would appear yellow because when red and green light are mixed the color yellow is formed. However, if the same shirt has magenta (red and blue) light is projected at the shirt the shirt would appear red since the shirt absorbs blue and therefore reflects the red color (The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 e). In both analog and digital cameras light travels through a camera’s converging lens and is focused on a plane. A lens is simply a curved piece of glass or plastic that bends light beams in a way to create a focused real image (Kudenov, 2003; Brain, 2006). Light travels slower the denser the material. When light waves enter a piece of glass at an angle, one part of the wave reaches the glass before another and therefore begins to slow down. Since the other light rays have yet to reach the lens, they will still be traveling at the same speed, causing the entire light wave to bend toward the normal (90 degrees to the line separating the two materials of different densities) (Zobel, 2006). This is similar to a shopping cart being pushed at an angle from pavement to denser grass. Refer to figure 3. The right wheel of the cart hits the grass first, and like the light wave, slows down while the left wheel is still moving at the same velocity on the pavement. Because the left wheel is briefly moving faster than the right wheel, the shopping cart turns to the right, towards the normal, as it enters the grass (Brain, 2006). Each lens has a focal length. All light that enters the lens parallel to the axis is refracted inside the lens and passes through the focal length on the axis (focal point (+)), while all light that passes through the focal length on the axis (focal point (-))is refracted parallel to the axis. Refer to figure 4. The point which, for a particular object, is created by the intersection of the path of a light ray moving parallel to the axis from a point on the object, being refracted in the lens and therefore passing through the focal point (- ), crosses the path of a light ray moving from that same point on the original object through the focal point (+) and then being refracted in the lens parallel to the axis defines a point on the focal plane (Watson, 2006; Henderson, 2004). Refer to figure 5. The focal plane is made up of an infinite number of points and defines where an image is in focus (Kudenov, 2003c).

4 The lens in a camera can be moved back and forth in order to focus an object (Watson, 2006). The focal plane is represented in cameras by the area where light is focused by the lens. In an analog camera this plane is film, however, in digital cameras it is a sensor (Kudenov, 2003c). and transistors, the heart of any electronic device, are made of silicon, a column fourteen nonmetal. The most frequently used sensors are charge coupled devices (CCDs) (Wilson, et al, 2006). Refer to figure 6. Complementary metal oxide semiconductors (CMOSs) are also quickly growing in use as an alternate to CCDs. Although both have very different methods in which they collect the data, both have the same purpose; to convert light into an electrical signal (DALSA. 2005).

CCDs are comprised of many different cells or that each contain a sensor, and often a color filter and micro lens (Kudenov, 2003b). Refer to figure 7. The silicon gives off photo-electrons when struck by a photon. The more photons that strike the sensor, the more photo-electrons are released, and therefore the more intense the light is (Hogan, 2004). These photon counts are converted to voltages, which are sent to an analog-to-digital converter (ADC) by performing an interline transfer, shifting one line at a time to the edge of the sensor. The ADC reads the array of data and turns each pixel’s intensity value into a digital value represented in binary form (Hogan, 2004; Kudenov, 2003b). Most CCD’s currently are built with micro lenses. Since light waves do not change direction unless acted upon by something, without a lens light would hit the photo-sites at severe angles, causing adjacent cells to obtain an inaccurate reading of how much light actually is supposed to be hitting them. This lens, like the lens in the front of the camera, redirects the light and causes it to hit the sensor at a more perpendicular angle (Kudenov, 2003b). CMOSs are similar to CCDs in that they both collect data about the intensity of light by accumulating electrons produced by the sensor. Refer to figure 8. However, unlike CCDs, CMOSs use multiple transistors that are placed next to individual pixels to amplify and move the charge created by the electrons with wires (Hogan, 2004; Wilson, et al, 2006). CCDs are the more widely used sensor of the two (Alken M.R.S, 2006; DALSA, 2005). Since each pixel on a CMOS sensor has several transistors located next to it, the light sensitivity of a CMOS chip is lower because many of the photons hit the transistors instead of the photo-site. In addition, CMOS sensors usually are more susceptible to noise, defects in the image, whereas a CCD sensor creates high quality images

5 with only a small amount of noise. Although CMOS sensors use much less power (100 times less) than a CCD sensor, CCD’s have been used longer and therefore have been able to become more developed in the quality and number of pixels per sensor (Adams, et al, 1998; Hogan, 2004; Wilson, et al, 2006; DALSA, 2005). On both chips, many light sensitive areas called photo-sites (or electrodes) are laid out in an array to capture the charge of the accumulated electrons (the more photo-sites the higher the resolution of an image). However, since these photo-sites are just capturing how many electrons are released, or the intensity of light, they are colorblind to any differences in wavelength (Adams, et al, 1998; Kudenov, 2003b; Hogan, 2004; Alken M.R.S, 2006). Therefore, all sensors, except for the , need a method implemented to obtain the value of multiple colors’ intensity at each pixel. The Foveon X3 sensor is the only sensor that is able to capture three different color intensities needed at every pixel location by the use of three separate pixel layers to create a color rendered image (Hogan, 2004; Digital Photography Review, 2006 ). Rather than being sensitive to all the different wavelengths of light, a sensor only needs to be monochromatic and sensitive to each of the primary colors of light, red, green, and blue, or sometimes green and the three secondary, cyan, magenta, yellow. With these colors, it is possible to create any color by simply overlapping the three or four values (Adams, et al, 1998; Kudenov, 2003b; Hogan, 2004). There are several methods of achieving a colored image by finding the intensity values of the different colors. One of the best ways to obtain the values is by using a beam splitter. This works by directing the light that is focused on the sensor to three different types of sensors (Red, Green, Blue (RGB)). These sensors have an identical view of the image, but only in the color they are responsive to. In effect, each sensor has an intensity of light for their particular color, and theses values can be overlaid to create one final image. This method creates very accurate pictures because each pixel has a RGB value. However, since the sensor is the camera’s most expensive component, accounting for around ten to twenty-five percent of the total cost, this causes the camera to be very expensive and often very large (Adams, et al, 1998; Wilson, et al, 2006). Another method that is similar to splitting the image is rotating a series of red, green, and blue filters in rapid succession in front of a sensor that records all the data for the different filters. This, like splitting the image, acquires a value of intensity of each color for each pixel, however, also like splitting the image, it is not practical for commercial cameras. In order for a good picture to result, the subject of the image must remain still and a tripod must be used, or the image will become blurred (Wilson, et al, 2006) . One of the most practical and common methods used for creating colored images in digital cameras is by using a color filter array (CFA). In this type of sensor, a photo-site is broken up into red, blue, and green filters or cyan, magenta, yellow, and green filters (Adams, et al, 1998; Kudenov, 2003b; Hogan, 2004). After the array of data is read by the sensor, the data is run through algorithms in the camera’s software to merge the four intensity values from each pixel into one color value. In addition, intermediate colors are assigned to “invented” pixels based on the surrounding pixels’ colors; this is called interpolation. A image is simply created by averaging the luminance (V) values, which describe the amount of light that is emitted  R + G + B from a particular area, to generate one valueV = ,  3 C + M + Y + G V =  (Adams, et al, 1998). 4  The most common type of color filter is the pattern (Adams, et al, 1998; Kudenov,€ 2003b; Hogan, 2004). Refer to figure 9. This pattern uses the primary colors of light, € red (R), green (G), and blue (B), as filters for each photo-site to additively mix intensity values after the values

6 (C), the (frequency) and saturation, have been found by takingCR = R − G and CB = B − G (Adams, et al, 1998). The Bayer filter involves alternating an odd number of rows of red and green filters with an even number of rows of blue and green filters (Hogan, 2004). There are 25% more green pixels than there are red or blue, since human eyes are more sensitive to small changes in green wavelengths than red or blue. This extra green filter helps to estimate the green’s€ luminance €data more like the human eye (Kudenov, 2003b; Adams, et al, 1998). A slightly more complex filter uses light’s three secondary colors, cyan (C), magenta (M), yellow (Y), plus green (G). However, instead of adding these colors, they are mixed using the subtractive (M + Y) − (G + C) (M + C) − (G + Y) process. In this method the equations C = and C = are used to R 2 B 2 obtain the chrominance values for red and blue. Since magenta is made up of blue and red, and yellow is made up of green and red, the red, blue, and green values cancel to form white and only red is left over. The same is true with magenta and cyan, except that blue is the dominant color. In a Bayer filter, the red, green, and blue are made€ by mixing together different€ combinations of dyes (Red: magenta and yellow; Green: cyan and yellow; Blue: cyan and magenta) (Adams, et al, 1998). Since the colors cyan, magenta, and yellow are lighter than red, green, and blue, CYMG is more sensitive to light since it only needs to go through one layer of instead of two before reaching the sensor. As a result, CYMG is typically used on sensors that have high noise at low light levels since the light does not need to travel through as many layers of dye in order to be detected, eliminating noise (Adams, et al, 1998; Hogan, 2004). To find the value for each pixel the values obtained from the four photo-sties surrounding each intersection are averaged. However, averaging causes sharp edges are blurred, causing errors in the luminance value for that area. One common solution to this problem is to perform edge detection to every pixel. If an “edge” is detected, pixels are only averaged along edges rather than across them (Adams, et al, 1998). In taking a picture of an intricate cloth pattern or a tweed suit, the image one expects to see is instead replaced low-frequency moiré patterns (Adams, et al, 1998). Refer to figure 11. This is due to the camera is trying to capture an image that repeats a characteristic, spatial frequency, that’s spacing is finer than the spacing between pixels on the sensor (the closer the lines or dots the higher the spatial (space) frequency) (Bass, et al, 2005; Adams, et al, 1998). In this case high spatial frequencies must be eliminated. One possible method is through the use of polarization. Refer to figure 10. In this approach beams of light are passed through two pieces of birefringent material (quartz or calcite), which splits an oncoming beam into two beams causing the original beam to be eventually split into four separate beams of light. This in effect, directs each beam to four adjacent pixels and causes blurring to occur. High Frequencies are eliminated without blurring the original image much. Refer to figure 12. Another approach is that a pattern can be etched onto the surface of optical material so that when light passes through the material some light suffers from phase delay and interferes with light taking another path. This causes the higher spatial frequencies to disappear before the lower spatial frequencies (Adams, et al, 1998). If the frequency of a signal is known beforehand, anti-aliasing techniques can be disregarded. The frequency at

7  f sampling  which a signal is sampled must be at least two times the frequency of the original signal  f max =  .  2  If the frequency of the signal is more than the maximum frequency the signal will overlap, causing aliasing (Bass, et al, 2005). Refer to figures 13, 14, and 15.

Figure 13 (top): A signal that has been limited to W (frequency of signal) Hz (Romburg, et al, 2005)

Figure 14 (middle): A signal where the sampling frequency (Ts) is less than twice the signal’s frequency (W) causing overlap of the signals and aliasing (Romberg, et al, 2005)

Figure 15 (bottom): A signal where the sampling frequency (Ts) is greater than twice the signal’s frequency (W) (Romburg, et al, 2005)

Since humans can only see light from around 400 to 700 nm, past that all electromagnetic wavelengths are “invisible” to the human eye. However, color filter arrays transmit infrared light that beyond this 700 nm mark. Sensors are sensitive to infared rays. Since cameras’ images should represent what humans see, they must disregard the photo-electron count gained from these longer wavelengths. Solutions for this problem fall into two classes: reflection or absorption (Adams, et al, 1998). One way to resolve this dilemma is to use a hot mirror, dielectric material, a substance that is a poor conductor but can easily hold a current, coated on an optical surface (Adams, et al, 1998; Wigmore, 2001). This hot mirror will let all light pass through the optical surface except wavelengths that exceed a value in the range of 600 to 700 nm, which are reflected. The same results can be achieved by using absorption. These filters instead absorb the infrared radiation with a filter. However, since infrared radiation is heat energy (this is why the reflection process is called a hot mirror), if these filters do not have proper cooling they can easily crack or shatter. In addition, color correction, white balance adjustment, high pass filtering, and gamma correction are all used to “fix up” the image to look as accurate as possible before being displayed (Adams, et al, 1998). Before the information is saved on a memory card, it is usually compressed. This compression excludes excess data and reduces the file’s size, saving it usually as a Joint Photographic Experts Group (JPEG) or Flashpix. However, cameras are also able to save uncompressed images as well, in the form of TIFFs or just raw data. Although these file sizes are much larger, they hold much more data. With raw data the image can be processed with software that will perform the same tasks as the camera, however in much greater control (Alken M.R.S, 2006).

8 A picture file contains more than just the image, it also contains information about the image that it took in the form of an Extended File Information (EXIF) header. Refer to figure 16. This method of saving data about an image was developed by the Japanese Electronic Industry Development Association. The information in the header contains important information such as the date and time a picture was taken and the source camera’s make and model. The information can prove useful when details about an image are needed for law enforcement (Fung, 2006). Digital image quality has become so high in the last few years that not only has the general public been rapidly replacing classical analog cameras with digital cameras, but law enforcement agencies are doing so as well. However, with the availability of powerful editing programs, it is very easy for an amateur to modify digital media and create realistic looking forgeries (Khanna. 2006). In film photography, methods for camera identification have been perfected. Small scratches on negatives are one of the many ways to identify an image captured on an analog camera. However, because of the growing use of digital cameras, methods to identify an image’s origin quickly, reliably, and inexpensively are needed. Forensic tools that help establish the origin, and authenticity of digital images are essential to a forensic examiner. These tools can prove to be vital whenever questions of digital image integrity are raised (Luckas, et al, 2005). Although an image is originally accompanied with a large amount of data about the image in the EXIF header, this header may not be available if the image was saved as a different format or recompressed (Alken M.R.S, 2006). One possible way to identify the source of a digital image is by using the pattern of hot or dead pixels, which give luminance values of white or black pixels respectively regardless of the images content. However, if the camera did not contain any defective pixels, or were eliminated by processing after the image was taken, the source could not be identified. However a noise pattern could be used as a watermark for images. Like analog cameras, defects in the image are used to determine which camera produced certain image. Each camera has non-uniformities ranging from dust specks on the optics, to dark current. This noise is relatively stable over the camera’s life span and can be used to determine the source of an image. Although, alone hot and dead pixels are not reliable for identifying images, they can be used in conjunction with other defects. Hot and dead pixels fall into the range of noise caused by array defects, which include point defects, hot point defects, pixel traps, column defects, and cluster defects, all of which alter individual pixel values dramatically. Pattern noise can also be mapped for cameras. Any spatial pattern that does not vary from frame to frame can be used as a “watermark” for an image (Lukas, et al, 2005). Such patterns include dark currents (dark meaning the current was formed despite no exposure to light), and photo response non-uniformity noise (PRNU). Dark current arises from an excess of electrons that are captured by the sensors for various reasons and calculated as part of the total signal. These brighter areas (since there are more electrons captured at those particular points), defined as the fixed pattern noise (FPD), are due to variations in detector size, and foreign matter trapped during fabrication of the sensor. The FPD can be detected when the senor is not illuminated. Dark current increases with temperature since electrons become more active with heat (Hogan, 2004; Lukas, et al, 2005). PRNU on the other hand is the variation in pixel responsively that can be detected when the sensor is illuminated (Lukas, et al, 2005). This noise is related to detector size, spectral response, thickness in coatings and other small imperfections made when the camera was manufactured (Khanna, et al, 2006; Lukas, et al, 2005). One method of identifying a cameras noise pattern is determined by subtracting a denoised version of an image, by performing a wavelet based denoising filter, from the original image. The

9 reference pattern is found by averaging the noise pattern found in multiple pictures obtained from the same camera. This intrinsic signature of the camera can be used to correlate with a noise pattern of an image with an unknown origin. If the patterns are similar, and the correlation is above a certain threshold, the camera containing that particular reference pattern is the source camera. This method however may cause problems. Since some cameras use similar or even the same image processing algorithms, reference patters of different cameras are often slightly correlated. Since the preferred method of capturing colored images is through the use of a single sensor in conjunction with a color filter array. Only a third of the final image is captured by the camera, the other two thirds is interpolated. Thus, an image’s origin can be determined through estimation of the color interpolation parameters used by the source camera. However, this method cannot be used on images that have been compressed, such as in JPEGs, since in compression artifacts suppress the correlation between the pixels created by the camera’s interpolation (Khanna, et al, 2006). Modern digital computers store all their information as binary information, meaning everything done on a computer or any other device containing a computer is made up of only two commands, “on” and “off”. There are no “” in a computer, nothing is ambiguous, everything is either “black” or “white”. There are many different ways of representing the two conditions: “true” or “false”, and as stated above, “white” or “black”, or “on” and “off”, however, a computer represents these conditions as 1 and 0. Each one or zero is called a bit (b). Bits (also sometimes called flags or digits) can be placed together in different ways to create immense amounts of data. The word byte (B) (also called an octet or character), meaning eight bits, is commonly used today (Kozierok, 2005). When first learning about numbers, almost every child holds up his or her hands to count. Having 10 fingers allows a child to count ten numbers, zero to nine. Because of this, the foundation of the numeric system is based on the number ten, base ten (decimal (10) numbers). Computers on the other hand, only deal with binary numbers or base two. Instead of counting to nine until moving to a new place, one only needs to count up to 1. For example, the number fourteen is made up of a four in the ones place and a one in the tens place, meaning 4 + 10 = 14. The same concept is used in binary. In the number 1110 there is a one in the two’s place, a one in the four’s place and a one in the eight’s place meaning 2 + 4 + 8 = 14 (Kozierok, 2005). Refer to figure 17.

Since a sensor only measures intensity, pixels are stored in a computer as a value which describes how bright the pixel is. In binary images, the pixel value is a 1-bit number (0 or 1). Since there are only two possible one bit numbers, these images are often displayed as black and white. A grayscale image on the other hand contains eight bits (one byte) of information per pixel. Since every bit can contain up to two values, the number of possible values in a byte is 28 or 256. Since a grayscale images only needs one value to represent the brightness of the pixel, each pixel contains a number from 0 to 255 (256 possible values) where 0 represents black, 255 represents white, and all values in the middle are shades of gray. A color image, on the other hand, is made up of three different values, red, green, and blue, each having a range from 0 to 255. When these three values are placed together in an image they make a colored image (Fisher, 1994). Filters are used to attack pixel values usually by assigning them new values based on adjacent pixels. When attacking a colored imaged these three values have to be determined separately. Five different examples of filters are described below ) (EE637: Digital Signal Processing I, 2005).

10 A blurring filter is a three by three array of weighed coefficients where each coefficient is assigned a weight of one. Refer to figure 18. The weight for each pixel is multiplied by its intensity value (in this case since they all have a coefficient of one, the values would stay the same), and then divided by nine to find the average of the pixels contained in the filter. This new average value is then assigned to the middle pixel (shown here as bold). This is done for every pixel in the image. In effect, the resultant image is fuzzy when compared to the inputted image. A weighted blurring filter is very similar to a blurring filter except the coefficients are different. More emphasis is placed on the pixel value that is being changed and the adjacent pixel values rather than the pixel values diagonal from the pixel being changed. An example of possible weighting coefficients can be seen in Figure 19. Instead of dividing by nine, after the pixel’s values are multiplied by their coefficients and added together they are divided by the sum of the coefficients, in this example in most cases sixteen (when the pixel is on an image’s edge the value changes). A weighted blurring filter, like a blurring filter, makes an image fuzzier, however more detail is kept. Refer to figure 19. A sharpening filter makes use of a weighed blurring filter. The sharpened valued is found by the equation I(m,n) + λ(I(m,n) − h(m,n) × I(m,n)) where m identifies the row the pixel is in, n identifies the column, h(m,n) × I(m,n) is the corresponding pixel in the blurred image and λ is any value the programmer decides on (usually around 1)(EE637: Digital Signal Processing I, 2005).Refer

€ € €

to figure 21. A median filter is used on images that contain salt and pepper. Refer to figure 22. A median filter finds the median value out of a set of pixel values that surround the value is being changed. Although slightly blurring the image, it discards the extreme pixel values of 0 and 255 where they do not belong. A weighted median filter works in the same way as a median filter except helps to prevent blurring. An example weighted median filter is a five by five filter where the outer pixels are assigned weighting factors of one and all values surrounded by the pixels assigned one in the filter receive factors of two. Refer to figure 23. Each pixel value is placed in an array; the coefficients of the pixels are placed into a separate array. The array of pixel values is sorted in descending order, {X(1),X(2),....,X( p)}, where X is the pixel value and the subscript number identifies its position in the array, and the corresponding pixel weights are placed in the same order as the sorted pixels, {a(1),a(2),....,a( p)}, where a is the weight values € 11 € and the subscripts identify which value they correspond to in the intensity value array. A variable is incremented to go through the array containing the weighting factors and all the weights to the left and right of the incremented value are added together separately. When the values on the left add up to be more than or equal to the values on the right, the position in the weighting factors array is saved, and that saved position in the pixel values’ array is considered the median value and assigned to the pixel in the middle of the filter (shown in bold) (EE637: Digital Signal Processing I, 2005). A histogram of an image graphs pixel intensity (0 to 255) by the number of pixels that are each value. A histogram equalizer filter makes pixel intensities contain closer to around the same number of pixels, causing the histogram to look more even at the top. This is done by first finding the number of pixels in an image. The number of pixels at each intensity value is then divided by the total number of pixels in the image. This is done for every intensity value and then is placed into an array containing 256 values. A new array of 256 values is created by adding the fractions in the original array until reaching the intensity value that is being recorded in the new array. For example if for the newer array’s 5th position, the values in the 0th, 1st, 2nd, 3rd, 4th, and 5th positions of the original array are added together and assigned to the newer array’s 5th position. This is done for every pixel value (0 to 255) until the array’s reaches the 256th term, which is assigned 1. These new values are multiplied by 255 and the new values of each pixel intensity are assigned to the pixels containing that intensity (Mai, 2000; EE637: Digital Signal Processing I, 2005). Refer to figures 24, 25, 26, and 27.

Fig 24 (top left): Original Image (EE637: Digital Signal Processing I, 2005) Fig 25 (bottom left): Histogram of Original Image Fig 26 (bottom left): Image 4.1 after histogram equalization Fig 27 (bottom right): Histogram of Image 4.3

12 Watermarks first appeared in Italy around the year 1282 in papermaking. They were added by the use of thin wire patters that were placed inside the paper molds. Although their purpose for using watermarks is uncertain, watermarking today is used all the time as a security measure. If one holds a twenty-dollar bill up to the light they can see a portrait of president Andrew Jackson on one side of the paper. Like the watermarking over 700 years ago, this anti-counterfeiting watermark is embedded directly into the paper during its production. The word watermark is thought to have been coined during the eighteenth century. It is believed to have been derived from the German word wassermarke. Although water does not play a part in the creation of a watermark, it is thought to have been given that name because of the effect of water on paper (Cox, et al, 2002). Steganography and watermarking are very closely related. Steganography is derived from the Greek words steganos, “covered”, and graphia, “writing.” Covered writing describes the art of seganography perfectly; it is the use of concealed communication or the existence of a secret message. Although the purpose of stegongraphy is similar to watermarking there are fundamental philosophical differences. A watermark is a message that is related to the cover work, whereas in steganography the message only needs to be anything that is concealed. Although watermarks can be invisible, this may only be done for cosmetic purposes and the watermark is not necessarily secret (Cox, et al, 2002). With the growing use of digital materials, a concern for copyright issues and protection over digital works has increased. Techniques are needed to deter or prevent illegal copying, forgery, or just identifying the digital cover work. Although digital videos, audio, and images all can be watermarked, the remainder of this paper will focus on images. There are two different categories of watermarks, visible and invisible. As the names suggest, visible watermarks are designed to be perceived by the viewer, whereas invisible watermarks, under normal viewing conditions, are imperceptible (Cox, et al, 2002). There are many properties of a watermark. Possibly the most important and the most obvious is the embedding effectiveness. This is the probability that the output of the embedded will be watermarked, and the watermark will be able to be detected immediately after. The embedding effectiveness can be determined by simply embedding a watermark into many different images and finding the probability of effectiveness by calculating the percentage of output images that resulted in a positive detection. Although 100% effectiveness is desirable, many times this property has to be a tradeoff with other properties such as fidelity, which refers to the perceptual similarity between the original work and the watermarked work (Cox, et al, 2002). The data payload defines the number of bits that are encoded within the image. The number of bits that are encoded (N) determines how many messages can be embedded. A N-bit watermark is able to embed any one of 2N messages (since there are two possible values for each of those encoded bits, 1 or 0). For example, if a watermark message encodes 5 bits of data (5-bit watermark), it can contain 25 or 32 different messages. However, during detection many applications require a detector to not only identify which of the messages has been encoded but also to decide if the watermark is present at all, allowing 2N + 1 possible output values (Cox, et al, 2002). There are two methods in which a watermark can be detected. The watermark’s message can be detected through either informed or blind detection. In some applications, the original unwatermarked image is available during the watermark’s detection. In this case the original image’s data can be “subtracted” from the watermarked image’s data to obtain the watermark. A detector that requires both the original work and the watermarked work is called an informed detector and uses a private watermarking system to embed into the image (Cox, et al, 2002). However, many times the original work cannot be accessed, and the detector has to obtain the message without through other methods, this is called blind detection (Cox, et al, 2002; Delp, 2006). Public watermarking systems are used to refer to a system where the watermark must be identified through blind detection (Cox, et al, 2002). A watermark may be fragile, semi-fragile, or robust. A fragile watermark is designed to “break” under the smallest modification to the original image. This can be helpful in designing watermarks for authentication. Since any signal processing operations will cause the watermark to be lost, one can tell if the original watermarked image has been modified. A semi-fragile watermark contains a user-specified threshold that determines when a watermark will be broken. The larger the threshold, the more

13 modification the watermark is able to withstand (a fragile watermark has a threshold of zero) (Delp, 2006). A problem many people are facing today is that many times they do not want the watermark to become lost. After inserting a watermark into an image it is often desirable to be able to access the information even if the image has been attacked. However, with easy access to image modifying programs, such as Photoshop and The Gimp, it is very easy to alter the original image (Khanna, et al, 2006). The robustness refers to the ability of a detector to be able to correctly identify the watermark after common signal processing attacks such as the ones described above. Robustness can be achieved multiple ways, although no method is perfect. By repeating the watermark over the entire image in several locations, losing the message to attacks can be partially avoided (Cox, et al, 2002).

14 Works Cited

[1] Adams, Jim, Parulski, Ken, Spaulding, Kevin. 1998. “Color Processing in Digital Cameras”. IEEE.

[2] Alken M.R.S. 2006. “How Digital Cameras Work”. http://www.alkenmrs.com/digital-photography/how- digital-cameras-work.html

[3] Bass, James, Finnigan, James, Rodriguez, Edward, Mcpheeters, Claiborne. 2005. “Spatial Frequency”. http://cnx.org/content/m12564/latest/.

[4] Bjork, Gail. 2003. “What is Exif Data?” http://www.digicamhelp.com/what-is-exif/.

[5] Brain, Marshall. 2006. “How Television Works”. HowStuffWorks, Inc. http://electronics.howstuffworks.com/tv6.htm.

[6] Campbell, Mitchell, and Reece, 1997, Biology: Concepts and Connections, Benjamin Cummings Publishing Company: Menlo Park.

[7] Cox, Ingemar, Miller, Matthew, Bloom, Jeffrey. 2002. Digital Watermarking. Academic Press. London.

[8] DALSA. 2005. “CCD vs. CMOS.” http://www.dalsa.com/markets/ccd_vs_cmos.asp.

[9] Davis, Raymond, Metcalfe, Clark, Williams, John, Castka, Joseph. 2002. Modern Chemistry. Holt, Rinehart and Winston. Austin.

[10] Delp, Edward. 2006. “Multimedia Security Research at Purdue University”. http://cobweb.ecn.purdue.edu/~ace/water2/digwmk.html.

[11] Department of Biology. 2001. “Capturing Solar Energy: Photosynthesis” . http://www.uta.edu/biology/westmoreland/classnotes/1441/Chapter%2010%20Supplement.htm.

[12] Digital Photography Review. 2006. “Foveon's Revolutionary X3 Sensor”. http://www.dpreview.com/news/0202/02021101foveonx3.asp.

[13] EE637: Digital Signal Processing I. 2005. “ Laboratories.” http://dynamo.ecn.purdue.edu/~bouman/grad-labs/.

[14] Fisher, Bob, Perkins, Simon, Walker, Ashley, Wolfart, Erik. 1994. “Pixel Values”. http://www.cee.hw.ac.uk/hipr/html/value.html.

[15] The Franklin Institute. September 20, 2004. “Light and Color”. http://sln.fi.edu/color/color.html.

[16] Fung, Anthony. 2006. “Probing into Digital Image Tampering”. http://64.233.187.104/search?q=cache:lia5DDuC9gMJ:www.isfs.org.hk/newsletter/release04/digital _evidence.pdf+exif+header&hl=en&gl=us&ct=clnk&cd=8.

[17] Harris, Tom. 2006. “How Cameras Work”. HowStuffWorks, Inc. http://science.howstuffworks.com/camera.htm.

15 [18] Henderson, Tom. 2004. “The Anatomy of a Lens”. http://www.glenbrook.k12.il.us/gbssci/phys/CLass/refrn/u14l5a.html.

[19] Hogan, Thom. 2004, “How Digital Cameras Work”. http://www.bythom.com/ccds.htm.

[20] Khanna, Nitin. 2006.“A Survey of Forensic Characterization Methods for Physical Devices”

[21] Kozierok, Charles. 2005. The TCP/IP Guide: a Comprehensive, Illustrated Internet Protocols Reference. San Francisco: William Pollock.

[22] Kudenov, Mike. 2003a. “Characteristics of Light” http://ffden- 2.phys.uaf.edu/212_fall2003.web.dir/Mike_Kudenov%20/light.htm.

[23] Kudenov, Mike. 2003b. “Charged Coupled Devices (CCD's)” http://ffden- 2.phys.uaf.edu/212_fall2003.web.dir/Mike_Kudenov%20/CCD.htm.

[24] Kudenov, Mike. 2003c. “The Focal Plane” http://ffden- 2.phys.uaf.edu/212_fall2003.web.dir/Mike_Kudenov%20/plane.htm.

[25] Lukas, Jan, Fridrich, Jessica, Goljan, Miroslav, 2005. “Determining Digital Image Origin Using Sensor Imperfections”

[26] Lukas, Jan, Fridrich, Jessica, Goljan, Miroslav, 2005b. “Detecting Digital Image Forgeries Using Sensor Pattern Noise”

[27] Levine, Micael. 2000. Fundamentals of Sensation and Perception. Third edition. Oxford University Press. Great Britain.

[28] Mai , Luong Chi. 2000. “Histogram Equalization”.http://www.netnam.vn/unescocourse/computervision/22.htm.

[29] The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 a. “Wavelike Behaviors of Light”. http://www.physicsclassroom.com/Class/light/U12L1a.html.

[30] The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 b. “The Electromagnetic and Visible Spectra”. http://www.physicsclassroom.com/Class/light/U12L2a.html.

[31] The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 c. “Light Absorption, Reflection, and Transmission”. http://www.physicsclassroom.com/Class/light/U12L2c.html.

[32] The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 d. “Color Addition”. http://www.physicsclassroom.com/Class/light/U12L2d.html.

[33] The Physics Classroom and Mathsoft Engineering & Education, Inc. 2004 e. “Color Subtraction”. http://www.physicsclassroom.com/Class/light/U12L2e.html.

[34] Romburg, Justin, Johnson, Don. 2005. “Aliasing”. http://cnx.org/content/m10793/latest/.

[35] Sekuler, Robert, Blake, Randolph. 2002. Perception. Fourth edition. Mc-Graw Hill. Boston.

[36] Szaflarski, Diane. September 22, 2002. How We See: “The First Steps of Human Vision”. The National Health Museum. http://www.accessexcellence.org/AE/AEC/CC/vision_background.html.

16 Health Museum. http://www.accessexcellence.org/AE/AEC/CC/vision_background.html.

[37] Watson, Curtis. 2006. “Optical Instruments”. http://home.insightbb.com/~phys1/notes.html.

[38] Wigmore, Ivy. 2001. “Dielectric Material”. TechTarget. http://whatis.techtarget.com/definition/0,,sid9_gci211945,00.html.

[39] Wikipedia. 2006. “Anti-Aliasing”. http://en.wikipedia.org/wiki/Anti-aliasing.

[40] Wilson, Tracy, Nice, K., Gurevich, G. 2006. “How Digital Cameras Work”. HowStuffWorks, Inc. http://electronics.howstuffworks.com/digital-camera.htm.

[41] Zobel, Edward. 2006. “Ray Optics, Light Refraction”. http://id.mind.net/~zona/mstm/physics/light/rayOptics/refraction/refraction1.html.

17