Efficient Digital Image Demosaicing Directly to YCbCr 4:2:0

Thesis

Submitted to

The School of Engineering of the

UNIVERSITY OF DAYTON

In Partial Fulfillment of the Requirements for

The Degree of

Master of Science in Electrical Engineering

By

Daniel Christopher Whitehead

UNIVERSITY OF DAYTON

Dayton, Ohio

December, 2013 Efficient Digital Demosaicing Directly to YCbCr 4:2:0

Name: Whitehead, Daniel Christopher

APPROVED BY:

Eric J. Balster, Ph.D. Keigo Hirakawa, Ph.D. Advisor Committee Chairman Committee Member Assistant Professor, Department of Assistant Professor, Department of Electrical and Computer Engineering Electrical and Computer Engineering

Frank A. Scarpino, Ph.D. Committee Member Professor Emeritus, Department of Electrical and Computer Engineering

John G. Weber, Ph.D. Tony E. Saliba, Ph.D. Associate Dean Dean, School of Engineering School of Engineering & Wilke Distinguished Professor

ii ABSTRACT

Efficient Digital Color Image Demosaicing Directly to YCbCr 4:2:0

Name: Whitehead, Daniel Christopher University of Dayton

Advisor: Dr. Eric J. Balster

As digital cameras replace their film based predecessors, new techniques are required to con- vert raw sensor data into a more usable format. Color image demosaicing is used to interpolate sparse color channel information received from the sensor to produce a full color image. The ideal demosaicing would minimize complexity while maximizing quality, however, in reality trade-offs must be between complexity and quality.

Typically an image is demosaiced into a -- (RGB) color-space and then transformed into an alternate color-space such as YCbCr. The YCbCr color-space separates the image into a lu- minance channel, Y,and two channels, Cb and Cr, which is useful for image processing tasks, such as compression. The chrominance channel of the YCbCr image is often subsampled to reduce the amount of data processed without significantly impacting the perceived .

This is possible because the human visual system has a lower sensitivity to high frequency chromi- nance information compared to high frequency information[1]. A common form of the

YCbCr format with subsampled chrominance is YCbCr 4:2:0, which consists of one Cr and one Cb sample for every four luminance samples.

iii This thesis presents an efficient method of demosaicing directly into YCbCr 4:2:0 format, by- passing the intermediate RGB image produced by most existing demosaicing methods. The pro- posed color image demosaicing algorithm is first implemented with floating point mathematics and then further simplified to operate as a fixed point algorithm. The floating point implementation of the proposed algorithm is shown to have a significantly reduced average execution time when com- pared to capable of producing similar quality images. Hardware is developed using fixed point multiplications, which has a throughput of approximately 24 bits/clock.

iv ACKNOWLEDGMENTS

I would like the thank all who have helped and supported, especially:

• My family: For always supporting and encouraging me.

• Dr. Eric Balster: For giving me this opportunity, motivating me, and advising me throughout my graduate education.

• Dr. Keigo Hirakawa and Dr. Frank Scarpino: For serving on my thesis committee.

• My Co-Workers: For helping me along the way.

• Kerry Hill, Al Scarpelli, and the Air Force Research Laboratory at Wright-Patterson

Air Force Base: For enabling this experience.

v TABLE OF CONTENTS

ABSTRACT ...... iii

ACKNOWLEDGMENTS ...... v

LIST OF FIGURES ...... viii

LIST OF TABLES ...... x

I. Introduction ...... 1

II. Background ...... 3

2.1 Imaging systems ...... 3 2.2 YCbCr Color-Space ...... 7 2.3 Existing Demosaicing Algorithms ...... 11

III. Proposed Algorithm ...... 13

3.1 Proposed Demosaicing Algorithm ...... 13 3.1.1 Green Channel ...... 16 3.1.2 Red-Blue Interpolation ...... 16 3.1.3 Calculating Chrominance ...... 17 3.1.4 Calculating Luminance ...... 17 3.2 Results ...... 19

IV. Transition from Floating Point to Fixed Point Multiplication ...... 26

4.1 Floating Point Overview ...... 26 4.2 Conversion to Integer Multiplications ...... 27 4.3 Results ...... 28

vi V. VHDL Hardware Implementation ...... 32

5.1 Hardware Design ...... 32 5.1.1 Demosaicing Top Level ...... 34 5.1.2 Demosaicing State-Machine ...... 34 5.1.3 Green Interpolation ...... 35 5.1.4 Red/Blue Interpolation ...... 38 5.1.5 Chrominance, Y1 and Y2 Interpolation ...... 38 5.1.6 Y3 and Y4 Interpolation ...... 41 5.2 Results ...... 41

VI. Conclusion ...... 46

6.1 Future Research ...... 46

BIBLIOGRAPHY ...... 48

vii LIST OF FIGURES

2.1 Digital using a prism to split between multiple sensors . . 4

2.2 [2] ...... 5

2.3 Color artifacts common in demosaiced images ...... 6

2.4 Raw image to demosaiced image ...... 7

2.5 Absorption of light by the red, green, and blue cones in the human eye as a function of wavelength [3] ...... 8

2.6 Spectral sensitivity characteristics of a typical camera sensor [1] ...... 9

2.7 Gamma transformations ...... 10

3.1 Green-red starting pattern ...... 14

3.2 Region of interest ...... 15

3.3 Green channel interpolation filter developed in [4] ...... 16

3.4 image set [5] ...... 20

3.5 Average PSNR in dB ...... 21

3.6 Average SSIM ...... 22

3.7 Sample outputs from left to right, top to bottom: Original, Proposed, Method in [6],Method in [7], Method in [8],Method in [9],Method in [4],Bilinear ...... 23

4.1 PSNR vs. scaling factor ...... 29

viii 4.2 PSNR comparison of fixed point and floating point results ...... 30

4.3 SSIM comparison of fixed point and floating point results ...... 31

5.1 GiDEL ProceIV Block Diagram from [10] ...... 33

5.2 State-machine in demosaic control module ...... 36

5.3 Green interpolation module when processing a blue-green row ...... 37

5.4 Green interpolation module when processing a green-red row ...... 39

5.5 Red and blue interpolation module ...... 40

5.6 Chrominance interpolation segment ...... 42

5.7 Y1 and Y2 interpolation segment ...... 43

5.8 Y3 and Y4 interpolation module ...... 45

ix LIST OF TABLES

3.1 Operations per ...... 24

3.2 Average Execution Time ...... 24

5.1 Hardware design characteristics ...... 44

5.2 Hardware execution time characteristics at 200MHz ...... 44

x CHAPTER I

Introduction

Digital cameras have largely replaced film based cameras in the consumer market in over the past several years. The sensors used in digital cameras convert light intensity into a discrete picture element (pixel) values. Although the sensor captures light intensity, they give no specific informa- tion about color since without filters the light intensity is taken over a wide range of wavelengths.

Despite this limitation color digital cameras can be made possible through strategic inclusion of

filters that only allow limited ranges of wavelengths through to the sensor. The filtering of light is a critical step in allowing the possibility of color information, however, filtering alone will not produce a color image.

The most common technique for producing color images in consumer grade digital cameras is to place a color filter array (CFA) in front of the sensor. The CFA is a known pattern of filters where a different filter is applied to each pixel. An image captured through a CFA contains partial color channel information, typical red, blue, and green color channels. This pseudo-color image that is captured by the camera’s sensor is not acceptable as an output image because each pixel only has one color component. A process of interpolating the image produced by a CFA known as demosaicing is performed to produce a full color image.

Typical demosaicing algorithms produce a Red-Green-Blue (RGB) image which is useful for display purposes but for many other image processing tasks, such as compression, an RGB image 1 is not ideal. More useful color-spaces exist than RGB, such as YCbCr which splits the image into luminance and chrominance information, HSL which converts the image into , saturation, and luminance channels, and many more.

This paper proposes a new method of demosaicing raw images directly into the YCbCr color- space, bypassing the need for a full RGB image as an intermediate step. As a low complexity method of demosaicing, the proposed algorithm is ideal for applications that are able to sacrifice quality slightly for significant increase in speed. The proposed algorithm is first introduced with the inclusion of floating point multiplications. The algorithm is then be simplified to only required integer based operations.

Chapter II covers various camera designs for capturing color, the YCbCr color-space, and sum- maries some existing demosaicing algorithms. The proposed demosaicing algorithm and its results are discussed in Chapter III. In Chapter IV the proposed algorithm is converted from relying on

floating point multiplications to being purely integer based. A hardware implementation of the pro- posed integer based algorithm is presented in Chapter V. Conclusions about the proposed algorithm are drawn in Chapter VI.

2 CHAPTER II

Background

This chapter discusses the systems for capturing color images with a , the YCbCr color-space transform, and some existing demosaicing algorithms. Section 2.1 discusses three types of color imaging systems and their advantages and disadvantages. Section 2.2 introduces the YCbCr color-space and why it is used. Section 2.3 summarizes some existing demosaicing algorithms which are used for comparison later in this paper.

2.1 Imaging systems

The simplest model of a digital camera consists of a sensor array, which converts photons into an electrical charge, a which focuses the scene being photographed onto the sensor array, an aperture which limits the amount of light entering the lens, and a shutter to limit time. The sensor arrays in digital cameras record light intensity based on the number of photons that reach the sensor. The intensities of specific wavelengths would be completely lost, leaving a gray-scale image if additional hardware is not included in this model.

In order to produce color images, specific wavelengths of light must be strategically filtered.

One method of doing this is using a prism to split light between three separate sensors, each with a different filter placed in front of it resulting in three complete color channels as seen in Figure 2.1.

Since the light is split between three sensor arrays, fewer photons reach each array and thus more 3 Figure 2.1: Digital color photography using a prism to split light between multiple sensors

is present in the final image. This method tends to be more expensive to implement and results in larger cameras due to the additional sensors required.

The second method is to have a single sensor with a rotating wheel of filters as seen in Figure

2.2, where three separate images are captured to produce the final color image. Unfortunately this method only works with a still scene and camera because any movement would result in misaligned color channels. Although this type of system is impractical for a point-and-shoot digital camera, it does preserve along object edges well.

The final and most commonly used method of digital color photography uses a color filter array

(CFA) in front of the sensor to preserve some color information at each recorded pixel. A color filter array typically consists of a 2x2 pattern of filters that is repeated and placed over the entire sensor array so that each filter covers only one pixel. Each filter in the pattern allows through a limited range of wavelengths of light from the . The resulting image can be processed as a color image where each pixel only contains information about one color channel. Once the pseudo- color image is obtained the unpopulated color channels for the individual are filled in through an interpolation process known as demosaicing. Although the CFA method of producing color

4 Figure 2.2: Color wheel [2]

images is widely used consumer grade and professional cameras it does have drawbacks including chromatic aliasing as seen in Figure 2.3a and zippering as seen in Figure 2.3b.

Several different color filter array patterns have been developed since the invention of , but the most common pattern by far is the Bayer pattern, named after its inventor

B. E. Bayer [11]. Figure 2.4 illustrates from left to right, the simulated output from a camera with a Bayer CFA, the mosaiced image with colors applied at appropriate locations, and the demosaiced image.

The human eye absorbs light according to the plot shown in Figure 2.5 [3]. The standard RGB definition approximately matches the characteristics of the human eye. A digital camera sensor produces an RGB that differs from the standard RGB definition due to the absorption characteristics of the sensor. Figure 2.6 shows an absorption plot from a Sony CCD [1]. In order to rectify the discrepancy between the RGB values recorded by a digital camera sensor and the standard

RGB definition, a color correction matrix (CCM) must be applied. The scene illumination also affects the image captured by the camera sensor. -balancing is done to correct for the scene

5 (a) (b)

Figure 2.3: Color artifacts common in demosaiced images

illumination so that the displayed output image appears more similar to what a person would see instead of what the camera sees. The last step before the image can be displayed on a monitor is performing , which is a non-linear operation applied to each of the color channels individually. Gamma correction is calculated by

s = crγ, (2.1) where s is the output intensity, c and γ are constants, and r is the input intensity [3]. Figure 2.7 shows several gamma transform curves for c = 1 and the input intensity is normalized to one.

A display will typically apply a transform with a gamma greater than one, so in order to appear correctly on the screen the intensity must first corrected through a gamma transform with a gamma less than one. For example if the monitor displaying the image has performs a transform at γ = 2.5

6 Figure 2.4: Raw image to demosaiced image

the intensities must first be corrected with a transform using γ = 0.40 to properly reproduce the correct intensities.

2.2 YCbCr Color-Space

Whether multiple channels are stitched together from three sensors, combined from multiple frames of a single sensor or generated by demosaicing the end result is typically expressed in the

Red-Green-Blue (RGB) color-space. In the RGB color-space each pixel is represented as a combi- nation of red, green, and blue light intensity channels. Although this format is fairly intuitive it is not ideal for most types of processing because changes to individual channels will alter the color ratio, meaning all channels would need to be modified equally to preserve color information. Instead of working with three color channels, the RGB can be transformed into YCbCr format which is split

7 Figure 2.5: Absorption of light by the red, green, and blue cones in the human eye as a function of wavelength [3]

8 Figure 2.6: Spectral sensitivity characteristics of a typical camera sensor [1]

9 1

0.9 γ=0.04 γ=0.1 0.8 γ=0.2 0.7 γ=0.4 0.6 γ=0.67 0.5 γ=1 γ=1.5 0.4 γ=2.5

Output intensity level, s 0.3 γ=5 0.2 γ=10 0.1 γ=25

0 0 0.2 0.4 0.6 0.8 1 Input intensity level, r

Figure 2.7: Gamma transformations

up into a luminance channel and two chrominance channels. The luminance channel, Y, contains light intensity information and can be used as a grayscale representation of the image. The chromi- nance channels, Cb and Cr, are the blue difference and red difference components respectively. The standard conversion for gamma corrected RGB to YCbCr for high definition as defined by [12] is

Yi=0.2126Ri + 0.7152Gi + 0.0722Bi

Cbi=−0.1172Ri − 0.3942Gi + 0.5114Bi (2.2)

Cri=0.5114Ri − 0.4645Gi − 0.04689Bi.

The chrominance component of a YCbCr image is often sub-sampled to reduce the amount of data being processed. Commonly used form of YCbCr sampling include YCbCr 4:4:4 which performs no sub-sampling, YCbCr 4:2:2 which is sub-sampled in chrominance to contain only one 10 half of the resolution in the horizontal direction and full resolution in the vertical direction, YCbCr

4:2:0 which sub-samples the chrominance channels to contain only one half the resolution in both the horizontal and vertical direction, and YCbCr 4:0:0 which contains only the luminance channel.

Although chrominance sub-sampling significantly reduces the color resolution, the human visual system has a lower sensitivity to high frequency color information compared to high frequency intensity information and therefore the different resolutions are acceptable [1].

2.3 Existing Demosaicing Algorithms

The simplest method of demosaicing is bilinear interpolation, which operates on the red, green, and blue color channels independently. The red (blue) channel is interpolated using two separate 2D convolution operations. At green pixel locations the red (blue) pixels are interpolated by convolving the red (blue) channel with  1  0 4 0 1 1  4 0 4  . (2.3) 1 0 4 0 The green channel is also calculated by a 2D convolution of the green channel with the kernel

in Equation 2.3. The red (blue) samples at blue (red) locations are calculated via yet another 2D

convolution with  1 1  4 0 4 0 0 0 . (2.4) 1 1 4 0 4 Malvar et al. [4] propose a method of high quality linear interpolation (HQLI), in which a larger region of influence is used in determining pixel values. This method uses four 5x5 kernels to filter the input Bayer pattern image. The results of the filtering are combined into a single RGB image.

Hamilton and Adams have two methods, [9] and [8], the first of which is more commonly used due to its lower complexity. The two methods vary slightly in the green channel interpolation and use the same process for red and blue interpolation. The green channel interpolation in both cases uses the horizonal and vertical gradients to determine how the current pixel is interpolated. The

11 interpolated green pixels and the known red (blue) pixels are used to interpolate for the red (blue) channel.

A non-traditional demosaicing algorithm is presented in [6] which outputs a YCbCr 4:2:0 image instead of an RGB image. This is beneficial for certain applications since YCbCr 4:2:0 can be pro- cessed directly without additional color-space transforms and the bits per pixel is reduced compared to YCbCr 4:4:4 and RGB format. The method in [6] starts by producing a green channel using

[9] with the addition of a threshold term. The interpolated green values are used in the calculation of the Red-Green and Blue-Green channels. The difference channels are low-pass

filtered to produce interpolated red and blue values. After the red/blue interpolation chrominance and luminance are determined using the color-space conversion equations.

The Adaptive Homogeneity-Directed Demosaicing (AHDD) [7] is designed with the human visual system in mind. The input image is interpolated in both the vertical and horizontal directions to produce two new images. These interpolated images are transformed into CIELAB space, where luminance and color neighborhoods are calculated. The neighborhoods are then used to calculate horizontal and vertical homogeneity maps. A single images is produced by selecting the vertically

(horizontally) interpolated value when the average vertical (horizontal) homogeneity in a region is greater than the average horizontal (vertical) homogeneity in the same region. A median filter is then iteratively applied to the image to reduce artifacts and produce the final image.

12 CHAPTER III

Proposed Algorithm

Generally demosaicing algorithms designed for the Bayer pattern output an image in RGB for- mat. The RGB image is then often converted into a different color-space, such as YCbCr because the decorrelation of the channels improves compression performance. For many applications, once the image is transformed into the YCbCr color-space the chrominance channels (Cb and Cr) are sub- sampled while the luminance channel (Y) is left unchanged. By performing the subsampling, the image being compressed is only 1.5 times the size of the original mosaiced image instead of 3 times the size. The lost data in the chrominance channels does little to affect the perceived image quality since the human visual system is less sensitive to high chromatic frequencies than high luminance frequencies [1]. The proposed algorithm achieves a YCbCr 4:2:0 output without first calculating a full RGB image.

3.1 Proposed Demosaicing Algorithm

The proposed method for demosaicing operates on a Bayer pattern image, where each 2x2 block of pixels consists of two green pixels, one red pixel and one blue pixel. The algorithm is only defined for a Green-Red Bayer starting pattern as shown in figure 3.1. Additional starting patterns can easily be used by adding a column of padding on the left and right for a Red-Green pattern, adding a row of padding on the top and bottom for a Blue-Green pattern, or adding padding on all sides for a 13 Figure 3.1: Green-red starting pattern

Green-Blue pattern. The padding used to offset the starting pattern is cropped off of the final output in order to maintain an image size consistent with the original image.

The demosaicing method described in [6] is used as the framework for generating a YCbCr 4:2:0 output directly from a Bayer pattern image. The proposed method achieve an increase in speed and reduced complexity compared to the method in [6] by using simplified calculations for red, green, and blue interpolation. Figure 3.2 shows pixel locations for the current region of interest (ROI) in a

Bayer pattern image. The ROI moves along the image in order during the demosaicing process. In order to get the highest quality output from the demosaicing algorithm, the input image is first padded with four pixels symmetric extension on all sides [13].

The proposed method incorporates the color-space transforms given by

Yi=0.2126Ri + 0.7152Gi + 0.0722Bi

Cbi=−0.1172Ri − 0.3942Gi + 0.5114Bi (3.1)

Cri=0.5114Ri − 0.4645Gi − 0.04689Bi,

14

red (R1) and blue (B1) pixels at location G1 are given by

R + R −G + 2G − G R = 8 2 + 8 1 2 (3.3) 1 2 2

and

B + B −G + 2G − G B = 6 3 + 6 1 3 , (3.4) 1 2 2

where G2, G3, G6, and G8 come from the interpolated green channel. The red and blue interpolation

equations follow the calculations done in [9] for red and blue interpolation. Since the red and blue

interpolation only relies on immediately neighboring pixels, a reduced region of interest is required

for the calculation compared to the corresponding step in [6]. This allows for one row less of a

delay for the red and blue interpolation and reduces the amount of padding required for the image.

3.1.3 Calculating Chrominance

The chrominance samples are directly calculated by plugging R1, G1, and B1 into Equation 3.1, resulting in

Cb1 = −0.1172R1 − 0.3942G1 + 0.5114B1 (3.5)

Cr1 = 0.5114R1 − 0.4645G1 − 0.0469B1.

3.1.4 Calculating Luminance

The luminance calculation changes depending on which location out of a 2x2 cell of pixels is being calculated. The most straight forward of the luminance calculations is the Y1 calculation, which like the chrominance calculation, simply plugs in the R1, G1, and B1 values into Equation

3.1 resulting in

Y1 = 0.2126R1 + 0.7152G1 + 0.0722B1. (3.6)

17 The calculation for Y2 cannot be done as easily since B2 is not available. Instead, the reverse transform equation for blue from Equation 3.2 is used to obtain

Y2 = 0.2126R2 + 0.7152G2 + 0.0722(Y2 + 1.8142Cb2). (3.7)

Although this solves the problem of the missing B2 term, it is only a partial solution since Y2 now exists on both sides of the equation and a Cb2 term has been added. The 0.0722 constant is

multiplied through so that the Y2 terms can be collected on the left hand side of the equation and

Cb2 is replaced with the linear interpolation of Cb1 and Cb9 to obtain

(1 − 0.0722)Y2 = 0.2126R2 + 0.7152G2 (3.8) Cb + Cb + 0.1310( 1 9 ), 2

where Cb9 comes from the chrominance calculation in the neighboring region of interest. Dividing

both sides of the equation by (1 − 0.0722) and multiplying out the coefficients yields the final equation for Y2,

Y2 = 0.2291R2 + 0.7709G2 + 0.0706(Cb1 + Cb9). (3.9)

The luminance sample Y3 is determined using the same process as the calculation for Y2, where

R3 is replaced by the red calculation in Equation 3.2 and Cr3 is the linear interpolation of Cr1 and

Cr14. Following the described process, the equation for Y3 is found to be

Y3 = 0.2079(Cr1 + Cr14) + 0.9083G3 + 0.0917B3. (3.10)

The last luminance sample in the 2x2 block, Y4, follows a similar approach to those take for the

Y2 and Y3 calculations. Since neither R4 nor B4 are available, the Ri and Bi from Equation 3.2 must be utilized instead to give

Y4 = 0.2126(Y4 + 1.5396Cr4) (3.11)

+ 0.7152G4 + 0.0722(Y4 + 1.8142Cb4). 18 The final equation for Y4 is produced by substituting in bilinear interpolated values of Cr4 and

Cb4. After performing the substitution and collecting the Y4 terms

Y4 = 0.114(Cr1 + Cr9 + Cr14 + Cr16) (3.12)

+ G4 + 0.046(Cb1 + Cb9 + Cb14 + Cb16),

is obtained, where the Cb9, Cb14, Cb16, Cr9, Cr14, and Cr16 come from neighboring regions of

interest.

3.2 Results

The proposed algorithm is compared to several other algorithms in terms of image quality,

algorithm complexity, and total execution time. Figure 3.4 shows the Kodak image set [5] which is

used as the test image set because it is a high quality digitization of film photography. A reference

YCbCr 4:4:4 image is created directly from each image in the test set, then the original images are

sampled with a simulated Bayer CFA to produce images for demosaicing. The mosaiced images

are processed with the demosaicing methods in [6], [8], [9], [7], [4], bilinear demosaicing, and

the proposed method. Since the proposed algorithm and the method in [6] produce a YCbCr 4:2:0

output, they are converted to YCbCr 4:4:4 via bilinear interpolation of the chrominance channels.

The remaining algorithms produce a full RGB output and are simple converted directly to YCbCr

4:4:4 using the Equation 3.1 in Section 3.1.

Once all of the demosaiced images are in the same format as the reference images, it is possible

to perform a full reference image quality comparison. Peak Signal-to-Noise Ratio (PSNR) is used

as one of the quality metrics for this comparison. The calculation for PSNR is given by

2n − 1 PSNR = 20 log (√ ), (3.13) 10 MSE 19 Figure 3.4: Kodak image set [5]

where the Mean Squared Error (MSE) is

M N P P [f(i, j) − F (i, j)]2 i=1 j=1 MSE = , (3.14) MN f is a channel of the reference image, F is the corresponding channel in the demosaiced image, M

is the image width, N is the image height in pixels, and n is bit-depth. The calculation is performed

independently for each of the three channels YCbCr channels.

Figure 3.5 shows that [6] actually produces the highest PSNR on average for luminance. Both

of the chrominance components have the highest peak signal-to-noise ratio when demosaicing is

done using the method in [7], which is expected since the algorithm is optimized for color consis-

tency. The proposed method averages a luminance PSNR within 0.50 dB of the leading method

for luminance. The chrominance is slightly further behind the leader at 1.65 dB lower for the Cb

component and 1.06 dB lower for the Cr component. Although the proposed algorithm seems to

suffer in chrominance, some degradation of quality is expected since the calculated chrominance

20 45 Proposed [6] [7] [8] 40 [9] [4] Bilinear

PSNR (dB) 35

30 Y Cb Cr

Figure 3.5: Average PSNR in dB

channels only contain one fourth of the chrominance data of other methods prior to up-sampling to

YCbCr 4:4:4 format.

An alternate quality metric, the structural similarity index (SSIM) [14], is also used because although PSNR is a simple method of comparison a high decibel PSNR does not necessarily corre- spond to high visual quality. SSIM attempts to provide a metric that more closely matches perceived image quality. The structural similarity is calculated by

(2µxµy + C1)(2σxy + C2) SSIM(x, y) = 2 2 2 2 , (3.15) (µx + µy + C1)(σx + σy + C2)

21 1 Proposed 0.99 [6] [7] 0.98 [8] [9] 0.97 [4] Bilinear 0.96

0.95

0.94 Average SSIM

0.93

0.92

0.91

0.9 Demosaicing method

Figure 3.6: Average SSIM

where C1 and C2 are constants, µx and µy are average pixel values over an 8x8 window for the

2 2 reference and interpolated images respectively, σx and σy are variances the reference and interpo- lated images within the window, and σxy is the covariance in the window. The SSIM calculation

is performed on the Y, Cb, and Cr channels independently for each images. A weighted average of

0.7Y , 0.15Cb, and 0.15Cr is done as in [15] to get the final SSIM index number.

The weighted average SSIM seen in Figure 3.6 shows the top three methods are [6], [7], and the

proposed method respectively. The proposed method is only 0.004 SSIM points from the leading

method.

22 Figure 3.7: Sample outputs from left to right, top to bottom: Original, Proposed, Method in [6],Method in [7], Method in [8],Method in [9],Method in [4],Bilinear

Figure 3.7 shows a subsection of the Lighthouse image demosaiced using several different meth- ods. This image is selected as an illustration to show the various algorithms’ responses to high spatial frequency content. The adaptive homogeneity-directed demosaicing (AHDD) method in [7] has almost no visible aliasing along the fence, while the lower complexity algorithms have visible color artifacts.

High visual quality is an important feature to any demosaicing algorithm but in many appli- cations complexity must also be taken into consideration. Algorithm complexity for the various demosaicing algorithms tested is determined by the number of operations per pixel. The operations are broken down into eight different categories: additions, shifts, multiplies, compares, absolute val- ues, cube roots, eight element medians, and four element medians. The last three categories were not broken down into simpler operations because that would involve a specific method of doing the calculation when there are many possibilities. The complexity is based on the operations required

23 Table 3.1: Operations per Pixel

Absolute Cube Median Median Algorithm Add Shift Multiply Compare Value Root 8 4 Proposed 11.5 4.5 4.25 0 0 0 0 0 Method in [6] 23.5 6.5 4.25 1 2 0 0 0 Method in [7] 97 7 51 29 4 6 4 4 Method in [8] 40.5 14 9.5 3.5 4 0 0 0 Method in [9] 30 10 9.5 1.5 3 0 0 0 Method in [4] 26 10.5 12.5 0 0 0 0 0 Bilinear 12 2 9 0 0 0 0 0

Table 3.2: Average Execution Time

Method Seconds Proposed 0.60 [6] 1.20 [7] 27.44 [8] 0.97 [9] 1.43 [4] 1.33 Bilinear 0.41

to go from a Bayer image to a YCbCr output, where algorithms that natively produce an RGB out- put are converted to YCbCr 4:4:4 and those that produce a YCbCr 4:2:0 output natively are left as

YCbCr 4:2:0.

Table 3.1 gives an overall breakdown of algorithm complexity. From the table it is apparent that the proposed method of demosaicing is one of the least complex algorithms in terms of required operations. While some of the reduced complexity seen in the proposed method is due to converting to YCbCr 4:2:0 rather than YCbCr 4:4:4, if the desired output is YCbCr 4:2:0 then the relative complexity is decreased further since additional sub-sampling would need to be done with the other methods.

24 Instead of simply relying on the number of operations to determine complexity, a real-world benchmark is also performed by implementing the algorithms in MatLab and recording the average execution time. While the results of this type of test are very implementation specific, they do con- tribute a rough estimate of relative speed. Table 3.2 indicates the proposed method is significantly faster than all but the bilinear interpolation method.

25 CHAPTER IV

Transition from Floating Point to Fixed Point Multiplication

The algorithm proposed in Chapter III performs multiple floating point multiplications per pixel as part of the embedded color-space transform. This chapter investigates the possibility of replacing the floating point calculations with shifts, additions, and integer multiplication because floating point operations more complex and difficult to implement in hardware. Section 4.1 briefly describes how floating point numbers work and the issues associated with using floating point precision.

Section 4.2 covers the conversion to integer based multiplications and the decomposition of integer multiplies into shifts and additions. Section 4.3 selects the optimal scaling factor for converting

floating point multipliers to integer multipliers and compares the algorithm in Chapter III using

floating point calculations with the same algorithm done as fixed point operations.

4.1 Floating Point Overview

Each floating point number is made up of three separate parts, a sign bit, an exponent, and a mantissa. A single precision floating-point number following the IEEE 754 standard has a sign bit, an 8-bit exponent, and a 23-bit mantissa giving a total of 32-bits per number [16]. The mantissa is only the fractional part of the normalized number being represented, with the 1 before the decimal point being implied. A positive exponent corresponds to a left shift of the mantissa and a negative

26 exponent corresponds to a right shift of the mantissa to get the base 10 representation of the floating point number.

Typically for demosaicing applications, the desired result is an integer based output image. To achieve this, the fractional portion of any output is either rounded to the nearest integer or converted to an integer through a floor or ceiling function. This means that a precise fractional representation of the output is unnecessary as long as the intermediate calculations retain enough precision to correctly calculate the integer part of the number. Therefore, the added complexity and bit-depth required for floating point calculations can likely be avoided with little impact to the final result.

4.2 Conversion to Integer Multiplications

In order to convert to fixed point integer multiplies from the floating point multiplies described in Chapter III, the floating point multipliers are scaled by 2n and floored to create a new integer

multiplier. The new integer multiplier is used in the calculation with the original multiplicand and

the product is divided by 2n. The equation for this process can be seen in Equation 4.1, where P

is the product, M is the original floating point multiplier, S is the scaling constant, and N is the

integer multiplicand.

(bM · Sc · N) P = b c (4.1) S

Since the multiplier is a constant the calculation bM · Sc is done only once for each multiplier and the result is hard-coded into the calculation. By using an S that is a power of two, the division

by S can be simplified to a binary shift. Equation 4.6 shows an example of the above process applied

27 to the calculation for Y1 where R1 = 137, G1 = 140, B1 = 120, and the scaling factor S = 64.

Y1=0.2126R1 + 0.7152G1 + 0.0722B1 (4.2) (b0.2126 · 64c · 137) (b0.7152 · 64c · 140) (b0.0722 · 64c · 120) Y =b c + b c + b c (4.3) 1 64 64 64 14 · 137 46 · 140 5 · 120 Y =b c + b c + b c (4.4) 1 64 64 64

Y1=29 + 100 + 9 (4.5)

Y1=138 (4.6)

The same example problem produces a result of Y1 = 137.9182 when calculated using floating

point calculations. Integer multiplication with a constant multiplier and variable multiplicand can

easily be broken down into shifts and additions. Equation 4.7 illustrates the process of multiplying

46 and 140 or in binary 101110b and 10001100b respectively.

101110 ×10001100 (4.7) 10001100 10001100 10001100 +10001100 1100100101000

The calculation in Equation 4.7 can be equivalently written as N << 5 + N << 3 + N <<

2 + N << 1, where N is the multiplicand of 140. The required shifts come from the locations of the 1s in the binary representation of the multiplier. The result of Equation 4.7 must be divided by the scaling factor of 64 used in the example to complete the fixed point conversion of the floating point calculation.

4.3 Results

MatLab code for the algorithm is written using fixed point multiplies to determine the best scaling factor for the multiplier. The scaling factors tested are 32, 64, 128, 256, 512, and 1024. 28 The range is limited to these values because anything below 32 results in a multiplier of zero and anything above 1024 produces unnecessary precision that is lost when dividing the product by the scaling factor.

PSNR vs Scaling Constant −− Y PSNR vs Scaling Constant −− U 40 41.5 PSNR PSNR Max PSNR 41 Max PSNR 35 40.5

40 30 39.5

25 39

PSNR (dB) PSNR (dB) 38.5 20 38

37.5 15 37

10 36.5 32 64 128 256 512 1024 32 64 128 256 512 1024 Scaling Constant Scaling Constant (a) (b)

PSNR vs Scaling Constant −− U 41.5 PSNR 41 Max PSNR

40.5

40

39.5

39

PSNR (dB) 38.5

38

37.5

37

36.5 32 64 128 256 512 1024 Scaling Constant (c)

Figure 4.1: PSNR vs. scaling factor

Figure 4.1a and Figure 4.1b show that the best scaling factor is 64, while Figure 4.1c shows a scaling factor of 128 as the best. Based on this information it is determined that scaling the

floating point multipliers by 64 and then flooring them produces the best results. By using a scaling factor of 64 for the conversion to fixed-point multiplication, the algorithm sufferes a loss PSNR of approximately 0.5dB in luminance and 1db in chrominance compared to the floating-point version 29 Y 45

40 Floating point Fixed Point 35 PSNR (dB)

30 5 10 15 20 Image # Cb Cr 50 45

45 40 40 PSNR (dB) PSNR (dB)

35 35 5 10 15 20 5 10 15 20 Image # Image #

Figure 4.2: PSNR comparison of fixed point and floating point results

of the algorithm. Figure 4.2 shows a PNSR comparison of the floating-point results and the fixed- point results for the Y, Cb, and Cr channels over the Kodak image set.

Figure 4.3 gives the structural similarity index using the same image set, where the luminance

SSIM is given a weight of 0.7 and each of the chrominance channels is given a weight of 0.15.

The quality metrics show that demosaicing using fixed-point calculations in the proposed algo- rithm from Chapter III is a viable option despite loss in quality. Even with the loss in quality due to fixed point calculations, the proposed algorithm maintains a higher chrominance PSNR than the

floating-point calculated methods in [4], [9], [8], and Bilinear demosaicing. The luminance PSNR of the algorithm using fixed-point multiplies drops slightly in two spots in ranking when compared

30 SSIM comparison of Floating and Fixed point 0.995 Floating point 0.99 Fixed Point

0.985

0.98

0.975

0.97 SSIM 0.965

0.96

0.955

0.95

0.945 5 10 15 20 Image #

Figure 4.3: SSIM comparison of fixed point and floating point results

to the floating-point algorithms. The structural similarity also drops slightly but still is competitive with algorithms using floating-point calculations.

31 CHAPTER V

VHDL Hardware Implementation

The software implementations of the proposed algorithm have shown the merits of the algorithm as a low complexity demosaicing solution when the desired output is YCbCr 4:2:0. This chapter covers the hardware design implemented based on the fixed point multiplication version of the proposed demosaicing algorithm. The modules that make up the design are discussed in Section

5.1. Section 5.2 provides results on from the design in terms of maximum operating frequency, logic utilization, and throughput.

5.1 Hardware Design

The hardware is designed to read in raw Bayer image data from a First-In-First-Out memory

(FIFO) and output the Y, Cb, and Cr data to three separate FIFOs. The necessary padding is per- formed by the hardware so the dimensions of the input raw image are the same dimensions as the output image. The hardware is written in VHDL and has been implemented in a GiDEL ProceIV board. Figure 5.1 illustrates the connections between the Stratix IV EP4SE530H35C2 FPGA and the supporting hardware, such as the memory banks and the PCIe bus.

GiDEL intellectual property (IP) is used for the memory controller and PCIe bus communi- cation, allowing the design to just focus on the demosaicing aspect of the hardware design. The proposed demosaicing algorithm is broken down into six modules. The first module described in 32 Figure 5.1: GiDEL ProceIV Block Diagram from [10]

33 Section 5.1.1 is a top level which handles padding the left and right side of the image as well as the interface between GiDEL IP and the demosaicing hardware. The next module which is presented in Section 5.1.2 controls the data-flow between interpolation modules and pads the top and bottom of the image. Section 5.1.3 covers the module designed to interpolate green data. The red and blue interpolation is described in Section 5.1.4. The chrominance calculation is combined with the cal- culation for the luminance in a Green-Red row to form the module in Section 5.1.5. The remaining luminance samples, located at Blue-Green rows, are calculated by the by the module in Section

5.1.6. Logic usage, maximum operating frequency (FMAX), and power requirements are discussed in Section 5.2.

5.1.1 Demosaicing Top Level

The top level of the demosaicing hardware is the interface between GiDEL IP and the rest of the demosaicing hardware. The top level module is also responsible for mirroring data on the left and right side of the image, outputting the demosaiced data to the correct FIFOs, and swapping the endianness of the data. The state-machine in this module cycles through five states in the following order: init, first row, steady state, last row, and frame loaded. The init state resets all signals to a known state, first row and last row handle special cases for the first and last rows of raw data respectively, steady state produces a continuous stream of left and right mirrored Bayer data, and

frame loaded waits until the demosaicing process is done, then sets a done flag.

5.1.2 Demosaicing State-Machine

The interactions between the various interpolation modules of the hardware design are governed

by a state machine in the demosaic control module. This module performs padding on the top and bottom of the image, where the top is padded by three rows of mirrored Bayer data and the bottom is padded by four rows of mirrored Bayer data. State transitions occur at the end of rows, on reset n

34 going low, or when in input fifo becomes empty. Although none of the interpolation modules require more than five rows of Bayer data as an input, this module contains six FIFOs for rows of Bayer data in order to perform the mirroring at the top and bottom of the image.

Figure 5.2 provides a diagram of the state-machine used in this module. The init state resets all counters and flags to a known state so that processing may begin. States load 0 through load 3 strategically load the five of the raw Bayer FIFOs with three rows of mirrored data and the first two rows of the image. States load 4 and load 5 enable green interpolation and feedback the mirrored

data into the lower FIFOs rather than reading new data. State load 6 begins reading new Bayer data from the demosaic top level module. The load 7 state begins the red/blue interpolation along with

the chrominance and Y1 and Y2 interpolation. The next state, load 8 stops red and blue interpolation because a blue-green row is currently being processed. The final state before reaching steady state is load 9, which enables Y3 and Y4 interpolation along with re-enabling the red and blue interpola-

tion. The state machine then alternates between steady state GB and steady state RG which enable

and disable interpolation modules based on whether the row has a blue-green pattern or green-red

pattern. Once the number of rows in the image minus six have been processed states out 0 through

out 5 perform padding for the bottom of the image. After the last row of padded data has been

loaded into a FIFO the state-machine goes into the done state, which waits until the interpolation is

finished then sets the a done flag. All of the states that require reading new raw Bayer data transition into a fifo empty state if data is not available and return when data becomes available.

5.1.3 Green Interpolation

The green interpolation is handled by a module named interp green. This module performs the equivalent of a 2D convolution of the green kernel with the raw image where only the interpolated green at red and blue pixel locations is retained. There is a 6 clock delay between the arrival of the

first two bytes of Bayer data and the first interpolated green value output. Assuming a continuous

35 load_0:9

Steady Load State RG

Steady init State RG

done Out

out_0:9

Figure 5.2: State-machine in demosaic control module

stream of Bayer data, a continuous stream of interpolated green values will be output for each row.

At the end of a Green-Red row there is a 1 clock delay before the outputs for the next row start and after a Blue-Green row there is a 3 clock delay before the next row of output data starts. The differing delays at the end of rows is due to the horizontal offset between the red and blue pixel samples in the Bayer pattern.

The calculation of the green channel is essentially the same whether the current row being processed is a green-red row or a blue-green row, with the only difference being how the raw Bayer data is split up before the calculation begins. Figure 5.3 shows the processing flow when a blue- green row is being processed.

36 bayer_0[15..8] Z-1 -1 bayer_0[15..0] bayer_0[7..0] green_out[7..0] Σ Σ Σ Σ ⅛

bayer_1[15..8] Z-1 2 bayer_1[15..0] bayer_1[7..0]

bayer_2[15..8] Z-1 Z-1 -1 Σ Z-1 bayer_2[15..0] 4 Σ bayer_2[7..0] Σ Z-1 Z-1 2

2

-1 37

bayer_3[15..8] Z-1 2 bayer_3[15..0] bayer_3[7..0] Σ

bayer_4[15..8] Z-1 -1 bayer_4[15..0] bayer_4[7..0] Figure 5.3: Green interpolation module when processing a blue-green row Figure 5.4 illustrates the same calculation adjusted to work on a green-red row. While the blue- green rows and green-red rows are shown as two separate calculations, in reality the process is combined and only the raw Bayer data is manipulated to produce the proper offset.

5.1.4 Red/Blue Interpolation

Red and Blue interpolation is performed by the interp rb module using the output data from the

green interpolation module and the raw Bayer data. The blue interpolation cannot start until three

rows of green data have been interpolated while red interpolation only needs the green interpolation

results from the current row. To simplify the data flow, the red interpolation also waits for three rows

of interpolated green data before it begins processing the data. Figure 5.5 shows a block diagram of

the internal operations performed for both red and blue interpolation. The red and blue values are

interpolated at every G1 location in the unpadded region of the image along with the first column of padding on the right and the first row of padding at the bottom. The additional red and blue values interpolated at the bottom and to the right of the image are required for later calculations of the chrominance and ultimately the last column and row of luminance in the image.

5.1.5 Chrominance, Y1 and Y2 Interpolation

The chrominance can be calculated directly with G1 from the Bayer data and the interpolated red and blue data. This calculation is combined with the calculation for Y1 and Y2 because all of these calculations are performed on the same row of data with a fixed amount of delay between when the chrominance can be calculated and when the luminance can be calculated. Figure 5.6 depicts the first segment of the module which calculates the chrominance output values. The calculated chrominance is used in the second segment of the module shown in Figure 5.7, which calculates the luminance at location Y1 and Y2. The two segments of the module operate simultaneously, with the

38 bayer_0[15..8] bayer_0[15..0] bayer_0[7..0] Z-1 -1

bayer_1[15..8] green_out[7..0] Σ Σ Σ Σ ⅛ bayer_1[15..0] bayer_1[7..0] Z-1 2

bayer_2[15..8] Z-1 Z-1 bayer_2[15..0] 2 Σ Σ bayer_2[7..0] Z-1 Z-1 -1 Z-1

4 2 Σ 39

bayer_3[15..8] -1 bayer_3[15..0] bayer_3[7..0] Z-1 2

bayer_4[15..8] Σ bayer_4[15..0] bayer_4[7..0] Z-1 -1 Figure 5.4: Green interpolation module when processing a green-red row bayer_0[15..8] Σ Σ ½ Σ blue_out[7..0] bayer_0[15..0]

bayer_0[7..0] Σ bayer_1[15..8] Z-1 Σ Z-1 Σ ½ bayer_1[15..0]

bayer_1[7..0] Z-2 red_out[7..0] bayer_2[15..8] Z-1 Σ Σ bayer_2[15..0]

bayer_2[7..0] 40 green_0[7..0] -1

green_1[7..0] -1 Z-1 Figure 5.5: Red and blue interpolation module green_2[7..0] -1 second segment being designed in such a way that it does not require the chrominance samples until their calculation has completed in the first segment.

5.1.6 Y3 and Y4 Interpolation

The Y3 and Y4 interpolation module is the last module in the processing chain and directly

relies on data from the previous modules. The interpolation of Y3 requires chrominance samples from above and below the Y3 location and the Y4 calculation requires chrominance samples for the four corner pixels surrounding it. These values are obtained by delaying the start of the Y3 and

Y4 interpolation by a full row and using chrominance samples from temporary FIFOs and samples coming directly out of the chrominance interpolation module. Figure 5.8 depicts the signal flow of the module.

5.2 Results

The hardware design is tested on a GiDEL ProceIV PCIe board with an Altera Stratix IV

EP4SE530H35C2 field-programmable gate array (FPGA). For testing purposes, the demosaicing hardware was run using a 50MHz clock instead of pushing the limits of the maximum operating frequency (FMAX).

The hardware output is an exact match to the MATLAB software implementation of the fixed point proposed algorithm. Table 5.1 shows the various characteristics of the current hardware de- signs in terms of logic requirements, FMAX, and memory requirements.

The processing time statistics are given in Table 5.2 based on ModelSim results running at a

200MHz clock. The test images used to generate this table are from the Kodak image set and they are 768x512 pixels.

41 16 Σ

bayer_0[15..8] 8 bayer_0[15..0] u_out[8..0] -1 -1 Z Σ /64 Σ

4 bayer_0[7..0]

-1 2 Σ Σ /64

1 32 Σ /64 42

red_in[7..0] ⅛ Z-1 -1 Σ

1 Σ /64 blue_in[7..0] 32 Figure 5.6: Chrominance interpolation segment

v_out[8..0] -1 2 Σ /64 Σ Σ green_0[7..0] 32

8 Σ

bayer_0[15..8] y1_out[7..0] 1 -4 4 Σ Σ /64 Σ Σ Z bayer_0[15..0]

2

bayer_0[7..0] 1 8 Σ Σ /64

4

2 Σ

1 2 Σ Σ /64

4 interpolation segment 43 2 red_in[7..0] Y 8 Z-1 and 1 Y blue_in[7..0] 1 -2 2 Σ /64 Z

32 Figure 5.7: green_0[7..0] y2_out[7..0] 1 Z-4 16 Σ Σ /64 Σ Σ

Z-1 4

-1 1 Z Σ Σ /64 u_out[8..0] u_out[8..0] v_out[8..0] v_out[8..0] Table 5.1: Hardware design characteristics

FMAX 264.55MHz Combinational ALUTs 2,470/424,960 Memory ALUTs 0/212480 Dedicated logic registers 3,004/424,960 Total block memory bits 156,672/21,233,664

Table 5.2: Hardware execution time characteristics at 200MHz

First Input 55ns Last Input 997,115ns First Output 16,210ns Last Output 1,009,470ns Throughput 24 bits/clock

The current power requirement estimated using Altera’s PowerPlay Power Analyzer Tool returns an estimate of 1638.01 mW on a Stratix IV EP4SE530H35C2 with no heat sink and still air. This is calculated using an input pin toggle rate of 12.5%.

44 o_y3[7..0] i_bayer[15..8] 1 Z-5 4 Σ /64 Σ i_bayer[15..0] 2 o_y4[7..0] Σ Σ i_bayer[7..0] Z-5

-1 1 Σ Z Σ /64 i_V_bot[8..0] Σ Z-1 Z-1 4 Σ i_V_top[8..0] 2

1 Σ Σ /64 interpolation module 4 2 45 i_U_top[8..0] Y Σ Z-1 Z-1 Z-1 i_U_bot[8..0] and

8 3 Y

1 4 Σ Σ /64

32 Σ Figure 5.8: 16

green_0[7..0] Z-2 1 -1 8 Σ Σ /64 Z Σ

2 CHAPTER VI

Conclusion

Demosaicing is a critical step in any color filter array based digital camera processing chain.

The proposed demosaicing algorithm is shown to be a fast low complexity option for demosaicing when a minor reduction in quality is acceptable. A 2× reduction in processing time is achieved between the proposed demosaicing method and the method presented in [6], which also directly demosaics to YCbCr 4:2:0.

In addition to the floating point software implementation of the proposed demosaicing algo- rithm, an integer based version is created. The fixed point version of the algorithm further sacrifices quality, however, it greatly reduces the complexity and size from a hardware design perspective.

Based on the fixed point version of the proposed algorithm, a VHDL implementation is created which has a throughput of approximately 24 bits per clock.

6.1 Future Research

The proposed method of demosaicing has proven to be a fast low complexity demosaicing option capable of producing reasonable quality results when applied to artificially mosaiced data which has been gamma corrected, white-balanced, and color corrected. In a typical processing chain these corrections are preformed between the demosaicing step and the color-space transform, which

46 presents a problem in relation to the proposed demosaicing method since there is no intermediate

RGB stage.

The white-balancing and color correction could conceivably be embedded into the color-space conversion equations because they are generally global and linear corrections [1]. The gamma correction, as a non-linear process, cannot be addressed as simply and further research into the possibility of performing gamma correction prior to demosaicing is required.

47 BIBLIOGRAPHY

[1] R. Lukac, Ed., Single-Sensor Imaging: Methods and Applications for Digital Cameras, 1st ed. CRC Press, Inc., 2009.

[2] “Concepts in - sequential color imaging systems,” http://www.olympusmicro.com/primer/digitalimaging/concepts/threepass.html, accessed: 2013-9-16.

[3] R. C. Gonzalez and R. E. Woods, . Prentice Hall, August 2007, ch. 6.

[4] H. Malvar, L.-W. He, and R. Cutler, “High-quality linear interpolation for demosaicing of bayer-patterned color images,” in Acoustics, Speech, and Signal Processing, 2004. Proceed- ings. (ICASSP ’04). IEEE International Conference on, vol. 3, 2004, pp. iii–485–8 vol.3.

[5] “True-color kodak test images,” http://r0k.us/graphics/kodak/, accessed: 2013-3-26.

[6] C. Doutre, P. Nasiopoulos, and K. Plataniotis, “A fast demosaicking method directly producing 4:2:0 output,” Consumer Electronics, IEEE Transactions on, vol. 53, no. 2, pp. 499–505, May 2007.

[7] K. Hirakawa, “Adaptive homogeneity-directed demosaicing algorithm,” Image Processing, IEEE Transactions on, vol. 14, no. 3, pp. 360–369, March 2005.

[8] J. Adams and J. Hamilton, “Adaptive color plane interpolation in single sensor color electronic camera,” U.S. Patent 5,652,621, July 1997.

[9] ——, “Adaptive color plane interpolation in single sensor color electronic camera,” U.S. Patent 5,629,734, May 1997.

[10] ProceIV Data Book, GiDEL Ltd., May 2011.

[11] B. Bayer, “Color imaging array,” U.S. Patent 3,971,065, July 1976.

[12] I.-R. BT.709-5, Parameter values for the HDTV standards for production and international programme exchange, April 2002.

48 [13] S. Li and W. Li, “Shape-adaptive discrete wavelet transforms for arbitrarily shaped visual object coding,” Circuits and Systems for Technology, IEEE Transactions on, vol. 10, no. 5, pp. 725–743, 2000.

[14] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” Image Processing, IEEE Transactions on, vol. 13, no. 4, pp. 600–612, 2004.

[15] Z. Wang and A. Bovik, “Mean squared error: love it or leave it? - a new look at signal fidelity measures,” IEEE Signal Processing Magazine, vol. 26, no. 1, pp. 98–117, Jan. 2009.

[16] D. Harris and S. Harris, Digital Design and Computer Architecture. Elsevier, 2007, ch. 5, pp. 250–253.

49