Digital Images and Video
Total Page:16
File Type:pdf, Size:1020Kb
CHAPTER 2 Digital Images and Video Advances in ultra-high-definition and 3D-video technologies as well as high-speed Internet and mobile computing have led to the introduction of new video services. Digital images and video refer to 2D or 3D still and moving (time-varying) visual information, respectively. A still image is a 2D/3D spatial distribution of intensity that is constant with respect to time. A video is a 3D/4D spatio-temporal inten- sity pattern, i.e., a spatial-intensity pattern that varies with time. Another term commonly used for video is image sequence, since a video is represented by a time sequence of still images (pictures). The spatio-temporal intensity pattern of this time sequence of images is ordered into a 1D analog or digital video signal as a function of time only according to a progressive or interlaced scanning convention. We begin with a short introduction to human visual perception and color models in Section 2.1. Next, we present 2D digital video repre-sentations and a brief sum- mary of current standards in Section 2.2. We introduce 3D digital video display, representations, and standards in Section 2.3. Section 2.4 provides an overview of popular digital video applications, including digital TV, digital cinema, and video streaming. Finally, Section 2.5 discusses factors afecting video quality and quanti- tative and subjective video-quality assessment. 53 Tekalp_book_COLOR.indb 53 5/21/15 7:47 PM 54 Chapter 2. Digital Images and Video 2.1 Human Visual System and Color Video is mainly consumed by the human eye. Hence, many imaging system design choices and parameters, including spatial and temporal resolution as well as color representation, have been inspired by or selected to imitate the properties of human vision. Furthermore, digital image/video-processing operations, including filtering and compression, are generally designed and optimized according to the specifica- tions of the human eye. In most cases, details that cannot be perceived by the human eye are regarded as irrelevant and referred to as perceptual redundancy. 2.1.1 Color Vision and Models The human eye is sensitive to the range of wavelengths between 380 nm (blue end of the visible spectrum) and 780 nm (red end of the visible spectrum). The cornea, iris, and lens comprise an optical system that forms images on the retinal surface. There are about 100-120 million rods and 7-8 million cones in the retina [Wan 95, Fer 01]. They are receptor nerve cells that emit electrical signals when light hits them. The region of the retina with the highest density of photoreceptors is called the fovea. Rods are sensitive to low-light (scotopic) levels but only sense the intensity of the light; they enable night vision. Cones enable color perception and are best in bright (photopic) light. They have bandpass spectral response. There are three types of cones that are more sensitive to short (S), medium (M), and long (L) wavelengths, respectively. The spectral response of S-cones peak at 420 nm, M-cones at 534 nm, and L-cones at 564 nm, with significant overlap in their spectral response ranges and varying degrees of sensitivity at these range of wavelengths specified by the function 5 mk ( ), k r, g, b, as depicted in Figure 2.1(a). The perceived color of light f (x1, x2, ) at spatial location (x1, x2) depends on the distribution of energy in the wavelength dimension. Hence, color sensation can be achieved by sampling into three levels to emulate color sensation of each type of cones as: ϭϭ fxkk(,12xf)(∫ xx12,,)(md),krgb, (2.1) where mk ( ) is the wavelength sensitivity function (also known as the color- matching function) of the kth cone type or color sensor. This implies that perceived color at any location (x1, x2) depends only on three values fr , fg, and fb, which are called the tristimulus values. It is also known that the human eye has a secondary processing stage whereby the R, G, and B values sensed by the cones are converted into a luminance and two Tekalp_book_COLOR.indb 54 5/21/15 7:47 PM 2.1 Human Visual System and Color 55 color-difference (chrominance) values [Fer 01]. The luminance Y is related to the perceived brightness of the light and is given by ϭ Yx(,12xf)(∫ xx12,,)(ld) (2.2) where l() is the International Commission on Illumination (CIE) luminous effi- ciency function, depicted in Figure 2.1(b), which shows the contribution of energy at each wavelength to a standard human observer’s perception of brightness. Two chrominance values describe the perceived color of the light. Color representations for color image processing are further discussed in Section 2.2.3. 2.0 x y 1.5 z 1.0 0.5 Relative Response 0.0 400 450 500 550 600 650 700 Wavelength (nm) (a) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 Luminous Eciency 0.2 0.1 0 350 400 450 500 550 600 650 700 750 800 850 Wavelength (nm) (b) Figure 2.1 Spectral sensitivity: (a) CIE 1931 color-matching functions for a standard observer with − − − a 2-degree field of view, where the curves x , y , and z may represent mr (), mg (), and mb (), respectively, and (b) the CIE luminous efficiency functionl () as a function of wavelength . Tekalp_book_COLOR.indb 55 5/21/15 7:47 PM 56 Chapter 2. Digital Images and Video Now that we have established that the human eye perceives color in terms of three component values, the next question is whether all colors can be reproduced by mixing three primary colors. The answer to this question is yes in the sense that most colors can be realized by mixing three properly chosen primary colors. Hence, inspired by human color perception, digital representation of color is based on the tri-stimulus theory, which states that all colors can be approximated by mixing three additive primaries, which are described by their color-matching functions. As a result, colors are represented by triplets of numbers, which describe the weights used in mixing the three primaries. All colors that can be reproduced by a com- bination of three primary colors define the color gamut of a specific device. There are different choices for selecting primaries based on additive and subtractive color models. We discuss the additive RGB and subtractive CMYK color spaces and color management in the following. However, an in-depth discussion of color science is beyond the scope of this book, and interested readers are referred to [Tru 93, Sha 98, Dub 10]. RGB and CMYK Color Spaces The RGB model, inspired by human vision, is an additive color model in which red, green, and blue light are added together to reproduce a variety of colors. The RGB model applies to devices that capture and emit color light such as digital cameras, video projectors, LCD/LED TV and computer monitors, and mobile phone dis- plays. Alternatively, devices that produce materials that reflect light, such as color printers, are governed by the subtractive CMYK (Cyan, Magenta, Yellow, Black) color model. Additive and subtractive color spaces are depicted in Figure 2.2. RGB and CMYK are device-dependent color models: i.e., different devices detect or repro- duce a given RGB value differently, since the response of color elements (such as filters or dyes) to individual R, G, and B levels may vary among different manufac- turers. Therefore, the RGB color model itself does not define absolutered , green, and blue (hence, the result of mixing them) colorimetrically. When the exact chromaticities of red, green, and blue primaries are defined, we have a color space. There are several color spaces, such as CIERGB, CIEXYZ, or sRGB. CIERGB and CIEXYZ are the first formal color spaces defined by the CIE in 1931. Since display devices can only generate non-negative primaries, and an adequate amount of luminance is required, there is, in practice, a limitiation on the gamut of colors that can be reproduced on a given device. Color characteristics of a device can be specified by its International Color Consortium (ICC) profile. Tekalp_book_COLOR.indb 56 5/21/15 7:47 PM 2.1 Human Visual System and Color 57 Red Yellow Yellow Magenta Green Red Cyan Blue Green Blue Cyan Magenta (a) (b) Figure 2.2 Color spaces: (a) additive color space and (b) subtractive color space. Color Management Color management must be employed to generate the exact same color on different devices, where the device-dependent color values of the input device, given its ICC pro- file, is first mapped to a standard device-independent color space, sometimes called the Profile Connection Space (PCS), such as CIEXYZ. They are then mapped to the device- dependent color values of the output device given the ICC profile of the output device. Hence, an ICC profile is essentially a mapping from a device color space to the PCS and from the PCS to a device color space. Suppose we have particular RGB and CMYK devices and want to convert the RGB values to CMYK. The first step is to obtain the ICC profiles of concerned devices. To perform the conversion, each (R, G, B) triplet is first converted to the PCS using the ICC profile of the RGB device. Then, the PCS is converted to the C, M, Y, and K values using the profile of the second device. Color management may be side-stepped by calibrating all devices to a common standard color space, such as sRGB, which was developed by HP and Microsoft in 1996.