ECE 634 – Digital Systems Spring 2021

Fengqing Maggie Zhu Assistant Professor of ECE MSEE 334 [email protected]

Video Basics

ECE634 – Spring 2021 Jan 19, 2021 1 Outline

• Color Perception • Video Capture and Display • Analog and Digital Video

ECE634 – Spring 2021 Jan 19, 2021 2 Color Perception

ECE634 – Spring 2021 Jan 19, 2021 3 Light Sources

• Illuminating sources: – Emit light (e.g. the sun, light bulb, TV monitors) – Perceived color depends on the emitted freq. – Follows additive rule • R+G+B=White • Reflecting sources: – Reflect an incoming light (e.g. the color dye, matte surface, cloth) – Perceived color depends on reflected freq. (= emitted freq. - absorbed freq.) – Follows subtractive rule • R+G+B=Black

ECE634 – Spring 2021 Jan 19, 2021 4 Eyes – Human Visual System

• Most intelligent and high-end camera

Lens: Cornea, Lens

Lens Control: Zonula (muscle group)

Aperture Control: Iris, Pupil

Photo Sensor: Retina, Fovea

ECE634 – Spring 2021 Jan 19, 2021 5 Human Perception of Color

• Photo receptors in the retina (the surface of the rear of the eye ball) – Cones: function under bright light, can perceive color tone • Red (~570 nm), green (~535 nm), blue (~445 nm) cones • Passed via optic nerve fibers to the visual cortex for processing – Rods: work under low light, can only perceive luminance information

• Color sensation of human: – Luminance: brightness – : hue (color tone), saturation (color purity)

ECE634 – Spring 2021 Jan 19, 2021 6 Spectral Sensitivity Curves

ECE634 – Spring 2021 Jan 19, 2021 7 Trichromatic Color Mixing

• Trichromatic color mixing theory – Any color can be obtained by mixing three properly chosen primary colors with a right proportion

C = åTkCk , Tk : Tristimulus values k=1,2,3 • Primary colors for illuminating sources – Red, Green, Blue (RGB) – Color monitor works by exciting red, green, blue phosphors using separate electronic guns • Primary colors for reflecting sources – Cyan, Magenta, Yellow (CMY) – Color printer works by using cyan, magenta, yellow and black (CMYK) dyes

ECE634 – Spring 2021 Jan 19, 2021 8 Color Representation Models

• Specify the tristimulus values associated with the three primary colors – RGB – CMY • Specify the luminance and chrominance – HSI (Hue, saturation, intensity) – YIQ (used in NTSC color TV) – YCbCr (used in digital color TV) • Amplitude specification: – 8 bits for each color component, or 24 bits total for each pixel – Total of 16 million colors – A true RGB color display of size 1Kx1K requires a display buffer memory size of 3 MB

ECE634 – Spring 2021 Jan 19, 2021 9 Conversion

• Conversion between different primary sets is linear (3x3 matrix)

• Conversion between primary and XYZ/YIQ/YUV are also linear – XYZ colors are not realizable by actual stimuli – YIQ and YUV are derived from the XYZ coordinate

• Conversion to LSI/Lab are nonlinear – Coordinate Euclidean distance proportional to actual color difference

ECE634 – Spring 2021 Jan 19, 2021 10 Choosing Color Coordinates

• For display or printing: RGB or CMY, to produce more colors

• For analyzing color differences: HSI, for linear relationship

• For processing perceptually meaningful color: L*a*b*

• For transmission or storage: YIQ or YUV, for a less redundant representation

ECE634 – Spring 2021 Jan 19, 2021 11 Color in Images and

• Images are commonly RGB, and each pixel location has 3 colors – (this is ignoring Bayer color sampling) – BE CAREFUL!!! OpenCV loads images as BGR

• Videos are commonly YUV or YCbCr, and there are fewer color pixels than luminance pixels – OpenCV will automatically convert videos in YUV into consecutive images of RGB, upsampling the color information

ECE634 – Spring 2021 Jan 19, 2021 12 Video Capture and Display

ECE634 – Spring 2021 Jan 19, 2021 13 Plenoptic Function (Light Field)

• Measures the intensity of light that passes through a particular point in space • Every possible viewing position, with any viewing angle, at every moment in time – 3 location coordinates – 2 angular directions – Time – Wavelength • Light field (plenoptic) camera

Adelson and Bergen ’91

ECE634 – Spring 2021 Jan 19, 2021 14 Image Formation (Pinhole Camera)

A video records the emitted and/or reflected light intensity from the objects in the scene that is observed by a viewing system (a human eye or a camera)

3-D point

X Y x = F , y = F Z Z Camera center The image of an object is reversed from its 3-D position. The object appears

2-D smaller when it is farther away. Image image plane ECE634 – Spring 2021 Jan 19, 2021 15 Video Signal

• Real-world scene is a continuous 3-D signal (temporal, horizontal, vertical) • Film records samples in time but continuous in space (typically 24 frames/sec) • Analog video samples in time and samples vertically; continuous horizontally (about 30 frames/sec or higher) – Number of lines controls the maximum vertical frequency that can be displayed for a given viewing distance – Video-raster = 1-D signal consisting of scan lines from successive frames • Digital video: samples in time, vertically, and horizontally

ECE634 – Spring 2021 Jan 19, 2021 16 Progressive Scanning

Scan lines Horizontal retrace Vertical retrace

• Progressive scan: – Captures consecutive lines – Captures a complete frame every D t sec – Also referred to as sequential or non-interlaced – Used by TVs, monitors, video projectors

ECE634 – Spring 2021 Jan 19, 2021 17 Interlaced Scanning

E A C B Even field Odd field (Horizontal & vertical retrace not shown)

• Interlaced scan: D F – Captures alternate lines (each frame split into two fields) • Odd lines are captured (odd field), then even lines (even field) – Captures a complete frame every D t sec – Used in analog television

ECE634 – Spring 2021 Jan 19, 2021 18 Why Interlace?

• To provide a trade-off between temporal and vertical resolution, for a given, fixed data rate (number of line/sec) • Interlace Artifact field 0 field 1 field 2 field 3

frame 0 frame 1

ECE634 – Spring 2021 Jan 19, 2021 19 Capture Color

• Sensors – CCD: Charge-Coupled Devices – CMOS: complementary Metal-Oxide-Semiconductor • Bayer Grid • Demosaicing • Dynamic Range

ECE634 – Spring 2021 Jan 19, 2021 20 Video Display

• CRT (cathode ray tube) vs LCD (liquid crystal display) vs LED (light emitting diode)

: non-linear relation between camera output signal and actual color values

ECE634 – Spring 2021 Jan 19, 2021 21 Analog Video

ECE634 – Spring 2021 Jan 19, 2021 22 History of TV in US

• 1941: First NTSC broadcast, monochrome – 4:3 aspect ratio; Interlacing – 60 Hz (60 fields per second) – 525 lines but only 480 active lines • 1953: Color NTSC – Backwards compatible with black and white TVs • 1993: Grand Alliance forms to design HDTV • 1996: First public broadcast of HDTV • 2000: First HDTV Superbowl transmission • 2009: Last analog transmission

ECE634 – Spring 2021 Jan 19, 2021 23 TV at Purdue

• Roscoe George develops the first electronic television receiver in 1929

https://www.earlytelevision.org/pdf/roscoe_george_and_television.pdf

ECE634 – Spring 2021 Jan 19, 2021 24 Video Terminology

• Component video – Three color components stored/transmitted separately – Use either RGB or YIQ (YUV) or YCrCb coordinate – Betacam (professional tape recorder) use this format • Composite video – Convert RGB to YIQ (YUV) – Multiplexing YIQ into a single signal – Used in most consumer analog video devices • S-video – Y and C (I and Q) are stored separately – Used in consumer video devices

ECE634 – Spring 2021 Jan 19, 2021 25 25 TV Broadcasting and Receiving

Lu m i n a n c e , RG B Chrominance, ---> Aud io Modulation YC 1 C 2 Multiplexing

YC 1 C 2 De- De- ---> Multiplexing Modulation RG B

ECE634 – Spring 2021 Jan 19, 2021 26 Why not using RGB directly?

• R,G,B components are correlated – Transmitting R,G,B components separately is redundant – More efficient use of bandwidth is desired • RGB->YC1C2 transformation – Decorrelating: Y,C1,C2 are uncorrelated – C1 and C2 require lower bandwidth – Y (luminance) component can be received by B/W TV sets • YIQ in NTSC – I: orange-to-cyan – Q: green-to-purple (human eye is less sensitive) • Q can be further bandlimited than I – Phase=Arctan(Q/I) = hue, Magnitude=sqrt (I^2+Q^2) = saturation – Hue is better retained than saturation

ECE634 – Spring 2021 Jan 19, 2021 27 Different Color TV Systems

Parameters NTSC PAL SECAM

Field Rate (Hz) 59.95 (60) 50 50

Line Number/Frame 525 625 625

Line Rate (Line/s) 15,750 15,625 15,625

Color Coordinate YIQ YUV YDbDr

Luminance Bandwidth (MHz) 4.2 5.0/5.5 6.0

Chrominance Bandwidth (MHz) 1.5(I)/0.5(Q) 1.3(U,V) 1.0 (U,V)

Color (MHz) 3.58 4.43 4.25(Db),4.41(Dr)

Color Modulation QAM QAM FM

Audio Subcarrier 4.5 5.5/6.0 6.5

Total Bandwidth (MHz) 6.0 7.0/8.0 8.0

ECE634 – Spring 2021 Jan 19, 2021 28 Digital Video

ECE634 – Spring 2021 Jan 19, 2021 29 ITU-R BT.601 Video Format

Digital encoding of analog video

858 pels 864 pels 720 pels 720 pels s s e e s n i n s l i e l

e n 5 i 0 l n Ac tive

Ac tive i

2 l 8

5 5 4 6

Area 2 Area 7 6

5

122 16 132 12 pel pel pel pel

525/60: 60 field/s 625/50: 50 field/s NTSC PAL/SECAM

ECE634 – Spring 2021 Jan 19, 2021 30 Color Coordinate - YCbCr

• Scaled and shifted versions of analog YUV so the values are in the range of (0, 255)

ECE634 – Spring 2021 Jan 19, 2021 31 Chrominance Subsampling Formats

4:4:4 4:2:2 4:1:1 4:2:0 For every 2x2 Y Pixels For every 2x2 Y Pixels For every 4x1 Y Pixels For every 2x2 Y Pixels 4 Cb & 4 Cr Pixel 2 Cb & 2 Cr Pixel 1 Cb & 1 Cr Pixel 1 Cb & 1 Cr Pixel (No subsampling) (Subsampling by 2:1 (Subsampling by 4:1 (Subsampling by 2:1 both horizontally only) horizontally only) horizontally and vertically)

Y Pixel Cb and Cr Pixel

ECE634 – Spring 2021 Jan 19, 2021 32 Digital Formats

– Pixel === “picture element”, a point sample • Digital video is a sequence of frames (x,y,t) • Often denoted {lines}{i,p} or {lines}{i,p}{fps} – 1080i, 720p, 1080p60 • Temporal resolutions – Video • 25, 30, 60 frames per second (fps) • 50, 60, 120 fields per second – Film: 24, 48 fps – Animation: often lower • Why use more FPS?

ECE634 – Spring 2021 Jan 19, 2021 33 Spatial Resolutions

• 2K, 4K, 8K, etc. • High-definition TV (HDTV) – 1920 x 1080 (1080p or 1080i) – 1280 x 720 (720p) • Standard-definition TV (SDTV or TV) – 720 x 480; 480 x 480 – D1: 720 x 486, 720 x 576 • Common Intermediate Format (CIF) – 352 x 288, 30 frames per second – Required for H.261 compression

ECE634 – Spring 2021 Jan 19, 2021 34 Spatial Resolutions (cont.)

• Source Image Format (SIF) – 352 x 240; 352 x 288 (various frame rates!) • Quarter CIF (QCIF) – 176 x 120; 176 x 144 • 4CIF – 4xCIF: 704 × 576

ECE634 – Spring 2021 Jan 19, 2021 35 Aspect Ratio

• Picture width relative to picture height • Display aspect ratio – NTSC 4:3 – HDTV 16:9 • Pixel aspect ratio – A ratio describes how the width of a pixel in a digital image compares to the height of that pixel – Not always 1:1

ECE634 – Spring 2021 Jan 19, 2021 36 Aspect Ratio Accommodations: Fitting HD into SD, or SD into HD

• Squeeze video to fit – Tall skinny people; short wide people • Letterboxing (Pillarboxing) – Fill top and bottom (left and right) with black • Pan and scan – Show only a subset of the full content – Change viewing window over time if desired

ECE634 – Spring 2021 Jan 19, 2021 37 How Many Bits Per Pixel?

• Quantization transforms the continuous value at each pixel location into a digital number that can be represented by a fixed number of bits – color depth • Most video today is 8 bits per pixel (for luminance) • Emerging High Dynamic Range (HDR) images and video are 10 or 12 or 16 bits per pixel

ECE634 – Spring 2021 Jan 19, 2021 38 Digital Video Formats https://commons.wikimedia.org/wiki/File:Vector_Video_Standards8.svg

Video Format Y Size Color Frame Rate Raw Data Rate Sampling (Hz) (Mbps)

HDTV Over air. cable, satellite, MPEG2 video, 20-45 Mbps SMPTE296M 1280x720 4:2:0 24P/30P/60P 265/332/664 SMPTE295M 1920x1080 4:2:0 24P/30P/60I 597/746/746

Video production, MPEG2, 15-50 Mbps BT.601 720x480/576 4:4:4 60I/50I 249 BT.601 720x480/576 4:2:2 60I/50I 166

High quality video distribution (DVD, SDTV), MPEG2, 4-10 Mbps BT.601 720x480/576 4:2:0 60I/50I 124

Intermediate quality video distribution (VCD, WWW), MPEG1, 1.5 Mbps SIF 352x240/288 4:2:0 30P/25P 30

Video conferencing over ISDN/Internet, H.261/H.263, 128-384 Kbps CIF 352x288 4:2:0 30P 37

Video telephony over wired/wireless modem, H.263, 20-64 Kbps QCIF 176x144 4:2:0 30P 9.1

ECE634 – Spring 2021 Jan 19, 2021 39 Practice Problem (1)

What is the perceived color if you have a light that has approximately the same energy at frequencies corresponding to red, green, and blue, and zero energy at other frequencies? What about red and green frequencies only?

If red, green and blue lights have equal energies, then the perceived color would cover the gray scale, i.e., the perceived color would be black for (R=G=B=0) and would be white for (R=G=B=255).

If the light has only red and green wavelengths at equal energies, then the perceived color will range from black for (R=G=0, B=0) to yellow for (R=G=255, B=0).

ECE634 – Spring 2021 Jan 19, 2021 40 Practice Problem (2)

What are the pros and cons of using component versus composite formats?

In component video formats, a color video is specified by three signals. In composite video formats, three color signals are multiplexed into a single signal.

Composite video format is compatible with a grey-scale signal, and it eliminates the need for synchronizing different color components when processing a color video. A composite signal also has a bandwidth that is significantly lower than the sum of the bandwidths of three component signals, and therefore can be transmitted or stored more efficiently. These benefits are achieved, however, at the expense of image quality: there often exist noticeable artifacts caused by cross talk between color and luminance components.

ECE634 – Spring 2021 Jan 19, 2021 41