ECE 634 – Digital Video Systems Spring 2021
Fengqing Maggie Zhu Assistant Professor of ECE MSEE 334 [email protected]
Video Basics
ECE634 – Spring 2021 Jan 19, 2021 1 Outline
• Color Perception • Video Capture and Display • Analog and Digital Video
ECE634 – Spring 2021 Jan 19, 2021 2 Color Perception
ECE634 – Spring 2021 Jan 19, 2021 3 Light Sources
• Illuminating sources: – Emit light (e.g. the sun, light bulb, TV monitors) – Perceived color depends on the emitted freq. – Follows additive rule • R+G+B=White • Reflecting sources: – Reflect an incoming light (e.g. the color dye, matte surface, cloth) – Perceived color depends on reflected freq. (= emitted freq. - absorbed freq.) – Follows subtractive rule • R+G+B=Black
ECE634 – Spring 2021 Jan 19, 2021 4 Eyes – Human Visual System
• Most intelligent and high-end camera
Lens: Cornea, Lens
Lens Control: Zonula (muscle group)
Aperture Control: Iris, Pupil
Photo Sensor: Retina, Fovea
ECE634 – Spring 2021 Jan 19, 2021 5 Human Perception of Color
• Photo receptors in the retina (the surface of the rear of the eye ball) – Cones: function under bright light, can perceive color tone • Red (~570 nm), green (~535 nm), blue (~445 nm) cones • Passed via optic nerve fibers to the visual cortex for processing – Rods: work under low light, can only perceive luminance information
• Color sensation of human: – Luminance: brightness – Chrominance: hue (color tone), saturation (color purity)
ECE634 – Spring 2021 Jan 19, 2021 6 Spectral Sensitivity Curves
ECE634 – Spring 2021 Jan 19, 2021 7 Trichromatic Color Mixing
• Trichromatic color mixing theory – Any color can be obtained by mixing three properly chosen primary colors with a right proportion
C = åTkCk , Tk : Tristimulus values k=1,2,3 • Primary colors for illuminating sources – Red, Green, Blue (RGB) – Color monitor works by exciting red, green, blue phosphors using separate electronic guns • Primary colors for reflecting sources – Cyan, Magenta, Yellow (CMY) – Color printer works by using cyan, magenta, yellow and black (CMYK) dyes
ECE634 – Spring 2021 Jan 19, 2021 8 Color Representation Models
• Specify the tristimulus values associated with the three primary colors – RGB – CMY • Specify the luminance and chrominance – HSI (Hue, saturation, intensity) – YIQ (used in NTSC color TV) – YCbCr (used in digital color TV) • Amplitude specification: – 8 bits for each color component, or 24 bits total for each pixel – Total of 16 million colors – A true RGB color display of size 1Kx1K requires a display buffer memory size of 3 MB
ECE634 – Spring 2021 Jan 19, 2021 9 Color Space Conversion
• Conversion between different primary sets is linear (3x3 matrix)
• Conversion between primary and XYZ/YIQ/YUV are also linear – XYZ colors are not realizable by actual stimuli – YIQ and YUV are derived from the XYZ coordinate
• Conversion to LSI/Lab are nonlinear – Coordinate Euclidean distance proportional to actual color difference
ECE634 – Spring 2021 Jan 19, 2021 10 Choosing Color Coordinates
• For display or printing: RGB or CMY, to produce more colors
• For analyzing color differences: HSI, for linear relationship
• For processing perceptually meaningful color: L*a*b*
• For transmission or storage: YIQ or YUV, for a less redundant representation
ECE634 – Spring 2021 Jan 19, 2021 11 Color in Images and Videos
• Images are commonly RGB, and each pixel location has 3 colors – (this is ignoring Bayer color sampling) – BE CAREFUL!!! OpenCV loads images as BGR
• Videos are commonly YUV or YCbCr, and there are fewer color pixels than luminance pixels – OpenCV will automatically convert videos in YUV into consecutive images of RGB, upsampling the color information
ECE634 – Spring 2021 Jan 19, 2021 12 Video Capture and Display
ECE634 – Spring 2021 Jan 19, 2021 13 Plenoptic Function (Light Field)
• Measures the intensity of light that passes through a particular point in space • Every possible viewing position, with any viewing angle, at every moment in time – 3 location coordinates – 2 angular directions – Time – Wavelength • Light field (plenoptic) camera
Adelson and Bergen ’91
ECE634 – Spring 2021 Jan 19, 2021 14 Image Formation (Pinhole Camera)
A video records the emitted and/or reflected light intensity from the objects in the scene that is observed by a viewing system (a human eye or a camera)
3-D point
X Y x = F , y = F Z Z Camera center The image of an object is reversed from its 3-D position. The object appears
2-D smaller when it is farther away. Image image plane ECE634 – Spring 2021 Jan 19, 2021 15 Video Signal
• Real-world scene is a continuous 3-D signal (temporal, horizontal, vertical) • Film records samples in time but continuous in space (typically 24 frames/sec) • Analog video samples in time and samples vertically; continuous horizontally (about 30 frames/sec or higher) – Number of lines controls the maximum vertical frequency that can be displayed for a given viewing distance – Video-raster = 1-D signal consisting of scan lines from successive frames • Digital video: samples in time, vertically, and horizontally
ECE634 – Spring 2021 Jan 19, 2021 16 Progressive Scanning
Scan lines Horizontal retrace Vertical retrace
• Progressive scan: – Captures consecutive lines – Captures a complete frame every D t sec – Also referred to as sequential or non-interlaced – Used by TVs, monitors, video projectors
ECE634 – Spring 2021 Jan 19, 2021 17 Interlaced Scanning
E A C B Even field Odd field (Horizontal & vertical retrace not shown)
• Interlaced scan: D F – Captures alternate lines (each frame split into two fields) • Odd lines are captured (odd field), then even lines (even field) – Captures a complete frame every D t sec – Used in analog television
ECE634 – Spring 2021 Jan 19, 2021 18 Why Interlace?
• To provide a trade-off between temporal and vertical resolution, for a given, fixed data rate (number of line/sec) • Interlace Artifact field 0 field 1 field 2 field 3
frame 0 frame 1
ECE634 – Spring 2021 Jan 19, 2021 19 Capture Color
• Sensors – CCD: Charge-Coupled Devices – CMOS: complementary Metal-Oxide-Semiconductor • Bayer Grid • Demosaicing • Dynamic Range
ECE634 – Spring 2021 Jan 19, 2021 20 Video Display
• CRT (cathode ray tube) vs LCD (liquid crystal display) vs LED (light emitting diode)
• Gamma correction: non-linear relation between camera output signal and actual color values
ECE634 – Spring 2021 Jan 19, 2021 21 Analog Video
ECE634 – Spring 2021 Jan 19, 2021 22 History of TV in US
• 1941: First NTSC broadcast, monochrome – 4:3 aspect ratio; Interlacing – 60 Hz (60 fields per second) – 525 lines but only 480 active lines • 1953: Color NTSC – Backwards compatible with black and white TVs • 1993: Grand Alliance forms to design HDTV • 1996: First public broadcast of HDTV • 2000: First HDTV Superbowl transmission • 2009: Last analog transmission
ECE634 – Spring 2021 Jan 19, 2021 23 TV at Purdue
• Roscoe George develops the first electronic television receiver in 1929
https://www.earlytelevision.org/pdf/roscoe_george_and_television.pdf
ECE634 – Spring 2021 Jan 19, 2021 24 Video Terminology
• Component video – Three color components stored/transmitted separately – Use either RGB or YIQ (YUV) or YCrCb coordinate – Betacam (professional tape recorder) use this format • Composite video – Convert RGB to YIQ (YUV) – Multiplexing YIQ into a single signal – Used in most consumer analog video devices • S-video – Y and C (I and Q) are stored separately – Used in consumer video devices
ECE634 – Spring 2021 Jan 19, 2021 25 25 TV Broadcasting and Receiving
Lu m i n a n c e , RG B Chrominance, ---> Aud io Modulation YC 1 C 2 Multiplexing
YC 1 C 2 De- De- ---> Multiplexing Modulation RG B
ECE634 – Spring 2021 Jan 19, 2021 26 Why not using RGB directly?
• R,G,B components are correlated – Transmitting R,G,B components separately is redundant – More efficient use of bandwidth is desired • RGB->YC1C2 transformation – Decorrelating: Y,C1,C2 are uncorrelated – C1 and C2 require lower bandwidth – Y (luminance) component can be received by B/W TV sets • YIQ in NTSC – I: orange-to-cyan – Q: green-to-purple (human eye is less sensitive) • Q can be further bandlimited than I – Phase=Arctan(Q/I) = hue, Magnitude=sqrt (I^2+Q^2) = saturation – Hue is better retained than saturation
ECE634 – Spring 2021 Jan 19, 2021 27 Different Color TV Systems
Parameters NTSC PAL SECAM
Field Rate (Hz) 59.95 (60) 50 50
Line Number/Frame 525 625 625
Line Rate (Line/s) 15,750 15,625 15,625
Color Coordinate YIQ YUV YDbDr
Luminance Bandwidth (MHz) 4.2 5.0/5.5 6.0
Chrominance Bandwidth (MHz) 1.5(I)/0.5(Q) 1.3(U,V) 1.0 (U,V)
Color Subcarrier (MHz) 3.58 4.43 4.25(Db),4.41(Dr)
Color Modulation QAM QAM FM
Audio Subcarrier 4.5 5.5/6.0 6.5
Total Bandwidth (MHz) 6.0 7.0/8.0 8.0
ECE634 – Spring 2021 Jan 19, 2021 28 Digital Video
ECE634 – Spring 2021 Jan 19, 2021 29 ITU-R BT.601 Video Format
Digital encoding of analog video
858 pels 864 pels 720 pels 720 pels s s e e s n i n s l i e l
e n 5 i 0 l n Ac tive
Ac tive i
2 l 8
5 5 4 6
Area 2 Area 7 6
5
122 16 132 12 pel pel pel pel
525/60: 60 field/s 625/50: 50 field/s NTSC PAL/SECAM
ECE634 – Spring 2021 Jan 19, 2021 30 Color Coordinate - YCbCr
• Scaled and shifted versions of analog YUV so the values are in the range of (0, 255)
ECE634 – Spring 2021 Jan 19, 2021 31 Chrominance Subsampling Formats
4:4:4 4:2:2 4:1:1 4:2:0 For every 2x2 Y Pixels For every 2x2 Y Pixels For every 4x1 Y Pixels For every 2x2 Y Pixels 4 Cb & 4 Cr Pixel 2 Cb & 2 Cr Pixel 1 Cb & 1 Cr Pixel 1 Cb & 1 Cr Pixel (No subsampling) (Subsampling by 2:1 (Subsampling by 4:1 (Subsampling by 2:1 both horizontally only) horizontally only) horizontally and vertically)
Y Pixel Cb and Cr Pixel
ECE634 – Spring 2021 Jan 19, 2021 32 Digital Formats
– Pixel === “picture element”, a point sample • Digital video is a sequence of frames (x,y,t) • Often denoted {lines}{i,p} or {lines}{i,p}{fps} – 1080i, 720p, 1080p60 • Temporal resolutions – Video • 25, 30, 60 frames per second (fps) • 50, 60, 120 fields per second – Film: 24, 48 fps – Animation: often lower • Why use more FPS?
ECE634 – Spring 2021 Jan 19, 2021 33 Spatial Resolutions
• 2K, 4K, 8K, etc. • High-definition TV (HDTV) – 1920 x 1080 (1080p or 1080i) – 1280 x 720 (720p) • Standard-definition TV (SDTV or TV) – 720 x 480; 480 x 480 – D1: 720 x 486, 720 x 576 • Common Intermediate Format (CIF) – 352 x 288, 30 frames per second – Required for H.261 compression
ECE634 – Spring 2021 Jan 19, 2021 34 Spatial Resolutions (cont.)
• Source Image Format (SIF) – 352 x 240; 352 x 288 (various frame rates!) • Quarter CIF (QCIF) – 176 x 120; 176 x 144 • 4CIF – 4xCIF: 704 × 576
ECE634 – Spring 2021 Jan 19, 2021 35 Aspect Ratio
• Picture width relative to picture height • Display aspect ratio – NTSC 4:3 – HDTV 16:9 • Pixel aspect ratio – A ratio describes how the width of a pixel in a digital image compares to the height of that pixel – Not always 1:1
ECE634 – Spring 2021 Jan 19, 2021 36 Aspect Ratio Accommodations: Fitting HD into SD, or SD into HD
• Squeeze video to fit – Tall skinny people; short wide people • Letterboxing (Pillarboxing) – Fill top and bottom (left and right) with black • Pan and scan – Show only a subset of the full content – Change viewing window over time if desired
ECE634 – Spring 2021 Jan 19, 2021 37 How Many Bits Per Pixel?
• Quantization transforms the continuous value at each pixel location into a digital number that can be represented by a fixed number of bits – color depth • Most video today is 8 bits per pixel (for luminance) • Emerging High Dynamic Range (HDR) images and video are 10 or 12 or 16 bits per pixel
ECE634 – Spring 2021 Jan 19, 2021 38 Digital Video Formats https://commons.wikimedia.org/wiki/File:Vector_Video_Standards8.svg
Video Format Y Size Color Frame Rate Raw Data Rate Sampling (Hz) (Mbps)
HDTV Over air. cable, satellite, MPEG2 video, 20-45 Mbps SMPTE296M 1280x720 4:2:0 24P/30P/60P 265/332/664 SMPTE295M 1920x1080 4:2:0 24P/30P/60I 597/746/746
Video production, MPEG2, 15-50 Mbps BT.601 720x480/576 4:4:4 60I/50I 249 BT.601 720x480/576 4:2:2 60I/50I 166
High quality video distribution (DVD, SDTV), MPEG2, 4-10 Mbps BT.601 720x480/576 4:2:0 60I/50I 124
Intermediate quality video distribution (VCD, WWW), MPEG1, 1.5 Mbps SIF 352x240/288 4:2:0 30P/25P 30
Video conferencing over ISDN/Internet, H.261/H.263, 128-384 Kbps CIF 352x288 4:2:0 30P 37
Video telephony over wired/wireless modem, H.263, 20-64 Kbps QCIF 176x144 4:2:0 30P 9.1
ECE634 – Spring 2021 Jan 19, 2021 39 Practice Problem (1)
What is the perceived color if you have a light that has approximately the same energy at frequencies corresponding to red, green, and blue, and zero energy at other frequencies? What about red and green frequencies only?
If red, green and blue lights have equal energies, then the perceived color would cover the gray scale, i.e., the perceived color would be black for (R=G=B=0) and would be white for (R=G=B=255).
If the light has only red and green wavelengths at equal energies, then the perceived color will range from black for (R=G=0, B=0) to yellow for (R=G=255, B=0).
ECE634 – Spring 2021 Jan 19, 2021 40 Practice Problem (2)
What are the pros and cons of using component versus composite formats?
In component video formats, a color video is specified by three signals. In composite video formats, three color signals are multiplexed into a single signal.
Composite video format is compatible with a grey-scale signal, and it eliminates the need for synchronizing different color components when processing a color video. A composite signal also has a bandwidth that is significantly lower than the sum of the bandwidths of three component signals, and therefore can be transmitted or stored more efficiently. These benefits are achieved, however, at the expense of image quality: there often exist noticeable artifacts caused by cross talk between color and luminance components.
ECE634 – Spring 2021 Jan 19, 2021 41