<<

Digital Video

‰ Video come from a camera, which records what it sees as a sequence of image ‰ Image frames comprise the video – = presentation of successive frames – Minimal image changing between frames – Frequency of frames is measured in frames per second (fps)

‰ Sequence of still images creates the illusion of movement – > 16 fps is “smooth” – Standards: 29.97 fps NTSC 24 fps for movies 25 fps for PAL 60 fps for HDTV Formats

‰ Most video signals are signals, YCbCr, RGB, etc. ‰ The YCbCr color representation is used for most video coding standards compliance with the CCIR601, CIF, SIF formats ‰ CCIR601: International Radio Consultation Committee) – Three components, Y Cb Cr – CCIR format has two option: • One for the NTSC TV system – 525 lines/frame @30 fps – Y= 720x480 active – Cb, Cr = 360x240 active pixels, (4:2:2) • Another one for the PAL system – 625 lines/fram @25 fps – Y = 720x576 active pixels – Cb, Cr = 360x288 active pixels, (4:2:0) – SIF: Source Input Format • Y = 360x240 /frame @ 30 fps; Y = 360x288 pixel/frame @25 fps • Cb, Cr = y/2 = 180x120 pixel/frame – CIF: Common Intermediate Format Note on Digital Video Sup-sampling

‰ The human eye responds more precisely to brightness than it does to color, chroma subsampling ( decimating) takes advantage of this. o In a 4:4:4 scheme, each 8×8 matrix of RGB pixels converts to three YCrCb 8×8 matrices: one for luminance (Y) and one for each of the two bands (Cr and Cb). o A 4:2:2 scheme also creates one 8×8 luminance matrix but decimates every two horizontal pixels to create each chrominance-matrix entry. Thus reducing the amount of data to 2/3rds of a 4:4:4 scheme. o Ratios of 4:2:0 decimate chrominance both horizontally and vertically, resulting in four Y, one Cr, and one Cb 8×8 matrix for every four 8×8 pixel-matrix sources. This conversion creates half the data required in a 4:4:4 chroma ratio. Note on Digital Video Sup-sampling (Cont..) Color coordinate conversion and sub-sampling of chrominance components

Cb Cr Note on Digital Video Sup-sampling (Cont..)

Y : Cb : Cr The video Data Firehouse

‰ To play one SECOND of uncompressed 16-bit color, 640x480 resolution, digital video requires approximately 18MB of storage

‰ One minute would require about 1 GB

‰ A CD-ROM can only hold about 600 MB and a single- speed (1x) player can only transfer 150 kB per second

‰ Video Compression – Many video coding standards: • H. 261, H. 262, H. 263+, H. 263++, H. 26L • MPEG-1, MPEG-2, MPEG-4 Video Compression 1: H261, H263 H 261

‰ ITU-T SG on Visual Telephony started standard in 1984 ‰ Target: Audiovisual services at Nx384 kbit/s, N=1,…,5 ‰ 1984-1989: Video coding algorithm development ‰ Outcome: Audiovisual services at px64 kbit/s, p=1,…,30 ‰ Dec. 1990: H.261 video coding standard approved ‰ H. 261 support low resolution formats due to bandwidth constrains and therefore cannot deliver video broadcast quality; Resolutions: o Common Intermediate Format (CIF) 352x288 pixels. o Quarter CIF (QCIF) at 176 x 144 pixels. ‰ The maximum frame rate is 30 frames per second but it can be reduced depending on the application and bandwidth availability H261 Examples

Here are some h261 video sequences. They were H261 coded then recoded with MPEG at high quality to make them viewable.

Average Compression Description MPEG PSNR(dB) Ratio

Original n/a 1:1

High Quality 38 25:1

Low Quality 34.7 104:1

Low Quality with motion 35 113:1 vectors H 261 Coding Basis

‰ A fame of video is one screenshot ‰ H 261 compression algorithm takes advantage of both the spatial and the temporal redundancy of video sequences to achieve high compression ratios. ‰ Code frames as two types – I-frames or Intra-coded frames: Coded by exploiting redundancy within the frame • You can think of these as being just the JPEG coding of the frame • These are reference points in the video sequence – P-frames or Inter-coded frames that exploit their similarity with previously coded frames. (Also called predicted or Pseudo frames) ‰ An example H 261 Frame Sequence: Picture Format and Data Structure Luminance (Y), Crominance (Cb, Cr) as 4:2:0 noninterlanced CIF 352x288 as group of Blocks, Machorblocks, and Block layers

GOB 1 GOB 2

GOB 3 GOB 4

GOB 5 GOB 6

GOB 7 GOB 8

GOB 9 GOB 10

GOB 11 GOB 12

MB1 MB2 MB3 MB4 MB5 MB6 MB7 MB8 MB9 MB10 MB11

MB12 MB13 MB14 MB15 MB16 MB17 MB18 MB19 MB20 MB21 MB22

MB23 MB24 MB25 MB26 MB27 MB28 MB29 MB30 MB31 MB32 MB33

YCbCr MB B1 B2 B5 B6 B3 B4 Intra-Frame Coding

‰ for coding – A Macroblock spans a 16x16 pisel area with 4 y blocks and one Cr and 1 Cb block. – Uses a uniform Quantization as opposed to a table. Temporal Redundancy Reduction

Frame 1 Frame 2

‰First frame: INTRA coding (spatial) ‰How to predict next frames (INTER coding)? oMotion estimation/compensation

‰Ideally: motion info for each pixel –Too expensive ‰Semantically: motion info for each homogenous region or object – Second generation coding standard ‰Simplified: motion info for each 16x16 macroblock Motion Estimation

‰ Optimal motion vector? o Investigate all the position in the search window o Keep the one with the minimum mean square error o Motion vector = corresponding translation ‰ Motion Estimation/compensation on 16x16 Luminance blocks ‰ Motion vectors for luminance blocks, differentially coded Inter-Frame Coding

Basic Idea: ‰ Encode the .difference. of a motion-compensated part of target frame (frame to code) with respect to a decoded reference frame. o The reference frame is always the previous I-frame. ‰ Motion Vector: The “offset” w.r.t to the best match (macroblock) for this macroblock H 261 Bitstream Format H 261 Bitstream Format H. 263 ‰H. 263 is an improved standard for very low applications such as video telecommunication ( <64 kbps) – It uses the transform coding for intra-frames and predictive coding for inter-frames. ‰Advance option: – Half-pixel precision in – Unrestricted motion vectors – Syntax-based – Advanced prediction and PB - frames

‰In addition to CIF and QCIR, H. 263 could also supports SQCIF, 4CIF, and 16 CIF CIF Video