Overview: Video Coding Standards
Video coding standards: the applications and the common structure Relevant standards organizations ITU-T Rec. H.261 ITU-T Rec. H.263 ISO/IEC MPEG-1 ISO/IEC MPEG-2 ISO/IEC MPEG-4 Recent progress: H.264/JVT
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 1 Major Applications of Video Compression
Digital television 2 . . . 6 Mbps MPEG-2 broadcasting (10…20 Mbps for HD)
DVD video 6 . . . 8 Mbps MPEG-2
Internet video 20 . . . 200 kbps Proprietary, similar to streaming H.263, MPEG-4, or H.26L/JVT Videoconferencing, 20 . . . 320 kbps H.261, H.263 videotelephony
Video over 3G wireless 20 . . . 100 kbps H.263, MPEG-4
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 2 Motion-compensated Hybrid Coding H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.26L/JVT
Coder Control Control Data Transform/ Quant. Quantizer - Transf. coeffs Decoder Deq./Inv. Transform
0 Entropy Motion- Coding Compensated Intra/Inter Predictor
Motion Data Motion Estimator
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 3 Video Standards: Hierarchical Syntax I
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 4 Video Standards: Hierarchical Syntax II
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 5 International Telecommunication Union (ITU)
Formed in 1934 by merger of the International Telegraph Convention of 1865 and the International Radiotelegraph Convention of 1906 Several “committees,” among them z CCITT (International Telephone and Telegraph Consultative Committee) 1956-1992 z CCIR (International Radio Consultative Committee) 1927-1992 Reform in 1992 z CCITT -> ITU-T z CCIR -> ITU-R Any recommendation must be agreed upon unanimously by all its member states
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 6 ITU organization with subgroups relevant for video
ITU
ITU-R ITU-T ITU-D
SG1 SG2 … Study Group 16 - Multimedia
WP1 – Modems WP2 –Systems WP3 – Coding and interface H.320 – ISDN G.7xx – Audio V.34, V.25ter H.323 – LAN H.261 – Video H.324 – PSTN H.263 - Video T.120 - DATA
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 7 IEC and ISO
IEC – International Electrotechnical Commission z founded in 1906 to establish international standards for all electrical technologies z private, non-profit company under Swiss law ISO – International Organization for Standardization z Established in 1947 “to facilitate the international coordination and unification of industrial standards” z Private, non-profit company under Swiss law z Agency of the United Nations Joint ISO/IEC Technical Committee 1 (JTC 1) z Jointly addresses all computer-related activities z About 30% of total ISO and IEC standards
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 8 ISO/IEC organization with subgroups relevant for video
IEC ISO Requirements
Systems JTC1 Description SC29 Video
AG WG Audio
SNHC AGM RA WG1 WG12 WG11
Tests SG SG SG Implementation JBIG MHEG-5 Studies
JPEG MHEG-6 Liaisons
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 9 Requirements for a successful video coding standard
Interoperability: should assure that encoders and decoders from different manufacturers work together seamlessly. Innovation: should perform significantly better than previous standard. Competition: should be flexible enough to allow competition between manufacturers based on technical merit. Only standardize bit-stream syntax and reference decoder. Independence from transmission and storage media: should be flexible enough to be used for a range of applications. Forward compatibility: should decode bit-streams from prior standard Backward compatibility: prior generation decoders should be able to partially decode new bit-streams
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 10 Standard Development Process
subject 1985 1986 1987 1988 1989 1990 n x 384 competition convergence verification (n= 1-5) optimization RM 1 2 3 4
p x 64 convergence verification optimization REC. (p= 1-30) H.261 RM 6 7 8 m x 64 competition (m= 1,2) RM 5
Overview of H.261 Standardization Process
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 11 ITU-T Rec. H.261
International standard for ISDN picture phones and for video conferencing systems (1990) Image format: CIF (352 x 288 Y samples) or QCIF (176 * 144 Y samples), frame rate 7.5 ... 30 fps Bit-rate: multiple of 64 kbps (= ISDN-channel), typically 128 kbps including audio. Picture quality: for 128 kbps acceptable with limited motion in the scene Stand-alone videoconferencing system or desk-top videoconferencing system, integrated with PC
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 12 Image formats
sub QCIF 176 x 144
QCIF
352 x 288 ITU-R 601 CIF
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 13 H.261 macroblocks
Macroblock (MB) of 16x16 pixels Sampling format: 4:2:0 An MB consists of 4 luminance and 2 chrominance blocks
16x16 luminance 8x8 Cb- 8x8 Cr- samples samples samples
01 45
23
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 14 H.261 motion compensated prediction
Integer-pel accuracy One displacement vector per macroblock Maximum displacement vector range +/-16 horizontally and vertically Adaptive loop filter, separable filters in 1-D horizontal and vertical impulse response: [¼, ½, ¼] Differential encoding of motion vectors
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 15 H.261 residual coding
8x8 DCT Quantization z A uniform quantizer (∆=8) for intra-mode DC coefficients z A uniform threshold quantizer (∆=2,4,…,62) for AC coefficients in intra-mode and all coefficients in inter-mode Zigzag scan Run-level coding for entropy coding z (zero-run, value) symbols z zero-run: the number of coefficients quantized to zero since the last nonzero coefficient z value: the amplitude of the current nonzero coefficient
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 16 H.261 Macroblock Types (VLC Table)
Prediction MQUANT MVD CBP TCOEFF VLC Intra X 0001 Intra X X 0000 001 Inter X X 1 Inter X X X 0000 1 Inter+MC X 0000 0000 1 Inter+MC X X X 0000 0001 Inter+MC X X X X 0000 0000 01 Inter+MC+FIL X 001 Inter+MC+FIL X X X 01 Inter+MC+FIL X X X X 0000 01
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 17 ITU-T Rec. H.263
International standard for picture phones over analog subscriber lines (1995) Image format usually CIF, QCIF or Sub-QCIF, frame rate usually below 10 fps Bit-rate: arbitrary, typically 20 kbps for PSTN Picture quality: with new options as good as H.261 (at half rate) Software-only PC video phone
or TV set-top box Example: 8x8 ViaTV phone VC 105 Widely used as compression engine for Internet video streaming H.263 is also the compression core of the MPEG-4 standard
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 18 H.261 vs. H.263
Improved motion compensation z H.261 (1990): integer-pel accuracy, loop filter, 1 motion vector per MB z H.263 (1995): half-pel accuracy, no loop filter, 1 motion vector per MB Improved 3-D VLC for DCT coefficients, (last, run, level) Reduced overhead Support more picture formats Optional features defined in annexes z Unrestricted motion vectors (Annex D) z Syntax-based arithmetic coding (Annex E) z Advanced prediction mode (Annex F) • Overlapped block motion compensation (OBMC), • Switch between 1 or 4 motion vectors per MB z PB pictures (Annex G) More optional features in H.263++. (H.263 as of 2001)
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 19 Performance of H.263 and H.261
1) 2) 3) 34 4) 5) 32
30
28 1) H.263 2) H.263 w/o options PSNR [dB] 26 3) H.261 4) H.263 w/o options, integer-pel ME 24 5) H.261 w/o loop filter
32 64 128 rate [kbps]
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 20 Performance of H.263 SAC mode
36 1) 35 2) 34 33 32 31 30 29 28 1) SAC-mode 27 2) w/o options 26 0 32 64 128 rate [kbps]
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 21 H.263: overlapped block motion compensation (OBMC)
remote luminance Current block luminance block (8x8)
M A C
remote luminance remote luminance R block block O B L O remote luminance C block K
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 22 H.263: OBMC weights
4 5 5 5 5 5 5 4 for MV of current luminance block 5 5 5 5 5 5 5 5 5 5 6 6 6 6 5 5 5 5 6 6 6 6 5 5 5 5 6 6 6 6 5 5 5 5 6 6 6 6 5 5 2 2 2 2 2 2 2 2 for remote MV 5 5 5 5 5 5 5 5 1 1 2 2 2 2 1 1 4 5 5 5 5 5 5 4 1 1 1 1 1 1 1 1 of top/bottom 1 1 1 1 1 1 1 1 luminance block 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 2 2 1 1 1 1 2 2 1 1 2 2 2 2 1 1 2 2 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 2 2 2 2 1 1 1 1 2 2 2 2 1 1 1 1 2 2 for remote MV of left/right 2 2 1 1 1 1 2 2 2 1 1 1 1 1 1 2 luminance block
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 23 Performance of H.263 AP mode
36 1) 35 2) 34 33 32 31 30 29 1) AP-mode 28 2) w/o options 27 26 0 32 64 128 rate [kbps]
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 24 H.263: PB pictures (Annex G)
forward prediction
P B P
bidirectional bidirectional prediction prediction
PB picture
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 25 Performance of H.263 PB-mode
36 1) 35 3a) 2) 34 33 3b) 32 31 1) w/o options (6.25 fps) 30 2) w/o options (12.5 fps) PSNR [dB] 29 3) PB-mode (12.5 fps) 28 a) P-frames b) B-frames 27 26 0 32 64 128 rate [kbps]
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 26 Visual Communication Systems: H.320/H.323/H.324
Network System Video Audio Mux Control
PSTN H.324 H.261/3 G.723.1 H.223 H.245
N-ISDN H.320 H.261 G.7xx H.221 H.242
B-ISDN /ATM H.321 H.261 G.7xx H.221 Q.2931
H.310 H.261/2 G.7xx/MPEG H.222.0/1 H.245
QoS LAN H.322 H.261/3 G.7xx H.221 H.242
Non-Qos LAN H.323 H.261 G.7xx H.225.0 H.245
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 27 H.324 Multimedia Terminals
Video I/O Video codec equipment H.261/H.263
Audio I/O Audio codec Receive PSTN equipment G.723 path delay Mux/ Modem network demux V.34/V.8 H.223 User data Data protocols applications V.14, LAPM etc
Control protocol SRP/LAPM Modem control System H.245 procedures v.25ter control
Scope of H.324
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 28 Overview: Video Coding Standards
Video coding standards: the applications and the common structure Relevant standards organizations ITU-T Rec. H.261 ITU-T Rec. H.263 ISO/IEC MPEG-1 ISO/IEC MPEG-2 ISO/IEC MPEG-4 Recent progress: H.26L/JVT
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 29 ISO/IEC MPEG
MPEG-1 Standard (1991) (ISO/IEC 11172) z Target bit-rate about 1.5 Mbit/s z Typical image format CIF, no interlace z Frame rate 24 ... 30 fps z Main application: video storage for multimedia (e.g., on CD-ROM) MPEG-2 Standard (1994) (ISO/IEC 13818) z Extension for interlace, optimized for TV resolution (NTSC: 704 x 480 Pixel) z Image quality similar to NTSC, PAL, SECAM at 4 - 8 Mbit/s z HDTV at 20 Mbit/s MPEG-4 Standard (1999) (ISO/IEC 14496) z Object based coding z Wide-range of applications, with choices of interactivity, scalability, error resilience, etc.
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 30 MPEG-1/2: GOP Structure
"Group of Pictures" = “GOP“, GOP structure is very flexible
I-Picture P-Picture P-Picture
B-Pictures
time 1 3 4 2 6 7 8 5
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 31 MPEG-1 Encoder
Pre- Picture Video DCT Weighting Quantization VLC Buffer processing reordering - multiplex
Inverse quantization Motion vectors, Inverse macroblock info, Video in weighting start codes
Inverse DCT Bitstream
+ zero Picture store 1 Motion 1/2 + compensation
Picture store 2
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 32 MPEG-1: coding of I-pictures
I-pictures: intraframe coded 8x8 DCT Arbitrary weighting matrix for coefficients Differential coding of DC-coefficients Uniform quantization Zig-zag-scan, run-level-coding Entropy coding Unfortunately, not quite JPEG
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 33 MPEG-1: coding of P-pictures
Motion-compensated prediction from an encoded I-picture or P-picture (DPCM) Half-pel accuracy of motion compensation, bilinear interpolation One displacement vector per macroblock Differential coding of displacement vectors Coding of prediction error with 8x8-DCT, uniform threshold quantization, zig-zag-scan as in I-pictures
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 34 MPEG-1: coding of B-pictures
Motion-compensated prediction from two consecutive P- or I-pictures z either • only forward prediction (1 vector/macroblock) z or • only backward prediction (1 vector/macroblock) z or • Average of forward and backward prediction = interpolation (2 vectors/macroblock) Half-pel accuracy of motion compensation, bilinear interpolation Coding of prediction error with 8x8-DCT, uniform quantization, zig-zag-scan as in I-pictures
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 35 MPEG-2 vs. MPEG-1
Efficiently compress interlaced digital video at broadcast quality z Field/frame pictures z Chroma sampling z New prediction modes z Field/frame DCT z Additional scan patterns for DCT coefficients z Motion compensation with blocks of size 16x8 pels Improved coding efficiency by different quantization, VLC tables Various scalability modes
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 36 Coding of Interlaced Video (1)
Frame and field picture structures
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 37 Coding of Interlaced Video (2)
Field prediction for field pictures
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 38 Coding of Interlaced Video (3)
Field prediction for frame pictures
16 16
8
16
8
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 39 Coding of Interlaced Video (4)
Dual prime for P pictures
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 40 Coding of Interlaced Video (5)
Field/frame DCT Alternate Scan
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 41 MPEG-4
Support highly interactive multimedia applications as well as traditional applications Advanced functionalities: interactivity, scalability, error resilience… Coding of natural and synthetic audio and video, as well as graphics Enable the multiplexing of audiovisual objects and composition in a scene
¾ Video on LANs, Internet video ‘TV/film’ ¾ Wireless video AV-data ¾ Video database ¾ Interactive home shopping ¾ Video e-mail, home movies ¾ Virtual reality games, flight ‘Computer’ ‘Telecom’ simulation, multi-viewpoint Interactivity Wireless training
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 42 MPEG-4: Scene with audiovisual objects
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 43 MPEG-4: Video coding
Basic video coding z Definition of Video Object (VO), Vide Object Layer (VOL), Video Object Plane (VOP) z Improved coding efficiency vs. MPEG-1/2 • Based on H.263 baseline • Global motion compensation •Sprite • Quarter pixel motion compensation
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 44 MPEG-4: Video coding
Object-based video coding z Binary shape coding z Greyscale shape coding z Padding for block-based DCT of texture z Shape-adaptive DCT DWT for still texture coding Mesh animation, face and body animation
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 45 Shape Adaptive DCT
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 46 Video compression progress
• Intraframe coding: only spatial correlation exploited Complexity Î DCT [Ahmed, Natarajan, Rao 1974], JPEG [1992] increases • Conditional replenishment, DPCM, scalar quantization Î H.120 [1984] • Frame difference coding Î H.120 Version 2 [1988] • Motion compensation: integer-pel accurate displacements Î H.261 [1991] • Half-pel accurate motion compensation Î MPEG-1 [1993], MPEG-2/H.262 [1994] • Variable block-size motion compensation Î H.263 [1996], MPEG-4 [1999]
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 47 Video compression progress
Half-pel Frame Variable Integer-pel motion difference block size motion compensation coding motion compensation 40 (MPEG-1 1993) (H.120 1988) compensation (H.261 1991) 38 (H.263 1996) Foreman 36 Conditional 10 Hz, QCIF PSNR 34 ~40 % Replenishment 100 frames encoded [dB] (H.120) 32 Intraframe DCT coding 30 (JPEG) 28 ~20 %
26 ~30 % 24 Bit Rate [kbps] 0 100 200 300 400 500 600
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 48 Video compression progress
Variable block size 42 motion compensation Conditional (H.263 1996) 40 Replenishment (H.120) Mother & Daughter PSNR 38 10 Hz, QCIF [dB] 100 frames encoded 36
Intraframe 34 ~60 % DCT coding ~70 % (JPEG) 32
30 Bit Rate [kbps] 28 0 50 100 150 200 250 300 350
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 49 Video compression progress
Variable Integer-pel block size 40 motion motion compensation compensation 38 (H.261 1991) (H.263 1996) 36 Mobile &Calendar PSNR 34 10 Hz, QCIF ~ 35 % 100 frames encoded [dB] 32 ~ 40 % 30 Intraframe DCT coding 28 (JPEG) 26 24 22 Bit Rate [kbps] 0 200 400 600 800 1000 1200 1400
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 50 H.26L/JVT:H.26L/JVT: MotionMotion CompensationCompensation AccuracyAccuracy
Coder Control Control Data Transform/ Quant. Quantizer - Transf. coeffs Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Mode 1 Mode 2 Mode 3 Mode 4 Intra/Inter Predictor 0 0 1 0 0 1 1 2 3 Motion Mode 5 DataMode 6 Mode 7 Motion 0 1 2 3 0 1 0 1 2 3 2 3 4 5 6 7 Estimator 4 5 6 7 4 5 8 9 10 11 6 7 12 13 14 15 1/4 (QCIF) or 1/8 (CIF) pel
[courtesy T. Wiegand]
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 51 H.26L/JVT:H.26L/JVT: MultipleMultiple ReferenceReference FramesFrames
Coder Control Control Data Transform/ Quant. Quantizer - Transf. coeffs Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Intra/Inter Predictor
Multiple ReferenceMotion Frames for Data Motion Motion Compensation Estimator
[courtesy T. Wiegand]
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 52 H.26L/JVT:H.26L/JVT: ResidualResidual CodingCoding
Transform z 4x4 Integer transform approximating a DCT z Expanded to 8x8 for chroma by 2x2 DC transform Intra Coding Structure z Directional spatial prediction for intra mode z Expanded to 16x16 for luma intra by 4x4 DC transform Quantization z Two inverse scan patterns z Logarithmic step size control z Smaller step size for chroma (per H.263 Annex T) Deblocking Filter (in loop)
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 53 H.26L/JVT:H.26L/JVT: EntropyEntropy CodingCoding (1)(1)
Universal Variable Length Code (UVLC)
Coder Control Control Data Transform/ Quant. Quantizer - Transf. coeffs Decoder Deq./Inv. Transform Entropy Coding 0 Motion- Compensated Intra/Inter Predictor
Motion Data Motion Estimator [courtesy T. Wiegand]
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 54 H.26L/JVT:H.26L/JVT: EntropyEntropy CodingCoding (2)(2)
Context-based adaptive binary arithmetic codes (CABAC)
update probability estimation
Context Binarization Probability Coding modeling estimation engine
Adaptive binary arithmetic coder
Chooses a model Maps non-binary Uses the provided model conditioned on symbols to a for the actual encoding past observations binary sequence and updates the model
[courtesy T. Wiegand]
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 55 ComparisonComparison ofof H.26LH.26L toto MPEGMPEG--44
MPEG-4: Advanced Simple Profile (ASP) z Motion Compensation: 1/4 pel z Global Motion Compensation H.26L: z Motion Compensation: 1/4 pel (QCIF), 1/8 pel (CIF) z Using CABAC entropy coding z 5 reference frames in 7 of 8 cases (News: 17 / 25) Both z Sequence structure IBBPBBP...
z QPB=QPP+2 (step size: +25%) z Search range: 32x32 around 16x16 predictor z Well-known D+λR optimization techniques
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 56 RDRD Curves:Curves: FForemanoreman (QCIF,(QCIF, 10Hz)10Hz)
39 38 37 36 35 34 33 >30% 32 e PSNR(Y) [dB] 31 30 Averag 29 28 MPEG-4 27 H.26L 26 0 163248648096112128 Bit-rate [kbit/s] [source: ITU-T VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 57 RDRD Curves:Curves: FFlowergardenlowergarden (CIF,(CIF, 30Hz)30Hz)
38 37 36 35 34 33 >30% 32 31 30
e PSNR(Y) [dB] 29 28 27 Averag 26 25 24 MPEG-4 23 H.26L 22 0 256 512 768 1024 1280 1536 1792 2048 2304 Bit-rate [kbit/s] [source: ITU-T VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 58 Bernd Girod: EE398B Image Communication II Video Coding Standards no. 59 Bernd Girod: EE398B Image Communication II Video Coding Standards no. 60 Bernd Girod: EE398B Image Communication II Video Coding Standards no. 61