Overview: Coding Standards

„ Video coding standards: applications and common structure „ Relevant standards organizations „ ITU-T Rec. H.261 „ ITU-T Rec. H.263 „ ISO/IEC MPEG-1 „ ISO/IEC MPEG-2 „ ISO/IEC MPEG-4 „ Recent progress: H.264/AVC

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 1 TheThe JVTJVT ProjectProject

„ ITU-T Q.6/SG16 (VCEG - Video Coding Experts Group) formed for ITU-T standardization activity for video compression since 1997 „ August 1999: 1st test model (TML-1) of H.26L „ December 2001: Formation of the Joint Video Team (JVT) between VCEG and ISO/IEC JTC 1/SC 29/WG 11 (MPEG) to establish a joint standard project - H.264 / MPEG4-AVC „ ITU-T Approval: May 2003 „ ISO/IEC Approval: October 2003

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 2 JVT Goals

„ Improved coding efficiency z Average reduction of 50% given fixed fidelity compared to any other standard z Trade-off complexity vs. coding efficiency „ Improved network friendliness z Anticipate error-prone transport over mobile networks and the wired and wireless Internet z Further improve robustness techniques in H.263 and MPEG-4 „ Simple syntax specification z Avoid excessive quantity of optional features z Minimize number of “profiles” for distinct application areas

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 3 H.264/JVT Applications

„ Entertainment Video z Broadcast: Terrestial / Satellite / Cable . . . z Storage: DVD / HD-DVD / PVR . . . „ Conversational Services z H.320 Conversational z 3GPP Conversational H.324/M z H.323 Conversational Internet/best effort IP/RTP z 3GPP Conversational IP/RTP/SIP „ Video Streaming z 3GPP Streaming IP/RTP/RTSP z Streaming IP/RTP/RTSP (without TCP fallback) „ Other Applications z 3GPP Multimedia Messaging Services z Digital camcorder

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 4 Relationship to Other Standards

„ Identical specifications have been approved in both ITU-T / VCEG and ISO/IEC / MPEG „ In ITU-T / VCEG this is a new & separate standard z ITU-T Recommendation H.264 z ITU-T Systems (H.32x) will be modified to support it „ In ISO/IEC / MPEG this is a new “part” in the MPEG-4 suite z Separate design from prior MPEG-4 visual z New Part 10 called “” (AVC – similar to “AAC” in MPEG-2 as separate ) z MPEG-4 Systems / File Format has been modified to support it z H.222.0 | MPEG-2 Systems also modified to support it „ IETF: RTP payload packetization

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 5 H.264/AVC Profiles

„ Baseline: core compression capabilities, plus error resilience, e.g., for videoconferencing, mobile video

„ Main: high compression and quality, e.g., for broadcasting

„ Extended: added features for efficient streaming

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 6 H.264/AVC Coder

Input Coder Video Control Control Signal Data Transform/ Quant. Scal./Quant. - Transf. coeffs Decoder Split into Scaling & Inv. Macroblocks Transform 16x16 Entropy Coding Deblocking Intra-frame Filter Prediction Output Motion- Video Intra/Inter Compensation Signal

Motion Data [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 7 Input Video Signal

Progressive Top Bottom • Progressive and Frame Field Field interlaced frames can be coded as one unit • Progressive vs. interlace frame is signaled but has no impact on decoding • Each field can be coded separately • Dangling fields Δt

Interlaced Frame (Top Field First) [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 8 Partitioning of the Picture

ƒ Slices: Slice #0 • A picture is split into 1 or several slices • Slices are self-contained Slice #1 • Slices are a sequence of macroblocks Slice #2 ƒ Macroblocks: 0 1 2 … • Basic syntax & processing unit • Contains 16x16 luma samples and 2 x 8x8 chroma samples • Macroblocks within a slice depend on each other Macroblock #40 • Macroblocks can be further partitioned

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 9 Flexible Macroblock Ordering (FMO)

ƒ Slice Group: Slice Group #0 • Pattern of macroblocks defined Slice Group #1 by a Macroblock allocation map • A slice group may contain 1 to Slice Group #2 several slices Slice Group #0 ƒ Macroblock allocation map types:

• Interleaved slices Slice Group #1 • Dispersed macroblock allocation • Explicitly assign a slice group to

each macroblock location in Slice Slice Group #1 raster scan order Group #0 • One or more “foreground” slice groups and a “leftover” slice Slice Group #2 group

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 10 Interlaced Processing

ƒ Field coding: each field is coded as a separate picture using fields for

ƒ Frame coding: 024 … • Type 1: the complete frame 1 3 5 … is coded as a separate 36 picture 37 • Type 2: the frame is scanned as macroblock pairs, for each macroblock pair: Macroblock Pair switch between frame and field coding

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 11 Scanning of a Macroblock

Intra_16x16 macroblock type -1 01 only: Luma 4x4 DC

... 23 0451 Cb16 Cr 17 2x2 DC

Coded Block Pattern for 2367 Luma in 8x8 block order: 891213 18 19 22 23 signals which of the 8x8 AC blocks contains at least 10 11 14 15 20 21 24 25 one 4x4 block with non- zero transform coefficients Luma 4x4 block order for Chroma 4x4 block order for 4x4 intra prediction and 4x4 residual coding, shown as 4x4 residual coding 16-25, and intra 4x4 prediction, shown as 18-21 and 22-25

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 12 H.264/AVC Coder

Input Coder Video Control Control Signal Data

Transform/ - Scal./Quant. Quant. Transf. coeffs Split into Scaling & Inv. Macroblocks Transform 16x16 pixels Entropy Intra Coding Intra-frame Prediction Estimation Data

Intra-frame Prediction Deblocking Motion Filter Data Motion Compensation Intra/Inter Output MB select Video Signal Motion Estimation [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 13 Common Elements with other Standards

„ Macroblocks: 16x16 luma + 2 x 8x8 chroma samples „ Input: Association of luma and chroma and conventional sub-sampling of chroma (4:2:0) „ Block-wise motion compensation „ Motion vectors over picture boundaries „ Variable block-size motion „ Block transforms „ Scalar quantization „ I, P, and B coding types

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 14 H.264 Motion Compensation Accuracy

Input Coder Video Control Control Signal Data Transform/ Quant. Scal./Quant. - Transf. coeffs Decoder Split into Scaling & Inv. Macroblocks Transform 16x16 pixels Entropy Coding De-blocking 16x16 16x8 8x16 8x8 Filter Intra-frame MB 0 01 Prediction Types 0 0 1 Output1 23 Motion- Video 8x8 8x4 4x8 4x4 Intra/Inter Compensation Signal 8x8 0 0 1 0 0 1 Types Motion1 2 3 Data Motion Motion vector accuracy 1/4 (6-tap filter) Estimation [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 15 H.264 Multiple Reference Frames

Input Coder Video Control Control Signal Data Transform/ Quant. Scal./Quant. - Transf. coeffs Decoder Split into Scaling & Inv. Macroblocks Transform 16x16 pixels Entropy Coding De-blocking Intra-frame Filter Prediction Output Motion- Video Intra/Inter Compensation Signal ƒ Multiple ReferenceMotion Frames Data Motion ƒ Generalized B Frames Estimation ƒ Weighted Prediction [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 16 H.264 Intra Prediction

Input Coder ƒ Directional spatial prediction Video Control (9 types forControl luma, 1 chroma) Signal Data Q A B C D E F G H Transform/ I a Quant.b c d Scal./Quant. - J Transf.e f g coeffs h Decoder Scaling & Inv. K i j k l Split into L m n o p Macroblocks Transform 16x16 pixels Entropy Coding0 De-blocking 7 Intra-frame Filter 2 Prediction 8 Output 4 3 Motion- Video 1 56 Intra/Inter Compensation Signal • e.g., Mode 3: diagonal down/rightMotion prediction a, f, k, p areData predicted by Motion Estimation (A + 2Q + I + 2) >> 2 [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 17 H.264 4x4 Transform

Input Coder Video Control Control Signal Data Transform/ Quant. Scal./Quant. - Transf. coeffs Decoder ƒ 4x4Split intoBlock Integer Transform Scaling & Inv. Macroblocks⎡⎤11 1 1 Transform ⎢⎥ 16x16 pixels21−− 1 2 H = ⎢⎥ Entropy ⎢⎥1111−− Coding ⎢⎥ 1221−− ⎣⎦⎢⎥ De-blocking Intra-frame Filter ƒ Repeated transform of DCPrediction coeffs Output for 8x8 chroma and some Motion-16x16 Video Intra luma blocks Intra/Inter Compensation Signal

Motion Data Motion Estimation [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 18 Quantization of Transform Coefficients

„ Scalar quantization

„ Logarithmic step size control

„ Smaller step size for chroma (per H.263 Annex T)

„ Extended range of step sizes

„ Can change to any step size at macroblock level

„ Quantization reconstruction is one multiply, one add, one shift

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 19

„ Improves subjective quality and PSNR of the decoded picture „ Significantly superior to post filtering „ Filtering affects the edges of the 4x4 block structure „ Adaptive filtering removes blocking artifacts, but does not unnecessarily blur the visual content z On slice level, the global filtering strength can be adjusted to the individual characteristics of the video sequence z On edge level, filtering strength is made dependent on inter/intra, motion, and coded residuals z On sample level, quantizer dependent thresholds can turn off filtering for every individual sample z Specially strong filter for macroblocks with very flat characteristics almost removes “tiling artifacts”

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 20 Deblocking Filter

One dimensional visualization of q 0 q 2 an edge position q1

Filtering of p0 and q0 only takes place if:

1. |p0 -q0| < α(QP)

2. |p1 -p0| < β(QP)

3. |q1 -q0| < β(QP) Where β(QP) is considerably smaller than α(QP) p0 p2 p1 Filtering of p1 or q1 takes place if additionally :

1. |p2 -p0| < β(QP) or |q2 -q0| < β(QP) 4x4 Block Edge (QP = quantization parameter)

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 21 Deblocking: Subjective Result for Intra

Highly compressed first decoded intra picture at 0.28 bit/sample

Without Filter With H264/AVC Deblocking

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 22 Deblocking: Subjective Result for Inter

Highly compressed decoded inter picture

Without Filter With H264/AVC Deblocking

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 23 Entropy coding

Input Coder Video Control Control Signal Data Transform/ Quant. Scal./Quant. - Transf. coeffs Decoder Split into Inv. Scal. & Macroblocks Transform 16x16 pixels Entropy Coding De-blocking Intra-frame Filter Prediction Output Motion- Video Intra/Inter Compensation Signal

Motion Data Motion Estimation [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 24 Variable length coding

„ Exp-Golomb code for almost all symbols except for transform coefficients „ Context adaptive VLCs for coding of transform coefficients z Number of coefficients is decoded z Special treatment of values +1 and -1 z Contexts are built dependent on transform coefficients

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 25 Context-Adaptive (CABAC)

update probability estimation

Context Binarization Probability Coding modeling estimation engine

Adaptive binary arithmetic coder

Chooses a model Maps non-binary Uses the provided model conditioned on symbols to a for the actual encoding past observations binary sequence and updates the model

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 26 S Pictures

„ General description z Allows identical reconstruction of frames even when different reference frames are being used z SP pictures use of motion-compensated prediction z SI pictures can exactly approximate SP pictures

„ Applications z Bitstream switching or splicing z Random access z Fast-forward, fast-backward z Error recovery and/or resiliency z Resynchronization such as in Video Redundancy Coding

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 27 SP and SI Pictures

l rec Quant. Scaling Quantization Transf. coeffs +

Scaling & Inv. Transform Transform

l pred Entropy Control De-blocking Decoding Data Filter Intra-frame Prediction

Motion- Output Compensation Video Intra/Inter Signal

Motion Motion Data Estimation

[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 28 ComparisonComparison ofof H.264H.264 toto MPEGMPEG--44

„ MPEG-4: Advanced Simple Profile (ASP) z Motion Compensation: 1/4 pel z Global Motion Compensation „ H.264: z Motion Compensation: 1/4 pel z Using CABAC entropy coding z 5 reference frames (News: 17) „ Both z Sequence structure IBBPBBP...

z QPB=QPP+2 (step size: +25%) z Search range: 32x32 around 16x16 predictor z Lagrangian D+λR coder control

[source: ITU-T VCEG]

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 29 RDRD Curves:Curves: FForemanoreman (QCIF,(QCIF, 10Hz)10Hz)

39 38 37 36 35 34 33 >30% 32 31 30 Average PSNR(Y) [dB] Average 29 28 MPEG-4 27 H.26L 26 0 163248648096112128 Bit-rate [kbit/s] [source: ITU-T VCEG]

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 30 Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 31 Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 32 Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 33 Performance Streaming Application

Average bit-rate savings relative to:

Coder MPEG-4 ASP H.263 HLP MPEG-2

H.264/AVC MP 37.44% 47.58% 63.57% MPEG-4 ASP - 16.65% 42.95% H.263 HLP - - 30.61%

[Wiegand, et al. 2003]

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 34 ExampleExample StreamingStreaming TestTest ResultResult

Tempete CIF 15Hz

38 37 36 35 34 33 32 31 30 Y-PSNR [dB] 29 MPEG-2 28 H.263 HLP 27 MPEG-4 ASP 26 H.264/AVC MP 25 Test Points 24

0 256 512 768 1024 1280 1536 1792 Bit-rate [kbit/s] [Wiegand, et al. 2003]

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 35 ExampleExample StreamingStreaming TestTest ResultResult

Tempete CIF 15Hz

80%

70%

60% H.264/AVC MP

50%

40%

30% MPEG-4 ASP

20% H.263 HLP 10% Rate saving relative to MPEG-2

0%

26 28 30 32 34 36 38 Y-PSNR [dB] [Wiegand, et al. 2003]

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 36 TestTest ResultsResults forfor RealReal--TimeTime ConversationConversation

Average bit-rate savings relative to:

Coder H.263 CHC MPEG-4 SP H.263 Base

H.264/AVC BP 27.69% 29.37% 40.59%

H.263 CHC - 2.04% 17.63%

MPEG-4 SP - - 15.69%

[Wiegand, et al. 2003]

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 37 ExampleExample RealReal--TimeTime ConversationConversation ResultResult

Paris CIF 15Hz

39 38 37 36 35 34 33 32 31 30 Y-PSNR [dB] H.263-Base 29 H.263 CHC 28 27 MPEG-4 SP 26 H.264/AVC BP 25 Test Points 24

0 128 256 384 512 640 768 Bit-rate [kbit/s] [Wiegand, et al. 2003]

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 38 ExampleExample RealReal--TimeTime TestTest ResultResult

Paris CIF 15Hz

50% H.264/AVC BP

40%

30%

20% H.263 CHC H.263-Baseline

Rate saving relative to 10%

MPEG-4 SP 0%

24 26 28 30 32 34 36 38 Y-PSNR [dB] [Wiegand, et al. 2003]

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 39 TestTest ResultsResults EntertainmentEntertainment--QualityQuality ApplicationsApplications

Average bit-rate savings relative to:

Coder MPEG-2

H.264/AVC MP 45%

[Wiegand, et al. 2003]

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 40 ExampleExample EntertainmentEntertainment--QualityQuality ApplicationsApplications ResultResult

Entertainment SD (720x576i) 25Hz

39 38 37 36 35 34 33 32 31 30 Y-PSNR [dB] 29 28 27 MPEG-2 26 25 H.264/AVC MP 24

012345678910 Bit-rate [Mbit/s] [Wiegand, et al. 2003]

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 41 ExampleExample EntertainmentEntertainment--QualityQuality ApplicationsApplications ResultResult

Entertainment SD (720x576i) 25Hz

60%

50% H.264/AVC MP

40%

30%

20%

10% Rate saving relative to MPEG-2

0%

26 28 30 32 34 36 38 Y-PSNR [dB] [Wiegand, et al. 2003]

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 42 Further reading

IEEE Transactions on Circuits and Systems for Video Technology, Special Issue on the H.264/JVC Video Coding Standard, July 2003.

Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 43