Overview: Video Coding Standards
Video coding standards: applications and common structure Relevant standards organizations ITU-T Rec. H.261 ITU-T Rec. H.263 ISO/IEC MPEG-1 ISO/IEC MPEG-2 ISO/IEC MPEG-4 Recent progress: H.264/AVC
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 1 TheThe JVTJVT ProjectProject
ITU-T Q.6/SG16 (VCEG - Video Coding Experts Group) formed for ITU-T standardization activity for video compression since 1997 August 1999: 1st test model (TML-1) of H.26L December 2001: Formation of the Joint Video Team (JVT) between VCEG and ISO/IEC JTC 1/SC 29/WG 11 (MPEG) to establish a joint standard project - H.264 / MPEG4-AVC ITU-T Approval: May 2003 ISO/IEC Approval: October 2003
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 2 JVT Goals
Improved coding efficiency z Average bit rate reduction of 50% given fixed fidelity compared to any other standard z Trade-off complexity vs. coding efficiency Improved network friendliness z Anticipate error-prone transport over mobile networks and the wired and wireless Internet z Further improve robustness techniques in H.263 and MPEG-4 Simple syntax specification z Avoid excessive quantity of optional features z Minimize number of “profiles” for distinct application areas
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 3 H.264/JVT Applications
Entertainment Video z Broadcast: Terrestial / Satellite / Cable . . . z Storage: DVD / HD-DVD / PVR . . . Conversational Services z H.320 Conversational z 3GPP Conversational H.324/M z H.323 Conversational Internet/best effort IP/RTP z 3GPP Conversational IP/RTP/SIP Video Streaming z 3GPP Streaming IP/RTP/RTSP z Streaming IP/RTP/RTSP (without TCP fallback) Other Applications z 3GPP Multimedia Messaging Services z Digital camcorder
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 4 Relationship to Other Standards
Identical specifications have been approved in both ITU-T / VCEG and ISO/IEC / MPEG In ITU-T / VCEG this is a new & separate standard z ITU-T Recommendation H.264 z ITU-T Systems (H.32x) will be modified to support it In ISO/IEC / MPEG this is a new “part” in the MPEG-4 suite z Separate codec design from prior MPEG-4 visual z New Part 10 called “Advanced Video Coding” (AVC – similar to “AAC” in MPEG-2 as separate audio codec) z MPEG-4 Systems / File Format has been modified to support it z H.222.0 | MPEG-2 Systems also modified to support it IETF: RTP payload packetization
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 5 H.264/AVC Profiles
Baseline: core compression capabilities, plus error resilience, e.g., for videoconferencing, mobile video
Main: high compression and quality, e.g., for broadcasting
Extended: added features for efficient streaming
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 6 H.264/AVC Coder
Input Coder Video Control Control Signal Data Transform/ Quant. Scal./Quant. - Transf. coeffs Decoder Split into Scaling & Inv. Macroblocks Transform 16x16 pixels Entropy Coding Deblocking Intra-frame Filter Prediction Output Motion- Video Intra/Inter Compensation Signal
Motion Data Motion Estimation [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 7 Input Video Signal
Progressive Top Bottom • Progressive and Frame Field Field interlaced frames can be coded as one unit • Progressive vs. interlace frame is signaled but has no impact on decoding • Each field can be coded separately • Dangling fields Δt
Interlaced Frame (Top Field First) [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 8 Partitioning of the Picture
Slices: Slice #0 • A picture is split into 1 or several slices • Slices are self-contained Slice #1 • Slices are a sequence of macroblocks Slice #2 Macroblocks: 0 1 2 … • Basic syntax & processing unit • Contains 16x16 luma samples and 2 x 8x8 chroma samples • Macroblocks within a slice depend on each other Macroblock #40 • Macroblocks can be further partitioned
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 9 Flexible Macroblock Ordering (FMO)
Slice Group: Slice Group #0 • Pattern of macroblocks defined Slice Group #1 by a Macroblock allocation map • A slice group may contain 1 to Slice Group #2 several slices Slice Group #0 Macroblock allocation map types:
• Interleaved slices Slice Group #1 • Dispersed macroblock allocation • Explicitly assign a slice group to
each macroblock location in Slice Slice Group #1 raster scan order Group #0 • One or more “foreground” slice groups and a “leftover” slice Slice Group #2 group
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 10 Interlaced Processing
Field coding: each field is coded as a separate picture using fields for motion compensation
Frame coding: 024 … • Type 1: the complete frame 1 3 5 … is coded as a separate 36 picture 37 • Type 2: the frame is scanned as macroblock pairs, for each macroblock pair: Macroblock Pair switch between frame and field coding
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 11 Scanning of a Macroblock
Intra_16x16 macroblock type -1 01 only: Luma 4x4 DC
... 23 0451 Cb16 Cr 17 2x2 DC
Coded Block Pattern for 2367 Luma in 8x8 block order: 891213 18 19 22 23 signals which of the 8x8 AC blocks contains at least 10 11 14 15 20 21 24 25 one 4x4 block with non- zero transform coefficients Luma 4x4 block order for Chroma 4x4 block order for 4x4 intra prediction and 4x4 residual coding, shown as 4x4 residual coding 16-25, and intra 4x4 prediction, shown as 18-21 and 22-25
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 12 H.264/AVC Coder
Input Coder Video Control Control Signal Data
Transform/ - Scal./Quant. Quant. Transf. coeffs Split into Scaling & Inv. Macroblocks Transform 16x16 pixels Entropy Intra Coding Intra-frame Prediction Estimation Data
Intra-frame Prediction Deblocking Motion Filter Data Motion Compensation Intra/Inter Output MB select Video Signal Motion Estimation [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 13 Common Elements with other Standards
Macroblocks: 16x16 luma + 2 x 8x8 chroma samples Input: Association of luma and chroma and conventional sub-sampling of chroma (4:2:0) Block-wise motion compensation Motion vectors over picture boundaries Variable block-size motion Block transforms Scalar quantization I, P, and B coding types
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 14 H.264 Motion Compensation Accuracy
Input Coder Video Control Control Signal Data Transform/ Quant. Scal./Quant. - Transf. coeffs Decoder Split into Scaling & Inv. Macroblocks Transform 16x16 pixels Entropy Coding De-blocking 16x16 16x8 8x16 8x8 Filter Intra-frame MB 0 01 Prediction Types 0 0 1 Output1 23 Motion- Video 8x8 8x4 4x8 4x4 Intra/Inter Compensation Signal 8x8 0 0 1 0 0 1 Types Motion1 2 3 Data Motion Motion vector accuracy 1/4 (6-tap filter) Estimation [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 15 H.264 Multiple Reference Frames
Input Coder Video Control Control Signal Data Transform/ Quant. Scal./Quant. - Transf. coeffs Decoder Split into Scaling & Inv. Macroblocks Transform 16x16 pixels Entropy Coding De-blocking Intra-frame Filter Prediction Output Motion- Video Intra/Inter Compensation Signal Multiple ReferenceMotion Frames Data Motion Generalized B Frames Estimation Weighted Prediction [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 16 H.264 Intra Prediction
Input Coder Directional spatial prediction Video Control (9 types forControl luma, 1 chroma) Signal Data Q A B C D E F G H Transform/ I a Quant.b c d Scal./Quant. - J Transf.e f g coeffs h Decoder Scaling & Inv. K i j k l Split into L m n o p Macroblocks Transform 16x16 pixels Entropy Coding0 De-blocking 7 Intra-frame Filter 2 Prediction 8 Output 4 3 Motion- Video 1 56 Intra/Inter Compensation Signal • e.g., Mode 3: diagonal down/rightMotion prediction a, f, k, p areData predicted by Motion Estimation (A + 2Q + I + 2) >> 2 [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 17 H.264 4x4 Transform
Input Coder Video Control Control Signal Data Transform/ Quant. Scal./Quant. - Transf. coeffs Decoder 4x4Split intoBlock Integer Transform Scaling & Inv. Macroblocks⎡⎤11 1 1 Transform ⎢⎥ 16x16 pixels21−− 1 2 H = ⎢⎥ Entropy ⎢⎥1111−− Coding ⎢⎥ 1221−− ⎣⎦⎢⎥ De-blocking Intra-frame Filter Repeated transform of DCPrediction coeffs Output for 8x8 chroma and some Motion-16x16 Video Intra luma blocks Intra/Inter Compensation Signal
Motion Data Motion Estimation [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 18 Quantization of Transform Coefficients
Scalar quantization
Logarithmic step size control
Smaller step size for chroma (per H.263 Annex T)
Extended range of step sizes
Can change to any step size at macroblock level
Quantization reconstruction is one multiply, one add, one shift
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 19 Deblocking Filter
Improves subjective quality and PSNR of the decoded picture Significantly superior to post filtering Filtering affects the edges of the 4x4 block structure Adaptive filtering removes blocking artifacts, but does not unnecessarily blur the visual content z On slice level, the global filtering strength can be adjusted to the individual characteristics of the video sequence z On edge level, filtering strength is made dependent on inter/intra, motion, and coded residuals z On sample level, quantizer dependent thresholds can turn off filtering for every individual sample z Specially strong filter for macroblocks with very flat characteristics almost removes “tiling artifacts”
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 20 Deblocking Filter
One dimensional visualization of q 0 q 2 an edge position q1
Filtering of p0 and q0 only takes place if:
1. |p0 -q0| < α(QP)
2. |p1 -p0| < β(QP)
3. |q1 -q0| < β(QP) Where β(QP) is considerably smaller than α(QP) p0 p2 p1 Filtering of p1 or q1 takes place if additionally :
1. |p2 -p0| < β(QP) or |q2 -q0| < β(QP) 4x4 Block Edge (QP = quantization parameter)
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 21 Deblocking: Subjective Result for Intra
Highly compressed first decoded intra picture at 0.28 bit/sample
Without Filter With H264/AVC Deblocking
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 22 Deblocking: Subjective Result for Inter
Highly compressed decoded inter picture
Without Filter With H264/AVC Deblocking
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 23 Entropy coding
Input Coder Video Control Control Signal Data Transform/ Quant. Scal./Quant. - Transf. coeffs Decoder Split into Inv. Scal. & Macroblocks Transform 16x16 pixels Entropy Coding De-blocking Intra-frame Filter Prediction Output Motion- Video Intra/Inter Compensation Signal
Motion Data Motion Estimation [source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 24 Variable length coding
Exp-Golomb code for almost all symbols except for transform coefficients Context adaptive VLCs for coding of transform coefficients z Number of coefficients is decoded z Special treatment of values +1 and -1 z Contexts are built dependent on transform coefficients
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 25 Context-Adaptive Arithmetic Coding (CABAC)
update probability estimation
Context Binarization Probability Coding modeling estimation engine
Adaptive binary arithmetic coder
Chooses a model Maps non-binary Uses the provided model conditioned on symbols to a for the actual encoding past observations binary sequence and updates the model
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 26 S Pictures
General description z Allows identical reconstruction of frames even when different reference frames are being used z SP pictures use of motion-compensated prediction z SI pictures can exactly approximate SP pictures
Applications z Bitstream switching or splicing z Random access z Fast-forward, fast-backward z Error recovery and/or resiliency z Resynchronization such as in Video Redundancy Coding
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 27 SP and SI Pictures
l rec Quant. Scaling Quantization Transf. coeffs +
Scaling & Inv. Transform Transform
l pred Entropy Control De-blocking Decoding Data Filter Intra-frame Prediction
Motion- Output Compensation Video Intra/Inter Signal
Motion Motion Data Estimation
[source: G. Sullivan, VCEG] Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 28 ComparisonComparison ofof H.264H.264 toto MPEGMPEG--44
MPEG-4: Advanced Simple Profile (ASP) z Motion Compensation: 1/4 pel z Global Motion Compensation H.264: z Motion Compensation: 1/4 pel z Using CABAC entropy coding z 5 reference frames (News: 17) Both z Sequence structure IBBPBBP...
z QPB=QPP+2 (step size: +25%) z Search range: 32x32 around 16x16 predictor z Lagrangian D+λR coder control
[source: ITU-T VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 29 RDRD Curves:Curves: FForemanoreman (QCIF,(QCIF, 10Hz)10Hz)
39 38 37 36 35 34 33 >30% 32 31 30 Average PSNR(Y) [dB] Average 29 28 MPEG-4 27 H.26L 26 0 163248648096112128 Bit-rate [kbit/s] [source: ITU-T VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 30 Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 31 Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 32 Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 33 Performance Streaming Application
Average bit-rate savings relative to:
Coder MPEG-4 ASP H.263 HLP MPEG-2
H.264/AVC MP 37.44% 47.58% 63.57% MPEG-4 ASP - 16.65% 42.95% H.263 HLP - - 30.61%
[Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 34 ExampleExample StreamingStreaming TestTest ResultResult
Tempete CIF 15Hz
38 37 36 35 34 33 32 31 30 Y-PSNR [dB] 29 MPEG-2 28 H.263 HLP 27 MPEG-4 ASP 26 H.264/AVC MP 25 Test Points 24
0 256 512 768 1024 1280 1536 1792 Bit-rate [kbit/s] [Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 35 ExampleExample StreamingStreaming TestTest ResultResult
Tempete CIF 15Hz
80%
70%
60% H.264/AVC MP
50%
40%
30% MPEG-4 ASP
20% H.263 HLP 10% Rate saving relative to MPEG-2
0%
26 28 30 32 34 36 38 Y-PSNR [dB] [Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 36 TestTest ResultsResults forfor RealReal--TimeTime ConversationConversation
Average bit-rate savings relative to:
Coder H.263 CHC MPEG-4 SP H.263 Base
H.264/AVC BP 27.69% 29.37% 40.59%
H.263 CHC - 2.04% 17.63%
MPEG-4 SP - - 15.69%
[Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 37 ExampleExample RealReal--TimeTime ConversationConversation ResultResult
Paris CIF 15Hz
39 38 37 36 35 34 33 32 31 30 Y-PSNR [dB] H.263-Base 29 H.263 CHC 28 27 MPEG-4 SP 26 H.264/AVC BP 25 Test Points 24
0 128 256 384 512 640 768 Bit-rate [kbit/s] [Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 38 ExampleExample RealReal--TimeTime TestTest ResultResult
Paris CIF 15Hz
50% H.264/AVC BP
40%
30%
20% H.263 CHC H.263-Baseline
Rate saving relative to 10%
MPEG-4 SP 0%
24 26 28 30 32 34 36 38 Y-PSNR [dB] [Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 39 TestTest ResultsResults EntertainmentEntertainment--QualityQuality ApplicationsApplications
Average bit-rate savings relative to:
Coder MPEG-2
H.264/AVC MP 45%
[Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 40 ExampleExample EntertainmentEntertainment--QualityQuality ApplicationsApplications ResultResult
Entertainment SD (720x576i) 25Hz
39 38 37 36 35 34 33 32 31 30 Y-PSNR [dB] 29 28 27 MPEG-2 26 25 H.264/AVC MP 24
012345678910 Bit-rate [Mbit/s] [Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 41 ExampleExample EntertainmentEntertainment--QualityQuality ApplicationsApplications ResultResult
Entertainment SD (720x576i) 25Hz
60%
50% H.264/AVC MP
40%
30%
20%
10% Rate saving relative to MPEG-2
0%
26 28 30 32 34 36 38 Y-PSNR [dB] [Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 42 Further reading
IEEE Transactions on Circuits and Systems for Video Technology, Special Issue on the H.264/JVC Video Coding Standard, July 2003.
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 43