Video Coding Video Coding Standardization Organizations (esp. ITU & ISO/IEC Standards) § Two organizations have historically dominated general-purpose video compression standardization: • ITU-T Video Coding Experts Group (VCEG) Gary J. Sullivan, Ph.D. International Telecommunications Union – Telecommunications Standardization Sector (ITU-T, a United Nations Organization, formerly CCITT), ITU-T VCEG Rapporteur | Chair Study Group 16, Question 6 ISO/IEC MPEG Video Rapporteur | Co-Chair • ISO/IEC Moving Picture Experts Group (MPEG) ITU/ISO/IEC JVT Rapporteur | Co-Chair International Standardization Organization and International Electrotechnical Commission, Joint IEEE Fellow (2006) Technical Committee Number 1, Subcommittee 29, Working Group 11 Corporation Video Architect § Recently, the Society for Motion Picture and Television Engineers (SMPTE) has also entered with “VC-1”, based on Microsoft’s WMV 9 but this talk covers only the ITU and ISO/IEC work. March 2006 Video Standards Overview March ‘06 Gary J. Sullivan 1

Chronology of International The Scope of Picture and Video Video Coding Standards HH..112200 Coding Standardization ((11998844--11998888)) HH..226633 (1995-2000+) § Only the Syntax and Decoder are standardized:

T (1995-2000+) - HH..226611 • Permits optimization beyond the obvious U (1990+) T (1990+) H.262 / H.264 / • Permits complexity reduction for implementability I H.262 / H.264 / MMPPEEGG--22 MMPPEEGG--44 • Provides no guarantees of Quality ((11999944/9/955--11999988++)) AAVVCC C

E (2003-2006) Source

I (2003-2006) / MMPPEEGG--11 MMPPEEGG--44 Pre-Processing Encoding O ((11999933)) VViissuuaall S I ((11999988--22000011++)) Destination Post-Processing Decoding & Error Recovery Scope of Standard 1990 1992 1994 1996 1998 2000 2002 2004

Video Standards Overview March ‘06 Gary J. Sullivan 2 Video Standards Overview March ‘06 Gary J. Sullivan 3

Predictive Coding and DPCM H.120 : The First Digital Video Coding Standard § Separate quantization of each sample is known as pulse-code modulation (PCM) § ITU-T (ex-CCITT) Rec. H.120: The first digital video § Predictive Coding or Differential PCM: Generate an estimate for the value of the input data, and encode only the remaining difference. coding standard (1984) • v1 (1984) had conditional replenishment, DPCM, scalar quantization, variable-length coding, switch for Entropy (Channel) + Quantizer Coder quincunx sampling - • v2 (1988) added motion compensation and

Prediction background prediction + Entropy Decoder • Operated at 1544 (NTSC) and 2048 (PAL) kbps • Few units made, essentially not in use today Predictor (e.g. Delay weighted sum) (Memory) Decoded Samples

(Decoding Process Shown Shaded)

Video Standards Overview March ‘06 Gary J. Sullivan 4 Video Standards Overview March ‘06 Gary J. Sullivan 5 “Intra” Picture Coding by DCT The Discrete Cosine Transform

Basic “intra” image representation: Discrete Cosine § The DCT (unitary type II DCT): M −1 N −1 Transform (DCT) (early ‘70s, ITU+ISO JPEG approved ≈ 2 ’ ≈ 2 ’ » (2 x + 1)uπ ÿ » (2 y + 1)vπ ÿ F (u, v) = ∆ c ÷ ∆ c ÷ f (m M + x, nN + y) ⋅ cos ⋅ cos ‘92): m,n ƒ ƒ « u M ◊ « v N ◊ … 2 M ⁄Ÿ … 2 N ⁄Ÿ x = 0 y = 0 • Analyze 8x8 blocks of image according to DCT § The Inverse DCT (unitary type III DCT): frequency content (images tend to be smooth) M −1 N −1 ≈ 2 ’ ≈ 2 ’ » (2 x + 1)uπ ÿ » (2 y + 1)vπ ÿ f! (mM + x, nN + y) = ∆ c ÷ ∆ c ÷ F! (u, v) ⋅ cos ⋅ cos • Find magnitude of each discrete frequency within the ƒ ƒ « u M ◊ « v N ◊ m ,n … 2 M ⁄Ÿ … 2 N ⁄Ÿ block u = 0 v = 0 • Round off (“quantize”) the amounts to scaled integer § Definition of Constants = = = values (‘50s, ‘60s, ...) cu 1 / 2 for u 0, otherw ise 1. M 8 in c u rre n t v is u a l s ta n d a rd s c = 1 / 2 for v = 0, otherw ise 1. N = 8 in c u rre n t v is u a l s ta n d a rd s • Send integer approximations to decoder using v “Huffman” variable-length codes (VLC, early ‘50s)

Video Standards Overview March ‘06 Gary J. Sullivan 6 Video Standards Overview March ‘06 Gary J. Sullivan 7

Coefficient Scan Order: Interframe Motion Prediction The Zig-Zag Scan § Large areas of images stay the same from frame to frame, changing mostly due to motion 0 1 2 3 4 5 6 7 § Conditional Replenishment: Can signal to leave a block area of the 8 9 10 11 0 0 0 0 image unchanged, or replace it with new data § Interframe Difference Coding: Could encode a refinement to the 16 17 18 19 20 21 22 23 value of an area 24 25 26 27 28 29 30 31 § Displaced Frame Difference Coding: Can predict an image area by copying some nearby part of the previous image (motion 32 33 34 35 36 37 38 39 compensation) and optionally adding some refinement 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63

Video Standards Overview March ‘06 Gary J. Sullivan 8 Video Standards Overview March ‘06 Gary J. Sullivan 9

P-Picture Predictive Coding H.261: The Basis of Modern Video Compression § ITU-T (ex-CCITT) Rec. H.261: The first widespread practical success • First design (late ‘90) embodying typical structure dominating today: – 16x16 macroblock motion compensation, – 8x8 DCT, – scalar quantization, – zig-zag scan, and – run-length – variable-length coding • Key aspects later dropped by other standards: loop filter, integer motion comp., 2-D VLC, header overhead • v2 (early ‘93) added a backward-compatible high-resolution graphics trick mode • Operated at 64-2048 kbps I P P P P • Still in use, although mostly as a backward-compatibility feature – overtaken by H.263

Video Standards Overview March ‘06 Gary J. Sullivan 10 Video Standards Overview March ‘06 Gary J. Sullivan 11 Blocks and Macroblocks H.261&3 Macroblock Structure The luma and chroma planes are divided into blocks. Luma blocks are associated with Cb and = luma sample Cr blocks to create a macroblock. = chroma sample macroblock (two chroma fields) Intra/Inter Decisions: 16x16 macroblock DCT of 8x8 blocks

H.261: 16x16 1-pel motion Cb Cr H.263: 16x16 1/2-pel motion Y or (AP mode) 8x8 sample blocks 8x8 1/2-pel motion with overlapping Video Standards Overview March ‘06 Gary J. Sullivan 12 Video Standards Overview March ‘06 Gary J. Sullivan 13

Basic Hybrid Structure of H.261, etc. (late ’90) Predictive Coding with (old-fashioned) B Pictures Input Coder Video Control Signal Control Data Transform/ Quant. Quantizer - Transf. coeffs Decoder Split into Deq./Inv. Macroblocks Transform 16x16 pixels Entropy Coding 0 Motion- Compensated Intra/Inter Predictor

Motion Data Motion Estimator I B P B P

Video Standards Overview March ‘06 Gary J. Sullivan 14 Video Standards Overview March ‘06 Gary J. Sullivan 15

MPEG-1: Interlaced Video Practical Video at Higher Rates than H.261 (Welcome to the 1940 Analog World) § Formally ISO/IEC 11172-2 (‘93), developed by ISO/IEC JTC1 SC29 WG11 (MPEG) – use is fairly widespread (esp. Video CD in Asia), but mostly overtaken by MPEG-2 • Superior quality to H.261 when operated a higher bit rates (≥ 1 Mbps for CIF 352x288 resolution) • Can provide approximately VHS quality between 1-2 Mbps using SIF 352x240/288 resolution • Technical features inherited from H.261 – 16x16 macroblocks – 16x16 motion compensation, – 8x8 DCT, – scalar quantization, – zig-zag scan, and Vertical Vertical – run-length – variable-length coding • Technical features added: – Bi-directional motion prediction Horizontal Temporal – Half-pixel motion – Slice-structured coding – DC-only “D” pictures – Quantization weighting matrices Video Standards Overview March ‘06 Gary J. Sullivan 16 Video Standards Overview March ‘06 Gary J. Sullivan 17 MPEG-2/H.262: Even Higher Bit Rates H.263: The Next Generation and Interlace § Formally ISO/IEC 13818-2 & ITU-T H.262, developed (‘94) jointly by ITU-T § ITU-T Rec. H.263 (v1: 1995): The next generation of and ISO/IEC SC29 WG11 (MPEG) – Now in very wide use for DVD and standard and high-definition DTV (the most commonly used video coding video coding performance, developed by ITU-T – the standard) current premier ITU-T video standard (has overtaken • Primary new technical features: H.261 as dominant videoconferencing codec) – Support for interlaced-scan pictures – Increased DC quantization precision • Superior quality to prior standards at all bit rates • Also (except perhaps for interlaced video) – Various forms of scalability (SNR, Spatial, breakpoint) • Better by a factor of two at very low rates – I-picture concealment motion vectors • Essentially the same as MPEG-1 for progressive-scan pictures, and • Versions 2 (late 1997/early 1998) & v3 (2000) later MPEG-1 forward compatibility required developed with a large number of new features • Not especially useful below 2-3 Mbps (range of use normally 2-5 Mbps SDTV broadcast, 6-8 DVD, 20 HDTV) • Profiles defined early 2001 • Essentially fixed frame rate • A somewhat tangled relationship with MPEG-4

Video Standards Overview March ‘06 Gary J. Sullivan 18 Video Standards Overview March ‘06 Gary J. Sullivan 19

What Was in H.263 Version 1? H.263+ Feature Categories

§ “Baseline” Algorithm Features beat H.261 § Error resilience • Half-pel motion compensation (also in MPEG-1) § Improved compression efficiency (e.g., • 3-D variable length coding of DCT coefficients 15-25% overall improvement over H.263v1) • Median motion vector prediction § Custom and Flexible Video Formats • More efficient coding pattern signaling (?) • Deletable GOB header overhead (also in MPEG-1, but not 2?) § Scalability for resilience and multipoint § Optional Enhanced Modes § Supplemental enhancement information • Increased motion vector range with picture extrapolation • Variable-size, overlapped motion with picture extrapolation • PB-frames (bi-directional prediction) • Arithmetic entropy coding • Continuous-presence multipoint / video mux

Video Standards Overview March ‘06 Gary J. Sullivan 20 Video Standards Overview March ‘06 Gary J. Sullivan 21

MPEG-4 “Visual”: Baseline H.263 H.263++ Version 3 Features and Many Creative Extras § Annex U: Fidelity enhancement by macroblock and block-level reference § MPEG-4 part 2 (v1: early 1999), formally ISO/IEC 14496-2 picture selection – a significant improvement in picture quality § Contains the H.263 baseline design § Annex V: Packet Loss & Error Resilience using data partitioning with reversible VLCs (roughly similar to MPEG-4 data partitioning, but improved • coding efficiency enhancements (esp. at low rates) by using reversible coding of motion vectors rather than coefficients) § Adds many creative new extras: § Annex W:Additional Supplemental Enhancement Information • more coding efficiency enhancements • IDCT Mismatch Elimination (specific fixed-point fast IDCT) • Arbitrary binary user data • error resilience / packet loss enhancements • Text messages (arbitrary, copyright, caption, video description, and • segmented coding of shapes URI) • zero-tree wavelet coding of still textures • Error Resilience: • coding of synthetic and semi-synthetic content, – Picture header repetition (current, previous, next+TR, next-TR) – Spare reference pictures for error concealment • 10 & 12-bit sampling, • Interlaced field indications (top & bottom) • more • v2 (early 2000) & v3 (early 2001) & …

Video Standards Overview March ‘06 Gary J. Sullivan 22 Video Standards Overview March ‘06 Gary J. Sullivan 23 MPEG-4 Visual Focus: Simple Profile MPEG-4 Visual Focus: Advanced Simple § The most basic video coding profile of MPEG-4 Profile § No shape coding § Target goal: General rectangular video with improved § Progressive-scan video only coding efficiency § Most popular in low cost / low rate / low resolution apps § Progressive-scan and interlaced video support (e.g., mobile) – top bit rate & resolution limited § Up to SDTV resolution § Basic contents § Basic contents • H.263 baseline • All of Simple profile • Motion vectors over picture boundaries • B pictures • Variable block-size motion compensation • Global motion compensation • Intra DCT coefficient prediction • Quarter-sample motion compensation • Handling of four streams in most levels • Interlace handling • Error / packet-loss features – data partitioning, RVLC Video Standards Overview March ‘06 Gary J. Sullivan 24 Video Standards Overview March ‘06 Gary J. Sullivan 25

MPEG-4 Visual Focus: Studio Profile The H.264/MPEG-4 § Target goal: studio & professional use § Progressive-scan and interlaced video support § Up to very high resolution and bit rate (AVC) Standard § Basic contents • Enhanced-accuracy IDCT Gary J. Sullivan, Ph.D. • B pictures • 10 & 12 bit sample accuracy ITU-T VCEG Rapporteur | Chair • 4:2:2 & 4:4:4 chroma sampling structures ISO/IEC MPEG Video Rapporteur | Co-Chair ITU/ISO/IEC JVT Rapporteur | Co-Chair

Microsoft Corporation Video Architect

Video Standards Overview March ‘06 Gary J. Sullivan 26 March 2006

The Advanced Video Coding Project H.264/MPEG-4 AVC Objectives AVC = ITU-T H.264 / MPEG-4 part 10 § History: ITU-T Q.6/SG16 (VCEG - Video Coding Experts Group) § Primary technical objectives: “H.26L” standardization activity (where the “L” stood for “long-term”) • Significant improvement in coding efficiency st § Aug 1999: 1 test model (TML-1) • High loss/error robustness § July 2001: MPEG open call for technology: H.26L demo’ed • “Network Friendliness” (carry it well on MPEG-2 or RTP or § Dec 2001: Formation of the Joint Video Team (JVT) between H.32x or in MPEG-4 file format or MPEG-4 systems or …) VCEG and MPEG to finalize H.26L as a new joint project (similar to • Low latency capability (better quality for higher latency) MPEG-2/H.262) • Exact match decoding § July 2002: Final Committee Draft status in MPEG § Additional version 2 objectives (in FRExt): § Dec ‘02 technical freeze, FCD ballot approved • Professional applications (more than 8 bits per sample, § May ’03 completed in both orgs 4:4:4 color sampling, etc.) § July ’04 Fidelity Range Extensions (FRExt) completed • Higher-quality high-resolution video § Jan ’05 launched • Alpha plane support (a degree of “object” functionality)

Video Standards Overview March ‘06 Gary J. Sullivan 28 Video Standards Overview March ‘06 Gary J. Sullivan 29 Relating to Other ITU & MPEG Standards A Comparison of Performance

§ Same design to be approved in both ITU-T VCEG and ISO/IEC § Test of different standards (ICIP 2002 study) MPEG § Using same rate-distortion optimization techniques for all codecs § Streaming test: High-latency (included B frames) § In ITU-T VCEG this is a new & separate standard • Four QCIF sequences coded at 10 Hz and 15 Hz (Foreman, Container, News, • ITU-T Recommendation H.264 Tempete) and • Four CIF sequences coded at 15 Hz and 30 Hz (Bus, Flower Garden, Mobile and • ITU-T Systems (H.32x) support it Calendar, and Tempete) § In ISO/IEC MPEG this is a new “part” in the MPEG-4 suite § Real-time conversation test: No B frames • Four QCIF sequences encoded at 10Hz and 15Hz (Akiyo, Foreman, Mother and • Separate codec design from prior MPEG-4 visual Daughter, and Silent Voice) • New part 10 called “Advanced Video Coding” (AVC – similar to • Four CIF sequences encoded at 15Hz and 30Hz (Carphone, Foreman, Paris, “AAC” position in MPEG-2 as separate codec) and Sean) § Compare four codecs using PSNR measure: • Not backward or forward compatible with prior standards (incl. • MPEG-2 (in high-latency/streaming test only) the prior MPEG-4 visual spec. – core technology is different) • H.263 (high-latency profile, conversational high-compression profile, baseline profile) • MPEG-4 Systems / File Format supports it • MPEG-4 Visual (simple and advanced simple profiles with & without B pictures) § H.222.0 | MPEG-2 Systems also supports it • H.264/AVC (with & without B pictures) § Note: These test results are from a private study and are not an endorsed report of the JVT, VCEG or MPEG organizations.

Video Standards Overview March ‘06 Gary J. Sullivan 30 Video Standards Overview March ‘06 Gary J. Sullivan 31

Comparison to MPEG-2, H.263, MPEG-4p2 Comparison to MPEG-2, H.263, MPEG-4p2

Foreman QCIF 10Hz Tempete CIF 30Hz 39 38 38 37 37 36 36 35 JVT/H.264/AVC 35 34 34 MPEG-4 Visual 33 Quality 33 MPEG-2 Quality 32 Y-PSNR [dB] 32 H.263 Y-PSNR [dB] 31 31 30 30 29 JVT/H.264/AVC 29 28 MPEG-4 Visual 28 27 MPEG-2 26 27 H.263 25 0 50 100 150 200 250 0 500 1000 1500 2000 2500 3000 3500 Bit-rate [kbit/s] Bit-rate [kbit/s]

Video Standards Overview March ‘06 Gary J. Sullivan 32 Video Standards Overview March ‘06 Gary J. Sullivan 33

Caution: Your Mileage Will Vary Computing Resources for the New Design

§ This encoding software may not represent § New design includes relaxation of traditional bounds on computing resources – rough guess 2-3x the MIPS, ROM & RAM requirements implementation quality of MPEG-2 for decoding, 3-4x for encoding § These tests only up to CIF (quarter-standard-definition) § Particularly an issue for low-power (e.g., mobile) devices § resolution Problem areas: • Smaller block sizes for motion compensation (cache access § Interlace, SDTV, and HDTV not tested in this test issues) • Longer filters for motion compensation (more memory access) § Test sequences may not be representative of the • Multi-frame motion compensation (more memory for reference variety of content encountered by applications frame storage) § These tests so far not aligned with profile designs • In-loop deblocking filter (more processing & memory access) • More segmentations of macroblock to choose from (more § This study reports PSNR, but perceptual quality is what searching in the encoder) matters • More methods of predicting intra data (more searching) • Arithmetic coding (adaptivity, computation on output bits)

Video Standards Overview March ‘06 Gary J. Sullivan 34 Video Standards Overview March ‘06 Gary J. Sullivan 35 H.264/MPEG-4 AVC Structure Motion Compensation Accuracy

Input Coder Input Coder Video Control Video Control Signal Control Signal Control Data Data Transform/ Transform/ Quant. Quant. Scal./Quant. Scal./Quant. - Transf. coeffs - Transf. coeffs Decoder Decoder Split into Scaling & Inv. Split into Scaling & Inv. Macroblocks Transform Macroblocks Transform 16x16 pixels Entropy 16x16 pixels Entropy Coding Coding Deblocking De-blocking 16x16 16x8 8x16 8x8 Filter Filter Intra-frame Intra-frame MB 0 0 1 Prediction Prediction Types 0 0 1 Output Output1 2 3 Motion- Video Motion- Video 8x8 8x4 4x8 Intra/Inter Compensation Signal Intra/Inter Compensation Signal 4x4 8x8 0 0 1 0 0 1 Motion Types M1otion 2 3 Data Data Motion Motion Motion vector accuracy 1/4 sample Estimation Estimation (6-tap filter)

Video Standards Overview March ‘06 Gary J. Sullivan 36 Video Standards Overview March ‘06 Gary J. Sullivan 37

Multiple Reference Frames Intra Prediction

Input Coder Input Coder § Directional spatial prediction Video Control Video Control (9 types for 4x4 luma pred, Signal Control Signal Control Data 4 types for D1a6tax16 luma pred, Transform/ Transform/ Quant. 4 types forQ 8uaxn8t. chroma pred) Scal./Quant. Scal./Quant. - Transf. coeffs - Transf. coeffs Decoder Decoder M A B C D E F G H Split into Scaling & Inv. Split into Scaling & Inv. I a b c d Macroblocks Transform Macroblocks Transform J e f g h 16x16 pixels 16x16 pixels Entropy K i j k l Entropy Coding Coding L m n o p 8 De-blocking De-blocking 1 Intra-frame Filter Intra-frame Filter Prediction Prediction 6 Output Output 3 4 Motion- Video Motion- Video 7 0 5 Intra/Inter Compensation Signal Intra/Inter Compensation Signal e.g., Mode 4: § Multiple RefereMnoctioen Frames diagonal dMoowtionn/right prediction Data Data Motion § Generalized B Frames Motion a, f, k, p are predicted by Estimation Estimation (A + 2M + I + 2) >> 2

Video Standards Overview March ‘06 Gary J. Sullivan 38 Video Standards Overview March ‘06 Gary J. Sullivan 39

Transform Coding Deblocking Filter

Input Coder Input Coder Video Control Video Control Signal Control Signal Control Data Data Transform/ Transform/ Quant. Quant. Scal./Quant. Scal./Quant. - Transf. coeffs - Transf. coeffs Decoder Decoder § 4Sxpl4it iBntloock Integer Transform Scaling & Inv. Split into Scaling & Inv. Transform Transform Macroblock»s1 1 1 1 ÿ Macroblocks 16x16 pixel…s Ÿ 16x16 pixels 2 1 −1 −2 Entropy Entropy H = … Ÿ Coding Coding …1 −1 −1 1 Ÿ … Ÿ …1 −2 2 −1⁄Ÿ De-blocking Deblocking Intra-frame Filter Intra-frame Filter § Repeated transform of DCPr ecdoicetfiofsn Prediction Output Output for 8x8 chroma and 16x16M Inottiroan - Video Motion- Video luma blocks Intra/Inter Compensation Signal Intra/Inter Compensation Signal

Motion Motion Data Data Motion Motion Estimation Estimation

Video Standards Overview March ‘06 Gary J. Sullivan 40 Video Standards Overview March ‘06 Gary J. Sullivan 41 Entropy Coding AVC Version 1 Profiles

Input Coder Video Control Signal Control § Three profiles in version 1: Baseline, Main, and Extended Data Transform/ § Baseline (esp. Videoconferencing & Wireless) Quant. Scal./Quant. - Transf. coeffs • I and P progressive-scan picture coding (not B) Decoder Split into Scaling & Inv. • In-loop deblocking filter Macroblocks Transform 16x16 pixels Entropy • 1/4-sample motion compensation Coding • Tree-structured motion segmentation down to 4x4 block size Deblocking • VLC-based entropy coding Intra-frame Filter •UVLC/Exp-Golomb + Prediction •CABAC or • Some enhanced error resilience features Output Motion- Video •CAVLC – Flexible macroblock ordering/arbitrary slice ordering Intra/Inter Compensation Signal – Redundant slices Motion Data Motion Estimation

Video Standards Overview March ‘06 Gary J. Sullivan 42 Video Standards Overview March ‘06 Gary J. Sullivan 43

Amendment 1: Fidelity-Range Extensions Non-Baseline AVC Version 1 Profiles

§ Main Profile (esp. Broadcast) § AVC standard finished 2003 • All Baseline features except enhanced error resilience features • ITU-T/H.264 finalized May, 2003 • Interlaced video handling • MPEG-4 AVC finalized July, 2003 (same thing) • Generalized B pictures • Only corrigenda (bug fixes) since then • Adaptive weighting for B and P picture prediction § Fidelity-Range Extensions (FRExt) • CABAC (arithmetic entropy coding) • New work item initiated in July 2003 § Extended Profile (esp. Streaming) • More than 8 bits, color other than 4:2:0 • All Baseline features • Interlaced video handling • Alpha coding • Generalized B pictures • More coding efficiency capability • Adaptive weighting for B and P picture prediction • Also new supplemental information • More error resilience: Data partitioning • SP/SI switching pictures

Video Standards Overview March ‘06 Gary J. Sullivan 44 Video Standards Overview March ‘06 Gary J. Sullivan 45

FRExt Finished July 04 New Things in FRExt – Part 1

§ Project initiated July 2003 § Larger transforms • Motivations • 8x8 transform (again!) – Higher quality, higher rates • Drop 4x8, 8x4, or larger, 16-point… § Filtered intra prediction modes for 8x8 block size – 4:4:4, 4:2:2, and also 4:2:0 § Quantization matrix – 8, 10, or 12 bits (14 bits considered and not • 4x4, 8x8, intra, inter trans. coefficients weighted differently included) • Old idea, dating to JPEG and before (circa 1986?) – Lossless • Full capabilities not yet explored (visual weighting) – Stereo § Coding in various color spaces § Finished in one year! (July 04) • 4:4:4, 4:2:2, 4:2:0, Monochrome, with/without Alpha • New integer color transform (a VUI-message item)

Video Standards Overview March ‘06 Gary J. Sullivan 46 Video Standards Overview March ‘06 Gary J. Sullivan 47 New Things in FRExt – Part 2 8x8 16-Bit (Bossen) Transform

§ Efficient lossless interframe coding » 8 8 8 8 8 8 8 8ÿ § Film grain characterization for analysis/synthesis … Ÿ 12 10 6 3 − 3 − 6 −10 −12 representation … Ÿ … 8 4 − 4 − 8 −8 − 4 4 8Ÿ § Stereo-view video support … Ÿ 10 − 3 −12 − 6 6 12 3 −10 § Deblocking filter display preference … Ÿ … 8 −8 − 8 8 8 −8 −8 8Ÿ … Ÿ … 6 −12 3 10 −10 − 3 12 − 6Ÿ … 4 −8 8 − 4 − 4 8 −8 4Ÿ … Ÿ … 3 − 6 10 −12 12 −10 6 − 3⁄Ÿ

Video Standards Overview March ‘06 Gary J. Sullivan 48 Video Standards Overview March ‘06 Gary J. Sullivan 49

8x8 Transform Advantage Quantization Matrix (JVT-K028, IBBP coding, prog. scan)

Sequence % BD bit-rate reduction § Similar concept to MPEG-2 design § Vary step size based on frequency Movie 1 11.59 § Adapted to modified transform structure Movie 2 12.71 § More efficient representation of weights

Movie 3 12.01 § Eight downloadable matrices (at least 4:2:0) • Intra 4x4 Y, Cb, Cr Movie 4 11.06 • Intra 8x8 Y Movie 5 13.46 • Inter 4x4 Y, Cb, Cr Crawford 10.93 • Inter 8x8 Y Riverbed 15.65

Average 12.48

Video Standards Overview March ‘06 Gary J. Sullivan 50 Video Standards Overview March ‘06 Gary J. Sullivan 51

New Profiles Created by FRExt Some Notes on Quality Testing

§ 4:2:0, 8-bit: “High” (HP) § Use appropriate “High” profile (incl. adaptive transform) § 4:2:0, 10-bit: “High 10” (Hi10) § If testing for PSNR, use “flat” quant matrices § 4:2:2, 10-bit: “High 4:2:2” (Hi422) § Otherwise, use “non-flat” quant matrices § 4:4:4, 12-bit: “High 4:4:4” (Hi444) § Use more than 1 or 2 reference pictures § Use hierarchical reference frames coding structure § Effectively the same tools, but acting on different input § Use CABAC entropy coding data (with a couple of exceptions in the 4:4:4 profile) § If testing high-quality PSNR, use adaptive quantization* § Use rate-distortion optimization in encoder § Use large-range good-quality motion search § Use bi-predictive search optimization (see JVT-N014) * = See G. Sullivan & S. Sun, “On Dead-Zone…”, VCIP 2005/JVT-N011

Video Standards Overview March ‘06 Gary J. Sullivan 52 Video Standards Overview March ‘06 Gary J. Sullivan 53 A Performance Test for High Profile For Further Information (from JVT-L033 - Panasonic) § JVT, MPEG, and VCEG management team members: § Subjective tests by Blu-Ray Disk Founders of FRExt HP • Gary Sullivan ([email protected]) • 4:2:0/8 (HP) 1920x1080x24p (1080p), 3 clips. • Jens Ohm ([email protected]) • Nominal 3:1 advantage to MPEG-2 • Ajay Luthra ([email protected]) – 8 Mbps HP scored better than 24 Mbps MPEG-2 • ([email protected]) • Apparent transparency at 16 Mbps § IEEE Transactions on Circuits and Systems for Video Technology Special Issue on H.264/AVC (July 2003) 5 Figure 1: Results of subjective test with studio participants (Blu-Ray Disk Founders) § 4.5 Paper in Proceedings of IEEE January 2005 (Sullivan & Wiegand)

4 § I. Richardson, H.264 and MPEG-4 Video Compression, 2003 4.03 5: Perfect Mean 4.00 3.90 4: Good 3.5 § Overview incl. FRExt: SPIE Aug 2004 (Sullivan, Topiwala, & Luthra) Opinion 3.65 3.71 3.59 3: Fair (OK for DVD) Score 3 2: Poor 1: Very Poor § Paper at VCIP 2005: Meta-overview and deployment 2.5 § Wikipedia H.264 page 2 H.264/AVC H.264/AVC H.264/AVC H.264/AVC Original MPEG2 24 Mbs, Video StandFRaErxdt s OvervieFwRE Mxt arch ‘06F R E x t F R E x t Gary J.D SVHuSll ivan 54 Video Standards Overview March ‘06 Gary J. Sullivan 55 8Mbps 12Mbps 16Mbps 20Mbps emulation