The H.264/MPEG4 Advanced Video Coding Standard and Its Applications

SULLIVAN LAYOUT 7/19/06 10:38 AM Page 134 STANDARDS REPORT The H.264/MPEG4 Advanced Video Coding Standard and its Applications Detlev Marpe and Thomas Wiegand, Heinrich Hertz Institute (HHI), Gary J. Sullivan, Microsoft Corporation ABSTRACT Regarding these challenges, H.264/MPEG4 Advanced Video Coding (AVC) [4], as the latest H.264/MPEG4-AVC is the latest video cod- entry of international video coding standards, ing standard of the ITU-T Video Coding Experts has demonstrated significantly improved coding Group (VCEG) and the ISO/IEC Moving Pic- efficiency, substantially enhanced error robust- ture Experts Group (MPEG). H.264/MPEG4- ness, and increased flexibility and scope of appli- AVC has recently become the most widely cability relative to its predecessors [5]. A recently accepted video coding standard since the deploy- added amendment to H.264/MPEG4-AVC, the ment of MPEG2 at the dawn of digital televi- so-called fidelity range extensions (FRExt) [6], sion, and it may soon overtake MPEG2 in further broaden the application domain of the common use. It covers all common video appli- new standard toward areas like professional con- cations ranging from mobile services and video- tribution, distribution, or studio/post production. conferencing to IPTV, HDTV, and HD video Another set of extensions for scalable video cod- storage. This article discusses the technology ing (SVC) is currently being designed [7, 8], aim- behind the new H.264/MPEG4-AVC standard, ing at a functionality that allows the focusing on the main distinct features of its core reconstruction of video signals with lower spatio- coding technology and its first set of extensions, temporal resolution or lower quality from parts known as the fidelity range extensions (FRExt). of the coded video representation (i.e., from par- In addition, this article also discusses the current tial bitstreams). The SVC project is planned to status of adoption and deployment of the new be finalized in January 2007. Also, multi-view standard in various application areas. video coding (MVC) capability has been success- fully demonstrated using H.264/MPEG4-AVC INTRODUCTION AND [9], requiring almost no change to the technical ISTORICAL ERSPECTIVE content of the standard. H P Rather than providing a comprehensive overview that covers all technical aspects of the Digital video technology is enabling and generat- H.264/MPEG4-AVC design, this article focuses ing ever new applications with a broadening on a few representative features of its core cod- range of requirements regarding basic video ing technology. After presenting some informa- characteristics such as spatiotemporal resolution, tion about target application areas and the chroma format, and sample accuracy. Applica- current status of deployment of the new stan- tion areas today range from videoconferencing dard into those areas, this article provides a over mobile TV and broadcasting of standard-/ high-level overview of the so-called video cod- high-definition TV content up to very-high-qual- ing layer (VCL) of H.264/MPEG4-AVC. Being ity applications such as professional digital video designed for efficiently representing video con- recording or digital cinema/large-screen digital tent, the VCL is complemented by the network imagery. Prior video coding standards such as abstraction layer (NAL), which formats the MPEG2/H.262 [1], H.263 [2], and MPEG4 Part VCL representation and provides header infor- 2 [3] are already established in parts of those mation in a manner appropriate for conveyance application domains. But with the proliferation by a variety of transport layers or storage of digital video into new application spaces such media. A representative selection of innovative as mobile TV or high-definition TV broadcast- features of the video coding layer in ing, the requirements for efficient representation H.264/MPEG4-AVC is described in more detail of video have increased up to operation points by putting emphasis on some selected FRExt- where previously standardized video coding tech- specific coding tools. Profile and level defini- nology can hardly keep pace. Furthermore, more tions of H.264/MPEG4-AVC are briefly cost-efficient solutions in terms of bit rate vs. discussed and finally a rate-distortion (R-D) end-to-end reproduction quality are increasingly performance comparison between sought in traditional application areas of digital H.264/MPEG4-AVC and MPEG2 video coding video as well. technology is presented. 134 0163-6804/06/$20.00 © 2006 IEEE IEEE Communications Magazine • August 2006 SULLIVAN LAYOUT 7/19/06 10:38 AM Page 135 The video coding Input Coder video control layer of signal Control data H.264/MPEG4-AVC Transform/ scal./quant. Quant. is similar in spirit to - transf. coeffs that of other video Decoder Split into Scaling and coding standards Macroblocks inv. transform 16x16 pixels such as MPEG2 Entropy coding Video. In fact, it uses Deblocking a fairly traditional filter Intra-frame approach consisting prediction of a hybrid of block- Output Motion-Motion video based temporal and compensation signal Intra/inter spatial prediction in conjunction with Motion data block-based Motion estimation transform coding. I Figure 1. Typical structure of an H.264/MPEG4-AVC video encoder. APPLICATIONS AND CURRENT below) in important application standards or TATUS OF EPLOYMENTS industry consortia specifications such as: S D • The revised implementation guideline TS As a generic, all-purpose video coding standard 101 154 of the Digital Video Broadcasting that is able to cover a broad spectrum of require- (DVB) organization ments from mobile phone to digital cinema • The HD-DVD specification of the DVD applications within a single specification, Forum H.264/MPEG4-AVC has received a great deal of • The BD specification of the Blu-Ray Disc recent attention from industry. Besides the clas- Association (BDA) sical application areas of videoconferencing and • The International Telecommunication broadcasting of TV content (satellite, cable, and Union — Radiocommunication Standard- terrestrial), the improved compression capability ization Sector (ITU-R) standards BT.1737 of H.264/MPEG4-AVC enables new services and for HDTV contribution, distribution, satel- thus opens new markets and opportunities for lite news gathering, and transmission, and the industry. As an illustration of this develop- BT.1687 for large-screen digital imagery for ment, consider the case of “mobile TV” for the presentation in a theatrical environment reception of audio-visual content on cell phones In addition, a number of providers of satellite or portable devices, presently on the verge of television services (including DirecTV, BSkyB, commercial deployment. Several such systems Dish Network, Euro1080, Premiere, and for mobile broadcasting are currently under con- ProSiebenSat.1) have recently announced or sideration, e.g., begun near-term deployments of H.264/MPEG4- • Digital Multimedia Broadcasting (DMB) in AVC in so-called second generation HDTV South Korea delivery systems (often coupled with the new • Digital Video Broadcasting — Handheld DVB-S2 satellite specification). At the time of (DVB-H), mainly in Europe and the Unit- writing this article, at least four single-chip solu- ed States tions for HD decoding of H.264/MPEG4-AVC • Multimedia Broadcast/Multicast Service for set-top boxes are on the market by the semi- (MBMS), as specified in Release 6 of 3GPP conductor industry. For such mobile TV services, improved video compression performance, in conjunction with appropriate mechanisms for error robustness, is HIGH-LEVEL OVERVIEW OF THE key — a fact that is well reflected by the use of IDEO ODING AYER H.264/MPEG4-AVC (using the version 1 Base- V C L line profile described below) together with for- The video coding layer of H.264/MPEG4-AVC ward error correction schemes in all of those is similar in spirit to that of other video coding mobile-broadcasting systems. standards such as MPEG2 Video [1]. In fact, it Another area that has attracted a lot of near- uses a fairly traditional approach consisting of a term industry implementation interest is the hybrid of block-based temporal and spatial pre- transmission and storage of HD content. Some diction in conjunction with block-based trans- indications of that trend are shown by the recent form coding. Figure 1 shows an encoder block inclusion of H.264/MPEG4-AVC (using version diagram for such a design. 3, i.e., FRExt-related “High profile” described A coded video sequence in H.264/MPEG4- IEEE Communications Magazine • August 2006 135 SULLIVAN LAYOUT 7/19/06 10:38 AM Page 136 The typical encoding Z A B C D E F G H I J K L M N O P operation for a Q picture begins with R 8 splitting the picture S T into blocks of 8X8 block 1 U samples. The first V picture of a W 6 sequence or a X 3 4 random access point 7 5 is typically coded in A - X, Z: Constructed samples of neighboring blocks 0 Intra mode. For all I Figure 2. Samples used for 8 × 8 spatial luma intra prediction (left), and directions of 4 × 4 and 8 × 8 remaining pictures of spatial luma intra prediction modes 0, 1, and 3–8 (right). a sequence or between random AVC consists of a sequence of coded pictures [4, tures. The decoder inverts the entropy coding access points, 5]. A coded picture can represent either an processes, performs the prediction process as typically Inter coding entire frame or a single field, as was also the case indicated by the encoder using the prediction for MPEG2 video. Generally, a frame of video type information and motion data. It also is utilized. can be considered to contain two interleaved inverse-scales and inverse-transforms the quan- fields: a top field and a bottom field. If the two tized transform coefficients to form the approxi- fields of a frame were captured at different time mated residual and adds this to the prediction.

The H.264/MPEG4 Advanced Video Coding Standard and Its Applications

ARCHIVE ONLY: DCI Digital Cinema System Specification, Version 1.2

ACM Multimedia 2007

CP750 Digital Cinema Processor the Latest in Sound Processing—From Dolby Digital Cinema

DV-983H 1080P Up-Converting Universal DVD Player with VRS by Anchor Bay Video Processing and 7.1CH Audio

Dolby Surround 7.1 for Theatres

Versatile Video Coding – the Next-Generation Video Standard of the Joint Video Experts Team

Paramount Theatre Sherry Lansing Theatre Screening Room #5 Marathon Theatre Gower Theatre

Installation Manual, Document Number 200-800-0002 Or Later Approved Revision, Is Followed

Digital Cinema System Specification Version 1.3

Security Solutions Y in MPEG Family of MPEG Family of Standards

The H.264 Advanced Video Coding (AVC) Standard

Introduction to DVD Carol Cini, U.S