The H.264/AVC Video Coding Standard
Total Page:16
File Type:pdf, Size:1020Kb
[STANDARDS in a nutshell] Thomas Wiegand and Gary J. Sullivan The H.264/AVC Video Coding Standard he H.264/MPEG-4 Advanced reductions in cost for mass-market of a given quality within a given limited Video Coding standard affordability. The latest addition to the bit-rate delivery system such as a broad- (H.264/AVC) is the newest lineup of these well-known standards cast network. The architecture-related video coding standard jointly is H.264/AVC. objective was to give the design a developed by the ITU-T Video “network-friendly” structure, including TCoding Experts Group (VCEG) and the MOTIVATION enhanced error/loss robustness capabili- ISO/IEC Moving Picture Experts Group As in the case of other international ties, in particular, which could address (MPEG). H.264/AVC has achieved a signif- video compression standards, the driv- applications requiring transmission over icant improvement in compression per- ing force behind the creation of various networks under various delay formance compared to prior standards, H.264/AVC was the need to enable inter- and loss conditions. The functionalities- and it provides a network-friendly repre- operability between encoder and related objectives included—as with sentation of the video that addresses both decoder products made by different prior video coding standards—providing conversational (video telephony) and manufacturers while minimizing the support for random access (i.e., the abili- nonconversational (storage, broadcast, or quantity of encoded data necessary to ty to start decoding at points other than streaming) applications. This article pro- achieve a given level of output video the beginning of the entire stream of vides a description of the structure, tech- quality—a concept known as the coding encoded data) and “trick mode” opera- nology, performance, and resources of efficiency of the design. In particular, tion (i.e., fast-forward, fast and slow H.264/AVC, which is referred to formally the increasing demand for video services reverse play, scene and chapter skipping, as ITU-T Recommendation H.264 and and the growing popularity of higher- switching between coded bitstreams, ISO/IEC 14496-10 (MPEG-4 Part 10). definition video are constantly creating etc.), and other features. greater demand for improved compres- BACKGROUND sion capabilities and this, in turn, moti- ISSUING BODIES AND SCHEDULE Since the early 1990s, when video coding vated the H.264/AVC standardization H.264/AVC was developed by the ITU/ISO/ technology was in its infancy, interna- effort. As a result of advances in technol- IEC Joint Video Team (JVT), consisting of tional standards such as H.261, MPEG-1, ogy since the development of the prior experts from ITU-T’s VCEG and ISO/IEC’s H.262/MPEG-2 Video, H.263, and MPEG- standards, a substantial improvement in MPEG organizations. VCEG is officially 4 Part 2 have been powerful engines coding efficiency had become possible. referred to as ITU-T SG16 Q.6, and it is a behind the commercial success of digital part of the Telecommunication video. They have played a pivotal role in OBJECTIVES Standardization Sector of the Inter- establishing the technology by ensuring The main objectives of the H.264/AVC national Telecommunications Union (ITU- interoperability among products devel- standard are focused on coding efficien- T, which is a United Nations organization oped by different manufacturers. At the cy, architecture, and functionalities. for telecom-related standardization). same time, these standards have allowed More specifically, an important objective MPEG is officially referred to as ISO/IEC flexibility for optimizing and molding the was the achievement of a substantial JTC 1/SC 29/WG 11, and it falls jointly technology to fit various applications, increase (roughly a doubling) of coding under the International Organization for and for making cost-performance trade- efficiency over MPEG-2 Video for high- Standardization and the International offs for particular product requirements. delay applications and over H.263 ver- Electrotechnical Commission (ISO and They have provided much-needed assur- sion 2 for low-delay applications, while IEC, which are major privately organized ance to the content creators that their keeping implementation costs within an international standardization bodies). content will play everywhere, making it acceptable range. Doubling coding effi- In early 1998, VCEG issued a call for unnecessary to create and manage multi- ciency corresponds to halving the bit proposals on a project then called H.26L. ple copies of the same content to match rate necessary to represent video content The first draft design for that new stan- the products of different manufacturers. with a given level of perceptual picture dard was adopted in August 1999. In Moreover, these standards have permit- quality. It also corresponds to doubling December 2001, VCEG and MPEG formed ted economies of scale to allow steep the number of channels of video content the JVT with the charter to finalize the IEEE SIGNAL PROCESSING MAGAZINE [148] MARCH 2007 1053-5888/07/$25.00©2007IEEE draft new video coding standard, which image acquisition, pre- and postprocess- of which contains an integer number of was formally approved as H.264/AVC in ing operations, error/loss concealment bytes. The NAL unit structure provides a May 2003. A first extension was issued in and recovery, and all aspects of decoded generic form for use in both packet- September 2004 with version 3. The sec- video display, have been deliberately kept oriented and bitstream-based systems. ond major extension is expected to be outside the scope of the standard. This The format of NAL units is identical in finalized in early 2007 with version 7. limitation of the scope of the standard both environments, except that each permits maximal freedom to optimize NAL unit is preceded by a unique start TARGET APPLICATIONS product designs in ways that do not code prefix for resynchronization in bit- The applications foreseen for the interfere with interoperability (balancing stream-oriented transport systems. The H.264/AVC standard include broadcast compression quality, implementation VCL is specified to efficiently represent over cable, satellite, cable modem, x (of cost, time to market, etc.). However, it the content of the video data and fulfill any type) digital subscriber line (xDSL), provides no guarantees of video encoding the design objective of enhanced coding and terrestrial channels; interactive or quality or decoded video display quality. efficiency. It is similar in spirit to serial storage on optical and magnetic It allows encoders to produce any bit- designs found in other standards in the devices such as DVDs; storage and distri- stream that is in the correct format. In sense that it consists of a hybrid of bution of professional film and video fact, such a standard does not even block-based temporal and spatial predic- material for content contribution, con- require encoders to produce bitstreams tion in conjunction with scalar-quan- tent distribution, studio editing, and post that decode into video bearing any tized block transform coding. A processing; video-on-demand or multi- resemblance to their video input, or even simplified block diagram of typical media streaming services over cable require encoders to accept video input encoder processing elements for the modem, xDSL, local area network (LAN), data at all. It also does not specify any VCL is provided in Figure 1. Decoding integrated service digital network (ISDN), particular relationship between the out- processes are conceptually a subset of and wireless networks; conversational put of the specified decoding process and these encoding processes, and are shown services over Ethernet, LAN, xDSL, ISDN, the output of the subsequent display in the shaded region of the figure. wireless and mobile networks, and process; it does not even specify that a Although only the decoding process modems; and multimedia messaging decoder needs to have the ability to dis- is actually specified in the standard, we services over xDSL, Ethernet, LAN, ISDN, play its output. focus on typical encoder technology to wireless, and mobile networks. With such explain it, since the design is much easi- broad application coverage, H.264/AVC TECHNOLOGY er to understand from that perspective. quickly received a great deal of recent As shown in Figure 1, the picture is attention from industry and found wide- FUNCTIONALITIES split into blocks. The first picture at the spread standard system adoption as well The H.264/AVC technology design sup- start of a sequence or a random access as deployment in products. ports the coding of video for a wide vari- point (a point within a coded video ety of applications. In addition to sequence at which effective decoding STRUCTURE OF THE STANDARD enabling efficient compression of digital can begin) is typically coded in “intra” As has been the case for all ITU-T and video, it supports error/loss resilience, (intrapicture) mode, which means that ISO/IEC video coding standards, only the random-access operation, “trick-mode” only information from the picture itself bit-stream format (i.e., encoded data for- operation (mentioned earlier), region-of- is being used (no prediction references mat) and the central decoding process interest preferential coding, stereo-view to other preceding pictures in the bit- have been standardized in the H.264/AVC indicators, film-grain analysis/synthesis stream). Each sample of a block in such specification. The standard defines the processing, and a variety of additional a picture is predicted using spatially syntax, certain constraints on allowed capabilities. Further work is underway to neighboring samples of previously combinations of syntax values, and the add enhanced application capabilities for coded blocks in the same picture. For decoding process of the syntax elements scalable and multiview/three-dimension- the remaining pictures of a sequence or to convert the encoded bitstream data al video coding.