MPEG STANDARDS EVOLUTION and IMPACT on CABLE Dr
Total Page:16
File Type:pdf, Size:1020Kb
MPEG STANDARDS EVOLUTION AND IMPACT ON CABLE Dr. Paul Moroney, Dr. Ajay Luthra Motorola Broadband Communications Sector Abstract continues in that role to this day. MPEG-1 [1] targeted stored content up to a 1.5 Mbps rate, The MPEG standards process has evolved matching the parameters necessary to store on from its beginnings in 1988 through today to CDs. MPEG-1 addressed audio compression, cover a wide range of multimedia delivery video compression, and a storage-oriented technology. The entire phenomenon of digital systems layer to combine them with other television and the standards involved have higher-level information, and the resulting affected our culture through increased content standard earned an Emmy award. MPEG-2 choices for the consumer, increased [2] shifted the focus to entertainment competition, and new types of products. television, added more video compression Cable delivery has seen a huge impact, in tools, multi-channel audio, and a new systems particular from MPEG-2 based digital layer more tuned to the need for a transport television. If we can extrapolate from the definition for broadcast. This standard has evolution of the MPEG process, more change been deployed very successfully worldwide, is coming! and earned an unprecedented second Emmy for MPEG. The designation “MPEG-3” had INTRODUCTION been set aside for HDTV extensions, but MPEG-2 worked so well for this application No one would dispute that the cable that MPEG-3 was never needed. industry has been dramatically impacted by the transition to digital television and related MPEG-4 [3] began slowly, while MPEG-2 services, whether one considers the advent of based systems were deployed, and has effective DBS services, or the distribution of developed into a true next generation digital television to cable headends, or the multimedia system. MPEG-4 covers a much distribution of digital services through the broader scope than its predecessors, and cable plant to consumers. Although many achieves dramatic improvements in the factors have played a role in the process, underlying source coding technology as well. certainly the development of true standards was and continues to be a true driver. The MPEG has also begun efforts in two ISO/IEC sponsored Motion Picture Experts related areas, not focused per se on Group [MPEG] has been the primary multimedia compression. MPEG-7 [4] is organization for the formation of the basic targeted as multimedia search and retrieval, multimedia source coding standards. This enabling the creation of search engines for paper examines where MPEG has been, and such content. MPEG-21 [5] (the 21 suffix where it is going, to build a vision of the representing the 21st century) addresses the future of digital services over cable. goal of true universal multimedia access. This Particular attention is paid to video group focuses on defining a multimedia compression. framework so that broad classes of content can be created, delivered, and consumed in an The MPEG process began in October of interoperable manner. 1988, under the very effective leadership of its convener, Leonardo Chiariglione, who MPEG-1 AND MPEG-2 The last key aspect is the inclusion of MPEG-1 and MPEG-2 are defined in three motion estimation and compensation, where main parts, addressing video source coding, the above approach applies to only the audio source coding, and a systems layer. difference between a current frame (macroblock) region, and its best possible Video match in the prior frame. The offsets representing the best match, to the nearest x MPEG video source coding is designed to and y half-pixel, become motion vectors for remove redundancy in video content both the region, and are included in as header frame to frame, and within a frame. information. Algorithms that achieve these goals without degradation in the reconstructed video images Figure 1 shows the most basic block are “lossless,” that is, reversible. Algorithms diagram of this compression process. Note that sacrifice quality in the interests of bit rate the presence of the decode (decompress) loop, reduction are termed “lossy.” Lossy so that the compressor will be able to algorithms attempt to hide errors, or artifacts, transform the pixel differences required for based upon models of human video motion compensation. perception. For example, it is more difficult to detect artifacts under high motion. It is also true that human vision chroma resolution is lower than human vision luminance DCT Quantize VLC resolution. Compression systems, such as MPEG, that seek to address a wide range of applications typically employ both lossless Decode and lossy algorithms. MPEG video compression at its base employs transformation, quantization, and variable length coding (VLC), in that order. Figure 1: Basic MPEG Compressor Pixel domain representations of video frames are transformed through an 8 by 8 Discrete Although not shown in the Figure above, Cosine Transform (DCT) to a frequency MPEG also allows motion estimation from a domain representation. The resulting spatial future frame, or from both the previous and frequency coefficients are scanned in a zigzag future frames, in combination (See Figure 2). manner, and quantized. The resulting In MPEG parlance, a frame without motion sequence of amplitudes and runs of zero compensation is an I frame, one with values are then Huffman coded. The bits compensation from only the previous frame is generated are grouped under a syntax a P frame, and one with bi-directional motion organized roughly at the sequence level (a is a B frame. Note that in order to decode a B number of consecutive frames), the picture frame, coded frames must be sent out of (typically frame) level, the slice level (rows of (display) order. A decoder must have decoded blocks), the macroblock level (4 luminance the future frame before that frame can be used and 2 chrominance blocks) and the DCT block to “anchor” any frame decoded with bi- level, including header information directional motion. appropriate to each. film frame 4 becomes NTSC field 9 and 10. MPEG allows compressors to drop the I B B P duplicate fields, and code the film frames directly. An overhead bit called “repeat field” Figure 2: B Frame Prediction informs the decompressor, or decoder, how to restore NTSC for display. The MPEG-2 standard introduced various enhancements, but two major changes were Given the descriptions above, one can deduce the addition of frame/field mode and direct the philosophy adopted by the MPEG support for film content. In order to compress planners. The maximal benefit to equipment interlaced video signals intended to be suppliers, and consumers, taken together, was ultimately displayed on interlaced televisions, determined to be achieved through a standard (see Figure 3) efficiency is enhanced by that defines decoders to the extent required to allowing the 8 by 8 pixels to be assembled be able to receive all signals that are “MPEG from alternate (interlaced) fields. In fact, compliant” according to the standard syntax. throughout the interlaced content, some Thus all aspects of the decoder are defined, regions compress better in this fashion [frame with the exception of error recovery and the mode] and some compress better by taking the specifics of the display generation. 8 by 8 pixel block from each field separately! Thus the standard allows either approach, The compression process, however, signaled by header bits. provides substantial room for creativity and invention, within the context of the standard. Any compressor can be built, so long as an MPEG compliant decoder can decode the Field 1 signals produced. Compression is, thus, far Field 2 more of an art form, than decompression. Consider the various degrees of freedom. Which frame regions are to be predicted, and how? For predicted regions, what are the best motion vectors? Which frames do you select as I frames, or which slices are not predicted Figure 3: Interlaced Block Pair at all, so that decompression equipment can acquire a signal after a channel change, or after an error? How much should any given For film content carried in NTSC television block or region coefficients be quantized to signals, common practice has been to convert minimize perceptual loss yet achieve some the 24 frames per second progressive film to desired net output bit rate (rate control)? the NTSC 29.97 frames per second interlace How does the equipment estimate such loss? (59.94 fields per second) through a telecine Which regions are best coded as fields, rather process known as 3:2 pull down. The first than frames? What is the best way to scan the film frame is divided into two fields, and the coefficients (there is more than one option first field is repeated in the NTSC signal as a provided in the standard)? “field 3”. Frame 2 of the film becomes “field 4” and “field 5,” without any repeat. Film MPEG-2 introduced the concept of profiles frame 3 becomes fields 6 and 7, and field 6 is and levels, to allow the broad range of repeated as a field 8. To complete the cycle, compression tools and bit rates to be targeted at classes of applications. This allowed also defined a new non-backward compatible products to be compliant at only a specific algorithm called Advanced Audio Coding profile and level; thus low cost consumer (AAC). AAC provides the best quality overall devices were possible. No one device was audio coding approach, at added complexity, burdened with implementing all the tools of both for multi-channel surround and for MPEG, at all bit rates. regular stereo audio. With AAC, 64 kbps stereo audio is generally perceived as excellent. Audio Systems MPEG audio compression bears several MPEG defined a systems layer to combine similarities to the video compression the various audio and video streams into a described above.