The Essentials of Video Transcoding Enabling Acquisition, Interoperability, and Distribution in Digital Media
Total Page:16
File Type:pdf, Size:1020Kb
Telestream Whitepaper The Essentials of Video Transcoding Enabling Acquisition, Interoperability, and Distribution in Digital Media The intent of this paper is to Contents provide an overview of the Introduction Page 1 transcoding basics you’ll need to Digital media formats Page 2 know in order to make informed The essence of compression Page 3 decisions about implementing The right tool for the job Page 4 transcoding in your workflow. That Bridging the gaps Page 5 involves explaining the elements of (media exchange between diverse systems) the various types of digital media Transcoding workflows Page 6 files used for different purposes, Conclusion Page 7 how transcoding works upon those elements, what challenges might arise in moving from one Introduction format to another, and what It’s been more than three decades now since digital media emerged from the workflows might be most effective lab into real-world applications, igniting a communications revolution that for transcoding in various common continues to play out today. First up was the Compact Disc, featuring situations. compression-free LPCM digital audio that was designed to fully reconstruct the source at playback. With digital video, on the other hand, it was clear from the outset that most recording, storage, and playback situations wouldn’t have the data capacity required to describe a series of images with complete accuracy. And so the race was on to develop compression/decompression algorithms (codecs) that could adequately capture a video source while requiring as little data bandwidth as possible. 1 Telestream Whitepaper The good news is that astonishing progress has been Transcoding bypasses this tedious, inefficient scenario. made in the ability of codecs to allocate bandwidth It allows programs to be converted from codec to effectively to the parts of a signal that are most critical codec faster than real-time using standard IT hardware to our perception of the content. Take, for example, the rather than more costly video routers and switches. And Advanced Audio Codec (AAC) popularized by Apple via it facilitates automation because it avoids manual steps iTunes. Its fidelity may be debatable for purists, but the such as routing signal or transporting storage media. fact is that 128 kbps AAC delivers audio quality that is Intelligently implemented and managed, transcoding acceptable to a great many people in a great many vastly improves throughput while preserving quality. situations, and it uses only one-eleventh of the bitrate of CD-quality LPCM (16-bit/44.1 kHz). Even more impres- Digital media formats sive are the capabilities of the h.264 codec, which The first rule of digital media file formats and codecs is supports encoding of 1080p HD material at bit rates of that their number is always growing. A glance back in 6Mb/s or less – a compression ratio of 250:1 or more! time will reveal some older formats that have fallen by the wayside, but also shows that there are far more With success, of course, comes competition and formats in regular use today than there were fifteen or complication. Codecs make possible the extension of even five years ago. Different formats arise to fill digital content into new realms — the desktop, the smart different needs (more on this later), but they all share phone, etc. — each of which has constraints that are some basic characteristics. best addressed by optimizing a codec’s performance for a particular platform in a particular situation. The At the simplest level a video-capable file format is a result is an ever-expanding array of specialized codecs. container or wrapper for component parts of at least Some codecs are used for content acquisition, others three types. One type is video stream data and another for editing and post-production, still others for storage is audio stream data; together the elementary streams and archiving, and still more for delivery to viewers on of these two content types are referred to as the file’s everything from big-screen TVs to mobile devices. “essence.” The third component is metadata, which is Because content is increasingly used in a multitude of data that is not itself essence but is instead information settings, it’s become increasingly important to be able about the essence. to move it efficiently from one digital format to another Metadata can be anything from a simple file ID to a set without degrading its quality. Simply put, that is the of information about the content, such as the owner purpose of transcoding. (rights holder), the artist, the year of creation, etc. In Perhaps the easiest way to realize the importance of addition to these three, container formats may also transcoding is to imagine a world in which it doesn’t support additional elementary content streams, such as exist. Suppose, for example, that you manage produc- the overlay graphics used in DVD Video menus and tion for a local TV news team that has nabbed a subtitles. bombshell interview. You’ve got an exclusive on the story only until the following day, and you need to get it on the evening news. Unfortunately, the digital video format of your ENG camera uses a different codec than that of your non-linear editor (NLE). So before you can even begin to edit you’ll have to play the raw footage from the camera, decompressing it in real time while simultaneously redigitizing it in a format that your NLE can handle. Now suppose that you have to go through the same painfully slow routine to get the edited version to your broadcast server, and again to make a backup Component parts of a media container file copy for archiving, and again to make a set of files at different bitrates for different devices to stream from your station’s website. The inefficiency multiplies as you move through the process. 2 Telestream Whitepaper The following are examples of container file types that • Content metadata describes the content in ways are in common use today: that are relevant to a particular application, • QuickTime MOV, an Apple-developed cross- workflow, or usage, such as: platform multimedia format • for Web streaming: title, description, artist, etc.; • MPEG-2 Transport Stream and Program Stream, • for television commercials: Ad-ID, Advertiser, both developed by the Motion Picture Experts Brand, Agency, Title, etc.; Group, with applications in optical media (DVD-Vid- • for sporting events: event name, team names, eo, Blu-ray Disc) and digital broadcast (e.g. ATSC). athlete names, etc. WMV and MP4 for online distribution • File formats are designed with various purposes in • GXF (General eXchange Format), developed by mind, so naturally the metadata schemes used by those Grass Valley and standardized by SMPTE formats are not all the same. A metadata scheme • LXF, a broadcast server format developed by Harris designed for one purpose may not include the same (now named “Imagine Communications”) attributes as a scheme designed for a different purpose, which can greatly complicate the task of preserving • MXF (Material eXchange Format), standardized by metadata in the transcoding process. SMPTE The essence of compression Container formats such as those listed above differ from one another in a multitude of ways. At the overall In many cases the most important difference between file level, these include the structure of the header file formats, and the reason transcoding is required (identifying information at the start of a file that tells a between them, is the codecs they support and the hardware device or software algorithm the type of data amount of compression applied to the video and audio contained in the file), the organization of the component streams that comprise each format’s essence. Those parts within the container, the way elementary streams factors are determined by the purpose for which each are packetized and interleaved, the amount and format was defined. The common video file formats fall organization of the file-level metadata, and the support into a half-dozen different usage categories, including for simultaneous inclusion of more than one audio and/ acquisition, editing, archiving, broadcast servers, or video stream, such as audio streams for multiple distribution, and proxy formats. languages or video streams at different bitrates (e.g. for To understand why different compression schemes are adaptive streaming formats such as Apple HLS or used in these different categories we need a quick MPEG-DASH). refresher about how compression works. In video, At the essence level, format differences relate to the compression schemes are generally based on one or technical attributes of the elementary streams. For both of two approaches to reducing the overall amount video, these include the frame size (resolution) and of data needed to represent motion pictures: shape (aspect ratio), frame rate, color depth, bit rate, • Intra-frame compression reduces the data used to etc. For audio, such attributes include the sample rate, store each individual frame. the bit depth, and the number of channels. • Inter-frame compression reduces the data used to Container file formats also differ greatly in the type and store a series of frames. organization of the metadata they support, both at the Both compression approaches start with the assump- file level and the stream level. In this context, metadata tion that video contains a fair amount of redundancy. generally falls into one of two categories: Within a given frame, for example, there may be many • Playback metadata is time-synchronized data areas of adjacent pixels whose color and brightness intended for use by the playback device to values are identical. “Lossless” intraframe compression correctly render the content to the viewer, such as: simply expresses those areas more concisely. Rather • captions, including VBI (Line 21) or teletext in SD than repeat the values for each individual pixel, the and VANC in HD; values may be stored once along with the number of sequential pixels to which they apply.