A Technology Office White Paper Considerations for Interoperable

Home , CineForm

A Technology Office White Paper

Considerations for Interoperable Master Formats

September 9, 2009 Edition 3 Considerations for Interoperable Master Formats

Contents

EXECUTIVE SUMMARY 3

INTRODUCTION 4

COMPRESSION AND VISUAL QUALITY / FILE SIZE 4

CODEC PERFORMANCE - SPEED VS. FILE SIZE 7

INDUSTRY COMPATIBILITY 9

MECHANISM FOR PICTURE-DEPENDENT METADATA 10

INDUSTRY ACCESSIBILITY 12

INDUSTRY EXTENSIBILITY 13

SUPPORT FOR SINGLE VS. MULTIPLE CODECS 13

CONCLUSION 15

APPENDIX A - VIDEO COMPRESSION TECHNOLOGY OVERVIEW 16

APPENDIX B - QUALITY VS. COMPLEXITY 19

CONTACT INFORMATION 21

2 Considerations for Interoperable Master Formats

Executive Summary Traditionally, film provided a standard medium for producing, editing, storing and replicating content. As the film and television industries evolve more and more into the digital domain, the lack of standard mechanisms for content interchange have meant significant inefficiencies, costs and limitations in the content production workflow. Specifying a well considered, standard “Interoperable Master Format” provides a tremendous opportunity to alleviate many of the issues that exist today, extending the robustness and consistency of film to digital production.

In order to achieve this goal, the specification of an Interoperable Master Format needs to take into account not only quality, but longevity, flexibility, security and workflow efficiency. The choice of video codec or codecs has a significant bearing on these attributes and as such, this paper focuses on codec choice in relation to the following key areas: • Compression and Visual Quality / File Size • Codec Performance - Speed vs. File Size

• Industry Compatibility • Mechanism for picture-dependent metadata • Industry accessibility • Industry extensibility • Support for Single vs. Multiple Codecs There are a number of codecs considered possible options for an IMF including JPEG 2K, Cineform, H.264/AVC-Intra-Pro, JPEG XR and VC2. There are pros and cons for each of these and ultimately, the codec choice (or choices), should be determined by importance of the aforementioned characteristics and how the codecs perform against these criteria.

3 Considerations for Interoperable Master Formats

Introduction In a content production workflow the need for efficient mechanisms to encode and decode content into and out of an Interoperable Master Format (IMF), while maintaining high visual fidelity, is generally acknowledged. Equally important however, are workflow considerations including codec implementation requirements, price, industry compatibility, format extensibility, and other characteristics. The choice of an IMF codec (or codecs) and the software infrastructure that surrounds the format have significant bearing on these characteristics; this paper reviews the state of the art of different compression schemes/approaches and provides commentary on the relative affect on the content production workflow.

Compression and Visual Quality / File Size The choice of codec for an Interoperable Master format impacts the workflow in several measurable ways. Firstly, the higher the ratio of compression that is applied, the easier and quicker it is to transfer files between equipment, over networks, or between studios and post production facilities. However, the added convenience of smaller file sizes is generally offset with a corresponding reduction in visual quality when higher compression is applied. There is an important balance between the demand to maintain very high quality and the convenience offered by small file sizes. Also, in instances where video may be re-processed several times (not uncommon in a post production environment), the initial format of the video and the processes surrounding re-compression are critical to the visual quality of the final result.

Affect of Compression on File Transfer and Storage As stated, one of the effects of choosing a particular compression technology (or ratio of compression) is the impact on the resulting file size, which affects the storage requirement, and the corresponding time it takes to transfer. In order to understand this and establish a reference point, let’s consider the parameters of a “typical” film. The table below provides some basic statistics for an uncompressed 10bit 1080p24 video file.

4 Considerations for Interoperable Master Formats

Table 1 - Uncompressed 10bit 1080p24 Video File Statistics

Film Duration (minutes) 90

Film Duration (seconds) 5400

Horizontal Resolution 1920 Vertical Resolution 1080

BitDepth / Chroma 10 / 4:4:4

FPS 24

Total Film Storage (MB) 961,084

Sustained real-time Data Rate (MB/s) 187

Available HDD Transfer Rate (MB/s) 25

Table 2 below illustrates data storage and transfer requirements for different compression ratios.

Table 2 - File Sizes and Data Transfer Times for Various Compression Algorithm Types

Visually Compression Type: Uncompressed Lossless Lossy Lossless Compression Ratio: 1:1 2:1 6:1 25:1 Total MBytes to Store/Transfer: 961,084 480,542 160,181 38,443 Transfer Time (Minutes): 641 320 107 26 Transfer Time (Hours): 10.7 5.3 1.8 0.4 Transfer ratio: 7.1x 3.6x 1.2x 0.3x

Uncompressed: For a 90-minute film stored in an uncompressed (1:1) format, total storage is approximately 1TB in size. Assuming a data transfer rate of 25 Mbytes/sec for an external USB or Firewire hard drive, the total transfer (copy) time would be about 11 hours, or about 7.5 times longer than a “real-time” transfer of 90 minutes. This lengthy data transfer process and its corresponding large storage requirements are one of the motivations for considering use of compression as a format option within IMF.

5 Considerations for Interoperable Master Formats

Lossless Compression: An alternative to uncompressed media is to store the video data using an arithmetically lossless compression format. That is, the Entropy encoder of the codec may be applied to the data, resulting in a 2-3x improvement in storage requirements and file transfer time. The corresponding data parameters using a lossless compression ratio of 2:1 on our “typical” 90-minute film are shown in the table above; e.g., the storage requirement is halved and the transfer speed is doubled.

From a technical standpoint, for lossless compression, the data that emerges after decompression is arithmetically identical to the original source. The cost of doing so is merely the compression and decompression time and the requirement to support the codec. A number of lossless video codecs are currently available off the shelf (e.g. HuffYuv, Lagarith, Yuls, FFV1 and others, though a Zip or data compressor could also be used.). These are all capable of delivering a compression ratio between 2:1 and 3:1 at 8 bits of pixel depth.

However, most lossless codecs have serious limitations in compressibility at bit depths above 8 bits. As bit depths Increase towards 10 bits or 12 bits, compressibility decreases – typically below 2:1. It is worth mentioning that JPEG 2K, CineForm, some professional profiles of AVC/H.264 and 2 emerging standards (JPEG XR and VC2) are capable of reversible coding at higher bit depths with less impact on compressibility.

Visually Lossless Compression: Another alternative is to apply “light” video compression techniques that result in extremely high-quality compressed images in the range of 4:1 to 12:1 compression. Such techniques are often known as “visually lossless” because the human visual system is generally unable to discern visual differences between such lightly compressed files and the original sources. The main codecs available that offer visually lossless compression include JPEG 2K, CineForm, JPEG XR, VC2 and H.264/AVC-Intra-Pro. Note from the table above that at 6:1 compression the transfer rate of the compressed IMF takes just longer than real- time (1.2x) to move from a USB/Firewire drive to other media. On a Gigabit Ethernet connection that sustains 90MBytes per second, our 90-minute film from Table 1 can be transferred in about 30 minutes.

It is interesting to note that many digital acquisition source formats use visually lossless compression at the acquisition stage. The RED camera, for example uses JPEG 2K with up to 12:1 compression internally, though is called “RAW” and treated as if it is uncompressed. Silicon Imaging cameras use the CineForm RAW codec at 4:1 compression. Similarly, HDCAM SR in its 440Mbps mode uses 4:1 visually lossless compression.

6 Considerations for Interoperable Master Formats

There are differences in the design goals of visually lossless codecs that are worth considering that yield different performance and slight variations in file size. These will be explored in more detail in the next section.

Lossy Compression: For completeness, it is also worth considering the consequences of a more lossy compression that is usually used for purposes of distribution (Blu-ray, DVD, cable/satellite delivery, Internet streaming, etc). In some instances, for example for a mobile video stream, the source material may not need to be of equivalent quality to the original source. At a compression rate of 25:1 (illustrated in the above table), you end up with a video quality that with the same codec (i.e. H.264/AVC) would result in a quality superior to Blu-ray Disc.

It is not suggested that lossy compression in any way represents a viable option for coding in an Interoperable Master file, although it may be useful in some instances.

Codec Performance - Speed vs. File Size Observing existing post production workflows reinforces an important point that in some instances, there may be a need to not only perform multiple different conversions from the source file, but also a need to re-create new source files of a different format from the original source. Performing these conversions efficiently in software on affordable workstations that deliver faster-than-real-time performance is recommended in order to minimize costs and labor/time.

A second emerging trend is towards software-based render farms, including the ability to process multiple sources simultaneously on commodity hardware. Efficiency of render farm performance is influenced by required bandwidth to move files and also the speed of codec performance. A reduction in bandwidth by 4X to 6X (visually lossless compression) is beneficial for data throughput. Similarly codec computation requirements should be minimized for highest render-farm throughput.

Given the background above, it is important to note that codec performance – the time it takes to encode into or out of a particular codec format – varies substantially based on underlying algorithm design choices. For highest workflow efficiency and lowest cost, care should be taken in selecting the appropriate codec to minimize encoding and decoding times. (More detailed discussion about specific algorithm variations and performance is discussed in the Appendices).

Regarding design considerations for compression algorithms, and assuming that differing design choices will yield equivalent visual fidelity, then the next high-level design decision is a choice between smaller file sizes or higher performance. The choice between small file sizes versus 7 Considerations for Interoperable Master Formats

high codec performance is often inversely proportional. A non-IMF example of where this is evident is the choice of high compression AVCHD encoding (small files) for consumer HD camcorders. In this case a design decision was made to create small file sizes at the expense of encoder complexity in order to fit more content on flash media.

JPEG 2K: One of the overall design goals of JPEG 2K was to reduce resulting file sizes of compressed images. This goal was achieved, although at the expense of computational complexity. While JPEG 2K’s Arithmetic Coding creates files roughly 15%-20% smaller than using Variable Length Huffman Coding, the overall compression / decompression process is roughly 10 times slower as a result, assuming no hardware acceleration, such as is available in one of the DVS Clipster and Venice products. (See “Appendix B - Quality vs. Complexity” for more details)

CineForm: The overall design goal for CineForm compression was fast codec performance at the expense of slightly larger file sizes. Although there are differing compression blocks between differing compression formats, the dominant influence on performance and file size is the implementation choice for Entropy Coding. Arithmetic Coding was chosen for JPEG 2K while CineForm chose Huffman Coding (Variable Length Coding), which accounts for the majority of the performance difference between the codecs. CineForm compression can encode or decode 12-bit 1920x1080 4:4:4 images at about 100 frames per second (4x faster than real time) on a single-chip quad-core PC. Additional discussions about these entropy coding techniques are discussed in the Appendices.

H.264/AVC: H264/AVC offers 2 different exclusive algorithms for entropy encoding: CABAC and CaVLC. CABAC is based on arithmetic coding (similar to JPEG 2K) while CaVLC is variable length (closer to Huffman coding). CABAC has demonstrated an average 15% better efficiency compared to CaVLC, but is far more complex to implement. For intra professional HD TV production profiles the following design tradeoff is frequently made: for “high” bit-rate (above 100 Mb/s) CaVLC is used instead of CABAC, and the difference in encoding performance is counterbalanced by the cost efficiency and the ease of the implementation. With a bit-rate in the range of 200 Mb/s an additional 15% picture quality has been assessed as of less importance than the implementation aspects.

At a more tangible level, CineForm, JPEG XR and VC2 have been designed with the same kind of trade-off in mind, favoring reduced CPU computational load versus modestly larger file sizes.

8 Considerations for Interoperable Master Formats

Industry Compatibility The topic of industry compatibility explores workflow considerations the industry must address through software development tasks, in support of the format chosen by IMF. Three considerations are offered:

• Support all Major Wrappers: For broadest compatibility, it is preferable that the chosen codec support existing wrapper formats including AVI, MOV, and MXF. This is not to say the chosen IMF should be in one of these formats, rather that IMF packages will be commonly rewrapped as part of the post production workflow. These existing wrapper formats serve as an interface buffer between the underlying compression layer and end- user applications that never need to be made IMF compliant. Adopting standard wrappers will accelerate integration of IMF into standard industry tools which usually support one of more of the standard wrapper types. This basically means the underlying compression needs to be accessible through existing APIs such as Video for Windows, DirectShow, QuickTime, and GStreamer. As a corollary to the above point, it is common to use numerous tools within a post production workflow, including large-vendor proprietary tools (Avid, Autodesk, Apple, etc) combined with internally developed tools that may leverage public domain software, an example of which is AVISynth. It will be beneficial to choose a codec that offers known compatibility (or standardized interfaces) to ease integration within custom tools.

• Cross-Platform Support: All major OS platforms – Windows, Mac OS, and Linux – are used in post production, and supporting IMF compatibility across all operating systems is important. To accelerate the speed of adoption, the IMF codec and its surrounding wrapper layer(s) should be available across major platforms. It is acknowledged that not all platforms support all wrapper types in a seamless manner, so it is important that tools be available to rewrap between formats without disturbing the underlying compression. For instance, although both Windows and Mac OS support AVI and MOV files, Windows applications generally prefer AVI files while Mac applications generally prefer MOV wrappers.

• Performance (Again): It is acknowledged that whatever compression format is chosen by IMF, it will not initially be supported on all platforms and by all wrapper types. In this case it is likely (at least in the short term) that conversions will be made between IMF formats and existing formats (such as DPX) that are already supported within existing 9 Considerations for Interoperable Master Formats

tools. In such case the issue of performance (getting into and out of IMF) as discussed in the last section remains an important consideration.

JPEG 2K: JPEG 2K has made significant impact in a number of areas relating to Digital Cinema, and in particular where it is a single codec environment using hardware compressors/de- compressors. While it appears to be gaining some industry support, however, it does not yet appear to have been widely adopted for post production by major industry tools, presumably for reasons of performance. Apple includes a QuickTime wrapper around a JPEG 2K compressor that ships within Final Cut Studio, but its bit depth is limited to 8 bits. In addition, 3rd party DirectShow and QuickTime wrappers are also available for JPEG 2K.

CineForm: CineForm developed its compression format in 2001, and includes both QuickTime and DirectShow wrappers for compatibility and simple integration with a broad range of industry tools. CineForm’s codec is engineered to perform well without hardware acceleration (at the expense of larger file sizes) and is available on both Windows and Mac, and Linux development is underway. CineForm provides SDKs for Windows, Mac OS, and (soon) Linux.

H.264/AVC: The H264/AVC standard was ratified in 2003, though the professional profiles likely to be viable for IMF (Such as CALVC 4:4:4 Intra) were only ratified in April 2007. While there is broad support for H.264/AVC on Mac, Linux, and Windows, this does not necessarily include the professional profiles.

Mechanism for Picture-Dependent Metadata The concept of Picture-Dependent Metadata provides a mechanism to store data as part of the media file that could significantly affect the rendering or format of that video, without affecting the encoded video. This can be tremendously beneficial to digital workflows as it negates the need to re-encode any video when creating a new master file.

If the IMF were to adopt the concept of “Picture-Dependent Metadata” it could provide the ability to perform post-decode operations on images prior to presenting them to a calling application. Examples of such post-processing could include:

• Image exposure and color correction Adjustments (Lift / Gamma / Gain / Saturation) including CDL (Color Decision List) compliance

• Pan and Scan tracks

• Zoom / Crop / Rotate

• Spatial Re-sampling

10 Considerations for Interoperable Master Formats

• Aspect Ratio modification including letterbox/pillar box (overlaps with zoom/crop/rotate)

• 3D convergence adjustments: horizontal, vertical, rotation including zoom adjust

• 3D floating windows By way of example, the idea of applying CDL adjustments as a “layer” to an underlying image sequence is generally understood by most. If we consider CDL information as Picture- Dependent Metadata, then CDL post-processing is performed in real-time after the decode operation is performed, and prior to presenting the image to a calling application. This concept defers the need to create intermediate render formats because they are created dynamically in real time, thus saving time as well as intermediate storage.

In a more sophisticated example, a full-aperture 2K 4:3 data scan in IMF format may include CDL color metadata plus embedded pan/scan tracks intended for both 16:9 HD and 4:3 SD. In this case only one such source master would need to exist, and a calling application could request either an HD 16:9 version or a 4:3 SD 60i version. The compression layer interprets the assigned Picture-Dependent Metadata, performs the CDL processing applies Pan/Scan data, and re-samples the images. For the SD version output pulldown is added, again as a real-time process. The results are passed to the distribution encoder for creation of Blu-ray or DVD (for example) outputs, again without the need to render or create intermediate files.

It is acknowledged that all of the operations identified can be applied as independent post production processes to decoded images – that is mostly how these operations are performed today. However, each operation usually requires rendering into a new intermediate or master file which takes time and disk storage. The ability to dynamically create these images in real time is an enhancement to workflow efficiency and is worth considering.

While it is expected that an IMF compliant decoding application could perform all these image manipulations upon the decoded frame, the workflow breaks when rewrapped IMF packages are used in non-compliant tools. We feel this may happen with high regularity, particularly during the early days of IMF adoption. To preserve the correct processing of the image and formatting, this metadata processing should be moved into the compression layer, so that rewrapped media can still be used with these extended image formatting features. CineForm, provides this capability today, and theoretically, standard metadata mechanisms of JPEG 2K, XR, etc, could also be used to implement Picture-Dependent Metadata style image manipulation at the decoder layer.

11 Considerations for Interoperable Master Formats

Industry Accessibility Accessibility discusses access and use within the industry of the underlying IMF technology from both a business and technology standpoint.

With one exception, the formats most likely considered for adoption by IMF are either already ratified through various standards efforts or are currently being worked on by standards bodies. Three of the formats considered most likely for adoption are JPEG 2K (ISO IEC), H.264 / MPEG-4 AVC (ITU-T / ISO IEC) and CineForm (Privately developed; not in the public domain).

About Standards: Standardization offers the benefit of industry experts that test an algorithm through a rigorous technical evaluation process, ensures its interfaces and syntax, and monitors its extensibility. Standards are then managed by national or international bodies that ensure they may be used by those qualified who request access. A standard thus separates the standard itself, from the financial success of any individual company who chooses to implement the standard; i.e., if an individual company with a JPEG 2K implementation goes out of business it has no impact on the standard itself or access to the standard by others.

Although standards are available for implementation by all who request access, it must be pointed out that standards are not necessarily royalty free unless this is addressed by the standards body. MPEG and H264 are examples of standards that are available for implementation but also require royalty payments to patent holders. For purposes of clarity, JPEG 2K Part One may be implemented by all interested parties without license royalties to patent holders, though JPEG 2K extensions are not necessarily clear of patent holders. Further, unless an end user developed his own JPEG 2K implementation, separate from any patent royalties which may be required, a license purchase must be made between the end user and the developer of the implementation.

Non-Standards Consideration: As pointed out, standards provide a primary benefit of ensuring that any given format is disassociated with the success of any given company. Adoption by IMF of a format that is not ratified by a standards body means direct negotiation must happen between participants in the industry and the company whose technology is under consideration. CineForm’s compression is perhaps the only proprietary technology that may be considered by the IMF body. If considered, it may be also necessary to consider some form of standardization, industry license, or mechanism to make a standard implementation source code available as part of the media files it generates.

12 Considerations for Interoperable Master Formats

Industry Extensibility Evolution of workflow techniques in the professional creative industry is occurring at a rapid pace, as are migration of source and distribution formats. 8K film scans, 4K digital camera sources, and 3D content are but a few of the examples of evolving technologies that impact post production. With digital acquisition, the opportunity or requirement to move post processes that were previously performed at a post production facility forward onto the shooting location is commonplace. Also, given the rapid migration to digital processing and storage, it is also possible that a defined IMF may be used as an archival format in addition to a media exchange format.

In light of this rapid change, it is important to consider the impact of workflow evolution on the choice of IMF format. If such care is not taken, and if a chosen format is not easily extended, we believe the relevance of IMF may suffer, with the industry adopting other available techniques that support evolving workflows.

Support for Single vs. Multiple Codecs The main strength of limiting compression to a single codec is the perception that workflow is simplified for a single codec choice. This is certainly true to a degree, though there are a number of reasons why support for multiple codecs may be necessary. Support of multiple codecs in the format provides the following advantages:

• Compression algorithms are evolving rapidly, offering higher picture quality with lower computing complexity. Supporting multiple codecs would make it possible to optimize codecs for the compression ratio operating points required for given IMF applications. It is also difficult for a single compression format to suit all current and future applications of the IMF.

• Future-proofs the solution to support likely evolutions of picture resolution (such as 8K), color space (e.g. extending recommendation 709 to address wider gamut) and increases in bit-depth (such as 16 bit floating point).

• Allows for better support of compression of 3D content as soon as specific codecs would be available.

• Depending on the container that will be selected, being able to support multiple codecs in the same container will add flexibility: “no compression”, optimal “lossless compression” and optimal “visually lossless” compression could be handled from the

13 Considerations for Interoperable Master Formats

same container. It will also help adapt the compression to the content, allowing the most optimal available compression option to be used for any given application.

• Having the ability to support several different codecs would allow using a visually lossless codec as well as a pure lossless one when possible.

Supporting multiple codecs in a single container could easily rely on adding tags in the container to specify the codec used and pixel representation. Such method is used in some common containers like AVI and MOV: a 32-bit “fourCC” code is used to store 4 ASCII characters indicating the codec used. The MXF container also provides an equivalent mechanism.

Given the right container, it may also be possible to provide reference source code or even binaries of the decoder within the file itself, allowing the content to be played in future generations.

As a final comment for consideration, rather than IMF specifying specific compression codecs, one suggestion is to intentionally not define any specific compression. Instead, define video interfaces and pixel formats that interface to the compression system; such interfaces can be rigorously defined and are unlikely to change substantially as post production workflows evolve. Such a solution allows underlying codec technology to evolve just as workflows themselves evolve. An added benefit would allow a specific project to choose a specific codec based on features most beneficial to that project.

14 Considerations for Interoperable Master Formats

Conclusion Summarizing the considerations for Interoperable Master Formats described in this document, it should be remembered that one of the main purposes for developing the IMF format is to allow more efficient and interoperable workflows in post production. With that objective in mind, the following points should be considered to meet the overall goal:

• A plethora of workflow applications have differing requirements as it pertains to video formats, resolutions, and compression schemes. • Some reasonable level of compression is required in an Intermediate format, though this needs to not be at the expense of the quality of the video, and nor should it compromise the ability of the originator or receiver of the content to efficiently create or use the content. • Workflows are always evolving, and as such, the IMF format should provide the means to also support future workflows (e.g. Greater Resolution, Wider Gamut and Bit-depth, Codecs for 3D, etc.). • Providing flexibility to the Content Owner to choose an appropriate codec for their content (assuming it can be broadly supported), always allows the best use of compression, and will help establish and ensure longer-term support for the format. As far as codec choice is concerned, there are many pros and cons for each of the available choices and should be weighed in light of the considerations above. There is no perfect solution, though it is our belief that in accommodating some of the key workflow issues and requirements raised in this paper, the IMF can become not only an accepted, but integral and valued component of the content production process and be valued as such for many years to come.

15 Considerations for Interoperable Master Formats

Appendix A - Video Compression Technology Overview In order to understand the implications of selecting a video compression technology, it is helpful to understand the core methodologies and steps behind video compression. For Visually lossless compression, target compression ratio is in the range of 4:1 to 12:1 or more, and codecs generally encode video frame by frame (temporal prediction is not used to limit propagation of encoding artifacts and save computing power). The main codecs available in this arena are the visually lossless versions of JPEG 2K, JPEG XRi and VC2ii complemented by CineForm and AVC-Intra-Proiii. Avid’s DNxHD (aka VC3) could also be considered.

Both JPEG 2K and CineForm are already used for the visually lossless DI process, while DCI has made the choice of JPEG 2K as their distribution format for Digital Cinema. Generally AVC-Intra- Pro, VC2, and JPEG 2K deliver better encoding performance but at the expense of a higher codec complexity, while Cineform and JPEG XR have been designed to minimize computing power.

To compress video, several toolsiv are generally applied sequentially:

• Pixel representation transformation: this first step deals with color representation (e.g. RGB vs. YUV vs. XYZ), color component sub-sampling (e.g. 444 vs. 422 vs. 420) and pixel color component bit-depth (e.g. 12/10/8 bits). Obviously most of those transformations are irreversible and thus not applicable for lossless, but in any case an appropriate pixel representation will help save both bit-rate and computing power.

• Picture redundancy extraction: removing spatial or temporal picture redundancy is an efficient tool that may be used to compute “residuals”, i.e. the difference between original pixels and a causal prediction of them. Two kinds of predictions are available: spatial (or intra) and temporal (or inter). Spatial predictions will be computed from the surrounding pixels (e.g. the pixel from above or from the left) while temporal predictions will make use of past and/or future pictures. Temporal prediction is really efficient to remove redundancy in a sequence of pictures but it requires a lot of computing resources to compute motion information and it makes frame accurate editing of a compressed sequence more difficult (decoding and re-encoding is required). For those reasons temporal predictions are mainly reserved for final program distribution codecs and high compression ratio (such as MPEG2, VC1 and H264). Some

16 Considerations for Interoperable Master Formats

codecs like JPEG 2K, JPEG XR and VC2 don’t even use spatial predictions. Pixel prediction when used is a fully reversible process.

• Decomposition in the frequency domain: Residuals (or original pixels when prediction is not used) are then transformed in the frequency domain. This is called “wavelet” for JPEG2K, “DCT” for MPEG2 or “integer transform” for H264. This frequency transformation helps in increasing the correlation in a given frequency range and is generally reversible (depending on computation accuracy).

• Quantization: this technique is used for non-lossless compression only, it will reduce the precision of coefficients resulting from the frequency transform and then save bits at the expense of picture quality. Modifying the quantization level is useful to control the resulting bit-rate for example when encoding for Blu-ray or Broadcast applications. Generally higher frequencies are more quantized than lower ones following the human vision properties where the eye sensibility is lower for high frequencies. After quantization JPEG XR proposes to save some more bits by predicting quantized coefficients from their causal environment.

• Entropy encoding: this is the final step where statistical information will be used to further compress bits resulting from previous processes (e.g. the most probable message fragment will be translated into the smallest bit-field). Several different techniques are used: mainly variable length encoding (like in MPEG2, H264/AVC, Cineform, VC2 and JPEG XR) and arithmetic coding (like JPEG 2K and H264/AVC). This process could be applied either on blocks or on bit-planes or on frequency ranges. It could be contextual and adaptive or not and is generally a sequential process. Entropy encoding is anyway fully reversible.

Based on the above “coding tools” codecs are carefully tuned by experts for a given compression ratio operating point: the “cooking recipe” is not the same if you want to encode for a film camera (125 to 357 Mbps), or for DCI (250 Mbps), or for a Blu-ray (20 to 40 Mb/s) or for HD broadcast applications (6 Mb/s in average). Such coding toolbox allows customizing codecs to a specific application field, thus optimizing either fidelity, or complexity, or scalability, or latency, etc.

Codec technology is in constant evolution. This is highlighted by the emergence of new standard like JPEG XR and VC2. This evolution is driven by new requirements or new applications such as higher picture resolution, higher bit-depth for higher dynamic range, new color space, scalability for consumer application, multiple views encoding for 3D, etc.

17 Considerations for Interoperable Master Formats

Progresses in encoding algorithms are also key and are paired with the availability of more and more powerful computing architecture.

18 Considerations for Interoperable Master Formats

Appendix B - Quality vs. Complexity As discussed previously, some encoding tools are extremely compute intensive (like temporal predictions) and when used, will have a strong impact on the cost of a visually lossless encoding platform (though there are other reasons preventing use of temporal predictions for visually lossless coding). The entropy coding methodology could also have a strong effect on the compute resources for encode and decode.

On an optimized port of JPEG 2K on a PC (with SSE code), we estimate that “entropy encoding” will account for 75% of the CPU while the remainder, i.e. all other compression operations, accounts for only 25%. In addition, it is estimated that the wavelet and quantize operations together only account for around 6% of the overall CPU usage.

The reason for this is that the Arithmetic entropy coding used by JPEG 2K is very efficient, though very computationally intensive, can’t be optimized using SSE and runs only on the CPU. When compared to Huffman variable length Entropy coding, it is of the order of 15%-20% more efficient (i.e. identical data can be stored with 15%-20% less bits than Huffman), though the encode and decode requirements are of the order of 10 times slower on equivalent hardware.

H264/AVC offers 2 different exclusive tools for entropy encoding: CABAC and CaVLC. CABAC is based on arithmetic coding while CaVLC is variable length. CABAC has also demonstrated an average 15% better efficiency compared to CaVLC, but is far more complex to implement. For intra professional HD TV production profiles the following design tradeoff is frequently made: for “high” bit-rate (above 100 Mb/s) CaVLC is used instead of CABAC, and the difference in encoding performance is counterbalanced by the cost efficiency and the ease of the implementation. With a bit-rate in the range of 200 Mb/s an additional 15% picture quality has been assessed as of less importance than the implementation aspects.

Cineform, JPEG XR and VC2 have been designed with the same kind of trade-off in mind: picture quality versus algorithm complexity. The design teams started from acknowledging the complexity of JPEG 2K and have sacrificed a little bit of file-size efficiency to allow optimized implementation. Cineform is able on high end PCs to run faster than real time on 1080p 4:4:4 12 bits 24fps.

19 Considerations for Interoperable Master Formats

Latency and Robustness Latency may be of less importance for frame based visually lossless codec as compared to lossy “long GOP” codec like H264/AVC. Also latency is a concern in a diskless, real-time environment, which may not be the case for digital intermediate. Latency is the duration between the time you started to process content and the time you deliver the first resulting output. To be noted VC2 has been designed to minimize latency.

Regarding robustness to multiple generations (encode and decode with a given codec, then processes, re-encodes and re-decodes with the same codec, etc.), based on our experience JPEG 2K is more robust than AVC-Intra-Pro, which is also more robust than VC2. Cineform allows many changes in the video to be represented through changes in metadata, rather than re-encoding the video itself. In this instance, some of the multiple generation effect may be negated.

i JPEG XR derived from HD Photo (aka Windows Media Photo) has been originally developed by Microsoft and its specification is now at FDIS (Final Draft International Standard) status in JPEG committee. Given its recent introduction, it’s too early to have a clear view on its real performances. Extension to sequences of images is planned for the future. ii Also known as Dirac-Pro, and originally developed by the BBC. SMPTE is currently working on VC2 standardization. iii We are referring here to specific professional profiles of H264 (aka AVC) in which predictions are limited to spatial modes (aka “Intra”). The following 10 bits profiles are considered: High 10 intra, High 422 Intra, High 444 Intra and Cavlc 444 Intra. 12 bits is not supported. iv The compression principles listed below are given to highlight where the main tradeoffs are and where irreversible processing is standing. This list is not exhaustive.

About Technicolor

Headquartered in Burbank, California, Technicolor is part of Thomson (Euronext Paris: 18453; NYSE: TMS). Along with being the world’s largest film processor, Technicolor has evolved to become the world’s largest manufacturer and distributor of DVDs and CDs (including Blu-ray Disc), and is a leading provider of production, postproduction, and visual effects services to film studios and cable and television networks. On an annual basis, Technicolor processes more than five billion feet of motion picture film, and has the capacity to produce over 2.1 billion DVDs/CDs. The company is also a principal developer and supplier of services for content owners and distributors.

Technicolor serves an international base of entertainment, software, game manufacturing, promotional, direct marketing, OEM, and corporate customers with its manufacturing and postproduction facilities worldwide. Major Hollywood clients include The Walt Disney Co., DreamWorks SKG, New Line Cinema, Universal, Paramount, The Weinstein Company, and Lionsgate. Software publishing clients include Microsoft, Vivendi Universal Games, Electronic Arts, and Atari.

Technical Contact Information

Published by the Office of the CTO.

Any technical questions may be addressed to the following individuals and principal contributors.

Tim Barrett – [email protected]

Marie-Jean Colaitis – [email protected]

Michael Zink – [email protected]

Sunset Gower Studios 6040 Sunset Blvd. Hollywood, CA 90028

www.Technicolor.com

Copyright © 2009 Technicolor. All rights reserved. Technicolor, Thomson are trademarks or registered trademarks of Thomson SA or its subsidiaries in the United States and other countries. Other names and brands may be claimed as the property of others.

September 9, 2009 Edition 3