Manipulating lossless video in the compressed domain
The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.
Citation Thies, William, Steven Hall, and Saman Amarasinghe. “Manipulating Lossless Video in the Compressed Domain.” Proceedings of the 17th ACM International Conference on Multimedia. Beijing, China: ACM, 2009. 331-340.
As Published http://dx.doi.org/10.1145/1631272.1631319
Publisher Association for Computing Machinery
Version Author's final manuscript
Citable link http://hdl.handle.net/1721.1/63126
Terms of Use Creative Commons Attribution-Noncommercial-Share Alike 3.0
Detailed Terms http://creativecommons.org/licenses/by-nc-sa/3.0/ Manipulating Lossless Video in the Compressed Domain
William Thies Steven Hall Saman Amarasinghe Microsoft Research India Massachusetts Institute of Massachusetts Institute of [email protected] Technology Technology [email protected] [email protected]
ABSTRACT total volume of data processed, thereby offering large savings in A compressed-domain transformation is one that operates directly both execution time and memory footprint. However, existing tech- on the compressed format, rather than requiring conversion to an niques for operating directly on compressed data are largely limited uncompressed format prior to processing. Performing operations in to lossy compression formats such as JPEG [6, 8, 14, 19, 20, 23] the compressed domain offers large speedups, as it reduces the vol- and MPEG [1, 5, 15, 27, 28]. While these formats are used perva- ume of data processed and avoids the overhead of re-compression. sively in the distribution of image and video content, they are rarely While previous researchers have focused on compressed-domain used during the production of such content. Instead, professional techniques for lossy data formats, there are few techniques that ap- artists and filmmakers rely on lossless compression formats (BMP, ply to lossless formats. In this paper, we present a general tech- PNG, Apple Animation) to avoid accumulating artifacts during the nique for transforming lossless data as compressed with the sliding- editing process. Given the computational intensity of professional window Lempel Ziv algorithm (LZ77). We focus on applications video editing, there is a large demand for new techniques that could in video editing, where our technique supports color adjustment, accelerate operations on lossless formats. video compositing, and other operations directly on the Apple An- In this paper, we present a technique for translating a specific imation format (a variant of LZ77). class of computations to operate directly on losslessly-compressed We implemented a subset of our technique as an automatic pro- data. We consider compression formats that are based on LZ77, a gram transformation. Using the StreamIt language, users write a compression algorithm that is utilized by ZIP and fully encapsu- program to operate on uncompressed data, and our compiler trans- lates common formats such as Apple Animation, Microsoft RLE, forms the program to operate on compressed data. Experiments and Targa. Our transformation applies to a restricted class of pro- show that the technique offers speedups roughly proportional to grams, termed stream programs [26], that operate on continuous the compression factor. For our benchmark suite of 12 videos in streams of data. The transformation is most efficient when each el- Apple Animation format, speedups range from 1.1x to 471x, with ement of the stream is transformed in a uniform way (e.g., adjusting a median of 15x. the brightness of each pixel). However, it also applies to cases in which multiple items are processed at once (e.g., averaging pixels) or in which multiple streams are split or combined (e.g., composit- Categories and Subject Descriptors ing frames). The precise coverage of our transformation is defined D.3.2 [Programming Languages]: Language Classifications–Data- in Section 2. flow languages; D.3.4 [Programming Languages]: ProcessorsÐ The key idea behind our technique can be understood in simple Compilers, Optimization; I.4.2 [Compression]; H.5.1 [Multimedia terms. In LZ77, compression is achieved by indicating that a given Information Systems] part of the data stream is a repeat of a previous part of the stream. If a program is transforming each element of the stream in the same way, then any repetitions in the input will necessarily be present General Terms in the output as well. Thus, while new data sequences need to be Algorithms, Design, Experimentation, Languages, Performance processed as usual, any repeats of those sequences do not need to be transformed again. Rather, the repetitions in the input stream 1. INTRODUCTION can be directly copied to the output stream, thereby referencing the previously-computed values. This preserves the compression in the In order to accelerate the process of editing compressed data, re- stream while avoiding the cost of decompression, re-compression, searchers have identified specific transformations that can be mapped and computing on the uncompressed data. into the compressed domain—that is, they can operate directly on In this paper, we extend this simple idea to encompass a broad the compressed data format rather than on the uncompressed for- class of programs that can be expressed in the StreamIt program- mat [4, 12, 22, 28]. In addition to avoiding the cost of the decom- ming language [26]. We have implemented a subset of our gen- pression and re-compression, such techniques greatly reduce the eral technique in the StreamIt compiler. The end result is a fully- automatic system in which the user writes programs that operate on uncompressed data, and our compiler emits an optimized pro- Permission to make digital or hard copies of all or part of this work for gram that operates directly on compressed data. Our compiler gen- personal or classroom use is granted without fee provided that copies are erates plugins for two popular video editing tools (MEncoder and not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to Blender), allowing the optimized transformations to be used as part republish, to post on servers or to redistribute to lists, requires prior specific of a standard video editing process. permission and/or a fee. Using a suite of 12 videos (screencasts, animations, and stock MM’09, October 19Ð24, 2009, Beijing, China. footage) in Apple Animation format, our transformation offers a Copyright 2009 ACM 978-1-60558-608-3/09/10 ...$10.00. speedup roughly proportional to the compression factor. For trans- input stream decompress output stream struct rgb { byte r, g, b; distance } 132 489576 10 rgb->rgb pipeline HalfSize { ¢1,3² OLA¢2,4² add splitjoin { ReadRGB split roundrobin(WIDTH,WIDTH); add Identity