LOCO-I: a Low Complexity, Context-Based, Lossless Image Compression Algorithm Marcelo J
Total Page:16
File Type:pdf, Size:1020Kb
(1) LOCO-I: A Low Complexity, Context-Based, Lossless Image Compression Algorithm Marcelo J. Weinberger, Gadiel Seroussi, and Guillermo Sapiro Hewlett-Packard Laboratories, Palo Alto, CA 94304 Copyright 1996 Institute of Electrical and Electronics Engineers. Reprinted, with permission, from Proceedings of the IEEE Data Compression Conference, Snowbird, Utah, March-April 1996. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Hewlett-Packard’s products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertis- ing or promotional purposes or for creating new collective works for resale or re- distribution must be obtained from the IEEE by sending a blank email message to [email protected]. By choosing to view this document, you agree to all provisions of the copyright laws protecting it. LOCO-I: A Low Complexity, Context-Based, Lossless Image Compression Algorithm Marcelo J. Weinberger, Gadiel Seroussi, and Guillermo Sapiro Hewlett-Packard Laboratories, Palo Alto, CA 94304 Abstract. LOCO-I (LOw COmplexity LOssless COmpression for Images) is a novel loss- less compression algorithm for continuous-tone images which combines the simplicity of Huffman coding with the compression potential of context models, thus “enjoying the best of both worlds.” The algorithm is based on a simple fixed context model, which approaches the capability of the more complex universal context modeling techniques for capturing high-order dependencies. The model is tuned for efficient performance in conjunction with a collection of (context-conditioned) Huffman codes, which is realized with an adaptive, symbol-wise, Golomb-Rice code. LOCO-I attains, in one pass, and without recourse to the higher complexity arithmetic coders, compression ratios similar or superior to those obtained with state-of-the-art schemes based on arithmetic coding. In fact, LOCO-I is be- ing considered by the ISO committee as a replacement for the current lossless standard in low-complexity applications. 1 Introduction Lossless image compression schemes often consist of two distinct and independent components: modeling and coding. The modeling part can be formulated as an in- ductive inference problem, in which an image is observed pixel by pixel in some pre-defined order (e.g., raster-scan). At each time instant i, and after having scanned i past data x = x1x2 xi, one wishes to make inferences on the next pixel value x i+1 by assigning a conditional··· probability distribution p( xi) to it.1 Ideally, the code i ·| length contributed by xi+1 is log p(xi+1 x ) bits (hereafter, logarithms are taken to the base 2), which averages− to the entropy| of the probabilistic model. Thus, a skewed (low-entropy) probability distribution, which assigns a high probability value to the next pixel, is desirable. In a sequential formulation, the distribution p( xi)is learned from the past and it is available to the decoder as it decodes the past·| string sequentially. Alternatively, in a two-pass scheme the conditional distribution can be learned from the whole image in a first pass and must be sent to the decoder as header information. In this case, the total code length includes the length of the header. Yet, both the second encoding pass and the (single-pass) decoding are subject to the same sequential formulation. In principle, skewed distributions can be obtained through larger conditioning regions or “contexts.” However, this implies a larger number of parameters in the statistical model, with an associated model cost [1] which could offset the entropy savings. In a sequential scheme, this cost can be interpreted as capturing the penal- ties of “context dilution” occurring when count statistics must be spread over too many contexts, thus affecting the accuracy of the corresponding estimates. In non- sequential (two-pass) schemes the model cost represents the code length required to encode the model parameters estimated in the first pass, which must be transmitted to the decoder. In both cases the model cost is proportional to the number of free parameters in the statistical model [2]. 1Notice that pixel values are indexed with only one subscript, despite corresponding to a two- dimensional array. This subscript denotes the “time” index in the pre-defined order. In state-of-the-art lossless image compression schemes, the above probability as- signment is generally broken into the following components: a. A prediction step, in which a deterministic valuex ˆ i+1 is guessed for the next i pixel xi+1 based on a subset of the available past sequence x (a causal template). b. The determination of a context in which xi+1 occurs (again, a function of a past subsequence). ∆ c. A probabilistic model for the prediction residual (or error signal) ei+1 = xi+1 − xˆi+1, conditioned on the context of xi+1. The optimization of these steps, inspired on the ideas of universal modeling, is ana- lyzed in [1], where a high complexity scheme is proposed. In this scheme, the pre- diction step is accomplished with an adaptively optimized, context-dependent linear predictor, and the statistical modeling is performed with an optimized number of parameters (variable-size quantized contexts). The modeled prediction residuals are arithmetic encoded in order to attain the ideal code length [3]. In [4], some of the optimizations performed in [1] are avoided, with no deterioration in the compression ratios. Yet, the modeling remains in the high complexity range. In various versions of the Sunset algorithm [5, 6], a further reduction in the modeling complexity, via a fixed predictor and a non-optimized context model, causes some deterioration in the compression ratios. Moreover, the models used in all these algorithms provide best results with arithmetic coding of prediction residuals, an operation that is con- sidered too complex in many applications, especially for software implementations. Other alternatives have been designed with simplicity in mind and propose minor variations of traditional DPCM techniques [7], which include Huffman coding of pre- diction residuals obtained with some fixed predictor. Thus, these simpler techniques are fundamentally limited in their compression performance by the first order entropy of the prediction residuals. LOCO-I combines the simplicity of Huffman (as opposed to arithmetic) coding with the compression potential of context models, thus “enjoying the best of both worlds.” The algorithm uses a non-linear predictor with rudimentary edge detecting capability, and is based on a very simple context model, determined by quantized gradients. A small number of free statistical parameters is used to approach the capability of the more complex universal context modeling techniques for capturing high-order dependencies [1], without the drawbacks of “context dilution.” This is achieved through a single-parameter probability distribution per context, efficiently implemented by an adaptive, symbol-wise, Golomb-Rice code [8, 9]. In addition, reduced redundancy is achieved by embedding an alphabet extension in “flat” regions. A low complexity scheme based on Golomb-Rice coding, FELICS, was previously described in [10]. LOCO-I differs significantly from FELICS in that it follows a more traditional predictor-modeler-coder structure along the paradigm of [1], in the use of a novel and extremely simple explicit formula for Golomb-Rice parameter estimation (described in Section 2.3), as opposed to the search procedure described in [10], and in the use of an embedded alphabet extension to reduce code redundancy. As shown in the results Section 3, LOCO-I attains compression ratios that are significantly better than those of FELICS, while maintaining the same low complexity level. In fact, the results show that LOCO-I compresses similarly or better than state-of-the art schemes based on arithmetic coding, at a fraction of the complexity. LOCO-I is Figure 1: A causal template for LOCO-I being considered by the ISO/IEC JTC1/SC29/WG1 committee as a replacement for the current lossless standard in low-complexity applications. 2 A description of LOCO-I The prediction and modeling units in LOCO-I are based on the causal template de- picted in Figure 1, where x denotes the current pixel, and a, b, c, d, and e are neighbor- ing pixels in the relative positions shown in the figure. The dependence of a, b, c, d, and e on the time index i has been deleted from the notation for simplicity. 2.1 Prediction Ideally, guessing the value of the current pixel xi+1 based on a, b, c, d, and e should be done by adaptively learning a model conditioned on the local edge direction. However, our complexity constraints rule out this possibility. Yet, a primitive edge detector is still desirable in order to approach the best possible predictor. The approach in LOCO-I consists on performing a primitive test to detect vertical or horizontal edges. If an edge is not detected, then the guessed value is a+b c, as this would be the value − of xi+1 if the current pixel belonged to the “plane” defined by the three neighboring pixels with “heights” a, b and c. This expresses the expected smoothness of the image in the absence of edges. Specifically, the LOCO-I predictor guesses: min(a, b)ifc max(a, b) ∆ ≥ xˆi+1 = max(a, b)ifc min(a, b) (2) a + b c otherwise.≤ − Assuming, without loss of generality, that a b, then the predictor of (2) can be interpreted as picking a in many cases where a≤ vertical edge exists left of the current location, b in many cases of an horizontal edge above the current location, or a plane predictor a + b c if no edge has been detected. − The above predictor has been employed in image compression applications [11], although under a different interpretation.