Practical Implementations of Arithmetic Coding

Total Page:16

File Type:pdf, Size:1020Kb

Practical Implementations of Arithmetic Coding Practical Implementations of Arithmetic Co ding Paul G. Howard and Je rey Scott Vitter Brown University Department of Computer Science Technical Rep ort No. 92{18 Revised version, April 1992 Formerly Technical Rep ort No. CS{91{45 App ears in Image and Text Compression, James A. Storer, ed., Kluwer Academic Publishers, Norwell, MA, 1992, pages 85{112. A shortened version app ears in the pro ceedings of the International Conference on Advances in Communication and Control COMCON 3, Victoria, British Columbia, Canada, Octob er 16{18, 1991. Practical Implementations of 1 Arithmetic Coding 2 3 Paul G. Howard Je rey Scott Vitter Department of Computer Science Brown University Providence, R.I. 02912{191 0 Abstract We provide a tutorial on arithmetic co ding, showing how it provides nearly optimal data compression and how it can b e matched with almost any prob- abilistic mo del. We indicate the main disadvantage of arithmetic co ding, its slowness, and give the basis of a fast, space-ecient, approximate arithmetic co der with only minimal loss of compression eciency. Our co der is based on the replacement of arithmetic by table lo okups coupled with a new deterministic probability estimation scheme. Index terms : Data compression, arithmetic co ding, adaptive mo deling, analysis of algorithms, data structures, low precision arithmetic. 1 A similar version of this pap er app ears in Image and Text Compression, James A. Storer, ed., Kluwer Academic Publishers, Norwell, MA, 1992, 85{112. A shortened version of this pap er app ears in the pro ceedings of the International Conference on Advances in Communication and Control COMCON 3, Victoria, British Columbia, Canada, Octob er 16{18, 1991. 2 Supp ort was provided in part by NASA Graduate Student Researchers Program grant NGT{ 50420 and by a National Science Foundation Presidential Young Investigators Award grant with matching funds from IBM. Additional supp ort was provided by a Universities Space Research As- so ciation/CESDIS asso ciate memb ership. 3 Supp ort was provided in part by National Science Foundation Presidential Young Investigator Award CCR{9047466 with matching funds from IBM, by NSF research grant CCR{9007851, by Army Research Oce grantDAAL03{91{G{0035, and by the Oce of Naval Research and the De- fense Advanced Research Pro jects Agency under contract N00014{91{J{4052 ARPA Order No. 8225. Additional supp ort was provided by a Universities Space Research Asso ciation/CESDIS asso ciate memb ership. 1 1 Data Compression and Arithmetic Co ding Data can b e compressed whenever some data symb ols are more likely than others. Shannon [54] showed that for the b est p ossible compression co de in the sense of minimum average co de length, the output length contains a contribution of lg p bits from the enco ding of each symb ol whose probability of o ccurrence is p.Ifwe can provide an accurate mo del for the probability of o ccurrence of each p ossible symbol at every p oint in a le, we can use arithmetic co ding to enco de the symb ols that actually o ccur; the numb er of bits used by arithmetic co ding to enco de a symb ol with probability p is very nearly lg p, so the enco ding is very nearly optimal for the given probability estimates. In this pap er we showby theorems and examples how arithmetic co ding achieves its p erformance. We also p oint out some of the drawbacks of arithmetic co ding in practice, and prop ose a uni ed compression system for overcoming them. We b egin by attempting to clear up some of the false impressions commonly held ab out arithmetic co ding; it o ers some genuine b ene ts, but it is not the solution to all data compression problems. The most imp ortant advantage of arithmetic co ding is its exibility: it can b e used in conjunction with any mo del that can provide a sequence of event probabilities. This advantage is signi cant b ecause large compression gains can b e obtained only through the use of sophisticated mo dels of the input data. Mo dels used for arithmetic co ding may b e adaptive, and in fact a numb er of indep endent mo dels may b e used in succession in co ding a single le. This great exibility results from the sharp separation of the co der from the mo deling pro cess [47]. There is a cost asso ciated with this exibility: the interface b etween the mo del and the co der, while simple, places considerable time and space demands on the mo del's data structures, esp ecially in the case of a multi-symb ol input alphab et. The other imp ortant advantage of arithmetic co ding is its optimality. Arithmetic co ding is optimal in theory and very nearly optimal in practice, in the sense of enco d- ing using minimal average co de length. This optimality is often less imp ortant than it might seem, since Hu man co ding [25] is also very nearly optimal in most cases [8,9, 18,39]. When the probability of some single symb ol is close to 1, however, arithmetic co ding do es give considerably b etter compression than other metho ds. The case of highly unbalanced probabilities o ccurs naturally in bilevel black and white image co ding, and it can also arise in the decomp osition of a multi-symb ol alphab et into a sequence of binary choices. The main disadvantage of arithmetic co ding is that it tends to b e slow. We shall see that the full precision form of arithmetic co ding requires at least one multiplication per event and in some implementations up to twomultiplications and two divisions per event. In addition, the mo del lo okup and up date op erations are slow b ecause of the input requirements of the co der. Both Hu man co ding and Ziv-Lemp el [59, 60] co ding are faster b ecause the mo del is represented directly in the data structures 2 2 TUTORIAL ON ARITHMETIC CODING used for co ding. This reduces the co ding eciency of those metho ds by narrowing the range of p ossible mo dels. Much of the current research in arithmetic co ding concerns nding approximations that increase co ding sp eed without compromising compression eciency. The most common metho d is to use an approximation to the multiplication op eration [10,27,29,43]; in this pap er we present an alternative approach using table lo okups and approximate probability estimation. Another disadvantage of arithmetic co ding is that it do es not in general pro duce a pre x co de. This precludes parallel co ding with multiple pro cessors. In addition, the p otentially unb ounded output delay makes real-time co ding problematical in critical applications, but in practice the delay seldom exceeds a few symb ols, so this is not a ma jor problem. A minor disadvantage is the need to indicate the end of the le. One nal minor problem is that arithmetic co des have p o or error resistance, esp e- cially when used with adaptive mo dels [5]. A single bit error in the enco ded le causes the deco der's internal state to b e in error, making the remainder of the deco ded le wrong. In fact this is a drawbackofal l adaptive co des, including Ziv-Lemp el co des and adaptive Hu man co des [12,15,18,26,55,56]. In practice, the p o or error resistance of adaptive co ding is unimp ortant, since we can simply apply appropriate error cor- rection co ding to the enco ded le. More complicated solutions app ear in [5,20], in which errors are made easy to detect, and up on detection of an error, bits are changed until no errors are detected. Overview of this pap er. In Section 2 we give a tutorial on arithmetic co ding. We include an intro duction to mo deling for text compression. We also restate several imp ortant theorems from [22] relating to the optimality of arithmetic co ding in theory and in practice. In Section 3 we present some of our current researchinto practical ways of improv- ing the sp eed of arithmetic co ding without sacri cing much compression eciency. The center of this research is a reduced-precision arithmetic co der, supp orted by ecient data structures for text mo deling. 2 Tutorial on Arithmetic Co ding In this section we explain how arithmetic co ding works and give implementation details; our treatment is based on that of Witten, Neal, and Cleary [58]. We p oint out the usefulness of binary arithmetic co ding that is, co ding with a 2-symb ol alphab et, and discuss the mo deling issue, particularly high-order Markov mo deling for text compression. Our fo cus is on enco ding, but the deco ding pro cess is similar. 2.1 Arithmetic co ding and its implementation Basic algorithm. The algorithm for enco ding a le using arithmetic co ding works conceptually as follows: 2.1 Arithmetic co ding and its implementation 3 Old interval 0LH1 Decomposition probability of ai 01 New interval 0LH1 Figure 1: Sub division of the currentinterval based on the probability of the input symbol a that o ccurs next. i 1. We b egin with a \currentinterval" [L; H initialized to [0; 1. 2.
Recommended publications
  • Data Compression: Dictionary-Based Coding 2 / 37 Dictionary-Based Coding Dictionary-Based Coding
    Dictionary-based Coding already coded not yet coded search buffer look-ahead buffer cursor (N symbols) (L symbols) We know the past but cannot control it. We control the future but... Last Lecture Last Lecture: Predictive Lossless Coding Predictive Lossless Coding Simple and effective way to exploit dependencies between neighboring symbols / samples Optimal predictor: Conditional mean (requires storage of large tables) Affine and Linear Prediction Simple structure, low-complex implementation possible Optimal prediction parameters are given by solution of Yule-Walker equations Works very well for real signals (e.g., audio, images, ...) Efficient Lossless Coding for Real-World Signals Affine/linear prediction (often: block-adaptive choice of prediction parameters) Entropy coding of prediction errors (e.g., arithmetic coding) Using marginal pmf often already yields good results Can be improved by using conditional pmfs (with simple conditions) Heiko Schwarz (Freie Universität Berlin) — Data Compression: Dictionary-based Coding 2 / 37 Dictionary-based Coding Dictionary-Based Coding Coding of Text Files Very high amount of dependencies Affine prediction does not work (requires linear dependencies) Higher-order conditional coding should work well, but is way to complex (memory) Alternative: Do not code single characters, but words or phrases Example: English Texts Oxford English Dictionary lists less than 230 000 words (including obsolete words) On average, a word contains about 6 characters Average codeword length per character would be limited by 1
    [Show full text]
  • Annual Report 2016
    ANNUAL REPORT 2016 PUNJABI UNIVERSITY, PATIALA © Punjabi University, Patiala (Established under Punjab Act No. 35 of 1961) Editor Dr. Shivani Thakar Asst. Professor (English) Department of Distance Education, Punjabi University, Patiala Laser Type Setting : Kakkar Computer, N.K. Road, Patiala Published by Dr. Manjit Singh Nijjar, Registrar, Punjabi University, Patiala and Printed at Kakkar Computer, Patiala :{Bhtof;Nh X[Bh nk;k wjbk ñ Ò uT[gd/ Ò ftfdnk thukoh sK goT[gekoh Ò iK gzu ok;h sK shoE tk;h Ò ñ Ò x[zxo{ tki? i/ wB[ bkr? Ò sT[ iw[ ejk eo/ w' f;T[ nkr? Ò ñ Ò ojkT[.. nk; fBok;h sT[ ;zfBnk;h Ò iK is[ i'rh sK ekfJnk G'rh Ò ò Ò dfJnk fdrzpo[ d/j phukoh Ò nkfg wo? ntok Bj wkoh Ò ó Ò J/e[ s{ j'fo t/; pj[s/o/.. BkBe[ ikD? u'i B s/o/ Ò ô Ò òõ Ò (;qh r[o{ rqzE ;kfjp, gzBk óôù) English Translation of University Dhuni True learning induces in the mind service of mankind. One subduing the five passions has truly taken abode at holy bathing-spots (1) The mind attuned to the infinite is the true singing of ankle-bells in ritual dances. With this how dare Yama intimidate me in the hereafter ? (Pause 1) One renouncing desire is the true Sanayasi. From continence comes true joy of living in the body (2) One contemplating to subdue the flesh is the truly Compassionate Jain ascetic. Such a one subduing the self, forbears harming others. (3) Thou Lord, art one and Sole.
    [Show full text]
  • Arithmetic Coding
    Arithmetic Coding Arithmetic coding is the most efficient method to code symbols according to the probability of their occurrence. The average code length corresponds exactly to the possible minimum given by information theory. Deviations which are caused by the bit-resolution of binary code trees do not exist. In contrast to a binary Huffman code tree the arithmetic coding offers a clearly better compression rate. Its implementation is more complex on the other hand. In arithmetic coding, a message is encoded as a real number in an interval from one to zero. Arithmetic coding typically has a better compression ratio than Huffman coding, as it produces a single symbol rather than several separate codewords. Arithmetic coding differs from other forms of entropy encoding such as Huffman coding in that rather than separating the input into component symbols and replacing each with a code, arithmetic coding encodes the entire message into a single number, a fraction n where (0.0 ≤ n < 1.0) Arithmetic coding is a lossless coding technique. There are a few disadvantages of arithmetic coding. One is that the whole codeword must be received to start decoding the symbols, and if there is a corrupt bit in the codeword, the entire message could become corrupt. Another is that there is a limit to the precision of the number which can be encoded, thus limiting the number of symbols to encode within a codeword. There also exist many patents upon arithmetic coding, so the use of some of the algorithms also call upon royalty fees. Arithmetic coding is part of the JPEG data format.
    [Show full text]
  • Image Compression Using Discrete Cosine Transform Method
    Qusay Kanaan Kadhim, International Journal of Computer Science and Mobile Computing, Vol.5 Issue.9, September- 2016, pg. 186-192 Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320–088X IMPACT FACTOR: 5.258 IJCSMC, Vol. 5, Issue. 9, September 2016, pg.186 – 192 Image Compression Using Discrete Cosine Transform Method Qusay Kanaan Kadhim Al-Yarmook University College / Computer Science Department, Iraq [email protected] ABSTRACT: The processing of digital images took a wide importance in the knowledge field in the last decades ago due to the rapid development in the communication techniques and the need to find and develop methods assist in enhancing and exploiting the image information. The field of digital images compression becomes an important field of digital images processing fields due to the need to exploit the available storage space as much as possible and reduce the time required to transmit the image. Baseline JPEG Standard technique is used in compression of images with 8-bit color depth. Basically, this scheme consists of seven operations which are the sampling, the partitioning, the transform, the quantization, the entropy coding and Huffman coding. First, the sampling process is used to reduce the size of the image and the number bits required to represent it. Next, the partitioning process is applied to the image to get (8×8) image block. Then, the discrete cosine transform is used to transform the image block data from spatial domain to frequency domain to make the data easy to process.
    [Show full text]
  • Error Correction Capacity of Unary Coding
    Error Correction Capacity of Unary Coding Pushpa Sree Potluri1 Abstract Unary coding has found applications in data compression, neural network training, and in explaining the production mechanism of birdsong. Unary coding is redundant; therefore it should have inherent error correction capacity. An expression for the error correction capability of unary coding for the correction of single errors has been derived in this paper. 1. Introduction The unary number system is the base-1 system. It is the simplest number system to represent natural numbers. The unary code of a number n is represented by n ones followed by a zero or by n zero bits followed by 1 bit [1]. Unary codes have found applications in data compression [2],[3], neural network training [4]-[11], and biology in the study of avian birdsong production [12]-14]. One can also claim that the additivity of physics is somewhat like the tallying of unary coding [15],[16]. Unary coding has also been seen as the precursor to the development of number systems [17]. Some representations of unary number system use n-1 ones followed by a zero or with the corresponding number of zeroes followed by a one. Here we use the mapping of the left column of Table 1. Table 1. An example of the unary code N Unary code Alternative code 0 0 0 1 10 01 2 110 001 3 1110 0001 4 11110 00001 5 111110 000001 6 1111110 0000001 7 11111110 00000001 8 111111110 000000001 9 1111111110 0000000001 10 11111111110 00000000001 The unary number system may also be seen as a space coding of numerical information where the location determines the value of the number.
    [Show full text]
  • Image Data Compression Introduction to Coding
    Image Data Compression Introduction to Coding © 2018-19 Alexey Pak, Lehrstuhl für Interaktive Echtzeitsysteme, Fakultät für Informatik, KIT 1 Review: data reduction steps (discretization / digitization) Continuous 2D siGnal Fully diGital siGnal (liGht intensity on sensor) gq (xa, yb,ti ) Discrete time siGnal (pixel voltaGe readinGs) g(xa, yb,ti ) g(x, y,t) Spatial discretization Temporal discretization and diGitization g(xa, yb,t) g(xa, yb,t) gq (xa, yb,t) Discrete value siGnal AnaloG siGnal Spatially discrete siGnal (e.G., # of electrons at each (liGht intensity at a pixel) (pixel-averaGed intensity) pixel of the CCD matrix) © 2018-19 Alexey Pak, Lehrstuhl für Interaktive Echtzeitsysteme, Fakultät für Informatik, KIT 2 Review: data reduction steps (discretization / digitization) Discretization of 1D continuous-time signals (sampling) • Important signal transformations: up- and down-sampling • Information-preserving down-sampling: rate determined based on signal bandwidth • Fourier space allows simple interpretation of the effects due to decimation and interpolation (techniques of up-/down-sampling) Scalar (one-dimensional) signal quantization of continuous-value signals • Quantizer types: uniform, simple non-uniform (with a dead zone, with a limited amplitude) • Advanced quantizers: PDF-optimized (Max-Lloyd algorithm), perception-optimized, SNR- optimized • Implementation: pre-processing with a compander function + simple quantization Vector (multi-dimensional) signal quantization • Terminology: quantization, reconstruction, codebook, distance metric, Voronoi regions, space partitioning • Relation to the general classification problem (from Machine Learning) • Linde-Buzo-Gray algorithm of constructing (sub-optimal) codebooks (aka k-means) © 2018-19 Alexey Pak, Lehrstuhl für Interaktive Echtzeitsysteme, Fakultät für Informatik, KIT 3 LGB vector quantization – 2D example [Linde, Buzo, Gray ‘80]: 1.
    [Show full text]
  • Comparison of Entropy and Dictionary Based Text Compression in English, German, French, Italian, Czech, Hungarian, Finnish, and Croatian
    mathematics Article Comparison of Entropy and Dictionary Based Text Compression in English, German, French, Italian, Czech, Hungarian, Finnish, and Croatian Matea Ignatoski 1 , Jonatan Lerga 1,2,* , Ljubiša Stankovi´c 3 and Miloš Dakovi´c 3 1 Department of Computer Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, HR-51000 Rijeka, Croatia; [email protected] 2 Center for Artificial Intelligence and Cybersecurity, University of Rijeka, R. Matejcic 2, HR-51000 Rijeka, Croatia 3 Faculty of Electrical Engineering, University of Montenegro, Džordža Vašingtona bb, 81000 Podgorica, Montenegro; [email protected] (L.S.); [email protected] (M.D.) * Correspondence: [email protected]; Tel.: +385-51-651-583 Received: 3 June 2020; Accepted: 17 June 2020; Published: 1 July 2020 Abstract: The rapid growth in the amount of data in the digital world leads to the need for data compression, and so forth, reducing the number of bits needed to represent a text file, an image, audio, or video content. Compressing data saves storage capacity and speeds up data transmission. In this paper, we focus on the text compression and provide a comparison of algorithms (in particular, entropy-based arithmetic and dictionary-based Lempel–Ziv–Welch (LZW) methods) for text compression in different languages (Croatian, Finnish, Hungarian, Czech, Italian, French, German, and English). The main goal is to answer a question: ”How does the language of a text affect the compression ratio?” The results indicated that the compression ratio is affected by the size of the language alphabet, and size or type of the text. For example, The European Green Deal was compressed by 75.79%, 76.17%, 77.33%, 76.84%, 73.25%, 74.63%, 75.14%, and 74.51% using the LZW algorithm, and by 72.54%, 71.47%, 72.87%, 73.43%, 69.62%, 69.94%, 72.42% and 72% using the arithmetic algorithm for the English, German, French, Italian, Czech, Hungarian, Finnish, and Croatian versions, respectively.
    [Show full text]
  • The Pillars of Lossless Compression Algorithms a Road Map and Genealogy Tree
    International Journal of Applied Engineering Research ISSN 0973-4562 Volume 13, Number 6 (2018) pp. 3296-3414 © Research India Publications. http://www.ripublication.com The Pillars of Lossless Compression Algorithms a Road Map and Genealogy Tree Evon Abu-Taieh, PhD Information System Technology Faculty, The University of Jordan, Aqaba, Jordan. Abstract tree is presented in the last section of the paper after presenting the 12 main compression algorithms each with a practical This paper presents the pillars of lossless compression example. algorithms, methods and techniques. The paper counted more than 40 compression algorithms. Although each algorithm is The paper first introduces Shannon–Fano code showing its an independent in its own right, still; these algorithms relation to Shannon (1948), Huffman coding (1952), FANO interrelate genealogically and chronologically. The paper then (1949), Run Length Encoding (1967), Peter's Version (1963), presents the genealogy tree suggested by researcher. The tree Enumerative Coding (1973), LIFO (1976), FiFO Pasco (1976), shows the interrelationships between the 40 algorithms. Also, Stream (1979), P-Based FIFO (1981). Two examples are to be the tree showed the chronological order the algorithms came to presented one for Shannon-Fano Code and the other is for life. The time relation shows the cooperation among the Arithmetic Coding. Next, Huffman code is to be presented scientific society and how the amended each other's work. The with simulation example and algorithm. The third is Lempel- paper presents the 12 pillars researched in this paper, and a Ziv-Welch (LZW) Algorithm which hatched more than 24 comparison table is to be developed.
    [Show full text]
  • Data Compression
    Data Compression Data Compression Compression reduces the size of a file: ! To save space when storing it. ! To save time when transmitting it. ! Most files have lots of redundancy. Who needs compression? ! Moore's law: # transistors on a chip doubles every 18-24 months. ! Parkinson's law: data expands to fill space available. ! Text, images, sound, video, . All of the books in the world contain no more information than is Reference: Chapter 22, Algorithms in C, 2nd Edition, Robert Sedgewick. broadcast as video in a single large American city in a single year. Reference: Introduction to Data Compression, Guy Blelloch. Not all bits have equal value. -Carl Sagan Basic concepts ancient (1950s), best technology recently developed. Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http://www.Princeton.EDU/~cos226 2 Applications of Data Compression Encoding and Decoding hopefully uses fewer bits Generic file compression. Message. Binary data M we want to compress. ! Files: GZIP, BZIP, BOA. Encode. Generate a "compressed" representation C(M). ! Archivers: PKZIP. Decode. Reconstruct original message or some approximation M'. ! File systems: NTFS. Multimedia. M Encoder C(M) Decoder M' ! Images: GIF, JPEG. ! Sound: MP3. ! Video: MPEG, DivX™, HDTV. Compression ratio. Bits in C(M) / bits in M. Communication. ! ITU-T T4 Group 3 Fax. Lossless. M = M', 50-75% or lower. ! V.42bis modem. Ex. Natural language, source code, executables. Databases. Google. Lossy. M ! M', 10% or lower. Ex. Images, sound, video. 3 4 Ancient Ideas Run-Length Encoding Ancient ideas. Natural encoding. (19 " 51) + 6 = 975 bits. ! Braille. needed to encode number of characters per line ! Morse code.
    [Show full text]
  • The Deep Learning Solutions on Lossless Compression Methods for Alleviating Data Load on Iot Nodes in Smart Cities
    sensors Article The Deep Learning Solutions on Lossless Compression Methods for Alleviating Data Load on IoT Nodes in Smart Cities Ammar Nasif *, Zulaiha Ali Othman and Nor Samsiah Sani Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science & Technology, University Kebangsaan Malaysia, Bangi 43600, Malaysia; [email protected] (Z.A.O.); [email protected] (N.S.S.) * Correspondence: [email protected] Abstract: Networking is crucial for smart city projects nowadays, as it offers an environment where people and things are connected. This paper presents a chronology of factors on the development of smart cities, including IoT technologies as network infrastructure. Increasing IoT nodes leads to increasing data flow, which is a potential source of failure for IoT networks. The biggest challenge of IoT networks is that the IoT may have insufficient memory to handle all transaction data within the IoT network. We aim in this paper to propose a potential compression method for reducing IoT network data traffic. Therefore, we investigate various lossless compression algorithms, such as entropy or dictionary-based algorithms, and general compression methods to determine which algorithm or method adheres to the IoT specifications. Furthermore, this study conducts compression experiments using entropy (Huffman, Adaptive Huffman) and Dictionary (LZ77, LZ78) as well as five different types of datasets of the IoT data traffic. Though the above algorithms can alleviate the IoT data traffic, adaptive Huffman gave the best compression algorithm. Therefore, in this paper, Citation: Nasif, A.; Othman, Z.A.; we aim to propose a conceptual compression method for IoT data traffic by improving an adaptive Sani, N.S.
    [Show full text]
  • Answers to Exercises
    Answers to Exercises A bird does not sing because he has an answer, he sings because he has a song. —Chinese Proverb Intro.1: abstemious, abstentious, adventitious, annelidous, arsenious, arterious, face- tious, sacrilegious. Intro.2: When a software house has a popular product they tend to come up with new versions. A user can update an old version to a new one, and the update usually comes as a compressed file on a floppy disk. Over time the updates get bigger and, at a certain point, an update may not fit on a single floppy. This is why good compression is important in the case of software updates. The time it takes to compress and decompress the update is unimportant since these operations are typically done just once. Recently, software makers have taken to providing updates over the Internet, but even in such cases it is important to have small files because of the download times involved. 1.1: (1) ask a question, (2) absolutely necessary, (3) advance warning, (4) boiling hot, (5) climb up, (6) close scrutiny, (7) exactly the same, (8) free gift, (9) hot water heater, (10) my personal opinion, (11) newborn baby, (12) postponed until later, (13) unexpected surprise, (14) unsolved mysteries. 1.2: A reasonable way to use them is to code the five most-common strings in the text. Because irreversible text compression is a special-purpose method, the user may know what strings are common in any particular text to be compressed. The user may specify five such strings to the encoder, and they should also be written at the start of the output stream, for the decoder’s use.
    [Show full text]
  • Lec 05 Arithmetic Coding
    ECE 5578 Multimedia Communication Lec 05 Arithmetic Coding Zhu Li Dept of CSEE, UMKC web: http://l.web.umkc.edu/lizhu phone: x2346 Z. Li, Multimedia Communciation, 2018 p.1 Outline Lecture 04 ReCap Arithmetic Coding About Homework-1 and Lab Z. Li, Multimedia Communciation, 2018 p.2 JPEG Coding Block (8x8 pel) based coding DCT transform to find sparse * representation Quantization reflects human visual system Zig-Zag scan to convert 2D to 1D string Run-Level pairs to have even more = compact representation Hoffman Coding on Level Category Quant Table: Fixed on the Level with in the category Z. Li, Multimedia Communciation, 2018 p.3 Coding of AC Coefficients Zigzag scanning: Example 8 24 -2 0 0 0 0 0 -31 -4 6 -1 0 0 0 0 0 -12 -1 2 0 0 0 0 0 0 -2 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Example: zigzag scanning result 24 -31 0 -4 -2 0 6 -12 0 0 0 -1 -1 0 0 0 2 -2 0 0 0 0 0 -1 EOB (Run, level) representation: (0, 24), (0, -31), (1, -4), (0, -2), (1, 6), (0, -12), (3, -1), (0, -1), (3, 2), (0, -2), (5, -1), EOB Z. Li, Multimedia Communciation, 2018 p.4 Coding of AC Coefficients Run / Base Run / Base … Run / Base codeword Catg. codeword Catg. Codeword Cat. EOB 1010 - - … ZRL 1111 1111 001 0/1 00 1/1 1100 … 15/1 1111 1111 1111 0101 0/2 01 1/2 11011 … 15/2 1111 1111 1111 0110 0/3 100 1/3 1111001 … 15/3 1111 1111 1111 0111 0/4 1011 1/4 111110110 … 15/4 1111 1111 1111 1000 0/5 11010 1/5 11111110110 … 15/5 1111 1111 1111 1001 … … … … … … … ZRL: represent 16 zeros when number of zeros exceeds 15.
    [Show full text]