Introduction to Compression
Total Page:16
File Type:pdf, Size:1020Kb
CM3106 Multimedia Introduction to Compression Dr Kirill Sidorov [email protected] www.facebook.com/kirill.sidorov Prof David Marshall [email protected] School of Computer Science and Informatics Cardi University, UK Modelling and compression • We are interested in modelling multimedia data. • To model means to replace something complex with a simpler (= shorter) analog. • Some models help understand the original phenomenon/data better: Example: Laws of physics Huge arrays of astronomical observations (e.g. Tycho Brahe’s logbooks) summarised in a few characters (e.g. Kepler, Newton): M M jF j = G 1 2 : r2 • This model helps us understand gravity better. • Is an example of tremendous compression of data. • We will look at models whose purpose is primarily compression of multimedia data. The need for compression Raw video, image, and audio les can be very large. Example: One minute of uncompressed audio Audio type 44.1 KHz 22.05 KHz 11.025 KHz 16 bit stereo: 10.1 MB 5.05 MB 2.52 MB 16 bit mono: 5.05 MB 2.52 MB 1.26 MB 8 bit mono: 2.52 MB 1.26 MB 630 KB Example: Uncompressed images Image type File size 640×480 (VGA) 8-bit gray scale 307 KB 1920×1080 (Full HD) 16-bit YUYV 4:2:2 4.15 MB 2560×1600 24-bit RGB colour 11.7 MB The need for compression Example: Videos (involves a stream of audio plus video imagery) • Raw video — uncompressed image frames 512×512 True Colour at 25 FPS = 1125 MB/min. • HDTV (1920×1080) — gigabytes per minute uncompressed, True Colour at 25 FPS = 8.7 GB/min. • Relying on higher bandwidths is not a good option — M25 Syndrome: trac will always increase to ll the current bandwidth limit whatever this is. • Compression has to be part of the representation of audio, image, and video formats. Basics of information theory Suppose we have an information source (random variable) S which emits symbols fs1; s2; : : : ; sng with probabilities p1; p2; : : : ; pn. According to Shannon, the entropy of S is dened as: X 1 H(S) = p log ; i 2 p i i where pi is the probability that symbol si will occur. • When a symbol with probability pi is transmitted, it reduces the amount of uncertainty in the receiver by a factor of 1 . pi • log 1 = − log p indicates the amount of information conveyed 2 pi 2 i by si, i.e., the number of binary digits needed to code si (Shannon’s coding theorem). Entropy example Example: Entropy of a fair coin The coin emits symbols s1 = heads and s2 = tails with p1 = p2 = 1=2. Therefore, the entropy if this source is: H(coin) = −(1=2 × log2 1=2 + 1=2 × log2 1=2) = −(1=2 × −1 + 1=2 × −1) = −(−1=2 − 1=2) = 1 bit: Example: Grayscale “image” • In an image with uniform distribution of gray-level intensity (and all pixels independent), i.e. pi = 1=256, then • The number of bits needed to code each gray level is 8 bits. • The entropy of this image is 8. • We will shortly see that real images are not like that! Answer: He has reduced the uncertainty by a factor of 2, therefore 1 bit. Entropy example Example: Breakfast order #1. Alice: “What do you want for breakfast: pancakes or eggs? I am unsure, because you like them equally (p1 = p2 = 1=2). ” Bob: “I want pancakes.” Question: How much information has Bob communicated to Alice? Entropy example Example: Breakfast order #1. Alice: “What do you want for breakfast: pancakes or eggs? I am unsure, because you like them equally (p1 = p2 = 1=2). ” Bob: “I want pancakes.” Question: How much information has Bob communicated to Alice? Answer: He has reduced the uncertainty by a factor of 2, therefore 1 bit. Answer: 3 X 1 H(Bob) = log 3 = log 3 ≈ 1:585 bits: 3 2 2 i=1 Entropy example Example: Breakfast order #2. Alice: “What do you want for breakfast: pancakes, eggs, or salad? I am unsure, because you like them equally (p1 = p2 = p3 = 1=3). ” Bob: “Eggs.” Question: What is Bob’s entropy assuming he behaves like a random variable = how much information has Bob communicated to Alice? Entropy example Example: Breakfast order #2. Alice: “What do you want for breakfast: pancakes, eggs, or salad? I am unsure, because you like them equally (p1 = p2 = p3 = 1=3). ” Bob: “Eggs.” Question: What is Bob’s entropy assuming he behaves like a random variable = how much information has Bob communicated to Alice? Answer: 3 X 1 H(Bob) = log 3 = log 3 ≈ 1:585 bits: 3 2 2 i=1 Answer: He has reduced her uncertainty by a factor of 3=2 (leaving 2 out of 3 equal options), therefore transmitted log2 3=2 ≈ 0:58 bits. Entropy example Example: Breakfast order #3. Alice: “What do you want for breakfast: pancakes, eggs, or salad? I am unsure, because you like them equally (p1 = p2 = p3 = 1=3). ” Bob: “Hmm, I do not know. I denitely do not want salad.” Question: How much information has Bob communicated to Alice? Entropy example Example: Breakfast order #3. Alice: “What do you want for breakfast: pancakes, eggs, or salad? I am unsure, because you like them equally (p1 = p2 = p3 = 1=3). ” Bob: “Hmm, I do not know. I denitely do not want salad.” Question: How much information has Bob communicated to Alice? Answer: He has reduced her uncertainty by a factor of 3=2 (leaving 2 out of 3 equal options), therefore transmitted log2 3=2 ≈ 0:58 bits. English letter frequencies From D. MacKay “Information Theory, Inference, and Learning Algorithms”, 2003. Shannon’s experiment (1951) Estimated entropy for English text: HEnglish ≈ 0:6 − 1:3 bits/letter. (If all letters and space were equally probable, then it would be H0 = log2 27 ≈ 4:755 bits/letter.) External link: Shannon’s original 1951 paper. External link: Java applet recreating Shannon’s experiment. Shannon’s experiment (1951): my attempt Estimated entropy for my attempt: 2.03 bits/letter. Why? Shannon’s coding theorem Shannon 1948 Basically: The ideal code length for an event with probability p is L(p) = −log2p ones and zeros (or generally, −logbp if instead we use b possible values for codes). External link: Shannon’s original 1948 paper. Shannon vs Kolmogorov What if we have a nite string? Shannon’s entropy is a statistical measure of infor- mation. We can “cheat” and regard a string as in- nitely long sequence of i.i.d. random variables. Shannon’s theorem then approximately applies. Kolmogorov Complexity: Basically, the length of the shortest program that ouputs a given string. Algo- rithmical measure of information. • K(S) is not computable! • Practical algorithmic compression is hard. Compression in multimedia data Compression basically employs redundancy in the data: Temporal in 1D data, 1D signals, audio, between video frames etc. Spatial correlation between neighbouring pixels or data items. Spectral e.g. correlation between colour or luminescence components. This uses the frequency domain to exploit relationships between frequency of change in data. Psycho-visual exploit perceptual properties of the human visual system. Lossless vs lossy compression Compression methods can also be categorised in two broad ways: Lossless compression: aer decompression gives an exact copy of the original data. Example: Entropy encoding schemes (Shannon-Fano, Human coding), arithmetic coding, LZ/LZW algorithm (used in GIF image le format). Lossy compression: aer decompression gives ideally a “close” approximation of the original data, ideally perceptually lossless. Example: Transform coding — FFT/DCT based quantisation used in JPEG/MPEG dierential encoding, vector quantisation. Why lossy compression? • Lossy methods are typically applied to high resoultion audio, image compression. • Have to be employed in video compression (apart from special cases). Basic reason: • Compression ratio of lossless methods (e.g. Human coding, arithmetic coding, LZW) is not high enough for audio/video. • By cleverly making a small sacrice in terms of delity of data, we can oen achieve very high compression ratios. • Cleverly = sacrice information that is perceptually unimportant. Lossless compression algorithms • Entropy encoding: • Shannon-Fano algorithm. • Human coding. • Arithmetic coding. • Repetitive sequence suppression. • Run-Length Encoding (RLE). • Pattern substitution. • Lempel-Ziv-Welch (LZW) algorithm. Simple repetition suppression If a sequence a series on n successive tokens appears: • Replace series with a token and a count number of occurrences. • Usually need to have a special ag to denote when the repeated token appears. Example: 89400000000000000000000000000000000 we can replace with: 894f32 where f is the ag for zero. Simple repetition suppression • Fairly straight forward to understand and implement. • Simplicity is its downfall: poor compression ratios. Compression savings depend on the content of the data. Applications of this simple compression technique include: • Suppression of zeros in a le (zero length suppression) • Silence in audio data, pauses in conversation etc. • Sparse matrices. • Component of JPEG. • Bitmaps, e.g. backgrounds in simple images. • Blanks in text or program source les. • Other regular image or data tokens. Run-length encoding (RLE) This encoding method is frequently applied to graphics-type images (or pixels in a scan line) — simple compression algorithm in its own right. It is also a component used in JPEG compression pipeline. Basic RLE Approach (e.g. for images): • Sequences of image elements X1;X2;:::;Xn (row by row). • Mapped to pairs (c1;L1); (c2;L2);:::; (cn;Ln), where ci represent image intensity or colour and Li the length of the i-th run of pixels. • (Not dissimilar to zero length suppression above.) Run-length Encoding Example Original sequence: 111122233333311112222 can be encoded as: (1,4),(2,3),(3,6),(1,4),(2,4) How much compression? The savings are dependent on the data: In the worst case(random noise) encoding is more heavy than original le: 2×integer rather than1 ×integer if original data is integer vector/array.