Internet Engineering Dr. Marek Woda Multimedia and Computer Visualisation Part 4

JPEG compression

Joint Photographic Expert Group - 1986 • ISO - International Standard Organisation

• CCITT - Comité Consultatif International de Téléphonie et Télégraphie

ISO standard - 1991 Application of the algorithm – compression of photorealistic image Assumptions:

An image is an array: f = f ( x, y ) x = 0,1,2,..,N - 1; y = 0,1,2,..,M - 1 where f(x,y) - element of image (pixel), N, M - image width and height, Element f(x,y) can have different sense e. g. - gray level, f ( x, y ) Î {0,1,...,S } - color, f ( x, y ) = [r( x, y ) g( x, y ) b( x, y )] r , g,b Î{0,1,...,S} Phases of the JPEG algorithm

1. Conversion to luminance-chrominace color model (only for color image) 2. Division into blocks 3. Calculation of the Discrete Cosine Transform (DCT) 4. Quantisation of DCT coefficients 5. Conversion of DCT coefficients array to a vector 6. Entropy Coding 1. Conversion to luminance-chrominace color model (YUV, YCbCr)

Source image representation - RGB model:

R = [rij ], G = [gij ], B = [bij ]

Conversion formula (from RGB to YUV color model):

é yij ù é 0.229 0.587 0.114 ùé rij ù ê ú ê ú u = ê- 0.146 - 0.288 - 0.434ú g ê ij ú ê úê ij ú ê ú ê ú ëvij û ëê 0.617 - 0.517 0.100 ûúëbij û after conversion:

Y = [yij ], U = [iij ], V = [qij ]

where Y – luminance and U and V chrominance 2. Division into blocks

Division of the image into matrices of 8 x 8 pixel blocks, where each block is an array :

f ( x, y ) x = 0,1,...,7 y = 0,1,...,7 3. Calculation of the Discrete Cosine Transform (DCT)

f(x,y) F(u,v)

C( u )C( v ) 7 7 æ 2x + 1 ö æ 2 y + 1 ö F( u,v ) = å å f ( x, y )cos ç up ÷cos ç vp ÷ 4 x=0 y=0 è 16 ø è 16 ø

F(u,v) f(x,y)

1 7 7 æ 2x + 1 ö æ 2 y + 1 ö f ( x, y ) = å åC( u )C( v )F( u,v )cos ç up ÷cos ç vp ÷ 4 u=0 v=0 è 16 ø è 16 ø

ì1 / 2 dla u = 0 ì1 / 2 dla v = 0 where C( u) = í i C( v ) = í î1 dla u ¹ 0 î1 dla v ¹ 0 Image 1 „plane”

block of the input image

f(x,y) F(u,v) block (function) block (DCT transform) Image 2 „chessboard”

block of the input image

f(x,y) F(u,v) block (function) block (DCT transform) Image 3 „photorealistic image”

block of the input image

f(x,y) F(u,v)F(u,v)

block (function) block (DCT transform) Function and DCT transform for the „photorealistic image”

186 198 199 190 182 177 182 197 179 184 183 176 173 172 175 184 188 182 180 178 174 172 171 166 f(x,y) = 132 130 139 146 151 169 191 201 131 134 137 140 139 139 139 138 153 157 161 172 177 145 89 49 190 178 192 196 120 43 39 47 176 184 187 112 41 39 43 44

1.2047 0.1372 -0.0212 -0.0364 0.0023 0.0088 0.0023 0.0002 0.2165 -0.1758 0.0319 0.0240 -0.0012 -0.0143 -0.0025 -0.0002 -0.0087 0.1324 0.0194 -0.0460 -0.0065 0.0029 0.0046 0.0001 0.0169 -0.0018 -0.0613 0.0242 0.0146 -0.0103 -0.0063 -0.0006 F(u,v) = * 1.0e+003 -0.0315 -0.0626 0.0572 -0.0192 -0.0225 0.0000 0.0069 -0.0004 0.0287 0.0069 -0.0122 -0.0150 0.0260 0.0086 -0.0065 0.0001 0.0123 0.0115 -0.0166 0.0300 -0.0216 -0.0075 0.0049 0.0004 -0.0005 0.0352 0.0060 -0.0166 0.0128 0.0052 -0.0039 -0.0005 4. Quantisation of DCT coefficients F(u,v) FQ(u,v) æ F( u,v )ö F Q ( u,v ) = Integer Round ç ÷ è Q( u,v )ø

é16 11 10 16 24 40 51 61ù é17 18 24 47 24 40 51 61ù ê ú 12 12 14 19 26 58 60 56 ê18 21 26 66 26 58 60 56ú ê ú ê ú ê14 13 16 24 40 57 69 56ú ê24 26 56 99 99 99 99 99ú ê ú ê ú 14 17 22 29 51 87 80 62 47 99 99 99 99 99 99 99 Q( u,v ) = ê ú Q( u,v ) = ê ú ê18 22 37 56 68 109 103 77ú ê99 99 99 99 99 99 99 99ú ê ú ê ú ê24 35 55 64 81 104 113 92ú ê99 99 99 99 99 99 99 99ú ê49 64 78 87 103 121 120 101ú ê99 99 99 99 99 99 99 99ú ê ú ê ú ë72 92 95 98 112 100 103 99û ë99 99 99 99 99 99 99 99û

for luminance Y for chrominance I and Q DCT transform coefficients after quantisation for the „photorealistic image”

1.2047 0.1372 -0.0212 -0.0364 0.0023 0.0088 0.0023 0.0002 0.2165 -0.1758 0.0319 0.0240 -0.0012 -0.0143 -0.0025 -0.0002 -0.0087 0.1324 0.0194 -0.0460 -0.0065 0.0029 0.0046 0.0001 0.0169 -0.0018 -0.0613 0.0242 0.0146 -0.0103 -0.0063 -0.0006 F(u,v) = * 1.0e+003 -0.0315 -0.0626 0.0572 -0.0192 -0.0225 0.0000 0.0069 -0.0004 0.0287 0.0069 -0.0122 -0.0150 0.0260 0.0086 -0.0065 0.0001 0.0123 0.0115 -0.0166 0.0300 -0.0216 -0.0075 0.0049 0.0004 -0.0005 0.0352 0.0060 -0.0166 0.0128 0.0052 -0.0039 -0.0005

75 12 -2 -2 0 0 0 0 18 15 2 1 0 0 0 0 -1 0 1 -2 0 0 0 0 FQ(u,v) = 1 0 -3 1 0 0 0 0 -2 -3 2 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Inverse operation (decoding)

Integer Round (DCT-1( FQ(u,v)*Q(u,v) )) f *(x,y)

186 198 199 190 182 177 182 197 179 184 183 176 173 172 175 184 188 182 180 178 174 172 171 166 f(x,y) = 132 130 139 146 151 169 191 201 131 134 137 140 139 139 139 138 153 157 161 172 177 145 89 49 190 178 192 196 120 43 39 47 176 184 187 112 41 39 43 44

183 186 187 182 176 178 188 198 178 188 196 192 180 169 168 171 169 174 178 175 170 170 176 183 f*(x,y) = 147 140 133 135 148 168 186 197 131 126 126 135 149 153 146 136 150 160 173 178 163 127 82 51 176 190 195 172 125 75 44 31 181 185 168 114 50 19 32 58 Coding and Decoding (examples) • „photorealistic image”

before compression f(x,y) after compression and decompression f*(x,y)

• „chessboard”

before compression f(x,y) after compression and decompression f*(x,y) 5. Conversion of DCT coefficients array to a vector

Q F (u,v) [DC, AC1, AC2 ,..., AC63]

zig-zag algorithm (A. G. Tescher 1978)

75 12 -2 -2 0 0 0 0 18 15 2 1 0 0 0 0 -1 10 1 -2 0 0 0 0 FQ(u,v) = 1 0 -3 1 0 0 0 0 -2 -3 2 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[DC, AC1,...,AC63] = [ 75, 12, 18, -1, 15, -2, -2, 2, 10, 1, -2, 0, 1, 1, 0, 0, 0, -2, -3, -3, 1, 0, 0,..., 0] 6. Entropy Coding

vector - [DC, AC1, AC2 ,..., AC63]

• coding of DC for blocks (array of the blocks)

• coding AC1, AC2, ... , AC63 for each block

The entropy coder compresses data by replacing each fixed-length input symbol by the corresponding variable-length prefix codeword. The length of each codeword is proportional to the negative logarithm of the codeword probability. 6.1. DC coding

The image has been devided into blocks (8x8 pixels).

DC0 DC1 DC2 block 0 block 1 block 2 ... DCi - DC value for block i, DCk DCk+1 i = 0, 1, ..., m block k block k+1 ... m – number of blocks DC2k block 2k ...

... Coding of the DC (DPCM algorithm)

1. Construction of the vector

DC = [DC0, DC1, DC2,..., DCk, DCk+1,..., DCm].

2. Calculation Δ = [Δ0, Δ1, ..., Δi , ..., Δm] where

0 Δ0 = DC i i-1 Δi = DC - DC i = 1,2, ...,m

3. Coding Δ = [Δ0, Δ1, ..., Δi , ..., Δm] using Huffman code table ( Table 1) Table 1.

Huffman

Δi value Size code for Additional bits Size 0 0 00 - -1, 1 1 010 0,1 -3,-2, 2, 3 2 011 00,01,10,11 -7,…,-4,4,…7 3 100 000,…,011,100,…,111 -15,…,-8,8,…,15 4 101 0000,…,0111,1000,…,1111 … … … … -2047,…-1024,1024,…,2047 11 1 1111 1110 000 0000 0000,…,111 1111 1111

Coding procedure: • For next Δi calculate Size using formula

Size = Integer Round [log2( abs(Δi) ) +1]

• For calculated Size read (Table 2) Huffman code and additional bits

Example: For the next block the sequence of coefficients is:

[DC, AC1,...,AC63] = [75, 12, 18, -1, 15,…]

For the previous block DC value is 71. Solution:

• Calculate Δi : i i-1 Δi = DC - DC = 75 – 71 = 4

• For Δi calculate Size using formula:

Size = Integer Round [log2( abs(4)) +1] = 3

• From Table 1 for Size = 3 Huffman code is 100, and additional bits for Δi = 4 have a value 100

• In result for Δi = 75, the calculated code is

100 100 6.2. Coding of the [AC1, AC2, ...,AC63]

Example of AC vector:

[ AC1,...,AC63] = [2, 18, -1, 5, -2, -2, 2, 10, 1, -2, 0, 1, 1, 0, 0, 0, -2, -3, -3, 1, 0, 0,..., 0]

The vector contains non-zero elements and sequences of zeros..

[ ..., ACi-1, 0,..., 0, ACi, 0,..., 0, ACi+1, 0,... ]

The following coding structure has been proposed.

... Symbol-1 Symbol-2 Symbol-1 Symbol-2 ... Symbol-1 Symbol-2 ACi =

Symbol-1 = (Runlength, Size)

Symbol-2 = (Amplitude)

Runlength – number of zeros beteween ACi and ACi-1

Size – ACi range

Amplitude – ACi value

For ACi coding Huffman code is used. Coding of the ACi:

1. Calculate Runlength, that is number of zeros between previous non-zero

ACi-1 and ACi

2. Calculate Size using formula:

Size = Integer Round [log2( abs(ACi)) + 1]

3. Code the pair (Runlength, Size) usind Huffman code table (Table 2). Table 2.

(Runlength, (Runlength, Huffmana code Huffmana code EOB – zeros to Size) Size) the end of the (0,1) 00 (0,6) 1111000 block (0,2) 01 (1,3) 1111001 (0,3) 100 (5,1) 1111010 ZRL – 16 zeros EOB 1010 (6,1) 1111011 (15 zeros and

(0,4) 1011 (0,7) 11111000 ACi = 0) (1,1) 1100 (2,2) 11111001 (0,5) 11010 (7,1) 11111010 (1,2) 11011 (1,4) 111110110 (2,1) 11100 … … (3,1) 111010 ZRL 11111111001 (4,1) 111011 … … 4. Code Amplitude (for Size value calculated in step 2), using Table 3.

Table 3.

Size ACi value Amplitude code 0 0 --- 1 -1, 1 0, 1 2 -3, -2, 2, 3 00, 01, 10, 11 3 -7, …,-4, 4, …,7 000, …, 011, 100, …,111 4 -15, …,-8, 8, …,15 0000, …, 0111, 1000, …, 1111 … … … 11 -2047,…,-1024,1024,…,2047 000 0000 0000,…,111 1111 1111 Example:

Code the following part of the vector.

[ AC1 ,..., AC63] = [ …1, 0, 0, -2, -3,…] coding sequence is …, 0, 0, -2,…

• for ACi = – 2 Size value is equal 2, • from Table 2, for pair (Runlength, Size) = (2, 2) appropriate Huffman code is 11111001,

• from Table 3 for Size = 2 and ACi = -2, code is 01,

In result, sequence …, 0, 0, -2,… has been coded as: 11111001 01 7. Conclusions JPEG coder (simplified)

Source data Quantisator Compressed blocks DCT Bit coder data 8x8

Quantisation Huffman tables code tables JPEG decoder (simplified)

Data blocks Compressed Bit Dequantisator DCT-1 8x8 data decoder

Huffman Quantisation code tables tables 8. An example Test image 256 x 256 x 8 = 524.288 bits = 64kB

BMP format (66.616 B) JPEG compression results

High compression (24.295 B) Midlle compression (31.526 B)

Low compression (56.956 B) JPEG compression - details

Source image High compression (66.616 B) (24.295 B) JPEG XR (eXtended Range)

, developed by: Microsoft, ITU-T, ISO/IEC (2009) • several key improvements over JPEG – Better compression – – Tile structure support – Support for more color accuracy – Transparency map support – Compressed-domain image modification – support WebP

• Open format, developed by: Google (2010) – lossless images are 26% smaller in size compared to PNGs – lossy images are 25-34% smaller in size compared to JPEG images at equivalent SSIM index – supports lossless transparency (also known as alpha channel) with just 22% additional bytes – uses predictive coding to encode an image, the same methodology used by the VP8 to compress keyframes in videos BPG - Better Portable Graphics

• based on the video encoding standard HEVC, otherwise known as H.265

• a format designed by the French programmer Fabrice Bellard

• Its purpose is to replace the JPEG image format when quality or file size is an issue BPG - Better Portable Graphics

• Features – High compression ratio. Files are much smaller than JPEG for similar quality – Supported by most Web browsers with a small Javascript decoder (gzipped size: 56 KB) – Based on a subset of the HEVC open video compression standard – Supports the same chroma formats as JPEG (grayscale, YCbCr 4:2:0, 4:2:2, 4:4:4) to reduce the losses during the conversion. An alpha channel is supported. The RGB, YCgCo and CMYK color spaces are also supported. – Native support of 8 to 14 bits per channel for a higher dynamic range – Lossless compression is supported – Various metadata (such as , ICC profile, XMP) can be included – Animation support High Efficiency Image

• developed by the Moving Picture Experts Group (MPEG) • defined by MPEG-H Part 12 (ISO/IEC 23008-12) • a container format (images and image sequences that are coded in different formats) • support for animation, storing other media, such as audio and timed text • already implemented in iOS / MacOS FLIF - Free Lossless Image Format

• Open format (IX, 2015) • A novel lossless image format – 14% smaller than lossless WebP, – 22% smaller than lossless BPG, – 33% smaller than brute-force crushed PNG files (using ZopfliPNG), – 43% smaller than typical PNG files, – 46% smaller than optimized Adam7-interlaced PNG files, – 53% smaller than lossless JPEG 2000 compression, – 74% smaller than lossless JPEG XR compression. • based on MANIAC compression (Meta-Adaptive Near-zero Integer ) is an algorithm for entropy coding developed by Jon Sneyers and Pieter Wuille. FLIF - Free Lossless Image Format

• Features – Lossless compression – Greyscale, RGB, RGBA – Color depth: up to 16 bits per channel (high dynamic range) – Interlaced (default) or non-interlaced – Interlaced files can be decoded quickly at lower quality/resolution (“Responsive By Design”) – Progressive decoding of partially downloaded files – Animation support – Encoding and decoding speeds are not very fast FLIF - Free Lossless Image Format

• FLIF does not yet support : – Tiles (to store huge images with fast cropped viewing) – Other color spaces (CMYK, YCbCr, ...) – – Web browser support – Support in popular image tools and viewers – A highly optimized implementation JPEG-XR / JPEG-XR / .HEIC JPEG/Exif PNG GIF (89a) WebP BPG TIFF JPX Formats and extensibility Base container file format ISOBMFF TIFF - - RIFF TIFF - 4 - Lossy compression Yes (HEVC) Yes (JPEG) No No Yes (VP8) Yes Yes Yes (HEVC10) Yes (TIFF Rev Yes Lossless compression Yes (HEVC) Yes (GIF)1 Yes (VP8L) Yes Yes Yes (HEVC10) 6.0) (PNG)1 Extensible to other coding formats Yes Yes8 No No No Yes8 Yes5 No

Metadata format (on top of internal) Exif, XMP, MPEG-7 Exif - - Exif, XMP Exif, XMP JPX, (XMP)6 Exif, XMP Yes (XML- Extensible to other metadata formats Yes No No No No No Yes based) Other media types (audio, text, etc.) Yes Audio2 No No No No Yes7 No Multi-picture features Multiple images in the same file Yes No11 No Yes3 Yes3 No Yes Yes9 Image sequences / animations Yes No No Yes Yes No Yes Yes Image coding Yes No No No No No No Yes Derived images Multiple-of-90-degree rotations Yes Yes No No No Yes Yes No Cropping Yes No No No No No Yes No Tiling/overlaying Yes No No No Yes No Yes No Extensible to other editing operations Yes No No No No No No No Auxiliary picture information Transparency (alpha plane) Yes No Yes No12 Yes Yes Yes Yes Thumbnail image Yes Yes No No No Yes Yes Yes