1

X-Media Audio, Image and Video Coding

Laurent Duval (IFP Energies nouvelles) Eric Debes (Thalès) Philippe Morosini (Supélec)

Supports Moodle/NTNoe http://www.laurent-duval.eu/lcd-lecture-supelec-xmedia.html 2

General information Contents 3 • Introduction to X-Media coding Ë generic principles • Audio Ë data recording/sampling, physiology • Data, audio coding Ë LZ*, Mpeg-Layer 3 • Image coding Ë JPEG vs. JPEG 2000 • Video coding Ë MPEG formats, H-264 • Bonuses, exercices Initial motivation 4 Initial motivation 5 • Coding/compression as a DSP engineer discipline Ë information reduction, standards & adaption, complexity, integrity & security issues, interaction with SP pipeline • Composite domain, yet ubiquitous Ë sampling (Nyquist-Shannon, filter banks), statistical SP (KLT, decorrelation), transforms (Fourier, ), classification (quantization, K-means), functional spaces (basis & frames), information theory (entropy), error measurements, modelling Initial motivation 6

Digital Image Processing

Digital Image Characteristics

Spatial Spectral

Gray-level Histogram DFT DCT

Pre-Processing

Enhancement Restoration

Point Processing Masking Filtering Degradation Models Inverse Filtering Wiener Filtering

Compression

Information Theory Lossless Lossy

LZW () Transform-based ()

Segmentation

Edge Detection

Description

Shape Descriptors Texture Morphology Initial motivation 7 • Exemplar Ë underline importance of specific steps (related to other lectures) to make stuff work Ë which algorithm for which task? ° process text? sound? images? video? Ë "the toolbox quote" (Juran) • Evolving Ë steady evolution of standards, tools, overview of future directions (and tools) • Central to SP task Ë what is really important in my data? Ë (stored) info. overflow + degradations (decon.) 8

Principles Principles 9 • What is data (image) Compression? Ë is the art and science of representing information in a compact form. Ë Data is a sequence of symbols taken from a discrete alphabet. ° Text: sequence of characters/bytes (0.5 D) ° Sound: collection of arrays of values representing intensities (1 D to 1.5 D) ° Still image data: collection of arrays (one for each color plane) of values representing intensity (color) of the point in corresponding spatial locations (pixel) (2 D ou 2.5 D) ° Video: sequence of still images (3 D) ° Next: 3D-TV (audio, video, ambiance, smell?) 10

Flurry of formats ? Examples of data compression extensions 11

• Some (standard) extensions GIF, RAR, ZIP, BZ2, MP3, MPEG4, AVC, PNG, J2K, BH, AVI, R(A)M, LHA, , ACC, HE-AAC, MPC, OGM, APE, TIFF, JP2, JPEG-XR (WMP/HD Photo), WAV, PAK, FLV, FLAC, MPC, PDF, MAT, BPM, Z, GZ, LHA, MJEG, 7z, TTA, DjVu, WebP http://en.wikipedia.org/wiki/List_of_archive_formats

• Targeted for what kind of data? 12

Why? Why do we need ? 13

Still Image • One page of A4 format at 600 dpi is > 100 MB. • One color image in digital camera generates 10-30 MB. • Scanned 3”×7” photograph at 300 dpi is 30 MB.

HDTV Video • 720x1280 pixels/frame x 60 frames/s = 1,3 Gb/s • HDTV bandwidth: 20 Mb/s • Objective: 70 x reduction • Equivalent : 0.35 bits/pixel

• Infotrends (2008/01): total #of digital pictures since beginning ~ 180 billions. Should grow to 347 billions in 2012 Why do we need Image Compression? 14

1) Storage 2) Transmission (cable, satellite, wifi) 3) Data access 1990-2000 Disk capacities : 100MB -> 20 GB (200 times!) (3-4 TB) but seek time : 15 ms ‰ 10 ms (3-4 ms) and transfer rate: 1MB/sec ->2 MB/sec. (much better for SSD)

Compression improves overall response time in some applications Source of images 15

•Digital camera •Video camera, •Ultra-sound (US), Tomography (CT), Magnetic resonance image (MRI), digital X-ray (XR), Infrared. •Remote sensing, Seismics, Satellite, Radar, SAR Data types 16

IMAGE UNIVERSAL COMPRESSION COMPRESSION

Gray-scale Binary images images Video Textual data images Colour True colour palette images images

Why do we need specific algorithms? Binary image: 1 bit/pel 17 Grayscale image: 8 bits/pel 18

Intensity = 0-255 Parameters of digital images 19

6 bits 4 bits 2 bits (64 gray levels) (16 gray levels) (4 gray levels)

384 ×××256

192 ×××128

96 ×××64

48 ×××32 True color image: 3*8 bits/pel 20 Goals of compression 21 • Balance redundancy and irrelevancy Ësources of redundancy ° temporal ° spatial ° color ° other? Ësources of irrelevancy ° perceptually unimportant information Ëissues: redundancy/irrelevancy ° examples? ° lossless/lossy choices Lossy vs. 22

Lossless compression: reversible, information preserving text compression algorithms, binary images, palette images

Lossy compression: irreversible grayscale, color, video

Near-lossless compression: medical imaging, remote sensing.

1) Why do we need ? 2) When we can use lossy compression? Rate measures 23

size of the compressed file C Bitrate: = bits/pel pixels in the image N

size of the original file N ⋅k Compression ratio: = size of the compressed file C Distortion measures 24

N = 1 − Mean average error (MAE): MAE ∑ yi xi N i=1

N Mean square error (MSE): = 1 ()− 2 MSE ∑ yi xi N i=1

Signal-to-noise ratio (SNR): = ⋅ [σ 2 ] SNR 10 log 10 MSE (decibels)

Pulse-signal-to-noise ratio (PSNR): = ⋅ [ 2 ] PSNR 10 log 10 A MSE (decibels)

A is amplitude of the signal: A = 2 8-1=255 for 8-bits signal. Other measures: l_p norms, SSIM, MOS Other issues 25 • Coder and decoder computation complexity • Memory requirements • Fixed rate or variable rate • Error resilience • Symmetric or asymmetric algorithms • Decompress at multiple resolutions • Decompress at various bit rates • Standard or proprietary Reduce redundancy/irrelevancy at "each" step considering performance and quality What is an image? 26 What is an image? 27 What is an image? 28 Ultimate storage 29 • Image compression: why? • storage, transmission, database indexing • processing (denoising, scaling, rotation) • How come? • 512 x 512 pix 8-bit image → 2,097,152 bits • Far less than 10 100 atoms in the entire Universe • 10 15 directions, magn., depth and exposure params • Tautavel Man takes 1000 pix/s (since 450,000 BC) • A collection of 1,42.10 176 pix. • typical compressed image size? • # of bits needed: 17, 586, 10.253, 12.087.300, more? What is a compression system? 30

Compression Model

Encode f(x,y) Transform Quantize • Source • Channel What is a compression system? 31

Pre - Blocking Data Transform Processing Processing

Classifier Adaptive Compression Rate Allocator

Bit Quality Entropy Ordering Reduction Stream Coding Embedded Coding Compression scheme (1) 32

modeling

=

parameters

+ model error Compression scheme (2) 33

transform

reduction

xO = 1 3 7 2 -5 -1 0 0 0 xR = 0 2 6 2 -6 0 0 0 0 coding xC = 0 1 3 1 -3 4 Preprocessing 34

Pre - Blocking Image Transform Processing Processing Image analysis Filtering/enhancement Classifier RGB to YUV Extension Adaptive Compression Rate Allocator

Bit Quality Entropy Ordering Reduction Stream Coding Preprocessing 35

• Image Analysis

• Filtering/enhancement

• RGB to YUV

• Image Extension RGB color space 36

Red Green Blue RGB → YUV 37

 Y   .0 299 .0 587 .0 114   R        Y = 0.3⋅ R + 0.6⋅G + 0.1⋅ B = − − Cb   .0 16875 .0 33126 5.0 .G    −    U = B −Y Cr   5.0 .0 41869 .0 08131   B  V = R −Y  R   0.1 0 .1 402   Y        = − − G  0.1 .0 34413 .0 71414 .Cb         B   0.1 .1 772 0  Cr 

R, G, B -- red, green, blue Y -- the luminance U,V -- the chrominance components

Most of the information is collected to the Y component, while the information content in the U and V is less. YUV color space 38

Y U V Blocking 39

Pre - Blocking Image Transform Processing Processing JPEG blocking Irregular tiling Classifier Segmentation Adaptive Compression Rate Allocator

Bit Quality Entropy Ordering Reduction Stream Coding Blocking 40

• Goals:

• To exploit image unstationarities

• To reduce the computational cost

• To exploit inter-block dependencies (2D-3D)

• To select objects of interest (moving) 1D Signal Blocking 41

• How do we perform decomposition? • block by block • with overlap 2D Image Blocking 42

JPEG blocking Irregular tiling Segmentation 43

Pre - Blocking Image Transform Processing Processing Karhunen -Loève Fourier, DCT Classifier Walsh, Hartley Adaptive (packets) Compression Rate Allocator

Bit Quality Entropy Ordering Reduction Stream Coding Transform Coding 44 • Goals • Efficient representation I(u,v) of an image i(x,y) • Data decorrelation (KLT optimality?) • Properties • Linear transforms (matrix op.) • Orthogonality • Fast algorithms for compression/decompression • Laplace-Gauss distribution (var. length coding) Karhunen-Loeve (Hotelling) Transform 45

x : N ×1 vector

mx = E{}x : mean vecto r = {}− − T Cx E (x mx )( x mx ) : Covariance matrix λ = i : Eigenvalue s of x, i ,1 2,..., N λ ei : Eigenvecto rs correspond ing to i A : Matrix wit h rows of ei

Hotelling transform of x

y = A(x − m x ) Singular Value Decomposition 46 Transform Coding 47 • An optimality result: • Simplest stationary source model AR(1) DCT ~ KLT • Troubles:

• KLT calculations + overhead

• After a transformation, correlation can be made very small, but coefficients are far from being independent! • Transform affects coding Transform Coding 48 • Choices • Good decorrelation properties • Low complexity, HW implementation (DCT, Fourier) • Side effects (extension, zero or linear-phase) • A great deal of nice transforms • Walsh-Hadamard system (with fast transforms) • Wavelet (Haar-1910, Mallat, Daubechies 1988) • Lapped transforms (Malvar, Meyer) Transform Coding 49 Transform Coding: performance 50 Performance measure: coding gain 51 DCT and size 52 Transform coding: optimization 53 2D Hadamard-16 54 2D – DCT 16 55 Standard transforms 56 • Limitations Ë fixed vector size Ëconstrained shape (orthogonality) Ëdata-driven adaptation Ëshifts and rotations robustness Ëhigher dimension generalizations Ëanalogy with vision aspects ° needed for compression : inverses/redundancy/sparsity Ëpre- and post-processing Ë coder complexity Transform coding 57 • A common waveform

• A common representation Transform coding 58 • A less common waveform

• A less common representation Transform coding 59 • A not so common waveform

• A not so common representation Novel transforms 60

high frequencies

low frequencies Wavelet FB-II Transform Coding 61 Transform Coding 62 Processing 63

Pre - Blocking Image Transform Processing Processing

Time/freq.Classifier Filtering Image analysis Adaptive Texture analysis Compression Rate Allocator

Bit Quality Entropy Ordering Reduction Stream Coding Classifier 64

Pre - Blocking Image Transform Processing Processing

Classifier Adaptive Spectrum allocation Compression Texture extraction Rate Segmentation Allocator

Bit Quality Entropy Ordering Reduction Stream Coding Texture Synthesis 65 Bit Reduction 66

Pre - Blocking Image Transform Processing Processing

Classifier Adaptive Thresholding Compression Subsampling Rate Scalar quantization Vector quantization Allocator Iterations (fractals)

Bit Quality Entropy Ordering Reduction Stream Coding Bit Reduction 67 • The lossy stage • Human eyes see a limited range of tones/freq • Real-world pictures are already imperfect • Methods • Scalar quant.: uniform, log., optimal (Lloyd-Max algo) • Vector quant.: Voronoï diagrams, nearest neighbour • Adaptivity • Pre-stored tables • Training set based tables • On-the-fly quantization estimation Quantization 68

8-bits last 5 bits last 4 bits Quantization 69

last 3 bits last 2 bits last bit Quantization 70

8 4 3

2 1 Quantization 71 Quantization (non-uniform) 72 Quantization (adaptive) 73 Quantization (vector) 74

• Outline for images Quantization (vector) 75

Image Codebook

V1 codevectors : Vi , 1 ≤ i ≤ L xk V2

closest matching code vector Vk

Image vectors : Xj

VL

n Mean Square Error (MSE) = 1 − 2 d ( X j ,VK ) ∑ ( X j (i) Vk (i)) (Euclidean Distance) n i=1 n = 1 − 2 Weighted MSE d ( X j ,VK ) ∑ (Wi ( X j (i) Vk (i)) n i=1 Quantization (vector) 76 • LBG algorithm Quantization (vector) 77

• Simple codebooks Ë (parrots)

• Codebook for a specific feature Ë ex. edges, smooth areas, etc.

• Codebooks could be of different sizes Ordering 78

Pre - Blocking Image Transform Processing Processing

Classifier Adaptive Compression Raster scan Zig-zag,Rate Hilbert scan ZerotreesAllocator Object based coding Bit Quality Entropy Ordering Reduction Stream Coding Ordering 79 Tree-coding ordering 80 Ordering 81 Wavelet/Blocking equivalence 82 Entropy Coding 83

Pre - Blocking Image Transform Processing Processing

Classifier Adaptive Compression RLE, Huffman Rate Arithmetic, LZ* Allocator Form dictionaries

Bit Quality Entropy Ordering Reduction Stream Coding Entropy Coding 84

After reduction: • Lower coefficient variance • A lot zeroed out 85

ANTECEDENT Incoming Low Up Start 0 1 A 0 0.1 Symbol # Huff. Low Up N 0.06 0.08 A 14 0 0.1 T 0.076 0.08 C 15 0.1 0.2 E 0.0772 0.0784 D 15 0.2 0.3 E 31 0.3 0.6 C 0.07732 0.07744 N 22 0.6 0.8 E 0.077356 0.077392 T 2 3 0 .8 1 D 0.07736632 0.0773668 E 0.07736428 0.07736536 N 0.077364928 0.077365144 T 0.0773651008 0.077365144 -8 ≈ 4.32 10 24.46 bits against 27 bits Codeword 0.07736511 AAAAAAAAA END : 7 against 11 bits Rate allocation 86

Pre - Blocking Image Transform Processing Processing

Classifier Adaptive Exact bit-rate Compression Progressive coding Distortion matching Rate Allocator

Bit Quality Entropy Ordering Reduction Stream Coding Quality measure 87

Pre - Blocking Image Transform Processing Processing

Classifier Adaptive Compression

Objective measures (SNR) Rate Subjective measures Allocator HVS model Bit Quality Entropy Ordering Reduction Stream Coding Embedded coding 88

Pre - Blocking Image Transform Processing Processing

Classifier Adaptive Compression Rate Allocator

Bit Quality Entropy Ordering Reduction Stream Coding Embedded coding Embedded quantization 89

Sign ssssssss Msb 41110000 3 x x x 1 1 0 0 2 x x x x x 0 0 1 x x x x x 1 1 Lsb 0xxxxxxx Ordering 90 Wavelet/Blocking equivalence 91 92

JPEG JPEG Principles 93

8X8 Quantizer Coefficients-to-Symbols Entropy DCT Map Coder

Encoder JPEG Principles 94 Input Image, Size=512 x 512 x 8 bits JPEG Principles 95 JPEG Principles 96

C u][ C v][ 7 7 2( m + )1 uπ 2( n + )1 vπ X u v],[ = ∑ ∑ x[m n], cos cos 4 m=0 n =0 16 16

 /1 ,2 u = ,0 u  C  =  v    ≤ ≤  1 1 u 7

69 71 74 76 89 106 111 122 717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6 59 70 61 61 68 76 88 94 -99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7 82 70 77 67 65 63 57 70 51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4 97 99 87 83 72 72 68 63 30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0 91 105 90 95 85 84 79 75 22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9 92 110 101 106 100 94 87 93 15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7 89 113 115 124 113 105 100 110 0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8 104 110 124 125 107 95 117 116 -0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1 Step 1: DCT JPEG Principles (Transform: DCT8) 97 JPEG Principles 98

717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6 16 11 10 16 24 40 51 61 -99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7 12 12 14 19 26 58 60 55 51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4 14 13 16 24 40 57 69 56 30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0 ÷ 14 17 22 29 51 87 80 62 22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9 ÷ 18 22 37 56 68 109 103 77 Q 15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7 24 35 55 64 81 104 113 92 0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8 49 64 78 87 103 121 120 101 -0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1 72 92 95 98 112 100 103 99]

45 0 0 -1 0 0 0 0 -8 -3 2 1 0 0 0 0 4 -5 0 0 0 0 0 0 2 -1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Step 2: Quantization JPEG Principles 99 Step 3: Coefficient-to-Symbol Mapping

45 0 0 -1 0 0 0 0 -8 -3 2 1 0 0 0 0 4 -5 0 0 0 0 0 0 2 -1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Input Zigzag scan procedure

Result = 45,0,-8,4,-3,0,-1,2,-5,2,1,-1,0,1,0,0,0,0,0,0,1,0x23 times

Symbols defined as [run of zeros, nonzero terminating value]

Step 4: Entropy Coding • Symbols are encoded using mostly . • Huffman coding is a method of variable length coding in which shorter codewords are assigned to the more frequently occurring symbols. Classical orthogonal transforms 100

[f ] an image NxN

[F ]= V [f ]H ∗

− ∗ V 1 = V H −1 = H ∗

[f ]= V ∗ [F ]H

N −1 N −1 N −1 N −1 ∑ ∑ F ()u, v 2 = ∑ ∑ f ()m, n 2 u =0 v =0 m =0 n =0 One JPEG block 101 JPEG compression on several blocks 102 BlockingBlocking ArtifactsArtifacts 103

Blocking artifact (DCT) Examples using different quality factors to scale matrix Q Original Image 104

Size: 5728 Bytes Size: 11956 Bytes Size: 263224Bytes 46x Compression 22x Compression

Size: 15159 Bytes Size: 18032 Bytes Size: 20922 Bytes 17x Compression 15x Compression 13x Compression JPEG and DCT: motion 105

Another DCT Application: Video Compression Prediction Input Error 8X8 Quantizer Entropy Frame DCT Coder Prediction

Intraframe – Open Inverse Interframe - Close Quantizer

Motion Delayed 8X8 Compensation Frame Memory IDCT

Motion Vectors Motion Entropy Estimation Coder JPEG and DCT: motion 106 Classical orthogonal transforms 107 Principles of overlapping 108

32 x 32 superbloc 8 x 8 pixels 8 x 8 coefficients

Original image Transformed image Lapped transforms 109 BlockingBlocking andand RingingRinging ArtifactsArtifacts (Nagai(Nagai 2001)2001) 110

Ringing artifact (Wavelet, GenLOT ) HowHow toto ReduceReduce thethe RingingRinging ?? 111 Frequency high

Output …

Input Basis of GenLOT Quantization Error Basis of GULLOT Frequency high Output … 1-D base example 112

GULLOT Results 113

Coded Yogi image at 0.3bps

GULBGULB (26.29(26.29 dB)dB) GenLOTGenLOT (26.06(26.06 dB)dB) Wavelet/Blocking equivalence 114 115

Principles of wavelet image coding SPIHT/JPEG-2000 SPIHT/JPEG 2000 principles 116 1-Level Wavelet Decomposition (2D DWT)

LL H1 2 Component (Low pass) H1 2

(Low pass) HL H2 2 Input Image Component (High pass)

LH H1 2 Component (Low pass) H2 2

(High pass) HH H2 2 Component (High pass)

Row-wise operations Column-wise operations

Filter Decimator x[n] y[n] Hi 2 L = − Keep one out of two pixels y[n] ∑ x[n k]hi [k ] k =0 SPIHT/JPEG 2000 principles 117 Multi-Level Wavelet Decomposition

2D-DWT

LL HL2 HL1 LL HL1 LH2 HH2

2D-DWT LH1 HH1 LH1 HH1 SPIHT/JPEG 2000 principles 118 Bitplanes and Self-Similarity Across Scales SPIHT/JPEG 2000 principles 119 Spatial Orientation Trees

Some Definitions: • O (i,j): set of coordinates of all offspring of node (i,j).

• D (i,j): set of coordinates of all descendants of node (i,j).

• H : set of coordinates of all spatial orientation tree roots.

• L (i,j): D (i,j) - O (i,j). SPIHT/JPEG 2000 principles 120 Coding Algorithm (SPIHT) • Key Ideas: • Ordered bit plane transmission. • Multi-pass zero-tree coding. • Exploitation across scales of the 2-D DWT. • Three list are defined: 1. LIS: List of Insignificant Sets • Type A: Entries are elements D (i,j) • Type B: Entries are elements L (i,j)

2. LIP: List of Insignificant Pixels 3. LSP: List of Significant Pixels • Significance Test:  ≥ n ,1 max {| c , ji |} 2  ji ),( ∈T = S n (T )    ,0 otherwise Coefficient Coefficient Binary Reconstruction SPIHT – Example Coordinates Value Symbols Value 121 (0,0) 63 1

0 48

(1,0) -34 1

1 -48

(0,1) -31 0 0

63 -34 49 10 7 13 -12 7 (1,1) 23 0 0

(1,0) -34 1

-31 23 14 -13 3 4 6 -1 (2,0) 49 1

0 48 15 14 3 -12 5 -7 3 9 (3,0) 10 0 0

(2,1) 14 0 0 -9 -7 -14 8 4 -2 3 2 (3,1) -13 0 0

(0,1) -31 1 -5 9 -1 47 4 6 -2 2 (0,2) 15 0 0 3 0 -3 2 3 -2 0 4 (1,2) 14 0 0 (0,3) -9 0 0 2 -3 6 -4 3 6 3 6 (1,3) -7 0 0 (1,1) 23 0 5 11 5 6 0 3 -4 4 (1,0) -34 0 (0,1) -31 1

(0,2) 15 0

(1,2) 14 1

(2,4) -1 0 0

(3,4) 47 1

0 48

(2,5) -3 0 0

(3,5) 2 0 0

(0,3) -9 0

(1,3) -7 0 Some Examples Original Image 122

96x Compression 48x Compression

22x Compression 16x Compression 123

JPEG-2000 Introduction 124 • Image Compression has to: Ë Reduce storage and bandwidth requirements Ë Allow different extraction modes • JPEG-2000 provides: Ë Low bit-rate compression performance Ë Progressive transmission by quality, resolution, component, or spatial locality Ë Lossy and Lossless compression Ë Random access to the bitstream Ë Region of Interest coding Ë Robustness to bit errors Codec Structure 125 Source Image Model 126 • One or Several Components in the image • Components can be at different resolutions (different sizes) Intercomponent Transform 127 • Reduces the correlation between components • Maps image data from RGB to YCrCb • Advantages: ËImprove coding efficiency ËAllow visually relevant quantization • Two Transforms: ËIrreversible color transform (ICT) ËReversible color transform (RCT) Reversible Component Transformation 128 • Used for Lossless or Lossy R+ 2G + B   coding Y     r   4  • = − Advantages: Vr   R G  Ë Reasonable Color Space    −  U r   B G  Ë Ability of having lossless   compression  U +V   − r r    Yr   G    4  = +  R   Vr G     +   B   Ur G    (6) 129

An example: (7) Wavelet Transform 130 • 5/3 Transform: reversible ËInteger to Integer transform ËCan be used both for lossless or lossy coding • 9/7 Transform: nonreversible ËReal to Real transform ËCan only be used for lossy coding Quantization 131

• A uniform scalar quantization with dead-zone about the origin

A zero output may be produced for larger values on the input, to avoid recording noise Progression 132 • Different ordering of the packets in the code stream • 4 Types of Progression: Ë Resolution Ë Quality Ë Spatial Location Ë Component • Progression Type can be changed during coding Progression (2) 133 • Progression by Resolution (3) Progression 134 • Progression by Quality Region of Interest 135 • Coding different regions of the image with different quality • Used when certain parts of the image are of higher importance • ROI coding: Ë General Scaling-Based method Ë MAXSHIFT method General Scaling-Based 136 • Idea: Scale (shift) coefficients s.t. the bits associated with ROI are in higher bit-planes • Some bits of ROI might be encoded together with nonROI bits 137 • Steps: General Scaling-Based (2) • Wavelet Transform • ROI Mask is derived • Quantization • nonROI coefficients are downscaled • Entropy coding • Scaling Value and ROI coordinates are included MAXSHIFT Method 138 • Scaling value S is chosen such that: The minimum ROI coefficient is larger than the maximum nonROI coefficient • Advantages: Ë Allows arbitrary shaped ROIs Ë No ROI mask is needed Scalability 139 • The ability to achieve coding of more than one qualities and/or resolution simultaneously • Two important types: Ë SNR Scalability Ë Spatial or Resolution Scalability • Advantages: Ë No need to know target bit rate/resolution Ë No need for multiple compressions Ë Resilience to transmission errors SNR Scalability 140 • The bit stream can be decompressed at different quality levels (SNR)

Decompressed image “bike” at (a) 0.125 b/p, (b) 0.25 b/p, (c) 0.5 b/p Spatial Scalability 141 • The bit stream can be decompressed at different resolution level (2) Scalability 142 • Combination of Spatial and SNR • Changing the progression type JPEG-2000 V.S. JPEG 143

(a) (b)

Compression at 0.25 b/p by means of (a) JPEG (b) JPEG-2000 JPEG-2000 V.S. JPEG 144

(a) (b) Compression at 0.2 b/p by means of (a) JPEG (b) JPEG-2000 Comparison of JPEG 2000 with JPEG 145 • Much smaller files • Much better quality

Figure: 0.08bpp J2K Image (8KB); 0.1563bpp JPEG Image (16KB); Illustration 146 • Region of Interest (ROI) Encoding

Figure: Raw Image; 0.07bpp J2K Image with ROI; 0.07bpp J2K Image without ROI References 147

• M.D. Adams , “The JPEG-2000 Still Image Compression Standard”, ISO/IEC JTC1/SC29/WG1 (ITU-T SG8), 2001

• D. Taubman, E. Ordentlcih, I. Ueno, “Embedded Block Coding”, Proc. Int. Conf. on Image Processing (ICIP '2000), Vol. II , 33-36, 2000 • M.W. Marcellin, M.J. Gormish, A. Bilgin, M.P. Boliek, “An Overview of JPEG-2000”, Proc. Of IEEE Data Compression Conference, pp. 523-541, 2000

• A. Skodras, C. Christopulos, T. Ebrahimi, “The JPEG 2000 Still Image Compression Standard”, IEEE Signal Processing Magazine, pp. 36-60, September 2001

• D.S. Taubman, and M.W. Marcellin , "Jpeg2000: Image Compression Fundamentals, Standards, and Practice", Kluwer Academic Pulishers, 2001 .

• G. Cena, P. Montuschi, L. Ciminiera, A. Sanna, “A Q-Coder Algorithm with Carry Free Addition”, Proc. 13th IEEE Symposium on Computer Arithmetic , pp. 282-290, July 1997

• S.Y. Choo, G. Chew , “JPEG 2000 and Wavelet Compression ”, http://www-ise.stanford.edu/class/psych221/00/shuoyen/ JPEG-2000 Parts 148 JPEG Principles 149

717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6 16 11 10 16 24 40 51 61 -99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7 12 12 14 19 26 58 60 55 51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4 14 13 16 24 40 57 69 56 30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0 ÷ 14 17 22 29 51 87 80 62 22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9 ÷ 18 22 37 56 68 109 103 77 Q 15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7 24 35 55 64 81 104 113 92 0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8 49 64 78 87 103 121 120 101 -0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1 72 92 95 98 112 100 103 99]

45 0 0 -1 0 0 0 0 -8 -3 2 1 0 0 0 0 4 -5 0 0 0 0 0 0 2 -1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Step 2: Quantization