1
X-Media Audio, Image and Video Coding
Laurent Duval (IFP Energies nouvelles) Eric Debes (Thalès) Philippe Morosini (Supélec)
Supports Moodle/NTNoe http://www.laurent-duval.eu/lcd-lecture-supelec-xmedia.html 2
General information Contents 3 • Introduction to X-Media coding Ë generic principles • Audio Ë data recording/sampling, physiology • Data, audio coding Ë LZ*, Mpeg-Layer 3 • Image coding Ë JPEG vs. JPEG 2000 • Video coding Ë MPEG formats, H-264 • Bonuses, exercices Initial motivation 4 Initial motivation 5 • Coding/compression as a DSP engineer discipline Ë information reduction, standards & adaption, complexity, integrity & security issues, interaction with SP pipeline • Composite domain, yet ubiquitous Ë sampling (Nyquist-Shannon, filter banks), statistical SP (KLT, decorrelation), transforms (Fourier, wavelets), classification (quantization, K-means), functional spaces (basis & frames), information theory (entropy), error measurements, modelling Initial motivation 6
Digital Image Processing
Digital Image Characteristics
Spatial Spectral
Gray-level Histogram DFT DCT
Pre-Processing
Enhancement Restoration
Point Processing Masking Filtering Degradation Models Inverse Filtering Wiener Filtering
Compression
Information Theory Lossless Lossy
LZW (gif) Transform-based (jpeg)
Segmentation
Edge Detection
Description
Shape Descriptors Texture Morphology Initial motivation 7 • Exemplar Ë underline importance of specific steps (related to other lectures) to make stuff work Ë which algorithm for which task? ° process text? sound? images? video? Ë "the toolbox quote" (Juran) • Evolving Ë steady evolution of standards, tools, overview of future directions (and tools) • Central to SP task Ë what is really important in my data? Ë (stored) info. overflow + degradations (decon.) 8
Principles Principles 9 • What is data (image) Compression? Ë Data compression is the art and science of representing information in a compact form. Ë Data is a sequence of symbols taken from a discrete alphabet. ° Text: sequence of characters/bytes (0.5 D) ° Sound: collection of arrays of values representing intensities (1 D to 1.5 D) ° Still image data: collection of arrays (one for each color plane) of values representing intensity (color) of the point in corresponding spatial locations (pixel) (2 D ou 2.5 D) ° Video: sequence of still images (3 D) ° Next: 3D-TV (audio, video, ambiance, smell?) 10
Flurry of formats ? Examples of data compression extensions 11
• Some (standard) extensions GIF, RAR, ZIP, BZ2, MP3, MPEG4, AVC, PNG, J2K, BH, AVI, R(A)M, LHA, OGG, ACC, HE-AAC, MPC, OGM, APE, TIFF, JP2, JPEG-XR (WMP/HD Photo), WAV, PAK, FLV, FLAC, MPC, PDF, MAT, BPM, Z, GZ, LHA, MJEG, 7z, TTA, DjVu, WebP http://en.wikipedia.org/wiki/List_of_archive_formats
• Targeted for what kind of data? 12
Why? Why do we need Image Compression? 13
Still Image • One page of A4 format at 600 dpi is > 100 MB. • One color image in digital camera generates 10-30 MB. • Scanned 3”×7” photograph at 300 dpi is 30 MB.
HDTV Video • 720x1280 pixels/frame x 60 frames/s = 1,3 Gb/s • HDTV bandwidth: 20 Mb/s • Objective: 70 x reduction • Equivalent : 0.35 bits/pixel
• Infotrends (2008/01): total #of digital pictures since beginning ~ 180 billions. Should grow to 347 billions in 2012 Why do we need Image Compression? 14
1) Storage 2) Transmission (cable, satellite, wifi) 3) Data access 1990-2000 Disk capacities : 100MB -> 20 GB (200 times!) (3-4 TB) but seek time : 15 ms ‰ 10 ms (3-4 ms) and transfer rate: 1MB/sec ->2 MB/sec. (much better for SSD)
Compression improves overall response time in some applications Source of images 15
•Image scanner •Digital camera •Video camera, •Ultra-sound (US), Computer Tomography (CT), Magnetic resonance image (MRI), digital X-ray (XR), Infrared. •Remote sensing, Seismics, Satellite, Radar, SAR Data types 16
IMAGE UNIVERSAL COMPRESSION COMPRESSION
Gray-scale Binary images images Video Textual data images Colour True colour palette images images
Why do we need specific algorithms? Binary image: 1 bit/pel 17 Grayscale image: 8 bits/pel 18
Intensity = 0-255 Parameters of digital images 19
6 bits 4 bits 2 bits (64 gray levels) (16 gray levels) (4 gray levels)
384 ×××256
192 ×××128
96 ×××64
48 ×××32 True color image: 3*8 bits/pel 20 Goals of compression 21 • Balance redundancy and irrelevancy Ësources of redundancy ° temporal ° spatial ° color ° other? Ësources of irrelevancy ° perceptually unimportant information Ëissues: redundancy/irrelevancy ° examples? ° lossless/lossy choices Lossy vs. Lossless compression 22
Lossless compression: reversible, information preserving text compression algorithms, binary images, palette images
Lossy compression: irreversible grayscale, color, video
Near-lossless compression: medical imaging, remote sensing.
1) Why do we need lossy compression? 2) When we can use lossy compression? Rate measures 23
size of the compressed file C Bitrate: = bits/pel pixels in the image N
size of the original file N ⋅k Compression ratio: = size of the compressed file C Distortion measures 24
N = 1 − Mean average error (MAE): MAE ∑ yi xi N i=1
N Mean square error (MSE): = 1 ()− 2 MSE ∑ yi xi N i=1
Signal-to-noise ratio (SNR): = ⋅ [σ 2 ] SNR 10 log 10 MSE (decibels)
Pulse-signal-to-noise ratio (PSNR): = ⋅ [ 2 ] PSNR 10 log 10 A MSE (decibels)
A is amplitude of the signal: A = 2 8-1=255 for 8-bits signal. Other measures: l_p norms, SSIM, MOS Other issues 25 • Coder and decoder computation complexity • Memory requirements • Fixed rate or variable rate • Error resilience • Symmetric or asymmetric algorithms • Decompress at multiple resolutions • Decompress at various bit rates • Standard or proprietary Reduce redundancy/irrelevancy at "each" step considering performance and quality What is an image? 26 What is an image? 27 What is an image? 28 Ultimate storage 29 • Image compression: why? • storage, transmission, database indexing • processing (denoising, scaling, rotation) • How come? • 512 x 512 pix 8-bit image → 2,097,152 bits • Far less than 10 100 atoms in the entire Universe • 10 15 directions, magn., depth and exposure params • Tautavel Man takes 1000 pix/s (since 450,000 BC) • A collection of 1,42.10 176 pix. • typical compressed image size? • # of bits needed: 17, 586, 10.253, 12.087.300, more? What is a compression system? 30
Compression Model
Encode f(x,y) Transform Quantize • Source • Channel What is a compression system? 31
Pre - Blocking Data Transform Processing Processing
Classifier Adaptive Compression Rate Allocator
Bit Quality Entropy Ordering Reduction Stream Coding Embedded Coding Compression scheme (1) 32
modeling
=
parameters
+ model error Compression scheme (2) 33
transform
reduction
xO = 1 3 7 2 -5 -1 0 0 0 xR = 0 2 6 2 -6 0 0 0 0 coding xC = 0 1 3 1 -3 4 Preprocessing 34
Pre - Blocking Image Transform Processing Processing Image analysis Filtering/enhancement Classifier RGB to YUV Extension Adaptive Compression Rate Allocator
Bit Quality Entropy Ordering Reduction Stream Coding Preprocessing 35
• Image Analysis
• Filtering/enhancement
• RGB to YUV
• Image Extension RGB color space 36
Red Green Blue RGB → YUV 37
Y .0 299 .0 587 .0 114 R Y = 0.3⋅ R + 0.6⋅G + 0.1⋅ B = − − Cb .0 16875 .0 33126 5.0 .G − U = B −Y Cr 5.0 .0 41869 .0 08131 B V = R −Y R 0.1 0 .1 402 Y = − − G 0.1 .0 34413 .0 71414 .Cb B 0.1 .1 772 0 Cr
R, G, B -- red, green, blue Y -- the luminance U,V -- the chrominance components
Most of the information is collected to the Y component, while the information content in the U and V is less. YUV color space 38
Y U V Blocking 39
Pre - Blocking Image Transform Processing Processing JPEG blocking Irregular tiling Classifier Segmentation Adaptive Compression Rate Allocator
Bit Quality Entropy Ordering Reduction Stream Coding Blocking 40
• Goals:
• To exploit image unstationarities
• To reduce the computational cost
• To exploit inter-block dependencies (2D-3D)
• To select objects of interest (moving) 1D Signal Blocking 41
• How do we perform decomposition? • block by block • with overlap 2D Image Blocking 42
JPEG blocking Irregular tiling Segmentation Transform Coding 43
Pre - Blocking Image Transform Processing Processing Karhunen -Loève Fourier, DCT Classifier Walsh, Hartley Adaptive Wavelet (packets) Compression Rate Allocator
Bit Quality Entropy Ordering Reduction Stream Coding Transform Coding 44 • Goals • Efficient representation I(u,v) of an image i(x,y) • Data decorrelation (KLT optimality?) • Properties • Linear transforms (matrix op.) • Orthogonality • Fast algorithms for compression/decompression • Laplace-Gauss distribution (var. length coding) Karhunen-Loeve (Hotelling) Transform 45
x : N ×1 vector
mx = E{}x : mean vecto r = {}− − T Cx E (x mx )( x mx ) : Covariance matrix λ = i : Eigenvalue s of x, i ,1 2,..., N λ ei : Eigenvecto rs correspond ing to i A : Matrix wit h rows of ei
Hotelling transform of x
y = A(x − m x ) Singular Value Decomposition 46 Transform Coding 47 • An optimality result: • Simplest stationary source model AR(1) DCT ~ KLT • Troubles:
• KLT calculations + overhead
• After a transformation, correlation can be made very small, but coefficients are far from being independent! • Transform affects coding Transform Coding 48 • Choices • Good decorrelation properties • Low complexity, HW implementation (DCT, Fourier) • Side effects (extension, zero or linear-phase) • A great deal of nice transforms • Walsh-Hadamard system (with fast transforms) • Wavelet (Haar-1910, Mallat, Daubechies 1988) • Lapped transforms (Malvar, Meyer) Transform Coding 49 Transform Coding: performance 50 Performance measure: coding gain 51 DCT and size 52 Transform coding: optimization 53 2D Hadamard-16 54 2D – DCT 16 55 Standard transforms 56 • Limitations Ë fixed vector size Ëconstrained shape (orthogonality) Ëdata-driven adaptation Ëshifts and rotations robustness Ëhigher dimension generalizations Ëanalogy with vision aspects ° needed for compression : inverses/redundancy/sparsity Ëpre- and post-processing Ë coder complexity Transform coding 57 • A common waveform
• A common representation Transform coding 58 • A less common waveform
• A less common representation Transform coding 59 • A not so common waveform
• A not so common representation Novel transforms 60
high frequencies
low frequencies Wavelet FB-II Transform Coding 61 Transform Coding 62 Processing 63
Pre - Blocking Image Transform Processing Processing
Time/freq.Classifier Filtering Image analysis Adaptive Texture analysis Compression Rate Allocator
Bit Quality Entropy Ordering Reduction Stream Coding Classifier 64
Pre - Blocking Image Transform Processing Processing
Classifier Adaptive Spectrum allocation Compression Texture extraction Rate Segmentation Allocator
Bit Quality Entropy Ordering Reduction Stream Coding Texture Synthesis 65 Bit Reduction 66
Pre - Blocking Image Transform Processing Processing
Classifier Adaptive Thresholding Compression Subsampling Rate Scalar quantization Vector quantization Allocator Iterations (fractals)
Bit Quality Entropy Ordering Reduction Stream Coding Bit Reduction 67 • The lossy stage • Human eyes see a limited range of tones/freq • Real-world pictures are already imperfect • Methods • Scalar quant.: uniform, log., optimal (Lloyd-Max algo) • Vector quant.: Voronoï diagrams, nearest neighbour • Adaptivity • Pre-stored tables • Training set based tables • On-the-fly quantization estimation Quantization 68
8-bits last 5 bits last 4 bits Quantization 69
last 3 bits last 2 bits last bit Quantization 70
8 4 3
2 1 Quantization 71 Quantization (non-uniform) 72 Quantization (adaptive) 73 Quantization (vector) 74
• Outline for images Quantization (vector) 75
Image Codebook
V1 codevectors : Vi , 1 ≤ i ≤ L xk V2
closest matching code vector Vk
Image vectors : Xj
VL
n Mean Square Error (MSE) = 1 − 2 d ( X j ,VK ) ∑ ( X j (i) Vk (i)) (Euclidean Distance) n i=1 n = 1 − 2 Weighted MSE d ( X j ,VK ) ∑ (Wi ( X j (i) Vk (i)) n i=1 Quantization (vector) 76 • LBG algorithm Quantization (vector) 77
• Simple codebooks Ë (parrots)
• Codebook for a specific feature Ë ex. edges, smooth areas, etc.
• Codebooks could be of different sizes Ordering 78
Pre - Blocking Image Transform Processing Processing
Classifier Adaptive Compression Raster scan Zig-zag,Rate Hilbert scan ZerotreesAllocator Object based coding Bit Quality Entropy Ordering Reduction Stream Coding Ordering 79 Tree-coding ordering 80 Ordering 81 Wavelet/Blocking equivalence 82 Entropy Coding 83
Pre - Blocking Image Transform Processing Processing
Classifier Adaptive Compression RLE, Huffman Rate Arithmetic, LZ* Allocator Form dictionaries
Bit Quality Entropy Ordering Reduction Stream Coding Entropy Coding 84
After reduction: • Lower coefficient variance • A lot zeroed out Arithmetic Coding 85
ANTECEDENT Incoming Low Up Start 0 1 A 0 0.1 Symbol # Huff. Low Up N 0.06 0.08 A 14 0 0.1 T 0.076 0.08 C 15 0.1 0.2 E 0.0772 0.0784 D 15 0.2 0.3 E 31 0.3 0.6 C 0.07732 0.07744 N 22 0.6 0.8 E 0.077356 0.077392 T 2 3 0 .8 1 D 0.07736632 0.0773668 E 0.07736428 0.07736536 N 0.077364928 0.077365144 T 0.0773651008 0.077365144 -8 ≈ 4.32 10 24.46 bits against 27 bits Codeword 0.07736511 AAAAAAAAA END : 7 against 11 bits Rate allocation 86
Pre - Blocking Image Transform Processing Processing
Classifier Adaptive Exact bit-rate Compression Progressive coding Distortion matching Rate Allocator
Bit Quality Entropy Ordering Reduction Stream Coding Quality measure 87
Pre - Blocking Image Transform Processing Processing
Classifier Adaptive Compression
Objective measures (SNR) Rate Subjective measures Allocator HVS model Bit Quality Entropy Ordering Reduction Stream Coding Embedded coding 88
Pre - Blocking Image Transform Processing Processing
Classifier Adaptive Compression Rate Allocator
Bit Quality Entropy Ordering Reduction Stream Coding Embedded coding Embedded quantization 89
Sign ssssssss Msb 41110000 3 x x x 1 1 0 0 2 x x x x x 0 0 1 x x x x x 1 1 Lsb 0xxxxxxx Ordering 90 Wavelet/Blocking equivalence 91 92
JPEG JPEG Principles 93
8X8 Quantizer Coefficients-to-Symbols Entropy DCT Map Coder
Encoder JPEG Principles 94 Input Image, Size=512 x 512 x 8 bits JPEG Principles 95 JPEG Principles 96
C u][ C v][ 7 7 2( m + )1 uπ 2( n + )1 vπ X u v],[ = ∑ ∑ x[m n], cos cos 4 m=0 n =0 16 16
/1 ,2 u = ,0 u C = v ≤ ≤ 1 1 u 7
69 71 74 76 89 106 111 122 717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6 59 70 61 61 68 76 88 94 -99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7 82 70 77 67 65 63 57 70 51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4 97 99 87 83 72 72 68 63 30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0 91 105 90 95 85 84 79 75 22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9 92 110 101 106 100 94 87 93 15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7 89 113 115 124 113 105 100 110 0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8 104 110 124 125 107 95 117 116 -0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1 Step 1: DCT JPEG Principles (Transform: DCT8) 97 JPEG Principles 98
717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6 16 11 10 16 24 40 51 61 -99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7 12 12 14 19 26 58 60 55 51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4 14 13 16 24 40 57 69 56 30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0 ÷ 14 17 22 29 51 87 80 62 22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9 ÷ 18 22 37 56 68 109 103 77 Q 15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7 24 35 55 64 81 104 113 92 0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8 49 64 78 87 103 121 120 101 -0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1 72 92 95 98 112 100 103 99]
45 0 0 -1 0 0 0 0 -8 -3 2 1 0 0 0 0 4 -5 0 0 0 0 0 0 2 -1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Step 2: Quantization JPEG Principles 99 Step 3: Coefficient-to-Symbol Mapping
45 0 0 -1 0 0 0 0 -8 -3 2 1 0 0 0 0 4 -5 0 0 0 0 0 0 2 -1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Input Zigzag scan procedure
Result = 45,0,-8,4,-3,0,-1,2,-5,2,1,-1,0,1,0,0,0,0,0,0,1,0x23 times
Symbols defined as [run of zeros, nonzero terminating value]
Step 4: Entropy Coding • Symbols are encoded using mostly Huffman coding. • Huffman coding is a method of variable length coding in which shorter codewords are assigned to the more frequently occurring symbols. Classical orthogonal transforms 100
[f ] an image NxN
[F ]= V [f ]H ∗
− ∗ V 1 = V H −1 = H ∗
[f ]= V ∗ [F ]H
N −1 N −1 N −1 N −1 ∑ ∑ F ()u, v 2 = ∑ ∑ f ()m, n 2 u =0 v =0 m =0 n =0 One JPEG block 101 JPEG compression on several blocks 102 BlockingBlocking ArtifactsArtifacts 103
Blocking artifact (DCT) Examples using different quality factors to scale matrix Q Original Image 104
Size: 5728 Bytes Size: 11956 Bytes Size: 263224Bytes 46x Compression 22x Compression
Size: 15159 Bytes Size: 18032 Bytes Size: 20922 Bytes 17x Compression 15x Compression 13x Compression JPEG and DCT: motion 105
Another DCT Application: Video Compression Prediction Input Error 8X8 Quantizer Entropy Frame DCT Coder Prediction
Intraframe – Open Inverse Interframe - Close Quantizer
Motion Delayed 8X8 Compensation Frame Memory IDCT
Motion Vectors Motion Entropy Estimation Coder JPEG and DCT: motion 106 Classical orthogonal transforms 107 Principles of overlapping 108
32 x 32 superbloc 8 x 8 pixels 8 x 8 coefficients
Original image Transformed image Lapped transforms 109 BlockingBlocking andand RingingRinging ArtifactsArtifacts (Nagai(Nagai 2001)2001) 110
Ringing artifact (Wavelet, GenLOT ) HowHow toto ReduceReduce thethe RingingRinging ?? 111 Frequency high
Output …
Input Basis of GenLOT Quantization Error Basis of GULLOT Frequency high Output … 1-D base example 112
GULLOT Results 113
Coded Yogi image at 0.3bps
GULBGULB (26.29(26.29 dB)dB) GenLOTGenLOT (26.06(26.06 dB)dB) Wavelet/Blocking equivalence 114 115
Principles of wavelet image coding SPIHT/JPEG-2000 SPIHT/JPEG 2000 principles 116 1-Level Wavelet Decomposition (2D DWT)
LL H1 2 Component (Low pass) H1 2
(Low pass) HL H2 2 Input Image Component (High pass)
LH H1 2 Component (Low pass) H2 2
(High pass) HH H2 2 Component (High pass)
Row-wise operations Column-wise operations
Filter Decimator x[n] y[n] Hi 2 L = − Keep one out of two pixels y[n] ∑ x[n k]hi [k ] k =0 SPIHT/JPEG 2000 principles 117 Multi-Level Wavelet Decomposition
2D-DWT
LL HL2 HL1 LL HL1 LH2 HH2
2D-DWT LH1 HH1 LH1 HH1 SPIHT/JPEG 2000 principles 118 Bitplanes and Self-Similarity Across Scales SPIHT/JPEG 2000 principles 119 Spatial Orientation Trees
Some Definitions: • O (i,j): set of coordinates of all offspring of node (i,j).
• D (i,j): set of coordinates of all descendants of node (i,j).
• H : set of coordinates of all spatial orientation tree roots.
• L (i,j): D (i,j) - O (i,j). SPIHT/JPEG 2000 principles 120 Coding Algorithm (SPIHT) • Key Ideas: • Ordered bit plane transmission. • Multi-pass zero-tree coding. • Exploitation across scales of the 2-D DWT. • Three list are defined: 1. LIS: List of Insignificant Sets • Type A: Entries are elements D (i,j) • Type B: Entries are elements L (i,j)
2. LIP: List of Insignificant Pixels 3. LSP: List of Significant Pixels • Significance Test: ≥ n ,1 max {| c , ji |} 2 ji ),( ∈T = S n (T ) ,0 otherwise Coefficient Coefficient Binary Reconstruction SPIHT – Example Coordinates Value Symbols Value 121 (0,0) 63 1
0 48
(1,0) -34 1
1 -48
(0,1) -31 0 0
63 -34 49 10 7 13 -12 7 (1,1) 23 0 0
(1,0) -34 1
-31 23 14 -13 3 4 6 -1 (2,0) 49 1
0 48 15 14 3 -12 5 -7 3 9 (3,0) 10 0 0
(2,1) 14 0 0 -9 -7 -14 8 4 -2 3 2 (3,1) -13 0 0
(0,1) -31 1 -5 9 -1 47 4 6 -2 2 (0,2) 15 0 0 3 0 -3 2 3 -2 0 4 (1,2) 14 0 0 (0,3) -9 0 0 2 -3 6 -4 3 6 3 6 (1,3) -7 0 0 (1,1) 23 0 5 11 5 6 0 3 -4 4 (1,0) -34 0 (0,1) -31 1
(0,2) 15 0
(1,2) 14 1
(2,4) -1 0 0
(3,4) 47 1
0 48
(2,5) -3 0 0
(3,5) 2 0 0
(0,3) -9 0
(1,3) -7 0 Some Examples Original Image 122
96x Compression 48x Compression
22x Compression 16x Compression 123
JPEG-2000 Introduction 124 • Image Compression has to: Ë Reduce storage and bandwidth requirements Ë Allow different extraction modes • JPEG-2000 provides: Ë Low bit-rate compression performance Ë Progressive transmission by quality, resolution, component, or spatial locality Ë Lossy and Lossless compression Ë Random access to the bitstream Ë Region of Interest coding Ë Robustness to bit errors Codec Structure 125 Source Image Model 126 • One or Several Components in the image • Components can be at different resolutions (different sizes) Intercomponent Transform 127 • Reduces the correlation between components • Maps image data from RGB to YCrCb • Advantages: ËImprove coding efficiency ËAllow visually relevant quantization • Two Transforms: ËIrreversible color transform (ICT) ËReversible color transform (RCT) Reversible Component Transformation 128 • Used for Lossless or Lossy R+ 2G + B coding Y r 4 • = − Advantages: Vr R G Ë Reasonable Color Space − U r B G Ë Ability of having lossless compression U +V − r r Yr G 4 = + R Vr G + B Ur G (6) Wavelet Transform 129
An example: (7) Wavelet Transform 130 • 5/3 Transform: reversible ËInteger to Integer transform ËCan be used both for lossless or lossy coding • 9/7 Transform: nonreversible ËReal to Real transform ËCan only be used for lossy coding Quantization 131
• A uniform scalar quantization with dead-zone about the origin
A zero output may be produced for larger values on the input, to avoid recording noise Progression 132 • Different ordering of the packets in the code stream • 4 Types of Progression: Ë Resolution Ë Quality Ë Spatial Location Ë Component • Progression Type can be changed during coding Progression (2) 133 • Progression by Resolution (3) Progression 134 • Progression by Quality Region of Interest 135 • Coding different regions of the image with different quality • Used when certain parts of the image are of higher importance • ROI coding: Ë General Scaling-Based method Ë MAXSHIFT method General Scaling-Based 136 • Idea: Scale (shift) coefficients s.t. the bits associated with ROI are in higher bit-planes • Some bits of ROI might be encoded together with nonROI bits 137 • Steps: General Scaling-Based (2) • Wavelet Transform • ROI Mask is derived • Quantization • nonROI coefficients are downscaled • Entropy coding • Scaling Value and ROI coordinates are included MAXSHIFT Method 138 • Scaling value S is chosen such that: The minimum ROI coefficient is larger than the maximum nonROI coefficient • Advantages: Ë Allows arbitrary shaped ROIs Ë No ROI mask is needed Scalability 139 • The ability to achieve coding of more than one qualities and/or resolution simultaneously • Two important types: Ë SNR Scalability Ë Spatial or Resolution Scalability • Advantages: Ë No need to know target bit rate/resolution Ë No need for multiple compressions Ë Resilience to transmission errors SNR Scalability 140 • The bit stream can be decompressed at different quality levels (SNR)
Decompressed image “bike” at (a) 0.125 b/p, (b) 0.25 b/p, (c) 0.5 b/p Spatial Scalability 141 • The bit stream can be decompressed at different resolution level (2) Scalability 142 • Combination of Spatial and SNR • Changing the progression type JPEG-2000 V.S. JPEG 143
(a) (b)
Compression at 0.25 b/p by means of (a) JPEG (b) JPEG-2000 JPEG-2000 V.S. JPEG 144
(a) (b) Compression at 0.2 b/p by means of (a) JPEG (b) JPEG-2000 Comparison of JPEG 2000 with JPEG 145 • Much smaller files • Much better quality
Figure: 0.08bpp J2K Image (8KB); 0.1563bpp JPEG Image (16KB); Illustration 146 • Region of Interest (ROI) Encoding
Figure: Raw Image; 0.07bpp J2K Image with ROI; 0.07bpp J2K Image without ROI References 147
• M.D. Adams , “The JPEG-2000 Still Image Compression Standard”, ISO/IEC JTC1/SC29/WG1 (ITU-T SG8), 2001
• D. Taubman, E. Ordentlcih, I. Ueno, “Embedded Block Coding”, Proc. Int. Conf. on Image Processing (ICIP '2000), Vol. II , 33-36, 2000 • M.W. Marcellin, M.J. Gormish, A. Bilgin, M.P. Boliek, “An Overview of JPEG-2000”, Proc. Of IEEE Data Compression Conference, pp. 523-541, 2000
• A. Skodras, C. Christopulos, T. Ebrahimi, “The JPEG 2000 Still Image Compression Standard”, IEEE Signal Processing Magazine, pp. 36-60, September 2001
• D.S. Taubman, and M.W. Marcellin , "Jpeg2000: Image Compression Fundamentals, Standards, and Practice", Kluwer Academic Pulishers, 2001 .
• G. Cena, P. Montuschi, L. Ciminiera, A. Sanna, “A Q-Coder Algorithm with Carry Free Addition”, Proc. 13th IEEE Symposium on Computer Arithmetic , pp. 282-290, July 1997
• S.Y. Choo, G. Chew , “JPEG 2000 and Wavelet Compression ”, http://www-ise.stanford.edu/class/psych221/00/shuoyen/ JPEG-2000 Parts 148 JPEG Principles 149
717.6 0.2 0.4 -19.8 -2.1 -6.2 -5.7 -7.6 16 11 10 16 24 40 51 61 -99.0 -35.8 27.4 19.4 -2.6 -3.8 9.0 2.7 12 12 14 19 26 58 60 55 51.8 -60.8 3.9 -11.8 1.9 4.1 1.0 6.4 14 13 16 24 40 57 69 56 30.0 -25.1 -6.7 6.2 -4.4 -10.7 -4.2 -8.0 ÷ 14 17 22 29 51 87 80 62 22.6 2.7 4.9 3.4 -3.6 8.7 -2.7 0.9 ÷ 18 22 37 56 68 109 103 77 Q 15.6 4.9 -7.0 1.1 2.3 -2.2 6.6 -1.7 24 35 55 64 81 104 113 92 0.0 5.9 2.3 0.5 5.8 3.1 8.0 4.8 49 64 78 87 103 121 120 101 -0.7 -2.3 -5.2 -1.0 3.6 -0.5 5.1 -0.1 72 92 95 98 112 100 103 99]
45 0 0 -1 0 0 0 0 -8 -3 2 1 0 0 0 0 4 -5 0 0 0 0 0 0 2 -1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Step 2: Quantization