1 X-Media Audio, Image and Video Coding Laurent Duval (IFP Energies nouvelles) Eric Debes (Thalès) Philippe Morosini (Supélec) Supports Moodle/NTNoe http://www.laurent-duval.eu/lcd-lecture-supelec-xmedia.html 2 General information Contents 3 • Introduction to X-Media coding generic principles • Audio data recording/sampling, physiology • Data, audio coding LZ*, Mpeg-Layer 3 • Image coding JPEG vs. JPEG 2000 • Video coding MPEG formats, H-264 • Bonuses, exercices Initial motivation 4 Initial motivation 5 • Coding/compression as a DSP engineer discipline information reduction, standards & adaption, complexity, integrity & security issues, interaction with SP pipeline • Composite domain, yet ubiquitous sampling (Nyquist-Shannon, filter banks), statistical SP (KLT, decorrelation), transforms (Fourier, wavelets), classification (quantization, K-means), functional spaces (basis & frames), information theory (entropy), error measurements, modelling Initial motivation 6 Digital Image Processing Digital Image Characteristics Spatial Spectral Gray-level Histogram DFT DCT Pre-Processing Enhancement Restoration Point Processing Masking Filtering Degradation Models Inverse Filtering Wiener Filtering Compression Information Theory Lossless Lossy LZW (gif) Transform-based (jpeg) Segmentation Edge Detection Description Shape Descriptors Texture Morphology Initial motivation 7 • Exemplar underline importance of specific steps (related to other lectures) to make stuff work which algorithm for which task? ° process text? sound? images? video? "the toolbox quote" (Juran) • Evolving steady evolution of standards, tools, overview of future directions (and tools) • Central to SP task what is really important in my data? (stored) info. overflow + degradations (decon.) 8 Principles Principles 9 • What is data (image) Compression? Data compression is the art and science of representing information in a compact form. Data is a sequence of symbols taken from a discrete alphabet. ° Text: sequence of characters/bytes (0.5 D) ° Sound: collection of arrays of values representing intensities (1 D to 1.5 D) ° Still image data: collection of arrays (one for each color plane) of values representing intensity (color) of the point in corresponding spatial locations (pixel) (2 D ou 2.5 D) ° Video: sequence of still images (3 D) ° Next: 3D-TV (audio, video, ambiance, smell?) 10 Flurry of formats ? Examples of data compression extensions 11 • Some (standard) extensions GIF, RAR, ZIP, BZ2, MP3, MPEG4, AVC, PNG, J2K, BH, AVI, R(A)M, LHA, OGG, ACC, HE-AAC, MPC, OGM, APE, TIFF, JP2, JPEG-XR (WMP/HD Photo), WAV, PAK, FLV, FLAC, MPC, PDF, MAT, BPM, Z, GZ, LHA, MJEG, 7z, TTA, DjVu, WebP http://en.wikipedia.org/wiki/List_of_archive_formats • Targeted for what kind of data? 12 Why? Why do we need Image Compression? 13 Still Image • One page of A4 format at 600 dpi is > 100 MB. • One color image in digital camera generates 10-30 MB. • Scanned 3”×7” photograph at 300 dpi is 30 MB. HDTV Video • 720x1280 pixels/frame x 60 frames/s = 1,3 Gb/s • HDTV bandwidth: 20 Mb/s • Objective: 70 x reduction • Equivalent : 0.35 bits/pixel • Infotrends (2008/01): total #of digital pictures since beginning ~ 180 billions. Should grow to 347 billions in 2012 Why do we need Image Compression? 14 1) Storage 2) Transmission (cable, satellite, wifi) 3) Data access 1990-2000 Disk capacities : 100MB -> 20 GB (200 times!) (3-4 TB) but seek time : 15 ms 10 ms (3-4 ms) and transfer rate: 1MB/sec ->2 MB/sec. (much better for SSD) Compression improves overall response time in some applications Source of images 15 •Image scanner •Digital camera •Video camera, •Ultra-sound (US), Computer Tomography (CT), Magnetic resonance image (MRI), digital X-ray (XR), Infrared. •Remote sensing, Seismics, Satellite, Radar, SAR Data types 16 IMAGE UNIVERSAL COMPRESSION COMPRESSION Gray-scale Binary images images Textual Video data images Colour True colour palette images images Why do we need specific algorithms? Binary image: 1 bit/pel 17 Grayscale image: 8 bits/pel 18 Intensity = 0-255 Parameters of digital images 19 6 bits 4 bits 2 bits (64 gray levels) (16 gray levels) (4 gray levels) 384 ×××256 192 ×××128 96 ×××64 48 ×××32 True color image: 3*8 bits/pel 20 Goals of compression 21 • Balance redundancy and irrelevancy sources of redundancy ° temporal ° spatial ° color ° other? sources of irrelevancy ° perceptually unimportant information issues: redundancy/irrelevancy ° examples? ° lossless/lossy choices Lossy vs. Lossless compression 22 Lossless compression: reversible, information preserving text compression algorithms, binary images, palette images Lossy compression: irreversible grayscale, color, video Near-lossless compression: medical imaging, remote sensing. 1) Why do we need lossy compression? 2) When we can use lossy compression? Rate measures 23 size of the compressed file C Bitrate: = bits/pel pixels in the image N size of the original file N ⋅k Compression ratio: = size of the compressed file C Distortion measures 24 N = 1 − Mean average error (MAE): MAE ∑ yi xi N i=1 N Mean square error (MSE): = 1 ()− 2 MSE ∑ yi xi N i=1 Signal-to-noise ratio (SNR): = ⋅ [σ 2 ] SNR 10 log 10 MSE (decibels) Pulse-signal-to-noise ratio (PSNR): = ⋅ [ 2 ] PSNR 10 log 10 A MSE (decibels) A is amplitude of the signal: A = 2 8-1=255 for 8-bits signal. Other measures: l_p norms, SSIM, MOS Other issues 25 • Coder and decoder computation complexity • Memory requirements • Fixed rate or variable rate • Error resilience • Symmetric or asymmetric algorithms • Decompress at multiple resolutions • Decompress at various bit rates • Standard or proprietary Reduce redundancy/irrelevancy at "each" step considering performance and quality What is an image? 26 What is an image? 27 What is an image? 28 Ultimate storage 29 • Image compression: why? • storage, transmission, database indexing • processing (denoising, scaling, rotation) • How come? • 512 x 512 pix 8-bit image → 2,097,152 bits • Far less than 10 100 atoms in the entire Universe • 10 15 directions, magn., depth and exposure params • Tautavel Man takes 1000 pix/s (since 450,000 BC) • A collection of 1,42.10 176 pix. • typical compressed image size? • # of bits needed: 17, 586, 10.253, 12.087.300, more? What is a compression system? 30 Compression Model Encode f(x,y) Transform Quantize • Source • Channel What is a compression system? 31 Pre - Blocking Data Transform Processing Processing Classifier Adaptive Compression Rate Allocator Bit Quality Entropy Ordering Reduction Stream Coding Embedded Coding Compression scheme (1) 32 modeling = parameters + model error Compression scheme (2) 33 transform reduction xO = 1 3 7 2 -5 -1 0 0 0 xR = 0 2 6 2 -6 0 0 0 0 coding xC = 0 1 3 1 -3 4 Preprocessing 34 Pre - Blocking Image Transform Processing Processing Image analysis Filtering/enhancement Classifier RGB to YUV Extension Adaptive Compression Rate Allocator Bit Quality Entropy Ordering Reduction Stream Coding Preprocessing 35 • Image Analysis • Filtering/enhancement • RGB to YUV • Image Extension RGB color space 36 Red Green Blue RGB → YUV 37 Y .0 299 .0 587 .0 114 R Y = 0.3⋅ R + 0.6⋅G + 0.1⋅ B = − − Cb .0 16875 .0 33126 5.0 .G − U = B −Y Cr 5.0 .0 41869 .0 08131 B V = R −Y R 0.1 0 .1 402 Y = − − G 0.1 .0 34413 .0 71414 .Cb B 0.1 .1 772 0 Cr R, G, B -- red, green, blue Y -- the luminance U,V -- the chrominance components Most of the information is collected to the Y component, while the information content in the U and V is less. YUV color space 38 Y U V Blocking 39 Pre - Blocking Image Transform Processing Processing JPEG blocking Irregular tiling Classifier Segmentation Adaptive Compression Rate Allocator Bit Quality Entropy Ordering Reduction Stream Coding Blocking 40 • Goals: • To exploit image unstationarities • To reduce the computational cost • To exploit inter-block dependencies (2D-3D) • To select objects of interest (moving) 1D Signal Blocking 41 • How do we perform decomposition? • block by block • with overlap 2D Image Blocking 42 JPEG blocking Irregular tiling Segmentation Transform Coding 43 Pre - Blocking Image Transform Processing Processing Karhunen -Loève Fourier, DCT Classifier Walsh, Hartley Adaptive Wavelet (packets) Compression Rate Allocator Bit Quality Entropy Ordering Reduction Stream Coding Transform Coding 44 • Goals • Efficient representation I(u,v) of an image i(x,y) • Data decorrelation (KLT optimality?) • Properties • Linear transforms (matrix op.) • Orthogonality • Fast algorithms for compression/decompression • Laplace-Gauss distribution (var. length coding) Karhunen-Loeve (Hotelling) Transform 45 x : N ×1 vector mx = E{}x : mean vecto r = {}− − T Cx E (x mx )( x mx ) : Covariance matrix λ = i : Eigenvalue s of x, i ,1 2,..., N λ ei : Eigenvecto rs correspond ing to i A : Matrix wit h rows of ei Hotelling transform of x y = A(x − m x ) Singular Value Decomposition 46 Transform Coding 47 • An optimality result: • Simplest stationary source model AR(1) DCT ~ KLT • Troubles: • KLT calculations + overhead • After a transformation, correlation can be made very small, but coefficients are far from being independent! • Transform affects coding Transform Coding 48 • Choices • Good decorrelation properties • Low complexity, HW implementation (DCT, Fourier) • Side effects (extension, zero or linear-phase) • A great deal of nice transforms • Walsh-Hadamard system (with fast transforms) • Wavelet (Haar-1910, Mallat, Daubechies 1988) • Lapped transforms (Malvar, Meyer) Transform Coding 49 Transform Coding:
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages149 Page
-
File Size-