Transform Codingcoding

Total Page:16

File Type:pdf, Size:1020Kb

Transform Codingcoding TransformTransform CodingCoding • Principle of block-wise transform coding • Properties of orthonormal transforms • Discrete cosine transform (DCT) • Bit allocation for transform coefficients • Entropy coding of transform coefficients • Typical coding artifacts • Fast implementation of the DCT Thomas Wiegand: Digital Image Communication Transform Coding - 1 TransformTransform CodingCoding PrinciplePrinciple • Structure uvUV T Quantizer T-1 • Transform coder (T ) / decoder ( T-1) structure u UVi v T T-1 • Insert entropy coding ( ) and transmission channel u UiVi b b v T channel 1 T-1 Thomas Wiegand: Digital Image Communication Transform Coding - 2 TransformTransform CodingCoding andand QuantizationQuantization u1 U1 V1 v1 u U V v 2 T 2 Q 2 T-1 2 M M M M uN UN VN vN • Transformation of vector u=(u1,u2,…,uN) into U=(U1,U2,…,UN) • Quantizer Q may be applied to coefficients Ui • separately (scalar quantization: low complexity) • jointly (vector quantization, may require high complexity, exploiting of redundancy between Ui) • Inverse transformation of vector V=(V1,V2,…,VN) into v=(v1,v2,…,vN) Why should we use a transform ? Thomas Wiegand: Digital Image Communication Transform Coding - 3 GeometricalGeometrical InterpretationInterpretation • A linear transform can decorrelate random variables • An orthonormal transform is a rotation of the signal vector around the origin • Parseval‘s Theorem holds for orthonormal transforms u2 DC basis function U1 U u1 2 1 1 1 2 u = = T u 1 = 2 2 1 1 AC basis function u1 U1 1 1 12 1 3 U = = Tu = = U 2 2 1 11 2 1 Thomas Wiegand: Digital Image Communication Transform Coding - 4 PropertiesProperties ofof OrthonormalOrthonormal TransformsTransforms • Forward transform U = Tu N transform coefficients input signal block of size N Transform matrix of size NxN • Orthonormal transform property: inverse transform u = T1U = TT U • Linearity: u is represented as linear combination of “basis functions“ Thomas Wiegand: Digital Image Communication Transform Coding - 5 TransformTransform CodingCoding ofof ImagesImages Exploit horizontal and vertical dependencies by processing blocks original image reconstructed image Original Reconstructed image block image block Transform T Quantization Inverse Entropy Coding transform T-1 Transform Transmission Quantized transform coefficients coefficients from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 6 SeparableSeparable OrthonormalOrthonormal Transforms,Transforms, II • Problem: size of vectors N*N (typical value of N: 8) • An orthonormal transform is separable, if the transform of a signal block of size N*N-can be expressed by U = TuT1 N*N transform Orthonormal transform N*N block of coefficients matrix of size N*N input signal • The inverse transform is U = TT UT • Great practical importance: transform requires 2 matrix multiplications of size N*N instead one multiplication of a vector of size 1*N2 with a matrix of size N2*N2 • Reduction of the complexity from O(N4) to O(N3) from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 7 SeparableSeparable OrthonormalOrthonormal Transforms,Transforms, IIII Separable 2-D transform is realized by two 1-D transforms • along rows and • columns of the signal block N T u column-wise Tu row-wise TuT N N-transform N-transform N*N block of N*N block of pixels transform coefficients from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 8 Criteria for the Selection of a Particular Transform • Decorrelation, energy concentration • KLT, DCT, … • Transform should provide energy compaction • Visually pleasant basis functions • pseudo-random-noise, m-sequences, lapped transforms, … • Quantization errors make basis functions visible • Low complexity of computation • Separability in 2-D • Simple quantization of transform coefficients from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 9 KarhunenKarhunen LoèveLoève TransformTransform (KLT)(KLT) • Decorrelate elements of vector u T Ru=E{uu }, U=Tu, T T T T RU=E{UU }=TE{uu }T = TRuT =diag{ }i • Basis functions are eigenvectors of the covariance matrix of the input signal. • KLT achieves optimum energy concentration. • Disadvantages: - KLT dependent on signal statistics - KLT not separable for image blocks - Transform matrix cannot be factored into sparse matrices. from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 10 ComparisonComparison ofof VariousVarious Transforms,Transforms, II Comparison of 1D basis functions for block size N=8 (1) Karhunen Loève transform (1948/1960) (2) Haar transform (1910) (3) Walsh-Hadamard transform (1923) (4) Slant transform (Enomoto, Shibata, 1971) (5) Discrete CosineTransform (DCT) (Ahmet, Natarajan, Rao, 1974) (1) (2) (3) (4) (5) Thomas Wiegand: Digital Image Communication Transform Coding - 11 ComparisonComparison ofof VariousVarious Transforms,Transforms, IIII • Energy concentration measured for typical natural images, block size 1x32 (Lohscheller) • KLT is optimum • DCT performs only slightly worse than KLT Thomas Wiegand: Digital Image Communication Transform Coding - 12 DCTDCT - Type II-DCT of blocksize M x M - 2D basis functions of the DCT: is defined by transform matrix A containing elements (2k +1)i a = cos ik i 2M i,k = 0...(M 1) with 1 = 0 M 2 = i 0 i M Thomas Wiegand: Digital Image Communication Transform Coding - 13 Discrete Cosine Transform and Discrete Fourier Transform • Transform coding of images using the Discrete Fourier Transform (DFT): edge • For stationary image statistics, the energy folded concentration properties of the DFT converge against those of the KLT for large block sizes. • Problem of blockwise DFT coding: blocking effects • DFT of larger symmetric block -> “DCT“ due to circular topology of the DFT and Gibbs phenomena. pixel folded • Remedy: reflect image at block boundaries from: Girod Thomas Wiegand: Digital Image Communication Transform Coding - 14 HistogramsHistograms ofof DCTDCT Coefficients:Coefficients: • Image: Lena, 256x256 pixel • DCT: 8x8 pixels • DCT coefficients are approximately distributed like Laplacian pdf Thomas Wiegand: Digital Image Communication Transform Coding - 15 DistributionDistribution ofof thethe DCTDCT CoefficientsCoefficients • Central Limit Theorem requests DCT coefficients to be Gaussian distributed • Model variance of Gaussian DCT coefficients distribution as random variable (Lam & Goodmann‘2000) 2 2 1 u p(u | ) = exp 2 2 2 • Using conditional probability p(u) = p(u | 2 ) p( 2 )d 2 0 Thomas Wiegand: Digital Image Communication Transform Coding - 16 DistributionDistribution ofof thethe VarianceVariance Exponential Distribution 2 2 p( ) = μexp{}μ Thomas Wiegand: Digital Image Communication Transform Coding - 17 DistributionDistribution ofof thethe DCTDCT CoefficientsCoefficients p(u) = p(u | 2 ) p( 2) d 2 0 2 1 u 2 2 = exp 2 μexp{}μ d 0 2 2 2 2 u 2 = μ exp 2 μ d 0 2 2 1 μu = μ exp 2 2 μ 2 2μ = exp 2μ u Laplacian Distribution 2 {} Thomas Wiegand: Digital Image Communication Transform Coding - 18 BitBit AllocationAllocation forfor TransformTransform CoefficientsCoefficients II • Problem: divide bit-rate R among N transform coefficients such that resulting distortion D is minimized. N 1 1 N D(R) = Di (Ri ), s.t. R R N i Average i=1 N i=1 Average rate distortion Distortion contributed by coefficient i Rate for coefficient i • Approach: minimize Lagrangian cost function N N ! d dDi (Ri ) Di (Ri ) + Ri = + =0 dRi i=1 i=1 dRi • Solution: Pareto condition dD (R ) i i = dRi • Move bits from coefficient with small distortion reduction per bit to coefficient with larger distortion reduction per bit Thomas Wiegand: Digital Image Communication Transform Coding - 19 BitBit AllocationAllocation forfor TransformTransform CoefficientsCoefficients IIII • Assumption: high rate approximations are valid 2 2Ri dDi (Ri ) 2 2Ri Di (Ri ) a i 2 , 2aln2 i 2 = dRi 2aln2 R log + log R log log ˜ + R = log i + R i 2 i 2 i 2 i 2 2 ˜ 1 1 N 1 N 2aln2 N N 2aln2 R = Ri = log2 i + log2 = log2 i + log2 N i 1 N i 1 i 1 = 14 = 2 4 3 14= 2 4 3 log2 ˜ ˜ • Operational Distortion Rate function for transform coding: N N N i 1 a a 2log2 2R 2 2Ri 2 2 2R D(R) = Di (Ri ) i 2 = i 2 = a˜ 2 N i=1 N i=1 N i=1 1 N N Geometric mean: ˜ = i i=1 Thomas Wiegand: Digital Image Communication Transform Coding - 20 EntropyEntropy CodingCoding ofof TransformTransform CoefficientsCoefficients 2 1 i • Previous derivation assumes: Ri = max log2 ,0 bit 2 D • AC coefficients are very likely to be zero • Ordering of the transform DC coefficient coefficients by zig-zag scan • Design Huffman code for event: # of zeros and coefficient value • Arithmetic code maybe 1 simpler Probability 0.5 that coefficient is not zero 1 when quantizing with: 0 2 1 3 Vi=100*round(Ui/100) 2 4 3 4 5 5 6 6 7 7 8 8 Thomas Wiegand: Digital Image Communication Transform Coding - 21 EntropyEntropy CodingCoding ofof TransformTransform CoefficientsCoefficients IIII Encoder DCT 201 195 188 193 169 157 196 190 185 3 1 1 -3 2 -1 0 1480 49 33 -15 -14 33 -38 20 193 188 187 201 195 193 213 193 1 1 -1 0 -1 0 0 1 10 -52 11 -12 16 17 -13 -12 184 192 180 195 182 151 199 193 0 0 1 0 -1 0 0 0 19 32 -22 -10 22 -20 9 8 176 172 179 179 152 148 198 183 1 1 0 -1 0 0 0 -1 16 10 17 27 -31 12 6 -5 196 195 169 171 159 185 218 175 0 0 1 0 0 0 -1 0 -30 -6 13 -12 8 4 -3 -3 214 213 205 170 173 185 206 150 0 0 0 0 0 0 0 0 -25 16 6 -24 9 3 3 3 207 205 207 184 180 167 173 160 0 0 0 0 0 0 0 0 -2
Recommended publications
  • The Discrete Cosine Transform (DCT): Theory and Application
    The Discrete Cosine Transform (DCT): 1 Theory and Application Syed Ali Khayam Department of Electrical & Computer Engineering Michigan State University March 10th 2003 1 This document is intended to be tutorial in nature. No prior knowledge of image processing concepts is assumed. Interested readers should follow the references for advanced material on DCT. ECE 802 – 602: Information Theory and Coding Seminar 1 – The Discrete Cosine Transform: Theory and Application 1. Introduction Transform coding constitutes an integral component of contemporary image/video processing applications. Transform coding relies on the premise that pixels in an image exhibit a certain level of correlation with their neighboring pixels. Similarly in a video transmission system, adjacent pixels in consecutive frames2 show very high correlation. Consequently, these correlations can be exploited to predict the value of a pixel from its respective neighbors. A transformation is, therefore, defined to map this spatial (correlated) data into transformed (uncorrelated) coefficients. Clearly, the transformation should utilize the fact that the information content of an individual pixel is relatively small i.e., to a large extent visual contribution of a pixel can be predicted using its neighbors. A typical image/video transmission system is outlined in Figure 1. The objective of the source encoder is to exploit the redundancies in image data to provide compression. In other words, the source encoder reduces the entropy, which in our case means decrease in the average number of bits required to represent the image. On the contrary, the channel encoder adds redundancy to the output of the source encoder in order to enhance the reliability of the transmission.
    [Show full text]
  • Block Transforms in Progressive Image Coding
    This is page 1 Printer: Opaque this Blo ck Transforms in Progressive Image Co ding Trac D. Tran and Truong Q. Nguyen 1 Intro duction Blo ck transform co ding and subband co ding have b een two dominant techniques in existing image compression standards and implementations. Both metho ds actually exhibit many similarities: relying on a certain transform to convert the input image to a more decorrelated representation, then utilizing the same basic building blo cks such as bit allo cator, quantizer, and entropy co der to achieve compression. Blo ck transform co ders enjoyed success rst due to their low complexity in im- plementation and their reasonable p erformance. The most p opular blo ck transform co der leads to the current image compression standard JPEG [1] which utilizes the 8 8 Discrete Cosine Transform DCT at its transformation stage. At high bit rates 1 bpp and up, JPEG o ers almost visually lossless reconstruction image quality. However, when more compression is needed i.e., at lower bit rates, an- noying blo cking artifacts showup b ecause of two reasons: i the DCT bases are short, non-overlapp ed, and have discontinuities at the ends; ii JPEG pro cesses each image blo ck indep endently. So, inter-blo ck correlation has b een completely abandoned. The development of the lapp ed orthogonal transform [2] and its generalized ver- sion GenLOT [3, 4] helps solve the blo cking problem to a certain extent by b or- rowing pixels from the adjacent blo cks to pro duce the transform co ecients of the current blo ck.
    [Show full text]
  • Video/Image Compression Technologies an Overview
    Video/Image Compression Technologies An Overview Ahmad Ansari, Ph.D., Principal Member of Technical Staff SBC Technology Resources, Inc. 9505 Arboretum Blvd. Austin, TX 78759 (512) 372 - 5653 [email protected] May 15, 2001- A. C. Ansari 1 Video Compression, An Overview ■ Introduction – Impact of Digitization, Sampling and Quantization on Compression ■ Lossless Compression – Bit Plane Coding – Predictive Coding ■ Lossy Compression – Transform Coding (MPEG-X) – Vector Quantization (VQ) – Subband Coding (Wavelets) – Fractals – Model-Based Coding May 15, 2001- A. C. Ansari 2 Introduction ■ Digitization Impact – Generating Large number of bits; impacts storage and transmission » Image/video is correlated » Human Visual System has limitations ■ Types of Redundancies – Spatial - Correlation between neighboring pixel values – Spectral - Correlation between different color planes or spectral bands – Temporal - Correlation between different frames in a video sequence ■ Know Facts – Sampling » Higher sampling rate results in higher pixel-to-pixel correlation – Quantization » Increasing the number of quantization levels reduces pixel-to-pixel correlation May 15, 2001- A. C. Ansari 3 Lossless Compression May 15, 2001- A. C. Ansari 4 Lossless Compression ■ Lossless – Numerically identical to the original content on a pixel-by-pixel basis – Motion Compensation is not used ■ Applications – Medical Imaging – Contribution video applications ■ Techniques – Bit Plane Coding – Lossless Predictive Coding » DPCM, Huffman Coding of Differential Frames, Arithmetic Coding of Differential Frames May 15, 2001- A. C. Ansari 5 Lossless Compression ■ Bit Plane Coding – A video frame with NxN pixels and each pixel is encoded by “K” bits – Converts this frame into K x (NxN) binary frames and encode each binary frame independently. » Runlength Encoding, Gray Coding, Arithmetic coding Binary Frame #1 .
    [Show full text]
  • Predictive and Transform Coding
    Lecture 14: Predictive and Transform Coding Thinh Nguyen Oregon State University 1 Outline PCM (Pulse Code Modulation) DPCM (Differential Pulse Code Modulation) Transform coding JPEG 2 Digital Signal Representation Loss from A/D conversion Aliasing (worse than quantization loss) due to sampling Loss due to quantization Digital signals Analog signals Analog A/D converter signals Digital Signal D/A sampling quantization processing converter 3 Signal representation For perfect re-construction, sampling rate (Nyquist’s frequency) needs to be twice the maximum frequency of the signal. However, in practice, loss still occurs due to quantization. Finer quantization leads to less error at the expense of increased number of bits to represent signals. 4 Audio sampling Human hearing frequency range: 20 Hz to 20 Khz. Voice:50Hz to 2 KHz What is the sampling rate to avoid aliasing? (worse than losing information) Audio CD : 44100Hz 5 Audio quantization Sample precision – resolution of signal Quantization depends on the number of bits used. Voice quality: 8 bit quantization, 8000 Hz sampling rate. (64 kbps) CD quality: 16 bit quantization, 44100Hz (705.6 kbps for mono, 1.411 Mbps for stereo). 6 Pulse Code Modulation (PCM) The 2 step process of sampling and quantization is known as Pulse Code Modulation. Used in speech and CD recording. audio signals bits sampling quantization No compression, unlike MP3 7 DPCM (Differential Pulse Code Modulation) 8 DPCM (Differential Pulse Code Modulation) Simple example: Code the following value sequence: 1.4 1.75 2.05 2.5 2.4 Quantization step: 0.2 Predictor: current value = previous quantized value + quantized error.
    [Show full text]
  • Speech Compression
    information Review Speech Compression Jerry D. Gibson Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93118, USA; [email protected]; Tel.: +1-805-893-6187 Academic Editor: Khalid Sayood Received: 22 April 2016; Accepted: 30 May 2016; Published: 3 June 2016 Abstract: Speech compression is a key technology underlying digital cellular communications, VoIP, voicemail, and voice response systems. We trace the evolution of speech coding based on the linear prediction model, highlight the key milestones in speech coding, and outline the structures of the most important speech coding standards. Current challenges, future research directions, fundamental limits on performance, and the critical open problem of speech coding for emergency first responders are all discussed. Keywords: speech coding; voice coding; speech coding standards; speech coding performance; linear prediction of speech 1. Introduction Speech coding is a critical technology for digital cellular communications, voice over Internet protocol (VoIP), voice response applications, and videoconferencing systems. In this paper, we present an abridged history of speech compression, a development of the dominant speech compression techniques, and a discussion of selected speech coding standards and their performance. We also discuss the future evolution of speech compression and speech compression research. We specifically develop the connection between rate distortion theory and speech compression, including rate distortion bounds for speech codecs. We use the terms speech compression, speech coding, and voice coding interchangeably in this paper. The voice signal contains not only what is said but also the vocal and aural characteristics of the speaker. As a consequence, it is usually desired to reproduce the voice signal, since we are interested in not only knowing what was said, but also in being able to identify the speaker.
    [Show full text]
  • Explainable Machine Learning Based Transform Coding for High
    1 Explainable Machine Learning based Transform Coding for High Efficiency Intra Prediction Na Li,Yun Zhang, Senior Member, IEEE, C.-C. Jay Kuo, Fellow, IEEE Abstract—Machine learning techniques provide a chance to (VVC)[1], the video compression ratio is doubled almost every explore the coding performance potential of transform. In this ten years. Although researchers keep on developing video work, we propose an explainable transform based intra video cod- coding techniques in the past decades, there is still a big ing to improve the coding efficiency. Firstly, we model machine learning based transform design as an optimization problem of gap between the improvement on compression ratio and the maximizing the energy compaction or decorrelation capability. volume increase on global video data. Higher coding efficiency The explainable machine learning based transform, i.e., Subspace techniques are highly desired. Approximation with Adjusted Bias (Saab) transform, is analyzed In the latest three generations of video coding standards, and compared with the mainstream Discrete Cosine Transform hybrid video coding framework has been adopted, which is (DCT) on their energy compaction and decorrelation capabilities. Secondly, we propose a Saab transform based intra video coding composed of predictive coding, transform coding and entropy framework with off-line Saab transform learning. Meanwhile, coding. Firstly, predictive coding is to remove the spatial intra mode dependent Saab transform is developed. Then, Rate- and temporal redundancies of video content on the basis of Distortion (RD) gain of Saab transform based intra video coding exploiting correlation among spatial neighboring blocks and is theoretically and experimentally analyzed in detail. Finally, temporal successive frames.
    [Show full text]
  • Signal Processing, IEEE Transactions On
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 11, NOVEMBER 2002 2843 An Improvement to Multiple Description Transform Coding Yao Wang, Senior Member, IEEE, Amy R. Reibman, Senior Member, IEEE, Michael T. Orchard, Fellow, IEEE, and Hamid Jafarkhani, Senior Member, IEEE Abstract—A multiple description transform coding (MDTC) comprehensive review of the literature in both theoretical and method has been reported previously. The redundancy rate distor- algorithmic development, see the comprehensive review paper tion (RRD) performance of this coding scheme for the independent by Goyal [1]. and identically distributed (i.i.d.) two-dimensional (2-D) Gaussian source has been analyzed using mean squared error (MSE) as The performance of an MD coder can be evaluated by the re- the distortion measure. At the small redundancy region, the dundancy rate distortion (RRD) function, which measures how MDTC scheme can achieve excellent RRD performance because fast the side distortion ( ) decreases with increasing redun- a small increase in redundancy can reduce the single description dancy ( ) when the central distortion ( ) is fixed. As back- distortion at a rate faster than exponential, but the performance ground material, we first present a bound on the RRD curve for of MDTC becomes increasingly poor at larger redundancies. This paper describes a generalization of the MDTC (GMDTC) scheme, an i.i.d Gaussian source with the MSE as the distortion mea- which introduces redundancy both by transform and through sure. This bound was derived by Goyal and Kovacevic [2] and correcting the error resulting from a single description. Its RRD was translated from the achievable region for multiple descrip- performance is closer to the theoretical bound in the entire range tions, which was previously derived by Ozarow [3].
    [Show full text]
  • Nonlinear Transform Coding Johannes Ballé, Philip A
    1 Nonlinear Transform Coding Johannes Ballé, Philip A. Chou, David Minnen, Saurabh Singh, Nick Johnston, Eirikur Agustsson, Sung Jin Hwang, George Toderici Google Research Mountain View, CA 94043, USA {jballe, philchou, dminnen, saurabhsingh, nickj, eirikur, sjhwang, gtoderici}@google.com Abstract—We review a class of methods that can be collected it separates the task of decorrelating a source, from coding under the name nonlinear transform coding (NTC), which over it. Any source can be optimally compressed in theory using the past few years have become competitive with the best linear vector quantization (VQ) [2]. However, in general, VQ quickly transform codecs for images, and have superseded them in terms of rate–distortion performance under established perceptual becomes computationally infeasible for sources of more than quality metrics such as MS-SSIM. We assess the empirical a handful dimensions, mainly because the codebook of repro- rate–distortion performance of NTC with the help of simple duction vectors, as well as the computational complexity of example sources, for which the optimal performance of a vector the search for the best reproduction of the source vector grow quantizer is easier to estimate than with natural data sources. exponentially with the number of dimensions. TC simplifies To this end, we introduce a novel variant of entropy-constrained vector quantization. We provide an analysis of various forms quantization and coding by first mapping the source vector into of stochastic optimization techniques for NTC models; review a latent space via a decorrelating invertible transform, such as architectures of transforms based on artificial neural networks, as the Karhunen–Loève Transform (KLT), and then separately well as learned entropy models; and provide a direct comparison quantizing and coding each of the latent dimensions.
    [Show full text]
  • Introduction to Data Compression
    Introduction to Data Compression Guy E. Blelloch Computer Science Department Carnegie Mellon University [email protected] October 16, 2001 Contents 1 Introduction 3 2 Information Theory 5 2.1 Entropy . 5 2.2 The Entropy of the English Language . 6 2.3 Conditional Entropy and Markov Chains . 7 3 Probability Coding 9 3.1 Prefix Codes . 10 3.1.1 Relationship to Entropy . 11 3.2 Huffman Codes . 12 3.2.1 Combining Messages . 14 3.2.2 Minimum Variance Huffman Codes . 14 3.3 Arithmetic Coding . 15 3.3.1 Integer Implementation . 18 4 Applications of Probability Coding 21 4.1 Run-length Coding . 24 4.2 Move-To-Front Coding . 25 4.3 Residual Coding: JPEG-LS . 25 4.4 Context Coding: JBIG . 26 4.5 Context Coding: PPM . 28 ¡ This is an early draft of a chapter of a book I’m starting to write on “algorithms in the real world”. There are surely many mistakes, and please feel free to point them out. In general the Lossless compression part is more polished than the lossy compression part. Some of the text and figures in the Lossy Compression sections are from scribe notes taken by Ben Liblit at UC Berkeley. Thanks for many comments from students that helped improve the presentation. ¢ c 2000, 2001 Guy Blelloch 1 5 The Lempel-Ziv Algorithms 31 5.1 Lempel-Ziv 77 (Sliding Windows) . 31 5.2 Lempel-Ziv-Welch . 33 6 Other Lossless Compression 36 6.1 Burrows Wheeler . 36 7 Lossy Compression Techniques 39 7.1 Scalar Quantization .
    [Show full text]
  • Beyond Traditional Transform Coding
    Beyond Traditional Transform Coding by Vivek K Goyal B.S. (University of Iowa) 1993 B.S.E. (University of Iowa) 1993 M.S. (University of California, Berkeley) 1995 A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Engineering|Electrical Engineering and Computer Sciences in the GRADUATE DIVISION of the UNIVERSITY OF CALIFORNIA, BERKELEY Committee in charge: Professor Martin Vetterli, Chair Professor Venkat Anantharam Professor Bin Yu Fall 1998 Beyond Traditional Transform Coding Copyright 1998 by Vivek K Goyal Printed with minor corrections 1999. 1 Abstract Beyond Traditional Transform Coding by Vivek K Goyal Doctor of Philosophy in Engineering|Electrical Engineering and Computer Sciences University of California, Berkeley Professor Martin Vetterli, Chair Since its inception in 1956, transform coding has become the most successful and pervasive technique for lossy compression of audio, images, and video. In conventional transform coding, the original signal is mapped to an intermediary by a linear transform; the final compressed form is produced by scalar quantization of the intermediary and entropy coding. The transform is much more than a conceptual aid; it makes computations with large amounts of data feasible. This thesis extends the traditional theory toward several goals: improved compression of nonstationary or non-Gaussian signals with unknown parameters; robust joint source–channel coding for erasure channels; and computational complexity reduction or optimization. The first contribution of the thesis is an exploration of the use of frames, which are overcomplete sets of vectors in Hilbert spaces, to form efficient signal representations. Linear transforms based on frames give representations with robustness to random additive noise and quantization, but with poor rate–distortion characteristics.
    [Show full text]
  • Waveform-Based Coding: Transform and Predictive Coding Outline
    Waveform-Based Coding: Transform and Predictive Coding Yao Wang Polytechnic University, Brooklyn, NY11201 http://eeweb.poly.edu/~yao Based on: Y. Wang, J. Ostermann, and Y.-Q. Zhang, Video Processing and Communications, Prentice Hall, 2002. Outline • Overview of video coding systems • Transform coding • Predictive coding Yao Wang, 2003 Waveform-based video coding 1 Components in a Coding System Focus of this lecture Yao Wang, 2003 Waveform-based video coding 3 Encoder Block Diagram of a Typical Block-Based Video Coder (Assuming No Intra Prediction) Lectures 3&4: Motion estimation Lecture 5: Variable Length Coding Last lecture: Scalar and Vector Quantization This lecture: DCT, wavelet and predictive coding ©Yao Wang, 2006 Waveform-based video coding 4 2 A Review of Vector Quantization • Motivation: quantize a group of samples (a vector) tthttogether, to exp litthltibtloit the correlation between samp les • Each sample vector is replaced by one of the representative vectors (or patterns) that often occur in the signal • Typically a block of 4x4 pixels •Desiggyygn is limited by ability to obtain training samples • Implementation is limited by large number of nearest neighbor comparisons – exponential in the block size ©Yao Wang, 2003 Coding: Quantization 5 Transform Coding • Motivation: – Represent a vector (e.g. a block of image samples) as the superposition of some typical vectors (block patterns) – Quantize and code the coefficients – Can be thought of as a constrained vector quantizer t + 1 t2 t3 t4 Yao Wang, 2003 Waveform-based video
    [Show full text]
  • A Hybrid Coder for Code-Excited Linear Predictive Coding of Images
    Intemataonal Symposaum on Sagnal Processzng and its Applzcataons, ISSPA. Gold Coast, Australza, 25-30 August. 1996 Organazed by the Sagnal Processang Research Centre, &UT,Bnsbane. Australza. A HYBRID CODER FOR CODE-EXCITED LINEAR PREDICTIVE CODING OF IMAGES Farshid Golchin* and Kuldip K. Paliwal School of Microelectronic Engineering, Griffith University Brisbane, QLD 4 1 1 1, Australia K. Paliw a1 0 me. gu .edu .au , F. Golchin (2 me.gu.edu . au ABSTRACT well preserved, even at low bit-rates. This led us to investigate combining CELP coding a with another technique, such as In this paper, we propose a hybrid image coding system for DCT, which is better suited to encoding edge areas. The encoding images at bit-rates below 0.5 bpp. The hybrid coder is resultant hybrid coder is able to use CELP for efficiently a combination of conventional discrete cosine transform @Cr) encoding the smoother regions while the DCT-based coder is based coding and code-excited linear predictive coding (CELP). used to encode the edge areas. In this paper, we describe the The DCT-based component allows the efficient coding of edge hybrid coder and examine its results when applied to a typical areas and strong textures in the image which are the areas where monochrome image. CELP based coders fail. On the other hand, CELP coding is used to code smooth areas which constitute the majority of a 2. CELP CODING IMAGES typical image, to be encoded efficiently and relatively free of OF artifacts. In CELP coders, source signals are approximated by the output of synthesis filter (IIR) when an excitation (or code) vector is 1.
    [Show full text]