Learning, Selection and Coding of New Block Transforms in and for the Optimization Loop of Video Coders Saurabh Puri

Total Page:16

File Type:pdf, Size:1020Kb

Learning, Selection and Coding of New Block Transforms in and for the Optimization Loop of Video Coders Saurabh Puri Learning, selection and coding of new block transforms in and for the optimization loop of video coders Saurabh Puri To cite this version: Saurabh Puri. Learning, selection and coding of new block transforms in and for the optimization loop of video coders. Computer Science [cs]. Université Bretagne Loire; Université de Nantes; LS2N, Université de Nantes, 2017. English. tel-01779566 HAL Id: tel-01779566 https://tel.archives-ouvertes.fr/tel-01779566 Submitted on 26 Apr 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Thèse de Doctorat Saurabh PURI Mémoire présenté en vue de l’obtention du grade de Docteur de l’Université de Nantes sous le sceau de l’Université Bretagne Loire École doctorale : Sciences et technologies de l’information, et mathématiques Discipline : Traitement du signal et des images, section CNU 27 Unité de recherche : LABORATOIRE DES SCIENCES DU NUMÉRIQUE DE NANTES (LS2N) Soutenance prévue le 09 November 2017 Learning, selection and coding of new block transforms in and for the optimization loop of video coders JURY Présidente : Mme Christine GUILLEMOT, Directrice de Recherche, INRIA Rennes Rapporteurs : M. Olivier DEFORGES, Professeur, INSA Rennes M. Marco CAGNAZZO, Associate Professeur, TELECOM-ParisTech Examinateur : M. André KAUP, Professeur, FAU-Erlangen-Nürnberg Allemagne Directeur de thèse : M. Patrick LE CALLET, Polytech Nantes, Université de Nantes Co-encadrant de thèse : M. Sébastien LASSERRE, Principal Scientist, Technicolor France Contents 1 General Introduction 11 1.1 Context......................................... 11 1.2 Objective and Scope of the Research......................... 12 1.3 Outline......................................... 13 1.4 Summary of The Contributions............................ 14 I Prior art on transform-based video compression 17 2 Video Compression, HEVC and Transform-based Coding 19 2.1 Video Compression: Basic Building Blocks...................... 19 2.2 High Efficiency Video Compression Standard.................... 20 2.2.1 Sketch of the codec............................... 21 2.2.2 Quad-tree structure in HEVC......................... 22 2.2.3 Intra and Inter Prediction........................... 23 2.2.4 Transforms in HEVC.............................. 24 2.2.5 Quantization in HEVC............................. 25 2.2.6 Adaptive Coefficient Scanning and Coefficient Encoding.......... 25 2.2.7 CABAC..................................... 27 2.2.8 Special Modes in HEVC............................ 28 2.2.9 Encoder Control................................ 28 2.2.10 Bjøntegaard Delta (BD) Rates........................ 30 2.3 Block Transforms in general.............................. 31 2.3.1 Properties of block transforms........................ 31 2.3.2 The Most Popular Transforms........................ 32 2.4 Motivation to improve the transforms used in HEVC................ 34 2.5 Conclusion....................................... 36 3 Advanced Transforms 37 3.1 Introduction....................................... 37 3.2 Future foreseen standard transforms in JVET.................... 38 3.2.1 Enhanced Multiple Transforms (EMT).................... 38 3.2.2 Secondary Transforms............................. 41 3.3 Other systematic Transforms............................. 43 3.3.1 Systematic Directional Transforms...................... 43 3.4 Offline learned Transforms............................... 46 3.4.1 Adaptive Karhunen Louve Transform (KLT)................ 46 3.4.2 Mode Dependent Transforms......................... 47 3 4 CONTENTS 3.4.3 Rate Distortion Optimized Transforms (RDOT).............. 48 3.5 Online learned transforms............................... 49 3.5.1 Content Adaptive Transforms (CAT)..................... 50 3.5.2 Signal Dependent Transforms (SDT)..................... 51 3.5.3 Other online transform learning schemes................... 51 3.6 Discussion........................................ 53 3.7 Conclusion....................................... 54 II The machinery on transform learning and coding 55 4 Data-Driven Transform Learning 57 4.1 Analysis of the training process of data-driven adaptive transforms........ 57 4.1.1 Elements of the training process....................... 58 4.1.2 Classification of residual blocks........................ 58 4.1.3 Computing optimized transform for each class................ 61 4.1.4 Re-classification................................ 65 4.1.5 Finding a consistent value for λ ....................... 65 4.2 Separable versus non-separable transform learning................. 66 4.3 Discussion and Conclusion............................... 67 4.3.1 Comparison of state-of-the-art offline learning schemes........... 68 5 Online and Offline Learning 69 5.1 Introduction....................................... 69 5.2 Online Adaptive Transform Learning Scheme.................... 69 5.2.1 Principles of the Algorithm.......................... 70 5.2.2 Proposed Adaptive Directional Transforms in HEVC............ 72 5.2.3 Simulation Results............................... 75 5.2.4 Discussion and Conclusions.......................... 79 5.3 Offline Adaptive Transform Learning scheme.................... 81 5.3.1 Learning scheme................................ 81 5.3.2 Implementation in HEVC........................... 82 5.3.3 Simulation Results............................... 83 5.4 Comparison between offline and online learning................... 85 5.4.1 Short-comings of the online framework.................... 85 5.5 Conclusions and Perspectives............................. 87 5.5.1 Perspectives................................... 88 6 Advanced signaling of transform and transform index 89 6.1 Goal of this chapter.................................. 89 6.2 Proposed methods for signaling the overhead.................... 89 6.3 Scheme for coding of transform basis vectors.................... 90 6.3.1 Prior art on basis vector coding........................ 90 6.3.2 Proposed methods............................... 92 6.3.3 Simulation Results............................... 97 6.3.4 Conclusion................................... 98 6.4 Proposed transform index prediction......................... 98 6.4.1 Related Works................................. 99 6.4.2 Proposed trained model-based transform index prediction......... 99 6.4.3 Architecture and training of CNN-based model............... 101 CONTENTS 5 6.4.4 Simulation Results............................... 103 6.4.5 Conclusions................................... 106 6.5 Discussion and Conclusions.............................. 106 III Improving offline adaptive transform learning 107 7 Offline scheme improvement 109 7.1 Introduction....................................... 109 7.2 Observations on training sets used for the design of MDTC............ 110 7.3 Proposed improvements to MDTC scheme to get proposed IMDTC........ 112 7.3.1 Proposed iterative learning on training set.................. 112 7.3.2 Learning KLTs on the chroma components................. 113 7.3.3 Extension to larger residual blocks...................... 114 7.3.4 Low-complexity IMDTC using partial transforms.............. 114 7.4 Experimental results.................................. 115 7.4.1 Effect of iterating over the training set.................... 115 7.4.2 R-D performance over HEVC due to improved chroma residual coding.. 116 7.4.3 Extension of transforms to 16×16 residual blocks.............. 118 7.4.4 Comparison of low-complexity IMDTC and full IMDTC scheme..... 119 7.4.5 Comparison of R-D performance, encoding complexity and memory re- quirement with MDTC scheme........................ 120 7.5 Conclusion....................................... 122 8 Content adaptability improvement: playing with datasets 125 8.1 Introduction....................................... 125 8.2 Related works...................................... 126 8.3 Generation of a pool of multiple transform sets................... 127 8.4 Proposed pool-based transform coding scheme.................... 128 8.5 Simulation Results................................... 129 8.6 Conclusions and Perspectives............................. 130 IV Conclusion and Future Perspectives 133 9 Conclusions and Future work 135 9.1 Summary........................................ 135 9.2 What we have learnt.................................. 138 9.2.1 h.264 vs. HEVC................................ 138 9.2.2 Online vs Offline................................ 139 9.2.3 Penalization and rate models......................... 139 9.2.4 Transform genericity versus specificity.................... 139 9.3 Future Work...................................... 140 A Smart re-ordering of coefficients during learning 143 B Statistical model to determine precision drop value b 147 C Author’s publications 151 List of Tables 3.1 Transform basis functions for the DCT and the DST-type transforms [1]..... 39 3.2 Transform set with corresponding candidates [1].................. 40 3.3 Transform set implicitly (no signaling) chosen for each IP Mode [1]........ 40 4.1 Summary of different offline transform learning schemes.............
Recommended publications
  • Space Application Development Board
    Space Application Development Board The Space Application Development Board (SADB-C6727B) enables developers to implement systems using standard and custom algorithms on processor hardware similar to what would be deployed for space in a fully radiation hardened (rad-hard) design. The SADB-C6727B is devised to be similar to a space qualified rad-hardened system design using the TI SMV320C6727B-SP DSP and high-rel memories (SRAM, SDRAM, NOR FLASH) from Cobham. The VOCAL SADB- C6727B development board does not use these high reliability space qualified components but rather implements various common bus widths and memory types/speeds for algorithm development and performance evaluation. VOCAL Technologies, Ltd. www.vocal.com 520 Lee Entrance, Suite 202 Tel: (716) 688-4675 Buffalo, New York 14228 Fax: (716) 639-0713 Audio inputs are handled by a TI ADS1278 high speed multichannel analog-to-digital converter (ADC) which has a variant qualified for space operation. Audio outputs may be generated from digital signal Pulse Density Modulation (PDM) or Pulse Width Modulation (PWM) signals to avoid the need for a space qualified digital-to-analog converter (DAC) hardware or a traditional audio codec circuit. A commercial grade AIC3106 audio codec is provided for algorithm development and for comparison of audio performance to PDM/PWM generated audio signals. Digital microphones (PDM or I2S formats) can be sampled by the processor directly or by the AIC3106 audio codec (PDM format only). Digital audio inputs/outputs can also be connected to other processors (using the McASP signals directly). Both SRAM and SDRAM memories are supported via configuration resistors to allow for developing and benchmarking algorithms using either 16-bit wide or 32-bit wide busses (also via configuration resistors).
    [Show full text]
  • Concatenation of Compression Codecs – the Need for Objective Evaluations
    Concatenation of compression codecs – The need for objective evaluations C.J. Dalton (UKIB) In this article the Author considers, firstly, a hypothetical broadcast network in which compression equipments have replaced several existing functions – resulting in multiple-cascading. Secondly, he describes a similar network that has been optimized for compression technology. Picture-quality assessment methods – both conventional and new, fied for still-picture applications in the printing in- subjective and objective – are dustry and was adapted to motion video applica- discussed with the aim of providing tions, by encoding on a picture-by-picture basis. background information. Some Various proprietary versions of JPEG, with higher proposals are put forward for compression ratios, were introduced to reduce the objective evaluation together with storage requirements, and alternative systems based on Wavelets and Fractals have been imple- initial observations when concaten- mented. Due to the availability of integrated hard- ating (cascading) codecs of similar ware, motion JPEG is now widely used for off- and different types. and on-line editing, disc servers, slow-motion sys- tems, etc., but there is little standardization and few interfaces at the compressed level. The MPEG-1 compression standard was aimed at 1. Introduction progressively-scanned moving images and has been widely adopted for CD-ROM and similar ap- Bit-rate reduction (BRR) techniques – commonly plications; it has also found limited application for referred to as compression – are now firmly estab- broadcast video. The MPEG-2 system is specifi- lished in the latest generation of broadcast televi- cally targeted at the complete gamut of video sys- sion equipments. tems; for standard-definition television, the MP@ML codec – with compression ratios of 20:1 Original language: English Manuscript received 15/1/97.
    [Show full text]
  • The Discrete Cosine Transform (DCT): Theory and Application
    The Discrete Cosine Transform (DCT): 1 Theory and Application Syed Ali Khayam Department of Electrical & Computer Engineering Michigan State University March 10th 2003 1 This document is intended to be tutorial in nature. No prior knowledge of image processing concepts is assumed. Interested readers should follow the references for advanced material on DCT. ECE 802 – 602: Information Theory and Coding Seminar 1 – The Discrete Cosine Transform: Theory and Application 1. Introduction Transform coding constitutes an integral component of contemporary image/video processing applications. Transform coding relies on the premise that pixels in an image exhibit a certain level of correlation with their neighboring pixels. Similarly in a video transmission system, adjacent pixels in consecutive frames2 show very high correlation. Consequently, these correlations can be exploited to predict the value of a pixel from its respective neighbors. A transformation is, therefore, defined to map this spatial (correlated) data into transformed (uncorrelated) coefficients. Clearly, the transformation should utilize the fact that the information content of an individual pixel is relatively small i.e., to a large extent visual contribution of a pixel can be predicted using its neighbors. A typical image/video transmission system is outlined in Figure 1. The objective of the source encoder is to exploit the redundancies in image data to provide compression. In other words, the source encoder reduces the entropy, which in our case means decrease in the average number of bits required to represent the image. On the contrary, the channel encoder adds redundancy to the output of the source encoder in order to enhance the reliability of the transmission.
    [Show full text]
  • Ffmpeg Documentation Table of Contents
    ffmpeg Documentation Table of Contents 1 Synopsis 2 Description 3 Detailed description 3.1 Filtering 3.1.1 Simple filtergraphs 3.1.2 Complex filtergraphs 3.2 Stream copy 4 Stream selection 5 Options 5.1 Stream specifiers 5.2 Generic options 5.3 AVOptions 5.4 Main options 5.5 Video Options 5.6 Advanced Video options 5.7 Audio Options 5.8 Advanced Audio options 5.9 Subtitle options 5.10 Advanced Subtitle options 5.11 Advanced options 5.12 Preset files 6 Tips 7 Examples 7.1 Preset files 7.2 Video and Audio grabbing 7.3 X11 grabbing 7.4 Video and Audio file format conversion 8 Syntax 8.1 Quoting and escaping 8.1.1 Examples 8.2 Date 8.3 Time duration 8.3.1 Examples 8.4 Video size 8.5 Video rate 8.6 Ratio 8.7 Color 8.8 Channel Layout 9 Expression Evaluation 10 OpenCL Options 11 Codec Options 12 Decoders 13 Video Decoders 13.1 rawvideo 13.1.1 Options 14 Audio Decoders 14.1 ac3 14.1.1 AC-3 Decoder Options 14.2 ffwavesynth 14.3 libcelt 14.4 libgsm 14.5 libilbc 14.5.1 Options 14.6 libopencore-amrnb 14.7 libopencore-amrwb 14.8 libopus 15 Subtitles Decoders 15.1 dvdsub 15.1.1 Options 15.2 libzvbi-teletext 15.2.1 Options 16 Encoders 17 Audio Encoders 17.1 aac 17.1.1 Options 17.2 ac3 and ac3_fixed 17.2.1 AC-3 Metadata 17.2.1.1 Metadata Control Options 17.2.1.2 Downmix Levels 17.2.1.3 Audio Production Information 17.2.1.4 Other Metadata Options 17.2.2 Extended Bitstream Information 17.2.2.1 Extended Bitstream Information - Part 1 17.2.2.2 Extended Bitstream Information - Part 2 17.2.3 Other AC-3 Encoding Options 17.2.4 Floating-Point-Only AC-3 Encoding
    [Show full text]
  • VOL. E100-C NO. 6 JUNE 2017 the Usage of This PDF File Must Comply
    VOL. E100-C NO. 6 JUNE 2017 The usage of this PDF file must comply with the IEICE Provisions on Copyright. The author(s) can distribute this PDF file for research and educational (nonprofit) purposes only. Distribution by anyone other than the author(s) is prohibited. IEICE TRANS. ELECTRON., VOL.E100–C, NO.6 JUNE 2017 643 PAPER A High-Throughput and Compact Hardware Implementation for the Reconstruction Loop in HEVC Intra Encoding Yibo FAN†a), Member, Leilei HUANG†, Zheng XIE†, and Xiaoyang ZENG†, Nonmembers SUMMARY In the newly finalized video coding standard, namely high 4×4, 8×8, 16×16, 32×32 and 64×64 with 35 possible pre- efficiency video coding (HEVC), new notations like coding unit (CU), pre- diction modes in intra prediction. Although several fast diction unit (PU) and transformation unit (TU) are introduced to improve mode decision designs have been proposed, still a consid- the coding performance. As a result, the reconstruction loop in intra en- coding is heavily burdened to choose the best partitions or modes for them. erable amount of candidate PU modes, PU partitions or TU In order to solve the bottleneck problems in cycle and hardware cost, this partitions are needed to be traversed by the reconstruction paper proposed a high-throughput and compact implementation for such a loop. reconstruction loop. By “high-throughput”, it refers to that it has a fixed It can be inferred that the reconstruction loop in intra throughput of 32 pixel/cycle independent of the TU/PU size (except for 4×4 TUs). By “compact”, it refers to that it fully explores the reusability prediction has become a bottleneck in cycle and hardware between discrete cosine transform (DCT) and inverse discrete cosine trans- cost.
    [Show full text]
  • Error Correction Capacity of Unary Coding
    Error Correction Capacity of Unary Coding Pushpa Sree Potluri1 Abstract Unary coding has found applications in data compression, neural network training, and in explaining the production mechanism of birdsong. Unary coding is redundant; therefore it should have inherent error correction capacity. An expression for the error correction capability of unary coding for the correction of single errors has been derived in this paper. 1. Introduction The unary number system is the base-1 system. It is the simplest number system to represent natural numbers. The unary code of a number n is represented by n ones followed by a zero or by n zero bits followed by 1 bit [1]. Unary codes have found applications in data compression [2],[3], neural network training [4]-[11], and biology in the study of avian birdsong production [12]-14]. One can also claim that the additivity of physics is somewhat like the tallying of unary coding [15],[16]. Unary coding has also been seen as the precursor to the development of number systems [17]. Some representations of unary number system use n-1 ones followed by a zero or with the corresponding number of zeroes followed by a one. Here we use the mapping of the left column of Table 1. Table 1. An example of the unary code N Unary code Alternative code 0 0 0 1 10 01 2 110 001 3 1110 0001 4 11110 00001 5 111110 000001 6 1111110 0000001 7 11111110 00000001 8 111111110 000000001 9 1111111110 0000000001 10 11111111110 00000000001 The unary number system may also be seen as a space coding of numerical information where the location determines the value of the number.
    [Show full text]
  • Dynamic Resource Management of Network-On-Chip Platforms for Multi-Stream Video Processing
    DYNAMIC RESOURCE MANAGEMENT OF NETWORK-ON-CHIP PLATFORMS FOR MULTI-STREAM VIDEO PROCESSING Hashan Roshantha Mendis Doctor of Engineering University of York Computer Science March 2017 2 Abstract This thesis considers resource management in the context of parallel multiple video stream de- coding, on multicore/many-core platforms. Such platforms have tens or hundreds of on-chip processing elements which are connected via a Network-on-Chip (NoC). Inefficient task allo- cation configurations can negatively affect the communication cost and resource contention in the platform, leading to predictability and performance issues. Efficient resource management for large-scale complex workloads is considered a challenging research problem; especially when applications such as video streaming and decoding have dynamic and unpredictable workload characteristics. For these type of applications, runtime heuristic-based task mapping techniques are required. As the application and platform size increase, decentralised resource management techniques are more desirable to overcome the reliability and performance bot- tlenecks in centralised management. In this work, several heuristic-based runtime resource management techniques, targeting real-time video decoding workloads are proposed. Firstly, two admission control approaches are proposed; one fully deterministic and highly predictable; the other is heuristic-based, which balances predictability and performance. Secondly, a pair of runtime task mapping schemes are presented, which make use of limited known application properties, communication cost and blocking-aware heuristics. Combined with the proposed deterministic admission con- troller, these techniques can provide strict timing guarantees for hard real-time streams whilst improving resource usage. The third contribution in this thesis is a distributed, bio-inspired, low-overhead, task re-allocation technique, which is used to further improve the timeliness and workload distribution of admitted soft real-time streams.
    [Show full text]
  • Influence of Speech Codecs Selection on Transcoding Steganography
    Influence of Speech Codecs Selection on Transcoding Steganography Artur Janicki, Wojciech Mazurczyk, Krzysztof Szczypiorski Warsaw University of Technology, Institute of Telecommunications Warsaw, Poland, 00-665, Nowowiejska 15/19 Abstract. The typical approach to steganography is to compress the covert data in order to limit its size, which is reasonable in the context of a limited steganographic bandwidth. TranSteg (Trancoding Steganography) is a new IP telephony steganographic method that was recently proposed that offers high steganographic bandwidth while retaining good voice quality. In TranSteg, compression of the overt data is used to make space for the steganogram. In this paper we focus on analyzing the influence of the selection of speech codecs on hidden transmission performance, that is, which codecs would be the most advantageous ones for TranSteg. Therefore, by considering the codecs which are currently most popular for IP telephony we aim to find out which codecs should be chosen for transcoding to minimize the negative influence on voice quality while maximizing the obtained steganographic bandwidth. Key words: IP telephony, network steganography, TranSteg, information hiding, speech coding 1. Introduction Steganography is an ancient art that encompasses various information hiding techniques, whose aim is to embed a secret message (steganogram) into a carrier of this message. Steganographic methods are aimed at hiding the very existence of the communication, and therefore any third-party observers should remain unaware of the presence of the steganographic exchange. Steganographic carriers have evolved throughout the ages and are related to the evolution of the methods of communication between people. Thus, it is not surprising that currently telecommunication networks are a natural target for steganography.
    [Show full text]
  • Parameter Optimization in H.265 Rate-Distortion by Single Frame
    https://doi.org/10.2352/ISSN.2470-1173.2019.11.IPAS-262 © 2019, Society for Imaging Science and Technology Parameter optimization in H.265 Rate-Distortion by single- frame semantic scene analysis Ahmed M. Hamza; University of Portsmouth Abdelrahman Abdelazim; Blackpool and the Fylde College Djamel Ait-Boudaoud; University of Portsmouth Abstract with Q being the quantization value for the source. The relation is The H.265/HEVC (High Efficiency Video Coding) codec and derived based on several assumptions about the source probability its 3D extensions have crucial rate-distortion mechanisms that distribution within the quantization intervals, and the nature of help determine coding efficiency. We have introduced in this the rate-distortion relations themselves (constantly differentiable work a new system of Lagrangian parameterization in RDO cost throughout, etc.). The value initially used for c in the literature functions, based on semantic cues in an image, starting with the was 0.85. This was modified and made adaptive in subsequent current HEVC formulation of the Lagrangian hyper-parameter standards including HEVC/H.265. heuristics. Two semantic scenery flag algorithms are presented Our work here investigates further adaptations to the rate- and tested within the Lagrangian formulation as weighted factors. distortion Lagrangian by semantic algorithms, made possible by The investigation of whether the semantic gap between the coder recent frameworks in computer vision. and the image content is holding back the block-coding mecha- nisms as a whole from achieving greater efficiency has yielded a Adaptive Lambda Hyper-parameters in HEVC positive answer. Early versions of the rate-distortion parameters in H.263 were replaced with more sophisticated models in subsequent stan- Introduction dards.
    [Show full text]
  • Image Data Compression Introduction to Coding
    Image Data Compression Introduction to Coding © 2018-19 Alexey Pak, Lehrstuhl für Interaktive Echtzeitsysteme, Fakultät für Informatik, KIT 1 Review: data reduction steps (discretization / digitization) Continuous 2D siGnal Fully diGital siGnal (liGht intensity on sensor) gq (xa, yb,ti ) Discrete time siGnal (pixel voltaGe readinGs) g(xa, yb,ti ) g(x, y,t) Spatial discretization Temporal discretization and diGitization g(xa, yb,t) g(xa, yb,t) gq (xa, yb,t) Discrete value siGnal AnaloG siGnal Spatially discrete siGnal (e.G., # of electrons at each (liGht intensity at a pixel) (pixel-averaGed intensity) pixel of the CCD matrix) © 2018-19 Alexey Pak, Lehrstuhl für Interaktive Echtzeitsysteme, Fakultät für Informatik, KIT 2 Review: data reduction steps (discretization / digitization) Discretization of 1D continuous-time signals (sampling) • Important signal transformations: up- and down-sampling • Information-preserving down-sampling: rate determined based on signal bandwidth • Fourier space allows simple interpretation of the effects due to decimation and interpolation (techniques of up-/down-sampling) Scalar (one-dimensional) signal quantization of continuous-value signals • Quantizer types: uniform, simple non-uniform (with a dead zone, with a limited amplitude) • Advanced quantizers: PDF-optimized (Max-Lloyd algorithm), perception-optimized, SNR- optimized • Implementation: pre-processing with a compander function + simple quantization Vector (multi-dimensional) signal quantization • Terminology: quantization, reconstruction, codebook, distance metric, Voronoi regions, space partitioning • Relation to the general classification problem (from Machine Learning) • Linde-Buzo-Gray algorithm of constructing (sub-optimal) codebooks (aka k-means) © 2018-19 Alexey Pak, Lehrstuhl für Interaktive Echtzeitsysteme, Fakultät für Informatik, KIT 3 LGB vector quantization – 2D example [Linde, Buzo, Gray ‘80]: 1.
    [Show full text]
  • Block Transforms in Progressive Image Coding
    This is page 1 Printer: Opaque this Blo ck Transforms in Progressive Image Co ding Trac D. Tran and Truong Q. Nguyen 1 Intro duction Blo ck transform co ding and subband co ding have b een two dominant techniques in existing image compression standards and implementations. Both metho ds actually exhibit many similarities: relying on a certain transform to convert the input image to a more decorrelated representation, then utilizing the same basic building blo cks such as bit allo cator, quantizer, and entropy co der to achieve compression. Blo ck transform co ders enjoyed success rst due to their low complexity in im- plementation and their reasonable p erformance. The most p opular blo ck transform co der leads to the current image compression standard JPEG [1] which utilizes the 8 8 Discrete Cosine Transform DCT at its transformation stage. At high bit rates 1 bpp and up, JPEG o ers almost visually lossless reconstruction image quality. However, when more compression is needed i.e., at lower bit rates, an- noying blo cking artifacts showup b ecause of two reasons: i the DCT bases are short, non-overlapp ed, and have discontinuities at the ends; ii JPEG pro cesses each image blo ck indep endently. So, inter-blo ck correlation has b een completely abandoned. The development of the lapp ed orthogonal transform [2] and its generalized ver- sion GenLOT [3, 4] helps solve the blo cking problem to a certain extent by b or- rowing pixels from the adjacent blo cks to pro duce the transform co ecients of the current blo ck.
    [Show full text]
  • Adaptive Learning Methods and Their Use in Flaw Classification Sriram Chavali Iowa State University
    Iowa State University Capstones, Theses and Retrospective Theses and Dissertations Dissertations 1996 Adaptive learning methods and their use in flaw classification Sriram Chavali Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/rtd Part of the Structures and Materials Commons Recommended Citation Chavali, Sriram, "Adaptive learning methods and their use in flaw classification" (1996). Retrospective Theses and Dissertations. 111. https://lib.dr.iastate.edu/rtd/111 This Thesis is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Retrospective Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. r r Adaptive learning methods and their use in flaw classification r by r Sriram Chavali r A thesis submitted to the graduate faculty in partial fulfillment of the requirements for the degree of r MASTER OF SCIENCE r Department: Aerospace Engineering and Engineering Mechanics Major: Aerospace Engineering r1 r Major Professor: Dr. Lester W. Schmerr r r r r r r Iowa State University Ames, Iowa r 1996 r Copyright© Sriram Chavali, 1996. All rights reserved. r( r 11 r Graduate College r Iowa State University rI I This is to certify that the Master's thesis of Sriram Chavali has met the thesis requirements of Iowa State University r r r r r r rt r r r rm I l rl r r ll1 r {l1m'l r 1 TABLE OF CONTENTS {1m 1, I )11m ACKNOWLEDGMENTS Vlll I i ABSTRACT ....
    [Show full text]