Huffman Based Code Generation Algorithms: Data Compression Perspectives

Total Page:16

File Type:pdf, Size:1020Kb

Huffman Based Code Generation Algorithms: Data Compression Perspectives Journal of Computer Science Original Research Paper Huffman Based Code Generation Algorithms: Data Compression Perspectives Ahsan Habib, M. Jahirul Islam and M. Shahidur Rahman Department of Computer Science and Engineering, Shahjalal University of Science and Technology, Sylhet, Bangladesh Article history Abstract: This article proposes two dynamic Huffman based code Received: 18-07-2018 generation algorithms, namely Octanary and Hexanary algorithm, for data Revised: 07-09-2018 compression. Faster encoding and decoding process is very important in Accepted: 06-12-2018 data compression area. We propose tribit-based (Octanary) and quadbit- based (Hexanary) algorithm and compare the performance with the existing Corresponding Author: Ahsan Habib widely used single bit (Binary) and recently introduced dibit (Quaternary) Department of Computer algorithms. The decoding algorithms for the proposed techniques have also Science and Engineering, been described. After assessing all the results, it is found that the Octanary Shahjalal University of Science and the Hexanary techniques perform better than the existing techniques in and Technology, Sylhet, terms of encoding and decoding speed. Bangladesh Email: [email protected] Keywords: Binary Tree, Quaternary Tree, Octanary Tree, Hexanary Tree, Huffman Principle, Decoding Technique, Encoding Technique, Tree Data Structure, Data Compression Introduction decompression speed of the proposed techniques are better than the others. To summarize, the proposed Huffman coding (Huffman, 1952) is very popular in techniques may be suitable for offline data compression data compression area. Nowadays it is used in data applications where encoding and decoding speed is more compression for wireless and sensor networks important with less constraint on space. (Săcăleanu et al ., 2011; Renugadevi and Darisini, 2013), data mining (Oswald and Sivaselvan, 2018; Oswald et al ., Related Works 2015). It is also found efficient for data compression in low resource systems (Radhakrishnan et al ., 2016; Matai In 1952, David Huffman introduced an algorithm et al ., 2014; Wang and Lin, 2016). The use of Huffman (Huffman, 1952) which produced optimal code for data code in word-based text compression is also very compression system. Huffman code is produced using common (Sinaga, 2015). Huffman principle produces Binary tree technique, where more frequent symbols optimal code using a Binary tree where the most frequent produce shorter codeword length and less frequent codewords are smaller in length. However, Huffman symbols produce longer codeword. Later on, so many principle does not produce a balanced tree (Rajput, popular algorithms and applications have been developed 2018). For this reason, it requires more memory to store based on Binary Huffman coding technique. Saradashri et longer codeword, and thus it also requires more time to al . (2015) explained in his book that Huffman code could decode those codewords from the memory. In this paper, also be static or dynamic. Chen et al . (Chen et al ., 1999) we first review traditional Huffman algorithm and newly introduced a method to speed up the process and reduced introduced Quaternary Huffman algorithm. Then we memory of Huffman tree. A tree clustering algorithm is introduce Octanary and Hexanary tree for construction of introduced in (Hashemian, 1995) to avoid high sparsity Huffman codes. Octanary and Hexanary structure makes of the tree. In this research, the author reduced the the underlying tree more balanced. The tree construction header size dramatically. Vitter (1987), the author and decoding algorithms for both techniques have been introduced a new method where it required less memory developed. The codeword efficiency of Binary, than the conventional Huffman technique. Chung (1997) Quaternary, Octanary and Hexanary structure have been also introduced a memory-efficient array structure to compared. The compression ratio and speed are also represent the Huffman tree. In some other researches compared for different methods using these coding codeword length of Huffman code also investigated. systems. It is found that the compression and Katona and Nemetz (1978) investigated the connection © 2018 Ahsan Habib, M. Jahirul Islam and M. Shahidur Rahman. This open access article is distributed under a Creative Commons Attribution (CC-BY) 3.0 license. Ahsan Habib et al. / Journal of Computer Science 2018, 14 (12): 1599.1610 DOI: 10.3844/jcssp.2018.1599.1610 between self-information of a source letter and its ministry (Luke 5, 2018). The frequency distribution of codeword length. A recursive Huffman algorithm is Luke 5 is shown in Fig. 3. introduced in (Lin et al ., 2012), where a tree is transformed into a recursive Huffman tree and it decoded Octanary Tree more than one symbol at a time. The decoding process Octanary tree or 8-ary tree is a tree in which each starts by reading a file bit by bit in all of the above node has 0 to 8 children (labeled as LEFT1 child, LEFT2 techniques. Recently, we introduced a code generation child, LEFT3 child, LEFT4 child, RIGHT1 child, technique based on Quaternary (dibit) Huffman tree RIGHT2 child, RIGHT3 child, RIGHT4 child). Here for (Habib and Rahman, 2017) to produce Huffman codes. constructing codes for Octanary Huffman tree, we use In this research, better encoding and decoding speed is 000 for a LEFT1 child, 001 for a LEFT2 child, 010 for a achieved by sacrificing an insignificant amount of space, where it is also found that searching two bits at a time LEFT3 child, 011 for a LEFT4 child, 100 for a RIGHT1 speed up the overall processing speed than searching a child, 101 for a RIGHT2 child, 110 for a RIGHT3 child single bit. This motivated us to search three or four bits and 111 for a RIGHT4 child. at a time. In this connection, the Octanary algorithm is The process of the construction of an Octanary tree is introduced to produce three bit based Huffman code, described below: whereas the Hexanary algorithm is introduced to produce four bit based Huffman code. The proposed • List all possible symbols with their probabilities; algorithms improved the Huffman decoding time • Find the eight symbols with the smallest probabilities compared with the existing Huffman algorithms. • Replace these by a single set containing all eight We organize the paper as follows. In section “Tree symbols, whose probability is the sum of the Structure”, traditional Binary, Quaternary, Octanary and individual probabilities Hexanary tree structures in data management system are • Repeat until the list contains single member presented. In section “Implementation”, the proposed • The octanry tree structure for Luke 5 data is shown in encoding and decoding algorithm of Octanary and Fig. 4. Hexanary techniques have been presented. Section “Result and Discussion” discusses the experimental results. Hexanary Tree Finally, Section “Conclusion” concludes the paper. Hexanary tree or 16-ary tree is a tree in which each Tree Structure node has 0 to 16 children (labeled as LEFT1 child, LEFT2 child, LEFT3 child, LEFT4 child, LEFT5 child, LEFT6 Binary and Quaternary Tree child, LEFT7 child, LEFT8 child, RIGHT1 child, RIGHT2 child, RIGHT3 child, RIGHT4 child, RIGHT5 child, A rooted tree T is called an m-ary tree if every RIGHT6 child, RIGHT7 child, RIGHT8 child). Here for internal vertex has no more than m children. The tree is called a full m-ary tree if every internal vertex has constructing codes for Hexanary Huffman tree we use exactly m children. An m-ary tree with m = 2 is called 0000 for LEFT1 child, 0001 for LEFT2 child, 0010 for a Binary tree. In a Binary tree, if an internal vertex has LEFT3 child, 0011 for LEFT4 child, 0100 for LEFT5 two children, the first child is called the LEFT child child, 0101 for LEFT6 child, 0110 for LEFT7 child, 0111 and the second child is called the RIGHT child for LEFT8 child, 1000 for RIGHT1 child, 1001 for (Adamchik, 2009). The Binary tree structure is RIGHT2 child, 1010 for RIGHT3 child, 1011 for RIGHT4 thoroughly discussed in (Huffman, 1952). A tree with child, 1100 for RIGHT5 child, 1101 for RIGHT6 child, m = 4 is called a Quaternary tree, which has at most 1110 for RIGHT7 child and 1111 for RIGHT8 child. four children, the first child is called LEFT child, the The process of the construction of a Hexanary tree is second child is called LEFT-MID child, the third child described below: is called RIGHT-MID child and the fourth child is • List all possible symbols with their probabilities called RIGHT child. The detail of Quaternary tree • Find the sixteen symbols with the smallest structures is explained in (Habib and Rahman, 2017). probabilities The Binary and Quaternary tree structures for luke 5 • Replace these by a single set containing all sixteen (Luke 5, 2018) are shown in Fig. 1 and 2, respectively. symbols, whose probability is the sum of the Luke 5 is the fifth chapter of the Gospel of Luke in the individual probabilities New Testament of the Christian Bible. The chapter • Repeat until the list contains single member relates the recruitment of Jesus' first disciples and continues to describe Jesus' teaching and healing The Hexanary tree structure for Luke 5 data is shown in Fig. 5 1600 Ahsan Habib et al. / Journal of Computer Science 2018, 14 (12): 1599.1610 DOI: 10.3844/jcssp.2018.1599.1610 186201 108273 77928 50592 57681 33609 44319 24174 26418 27948 29733 20349 e 11781 16779 t h 13209 s n a o 15249 2 1 1 9282 i 5661 w 1 8517 ⌣ 3 3 l 1 1 1 r 6 2 b ⌣ 9 2 3 4 4 7140 d 8 1 7 2652 6 3 0 6 7 4 4 4 8 3 y 4284 u m 1 0 1 9 9 6 8 8 2 0 1 0 8 f 0 3 4 4 6 3 2 7 3 4 1 4 4 6 j 1530 c g 0 0 1 8 0 2 4 3 8 7 0 9 2 k p 8 9 1 3 3 459 9 3 6 6 1 v 4 7 3 1 2 1 2 2 225 6 6 2 x 1 7 3 8 0 0 3 1 2 q z 7 0 1 4 1 1 0 5 2 3 Fig.
Recommended publications
  • Data Compression for Remote Laboratories
    Paper—Data Compression for Remote Laboratories Data Compression for Remote Laboratories https://doi.org/10.3991/ijim.v11i4.6743 Olawale B. Akinwale Obafemi Awolowo University, Ile-Ife, Nigeria [email protected] Lawrence O. Kehinde Obafemi Awolowo University, Ile-Ife, Nigeria [email protected] Abstract—Remote laboratories on mobile phones have been around for a few years now. This has greatly improved accessibility of these remote labs to students who cannot afford computers but have mobile phones. When money is a factor however (as is often the case with those who can’t afford a computer), the cost of use of these remote laboratories should be minimized. This work ad- dressed this issue of minimizing the cost of use of the remote lab by making use of data compression for data sent between the user and remote lab. Keywords—Data compression; Lempel-Ziv-Welch; remote laboratory; Web- Sockets 1 Introduction The mobile phone is now the de facto computational device of choice. They are quite powerful, connected devices which are almost always with us. This is so much so that about every important online service has a mobile version or at least a version compatible with mobile phones and tablets (for this paper we would adopt the phrase mobile device to represent mobile phones and tablets). This has caused online labora- tory developers to develop online laboratory solutions which are mobile device- friendly [1, 2, 3, 4, 5, 6, 7, 8]. One of the main benefits of online laboratories is the possibility of using fewer re- sources to serve more people i.e.
    [Show full text]
  • Incompressibility and Lossless Data Compression: an Approach By
    Incompressibility and Lossless Data Compression: An Approach by Pattern Discovery Incompresibilidad y compresión de datos sin pérdidas: Un acercamiento con descubrimiento de patrones Oscar Herrera Alcántara and Francisco Javier Zaragoza Martínez Universidad Autónoma Metropolitana Unidad Azcapotzalco Departamento de Sistemas Av. San Pablo No. 180, Col. Reynosa Tamaulipas Del. Azcapotzalco, 02200, Mexico City, Mexico Tel. 53 18 95 32, Fax 53 94 45 34 [email protected], [email protected] Article received on July 14, 2008; accepted on April 03, 2009 Abstract We present a novel method for lossless data compression that aims to get a different performance to those proposed in the last decades to tackle the underlying volume of data of the Information and Multimedia Ages. These latter methods are called entropic or classic because they are based on the Classic Information Theory of Claude E. Shannon and include Huffman [8], Arithmetic [14], Lempel-Ziv [15], Burrows Wheeler (BWT) [4], Move To Front (MTF) [3] and Prediction by Partial Matching (PPM) [5] techniques. We review the Incompressibility Theorem and its relation with classic methods and our method based on discovering symbol patterns called metasymbols. Experimental results allow us to propose metasymbolic compression as a tool for multimedia compression, sequence analysis and unsupervised clustering. Keywords: Incompressibility, Data Compression, Information Theory, Pattern Discovery, Clustering. Resumen Presentamos un método novedoso para compresión de datos sin pérdidas que tiene por objetivo principal lograr un desempeño distinto a los propuestos en las últimas décadas para tratar con los volúmenes de datos propios de la Era de la Información y la Era Multimedia.
    [Show full text]
  • A Machine Learning Perspective on Predictive Coding with PAQ
    A Machine Learning Perspective on Predictive Coding with PAQ Byron Knoll & Nando de Freitas University of British Columbia Vancouver, Canada fknoll,[email protected] August 17, 2011 Abstract PAQ8 is an open source lossless data compression algorithm that currently achieves the best compression rates on many benchmarks. This report presents a detailed description of PAQ8 from a statistical machine learning perspective. It shows that it is possible to understand some of the modules of PAQ8 and use this understanding to improve the method. However, intuitive statistical explanations of the behavior of other modules remain elusive. We hope the description in this report will be a starting point for discussions that will increase our understanding, lead to improvements to PAQ8, and facilitate a transfer of knowledge from PAQ8 to other machine learning methods, such a recurrent neural networks and stochastic memoizers. Finally, the report presents a broad range of new applications of PAQ to machine learning tasks including language modeling and adaptive text prediction, adaptive game playing, classification, and compression using features from the field of deep learning. 1 Introduction Detecting temporal patterns and predicting into the future is a fundamental problem in machine learning. It has gained great interest recently in the areas of nonparametric Bayesian statistics (Wood et al., 2009) and deep learning (Sutskever et al., 2011), with applications to several domains including language modeling and unsupervised learning of audio and video sequences. Some re- searchers have argued that sequence prediction is key to understanding human intelligence (Hawkins and Blakeslee, 2005). The close connections between sequence prediction and data compression are perhaps under- arXiv:1108.3298v1 [cs.LG] 16 Aug 2011 appreciated within the machine learning community.
    [Show full text]
  • Adaptive On-The-Fly Compression
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 17, NO. 1, JANUARY 2006 15 Adaptive On-the-Fly Compression Chandra Krintz and Sezgin Sucu Abstract—We present a system called the Adaptive Compression Environment (ACE) that automatically and transparently applies compression (on-the-fly) to a communication stream to improve network transfer performance. ACE uses a series of estimation techniques to make short-term forecasts of compressed and uncompressed transfer time at the 32KB block level. ACE considers underlying networking technology, available resource performance, and data characteristics as part of its estimations to determine which compression algorithm to apply (if any). Our empirical evaluation shows that, on average, ACE improves transfer performance given changing network types and performance characteristics by 8 to 93 percent over using the popular compression techniques that we studied (Bzip, Zlib, LZO, and no compression) alone. Index Terms—Adaptive compression, dynamic, performance prediction, mobile systems. æ 1 INTRODUCTION UE to recent advances in Internet technology, the networking technology available changes regularly, e.g., a Ddemand for network bandwidth has grown rapidly. user connects her laptop to an ISDN link at home, takes her New distributed computing technologies, such as mobile laptop to work and connects via a 100Mb/s Ethernet link, and computing, P2P systems [10], Grid computing [11], Web attends a conference or visits a coffee shop where she uses the Services [7], and multimedia applications (that transmit wireless communication infrastructure that is available. The video, music, data, graphics files), cause bandwidth compression technique that performs best for this user will demand to double every year [21].
    [Show full text]
  • Benefits of Hardware Data Compression in Storage Networks Gerry Simmons, Comtech AHA Tony Summers, Comtech AHA SNIA Legal Notice
    Benefits of Hardware Data Compression in Storage Networks Gerry Simmons, Comtech AHA Tony Summers, Comtech AHA SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in presentations and literature under the following conditions: Any slide or slides used must be reproduced without modification The SNIA must be acknowledged as source of any material used in the body of any document containing material from these presentations. This presentation is a project of the SNIA Education Committee. Oct 17, 2007 Benefits of Hardware Data Compression in Storage Networks 2 © 2007 Storage Networking Industry Association. All Rights Reserved. Abstract Benefits of Hardware Data Compression in Storage Networks This tutorial explains the benefits and algorithmic details of lossless data compression in Storage Networks, and focuses especially on data de-duplication. The material presents a brief history and background of Data Compression - a primer on the different data compression algorithms in use today. This primer includes performance data on the specific compression algorithms, as well as performance on different data types. Participants will come away with a good understanding of where to place compression in the data path, and the benefits to be gained by such placement. The tutorial will discuss technological advances in compression and how they affect system level solutions. Oct 17, 2007 Benefits of Hardware Data Compression in Storage Networks 3 © 2007 Storage Networking Industry Association. All Rights Reserved. Agenda Introduction and Background Lossless Compression Algorithms System Implementations Technology Advances and Compression Hardware Power Conservation and Efficiency Conclusion Oct 17, 2007 Benefits of Hardware Data Compression in Storage Networks 4 © 2007 Storage Networking Industry Association.
    [Show full text]
  • The Deep Learning Solutions on Lossless Compression Methods for Alleviating Data Load on Iot Nodes in Smart Cities
    sensors Article The Deep Learning Solutions on Lossless Compression Methods for Alleviating Data Load on IoT Nodes in Smart Cities Ammar Nasif *, Zulaiha Ali Othman and Nor Samsiah Sani Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science & Technology, University Kebangsaan Malaysia, Bangi 43600, Malaysia; [email protected] (Z.A.O.); [email protected] (N.S.S.) * Correspondence: [email protected] Abstract: Networking is crucial for smart city projects nowadays, as it offers an environment where people and things are connected. This paper presents a chronology of factors on the development of smart cities, including IoT technologies as network infrastructure. Increasing IoT nodes leads to increasing data flow, which is a potential source of failure for IoT networks. The biggest challenge of IoT networks is that the IoT may have insufficient memory to handle all transaction data within the IoT network. We aim in this paper to propose a potential compression method for reducing IoT network data traffic. Therefore, we investigate various lossless compression algorithms, such as entropy or dictionary-based algorithms, and general compression methods to determine which algorithm or method adheres to the IoT specifications. Furthermore, this study conducts compression experiments using entropy (Huffman, Adaptive Huffman) and Dictionary (LZ77, LZ78) as well as five different types of datasets of the IoT data traffic. Though the above algorithms can alleviate the IoT data traffic, adaptive Huffman gave the best compression algorithm. Therefore, in this paper, Citation: Nasif, A.; Othman, Z.A.; we aim to propose a conceptual compression method for IoT data traffic by improving an adaptive Sani, N.S.
    [Show full text]
  • On Prediction Using Variable Order Markov Models
    Journal of Arti¯cial Intelligence Research 22 (2004) 385-421 Submitted 05/04; published 12/04 On Prediction Using Variable Order Markov Models Ron Begleiter [email protected] Ran El-Yaniv [email protected] Department of Computer Science Technion - Israel Institute of Technology Haifa 32000, Israel Golan Yona [email protected] Department of Computer Science Cornell University Ithaca, NY 14853, USA Abstract This paper is concerned with algorithms for prediction of discrete sequences over a ¯nite alphabet, using variable order Markov models. The class of such algorithms is large and in principle includes any lossless compression algorithm. We focus on six prominent prediction algorithms, including Context Tree Weighting (CTW), Prediction by Partial Match (PPM) and Probabilistic Su±x Trees (PSTs). We discuss the properties of these algorithms and compare their performance using real life sequences from three domains: proteins, English text and music pieces. The comparison is made with respect to prediction quality as measured by the average log-loss. We also compare classi¯cation algorithms based on these predictors with respect to a number of large protein classi¯cation tasks. Our results indicate that a \decomposed" CTW (a variant of the CTW algorithm) and PPM outperform all other algorithms in sequence prediction tasks. Somewhat surprisingly, a di®erent algorithm, which is a modi¯cation of the Lempel-Ziv compression algorithm, signi¯cantly outperforms all algorithms on the protein classi¯cation problems. 1. Introduction Learning of sequential data continues to be a fundamental task and a challenge in pattern recognition and machine learning. Applications involving sequential data may require pre- diction of new events, generation of new sequences, or decision making such as classi¯cation of sequences or sub-sequences.
    [Show full text]
  • Data Compression for Remote Laboratories
    CORE Metadata, citation and similar papers at core.ac.uk Provided by Online-Journals.org (International Association of Online Engineering) Paper—Data Compression for Remote Laboratories Data Compression for Remote Laboratories https://doi.org/10.3991/ijim.v11i4.6743 Olawale B. Akinwale Obafemi Awolowo University, Ile-Ife, Nigeria [email protected] Lawrence O. Kehinde Obafemi Awolowo University, Ile-Ife, Nigeria [email protected] Abstract—Remote laboratories on mobile phones have been around for a few years now. This has greatly improved accessibility of these remote labs to students who cannot afford computers but have mobile phones. When money is a factor however (as is often the case with those who can’t afford a computer), the cost of use of these remote laboratories should be minimized. This work ad- dressed this issue of minimizing the cost of use of the remote lab by making use of data compression for data sent between the user and remote lab. Keywords—Data compression; Lempel-Ziv-Welch; remote laboratory; Web- Sockets 1 Introduction The mobile phone is now the de facto computational device of choice. They are quite powerful, connected devices which are almost always with us. This is so much so that about every important online service has a mobile version or at least a version compatible with mobile phones and tablets (for this paper we would adopt the phrase mobile device to represent mobile phones and tablets). This has caused online labora- tory developers to develop online laboratory solutions which are mobile device- friendly [1, 2, 3, 4, 5, 6, 7, 8].
    [Show full text]
  • Data Compression Algorithms for Energy-Constrained Devices in Delay Tolerant Networks
    Data Compression Algorithms for Energy-Constrained Devices in Delay Tolerant Networks Christopher M. Sadler and Margaret Martonosi Department of Electrical Engineering Princeton University {csadler, mrm}@princeton.edu Abstract 1 Introduction Sensor networks are fundamentally constrained by the In both mobile and stationary sensor networks, collect- difficulty and energy expense of delivering information ing and interpreting useful data is the fundamental goal, and from sensors to sink. Our work has focused on garner- this goal is typically achieved by percolating data back to a ing additional significant energy improvements by devising base station where larger-scale calculations or coordination computationally-efficient lossless compression algorithms are most feasible. The typical sensor node design includes on the source node. These reduce the amount of data that sensors as needed for the application, a small processor, and must be passed through the network and to the sink, and thus a radio whose range and power characteristics match the con- have energy benefits that are multiplicative with the number nectivity needs and energy constraints of the particular de- of hops the data travels through the network. sign space targeted. Currently, if sensor system designers want to compress Energy is a primary constraint in the design of sensor net- acquired data, they must either develop application-specific works. This fundamental energy constraint further limits ev- compression algorithms or use off-the-shelf algorithms not erything from data sensing rates and link bandwidth, to node designed for resource-constrained sensor nodes. This pa- size and weight. In most cases, the radio is the main energy per discusses the design issues involved with implementing, consumer of the system.
    [Show full text]
  • Data Compression in Solid State Storage
    Data Compression in Solid State Storage John Fryar [email protected] Santa Clara, CA August 2013 1 Acknowledgements This presentation would not have been possible without the counsel, hard work and graciousness of the following individuals and/or organizations: Raymond Savarda Sandgate Technologies Santa Clara, CA August 2013 2 Disclaimers The opinions expressed herein are those of the author and do not necessarily represent those of any other organization or individual unless specifically cited. A thorough attempt to acknowledge all sources has been made. That said, we’re all human… Santa Clara, CA August 2013 3 Learning Objectives At the conclusion of this tutorial the audience will have been exposed to: • The different types of Data Compression • Common Data Compression Algorithms • The Deflate/Inflate (GZIP/GUNZIP) algorithms in detail • Implementation Options (Software/Hardware) • Impacts of design parameters in Performance • SSD benefits and challenges • Resources for Further Study Santa Clara, CA August 2013 4 Agenda • Background, Definitions, & Context • Data Compression Overview • Data Compression Algorithm Survey • Deflate/Inflate (GZIP/GUNZIP) in depth • Software Implementations • HW Implementations • Tradeoffs & Advanced Topics • SSD Benefits and Challenges • Conclusions Santa Clara, CA August 2013 5 Definitions Item Description Comments Open A system which will compress Must strictly adhere to standards System data for use by other entities. on compress / decompress I.E. the compressed data will algorithms exit the system Interoperability among vendors mandated for Open Systems Closed A system which utilizes Can support a limited, optimized System compressed data internally but subset of standard. does not expose compressed Also allows custom algorithms data to the outside world No Interoperability req’d.
    [Show full text]
  • Algorithms for Compression on Gpus
    Algorithms for Compression on GPUs Anders L. V. Nicolaisen, s112346 Tecnical University of Denmark DTU Compute Supervisors: Inge Li Gørtz & Philip Bille Kongens Lyngby , August 2013 CSE-M.Sc.-2013-93 Technical University of Denmark DTU Compute Building 321, DK-2800 Kongens Lyngby, Denmark Phone +45 45253351, Fax +45 45882673 [email protected] www.compute.dtu.dk CSE-M.Sc.-2013 Abstract This project seeks to produce an algorithm for fast lossless compression of data. This is attempted by utilisation of the highly parallel graphic processor units (GPU), which has been made easier to use in the last decade through simpler access. Especially nVidia has accomplished to provide simpler programming of GPUs with their CUDA architecture. I present 4 techniques, each of which can be used to improve on existing algorithms for compression. I select the best of these through testing, and combine them into one final solution, that utilises CUDA to highly reduce the time needed to compress a file or stream of data. Lastly I compare the final solution to a simpler sequential version of the same algorithm for CPU along with another solution for the GPU. Results show an 60 time increase of throughput for some files in comparison with the sequential algorithm, and as much as a 7 time increase compared to the other GPU solution. i ii Resumé Dette projekt søger en algoritme for hurtig komprimering af data uden tab af information. Dette forsøges gjort ved hjælp af de kraftigt parallelisérbare grafikkort (GPU), som inden for det sidste årti har åbnet op for deres udnyt- telse gennem simplere adgang.
    [Show full text]
  • Lossless Decompression Accelerator for Embedded Processor with GUI
    micromachines Article Lossless Decompression Accelerator for Embedded Processor with GUI Gwan Beom Hwang, Kwon Neung Cho, Chang Yeop Han, Hyun Woo Oh, Young Hyun Yoon and Seung Eun Lee * Department of Electronic Engineering, Seoul National University of Science and Technology, Seoul 01811, Korea; [email protected] (G.B.H.); [email protected] (K.N.C.); [email protected] (C.Y.H.); [email protected] (H.W.O.); [email protected] (Y.H.Y.) * Correspondence: [email protected]; Tel.: +82-2-970-9021 Abstract: The development of the mobile industry brings about the demand for high-performance embedded systems in order to meet the requirement of user-centered application. Because of the limitation of memory resource, employing compressed data is efficient for an embedded system. However, the workload for data decompression causes an extreme bottleneck to the embedded processor. One of the ways to alleviate the bottleneck is to integrate a hardware accelerator along with the processor, constructing a system-on-chip (SoC) for the embedded system. In this paper, we propose a lossless decompression accelerator for an embedded processor, which supports LZ77 decompression and static Huffman decoding for an inflate algorithm. The accelerator is implemented on a field programmable gate array (FPGA) to verify the functional suitability and fabricated in a Samsung 65 nm complementary metal oxide semiconductor (CMOS) process. The performance of the accelerator is evaluated by the Canterbury corpus benchmark and achieved throughput up to 20.7 MB/s at 50 MHz system clock frequency. Keywords: lossless compression; inflate algorithm; hardware accelerator; graphical user interface; Citation: Hwang, G.B.; Cho, K.N.; embedded processor; system-on-chip Han, C.Y.; Oh, H.W.; Yoon, Y.H.; Lee, S.E.
    [Show full text]