Comparison and Model of Compression Techniques for Smart Cloud Log File Handling

Total Page:16

File Type:pdf, Size:1020Kb

Comparison and Model of Compression Techniques for Smart Cloud Log File Handling Copyright IEEE. The final publication is available at IEEExplore via https://doi.org/10.1109/CCCI49893.2020.9256609. Comparison and Model of Compression Techniques for Smart Cloud Log File Handling Josef Spillner Zurich University of Applied Sciences Winterthur, Switzerland [email protected] Abstract—Compression as data coding technique has seen tight pricing of offered cloud services. Increasing diversity and approximately 70 years of research and practical innovation. progress in generic compression tools and log-specific algo- Nowadays, powerful compression tools with good trade-offs exist rithms [4], [5], [6] leaves many operators without a systematic for a range of file formats from plain text to rich multimedia. Yet in the dilemma of cloud providers to reduce log data sizes as much framework to choose suitable and economic log compression as possible while having to keep as much as possible around for tools and respective configurations. This prevents a systematic regulatory reasons and compliance processes, many companies solution to exploit cost tradeoffs, such as increasing investment are looking for smarter solutions beyond brute compression. into better compression levels while saving long-term storage In this paper, comprehensive applied research setting around cost. In this paper, such a framework is constructed by giving network and system logs is introduced by comparing text com- pression ratios and performance. The benchmark encompasses a comprehensive overview with benchmark results of 30 to- 13 tools and 30 tool-configuration-search combinations. The tool tal combinations of compression tools, decompression/search and algorithm relationships as well as benchmark results are tools and associated configurations. modelled in a graph. After discussing the results, the paper The four concrete technical contributions of the paper are: reasons about limitations of individual approaches and suitable combinations of compression with smart adaptive log file han- 1) A rich graph model of compression algorithms, dling. The adaptivity is based on the exploitation of knowledge formats, tools, settings and runtime characteristics on format-specific compression characteristics expressed in the (compressgraph). graph, for which a proof-of-concept advisor service is provided. 2) A robust test bench aiming at reproducible model cre- Index Terms —log file management, compression algorithms, ation with integration of relevant tools for accurate ratio text compression, benchmark, adaptivity, smart systems and performance benchmarking (compressbench). 3) A reference input and results dataset of text compres- I. INTRODUCTION sion and search tools applied to representative log files Cloud computing has become a mature backbone for mil- (compressrefdata). lions of delivered applications and services. Besides global- 4) A programmable advisor service that exploits the graph scale/hyper-scale infrastructure providers with dozens of to recommend suitable compression for a given situation data centres, many smaller managed network and platform (compressadvisor). providers are successfully covering market needs for spe- All four contributions are publicly available1. The relation cialised services [1]. One key issue for these providers is the between them is summarised in Fig. 1. handling of dynamically generated data from their services and hosted applications. Increasingly automated operations demand more insights into the provisioning and delivery situations, and therefore access to larger amounts of historic data [2]. Additionally, regulations may demand the storage of such data for longer periods of time, and occasional search for suspicious occurrences of terms. One of the most impor- tant information sources are log files, and therefore complex log management systems are set up to collect, transform and unify log messages. At the end of such pipelines, logs Fig. 1. Contributions of this paper are compressed and stored for as long as necessary, while still being available for occasional information retrieval [3]. In the next sections, related works are summarised and Consequently, providers aim at finding compression tools log file scenarios defined. Afterwards, the tool comparison which squeeze the logs into the smallest possible files, while is introduced with the compression graph model, a testbed tolerating slow compression, as long as content search, in with curated sample data and the plan of the experiments. most cases preceded by decompression, should be fast. The The results are then presented and discussed, and the advisor additional cost of log management, along with monitoring and other operations, should be kept to a minimum to allow for 1Contributions records: https://doi.org/10.5281/zenodo.4053735 978-1-7281-7315-3/20/$31.00 c 2020 IEEE service presented, before proceeding to an outlook on potential recent and comprehensive comparisons of log file compression future compression tools that favour smart handling over the and smart selection of best tools for this task. quest for raw compression ratios. A general observation can be made about the apparent business necessity of industrial compression research and tool II. RELATED WORK development. This is evidenced not only by Logzip (Huawei), In recent years, the use of online services has seen a but also by the generic tools Brotli (Google) and Zstandard significant growth, leading to an increase in log messages (Facebook). A second observation concerns the optimisation to preserve (spatial growth). For multiple reasons, including dimension. Most recent research works aim at a decreased legal requirements, log files are also stored longer (tempo- compression ratio, typically at the cost of increased compres- ral growth). The product of both growth factors leads to sion time. In contrast, another class of compression algorithms a superlinear increase in resources required to store logs. aims primarily on searchable compression with ratios being a Hence, some researchers have focused specifically on new secondary concern. Our work combines them in a common compression algorithms for log files, while others have looked model. into comparison approaches. Logs can be produced by application, by system components III. ADAPTIVELY COMPRESSED LOG FILE SCENARIOS or by network or user activities on a system. They are typically Software adaptivity is controlled by goals and constraints. semi-structured, combining regular entry types (dates, times, For compression processes, typical goals are fast compression hosts) with irregular user-defined messages. For a primer on or decompression times, fast search (often in conjunction with application log structures and their semantic interpretation, decompression), low-memory or low-energy (de)compression, which is also exploited by more recent compression algo- or optimal compression ratio. The constraints are manyfold, rithms, the work by Nimbalkar et al. [7] explains the problem ranging from not having the appropriate tool installed to domain and offers an RDF-based solution that links to domain inherent file size limitations in the tools. This knowledge needs vocabularies. to be captured in a knowledge base so that it can be exploited Logzip [4] has been proposed to exploit log-specific redun- at runtime. In contrast to pure mechanical abstraction layers dancy in contrast to that found in generic text. Specifically, such as Squash, the knowledge can then lead to dynamic Logzip extracts hidden structures by first sampling log lines decisions about which codec and which parameterisation to and then clustering them by tokens and other features. One use in any context. The novel proposal in this paper is limitation of Logzip is the reliance on spaces as token separa- to model the relationships in a graph, so that for instance tors which excludes widespread other formats. Vehicle traffic format-equivalent compression tool alternatives can be queried logs can be compressed semantically with high efficiency as dynamically based on situational context defined by goals shown in a recent study [8]. Multi-level Log Compression and constraints. Through autonomous or intelligent decision- (MLC) [9] is another proposal aimed at compressing log files making between the possible candidates, based on an advisor in a cloud backup workflow. It promises ratio improvements of service, smart adaptive log file handling is achieved. around 16% over state of the art compression tools. Text com- This handling shall be illustrated by a scenario: A provider pression beyond ASCII, applicable to the human-readable log wants to store and rotate logs, asks the advisor, and gets a messages, has been explored by modifications to existing byte- command-line ready to execute on the files to achieve the level compressors such as bzip2, with significant effectiveness highest possible compression. Afterwards, the provider notices improvements reported [10], and semantic compression for that CPU usage is high and negatively affects the business ap- text has been investigated as well [11]. plication. The constraint for less CPU involvement is brought While these research prototypes are promising, a baseline to the advisor, leading to updated advice on a command-line comparison of widely deployed compression tools would be that achieves still high compression with tolerable CPU load. of immediate usefulness to operators and is in the focus of As the higher-level choice is remembered, new tools that are this paper. There are many benchmarks and measurements added in later years are taken into account
Recommended publications
  • Data Compression: Dictionary-Based Coding 2 / 37 Dictionary-Based Coding Dictionary-Based Coding
    Dictionary-based Coding already coded not yet coded search buffer look-ahead buffer cursor (N symbols) (L symbols) We know the past but cannot control it. We control the future but... Last Lecture Last Lecture: Predictive Lossless Coding Predictive Lossless Coding Simple and effective way to exploit dependencies between neighboring symbols / samples Optimal predictor: Conditional mean (requires storage of large tables) Affine and Linear Prediction Simple structure, low-complex implementation possible Optimal prediction parameters are given by solution of Yule-Walker equations Works very well for real signals (e.g., audio, images, ...) Efficient Lossless Coding for Real-World Signals Affine/linear prediction (often: block-adaptive choice of prediction parameters) Entropy coding of prediction errors (e.g., arithmetic coding) Using marginal pmf often already yields good results Can be improved by using conditional pmfs (with simple conditions) Heiko Schwarz (Freie Universität Berlin) — Data Compression: Dictionary-based Coding 2 / 37 Dictionary-based Coding Dictionary-Based Coding Coding of Text Files Very high amount of dependencies Affine prediction does not work (requires linear dependencies) Higher-order conditional coding should work well, but is way to complex (memory) Alternative: Do not code single characters, but words or phrases Example: English Texts Oxford English Dictionary lists less than 230 000 words (including obsolete words) On average, a word contains about 6 characters Average codeword length per character would be limited by 1
    [Show full text]
  • Contrasting the Performance of Compression Algorithms on Genomic Data
    Contrasting the Performance of Compression Algorithms on Genomic Data Cornel Constantinescu, IBM Research Almaden Outline of the Talk: • Introduction / Motivation • Data used in experiments • General purpose compressors comparison • Simple Improvements • Special purpose compression • Transparent compression – working on compressed data (prototype) • Parallelism / Multithreading • Conclusion Introduction / Motivation • Despite the large number of research papers and compression algorithms proposed for compressing genomic data generated by sequencing machines, by far the most commonly used compression algorithm in the industry for FASTQ data is gzip. • The main drawbacks of the proposed alternative special-purpose compression algorithms are: • slow speed of either compression or decompression or both, and also their • brittleness by making various limiting assumptions about the input FASTQ format (for example, the structure of the headers or fixed lengths of the records [1]) in order to further improve their specialized compression. 1. Ibrahim Numanagic, James K Bonfield, Faraz Hach, Jan Voges, Jorn Ostermann, Claudio Alberti, Marco Mattavelli, and S Cenk Sahinalp. Comparison of high-throughput sequencing data compression tools. Nature Methods, 13(12):1005–1008, October 2016. Fast and Efficient Compression of Next Generation Sequencing Data 2 2 General Purpose Compression of Genomic Data As stated earlier, gzip/zlib compression is the method of choice by the industry for FASTQ genomic data. FASTQ genomic data is a text-based format (ASCII readable text) for storing a biological sequence and the corresponding quality scores. Each sequence letter and quality score is encoded with a single ASCII character. FASTQ data is structured in four fields per record (a “read”). The first field is the SEQUENCE ID or the header of the read.
    [Show full text]
  • A New Way of Accelerating Web by Compressing Data with Back Reference-Prefer Geflochtener
    442 The International Arab Journal of Information Technology, Vol. 14, No. 4, July 2017 A New Way of Accelerating Web by Compressing Data with Back Reference-Prefer Geflochtener Kushwaha Singh1, Challa Krishna2, and Saini Kumar1 1Department of Computer Science and Engineering, Rajasthan Technical University, India 2Department of Computer Science and Engineering, Panjab University, India Abstract: This research focused on the synthesis of an iterative approach to improve speed of the web and also learning the new methodology to compress the large data with enhanced backward reference preference. In addition, observations on the outcomes obtained from experimentation, group-benchmarks compressions, and time splays for transmissions, the proposed system have been analysed. This resulted in improving the compression of textual data in the Web pages and with this it also gains an interest in hardening the cryptanalysis of the data by maximum reducing the redundancies. This removes unnecessary redundancies with 70% efficiency and compress pages with the 23.75-35% compression ratio. Keywords: Backward references, shortest path technique, HTTP, iterative compression, web, LZSS and LZ77. Received April 25, 2014; accepted August 13, 2014 1. Introduction Algorithm 1: LZ77 Compression Compression is the reduction in data size of if a sufficient length is matched or it may correlate better with next input. information to save space and bandwidth. This can be While (! empty lookaheadBuffer) done on Web data or the entire page including the { header. Data compression is a technique of removing get a remission (position, length) to longer match from search white spaces, inserting a single repeater for the repeated buffer; bytes and replace smaller bits for frequent characters if (length>0) Data Compression (DC) is not only the cost effective { Output (position, length, nextsymbol); technique due to its small size for data storage but it transpose the window length+1 position along; also increases the data transfer rate in data } communication [2].
    [Show full text]
  • Pack, Encrypt, Authenticate Document Revision: 2021 05 02
    PEA Pack, Encrypt, Authenticate Document revision: 2021 05 02 Author: Giorgio Tani Translation: Giorgio Tani This document refers to: PEA file format specification version 1 revision 3 (1.3); PEA file format specification version 2.0; PEA 1.01 executable implementation; Present documentation is released under GNU GFDL License. PEA executable implementation is released under GNU LGPL License; please note that all units provided by the Author are released under LGPL, while Wolfgang Ehrhardt’s crypto library units used in PEA are released under zlib/libpng License. PEA file format and PCOMPRESS specifications are hereby released under PUBLIC DOMAIN: the Author neither has, nor is aware of, any patents or pending patents relevant to this technology and do not intend to apply for any patents covering it. As far as the Author knows, PEA file format in all of it’s parts is free and unencumbered for all uses. Pea is on PeaZip project official site: https://peazip.github.io , https://peazip.org , and https://peazip.sourceforge.io For more information about the licenses: GNU GFDL License, see http://www.gnu.org/licenses/fdl.txt GNU LGPL License, see http://www.gnu.org/licenses/lgpl.txt 1 Content: Section 1: PEA file format ..3 Description ..3 PEA 1.3 file format details ..5 Differences between 1.3 and older revisions ..5 PEA 2.0 file format details ..7 PEA file format’s and implementation’s limitations ..8 PCOMPRESS compression scheme ..9 Algorithms used in PEA format ..9 PEA security model .10 Cryptanalysis of PEA format .12 Data recovery from
    [Show full text]
  • Metadefender Core V4.12.2
    MetaDefender Core v4.12.2 © 2018 OPSWAT, Inc. All rights reserved. OPSWAT®, MetadefenderTM and the OPSWAT logo are trademarks of OPSWAT, Inc. All other trademarks, trade names, service marks, service names, and images mentioned and/or used herein belong to their respective owners. Table of Contents About This Guide 13 Key Features of Metadefender Core 14 1. Quick Start with Metadefender Core 15 1.1. Installation 15 Operating system invariant initial steps 15 Basic setup 16 1.1.1. Configuration wizard 16 1.2. License Activation 21 1.3. Scan Files with Metadefender Core 21 2. Installing or Upgrading Metadefender Core 22 2.1. Recommended System Requirements 22 System Requirements For Server 22 Browser Requirements for the Metadefender Core Management Console 24 2.2. Installing Metadefender 25 Installation 25 Installation notes 25 2.2.1. Installing Metadefender Core using command line 26 2.2.2. Installing Metadefender Core using the Install Wizard 27 2.3. Upgrading MetaDefender Core 27 Upgrading from MetaDefender Core 3.x 27 Upgrading from MetaDefender Core 4.x 28 2.4. Metadefender Core Licensing 28 2.4.1. Activating Metadefender Licenses 28 2.4.2. Checking Your Metadefender Core License 35 2.5. Performance and Load Estimation 36 What to know before reading the results: Some factors that affect performance 36 How test results are calculated 37 Test Reports 37 Performance Report - Multi-Scanning On Linux 37 Performance Report - Multi-Scanning On Windows 41 2.6. Special installation options 46 Use RAMDISK for the tempdirectory 46 3. Configuring Metadefender Core 50 3.1. Management Console 50 3.2.
    [Show full text]
  • Improved Neural Network Based General-Purpose Lossless Compression Mohit Goyal, Kedar Tatwawadi, Shubham Chandak, Idoia Ochoa
    JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 1 DZip: improved neural network based general-purpose lossless compression Mohit Goyal, Kedar Tatwawadi, Shubham Chandak, Idoia Ochoa Abstract—We consider lossless compression based on statistical [4], [5] and generative modeling [6]). Neural network based data modeling followed by prediction-based encoding, where an models can typically learn highly complex patterns in the data accurate statistical model for the input data leads to substantial much better than traditional finite context and Markov models, improvements in compression. We propose DZip, a general- purpose compressor for sequential data that exploits the well- leading to significantly lower prediction error (measured as known modeling capabilities of neural networks (NNs) for pre- log-loss or perplexity [4]). This has led to the development of diction, followed by arithmetic coding. DZip uses a novel hybrid several compressors using neural networks as predictors [7]– architecture based on adaptive and semi-adaptive training. Unlike [9], including the recently proposed LSTM-Compress [10], most NN based compressors, DZip does not require additional NNCP [11] and DecMac [12]. Most of the previous works, training data and is not restricted to specific data types. The proposed compressor outperforms general-purpose compressors however, have been tailored for compression of certain data such as Gzip (29% size reduction on average) and 7zip (12% size types (e.g., text [12] [13] or images [14], [15]), where the reduction on average) on a variety of real datasets, achieves near- prediction model is trained in a supervised framework on optimal compression on synthetic datasets, and performs close to separate training data or the model architecture is tuned for specialized compressors for large sequence lengths, without any the specific data type.
    [Show full text]
  • User Commands GZIP ( 1 ) Gzip, Gunzip, Gzcat – Compress Or Expand Files Gzip [ –Acdfhllnnrtvv19 ] [–S Suffix] [ Name ... ]
    User Commands GZIP ( 1 ) NAME gzip, gunzip, gzcat – compress or expand files SYNOPSIS gzip [–acdfhlLnNrtvV19 ] [– S suffix] [ name ... ] gunzip [–acfhlLnNrtvV ] [– S suffix] [ name ... ] gzcat [–fhLV ] [ name ... ] DESCRIPTION Gzip reduces the size of the named files using Lempel-Ziv coding (LZ77). Whenever possible, each file is replaced by one with the extension .gz, while keeping the same ownership modes, access and modification times. (The default extension is – gz for VMS, z for MSDOS, OS/2 FAT, Windows NT FAT and Atari.) If no files are specified, or if a file name is "-", the standard input is compressed to the standard output. Gzip will only attempt to compress regular files. In particular, it will ignore symbolic links. If the compressed file name is too long for its file system, gzip truncates it. Gzip attempts to truncate only the parts of the file name longer than 3 characters. (A part is delimited by dots.) If the name con- sists of small parts only, the longest parts are truncated. For example, if file names are limited to 14 characters, gzip.msdos.exe is compressed to gzi.msd.exe.gz. Names are not truncated on systems which do not have a limit on file name length. By default, gzip keeps the original file name and timestamp in the compressed file. These are used when decompressing the file with the – N option. This is useful when the compressed file name was truncated or when the time stamp was not preserved after a file transfer. Compressed files can be restored to their original form using gzip -d or gunzip or gzcat.
    [Show full text]
  • Daftar Pustaka.Pdf
    Hak cipta dan penggunaan kembali: Lisensi ini mengizinkan setiap orang untuk menggubah, memperbaiki, dan membuat ciptaan turunan bukan untuk kepentingan komersial, selama anda mencantumkan nama penulis dan melisensikan ciptaan turunan dengan syarat yang serupa dengan ciptaan asli. Copyright and reuse: This license lets you remix, tweak, and build upon work non-commercially, as long as you credit the origin creator and license it on your new creations under the identical terms. Team project ©2017 Dony Pratidana S. Hum | Bima Agus Setyawan S. IIP DAFTAR PUSTAKA Adhitama, G. 2009. Perbandingan Algoritma Huffman Dengan Algoritma Shannon-Fano. Institut Teknologi Bandung. Alakuijala, H., Kliuchnikov, E., Szabadka, Z., & Vandevenne, L. 2015. Comparison of Brotli, Deflate, Zopfli, LZMA, LZHAM and Bzip2 Compression Algorithm. Alakuijala, J., & Szabadka, Z. 2016. Internet Engineering Task Force (IETF). Tersedia dalam: https://www.ietf.org/rfc/rfc7932.txt. [Diakses tanggal 11 November 2017] Anggriani, M. 2011. Perbandingan Metode Kompresi Huffman dan Dynamic Markov Compression (DMC). Bell, C. A. 2007. Expert MySQL. New York: Apress. Boedi, D., Rustamaji, H. C., & Nugraha, M. A. 2009. Aplikasi Kompresi SMS Berbasis JAVA ME Dengan Metode Kompresi LZW-Huffman. UPN "Veteran" Yogyakarta,. Dalam Seminar Nasional Informatika. Darmawan, D. 2014. Pengembangan E-Learning Teori dan Desain. Eka, N. 2016. Pengembangan E-learning Dengan Schoology Pada Materi Dinamika Benda Tegar. Universitas Lampung Salomon, D. 2007. Data Compression The Complete Reference Fourth Edition, hal. 2. Fransisca, C. 2014. Implementasi Algoritma Kompresi Deflate Pada Website Berbasis PHP dan Basis Data Mysql. Google. 2015. Github. Brotli. Tersedia dalam: https://github.com/google/brotli. [Diakses tanggal 25 Mei 2017] Harahap, E. M., Rachmawati, D., & Herriyance.
    [Show full text]
  • The Ark Handbook
    The Ark Handbook Matt Johnston Henrique Pinto Ragnar Thomsen The Ark Handbook 2 Contents 1 Introduction 5 2 Using Ark 6 2.1 Opening Archives . .6 2.1.1 Archive Operations . .6 2.1.2 Archive Comments . .6 2.2 Working with Files . .7 2.2.1 Editing Files . .7 2.3 Extracting Files . .7 2.3.1 The Extract dialog . .8 2.4 Creating Archives and Adding Files . .8 2.4.1 Compression . .9 2.4.2 Password Protection . .9 2.4.3 Multi-volume Archive . 10 3 Using Ark in the Filemanager 11 4 Advanced Batch Mode 12 5 Credits and License 13 Abstract Ark is an archive manager by KDE. The Ark Handbook Chapter 1 Introduction Ark is a program for viewing, extracting, creating and modifying archives. Ark can handle vari- ous archive formats such as tar, gzip, bzip2, zip, rar, 7zip, xz, rpm, cab, deb, xar and AppImage (support for certain archive formats depends on the appropriate command-line programs being installed). In order to successfully use Ark, you need KDE Frameworks 5. The library libarchive version 3.1 or above is needed to handle most archive types, including tar, compressed tar, rpm, deb and cab archives. To handle other file formats, you need the appropriate command line programs, such as zipinfo, zip, unzip, rar, unrar, 7z, lsar, unar and lrzip. 5 The Ark Handbook Chapter 2 Using Ark 2.1 Opening Archives To open an archive in Ark, choose Open... (Ctrl+O) from the Archive menu. You can also open archive files by dragging and dropping from Dolphin.
    [Show full text]
  • Combinatorial Optimization Problems in Internet Applications
    Poznań University of Technology Institute of Computing Science Combinatorial optimization problems in Internet applications Doctoral thesis Jakub Marszałkowski Supervisor: prof. dr hab. inż. Maciej Drozdowski Poznań, 2017 Contents 1 Introduction 4 1.1 Motivation . 4 1.2 Scope and Puropose . 5 1.3 Methodology . 6 1.4 Common webpage-related factors . 10 1.5 Outline of the Thesis . 11 2 Layout Partitioning for Advertisements Fit 13 2.1 Website’s Layouts and Ad Placement . 13 2.2 Problem Formulation . 16 2.3 Objective Functions . 19 2.3.1 Max Ad Number Function . 20 2.3.2 Max Most Difficult to Pack Ad Unit Function . 20 2.3.3 Min Single Ad Waste . 20 2.4 Solution Method . 21 2.4.1 Combining Ad Units . 22 2.4.2 Valid Column Widths List . 23 2.4.3 Browsing Layouts . 24 2.4.4 Selecting Final Results . 25 2.4.5 Example For a Small Instance . 25 2.5 Benchmarks . 27 2.5.1 Data Sets . 27 2.5.2 Webmaster Survey . 27 2.6 Computational Experiments . 29 2.6.1 Input Parameters . 29 2.6.2 Execution Times . 31 2.6.3 Layout Partitioning Results and Discussion . 31 2.7 Conclusions . 35 3 Tag Cloud Construction 37 3.1 Tag Clouds . 37 3.2 Problem Analysis and Related Work Survey . 38 3.2.1 Tag cloud taxonomy . 38 3.2.2 Related work . 40 3.2.3 Tag Cloud Usability Studies . 42 3.2.4 Tag Clouds for the Web . 43 3.2.5 Client Side . 44 3.2.6 Analysis of Packing Problem Properties .
    [Show full text]
  • Redalyc.A Lossy Method for Compressing Raw CCD Images
    Revista Mexicana de Astronomía y Astrofísica ISSN: 0185-1101 [email protected] Instituto de Astronomía México Watson, Alan M. A Lossy Method for Compressing Raw CCD Images Revista Mexicana de Astronomía y Astrofísica, vol. 38, núm. 2, octubre, 2002, pp. 233-249 Instituto de Astronomía Distrito Federal, México Available in: http://www.redalyc.org/articulo.oa?id=57138212 How to cite Complete issue Scientific Information System More information about this article Network of Scientific Journals from Latin America, the Caribbean, Spain and Portugal Journal's homepage in redalyc.org Non-profit academic project, developed under the open access initiative Revista Mexicana de Astronom´ıa y Astrof´ısica, 38, 233{249 (2002) A LOSSY METHOD FOR COMPRESSING RAW CCD IMAGES Alan M. Watson Instituto de Astronom´ıa Universidad Nacional Aut´onoma de M´exico, Campus Morelia, M´exico Received 2002 June 3; accepted 2002 August 7 RESUMEN Se presenta un m´etodo para comprimir las im´agenes en bruto de disposi- tivos como los CCD. El m´etodo es muy sencillo: cuantizaci´on con p´erdida y luego compresi´on sin p´erdida con herramientas de uso general como gzip o bzip2. Se convierten los archivos comprimidos a archivos de FITS descomprimi´endolos con gunzip o bunzip2, lo cual es una ventaja importante en la distribuci´on de datos com- primidos. El grado de cuantizaci´on se elige para eliminar los bits de bajo orden, los cuales sobre-muestrean el ruido, no proporcionan informaci´on, y son dif´ıciles o imposibles de comprimir. El m´etodo es con p´erdida, pero proporciona ciertas garant´ıas sobre la diferencia absoluta m´axima, la diferencia RMS y la diferencia promedio entre la imagen comprimida y la imagen original; tales garant´ıas implican que el m´etodo es adecuado para comprimir im´agenes en bruto.
    [Show full text]
  • Linux-Cookbook.Pdf
    LINUX COOKBOOK ™ Other Linux resources from O’Reilly Related titles Linux Device Drivers Exploring the JDS Linux Linux in a Nutshell Desktop Running Linux Learning Red Hat Enterprise Building Embedded Linux Linux and Fedora Systems Linux Pocket Guide Linux Security Cookbook Understanding the Linux Kernel Linux Books linux.oreilly.com is a complete catalog of O’Reilly’s books on Resource Center Linux and Unix and related technologies, including sample chapters and code examples. ONLamp.com is the premier site for the open source web plat- form: Linux, Apache, MySQL, and either Perl, Python, or PHP. Conferences O’Reilly brings diverse innovators together to nurture the ideas that spark revolutionary industries. We specialize in document- ing the latest tools and systems, translating the innovator’s knowledge into useful skills for those in the trenches. Visit conferences.oreilly.com for our upcoming events. Safari Bookshelf (safari.oreilly.com) is the premier online refer- ence library for programmers and IT professionals. Conduct searches across more than 1,000 books. Subscribers can zero in on answers to time-critical questions in a matter of seconds. Read the books on your Bookshelf from cover to cover or sim- ply flip to the page you need. Try it today with a free trial. LINUX COOKBOOK ™ Carla Schroder Beijing • Cambridge • Farnham • Köln • Paris • Sebastopol • Taipei • Tokyo Linux Cookbook™ by Carla Schroder Copyright © 2005 O’Reilly Media, Inc. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use.
    [Show full text]