Mobile App Acceleration Via Fine-Grain Offloading to the Cloud
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Lecture 24 1 Kolmogorov Complexity
6.441 Spring 2006 24-1 6.441 Transmission of Information May 18, 2006 Lecture 24 Lecturer: Madhu Sudan Scribe: Chung Chan 1 Kolmogorov complexity Shannon’s notion of compressibility is closely tied to a probability distribution. However, the probability distribution of the source is often unknown to the encoder. Sometimes, we interested in compressibility of specific sequence. e.g. How compressible is the Bible? We have the Lempel-Ziv universal compression algorithm that can compress any string without knowledge of the underlying probability except the assumption that the strings comes from a stochastic source. But is it the best compression possible for each deterministic bit string? This notion of compressibility/complexity of a deterministic bit string has been studied by Solomonoff in 1964, Kolmogorov in 1966, Chaitin in 1967 and Levin. Consider the following n-sequence 0100011011000001010011100101110111 ··· Although it may appear random, it is the enumeration of all binary strings. A natural way to compress it is to represent it by the procedure that generates it: enumerate all strings in binary, and stop at the n-th bit. The compression achieved in bits is |compression length| ≤ 2 log n + O(1) More generally, with the universal Turing machine, we can encode data to a computer program that generates it. The length of the smallest program that produces the bit string x is called the Kolmogorov complexity of x. Definition 1 (Kolmogorov complexity) For every language L, the Kolmogorov complexity of the bit string x with respect to L is KL (x) = min l(p) p:L(p)=x where p is a program represented as a bit string, L(p) is the output of the program with respect to the language L, and l(p) is the length of the program, or more precisely, the point at which the execution halts. -
Compressed Transitive Delta Encoding 1. Introduction
Compressed Transitive Delta Encoding Dana Shapira Department of Computer Science Ashkelon Academic College Ashkelon 78211, Israel [email protected] Abstract Given a source file S and two differencing files ∆(S; T ) and ∆(T;R), where ∆(X; Y ) is used to denote the delta file of the target file Y with respect to the source file X, the objective is to be able to construct R. This is intended for the scenario of upgrading soft- ware where intermediate releases are missing, or for the case of file system backups, where non consecutive versions must be recovered. The traditional way is to decompress ∆(S; T ) in order to construct T and then apply ∆(T;R) on T and obtain R. The Compressed Transitive Delta Encoding (CTDE) paradigm, introduced in this paper, is to construct a delta file ∆(S; R) working directly on the two given delta files, ∆(S; T ) and ∆(T;R), without any decompression or the use of the base file S. A new algorithm for solving CTDE is proposed and its compression performance is compared against the traditional \double delta decompression". Not only does it use constant additional space, as opposed to the traditional method which uses linear additional memory storage, but experiments show that the size of the delta files involved is reduced by 15% on average. 1. Introduction Differential file compression represents a target file T with respect to a source file S. That is, both the encoder and decoder have available identical copies of S. A new file T is encoded and subsequently decoded by making use of S. -
Implementing Compression on Distributed Time Series Database
Implementing compression on distributed time series database Michael Burman School of Science Thesis submitted for examination for the degree of Master of Science in Technology. Espoo 05.11.2017 Supervisor Prof. Kari Smolander Advisor Mgr. Jiri Kremser Aalto University, P.O. BOX 11000, 00076 AALTO www.aalto.fi Abstract of the master’s thesis Author Michael Burman Title Implementing compression on distributed time series database Degree programme Major Computer Science Code of major SCI3042 Supervisor Prof. Kari Smolander Advisor Mgr. Jiri Kremser Date 05.11.2017 Number of pages 70+4 Language English Abstract Rise of microservices and distributed applications in containerized deployments are putting increasing amount of burden to the monitoring systems. They push the storage requirements to provide suitable performance for large queries. In this paper we present the changes we made to our distributed time series database, Hawkular-Metrics, and how it stores data more effectively in the Cassandra. We show that using our methods provides significant space savings ranging from 50 to 95% reduction in storage usage, while reducing the query times by over 90% compared to the nominal approach when using Cassandra. We also provide our unique algorithm modified from Gorilla compression algorithm that we use in our solution, which provides almost three times the throughput in compression with equal compression ratio. Keywords timeseries compression performance storage Aalto-yliopisto, PL 11000, 00076 AALTO www.aalto.fi Diplomityön tiivistelmä Tekijä Michael Burman Työn nimi Pakkausmenetelmät hajautetussa aikasarjatietokannassa Koulutusohjelma Pääaine Computer Science Pääaineen koodi SCI3042 Työn valvoja ja ohjaaja Prof. Kari Smolander Päivämäärä 05.11.2017 Sivumäärä 70+4 Kieli Englanti Tiivistelmä Hajautettujen järjestelmien yleistyminen on aiheuttanut valvontajärjestelmissä tiedon määrän kasvua, sillä aikasarjojen määrä on kasvanut ja niihin talletetaan useammin tietoa. -
In-Place Reconstruction of Delta Compressed Files
In-Place Reconstruction of Delta Compressed Files Randal C. Burns Darrell D. E. Long’ IBM Almaden ResearchCenter Departmentof Computer Science 650 Harry Rd., San Jose,CA 95 120 University of California, SantaCruz, CA 95064 [email protected] [email protected] Abstract results in high latency and low bandwidth to web-enabled clients and prevents the timely delivery of software. We present an algorithm for modifying delta compressed Differential or delta compression [5, 11, compactly en- files so that the compressedversions may be reconstructed coding a new version of a file using only the changedbytes without scratchspace. This allows network clients with lim- from a previous version, can be usedto reducethe size of the ited resources to efficiently update software by retrieving file to be transmitted and consequently the time to perform delta compressedversions over a network. software update. Currently, decompressingdelta encoded Delta compressionfor binary files, compactly encoding a files requires scratch space,additional disk or memory stor- version of data with only the changedbytes from a previous age, used to hold a required second copy of the file. Two version, may be used to efficiently distribute software over copiesof the compressedfile must be concurrently available, low bandwidth channels, such as the Internet. Traditional as the delta file contains directives to read data from the old methods for rebuilding these delta files require memory or file version while the new file version is being materialized storagespace on the target machinefor both the old and new in another region of storage. This presentsa problem. Net- version of the file to be reconstructed. -
Shannon Entropy and Kolmogorov Complexity
Information and Computation: Shannon Entropy and Kolmogorov Complexity Satyadev Nandakumar Department of Computer Science. IIT Kanpur October 19, 2016 This measures the average uncertainty of X in terms of the number of bits. Shannon Entropy Definition Let X be a random variable taking finitely many values, and P be its probability distribution. The Shannon Entropy of X is X 1 H(X ) = p(i) log : 2 p(i) i2X Shannon Entropy Definition Let X be a random variable taking finitely many values, and P be its probability distribution. The Shannon Entropy of X is X 1 H(X ) = p(i) log : 2 p(i) i2X This measures the average uncertainty of X in terms of the number of bits. The Triad Figure: Claude Shannon Figure: A. N. Kolmogorov Figure: Alan Turing Just Electrical Engineering \Shannon's contribution to pure mathematics was denied immediate recognition. I can recall now that even at the International Mathematical Congress, Amsterdam, 1954, my American colleagues in probability seemed rather doubtful about my allegedly exaggerated interest in Shannon's work, as they believed it consisted more of techniques than of mathematics itself. However, Shannon did not provide rigorous mathematical justification of the complicated cases and left it all to his followers. Still his mathematical intuition is amazingly correct." A. N. Kolmogorov, as quoted in [Shi89]. Kolmogorov and Entropy Kolmogorov's later work was fundamentally influenced by Shannon's. 1 Foundations: Kolmogorov Complexity - using the theory of algorithms to give a combinatorial interpretation of Shannon Entropy. 2 Analogy: Kolmogorov-Sinai Entropy, the only finitely-observable isomorphism-invariant property of dynamical systems. -
Delta Compression Techniques
D but the concept can also be applied to multimedia Delta Compression and structured data. Techniques Delta compression should not be confused with Elias delta codes, a technique for encod- Torsten Suel ing integer values, or with the idea of coding Department of Computer Science and sorted sequences of integers by first taking the Engineering, Tandon School of Engineering, difference (or delta) between consecutive values. New York University, Brooklyn, NY, USA Also, delta compression requires the encoder to have complete knowledge of the reference files and thus differs from more general techniques for Synonyms redundancy elimination in networks and storage systems where the encoder has limited or even Data differencing; Delta encoding; Differential no knowledge of the reference files, though the compression boundaries with that line of work are not clearly defined. Definition Delta compression techniques encode a target Overview file with respect to one or more reference files, such that a decoder who has access to the same Many applications of big data technologies in- reference files can recreate the target file from the volve very large data sets that need to be stored on compressed data. Delta compression is usually disk or transmitted over networks. Consequently, applied in cases where there is a high degree of data compression techniques are widely used to redundancy between target and references files, reduce data sizes. However, there are many sce- leading to a much smaller compressed size than narios where there are significant redundancies could be achieved by just compressing the tar- between different data files that cannot be ex- get file by itself. -
Information Theory
Information Theory Professor John Daugman University of Cambridge Computer Science Tripos, Part II Michaelmas Term 2016/17 H(X,Y) I(X;Y) H(X|Y) H(Y|X) H(X) H(Y) 1 / 149 Outline of Lectures 1. Foundations: probability, uncertainty, information. 2. Entropies defined, and why they are measures of information. 3. Source coding theorem; prefix, variable-, and fixed-length codes. 4. Discrete channel properties, noise, and channel capacity. 5. Spectral properties of continuous-time signals and channels. 6. Continuous information; density; noisy channel coding theorem. 7. Signal coding and transmission schemes using Fourier theorems. 8. The quantised degrees-of-freedom in a continuous signal. 9. Gabor-Heisenberg-Weyl uncertainty relation. Optimal \Logons". 10. Data compression codes and protocols. 11. Kolmogorov complexity. Minimal description length. 12. Applications of information theory in other sciences. Reference book (*) Cover, T. & Thomas, J. Elements of Information Theory (second edition). Wiley-Interscience, 2006 2 / 149 Overview: what is information theory? Key idea: The movements and transformations of information, just like those of a fluid, are constrained by mathematical and physical laws. These laws have deep connections with: I probability theory, statistics, and combinatorics I thermodynamics (statistical physics) I spectral analysis, Fourier (and other) transforms I sampling theory, prediction, estimation theory I electrical engineering (bandwidth; signal-to-noise ratio) I complexity theory (minimal description length) I signal processing, representation, compressibility As such, information theory addresses and answers the two fundamental questions which limit all data encoding and communication systems: 1. What is the ultimate data compression? (answer: the entropy of the data, H, is its compression limit.) 2. -
Paper 1 Jian Li Kolmold
KolmoLD: Data Modelling for the Modern Internet* Dmitry Borzov, Huawei Canada Tim Tingqiu Yuan, Huawei Mikhail Ignatovich, Huawei Canada Jian Li, Futurewei *work performed before May 2019 Challenges: Peak Traffic Composition 73% 26% Content: sizable, faned-out, static Streaming Services Everything else (Netflix, Hulu, Software File storage (Instant Messaging, YouTube, Spotify) distribution services VoIP, Social Media) [1] Source: Sandvine Global Internet Phenomena reports for 2009, 2010, 2011, 2012, 2013, 2015, 2016, October 2018 [1] https://qz.com/1001569/the-cdn-heavy-internet-in-rich-countries-will-be-unrecognizable-from-the-rest-of-the-worlds-in-five-years/ Technologies to define the revolution of the internet ChunkStream Founded in 2016 Video codec Founded in 2014 Content-addressable Based on a 2014 MIT Browser-targeted Runtime Content-addressable network protocol based research paper network protocol based on cryptohash naming on cryptohash naming scheme Implemented and supported by all major scheme Based on the Open source project cryptohash naming browsers, an IETF standard Founding company is a P2P project scheme YCombinator graduate YCombinator graduate with backing of high profile SV investors Our Proposal: A data model for interoperable protocols KolmoLD Content addressing through hashes has become a widely-used means of Addressable: connecting layer, inspired by connecting data in distributed the principles of Kolmogorov complexity systems, from the blockchains that theory run your favorite cryptocurrencies, to the commits that back your code, to Compossible: sending data as code, where the web’s content at large. code efficiency is theoretically bounded by Kolmogorov complexity Yet, whilst all of these tools rely on Computable: sandboxed computability by some common primitives, their treating data as code specific underlying data structures are not interoperable. -
Maximizing T-Complexity 1. Introduction
Fundamenta Informaticae XX (2015) 1–19 1 DOI 10.3233/FI-2012-0000 IOS Press Maximizing T-complexity Gregory Clark U.S. Department of Defense gsclark@ tycho. ncsc. mil Jason Teutsch Penn State University teutsch@ cse. psu. edu Abstract. We investigate Mark Titchener’s T-complexity, an algorithm which measures the infor- mation content of finite strings. After introducing the T-complexity algorithm, we turn our attention to a particular class of “simple” finite strings. By exploiting special properties of simple strings, we obtain a fast algorithm to compute the maximum T-complexity among strings of a given length, and our estimates of these maxima show that T-complexity differs asymptotically from Kolmogorov complexity. Finally, we examine how closely de Bruijn sequences resemble strings with high T- complexity. 1. Introduction How efficiently can we generate random-looking strings? Random strings play an important role in data compression since such strings, characterized by their lack of patterns, do not compress well. For prac- tical compression applications, compression must take place in a reasonable amount of time (typically linear or nearly linear time). The degree of success achieved with the classical compression algorithms of Lempel and Ziv [9, 21, 22] and Welch [19] depend on the complexity of the input string; the length of the compressed string gives a measure of the original string’s complexity. In the same monograph in which Lempel and Ziv introduced the first of these algorithms [9], they gave an example of a succinctly describable class of strings which yield optimally poor compression up to a logarithmic factor, namely the class of lexicographically least de Bruijn sequences for each length. -
The Pillars of Lossless Compression Algorithms a Road Map and Genealogy Tree
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 13, Number 6 (2018) pp. 3296-3414 © Research India Publications. http://www.ripublication.com The Pillars of Lossless Compression Algorithms a Road Map and Genealogy Tree Evon Abu-Taieh, PhD Information System Technology Faculty, The University of Jordan, Aqaba, Jordan. Abstract tree is presented in the last section of the paper after presenting the 12 main compression algorithms each with a practical This paper presents the pillars of lossless compression example. algorithms, methods and techniques. The paper counted more than 40 compression algorithms. Although each algorithm is The paper first introduces Shannon–Fano code showing its an independent in its own right, still; these algorithms relation to Shannon (1948), Huffman coding (1952), FANO interrelate genealogically and chronologically. The paper then (1949), Run Length Encoding (1967), Peter's Version (1963), presents the genealogy tree suggested by researcher. The tree Enumerative Coding (1973), LIFO (1976), FiFO Pasco (1976), shows the interrelationships between the 40 algorithms. Also, Stream (1979), P-Based FIFO (1981). Two examples are to be the tree showed the chronological order the algorithms came to presented one for Shannon-Fano Code and the other is for life. The time relation shows the cooperation among the Arithmetic Coding. Next, Huffman code is to be presented scientific society and how the amended each other's work. The with simulation example and algorithm. The third is Lempel- paper presents the 12 pillars researched in this paper, and a Ziv-Welch (LZW) Algorithm which hatched more than 24 comparison table is to be developed. -
The Deep Learning Solutions on Lossless Compression Methods for Alleviating Data Load on Iot Nodes in Smart Cities
sensors Article The Deep Learning Solutions on Lossless Compression Methods for Alleviating Data Load on IoT Nodes in Smart Cities Ammar Nasif *, Zulaiha Ali Othman and Nor Samsiah Sani Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science & Technology, University Kebangsaan Malaysia, Bangi 43600, Malaysia; [email protected] (Z.A.O.); [email protected] (N.S.S.) * Correspondence: [email protected] Abstract: Networking is crucial for smart city projects nowadays, as it offers an environment where people and things are connected. This paper presents a chronology of factors on the development of smart cities, including IoT technologies as network infrastructure. Increasing IoT nodes leads to increasing data flow, which is a potential source of failure for IoT networks. The biggest challenge of IoT networks is that the IoT may have insufficient memory to handle all transaction data within the IoT network. We aim in this paper to propose a potential compression method for reducing IoT network data traffic. Therefore, we investigate various lossless compression algorithms, such as entropy or dictionary-based algorithms, and general compression methods to determine which algorithm or method adheres to the IoT specifications. Furthermore, this study conducts compression experiments using entropy (Huffman, Adaptive Huffman) and Dictionary (LZ77, LZ78) as well as five different types of datasets of the IoT data traffic. Though the above algorithms can alleviate the IoT data traffic, adaptive Huffman gave the best compression algorithm. Therefore, in this paper, Citation: Nasif, A.; Othman, Z.A.; we aim to propose a conceptual compression method for IoT data traffic by improving an adaptive Sani, N.S. -
1952 Washington UFO Sightings • Psychic Pets and Pet Psychics • the Skeptical Environmentalist Skeptical Inquirer the MAGAZINE for SCIENCE and REASON Volume 26,.No
1952 Washington UFO Sightings • Psychic Pets and Pet Psychics • The Skeptical Environmentalist Skeptical Inquirer THE MAGAZINE FOR SCIENCE AND REASON Volume 26,.No. 6 • November/December 2002 ppfjlffl-f]^;, rj-r ci-s'.n.: -/: •:.'.% hstisnorm-i nor mm . o THE COMMITTEE FOR THE SCIENTIFIC INVESTIGATION OF CLAIMS OF THE PARANORMAL AT THE CENTER FOR INQUIRY-INTERNATIONAL (ADJACENT TO THE STATE UNIVERSITY OF NEW YORK AT BUFFALO) • AN INTERNATIONAL ORGANIZATION Paul Kurtz, Chairman; professor emeritus of philosophy. State University of New York at Buffalo Barry Karr, Executive Director Joe Nickell, Senior Research Fellow Massimo Polidoro, Research Fellow Richard Wiseman, Research Fellow Lee Nisbet Special Projects Director FELLOWS James E. Alcock,* psychologist. York Univ., Consultants, New York. NY Irmgard Oepen, professor of medicine Toronto Susan Haack. Cooper Senior Scholar in Arts (retired), Marburg, Germany Jerry Andrus, magician and inventor, Albany, and Sciences, prof, of philosophy, University Loren Pankratz, psychologist, Oregon Health Oregon of Miami Sciences Univ. Marcia Angell, M.D., former editor-in-chief. C. E. M. Hansel, psychologist. Univ. of Wales John Paulos, mathematician, Temple Univ. New England Journal of Medicine Al Hibbs, scientist, Jet Propulsion Laboratory Steven Pinker, cognitive scientist, MIT Robert A. Baker, psychologist. Univ. of Douglas Hofstadter, professor of human Massimo Polidoro, science writer, author, Kentucky understanding and cognitive science, executive director CICAP, Italy Stephen Barrett, M.D., psychiatrist, author. Indiana Univ. Milton Rosenberg, psychologist, Univ. of consumer advocate, Allentown, Pa. Gerald Holton, Mallinckrodt Professor of Chicago Barry Beyerstein,* biopsychologist, Simon Physics and professor of history of science, Wallace Sampson, M.D., clinical professor of Harvard Univ. Fraser Univ., Vancouver, B.C., Canada medicine, Stanford Univ., editor, Scientific Ray Hyman.* psychologist, Univ.