Source Coding: Part I of Fundamentals of Source and Video Coding
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Entropy Rate of Stochastic Processes
Entropy Rate of Stochastic Processes Timo Mulder Jorn Peters [email protected] [email protected] February 8, 2015 The entropy rate of independent and identically distributed events can on average be encoded by H(X) bits per source symbol. However, in reality, series of events (or processes) are often randomly distributed and there can be arbitrary dependence between each event. Such processes with arbitrary dependence between variables are called stochastic processes. This report shows how to calculate the entropy rate for such processes, accompanied with some denitions, proofs and brief examples. 1 Introduction One may be interested in the uncertainty of an event or a series of events. Shannon entropy is a method to compute the uncertainty in an event. This can be used for single events or for several independent and identically distributed events. However, often events or series of events are not independent and identically distributed. In that case the events together form a stochastic process i.e. a process in which multiple outcomes are possible. These processes are omnipresent in all sorts of elds of study. As it turns out, Shannon entropy cannot directly be used to compute the uncertainty in a stochastic process. However, it is easy to extend such that it can be used. Also when a stochastic process satises certain properties, for example if the stochastic process is a Markov process, it is straightforward to compute the entropy of the process. The entropy rate of a stochastic process can be used in various ways. An example of this is given in Section 4.1. -
Entropy Rate Estimation for Markov Chains with Large State Space
Entropy Rate Estimation for Markov Chains with Large State Space Yanjun Han Department of Electrical Engineering Stanford University Stanford, CA 94305 [email protected] Jiantao Jiao Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720 [email protected] Chuan-Zheng Lee, Tsachy Weissman Yihong Wu Department of Electrical Engineering Department of Statistics and Data Science Stanford University Yale University Stanford, CA 94305 New Haven, CT 06511 {czlee, tsachy}@stanford.edu [email protected] Tiancheng Yu Department of Electronic Engineering Tsinghua University Haidian, Beijing 100084 [email protected] Abstract Entropy estimation is one of the prototypicalproblems in distribution property test- ing. To consistently estimate the Shannon entropy of a distribution on S elements with independent samples, the optimal sample complexity scales sublinearly with arXiv:1802.07889v3 [cs.LG] 24 Sep 2018 S S as Θ( log S ) as shown by Valiant and Valiant [41]. Extendingthe theory and algo- rithms for entropy estimation to dependent data, this paper considers the problem of estimating the entropy rate of a stationary reversible Markov chain with S states from a sample path of n observations. We show that Provided the Markov chain mixes not too slowly, i.e., the relaxation time is • S S2 at most O( 3 ), consistent estimation is achievable when n . ln S ≫ log S Provided the Markov chain has some slight dependency, i.e., the relaxation • 2 time is at least 1+Ω( ln S ), consistent estimation is impossible when n . √S S2 log S . Under both assumptions, the optimal estimation accuracy is shown to be S2 2 Θ( n log S ). -
Entropy Power, Autoregressive Models, and Mutual Information
entropy Article Entropy Power, Autoregressive Models, and Mutual Information Jerry Gibson Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; [email protected] Received: 7 August 2018; Accepted: 17 September 2018; Published: 30 September 2018 Abstract: Autoregressive processes play a major role in speech processing (linear prediction), seismic signal processing, biological signal processing, and many other applications. We consider the quantity defined by Shannon in 1948, the entropy rate power, and show that the log ratio of entropy powers equals the difference in the differential entropy of the two processes. Furthermore, we use the log ratio of entropy powers to analyze the change in mutual information as the model order is increased for autoregressive processes. We examine when we can substitute the minimum mean squared prediction error for the entropy power in the log ratio of entropy powers, thus greatly simplifying the calculations to obtain the differential entropy and the change in mutual information and therefore increasing the utility of the approach. Applications to speech processing and coding are given and potential applications to seismic signal processing, EEG classification, and ECG classification are described. Keywords: autoregressive models; entropy rate power; mutual information 1. Introduction In time series analysis, the autoregressive (AR) model, also called the linear prediction model, has received considerable attention, with a host of results on fitting AR models, AR model prediction performance, and decision-making using time series analysis based on AR models [1–3]. The linear prediction or AR model also plays a major role in many digital signal processing applications, such as linear prediction modeling of speech signals [4–6], geophysical exploration [7,8], electrocardiogram (ECG) classification [9], and electroencephalogram (EEG) classification [10,11]. -
Relative Entropy Rate Between a Markov Chain and Its Corresponding Hidden Markov Chain
⃝c Statistical Research and Training Center دورهی ٨، ﺷﻤﺎرهی ١، ﺑﻬﺎر و ﺗﺎﺑﺴﺘﺎن ١٣٩٠، ﺻﺺ ٩٧–١٠٩ J. Statist. Res. Iran 8 (2011): 97–109 Relative Entropy Rate between a Markov Chain and Its Corresponding Hidden Markov Chain GH. Yariy and Z. Nikooraveshz;∗ y Iran University of Science and Technology z Birjand University of Technology Abstract. In this paper we study the relative entropy rate between a homo- geneous Markov chain and a hidden Markov chain defined by observing the output of a discrete stochastic channel whose input is the finite state space homogeneous stationary Markov chain. For this purpose, we obtain the rela- tive entropy between two finite subsequences of above mentioned chains with the help of the definition of relative entropy between two random variables then we define the relative entropy rate between these stochastic processes and study the convergence of it. Keywords. Relative entropy rate; stochastic channel; Markov chain; hid- den Markov chain. MSC 2010: 60J10, 94A17. 1 Introduction Suppose fXngn2N is a homogeneous stationary Markov chain with finite state space S = f0; 1; 2; :::; N − 1g and fYngn2N is a hidden Markov chain (HMC) which is observed through a discrete stochastic channel where the input of channel is the Markov chain. The output state space of channel is characterized by channel’s statistical properties. From now on we study ∗ Corresponding author 98 Relative Entropy Rate between a Markov Chain and Its ::: the channels state spaces which have been equal to the state spaces of input chains. Let P = fpabg be the one-step transition probability matrix of the Markov chain such that pab = P rfXn = bjXn−1 = ag for a; b 2 S and Q = fqabg be the noisy matrix of channel where qab = P rfYn = bjXn = ag for a; b 2 S. -
Entropy Rates & Markov Chains Information Theory Duke University, Fall 2020
ECE 587 / STA 563: Lecture 4 { Entropy Rates & Markov Chains Information Theory Duke University, Fall 2020 Author: Galen Reeves Last Modified: September 1, 2020 Outline of lecture: 4.1 Entropy Rate........................................1 4.2 Markov Chains.......................................3 4.3 Card Shuffling and MCMC................................4 4.4 Entropy of the English Language [Optional].......................6 4.1 Entropy Rate • A stochastic process fXig is an indexed sequence of random variables. They need not be in- n dependent nor identically distributed. For each n the distribution on X = (X1;X2; ··· ;Xn) is characterized by the joint pmf n pXn (x ) = p(x1; x2; ··· ; xn) • A stochastic process provides a useful probabilistic model. Examples include: ◦ the state of a deck of cards after each shuffle ◦ location of a molecule each millisecond. ◦ temperature each day ◦ sequence of letters in a book ◦ the closing price of the stock market each day. • With dependent sequences, hows does the entropy H(Xn) grow with n? (1) Definition 1: Average entropy per symbol H(X ;X ; ··· ;X ) H(X ) = lim 1 2 n n!1 n (2) Definition 2: Rate of information innovation 0 H (X ) = lim H(XnjX1;X2; ··· ;Xn−1) n!1 • If fXig are iid ∼ p(x), then both limits exists and are equal to H(X). nH(X) H(X ) = lim = H(X) n!1 n 0 H (X ) = lim H(XnjX1;X2; ··· ;Xn−1) = H(X) n!1 But what if the symbols are not iid? 2 ECE 587 / STA 563: Lecture 4 • A stochastic process is stationary if the joint distribution of subsets is invariant to shifts in the time index, i.e. -
Information Theory 1 Entropy 2 Mutual Information
CS769 Spring 2010 Advanced Natural Language Processing Information Theory Lecturer: Xiaojin Zhu [email protected] In this lecture we will learn about entropy, mutual information, KL-divergence, etc., which are useful concepts for information processing systems. 1 Entropy Entropy of a discrete distribution p(x) over the event space X is X H(p) = − p(x) log p(x). (1) x∈X When the log has base 2, entropy has unit bits. Properties: H(p) ≥ 0, with equality only if p is deterministic (use the fact 0 log 0 = 0). Entropy is the average number of 0/1 questions needed to describe an outcome from p(x) (the Twenty Questions game). Entropy is a concave function of p. 1 1 1 1 7 For example, let X = {x1, x2, x3, x4} and p(x1) = 2 , p(x2) = 4 , p(x3) = 8 , p(x4) = 8 . H(p) = 4 bits. This definition naturally extends to joint distributions. Assuming (x, y) ∼ p(x, y), X X H(p) = − p(x, y) log p(x, y). (2) x∈X y∈Y We sometimes write H(X) instead of H(p) with the understanding that p is the underlying distribution. The conditional entropy H(Y |X) is the amount of information needed to determine Y , if the other party knows X. X X X H(Y |X) = p(x)H(Y |X = x) = − p(x, y) log p(y|x). (3) x∈X x∈X y∈Y From above, we can derive the chain rule for entropy: H(X1:n) = H(X1) + H(X2|X1) + .. -
Arxiv:1711.03962V1 [Stat.ME] 10 Nov 2017 the Entropy Rate of an Individual’S Behavior
Estimating the Entropy Rate of Finite Markov Chains with Application to Behavior Studies Brian Vegetabile1;∗ and Jenny Molet2 and Tallie Z. Baram2;3;4 and Hal Stern1 1Department of Statistics, University of California, Irvine, CA, U.S.A. 2Department of Anatomy and Neurobiology, University of California, Irvine, CA, U.S.A. 3Department of Pediatrics, University of California, Irvine, CA, U.S.A. 4Department of Neurology, University of California, Irvine, CA, U.S.A. *email: [email protected] Summary: Predictability of behavior has emerged an an important characteristic in many fields including biology, medicine, and marketing. Behavior can be recorded as a sequence of actions performed by an individual over a given time period. This sequence of actions can often be modeled as a stationary time-homogeneous Markov chain and the predictability of the individual's behavior can be quantified by the entropy rate of the process. This paper provides a comprehensive investigation of three estimators of the entropy rate of finite Markov processes and a bootstrap procedure for providing standard errors. The first two methods directly estimate the entropy rate through estimates of the transition matrix and stationary distribution of the process; the methods differ in the technique used to estimate the stationary distribution. The third method is related to the sliding-window Lempel-Ziv (SWLZ) compression algorithm. The first two methods achieve consistent estimates of the true entropy rate for reasonably short observed sequences, but are limited by requiring a priori specification of the order of the process. The method based on the SWLZ algorithm does not require specifying the order of the process and is optimal in the limit of an infinite sequence, but is biased for short sequences. -
Efficient Variable-To-Fixed Length Coding Algorithms for Text
Efficient Variable-to-Fixed Length Coding Algorithms for Text Compression (テキスト圧縮に対する効率よい可変長-固定長符号化アルゴリズム) Satoshi Yoshida February, 2014 Abstract Data compression is a technique for reducing the storage space and the cost of trans- ferring a large amount of data, using redundancy hidden in the data. We focus on lossless compression for text data, that is, text compression, in this thesis. To reuse a huge amount of data stored in secondary storage, I/O speeds are bottlenecks. Such a communication-speed problem can be relieved if we transfer only compressed data through the communication channel and furthermore can perform every necessary pro- cesses, such as string search, on the compressed data itself without decompression. Therefore, a new criterion \ease of processing the compressed data" is required in the field of data compression. Development of compression algorithms is currently in the mainstream of data compression field but many of them are not adequate for that criterion. The algorithms employing variable length codewords succeeded to achieve an extremely good compression ratio, but the boundaries between codewords are not obvious without a special processing. Such an \unclear boundary problem" prevents us from direct accessing to the compressed data. On the contrary, Variable-to-Fixed-length coding, which is referred to as VF coding, is promising for our demand. VF coding is a coding scheme that segments an input text into a consecutive sequence of substrings (called phrases) and then assigns a fixed length codeword to each substring. Boundaries between codewords of VF coding are obvious because all of them have the same length. Therefore, we can realize \accessible data compression" by VF coding. -
The Pillars of Lossless Compression Algorithms a Road Map and Genealogy Tree
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 13, Number 6 (2018) pp. 3296-3414 © Research India Publications. http://www.ripublication.com The Pillars of Lossless Compression Algorithms a Road Map and Genealogy Tree Evon Abu-Taieh, PhD Information System Technology Faculty, The University of Jordan, Aqaba, Jordan. Abstract tree is presented in the last section of the paper after presenting the 12 main compression algorithms each with a practical This paper presents the pillars of lossless compression example. algorithms, methods and techniques. The paper counted more than 40 compression algorithms. Although each algorithm is The paper first introduces Shannon–Fano code showing its an independent in its own right, still; these algorithms relation to Shannon (1948), Huffman coding (1952), FANO interrelate genealogically and chronologically. The paper then (1949), Run Length Encoding (1967), Peter's Version (1963), presents the genealogy tree suggested by researcher. The tree Enumerative Coding (1973), LIFO (1976), FiFO Pasco (1976), shows the interrelationships between the 40 algorithms. Also, Stream (1979), P-Based FIFO (1981). Two examples are to be the tree showed the chronological order the algorithms came to presented one for Shannon-Fano Code and the other is for life. The time relation shows the cooperation among the Arithmetic Coding. Next, Huffman code is to be presented scientific society and how the amended each other's work. The with simulation example and algorithm. The third is Lempel- paper presents the 12 pillars researched in this paper, and a Ziv-Welch (LZW) Algorithm which hatched more than 24 comparison table is to be developed. -
Download Special Issue
Wireless Communications and Mobile Computing Error Control Codes for Next-Generation Communication Systems: Opportunities and Challenges Lead Guest Editor: Zesong Fei Guest Editors: Jinhong Yuan and Qin Huang Error Control Codes for Next-Generation Communication Systems: Opportunities and Challenges Wireless Communications and Mobile Computing Error Control Codes for Next-Generation Communication Systems: Opportunities and Challenges Lead Guest Editor: Zesong Fei Guest Editors: Jinhong Yuan and Qin Huang Copyright © 2018 Hindawi. All rights reserved. This is a special issue published in “Wireless Communications and Mobile Computing.” All articles are open access articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, pro- vided the original work is properly cited. Editorial Board Javier Aguiar, Spain Oscar Esparza, Spain Maode Ma, Singapore Ghufran Ahmed, Pakistan Maria Fazio, Italy Imadeldin Mahgoub, USA Wessam Ajib, Canada Mauro Femminella, Italy Pietro Manzoni, Spain Muhammad Alam, China Manuel Fernandez-Veiga, Spain Álvaro Marco, Spain Eva Antonino-Daviu, Spain Gianluigi Ferrari, Italy Gustavo Marfia, Italy Shlomi Arnon, Israel Ilario Filippini, Italy Francisco J. Martinez, Spain Leyre Azpilicueta, Mexico Jesus Fontecha, Spain Davide Mattera, Italy Paolo Barsocchi, Italy Luca Foschini, Italy Michael McGuire, Canada Alessandro Bazzi, Italy A. G. Fragkiadakis, Greece Nathalie Mitton, France Zdenek Becvar, Czech Republic Sabrina Gaito, Italy Klaus Moessner, UK Francesco Benedetto, Italy Óscar García, Spain Antonella Molinaro, Italy Olivier Berder, France Manuel García Sánchez, Spain Simone Morosi, Italy Ana M. Bernardos, Spain L. J. García Villalba, Spain Kumudu S. Munasinghe, Australia Mauro Biagi, Italy José A. García-Naya, Spain Enrico Natalizio, France Dario Bruneo, Italy Miguel Garcia-Pineda, Spain Keivan Navaie, UK Jun Cai, Canada A.-J. -
The Deep Learning Solutions on Lossless Compression Methods for Alleviating Data Load on Iot Nodes in Smart Cities
sensors Article The Deep Learning Solutions on Lossless Compression Methods for Alleviating Data Load on IoT Nodes in Smart Cities Ammar Nasif *, Zulaiha Ali Othman and Nor Samsiah Sani Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science & Technology, University Kebangsaan Malaysia, Bangi 43600, Malaysia; [email protected] (Z.A.O.); [email protected] (N.S.S.) * Correspondence: [email protected] Abstract: Networking is crucial for smart city projects nowadays, as it offers an environment where people and things are connected. This paper presents a chronology of factors on the development of smart cities, including IoT technologies as network infrastructure. Increasing IoT nodes leads to increasing data flow, which is a potential source of failure for IoT networks. The biggest challenge of IoT networks is that the IoT may have insufficient memory to handle all transaction data within the IoT network. We aim in this paper to propose a potential compression method for reducing IoT network data traffic. Therefore, we investigate various lossless compression algorithms, such as entropy or dictionary-based algorithms, and general compression methods to determine which algorithm or method adheres to the IoT specifications. Furthermore, this study conducts compression experiments using entropy (Huffman, Adaptive Huffman) and Dictionary (LZ77, LZ78) as well as five different types of datasets of the IoT data traffic. Though the above algorithms can alleviate the IoT data traffic, adaptive Huffman gave the best compression algorithm. Therefore, in this paper, Citation: Nasif, A.; Othman, Z.A.; we aim to propose a conceptual compression method for IoT data traffic by improving an adaptive Sani, N.S. -
Error-Resilient Coding Tools in MPEG-4
I Error-Resilient Coding Tools In MPEG-4 t By CHENG Shu Ling A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF THE MASTER OF PHILOSOPHY DIVISION OF INFORMATION ENGnvfEERING THE CHINESE UNIVERSITY OF HONG KONG July 1997 • A^^s^ p(:^S31^^^ 4c—^w -v^ UNIVERSITY JgJ ' %^^y^^ s�,s^^ %^r::^v€X '^<"nii^Ji.^i:^'5^^ Acknowledgement I would like to thank my supervisor Prof. Victor K. W. Wei for his invaluable guidance and support throughout my M. Phil, study. He has given to me many fruitful ideas and inspiration on my research. I have been the teaching assistant of his channel coding and multimedia coding class for three terms, during which I have gained a lot of knowledge and experience. Working with Prof. Wei is a precious experience for me; I learnt many ideas on work and research from him. I would also like to thank my fellow labmates, C.K. Lai, C.W. Lam, S.W. Ng and C.W. Yuen. They have assisted me in many ways on my duties and given a lot of insightful discussions on my studies. Special thanks to Lai and Lam for their technical support on my research. Last but not least, I want to say thank you to my dear friends: Bird, William, Johnny, Mok, Abak, Samson, Siu-man and Nelson. Thanks to all of you for the support and patience with me over the years. ii Abstract Error resiliency is becoming increasingly important with the rapid growth of the mobile systems. The channels in mobile communications are subject to fading, shadowing and interference and thus have a high error rate.