On Thermodynamic Interpretation of Transfer Entropy
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Entropy Rate of Stochastic Processes
Entropy Rate of Stochastic Processes Timo Mulder Jorn Peters [email protected] [email protected] February 8, 2015 The entropy rate of independent and identically distributed events can on average be encoded by H(X) bits per source symbol. However, in reality, series of events (or processes) are often randomly distributed and there can be arbitrary dependence between each event. Such processes with arbitrary dependence between variables are called stochastic processes. This report shows how to calculate the entropy rate for such processes, accompanied with some denitions, proofs and brief examples. 1 Introduction One may be interested in the uncertainty of an event or a series of events. Shannon entropy is a method to compute the uncertainty in an event. This can be used for single events or for several independent and identically distributed events. However, often events or series of events are not independent and identically distributed. In that case the events together form a stochastic process i.e. a process in which multiple outcomes are possible. These processes are omnipresent in all sorts of elds of study. As it turns out, Shannon entropy cannot directly be used to compute the uncertainty in a stochastic process. However, it is easy to extend such that it can be used. Also when a stochastic process satises certain properties, for example if the stochastic process is a Markov process, it is straightforward to compute the entropy of the process. The entropy rate of a stochastic process can be used in various ways. An example of this is given in Section 4.1. -
Entropy Rate Estimation for Markov Chains with Large State Space
Entropy Rate Estimation for Markov Chains with Large State Space Yanjun Han Department of Electrical Engineering Stanford University Stanford, CA 94305 [email protected] Jiantao Jiao Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720 [email protected] Chuan-Zheng Lee, Tsachy Weissman Yihong Wu Department of Electrical Engineering Department of Statistics and Data Science Stanford University Yale University Stanford, CA 94305 New Haven, CT 06511 {czlee, tsachy}@stanford.edu [email protected] Tiancheng Yu Department of Electronic Engineering Tsinghua University Haidian, Beijing 100084 [email protected] Abstract Entropy estimation is one of the prototypicalproblems in distribution property test- ing. To consistently estimate the Shannon entropy of a distribution on S elements with independent samples, the optimal sample complexity scales sublinearly with arXiv:1802.07889v3 [cs.LG] 24 Sep 2018 S S as Θ( log S ) as shown by Valiant and Valiant [41]. Extendingthe theory and algo- rithms for entropy estimation to dependent data, this paper considers the problem of estimating the entropy rate of a stationary reversible Markov chain with S states from a sample path of n observations. We show that Provided the Markov chain mixes not too slowly, i.e., the relaxation time is • S S2 at most O( 3 ), consistent estimation is achievable when n . ln S ≫ log S Provided the Markov chain has some slight dependency, i.e., the relaxation • 2 time is at least 1+Ω( ln S ), consistent estimation is impossible when n . √S S2 log S . Under both assumptions, the optimal estimation accuracy is shown to be S2 2 Θ( n log S ). -
JIDT: an Information-Theoretic Toolkit for Studying the Dynamics of Complex Systems
JIDT: An information-theoretic toolkit for studying the dynamics of complex systems Joseph T. Lizier1, 2, ∗ 1Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103 Leipzig, Germany 2CSIRO Digital Productivity Flagship, Marsfield, NSW 2122, Australia (Dated: December 4, 2014) Complex systems are increasingly being viewed as distributed information processing systems, particularly in the domains of computational neuroscience, bioinformatics and Artificial Life. This trend has resulted in a strong uptake in the use of (Shannon) information-theoretic measures to analyse the dynamics of complex systems in these fields. We introduce the Java Information Dy- namics Toolkit (JIDT): a Google code project which provides a standalone, (GNU GPL v3 licensed) open-source code implementation for empirical estimation of information-theoretic measures from time-series data. While the toolkit provides classic information-theoretic measures (e.g. entropy, mutual information, conditional mutual information), it ultimately focusses on implementing higher- level measures for information dynamics. That is, JIDT focusses on quantifying information storage, transfer and modification, and the dynamics of these operations in space and time. For this purpose, it includes implementations of the transfer entropy and active information storage, their multivariate extensions and local or pointwise variants. JIDT provides implementations for both discrete and continuous-valued data for each measure, including various types of estimator for continuous data (e.g. Gaussian, box-kernel and Kraskov-St¨ogbauer-Grassberger) which can be swapped at run-time due to Java's object-oriented polymorphism. Furthermore, while written in Java, the toolkit can be used directly in MATLAB, GNU Octave, Python and other environments. We present the principles behind the code design, and provide several examples to guide users. -
Entropy Power, Autoregressive Models, and Mutual Information
entropy Article Entropy Power, Autoregressive Models, and Mutual Information Jerry Gibson Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA 93106, USA; [email protected] Received: 7 August 2018; Accepted: 17 September 2018; Published: 30 September 2018 Abstract: Autoregressive processes play a major role in speech processing (linear prediction), seismic signal processing, biological signal processing, and many other applications. We consider the quantity defined by Shannon in 1948, the entropy rate power, and show that the log ratio of entropy powers equals the difference in the differential entropy of the two processes. Furthermore, we use the log ratio of entropy powers to analyze the change in mutual information as the model order is increased for autoregressive processes. We examine when we can substitute the minimum mean squared prediction error for the entropy power in the log ratio of entropy powers, thus greatly simplifying the calculations to obtain the differential entropy and the change in mutual information and therefore increasing the utility of the approach. Applications to speech processing and coding are given and potential applications to seismic signal processing, EEG classification, and ECG classification are described. Keywords: autoregressive models; entropy rate power; mutual information 1. Introduction In time series analysis, the autoregressive (AR) model, also called the linear prediction model, has received considerable attention, with a host of results on fitting AR models, AR model prediction performance, and decision-making using time series analysis based on AR models [1–3]. The linear prediction or AR model also plays a major role in many digital signal processing applications, such as linear prediction modeling of speech signals [4–6], geophysical exploration [7,8], electrocardiogram (ECG) classification [9], and electroencephalogram (EEG) classification [10,11]. -
Source Coding: Part I of Fundamentals of Source and Video Coding
Foundations and Trends R in sample Vol. 1, No 1 (2011) 1–217 c 2011 Thomas Wiegand and Heiko Schwarz DOI: xxxxxx Source Coding: Part I of Fundamentals of Source and Video Coding Thomas Wiegand1 and Heiko Schwarz2 1 Berlin Institute of Technology and Fraunhofer Institute for Telecommunica- tions — Heinrich Hertz Institute, Germany, [email protected] 2 Fraunhofer Institute for Telecommunications — Heinrich Hertz Institute, Germany, [email protected] Abstract Digital media technologies have become an integral part of the way we create, communicate, and consume information. At the core of these technologies are source coding methods that are described in this text. Based on the fundamentals of information and rate distortion theory, the most relevant techniques used in source coding algorithms are de- scribed: entropy coding, quantization as well as predictive and trans- form coding. The emphasis is put onto algorithms that are also used in video coding, which will be described in the other text of this two-part monograph. To our families Contents 1 Introduction 1 1.1 The Communication Problem 3 1.2 Scope and Overview of the Text 4 1.3 The Source Coding Principle 5 2 Random Processes 7 2.1 Probability 8 2.2 Random Variables 9 2.2.1 Continuous Random Variables 10 2.2.2 Discrete Random Variables 11 2.2.3 Expectation 13 2.3 Random Processes 14 2.3.1 Markov Processes 16 2.3.2 Gaussian Processes 18 2.3.3 Gauss-Markov Processes 18 2.4 Summary of Random Processes 19 i ii Contents 3 Lossless Source Coding 20 3.1 Classification -
Relative Entropy Rate Between a Markov Chain and Its Corresponding Hidden Markov Chain
⃝c Statistical Research and Training Center دورهی ٨، ﺷﻤﺎرهی ١، ﺑﻬﺎر و ﺗﺎﺑﺴﺘﺎن ١٣٩٠، ﺻﺺ ٩٧–١٠٩ J. Statist. Res. Iran 8 (2011): 97–109 Relative Entropy Rate between a Markov Chain and Its Corresponding Hidden Markov Chain GH. Yariy and Z. Nikooraveshz;∗ y Iran University of Science and Technology z Birjand University of Technology Abstract. In this paper we study the relative entropy rate between a homo- geneous Markov chain and a hidden Markov chain defined by observing the output of a discrete stochastic channel whose input is the finite state space homogeneous stationary Markov chain. For this purpose, we obtain the rela- tive entropy between two finite subsequences of above mentioned chains with the help of the definition of relative entropy between two random variables then we define the relative entropy rate between these stochastic processes and study the convergence of it. Keywords. Relative entropy rate; stochastic channel; Markov chain; hid- den Markov chain. MSC 2010: 60J10, 94A17. 1 Introduction Suppose fXngn2N is a homogeneous stationary Markov chain with finite state space S = f0; 1; 2; :::; N − 1g and fYngn2N is a hidden Markov chain (HMC) which is observed through a discrete stochastic channel where the input of channel is the Markov chain. The output state space of channel is characterized by channel’s statistical properties. From now on we study ∗ Corresponding author 98 Relative Entropy Rate between a Markov Chain and Its ::: the channels state spaces which have been equal to the state spaces of input chains. Let P = fpabg be the one-step transition probability matrix of the Markov chain such that pab = P rfXn = bjXn−1 = ag for a; b 2 S and Q = fqabg be the noisy matrix of channel where qab = P rfYn = bjXn = ag for a; b 2 S. -
Exploratory Causal Analysis in Bivariate Time Series Data
Exploratory Causal Analysis in Bivariate Time Series Data Abstract Many scientific disciplines rely on observational data of systems for which it is difficult (or impossible) to implement controlled experiments and data analysis techniques are required for identifying causal information and relationships directly from observational data. This need has lead to the development of many different time series causality approaches and tools including transfer entropy, convergent cross-mapping (CCM), and Granger causality statistics. A practicing analyst can explore the literature to find many proposals for identifying drivers and causal connections in times series data sets, but little research exists of how these tools compare to each other in practice. This work introduces and defines exploratory causal analysis (ECA) to address this issue. The motivation is to provide a framework for exploring potential causal structures in time series data sets. J. M. McCracken Defense talk for PhD in Physics, Department of Physics and Astronomy 10:00 AM November 20, 2015; Exploratory Hall, 3301 Advisor: Dr. Robert Weigel; Committee: Dr. Paul So, Dr. Tim Sauer J. M. McCracken (GMU) ECA w/ time series causality November 20, 2015 1 / 50 Exploratory Causal Analysis in Bivariate Time Series Data J. M. McCracken Department of Physics and Astronomy George Mason University, Fairfax, VA November 20, 2015 J. M. McCracken (GMU) ECA w/ time series causality November 20, 2015 2 / 50 Outline 1. Motivation 2. Causality studies 3. Data causality 4. Exploratory causal analysis 5. Making an ECA summary Transfer entropy difference Granger causality statistic Pairwise asymmetric inference Weighed mean observed leaning Lagged cross-correlation difference 6. -
Entropy Rates & Markov Chains Information Theory Duke University, Fall 2020
ECE 587 / STA 563: Lecture 4 { Entropy Rates & Markov Chains Information Theory Duke University, Fall 2020 Author: Galen Reeves Last Modified: September 1, 2020 Outline of lecture: 4.1 Entropy Rate........................................1 4.2 Markov Chains.......................................3 4.3 Card Shuffling and MCMC................................4 4.4 Entropy of the English Language [Optional].......................6 4.1 Entropy Rate • A stochastic process fXig is an indexed sequence of random variables. They need not be in- n dependent nor identically distributed. For each n the distribution on X = (X1;X2; ··· ;Xn) is characterized by the joint pmf n pXn (x ) = p(x1; x2; ··· ; xn) • A stochastic process provides a useful probabilistic model. Examples include: ◦ the state of a deck of cards after each shuffle ◦ location of a molecule each millisecond. ◦ temperature each day ◦ sequence of letters in a book ◦ the closing price of the stock market each day. • With dependent sequences, hows does the entropy H(Xn) grow with n? (1) Definition 1: Average entropy per symbol H(X ;X ; ··· ;X ) H(X ) = lim 1 2 n n!1 n (2) Definition 2: Rate of information innovation 0 H (X ) = lim H(XnjX1;X2; ··· ;Xn−1) n!1 • If fXig are iid ∼ p(x), then both limits exists and are equal to H(X). nH(X) H(X ) = lim = H(X) n!1 n 0 H (X ) = lim H(XnjX1;X2; ··· ;Xn−1) = H(X) n!1 But what if the symbols are not iid? 2 ECE 587 / STA 563: Lecture 4 • A stochastic process is stationary if the joint distribution of subsets is invariant to shifts in the time index, i.e. -
Empirical Analysis Using Symbolic Transfer Entropy
University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School June 2019 Measuring Influence Across Social Media Platforms: Empirical Analysis Using Symbolic Transfer Entropy Abhishek Bhattacharjee University of South Florida, [email protected] Follow this and additional works at: https://scholarcommons.usf.edu/etd Part of the Computer Sciences Commons Scholar Commons Citation Bhattacharjee, Abhishek, "Measuring Influence Across Social Media Platforms: Empirical Analysis Using Symbolic Transfer Entropy" (2019). Graduate Theses and Dissertations. https://scholarcommons.usf.edu/etd/7745 This Thesis is brought to you for free and open access by the Graduate School at Scholar Commons. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Scholar Commons. For more information, please contact [email protected]. Measuring Influence Across Social Media Platforms: Empirical Analysis Using Symbolic Transfer Entropy by Abhishek Bhattacharjee A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science Department of Computer Science and Engineering College of Engineering University of South Florida Major Professor: Adriana Iamnitchi, Ph.D. John Skvoretz, Ph.D. Giovanni Luca Ciampaglia, Ph.D. Date of Approval: April 29, 2019 Keywords: Social Networks, Cross-platform influence Copyright © 2019, Abhishek Bhattacharjee DEDICATION Dedicated to my parents, brother, girlfriend and those five humans. ACKNOWLEDGMENTS I would like to pay my thanks to my advisor Dr Adriana Iamnitchi for her constant support and feedback throughout this project. The work wouldn’t have been possible without Leidos who provided us the data, and DARPA for funding this research work. I would also like to acknowledge the contributions made by Dr. -
Entropic Measures of Connectivity with an Application to Intracerebral Epileptic Signals Jie Zhu
Entropic measures of connectivity with an application to intracerebral epileptic signals Jie Zhu To cite this version: Jie Zhu. Entropic measures of connectivity with an application to intracerebral epileptic signals. Signal and Image processing. Université Rennes 1, 2016. English. NNT : 2016REN1S006. tel-01359072 HAL Id: tel-01359072 https://tel.archives-ouvertes.fr/tel-01359072 Submitted on 1 Sep 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. ANNÉE 2016 THÈSE / UNIVERSITÉ DE RENNES 1 sous le sceau de l’Université Bretagne Loire pour le grade de DOCTEUR DE L’UNIVERSITÉ DE RENNES 1 Mention : Traitement du Signal et Télécommunications Ecole doctorale MATISSE Ecole Doctorale Mathématiques, Télécommunications, Informatique, Signal, Systèmes, Electronique Jie ZHU Préparée à l’unité de recherche LTSI - INSERM UMR 1099 Laboratoire Traitement du Signal et de l’Image ISTIC UFR Informatique et Électronique Thèse soutenue à Rennes Entropic Measures of le 22 juin 2016 Connectivity with an devant le jury composé de : MICHEL Olivier Application to Professeur -
Information Theory 1 Entropy 2 Mutual Information
CS769 Spring 2010 Advanced Natural Language Processing Information Theory Lecturer: Xiaojin Zhu [email protected] In this lecture we will learn about entropy, mutual information, KL-divergence, etc., which are useful concepts for information processing systems. 1 Entropy Entropy of a discrete distribution p(x) over the event space X is X H(p) = − p(x) log p(x). (1) x∈X When the log has base 2, entropy has unit bits. Properties: H(p) ≥ 0, with equality only if p is deterministic (use the fact 0 log 0 = 0). Entropy is the average number of 0/1 questions needed to describe an outcome from p(x) (the Twenty Questions game). Entropy is a concave function of p. 1 1 1 1 7 For example, let X = {x1, x2, x3, x4} and p(x1) = 2 , p(x2) = 4 , p(x3) = 8 , p(x4) = 8 . H(p) = 4 bits. This definition naturally extends to joint distributions. Assuming (x, y) ∼ p(x, y), X X H(p) = − p(x, y) log p(x, y). (2) x∈X y∈Y We sometimes write H(X) instead of H(p) with the understanding that p is the underlying distribution. The conditional entropy H(Y |X) is the amount of information needed to determine Y , if the other party knows X. X X X H(Y |X) = p(x)H(Y |X = x) = − p(x, y) log p(y|x). (3) x∈X x∈X y∈Y From above, we can derive the chain rule for entropy: H(X1:n) = H(X1) + H(X2|X1) + .. -
Arxiv:1711.03962V1 [Stat.ME] 10 Nov 2017 the Entropy Rate of an Individual’S Behavior
Estimating the Entropy Rate of Finite Markov Chains with Application to Behavior Studies Brian Vegetabile1;∗ and Jenny Molet2 and Tallie Z. Baram2;3;4 and Hal Stern1 1Department of Statistics, University of California, Irvine, CA, U.S.A. 2Department of Anatomy and Neurobiology, University of California, Irvine, CA, U.S.A. 3Department of Pediatrics, University of California, Irvine, CA, U.S.A. 4Department of Neurology, University of California, Irvine, CA, U.S.A. *email: [email protected] Summary: Predictability of behavior has emerged an an important characteristic in many fields including biology, medicine, and marketing. Behavior can be recorded as a sequence of actions performed by an individual over a given time period. This sequence of actions can often be modeled as a stationary time-homogeneous Markov chain and the predictability of the individual's behavior can be quantified by the entropy rate of the process. This paper provides a comprehensive investigation of three estimators of the entropy rate of finite Markov processes and a bootstrap procedure for providing standard errors. The first two methods directly estimate the entropy rate through estimates of the transition matrix and stationary distribution of the process; the methods differ in the technique used to estimate the stationary distribution. The third method is related to the sliding-window Lempel-Ziv (SWLZ) compression algorithm. The first two methods achieve consistent estimates of the true entropy rate for reasonably short observed sequences, but are limited by requiring a priori specification of the order of the process. The method based on the SWLZ algorithm does not require specifying the order of the process and is optimal in the limit of an infinite sequence, but is biased for short sequences.