A Short Introduction to JIDT: an Information-Theoretic Toolkit for Studying the Dynamics of Complex Systems

A Short Introduction to JIDT: an Information-Theoretic Toolkit for Studying the Dynamics of Complex Systems

Outline Information dynamics Overview of JIDT Demos Wrap-up A short introduction to JIDT: An information-theoretic toolkit for studying the dynamics of complex systems Joseph T. Lizier The University of Sydney Max Planck Institute for Mathematics in the Sciences Leipzig July 17, 2015 Outline Information dynamics Overview of JIDT Demos Wrap-up Weapon of choice Outline Information dynamics Overview of JIDT Demos Wrap-up Java Information Dynamics Toolkit (JIDT) On google code ! github (https://lizier.me/jidt/) JIDT provides a standalone, open-source (GPL v3 licensed) implementation of information-theoretic measures of information processing in complex systems, i.e. information storage, transfer and modification. JIDT includes implementations: Principally for transfer entropy, mutual information, their conditional variants, active information storage etc; For both discrete and continuous-valued data; Using various types of estimators (e.g. Kraskov-St¨ogbauer-Grassberger, linear-Gaussian, etc.). Outline Information dynamics Overview of JIDT Demos Wrap-up Java Information Dynamics Toolkit (JIDT) JIDT is written in Java but directly usable in Matlab/Octave, Python, R, Julia, Clojure, etc. JIDT requires almost zero installation. JIDT is distributed with: A paper describing its design and usage; J.T. Lizier, Frontiers in Robotics and AI 1:11, 2014; (arXiv:1408.3270) A full tutorial and exercises (this is an excerpt!) Full Javadocs; A suite of demonstrations, including in each of the languages listed above. Outline Information dynamics Overview of JIDT Demos Wrap-up 1 Information dynamics Information theory The information dynamics framework Estimation techniques 2 Overview of JIDT 3 Demos 4 Wrap-up Outline Information dynamics Overview of JIDT Demos Wrap-up Entropy (Shannon) entropy is a measure of uncertainty. Intuitively: if we want to know the value of a variable, there is uncertainty in what that value is before we inspect it (measured in bits). Outline Information dynamics Overview of JIDT Demos Wrap-up Conditional entropy Uncertainty in one variable X in the context of the known measurement of another variable Y . Intuitively: how much uncertainty is there in X after we know the value of Y ? e.g. How uncertain are we about the web server is running if we know the IT guy is at lunch? H(X jY ) = H(X ; Y ) − H(Y ) = − P P p(x; y) log p(xjy) x2X y2Y Outline Information dynamics Overview of JIDT Demos Wrap-up Mutual information Information is a measure of uncertainty reduction. Intuitively: common information is the amount that knowing the value of one variable tells us about another. Outline Information dynamics Overview of JIDT Demos Wrap-up Information-theoretic quantities X Shannon entropy H(X ) = − p(x) log2 p(x) x = h− log2 p(x)i X Conditional entropy H(X jY ) = − p(x; y) log2 p(xjy) x;y Mutual information (MI) I (X ; Y ) = H(X ) + H(Y ) − H(X ; Y ) X p(xjy) = p(x; y) log 2 p(x) x;y p(xjy) = log 2 p(x) Conditional MI I (X ; Y jZ) = H(X jZ) + H(Y jZ) − H(X ; Y jZ) p(xjy; z) = log 2 p(xjz) Outline Information dynamics Overview of JIDT Demos Wrap-up Local measures We can write local (or point-wise) information-theoretic measures for specific observations/configurations fx; y; zg: p(xjy) h(x) = − log p(x); i(x; y) = log 2 2 p(x) p(xjy; z) h(xjy) = − log p(xjy); i(x; yjz) = log 2 2 p(xjz) We have H(X ) = hh(x)i and I (X ; Y ) = hi(x; y)i, etc. If X ; Y ; Z are time-series, local values measure dynamics over time. Outline Information dynamics Overview of JIDT Demos Wrap-up What can we do with these measures? Measure the diversity in agent strategies (Miramontes, 1995; Prokopenko et al., 2005). Measure long-range correlations as we approach a phase-transition (Ribeiro et al., 2008). Feature selection for machine learning (Wang et al., 2014). Quantify the information held in a response about a stimulus, and indeed about specific stimuli (DeWeese and Meister, 1999). Measure the common information in the behaviour of two agents (Sperati et al., 2008). Guide self-organisation (Prokopenko et al., 2006). ::: ! Information theory is useful for answering specific questions about information content, shared information, and where and to what extent information about some variable is mirrored. Idea: quantify computation via: Information storage Information transfer Information modification General idea: by quantifying intrinsic computation in the language it is normally described in, we can understand how nature computes and why it is complex. Outline Information dynamics Overview of JIDT Demos Wrap-up Information dynamics We talk about computation as: Memory Signalling Processing Distributed computation is any process involving these features: Time evolution of cellular automata Information processing in the brain Gene regulatory networks computing cell behaviours Flocks computing their collective heading Ant colonies computing the most efficient routes to food The universe is computing its own future! Outline Information dynamics Overview of JIDT Demos Wrap-up Information dynamics We talk about computation as: Idea: quantify computation via: Memory Information storage Signalling Information transfer Processing Information modification Distributed computation is any process involving these features: Time evolution of cellular automata Information processing in the brain Gene regulatory networks computing cell behaviours Flocks computing their collective heading Ant colonies computing the most efficient routes to food The universe is computing its own future! General idea: by quantifying intrinsic computation in the language it is normally described in, we can understand how nature computes and why it is complex. Outline Information dynamics Overview of JIDT Demos Wrap-up Information dynamics Key question: how is the next state of a variable in a complex system computed? Q: Where does the information in xn+1 come from, and how can we measure it? Q: How much was stored, how much was transferred, can we partition them or do they overlap? Complex system as a multivariate time-series of states aX (n) Outline Information dynamics Overview of JIDT Demos Wrap-up Active information storage (Lizier et al., 2012) How much information about the next observation Xn+1 of process (k) X can be found in its past state Xn = fXn−k+1 ::: Xn−1; Xng? Active information storage: (k) AX = I (Xn+1; Xn ) D p(x jx(k)) E = log n+1 n 2 p(xn+1) Average information from past state that is in use in predicting the next value. Outline Information dynamics Overview of JIDT Demos Wrap-up Active information storage (Lizier et al., 2012) How much information about the next observation Xn+1 of process (k) X can be found in its past state Xn = fXn−k+1 ::: Xn−1; Xng? Active information storage: (k) AX = I (Xn+1; Xn ) D p(x jx(k)) E = log n+1 n 2 p(xn+1) Average information from past state that is in use in predicting the next value. aX (n) Outline Information dynamics Overview of JIDT Demos Wrap-up Information transfer (k) How much information about the state transition Xn ! Xn+1 of (l) X can be found in the past state Yn of a source process Y ? Transfer entropy: (Schreiber, 2000) (l) (k) TY !X = I (Yn ; Xn+1 j Xn ) D (k) (l) E p(xn+1jxn ;yn )) = log2 (k) p(xn+1jxn )) Average info from source that helps predict next value in context of past. Storage and transfer are complementary: HX = AX + TY !X + higher order terms Outline Information dynamics Overview of JIDT Demos Wrap-up Information transfer Transfer entropy measures directed coupling between time-series. Intuitively: the amount of information that a source variable tells us about a destination, in the context of the destination's current state. e.g. How much does knowing the IT guy is at lunch tell us about the web server running, given its previous state? Outline Information dynamics Overview of JIDT Demos Wrap-up Information dynamics in CAs cells 1 + + 1 γ γ- γ α 5 5 α + 0.5 0.8 γ- γ 10 10 0 -0.5 15 0.6 15 - β - γ Domains and blinkers time γ -1 20 20 α 0.4 α -1.5 are the dominant 25 25 - + γ - γ -2 0.2 γ 30 30 -2.5 information storage - α + γ γ -3 35 0 35 5 10 15 20 25 30 35 5 10 15 20 25 30 35 entities. (a) Raw CA (b) LAIS + + + + γ γ- γ 3 γ γ- γ 3 5 + 5 + - γ 2.5 γ- γ 2.5 Gliders are the γ 10 10 2 2 15 - 1.5 15 - 1.5 dominant information - γ - γ γ 1 γ 1 20 20 transfer entities. - 0.5 - 0.5 25 γ - + 25 γ - + γ γ 0 γ γ 0 30 30 - γ+ -0.5 - γ+ -0.5 35 γ 35 γ 5 10 15 20 25 30 35 5 10 15 20 25 30 35 (c) LTE right (d) LTE left Outline Information dynamics Overview of JIDT Demos Wrap-up Discrete: plug-in estimator For discrete variables x and y, to compute H(X ; Y ) count(X =x;Y =y) 1 estimate: p(x; y) = N , where N is our sample size; 2 plug-in each estimated PDF to H(X ; Y ) to get H^ (X ; Y ) 2 (Box-)kernel estimation: PDFs from counts within a radius r 3 Kraskov et al. (2004) (KSG): PDFs from counts within a radius determined by k-nearest neighbours, plus bias-correction. Best of breed Outline Information dynamics Overview of JIDT Demos Wrap-up Continuous variables ! Differential entropy 1 Gaussian model: covariances ! entropies (Cover and Thomas, 1991) 1 h i H(X) = ln (2πe)d j Ω j 2 X 3 Kraskov et al.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    51 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us