Hidden Gibbs Models: Theory and Applications

Hidden Gibbs Models: Theory and Applications – DRAFT – Evgeny Verbitskiy Mathematical Institute, Leiden University, The Netherlands Johann Bernoulli Institute, Groningen University, The Netherlands November 21, 2015 2 Contents 1 Introduction1 1.1 Hidden Models / Hidden Processes............... 3 1.1.1 Equivalence of “random” and “deterministic” settings. 4 1.1.2 Deterministic chaos..................... 4 1.2 Hidden Gibbs Models....................... 6 1.2.1 How realistic is the Gibbs assumption?......... 8 I Theory9 2 Gibbs states 11 2.1 Gibbs-Bolztmann Ansatz..................... 11 2.2 Gibbs states for Lattice Systems................. 12 2.2.1 Translation Invariant Gibbs states............. 13 2.2.2 Regularity of Gibbs states................. 15 2.3 Gibbs and other regular stochastic processes.......... 16 2.3.1 Markov Chains....................... 17 2.3.2 k-step Markov chains.................... 18 2.3.3 Variable Length Markov Chains.............. 19 2.3.4 Countable Mixtures of Markov Chains (CMMC’s)... 20 2.3.5 g-measures (chains with complete connections).... 20 2.4 Gibbs measures in Dynamical Systems............. 27 2.4.1 Equilibrium states...................... 28 3 Functions of Markov processes 31 3.1 Markov Chains (recap)....................... 31 3.1.1 Regular measures on Subshifts of Finite Type...... 33 3.2 Function of Markov Processes.................. 33 3.2.1 Functions of Markov Chains in Dynamical systems.. 35 3.3 Hidden Markov Models...................... 36 3.4 Functions of Markov Chains from the Thermodynamic point of view................................ 38 3 4 CONTENTS 4 Hidden Gibbs Processes 45 4.1 Functions of Gibbs Processes................... 45 4.2 Yayama................................ 45 4.3 Review: renormalization of g-measures............. 47 4.4 Cluster Expansions......................... 47 4.5 Cluster expansion.......................... 49 5 Hidden Gibbs Fields 53 5.1 Renormalization Transformations in Statistical Mechanics.. 54 5.2 Examples of pathologies under renormalization........ 55 5.3 General Properties of Renormalised Gibbs Fields....... 60 5.4 Dobrushin’s reconstruction program.............. 60 5.4.1 Generalized Gibbs states/“Classification” of singularities 61 5.4.2 Variational Principles.................... 64 5.5 Criteria for Preservation of Gibbs property under renormalization................................ 70 II Applications 73 6 Hidden Markov Models 75 6.1 Three basic problems for HMM’s................. 75 6.1.1 The Evaluation Problem.................. 77 6.1.2 The Decoding or State Estimation Problem....... 78 6.2 Gibbs property of Hidden Markov Chains........... 78 7 Denoising 81 A Prerequisites 89 A.1 Notation............................... 89 A.2 Measure theory........................... 89 A.3 Stochastic processes........................ 92 A.4 Ergodic theory........................... 93 A.5 Entropy............................... 94 A.5.1 Shannon’s entropy rate per symbol............ 94 A.5.2 Kolmogorov-Sinai entropy of measure-preserving systems.............................. 95 Chapter 1 Introduction Last modified on April 3, 2015 Very often the true dynamics of a physical system is hidden from us: we are able to observe only a certain measurement of the state of the underlying system. For example, air temperature and the speed of wind are easily observable functions of the climate dynamics, while electroencephalogram provides insight into the functioning of the brain. Partial Observability Noise Coarse Graining Observing only partial information naturally limits our ability to describe or model the underlying system, detect changes in dynamics, make pre- dictions, etc. Nevertheless, these problems must be addressed in practical 1 2 CHAPTER 1. INTRODUCTION situations, and are extremely challenging from the mathematical point of view. Mathematics has proved to be extremely useful in dealing with imperfect data. There are a number of methods developed within various mathematical disciplines such as • Dynamical Systems • Control Theory • Decision Theory • Information Theory • Machine Learning • Statistics to address particular instances of this general problem – dealing with imperfect knowledge. Remarkably, many problems have a common nature. For example, cor- recting signals corrupted by noisy channels during transmission, analyzing neuronal spikes, analyzing genomic sequences, and validating the so- called renormalization group methods of theoretical physics, all have the same underlying mathematical structure: a hidden Gibbs model. In this model, the observable process is a function (either deterministic or random) of a process whose underlying probabilistic law is Gibbs. Gibbs models have been introduced in Statistical Mechanics, and have been highly successful in modeling physical systems (ferromagnets, dilute gases, poly- mer chains). Nowadays Gibbs models are ubiquitous and are used by researchers from different fields, often implicitly, i.e., without knowledge that a given model belongs to a much wider class. A well-established Gibbs theory provides an excellent starting point to develop Hidden Gibbs theory. A subclass of Hidden Gibbs models is formed by the well-known Hidden Markov Models. Several examples have been considered in Statistical Mechanics and the theory of Dynamical Systems. However, many basic questions are still unanswered. Moreover, the relevance of Hidden Gibbs models to other areas, such as Information Theory, Bioinformatics, and Neuronal Dynamics, has never been made explicit and exploited. Let us now formalise the problem, first by describing the mathematical paradigm to model partial observability, effects of noise, coarse-graining, and then, introducing the class of Hidden Gibbs Models. 1.1. HIDDEN MODELS / HIDDEN PROCESSES 3 1.1 Hidden Models / Hidden Processes Let us start with the following basic model, which we will specify further in the subsequent chapters. The observable process fYtg is a function of the hidden time- dependent process fXtg. We allow the process fXtg to be either • a stationary random (stochastic) process, • a deterministic dynamical process Xt+1 = f (Xt). For simplicity, we will only consider processes with discrete time, i.e., t 2 Z or Z+. However, many of what we discuss applies to continuous time processes as well. The process fYtg is a function of fXtg. The function can be • deterministic Yt = f (Xt) for all t, • or random: Yt ∼ PXt (·), i.e. Yt is chosen according to some probability distribution, which depends on fXtg. Remark 1.1. If not stated otherwise, we will implicitly assume that in case Yt is a random function of the underlying hidden process Xt, then Yt is chosen independently for every t. For example, Yt = Xt + Zt, where the “noise” fZtg is a sequence of independent endemically dis- tributed random variables. In many practical situations discussed below one can easily allow for the “dependence", e.g., fZtg is a Markov process. Moreover, as we will see latter, the case of a random function can reduced to the deterministic case, hence, any stationary process fZtg can be used to model noise. However, despite the fact the models are equivalent (see next subsection), it is often convenient to keep implicit dependence on the noise parameters. 4 CHAPTER 1. INTRODUCTION 1.1.1 Equivalence of “random” and “deterministic” settings. There is one-to-one correspondence between stationary processes and measure-preserving dynamical systems. Proposition 1.2. Let (W, A, m, T) be a measure-preserving dynamical system. Then for any measurable j : W ! R, (w) t Yt = j(T w), w 2 W, t 2 Z+ or Z,(1.1.1) is a stationary process. In the opposite direction, any real-valued stationary process fYtg gives rise to a measure preserving dynamical system (W, A, m, T) and a measurable function f : W ! R such that (1.1.1) holds. - Exercise 1.3. Prove Proposition 1.2. 1.1.2 Deterministic chaos Chaotic systems are typically defined as systems where trajectories de- pend sensitivelyon the initial conditions, that is, if small difference in initial conditions results, over time, in substantial differences in the states: small causes can produce large effects. Simple observables (functions) fYtg of trajectories of chaotic dynamical systems fXtg, Xt+1 = f (Xt), can be indistinguishable from “random” processes. Example 1.4. Consider a piece-wise expanding map f : [0, 1] ! [0, 1], f (x) = 2x mod 1, and the following observable f : ( 0, x 2 [0, 1 ] =: I , ( ) = 2 0 f x 1 1, x 2 ( 2 , 1] := I1. Since the map f is expanding: for x, x0 close, d( f (x), f (x0)) = 2d(x, x0), the dynamical system is chaotic. Furthermore, one can easily show that n the Lebesgue measure on [0, 1] is f -invariant, and the process Yn = f( f (x)) is actually a Bernoulli process. Theorem 1.5 (Sinai’s Theorem on Bernoulli factors). 1.1. HIDDEN MODELS / HIDDEN PROCESSES 5 At the same time, often a time series looks "completely random", while it has a very simple underlying dynamical structure. In the figure below, the Gaussian white noise is compared visually vs the so-called deterministic Gaussian white noise. The reconstruction techniques (c.f., Chapter ??) easily allow to distinguish the time series, as well as to identify the underlying dynamics. Figure 1.1: (a) Time series of the Gaussian and deterministic Gaussian white noise. (b) Corresponding probability densities of single observations. (c) Reconstruction plots Xn+1 vs Xn. Many vivid examples of close resemblance between the "random" and "deterministic" processes can be found in the literature.

Hidden Gibbs Models: Theory and Applications

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support