EE365: Markov Chains

Total Page:16

File Type:pdf, Size:1020Kb

EE365: Markov Chains EE365: Markov Chains Markov chains Transition Matrices Distribution Propagation Other Models 1 Markov chains 2 Markov chains I a model for dynamical systems with possibly uncertain transitions I very widely used, in many application areas I one of a handful of core effective mathematical and computational tools I often used to model systems that are not random; e.g., language 3 Stochastic dynamical system representation xt+1 = f(xt; wt) t = 0; 1;::: I x0; w0; w1;::: are independent random variables I state transitions are nondeterministic, uncertain I combine system xt+1 = f(xt; ut; wt) with policy ut = µt(xt) 4 Example: Inventory model with ordering policy xt+1 = [xt − dt + ut][0;C] ut = µ(xt) I xt 2 X = f0;:::;Cg is inventory level at time t I C > 0 is capacity I dt 2 f0; 1;:::g is demand that arrives just after time t I ut 2 f0; 1;:::g is new stock added to inventory I µ : X ! f0; 1;:::;Cg is ordering policy I [z][0;C] = min(max(z; 0);C) (z clipped to interval [0;C]) I assumption: d0; d1;::: are independent 5 Example: Inventory model with ordering policy xt+1 = [xt − dt + ut][0;C] ut = µ(xt) I capacity C = 6 I demand distribution pi = Prob(dt = i) p = 0:7 0:2 0:1 I policy: refill if xt ≤ 1 C − x if x ≤ 1 µ(x) = 0 otherwise I start with full inventory x0 = C 6 Simulation 6 5 4 xt 3 2 1 0 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 t I given initial state x0 I for t = 0;:::;T − 1 I simulate process noise wt := sample(p) I update the state xt+1 = f(xt; wt) 7 Simulation I called a particle simulation I gives a sample of the joint distribution of (x0; : : : ; xT ) I useful I to approximately evaluate probabilities and expectations via Monte Carlo simulation I to get a feel for what trajectories look like (e.g., for model validation) 8 Graph representation p0 p1 p2 p0 p1 p2 0 1 2 p0 3 p0 4 p0 5 p0 6 p0 p1 p1 p1 p1 p1 p2 p2 p2 p2 p2 I called the transition graph I each vertex (or node) corresponds to a state I edge i ! j is labeled with transition probability Prob(xt+1 = j j xt = i) 9 The tree 2 0.5 0.5 0.5 2 3 2 0.5 0.5 0.5 0.4 0.6 2 3 1 3 1 3 0.6 0.5 0.5 0.4 0.6 1 0.4 0.6 0.4 1 2 3 1 3 2 1 3 I each vertex of the tree corresponds to a path I e.g., Prob(x3 = 1 j x0 = 2) = 0:5 × 0:5 × 0:4 + 0:5 × 0:6 × 0:4 10 The birth-death chain 0.1 0.1 0.1 0.1 0.6 0.6 0.6 0.6 0.6 0.4 1 2 3 4 5 6 0.7 0.3 0.3 0.3 0.3 0.3 self-loops can be omitted, since they can be figured out from the outgoing proba- bilities 11 Example: Metastability 0.65 0.65 0.65 0.65 0.65 0.01 0.99 0.35 1 2 3 4 5 6 7 8 0.65 0.35 0.35 0.35 0.35 0.99 0.01 0.35 8 6 4 2 0 0 20 40 60 80 100 120 140 160 180 200 12 Transition Matrices 13 Transition matrix representation n×n we define the transition matrix P 2 R Pij = Prob(xt+1 = j j xt = i) I P 1 = 1 and P ≥ 0 elementwise I a matrix with these two properties is called a stochastic matrix I if P and Q are stochastic, then so is PQ 14 Transition matrix representation p0 p1 p2 p0 p1 p2 0 1 2 p0 3 p0 4 p0 5 p0 6 p0 p1 p1 p1 p1 p1 p2 p2 p2 p2 p2 transition matrix 2 0 0 0 0 0:1 0:2 0:7 3 6 0 0 0 0 0:1 0:2 0:7 7 6 7 60:1 0:2 0:7 0 0 0 0 7 6 7 P = 6 0 0:1 0:2 0:7 0 0 0 7 6 7 6 0 0 0:1 0:2 0:7 0 0 7 6 7 4 0 0 0 0:1 0:2 0:7 0 5 0 0 0 0 0:1 0:2 0:7 15 Particle simulation given the transition matrix Given 1×n I the distribution of initial states d 2 R n×n I the transition matrix P 2 R , with rows p1; : : : ; pn Algorithm: I choose initial state x0 := sample(d) I for t = 0;:::;T − 1 I find the distribution of the next state dnext := pxt I sample the next state xt+1 := sample(dnext) 16 Definition of a Markov chain sequence of random variables xt :Ω !X is a Markov chain if, for all s0; s1;::: and all t, Prob(xt+1 = st+1jxt = st; : : : ; x0 = s0) = Prob(xt+1 = st+1jxt = st) I called the Markov property I means that the system is memoryless I xt is called the state at time t; X is called the state space I if you know current state, then knowing past states doesn't give additional knowledge about next state (or any future state) 17 The joint distribution the joint distribution of x0; x1; : : : ; xt factorizes according to Prob x0 = a; x1 = b; x2 = c; : : : ; xt = q = daPabPbc ··· Ppq because the chain rule gives Prob(xt = st; xt−1 = st−1; : : : ; x0 = s0) = Prob(xt = st j xt−1 = st−1; : : : ; x0 = s0) Prob(xt−1 = st−1; : : : ; x0 = s0) abbreviating xt = st to just xt, we can repeatedly apply this to give Prob(xt; xt−1; : : : ; x0) = Prob(xt j xt−1; : : : ; x0) Prob(xt−1 j xt−2; : : : ; x0) ··· Prob(x1 j x0) Prob(x0) = Prob(xt j xt−1) Prob(xt−1 j xt−2) ··· Prob(x1 j x0) Prob(x0) 18 The joint distribution I the joint distribution completely specifies the process; for example X E f(x0; x1; x2; x3) = f(a; b; c; d) daPabPbcPcd a;b;c;d2X I in principle we can compute the probability of any event and the expected value of any function, but this requires a sum over nT terms I we can compute some expected values far more efficiently (more later) 19 Distribution Propagation 20 Distribution propagation πt+1 = πtP I here πt denotes distribution of xt, so (πt)i = Prob(xt = i) 2 I can compute marginals π1; : : : ; πT in T n operations I a useful type of simulation . 21 Example: Distribution propagation p0 p1 p2 p0 p1 p2 0 1 2 p0 3 p0 4 p0 5 p0 6 p0 p1 p1 p1 p1 p1 1 1 1 0.8 0.8 0.8 0.6 t = 0 p2 p20.6 t = 1 p2 p2 p0.62 t = 2 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 1 1 1 0.8 0.8 0.8 0.6 t = 3 0.6 t = 4 0.6 t = 5 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 1 1 1 0.8 0.8 0.8 0.6 t = 9 0.6 t = 10 0.6 t = 20 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 22 Example: Reordering I what is the probability we reorder at time t? I compute πt by distribution propagation, and use T Prob(xt 2 f0; 1g) = πt 1 1 0 0 0 0 0 0.09 0.08 0.07 0.06 E g(xt) 0.05 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35 40 45 50 t 23 Evaluating expectation of separable function T +1 I suppose f : X ! R is separable, i.e., f(x0; : : : ; xT ) = f0(x0) + ··· + fT (xT ) where ft : X! R I then we have E f(x) = π0f0 + ··· + πT fT 2 I using distribution propagation, we can compute exactly with T n operations 24 Powers of the transition matrix as an example of the use of the joint distribution, let's show k (P )ij = Prob xt+k = j j xt = i this holds because Prob xt+k = st+k and xt = st Prob xt+k = st+k j xt = st = Prob xt = st X ds0 Ps0s1 Ps1s2 ··· Pst+k−1st+k s0;s1;:::;st−1;st+1;:::;st+k−1 = Prob xt = st X X ds0 Ps0s1 ··· Pst−1st Pstst+1 ··· Pst+k−1st+k s0;s1;:::;st−1 st+1;:::;st+k−1 = Prob xt = st 25 Example: k step transition probabilities two ways to compute Prob(xt = j j x0 = i) t I direct method: sum products of probabilities over n state sequences t I using matrix powers: Prob(xt = j j x0 = i) = (P )ij I we can compute this in 3 I O(n t) arithmetic operations (by multiplying matrices to get P ) 3 I O(n ) arithmetic operations (by first diagonalizing P ) 26 Other Models 27 Time-varying Markov chains I we may have a time-varying Markov chain, with one transition matrix for each time (Pt)ij = Prob(xt+1 = j j xt = i) I suppose Prob(xt = a) 6= 0 for all a 2 X and t then the factorization property that there exists stochastic matrices P0;P1;::: and a distribution d such that Prob x0 = a; x1 = b; x2 = c; : : : ; xt = q = da(P0)ab(P1)bc ··· (Pt−1)pq is equivalent to the Markov property I the theory of factorization properties of distributions is called Bayesian net- works or graphical models (with Markov chains the simplest case) 28 Dynamical system representation revisited xt+1 = f(xt; wt) If x0; w0; w1;::: are independent then x0; x1;::: is Markov 29 Dynamical system representation To see that x0; x1;::: is Markov, notice that Prob(xt+1 = st+1 j x0 = s0; : : : ; xt = st) = Prob f(st; wt) = st+1 j x0 = s0; : : : ; xt = st = Prob f(st; wt) = st+1 j φt(x0; w0; : : : ; wt−1) = (s0; : : : ; st) = Prob f(st; wt) = st+1 I (x0; : : : ; xt) = φt(x0; w0; : : : ; wt−1) follows from xt+1 = f(xt; wt) I last line holds since wt ?? (x0; w0; : : : ; wt−1) and x ?? y =) f(x) ?? g(y) x0; x1;::: is Markov because the above approach also gives Prob(xt+1 = st+1 j xt = st) = Prob f(st; wt) = st+1 30 State augmentation we often have models xt+1 = f(xt; wt) where w0; w1;::: is Markov I w0; w1;::: satisfy wt+1 = g(wt; vt) where v0; v1;::: are independent xt I construct an equivalent Markov chain, with state zt = wt I dynamics zt+1 = h(zt; vt) where f(z ; z ) h(z; v) = 1 2 g(z2; v) 31 Markov k xt+1 = f(xt; xt−1; : : : ; xt−k+1; wt) I called a Markov-k model I assume (x0; x1; : : : ; xk−1); w0; w1; w2;::: are independent I initial distribution on (x0; x1; : : : ; xk−1) I Markov-k property: for all s0; s1;::: , Prob(xt+1 = st+1jxt = st; : : : ; x0 = s0) = Prob(xt+1 = st+1jxt = st; xt−1 = st−1; : : : ; xt−k+1 = st−k+1) 32 Markov k xt+1 = f(xt; xt−1; : : : ; xt−k+1; wt) k k I equivalent Markov system, with state zt 2 Z = X ; notice that jZj = jX j 2 xt 3 6 xt−1 7 z = 6 7 t 6 .
Recommended publications
  • A Class of Measure-Valued Markov Chains and Bayesian Nonparametrics
    Bernoulli 18(3), 2012, 1002–1030 DOI: 10.3150/11-BEJ356 A class of measure-valued Markov chains and Bayesian nonparametrics STEFANO FAVARO1, ALESSANDRA GUGLIELMI2 and STEPHEN G. WALKER3 1Universit`adi Torino and Collegio Carlo Alberto, Dipartimento di Statistica e Matematica Ap- plicata, Corso Unione Sovietica 218/bis, 10134 Torino, Italy. E-mail: [email protected] 2Politecnico di Milano, Dipartimento di Matematica, P.zza Leonardo da Vinci 32, 20133 Milano, Italy. E-mail: [email protected] 3University of Kent, Institute of Mathematics, Statistics and Actuarial Science, Canterbury CT27NZ, UK. E-mail: [email protected] Measure-valued Markov chains have raised interest in Bayesian nonparametrics since the seminal paper by (Math. Proc. Cambridge Philos. Soc. 105 (1989) 579–585) where a Markov chain having the law of the Dirichlet process as unique invariant measure has been introduced. In the present paper, we propose and investigate a new class of measure-valued Markov chains defined via exchangeable sequences of random variables. Asymptotic properties for this new class are derived and applications related to Bayesian nonparametric mixture modeling, and to a generalization of the Markov chain proposed by (Math. Proc. Cambridge Philos. Soc. 105 (1989) 579–585), are discussed. These results and their applications highlight once again the interplay between Bayesian nonparametrics and the theory of measure-valued Markov chains. Keywords: Bayesian nonparametrics; Dirichlet process; exchangeable sequences; linear functionals
    [Show full text]
  • Markov Process Duality
    Markov Process Duality Jan M. Swart Luminy, October 23 and 24, 2013 Jan M. Swart Markov Process Duality Markov Chains S finite set. RS space of functions f : S R. ! S For probability kernel P = (P(x; y))x;y2S and f R define left and right multiplication as 2 X X Pf (x) := P(x; y)f (y) and fP(x) := f (y)P(y; x): y y (I do not distinguish row and column vectors.) Def Chain X = (Xk )k≥0 of S-valued r.v.'s is Markov chain with transition kernel P and state space S if S E f (Xk+1) (X0;:::; Xk ) = Pf (Xk ) a:s: (f R ) 2 P (X0;:::; Xk ) = (x0;:::; xk ) , = P[X0 = x0]P(x0; x1) P(xk−1; xk ): ··· µ µ µ Write P ; E for process with initial law µ = P [X0 ]. x δx x 2 · P := P with δx (y) := 1fx=yg. E similar. Jan M. Swart Markov Process Duality Markov Chains Set k µ k x µk := µP (x) = P [Xk = x] and fk := P f (x) = E [f (Xk )]: Then the forward and backward equations read µk+1 = µk P and fk+1 = Pfk : In particular π invariant law iff πP = π. Function h harmonic iff Ph = h h(Xk ) martingale. , Jan M. Swart Markov Process Duality Random mapping representation (Zk )k≥1 i.i.d. with common law ν, take values in (E; ). φ : S E S measurable E × ! P(x; y) = P[φ(x; Z1) = y]: Random mapping representation (E; ; ν; φ) always exists, highly non-unique.
    [Show full text]
  • 1 Introduction Branching Mechanism in a Superprocess from a Branching
    数理解析研究所講究録 1157 巻 2000 年 1-16 1 An example of random snakes by Le Gall and its applications 渡辺信三 Shinzo Watanabe, Kyoto University 1 Introduction The notion of random snakes has been introduced by Le Gall ([Le 1], [Le 2]) to construct a class of measure-valued branching processes, called superprocesses or continuous state branching processes ([Da], [Dy]). A main idea is to produce the branching mechanism in a superprocess from a branching tree embedded in excur- sions at each different level of a Brownian sample path. There is no clear notion of particles in a superprocess; it is something like a cloud or mist. Nevertheless, a random snake could provide us with a clear picture of historical or genealogical developments of”particles” in a superprocess. ” : In this note, we give a sample pathwise construction of a random snake in the case when the underlying Markov process is a Markov chain on a tree. A simplest case has been discussed in [War 1] and [Wat 2]. The construction can be reduced to this case locally and we need to consider a recurrence family of stochastic differential equations for reflecting Brownian motions with sticky boundaries. A special case has been already discussed by J. Warren [War 2] with an application to a coalescing stochastic flow of piece-wise linear transformations in connection with a non-white or non-Gaussian predictable noise in the sense of B. Tsirelson. 2 Brownian snakes Throughout this section, let $\xi=\{\xi(t), P_{x}\}$ be a Hunt Markov process on a locally compact separable metric space $S$ endowed with a metric $d_{S}(\cdot, *)$ .
    [Show full text]
  • 1 Markov Chain Notation for a Continuous State Space
    The Metropolis-Hastings Algorithm 1 Markov Chain Notation for a Continuous State Space A sequence of random variables X0;X1;X2;:::, is a Markov chain on a continuous state space if... ... where it goes depends on where is is but not where is was. I'd really just prefer that you have the “flavor" here. The previous Markov property equation that we had P (Xn+1 = jjXn = i; Xn−1 = in−1;:::;X0 = i0) = P (Xn+1 = jjXn = i) is uninteresting now since both of those probabilities are zero when the state space is con- tinuous. It would be better to say that, for any A ⊆ S, where S is the state space, we have P (Xn+1 2 AjXn = i; Xn−1 = in−1;:::;X0 = i0) = P (Xn+1 2 AjXn = i): Note that it is okay to have equalities on the right side of the conditional line. Once these continuous random variables have been observed, they are fixed and nailed down to discrete values. 1.1 Transition Densities The continuous state analog of the one-step transition probability pij is the one-step tran- sition density. We will denote this as p(x; y): This is not the probability that the chain makes a move from state x to state y. Instead, it is a probability density function in y which describes a curve under which area represents probability. x can be thought of as a parameter of this density. For example, given a Markov chain is currently in state x, the next value y might be drawn from a normal distribution centered at x.
    [Show full text]
  • Simulation of Markov Chains
    Copyright c 2007 by Karl Sigman 1 Simulating Markov chains Many stochastic processes used for the modeling of financial assets and other systems in engi- neering are Markovian, and this makes it relatively easy to simulate from them. Here we present a brief introduction to the simulation of Markov chains. Our emphasis is on discrete-state chains both in discrete and continuous time, but some examples with a general state space will be discussed too. 1.1 Definition of a Markov chain We shall assume that the state space S of our Markov chain is S = ZZ= f:::; −2; −1; 0; 1; 2;:::g, the integers, or a proper subset of the integers. Typical examples are S = IN = f0; 1; 2 :::g, the non-negative integers, or S = f0; 1; 2 : : : ; ag, or S = {−b; : : : ; 0; 1; 2 : : : ; ag for some integers a; b > 0, in which case the state space is finite. Definition 1.1 A stochastic process fXn : n ≥ 0g is called a Markov chain if for all times n ≥ 0 and all states i0; : : : ; i; j 2 S, P (Xn+1 = jjXn = i; Xn−1 = in−1;:::;X0 = i0) = P (Xn+1 = jjXn = i) (1) = Pij: Pij denotes the probability that the chain, whenever in state i, moves next (one unit of time later) into state j, and is referred to as a one-step transition probability. The square matrix P = (Pij); i; j 2 S; is called the one-step transition matrix, and since when leaving state i the chain must move to one of the states j 2 S, each row sums to one (e.g., forms a probability distribution): For each i X Pij = 1: j2S We are assuming that the transition probabilities do not depend on the time n, and so, in particular, using n = 0 in (1) yields Pij = P (X1 = jjX0 = i): (Formally we are considering only time homogenous MC's meaning that their transition prob- abilities are time-homogenous (time stationary).) The defining property (1) can be described in words as the future is independent of the past given the present state.
    [Show full text]
  • Chapter 8: Markov Chains
    149 Chapter 8: Markov Chains 8.1 Introduction So far, we have examined several stochastic processes using transition diagrams and First-Step Analysis. The processes can be written as {X0, X1, X2,...}, where Xt is the state at time t. A.A.Markov On the transition diagram, Xt corresponds to 1856-1922 which box we are in at step t. In the Gambler’s Ruin (Section 2.7), Xt is the amount of money the gambler possesses after toss t. In the model for gene spread (Section 3.7), Xt is the number of animals possessing the harmful allele A in generation t. The processes that we have looked at via the transition diagram have a crucial property in common: Xt+1 depends only on Xt. It does not depend upon X0, X1,...,Xt−1. Processes like this are called Markov Chains. Example: Random Walk (see Chapter 4) none of these steps matter for time t+1 ? time t+1 time t ? In a Markov chain, the future depends only upon the present: NOT upon the past. 150 .. ...... .. .. ...... .. ....... ....... ....... ....... ... .... .... .... .. .. .. .. ... ... ... ... ....... ..... ..... ....... ........4................. ......7.................. The text-book image ...... ... ... ...... .... ... ... .... ... ... ... ... ... ... ... ... ... ... ... .... ... ... ... ... of a Markov chain has ... ... ... ... 1 ... .... ... ... ... ... ... ... ... .1.. 1... 1.. 3... ... ... ... ... ... ... ... ... 2 ... ... ... a flea hopping about at ... ..... ..... ... ...... ... ....... ...... ...... ....... ... ...... ......... ......... .. ........... ........ ......... ........... ........
    [Show full text]
  • Markov Random Fields and Stochastic Image Models
    Markov Random Fields and Stochastic Image Models Charles A. Bouman School of Electrical and Computer Engineering Purdue University Phone: (317) 494-0340 Fax: (317) 494-3358 email [email protected] Available from: http://dynamo.ecn.purdue.edu/»bouman/ Tutorial Presented at: 1995 IEEE International Conference on Image Processing 23-26 October 1995 Washington, D.C. Special thanks to: Ken Sauer Suhail Saquib Department of Electrical School of Electrical and Computer Engineering Engineering University of Notre Dame Purdue University 1 Overview of Topics 1. Introduction (b) Non-Gaussian MRF's 2. The Bayesian Approach i. Quadratic functions ii. Non-Convex functions 3. Discrete Models iii. Continuous MAP estimation (a) Markov Chains iv. Convex functions (b) Markov Random Fields (MRF) (c) Parameter Estimation (c) Simulation i. Estimation of σ (d) Parameter estimation ii. Estimation of T and p parameters 4. Application of MRF's to Segmentation 6. Application to Tomography (a) The Model (a) Tomographic system and data models (b) Bayesian Estimation (b) MAP Optimization (c) MAP Optimization (c) Parameter estimation (d) Parameter Estimation 7. Multiscale Stochastic Models (e) Other Approaches (a) Continuous models 5. Continuous Models (b) Discrete models (a) Gaussian Random Process Models 8. High Level Image Models i. Autoregressive (AR) models ii. Simultaneous AR (SAR) models iii. Gaussian MRF's iv. Generalization to 2-D 2 References in Statistical Image Modeling 1. Overview references [100, 89, 50, 54, 162, 4, 44] 4. Simulation and Stochastic Optimization Methods [118, 80, 129, 100, 68, 141, 61, 76, 62, 63] 2. Type of Random Field Model 5. Computational Methods used with MRF Models (a) Discrete Models i.
    [Show full text]
  • Temporally-Reweighted Chinese Restaurant Process Mixtures for Clustering, Imputing, and Forecasting Multivariate Time Series
    Temporally-Reweighted Chinese Restaurant Process Mixtures for Clustering, Imputing, and Forecasting Multivariate Time Series Feras A. Saad Vikash K. Mansinghka Probabilistic Computing Project Probabilistic Computing Project Massachusetts Institute of Technology Massachusetts Institute of Technology Abstract there are tens or hundreds of time series. One chal- lenge in these settings is that the data may reflect un- derlying processes with widely varying, non-stationary This article proposes a Bayesian nonparamet- dynamics [13]. Another challenge is that standard ric method for forecasting, imputation, and parametric approaches such as state-space models and clustering in sparsely observed, multivariate vector autoregression often become statistically and time series data. The method is appropri- numerically unstable in high dimensions [20]. Models ate for jointly modeling hundreds of time se- from these families further require users to perform ries with widely varying, non-stationary dy- significant custom modeling on a per-dataset basis, or namics. Given a collection of N time se- to search over a large set of possible parameter set- ries, the Bayesian model first partitions them tings and model configurations. In econometrics and into independent clusters using a Chinese finance, there is an increasing need for multivariate restaurant process prior. Within a clus- methods that exploit sparsity, are computationally ef- ter, all time series are modeled jointly us- ficient, and can accurately model hundreds of time se- ing a novel \temporally-reweighted" exten- ries (see introduction of [15], and references therein). sion of the Chinese restaurant process mix- ture. Markov chain Monte Carlo techniques This paper presents a nonparametric Bayesian method are used to obtain samples from the poste- for multivariate time series that aims to address some rior distribution, which are then used to form of the above challenges.
    [Show full text]
  • A Review on Statistical Inference Methods for Discrete Markov Random Fields Julien Stoehr, Richard Everitt, Matthew T
    A review on statistical inference methods for discrete Markov random fields Julien Stoehr, Richard Everitt, Matthew T. Moores To cite this version: Julien Stoehr, Richard Everitt, Matthew T. Moores. A review on statistical inference methods for discrete Markov random fields. 2017. hal-01462078v2 HAL Id: hal-01462078 https://hal.archives-ouvertes.fr/hal-01462078v2 Preprint submitted on 11 Apr 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. A review on statistical inference methods for discrete Markov random fields Julien Stoehr1 1School of Mathematical Sciences & Insight Centre for Data Analytics, University College Dublin, Ireland Abstract Developing satisfactory methodology for the analysis of Markov random field is a very challenging task. Indeed, due to the Markovian dependence structure, the normalizing constant of the fields cannot be computed using standard analytical or numerical methods. This forms a central issue for any statistical approach as the likelihood is an integral part of the procedure. Furthermore, such unobserved fields cannot be integrated out and the likelihood evaluation becomes a doubly intractable problem. This report gives an overview of some of the methods used in the literature to analyse such observed or unobserved random fields.
    [Show full text]
  • Monte Carlo Based Distance Dependent Chinese Restaurant Process for Segmentation of 3D LIDAR Data Using Motion and Spatial Features
    18th International Conference on Information Fusion Washington, DC - July 6-9, 2015 Monte Carlo Based Distance Dependent Chinese Restaurant Process for Segmentation of 3D LIDAR Data Using Motion and Spatial Features Mehmet Ali Cagri Tuncer Dirk Schulz Cognitive Mobile Systems Cognitive Mobile Systems Fraunhofer FKIE Fraunhofer FKIE Wachtberg, Germany 53343 Wachtberg, Germany 53343 Email: [email protected] Email: [email protected] Abstract—This paper proposes a novel method to obtain robust tends to unavoidable segmentation errors. These errors in turn and accurate object segmentations from 3D Light Detection cause erroneous track estimates. So improving the segmen- and Ranging (LIDAR) data points. The method exploits motion tation process is important to achieve progress on the whole information simultaneously estimated by a tracking algorithm in order to resolve ambiguities in complex dynamic scenes. recognition and tracking process. Typical approaches for tracking multiple objects in LIDAR data In urban scenarios, traffic participants are assumed well follow three steps; point cloud segmentation, object tracking, and segmented from each other. Therefore, current point cloud track classification. A large number of errors is due to failures segmentation approaches rely on geometrical features only. in the segmentation component, mainly because segmentation However the assumption may not hold, especially for bicyclists and tracking are performed consecutively and the segmentation step solely relies on geometrical features. This article presents and pedestrians. They often get close to other objects such as a 3D LIDAR based object segmentation method that exploits parking cars. Under these circumstances, segmentation gets the motion information provided by a tracking algorithm and difficult and under-segmentation of objects can occur.
    [Show full text]
  • Introduction to Hidden Markov Models
    Introduction to Hidden Markov Models Slides Borrowed From Venu Govindaraju Markov Models • Set of states: • Process moves from one state to another generating a sequence of states : • Markov chain property: probability of each subsequent state depends only on what was the previous state: • To define Markov model, the following probabilities have to be specified: transition probabilities and initial probabilities Example of Markov Model 0.3 0.7 Rain Dry 0.2 0.8 • Two states : ‘Rain’ and ‘Dry’. • Transition probabilities: P(‘Rain’|‘Rain’)=0.3 , P(‘Dry’|‘Rain’)=0.7 , P(‘Rain’|‘Dry’)=0.2, P(‘Dry’|‘Dry’)=0.8 • Initial probabilities: say P(‘Rain’)=0.4 , P(‘Dry’)=0.6 . Calculation of sequence probability • By Markov chain property, probability of state sequence can be found by the formula: • Suppose we want to calculate a probability of a sequence of states in our example, {‘Dry’,’Dry’,’Rain’,Rain’}. P({‘Dry’,’Dry’,’Rain’,Rain’} ) = P(‘Rain’|’Rain’) P(‘Rain’|’Dry’) P(‘Dry’|’Dry’) P(‘Dry’)= = 0.3*0.2*0.8*0.6 Hidden Markov models. • Set of states: • Process moves from one state to another generating a sequence of states : • Markov chain property: probability of each subsequent state depends only on what was the previous state: P (sik | si1,si2,…,sik−1) = P(sik | sik−1) • States are not visible, but each state randomly generates one of M observations (or visible states) • To€ define hidden Markov model, the following probabilities have to be specified: matrix of transition probabilities A=(aij), aij= P(si | sj) , matrix of observation probabilities B=(bi (vm )), bi(vm ) = P(vm | si) and a vector of initial probabilities π=(πi), πi = P(si) .
    [Show full text]
  • Chapter 2 Some Basic Tools
    Chapter 2 Some basic tools 2.1 Time series: Theory 2.1.1 Stochastic processes A stochastic process is a sequence of random variables . , x0, x1, x2,.... In this class, the subscript always means time. As always, x can be a vector. Associated with this stochastic process is its history: Ht = {xt, xt−1, xt−2,...} (2.1) The history is just the set of past values of xt. The expected value at time t of a particular random variable is just: Et(xs) = E(xs|Ht) (2.2) or its expected value conditional on all information available at t. Notice that if s ≤ t, then Et(xs) = xs. A few results from basic probability theory should be noted here. 1. E(.) is a linear operator, i.e., if X, Y , and Z are random variables then E(a(Z)X + b(Z)Y |Z = z) = a(z)E(x|Z = z) + b(z)E(y|Z = z). 2. The law of iterated expectations. Let Ω0 ⊂ Ω1 be sets of conditioning variables (or equivalently let Ω0 and Ω1 be events and let Ω1 ⊂ Ω0), and let X be a random variable. Then E(E(X|Ω1)|Ω0) = E(X|Ω0). 1 2 CHAPTER 2. SOME BASIC TOOLS 3. If E(Y |X) = 0, then E(f(x)Y ) = 0 for any function f(.). Next we consider two special classes of stochastic process that macroeconomists find particularly useful. 2.1.2 Markov processes Definition 2.1.1 (Markov process) A Markov process is a stochastic pro- cess with the property that: Pr(xt+1|Ht) = Pr(xt+1|xt) (2.3) where Pr(x) is the probability distribution of x.
    [Show full text]