Solving Hidden Non-Markovian Models: How to Compute Conditional State Change Probabilities

Total Page:16

File Type:pdf, Size:1020Kb

Solving Hidden Non-Markovian Models: How to Compute Conditional State Change Probabilities SOLVING HIDDEN NON-MARKOVIAN MODELS: HOW TO COMPUTE CONDITIONAL STATE CHANGE PROBABILITIES Claudia Krull(a), Graham Horton(b) (a)(b)University of Magdeburg, Department of Simulation and Graphics (a)[email protected], (b)[email protected] ABSTRACT of real life systems. These HnMM can in theory analyze Current production systems produce a wealth of the recorded protocols of not observed systems and information in the form of protocols, which is hardly determine the protocol probability or the most likely exploited, often due to a lack of appropriate analysis generating behavior. methods. Hidden non-Markovian Models attempt to Solving HnMM is still an issue of current research. relieve this problem. They enable the analysis of not HnMM can be analyzed using Proxel-based simulation observable systems only based on recorded output. The (Lazarova-Molnar 2005, Wickborn et al. 2006, Krull paper implements and tests solution algorithms based and Horton 2008), which is however quite expensive, on the original Hidden Markov Model algorithms. The because it computes the complete reachable model state problem of computing the conditional state change space and consequently all possible generating paths. probabilities at specific points in time is solved. The An alternative solution method was proposed in (Krull implemented algorithms compute the probability of a and Horton 2009), where the theory for HnMM was given output sequence and the most probable generating described as well as a theoretical transfer of the original path for Markov regenerative models. This enables the HMM solution algorithms to the new paradigm. analysis of protocols of not observable systems and the This paper shows an implementation of the solution of problems that cannot be solved using adapted HMM algorithms. It also solves the problem of existing methods. This opens up new application areas, how to compute the conditional state change including a different approach to mining the wealth of probabilities at a given point in time in continuous time. data already collected today with huge monetary effort. This is not trivial, since in theory the probability of a state change via a continuous distribution function at a Keywords: hidden non-Markovian models, Hidden specific point in time is 0. Furthermore the performance Markov Models, discrete stochastic models. of the implemented algorithms is tested under growing model size and growing protocol length. A comparison 1. INTRODUCTION with the current Proxel-based solution algorithm will Today machines and electronic components are getting show the differences of the two approaches. The more and more complex. As a result of that growing adapted HMM algorithms are much faster and enable complexity they can produce a wealth of information on the analysis of longer sequences and larger models. the system behavior in the form of protocols. These However, the algorithms only compute the cumulative protocols are most of the time discarded or simply state probability or the one most likely generating path. archived and not analyzed properly, often due to a lack Therefore both solution methods have different of appropriate tools. Another problem is, that often the application areas and should be developed further to be protocol entries are ambiguous. An error for example applicable to real world problems. could have been produced by several internal causes and each cause could also result in different errors. If 1.1. Possible Application Areas of HnMM that is the case, finding the exact or even a probable Application areas encompass all engineering problems system behavior that produced the given protocol is where the true system or machine state is not detectable almost impossible using existing methods. or has not been stored and only protocols containing The recently developed paradigm of Hidden non- ambiguous records are available. A possible application Markovian Models (HnMM) strives to relieve this area outside of engineering could be the analysis of the problem. HnMM are an extension of Hidden Markov symptoms of a patient, which can be interpreted as the Models to more general discrete stochastic hidden output of the unobservable true state of his or her models. HMM can analyze hidden behavior based only disease. on recorded system output. However, the hidden model Today in some parts of the industry a huge effort is is a discrete-time Markov chain and therefore severely invested in collecting data on existing production lines limited. The extension to discrete stochastic hidden by installing internal sensors and logging the observed models enables more realistic models and the analysis processes. The resulting amount of data is often huge and too often not consistent, since the calibration and accurately represent memoryless processes and are coordination of different sensors can be very difficult. therefore not very realistic for most applications. Consequently the analysis of the data can be tedious. Current extensions of HMM focus on increasing the Often the effort that would need to be invested to gain speech recognition capability (Fink 2008). Most of reliable sensor equipment and data analysis is larger them change the output process from discrete to than the expected benefit of such a step. continuous. The few attempts to generalize the hidden A concrete industry application of HnMM could process of an HMM were abandoned due to the minimal therefore be the following. A small production line is increase in speech recognition capability, compared to being observed over a limited period of time through the large increase in computational complexity of the external sensors such as cameras or photoelectric solution algorithms. barriers. These sensors can only detect the visible part of the production process and might not be able to 2.2. Hidden non-Markovian Models observe the individual machines’ or products’ status or The extension of HMM to more general models leads to a hidden buffer content. The analysis of these logs can so-called Hidden non-Markovian Models (HnMM). help determine the working status of the machines and These have been formalized in (Krull and Horton 2009) their current performance indices, helping the owner of and the difference to HMM is twofold. HnMM use the the production line in his decisions regarding the state space of a discrete stochastic model (DSM) as modernization or replacement of machines. Hidden hidden model, enabling non-Markovian state models are necessary in this context, since only a transitions. Furthermore, they associate the symbol portion of the production process can be observed. emissions with the state changes, since these are most However, HMM are not sufficient, since a production often the objects of interest in a DSM. process is most likely time dependent and includes The Proxel-based simulation algorithm was several non-Markovian transitions. Only the extension proposed as HnMM solution method even before to Hidden non-Markovian Models can provide a time formalization (Wickborn et al. 2006, Krull and Horton dependent model for the hidden process. 2008). The Proxel algorithm is a state space-based The application of the proposed method and simulation method that discovers all possible system HnMM analysis in this example could shift and developments in discrete time steps (Horton 2002, considerably reduce the necessary investment for sensor Lazarova-Molnar 2005). These possible development equipment and resulting effort for data collection, paths can correspond to generating paths of HnMM enabling also smaller firms to conduct such studies. output sequences. The solution of HnMM via Proxels is however quite expensive. The method explores all 2. STATE OF THE ART possible generating paths and has therefore a large memory and runtime requirement and is limited to very 2.1. Hidden Markov Models small models. Hidden Markov Models (HMM) (Rabiner 1989, Fink The goal of this paper is to implement the adapted 2008) are a well known paradigm in the realm of speech HMM algorithms presented by Krull and Horton (2009) and pattern recognition. They are used to determine the that promise to be more efficient. One major challenge hidden meaning of an observed output in the form of is the computation of conditional state change recorded speech or scanned pictures or letters. HMM probabilities at given points in time in continuous time. consist of a discrete-time Markov chain (DTMC) This would enable the analysis of larger and therefore (Bolch et. al 2006) as hidden model and a second more realistic models in order to increase the practical stochastic process emitting symbols with given applicability of HnMM. probabilities based on the current system state. Given a sequence of output symbols, one can solve the 3. ADAPTED HMM ALGORITHMS following three tasks: 3.1. Existing HnMM Theory • Evaluation: Determine the probability of that This section gives a brief definition of HnMM and their sequence for a given model. components. A more detailed formalization can be • Decoding: Determine the most likely found in Krull and Horton (2009). A Hidden non- generating state sequence for a given model. Markovian Model contains the state space of a discrete • Training: Train an initialized model to produce stochastic system as hidden model. The symbol outputs that sequence most likely. are associated to the transitions. Algorithms exist to solve all of these problems. ,,,,, (1) The evaluation and decoding problem are solved through iterative algorithms. Training is by far the most An HnMM is represented by a 6-tuple as seen in complex task of the three and is usually solved through Equation (1). The first three elements are the set of an optimization-like method. discrete system states S, the set of state changes C, the The applicability of HMM to our area of interest is set of output symbols V. The transition behavior is limited due to DTMCs as hidden model. These can only described by matrix A, which contains the instantaneous rate functions (IRF) (Horton 2002) of the state of the symbol emission there are multiple state changes transitions.
Recommended publications
  • Modeling Dependence in Data: Options Pricing and Random Walks
    UNIVERSITY OF CALIFORNIA, MERCED PH.D. DISSERTATION Modeling Dependence in Data: Options Pricing and Random Walks Nitesh Kumar A dissertation submitted in partial fulfillment of the requirements for the degree Doctor of Philosophy in Applied Mathematics March, 2013 UNIVERSITY OF CALIFORNIA, MERCED Graduate Division This is to certify that I have examined a copy of a dissertation by Nitesh Kumar and found it satisfactory in all respects, and that any and all revisions required by the examining committee have been made. Faculty Advisor: Harish S. Bhat Committee Members: Arnold D. Kim Roummel F. Marcia Applied Mathematics Graduate Studies Chair: Boaz Ilan Arnold D. Kim Date Contents 1 Introduction 2 1.1 Brief Review of the Option Pricing Problem and Models . ......... 2 2 Markov Tree: Discrete Model 6 2.1 Introduction.................................... 6 2.2 Motivation...................................... 7 2.3 PastWork....................................... 8 2.4 Order Estimation: Methodology . ...... 9 2.5 OrderEstimation:Results. ..... 13 2.6 MarkovTreeModel:Theory . 14 2.6.1 NoArbitrage.................................. 17 2.6.2 Implementation Notes. 18 2.7 TreeModel:Results................................ 18 2.7.1 Comparison of Model and Market Prices. 19 2.7.2 Comparison of Volatilities. 20 2.8 Conclusion ...................................... 21 3 Markov Tree: Continuous Model 25 3.1 Introduction.................................... 25 3.2 Markov Tree Generation and Computational Tractability . ............. 26 3.2.1 Persistentrandomwalk. 27 3.2.2 Number of states in a tree of fixed depth . ..... 28 3.2.3 Markov tree probability mass function . ....... 28 3.3 Continuous Approximation of the Markov Tree . ........ 30 3.3.1 Recursion................................... 30 3.3.2 Exact solution in Fourier space .
    [Show full text]
  • A Study of Hidden Markov Model
    University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Masters Theses Graduate School 8-2004 A Study of Hidden Markov Model Yang Liu University of Tennessee - Knoxville Follow this and additional works at: https://trace.tennessee.edu/utk_gradthes Part of the Mathematics Commons Recommended Citation Liu, Yang, "A Study of Hidden Markov Model. " Master's Thesis, University of Tennessee, 2004. https://trace.tennessee.edu/utk_gradthes/2326 This Thesis is brought to you for free and open access by the Graduate School at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Masters Theses by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. To the Graduate Council: I am submitting herewith a thesis written by Yang Liu entitled "A Study of Hidden Markov Model." I have examined the final electronic copy of this thesis for form and content and recommend that it be accepted in partial fulfillment of the equirr ements for the degree of Master of Science, with a major in Mathematics. Jan Rosinski, Major Professor We have read this thesis and recommend its acceptance: Xia Chen, Balram Rajput Accepted for the Council: Carolyn R. Hodges Vice Provost and Dean of the Graduate School (Original signatures are on file with official studentecor r ds.) To the Graduate Council: I am submitting herewith a thesis written by Yang Liu entitled “A Study of Hidden Markov Model.” I have examined the final electronic copy of this thesis for form and content and recommend that it be accepted in partial fulfillment of the requirements for the degree of Master of Science, with a major in Mathematics.
    [Show full text]
  • Entropy Rate
    Lecture 6: Entropy Rate • Entropy rate H(X) • Random walk on graph Dr. Yao Xie, ECE587, Information Theory, Duke University Coin tossing versus poker • Toss a fair coin and see and sequence Head, Tail, Tail, Head ··· −nH(X) (x1; x2;:::; xn) ≈ 2 • Play card games with friend and see a sequence A | K r Q q J ♠ 10 | ··· (x1; x2;:::; xn) ≈? Dr. Yao Xie, ECE587, Information Theory, Duke University 1 How to model dependence: Markov chain • A stochastic process X1; X2; ··· – State fX1;:::; Xng, each state Xi 2 X – Next step only depends on the previous state p(xn+1jxn;:::; x1) = p(xn+1jxn): – Transition probability pi; j : the transition probability of i ! j P = j – p(xn+1) xn p(xn)p(xn+1 xn) – p(x1; x2; ··· ; xn) = p(x1)p(x2jx1) ··· p(xnjxn−1) Dr. Yao Xie, ECE587, Information Theory, Duke University 2 Hidden Markov model (HMM) • Used extensively in speech recognition, handwriting recognition, machine learning. • Markov process X1; X2;:::; Xn, unobservable • Observe a random process Y1; Y2;:::; Yn, such that Yi ∼ p(yijxi) • We can build a probability model Yn−1 Yn n n p(x ; y ) = p(x1) p(xi+1jxi) p(yijxi) i=1 i=1 Dr. Yao Xie, ECE587, Information Theory, Duke University 3 Time invariance Markov chain • A Markov chain is time invariant if the conditional probability p(xnjxn−1) does not depend on n p(Xn+1 = bjXn = a) = p(X2 = bjX1 = a); for all a; b 2 X • For this kind of Markov chain, define transition matrix 2 3 6 ··· 7 6P11 P1n7 = 6 ··· 7 P 46 57 Pn1 ··· Pnn Dr.
    [Show full text]
  • Markov Decision Process Example
    Markov Decision Process Example Clarifying Stanleigh vernalised consolingly while Matthus always converts his lev lynch belligerently, he crossbreeding so subcutaneously. andDemetris air-minded. missend Answerable his heisters and touch-types pilose Tuckie excessively flanged heror hortatively flagstaffs turfafter while Hannibal Orbadiah effulging scrap and some emendated augmenters ungraciously, symptomatically. elaborative The example when an occupancy grid example of a crucial role of dots stimulus consists of theories, if π is. Below illustrates the example based on the next section. Data structure to read? MDP Illustrated mdp schematic MDP Example S 11 12 13 21 23 31 32 33 41 42 43 A. Both models by comparison of teaching mdp example, a trial starts after responding, and examples published in this in this technique for finding approximate solutions. A simple GUI and algorithms to experiment with Markov Decision Process miromanninomarkov-decision-process-examples. Markov Decision Processes Princeton University Computer. Senior ai scientist passionate about getting it simply runs into the total catch games is a moving object and examples are. Markov decision processes c Vikram Krishnamurthy 2013 6 2 Application Examples 21 Finite state Markov Decision Processes MDP xk is a S state Markov. Real-life examples of Markov Decision Processes Cross. Due to solve in a system does not equalizing strategy? First condition this remarkable and continues to. Example achieving a state satisfying property P at minimal. Example 37 Recycling Robot MDP The recycling robot Example 33 can be turned into his simple example had an MDP by simplifying it and providing some more. How does Markov decision process work? In state values here are you can be modeled as if we first accumulator hits its solutions are.
    [Show full text]
  • 1. Markov Models
    1. Markov models 1.1 Markov-chain Let X be a random variable X = (X1,...,Xt) taking values in some set S = {s1, . , sN }. The sequence is Markov chain if it has the following properties: 1. Limited horizon: P (Xt+1 = sk|X1,...,Xt) = P (Xt+1 = sk|Xt) (1) 2. Time invariant (stationary): P (X2 = sk|X1 = sj) = P (Xt+1 = sk|Xt = sj), ∀t, k, j (2) The subsequent variables may be dependent. In the general (non-Markovian) case, a variable Xt may depend on the all previous variables X1,...,Xt−1. Transition matrix A Markov model can be represented as a set of states (observations) S, and a transition matrix A. For example, S = {a, e, i, h, t, p} and aij = P (Xt+1 = sj|Xt = si): ai,j a e i h t p a 0.6 0 0 0 0 0.4 e ...... i 0.6 0.4 h ...... t 0.3 0 0.3 0.4 0 0 p ...... s0 0.9 0.1 Initial state probabilitites πsi can be represented as transitions from an auxil- liary initial state s0. 2 Markov model as finite-state automaton • Finite-State Automaton (FSA): states and transitions. • Weighted or probabilistic automaton: each transition has a probability, and transitions leaving a state sum to one. • A Markov model can be represented as a FSA. Observations either in states or transitions. b 0.6 1.0 0.9 b / 0.6 c / 1.0 c / 0.9 0.4 c / 0.4 a c 0.1 a / 0.1 3 Probability of a sequence: Given a Markov model (set of states S and transition matrix A) and a sequence of states (observations) X, it is straight- forward to compute the probability of the sequence: P (X1 ...XT ) = P (X1)P (X2|X1)P (X3|X1X2) ...P (XT |X1 ...XT −(3)1) = P (X1)P (X2|X1)P (X3|X2) ...P (XT |XT −1) (4) T −1 Y = πX1 aXtXt+1 (5) t=1 where aij is the transition probability from state i to state j, and πX1 is the initial state probability.
    [Show full text]
  • Markov Chains and Hidden Markov Models
    COMP 182 Algorithmic Thinking Luay Nakhleh Markov Chains and Computer Science Hidden Markov Models Rice University ❖ What is p(01110000)? ❖ Assume: ❖ 8 independent Bernoulli trials with success probability of α? ❖ Answer: ❖ (1-α)5α3 ❖ However, what if the assumption of independence doesn’t hold? ❖ That is, what if the outcome in a Bernoulli trial depends on the outcomes of the trials the preceded it? Markov Chains ❖ Given a sequence of observations X1,X2,…,XT ❖ The basic idea behind a Markov chain (or, Markov model) is to assume that Xt captures all the relevant information for predicting the future. ❖ In this case: T p(X1X2 ...XT )=p(X1)p(X2 X1)p(X3 X2) p(XT XT 1)=p(X1) p(Xt Xt 1) | | ··· | − | − t=2 Y Markov Chains ❖ When Xt is discrete, so Xt∈{1,…,K}, the conditional distribution p(Xt|Xt-1) can be written as a K×K matrix, known as the transition matrix A, where Aij=p(Xt=j|Xt-1=i) is the probability of going from state i to state j. ❖ Each row of the matrix sums to one, so this is called a stochastic matrix. 590 Chapter 17. Markov and hidden Markov models Markov Chains 1 α 1 β − α − A11 A22 A33 1 2 A12 A23 1 2 3 β ❖ A finite-state Markov chain is equivalent(a) (b) Figure 17.1 State transition diagrams for some simple Markov chains. Left: a 2-state chain. Right: a to a stochastic automaton3-state left-to-right. chain. ❖ One way to represent aAstationary,finite-stateMarkovchainisequivalenttoa Markov chain is stochastic automaton.Itiscommon 590 to visualizeChapter such 17.
    [Show full text]
  • Notes on Markov Models for 16.410 and 16.413 1 Markov Chains
    Notes on Markov Models for 16.410 and 16.413 1 Markov Chains A Markov chain is described by the following: • a set of states S = fs1; s2; : : : sng • a set of transition probabilities T (si; sj ) = p(sj jsi) • an initial state s0 2 S The Markov Assumption The state at time t, st, depends only on the previous state st−1 and not the previous history. That is, p(stjst−1; st−2; st−3; s0) = p(stjst−1) (1) Things you might want to know about Markov chains: • Probability of being in state si at time t • Stationary distribution 2 Markov Decision Processes The extension of Markov chains to decision making A Markov decision process (MDP) is a model for deciding how to act in “an accessible, stochastic environment with a known transition model” (Russell & Norvig, pg 500.). A Markov decision process is described by the following: • a set of states S = fs1; s2; : : : sng • a set of actions A = fa1; a2; : : : ; amg • a set of transition probabilities T (si; a; sj ) = p(sjjsi; a) • a set of rewards R : S × A 7! < • a discount factor γ 2 [0; 1] • an initial state s0 2 S Things you might want to know about MDPs: • The optimal policy One way to compute the optimal policy: Define the optimal value function V (si) by the Bellman equation: jSj V (si) = max 0R(si; a) + γ p(sj jsi; a) · V (sj )1 (2) a Xj=1 @ A The value iteration algorithm using Bellman’s equation: 1 1. t = 0 0 2.
    [Show full text]
  • A HMM Approach to Identifying Distinct DNA Methylation Patterns
    A HMM Approach to Identifying Distinct DNA Methylation Patterns for Subtypes of Breast Cancers Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University By Maoxiong Xu, B.S. Graduate Program in Computer Science and Engineering The Ohio State University 2011 Thesis Committee: Victor X. Jin, Advisor Raghu Machiraju Copyright by Maoxiong Xu 2011 Abstract The United States has the highest annual incidence rates of breast cancer in the world; 128.6 per 100,000 in whites and 112.6 per 100,000 among African Americans.[1,2] It is the second-most common cancer (after skin cancer) and the second-most common cause of cancer death (after lung cancer).[1] Recent studies have demonstrated that hyper- methylation of CpG islands may be implicated in tumor genesis, acting as a mechanism to inactivate specific gene expression of a diverse array of genes (Baylin et al., 2001). Genes have been reported to be regulated by CpG hyper-methylation, include tumor suppressor genes, cell cycle related genes, DNA mismatch repair genes, hormone receptors and tissue or cell adhesion molecules (Yan et al., 2001). Usually, breast cancer cells may or may not have three important receptors: estrogen receptor (ER), progesterone receptor (PR), and HER2. So we will consider the ER, PR and HER2 while dealing with the data. In this thesis, we first use Hidden Markov Model (HMM) to train the methylation data from both breast cancer cells and other cancer cells. Also we did hierarchy clustering to the gene expression data for the breast cancer cells and based on the clustering results, we get the methylation distribution in each cluster.
    [Show full text]
  • Stochastic Processes and Hidden Markov Models Introduction
    Stochastic processes and Hidden Markov Models Dr Mauro Delorenzi and Dr Frédéric Schütz Swiss Institute of Bioinformatics EMBnet course – Basel 23.3.2006 Introduction A mainstream topic in bioinformatics is the problem of sequence annotation: given a sequence of DNA/RNA or protein, we want to identify “interesting” elements Examples: – DNA/RNA: genes, promoters, splicing signals, segmentation of heterogeneous DNA, binding sites, etc – Proteins: coiled-coil domains, transmembrane domains, signal peptides, phosphorylation sites, etc – Generally: homologs, etc. “The challenge of annotating a complete eukaryotic genome: A case study in Drosophila melanogaster” – http://www.fruitfly.org/GASP1/tutorial/presentation/ EMBNET course Basel 23.3.2006 Sequence annotation The sequence of many of these interesting elements can be characterized statistically, so we are interested in modeling them. By modeling, we mean find statistical models than can: – Accurately describe the observed elements of provided sequences; – Accurately predict the presence of particular elements in new, unannotated, sequences; – If possible, be readily interpretable and provide some insight into the actual biological process involved (i.e. not a black box). EMBNET course Basel 23.3.2006 Example: heterogeneity of DNA sequences The nucleotide composition of segments of genomic DNA changes between different regions in a single organism – Example: coding regions in the human genome tend to be GC-rich. Modeling the differences between different homogeneous regions is interesting
    [Show full text]
  • Introduction to Stochastic Processes
    Introduction to Stochastic Processes Lothar Breuer Contents 1 Some general definitions 1 2 Markov Chains and Queues in Discrete Time 3 2.1 Definition .............................. 3 2.2 ClassificationofStates . 8 2.3 StationaryDistributions. 13 2.4 RestrictedMarkovChains . 20 2.5 ConditionsforPositiveRecurrence . 22 2.6 TheM/M/1queueindiscretetime . 24 3 Markov Processes on Discrete State Spaces 33 3.1 Definition .............................. 33 3.2 StationaryDistribution . 40 3.3 Firsthittingtimes .......................... 44 3.3.1 DefinitionandExamples . 45 3.3.2 ClosureProperties . 50 4 Renewal Theory 59 4.1 RenewalProcesses ......................... 59 4.2 RenewalFunctionandRenewalEquations . 62 4.3 RenewalTheorems ......................... 64 4.4 Residual Life Times and Stationary Renewal Processes . .... 67 5 Appendix 73 5.1 ConditionalExpectationsandProbabilities . .... 73 5.2 ExtensionTheorems . .. .. .. .. .. .. 76 5.2.1 Stochasticchains . 76 5.2.2 Stochasticprocesses . 77 i ii CONTENTS 5.3 Transforms ............................. 78 5.3.1 z–transforms ........................ 78 5.3.2 Laplace–Stieltjestransforms . 80 5.4 Gershgorin’scircletheorem. 81 CONTENTS iii Chapter 1 Some general definitions see notes under http://www.kent.ac.uk/IMS/personal/lb209/files/notes1.pdf 1 2 CHAPTER 1. SOME GENERAL DEFINITIONS Chapter 2 Markov Chains and Queues in Discrete Time 2.1 Definition Let X with n N denote random variables on a discrete space E. The sequence n ∈ 0 =(Xn : n N0) is called a stochastic chain. If P is a probability measure suchX that ∈ X P (X = j X = i ,...,X = i )= P (X = j X = i ) (2.1) n+1 | 0 0 n n n+1 | n n for all i0,...,in, j E and n N0, then the sequence shall be called a Markov chain on E.
    [Show full text]
  • Markov Decision Processes
    Lecture 2: Markov Decision Processes Lecture 2: Markov Decision Processes David Silver Lecture 2: Markov Decision Processes 1 Markov Processes 2 Markov Reward Processes 3 Markov Decision Processes 4 Extensions to MDPs Lecture 2: Markov Decision Processes Markov Processes Introduction Introduction to MDPs Markov decision processes formally describe an environment for reinforcement learning Where the environment is fully observable i.e. The current state completely characterises the process Almost all RL problems can be formalised as MDPs, e.g. Optimal control primarily deals with continuous MDPs Partially observable problems can be converted into MDPs Bandits are MDPs with one state Lecture 2: Markov Decision Processes Markov Processes Markov Property Markov Property \The future is independent of the past given the present" Definition A state St is Markov if and only if P [St+1 St ] = P [St+1 S1; :::; St ] j j The state captures all relevant information from the history Once the state is known, the history may be thrown away i.e. The state is a sufficient statistic of the future Lecture 2: Markov Decision Processes Markov Processes Markov Property State Transition Matrix For a Markov state s and successor state s0, the state transition probability is defined by 0 ss0 = P St+1 = s St = s P j State transition matrix defines transition probabilities from all states s to all successorP states s0, to 2 3 11 ::: 1n P. P = from 6 . 7 P 4 5 n1 ::: nn P P where each row of the matrix sums to 1. Lecture 2: Markov Decision Processes Markov Processes Markov Chains Markov Process A Markov process is a memoryless random process, i.e.
    [Show full text]
  • Stochastic Processes and Markov Chains (Part I)
    Stochastic processes and Markov chains (part I) Wessel van Wieringen w. n. van. wieringen@vu. nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics , VU University Amsterdam, The Netherlands Stochastic processes Stochastic processes ExExampleample 1 • The intensity of the sun. • Measured every day by the KNMI. • Stochastic variable Xt represents the sun’s intensity at day t, 0 ≤ t ≤ T. Hence, Xt assumes values in R+ (positive values only). Stochastic processes ExExampleample 2 • DNA sequence of 11 bases long. • At each base position there is an A, C, G or T. • Stochastic variable Xi is the base at position i, i = 1,…,11. • In case the sequence has been observed, say: (x1, x2, …, x11) = ACCCGATAGCT, then A is the realization of X1, C that of X2, et cetera. Stochastic processes position … t t+1 t+2 t+3 … base … A A T C … … A A A A … … G G G G … … C C C C … … T T T T … position position position position t t+1 t+2 t+3 Stochastic processes ExExampleample 3 • A patient’s heart pulse during surgery. • Measured continuously during interval [0, T]. • Stochastic variable Xt represents the occurrence of a heartbeat at time t, 0 ≤ t ≤ T. Hence, Xt assumes only the values 0 (no heartbeat) and 1 (heartbeat). Stochastic processes Example 4 • Brain activity of a human under experimental conditions. • Measured continuously during interval [0, T]. • Stochastic variable Xt represents the magnetic field at time t, 0 ≤ t ≤ T. Hence, Xt assumes values on R. Stochastic processes Differences between examples Xt Discrete Continuous e tt Example 2 Example 1 iscre DD Time s Example 3 Example 4 tinuou nn Co Stochastic processes The state space S is the collection of all possible values that the random variables of the stochastic process may assume.
    [Show full text]