Notes an Infinitesimally Small Change in T ≈ ≤ T ≤ X
Total Page:16
File Type:pdf, Size:1020Kb
Intro to Stochastic Systems (Spring 16) Lecture 3 3 Random signals and probabilistic systems 1 We have defined stochastic signals in an earlier lecture. A stochastic signal X is a collection (Xt)t T of random variables indexed by a time variable t T . Depending on the structure of the set T , we2 2 speak of discrete- or continuous-time signals: • If T is a subset of the integers Z, then X is a discrete-time stochastic signal. • If T is a finite or an infinite interval of the real line R, then X is a continous-time stochastic signal. The set X of possible values each Xt can take is called the state space of the process. If the elements of X can be put in a one-to-one correspondence with a subset of Z, then we say that X is a discrete-state process; otherwise, if X is a finite or an infinite interval of R, we say that X is a continuous-state process. It is important to point out that, even though we are using the word “state,” it does not mean that Xt is a state variable in the sense that it contains all relevant information about the signal up to time t. Another way to view stochastic signals is as follows. We have a probability space (Ω; ;P), such F that each X is a random variable with values in X, i.e., a function X : Ω X. This specifies a t t ! random variable for each t T . On the other hand, fixing ! Ω specifies a function from T into X 2 2 — this function simply assigns the value X (!) to each t T . Thus, one can think of X as a random t 2 function of t, where all the randomness is packed into !. This is best appreciated through examples, so let’s take a look at a few. Deterministic signals. Deterministic signals are a special case of stochastic signals. That is, if x : T X is a deterministic signal with state space X, we can just take X (!) = x(t) for all t T and ! t 2 all ! Ω. Thus, the value of X at each time t is known and equal to x(t) — that is, X is a function 2 t of t. Another way of stating that X is a deterministic signal is to note that, for any subset A of the state space X, P[X A] takes only two values: 0 if x(t) < A and 1 if x(t) A. t 2 2 i.i.d. processes. Take T = Z+ and let X0;X1;::: be a collection of independent and identically distributed (i.i.d.) random variables. Then X = (Xt)t Z+ is called an i.i.d. stochastic process, and 2 the common probability distribution of the Xt’s is called the (one-time) marginal distribution of X. For example, if each Xt is a Bernoulli(p) random variable, then X = (Xt) records the outcomes of repeated tosses of a coin with bias p. If we view it as a random function of t, then the value of this function at any time t can be only 0 or 1. Moreover, because of independence, for any collection of times t ;t ;:::;t T and for any binary values x ;:::;x 0;1 , 1 2 n 2 1 n 2 f g Yn Yn x 1 x P[X = x ;:::;X = x ] = P[X = x ] = p i (1 p) − i : (3.1) t1 1 tn n ti i − i=1 i=1 1Also known as stochasti processes – we will use both terms in this course. 1 Intro to Stochastic Systems (Spring 16) Lecture 3 Markov processes. We have already encountered Markov processes before. A Markov process X with state space is a discrete time process X = (Xt)t Z+ that can be realized in the form 2 Xt+1 = f (Xt;Ut); t = 0;1;2;::: where U = (Ut)t Z+ is an i.i.d. process, and the initial condition X0 is independent of all the Ut’s. We have mostly2 focused on discrete-state Markov processes, but X can take real values as well. Deterministic signals with stochastic parameters. Let A and Θ be two independent random variables, where A Uniform(0;1) and Θ Uniform(0;2π). Let X = Acos(2πf t + Θ), < t < . ∼ 2 t −∞ 1 This is a sinusoidal signal with random amplitude and random phase, and is an example of a predictable stochastic signal: if we can observe Xt over a full period (i.e., any interval of length 1=2πf ), then we can reconstruct the entire signal. 3.1 Describing stochastic signals Given a deterministic signal x : T R, we can compute its value at any time t T , at least in ! 2 principle. With a stochastic signal, though, we can only deal in probabilities. Thus, if we have a discrete-state process, we can specify the probabilities P[X = x] for all t T and all x X; if we t 2 2 have a continuous-state process, we can specify the probabilities P[a X b] for any finite or ≤ t ≤ infinite subinterval [a;b] of the state space X. It’s useful to think about the latter probability as the probability that the graph of the (random) function t X will “pass” through the “gate” [a;b] at 7! t time t – see Figure1. b a t Figure 1: The event a X b . f ≤ t ≤ g If we can do this for every t T , does this give us a complete description of the underlying 2 stochastic signal? The answer is no. To see why, consider the following two discrete-time stochastic processes: • X = (Xt)t Z+ is an i.i.d. Bernoulli(1=2) process. That is, each Xt encodes an independent toss of a fair coin.2 2 Intro to Stochastic Systems (Spring 16) Lecture 3 X • Y = (Yt)t Z+ is a Markov process with binary state space = 0;1 with initial state Y0 2 f g ∼ Bernoulli(1=2) and with one-step transition probabilities M(0;0) = M(1;1) = 0:7 and M(1;0) = M(0;1) = 0:3. We claim that these two processes have the same one-time marginal distributions, i.e., P[Xt = 0] = P[Yt = 0] for all t (since the state space of both processes has only two elements, this is all we need 1 to prove). That is, we claim that P[Yt = 0] = 2 for all t. From our earlier example of a two-state Markov chain, we know that π = (1=2;1=2) is an equilibrium distribution of our Markov chain. Since X Bernoulli(1=2), we know that X Bernoulli(1=2) for all t. This proves the claim. However, 0 ∼ t ∼ the two processes are very different. If we knew that X1000 = 0 and asked for the (conditional) probablity that X1001 = 0, we would get the answer 1=2 because the Xt’s correspond to independent coin tosses – the outcome of the coin toss at t = 1000 has no effect on the outcome of the coin toss at t = 1001. On the other hand, for the Y process we know that P[Y = 0 Y = M(0;0) = 0:7. And, 1001 j 1000 of course, the Yt’s are not independent — Markov processes have memory, albeit a very short one. It turns out that, in order to arrive at a full description of a stochastic signal X, we must be able to specify all probabilities of the form P[Xt1 = x1;Xt2 = x2;:::;Xtn = xn] for all n, all t < t < ::: < t and all x ;:::;x X (for discrete-state stochastic signals) and 1 2 n 1 n 2 P[a X b ; a X b ; :::; a X b ] 1 ≤ t1 ≤ 1 2 ≤ t2 ≤ 2 n ≤ tn ≤ n for all n, all t1 < t2 < ::: < tn, and all finite or infinite intervals [a1;b1];:::;[an;bn] of X (for continuous- state signals). Of course, it may not be easy to obtain closed-form expressions for these probabilities even for n = 2 — but, as long as we can do this in principle, we are done. In many situations, however, we do not need such a fine-grained description of X. For example, if X has a continuous state space, we may want to know what value Xt will take “on average” and how much we can expect it to fluctuate around this average value. This is encoded in the first 4 2 4 two moments of Xt: the mean mX(t) = E[Xt] and the variance σX(t) = Var[Xt]. We can also talk about correlations between the values of X at different times. This information is encoded in the autocorrelation function2 R (s;t) =4 E[X X ]; s;t T X s t 8 2 and in the autocovariance function C (s;t) =4 E[(X m (s))(X m (t))] = E[X X ] m (s)m (t) = R (s;t) m (s)m (t): X s − X t − X s t − X X X − X X Later on, we will also be looking at two signals at a time, say, X = (Xt) and Y = (Yt), where X is the input and Y is the output of some system. Then we may want to look at the crosscorrelation RXY (s;t) =4 E[XsYt] and the crosscovariance C (s;t) =4 E[(X m (s))(Y m (t))] = R (s;t) m (s)m (t): XY s − X t − Y XY − X Y 2The Greek prefix “auto” means “self.” So, this is fancyspeak for “self-correlation.” 3 Intro to Stochastic Systems (Spring 16) Lecture 3 3.2 Random walk We can spend all the time in the world on definitions and generalities, but the best way of appreciating key concepts is in the context of examples.