Random Variable

Random Variable Definition A random variable X on a sample spaceΩ is a real-valued function onΩ; that is, X :Ω !R.A discrete random variable is a random variable that takes on only a finite or countably infinite number of values. Discrete random variable X and real value a: the event \X = a" represents the set fs 2 Ω: X (s) = ag. X Pr(X = a) = Pr(s) s2Ω:X (s)=a Independence Definition Two random variables X and Y are independent if and only if Pr((X = x) \ (Y = y)) = Pr(X = x) · Pr(Y = y) for all values x and y. Similarly, random variables X1; X2;::: Xk are mutually independent if and only if for any subset I ⊆ [1; k] and any values xi ,i 2 I , ! \ Y Pr Xi = xi = Pr(Xi = xi ): i2I i2I Expectation Definition The expectation of a discrete random variable X , denoted by E[X ], is given by X E[X ] = i Pr(X = i); i where the summation is over all values in the range of X . The P expectation is finite if i jij Pr(X = i) converges; otherwise, the expectation is unbounded. The expectation (or mean or average) is a weighted sum over all possible values of the random variable. Median Definition The median of a random variable X is a value m such Pr(X < m) ≤ 1=2 and Pr(X > m) < 1=2: Linearity of Expectation Theorem For any two random variablesX andY E[X + Y ] = E[X ] + E[Y ]: Lemma For any constantc and discrete random variableX, E[cX ] = cE[X ]: Bernoulli Random Variable A Bernoulli or an indicator random variable: 1 if the experiment succeeds, Y = 0 otherwise. Let Pr(Y = 1) = p: E[Y ] = p · 1 + (1 − p) · 0 = p = Pr(Y = 1): Binomial Random Variable Definition A binomial random variable X with parameters n and p, denoted by B(n; p), is defined by the following probability distribution on j = 0; 1; 2;:::; n: n Pr(X = j) = pj (1 − p)n−j : j Expectation of a Binomial Random Variable n X n E[X ] = j pj (1 − p)n−j j j=0 n X n! = j pj (1 − p)n−j j!(n − j)! j=0 n X n! = pj (1 − p)n−j (j − 1)!(n − j)! j=1 n X (n − 1)! = np pj−1(1 − p)(n−1)−(j−1) (j − 1)!((n − 1) − (j − 1))! j=1 n−1 X (n − 1)! = np pk (1 − p)(n−1)−k k!((n − 1) − k)! k=0 n−1 X n − 1 = np pk (1 − p)(n−1)−k = np: k k=0 Binomial Random Variable X represents the number of successes in n independent experiments, each with success probability p. Let 1 if the i-th experiment succeeds, X = i 0 otherwise. so Xi is the indicator random variable (for the ith experiment). Using linearity of expectations " n # n X X E[X ] = E Xi = E[Xi ] = np: i=1 i=1 Example: Finding the k-Smallest Element Procedure Order(S; k); Input: A set S, an integer k ≤ jSj = n. Output: The k smallest element in the set S. Example: Finding the k-Smallest Element Procedure Order(S; k); Input: A set S, an integer k ≤ jSj = n. Output: The k smallest element in the set S. 1 If jSj = k = 1 return S. 2 Choose a random element y uniformly from S. 3 Compare all elements of S to y. Let S1 = fx 2 S j x ≤ yg and S2 = fx 2 S j x > yg: 4 If k ≤ jS1j return Order(S1; k) else return Order(S2; k − jS1j). Theorem The algorithm returns thek-smallest element inS performing O(n) comparisons in expectation. Proof • We say that a call to Order(S; k) was successful if the random element was in the middle1 =3 of the set S. A call is successful with probability1 =3. • After the i-th successful call the size of the set S is bounded i by n(2=3) . Then there are at most log3=2 n successful calls. • Let X be the total number of comparisons. Let Xi be the number of comparisons between the i-th successful call Plog3=2 n (included) and the i + 1-th (excluded): E[X ] ≤ i=0 E[Xi ]. • Let Yi (i > 0) be the number of calls between the i-th successful call and the i + 1-th successful call. The expected number of calls from one successful call to the next one (included) is3: E[Yi ] = 3 for all i. i i • Therefore E[Xi ] ≤ n(2=3) E[Yi ] = 3n(2=3) . • Expected number of comparisons: log3=2 n j X 2 E[X ] ≤ 3n ≤ 9n: 3 j=0 Conditional expectation Definition Let Y and Z be random variables. The conditional expectation of Y given that Z = z is defined as X E[Y jZ = z] = yPr(Y = yjZ = z); y where summation is over all y in the range of Y . We can also consider the expectation of Y conditioned on the event E: X E[Y jE] = yPr(Y = yjE) y Conditional expectation Lemma For any random variablesX andY X E[X ] = Pr(Y = y)E[X jY = y]; y where the sum is over the range ofY and all of the expectations exist. Conditional expectation Linearity of expectations also holds for conditional expectations: Lemma For any finite collection of discrete random variablesX 1; X2;:::; Xn with finite expectation and for any random variableY, " n # n X X E Xi jY = y = E[Xi jY = y]: i=1 i=1 Conditional expectation Note that for given random variables Y ; Z the expression E[Y jZ] is itself a random variable f (Z) that takes the value E[Y jZ = z] when Z = z. Theorem For given random variablesY ; Z we have E[Y ] = E[E[Y jZ]] Proof. By definition, X E[E[Y jZ]] = E[Y jZ = z]Pr(Z = z) = E[Y ] z Application of conditional expectation Given a process S that each time when when it is started will recursively spawn new copies of the process S, where the number of new copies is a binomial random variable with parameters n and p. We also assume that these random variables are independent for each call to S. Now assume that we call the process S from a program once. What is the expected number of copies of S generated by this call? Application of conditional expectation To analyse we introduce generations: • The initial process S is (the only element) in generation 0 • A (copy of) process S is in generation i if it was spawned by another process S in generation i − 1. • Denote by Yi the number of processes in generation i. • Then Y0 = 1 Application of conditional expectation As the number of processes spawned by the first process S is binomially distributed with parameters n; p, we have E[Y1] = np Let us assume that there are yi−1 copies of process S in generation i − 1 and let Zk denote the number of processes spawned by the kth process which was itself spawned in generation i − 1 for 1 ≤ k ≤ yi−1 Each Zk is binomially distributed with parameters n; p and hence has expectation E[Zk ] = np Application of conditional expectation "yi−1 # X E[Yi jYi−1 = yi−1] = E Zk jYi−1 = yi−1 k=1 yi−1 X = E[Zk jYi−1 = yi−1] k=1 yi−1 n X X = jPr(Zk = jjYi−1 = yi−1) k=1 j=0 yi−1 n X X = jPr(Zk = j) k=1 j=0 yi−1 X = E[Zk ] k=1 = yi−1np Application of conditional expectation Now we can calculate the expected size of the ith generation inductively. We have E[Yi ] = E[E[Yi jYi−1]] = E[Yi−1np] = npE[Yi−1] i so by induction and Y0 = 1 we get E[Yi ] = (np) Application of conditional expectation The expected number of copies of S generated by the first call is given by 2 3 X X X i E 4 Yi 5 = E[Yi ] = (np) i≥0 i≥0 i≥0 This is unbounded if np ≥ 1 and equals1 =(1 − np) when np < 1 Hence the expected number of copies spawned by the first call is finite if and only if the expected number of copies spawned by each copy is less the 1. The Geometric Distribution Definition A geometric random variable X with parameter p is given by the following probability distribution on n = 1; 2;:::. Pr(X = n) = (1 − p)n−1p: Example: repeatedly draw independent Bernoulli random variables with parameter p > 0 until we get a1. Let X be number of trials up to and including the first1. Then X is a geometric random variable with parameter p. Memoryless Property Lemma For a geometric random variable with parameterp andn > 0, Pr(X = n + k j X > k) = Pr(X = n): Proof. Pr((X = n + k) \ (X > k)) Pr(X = n + k j X > k) = Pr(X > k) Pr(X = n + k) (1 − p)n+k−1p = = P1 i Pr(X > k) i=k (1 − p) p (1 − p)n+k−1p = = (1 − p)n−1p (1 − p)k = Pr(X = n): Memoryless Property { Intuition I have a coin and flip it a number of times without telling you neither the outcomes nor the number of times I flip it. I give the coin to you, asking what is the probability that the first head will come out after n times you have flipped it.

Random Variable

Randomized Algorithms

MA3K0 - High-Dimensional Probability

HIGHER MOMENTS of BANACH SPACE VALUED RANDOM VARIABLES 1. Introduction Let X Be a Random Variable with Values in a Banach Space

Control Policy with Autocorrelated Noise in Reinforcement Learning for Robotics

CONVERGENCE RATES of MARKOV CHAINS 1. Orientation 1

Multivariate Scale-Mixed Stable Distributions and Related Limit

Random Fields, Fall 2014 1

Random Polynomials Over Finite Fields

University of Groningen Panel Data Models Extended to Spatial Error

Iterated Random Functions∗

Automorphisms of Random Trees

1 Introduction 2 Gaussian Random Elements