Basic Probability Theory

Total Page:16

File Type:pdf, Size:1020Kb

Basic Probability Theory TU Eindhoven Advanced Algorithms (2IL45) — Course Notes Basic probability theory Events. Probability theory is about experiments that can have different outcomes. The possible outcomes are called the elementary events, and the sample space is the set of all elementary events. A subset of the sample space is an event.1 (Note that if the subset is a singleton, then the event is an elementary event.) We call the set of all events defined by a sample space S the event space defined by S, and we denote it by Events(S). As an example, consider the experiment where we flip a coin two times. One possible outcome of the experiment is HT: we first get heads (H) and then tails (T). The sample space is the set {HH,HT,TH,TT }. The subset {HT, T H, T T } is one of the 16 events defined by this sample space, namely “the event that we get at least one T”. Probability distributions. A sample space S comes with a probability distribution, which is a mapping Pr : Events(S) → R such that 1. Pr[A] > 0 for all events A ∈ Events(S). 2. Pr[S] = 1. 3. Pr[A ∪ B] = Pr[A] + Pr[B] for any two events A, B ∈ Events(S) that are mutually exclusive (that is, events A, B such that A ∩ B = ∅). Another way of looking at this is that we assign non-negative probabilities to the elementary events (which sum to 1), and that the probability of an event is the sum of the probabilities of the elementary events it is composed of. Note that property 3 only talks about mutually exclusive events. However, for any set of n events A1,...,An (no matter whether the events are mutually exclusive) we have Pr[A1 ∪ · · · ∪ An] 6 Pr[A1] + ··· + Pr[An]. It is not necessarily the case that all elementary events have the same probability. For example, in the example above it could be that Pr[HH] = 1/16, Pr[HT ] = Pr[TH] = 3/16, and Pr[TT ] = 9/16. Note, however, that the probabilities of all the elementary events always sum to 1; this follows from the conditions 1, 2, 3 above. If all elementary events have the same probability—thus for finite S we have Pr[A] = 1/|S| for all elementary events A ∈ S—then Pr is called a uniform distribution. Conditional probabilities. The probability that an event happens may change if you already know that some other event has taken place. This leads to the following concept: the conditional probability of an event A given that another event B occurs is defined to be Pr[A ∩ B] Pr[A|B] := , Pr[B] where we require Pr[B] 6= 0. Two events A and B are called independent if Pr[A ∩ B] = Pr[A] · Pr[B]. If Pr[B] 6= 0 this implies Pr[A|B] = Pr[A], which explains the name: the probability that A has happened is independent of whether B has happened or not. 1In general not all subsets need to be events, but we shall only deal with situations where every subset is an event. 1 TU Eindhoven Advanced Algorithms (2IL45) — Course Notes Random variables and their expected value. A random variable is a function that assigns a real number to each elementary event. (Note that it does not assign a number to a non-elementary event.) We will be working with discrete random variables: random variables whose possible values come from a discrete set (usually N or Z). In the coin-flip example, for instance, we could define a random variable X that denotes the number of heads, so X(HH) = 2, X(TH) = X(HT ) = 1, and X(TT ) = 0. For a random variable X and a number x we can now define the event X = x to be {A ∈ S : X(A) = x}. In other words X = x is the collection of elementary events A such that X(A) = x. Thus we have P Pr[X = x] = A∈S:X(A)=x Pr[A]. The expected value of a random variable X is defined as X E[X] := x · Pr[X = x], x where the sum is taken over all possible values x that the random variable X can take. Lemma 1 (Markov inequality) Let X be a random variable taking only non-negative values, and µ = E[X] be its expectation . Then for any t > 0 we have Pr[ X > t · µ ] 6 1/t. Often several random variables are defined over the same sample space S. We have the following important property. Lemma 2 (Linearity of expectation) E[X + Y ] = E[X] + E[Y ] for any two random variables X, Y . Moreover, for any constant c we have E[cX] = c E[X]. For multiplication a similar statement holds only if the random variables are independent. (X and Y are independent if for all x and y the events X = x and Y = y are independent.) So we have: E[XY ] = E[X] · E[Y ] for any two independent random variables. Bernoulli and Poisson trials. A Bernoulli trial is an experiment with two possible out- comes: 0 (=fail) or 1 (=success). If the probability of succes is p, then the expected number of trials before a successful experiment takes place is 1/p. For example, if we have a fair dice then the probability of throwing, say, 5 is 1/6, so the expected number of throws before 5 comes up is 6. With a fair coin, the expected number of coin flips to gets heads is 2. If we have a sequence of experiments with two possible outcomes, 0 or 1, but the success probability is different for each experiment, then these are called Poisson trials. The following result is often useful to obtain high-probability bounds for randomized algorithms: Lemma 3 (Tail estimates for Poisson trials) Suppose we do n Poisson trials. Let Xi denote the outcome of the i-th trial and let pi = Pr[Xi = 1], where 0 < pi < 1. Let Pn X = i=1 Xi be a random variable that indicates the total number of successes in the n Pn trials, and let µ = E[X] = i=1 pi be the expected values of X (that is, the expected number of successful experiments). Then eδ µ Pr[X > (1 + δ)µ] . 6 (1 + δ)1+δ Thus the probability of deviating by more than some constant factor from the expected value is exponentially small. For example, for δ = 2 we have eδ/(1 + δ)1+δ < 1/2, so we get 1µ Pr[X > 3µ] . 6 2 2.
Recommended publications
  • Probability and Statistics Lecture Notes
    Probability and Statistics Lecture Notes Antonio Jiménez-Martínez Chapter 1 Probability spaces In this chapter we introduce the theoretical structures that will allow us to assign proba- bilities in a wide range of probability problems. 1.1. Examples of random phenomena Science attempts to formulate general laws on the basis of observation and experiment. The simplest and most used scheme of such laws is: if a set of conditions B is satisfied =) event A occurs. Examples of such laws are the law of gravity, the law of conservation of mass, and many other instances in chemistry, physics, biology... If event A occurs inevitably whenever the set of conditions B is satisfied, we say that A is certain or sure (under the set of conditions B). If A can never occur whenever B is satisfied, we say that A is impossible (under the set of conditions B). If A may or may not occur whenever B is satisfied, then A is said to be a random phenomenon. Random phenomena is our subject matter. Unlike certain and impossible events, the presence of randomness implies that the set of conditions B do not reflect all the necessary and sufficient conditions for the event A to occur. It might seem them impossible to make any worthwhile statements about random phenomena. However, experience has shown that many random phenomena exhibit a statistical regularity that makes them subject to study. For such random phenomena it is possible to estimate the chance of occurrence of the random event. This estimate can be obtained from laws, called probabilistic or stochastic, with the form: if a set of conditions B is satisfied event A occurs m times =) repeatedly n times out of the n repetitions.
    [Show full text]
  • Random Variable = a Real-Valued Function of an Outcome X = F(Outcome)
    Random Variables (Chapter 2) Random variable = A real-valued function of an outcome X = f(outcome) Domain of X: Sample space of the experiment. Ex: Consider an experiment consisting of 3 Bernoulli trials. Bernoulli trial = Only two possible outcomes – success (S) or failure (F). • “IF” statement: if … then “S” else “F” • Examine each component. S = “acceptable”, F = “defective”. • Transmit binary digits through a communication channel. S = “digit received correctly”, F = “digit received incorrectly”. Suppose the trials are independent and each trial has a probability ½ of success. X = # successes observed in the experiment. Possible values: Outcome Value of X (SSS) (SSF) (SFS) … … (FFF) Random variable: • Assigns a real number to each outcome in S. • Denoted by X, Y, Z, etc., and its values by x, y, z, etc. • Its value depends on chance. • Its value becomes available once the experiment is completed and the outcome is known. • Probabilities of its values are determined by the probabilities of the outcomes in the sample space. Probability distribution of X = A table, formula or a graph that summarizes how the total probability of one is distributed over all the possible values of X. In the Bernoulli trials example, what is the distribution of X? 1 Two types of random variables: Discrete rv = Takes finite or countable number of values • Number of jobs in a queue • Number of errors • Number of successes, etc. Continuous rv = Takes all values in an interval – i.e., it has uncountable number of values. • Execution time • Waiting time • Miles per gallon • Distance traveled, etc. Discrete random variables X = A discrete rv.
    [Show full text]
  • Introduction to Stochastic Processes - Lecture Notes (With 33 Illustrations)
    Introduction to Stochastic Processes - Lecture Notes (with 33 illustrations) Gordan Žitković Department of Mathematics The University of Texas at Austin Contents 1 Probability review 4 1.1 Random variables . 4 1.2 Countable sets . 5 1.3 Discrete random variables . 5 1.4 Expectation . 7 1.5 Events and probability . 8 1.6 Dependence and independence . 9 1.7 Conditional probability . 10 1.8 Examples . 12 2 Mathematica in 15 min 15 2.1 Basic Syntax . 15 2.2 Numerical Approximation . 16 2.3 Expression Manipulation . 16 2.4 Lists and Functions . 17 2.5 Linear Algebra . 19 2.6 Predefined Constants . 20 2.7 Calculus . 20 2.8 Solving Equations . 22 2.9 Graphics . 22 2.10 Probability Distributions and Simulation . 23 2.11 Help Commands . 24 2.12 Common Mistakes . 25 3 Stochastic Processes 26 3.1 The canonical probability space . 27 3.2 Constructing the Random Walk . 28 3.3 Simulation . 29 3.3.1 Random number generation . 29 3.3.2 Simulation of Random Variables . 30 3.4 Monte Carlo Integration . 33 4 The Simple Random Walk 35 4.1 Construction . 35 4.2 The maximum . 36 1 CONTENTS 5 Generating functions 40 5.1 Definition and first properties . 40 5.2 Convolution and moments . 42 5.3 Random sums and Wald’s identity . 44 6 Random walks - advanced methods 48 6.1 Stopping times . 48 6.2 Wald’s identity II . 50 6.3 The distribution of the first hitting time T1 .......................... 52 6.3.1 A recursive formula . 52 6.3.2 Generating-function approach .
    [Show full text]
  • 39 Section J Basic Probability Concepts Before We Can Begin To
    Section J Basic Probability Concepts Before we can begin to discuss inferential statistics, we need to discuss probability. Recall, inferential statistics deals with analyzing a sample from the population to draw conclusions about the population, therefore since the data came from a sample we can never be 100% certain the conclusion is correct. Therefore, probability is an integral part of inferential statistics and needs to be studied before starting the discussion on inferential statistics. The theoretical probability of an event is the proportion of times the event occurs in the long run, as a probability experiment is repeated over and over again. Law of Large Numbers says that as a probability experiment is repeated again and again, the proportion of times that a given event occurs will approach its probability. A sample space contains all possible outcomes of a probability experiment. EX: An event is an outcome or a collection of outcomes from a sample space. A probability model for a probability experiment consists of a sample space, along with a probability for each event. Note: If A denotes an event then the probability of the event A is denoted P(A). Probability models with equally likely outcomes If a sample space has n equally likely outcomes, and an event A has k outcomes, then Number of outcomes in A k P(A) = = Number of outcomes in the sample space n The probability of an event is always between 0 and 1, inclusive. 39 Important probability characteristics: 1) For any event A, 0 ≤ P(A) ≤ 1 2) If A cannot occur, then P(A) = 0.
    [Show full text]
  • Probability and Counting Rules
    blu03683_ch04.qxd 09/12/2005 12:45 PM Page 171 C HAPTER 44 Probability and Counting Rules Objectives Outline After completing this chapter, you should be able to 4–1 Introduction 1 Determine sample spaces and find the probability of an event, using classical 4–2 Sample Spaces and Probability probability or empirical probability. 4–3 The Addition Rules for Probability 2 Find the probability of compound events, using the addition rules. 4–4 The Multiplication Rules and Conditional 3 Find the probability of compound events, Probability using the multiplication rules. 4–5 Counting Rules 4 Find the conditional probability of an event. 5 Find the total number of outcomes in a 4–6 Probability and Counting Rules sequence of events, using the fundamental counting rule. 4–7 Summary 6 Find the number of ways that r objects can be selected from n objects, using the permutation rule. 7 Find the number of ways that r objects can be selected from n objects without regard to order, using the combination rule. 8 Find the probability of an event, using the counting rules. 4–1 blu03683_ch04.qxd 09/12/2005 12:45 PM Page 172 172 Chapter 4 Probability and Counting Rules Statistics Would You Bet Your Life? Today Humans not only bet money when they gamble, but also bet their lives by engaging in unhealthy activities such as smoking, drinking, using drugs, and exceeding the speed limit when driving. Many people don’t care about the risks involved in these activities since they do not understand the concepts of probability.
    [Show full text]
  • Is the Cosmos Random?
    IS THE RANDOM? COSMOS QUANTUM PHYSICS Einstein’s assertion that God does not play dice with the universe has been misinterpreted By George Musser Few of Albert Einstein’s sayings have been as widely quot- ed as his remark that God does not play dice with the universe. People have naturally taken his quip as proof that he was dogmatically opposed to quantum mechanics, which views randomness as a built-in feature of the physical world. When a radioactive nucleus decays, it does so sponta- neously; no rule will tell you when or why. When a particle of light strikes a half-silvered mirror, it either reflects off it or passes through; the out- come is open until the moment it occurs. You do not need to visit a labora- tory to see these processes: lots of Web sites display streams of random digits generated by Geiger counters or quantum optics. Being unpredict- able even in principle, such numbers are ideal for cryptography, statistics and online poker. Einstein, so the standard tale goes, refused to accept that some things are indeterministic—they just happen, and there is not a darned thing anyone can do to figure out why. Almost alone among his peers, he clung to the clockwork universe of classical physics, ticking mechanistically, each moment dictating the next. The dice-playing line became emblemat- ic of the B side of his life: the tragedy of a revolutionary turned reaction- ary who upended physics with relativity theory but was, as Niels Bohr put it, “out to lunch” on quantum theory.
    [Show full text]
  • Topic 1: Basic Probability Definition of Sets
    Topic 1: Basic probability ² Review of sets ² Sample space and probability measure ² Probability axioms ² Basic probability laws ² Conditional probability ² Bayes' rules ² Independence ² Counting ES150 { Harvard SEAS 1 De¯nition of Sets ² A set S is a collection of objects, which are the elements of the set. { The number of elements in a set S can be ¯nite S = fx1; x2; : : : ; xng or in¯nite but countable S = fx1; x2; : : :g or uncountably in¯nite. { S can also contain elements with a certain property S = fx j x satis¯es P g ² S is a subset of T if every element of S also belongs to T S ½ T or T S If S ½ T and T ½ S then S = T . ² The universal set ­ is the set of all objects within a context. We then consider all sets S ½ ­. ES150 { Harvard SEAS 2 Set Operations and Properties ² Set operations { Complement Ac: set of all elements not in A { Union A \ B: set of all elements in A or B or both { Intersection A [ B: set of all elements common in both A and B { Di®erence A ¡ B: set containing all elements in A but not in B. ² Properties of set operations { Commutative: A \ B = B \ A and A [ B = B [ A. (But A ¡ B 6= B ¡ A). { Associative: (A \ B) \ C = A \ (B \ C) = A \ B \ C. (also for [) { Distributive: A \ (B [ C) = (A \ B) [ (A \ C) A [ (B \ C) = (A [ B) \ (A [ C) { DeMorgan's laws: (A \ B)c = Ac [ Bc (A [ B)c = Ac \ Bc ES150 { Harvard SEAS 3 Elements of probability theory A probabilistic model includes ² The sample space ­ of an experiment { set of all possible outcomes { ¯nite or in¯nite { discrete or continuous { possibly multi-dimensional ² An event A is a set of outcomes { a subset of the sample space, A ½ ­.
    [Show full text]
  • Negative Probability in the Framework of Combined Probability
    Negative probability in the framework of combined probability Mark Burgin University of California, Los Angeles 405 Hilgard Ave. Los Angeles, CA 90095 Abstract Negative probability has found diverse applications in theoretical physics. Thus, construction of sound and rigorous mathematical foundations for negative probability is important for physics. There are different axiomatizations of conventional probability. So, it is natural that negative probability also has different axiomatic frameworks. In the previous publications (Burgin, 2009; 2010), negative probability was mathematically formalized and rigorously interpreted in the context of extended probability. In this work, axiomatic system that synthesizes conventional probability and negative probability is constructed in the form of combined probability. Both theoretical concepts – extended probability and combined probability – stretch conventional probability to negative values in a mathematically rigorous way. Here we obtain various properties of combined probability. In particular, relations between combined probability, extended probability and conventional probability are explicated. It is demonstrated (Theorems 3.1, 3.3 and 3.4) that extended probability and conventional probability are special cases of combined probability. 1 1. Introduction All students are taught that probability takes values only in the interval [0,1]. All conventional interpretations of probability support this assumption, while all popular formal descriptions, e.g., axioms for probability, such as Kolmogorov’s
    [Show full text]
  • Probability Theory Review 1 Basic Notions: Sample Space, Events
    Fall 2018 Probability Theory Review Aleksandar Nikolov 1 Basic Notions: Sample Space, Events 1 A probability space (Ω; P) consists of a finite or countable set Ω called the sample space, and the P probability function P :Ω ! R such that for all ! 2 Ω, P(!) ≥ 0 and !2Ω P(!) = 1. We call an element ! 2 Ω a sample point, or outcome, or simple event. You should think of a sample space as modeling some random \experiment": Ω contains all possible outcomes of the experiment, and P(!) gives the probability that we are going to get outcome !. Note that we never speak of probabilities except in relation to a sample space. At this point we give a few examples: 1. Consider a random experiment in which we toss a single fair coin. The two possible outcomes are that the coin comes up heads (H) or tails (T), and each of these outcomes is equally likely. 1 Then the probability space is (Ω; P), where Ω = fH; T g and P(H) = P(T ) = 2 . 2. Consider a random experiment in which we toss a single coin, but the coin lands heads with 2 probability 3 . Then, once again the sample space is Ω = fH; T g but the probability function 2 1 is different: P(H) = 3 , P(T ) = 3 . 3. Consider a random experiment in which we toss a fair coin three times, and each toss is independent of the others. The coin can come up heads all three times, or come up heads twice and then tails, etc.
    [Show full text]
  • Sample Space, Events and Probability
    Sample Space, Events and Probability Sample Space and Events There are lots of phenomena in nature, like tossing a coin or tossing a die, whose outcomes cannot be predicted with certainty in advance, but the set of all the possible outcomes is known. These are what we call random phenomena or random experiments. Probability theory is concerned with such random phenomena or random experiments. Consider a random experiment. The set of all the possible outcomes is called the sample space of the experiment and is usually denoted by S. Any subset E of the sample space S is called an event. Here are some examples. Example 1 Tossing a coin. The sample space is S = fH; T g. E = fHg is an event. Example 2 Tossing a die. The sample space is S = f1; 2; 3; 4; 5; 6g. E = f2; 4; 6g is an event, which can be described in words as "the number is even". Example 3 Tossing a coin twice. The sample space is S = fHH;HT;TH;TT g. E = fHH; HT g is an event, which can be described in words as "the first toss results in a Heads. Example 4 Tossing a die twice. The sample space is S = f(i; j): i; j = 1; 2;:::; 6g, which contains 36 elements. "The sum of the results of the two toss is equal to 10" is an event. Example 5 Choosing a point from the interval (0; 1). The sample space is S = (0; 1). E = (1=3; 1=2) is an event. Example 6 Measuring the lifetime of a lightbulb.
    [Show full text]
  • Notes for Math 450 Lecture Notes 2
    Notes for Math 450 Lecture Notes 2 Renato Feres 1 Probability Spaces We first explain the basic concept of a probability space, (Ω, F,P ). This may be interpreted as an experiment with random outcomes. The set Ω is the collection of all possible outcomes of the experiment; F is a family of subsets of Ω called events; and P is a function that associates to an event its probability. These objects must satisfy certain logical requirements, which are detailed below. A random variable is a function X :Ω → S of the output of the random system. We explore some of the general implications of these abstract concepts. 1.1 Events and the basic set operations on them Any situation where the outcome is regarded as random will be referred to as an experiment, and the set of all possible outcomes of the experiment comprises its sample space, denoted by S or, at times, Ω. Each possible outcome of the experiment corresponds to a single element of S. For example, rolling a die is an experiment whose sample space is the finite set {1, 2, 3, 4, 5, 6}. The sample space for the experiment of tossing three (distinguishable) coins is {HHH,HHT,HTH,HTT,THH,THT,TTH,TTT } where HTH indicates the ordered triple (H, T, H) in the product set {H, T }3. The delay in departure of a flight scheduled for 10:00 AM every day can be regarded as the outcome of an experiment in this abstract sense, where the sample space now may be taken to be the interval [0, ∞) of the real line.
    [Show full text]
  • Chapter 1 Probability, Random Variables and Expectations
    Chapter 1 Probability, Random Variables and Expectations Note: The primary reference for these notes is Mittelhammer (1999). Other treatments of proba- bility theory include Gallant (1997), Casella and Berger (2001) and Grimmett and Stirzaker (2001). This chapter provides an overview of probability theory as it applied to both discrete and continuous random variables. The material covered in this chap- ter serves as a foundation of the econometric sequence and is useful through- out financial economics. The chapter begins with a discussion of the axiomatic foundations of probability theory and then proceeds to describe properties of univariate random variables. Attention then turns to multivariate random vari- ables and important difference from univariate random variables. Finally, the chapter discusses the expectations operator and moments. 1.1 Axiomatic Probability Probability theory is derived from a small set of axioms – a minimal set of essential assumptions. A deep understanding of axiomatic probability theory is not essential to financial econometrics or to the use of probability and statistics in general, although understanding these core concepts does provide additional insight. The first concept in probability theory is the sample space, which is an abstract concept con- taining primitive probability events. Definition 1.1 (Sample Space). The sample space is a set, Ω, that contains all possible outcomes. Example 1.1. Suppose interest is on a standard 6-sided die. The sample space is 1-dot, 2-dots, . ., 6-dots. Example 1.2. Suppose interest is in a standard 52-card deck. The sample space is then A|, 2|, 3|,..., J |, Q|, K |, A},..., K }, A~,..., K ~, A♠,..., K ♠.
    [Show full text]