1.3 Probability Theory

Total Page:16

File Type:pdf, Size:1020Kb

1.3 Probability Theory 1.3 Probability theory 1.3.1 Basics Probability theory starts with a non-definable notion of experiment, which has possible outcomes. The set of all possible outcomes is called the sample space and usually denoted by Ω. In all the problems dealing with probabilities the first step if to identify the sample space. Example 1.5. An experiment consists in tossing a coin three times, here is the sample space: Ω = fHHH;HHT;HTH;HTT;THH;THT;TTH;TTT g; 3 and an outcome is !i = (a1; a2; a3); ai 2 fH; T g. Obviously, jΩj = 2 = 8. Example 1.6. Consider an experiment of choosing a graph randomly from the set of all graphs on 4 vertices and with 3 edges. The sample space here is G(n; m), where n = 4 and m = 3. All the elementary events are presented in the figure below. Note that we consider that the vertices of the graph are distinguishable (labeled). How many non-isomorphic graphs are here? Example 1.7. If you are asked to specify the probability space for the experiment that consists in picking n balls our of an urn containing m balls, you should ask the follow-up questions whether we care about the order of the balls or not, and which sampling procedure is used. The answer crucially depends on these important details. Start with an urn that contains M distinguishable balls, and perform sampling with replace- ment (i.e., after each drawing we return the ball into the urn). If an outcome of our experiment 2 4 2 4 2 4 2 4 2 4 1 3 1 3 1 3 1 3 1 3 2 4 2 4 2 4 2 4 2 4 1 3 1 3 1 3 1 3 1 3 2 4 2 4 2 4 2 4 2 4 1 3 1 3 1 3 1 3 1 3 2 4 2 4 2 4 2 4 2 4 1 3 1 3 1 3 1 3 1 3 ( 4 ) G jGj (2) Figure 1.6: The sample space for (4; 3), here = 3 = 20 18 is a sample of n balls, then what is jΩj? To answer this question we need to distinguish between ordered and unordered samples, whether we care about the exact order the balls appear or not. For the ordered sample we have that !i = (a1; : : : ; an), where each aj can take any values out of M. Hence here jΩj = M n. If, however we consider the unordered samples: Ω = f! : ! = fa1; : : : ; ang; ai = 1;:::;Mg; then the answer is not straightforward to come( up with) (except that it should be smaller than n j j M+n−1 M ). Let us prove that N(M; n) := Ω = n in this case. I use induction. First, note that for k ≤ M ( ) k N(k; 1) = k = : 1 ( ) k+n−1 ≤ Now assume that N(k; n) = k for k M, I need to show that this formula continues to hold when n is replaced with n + 1. For an unordered sample we can always assume that it is arranged as a1 ≤ a2 ≤ ::: ≤ an ≤ an+1: We have that the number of the unordered samples with a1 = 1 is N(M; n), with a1 = 2 is N(M − 1; n), etc, with a1 = M is N(1; n) = 1. Hence, N(M; n + 1) = N(M; n) + N(M − 1; n) + :::; +N(1; n) = ( ) ( ) ( ) M + n − 1 M + n − 2 n = + + ::: + = (( n ) ( n )) (( n ) ( )) M + n M + n − 1 M + n − 1 M + n − 2 = − + − + ((n + 1 ) ( n +)) 1 n + 1 n + 1 n + 1 n ::: + − = ( n +) 1 n + 1 M + n = : n + 1 Here we used the fact that ( ) ( ) ( ) k + 1 k k = + : l l l − 1 If we need to perform the sampling without replacement, we need n ≤ M. For the ordered samples one has (M)n := jΩj = M(M − 1) ::: (M − n + 1): Note that if n = M, we obtain here permutations of the set of balls, the total number of which is M! := 1 · 2 · ::: · M (and of course 0! := 1). For the unordered samples we do not bother about the order of the balls in our sample, hence here ( ) (M) M(M − 1) ::: (M − n + 1) M! M jΩj = n = = = : n! n! n!(M − n)! n 19 Example 1.8. Distribution of n objects in M cells (think about distribution of n particles among M energy states). Assume that we assign numbers 1; 2;:::;M to the cells and 1; 2; : : : ; n to the balls. If all the balls are distinguishable, then putting n balls into M cells amounts to having an ordered sample (a1; : : : ; an), where ai is the number of the cell into which the ith ball was put. However, if we do not distinguish the balls, then an outcome in an unordered sample fa1; : : : ; ang, where ai is the number of the cell into which an object is put at the step i. Hence we get a bijection ordered samples $ distinguishable objects unordered samples $ indistinguishable objects In an analogous way we get sampling with replacement $ a cell may get any number of balls sampling without replacement $ a cell can get only one ball per cell Hence we calculated the sizes of the sample spaces in four cases for out example! Problem 1.16. Give a combinatorial prove for the number of outcomes in the case of putting n indistinguishable balls among M cells such that any cell may contain any number of balls. The next important thing to specify is the set of events F. An event A 2 F is a subset of Ω, for our case finite Ω F is usually taken as the power set 2Ω, i.e., the set of all subsets of Ω. The events are sets, and we can do usual set operation with them (taking the complement, union, intersection, difference). In the jargon of probability theory, if A; B 2 F, the event A \ B reads \both A and B occurred", the event A [ B reads \either A or B or both occurred", the event A n B reads \event A occurred and B did not", the event \A := Ω n A" means \not A occurred". For example the event A for the graph to be not connected in case G(4; 3) consists of four outcomes (see the figure above). Given the sample space and the set of events, finally to specify the probability space one needs the probability measure P: F! R, for which the following axioms hold: 1. P fAg 2 [0; 1] for any A 2 F and P fΩg = 1. 2. If fAi 2 F : i 2 Ig is a countable set of pairwise disjoint events, then ( ) [ X P Ai = P fAig : i2I i2I The triple (Ω; F; P) is called the probability space. When we dealt with the Ramsey numbers, we used the classical probability model that assign to each event A the probability jAj P fAg = ; jΩj and each outcome !i 2 Ω has the probability 1 P f! g = : i jΩj 20 The classical probability model is also called the uniform probability space on Ω. Now we can return to Example 1.8 and ask which probability space one should pick to solve one or another problem. This question is not as obvious as might seem from the first view. For example, a simple question would be what is more probable: to observe 11 or 12 points if two dices are tossed. To answer this question we first must decide whether the outcomes (5; 6) and (6; 5) are considered different. If they are different (and we talk about ordered samples) then 2 P f11 points is observedg = ; 36 where jΩj = 36 and 1 P f12 points is observedg = : 36 However, if we think that (5; 6) and (6; 5) are the same outcome, then f g ( 1 ) f g P 11 points is observed = 6+2−1 = P 12 points is observed : 2 Those who played dice probably know that 11 points are observed somewhat more frequently than 12, hence the first approach is what we need to use not to contradict the nature. However, if we move, e.g, to the realm of physics particles, then other sample spaces have to be chosen. Consider the statistical physics problem to describe the (random) distribution of particles in some region, subdivided into smaller ones. It would be natural to assume that any configuration of particles has to have the probability M −n, where n is the number of particles and M is the number of subregions, in physics this is called Maxwell{Boltzmann statistics. Now we know from experiments that this statistics does not apply to any known type of particles! Actually, photons,( for) example, satisfy Bose{Einstein statistics, when the probability of any configuration M+n−1 −1 is n (i.e, the particles are indistinguishable and any subregion can accommodate more than one particle),( ) and protons obey Fermi{Dirac statistics, with the probability of any config- M −1 uration is n (the particles are indistinguishable and any subregion can accommodate only one particle). Note that the general problem of the probability theory is not to figure out how to assign probabilities to the outcomes, but, given the probabilities of outcomes, to present a framework to infer the probabilities of more complex events. In general, for our finite sample space Ω we can define the probability of event A as X P fAg = P f!ig ; !i2A and the axioms above follow from this definition.
Recommended publications
  • Measure-Theoretic Probability I
    Measure-Theoretic Probability I Steven P.Lalley Winter 2017 1 1 Measure Theory 1.1 Why Measure Theory? There are two different views – not necessarily exclusive – on what “probability” means: the subjectivist view and the frequentist view. To the subjectivist, probability is a system of laws that should govern a rational person’s behavior in situations where a bet must be placed (not necessarily just in a casino, but in situations where a decision must be made about how to proceed when only imperfect information about the outcome of the decision is available, for instance, should I allow Dr. Scissorhands to replace my arthritic knee by a plastic joint?). To the frequentist, the laws of probability describe the long- run relative frequencies of different events in “experiments” that can be repeated under roughly identical conditions, for instance, rolling a pair of dice. For the frequentist inter- pretation, it is imperative that probability spaces be large enough to allow a description of an experiment, like dice-rolling, that is repeated infinitely many times, and that the mathematical laws should permit easy handling of limits, so that one can make sense of things like “the probability that the long-run fraction of dice rolls where the two dice sum to 7 is 1/6”. But even for the subjectivist, the laws of probability should allow for description of situations where there might be a continuum of possible outcomes, or pos- sible actions to be taken. Once one is reconciled to the need for such flexibility, it soon becomes apparent that measure theory (the theory of countably additive, as opposed to merely finitely additive measures) is the only way to go.
    [Show full text]
  • Probability and Statistics Lecture Notes
    Probability and Statistics Lecture Notes Antonio Jiménez-Martínez Chapter 1 Probability spaces In this chapter we introduce the theoretical structures that will allow us to assign proba- bilities in a wide range of probability problems. 1.1. Examples of random phenomena Science attempts to formulate general laws on the basis of observation and experiment. The simplest and most used scheme of such laws is: if a set of conditions B is satisfied =) event A occurs. Examples of such laws are the law of gravity, the law of conservation of mass, and many other instances in chemistry, physics, biology... If event A occurs inevitably whenever the set of conditions B is satisfied, we say that A is certain or sure (under the set of conditions B). If A can never occur whenever B is satisfied, we say that A is impossible (under the set of conditions B). If A may or may not occur whenever B is satisfied, then A is said to be a random phenomenon. Random phenomena is our subject matter. Unlike certain and impossible events, the presence of randomness implies that the set of conditions B do not reflect all the necessary and sufficient conditions for the event A to occur. It might seem them impossible to make any worthwhile statements about random phenomena. However, experience has shown that many random phenomena exhibit a statistical regularity that makes them subject to study. For such random phenomena it is possible to estimate the chance of occurrence of the random event. This estimate can be obtained from laws, called probabilistic or stochastic, with the form: if a set of conditions B is satisfied event A occurs m times =) repeatedly n times out of the n repetitions.
    [Show full text]
  • The Probability Set-Up.Pdf
    CHAPTER 2 The probability set-up 2.1. Basic theory of probability We will have a sample space, denoted by S (sometimes Ω) that consists of all possible outcomes. For example, if we roll two dice, the sample space would be all possible pairs made up of the numbers one through six. An event is a subset of S. Another example is to toss a coin 2 times, and let S = fHH;HT;TH;TT g; or to let S be the possible orders in which 5 horses nish in a horse race; or S the possible prices of some stock at closing time today; or S = [0; 1); the age at which someone dies; or S the points in a circle, the possible places a dart can hit. We should also keep in mind that the same setting can be described using dierent sample set. For example, in two solutions in Example 1.30 we used two dierent sample sets. 2.1.1. Sets. We start by describing elementary operations on sets. By a set we mean a collection of distinct objects called elements of the set, and we consider a set as an object in its own right. Set operations Suppose S is a set. We say that A ⊂ S, that is, A is a subset of S if every element in A is contained in S; A [ B is the union of sets A ⊂ S and B ⊂ S and denotes the points of S that are in A or B or both; A \ B is the intersection of sets A ⊂ S and B ⊂ S and is the set of points that are in both A and B; ; denotes the empty set; Ac is the complement of A, that is, the points in S that are not in A.
    [Show full text]
  • Random Variable = a Real-Valued Function of an Outcome X = F(Outcome)
    Random Variables (Chapter 2) Random variable = A real-valued function of an outcome X = f(outcome) Domain of X: Sample space of the experiment. Ex: Consider an experiment consisting of 3 Bernoulli trials. Bernoulli trial = Only two possible outcomes – success (S) or failure (F). • “IF” statement: if … then “S” else “F” • Examine each component. S = “acceptable”, F = “defective”. • Transmit binary digits through a communication channel. S = “digit received correctly”, F = “digit received incorrectly”. Suppose the trials are independent and each trial has a probability ½ of success. X = # successes observed in the experiment. Possible values: Outcome Value of X (SSS) (SSF) (SFS) … … (FFF) Random variable: • Assigns a real number to each outcome in S. • Denoted by X, Y, Z, etc., and its values by x, y, z, etc. • Its value depends on chance. • Its value becomes available once the experiment is completed and the outcome is known. • Probabilities of its values are determined by the probabilities of the outcomes in the sample space. Probability distribution of X = A table, formula or a graph that summarizes how the total probability of one is distributed over all the possible values of X. In the Bernoulli trials example, what is the distribution of X? 1 Two types of random variables: Discrete rv = Takes finite or countable number of values • Number of jobs in a queue • Number of errors • Number of successes, etc. Continuous rv = Takes all values in an interval – i.e., it has uncountable number of values. • Execution time • Waiting time • Miles per gallon • Distance traveled, etc. Discrete random variables X = A discrete rv.
    [Show full text]
  • 1 Probabilities
    1 Probabilities 1.1 Experiments with randomness We will use the term experiment in a very general way to refer to some process that produces a random outcome. Examples: (Ask class for some first) Here are some discrete examples: • roll a die • flip a coin • flip a coin until we get heads And here are some continuous examples: • height of a U of A student • random number in [0, 1] • the time it takes until a radioactive substance undergoes a decay These examples share the following common features: There is a proce- dure or natural phenomena called the experiment. It has a set of possible outcomes. There is a way to assign probabilities to sets of possible outcomes. We will call this a probability measure. 1.2 Outcomes and events Definition 1. An experiment is a well defined procedure or sequence of procedures that produces an outcome. The set of possible outcomes is called the sample space. We will typically denote an individual outcome by ω and the sample space by Ω. Definition 2. An event is a subset of the sample space. This definition will be changed when we come to the definition ofa σ-field. The next thing to define is a probability measure. Before we can do this properly we need some more structure, so for now we just make an informal definition. A probability measure is a function on the collection of events 1 that assign a number between 0 and 1 to each event and satisfies certain properties. NB: A probability measure is not a function on Ω.
    [Show full text]
  • Propensities and Probabilities
    ARTICLE IN PRESS Studies in History and Philosophy of Modern Physics 38 (2007) 593–625 www.elsevier.com/locate/shpsb Propensities and probabilities Nuel Belnap 1028-A Cathedral of Learning, University of Pittsburgh, Pittsburgh, PA 15260, USA Received 19 May 2006; accepted 6 September 2006 Abstract Popper’s introduction of ‘‘propensity’’ was intended to provide a solid conceptual foundation for objective single-case probabilities. By considering the partly opposed contributions of Humphreys and Miller and Salmon, it is argued that when properly understood, propensities can in fact be understood as objective single-case causal probabilities of transitions between concrete events. The chief claim is that propensities are well-explicated by describing how they fit into the existing formal theory of branching space-times, which is simultaneously indeterministic and causal. Several problematic examples, some commonsense and some quantum-mechanical, are used to make clear the advantages of invoking branching space-times theory in coming to understand propensities. r 2007 Elsevier Ltd. All rights reserved. Keywords: Propensities; Probabilities; Space-times; Originating causes; Indeterminism; Branching histories 1. Introduction You are flipping a fair coin fairly. You ascribe a probability to a single case by asserting The probability that heads will occur on this very next flip is about 50%. ð1Þ The rough idea of a single-case probability seems clear enough when one is told that the contrast is with either generalizations or frequencies attributed to populations asserted while you are flipping a fair coin fairly, such as In the long run; the probability of heads occurring among flips is about 50%. ð2Þ E-mail address: [email protected] 1355-2198/$ - see front matter r 2007 Elsevier Ltd.
    [Show full text]
  • 39 Section J Basic Probability Concepts Before We Can Begin To
    Section J Basic Probability Concepts Before we can begin to discuss inferential statistics, we need to discuss probability. Recall, inferential statistics deals with analyzing a sample from the population to draw conclusions about the population, therefore since the data came from a sample we can never be 100% certain the conclusion is correct. Therefore, probability is an integral part of inferential statistics and needs to be studied before starting the discussion on inferential statistics. The theoretical probability of an event is the proportion of times the event occurs in the long run, as a probability experiment is repeated over and over again. Law of Large Numbers says that as a probability experiment is repeated again and again, the proportion of times that a given event occurs will approach its probability. A sample space contains all possible outcomes of a probability experiment. EX: An event is an outcome or a collection of outcomes from a sample space. A probability model for a probability experiment consists of a sample space, along with a probability for each event. Note: If A denotes an event then the probability of the event A is denoted P(A). Probability models with equally likely outcomes If a sample space has n equally likely outcomes, and an event A has k outcomes, then Number of outcomes in A k P(A) = = Number of outcomes in the sample space n The probability of an event is always between 0 and 1, inclusive. 39 Important probability characteristics: 1) For any event A, 0 ≤ P(A) ≤ 1 2) If A cannot occur, then P(A) = 0.
    [Show full text]
  • Probability and Counting Rules
    blu03683_ch04.qxd 09/12/2005 12:45 PM Page 171 C HAPTER 44 Probability and Counting Rules Objectives Outline After completing this chapter, you should be able to 4–1 Introduction 1 Determine sample spaces and find the probability of an event, using classical 4–2 Sample Spaces and Probability probability or empirical probability. 4–3 The Addition Rules for Probability 2 Find the probability of compound events, using the addition rules. 4–4 The Multiplication Rules and Conditional 3 Find the probability of compound events, Probability using the multiplication rules. 4–5 Counting Rules 4 Find the conditional probability of an event. 5 Find the total number of outcomes in a 4–6 Probability and Counting Rules sequence of events, using the fundamental counting rule. 4–7 Summary 6 Find the number of ways that r objects can be selected from n objects, using the permutation rule. 7 Find the number of ways that r objects can be selected from n objects without regard to order, using the combination rule. 8 Find the probability of an event, using the counting rules. 4–1 blu03683_ch04.qxd 09/12/2005 12:45 PM Page 172 172 Chapter 4 Probability and Counting Rules Statistics Would You Bet Your Life? Today Humans not only bet money when they gamble, but also bet their lives by engaging in unhealthy activities such as smoking, drinking, using drugs, and exceeding the speed limit when driving. Many people don’t care about the risks involved in these activities since they do not understand the concepts of probability.
    [Show full text]
  • Determinism, Indeterminism and the Statistical Postulate
    Tipicality, Explanation, The Statistical Postulate, and GRW Valia Allori Northern Illinois University [email protected] www.valiaallori.com Rutgers, October 24-26, 2019 1 Overview Context: Explanation of the macroscopic laws of thermodynamics in the Boltzmannian approach Among the ingredients: the statistical postulate (connected with the notion of probability) In this presentation: typicality Def: P is a typical property of X-type object/phenomena iff the vast majority of objects/phenomena of type X possesses P Part I: typicality is sufficient to explain macroscopic laws – explanatory schema based on typicality: you explain P if you explain that P is typical Part II: the statistical postulate as derivable from the dynamics Part III: if so, no preference for indeterministic theories in the quantum domain 2 Summary of Boltzmann- 1 Aim: ‘derive’ macroscopic laws of thermodynamics in terms of the microscopic Newtonian dynamics Problems: Technical ones: There are to many particles to do exact calculations Solution: statistical methods - if 푁 is big enough, one can use suitable mathematical technique to obtain information of macro systems even without having the exact solution Conceptual ones: Macro processes are irreversible, while micro processes are not Boltzmann 3 Summary of Boltzmann- 2 Postulate 1: the microscopic dynamics microstate 푋 = 푟1, … . 푟푁, 푣1, … . 푣푁 in phase space Partition of phase space into Macrostates Macrostate 푀(푋): set of macroscopically indistinguishable microstates Macroscopic view Many 푋 for a given 푀 given 푀, which 푋 is unknown Macro Properties (e.g. temperature): they slowly vary on the Macro scale There is a particular Macrostate which is incredibly bigger than the others There are many ways more to have, for instance, uniform temperature than not equilibrium (Macro)state 4 Summary of Boltzmann- 3 Entropy: (def) proportional to the size of the Macrostate in phase space (i.e.
    [Show full text]
  • Is the Cosmos Random?
    IS THE RANDOM? COSMOS QUANTUM PHYSICS Einstein’s assertion that God does not play dice with the universe has been misinterpreted By George Musser Few of Albert Einstein’s sayings have been as widely quot- ed as his remark that God does not play dice with the universe. People have naturally taken his quip as proof that he was dogmatically opposed to quantum mechanics, which views randomness as a built-in feature of the physical world. When a radioactive nucleus decays, it does so sponta- neously; no rule will tell you when or why. When a particle of light strikes a half-silvered mirror, it either reflects off it or passes through; the out- come is open until the moment it occurs. You do not need to visit a labora- tory to see these processes: lots of Web sites display streams of random digits generated by Geiger counters or quantum optics. Being unpredict- able even in principle, such numbers are ideal for cryptography, statistics and online poker. Einstein, so the standard tale goes, refused to accept that some things are indeterministic—they just happen, and there is not a darned thing anyone can do to figure out why. Almost alone among his peers, he clung to the clockwork universe of classical physics, ticking mechanistically, each moment dictating the next. The dice-playing line became emblemat- ic of the B side of his life: the tragedy of a revolutionary turned reaction- ary who upended physics with relativity theory but was, as Niels Bohr put it, “out to lunch” on quantum theory.
    [Show full text]
  • Topic 1: Basic Probability Definition of Sets
    Topic 1: Basic probability ² Review of sets ² Sample space and probability measure ² Probability axioms ² Basic probability laws ² Conditional probability ² Bayes' rules ² Independence ² Counting ES150 { Harvard SEAS 1 De¯nition of Sets ² A set S is a collection of objects, which are the elements of the set. { The number of elements in a set S can be ¯nite S = fx1; x2; : : : ; xng or in¯nite but countable S = fx1; x2; : : :g or uncountably in¯nite. { S can also contain elements with a certain property S = fx j x satis¯es P g ² S is a subset of T if every element of S also belongs to T S ½ T or T S If S ½ T and T ½ S then S = T . ² The universal set ­ is the set of all objects within a context. We then consider all sets S ½ ­. ES150 { Harvard SEAS 2 Set Operations and Properties ² Set operations { Complement Ac: set of all elements not in A { Union A \ B: set of all elements in A or B or both { Intersection A [ B: set of all elements common in both A and B { Di®erence A ¡ B: set containing all elements in A but not in B. ² Properties of set operations { Commutative: A \ B = B \ A and A [ B = B [ A. (But A ¡ B 6= B ¡ A). { Associative: (A \ B) \ C = A \ (B \ C) = A \ B \ C. (also for [) { Distributive: A \ (B [ C) = (A \ B) [ (A \ C) A [ (B \ C) = (A [ B) \ (A [ C) { DeMorgan's laws: (A \ B)c = Ac [ Bc (A [ B)c = Ac \ Bc ES150 { Harvard SEAS 3 Elements of probability theory A probabilistic model includes ² The sample space ­ of an experiment { set of all possible outcomes { ¯nite or in¯nite { discrete or continuous { possibly multi-dimensional ² An event A is a set of outcomes { a subset of the sample space, A ½ ­.
    [Show full text]
  • Measure Theory and Probability
    Measure theory and probability Alexander Grigoryan University of Bielefeld Lecture Notes, October 2007 - February 2008 Contents 1 Construction of measures 3 1.1Introductionandexamples........................... 3 1.2 σ-additive measures ............................... 5 1.3 An example of using probability theory . .................. 7 1.4Extensionofmeasurefromsemi-ringtoaring................ 8 1.5 Extension of measure to a σ-algebra...................... 11 1.5.1 σ-rings and σ-algebras......................... 11 1.5.2 Outermeasure............................. 13 1.5.3 Symmetric difference.......................... 14 1.5.4 Measurable sets . ............................ 16 1.6 σ-finitemeasures................................ 20 1.7Nullsets..................................... 23 1.8 Lebesgue measure in Rn ............................ 25 1.8.1 Productmeasure............................ 25 1.8.2 Construction of measure in Rn. .................... 26 1.9 Probability spaces ................................ 28 1.10 Independence . ................................. 29 2 Integration 38 2.1 Measurable functions.............................. 38 2.2Sequencesofmeasurablefunctions....................... 42 2.3 The Lebesgue integral for finitemeasures................... 47 2.3.1 Simplefunctions............................ 47 2.3.2 Positivemeasurablefunctions..................... 49 2.3.3 Integrablefunctions........................... 52 2.4Integrationoversubsets............................ 56 2.5 The Lebesgue integral for σ-finitemeasure.................
    [Show full text]