Propensities and Probabilities
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Bayes and the Law
Bayes and the Law Norman Fenton, Martin Neil and Daniel Berger [email protected] January 2016 This is a pre-publication version of the following article: Fenton N.E, Neil M, Berger D, “Bayes and the Law”, Annual Review of Statistics and Its Application, Volume 3, 2016, doi: 10.1146/annurev-statistics-041715-033428 Posted with permission from the Annual Review of Statistics and Its Application, Volume 3 (c) 2016 by Annual Reviews, http://www.annualreviews.org. Abstract Although the last forty years has seen considerable growth in the use of statistics in legal proceedings, it is primarily classical statistical methods rather than Bayesian methods that have been used. Yet the Bayesian approach avoids many of the problems of classical statistics and is also well suited to a broader range of problems. This paper reviews the potential and actual use of Bayes in the law and explains the main reasons for its lack of impact on legal practice. These include misconceptions by the legal community about Bayes’ theorem, over-reliance on the use of the likelihood ratio and the lack of adoption of modern computational methods. We argue that Bayesian Networks (BNs), which automatically produce the necessary Bayesian calculations, provide an opportunity to address most concerns about using Bayes in the law. Keywords: Bayes, Bayesian networks, statistics in court, legal arguments 1 1 Introduction The use of statistics in legal proceedings (both criminal and civil) has a long, but not terribly well distinguished, history that has been very well documented in (Finkelstein, 2009; Gastwirth, 2000; Kadane, 2008; Koehler, 1992; Vosk and Emery, 2014). -
Probability and Statistics Lecture Notes
Probability and Statistics Lecture Notes Antonio Jiménez-Martínez Chapter 1 Probability spaces In this chapter we introduce the theoretical structures that will allow us to assign proba- bilities in a wide range of probability problems. 1.1. Examples of random phenomena Science attempts to formulate general laws on the basis of observation and experiment. The simplest and most used scheme of such laws is: if a set of conditions B is satisfied =) event A occurs. Examples of such laws are the law of gravity, the law of conservation of mass, and many other instances in chemistry, physics, biology... If event A occurs inevitably whenever the set of conditions B is satisfied, we say that A is certain or sure (under the set of conditions B). If A can never occur whenever B is satisfied, we say that A is impossible (under the set of conditions B). If A may or may not occur whenever B is satisfied, then A is said to be a random phenomenon. Random phenomena is our subject matter. Unlike certain and impossible events, the presence of randomness implies that the set of conditions B do not reflect all the necessary and sufficient conditions for the event A to occur. It might seem them impossible to make any worthwhile statements about random phenomena. However, experience has shown that many random phenomena exhibit a statistical regularity that makes them subject to study. For such random phenomena it is possible to estimate the chance of occurrence of the random event. This estimate can be obtained from laws, called probabilistic or stochastic, with the form: if a set of conditions B is satisfied event A occurs m times =) repeatedly n times out of the n repetitions. -
Estimating the Accuracy of Jury Verdicts
Institute for Policy Research Northwestern University Working Paper Series WP-06-05 Estimating the Accuracy of Jury Verdicts Bruce D. Spencer Faculty Fellow, Institute for Policy Research Professor of Statistics Northwestern University Version date: April 17, 2006; rev. May 4, 2007 Forthcoming in Journal of Empirical Legal Studies 2040 Sheridan Rd. ! Evanston, IL 60208-4100 ! Tel: 847-491-3395 Fax: 847-491-9916 www.northwestern.edu/ipr, ! [email protected] Abstract Average accuracy of jury verdicts for a set of cases can be studied empirically and systematically even when the correct verdict cannot be known. The key is to obtain a second rating of the verdict, for example the judge’s, as in the recent study of criminal cases in the U.S. by the National Center for State Courts (NCSC). That study, like the famous Kalven-Zeisel study, showed only modest judge-jury agreement. Simple estimates of jury accuracy can be developed from the judge-jury agreement rate; the judge’s verdict is not taken as the gold standard. Although the estimates of accuracy are subject to error, under plausible conditions they tend to overestimate the average accuracy of jury verdicts. The jury verdict was estimated to be accurate in no more than 87% of the NCSC cases (which, however, should not be regarded as a representative sample with respect to jury accuracy). More refined estimates, including false conviction and false acquittal rates, are developed with models using stronger assumptions. For example, the conditional probability that the jury incorrectly convicts given that the defendant truly was not guilty (a “type I error”) was estimated at 0.25, with an estimated standard error (s.e.) of 0.07, the conditional probability that a jury incorrectly acquits given that the defendant truly was guilty (“type II error”) was estimated at 0.14 (s.e. -
General Probability, II: Independence and Conditional Proba- Bility
Math 408, Actuarial Statistics I A.J. Hildebrand General Probability, II: Independence and conditional proba- bility Definitions and properties 1. Independence: A and B are called independent if they satisfy the product formula P (A ∩ B) = P (A)P (B). 2. Conditional probability: The conditional probability of A given B is denoted by P (A|B) and defined by the formula P (A ∩ B) P (A|B) = , P (B) provided P (B) > 0. (If P (B) = 0, the conditional probability is not defined.) 3. Independence of complements: If A and B are independent, then so are A and B0, A0 and B, and A0 and B0. 4. Connection between independence and conditional probability: If the con- ditional probability P (A|B) is equal to the ordinary (“unconditional”) probability P (A), then A and B are independent. Conversely, if A and B are independent, then P (A|B) = P (A) (assuming P (B) > 0). 5. Complement rule for conditional probabilities: P (A0|B) = 1 − P (A|B). That is, with respect to the first argument, A, the conditional probability P (A|B) satisfies the ordinary complement rule. 6. Multiplication rule: P (A ∩ B) = P (A|B)P (B) Some special cases • If P (A) = 0 or P (B) = 0 then A and B are independent. The same holds when P (A) = 1 or P (B) = 1. • If B = A or B = A0, A and B are not independent except in the above trivial case when P (A) or P (B) is 0 or 1. In other words, an event A which has probability strictly between 0 and 1 is not independent of itself or of its complement. -
The Probability Set-Up.Pdf
CHAPTER 2 The probability set-up 2.1. Basic theory of probability We will have a sample space, denoted by S (sometimes Ω) that consists of all possible outcomes. For example, if we roll two dice, the sample space would be all possible pairs made up of the numbers one through six. An event is a subset of S. Another example is to toss a coin 2 times, and let S = fHH;HT;TH;TT g; or to let S be the possible orders in which 5 horses nish in a horse race; or S the possible prices of some stock at closing time today; or S = [0; 1); the age at which someone dies; or S the points in a circle, the possible places a dart can hit. We should also keep in mind that the same setting can be described using dierent sample set. For example, in two solutions in Example 1.30 we used two dierent sample sets. 2.1.1. Sets. We start by describing elementary operations on sets. By a set we mean a collection of distinct objects called elements of the set, and we consider a set as an object in its own right. Set operations Suppose S is a set. We say that A ⊂ S, that is, A is a subset of S if every element in A is contained in S; A [ B is the union of sets A ⊂ S and B ⊂ S and denotes the points of S that are in A or B or both; A \ B is the intersection of sets A ⊂ S and B ⊂ S and is the set of points that are in both A and B; ; denotes the empty set; Ac is the complement of A, that is, the points in S that are not in A. -
1 Probabilities
1 Probabilities 1.1 Experiments with randomness We will use the term experiment in a very general way to refer to some process that produces a random outcome. Examples: (Ask class for some first) Here are some discrete examples: • roll a die • flip a coin • flip a coin until we get heads And here are some continuous examples: • height of a U of A student • random number in [0, 1] • the time it takes until a radioactive substance undergoes a decay These examples share the following common features: There is a proce- dure or natural phenomena called the experiment. It has a set of possible outcomes. There is a way to assign probabilities to sets of possible outcomes. We will call this a probability measure. 1.2 Outcomes and events Definition 1. An experiment is a well defined procedure or sequence of procedures that produces an outcome. The set of possible outcomes is called the sample space. We will typically denote an individual outcome by ω and the sample space by Ω. Definition 2. An event is a subset of the sample space. This definition will be changed when we come to the definition ofa σ-field. The next thing to define is a probability measure. Before we can do this properly we need some more structure, so for now we just make an informal definition. A probability measure is a function on the collection of events 1 that assign a number between 0 and 1 to each event and satisfies certain properties. NB: A probability measure is not a function on Ω. -
Conditional Probability and Bayes Theorem A
Conditional probability And Bayes theorem A. Zaikin 2.1 Conditional probability 1 Conditional probablity Given events E and F ,oftenweareinterestedinstatementslike if even E has occurred, then the probability of F is ... Some examples: • Roll two dice: what is the probability that the sum of faces is 6 given that the first face is 4? • Gene expressions: What is the probability that gene A is switched off (e.g. down-regulated) given that gene B is also switched off? A. Zaikin 2.2 Conditional probability 2 This conditional probability can be derived following a similar construction: • Repeat the experiment N times. • Count the number of times event E occurs, N(E),andthenumberoftimesboth E and F occur jointly, N(E ∩ F ).HenceN(E) ≤ N • The proportion of times that F occurs in this reduced space is N(E ∩ F ) N(E) since E occurs at each one of them. • Now note that the ratio above can be re-written as the ratio between two (unconditional) probabilities N(E ∩ F ) N(E ∩ F )/N = N(E) N(E)/N • Then the probability of F ,giventhatE has occurred should be defined as P (E ∩ F ) P (E) A. Zaikin 2.3 Conditional probability: definition The definition of Conditional Probability The conditional probability of an event F ,giventhataneventE has occurred, is defined as P (E ∩ F ) P (F |E)= P (E) and is defined only if P (E) > 0. Note that, if E has occurred, then • F |E is a point in the set P (E ∩ F ) • E is the new sample space it can be proved that the function P (·|·) defyning a conditional probability also satisfies the three probability axioms. -
CONDITIONAL EXPECTATION Definition 1. Let (Ω,F,P)
CONDITIONAL EXPECTATION 1. CONDITIONAL EXPECTATION: L2 THEORY ¡ Definition 1. Let (,F ,P) be a probability space and let G be a σ algebra contained in F . For ¡ any real random variable X L2(,F ,P), define E(X G ) to be the orthogonal projection of X 2 j onto the closed subspace L2(,G ,P). This definition may seem a bit strange at first, as it seems not to have any connection with the naive definition of conditional probability that you may have learned in elementary prob- ability. However, there is a compelling rationale for Definition 1: the orthogonal projection E(X G ) minimizes the expected squared difference E(X Y )2 among all random variables Y j ¡ 2 L2(,G ,P), so in a sense it is the best predictor of X based on the information in G . It may be helpful to consider the special case where the σ algebra G is generated by a single random ¡ variable Y , i.e., G σ(Y ). In this case, every G measurable random variable is a Borel function Æ ¡ of Y (exercise!), so E(X G ) is the unique Borel function h(Y ) (up to sets of probability zero) that j minimizes E(X h(Y ))2. The following exercise indicates that the special case where G σ(Y ) ¡ Æ for some real-valued random variable Y is in fact very general. Exercise 1. Show that if G is countably generated (that is, there is some countable collection of set B G such that G is the smallest σ algebra containing all of the sets B ) then there is a j 2 ¡ j G measurable real random variable Y such that G σ(Y ). -
Topic 1: Basic Probability Definition of Sets
Topic 1: Basic probability ² Review of sets ² Sample space and probability measure ² Probability axioms ² Basic probability laws ² Conditional probability ² Bayes' rules ² Independence ² Counting ES150 { Harvard SEAS 1 De¯nition of Sets ² A set S is a collection of objects, which are the elements of the set. { The number of elements in a set S can be ¯nite S = fx1; x2; : : : ; xng or in¯nite but countable S = fx1; x2; : : :g or uncountably in¯nite. { S can also contain elements with a certain property S = fx j x satis¯es P g ² S is a subset of T if every element of S also belongs to T S ½ T or T S If S ½ T and T ½ S then S = T . ² The universal set is the set of all objects within a context. We then consider all sets S ½ . ES150 { Harvard SEAS 2 Set Operations and Properties ² Set operations { Complement Ac: set of all elements not in A { Union A \ B: set of all elements in A or B or both { Intersection A [ B: set of all elements common in both A and B { Di®erence A ¡ B: set containing all elements in A but not in B. ² Properties of set operations { Commutative: A \ B = B \ A and A [ B = B [ A. (But A ¡ B 6= B ¡ A). { Associative: (A \ B) \ C = A \ (B \ C) = A \ B \ C. (also for [) { Distributive: A \ (B [ C) = (A \ B) [ (A \ C) A [ (B \ C) = (A [ B) \ (A [ C) { DeMorgan's laws: (A \ B)c = Ac [ Bc (A [ B)c = Ac \ Bc ES150 { Harvard SEAS 3 Elements of probability theory A probabilistic model includes ² The sample space of an experiment { set of all possible outcomes { ¯nite or in¯nite { discrete or continuous { possibly multi-dimensional ² An event A is a set of outcomes { a subset of the sample space, A ½ . -
ST 371 (VIII): Theory of Joint Distributions
ST 371 (VIII): Theory of Joint Distributions So far we have focused on probability distributions for single random vari- ables. However, we are often interested in probability statements concerning two or more random variables. The following examples are illustrative: • In ecological studies, counts, modeled as random variables, of several species are often made. One species is often the prey of another; clearly, the number of predators will be related to the number of prey. • The joint probability distribution of the x, y and z components of wind velocity can be experimentally measured in studies of atmospheric turbulence. • The joint distribution of the values of various physiological variables in a population of patients is often of interest in medical studies. • A model for the joint distribution of age and length in a population of ¯sh can be used to estimate the age distribution from the length dis- tribution. The age distribution is relevant to the setting of reasonable harvesting policies. 1 Joint Distribution The joint behavior of two random variables X and Y is determined by the joint cumulative distribution function (cdf): (1.1) FXY (x; y) = P (X · x; Y · y); where X and Y are continuous or discrete. For example, the probability that (X; Y ) belongs to a given rectangle is P (x1 · X · x2; y1 · Y · y2) = F (x2; y2) ¡ F (x2; y1) ¡ F (x1; y2) + F (x1; y1): 1 In general, if X1; ¢ ¢ ¢ ;Xn are jointly distributed random variables, the joint cdf is FX1;¢¢¢ ;Xn (x1; ¢ ¢ ¢ ; xn) = P (X1 · x1;X2 · x2; ¢ ¢ ¢ ;Xn · xn): Two- and higher-dimensional versions of probability distribution functions and probability mass functions exist. -
Probability Theory Review 1 Basic Notions: Sample Space, Events
Fall 2018 Probability Theory Review Aleksandar Nikolov 1 Basic Notions: Sample Space, Events 1 A probability space (Ω; P) consists of a finite or countable set Ω called the sample space, and the P probability function P :Ω ! R such that for all ! 2 Ω, P(!) ≥ 0 and !2Ω P(!) = 1. We call an element ! 2 Ω a sample point, or outcome, or simple event. You should think of a sample space as modeling some random \experiment": Ω contains all possible outcomes of the experiment, and P(!) gives the probability that we are going to get outcome !. Note that we never speak of probabilities except in relation to a sample space. At this point we give a few examples: 1. Consider a random experiment in which we toss a single fair coin. The two possible outcomes are that the coin comes up heads (H) or tails (T), and each of these outcomes is equally likely. 1 Then the probability space is (Ω; P), where Ω = fH; T g and P(H) = P(T ) = 2 . 2. Consider a random experiment in which we toss a single coin, but the coin lands heads with 2 probability 3 . Then, once again the sample space is Ω = fH; T g but the probability function 2 1 is different: P(H) = 3 , P(T ) = 3 . 3. Consider a random experiment in which we toss a fair coin three times, and each toss is independent of the others. The coin can come up heads all three times, or come up heads twice and then tails, etc. -
Recent Progress on Conditional Randomness (Probability Symposium)
数理解析研究所講究録 39 第2030巻 2017年 39-46 Recent progress on conditional randomness * Hayato Takahashi Random Data Lab. Abstract We review the recent progress on the definition of randomness with respect to conditional probabilities and a generalization of van Lambal‐ gen theorem (Takahashi 2006, 2008, 2009, 2011). In addition we show a new result on the random sequences when the conditional probabili‐ tie \mathrm{s}^{\backslash }\mathrm{a}\mathrm{r}\mathrm{e} mutually singular, which is a generalization of Kjos Hanssens theorem (2010). Finally we propose a definition of random sequences with respect to conditional probability and argue the validity of the definition from the Bayesian statistical point of view. Keywords: Martin‐Löf random sequences, Lambalgen theo‐ rem, conditional probability, Bayesian statistics 1 Introduction The notion of conditional probability is one of the main idea in probability theory. In order to define conditional probability rigorously, Kolmogorov introduced measure theory into probability theory [4]. The notion of ran‐ domness is another important subject in probability and statistics. In statistics, the set of random points is defined as the compliment of give statistical tests. In practice, data is finite and statistical test is a set of small probability, say 3%, with respect to null hypothesis. In order to dis‐ cuss whether a point is random or not rigorously, we study the randomness of sequences (infinite data) and null sets as statistical tests. The random set depends on the class of statistical tests. Kolmogorov brought an idea of recursion theory into statistics and proposed the random set as the compli‐ ment of the effective null sets [5, 6, 8].