Bayesian Cognition Probabilistic Models of Action, Perception, Inference, Decision and Learning

Total Page:16

File Type:pdf, Size:1020Kb

Bayesian Cognition Probabilistic Models of Action, Perception, Inference, Decision and Learning Bayesian Cognition Probabilistic models of action, perception, inference, decision and learning Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #1 Bayesian Cognition Cours 2 : Bayesian Programming Julien Diard http://diard.wordpress.com [email protected] CNRS - Laboratoire de Psychologie et NeuroCognition, Grenoble Pierre Bessière CNRS - Institut des Systèmes Intelligents et de Robotique, Paris Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #2 Contents / Schedule • c1 : fondements théoriques, • c1 : Mercredi 11 Octobre définition du formalisme de la • c2 : Mercredi 18 Octobre programmation bayésienne • c3 : Mercredi 25 Octobre • *pas de cours la semaine du 30 • c2 : programmation Octobre* bayésienne des robots • c4 : Mercredi 8 Novembre • c5 : Mercredi 15 Novembre • c3 : modélisation bayésienne • *pas de cours la semaine du 20 cognitive Novembre* • c6 : Mercredi 29 Novembre • c4 : comparaison bayésienne de modèles • Examen ?/?/? (pour les M2) Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #3 Plan • Summary & questions! • Basic concepts: minimal example and spam detection example • Bayesian Programming methodology – Variables – Decomposition & conditional independence hypotheses – Parametric forms (demo) – Learning – Inference • Taxonomy of Bayesian models Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #4 Probability Theory As Extended Logic • Probabilités • Probabilités « fréquentistes » « subjectives » E.T. Jaynes (1922-1998) – Une probabilité est une – Référence à un état de propriété physique d'un connaissance d'un sujet • P(« il pleut » | Jean), objet P(« il pleut » | Pierre) – Axiomatique de • Pas de référence à la limite Kolmogorov, théorie d’occurrence d’un des ensembles événement (fréquence) • Probabilités conditionnelles N – P (A)=fA = limN – P(A | π) et jamais P(A) ⇥ NΩ – Statistiques classiques – Statistiques bayésiennes • Population parente, etc. Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #5 Principle Incompleteness Preliminary Knowledge + Bayesian Learning Experimental Data = Probabilistic Representation Uncertainty P(a) P( a) 1 Bayesian Inference + ¬ = P(a∧b) = P(a)P(b | a) = P(b)P(a | b) Decision€ Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #6 Règles de calcul • Règle du produit P (AB C)=P (A C)P (B AC) | | | = P (B C)P (A BC) | | Reverend Thomas Bayes è Théorème de Bayes (~1702-1761) P (B C)P (A BC) P (B AC)= | | , si P (A C) =0 | P (A C) | • Règle de la somme | P (A C)+P (A¯ C)=1 P ([A = a] C)=1 | | | a A è Règle de marginalisation∈ P (AB C)=P (B C) | | A Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #7 Programme Bayésien • Variables Spécification • Décomposition Description PB • Formes paramétriques Identification Question Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #8 Mon premier Programme Bayésien • Variables A = {il pleut, il ne pleut pas} B = {Jean a son p., Jean n'a pas son p.} Sp. • Décomposition Desc. P(B A) = P(A) P(B | A) PB • Formes paramétriques Tables de probabilités conditionnelles P(B | A) A=il pleut A=il ne pleut P(A) A=il pleut A=il ne pleut pas pas B=Jean n'a pas 0,05 0,9 Identification : 0,4 0,6 son parapluie B=Jean a son 0,95 0,1 parapluie Question : P(A | B) = P(A) P(B | A) / P(B) Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #9 Plan • Summary & questions! • Basic concepts: minimal example and spam detection example • Bayesian Programming methodology – Variables – Decomposition & conditional independence hypotheses – Parametric forms (demo) – Learning – Inference • Taxonomy of Bayesian models Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #10 Bayesian Spam Detection • Classify texts in 2 categories “spam” or “nonspam” – Only available information: a set of words • Adapt to the user and learn from experience Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #11 Variables • Spam – Boolean variable: {True, False} • W0, W1, ..., WN-1 – Wi is the presence or absence of word i in a text – Boolean variables: {True, False} Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #12 Decomposition P(Spam ∧W0 ∧ ... ∧WN−1) = P(Spam)× P(W0 | Spam)× P(W1 |W0 ∧Spam) × ... × P(WN−1 |WN−2 ∧ ... ∧W0 ∧Spam) € P(Spam W0 ... WN 1) ^ ^ ^ − N 1 − P(Spam) P(W Spam) ⇡ ⇥ n | Yn=0 Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #13 Importance des hypothèses d’indépendance conditionnelle • Nombre de probabilités – Attention : nb proba ≠ nb param libres – Avant hypothèses d’indépendance conditionnelle N+1 P(Spam W0 ... W N 1) 2 probabilités ^ ^ ^ − – Après hypothèses N 1 − P(Spam) P(W Spam) ⇥ n | 2 + 4N probabilités n=0 Diard – LPNC/CNRSY Cognition Bayésienne – 2017-2018 #14 Graphical representation Bayesian Network P(Spam ∧W ∧ ... ∧W ) P(Spam W0 ... WN 1) 0 N−1 ^ ^ ^ − N 1 = P(Spam)× P(W0 | Spam)× P(W1 |W0 ∧Spam) − P(Spam) P(Wn Spam) × ... × P(WN−1 |WN−2 ∧ ... ∧W0 ∧Spam) ⇡ ⇥ | Yn=0 Spam Spam € W0 W1 W2 … WN-1 W0 W1 W2 … WN-1 Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #15 P(Spam ∧W0 ∧ ... ∧WN−1) N−1 = P(Spam)× ∏ P(Wn | Spam) Parametric forms n=0 and identification € Could also be P([Spam = false]) = 0.25 computed from a learning database P Spam = true = 0.75 ([ ]) P([Spam = false]) =θ € P([Spam = true]) =1−θ € Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #16 P(Spam ∧W0 ∧ ... ∧WN−1) N−1 = P(Spam)× ∏ P(Wn | Spam) Parametric forms n=0 and identification € n Nbref P([Wn = true]| [Spam = false]) = Nbref P([Wn = false]| [Spam = false]) =1− P([Wn = true]| [Spam = false]) n Nbret P([Wn = true]| [Spam = true]) = Nbret P([Wn = false]| [Spam = true]) =1− P([Wn = true]| [Spam = true]) Attention, si un mot Wn n'est jamais vu dans un spam, alors si on le voit dans un mail m, m ne peut pas être un spam Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #17 P(Spam ∧W0 ∧ ... ∧WN−1) N−1 = P(Spam)× ∏ P(Wn | Spam) Parametric forms n=0 and identification • Loi de succession de€ Laplace n 1+ Nbref P([Wn = true]| [Spam = false]) = 2 + Nbref P([Wn = false]| [Spam = false]) =1− P([Wn = true]| [Spam = false]) n 1+ Nbret P([Wn = true]| [Spam = true]) = 2 + Nbret P([Wn = false]| [Spam = true]) =1− P([Wn = true]| [Spam = true]) Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #18 Identification n 1+ Nbref P([Wn = true]| [Spam = false]) = 2 + Nbref n 1+ Nbret P([Wn = true]| [Spam = true]) = 2 + Nbret Notion de paramètre libre, de base de données d’apprentissage, d’algorithme d’identification de paramètres Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #19 P(Spam ∧W0 ∧ ... ∧WN−1) Joint distribution and N−1 = P(Spam)× ∏ P(Wn | Spam) questions (1) n=0 € P(Spam) = ∑P(Spam ∧W0 ∧ ... ∧ Wn ∧ ... ∧WN−1) W0∧ ... ∧WN-1 N−1 P(Spam) = ∑ P(Spam)× ∏ P(Wn | Spam) € W0∧ ... ∧WN-1 n=0 = P(Spam) € Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #20 P(Spam ∧W0 ∧ ... ∧WN−1) Joint distribution and N−1 = P(Spam)× ∏ P(Wn | Spam) questions (2) n=0 € P(Wn ) = ∑P(Spam ∧W0 ∧ ... ∧ Wn ∧ ... ∧WN−1) Spam∧Wi≠n N−1 P(Wn ) = ∑ P(Spam)× ∏ P(Wn | Spam) € Spam∧Wi≠n n=0 = ∑P(Spam)× P(Wn | Spam) Spam # n & # n & 1+ Nbref 1+ Nbret P([Wn = true]) = % 0.25 × ( +% 0.75 × ( % ( 2 + Nbre $ 2 + Nbref ' $ t ' € Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #21 € P(Spam ∧W0 ∧ ... ∧WN−1) Joint distribution and N−1 = P(Spam)× ∏ P(Wn | Spam) questions (3) n=0 ∑P(Spam ∧W0 ∧ ... ∧ Wn ∧ ... ∧WN−1) Wi≠n P(Wn | [Spam = true]) = € ∑P(Spam ∧W0 ∧ ... ∧ Wn ∧ ... ∧WN−1) W0∧ ... ∧WN-1 n 1+ Nbret P(Wn | [Spam = true]) = 2 + Nbret € € P(Spam)× P([Wn = true] | Spam) P(Spam | [Wn = true]) = ∑ P(Spam)× P([Wn = true] | Spam) Spam Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #22 P(Spam ∧W0 ∧ ... ∧WN−1) Joint distribution and N−1 = P(Spam)× ∏ P(Wn | Spam) questions (4) n=0 € P(Spam ∧W0 ∧ ... ∧ Wn ∧ ... ∧WN−1) P(Spam |W0 ∧ ... ∧ WN-1) = ∑P(Spam ∧W0 ∧ ... ∧ Wn ∧ ... ∧WN−1) Spam N−1 P(Spam)× ∏ P(Wn | Spam) P Spam |W ∧ ... W = n=0 € ( 0 N−1) N−1 ∑ P(Spam)× ∏ P(Wn | Spam) Spam n=0 Diard – LPNC/CNRS € Cognition Bayésienne – 2017-2018 #23 Bayesian Program Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #24 Résultat : exemple numérique • 5 mots considérés : fortune, next, programming, money, you • 1 000 mails dans la base de données d’apprentissage (250 non spams, 750 spams) n P(Wn | [Spam=false]) P(Wn | [Spam=true]) W =false W =true W =false W =true n n n n n n n Word n af at 0 fortune 0 375 0 0.996032 0.00396825 0.5 0.5 1 next 125 0 1 0.5 0.5 0.99867 0.00132979 2 programming 250 0 2 0.00396825 0.996032 0.99867 0.00132979 3 money 0 750 3 0.996032 0.00396825 0.00132979 0.99867 4 you 125 375 4 0.5 0.5 0.5 0.5 Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #25 Résultat : exemple numérique • 25=32 mails possibles Subset Words present P(Spam | w0 … w4) number [Spam = false] [Spam = true] 3 {money} 5.24907e-06 0.999995 11 {next, money} 0.00392659 0.996073 12 {next, money, you} 0.00392659 0.996073 15 {next, programming, money} 0.998656 0.00134393 27 {fortune, next, money} 1.57052e-05 0.999984 Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #26 Results SpamSieve http://c-command.com/spamsieve/ Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #27 Plan • Summary & questions! • Basic concepts: minimal example and spam detection example • Bayesian Programming methodology – Variables – Decomposition & conditional independence hypotheses – Parametric forms (demo) – Learning – Inference • Taxonomy of Bayesian models Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #28 Programme Bayésien • Variables Spécification • Décomposition Description PB • Formes paramétriques Identification Question Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #29 Logical Proposition Logical Propositions are denoted by lowercase names: a Usual logical operators: a∧b a∨b € ¬a € Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #30 Probability of Logical Proposition To assign a probability to a given proposition a, it is necessary to have at least some preliminary knowledge, summed up by a proposition π.
Recommended publications
  • The Semantics of Subroutines and Iteration in the Bayesian Programming Language Probt
    1 The semantics of Subroutines and Iteration in the Bayesian Programming language ProBT R. LAURENT∗, K. MEKHNACHA∗, E. MAZERy and P. BESSIERE` z ∗ProbaYes S.A.S, Grenoble, France yUniversity of Grenoble-Alpes, CNRS/LIG, Grenoble, France zUniversity Pierre et Marie Curie, CNRS/ISIR, Paris, France Abstract—Bayesian models are tools of choice when solving 8 8 8 problems with incomplete information. Bayesian networks pro- > > >Variables > > <> vide a first but limited approach to address such problems. > <> Decomposition <> Specification (π) For real world applications, additional semantics is needed to Description > P arametric construct more complex models, especially those with repetitive > :>F orms > > P rogram structures or substructures. ProBT, a Bayesian a programming > :> > Identification (based on δ) language, provides a set of constructs for developing and applying :> complex models with substructures and repetitive structures. Question The goal of this paper is to present and discuss the semantics associated to these constructs. Figure 1. A Bayesian program is constructed from a Description and a Question. The Description is given by the programmer who writes a Index Terms—Probabilistic Programming semantics , Bayesian Specification of a model π and an Identification of its parameter values, Programming, ProBT which can be set in the program or obtained through a learning process from a data set δ. The Specification is constructed from a set of relevant variables, a decomposition of the joint probability distribution over these variables and a set of parametric forms (mathematical models) for the terms I. INTRODUCTION of this decomposition. ProBT [1] was designed to translate the ideas of E.T. Jaynes [2] into an actual programming language.
    [Show full text]
  • Teaching Bayesian Behaviours to Video Game Characters
    Robotics and Autonomous Systems 47 (2004) 177–185 Teaching Bayesian behaviours to video game characters Ronan Le Hy∗, Anthony Arrigoni, Pierre Bessière, Olivier Lebeltel GRAVIR/IMAG, INRIA Rhˆone-Alpes, ZIRST, 38330 Montbonnot, France Abstract This article explores an application of Bayesian programming to behaviours for synthetic video games characters. We address the problem of real-time reactive selection of elementary behaviours for an agent playing a first person shooter game. We show how Bayesian programming can lead to condensed and easier formalisation of finite state machine-like behaviour selection, and lend itself to learning by imitation, in a fully transparent way for the player. © 2004 Published by Elsevier B.V. Keywords: Bayesian programming; Video games characters; Finite state machine; Learning by imitation 1. Introduction After listing our practical objectives, we will present our Bayesian model. We will show how we use it to Today’s video games feature synthetic characters specify by hand a behaviour, and how we use it to involved in complex interactions with human players. learn a behaviour. We will tackle learning by exam- A synthetic character may have one of many different ple using a high-level interface, and then the natural roles: tactical enemy, partner for the human, strategic controls of the game. We will show that it is possible opponent, simple unit amongst many, commenter, etc. to map the player’s actions onto bot states, and use In all of these cases, the game developer’s ultimate this reconstruction to learn our model. Finally, we will objective is for the synthetic character to act like a come back to our objectives as a conclusion.
    [Show full text]
  • The Logical Basis of Bayesian Reasoning and Its Application on Judicial Judgment
    2018 International Workshop on Advances in Social Sciences (IWASS 2018) The Logical Basis of Bayesian Reasoning and Its Application on Judicial Judgment Juan Liu Zhengzhou University of Industry Technology, Xinzheng, Henan, 451100, China Keywords: Bayesian reasoning; Bayesian network; judicial referee Abstract: Bayesian inference is a law that corrects subjective judgments of related probabilities based on observed phenomena. The logical basis is that when the sample's capacity is close to the population, the probability of occurrence of events in the sample is close to the probability of occurrence of the population. The basic expression is: posterior probability = prior probability × standard similarity. Bayesian networks are applications of Bayesian inference, including directed acyclic graphs (DAGs) and conditional probability tables (CPTs) between nodes. Using the Bayesian programming tool to construct the Bayesian network, the ECHO model is used to analyze the node structure of the proposition in the first trial of von Blo, and the jury can be simulated by the insertion of the probability value in the judgment of the jury in the first instance, but find and set The difficulty of all conditional probabilities limits the effectiveness of its display of causal structures. 1. Introduction The British mathematician Thomas Bayes (about 1701-1761) used inductive reasoning for the basic theory of probability theory and created Bayesian statistical theory, namely Bayesian reasoning. After the continuous improvement of scholars in later generations, a scientific methodology system has gradually formed, "applied to many fields and developed many branches." [1] Bayesian reasoning needs to reason about the estimates and hypotheses to be made based on the sample information observed by the observer and the relevant experience of the inferencer.
    [Show full text]
  • Debugging Probabilistic Programs
    Debugging Probabilistic Programs Chandrakana Nandi Adrian Sampson Todd Mytkowicz Dan Grossman Cornell University, Ithaca, USA Microsoft Research, Redmond, USA University of Washington, Seattle, USA [email protected] [email protected] fcnandi, [email protected] Kathryn S. McKinley Google, USA [email protected] Abstract ity of being true in a given execution. Even though these assertions Many applications compute with estimated and uncertain data. fail when the program’s results are unexpected, they do not give us While advances in probabilistic programming help developers build any information about the cause of the failure. To help determine such applications, debugging them remains extremely challenging. the cause of failure, we identify three types of common probabilistic New types of errors in probabilistic programs include 1) ignoring de- programming defects. pendencies and correlation between random variables and in training Modeling errors and insufficient evidence. Probabilistic pro- data, 2) poorly chosen inference hyper-parameters, and 3) incorrect grams may use incorrect statistical models, e.g., using Gaussian statistical models. A partial solution to prevent these errors in some (0.0, 1.0) instead of Gaussian (1.0, 1.0), where Gaussian (µ, s) rep- languages forbids developers from explicitly invoking inference. resents a Gaussian distribution with mean µ and standard deviation While this prevents some dependence errors, it limits composition s. On the other hand,even if the statistical model is correct, their and control over inference, and does not guarantee absence of other input data (e.g., training data) may be erroneous, insufficient or inappropriate for performing a given statistical task. types of errors.
    [Show full text]
  • Fully Bayesian Computing
    Fully Bayesian Computing Jouni Kerman Andrew Gelman Department of Statistics Department of Statistics Columbia University Columbia University [email protected] [email protected] November 24, 2004 Abstract A fully Bayesian computing environment calls for the possibility of defining vector and array objects that may contain both random and deterministic quantities, and syntax rules that allow treating these objects much like any variables or numeric arrays. Working within the statistical package R, we introduce a new object-oriented framework based on a new random variable data type that is implicitly represented by simulations. We seek to be able to manipulate random variables and posterior simulation objects conveniently and transparently and provide a basis for further development of methods and functions that can access these objects directly. We illustrate the use of this new programming environment with several examples of Bayesian com- puting, including posterior predictive checking and the manipulation of posterior simulations. This new environment is fully Bayesian in that the posterior simulations can be handled directly as random vari- ables. Keywords: Bayesian inference, object-oriented programming, posterior simulation, random variable ob- jects 1 1 Introduction In practical Bayesian data analysis, inferences are drawn from an L × k matrix of simulations representing L draws from the posterior distribution of a vector of k parameters. This matrix is typically obtained by a computer program implementing a Gibbs sampling scheme or other Markov chain Monte Carlo (MCMC) process. Once the matrix of simulations from the posterior density of the parameters is available, we may use it to draw inferences about any function of the parameters.
    [Show full text]
  • Bayesian Change Points and Linear Filtering in Dynamic Linear Models Using Shrinkage Priors
    Bayesian Change Points and Linear Filtering in Dynamic Linear Models using Shrinkage Priors Jeffrey B. Arnold 2015-10-14 Political and social processes are rarely, if ever, constant over time, and, as such, social scientists have a need to model that change (Büthe 2002; Lieberman 2002). Moreover, these political process are often marked by both periods of stability and periods of large changes (Pierson 2004). If the researcher has a strong prior about or wishes to test a specific location of a change, they may include indicator variables, an example in international relations is the ubiquitous Cold War dummy variable. Other approaches estimate locations of these change using change point or structural break models (Calderia and Zorn 1998; Western and Kleykamp 2004; Spirling 2007b; Spirling 2007a; Park 2010; Park 2011; Blackwell 2012). This offers a different Bayesian approach to modeling change points. I combineacon- tinuous latent state space approach, e.g. dynamic linear models, with recent advances in Bayesian shrinkage priors (Carvalho, Polson, and Scott 2009; Carvalho, Polson, and Scott 2010; Polson and Scott 2010). These sparse shrinkage priors are the Bayesian analog to regularization and penalized likelihood approaches such as the LASSO (Tibshirani 1996) in maximum likelihood. An example of such an approach, is a model of change points in the mean of normally distributed observations, 푦푡 = 휇푡 + 휖푡 휖 iid, E(휖) = 0 (1) 휇푡 = 휇푡−1 + 휔푡 Since this is a change point model, the change in the mean, 휔푡, should be sparse, with most values at or near zero, and a few which can be large.
    [Show full text]
  • Obstacle Avoidance and Proscriptive Bayesian Programming Carla Koike, Cédric Pradalier, Pierre Bessiere, Emmanuel Mazer
    Obstacle Avoidance and Proscriptive Bayesian Programming Carla Koike, Cédric Pradalier, Pierre Bessiere, Emmanuel Mazer To cite this version: Carla Koike, Cédric Pradalier, Pierre Bessiere, Emmanuel Mazer. Obstacle Avoidance and Proscriptive Bayesian Programming. –, 2003, France. hal-00019258 HAL Id: hal-00019258 https://hal.archives-ouvertes.fr/hal-00019258 Submitted on 10 Mar 2006 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Submitted to the Workshop on Reasoning with Uncertainty in Robotics, EIGHTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE August 9, 2003. Acapulco (Mexico) Obstacle Avoidance and Proscriptive Bayesian Programming Carla Koike a, Cédric Pradalier, Pierre Bessière, Emmanuel Mazer [email protected] Inriab Rhône-Alpes & Gravirc 655 av. de l’Europe, Montbonnot, 38334 St Ismier Cedex, France March 2, 2003 aCorresponding Author: INRIA Rhône-Alpes, 38334 Saint Ismier cedex France - Phone: +33.476.615348 - Fax: +33.476.615210 bInstitut National de Recherche en Informatique et en Automatique. cLab. Graphisme, Vision et Robotique. Abstract — Unexpected events and not modeled properties of the robot environment are some of the challenges presented by situated robotics research field. Collision avoidance is a basic security requirement and this paper proposes a probabilistic approach called Bayesian Programming, which aims to deal with the uncertainty, imprecision and incompleteness of the information handled to solve the obstacle avoidance problem.
    [Show full text]
  • Application of Probabilistic Graphical Models in Forecasting Crude Oil Price
    Application of Probabilistic Graphical Models in Forecasting Crude Oil Price by Danish A. Alvi A dissertation submitted in partial satisfaction of the requirements for the degree of Bachelors of Science in Computer Science Department of Computer Science University College London arXiv:1804.10869v1 [q-fin.TR] 29 Apr 2018 Spring 2018 2 Abstract The dissertation investigates the application of Probabilistic Graphical Models (PGMs) in forecasting the price of Crude Oil. This research is important because crude oil plays a very pivotal role in the global econ- omy hence is a very critical macroeconomic indicator of the industrial growth. Given the vast amount of macroeconomic factors affecting the price of crude oil such as supply of oil from OPEC countries, demand of oil from OECD countries, geopolitical and geoeconomic changes among many other variables - probabilistic graphical models (PGMs) allow us to understand by learning the graphical structure. This dissertation proposes condensing data numerous Crude Oil factors into a graphical model in the attempt of creating a accurate forecast of the price of crude oil. The research project experiments with using different libraries in Python in order to con- struct models of the crude oil market. The experiments in this thesis investigate three main challenges commonly presented while trading oil in the financial markets. The first challenge it investigates is the process of learning the structure of the oil markets; thus allowing crude oil traders to understand the different physical market factors and macroeconomic indicators affecting crude oil markets and how they are causally related. The second challenge it solves is the exploration and exploitation of the available data and the learnt structure in predicting the behaviour of the oil markets.
    [Show full text]
  • Differentially Private Bayesian Programming
    Differentially Private Bayesian Programming ˚ Gilles Barthe Gian Pietro Farina IMDEA Software University at Buffalo ˚ Marco Gaboardi Emilio Jesús Gallego Arias Andy Gordon University at Buffalo CRI Mines-ParisTech Microsoft Research : Justin Hsu Pierre-Yves Strub University of Pennsylvania IMDEA Software ABSTRACT of tools for differentially private data analysis. Many of these We present PrivInfer, an expressive framework for writing tools use programming language techniques to ensure that and verifying differentially private Bayesian machine learning the resulting programs are indeed differentially private [2– algorithms. Programs in PrivInfer are written in a rich func- 4, 6, 19–21, 29, 33]. Moreover, property (2) has encouraged tional probabilistic programming language with constructs the interaction of the differential privacy community with the for performing Bayesian inference. Then, differential pri- machine learning community to design privacy-preserving vacy of programs is established using a relational refinement machine learning techniques, e.g. [12, 18, 25, 39]. At the same type system, in which refinements on probability types are time, researchers in probabilistic programming are exploring indexed by a metric on distributions. Our framework lever- programming languages as tools for machine learning. For ages recent developments in Bayesian inference, probabilistic example, in Bayesian inference, probabilistic programming programming languages, and in relational refinement types. allows data analysts to represent the probabilistic model, its We demonstrate the expressiveness of PrivInfer by verifying parameters, and the data observations as a specially crafted privacy for several examples of private Bayesian inference. program. Given this program as input, we can then use inference algorithms to produce a distribution over the pa- rameters of the model representing our updated beliefs on 1.
    [Show full text]
  • Bayesian Cognition Cours 2: Bayesian Programming
    Bayesian Cognition Cours 2: Bayesian programming Julien Diard CNRS - Laboratoire de Psychologie et NeuroCognition Grenoble Pierre Bessière CNRS - Institut des Systèmes Intelligents et de Robotique Paris Julien Diard — LPNC-CNRS Cours EDISCE/EDMSTII - M2R Sciences Cognitives, « Cognition bayésienne » — 2015 2 Contents / schedule • Cours 1 30/09/15, C-ADM008-Amphi J. Besson – Incompleteness to uncertainty: Bayesian programming and inference • Cours 2 14/10/15, C-ADM008-Amphi J. Besson – Bayesian programming • Cours 3 21/10/15, C-ADM008-Amphi J. Besson – Bayesian robot programming (part 1) • Cours 4 04/11/15, C-ADM008-Amphi J. Besson – Bayesian robot programming (part 2) – Bayesian cognitive modeling (part 1) • Cours 5 18/11/15, C-ADM008-Amphi J. Besson – Bayesian cognitive modeling (part 2) • Cours 6 20/12/15, C-ADM008-Amphi J. Besson – Bayesian model comparison, bayesian model distinguishability • Examen ?/?/? (pour les M2) Julien Diard — LPNC-CNRS Cours EDISCE/EDMSTII - M2R Sciences Cognitives, « Cognition bayésienne » — 2015 3 Plan • Summary & questions! • Bayesian Programming methodology – Variables – Decomposition & conditional independence hypotheses – Parametric forms (demo) – Learning – Inference • Taxonomy of Bayesian models Julien Diard — LPNC-CNRS Cours EDISCE/EDMSTII - M2R Sciences Cognitives, « Cognition bayésienne » — 2015 4 Probability Theory As Extended Logic • Probabilités • Probabilités « fréquentistes » « subjectives » E.T. Jaynes (1922-1998) – Une probabilité est une – Référence à un état de propriété physique d'un connaissance d'un sujet • P(« il pleut » | Jean), objet P(« il pleut » | Pierre) – Axiomatique de • Pas de référence à la limite Kolmogorov, théorie d’occurrence d’un des ensembles événement (fréquence) • Probabilités conditionnelles N – P (A)=fA = limN – P(A | π) et jamais P(A) ⇥ NΩ – Statistiques classiques – Statistiques bayésiennes • Population parente, etc.
    [Show full text]
  • Multiple Exposures
    2015-06-12 Outline Examples, in brief: – Witte, 1994: Diet and breast cancer risk – Steenland, 2000: Occupations and cancer incidence Analyzing multiple exposures – Momoli, 2010: Occupational chemicals and lung cancer risk Traditional teaching on modelling (and why it doesn’t nicely apply) Franco Momoli Hierarchical Modelling Types of research questions To Bayes or not to Bayes? Children’s Hospital of Eastern Ontario Research Institute The ideas behind hierarchical modelling Ottawa Hospital Research Institute Selection, shrinkage, exchangeability School of Epidemiology, Public Health, and Preventive Medicine (University of Ottawa) Three issues to overcome – Multiple inference and doing too much – Mutual confounding and “over-adjustment” SPER workshop – Small-sample bias and big “elephant-like” models Denver, June 2015 Two-stage empirical Bayes and Semi-Bayes Example 1: Diet and breast cancer Data for this application came from a The setting: case-control study of premenopausal bilateral breast cancer. Formal details of – You have a study the study and data have been given else- where. We had complete dietary and hormonal information on 140 cases and – In this study are multiple ‘exposures’ and you are interested in 222 controls; controls were sisters of the each one. You want, at the end of the day, an estimate of each cases. exposure’s effect on some outcome 140/10 ~ 14 parameters estimable – You may need to identify some exposures for further study – You are considering a set of exposures – They are not merely a nuisance to you. This is not the problem of having one exposure of interest and too many candidate confounders – Exposures are often correlated.
    [Show full text]
  • Probabilistic Programming Semantics for Name Generation
    Probabilistic Programming Semantics for Name Generation MARCIN SABOK, McGill University, Canada SAM STATON, University of Oxford, United Kingdom DARIO STEIN, University of Oxford, United Kingdom MICHAEL WOLMAN, McGill University, Canada We make a formal analogy between random sampling and fresh name generation. We show that quasi-Borel spaces, a model for probabilistic programming, can soundly interpret the a-calculus, a calculus for name generation. Moreover, we prove that this semantics is fully abstract up to first-order types. This is surprising for an ‘off-the-shelf’ model, and requires a novel analysis of probability distributions on function spaces.Our tools are diverse and include descriptive set theory and normal forms for the a-calculus. CCS Concepts: • Theory of computation Denotational semantics; Categorical semantics; • Mathe- 11 ! matics of computing Probability and statistics. ! Additional Key Words and Phrases: probabilistic programming, name generation, nu-calculus, quasi-Borel spaces, standard Borel spaces, descriptive set theory, Borel on Borel, denotational semantics, synthetic proba- bility theory ACM Reference Format: Marcin Sabok, Sam Staton, Dario Stein, and Michael Wolman. 2021. Probabilistic Programming Semantics for Name Generation. Proc. ACM Program. Lang. 5, POPL, Article 11 (January 2021), 29 pages. https://doi.org/10. 1145/3434292 1 INTRODUCTION This paper is a foundational study of two styles of programming and their relationship: (1) fresh name generation (gensym) via random draws; (2) statistical probabilistic programming with higher-order functions. We use a recent model of probabilistic programming, quasi-Borel spaces (QBSs, [Heunen et al. 2017]), to give a first random model of the a-calculus [Pitts and Stark 1993], which is a _-calculus with fresh name generation.
    [Show full text]