Bayesian Cognition Probabilistic Models of Action, Perception, Inference, Decision and Learning

Bayesian Cognition Probabilistic models of action, perception, inference, decision and learning Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #1 Bayesian Cognition Cours 2 : Bayesian Programming Julien Diard http://diard.wordpress.com [email protected] CNRS - Laboratoire de Psychologie et NeuroCognition, Grenoble Pierre Bessière CNRS - Institut des Systèmes Intelligents et de Robotique, Paris Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #2 Contents / Schedule • c1 : fondements théoriques, • c1 : Mercredi 11 Octobre définition du formalisme de la • c2 : Mercredi 18 Octobre programmation bayésienne • c3 : Mercredi 25 Octobre • *pas de cours la semaine du 30 • c2 : programmation Octobre* bayésienne des robots • c4 : Mercredi 8 Novembre • c5 : Mercredi 15 Novembre • c3 : modélisation bayésienne • *pas de cours la semaine du 20 cognitive Novembre* • c6 : Mercredi 29 Novembre • c4 : comparaison bayésienne de modèles • Examen ?/?/? (pour les M2) Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #3 Plan • Summary & questions! • Basic concepts: minimal example and spam detection example • Bayesian Programming methodology – Variables – Decomposition & conditional independence hypotheses – Parametric forms (demo) – Learning – Inference • Taxonomy of Bayesian models Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #4 Probability Theory As Extended Logic • Probabilités • Probabilités « fréquentistes » « subjectives » E.T. Jaynes (1922-1998) – Une probabilité est une – Référence à un état de propriété physique d'un connaissance d'un sujet • P(« il pleut » | Jean), objet P(« il pleut » | Pierre) – Axiomatique de • Pas de référence à la limite Kolmogorov, théorie d’occurrence d’un des ensembles événement (fréquence) • Probabilités conditionnelles N – P (A)=fA = limN – P(A | π) et jamais P(A) ⇥ NΩ – Statistiques classiques – Statistiques bayésiennes • Population parente, etc. Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #5 Principle Incompleteness Preliminary Knowledge + Bayesian Learning Experimental Data = Probabilistic Representation Uncertainty P(a) P( a) 1 Bayesian Inference + ¬ = P(a∧b) = P(a)P(b | a) = P(b)P(a | b) Decision€ Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #6 Règles de calcul • Règle du produit P (AB C)=P (A C)P (B AC) | | | = P (B C)P (A BC) | | Reverend Thomas Bayes è Théorème de Bayes (~1702-1761) P (B C)P (A BC) P (B AC)= | | , si P (A C) =0 | P (A C) | • Règle de la somme | P (A C)+P (A¯ C)=1 P ([A = a] C)=1 | | | a A è Règle de marginalisation∈ P (AB C)=P (B C) | | A Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #7 Programme Bayésien • Variables Spécification • Décomposition Description PB • Formes paramétriques Identification Question Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #8 Mon premier Programme Bayésien • Variables A = {il pleut, il ne pleut pas} B = {Jean a son p., Jean n'a pas son p.} Sp. • Décomposition Desc. P(B A) = P(A) P(B | A) PB • Formes paramétriques Tables de probabilités conditionnelles P(B | A) A=il pleut A=il ne pleut P(A) A=il pleut A=il ne pleut pas pas B=Jean n'a pas 0,05 0,9 Identification : 0,4 0,6 son parapluie B=Jean a son 0,95 0,1 parapluie Question : P(A | B) = P(A) P(B | A) / P(B) Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #9 Plan • Summary & questions! • Basic concepts: minimal example and spam detection example • Bayesian Programming methodology – Variables – Decomposition & conditional independence hypotheses – Parametric forms (demo) – Learning – Inference • Taxonomy of Bayesian models Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #10 Bayesian Spam Detection • Classify texts in 2 categories “spam” or “nonspam” – Only available information: a set of words • Adapt to the user and learn from experience Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #11 Variables • Spam – Boolean variable: {True, False} • W0, W1, ..., WN-1 – Wi is the presence or absence of word i in a text – Boolean variables: {True, False} Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #12 Decomposition P(Spam ∧W0 ∧ ... ∧WN−1) = P(Spam)× P(W0 | Spam)× P(W1 |W0 ∧Spam) × ... × P(WN−1 |WN−2 ∧ ... ∧W0 ∧Spam) € P(Spam W0 ... WN 1) ^ ^ ^ − N 1 − P(Spam) P(W Spam) ⇡ ⇥ n | Yn=0 Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #13 Importance des hypothèses d’indépendance conditionnelle • Nombre de probabilités – Attention : nb proba ≠ nb param libres – Avant hypothèses d’indépendance conditionnelle N+1 P(Spam W0 ... W N 1) 2 probabilités ^ ^ ^ − – Après hypothèses N 1 − P(Spam) P(W Spam) ⇥ n | 2 + 4N probabilités n=0 Diard – LPNC/CNRSY Cognition Bayésienne – 2017-2018 #14 Graphical representation Bayesian Network P(Spam ∧W ∧ ... ∧W ) P(Spam W0 ... WN 1) 0 N−1 ^ ^ ^ − N 1 = P(Spam)× P(W0 | Spam)× P(W1 |W0 ∧Spam) − P(Spam) P(Wn Spam) × ... × P(WN−1 |WN−2 ∧ ... ∧W0 ∧Spam) ⇡ ⇥ | Yn=0 Spam Spam € W0 W1 W2 … WN-1 W0 W1 W2 … WN-1 Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #15 P(Spam ∧W0 ∧ ... ∧WN−1) N−1 = P(Spam)× ∏ P(Wn | Spam) Parametric forms n=0 and identification € Could also be P([Spam = false]) = 0.25 computed from a learning database P Spam = true = 0.75 ([ ]) P([Spam = false]) =θ € P([Spam = true]) =1−θ € Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #16 P(Spam ∧W0 ∧ ... ∧WN−1) N−1 = P(Spam)× ∏ P(Wn | Spam) Parametric forms n=0 and identification € n Nbref P([Wn = true]| [Spam = false]) = Nbref P([Wn = false]| [Spam = false]) =1− P([Wn = true]| [Spam = false]) n Nbret P([Wn = true]| [Spam = true]) = Nbret P([Wn = false]| [Spam = true]) =1− P([Wn = true]| [Spam = true]) Attention, si un mot Wn n'est jamais vu dans un spam, alors si on le voit dans un mail m, m ne peut pas être un spam Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #17 P(Spam ∧W0 ∧ ... ∧WN−1) N−1 = P(Spam)× ∏ P(Wn | Spam) Parametric forms n=0 and identification • Loi de succession de€ Laplace n 1+ Nbref P([Wn = true]| [Spam = false]) = 2 + Nbref P([Wn = false]| [Spam = false]) =1− P([Wn = true]| [Spam = false]) n 1+ Nbret P([Wn = true]| [Spam = true]) = 2 + Nbret P([Wn = false]| [Spam = true]) =1− P([Wn = true]| [Spam = true]) Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #18 Identification n 1+ Nbref P([Wn = true]| [Spam = false]) = 2 + Nbref n 1+ Nbret P([Wn = true]| [Spam = true]) = 2 + Nbret Notion de paramètre libre, de base de données d’apprentissage, d’algorithme d’identification de paramètres Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #19 P(Spam ∧W0 ∧ ... ∧WN−1) Joint distribution and N−1 = P(Spam)× ∏ P(Wn | Spam) questions (1) n=0 € P(Spam) = ∑P(Spam ∧W0 ∧ ... ∧ Wn ∧ ... ∧WN−1) W0∧ ... ∧WN-1 N−1 P(Spam) = ∑ P(Spam)× ∏ P(Wn | Spam) € W0∧ ... ∧WN-1 n=0 = P(Spam) € Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #20 P(Spam ∧W0 ∧ ... ∧WN−1) Joint distribution and N−1 = P(Spam)× ∏ P(Wn | Spam) questions (2) n=0 € P(Wn ) = ∑P(Spam ∧W0 ∧ ... ∧ Wn ∧ ... ∧WN−1) Spam∧Wi≠n N−1 P(Wn ) = ∑ P(Spam)× ∏ P(Wn | Spam) € Spam∧Wi≠n n=0 = ∑P(Spam)× P(Wn | Spam) Spam # n & # n & 1+ Nbref 1+ Nbret P([Wn = true]) = % 0.25 × ( +% 0.75 × ( % ( 2 + Nbre $ 2 + Nbref ' $ t ' € Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #21 € P(Spam ∧W0 ∧ ... ∧WN−1) Joint distribution and N−1 = P(Spam)× ∏ P(Wn | Spam) questions (3) n=0 ∑P(Spam ∧W0 ∧ ... ∧ Wn ∧ ... ∧WN−1) Wi≠n P(Wn | [Spam = true]) = € ∑P(Spam ∧W0 ∧ ... ∧ Wn ∧ ... ∧WN−1) W0∧ ... ∧WN-1 n 1+ Nbret P(Wn | [Spam = true]) = 2 + Nbret € € P(Spam)× P([Wn = true] | Spam) P(Spam | [Wn = true]) = ∑ P(Spam)× P([Wn = true] | Spam) Spam Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #22 P(Spam ∧W0 ∧ ... ∧WN−1) Joint distribution and N−1 = P(Spam)× ∏ P(Wn | Spam) questions (4) n=0 € P(Spam ∧W0 ∧ ... ∧ Wn ∧ ... ∧WN−1) P(Spam |W0 ∧ ... ∧ WN-1) = ∑P(Spam ∧W0 ∧ ... ∧ Wn ∧ ... ∧WN−1) Spam N−1 P(Spam)× ∏ P(Wn | Spam) P Spam |W ∧ ... W = n=0 € ( 0 N−1) N−1 ∑ P(Spam)× ∏ P(Wn | Spam) Spam n=0 Diard – LPNC/CNRS € Cognition Bayésienne – 2017-2018 #23 Bayesian Program Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #24 Résultat : exemple numérique • 5 mots considérés : fortune, next, programming, money, you • 1 000 mails dans la base de données d’apprentissage (250 non spams, 750 spams) n P(Wn | [Spam=false]) P(Wn | [Spam=true]) W =false W =true W =false W =true n n n n n n n Word n af at 0 fortune 0 375 0 0.996032 0.00396825 0.5 0.5 1 next 125 0 1 0.5 0.5 0.99867 0.00132979 2 programming 250 0 2 0.00396825 0.996032 0.99867 0.00132979 3 money 0 750 3 0.996032 0.00396825 0.00132979 0.99867 4 you 125 375 4 0.5 0.5 0.5 0.5 Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #25 Résultat : exemple numérique • 25=32 mails possibles Subset Words present P(Spam | w0 … w4) number [Spam = false] [Spam = true] 3 {money} 5.24907e-06 0.999995 11 {next, money} 0.00392659 0.996073 12 {next, money, you} 0.00392659 0.996073 15 {next, programming, money} 0.998656 0.00134393 27 {fortune, next, money} 1.57052e-05 0.999984 Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #26 Results SpamSieve http://c-command.com/spamsieve/ Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #27 Plan • Summary & questions! • Basic concepts: minimal example and spam detection example • Bayesian Programming methodology – Variables – Decomposition & conditional independence hypotheses – Parametric forms (demo) – Learning – Inference • Taxonomy of Bayesian models Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #28 Programme Bayésien • Variables Spécification • Décomposition Description PB • Formes paramétriques Identification Question Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #29 Logical Proposition Logical Propositions are denoted by lowercase names: a Usual logical operators: a∧b a∨b € ¬a € Diard – LPNC/CNRS Cognition Bayésienne – 2017-2018 #30 Probability of Logical Proposition To assign a probability to a given proposition a, it is necessary to have at least some preliminary knowledge, summed up by a proposition π.

Bayesian Cognition Probabilistic Models of Action, Perception, Inference, Decision and Learning

The Semantics of Subroutines and Iteration in the Bayesian Programming Language Probt

Teaching Bayesian Behaviours to Video Game Characters

The Logical Basis of Bayesian Reasoning and Its Application on Judicial Judgment

Debugging Probabilistic Programs

Fully Bayesian Computing

Bayesian Change Points and Linear Filtering in Dynamic Linear Models Using Shrinkage Priors

Obstacle Avoidance and Proscriptive Bayesian Programming Carla Koike, Cédric Pradalier, Pierre Bessiere, Emmanuel Mazer

Application of Probabilistic Graphical Models in Forecasting Crude Oil Price

Differentially Private Bayesian Programming

Bayesian Cognition Cours 2: Bayesian Programming

Multiple Exposures

Probabilistic Programming Semantics for Name Generation