
Bayesian Estimation & Information Theory Jonathan Pillow Mathematical Tools for Neuroscience (NEU 314) Spring, 2016 lecture 18 Bayesian Estimation three basic ingredients: 1. Likelihood jointly determine the posterior 2. Prior “cost” of making an estimate 3. Loss function L(✓ˆ, ✓) if the true value is • fully specifies how to generate an estimate from the data Bayesian estimator is defined as: ✓ˆ(m) = arg min L(✓ˆ, ✓)p(✓ m)d✓ “Bayes’ risk” ˆ | ✓ Z Typical Loss functions and Bayesian estimators 2 1. L(✓ˆ, ✓)=(✓ˆ ✓) squared error loss − 0 need to find minimizing the expected loss: Differentiate with respect to and set to zero: “posterior mean” also known as Bayes’ Least Squares (BLS) estimator Typical Loss functions and Bayesian estimators ˆ ˆ “zero-one” loss 2. L(✓, ✓)=1 δ(✓ ✓) (1 unless ) − − 0 expected loss: which is minimized by: • posterior maximum (or “mode”). • known as maximum a posteriori (MAP) estimate. MAP vs. Posterior Mean estimate: 0.3 0.2 0.1 gamma pdf 0 0 2 4 6 8 10 Note: posterior maximum and mean not always the same! Typical Loss functions and Bayesian estimators 3. “L1” loss 0 expected loss: HW problem: What is the Bayesian estimator for this loss function? Simple Example: Gaussian noise & prior 1. Likelihood additive Gaussian noise zero-mean Gaussian 2. Prior 3. Loss function: doesn’t matter (all agree here) posterior distribution MAP estimate variance Likelihood 8 m 0 -8 -8 0 8 θ Likelihood 8 m 0 -8 -8 0 8 θ Likelihood 8 m 0 -8 -8 0 8 θ Likelihood 8 -8 0 8 m 0 -8 -8 0 8 -8 0 8 θ Prior 8 m 0 -8 -8 0 8 θ Computing the posterior likelihood prior posterior 0 x ∝ m θ 0 0 0 Making an Bayesian Estimate: likelihood prior posterior m* 0 x ∝ m θ 0 0 0 bias 0 0 0 High Measurement Noise: large bias likelihood prior posterior 0 x ∝ m θ 0 0 0 larger bias 0 0 0 Low Measurement Noise: small bias likelihood prior posterior 0 x ∝ m θ 0 0 0 small bias 0 0 0 Bayesian Estimation: • Likelihood and prior combine to form posterior • Bayesian estimate is always biased towards the prior (from the ML estimate) Application #1: Biases in Motion Perception + Which grating moves faster? Application #1: Biases in Motion Perception + Which grating moves faster? Explanation from Weiss, Simoncelli & Adelson (2002): posterior likelihood prior likelihood prior 0 0 Noisier measurements, so likelihood is broader ⇒ posterior has larger shift toward 0 (prior = no motion) • In the limit of a zero-contrast grating, likelihood becomes infinitely broad ⇒ percept goes to zero-motion. • Claim: explains why people actually speed up when driving in fog! summary • 3 ingredients for Bayesian estimation (prior, likelihood, loss) • Bayes’ least squares (BLS) estimator (posterior mean) • maximum a posteriori (MAP) estimator (posterior mode) • accounts for stimulus-quality dependent bias in motion perception (Weiss, Simoncelli & Adelson 2002).
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages21 Page
-
File Size-