Bayesian Estimation & Information Theory

Bayesian Estimation & Information Theory Jonathan Pillow Mathematical Tools for Neuroscience (NEU 314) Spring, 2016 lecture 18 Bayesian Estimation three basic ingredients: 1. Likelihood jointly determine the posterior 2. Prior “cost” of making an estimate 3. Loss function L(✓ˆ, ✓) if the true value is • fully specifies how to generate an estimate from the data Bayesian estimator is defined as: ✓ˆ(m) = arg min L(✓ˆ, ✓)p(✓ m)d✓ “Bayes’ risk” ˆ | ✓ Z Typical Loss functions and Bayesian estimators 2 1. L(✓ˆ, ✓)=(✓ˆ ✓) squared error loss − 0 need to find minimizing the expected loss: Differentiate with respect to and set to zero: “posterior mean” also known as Bayes’ Least Squares (BLS) estimator Typical Loss functions and Bayesian estimators ˆ ˆ “zero-one” loss 2. L(✓, ✓)=1 δ(✓ ✓) (1 unless ) − − 0 expected loss: which is minimized by: • posterior maximum (or “mode”). • known as maximum a posteriori (MAP) estimate. MAP vs. Posterior Mean estimate: 0.3 0.2 0.1 gamma pdf 0 0 2 4 6 8 10 Note: posterior maximum and mean not always the same! Typical Loss functions and Bayesian estimators 3. “L1” loss 0 expected loss: HW problem: What is the Bayesian estimator for this loss function? Simple Example: Gaussian noise & prior 1. Likelihood additive Gaussian noise zero-mean Gaussian 2. Prior 3. Loss function: doesn’t matter (all agree here) posterior distribution MAP estimate variance Likelihood 8 m 0 -8 -8 0 8 θ Likelihood 8 m 0 -8 -8 0 8 θ Likelihood 8 m 0 -8 -8 0 8 θ Likelihood 8 -8 0 8 m 0 -8 -8 0 8 -8 0 8 θ Prior 8 m 0 -8 -8 0 8 θ Computing the posterior likelihood prior posterior 0 x ∝ m θ 0 0 0 Making an Bayesian Estimate: likelihood prior posterior m* 0 x ∝ m θ 0 0 0 bias 0 0 0 High Measurement Noise: large bias likelihood prior posterior 0 x ∝ m θ 0 0 0 larger bias 0 0 0 Low Measurement Noise: small bias likelihood prior posterior 0 x ∝ m θ 0 0 0 small bias 0 0 0 Bayesian Estimation: • Likelihood and prior combine to form posterior • Bayesian estimate is always biased towards the prior (from the ML estimate) Application #1: Biases in Motion Perception + Which grating moves faster? Application #1: Biases in Motion Perception + Which grating moves faster? Explanation from Weiss, Simoncelli & Adelson (2002): posterior likelihood prior likelihood prior 0 0 Noisier measurements, so likelihood is broader ⇒ posterior has larger shift toward 0 (prior = no motion) • In the limit of a zero-contrast grating, likelihood becomes infinitely broad ⇒ percept goes to zero-motion. • Claim: explains why people actually speed up when driving in fog! summary • 3 ingredients for Bayesian estimation (prior, likelihood, loss) • Bayes’ least squares (BLS) estimator (posterior mean) • maximum a posteriori (MAP) estimator (posterior mode) • accounts for stimulus-quality dependent bias in motion perception (Weiss, Simoncelli & Adelson 2002).

Bayesian Estimation & Information Theory

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support