MEDICAL IMAGING INFORMATICS: Lecture # 1

Estimation Theory

Norbert Schuff Professor of Radiology VA Medical Center and UCSF [email protected]

UC Medical Imaging Informatics 2012 Nschuff SF VA Course # 170.03 Department of Slide 1/31 Radiology & Biomedical Imaging Objectives Of Today’ s Lecture

• Understand basic conceppgts of data modeling – Deterministic – Probabilistic – Bayesian • Learn the form of the most common estimators – Least squares estimator – Maximum likelihood estimator – Maximum a-posteriori estimator • Learn how estimators are used in image processing

UCSF Department of VA Radiology & Biomedical Imaging What Is Medical Imaging Informatics?

• Signal Processing – Digital Image Acquisition – Image Processing and Enhancement • – Computational anatomy – Statistics – – Data-mining – Workflow and Process Modeling and Simulation • Data Management – Picture Archiving and System (PACS) – Imaging Informatics for the Enterprise – Image-Enabled Electronic Medical Records – Radiology Information Systems (RIS) and Hospital Information Systems (HIS) – Quality Assurance – Archive Integrity and Security • Data – Image – 3D, Visualization and Multi -media – DICOM, HL7 and other Standards • – Imaging Vocabularies and – Transforming the Radiological Interpretation Process (TRIP)[2] – Computer pute -Aided D etectio n a nd Diag n osi s (CAD ). – Radiology Informatics •Etc. UCSF Department of VA Radiology & Biomedical Imaging What Is The Focus Of This Course? Learn using computational tools to maximize information for knowledge gain:

Pro-active

Drive the Let model Data Measurements knowledge Image Model

Collect data Compare Reactive with model

UCSF Department of VA Radiology & Biomedical Imaging Challenge: Maximize Information Gain

1. Q: How to estimate quantities from a given set of uncertitain ( noi se ) measuremen t?ts? A: Apply estimation theory (1st lecture today)

2. Q: How to quantify information? A: Apply (2nd lecture next week)

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 5/31 Radiology & Biomedical Imaging Motivation Example I: Tissue Classification

Gray/White Matter Segmentation Hypothetical Histogram

1.0

0.8

0.6

0.4

0.2

0.0

Intensity

GM/WM overlap 50:50; Can we do better than flipping a coin?

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 6/31 Radiology & Biomedical Imaging Motivation Example II: Dynam ic Signa l Trac king Goal: Capture a dynamic signal on a Goal: Capture a periodic signal corrupted by noise static background

5

3

1

-1

-3

-5

0 1000 2000 3000 4000 5000 6000 Time

D. Feinberg Advanced MRI Technologies, Sebastopol, CA

UCSF Department of VA Radiology & Biomedical Imaging Motivation Example III: Signa l Decompos ition

Diffusion Tensor Imaging (DTI) Goal: •Sensitive to random motion of water Capture directions of fiber bundles •Probes structures on a microscopic scale

Microscopic tissue sample

Dr. Van Wedeen, MGH

Quantitative Diffusion Maps UCSF Department of VA Radiology & Biomedical Imaging Basic Concepts of Modeling

Θ: unknown state of interest

ρ: measurement Θ: Estimator - a good guess of Θ based on measurements

Cartoon adapted from: Rajesh P . N . Rao, Bruno A. Olshausen Probabilistic Models of the Brain. MIT Press 2002. UCSF Department of VA Radiology & Biomedical Imaging Deterministic Model

N = number of measurements M = number of states, M=1 is possible Usually N > M and ||noise||2 > 0

⎛⎞⎛ϕ11111ϕθh hm ⎞⎛⎞⎛θ noise 1 ⎞ ⎜⎟⎜ ⎟⎜⎟⎜ ⎟ ⎜⎟⎜...=+ ⎟⎜⎟⎜i ⎟ ⎜⎟⎜ ⎟⎜⎟⎜ ⎟ ⎝⎠⎝nnh1 h nmm ⎠⎝⎠⎝ noise n ⎠

φNN=+H NxMθ M noise Unknowns

The model is deterministic, because discrete values of Θ are solutions.

Note: 1) we make no assumption about the distribution of Θ 1) Each value is as likely as any another value

What is the best estimator under these circumstances? UCSF Department of VA Radiology & Biomedical Imaging Least-Squares Estimator (LSE)

The best what we can do is minimizing noise:

φ−Hθ = noise ˆ φ−=HθLSE 0 TTˆ H φ−(HH)θLSE = 0 TT−1 θLSE = (HHH ) ϕ

•LSE is popular choice for model fitting •Useful for obtaining a descriptive measure

But •LSE makes no assumptions about distributions of data or parameters •Has no basis for statistics Î “deterministic model” UCSF Department of VA Radiology & Biomedical Imaging Prominent Examples of LSE

1 N 150 ˆ Mean Value: θϕmean = ∑ ()j 100 N j=1

2 50 1 N Variance ˆˆ Intensities (Y) θϕθ=−()j 0 var iance∑() mean N −1 j=1

-50

100 300 500 700 900 Measurements (x)

200 Amplitude: ˆ θ1

100 Frequency: ˆ ity θ2

Intens 0 Phase: ˆ θ3 Decay: ˆ -100 θ4

100 300 500 700 900 Measurements UCSF Department of VA Radiology & Biomedical Imaging Likelihood Model

Likelihood Lpϕ (Φ)=Φ(ϕ | ) We believe θ is governed by chance, but we don’t know the governing probability.

We perform measurements for all possible values of θ.

We obtain the likelihood function of θ given our measurements ρ

Note: θ is random ϕ Is measurable The likelihood function is the joint probability of unknown θ and known ϕ

UCSF Department of VA Radiology & Biomedical Imaging Likelihood Model (cont’d)

Likelihood functionL

LN(θϕθ)= Pr( N | , )

New Goal: Find an estimator which gives the most likely ppyrobability of θ underlying L ( θ ) .

UCSF Department of VA Radiology & Biomedical Imaging Maximum Likelihood Estimator (MLE)

Goal: Find estimator which gives the most likely probability of θ underlying L θ () Higgphest probabilit y ΘMLE = maxp(ϕN |θ )

ΘMLE can be found by taking the derivative of the likelihood function d lnp(ϕθ |) = 0 dθ N θθ= MLE

UCSF Department of VA Radiology & Biomedical Imaging Example I: MLE Of Normal Distribution

Normal distribution Normal Distribution 1.0 N ⎡⎤1 2 2 0.8 Pr(ϕθσN | , ϕ) θ=− exp ⎢⎥( ( j) ) 2σ 2 ∑ ⎣⎦j=1 0.6 a2

0.4 log of the normal distribution (normD) 0.2 N 1 2 0.0 ln Pr()ϕ |θσ ϕ , θ2 =−()()j N 2 ∑ 100 300 500 700 900 2 j=1 a1 Log Normal Distribution σ st MLE of the mean (1 derivative): 0 d 1 N ln Pri =−=ϕ j θ 0 -5 ( ) 2 ∑( ( ) ) dNθ 4σˆ j=1

-10 1 N Θ = ϕ ( j) MLE ∑ -15 N j=1 100 300 500 700 900 a1 UCSF Department of VA Radiology & Biomedical Imaging Example II: MLE Of Binominal Distribution (()Coin Tosses) n= # of tosses y= # of heads in n tosses θ = probability of obtaining a head in a toss (unknown) Pr(y| n, θ) = probability of heads from n tosses given θ

Pr(y|n=10, θ =7) 070.7 0.2

010.1

Pr(y|n=10, θ =3) y 0.3 0.0

0.2

0.1

0.0 12345678910 UCSF VA Ny Department of Radiology & Biomedical Imaging MLE Of Coin Toss (cont’d)

Goal: Given Pr(y| n = 10, θ), estimate the most likely probability distribution Θ MLE that produced the data. #heads Intuitively: Θ = MLE #tosses

For a fair coin Θ≈MLE 0.5

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 18/31 Radiology & Biomedical Imaging MLE Of Coin Toss (cont’d)

Coin tosses follow a binominal distribution

n! y ny− Pr(Yyn== | ,θ ) ⋅−θθ( 1 ) 0.25 (y!!)(n − y) 0.20 Likelihood function of coin tosses 0.15 0.10 Likelihood n! ny− 0.05 Lyn()θθθ|,=⋅−y () 1 yny!!()(− ) 0.00 0.1 0.3 0.5 0.7 0.9 Wθ What is the likelihood of observing 7 heads given that we tossed a coin 10 times?

10! 10− 7 Ln()θ |=== 10, y 7 ⋅ 0.57 () 1 − 0.5 = 0.12 For a fair coin θ =05:0.5: (7!)( 10− 7) !

10! 10− 7 For an unfair coin θ =0.6 Ln()θ |=== 10, y 7 ⋅ 0.67 () 1 − 0.6 = 0.21 7! 10()(− 7 ! )

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 19/31 Radiology & Biomedical Imaging MLE Of Coin Toss (cont’d)

Likelihood function of coin tosses 0.25

ny− 0.20 n! y d Ly(θθθ|1)= ⋅−( ) oo (yny!!)( − ) 0.15 0.10 Likeliho 0.05

0.00

0.1 0.3 0.5 0.7 0.9 W log likelihood function θ

lnLy()θ | =

-1 n! ln++yny lnθθ( −−) ln( 1 ) ()(yny!!− ) -2

-3

0.1 0.3 0.5 0.7 0.9 θΦ UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 20/31 Radiology & Biomedical Imaging MLE Of Coin Toss

Evaluate MLE equation (1st derivative)

dLln (i) y ( ny− ) =− =0 dθ θθ1−

y− nθ y# heads ====⇒0 Θ θθ(1− )MLE n # tosses

According to the MLE principle, the distribution Pr(y/n,ΘMLE) for a given n is the most likely distribution to have generated the observed data of y.

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 21/31 Radiology & Biomedical Imaging Relationship between MLE and LSE

Assume: • θ is independent of noise • Probability of measurements and noise have the same distribution

Prθ ()ϕNN |θϕθθ=− Prnoise ( H | )

For gaussian noise, i.e. zero mean

p(ρ|θ) is maximized when LSE is minimized

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 22/31 Radiology & Biomedical Imaging Bayesian Model Prior knowledge Now, the daemon comes into play, but we know the daemon’s preferences for θ (prior knowledge).

prior (θ )= Pr (θ )

New Goal: Find the estimator which gives the most likely probability distribution of θ given everything we know.

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 23/31 Radiology & Biomedical Imaging Bayesian Model

posteriorϕϕ(θϕ)= C⋅⋅ L (ϕN |θθ) Pr( )

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 24/31 Radiology & Biomedical Imaging Maximum A-Posteriori (()MAP) Estimator

Goal:

Find the most likely ΘMAP (max. posterior density of ) given ϕ. Maximize joint density Θ=MAPmaxLp(ϕ N |θϕ)( N )

ΘMAP can be found by taken the partial derivative

d ∂∂ ln LL(ϕϕ| )Pr( ) = ln ( | )+=ln Pr( ) 0 dθθθNNθθ θ∂∂ θ

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 25/31 Radiology & Biomedical Imaging Example III: MAP Of Normal Distribution

The sample mean of MAP is:

σ 2 N μ j ΦMAP = 22∑ϕ( ) σϕμ+Tσ j=1

If we do not have prior information on μ, σμÆ inf or TÆ inf

μμμˆˆˆMAP⇒ ML , LSE

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 26/31 Radiology & Biomedical Imaging Posterior Distribution and Decision Rules

(Θ|ρ)

Θ MSEX ΘMAP Θ UCSF Department of VA Radiology & Biomedical Imaging Decision Rules

Measurements Æ Likelihood Posterior function Distribution Æ Result

Prior Gain Distribution Function

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 28/31 Radiology & Biomedical Imaging Some Desirable Properties of Estimators I:

Unbiased: Mean value of the error should be zero E Φ - Φ = 0

Consistent: Error estimator should decrease asymptotically as number of measurements increase. (Mean Square Error (MSE))

2 MSE=Φ E Φ - Φ → 0forlargeN0 for large N

What happens to MSE when estimator is biased?

2 MSE=Φ+ EΦ - - b E b 2

variance bias UCSF Department of VA Radiology & Biomedical Imaging Some Desirable Properties of Estimators II:

Efficient: Co-variance matrix of error should decrease asymptotically to its minimal value for large N

T Cik=ΦΦ≤E (ΦΦi --i )( k k ) some ... very small value

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 30/31 Radiology & Biomedical Imaging Example: PtiOfEtitProperties Of Estimators Mean and ViVariance

11N Mean: EEjNμˆ = ∑ ϕμμ( ) =⋅ = NNj=1 The sample mean is an unbiased estimator of the true mean

N 2 2 112 σ EEjNμμˆ −= ϕ μ σ −=⋅=2 Variance: ( ) 22∑ ( ( ) ) NNNj=1

The variance is a consistent estimator because It approaches zero for large number of measurements.

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 31/31 Radiology & Biomedical Imaging Properties Of MLE

• is consistent: the MLE recovers asyypmptoticall y the true parameter values that generated the data for N Æ inf;

• Is efficient: The MLE achieves asymptotically the minimum error (= max. information)

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 32/31 Radiology & Biomedical Imaging Summary

• LSE is a descrippytive method to accurately fit data to a model. • MLE is a method to seek the probability distribution that makes th e ob serve d da ta mos t like ly. • MAP is a method to seek the most probably parameter value given prior information about the parameters and the observed data.

• If the influence of prior information decreases, MAP approaches MLE.

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 33/31 Radiology & Biomedical Imaging Some Priors in Imaging

• Smoothness of the brain • Anatomical boundaries • Intensity distributions • Anatomical shapes • Physical models – PitPoint sprea dfd functi on – Bandwidth limits • Etc.

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 34/31 Radiology & Biomedical Imaging Estimation Theory: Motivation Example I

Gray/White Matter Segmentation Hypothetical Histogram

1.0

0.8

0.6

0.4

0.2

0.0

Intensity

What works better than flipping a coin?

Design likelihood functions based on anatomy co-occurance of signal intensities others Determine prior distribution population based atlas of regional intensities model based distributions of intensities UCSF Department of VA others Radiology & Biomedical Imaging Estimation Theory: Motivation Example II

Goal: Capture dynamic signal on a Poor signal to noise static background 5

3

1

-1

-3

-5

0 1000 2000 3000 4000 5000 6000 Time Improvements to identify the dynamic signal:

Design likelihood functions based on auto-correlations anatomical information D. Feinberg Advanced MRI Technologies, Sebastopol, CA Determine prior distributions from serilial measuremen ts multiple subjects anatomy UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 36/31 Radiology & Biomedical Imaging Estimation Theory: Motivation Example III

Diffusion Spectrum Imaging – Human Cingulum Bundle Goal: Capture directions of fiber bundles

Impr ov em en ts to i den tif y tr acts:

Design likelihood functions based on similarity measures of adjacent voxels, e.g. correlations fiber anatomy

Determine prior distributions from anatomy fiber skeletons from a population others

Dr. Van Wedeen, MGH

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 37/31 Radiology & Biomedical Imaging MAP Estimation in Image Reconstructions with Edge-PiPiPreserving Priors

Dr. Ashish Raj, Cornell U UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 38/31 Radiology & Biomedical Imaging MAP in Image Reconstructions with Edge- Preserving Priors

For DTI, use the fact that coregistered DTI images have common edge features:

Dr. Justin Haldar Urbana-Champaign UCSF Department of VA Radiology & Biomedical Imaging MAP Estimation In Image Reconstruction

Human brain MRI. (a) The original LR data. (b) Zero-padding interpp()olation. (c) SR with box-PSF. (()d) SR with Gaussian-PSF.

From: A. Greenspan in The Computer Journal Advance Access published February 19, 2008

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 40/31 Radiology & Biomedical Imaging Improved ASL Perfusion Results

zDFT = zero-filled DFT

By Dr. John Kornak, UCSF UCSF VA Bayesian Automated Image Segmentation

Bruce Fischl, MGH

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 42/31 Radiology & Biomedical Imaging Population Atlases As Priors

Dr. Sarang Joshi, U Utah, Salt Lake City UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 43/31 Radiology & Biomedical Imaging Population Shape Regressions Based Age- SlSelec tive PiPriors

Age = 29 33 37 41 45 49 Dr. Sarang Joshi, U Utah, Salt Lake City UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 44/31 Radiology & Biomedical Imaging Imaging Software Using MLE And MAP

Packages Applications Languages VoxBo fMRI C/C++/IDL MEDx sMRI, fMRI C/C++/Tcl/Tk SPM fMRI, sMRI matlab/C iBrain IDL FSL fMRI, sMRI, DTI C/C++ fmristat fMRI matlab BrainVoyager sMRI C/C++ BrainTools C/C++ AFNI fMRI, DTI C/C++ Freesurfer sMRI C/C++ NiPy Python

UC Medical Imaging Informatics 2009, Nschuff SF VA Course # 170.03 Department of Slide 45/31 Radiology & Biomedical Imaging