<<

Inverse Problems for Dummies

Scott Ziegler (a very stable genius)

Colorado State University

September 13, 2018

Scott Ziegler (CSU) Green Slopes September 13, 2018 1 / 27 Geology, medicine, radar, , etc. The task of using the of the situation to simulate data is called a forward problem. The task of using the gathered data to recreate the mathematical model of the situation is called an . In practice, we typically look for a specific parameter of the mathematical model.

What is an inverse problem?

There are many physical situations in which we acquire data from some phenomenon or object which we cannot see.

Scott Ziegler (CSU) Green Slopes September 13, 2018 2 / 27 The task of using the mathematical model of the situation to simulate data is called a forward problem. The task of using the gathered data to recreate the mathematical model of the situation is called an inverse problem. In practice, we typically look for a specific parameter of the mathematical model.

What is an inverse problem?

There are many physical situations in which we acquire data from some phenomenon or object which we cannot see. Geology, medicine, radar, astronomy, etc.

Scott Ziegler (CSU) Green Slopes September 13, 2018 2 / 27 The task of using the gathered data to recreate the mathematical model of the situation is called an inverse problem. In practice, we typically look for a specific parameter of the mathematical model.

What is an inverse problem?

There are many physical situations in which we acquire data from some phenomenon or object which we cannot see. Geology, medicine, radar, astronomy, etc. The task of using the mathematical model of the situation to simulate data is called a forward problem.

Scott Ziegler (CSU) Green Slopes September 13, 2018 2 / 27 In practice, we typically look for a specific parameter of the mathematical model.

What is an inverse problem?

There are many physical situations in which we acquire data from some phenomenon or object which we cannot see. Geology, medicine, radar, astronomy, etc. The task of using the mathematical model of the situation to simulate data is called a forward problem. The task of using the gathered data to recreate the mathematical model of the situation is called an inverse problem.

Scott Ziegler (CSU) Green Slopes September 13, 2018 2 / 27 What is an inverse problem?

There are many physical situations in which we acquire data from some phenomenon or object which we cannot see. Geology, medicine, radar, astronomy, etc. The task of using the mathematical model of the situation to simulate data is called a forward problem. The task of using the gathered data to recreate the mathematical model of the situation is called an inverse problem. In practice, we typically look for a specific parameter of the mathematical model.

Scott Ziegler (CSU) Green Slopes September 13, 2018 2 / 27 Some simple examples

Deblurring

1

1Jennifer Mueller and Samuli Siltanen, Linear and Nonlinear Inverse Problems with Practical Applications. SIAM. 2012. Scott Ziegler (CSU) Green Slopes September 13, 2018 3 / 27 Some simple examples

X-ray imaging

2

2Jennifer Mueller and Samuli Siltanen, Linear and Nonlinear Inverse Problems with Practical Applications. SIAM. 2012. Scott Ziegler (CSU) Green Slopes September 13, 2018 4 / 27 When the A = A is a linear operator, we call the problem linear. The problem could still be infinite dimensional, but we will n m m×n typically discretize to get b ∈ R , x ∈ R and A = R . If the operator A is nonlinear, we cleverly call the inverse problem nonlinear. In this case we are typically dealing with A : H → H where H is a Hilbert space.

Classifying inverse problems

Inverse problems can be broadly summarized by the following equation:

b = Ax + 

where b represents the observation (or result of the forward problem), x represents the model parameters, A represents the operator governing the model, and  is a random noise vector (typically normally distributed).

Scott Ziegler (CSU) Green Slopes September 13, 2018 5 / 27 Classifying inverse problems

Inverse problems can be broadly summarized by the following equation:

b = Ax + 

where b represents the observation (or result of the forward problem), x represents the model parameters, A represents the operator governing the model, and  is a random noise vector (typically normally distributed). When the operator A = A is a linear operator, we call the problem linear. The problem could still be infinite dimensional, but we will n m m×n typically discretize to get b ∈ R , x ∈ R and A = R . If the operator A is nonlinear, we cleverly call the inverse problem nonlinear. In this case we are typically dealing with A : H → H where H is a Hilbert space.

Scott Ziegler (CSU) Green Slopes September 13, 2018 5 / 27 We would like for our inverse problem to be well-posed, which means it satisfies the following three conditions: Existence: there should be a solution. Uniqueness: there should be at most one solution Stability: the solution must continuously depend on the data. These conditions are equivalent to saying that our map A (which in the finite dimensional case is just a ) should have a continuous inverse. If an inverse problem is not well-posed, it is ill-posed.

Linear inverse problems

We’ll begin (and essentially end) by studying linear inverse problems since they are much easier to work with.

Scott Ziegler (CSU) Green Slopes September 13, 2018 6 / 27 Existence: there should be a solution. Uniqueness: there should be at most one solution Stability: the solution must continuously depend on the data. These conditions are equivalent to saying that our map A (which in the finite dimensional case is just a matrix) should have a continuous inverse. If an inverse problem is not well-posed, it is ill-posed.

Linear inverse problems

We’ll begin (and essentially end) by studying linear inverse problems since they are much easier to work with. We would like for our inverse problem to be well-posed, which means it satisfies the following three conditions:

Scott Ziegler (CSU) Green Slopes September 13, 2018 6 / 27 Uniqueness: there should be at most one solution Stability: the solution must continuously depend on the data. These conditions are equivalent to saying that our map A (which in the finite dimensional case is just a matrix) should have a continuous inverse. If an inverse problem is not well-posed, it is ill-posed.

Linear inverse problems

We’ll begin (and essentially end) by studying linear inverse problems since they are much easier to work with. We would like for our inverse problem to be well-posed, which means it satisfies the following three conditions: Existence: there should be a solution.

Scott Ziegler (CSU) Green Slopes September 13, 2018 6 / 27 Stability: the solution must continuously depend on the data. These conditions are equivalent to saying that our map A (which in the finite dimensional case is just a matrix) should have a continuous inverse. If an inverse problem is not well-posed, it is ill-posed.

Linear inverse problems

We’ll begin (and essentially end) by studying linear inverse problems since they are much easier to work with. We would like for our inverse problem to be well-posed, which means it satisfies the following three conditions: Existence: there should be a solution. Uniqueness: there should be at most one solution

Scott Ziegler (CSU) Green Slopes September 13, 2018 6 / 27 These conditions are equivalent to saying that our map A (which in the finite dimensional case is just a matrix) should have a continuous inverse. If an inverse problem is not well-posed, it is ill-posed.

Linear inverse problems

We’ll begin (and essentially end) by studying linear inverse problems since they are much easier to work with. We would like for our inverse problem to be well-posed, which means it satisfies the following three conditions: Existence: there should be a solution. Uniqueness: there should be at most one solution Stability: the solution must continuously depend on the data.

Scott Ziegler (CSU) Green Slopes September 13, 2018 6 / 27 If an inverse problem is not well-posed, it is ill-posed.

Linear inverse problems

We’ll begin (and essentially end) by studying linear inverse problems since they are much easier to work with. We would like for our inverse problem to be well-posed, which means it satisfies the following three conditions: Existence: there should be a solution. Uniqueness: there should be at most one solution Stability: the solution must continuously depend on the data. These conditions are equivalent to saying that our map A (which in the finite dimensional case is just a matrix) should have a continuous inverse.

Scott Ziegler (CSU) Green Slopes September 13, 2018 6 / 27 Linear inverse problems

We’ll begin (and essentially end) by studying linear inverse problems since they are much easier to work with. We would like for our inverse problem to be well-posed, which means it satisfies the following three conditions: Existence: there should be a solution. Uniqueness: there should be at most one solution Stability: the solution must continuously depend on the data. These conditions are equivalent to saying that our map A (which in the finite dimensional case is just a matrix) should have a continuous inverse. If an inverse problem is not well-posed, it is ill-posed.

Scott Ziegler (CSU) Green Slopes September 13, 2018 6 / 27 In practice every inverse problem is ill-posed, and it turns out that solving a problem for ill-posed inverse problems goes very badly.

Linear inverse problems

If our inverse problem is well-posed, then our work is essentially done. We simply need to solve a least squares problem.

xLS ∝ arg minx||Ax − b||.

Scott Ziegler (CSU) Green Slopes September 13, 2018 7 / 27 Linear inverse problems

If our inverse problem is well-posed, then our work is essentially done. We simply need to solve a least squares problem.

xLS ∝ arg minx||Ax − b||. In practice every inverse problem is ill-posed, and it turns out that solving a least squares problem for ill-posed inverse problems goes very badly.

Scott Ziegler (CSU) Green Slopes September 13, 2018 7 / 27 Given the (possibly noisy) convolution of a function f, reconstruct the original function. We can describe this through the continuous model Z a b = (ψ ∗ f)(x) = ψ(x0)f(x − x0)dx0. −a

We can then discretize this problem using some type of quadrature rule and end up with an equation of the form b = Af and attempt to solve the inverse problem using least squares.

Ill-posedness in convolution

As an example of this, consider the inverse problem of .

Scott Ziegler (CSU) Green Slopes September 13, 2018 8 / 27 We can describe this through the continuous model Z a b = (ψ ∗ f)(x) = ψ(x0)f(x − x0)dx0. −a

We can then discretize this problem using some type of quadrature rule and end up with an equation of the form b = Af and attempt to solve the inverse problem using least squares.

Ill-posedness in convolution

As an example of this, consider the inverse problem of deconvolution. Given the (possibly noisy) convolution of a function f, reconstruct the original function.

Scott Ziegler (CSU) Green Slopes September 13, 2018 8 / 27 We can then discretize this problem using some type of quadrature rule and end up with an equation of the form b = Af and attempt to solve the inverse problem using least squares.

Ill-posedness in convolution

As an example of this, consider the inverse problem of deconvolution. Given the (possibly noisy) convolution of a function f, reconstruct the original function. We can describe this through the continuous model Z a b = (ψ ∗ f)(x) = ψ(x0)f(x − x0)dx0. −a

Scott Ziegler (CSU) Green Slopes September 13, 2018 8 / 27 Ill-posedness in convolution

As an example of this, consider the inverse problem of deconvolution. Given the (possibly noisy) convolution of a function f, reconstruct the original function. We can describe this through the continuous model Z a b = (ψ ∗ f)(x) = ψ(x0)f(x − x0)dx0. −a

We can then discretize this problem using some type of quadrature rule and end up with an equation of the form b = Af and attempt to solve the inverse problem using least squares.

Scott Ziegler (CSU) Green Slopes September 13, 2018 8 / 27 Ill-posedness in convolution

Left: The piecewise continuous function f(x). Right: The function (ψ ∗ f)(x).

Scott Ziegler (CSU) Green Slopes September 13, 2018 9 / 27 Ill-posedness in convolution

Left: Result of a least squares inversion with no additive noise. Right: Result of a least squares inversion with data corrupted by 1% white noise.

Scott Ziegler (CSU) Green Slopes September 13, 2018 10 / 27 Given a matrix A of r, the decomposition (SVD) of A is A = UΣV T where U, V are n × n orthogonal and m×n σ = diag(σ1, σ2, ..., σr, 0, ..., 0) ∈ R with σ1 ≥ σ2 ≥ ... ≥ σr ≥ 0 the singular values of A. The columns of U = [u1, ..., um] and V = [v1, ..., um] are the left and right singular vectors of A, respectively. The outer product form of the SVD is r X T A = uiσivi . i=1

SVD and ill-posedness

We’ll show what went wrong here by analyzing the singular value decomposition of a matrix describing an arbitrary ill-posed linear inverse problem and take a look at the statistical properties of xLS.

Scott Ziegler (CSU) Green Slopes September 13, 2018 11 / 27 The outer product form of the SVD is r X T A = uiσivi . i=1

SVD and ill-posedness

We’ll show what went wrong here by analyzing the singular value decomposition of a matrix describing an arbitrary ill-posed linear inverse problem and take a look at the statistical properties of xLS. Given a matrix A of rank r, the singular value decomposition (SVD) of A is A = UΣV T where U, V are n × n orthogonal and m×n σ = diag(σ1, σ2, ..., σr, 0, ..., 0) ∈ R with σ1 ≥ σ2 ≥ ... ≥ σr ≥ 0 the singular values of A. The columns of U = [u1, ..., um] and V = [v1, ..., um] are the left and right singular vectors of A, respectively.

Scott Ziegler (CSU) Green Slopes September 13, 2018 11 / 27 SVD and ill-posedness

We’ll show what went wrong here by analyzing the singular value decomposition of a matrix describing an arbitrary ill-posed linear inverse problem and take a look at the statistical properties of xLS. Given a matrix A of rank r, the singular value decomposition (SVD) of A is A = UΣV T where U, V are n × n orthogonal and m×n σ = diag(σ1, σ2, ..., σr, 0, ..., 0) ∈ R with σ1 ≥ σ2 ≥ ... ≥ σr ≥ 0 the singular values of A. The columns of U = [u1, ..., um] and V = [v1, ..., um] are the left and right singular vectors of A, respectively. The outer product form of the SVD is r X T A = uiσivi . i=1

Scott Ziegler (CSU) Green Slopes September 13, 2018 11 / 27 The outer product form of this is

r † X −1 A = viσi ui i=1

SVD and ill-posedness

We can then define the pseudo-inverse of A by

A† = V Σ†U T

† −1 −1 −1 n×m where Σ = diag(σ1 , σ2 , ..., σr , 0, ..., 0) ∈ R

Scott Ziegler (CSU) Green Slopes September 13, 2018 12 / 27 SVD and ill-posedness

We can then define the pseudo-inverse of A by

A† = V Σ†U T

† −1 −1 −1 n×m where Σ = diag(σ1 , σ2 , ..., σr , 0, ..., 0) ∈ R The outer product form of this is

r † X −1 A = viσi ui i=1

Scott Ziegler (CSU) Green Slopes September 13, 2018 12 / 27 The first sum here is the projection of the true solution x onto the r span of {vi}i=1, while the second sum represents the corruption of xLS that occurs due to the presence of . If the matrix describing our model has certain properties (i.e. rapidly decaying singular values and highly oscillatory right singular vectors as i increases, then our least squares solution is corrupted.

SVD and ill-posedness

It turns out the least squares solution xLS can be written as

† xLS = A b r X uT b = i v σ i i=1 i r r X X uT  = (vT x)v + i v . i i σ i i=1 i=1 i

Scott Ziegler (CSU) Green Slopes September 13, 2018 13 / 27 If the matrix describing our model has certain properties (i.e. rapidly decaying singular values and highly oscillatory right singular vectors as i increases, then our least squares solution is corrupted.

SVD and ill-posedness

It turns out the least squares solution xLS can be written as

† xLS = A b r X uT b = i v σ i i=1 i r r X X uT  = (vT x)v + i v . i i σ i i=1 i=1 i

The first sum here is the projection of the true solution x onto the r span of {vi}i=1, while the second sum represents the corruption of xLS that occurs due to the presence of .

Scott Ziegler (CSU) Green Slopes September 13, 2018 13 / 27 SVD and ill-posedness

It turns out the least squares solution xLS can be written as

† xLS = A b r X uT b = i v σ i i=1 i r r X X uT  = (vT x)v + i v . i i σ i i=1 i=1 i

The first sum here is the projection of the true solution x onto the r span of {vi}i=1, while the second sum represents the corruption of xLS that occurs due to the presence of . If the matrix describing our model has certain properties (i.e. rapidly decaying singular values and highly oscillatory right singular vectors as i increases, then our least squares solution is corrupted.

Scott Ziegler (CSU) Green Slopes September 13, 2018 13 / 27 SVD and ill-posedness

Plots of the singular values and singular vectors of the matrix A describing one-dimensional deblurring. Scott Ziegler (CSU) Green Slopes September 13, 2018 14 / 27 2 2 This shows the variance of xLS in the direction vj is σ /σj , which will be large for large values of j.

SVD and ill-posedness

r xLS is a random vector in the span of {vi}i=1, specifically we have

T T T uj  T 2 2 vj xLS = vj x + ∼ N (vj x, σ /σj ) σj

where σ2 is the variance of the random vector .

Scott Ziegler (CSU) Green Slopes September 13, 2018 15 / 27 SVD and ill-posedness

r xLS is a random vector in the span of {vi}i=1, specifically we have

T T T uj  T 2 2 vj xLS = vj x + ∼ N (vj x, σ /σj ) σj

where σ2 is the variance of the random vector . 2 2 This shows the variance of xLS in the direction vj is σ /σj , which will be large for large values of j.

Scott Ziegler (CSU) Green Slopes September 13, 2018 15 / 27 If x = [1, 1]T , then clearly b = Ax = [1, 1]T and A−1b = [1, 1]T . However if we add one realization of  given by  = [0.026, 0.075]T , then A−1b = [−1.400, 3.501]T .

Example

T −2 T Let A be√ defined√ via A = v1v1√+ 10 √v2v2 with T T v1 = [1/ 2, 1/ 2] , v2 = [−1/ 2, 1/ 2] .

Scott Ziegler (CSU) Green Slopes September 13, 2018 16 / 27 However if we add one realization of  given by  = [0.026, 0.075]T , then A−1b = [−1.400, 3.501]T .

Example

T −2 T Let A be√ defined√ via A = v1v1√+ 10 √v2v2 with T T v1 = [1/ 2, 1/ 2] , v2 = [−1/ 2, 1/ 2] . If x = [1, 1]T , then clearly b = Ax = [1, 1]T and A−1b = [1, 1]T .

Scott Ziegler (CSU) Green Slopes September 13, 2018 16 / 27 Example

T −2 T Let A be√ defined√ via A = v1v1√+ 10 √v2v2 with T T v1 = [1/ 2, 1/ 2] , v2 = [−1/ 2, 1/ 2] . If x = [1, 1]T , then clearly b = Ax = [1, 1]T and A−1b = [1, 1]T . However if we add one realization of  given by  = [0.026, 0.075]T , then A−1b = [−1.400, 3.501]T .

Scott Ziegler (CSU) Green Slopes September 13, 2018 16 / 27 This is known as regularization, or spectral filtering. The easiest solution to this problem is to simply remove the bothersome singular vectors. This is known as truncated singular value decomposition (TSVD).

Regularization

Now that we know that problem with least squares, we want to alter the method in order to minimize the damage caused by the highly oscillatory right singular vectors.

Scott Ziegler (CSU) Green Slopes September 13, 2018 17 / 27 The easiest solution to this problem is to simply remove the bothersome singular vectors. This is known as truncated singular value decomposition (TSVD).

Regularization

Now that we know that problem with least squares, we want to alter the method in order to minimize the damage caused by the highly oscillatory right singular vectors. This is known as regularization, or spectral filtering.

Scott Ziegler (CSU) Green Slopes September 13, 2018 17 / 27 Regularization

Now that we know that problem with least squares, we want to alter the method in order to minimize the damage caused by the highly oscillatory right singular vectors. This is known as regularization, or spectral filtering. The easiest solution to this problem is to simply remove the bothersome singular vectors. This is known as truncated singular value decomposition (TSVD).

Scott Ziegler (CSU) Green Slopes September 13, 2018 17 / 27 T If we simply remove this singular vector to get Afilt = σ1v1v1 then † −1 T T we see that xLSfiltered = Afiltb = σ1 (v1 b)v1 = [1.0505, 1.0505] .

Example

Returning to our example above, our poor reconstruction was due to high variance in the direction of v2 (which has singular value 10−2).

Scott Ziegler (CSU) Green Slopes September 13, 2018 18 / 27 Example

Returning to our example above, our poor reconstruction was due to high variance in the direction of v2 (which has singular value 10−2). T If we simply remove this singular vector to get Afilt = σ1v1v1 then † −1 T T we see that xLSfiltered = Afiltb = σ1 (v1 b)v1 = [1.0505, 1.0505] .

Scott Ziegler (CSU) Green Slopes September 13, 2018 18 / 27 Example

Returning to our example above, our poor reconstruction was due to high variance in the direction of v2 (which has singular value 10−2). T If we simply remove this singular vector to get Afilt = σ1v1v1 then † −1 T T we see that xLSfiltered = Afiltb = σ1 (v1 b)v1 = [1.0505, 1.0505] .

Scott Ziegler (CSU) Green Slopes September 13, 2018 18 / 27 As an example, for TSVD we have

(ν) (ν) n×n Φν = diag(φ1 , ..., φr , 0, ..., 0) ∈ R

with ( 1, i = 1, ..., k φν = i 0, i = k + 1, ..., r

Regularization

We can generalize this idea by saying that our regularized least squares solution is

† T xν = V ΦνΣ U b

with Φν to be chosen depending on the choice of regularization.

Scott Ziegler (CSU) Green Slopes September 13, 2018 19 / 27 Regularization

We can generalize this idea by saying that our regularized least squares solution is

† T xν = V ΦνΣ U b

with Φν to be chosen depending on the choice of regularization. As an example, for TSVD we have

(ν) (ν) n×n Φν = diag(φ1 , ..., φr , 0, ..., 0) ∈ R

with ( 1, i = 1, ..., k φν = i 0, i = k + 1, ..., r

Scott Ziegler (CSU) Green Slopes September 13, 2018 19 / 27 . Here we reframe the least squares problem as 1 ν  x = arg min ||Ax − b||2 + ||x||2 ν x 2 2

2 ν σi which ends up giving Φν defined by φi = 2 , i = 1, ...r. σi +ν regularization. Here we reframe the least squares problem as

1 ν  x = arg min ||Ax − b||2 + ||Lx|| ν x 2 2

where L is a finite difference matrix

Regularization

Other common types of regularization are:

Scott Ziegler (CSU) Green Slopes September 13, 2018 20 / 27 Total variation regularization. Here we reframe the least squares problem as

1 ν  x = arg min ||Ax − b||2 + ||Lx|| ν x 2 2

where L is a finite difference matrix

Regularization

Other common types of regularization are: Tikhonov regularization. Here we reframe the least squares problem as 1 ν  x = arg min ||Ax − b||2 + ||x||2 ν x 2 2

2 ν σi which ends up giving Φν defined by φi = 2 , i = 1, ...r. σi +ν

Scott Ziegler (CSU) Green Slopes September 13, 2018 20 / 27 Regularization

Other common types of regularization are: Tikhonov regularization. Here we reframe the least squares problem as 1 ν  x = arg min ||Ax − b||2 + ||x||2 ν x 2 2

2 ν σi which ends up giving Φν defined by φi = 2 , i = 1, ...r. σi +ν Total variation regularization. Here we reframe the least squares problem as

1 ν  x = arg min ||Ax − b||2 + ||Lx|| ν x 2 2

where L is a finite difference matrix

Scott Ziegler (CSU) Green Slopes September 13, 2018 20 / 27 The only real issue is choosing the regularization parameter ν. We can reframe this idea of ”fixing” a matrix with some knowledge of the solution in terms of the stochastic approach to inverse problems.

Regularization

Regularization lets us ”fix” an ill-behaved matrix by truncating or altering the right singular values which are troublesome.

Scott Ziegler (CSU) Green Slopes September 13, 2018 21 / 27 We can reframe this idea of ”fixing” a matrix with some knowledge of the solution in terms of the stochastic approach to inverse problems.

Regularization

Regularization lets us ”fix” an ill-behaved matrix by truncating or altering the right singular values which are troublesome. The only real issue is choosing the regularization parameter ν.

Scott Ziegler (CSU) Green Slopes September 13, 2018 21 / 27 Regularization

Regularization lets us ”fix” an ill-behaved matrix by truncating or altering the right singular values which are troublesome. The only real issue is choosing the regularization parameter ν. We can reframe this idea of ”fixing” a matrix with some knowledge of the solution in terms of the stochastic approach to inverse problems.

Scott Ziegler (CSU) Green Slopes September 13, 2018 21 / 27 The outer product form of our regularized solution is given by

r r  T  X (ν) X (ν) u  x = φ (vT x)v + φ i v ν i i i i σ I i=1 i=1 i which produces

 T   2  T (ν) T ui  (ν) T  (ν) 2 2 vi xν = φi vi x + ∼ N φi vi x, φi σ /σi σi

Thus we decrease the variance of the solution in the direction of vi for large i, but we have introduced bias.

Statistical properties of regularized solutions

We saw that the least squares solution xLS has a high variance for later singular values.

Scott Ziegler (CSU) Green Slopes September 13, 2018 22 / 27 Thus we decrease the variance of the solution in the direction of vi for large i, but we have introduced bias.

Statistical properties of regularized solutions

We saw that the least squares solution xLS has a high variance for later singular values. The outer product form of our regularized solution is given by

r r  T  X (ν) X (ν) u  x = φ (vT x)v + φ i v ν i i i i σ I i=1 i=1 i which produces

 T   2  T (ν) T ui  (ν) T  (ν) 2 2 vi xν = φi vi x + ∼ N φi vi x, φi σ /σi σi

Scott Ziegler (CSU) Green Slopes September 13, 2018 22 / 27 Statistical properties of regularized solutions

We saw that the least squares solution xLS has a high variance for later singular values. The outer product form of our regularized solution is given by

r r  T  X (ν) X (ν) u  x = φ (vT x)v + φ i v ν i i i i σ I i=1 i=1 i which produces

 T   2  T (ν) T ui  (ν) T  (ν) 2 2 vi xν = φi vi x + ∼ N φi vi x, φi σ /σi σi

Thus we decrease the variance of the solution in the direction of vi for large i, but we have introduced bias.

Scott Ziegler (CSU) Green Slopes September 13, 2018 22 / 27 We can reframe this idea by immediately assuming x is a random vector with its own prior distribution. x is described by the prior probability density function (or just prior) p(x|δ) where δ is some positive scaling parameter. Once a prior is chosen, we obtain the posterior density function through Bayes’ law.

p(x|b, λ, δ) ∝ p(b|x, λ)p(x|δ)

The stochastic approach to inverse problems

Through regularization we are really ”encoding” some information we have about the solution x by minimizing either x or ∇x in whatever norm we choose.

Scott Ziegler (CSU) Green Slopes September 13, 2018 23 / 27 Once a prior is chosen, we obtain the posterior density function through Bayes’ law.

p(x|b, λ, δ) ∝ p(b|x, λ)p(x|δ)

The stochastic approach to inverse problems

Through regularization we are really ”encoding” some information we have about the solution x by minimizing either x or ∇x in whatever norm we choose. We can reframe this idea by immediately assuming x is a random vector with its own prior distribution. x is described by the prior probability density function (or just prior) p(x|δ) where δ is some positive scaling parameter.

Scott Ziegler (CSU) Green Slopes September 13, 2018 23 / 27 The stochastic approach to inverse problems

Through regularization we are really ”encoding” some information we have about the solution x by minimizing either x or ∇x in whatever norm we choose. We can reframe this idea by immediately assuming x is a random vector with its own prior distribution. x is described by the prior probability density function (or just prior) p(x|δ) where δ is some positive scaling parameter. Once a prior is chosen, we obtain the posterior density function through Bayes’ law.

p(x|b, λ, δ) ∝ p(b|x, λ)p(x|δ)

Scott Ziegler (CSU) Green Slopes September 13, 2018 23 / 27 If the probability density function of our posterior density function is not well known, we must sample from the density using methods such as Markov Chain Monte Carlo. Using the stochastic approach, we obtain more information (uncertainties of the random variable), but could possibly increase the expense of our algorithms.

The stochastic approach to inverse problems

Choosing the prior is the most important step here, and we typically let x be a Gaussian Markov random field whose precision matrix is specified by our knowledge of the solution.

Scott Ziegler (CSU) Green Slopes September 13, 2018 24 / 27 Using the stochastic approach, we obtain more information (uncertainties of the random variable), but could possibly increase the expense of our algorithms.

The stochastic approach to inverse problems

Choosing the prior is the most important step here, and we typically let x be a Gaussian Markov random field whose precision matrix is specified by our knowledge of the solution. If the probability density function of our posterior density function is not well known, we must sample from the density using methods such as Markov Chain Monte Carlo.

Scott Ziegler (CSU) Green Slopes September 13, 2018 24 / 27 The stochastic approach to inverse problems

Choosing the prior is the most important step here, and we typically let x be a Gaussian Markov random field whose precision matrix is specified by our knowledge of the solution. If the probability density function of our posterior density function is not well known, we must sample from the density using methods such as Markov Chain Monte Carlo. Using the stochastic approach, we obtain more information (uncertainties of the random variable), but could possibly increase the expense of our algorithms.

Scott Ziegler (CSU) Green Slopes September 13, 2018 24 / 27 We need very specially tailored techniques and algorithms in order to solve nonlinear inverse problems (unless we do least squares, which is very costly). What you’ll need to know to study nonlinear inverse problems: PDE’s Analysis (specifically ) Numerical analysis Probability and Knowledge of the physics underlying the situation

Nonlinear inverse problems

Linear inverse problems are all pretty much the same, this is absolutely not true for nonlinear inverse problems.

Scott Ziegler (CSU) Green Slopes September 13, 2018 25 / 27 What you’ll need to know to study nonlinear inverse problems: PDE’s Analysis (specifically functional analysis) Numerical analysis Probability and statistics Knowledge of the physics underlying the situation

Nonlinear inverse problems

Linear inverse problems are all pretty much the same, this is absolutely not true for nonlinear inverse problems. We need very specially tailored techniques and algorithms in order to solve nonlinear inverse problems (unless we do least squares, which is very costly).

Scott Ziegler (CSU) Green Slopes September 13, 2018 25 / 27 Nonlinear inverse problems

Linear inverse problems are all pretty much the same, this is absolutely not true for nonlinear inverse problems. We need very specially tailored techniques and algorithms in order to solve nonlinear inverse problems (unless we do least squares, which is very costly). What you’ll need to know to study nonlinear inverse problems: PDE’s Analysis (specifically functional analysis) Numerical analysis Probability and statistics Knowledge of the physics underlying the situation

Scott Ziegler (CSU) Green Slopes September 13, 2018 25 / 27 Thanks for listening!

Scott Ziegler (CSU) Green Slopes September 13, 2018 26 / 27 References

Jennifer Mueller and Samuli Siltanen, Linear and Nonlinear Inverse Problems with Practical Applications. SIAM. 2012. Johnathan Bardsley, Computational Uncertainty Quantification for Inverse Problems. SIAM. 2018.

Scott Ziegler (CSU) Green Slopes September 13, 2018 27 / 27