<<

5 Linear Algebra and 5.1 Introduction Direct problem ( Forward problem) is to find field quantities satisfying Governing equations, Bound- ary conditions, Initial conditions. The direct problem can be formulated as a well-posed problem. Inverse problem ( Backward problem) is to find some parameters related to Governing equation, Boundary conditions, Initial conditions, Geometry conditions from observation data. In general, inverse problems become nonlinear and ill-posed. So inverse problems are solved after linearization and regularization.

Examples of inverse problems • Example 1 — influence line

fj (j=1,...,n)

ui (i=1,...,m)

Let Aij, yi and xj be influence functions, measured data and parameters to be determined, respec- tively. Then we have

∑n ui = Aijfj (i = 1, . . . , m) or {u} = [A] {f} . (211) j=1

• Example 2 — Green’s function in solid mechanics

T

t(z)

u(x)

Using the Green’s function G(ξ, ζ), the displacement u due to the surface traction t is expressed by ∫ u(ξ) = G(ξ, ζ)t(ζ)dS. (212) S Due to the Saint-Venant’s principle, the distributed forces t(ζ), applied at far field from the obser- vation point ξ, can be approximated by the resultant force T without any change of u. It means that it is difficult to find a unique solution t, i.e., the inverse problem is an ill-posed problem. • Example 3 — geological prosecting s s q 1 t 0Dt 1

The problem is to determine the location, shape and constitution of subterranean bodies from measurements at the earth’s surface.

30 As shown in the above figure, consider a one dimensional problem that the mass density, x(t), distributed along the t-axis for 0 ≤ t ≤ 1 is determined from the vertical component of force, y(s), on the surface line s. The vertical force ∆y(s) due to a small mass element x(t)∆t is written as x(t)∆t x(t)∆t ∆y(s) = g cos θ = g . (213) (s − t)2 + 1 ((s − t)2 + 1)3/2 where g is the gravity constant. It then follows that ∫ 1 x(t)dt y(s) = g (214) 2 3/2 0 ((s − t) + 1)

• Example 4 — simplified tomography y I x detector

R (,)x y x

emitter I 0

Consider the simplified tomography in which the radiation absorption coefficient f(x, y) of the object is determined from the measured intensity Ix of the radiation, as shown in the above figure. Then the radiation intensity Ix satisfies the following differential equation. dI x = −fI (215) dy x

The solution Ix is obtained in the form of the function p(x) ∫ y(x) p(x) ≡ ln(I0/Ix) = f(x, y)dy (216) −y(x)

Suppose that the object has√ a circular region and the absorption coefficient f(x, y) has the form of f(x, y) = f(r), where r = (x2 + y2). Then we have ∫ R 2r p(x) = √ f(r)dr (217) 2 2 x r − x

• Example 5 — structural dynamics k m c

As shown in the above figure, we consider here structural dynamics with single degree of freedom governed by the equation of motion as follows.

mx¨ + cx˙ + kx = 0 orx ¨ + ax˙ + bx = 0 (218)

subjected to the initial conditions

x(0) = x0, x˙(0) =x ˙ 0. (219) The inverse problem is to find the constant values a and b from the time history of the mass x(t). Integrating eq.(218) twice using the initial conditions leads to ∫ ∫ t t x(t) − x0 − x˙ 0t + a (x(s) − x0)ds + b x(s)(t − s)ds = 0, t > 0. (220) 0 0

31 Suppose that displacements x1, x2, . . . , xn at the time steps of t = h, 2h, . . . , nh are measured. If the integrals in eq.(220) are evaluate by the trapezoidal rule, eq.(220) can be discretized as follows:

∑k Ek(a, b) = xk − x0 − x˙ 0kh + a( xjh − xkh/2 − x0kh) j=1 k∑−1 2 2 +b( xj(k − j)h + x0kh /2) k = 1, 2, . . . , n (221) j=1

Since eq. (221) is an approximated equation to eq.(220), E∑k(a, b) is generally nonzero. For n > 2, n 2 two parameters, a and b, are determined so that E(a, b) = k=1(Ek(a, b)) becomes minimum. To this end, method is generally used to find optimal values for a and b.

Problem 5.1

1. Solve eq.(218) with the initial conditions x(0) = 1,x ˙(0) = −1 with m = 1, c = 2, k = 5, and calculate the displacements x(tk) at the time steps of t = tk (k = 1, . . . , n), where tn ≤ 1.

2. Then generate noisy data by adding the noise ²k to the calculated displacement x(tk), namely, xk = x(tk) + ²k, where ²k is a uniformly distributed random number in [−², ²]. 3. Write a program to estimate a and b by means of the least square method to minimize E(a, b). 4. Compare results calculated from data with various noise levels and discuss the sta- bility of the method with respect to ².

5.2 Linearized inverse problems As seen in the last section, many inverse problems are formulated in integral forms, which can be reduced to the following system of equations after discretization.

n {b} = m [A] {x}. (222)

Note that in general, [A] is a non-square with the size m × n. The system of equation (222) is classified into three categories depending on the numbers of observation points and measurement points, namely, m and n, as follows. (a) m > n Consider the case of m = 3 and n = 2. Then we have three equations for two unknowns x1 and x2;     { }  y1  A11 A12   x1 y2 = A21 A22 . (223)   x2 y3 A31 A32 The above equation is an overdetermined set of linear equations. As illustrated in Fig. 8, there exists no exact solution, but the least squares solutions may be found in this case.

X 2 X 2

X X 1 1

Figure 8: Figure 9:

(b) m = n

32 (b)−1 If A is regular, a unique solution is obtained as x = A−1b (224) For m = n = 2, the behavior of equations is shown in Fig. 9. (b)−2 If A is singular, there exists no solution as shown in Fig. 10.

X 2

X 1

Figure 10:

(b)−3 When A is nearly singular, the solution is very sensitive to erros involved observed data. For example, we here consider the system of equations given as follows. { } [ ]{ } y A A x 1 = 11 12 1 (225) y2 A11 A12 + ² x2 Then the solution x is obtained by { } [ ]{ } x1 1 A12 + ² −A12 y1 = − (226) x2 A11² A11 A11 y2 As seen in Fig. 11, the solution is very sensitive to the error ² involved in the component of

X 2 X 2

X 1 X 1

Figure 11: Figure 12:

the matrix. The stability of solution will be discussed later. (c) m < n In this case, the number of equations is less than the number of unknowns. As seen in Fig. 12 for m = 1 and n = 2, we cannot obtain unique solution. Problem 5.2.1 Discuss the existence, uniqueness and stability of solutions for the following system of equations. { } [ ] x { } 1 1 1 = 2 (227) x [ ]{ 2 } { } 1 1 x 2 1 = (228) 1 1 x 3  2  [ ]  x  { } 2 1 1 1 3 x = (229) 1 1 0  2  2 x   3   2 1 { }  4  x  1 1  1 = 3 (230) x   1 0 2 1     2 1 { }  4  x  1 1  1 = 3 (231) x   1 0 2 2

33 5.3 Decomposition (SVD) 5.3.1 SVD of a square regular matrix If A is a real symmetric m × m matrix, then a useful decomposition is based on its eigenvalues and eigenvectors. The statement that ui is an eigenvector associated with eigenvalue λi can be written

Aui = uiλi. (232)

If we now write the column vectors u1,..., um next to each other to form the square matrix   . . .  . . .    U =  u1 u2 ... um  , (233) ...... the relations (232) for i = 1, . . . , m may be written as

AU = UΛ. (234) where Λ is the diagonal matrix of eigenvalues. The eigenvectors have a convenient mathematical property of orthogonality, U T U = I, where I is the identity matrix, and span the entire space of A as a basis or minimum spanning set. The set of eigenvalues is called the spectrum of A. If two or more eigenvalues of A are identical, the spectrum of the matrix is called degenerate. The “spectrum” nomenclature is an exact analogy with the idea of the spectrum of light as depicted in a rainbow. The brightness of each color of the spectrum tell us “how much” light of that wavelength exists in the undispersed white light. For this reason, the procedure of SVD is often referred to as a spectral decomposition. Since U T U = UU T = I, multiplying eq.(234) by U T from the right hand side yields ∑m T T T AUU = A = UΛU = λkukuk . (235) k=1 For a real symmetric matrix A, therefore, it is easy to compute any power of A

An = (UΛU T )n = UΛnU T , (236) because of U T U = UU T = I. Since Λ is diagonal, raising Λ to the nth power simply raises each of its diagonal elements to the nth power. The decomposition (235) means that the action of the real symmetric matrix A on an input vector x ∈ Rm may be understood in terms of three steps:

1. It resolves the input vector along each of the eigenvectors uk, the component of the input vector T along the ith eigenvector being given by uk x,

2. The amount along the kth eigenvector is multiplied by the eigenvalue λk,

3. The product tells us how much of the kth eigenvector uk is present in the product Ax. The most useful result of eq.(236) is the inverse of the matix A as follows ∑m −1 −1 T 1 T A = UΛ U = ukuk . (237) λk k=1

5.3.2 SVD of a non-square matrix The SVD for a square matrix can be extended to a decomposition for a non-square matrix A of size m × n. We may consider two square symmetric matrices AT A and AAT , being of size n × n and m × m, respectively. Since AT A and AAT are square and symmetric, the eigenvectors and eigenvalues for those matrices are available. The eigenvectors can be chosen to form orthonormal bases of the respective spaces. Also all their eigenvalues are non-negative for the positive semidefinite matrices. It is easy to see that if v is an eigenvector of AT A corresponding to the eigenvalue σ, then

AT Av = σv (238)

34 Multiplying on the left by vT and grouping the terms,

(vT AT )(Av) = σ(vT v) (239)

On the left hand side we have a non-negative quantity, the square of the norm of Av. On the right, vT v is positive and so σ must be non-negative. T Let vi be the eigenvector associated with the i-th eigenvalue σi of the matrix A A, and assume that we sort σi so that σ1 ≥ σ2 ≥ ... ≥ σn ≥ 0. (240) T Similarly, let ui and µi be the i-th eigenvector and eigenvalue of AA and assume that

µ1 ≥ µ2 ≥ ... ≥ µm ≥ 0. (241)

Assume that σ1 is not equal to zero. Then the vector Av1 is non-zero. In the following, it will be T proved that Av1 is in fact an eigenvector of AA . To check this, we notice that

T T (AA )(Av1) = A(A A)v1 = σ1(Av1). (242)

T This shows that Av1 is indeed an eigenvector of AA , the eigenvalue being σ1. If we normalize Av1 to have unit length by forming The normalized vector, defined by Av 1 , (243) ||Av1|| T T is an eigenvector u1 of AA unless AA is degenerate. T Continuing the above argument, we see that each non-zero eigenvalue σi of A A is also an eigenvalue T T of AA . A similar argument starting with an eigenvector ui of AA with a non-zero eigenvalue µi T T T shows that the vector A ui/||A ui|| is a normalized eigenvector for A A with the same eigenvalue. From the above discussion, it is concluded that the non-zero eigenvalues of AT A are the same as the non-zero eigenvalues of AAT and vice versa. If there are r non-zero eigenvalues, this means that σ1 = µ1, . . . , σr = µr and that all subsequent eigenvalues must be zero, i.e., σr+1 = ... = σn = 0 and that µr+1 = ... = µm = 0. Therefore, we have

Av AT u u = k and v = k , k = 1, . . . , r (244) k || || k T Avk ||A uk||

This happens automatically if the non-zero eigenvalues of AT A and AAT are non-degenerate. Even if there are degeneracies, it is possible to choose the appropriate linear combinations in the degenerate eigenspaces so that these are true. The value of r is known as the of the matrix A (or of the matrix AT ). It is clear that r ≤ m and r ≤ n. The norms in (244) may be evaluated, || ||2 T T T T Avk = (Avk) (Avk) = vk (A A)vk = vk σkvk = σk (245) T where the last equality holds because vk is an eigenvalue of A A belonging to σk, and because vk is T || T ||2 normalized so that vk vk = 1 Similarly, A uk = µk. Since σk = µk > 0, we may define λk to be the square root of the eigenvalue and write √ √ T ||Avk|| = ||A uk|| = λk = σk = σk, (246) for k = 1, 2, . . . , r. Equation (244) then takes the simple form

Avk = λkuk, (247) T A uk = λkvk. (248)

n The effect of the linear transformation A on the unit vector vk ∈ R is to take it to the vector λkuk ∈ m m T R of length λk in the direction of the unit vector uk ∈ R . The effect of the linear transformation A m n on the unit vector uk ∈ R is to take it to the vector λkvk ∈ R of length λk in the direction of the unit n vector vk ∈ R . T T On the other hand for k > r, the eigenvalue of A A associated with vk is zero and so A Avk = 0. T || || Premultiplying this by vk shows that Avk = 0 and hence that Avk = 0. We thus have that

Avk = 0 for k = r + 1, . . . , n (249)

35 and similarly, T A uk = 0 for k = r + 1, . . . , m. (250)

Equations (247) and (249) together describe how A acts on the vectors in the basis {vk} for k = 1, . . . , n. Thus the matrix A can be written as ∑r T A = λkukvk . (251) k=1

It is easy to check (using the orthonormality of the basis {vk} ) that the right hand side of eq.(251) does have the same action as A. Taking the of (251) gives

∑r T T A = λkvkuk , (252) k=1 and it is again easy to check that this is consistent with (248) and (250). The orthonormal vectors {vk} are known as the right singular vectors, while the vectors {uk} are known as the left singular vectors, and the scalars {λk} are called the singular values of the matrix A. We may write the column vectors uk next to each other to form an orthogonal m × m matrix U and T × T stack the row vectors vk on top of each other to form the orthogonal n n matrix V . The equation (251) may then be written in matrix form as

A = UΛV T (253) where Λ is an m×n matrix whose only non-zero elements are the first r entries in the diagonal components.

36