L. Vandenberghe ECE133B (Spring 2020) 4. Singular value decomposition

• singular value decomposition

• related eigendecompositions

properties from singular value decomposition

• min–max and max–min characterizations

• low-rank approximation

• sensitivity of linear equations

4.1 Singular value decomposition (SVD) every m × n matrix A can be factored as

A = UΣVT

• U is m × m and orthogonal • V is n × n and orthogonal

• Σ is m × n and “diagonal”: diagonal with diagonal elements σ1,..., σn if m = n,

 σ1 0 ··· 0     0 σ2 ··· 0   . . . .   σ1 0 ··· 0 0 ··· 0   . . . . .       0 σ2 ··· 0 0 ··· 0  Σ =  0 0 ··· σn  if m > n, Σ =  ......  if m < n    ......   0 0 ··· 0     . . .   0 0 ··· σm 0 ··· 0   . . .       0 0 ··· 0   

• the diagonal entries of Σ are nonnegative and sorted:

σ1 ≥ σ2 ≥ · · · ≥ σmin{m,n} ≥ 0

Singular value decomposition 4.2 Singular values and singular vectors

A = UΣVT

• the numbers σ1,..., σmin{m,n} are the singular values of A

• the m columns ui of U are the left singular vectors

• the n columns vi of V are the right singular vectors if we write the factorization A = UΣVT as

AV = UΣ, ATU = VΣT and compare the ith columns on the left- and right-hand sides, we see that

T Avi = σiui and A ui = σivi for i = 1,...,min{m, n}

T • if m > n the additional m − n vectors ui satisfy A ui = 0 for i = n + 1,..., m

• if n > m the additional n − m vectors vi satisfy Avi = 0 for i = m + 1,..., n

Singular value decomposition 4.3 Reduced SVD often m  n or n  m, which makes one of the orthogonal matrices very large

Tall matrix: if m > n, the last m − n columns of U can be omitted to define

A = UΣVT

• U is m × n with orthonormal columns • V is n × n and orthogonal

• Σ is n × n and diagonal with diagonal entries σ1 ≥ σ2 ≥ · · · ≥ σn ≥ 0

Wide matrix: if m < n, the last n − m columns of V can be omitted to define

A = UΣVT

• U is m × m and orthogonal • V is m × n with orthonormal columns

• Σ is m × m and diagonal with diagonal entries σ1 ≥ σ2 ≥ · · · ≥ σm ≥ 0 we refer to these as reduced SVDs (and to the factorization on p. 4.2 as a full SVD)

Singular value decomposition 4.4 Outline

• singular value decomposition

• related eigendecompositions

• matrix properties from singular value decomposition

• min–max and max–min characterizations

• low-rank approximation

• sensitivity of linear equations Eigendecomposition of Gram matrix suppose A is an m × n matrix with full SVD

A = UΣVT the SVD is related to the eigendecomposition of the Gram matrix AT A:

AT A = VΣT ΣVT

• V is an orthogonal n × n matrix

• ΣT Σ is a diagonal n × n matrix with (non-increasing) diagonal elements

2 2 2 σ1, σ2, . . . , σmin{m,n}, 0, 0, ··· , 0 | {z } n − min{m, n} times

T T • the n diagonal elements of Σ Σ are the eigenvalues of A A • the right singular vectors (columns of V) are corresponding eigenvectors

Singular value decomposition 4.5 Gram matrix of transpose the SVD also gives the eigendecomposition of AAT:

AAT = UΣΣTUT

• U is an orthogonal m × m matrix

• ΣΣT is a diagonal m × m matrix with (non-increasing) diagonal elements

2 2 2 σ1, σ2, . . . , σmin{m,n}, 0, 0, ··· , 0 | {z } m − min{m, n} times

T T • the m diagonal elements of ΣΣ are the eigenvalues of AA • the left singular vectors (columns of U) are corresponding eigenvectors in particular, the first min{m, n} eigenvalues of AT A and AAT are the same:

2 2 2 σ1, σ2, . . . , σmin{m,n}

Singular value decomposition 4.6 Example scatter plot shows m = 500 points from the normal distribution on page 3.34 √  5  1  7 3  µ = , Σ = √ 4 ex 4 3 5

• we define an m × 2 data matrix X with the m vectors as its rows T • the centered data matrix is Xc = X − (1/m)11 X

x2 sample estimate of mean is

1  5.01  µ = XT1 = b m 3.93 sample estimate of covariance is

1  1.67 0.48  Σ = XT X = b m c c 0.48 1.35

x1

Singular value decomposition 4.7 Example 1 A = √ Xc m

• eigenvectors of bΣ are right singular vectors v1, v2 of A (and of Xc) • eigenvalues of bΣ are squares of the singular values of A

x2

direction v1 direction v2

x1

Singular value decomposition 4.8 Existence of singular value decomposition the Gram matrix connection gives a proof that every matrix has an SVD

• assume A is m × n with m ≥ n and rank r • the n × n matrix AT A has rank r (page 2.5) and an eigendecomposition

AT A = VΛVT (1)

Λ is diagonal with diagonal elements λ1 ≥ · · · ≥ λr > 0 = λr+1 = ··· = λn √ • define σi = λi for i = 1,..., n, and an n × n matrix h i U =  u ··· u  = 1 Av 1 Av ··· 1 Av u ··· u 1 n σ1 1 σ2 2 σr r r+1 n

T where ur+1,..., un form any orthonormal basis for null(A )

• (1) and the choice of ur+1,..., un imply that U is orthogonal

• (1) also implies that Avi = 0 for i = r + 1,..., n

• together with the definition of u1,. . . , ur this shows that AV = UΣ

Singular value decomposition 4.9 Non-uniqueness of singular value decomposition the derivation from the eigendecomposition

AT A = VΛVT shows that the singular value decomposition of A is almost unique

Singular values • the singular values of A are uniquely defined • we have also shown that A and AT have the same singular values

Singular vectors (assuming m ≥ n): see the discussion on page 3.14

• right singular vectors vi with the same positive singular value span a subspace • in this subspace, any other orthonormal basis can be chosen

• the first r = rank(A) left singular vectors then follow from σiui = Avi

• the remaining vectors vr+1,..., vn can be any orthonormal basis for null(A) T • the remaining vectors ur+1,..., um can be any orthonormal basis for null(A )

Singular value decomposition 4.10 Exercise suppose A is an m × n matrix with m ≥ n, and define

 0 A  B = AT 0

1. suppose A = UΣVT is a full SVD of A; verify that

 U 0   0 Σ   U 0  T B = 0 V ΣT 0 0 V

2. derive from this an eigendecomposition of B

Hint: if Σ1 is square, then

 0 Σ  1  II   Σ 0   II  1 = 1 Σ1 0 2 I −I 0 −Σ1 I −I

3. what are the m + n eigenvalues of B?

Singular value decomposition 4.11 Outline

• singular value decomposition

• related eigendecompositions

• matrix properties from singular value decomposition

• min–max and max–min characterizations

• low-rank approximation

• sensitivity of linear equations Rank the number of positive singular values is the rank of a matrix

• suppose there are r positive singular values:

σ1 ≥ · · · ≥ σr > 0 = σr+1 = ··· = σmin{m,n}

• partition the matrices in a full SVD of A as   Σ 0 T A =  U U  1  V V  1 2 0 0 1 2 T = U1Σ1V1 (2)

Σ1 is r × r with the positive singular values σ1,..., σr on the diagonal

• since U1 and V1 have orthonormal columns, the factorization (2) proves that

rank(A) = r

(see page 1.12)

Singular value decomposition 4.12 Inverse and pseudo-inverse we use the same notation as on the previous page   Σ 0 T A =  U U  1  V V  = U Σ VT 1 2 0 0 1 2 1 1 1 diagonal entries of Σ1 are the positive singular values of A

• pseudo-inverse follows from page 1.36:

† −1 T A = V1Σ1 U1  Σ−1   T    1 0 U1 = V1 V2 T 0 0 U2 = VΣ†UT

• if A is square and nonsingular, this reduces to the inverse

A−1 = (UΣVT)−1 = VΣ−1UT

Singular value decomposition 4.13 Four subspaces we continue with the same notation for the SVD of an m × n matrix A with rank r:   Σ 0 T A =  U U  1  V V  1 2 0 0 1 2 the diagonal entries of Σ1 are the positive singular values of A the SVD provides orthonormal bases for the four subspaces associated with A

• the columns of the m × r matrix U1 are a basis of range(A) ⊥ T • the columns of the m × (m − r) matrix U2 are a basis of range(A) = null(A ) T • the columns of the n × r matrix V1 are a basis of range(A )

• the columns of the n × (n − r) matrix V2 are a basis of null(A)

Singular value decomposition 4.14 Image of unit ball define Ey as the image of the unit ball under the linear mapping y = Ax:

T Ey = {Ax | kxk ≤ 1} = {UΣV x | kxk ≤ 1}

x2 x˜2 e v2 x˜ = VT x 2

e1 x1 x˜1 v1

y = Ax y˜ = Σx˜ y2 y˜2 σ u σ2e2 2 2 σ1u1

σ1e1 y1 y˜1 Ey y = Uy˜

Singular value decomposition 4.15 Control interpretation system A maps input x to output y = Ax

x A y = Ax

• if kxk2 represents input energy, the set of outputs realizable with unit energy is

Ey = {Ax | kxk ≤ 1}

• assume rank(A) = m: every desired y can be realized by at least one input • the most energy-efficient input that generates output y is

m (uT y)2 x = A†y = AT(AAT)−1y, kx k2 = yT(AAT)−1y = X i eff eff 2 i=1 σi

T T −1 • (if rank(A) = m) the set Ey is an Ey = {y | y (AA ) y ≤ 1}

Singular value decomposition 4.16 Inverse image of unit ball define Ex as the inverse image of the unit ball under the linear mapping y = Ax:

T Ex = {x | kAxk ≤ 1} = {x | kUΣV xk ≤ 1}

x2 x˜2 −1 −1 σ e2 σ2 v2 2 x˜ = VT x

Ex −1 σ1 e1 x1 x˜1 −1 σ1 v1

y = Ax y˜ = Σx˜ y2 y˜2 e u2 2 u1 e1 y1 y˜1 y = Uy˜

Singular value decomposition 4.17 Estimation interpretation

measurement A maps unknown quantity xtrue to observation

yobs = A(xtrue + v) where v is unknown but bounded by kAvk ≤ 1

• if rank(A) = n, there is a unique estimate xˆ that satisfies Axˆ = yobs

• uncertainty in y causes uncertainty in estimate: true value xtrue must satisfy

kA(xtrue − xˆ)k ≤ 1

• the set Ex = {x | kA(x − xˆ)k ≤ 1} is the uncertainty region around estimate xˆ

Singular value decomposition 4.18 Frobenius norm and 2-norm

for an m × n matrix A with singular values σi:

/ min{m,n} ! 1 2 X 2 kAkF = σi , kAk2 = σ1 i=1 this readily follows from the unitary invariance of the two norms:

/ min{m,n} ! 1 2 T X 2 kAkF = kUΣV kF = kΣkF = σi i=1 and T kAk2 = kUΣV k2 = kΣk2 = σ1

Singular value decomposition 4.19 Exercise

† † Exercise 1: express kA k2 and kA kF in terms of the singular values of A

Exercise 2: the of a square nonsingular matrix A is defined as

−1 κ(A) = kAk2kA k2 express κ(A) in terms of the singular values of A

Exercise 3: give an SVD and the 2-norm of the matrix

A = abT where a is an n-vector and b is an m-vector

Singular value decomposition 4.20 Outline

• singular value decomposition

• related eigendecompositions

• matrix properties from singular value decomposition

• min–max and max–min characterizations

• low-rank approximation

• sensitivity of linear equations First singular value the first singular value is the maximal value of several functions:

T T σ1 = max kAxk = max y Ax = max kA yk (3) kxk=1 kxk=kyk=1 kyk=1

• the first and last expressions follow from page 3.24 and

2 T T T 2 T T T σ1 = λmax(A A) = max x A Ax, σ1 = λmax(AA ) = max y AA y kxk=1 kyk=1

• second expression in (3) follows from the Cauchy–Schwarz inequality:

kAxk = max yT(Ax), kAT yk = max xT(AT y) kyk=1 kxk=1

Singular value decomposition 4.21 First singular value alternatively, we can use an SVD of A to solve the maximization problems in

T T σ1 = max kAxk = max y Ax = max kA yk (4) kxk=1 kxk=kyk=1 kyk=1

• suppose A = USVT is a full SVD of A

• if we define x˜ = VT x, y˜ = UT y, then (4) can be written as

T T σ1 = max kΣx˜k = max y˜ Σx˜ = max kΣ y˜k kx˜k=1 kx˜k=ky˜k=1 ky˜k=1

• an optimal choice for x˜ and y˜ is x˜ = (1,0,...,0) and y˜ = (1,0,...,0)

• therefore x = v1 and y = u1 are optimal for each of the maximizations in (4)

Singular value decomposition 4.22 Last singular value two of the three expressions in (3) have a counterpart for the last singular value

• for an m × n matrix A, the last singular σmin{m,n} can be written as follows:

T if m ≥ n: σn = min kAxk, if n ≥ m: σm = min kA yk (5) kxk=1 kyk=1

• if m , n, we need to distinguish the two cases because

min kAxk = 0 if n > m, min kAT yk = 0 if m > n kxk=1 kyk=1 to prove (5), we substitute full SVD A = UΣVT, and define x˜ = VT x, y˜ = UT y:

1/2  2 2 2 2 if m ≥ n: min kΣx˜k = min σ x˜ + ··· + σn x˜n = σn kx˜k=1 kx˜k=1 1 1 1/2 T  2 2 2 2  if n ≥ m: min kΣ y˜k = min σ y˜ + ··· + σmy˜m = σm ky˜k=1 ky˜k=1 1 1 optimal choices for x and y in (5) are x = vn, y = um

Singular value decomposition 4.23 Max–min characterization we extend (3) to a max–min characterization of the other singular values:

σk = max σmin(AX) (6a) T X X=Ik T = max σmin(Y AX) (6b) T T X X=Y Y=Ik T = max σmin(A Y) (6c) T Y Y=Ik

• σk for k = 1,...,min{m, n} are the singular values of the m × n matrix A • X is n × k with orthonormal columns, Y is m × k with orthonormal columns

• σmin(B) denotes the smallest singular value of the matrix B

• in the three expressions in (6) σmin(·) denotes the kth singular value

• for k = 1, we obtain the three expressions for σ1 in (3)

• these can be derived from the min–max theorems for eigenvalues (p. 3.29) • or we can find an optimal choice for X, Y from an SVD of A (we skip the details)

Singular value decomposition 4.24 Min–max characterization we extend (5) to a min–max characterization of all singular values

Tall or square matrix: if A is m × n with m ≥ n

σn−k+1 = min kAXk2, k = 1,..., n (7) T X X=Ik

• we minimize over n × k matrices X with orthonormal columns

• kAXk2 is the maximum singular value of an m × k matrix • for k = 1, this is the first expression in (5)

Wide or square matrix (A is m × n with m ≤ n)

T σm−k+1 = min kA Y k2, k = 1,..., m T Y Y=Ik

• we minimize over n × k matrices Y with orthonormal columns T • kA Y k2 is the maximum singular value of an n × k matrix • for k = 1, this is the second expression in (5)

Singular value decomposition 4.25 Proof of min–max characterization (for m ≥ n) we use a full SVD A = UΣVT to solve the optimization problem

minimize kAXk2 subject to XT X = I changing variables to X˜ = VT X gives the equivalent problem

minimize kΣX˜ k2 subject to X˜ T X˜ = I

• we show that the optimal value is σn−k+1 • an optimal solution for X˜ is formed from the last k columns of the n × n identity   X˜opt = en−k+1 ··· en−1 en

• an optimal solution for X is formed from the last k columns of V:   Xopt = vn−k+1 ··· vn−1 vn

Singular value decomposition 4.26 Proof of min–max characterization (for m ≥ n)

• we first note that ΣX˜opt are the last k columns of Σ, so kΣX˜optk2 = σn−k+1

• to show that this is optimal, consider any other n × k matrix X˜ with X˜ T X˜ = I

• find a nonzero k-vector u for which y = Xu˜ has last k − 1 components zero:

Xu˜ = (y1,..., yn−k+1,0,...,0)

this is possible because a (k − 1) × k matrix has linearly dependent columns

• normalize u so that kuk = kyk = 1, and use it to lower bound kΣX˜ k2:

kΣX˜ k2 ≥ kΣXu˜ k2

= kΣyk2 1/2  2 2 2 2  = σ1 y1 + ··· + σn−k+1yn−k+1 1/2  2 2  ≥ σn−k+1 y1 + ··· + yn−k+1 = σn−k+1

Singular value decomposition 4.27 Outline

• singular value decomposition

• related eigendecompositions

• matrix properties from singular value decomposition

• min–max and max–min characterizations

• low-rank approximation

• sensitivity of linear equations Rank-r approximation let A be an m × n matrix with rank(A) > r and full SVD

min{m,n} T X T A = UΣV = σiuivi , σ1 ≥ · · · ≥ σmin{m,n} ≥ 0, σr+1 > 0 i=1 the best rank-r approximation of A is the sum of the first r terms in the SVD:

r X T B = σiuivi i=1

• B is the best approximation for the Frobenius norm: for every C with rank r,

/ min{m,n} ! 1 2 X 2 kA − CkF ≥ kA − BkF = σi i=r+1

• B is also the best approximation for the 2-norm: for every C with rank r,

kA − Ck2 ≥ kA − Bk2 = σr+1

Singular value decomposition 4.28 Rank-r approximation in Frobenius norm the approximation problem in k · kF is a nonlinear least squares problem:

m n r ! 2 T 2 X X X minimize kA − YX kF = Aij − Yik Xjk (8) i=1 j=1 k=1

• matrix B is written as B = YXT with Y of size m × r and X of size n × r • we optimize over X and Y

Outline of the solution

• the first order (necessary but not sufficient) optimality conditions are

AX = Y(XT X), ATY = X(YTY)

• can assume XT X = YTY = D with D diagonal (e.g., get X, Y from SVD of B) • then columns of X, Y are formed from r pairs of right/left singular vectors of A • optimal choice for (8) is to take the first r singular vector pairs

Singular value decomposition 4.29 Rank-r approximation in 2-norm to show that B is the best approximation in 2-norm, we prove that

kA − Ck2 ≥ kA − Bk2 for all C with rank(C) = r on the right-hand side

min{m,n} r X T X T A − B = σiuivi − σiuivi i=1 i=1 min{m,n} X T = σiuivi i=r+1

 σr+1 ··· 0     . . .    T = ur+1 ··· umin{m,n}  . . . .  vr+1 ··· vmin{m,n}    0 ··· σmin{m,n}    and the 2-norm of the difference is kA − Bk2 = σr+1 on the next page we show that kA − Ck2 ≥ σr+1 if C has rank r

Singular value decomposition 4.30 Proof: we prove that kA − Ck2 ≥ σr+1 for all m × n matrices C of rank r

• we assume that m ≥ n (otherwise, first take the transpose of A and C)

• if rank(C) = r, the nullspace of C has dimension n − r

• define an n × (n − r) matrix Xˆ with orthonormal columns that span null(C)

• use the min–max characterization on page 4.25 to bound kA − Ck2:

kA − Ck2 = max k(A − C)xk kxk=1 ≥ max k(A − C)Xˆ wk (kXˆ wk = 1 if kwk = 1) kwk=1

= k(A − C)Xˆ k2

= kAXˆ k2 (CXˆ = 0)

≥ min kAXk2 T X X=In−r = σr+1 (apply (7) with k = n − r)

Singular value decomposition 4.31 Outline

• singular value decomposition

• related eigendecompositions

• matrix properties from singular value decomposition

• min–max and max–min characterizations

• low-rank approximation

• sensitivity of linear equations SVD of square matrix for the rest of the lecture we assume that A is n × n and nonsingular with SVD

n T X T A = UΣV = σiuivi i=1

• 2-norm of A is kAk2 = σ1

• A is nonsingular if and only if σn > 0 • inverse of A and 2-norm of A−1 are

n −1 −1 T X 1 T −1 1 A = VΣ U = viui , kA k2 = i=1 σi σn

• condition number of A is

−1 σ1 κ(A) = kAk2kA k2 = ≥ 1 σn

A is called ill-conditioned if the condition number is very high

Singular value decomposition 4.32 Sensitivity to right-hand side perturbations linear equation with right-hand side b , 0 and perturbed right-hand side b + e:

Ax = b, Ay = b + e

• bound on distance between the solutions:

−1 −1 ky − xk = kA ek ≤ kA k2kek

recall that kBxk ≤ kBk2kxk for matrix 2-norm and Euclidean vector norm

• bound on relative change in the solution, in terms of δb = kek/kbk:

ky − xk kek ≤ kAk kA−1k = κ(A) δ kxk 2 2 kbk b

in the first step we use kbk = kAxk ≤ kAk2kxk large κ(A) indicates that the solution can be very sensitive to changes in b

Singular value decomposition 4.33 Worst-case perturbation of right-hand side

ky − xk kek ≤ κ(A) δ where δ = kxk b b kbk

• the upper bound is often very conservative • however, for every A one can find b, e for which the bound holds with equality

• choose b = u1 (first left singular vector of A): solution of Ax = b is

−1 −1 T 1 x = A b = VΣ U u1 = v1 σ1

• choose e = δb un (δb times left singular vector un): solution of Ay = b + e is

−1 δb y = A (b + e) = x + vn σn

• relative change is ky − xk σ1δb = = κ(A) δb kxk σn

Singular value decomposition 4.34 Nearest singular matrix the singular matrix closest to A is

n−1 X T T σiuivi = A + E where E = −σnunvn i=1

• this gives another interpretation of the condition number:

1 kE k σ 1 kE k = σ = , 2 = n = 2 n −1 kA k2 kAk2 σ1 κ(A)

1/κ(A) is the relative distance of A to the nearest singular matrix

• this also implies that a perturbation A + E of A is certainly nonsingular if

1 kE k < = σ 2 −1 n kA k2

Singular value decomposition 4.35 Bound on inverse on the next page we prove the following inequality:

−1 kA k2 1 k(A + E)−1k ≤ if kE k < (9) 2 −1 2 −1 1 − kA k2kE k2 kA k2

−1 1 using kA k2 = 1/σn: σn − kE k2

−1 1 k(A + E) k2 ≤ σn − kE k2 if kE k2 < σn

1/σn σn kE k2

Singular value decomposition 4.36 Proof:

• the matrix Y = (A + E)−1 satisfies

(I + A−1E)Y = A−1(A + E)Y = A−1

• therefore

−1 −1 kY k2 = kA − A EY k2 −1 −1 ≤ kA k2 + kA EY k2 (triangle inequality) −1 −1 ≤ kA k2 + kA E k2kY k2

in the last step we use the property kCDk2 ≤ kCk2kDk2 of the matrix 2-norm

• rearranging the last inquality for kY k2 gives

kA−1k kA−1k kY k ≤ 2 ≤ 2 2 −1 −1 1 − kA E k2 1 − kA k2kE k2

−1 −1 in the second step we again use the property kA E k2 ≤ kA k2kE k2

Singular value decomposition 4.37 Sensitivity to perturbations of coefficient matrix linear equation with matrix A and perturbed matrix A + E:

Ax = b, (A + E)y = b

−1 • we assume kE k2 < 1/kA k2, which guarantees that A + E is nonsingular • bound on distance between the solutions:

ky − xk = k(A + E)−1(b − (A + E)x)k = k(A + E)−1E xk −1 ≤ k(A + E) k2 kE k2 kxk −1 kA k2kE k2 ≤ kxk (applying (9)) −1 1 − kA k2kE k2

• bound on relative change in solution in terms of δA = kE k2/kAk2:

ky − xk κ(A) δA ≤ (10) kxk 1 − κ(A)δA

Singular value decomposition 4.38 Worst-case perturbation of coefficient matrix Pn T an example where the upper bound (10) is sharp (from SVD A = i=1 σiuivi )

• choose b = un: the solution of Ax = b is

−1 x = A b = (1/σn)vn

T • choose E = −δAσ1unvn with δA < σn/σ1 = 1/κ(A):

n−1 X T T A + E = σiuivi + (σn − δAσ1)unvn i=1 • solution of (A + E)y = b is

−1 1 y = (A + E) b = vn σn − δAσ1 • relative change in solution is   ky − xk 1 1 δAκ(A) = σn − = kxk σn − δAσ1 σn 1 − δAκ(A)

Singular value decomposition 4.39 Exercises

Exercise 1 to evaluate the sensivity to changes in A, we can also look at the residual

k(A + E)x − bk where x = A−1b is the solution of Ax = b

1. show that k(A + E)x − bk kE k ≤ κ(A)δ where δ = kbk A A kAk

2. show that for every A there exist b, E for which the inequality is sharp

Singular value decomposition 4.40 Exercises

Exercise 2: consider perturbations in A and b

Ax = b, (A + E)y = b + e

−1 assuming kE k2 < 1/kA k2, show that

ky − xk (δ + δ )κ(A) ≤ A b kxk 1 − δAκ(A) where δb = kek/kbk and δA = kE k2/kAk2

Singular value decomposition 4.41