Solutions to Selected Problems Chapter 1

1.7. Solution Clearly

1]_1 [2]-1=[1/2], 3 1]-1=~ [ 3-1] 1 =_1! -I5 -1 5-1 -1] [1 3 8 -I 3' 4 18 -I -I 5 and the sum of the elements in the inverse is ~ in each case. We show that this is true in general. The matrix A3=nl + j, where j is the matrix every element of which is I, and in the special cases above the inverses are linear combinations of I and j. Let us see if this is true in general. Assume A;;1=CtJ+/3J. Then (al+/3})(nl+})=1 which gives nal +(a+/3n)) +/3]2= I which can be satisfied, since ]2=nj, by taking

1 a=-, n Since the inverse is unique we have

and the sum of its elements is

nx~+n2X ( __1_) = I_~=~. n 2n2 2 2

The answer in the case of the is n2• See e.g. D. E. Knuth, The art of computer programming, I (1968), pp. 36/7, 473/4. 1.9. Solution Since R(x}=R(rx) for any r¥O we may replace the condition x¥O by x' x = 1. We know that we can choose an orthonormal system of vectors 120 Solutions to Selected Problems

c1 , C2 , ••• , Cn which span the whole space Rn and which are characteristic vectors of A, say Ac;=ex; c;, i= 1,2, ... , n. Hence we can express any x, with x'x= 1 as

where .2'~; = 1. Since

=.2'exj~;~jC;Cj i, j

= .2' ex; ~r (by orthonormality) ; we have exn = an .2' ~r;§ R (x) = .2' ai ~r;§ a1 .2' ~r = a1 • iii Also, clearly, for any i,

and so the bounds are attained. In view of the importance of the Rayleigh quotient in numerical mathe• matics we add three remarks, the first two dealing with the two-dimensional case. (1) We show how the Rayleigh quotient varies when A=[~ ~l By homo-

geneity we can restrict ourselves to vectors of unit length say x = [~~~:l Then

Q (8) = a cos2 8 + 2 h cos 8 sin 8 + b sin2 8

=~ [(a-b) cos 28+2 h sin 28+ (a + b)].

To study the variation of Q(8) with 8 observe that

q (

= yex 2+ (32 [(exl Ya 2+ (32) cos

= Yex 2 + (32 sin (

(2) The fact that the characteristic vectors are involved can be seen by use of the Lagrange Multipliers. To find extrema ofax2+2hxy+by 2 subject to x 2 +y2=1, say, we compute Ex, Ey where

Then Ex =2(a-A)x+2hy, Ey=2hx+2(b-A)Y and at an extremum (a-A)x+hY=O} hx+(b-A)y=O .

For a non-trivial solution we must have

a-A h] det [ h b-A =0, i.e., A must be a characteristic value of [~ ~]. (3) A very important general principle should be pointed out here. At an extremum Xo of y=f(x) at which f(x) is smooth, it is true that x "near" Xo implies f(x) "very near" f(xo). In the simplest case, f(x)=x2 and xo=O, we have f(x)=x2 of the "second order" in x; this is not true if we do not insist on smoothness, as in shown by the case g(x) = lxi, xo=O, in which g(x) is of the same order as x. We are just using the Taylor expansion about Xo: f(x)-f(xo) = (X_XO)2 [~f"(xo)+ ... ] in the case where f'(xo) =0. This idea can be generalized to the case where y=f(x) is a scalar function of a vector variable x in particular y=R(x). It means that from a "good" guess. at a characteristic vector of A, the Rayleigh quotient gives a "very good" estimate of the corresponding characteristic value. 1.10. Solution

(1- 2 m m') (1- 2 m m')' = (1- 2 m m') (1- 2 m m') = 1- 4 m m' + 4 m m' m m' = 1- 4 m m' + 4 m (m' m) m'

= 1- 4 OJ m' + 4 m m'

Matrices of the form of 0 were introduced by Householder and are of great use in numerical algebra. (See e.g. Chapter 8.) 122 Solutions to Selected Problems

Chapter 2

2.4. Solution

Assume p:> 1, .!:.+~= 1, a:>O, {3:>O. .1'=.'(,-1 p q B a P Then area OA'A= IxP-ldx =~ o p

P {3q and area OBB'= J yl/(P-l)dy=_. o q x

Clearly the area of the rectangle OA' C B' is not greater than the sum of the areas of the curvilinear triangles OA' A and OBB' and equal to it only if A, B and C coalesce. Hence

with strict inequality unless {3q=aP. This inequality, when written in the form

Al/p B1/q;§ (Alp) + (Blq) can be recognized as a generalization of the Arithmetic-Geometric Mean inequality from which it can be deduced, first when the weights p, q are rational and then by a limiting process for general p, q. If we write a-H {3-JbL in this inequality we find - Ilxllp' - Ilyllq

(1)

Adding the last inequalities for i=l, 2, ... , n we find

so that (H) Chapter 2 123

This is the Holder inequality. There is equality in the last inequality if and only if there is equality in all the inequalities (1) which means that the Ixil P are proportional to the IYil q. Observe that when p = q = 2 the inequality (H) reduces to the Schwarz inequality (S)

Observe also that the limiting case of (H), when p = 1, q = =, is also valid. In order to establish the Minkowski inequality

(M) we write and sum, applying (H) twice on the right to get L: (IXi! + !YiJ)P;:§ Ilxll p[L: (lXi! + !Yil)(P-ljqJ/q + lIyllp [L: (lxd + !YiJ)(P-ljqJ/q.

Observe that (p-1)q=p, so that the terms in [ ] on the right are identical with that on the left. Hence, dividing through, [L: (lxd + !YiI)P]l-(l/qj;:§ Ilxll p+ Ilyllp, i.e., since 1-(l/q) = lip,

The equality cases can easily be distinguished. We have therefore shown that the p-norm satisfies Axiom 3, the triangle• inequality. The proofs that Axioms 1, 2 are satisfied are trivial. To complete the solution we observe that

which we can write as Ilxll!.;:§ Ilxll~;:§ n Ilxll!.· Taking p-th roots we get IIxll=;:§ Ilxllp;:§ n1 / P Ilxll= and, since as p -+ =, we have llxll=;:§ lim II xll p;:§ Ilxll=· P-= 124 Solutions to Selected Problems

2.5. Solution See sketch. For simplicity we have only drawn the part in the first quadrant. Each set is bounded, closed, convex and symmetrical about the origin ("equilibrated") and has a not-empty interior.

2.6. Solution See sketch. For simplicity we have only drawn the part in the first quadrant. This set Ilxll ~ I is not convex but has the other properties of those in Problem 2.5. The triangle inequality is not satisfied: e.g., x=[O, I)" y=[l,O)" x+y=[I, I)'

Ilx+ yll =23/2>2= Ilxll + Ilyll·

2.7. Solution See sketch. For simplicity we have only drawn the part in the first quadrant. The set Ilxll ~ 1 has the properties listed in Problem 2.5 and the axioms are satisfied. Chapter 2 125

.,,

2.B. Solution If PI and P2 are equivalent and if P2 and P3 are equivalent then PI and P3 are equivalent for we have

-c PI (x) P2 (x) PI (x) -c O

m~p(x)~M, xE S.

Now, by continuity there are vectors x m , XM in S such thatp(xm)=m,p(xM)=M and, since Ilxmll = 1, m>O and we have

O

By homogeneity, the ratio IIXI12/11XIII is equal to its value at the vectors Xl' X2 where these are chosen to make II XliiI = 1, II X2 111 = Vi. Since Xl is inside the 126 Solutions to Selected Problems circle IIx11 2= 1 we have IIXI 1l 2:2I so that IIXI I1 2:2I = IIXIII I and then IIXI12:2IIXIII' Since X2 is outside the circle Ilx1l 2=I we have IIX2112/IIX2111~I/y'2 and so V21IXI12~IIXIIII'

x

2

We deal with the general case analytically. We have n max IXil~Zlxil~ ~max IXil which gives n Ilxll=~llxlll~lIxll=. Again Ilx11 2= V-(Z IXiI2):2 V-((Z IXi!)2) = Ilxll l and, by the Schwarz inequality, IIxlll = 1,' I-IXil:2 V-{(Z 12)(1,' IXiI2)} = Vnllxl1 2. Finally Ilx112= V-.(Z IXiI2)~ V-((max IXiI)2)= Ilxll= and Ilx112= V-(Z IX;!2):2 V-(n(max IXiI)2)= Vnllxll=. The equality cases in each inequality can be easily distinguished. All the result can be obtained from the fact that if =~q~p~I we have IIxllq :2llxllp:2 nP-l_q-11Ixllq and there is strict inequality on the right except when x is a unit vector and on the left except when x=e=[l, 1, ... , 1]'. This result can be obtained from the Holder inequality. (See, e.g., G. H. Hardy, J. E. Littlewood and G. P6lya, Inequalities, or N. Gastinel, Analyse numerique lineaire, p. 29.) 2.9. Solution We shall only discuss the Frobenius norm. It is trivial to check that the first two axioms are satisfied and we deal only with the last two. Subadditivity: In this paragraph all sums are double sums with respect to i,j. IIA + B 112 = 1,' (laij + bij!)2:2 1,' (laul2 + Ibijl2 + 21aiillbul) :2 1,' laijl2 + 1,' Ibil + 21,' laijllbijl· Chapter 2 127

The Schwarz inequality gives which can be used in our last inequality to get IIA +B112 o:§ IIAII2+ IIBI12+ 211AIIIIBil = (IIAII + IIBI1)2 so that IIA + BII o:§ IIAII + IIBII as required. Submultiplicativity: IIABI12=.z.z (IL: aik bkjj)2 i j k o:§.z .z (.z laikllbkjl)2 i j k o:§.z .z .z laikl 2.z Ib/i l2 (by the Schwarz inequality) i j k I as required. 2.10. Solution We note that tr A = .z au = .z IXi and so trA=tr5-1A5 for any non-singular 5, since 5-1 A 5 and A have the same characteristic values. It is clear that IIAII}=.z.z at=tr AA'. The required result is established as follows: 110' AOII}= tr (0' AO(O' A 0)') =tr(O' AOO' A'O) =trO' AA'O = tr A A' since A A' and 0' A A' 0 are similar =IIAII}·

More indeed is true: if 01 , O2 are orthogonal

1101 A 0211} = tr (01A O2O~ A' O~) = tr (0 1 A A' 00 = IIAII}. 2.12. Solution This terminology appears to be due to F. L. Bauer (in his lectures at Stanford University in 1967). Manhattan is the central borough of the city• of New York and its streets and avenues form a rectangular grid. So the "taxi cab" distance between two points is the sum of the distances along the: streets and along the avenues. 128 Solutions to Selected Problems

Chapter 3

3.2. Solution The characteristic polynomial of A is X2 - 5 x - 2 and the characteristic values are '2(5±1 ,~r 33). Hence Q(A)='2(5+1 ,/-r 33).

Chebyshev case. Clearly IIAII =max(3, 7)=7. We assume Ilxll==l so that x'=[XI ,X2] where max(lxII, Ix21)=l. Then

and we can have equality if and only if IXII = IX21 = l.

Manhattan case. Clearly IIAII=max(4,6)=6. We assume IXII+lx21=l. Then

and we can have equality if and only if Xl =0, IX21 = l. Euclidean case. We have AA'=[:l ~~], and this matrix has for its dominant characteristic value 15+ V221 so that IIAII = 115+ ym. We assume x~+x~=l. Then

In order that this should be IIAI12, x must be the dominant characteristic vector of AA' so that 10xI + 14x2 = (15 + V221) Xl

14xI + 20X2= (15 + V221) X2 • These equations are necessarily consistent and we find

3.3. Solution The fact that the induced by a vector norm is multiplicative depends essentially on the remark that

n(ABx) n(A(Bx)) n(Bx) n(x) = nCB x) • n(x) and this does not apply in the mixed case. To see that a mixed norm is not necessarily sub multiplicative take, as Chapter 3 129

we may, n2(x)=kn1 (x) where k is arbitrary. Then

nl(Ax) 1 IIAII12=SUP knl(x) ="k IIAlln where IIAII11 is the matrix norm induced by the vector norm nl(x). We know that IIA B1111 ~ IIAllnllBlin so that IIA BII12 ~k-lIIAII121IBII12 and we can certainly choose k so that this is false. It is easy to show that if m (A) is not submultipIicative, then we can choose a multiplier Jl so that Jlm(A) is a submultiplicative norm. In fact, the arguments of Problem 2.8 show also that any two matrix norms (finite-dimensional) are equivalent. Hence there are positive constants a, b such that

Hence

Thus if we take Jl=b/a2 then

Jlm(AB)~Jlm(A) X/lm(B). Corresponding to any not submultiplicative matrix norm meA) there is a least multiplier /l~1 such that JlM(A) is submultiplicative. In the case when m(A)=n;ta,x lad it can be shown that the least mul- l,} tiplier is n. (See J. F. Maitre, Numer. Math., 10, 132-161 (1967).) The last part of the problem, when put in standard notation, requires us to find IIAxll~ m~,l = sup II II . ",,",0 x 1 Clearly

Hence m~,l~fi?aJ(!aij!. We prove m~,l~fi?a.x!aij! and so m~,l=fi?a.x!aij!. l, J l, J l, J In fact if !aIJ! = max !ail! then if

x=ej we have Ax=[alJ,a2;, ... ,anJ]' and IIAxll~=!aIJ!' Ilxlll=l.

The values of m~, 2, mI ,2 do not seem to be known. 130 Solutions to Selected Problems

3.5. Solution To prove IIAxI12:2IIAIIFllxIl2 observe that

= IIAII} Ilxll~ and take square roots. It is clear that and so, a fortiori,

3.6. Solution The fact that n(x) is a norm is easy. Clearly m~O implies n~O and x'1'=O if and only if x=o and so n(x)=m(x'1')=O if and only if x=O. Homogeneity of m implies homogeneity of n. For the triangle inequality n (x + y) = m ((x + Y)'1') = m (x '1' + Y'1')

:2m (x '1') + m (y '1') =n(x)+n(y).

To prove compatibility we want to show

n(Ax):2m(A)n(x), i.e., meA x '1'):2 m(A)m(x'1'), but this is just the assertion of submuItiplicativity of 111.

3.7. Solution Let N={xln(x)=l}; this is regarded as a subset of the euclidean plane R2 and closed and bounded are relative to the geometry of R2. A set S c R2 is bounded if there is a constant M such that if [x, y]'ES, X2+y2~M. A set S c R2 is closed if whenever a sequence of points s" E S has a limit s, i.e., if lim [(Xn-X)2 + (Yn-y)2] =0, then sES. The set of points U on the circumference of the unit circle X2 + y2= 1 is manifestly bounded (take M = I) and can be proved to be closed as follows. Suppose lim Sn=S - this means lim [(Xn-x)2 + (Yn-y)2] =0 which implies limxn=x, limYn=Y. Since x~+y~=l we have limx~+limy!=l, both limits existing; but this is just X2+y2= 1. Chapter 3 131

We now note that any norm is continuous. From the triangle inequality,

Illz+'II-lIzlll~II'II~lmeIil+1111I1e211 if '=~el+l1e2 ~ c {I~I + 1111} if c = max (1Ielll, II e211) ~ c V2. VI~12 + 11112 , by Schwarz. Thus if 1~12+ 11112 is small so is Illz +~II-llzlll. (1) N is bounded. If not there is a sequence of points (or vectors) znEN such that x~ + y~ --- =. Consider the sequence of points ,(n) = znIV x~ + y~. These lie on U and by the Bolzano-Weierstrass theorem have a limit point 'E U (since UiscIosed).Suppose ,(n,)_,. Then 1I,

The matrix As is symmetric. Hence A' A=A2. If a, a is a characteristic pair for A so thatAa=aa we have A2 a=A(Aa)=A(aa)=a(Aa)=a2 a, i.e., the x characteristic values of A2 are the squares of those of A. Hence IIAI12=e(A). Now from Problem 5.13 (iv), or by direct verification, the characteristic values of A are a, = - 2 + 2 cos (r n/5), r = I, 2, 3, 4. The spectral radius of A corresponds to r=4 and e(A)= -a4=2+2 cos (n/5)=f(5+ VS). The dominant characteristic vector of A is a4 =[I, -2c, 2c, 1]' where c=cos (n/5) and this gives equality since Aa=aa so that IIAa411 = lallla411 =e(A) lIa411. The absolute column sums of A=AI6 are 4,9,9,4. Hence II AliI =9. The extremal vector is xo=[O, 1,0,0), for which Axo=[O, 2, 3,4)' and IIxolI l =l, II A xoll =2+3+4=9. 132 Solutions to Selected Problems

Chapter 4

4.1. Solution We approach the problem in the following way. To invert A we have to solve the systems of equations ACi=ei where Ci is the i-th column of A-I and ei the i-th unit vector. Assume, for simplicity, that the triangularization can be carried out without rearrangements. This requires, in the first stage, the determination a21 a31 a I of the factors - -, - -, ... , - _R_ and then the multiplication of the first an an an row by -arl/an and its addition to the r-th row, so as to kill the (r, 1) term. We need about n multiplications and then a further (n-l)X(n-l), in all about n2• For the whole process we need about Z r2 =n3/3. Let us now consider these operations carried out on the right hand sides. Take the case of ei' No action is needed until the i-th stage when we add multiples of 1 to the zeros in the i + 1, ... , n-th position. At the next stage we have to add multiples of the (i+ 1)-st component to those in the i +2, ... , n-th position. In all we need about

Z (n -r):;: (n-i)2 r=i 2 multiplications. To deal with all the right hand sides therefore requires about

R (n _ i)2. R P . n3 Z =;= Z -=;=- i=1 2 j=l 2 6 multiplications. Finally we have to solve n triangular systems: each involves about n2/2 multiplications and so, in all about n3/2 multiplications. n3 n3 n3 The grand total is 3+"6+"2=n3•

4.2. Solution The factorization of an (m+l)X(m+l) matrix which we can regard as a of width 2m + 1 gives (in general) full triangular factors, which we can regard as band matrices. Suppose we have established, for n~m+l that an nXn band matrix of width 2m+ 1 can be factorized into matrices of the same type. Consider now the problem for an (n+l)X(n+l) matrix, using the notation of p. 33 and the assumption that all the leading submatrices of A are non-singular. Chapter 4 133

o o o

u

o o o

x'

Our induction hypothesis validates the diagram above. We have only to show that the vectors x', y will each have n-m initial zeros. These are obtained as and since v, u have n-m initial zeros so have x, y. 4.3. Solution If an ~O we obtain the first Gauss transform of A by killing the remaining elements in the first row by premultiplication by

LI = [-a,~an 1... 1]' -anI/an We write LIA=[a~~)J and for convenience A = [aWJ· We denote by [AJr the leading rXr submatrix of A: ~n ... ~Irl [AJr=:[ : arl ... arr so that det [A]I =an , det [A]2=ana22-a12a2I' .... It is clear that

(2) _ aI2 a2I _ det [A]2 a22 -a22 ---- d [A]' an et I If a~~ is not zero we can get the second Gauss transform of A by pre• multiplying LI A by 1 1 L2 = - a~~) lag) 1

-a~2J/aW We write 134 Solutions to Selected Problems

We carryon in this way and at the end of the s-th stage in the Gaussian trian• gularization we have obtained r~) (2) a22

Ls ... L2 LI A= 0

n-s o I where the elements in the right hand blocks are irrelevant. Here the L's are certain lower triangular matrices with unit diagonals. Restricting our attention to the leading sxs block, and taking on both sides we find

det[A]s= aWaW ... aJ~) since the determinants of the blocks in the L-matrices are all 1. This result being true for each s we have

a~~) =det [A]s/det [Als- I . This result checks with n det A = II a(s)ss s=1 and makes very clear the role of the condition that all the leading minors of A should be non-singular. 4.4. Solution All that it is necessary to establish is that all the leading submatrices of A are non-singular. (a) If A has a strictly dominant diagonal, i.e.,

n (1) !au! >- ~ !aij! for i = 1, 2, ... , n j=1 jr'i so has every leading submatrix [Alr. For (1) surely implies that

r !au! >- ~ !aij! for i = 1, 2, ... , r. j=1 jr'i We now appeal to the Dominant Diagonal Theorem (see p. 54) to conclude that every leading submatrix is non-singular. (b) One of the properties of a positive is that all its leading principal minors, i.e., det[Al" are positive, so that [Alr is certainly non-singular (see p. 58). Chapter 4 135

4.5. Solution We discuss the "square case" leaving the rectangular one to the reader. We consider the representation (1) and denote by Xl' •.. , XII the columns of a matrix X. Writing out (1) in terms of the columns we get /1 = r 11 If'1

/2 = r 12 If'1 + r22 If'2

We also have If'; If'j = bij , i, j = 1, 2, ... , n. The unknown quantities can be obtained by a recurrence process. We first find r 11 = IIfl112 and take If'1 =/l/r11' Supposing we have the first s vectors If'i and the corresponding rij (i~j~s) determined. Then the condition that If's+l is orthogonal to If'1' ... , If's gives the relations

rj,s+1 = If'i/S+l' j = 1, 2, .. , ,s,

while the normality of If's+1 determines rs +1,s+1 by

rs+1, s+1 = ± II/HI - r 1,s+1lf'1 - ... - rs,S+1lf'sI12' Each of these calculations involves about n s multiplications and the latter also requires a . The total cost is therefore about

operation s. A second solution to this problem can be obtained by using the LL' scheme and using results given elsewhere in the text or problems and solutions. The main steps, in the "square case" are: compute F'F factorize F' F = L L' and then R=L'.

The first stage requires about n3/2 multiplications and the second about n3/6 - in each case we take account of symmetry. To compute q, = F R -1 requires additionally the inversion of R - costing about n3/6 multiplications and then the multiplication F R-1 which costs about n3/2 multiplications. We illustrate this process in the rectangular case by a numerical example. 136 Solutions to Selected Problems

Suppose we want to orthonormaIize the two vectors

11 = [1,2, 3]" 12 = [4, 5, 6]'. We write

and we seek an L such that FL'=4> is orthogonal, i.e., so that 4>'4>=1. This means that (L F')(F L') = 4>' 4> = 1 and so F'F=L -IL -1'. We compute F'F= [14 32] 32 77 .

We then apply the LD U theorem to find a, b, c so that

[: ~][~ ~] = [~~ ~~].

We find a= YI4, b=32/YI4 and c= Y27/7. Thus we may take

L -1_[ Vi4 0 - 32/Vi4 Y27/7 j. This gives L=_I_[ Y27j7 0j=[ 1/Vi4 0 j l"54 -32/ViA JII4 -I6!y7 0 27 Y7/27 and finally 1 4] [II Vi4, 41ffl] 4>=FL'= [2 5 [I/J1I4 -I6!Y7 0 27]= 2/J1I4, 1/ffl 3 6 0 Y7/27 3/Vi4, -21ffl and orthonormality is easily checked. 4.6. Solutiu. We compute A' A=[-~ ~ ~ [~ -~ =~]=[-: ~~ -:] -2 -4 ~J Y3 0 0 -8 8 20 Chapter 4 137 and then factorize A' A=LL' where

2y2 0 0] L= [ -2y2 2y2 0 . -2y2 0 2Y3 If A=4)R then A'=R'4)' and

A' A = R' 4)' 4) R = R'R so that R=L' and

We find 1 1 1 2y2 2V2 2Y3 1 L'-l= 0 2V2

0 0 and 2~1 r 1 o y2 1 1 -V3 2V2 2V2 2 V3 Y3 1 l2y2' 2y2 2"

Orthogonality of 4) is easily checked.

(2) B = "91 [-7 4 -4 1 -8-4] [10 2 4 3]5. 4 -8 1 0 0 6 4.7. Solution This is a matter of elementary algebra. The result shows that two 2X2 matrices (even block matrices) can be multiplied at the expense of 7 multipli• cations and 18 additions a~ compared to the 8 multiplications and 4 additions required in the obvious way. Building on this result V. Strassen (Numer. Math., 13, 354-356 (1969») has shown that two nXn matrices can be multiplied in (!)(n10gz 7) operations; 138 Solutions to Selected Problems the same is true for our basic problems: A-I, det A, Ax=b. These schemes are practical and have been implemented by R. P. Brent (Numer. Math., 16, 145-156 (1970)). Another fast technique has been developed by S. Winograd (Comm. Pure Appl. Math., 23, 165-179 (1970)). 4.8. Solution This is a matter of elementary algebra. The results show, for instance, that the inversion of a 2nX2n matrix can be accomplished by operations on nXn matrices, a fact which can extend the range of n for which all the work can be done internally. It is easy to verify that AU1=X-YW-1Z. 4.9. Solution This is a matter of elementary algebra. The result shows that the inversion of complex matrices can be accomplished by real operations. This form of the solution is due to K. Holladay. 4.10. Solution This can be regarded as the result of an L U (or LD U) decomposition of a : A [A A-IB] eIl= [C 0B] = C 0] , [' 0 O-CA-IB or 0 ] [' A-IB] eIl= [CA-l,' 0] [A0 O-CA-IB 0 , which can be verified by multiplication. Taking determinants

det ell = det [ ~ ~] det [~ 0 ~~ ~-l B] = det A det (0 - C A -1 B) =det(AO-AC-IB). The matrix O-CA-IB is called the Schur complement of A in ell. If 0 is non-singular we obtain in the same way.

[~ ~] = [A-BOO-IC B~-l][~ ~] and hence det ell = det 0 det (A- B O-lC). 4.11. Solution We factorize A=LL' and find that all elements are real. Hence A is positive definite. Chapter 4 139

Actually o 10 -5 Hj. 6 -1 7

To solve the system Ax=b, i.e., LL'x=b we find first

1/9 0 0 0 j -1_ [ 2/45 1/10 0 0 L - -1/72 1/16 1/8 O· -1/120 - 43/560 1/56 1/7 We have x = (L L')-1 b = (L')-IL-l b = (L -1)' . L-l b. We find (L -1 b)' = [28, 26, 15,7], and then X=(L-1),[28,26, 15,7],=[4,3,2,1]'. 4.12. Solution No. If we proceed to find the LL' decomposition we get: IiI = 1 so that 111 = 1 say 111/12 = 2 so that 112 = 2 li2 + 1~2 = 3 so that 1~2 = - 1 which demonstrates that the matrix is not positive definite. Carrying on we find

1 . -3 2 i -43 -2 3-1 I] [ .]. L -1= [1..'j2i -i . . A-l= [ 3 L= 3 2i . ' -i 2i -i . ' -2 3 0 1· 4 3i 1 1 -1 -1 1 1 -1 -1 1 If Ax=[30, 40, 43,57], we find x=[1, 2,3,4],. 4.13. Solution Premultiplication by a means multiplication of the rows by the corresponding diagonal element:

[0- 1 A] ij = di- 1 aij and postmultiplication means multiplication of the columns by the correspond• ing diagonal element: 140 Solutions to Selected Problems

In order that D-lAD should be symmetric, we must have, for all i,j, io;C.j so that

We can take dl = 1 and then

dj+l- _dVaj+l,jj -- J-·-12, , ... ,n- 1. aj,j+l We assume that ai,jo;C.O for li-jl = 1. The value of the elements in the (i, i+ 1) and (i+ 1, i) place is clearly

4.14. Solution For the results see Problem 5.13 (iv), (v). 4.15. Solution In order to check your results use Problem 5.13 (iv), (v). 4.16. Solution Consider the determination of the r-th row of L -1 = X from XL = t. The system of equations to be satisfied is

This is a triangular system. We have seen that it can be solved by back• substitution at the cost of about r 2/2 multiplications. In all we need

n Z r 2/2.:;=n 3/6. r=1 See diagram.

x L Chapter 4 141

4.17. Solution This is left to the reader. Apparently this problem does not occur frequently in computing practice, although the solution of X x = b and the evaluation of det X in this case are of great importance. 4.18. Solution The Dekker Matrix This is a matrix M derived from Hn which has the advantage, as a test matrix, that both it and its inverse have integral elements. In fact M is a positive matrix and M-l is got by changing the sign of alternate elements in M, specifically, [M-l]ii=( -IY+i[M]ij. We have recorded the inverse of the Hilbert matrix Hn (Problem 5.13 (viii»)

[H,;-l] ij = (_ly+i fifj/(i + j -1) where fi= (n + i-I)! I((i -1)!)2(n - i) !).

It is easy to check that fi is an integer: the quotient (n + 1-i) !/( (n - i)!) is a product of 2 i -1 consecutive integers, and we use twice the well-known fact that the product of any r consecutive integers is divisible by r! - it is, indeed, a . If we write F=diag [ft'/2' ... ,f,,], E=diag [-1, 1, -1, ... , (_l)n] we have H-l=FEHEF. We next define M=FG-IHG where G=diag [gl' g2' ... , gn] is any non• singular diagonal matrix and observe that M-l=G-lH-lGF-l = G-l(FE H E F) G F-l =G-IFEHE(FGF-l) =(G-IFE)H(EG) = E (FG-l H G) E =EME where we use repeatedly the fact that diagonal matrices commute. This shows that the inverse of M is got by multiplying its i, j element by (_I)i+ j for each i,j. We shall now show that if we choose G properly we can ensure that the elements of M are integers: If an integer N is expressed in prime factors N =p~l p~2 ... p~r we define N=p';'l p';2 ... p':r where ms=[ns/2], the integral part of ns/2 so that ns is either 142 Solutions to Selected Problems

2m. or 2m.+ 1. We now define gs=h·

We begin by showing that (Hj-l)2 is a factor of fJj. It is clear that J;jj is the product of two sets of i+j-l consecutive numbers n-j+l, ... , n+i-1 and (n-Hl), ... , (n+j-1) divided by [(i-1)!(j-1)!]2. Such a product is divisible by (i +j -I)! and therefore by (i +j -1), and by (i -I)! and the product of the (j -1) remaining consecutive numbers and, a fortiori, by (j -I)!. This establishes our assertion, and, incidentally, the fact that H;;l has integral elements. The i,j element of Mis J;gi l g)(Hj-l). Letp be any prime and let qi,qj' q be the powers of p which occur inJ;,jj,i+j-l. Since (i+j-1)2 divides J;fj it follows that qi +qj ~2q. The power of p which occurs in J; gj is qi+[qj/2] while that which occurs in gi (H j-1) is [qJ2] +q which is not greater than qi+[q)2]. To see this write E=qi+[q)2]-[qJ2]-q. Then if qi and qj are both even or both odd

E=~qi +~qj - q=~(qi +qj -2q) ~ O.

If qi is odd and qj even

It follows that [M]u is integral as asserted. 4.19. Solution The method for the general case is clear from the following discussion of the example:

Subtract 2 col 2 from colI to get a zero in the (1, 1) place. Add 7 col3 to colI to get a zero in the (2, 1) place. Subtract 12 col4 from colI to get a zero in the (3, 1) place. Add 16 col 5 to colI to get a zero in the (4, 1) place. We have therefore 0 0 0 01 0 2 1 0 0 detA= 0 3 2 1 0 0 4 3 2 L20 5 4 3 2 and so det A = 20 X 1 X I X 1 X 1 = 20. Chapter 4 143

4.20. Solution This is a rather difficult problem. For accounts of it see, e.g., W. Trench, 1. SIAM, 12, 515-522 (1964) and 13, 1102-1107 (1965). E. H. Bareiss, Numer. Math., 13, 404-426 (1969). 1. L. Phillips, Math. Comp., 25, 599-602 (1971). 1. Rissanen, Math. Compo 25, 147-154 (1971), and Numer. Math., 22,. 361-366 (1974), S. Zohar, J. ACM, 21, 272-276 (1974). 4.21. Solution Suppose

2 2 2 Then we have a = 1, a= 1 say·'2' ab=! b=!·2' c=!·3' d=!·4' b +e = 1, e2=~, e=V3;2 say; bc+ef=~, f=I/V3; g=V3/4; c2+f2+h2=1, h2=1- -(1/3)-(1/9), h= (5/3 say; i= (5/4, d2+g2+i2+ j2= 1, j2= 1- (5/16)- -(3/16)-(1/16), j=fl/4 say. We find easily that

1 -1/V3 0 0 j U-l= [0 2/(3 -2/(5 0 o 0 3/l1S -3/fl o 0 0 4/V7 and 4/3 -2/3 0 A-l= [ -2/3 32/15 -6/5 o - 6/5 108/35 -12~7j . o 0 -12/7 16/7

See Problem 5.13 (iii) for the general case.

4.22. Solution Consider the equations

(1)

Choosing Un in the first determines In and then U12 , .. , U11I are determined. So we get the first row of U. 144 Solutions to Selected Problems

Consider the equations

(2) and we see that the first column of L is determined by the value of Un. Now suppose we have determined the first r-I rows of U and the first r-I columns of L. We write down the equations corresponding to (1), (2) above:

r r-1 (3 i) ari=Zlrkuki=Z Irkuki+lrruri i=r,r+I, ... ,n. k=l k=l r-1 air= Z !;kukr+lirurr i=(r), r+ 1, ... , n. k=l

From (3r), by choice of Urr we determine Irr and then from (3r+1), (3r+2), ••• , (3n) we determine Ur,r+1, ... , Ur,H' using the available I's and u's. In the same way, from (4r+1), (4r+2), ... , (4n) we determine Ir+1,,, ... , Inr' Thus the r's and the l's can be computed by rows and columns. Let us count the multiplications and divisions involved in the r-th stage. We require to handle (n-r+ I) equations each involving (r-I) products and one division, i.e. (n-r+ l)r operations and (n-r) equations each involving (r-l) products and one division, i.e. (n-r)r operations. That is, altogether, about

Consider next the product UL=A. Let us look at the computation of the r-th row of the product. Because of the zeros at the beginning the computa• tion of each of will involve n-r multiplications, in all, r(n-r). The computation of requires (n-r)(n-r-l) (n-r-l)+(n-r-2)+.,.+I= 2 multiplications. Together we require

v { ( _) (n-r)(n-r-I)}= n3 LJrnr+ 2 '6' Chapter 4 145

From Problem 4.16, the determination of L -1, U-1 each involves about n3J6 multiplications. n3 n3 n3 The grand total is therefore "3+"3+2"6=n3. See diagrams.

L u A

u L A 4.23. Solution Use Problem 4.10 above. See SIAM Review 14, 499-500 (1972). 4.24. Solution Since I=x*x=xi+x*x we have x*x=I-xi. It is clear that U is hermitian and we compute U*U=U2=

[ xi+x*x x1x*-X*+(1+X1)-IX*XX*] = x1i* -x*+ (1 +xJ-1x x* x x x* + i - 2(1 + xJ-1i x*+ (1+X1)-2 x x* i x* . The (1,1) element is clearly 1. In the case of the (1,2) element we can replace the third term by (1 +X1)-1(X*X)x*=(I +X1)-1(1-xi)x*=(I-x1)x* and this cancels out the first two terms; similarly for the (2,1) element. The (2,2) element can be written as i + x x*(1- 2 (1 + X1)-1) + (1 + XJ-2 x (i*.i) i* = i + X X*(X1 -1)(1 +X1)-1) + (1 +X1)-2 (I-xi) x i* = i +x i*[(X1 -1)(1 +X1)-1+ (1 +xJ-1(I-xJ] =1. Hence U*U=I. 146 Solutions to Selected Problems

4.25. Solution The standard reduction processes, which are variants of the process, are effective in the context of theoretical arithmetic. However we clearly cannot handle such simple cases as

or [~ ~ ~l 3 13 10 exactly on a decimal machine. It is, however, possible to handle this problem exactly when the coefficients are rational provided we have a "rational arithmetic package" incorporated in our machine. This must include sub• routines for addition and multiplication of rationals (regarded as ordered pairs of integers)

In addition, there should be a subroutine for "reduction to lowest terms", so as to keep the integer components as small as possible: i.e., we want a Euclidean algorithm to obtain d=gcd(n1' d1) and then replace (n1, d1) by (n1/d, d1/d). 4.26. Solution Similar remarks apply here as in Problem 4.25. Indeed all questions about rank and linear dependence and independence, while appropriate in theoretical arithmetic (and, in practice, when we have rational arithmetic available), are largely meaningless in the context of practical computation. For instance, arbitrarily small perturbations of the can manifestly produce matrices of any rank. 4.27. Solution -A AB] I -B. o I

This shows that the product AB can be obtained by the inversion of the 3nX3n matrix ell. 4.28. Solution It is easy to verify that

det (A-AI)= (1 +A)(l + 9A-A2) Chapter 5 147 so that -1 is a characteristic value and so A cannot be positive definite. Alternatively, the quadratic form x2+ 3y2+4z2+4xy+ 6xz+ 8yz has the value -5 for x=2, y=l, z=-2. Although det[l] = 1, det A = 1, we have det [~ ~] = - 1.

Chapter 5 5.1. Solution

detA= 1, detB= -118.94, detC4 =5, det(C4 +f)=4.96.

-265, 108, 366] 1 [~ ~ ~l A-l= [ -2920, 1190, 4033, Ci1 = -5 2 ! 4 6 3 . 8684 - 3539, -11994 1 2 3 4

The first result can be simply obtained from the fact that, since det A = I, [A-1L=cofactor of ji element in A. The second result can be derived by obtaining (as in Problem 5.4 below) the factorization o 0 V372 o 0 1= L D1/2 -ffi3 ~ 0 1 o - V3j4 V574 where

Then we have where 1 0 0 01 1 1 0 2 . 0 L -I_ I - ! .: , 1 3 3 0 1 2 3 L4: 4: 4 1 IIAII== 183, IIA- 111==24217. IIAI11=245, IIA- 1 111 = 16393. 148 Solutions to Selected Problems

The characteristic values of

11989 -968 8966] AA'= [ -968 13445 -4668 8966 -4668 7869 are approximately

2.lOX 10\ 1.23 X 104, 3.87X 10-9 so that approximately, IIAI12 = 145, IIA -1112 = 16069. I1C411= = IIC4111 = 4, IICi111= = IICi1111 = 3. Since C4 is symmetric we need only compute the characteristic values -of C4 • These are -2+2 cos (kn/5) =-4 sin2 kn/1O, k=l, 2,3,4. Hence

. 2 4n 2 n C 1 . 2 n II C4112= 4 sm 1O=4cos 10' II i 112=4sm 10'

The characteristic polynomial of A is - A3 + 3 A2 + 11 069A + 1 and this has zeros 106.7, -0.00009034, -103.7 approximately. 5.2. Solution (W. Kahan, Canadian Math. Bull., 9, 757-801 (1966).)

(a) x=[_~]

8 [-10- ] r= 10-8 '

A-I = -108 [ .8648 - .1441] -1.2969 .2161'

IIAII= =2.1617, IIA-111= = 1.513 X 108•

IIAIII = 1.513, IIA- 1111 = 2.1617 X 108 • 5.3. Solution It is obvious that the solution ofWx=b in the first case is x=e. We deal with a general version of the second case. If Wx=b+fJ then x= W-l(b+ fJ)= W- 1 b+ W-lfJ. Using the result of the first part we have, if b' =[32,22, 33, 31],

x=e+W-1 fJ Chapter 5 149 so that when fl' =[B, -B, B, -B), using Problem 5.4,

Xl = 1 + 82B, X2= 1-136B, Xa= 1 +35B, X4= 1-21B. 5.4. Solution If we write

L= b c . . , L -1= B C . [ del.a . "J [A"'JDEF. ghij GHIJ we find, equating elements in LL' = W

a2= 10, a= YIO, ab=7, b=7/YIO, ad=8, d=8/YTO, ag=7, g=7/YIO;

b2 +c2 =5, c2 =5-(49/10), c=I/YIO, bd+ce=6, ce=(4/1O), e=4/YIO, bg+ch=5, ch=(1/IO), h=l/y'iO;

d 2 +e2+J2= 10, J2= 10-(64/10)-(16/10), /= (2, dg+eh+/i=9, /i=3, i=3/(2;

g2+h2+i2 =p= 10, j2= 10-(49/10)-(1/10)-(9/2), j= 1/(2.

In the same way equating elements in LL -1=1 we find aA= 1; cC= 1, bA+cB=O; /F= 1, eC+/E=O, dA+eB+/D=O; jJ= I, iF+jI=O, hC+iE+jH=O, gA+hB+iD+jG=O which gives A=I/YIO, C=YTO, B=-7/YIO, F=I/V2, E=-2V2,

D=(2, J=V2, 1=-3/(2, H=5V2, G=-3(2.

Multiplying L -1 (L-I), gives the result required. Since W, W-I are symmetric we have 150 Solutions to Selected Problems

5.5. Solution rt 0 0 0 1 "2 0 0 L= ! D = diag [ 1, 1~' 1!O' 28~0]' 3 o ' 1 9 3 1 4 10 2 J r1 0 0 0 1 -2 1/2 Y3 0 0 .P= 1 - 3 1/2Y3 1/6 y'5 0 L~ 3 {3/20 1/4 y'5 1/20 V7

16 -120 240 -140] H-l= [ -120 1200 -2700 1680 4 240 -2700 6480 -4200 -140 1680 -4200 2800 25 IIH4111=IIH411==12' (row 1)

IIHi1111 = II Hi1 11= = 13620. (row 3) We note that

1 7 det H4=detD= 12X 18~X2800 6048 X 103~ 1.6534 X 10- • 5.6. Solution n2] -n2 '

IIAII==2, IIA-lll==2n2, IIBII==2n 2 -11 IIAB-'II==I1-\ IIBA-'II==211. 5.7. Solution See F. R. Moulton, Amer. Math. Monthly, 20, 242-249 (1913).

-1.8000000X 10- 7 1.8490000 X 10-5 -1.8150000x 10- 5

2.1000000 X 10-7 1.7260000 X 10- 5 - 1.8140000 X 10-5

-4. 1000000 X 10-8 -1.8749000X 10-5 1.8134000 X 10-5 •

These were obtained on an IBM 360/50 which gave as solutions

X= 1.0270566, y=2.091917l, Z= -0.38048000. Chapter 5 151

5.B. Solution See E. H. Neville, Phil. Mag. {7}, 39,35-48 (1948).

X= 5.38625 24221 14004 89671 5, y= -2.81334 69056 56987 06591 5, z = - 11.59232 35480 19317 71940 9, u= 6.36482 51116 16317 76363 4, v= 7.99287 21174 39987 42297 9, w= -4.20355 33598 11286 99413 8. 5.9. Solution This matrix was introduced by H. Rutishauser (On test matrices, pp. 349-365, in: Programmation en mathematiques numerique, CNRS, Paris 1968). It is easily shown that det R = 1 and that

105 167 - 304 255] R-1= [ 167 266 -484 406 -304 -484 881 -739 . 255 406 -739 620

The characteristic values of R are approximately

19.1225, 10.88282, 8.99417, .0005343.

The condition numbers of R corresponding to the 2- and 00- vector norms are respectively

x2(R) =j= 19.1225/ .0005343 =j= 35790 and x=(R) =26 X 2408 = 62608. 5.10. Solution In this case it is simplest to find A-1, since we need this for the second part. We have det A=330+42+42-99-20-294=1. Hence since

A -1 = (det A)-l [A ji], where Aji is the ji cofactor in A, we have

-36 -19] A-1 = [_~~ 21 11 -19 11 6 and so x = [-36, 21, 11]'. 152 Solutions to Selected Problems

Clearly \\A\\oo=20, IIA-1\\oo= 117, X (A) = 2340. The characteristic values of A are 16.662,5.326 and O.oI 12 so that x2(A) =§= 1487. We have c,'~- ! [~ ~ ~] so that \\Ca\\oo=4, \\Ci1\\oo=2, Xoo(Ca) = 8. 5.11. Solution We find that the of the system is I and that the solution is x=l, y =-3, z =-2. The inverse of the matrix is [-! -;; -~]. -I 7 5 The with respect to the Chebyshev vector norm is 105 X22=2310 which is large for a 3 X 3 matrix. 5.12. Solution See F. L. Bauer, ZAMM, 46, 409-421 (1966).

I [10000 - 11 00 10] A-l=8910 -1100 11111-11 10 -11 I Xoo(A) = 10101 X{1111O/891O)=§= 12596.

1 rllooo - 4070 3367 1 3 ; 1 297 ~ 297 A-10-l= -100 3737 - 3367 1 OA= 11~1 II~I ~~~ I, 270 270 270 . I 100 100001 10 -407 33671 1 10101 10101 10101 2970 2970 2970 Xoo (0 A) = I X (8437/297) =§= 28. This shows how we can improve the condition of a matrix by "scaling", i.e. multiplying by diagonal matrices. In this example the diagonal 0 is optimal for one-sided scaling in the Chebyshev norm. Chapter 5 153-

5.13. Solution (i) Al is an and so All = A~ = AI' This can be established by elementary trigonometry using the relation ., . sin (i(n + I)oc)sininoc sm oc+sm 2oc+ ... + sm noc= . 1 . SIll'2 OC We note that Al is the modal matrix of As, i.e., the columns of Al are the characteristic vectors of As so that

Al As Al = diag [ocl , ... , ocn] where ocl , ... , OCn are the characteristic values of As. See (iv) of this problem and also Problem 6.11. The characteristic values of A are necessarily all real, A being symmetric_ Since they have absolute value I, A being orthogonal, they must be ± L It can be shown that they are 1; ±1; ±I, I; ±I, ±1... for n=l, 2, 3,4, .... This matrix is not positive definite. (ii) See Problem 1.7. More generally, the inverse of ocl+/3J is of the form yl+c5J because• j2=nJ and we can solve (oc 1 + /3 J)(y 1+ c5 J) = 1 by putting i.e., if n/3 + oc -F- 0, y=oc, c5=-/3J{oc(n/3+oc)}. In the special case oc=m, /3= 1 we have

y=n-t, c5=-IJ(2n2). If we consider det (Aa-A.l) we can add all the columns to the first and take out a common factor 2n-A leaving a matrix with its first row and first column all 1'so If we subtract the first row from each of the others we get a matrix whose determinant is clearly (n_A)n-l. Thus the characteristic polynomial is (2n-A)(n-A)n-l so that the characteristic roots are 2n and n (with multiplicity n-I). The matrix Aa is positive definite and its determinant is 2nn.

(iii) For the case n=4 see Problem 4.21. The more general matrix A which is symmetric and whose i, j element,. for i~j, is 154 Solutions to Selected Problems where we assume the a's are distinct and non-zero, is discussed in Amer.

Math. Monthly, 53, 534---535 (1946). Specializing to ai=i we get A7 • One way of obtaining A-I is the observation that

(1) BAB'=D=diag[l-bi, I-b~, ... , l-b~_I' I] where

I, -bn - I 1 To verify (I), we observe that

(2) BA= BAB'=

where the r's are rows of A and the )"s are the columns of BA. It is now easy to check that BA is a lower whose i,j element, i?Ej, is:

aj _ aiaj ai (ai+I)2 where we interpret an +1 = 00, bn=O so that the last row is not changed. Similarly we find that BAB' is the diagonal matrix given. From (l) we conclude that

and using the interpretations (2) we find that A-I is the symmetric triple diagonal matrix given by

[A -11;; = di- 1 + (ai_l/ai)2 di-_\ (ao = 0)

[A-l]i,i+1 = -a;/(ai+1 d;).

Specializing to ai=i shows that A;1 is a triple diagonal matrix with elements as indicated: (i, i)i

(n, n) : n2/(2n-l)

(i, i+ I), (i+ 1, i): -i(i+ 1)/(2i+ I). Chapter 5 155 n-l Since det B = 1 we have det A = det B A = II (r b~-I). In the special case we get ;=1 detA =(_l)n-l. (2n)! rv_l_(2e)n. 7 2n(n!)3 V2nn n

(iv) The matrix As is one of the most important in -• it can be called the "second difference matrix" since when we postmultiply As by a vector we get a new vector whose elements are the second differences of the first. We have already noted (Problem 5.1) that in the case n=4 we have

4 3 2 1] -1- 1 [ 3 6 4 2 As --3 2 4 6 2 . 1 2 3 4 It can be verified that Asl in the general case has

aij= -i(n-j+ l)/(n+ 1) i~j,

We return to a proof of this later. We easily prove by induction that det As= (-I)n (n+ 1). (Compare Problem 4.14.) We shall now show that the characteristic values of As are, for k= 1, 2, ... , n,

rJ.k= -2+2 cos (kn/(n+ 1»)= -4 sin2 (kn/(n + 1») and the corresponding characteristic vector is

Ck= [sin kqJ, sin2kqJ, sin 3kqJ, ... , sinnkqJ)',

where qJ = n/(n + 1). This establishes a remark made in (i) above. It is not much more difficult to handle the case of a general triple diagonal matrix with constant diagonals

cab c a Clearly we have

Lin (A) = det (A - AI) = (a - A)Li n - 1 (A) - b c Li n _ 2 (A), where 156 Solutions to Selected Problems

Introduce the notation Dl=(a-A)/~=2rx=2cosO and D2=(a-A)2jbc-l =4rx2-1 =4cos2 0-1. Generally, write Dn = t1)YCbc)n so that that is a-A rx=--. 2YbC We solve this difference equation by trying Dn=AIl" and see that 11 must satisfy the characteristic equation 1l2-2rxll+ 1 =0, so that C+iS 11= rx± yrx2-1 = { . C-1S, where we suppose -I '2rx'2 1, and have written c=cos O=rx, s=sin 0= Yl-rx2. Thus Dn=A1 Cc+is)n+A2(c- is)n=Bl cos nO+B2 sin nO.

We solve for B1 , B2 using the initial conditions which give

B1 c+B2s=2c, B1 (2c2-1)+B2(2sc)=4c2 -1. Hence Bl = 1, B2=cjs, so that cos e . sin (n + 1)0 Dn = cos nO sm n = . + -.sm-e- e sm e . Now t1 (A) =0 if and only if Dn=O and so if and only if O=knj(n+ 1), k= 1,2, ... , n. Hence

a-A kn ,fL"":. k n --=cos--, A=a-2 rbccos--l. 2YbC n+l n+ kn (n+ I-k) . Since cos --1 = - cos 1 n we may take mstead n+ n+ ,/- kn Ak = a + 2 r b C cos --1' k = 1, 2, ... , n. n+ Let us take the case a=O, b=c=1 for simplicity and determine the cha• racteristic vector corresponding to Chapter 5 157

The equations read X2=A./Xl} Xl +X3~.A./X2 .

Xn- l =A./Xn

If Xl =0 then all the other x's are zeros and this is not allowed in the case of a characteristic vector. We choose Xl = sin 10 and find

x2 =2cos 10xl =sin 210

x3 =sin 3/0

One way of finding A;l is to use the result of Problem 4.8 in the case where A22 is 1 X 1 to show how to obtain the inverse of a bordered matrix. We find

where a=(a-,' A-le)-l, on the assumption that a7":-" A-Ie. We shall apply this to establish inductively that the inverse of the nXn matrix: -2 1 1 -2 1

-2 1 1 -2

Assuming this for a particular n we consider bordering bye' = " = [0, 0, ... , 0, 1], a = - 2 to get to the next case. Then

a-l=_2+_n_= -2n+2+n n+2 n+l n+l n+l so that a= -1 (n+ 1) -l) has the appropriate form. Observe that ,'A-Ie n+ 2 is just the value of the quadratic form x'A-lx for x=[O, O, ... ,0, 1]" i.e. (A-l)nnx~ = (A-l)nn' In view of symmetry it is enough to consider the row vector - a,' A-I 158 Solutions to Selected Problems which is clearly -ex times the last row of A-I, i.e.,

+(::~). n-+\ .[1.1, 2-1, ... , n·I]

-1 =--2[1-1,2.1, ... ,n.I] n+ which is again of the appropriate form. For reasons of symmetry it will be sufficient to deal with the i, j element when i::§j. First note that [A-I] .. = -i(n-j+ 1) ''; n+ 1 and then compute [exA-Icr' A-I]ij as ex[A-Ic];[r' A-l]j giving

-( -n+I) . -·(i.l) -. . (j.1) n+2 (-1n+I )(-1n-t-I )

ij (n+ l)(n+2) .

Adding we find the new i, j element is

_i_[n_ '+I+--L]= -i(n-j+2) n+l ] n+2 n+2 which is of the appropriate form. This completes the proof since the result for n= 1 is clearly in the appro• priate form

(v) This matrix was introduced to us by J. W. Givens. It is easy to verify that A9 I is a triple diagonal matrix such that and the off-diagonal elements are all -i. Using results of D. E. Rutherford (Proc. Royal Soc. Edinburgh 62A, 229-236 (1967); 63A, 232-241 (1952») we find that the characteristic roots of A9 are

I 2 (2 r - 1) 1! '2 sec 4n ' r = 1, 2, ... , n. Chapter 5 159

Using the method of p. 30 we find, if /,,=det 2A;\ that 10= 1, 11=3, j',.=2j',.-1-j',.-2, ... , r=2, ... , n-I In =/,,-1-/,,-2· This gives /,,=2. Hence det A;I=2X2-n=21- n and

det A9 = 2n- 1• (vi) M. Fiedler has shown that

1 __1_ -1 1 n-l n~11 1 I -1 -~ -~ -1 Aii=- "2 I -1 2 -I 1 1 -1 1 n-l n-I and det Au = (_I)"-12n-2(n -I). Fiedler has handled the case of the matrix where c/s are given constants. He has shown that _2C-l= -1 -I

o

-I -I

o -1 1 IL and that det C = (-1)n-12n-2(C2 - cJ (C3 - C2) ... (cn - Cl). It is known that Au has a dominant positive characteristic value and that all the other characteristic values are negative (see e.g. G. Szego, Amer. Math. Monthly, 43, 246-259 (1936).) 160 Solutions to Selected Problems

(vii) This is a particular . The general form is [Auh" =X!-l where the x" are distinct. The inverse is given by

[A15];:-~ = (n+ 1)-1 exp (-2n i(J..-l) (Jl-l)Jn). (See W. Gautschi, Numer. Math., 4, 117-123 (1962), and 5, 425-430 (1963).) The characteristic values of A15 are if n=0(4): nl/2, n1/2, _n1/2, inl/2;±nl/2, ±in1/2 each (nJ4)-I) times

if n= 1 (4): nl/2; ±n1/ 2, ±in1/2 each (n-l)/4 times

if n=2(4): ±nl/2; ±nl/2, ±in1/2 each (n-2)/4 times

if n=3(4): ±n1/ 2, inl/2; ±n1/2, ±in1/ 2 each (n-3)J4 times. (See L. Carlitz, Acta Arith., 5, 293-308 (1959).) It can be shown that det A15 = n(n/2) • (viii) This is the Hilbert matrix. The case when n=4 has been dealt with in Problem 5.5. There are several ways available to obtain the inverse explicitly. We give an account due to N. Gastinel (Chiffres 3, 169-152 (1960»). As in most treatments, a more general problem can be handled without complication. Take therefore and assume the b's and P's distinct and no bi coincides with any Pj. Let b (t) = II (x-bi ), P(t) = II (x - Pi) i i and write F(t) = b (t) + h(t) where h (t) is a polynomial of degree at most n -1. From the elementary theory of partial fractions

F(t) Ai F(Pi) (1) P (t) = 1 + -t t _ Pi where Ai = P' (Pi) .

Put t=bk in (1), to get, since F(bk)=h(bk),

(2) k = 1, 2, ... , n. Chapter 5 161

We can present the equation (2) in matrix-vector form:

and U is a matrix with

We now observe that in order to solve the system

Ux=y for a given y, we should determine a polynomial h(t), of degree at most 11-1, such that (3) h([3i)=[3(bi)[l-y;] i=1,2, ... ,n and then the solution is

X1l-Ai where Ai = [b([3i) +h(f3i)]/[3'(f3i), i = 1, 2, ... , n.

In the case considered the data (3) determines the polynomial h(t) by Lagrangian interpolation. In order to find the inverse V of U we have to solve U x = y in the case when Yi=()ij, i= 1,2, ... ,11, j fixed. Thus the polynomial hj(t) is determined by

and is given explicitly by

[ [3 (bk) 1 [3 (b j ) 1] =b(t) Z b'(bk )' t-bk - b'(b) . t-bj . Using in this the fact that

we find 162 Solutions to Selected Problems

If we put bi=i, {3j= -(j-I) i,j = 1, 2, ... ,11 so that n-1 n {3 (t) = II (t + i), b (t) = II (t - i) o 1 we find b' (bi) = (_I)"-i(i -I)! (n- i)!

{3' ({3i) = (-Iy-l(i -I)! (n - i)! and, finally, [A-l]ij (-I)i+j (n+i-I)! (n+j-I)! i+ j-I [(i-1)! (j-I)!]2(n-i)! (n-j)!'

For another treatment see S. Schechter, MTAC, 13, 73-77 (1959). We also find (cf. A. C. Aitken, Determinants and Matrices (1951 ed.), p.136)

d A- [1!2! ... (n-1)!]4 et - I! 2! 3! ... (2n - I)! .

The dominant characteristic values and vectors of Hn have been studied at length (cf. J. Todd, J. Research Nat. Bureau Stand., 65B, 19-22 (1961»). For the cases when n=2, 4,6,8, 10 see H. H. Denman and R. C. W. Ettinger, Math. Comp., 16, 370-371 (1962). We give for n=4 the maximum and minimum characteristic values and the corresponding characteristic vectors:

1:57017 20837] 1.50021 42801, [ .40677 89880 .31814 09689

.03688 76826] lO-4 X .96702 30402, [ ~:41524 92878 .

-.65017 12197 Chapter 5 163

(ix) The inverse is the lower triangular matrix given by 0 if j>-i [A-I].. = (i) B. . for i = 1, 2, ... , '} I . ~ if j~i ] I where the B's are the Bernoulli numbers defined by

1 1 1 Bo=I, B1 =--, 2 B2 =6' B3 =O, B6= 42' .... Thus in the case n = 6 1 0 0 0 0 0 1 -1/2 1/2 0 0 0 0 A-l= 1/6 -1/2 1/3 0 0 0 0 1/4 -1/2 1/4 0 0 -1/3 0 1/3 -1/2 1/5 0 0 -1/12 0 5/12 -1/2 1/6 We give two derivations of the expression for the inverse.

(1) Write A-l=B=[b1 , b2 , ••• , bn] where the b's are column vectors:

Equate the elements in the first column of AB=I and we get generally,

In ( m ) ~ -1 br1 =O, m=I,2, ... ,n r=1 r which are practically the standard recurrence relations for the Bernoulli numbers

II-I (1) Bo= 1, ~ (n) Br=O, n=2, 3, .... r=O r Hence bj1 =Bj - 1 , j = 1, 2, ... , n. If we equate elements in the k-th column of AB=I we get generally

'if (. 1 )bjk =fJ(k, I), I=k, k+ 1, ... , n. j=k ]- 1

If we write bjk = j-I (l)pjk' j=k, k+ 1, ... , n, the above equations give

'~k (m-~+ I)Pik=fJ(k, l) since m!/(m-k+l)!k! cannot vanish. Comparison with (1) gives Pjk=Bj - k and so the result required. 164 Solutions to Selected Problems

(2) (M. Newman.) Let us denote the k-th column of B by [xo, Xl' ... Xn-I]' where XO=XI = ... =Xk-2=0, Xk-I =k-I. The (r, k) element of AB=I is

r-I (r) i~o i xi=o(r,k) so that r-I X. 0 (r, k) (2) i-?o i! (r ~ i) ! r!

Consider the generating function for the Xr:

=X (3) OJ = 2: -f z". n=O n. Clearly

= X "" 1 ""{ n OJ eZ = 2: --...!... Zl 2: - zm = 2: 2: --...!...XI} zn 1=0 l! m=O m! n=O 1=0 l! (n-I)!

= ..:;;.;v{Xn ,+ O(n,k)}, z n n=O n. n. where we have used (2) at the last step. We have thus obtained

Zk OJ eZ = OJ + k! giving 1 Zk Zk-l "" B (4) OJ -Z-- z" =,k. e -1 =-,-k. n=on.2: -f in virtue of a standard generating relation for the Bernoulli numbers _z_ = i: Bn z". eZ -l n=O n! Comparison of (3) and (4) gives the result required. 5.15. Solution We begin by examining the inequality

(1) in the case of the Chebyshev norm and the data of Problem 5.3 (and their solutions). We find

II~II"" 1.36 IIPII"" O.oI Ilxll"" --1-' IIbll"" ="33 , %""(W) =33 X 136=4488 and so the inequality (1) is a best possible one in the Chebyshev case. Chapter 5 165

If we use instead the Manhattan norm we find

II~III 2.74 -=--, Ilxlll 4 so that there is an overestimate in (1) by a factor of about 2 since the error amplification is actually (2.74 X 119)/(4 X.06) =§= 2038. Referring to the proof of (1) it is clear that in order to obtain equality in (1) we must have equality in both the relations

Ilbll = IIAllllxll, II~II = IIA-IIIIIPII· Since we have an example where equality holds in (1) in the Chebyshev case we proceed to the Manhattan case. From Chapter 3 we have if x = [0, 0, ... , 1, ... 0]' where the 1 is in the j-th place, where the j-th column sum is maximum. We have in the case of W, j=3 and so we have

In the case of W-1, j=2 and so we have

Thus if we consider the system

Wx~ ll~l with solut;on l!l and perturb it to

Wx~ l6~'l with solution l!l +, l~!il we have 166 Solutions to Selected Problems and we have equality in the relation

The corresponding problem for the case of the euclidean vector norm requires some calculations. Let us first observe what happens in the case of the first perturbation [ .01, - .01, .01, - .01]'. We find that

IIPII = .02, IIbll = (322 + 23 3 + 332+ 3J2)l/2 ~ 60,

II~II = (822 + 1362+ 352+ 212)1/2 X .01 ~ 1.64, IIxll = 2, and, since the characteristic values of Ware approximately 30.29, 3.858, .8431, .Q1015, we have x2(W)~30.29/ .01015~2974.

Thus there is a rather small overestimate in (1) since the error amplification is actually (1.64 X 60)/(2 X .02) = 2460. Let us now find a b and a P for which there is equality in the euclidean case. We recall that equality in

is attained if x is the dominant characteristic vector of AA'. In the symmetric case IIAI12 is just the (absolute value of the) dominant characteristic root of A and equality is obtained for the dominant characteristic vector of A. Suppose W has dominant characteristic root A* with vector v* and that its (absolutely) least characteristic root is Ab with vector Vb and that v*, Vb have euclidean norm 1. Consider Wx= v* perturbed to

W(x+~)=v* +evb • Then and so that

This means that the amplification of the relative error is A*f}.b=X2 (W).

We shall only examine C4 in the euclidean vector norm case. We have seen {Problem 5.1) Chapter 5 167 so that the largest amplification of relative error in this case is about 1/300 of that in the W case. Repetition of the argument in W case shows that this worst case is attained when we consider the system perturbed to C4 x=V"" +evb where v""' Vb are the characteristic vectors of C4 corresponding to the charac• teristic values _4COS2~=-2(c+ 1), -4sin2~=-2(c-l) 10 10 and these can be taken as [1, -2c, 2c, -1]' and [1, 2c, 2c, 1]' where c=cos 11/5. We give the result for the H4 case with the Manhattan norm. The system Hx=[I,~,~, i]' with solution x=[I, 0, 0, 0]' when perturbed to Hx=[1,~,~+e,i]' gives as solution x = [1, 0, 0, 0]' + e [240, - 2700, 6480, - 4200]' and the extreme amplification xl(H) =(25/12) X 13620=28475 is attained. 5.16. Solution Multiply the right-hand side by A+xy' to get xy' A-I xy' A-lxy' A-I 1- +x 'A-I_ I + y' A -1 X Y ---'--=-I-+-y-':-A="-""""'I=-x- -xy' A-I ) =1+ } 1 1+xy'A-l+(xy'A-l)(y'A-l x). 1 + y A x -xy'A - 1 xy'A - 1 The first two terms in the braces cancel and so do the second two for y' A-Ix is a scalar and these two terms are

(y' A-I x)(x y' A-I) and -x (y' A -1 x) y' A-I = -(y' A-I x) (x y' A-I). This establishes the relation. The change in A-I caused by the change xy' in A is

(1 + y' A-I X)-1 [A-l X y' A-I].

The first factor here is a scalar and can be computed in 2n2 multiplications, A-I being available, by first taking A-Ix and then y'(A-lx). Now using the 168 Solutions to Selected Problems column vector A-IX just computed and computing the row vector x' A-I at the expense of another n2 multiplications, we obtain the antiscalar product (A-Ix)(y' A-I) at the cost of n2 more multiplications. Thus in all we need about 4n2 multiplications. The scalar factor can be incorporated in the first factor A-IX at the cost of n divisions.

A+xy,=[H ~ ij, A-l=~[H : ~], 1 103 123 4

y'A-lx=10, A_IX=[~]' (I+Y'A-IX)_I(A-Ix)=[~i::j, 2 2/11 y' A-l=[2 3 3 2]. The perturbation is 1 r6 69 6 9 4] 6 Tf6996' 4 6 6 4 giving 0 1 3 -8 3 0 1 1 3 21 -1 -9j-8 0 3 o = 55 -84 -1 21 3 . [! 1 0 3 -9 -8 3 24 5.17. Solution r r (a) 1O- 2 x+y=1 } x+y=2 99 1 Subtracting 100 x = 1, x = 1 + 99

1 y=l- 99. (b) (1) .IOXIO-1 x+ .10Xl0Iy= .10XI01

(2) .10X 10 I x+ .10x 101 y= .20 X 101•

Multiply (1) by 102 to get (3) Chapter 5 16~

Subtract (3) from (2) to get

.lOXI03 y= .1OxlOs so that y= .1OX 101 and if we substitute this in (1) we get x=O. (c) MUltiply (2) by 10-2 to get (4) .1OX 10-1X+ .1OX 10-1y= .20X 10-1. Subtract (4) from (1) to get y= .IOx 101 and if we substitute this in (2) we get

X= .IOX 101. 5.18. Solution Our first pivot is the (1, 1) element and we get

10 Xl + 7 X2+ 8 X3 + 7 X4 = 32 } .Ix2+ .4X3+ .Ix4= .6 .4x2+3.6x3+3.4X4= 7.4 . . Ix2+3.4x3+5.Ix4= 8.6 The next pivot is the (4, 4) element and the next stage of reduction is to .IOx2+ .33x3 = .42) .33x2+ 1.32x3 = 1.64 . . 1 X2+3.4 x3+5.Ix4=8.6 The next pivot is the (3, 3) element and the next stage of reduction is to

.02x2 = .OI}

.33x2 + l.32X3= 1.64 . Thus we get

X2 = .50, X3 = 1.12, X4 = - .93, Xl = 1.30. 5.19. Solution Choose F so that A+F is singular. Then choose any xo~O such that (A+F)xo=O. Then IIF xoll IIA xoll IIA xoll 1 IIFII~~=IiXJ= IIA-1(Ax O)11 >-IIA-111·

We show that there is an Fo such that A+Fo is singular and II Foil =i/IIA-Ili. We begin by choosing a y~O such that

(1) 170 Solutions to Selected Problems

We then choose a vector ro such that there is equality in the HOlder inequality:

(2) lro' (A -1 y)l:§ [[roll q [[A -1 Y[[p and normalize it so that (3) ro' A-I y= 1.

We then take Fo = -yro'. We have (by (3))

(A+Fo)A-l y=y- yro' A-I y=O so that A+Fo is singular. We now have

[[Fo[[p=sup [[yro'X[[Pby definition of Fo, definition of an induced norm xr'o [[x[[p and by homogeneity of a vector norm

=sup [[Y[[plro'xl by homogeneity of a vector norm #0 [[x[[p lro' xl = [[Y[[p !~~ ~ since [[Y[[p does not depend on x

= [[yllp [[ro[[q by Holder's inequality = [[yllp/[[A-l Y[[p by (2) = I/[[A-l[[p' by (I). The results just established can be stated as follows: "min [[F[[p= [[A !1[[p where the minimum is over all F for which A+ F is singular". We can interpret this result as follows: "If a matrix A has a large condition number it is 'near' to a singular matrix". This result is true for a general norm and the proof is unchanged except that the q-norms must be replaced by the norm dual to that given. We carry out the detailed calculations of this problem in the case of the matrix A of Problem 5.6 in the case of the Chebyshev norm, i.e. p= 00.

2 With A=[~ l_ln-2] we have A-l=[1:2n _~:] and

[[A[[==2, [[A-l[[==2n2.

We therefore ought to be able to find an Fwith [[F[L=(2n2)-1 such that A + F is singular. If we take F=[O o n-0]2 Chapter 5 171

A+F is clearly singular but IIFII =n-2• However, if we take

F. = [0 _(2n2)-I] o ° (2n2)-1 then A+Fo is singular and IIFolI==(2n2)-I. To get Fo by the method sketched above as we begin bychoosingy=[I, -I]' which gives A-I Y = [1-2n2, 2n2]', IIA-I yll= =2n2. We write down the appropriate Holder inequality

Iw' (A -1 y)l;§ IlwlliliA -1 yll= in which there is equality if

and for this we must have WI =0, W2= 1. In order to satisfy the normalizing condition (3) we must take WI =0, w2=(2n2)-I. Hence

5.20. Solution (a) From Ax=b and (A+F)(x+~)=b it follows that

(A+ Fg=-Fx or ~=-(I +A-IF)-IA-IFx and this gives

If we use the norm axioms we find

IIA-IFII;§ IIA-111IiFll, 11(1 + A-I F)-lll;§ (II' + A-I FID-I, II'II = II' +A-IF-A-IFII;§ II' +A-IFII + IIA-IFII, so that II' +A-IFII ~ I-IIA-IFII ~ l-IIA-1111IFII. Hence 11~11:5IIA-IIIIIFllllxll -1-IIA- 1 1I1I F II and this gives the result stated. (b) From the identity A(A-I_B-I)B=B-A 172 Solutions to Selected Problems we get (A-I_8-1) =A-l(8 -A) 8-1 which gives IIA-I_8-111~ IIA-111 118-AII 11 8-111 so that, as required, IIA-I-8-1 11 118-AII 11 8 11 -< x (A) IIAII 5.21. Solution The matrix An was suggested by H. Rutishauser as a test matrix for matrix inversion programs in view of its ill-condition. The i-th row of 8 n is

(i-I)° ,-(i-I)1 ' (i-I) 2 , ... ,(_1),-1. (i-I)i-I ,0, ... ,0, and the j-th column of 8~ is

(j-l)° ,-(j-l)1 ' (j-I) 2 , ... , (-1) j-l (j-I)j-I' 0, ... ,0, so that the ij element of An is i-I) (j-l) (i-I) (j-l) (° ° + 1 1 + ... = (i-I) (j-I) (i-I) (j-l) ° j-I + 1 j-2 + ... which is the coefficient of x j - 1 in the product (1 +4-1 (1 +X)j-l which is (i+j-lj -2) . This method can be applied to prove B~=I. Thus

1 21 1 3 41] A4= [1 3 6 10 . 1 4 60 20

Observe that since 8 2 =1 we have A-l=8'8 and so A=8A-1 8-\ i.e. A, A-I are similar, so that their characteristic roots are the same. 5.22. Solution (See G. J. Tee, SIGNUM Newsletter, 7, 19-20, (1972), and G. Zielke, ibid.,9, Il-I2 (1974).) Chapter 5 173

Note that - 557 842 - 284] A-l = [ 610 -922 311 2 -3 1 so that IIAII~=93, IIA-lll~=1843. Tee does not discuss the worst case in the context of the effect of the change in x caused by a change in b. We discuss this in the case of the Chebyshev norm following Problem 5.15. If we take b[~~], x~[J, bb~H] then ox~[-:i:;] and [llchll=/llxl!=]/[lib bll=/llbll=] = 1843 X 93 = 171 399. The Zielke matrix Z is got by a rank-one perturbation of A since Z =A+rx} where }=ee' with e' =[1, 1, 1]. We can therefore use the result of Problem 5.16 to find etA-lee' A-l 1 +ete' A-le .

Here e' A-l=[55, -83,28], A-l e=[I, -1,0]' and e' A-le=O. Hence

Z-l=A-l-et[l, -1,0]'[55, -83,28]

-55 et - 557 83 et + 842 -28et-284] = [ 55et+610 -83et+922 28 et+ 311 . 2 -3 2 When et is large and positive we have

If we take 3et+ 35] b= [3et+ 10 , 3et- 39 we find -166et -1683] bx= [ 166et; 1843 . 174 Solutions to Selected Problems

Now we have

[11<5 xll=/llxll=]j[II<5 bll=/llbll=] = (166 a + 1843) (3 a + 35) '" 498 a2• 5.23. Solution See O. Taussky, MTAC, 4, 111-112 (1950). Generally the condition of AA' is worse than that of A so that symmetrizing a system of equations, apart from its expense, makes the solution more awkward.

Chapter 6

6.1. Solution If r'A=(XJ" and Ac={3c then r'Ac=(r'A)c=ar'c and r' Ac=r'(Ac)={3r' c. Hence (a -{3) r' c =0 and so r' c=O since (f. -F {3. 6.2. Solution xy' has rank 1 if X-FO, y-FO. Its characteristic values are x'y and 0 with multiplicity n -1. We illustrate this in a 3 X 3 case. The characteristic polynomial in the case x=[I, kl' k 21', y=[a, b, c]' is

det[~~aA k1:-A k:C] k 2a k 2b k 2c-A a+klb+k2C-A b c] by adding kl col2+ =det [ kla+kib+kic-klA klb-A k1c k2a+k~b+k~c-k2A k 2b k 2 c-A k2 col3 to COlI

=(a+klb+k2C-A)det[~1 kl:-A k:C] k2 k2b k 2c-A

I b c] by row2 - kl rowl and =(a+kl b+k2 c-A)det [0 -A 0 o 0 -A row3 -k2 rowl • We can apply this result with and x' y = 1 + 1 + ... + 1 = 11. Chapter 6 175

Alternatively, the matrix can be represented a O-lJO

when O=diag [kb k2' ... , kn1 and we can use the result of Problem 5.13 (ii). 6.3. Solution A=[~ ~] and 8=[~ ~] each have characteristic values 0,0 but

A + 8 = [~ ~] has characteristic values ± 1, and

A8=(~ ~] has characteristic values 1, o. A simple condition is that A is a polynomial in 8 (see Problem 6.4 below). 6.4. Solution Clearly if Aa=cca then A 2 a=A(Aa)=A(lXa)=IX(Aa)=1X2 a and, generally, p(A)a=p(lX)a. Also Aa=lXa implies 1X-1a=A-1a if A is non-singular. The same result is true r(A)a=r(lX)a with the proviso that rCA) is not singular. 6.5. Solution It is clear that the characteristic polynomial of P is x4-1. Accordingly its characteristic values are '~=exp (n irl2) for r= 1, 2, 3, 4. Corresponding to the characteristic value ,~ we have a characteristic vector vr= ['4' ,~r, ,~r, ,~r]'. The characteristic values of Q are (compare Problem 6.4) a+b'4+c,~r+d,~r, r=I,2,3,4

and the characteristic vectors include the Vr • (Note that there can be more, e.g. if a=l, b=c=d=O, any vector is a characteristic vector of Q=I.) The extension to the general case is obvious. Applying these results to the matrix R we see that its characteristic values are, for r= 1,2,3,4, - 2 + 1 • exp (r n i12) + o· exp (r n i) + 1 • exp (3 n i r12) = - 2 + [cos (r n12) + i sin (r nI2)] + [cos (3 r n12) + i sin (3 r nI2)] = -2+2 cos (rnI2) = -4sin2 (rnI4) i.e., -2, -4, -2, O. 176 Solutions to Selected Problems

6.6. Solution Suppose

Then and so that

Thus A is a characteristic value of A ± B with characteristic vector Xl ± X2 . What happens if Xl = ±X2? This proof does not work. We can, however, proceed as follows: Since

~ [A = [A + _2 [-'-lIB -I] B]A [' 1 -I'] 0 BA-B 0] the matrix

or:-.11. = [AB AB] and 13= [A +0 BA-B 0] are similar and have the same characteristic values. Thus the characteristic values of ell are those of A+B and of A-B. The characteristic vectors of 13 are [y, z]' where y, z are the characteristic vectors of A+B, A-B and those of ell are

6.7. Solution

o

Characteristic values,' 1,2 Chapter 6 177

1 characteristic values: "2 (3 ± Y13)

8 = [ 0 (3 - (5)/2]. 1 (3+ (5)/2 3

REMARKS (1) There is a considerable body of results known about the Gerschgorin circles. The above simple examples illustrate some of it. For instance, in "general" a characteristic value can lie on the circumference of a Gerschgorin circle only if it lies on the circumferences of all the circles. (The exceptional cases are when the matrix is "reducible", e.g., in the case of the matrix [~ ~] the characteristic value 3 is on the circumference of the point circle center 3 but not on the other circumference.) (2) If the Gerschgorin region Q breaks up into separate domains, then there are the "natural" number of characteristic values in each domain. An extreme case is that of a diagonal matrix with distinct elements - here there is exactly one characteristic value in each point-circle. (3) If we apply a diagonal similarity to a matrix A, i.e. if we consider

we note that the diagonal elements of Al are the same as those of A so that the centers of the Gerschgorin circles for Al are the same as those for A. However the radii may change and judicious choice of D may result in more information about the location of particular characteristic values, especially isolated ones. This is illustrated in 8 where we have shrunk the circle center oas much as possible, letting the circle center 3 expand. This diagram, combined with preceding remark 2, assures us that there is a characteristic value within the smaller circle. 178 Solutions to Selected Problems

6.8. Solution We have

(1) detA=detAI where Al is the matrix got from A by the first major step in the Gauss trans• formation, i.e.

rowlAI =rowlA rowi Al = rowi A - (an/an) rowl A, i = 2, 3, ... , n. Clearly (2) where Ail) is the matrix of order n -1 indicated:

Al = [a~l A~:)]' We shall show that Ail) has a dominant diagonal. Looking at the i-th row of Ail) we find

n n = Z laikl + lanl Z l(alk/an)I-lan alil/lanl k=2 k=2 kT'i

n ~ Z laikl + laill-Iailalil/lanl k=2 kT'i (since the first row of A is strictly diagonally dominant; note that we cannot put a strict inequality here for ail might be zero)

n = Z laikj-jail ali/anj kT'ik=l

-< laiil-Iail ali/ani = [Ail»);; since the i-th row of A is strictly diagonally dominant. The proof is completed by induction. It is clear that a strictly dominant diagonal matrix of order 1 is non-singular. Assuming that strictly diagonally dominant matrices of order n-l are non-singular it follows that Ail) is non• singular and, by (1) and (2), so is A. Chapter 6 179

6.9. Solution From the Rayleigh Quotient Theorem (Problem 1.9) we have e(A)= max x' Ax. x'x=l This implies, taking x=ei' that e(A)~aii· Here we use the fact that corresponding to the characteristic value e there is a positive characteristic vector x=[x1 , X2' •.• , xn)'. Hence from Ax=ex it follows that (ail-e)X1 +a12 x 2 + ... +a1nxn=O and, unless n= 1, an-e must be negative; similarly for i=2, 3, ... , n. 6.10. Solution The characteristic vectors are

6.11. Solution The whole solution to the characteristic value problem can be written as (1) where A = diag [O:b ••• , O:n]. Multiplying (1) by A we get, using (1), (2) A2M=AMA=MA2. Transposing (1) we find (3) M'A=AM' Multiplying (1) and (3) gives M'A2M=AM'MA, which, using (2), can be written as M'MA2=AM'MA, and since A is non-singular M'M=AM'MA-l. We now recall the effect of a diagonal similarity (Problem 4.14) which shows that [M' M]ij = [M' M]ij O:;/O:j. Since the o:'s are distinct this implies

[M'M]ij=O if i¥-j, 180 Solutions to Selected Problems i.e. M'M is diagonal. 6.12. Solution The characteristic values of A B consist of those of B A, together with n-m zeros. (See, e.g., Faddeev & Sominskii.) The characteristic polynomial of A B is _23+ 1822-81 2=-2(2-9)2. It follows from the preceding remark that the characteristic polynomial of BA must be (2-9)2. One way to establish the given expression for BA is to observe that (AB)2=9AB and hence (BA)3=B(AB)2A =B(9AB)A =9(BA)2. Since BA is non-singular it follows that BA=91. The following proof of the result quoted is due to Householder (1972). We use the Schur complement (Problem 4.10). Let A be an mXn matrix and Band nXm matrix. Suppose m~n. Consider the matrix

eIl=[Alm B Aln·A] Then the results of the problem mentioned give

det ell = deUlm det(Aln -B2 -lA)=2m - n det(221-B A) and detell= det 21n det (2Im-A2 -lB)=2n- m det (221-AB).

Hence, writing A2 =p, and equating the two expressions for det ell we get det (p,I- B A) = p,n-m det (/1'- A B) which we can interpret as follows: The characteristic roots of B A are those of AB, together with n-m zeros. 6.13. Solution No, for if it were Q=x' Ax would be positive for x'=[O,O, 1] and it vanishes for this vector. Alternatively, we can easily calculate det [A -AI] = (2 -A)(A+ 1)2 so that the characteristic values of A, which are 2, -1, -1, are not all positive, again showing that A is not positive definite. Chapter 6 181

The standard Lagrangian method of reduction of a quadratic form to a sum of squares by "eliminating" one variable at a time does not apply here since the diagonal terms all vanish. We therefore use the special device of transforming from

Xl' X2, X3 to ~1'~2'~3 byputting Xl=~l' X2=~1+~2' X3=~3 so that ~l=Xl' ~2=X2-Xl' ~3=X3. We find

Q(x)-,q (~) == (~l + ~3)(~1 + ~2) + ~l ~3 = ~~+ ~l ~2+2~1 ~3+ ~2~3 and we can apply the standard method to this. Indeed

,q(~ = ~~ + 2 ~lG~2 + ~3) + (~~2 + ~3)2 - (~~2 + ~3)2 + ~2 ~3

=(~l +~~2+ ~3)2_i~~-~~ and we can now change back to get

Q(x) ==(~Xl +~X2+X3)2- (iX2 -iXl)2_X~ which we can check by expanding. 6.14. Solution See p. 101-3. 6.15. Solution Consider -AI A*] dl(A)= [ A -AI· By "block Gaussian elimination"

[ I 0] [- AI A *] [- A I A* ] A-1A I A -AI - ° A-IAA*-Al· Hence det dl (A) =det (-AI) det (A -lAA* -AI) = _An det (A-IAA*-Al) = det (A21_ A A*). Thus the characteristic values are the positive and negative square roots of the characteristic values of AA*, i.e. the singular values of A and their negatives. 6.16. Solution See J. Williamson, Bull. American Math. Soc., 37, 585-590 (1931). We require the following result of Schur: 182 Solutions to Selected Problems

If A is any (complex) matrix, there is a U such that U* AU is a triangular matrix.

Proof The result is trivial when A is 1 X 1. Assume that the result has been established for (n -1) X (n -1) matrices. Let a be a characteristic vector of A corresponding to the characteristic value IX. Choose (as in Problem 4.24) other vectors x 2 , ••• , XI! so that is unitary. Consider

V' A V ~ [j~] A [a, x" ... , xJ~ [~:] [oa, Ax" ... , A x.J

The unitary similarity does not change the characteristic roots so that those of A1 are the remaining characteristic roots of A. By our inductive hypothesis there is an (n-l)X(n-l) unitary matrix () such that

0* A1 U= R, an upper triangular matrix. Hence

a triangular matrix. Since V [~ ~] is unitary the proof is complete. An easy consequence of this result is the fact that any A (i.e. satisfying AA* =A* A) is unitarily similar to a diagonal matrix. We outline the proof of Williamson's Theorem in the case of 2X2 block matrices of 2X2 matrices. It will be convenient tq change the notation so that

where A=a(M), B=b(M), ... with a, b, ... polynomials. By the result of Schur just established there is a unitary matrix U which triangularizes M, i.e. U* M U=R; it is easy to see that U also triangularizes A, B, ... so that Chapter 6 183

U* AU=a(R), U* B U=b(R), .... Hence, if M has characteristic roots Jl, v:

a(Jl) * b(Jl) * 1 [u* 0 ]eIl[U O]=[a(R) b(R)]= [ 0 a (v) 0 b(v) o U* 0 U c(R) d(R) C(Jl) * d(Jl) * o c(v) 0 d(v) where the elements marked * are irrelevant in the present context. Since we have performed a similarity on ell, the characteristic values of ell are those of the matrix on the right and therefore of the matrix

aCJl) b(Jl) * * 1 ell = [ C(Jl) d(Jl) * * 1 0 0 a(v) b(v) o 0 c(v) d(v) obtained by submitting that matrix to a similarity which interchanges the second and third rows and the second and third columns. But the characteristic values of eIl1 are those of

[a(Jl) b(Jl)] together with those of [a (v) b(V)] c(Jl) d(Jl) c(v) d(v) which is Williamson's Theorem in the special case. 6.17. Solution Compare Problem 6.2. A is the sum of two matrices [cicj] and [r;rj] each of rank ~ 1. Hence rank A ~ 2. We use the following general theorem:

det (A - AI) == (-It [An - 'l'1An- 1 + 1'2 An- 2 ••• + (-It 1'n] where for r= 1, 2, ... , n, Yr is the sum of all the principal minors of order r of A. Clearly 1'1 = trace A = Z c~ + Z r~ and n n 1'2 = Z (aHajj - a~j) = Z [(C~ + r~)(c1 + r1) - (CI Cj + ri r)2] i

6.18. Solution A is not symmetric and so cannot be orthogonally similar to a diagonal matrix. Since

the characteristic values are 0:1 =1 and 0:2,3=(-1±2Y2t)/3, all of absolute value 1. Corresponding characteristic vectors are

V3= ~ [ ~ ]. - y'2i

Clearly if U=[v1 , V 2 , v3] then U is unitary and U*AU=diag [1,0:2,0:3], Generally, if A is normal, i.e. if AA*=A* A, then A is unitarily similar to a diagonal matrix. (Cf. Problem 6.16.) 6.19. Solution Since xx*=x* x the first relation is trivial. To deal with the second use the fact that there is a unitary S such that

The characteristic vector of A corresponding to Ai is "i = S ei' i = 1, 2, ... , n. Let /ii= v2;, i= 1,2, ... , n, the positive square root being taken in all cases. Define

Then it is clear that "i is also a characteristic vector of B, corresponding to the characteristic value /ii' We put o:=max /ii and p-l=min /ii' Consider the matrix (1) R = (P B - P- 1 B-1) (o:B-l_o:-l B)

= (o:p + 0:- 1 P-l) l-o:P-1 A-l_0:-l P A.

The characteristic vectors of R are the "i and the corresponding characteristic values are 0:-IP(1-o:ilP-2)(I-O:iO:-~~O. Hence R, being hermitian, is positive semi-definite. Take any x~O. Express it as ~Yi"i' Then, assuming the"i normalised.

x*Rx= ~ IYiI 2>O. Using the relation (1) we conclude that

(o:p + 0:- 1 P-l) x* X~o:p-l x* A-I X +0:-1 p x* A x Chapter 7 185 and the arithmetic-geometric mean theorem tells us that the right-hand side is not less than 2 V(x* Ax)(x* A IX). Rearranging and squaring we have

(ex [3 + ex-I [3-1)2 >- (x* A x)(x* A -1 x) 4 (x* X)2 as required. If we use the euclidean vector norm 11·112 then the induced matrix norms of A, A-I are clearly IIAII = ex, IIA- 1 11 =[3 and the result can be expressed in the form

(x*A X)(X*A-l x) -< (X1/2(A)+X- 1/ 2(A»2 (x* X)2 = 4

In the special case when A is diagonal, if we write m = mjn Ai' M = m~x Ai' , I the inequality becomes

(Z Ai Ix;j2) (Z Ai1lxil2) -< (m + M)2 (Z IXi12)2 = 4mM

or, replacing x by a unit vector X, and writing qi for Ix~1 so that Z qi = 1,

Chapter 7 7.1. Solution

J.L(3) = 16.9850, V(3) = [1, 9.9801, 5.0017]'

J.L(4) = 15.7843, V(4) = [1, 9.9807, 4.9928],

J.L(5) = 15.7731, v(5)=[I, 9.9947, 4.9979]'

J.L(6) = 15.7925, V(6) = [1, 9.9987, 4.9995]' The exact results are

Al = 15.8, VI = [1, 10, 5]'. The other characteristic pairs are A2=3.16, v2=[3, 4, 5]'

A3 = 1.58, V3 = [2, 1.6, 1]'. 186 Solutions to Selected Problems

7.2. Solution The characteristic pairs are

45, [-3, -1, 2]' 2, [-3, -2, 3]' 1, [-2, -1, 2]'. 7.3. Solution For details see E. Bodewig, Math. Tables Aids Compo 8, 237-239 (1954). Convergence is very slow. After 1200 iterations the dominant characteristic value is only good to about 4D. There are two reasons: (a) the characteristic values are not welI separated

-8.02857835, 7.93290472, 5.66886436, -1.57319074 and (b) the dominant characteristic vector

[1, 2.50146030, -.75773064, -2.56421 169]' is very nearly orthogonal to the chosen initial guess

[I, 1, 1, 1]' - in fact their inner product is .17951 797 so that the cosine of the angle between them is

which is about .03lO~cos 88°.

7.4. Solution With the notation of the text we find

H4 v(O) = [1.508333, .860000, .613333, .479524]' so that Il(l) = 1.508333, V(l)=[1, .570166, .406630, .317917],.

Compare this with the 10 D results given on p. 162.

7.5. Solution We have to assume that the moduli of the roots are different so that A has necessarily distinct characteristic roots, A, Il say, where 1..11>- 11l1. Hence A is diagonalizable. We denote by [a, c]' and [b, d]' the characteristic vectors Chapter 7 187 of A. Then, since these are distinct, we may assume ad-bc= 1. This means that if

then T-I= d-b) and say. [-c a T-IAT=[~ ~]=A,

To compute An we observe that since A= T A T-I we have An= = T A T-I. TAT-I ..... T A T-l and this product collapses so that

Hence

(1)

Generally, therefore, the direction of this vector will tend to that of the domi• nant vector

since An»f1n as n----=. Difficulties can arise if dxl -bx2 =O which can be interpreted as meaning that our initial vector [Xl' x 2J' is parallel to the other characteristic vector of A, containing no component in the direction of the dominant vector [a, cr. It is also clear from (1) that the ratio of the lengths of successive vectors

Aft [Xl , x 2J' approaches A.

7.6. Solution

If we take ct=~(20+5) then the roots of the new matrix A-ctl are

23 15 5 15 2 and there is a separation factor of 15/23 which makes the power method quite attractive. The smallest characteristic values of the matrices in Problems 7.1, 7.2 have been given above. That for H4 has been given in Problem 5.13 (viii). 188 Solutions to Selected Problems

7.7. Solution In the special case we have 1 0 0] [ 1 o 5= [0 1 0, 5-1 = 0 1 0,0] 5-1 A5= [10 1 0 1 -1 o 1 0 We check that [1, 1]' is a characteristic vector of AI: [~ ~][~]=4g]. Then ~ 4-1 [1] [-1] y= [1, -4]g] 1 = -1 so that a characteristic vector of 5-I A 5 is [1, -1, -1]' and that for A is

The characteristic vector of A corresponding to the characteristic value 2 is [1, 1, 1]'.

REMARK. The "exceptional case" y' 51=0 can arise, for instance, in the case of the matrix 0 1-1] B= [ 2 2-2 -4 1 5

(a permutation similarity of the given matrix). Take the characteristic pair

1, [1, 0, 1]'. We deflate B to get 1 0 T-'BT~[~ 2 where T~[~ 1 0 -!] 0 ~] Choose the characteristic pair 4, [-1, 1]' and we have y' y = [1, 1] [- ~] = 0 Chapter 7 189 so that the characteristic pair of T-l B Tis

4, [0, -1, 1] and the characteristic vector of B is

7.B. Solution Let r~ be the third row of A. We have AV=A.3 v and, in particular, r~ V=A.3 since the third component of v is 1. Suppose .,1.7"".,1.3 is a characteristic value of A and let the corresponding characteristic vector c also be normalized to have its third component unity so that, as before, r;c=A.. Consider the matrix

This matrix has as its third row the zero vector. Thus one of its characteristic values is zero and all the characteristic vectors have their third component zero. We note that A(v -c) = (A -vr~) (v --c)

= .,1.3 V - A.C - .,1.3 V + A. v =A.(v-c) so that v-c is a characteristic vector of A with characteristic value A.. In view of what has been said we can get A. as a characteristic value of the principal 2X2 submatrix of .4, which is the "deflation" of A. If fj is the corresponding 2-dimensional characteristic vector and b=[fj', 0]' then we can choose .97""0 so that c=v-.9b

is a characteristic vector of A with characteristic value A., i.e.,

(1) A(v-.9 b) =A.(v -.9 b).

In view of our constructions we have

[A-vr~]b=A.b, i.e. Ab=vr~b-A.b and (1) will be satisfied if

~v-.9~;~v-.9A.b=A.v-.9A.b 190 Solutions to Selected Problems which is the case if

(Discuss the situation when r~ b =0.) We illustrate this with a slightly modified version of Problem 7.2.

A ~ FE ~: -~] has as a ch",,",'

The characteristic pairs of the deflated matrix

[01 -23] are 1, [1, 0]' and 2, [3, -1]'.

Let us see how to build up the characteristic vector of A corresponding to the characteristic value 2. We take b=[3, -1,0]' to give

8- 45-2 - (44X3)+(46X(-I)) 2 and then

so that we find the characteristic pair 2, [3, -3,2],. 7.9. Solution The characteristic values of the matrix are IJ(r=2-2cos(rB), r=I,2,3,4 with characteristic vectors

Cr = [sin r B, sin 2 r B, sin 3 r B, sin 4r B]', r = 1, 2, 3, 4 where B=nI5. We shall go afler the characteristic vector C1 where

C~ = [sin (nI5), sin (2 nl5), sin (3 nI5), sin (4 n/5)] Chapter 7 191 which is parallel to

[1, 2cos(nj5), 2cos(nj5), 1]=[1, HV5+1), HV5+1), 1]

=(= [1, 1.6180, 1.6180, 1]. The corresponding characteristic value is

1X1=2-2 cos (nj5) =4 sin2 (njlO) =(3 - V5)/2 =(= 0.3820.

We shall take v(O)=[1, 1, 1, 1]" IX= ·4.

We find, solving the system (7.2) for v(1):

(v(1»),=[-40, -65, -65, -40]= -40[1, ~, 1:,1]. c: =1.625).

We repeat, solving the system (2) for V(2) taking the normalized V(l) on the right and find

V(2)'= [-445 -90 -90 -445]= -445 [1 144 144 1] 8 , , 8 8' 89 ' 89' .

144 ) (8"9=(= 1.61798 .

It is interesting to see the relative sizes of the components ai . We find these by taking the scalar product of v(O) with Ci to get, since the c/s are orthogonal:

The scalar product on the right can be evaluated by elementary trigonometry and we find 10+2V5 10-2V5 a1 = 10 ,a2 = 0, a3 = 10 ,a4 = O.

Question. What is the dominant characteristic value of C4? What would happen if you would try to find it using the power method, starting with the vector v(O) above? Observe that 1X4=li(5+ VS) and that C4 is orthogonal to v(O) as c4=[I, -(V5+1)/2, (V5+1)/2, -1]. 192 Solutions to Selected Problems

7.10. Solution G. E. Forsythe (MTAC, 6, 9-17, esp. 15 (1952).) The characteristic pairs are 30.29, .380, .526, .552, .521],; 3.858, .396, .614, -.271, -.625]'; .8431, .094, -.302, .761, -.568]'; .01015, [-.830, .501, .208, -.124]' .

Chapter 8

8.1. Solution The basic equation is [-~ ;][~ :][~ -;]=[~ ~] where c, s are the cosine and sine of the angle () defined by tan 2()=2H/(A -B), and the values of a, b are given by a=Ac2+Bs2+2Hsc, b=As2+Bc2-2Hsc. (Observe that a+b=A+B and that a2+b2=A2+B2+2H2.) If we write n=2H, d=A-B it follows from elementary trigonometry that 2 c2 = {(n2 + d2)1/2 + d} (n2 +d2)-1/2 (1) 2S2 = {(n2 + d2)1/2 _ d} (n2 + d 2)-1/2, 2s c = n (n2 + d 2)-1/2. If we take case A=2, B=5, H = -3 we find tan 2()=2/1, n=2, d= 1, say. Hence which gives, in particular, results which check with the fact that a, b are the characteristic values of the A H' matrix [H B J. Chapter 8 193

-n n Care must be taken with the ambiguities in c, s, e.g., by taking -;§2e;§- 2 2 and then taking c positive and s to have the sign of tan 2e. Further, the formulas (1) are numerically unsatisfactory if n«d and special tricks must be used. Since arctan 2 = 1.1071 the required angle of rotation of the axes is e=.5536~31.72°. 8.3. Solution See Solution 8.7 below. 8.6. Solution The first polynomial has all its roots real and located as follows: (-11, -10), (-4, -3), (-2, -1), two in (1,2), (4,5). The second polynomial has exactly two real zeros, one in the interval (-2, -1), the other in (6, 7). 8.7. Solution We write

fO(A) = 1, fl (A) = a1 -). and then use fr(A) = (ar - A)fr-l (A) - br - 1 Cr fr-2(A) for r=2, 3, ... , n. The corresponding formulas for the derivatives are: f: (A) =0, f; (A) = -1

f: (A) = (ar - A)f:_l(A) -br - 1 CJ:_2(A) - fr-l (A) and these can be computed at the same time as thefr().). From Problem 5.13 (iv), or otherwise, the characteristic values of A are 0+20cosrn/6, r=I,2,3,4,5. i.e. ± 10Y3, ± 10, O. The characteristic polynomial of 8 is -A5 + * + 12A3 + 12A2-17 A-18

= -(A+ I)(A + 2)(A3 - 3)..2_ 52+9) and the characteristic values are - 2, -1.9459970, -1, 1.2520004, 3.6939948. Since tr 8=0, the term in A4 being absent, some idea of the errors in our solution is given by the actual sum of the characteristic values which is 4 X 10-7• 194 Solutions to Selected Problems

8.9. Solution The successive reductions are 5 9.2195446 o [ 9.2195446 17.9058830 1.2235291 11.17~9188j o 1.2235291 2.0941177 2.2777696 5 11.1719188 2.2777696 10 5 10.4880889 0 [ 10.4880889 25.4727293 2.1614262 2.78g6542j o 2.1614262 2.0941177 1.4189767 5 2.7806542 1.4189767 2.4331552 and that given. The rotations used are given by c= .7592566 s= .6507914 c= .8790491 s= .4767313 c= .6137097 s= .7895317. As a check we compute the determinant of WI using the recurrence method (p. 30) and we get det WI = 1.0000053. The characteristic roots of Ware approximately .0105, .8431, 3.858, 30.29. The Householder vectors are [0, .9131, .3133, .2611], and [0,0, .8533, .5215]'. 8.11. Solution See, e.g., H. Rutishauser, On Jacobi rotation patterns, pp. 219-239, in Proc. Symposia in Applied Mathematics, vol. 15 (American Mathematical Society, 1963). 8.12. Solution The characteristic polynomial Hs(),) =),6 -15),4 + 45),2 -15 of this matrix is the one of the normalizations of the Hermite polynomial, H6 (x). Specifically, it is clear that

Hn(),) = -),Hn- I (),)-(n-l)Hn- 2 (),). Chapter 8 195

Comparing this with the standard recurrence relation

Hn(x) =2xHn_1 (x) -2(n-l)Hn_ 2 (x) we see that Dn(A) =2-n/2 Hn(-A/l"2).

The zeros of H6(X) are (National Bureau of Standards, Handbook of Mathema• tical Functions, 1964, p. 924)

± .43608, ± 1.33585, ±2.35060 and those of DnUI,) are got by multiplying these by l"2 ± .61671, ±1.88918, ±3.32426. 8.13. Solution The characteristic values are approximately

22.406872, 7.513724, 4.848950, 1.327045, -1.096595.

The dominant characteristic vector is

[.024588, .302396, .453215, .577177, .556385]' and that corresponding to the characteristic value near 5 is

[-.547173, .312570, -.618112, .115607, .455494],.

This example is due to J. H. Wilkinson (Numer. Math., 4, 354-376 (1962»); see also John Todd (Error in digital computations, vol, 1, ed., L. B. RaIl, 1965, pp.3-41). 8.14. Solution We want to have A=O R, i.e. 0' A=R. It will be sufficient to show that ro=[xI , w] where ro'ro=1 can be chosen so that

(1) [ 1 - 2 x! - 2 x~ w:,] [au, ... ]= [bu, ... ] - 2 Xl ro 1-2 ro ro &,... 0, .. . for repetition of this process on successive principal submatrices will complete the triangulation of A (Note that this is an alternative to the Gaussian triangulation of A) Since .-2 ro ro' is orthogonal we have (2) We can clearly assume 1%7"'0; there is a choice of sign in bu. Multiplying (1) 196 Solutions to Selected Problems

-out we find (3)

(4)

Substituting in (4) from (3) we find which gives so that w is determined when Xl is, since bn is given by (2). The fact that xi+w' w= 1 gives

and this determines Xl up to a sign. 8.15. Solution The characteristic values are approximately 12.109309, - .535898, - .679823, -1.000000, -2.429488, - 7 .464102. The "even" values are exactly - 2 (2 - v3), - 1, - 2 (2 + V3).

Chapter 9

9.1. Solution (J.) The solution to this system is obviously X=[!]. We make any guess x(O) and write 8~0)=x~0)-D]. From the text, in the Jacobi case,

8(r+I)=[~ ~18(r),

dearly [0 1]2n= [1 0] [0 1]2n+l= [0 1] 1001' 10 10 and we see that the components of 8(r) are obtained from those of 8(0) by dividing by 2' and interchanging if r is odd. Chapter 9 197

(G.-S.) From the text, in this case,

8

= [~ ~][~ !]8

= [~ ;]8

9.2. Solution The inverse is ! ~ ~]. Iac+b c 1 In the first case we have to find the spectral radii of 0 2-2] 1101 220 and _~ ~ ~]_1[~ ~ -~]=[~ ~ ~][~ ~ -~]=I~ ~ =~]. I -2 -2 I 0 0 0 4 2 1 0 0 0 0 8 -6 The first matrix has characteristic polynomial -A3 and the second -A{A2+4A-4} so that the spectral radii are 0,2(1 + Vi) respectively. Thus, the Jacobi process converges and the Gauss-Seidel process does not. In the second case we have to find the spectral radii of Ur !] and

- -1 = = [ ~ ~ ~] l~ ~ ~] [~ ~ ~] [~ ~ !] [~ ~ ~ 1 000 0 ~ 1000 0 198 Solutions to Selected Problems

The first matrix has characteristic polynomial -A(A2 +5/4) and the second -A(A+~)2 so that the spectral radii are VS/2 and ~ respectively. Thus the Gauss-Seidel process converges and the Jacobi process does not. 9.3. Solution

C(L,,)

(,' o 1 Wb 2

It is clear from the graph, and can be rigorously established, that the change in Q (.i!w) in the neighborhood of the optimal ro is much larger for negative changes as for positive. This suggests that when we do not know the optimal ro, it is preferable to take an overestimate. 9.4. Solution We find, for instance,

-4.26 4.14 -0.86] -3.9976 3.9864 -1.0136] [ [ Xl = 4.14 -5.06 1.94, X 2 = 3.9864 -4.9896 2.0104 , -0.86 1.94 -1.06 -1.0136 2.0104 -0.9896 so that 0.26 -0.14 -0.14] A-l_XI = [ -0.14 0.06 0.06 , 0.14 0.06 0.06

0.0024 0.0136 0.0136] A-I_X2 = [0.0136 -0.0104 ~0.0104 . 0.0136 -0.0104 -0.0104 9.6. Solution ~e1w]anttofindtherootsofdet(A-A/)=0 where A=[TOJ 2~J] and J= [ 1 l' T=(1/4)ro, 0'= l-ro. Chapter 9 199

(a) From Problem 4.10 we have, since P=2},

det (A -A I) = det (-2.1 ,,2) + .12 1-2 (1 ,,2j), i.e. - 2 A,,2 - 2 (1 ,,2 + .12 - 2 A,,2 - 2 (1 ,,2 ] det(A-Al)=det [ 21 2 2 2 21 2 2 2 12 - 11." - (1" - 11." - (1" +11.

Hence the characteristic roots are

(b) Alternatively, by Williamson's Theorem, Problem 6.11, the charac• teristic roots of A are those of [ 0 2,,] 2 (1" 4,,2 and corresponding to the characteristic roots 2 and 0 of ). 9.7. Solution (J.) We may assume the matrix A normalized to have units on the diagonal. The condition for convergence of the Jacobi process is that e(I-A)<:1. Since A is strictly diagonally dominant we have Ai= 2" laijl <: 1 for each i. This means that the Gerschgorin circles of 1-A, which are all centered at the origin, have radii <: 1. Hence e(I - A) <: 1. (G.-S.) We have now to show that e(I-L)-lU)<:l, again assuming normalization. If A, x is a characteristic pair for (I-L)-lU then (I-L)-lUX= =.1x which gives (1) (U+.1L)x=.1x.

Let xM=max Ix;!' The M-th equation in (1) gives i.e. (2)

If 1.11 ~ 1 the relation (2) is impossible since the absolute value of the left 200 Solutions to Selected Problems hand side

~ Z laMil + A. Z laMil, by choice of M, iM

since 2~ 1, since A is strictly diagonally dominant. Hence IAI

(1) x*(F + G)x=(1 +2) x*Fx.

Since F + G is positive definite hermitian, it follows that 2 ;r"- -1. Since F+G is hermitian the right-hand side of (1) is equal to its conjugate transposed we have

(1 +A)x*F* X= (1 +2)x*Fx =(1 +A)[x*(F-G*)x+x*G*x] =(1 +A)[x*(F- G*)x+Xx* F*x], I.e. (I-1212)x*F*x= (1 +2)x*(F- G*)x. Multiply across by 1 +X and we get

The bracketed factors on the left can be replaced by x*(F+G)x because F+G is positive definite hermitian and we can star (1}. Hence

(2) (I-IAI2)x*(F+ G)x= 11 +A.12 x*(F- G*)x.

The hermitian forms in (2) are positive if x;r"-O. Hence 1-1212>0 which is the result required. The problem is now easily solved. Since A is hermitian A=L+D+L* with D real. If F=L+D, G=L* then A=L+G is positive definite hermitian and so clearly is its diagonal D=F-G*. The result just established give the conclusion wanted. Chapter 10 201

9.9. Solution 7O 1.55 [0]o . [3.20].12 . .18 . .21 . .22] . .21 0' .67' r·~]1.06' [l.n]1.28' r·1.39 ' 1.44 ' ... o .20 .35 .46 .55 .61

9.10. Solution

p. 298 .43 X 107 .15X10-1 .58 X 101 .99X 10-2 .10X 10-1

p. 20 .41 X 104 .19X10-4 .57X 10-4 .86X10- 5 .86X 10-5

p. 21 .16 X 108 .13 .18X103 .30X 10-1 .30X 10-1

p. 300 .13 X 1010 .18XlO-1 .65 X 10-1 .96X10- 2 .96X 10-2

p. 301 .14 X 1013 .20X 101 .26X 102 .10X 103 .35X 103

p. 301 .16 X 1016 .39X 102 .16X 104 .59X 103 .85 X 103

(1) The condition number quoted is the one related to the euclidean vector norm; it would be more appropriate to take that associated with the norm. (2) The 111-AXIiF are systematically smaller than III-XAIIF' (3) When the inverses are reasonable (i.e. in the first four cases) refinement takes place; in the last two, there is a deterioration. (4) Compare with the results of Problem 5.14.

Chapter 10

10.1. Solution We shall show that the Rayleigh-Ritz process leads, in this case, to the same solution as the Galerkin process. We discuss the problem, initially, in a somewhat more general case than is necessary. We begin by showing the relation of the variational problem

. e' Ae mm-• c,",o e'Be

where A is symmetric and B is positive definite to the generalized characteristic value pro blem (of which a special case was handled in Problem 6.14):

(1) Ax=A.Bx. 202 Solutions to Selected Problems

We can write this in the form of a simple characteristic value problem

but the matrix B-1A is no longer necessarily symmetric. We can, however, get a symmetric problem in the following way. From the LDU Theorem we can express B in the form B=TT' where T is a non-singular triangular matrix and then (1) can be written as

Methods for the solution ofthe symmetric case apply but we have the additional computation of X=T-1 y . Note that the characteristic vectors yare no longer orthogonal in the ordinary sense, but that we have x~ Bx2 =O if Xl' X2 corres• pond to distinct AI, A2' We can now essentially repeat the argument of Problem 1.9. We expand any c in the form c= Z YiXt and then note that

which shows that if Al and An are the extreme generalized characteristic values. We note, as in the simple case, that R(c) has an extreme point for C=Xi> with value Ai' This can be proved by the use of Lagrange multipliers. We now return to the problem proper. In the case of the system (1) when we use the basis functions br(x) suggested we are led to the problem

where A, B are the same triple diagonal matrices. In view of what has just been proved we come again to the same generalized characteristic value problem and to the same results as before. Several questions are apparent and form the beginning of a deeper study of the Rayleigh-Ritz method: What are good ways to choose the b's? How does the convergence to the characteristic values and vectors depend on n? Can lower bounds on the characteristic values be obtained? Chapter 11 203

Chapter 11

11.1. Solution (1) In the notation of the text we have

We find

Q'f=[5] ~J..[ 10' c 6 -614 -6][5]=[5/3]3 10 0'

so that y=5/3 is the best approximation. (2) Alternatively we take Q'Q= LL' where

L =[2~ ~] and we solve first L y= Uo] getting y= [5/~]

and then L' c = y getting c = [563] as before. (3) An analyt~c solution to the problem is as follows: Assume y=ax+b to be the linear fit. Then

which gives oE oa =28a+ 12b-20, oE ob =12a+6b-l0. Solving 7a+3b=5 and 6a+3b=5

we find a=O, b=5/3, as before. (4) Consider the over-determined system

x+ y= I} x+2y=3 . x+3y=1 204 Solutions to Selected Problems

The factorization Q=q,U is: 1, 1] [l/fJ' -l/fi] [fJ, 2Y3"] [1, 2 = I/fJ, 0 0 fi 1, 3 l/Y3", l/fi ' and we find first y=[ l/Y3", l/Y3", l/Y3"][!]=[5IfJ] -l/fi, 0, l/fi 1 0 and then solve Uc=y, i.e.,

[Y3", 2 Y3"] c = [5N3] 0, fi 0 giving y' = [5/3,0], as before. 11.2,11.3. Solution In connection with these problems see papers by R. H. Wampler, in particular An evaluation of linear computer programs, J. Research National Bureau Stand. 73B, 59-90 (1969). This paper gives some idea of the variable quality of library subroutines. 11.4. Solution It is geometrically obvious and easily proved that the line in question is such that the residuals r}, r2 , r3 where

ri=aXi+b-fi are equal in magnitude but alternate in sign. Thus the best fit is given by

y=2. For methods of handling this problem when there are more than three points Xi see F. Scheid, The under-over-under theorem, Amer. Math. Monthly, 68, 862-871 (1961). 11.5. Solution E=(x+ y_l)2+ (2x+2y -0)2+ (-x- y _2)2.

1 oE 1 oE "2. ox =6(x+y)+I; "2. oy =6(x+y)+1.

Hence x+y=-1/6 for a minimum which is 170/36. Chapter 12 205

11.6. Solution E = (x + y - 1)2 + (2 x - 0)2 + (-x + 3 Y - 2)2.

1 oE 1 oE -.-=6x-2y-l· -.-= -2x+ lOy-7 2 ox ' 2 oy .

Hence x = 1/14, y = 5/7 for a minimum which is 1/14.

Chapter 12

12.1. Solution We may assume that the leading rXr submatrix An of A is non-singular. [For if not there are permutation matrices P, Q such that P AQ has this property and if PAQ=BC then A=(P-IB)(CQ-l).] We observe that An A12] [ , ] A= [A A = A A-I [An A12]· 21 22 21 n

We only have to justify the equality A21AllIA12=A22' Since A has rank rand since its first r rows are independent the remaining m - r rows can be represented as linear combinations of them. That is, there is an m-rXr matrix X such that

i.e. XAll=A21 and XA12 =A22 ; the required equality follows. Finally, note that the rank of each of the two factors of A is exactly r: the first which has r columns includes 'r and the second, which has r rows, includes An. Clearly, if B, C are possible factors, so are BM and M-IC for any non• singular rXr matrix M. On the other hand, if then B' B C =B' CB(2 and so C=[(B'B)-IB'CB](2=M(2, say. The matrix B'B is non-singular since it has rank r with B. Since C, (2 each have rank r, so has M and we can therefore write (2=M-IC as required. Continuing, from C=M(2 follows

and so CB(2(2'=BM(2(2' which gives CB=BM as required since we can post• mUltiply across by «(2(2')-1 for (2(2' is of full rank with (2. 206 Solutions to Selected Problems

REMARK. We have here used two standard results about rank: (1) rank AB;;§min [rank A, rank B], (2) rank AA*=rank A* A = rank A = rank A* for matrices with complex elements. We include a proof of (2). Clearly A* Ax=O implies x* A* Ax=O and so Ax=O. On the other hand Ax=O implies A* Ax=O. Thus the null spaces %(A), %(A* A) are the same. Now it is a standard result that (3) dim %(M) = number of columns of M-rank M. Now A* A and A have the same number of columns and so, necessarily, the same rank. We outline a proof of (3). Suppose an nXn matrix A has rank r. Without loss of generality we may assume that the leading rXr submatrix An is non-singular.

A= [An A12] A21 A22 . Consider the n - r vectors in v,;:

'j= [-Al~jA12ej], j= 1, 2, ... , 11-r,

where ej is the j-th unit-vector in Vn - r • These matrices are clearly linearly independent and hence .,I/=span ('1' ... , 'n-r) has dimension 11-r in v". We prove that the null space %(A) of A is exactly At. Since A _ [ An(-AiiIAI2ej) + All ej ] _ [0] 'j - linear combinations of above r rows - ° it follows that each 'jE%(A) and so .,I/~%(A) as % is a subspace. To prove the opposite inclusion .,I/~%(A) take x=[xl>x2]'E%(A). Let x2= L: cjej . Then Ax=O gives AnXI +A12X2 =0, i.e. Xl = _A;;l Al2 X2 and so, by definition of 'j' we have x= L: Cj'j, i.e. xE%. 12.2. Solution

(Problem 11.6.) The matrix A = [~ ~l has rank 1 and can be -1 -1 factorized as Chapter 12 207

Here 8' 8 = [6], C C' = [2] and

x~ [:mml [I, 2, -I] m~ -[:~:;l,

Observe that Xl +X2 = -1/6 and that the solution obtained is that which has minimum length. (Problem 11.6, second solution.)

We find, if A*=[! ~ =!] that

AA*=[ ~ : =~] and A*A=[~ ~]. -2 -4 2 We see that the characteristic values of AA* are 12,0, °and those of A* A are 12, °(cf. Problem 6.12). The orthogonal similarities which diagonalize these matrices are easily found to be 1/f6 -1/(3 11fi] U*(AA*)U=diag [12,0,0] where U= [ 21f6 1/(3 ° -1/f6 11(3 11fi and V*(A*A)V=diag [12,0] where V=[~~~ _;~~].

Thus the singular value decomposition of A is ffl 0] A=UIV*, I= [ ~ ~' and the pseudo-inverse of A is AI=V* IIU*, II = [11m ° 0] i.e. ° ° 0'

A'~ I~ [: ; =:J giving X~A'm~ I~ [=:1 as before,

(Problem 11.7.) The matrix [ ~ ~l has rank 2 and we may take 8 = A,. -1 3 208 Solutions to Selected Problems

C=I. Then X=(A'A)-lA'b

=[ 6 -2r1 [-I] -2 10 7

=_1 2] [-1] 56 [102 6 7 =1~UO]· 12.3. Solution [ 425 -250 350] [17 -10 225AA*= -250 200 -100 =25 -10 8 ~:]. 350 -100 500 14 -4 20 The characteristic polynomial of 9AA* is

det r7-.1-10 8-A.-10 -414] =A.(A.-9)(}.-36). 14 -4 20-A. Hence the singular values of A are 0, 1,2. The singular value decomposition is A~V[~ ~]u. where -2 V=~[-~ 1 ~] , u*=~[3 5 4 -4]3 . 3 -2 -2 -1 Here V is the orthogonal matrix which diagonalizes AA*, V* AA*V=diag. [1, 2, 0] and U is the orthogonal matrix which diagonalizes A* A:

A* A =_1 [73 36] 25 36 52

~ (A* A) ~ [ = 5 [3 4 -4] 3 5 -43 4]3 [1 0 0]4· . Hence -2_ -2] AI=_1 [ 3 4][1 0 O][_~ 1 -2 15 -4 3 0 ~ 0 2 2 -1 1[-1 4 -10] =15 -7 19/2 5 . Chapter 12 209

In a similar way we find

S1 = -1[2 6 6 7 - 69] X [1 0 0] 2 X 5"1 [3 4 _4] 3 11 9 -6 2 0 0 so that 1 [18 32 15 ] S1 ="55 1 -27/2 -45 . 12.4. Solution (1) Clearly

AA* has characteristic values 5,3,0 and if l/n 0 l/n] U*= [1/(6 -2/(6 -1/J16 1/(3 1/(3 -1/(3 then o 3 0]O. U'AA'U~[~ o 0 Hence [n l/n I/Vi nl F= ~ -3~(6 3/(6 0 o 0 and V*- [Y2/5 l/yTo I/VW Y02/5]. 1- 0 -1/n l/Vi We then observe that we may take

V* _[l/n 0 0 -1/n] 2 - -l/VW 2/yTo 2/VW -l/YIO and then J!2i5 0 l/Vi _l/YIOj Ylll/YTh[ -l/Vi 0 2/YIO = o l/YTh l/n 0 2/YIO o J!2i5 0 -l/n -l/YIO YS 0 0 0] =0(300·[ o 0 0 0 2lO Solutions to Selected Problems

(2) A has obviously rank 2 and can be factorized in the form

1 0] [1 0 1 1] A= [ ~ ! 0 1 -1 0' where each factor, again, obviously has rank 2. (Cr. Problem 12.1.) Using the formulas of the text we find:

-Ij[i !IU -m~ ~ :1

2 - 1 1] 1 [- -2~l [-1 2 1 =15 4-5~ ~ 1 3 0

[This matrix. was discussed by M. R. Hestenes, J. SIAM, 6, 51-90 (1958).] 12.5. Solution (a) Trivial. We have A=UIV*=UV*VIV*=(UV*)(VIV*) and UV* is unitary and V I V* is positive definite. (b) We have to show that (1) IIA-UV*IIF:§IIA-WIIF for any unitary W.

Since IIXIIF=IIUI XU21IF for any unitary UI, U2 (Problem 2.10) (1) is equi• valent to III -IIIF:§ III - U* WVIIF for any unitary W, or to IIA-IIIF:§III-WIIIF for any unitary WI. Chapter 12 211

Now III- Wll1~=tr(I- WI) (I- Wi) =tr (I2_IWi- WII+I) =I(<1; -<1r(Wr+Wr) + 1) where WI' ... , Wn are the diagonal elements of WI' Now, WI being unitary, Iwrl ~ 1 and so (w; +wr)=2 Re Wr lies between ±2 and

III - Wll1~~I(<1; -2<1r+ I)=I(<1r-I)2= III -III~. 12.6. Solution We outline, from first principles, the construction of AI when A~O is a column vector a. Then AA* is an nXn matrix with characteristic values a*a, 0, 0, ... ,0 (cf. Problem 6.12). Since (aa*) a=a(a*a), the characteristic vector of AA* corresponding to a* a is a. This means that the first column of the unitary matrix U is a/ Ya* a - we shall see that no further information about U is required. Since A*A is a IXl matrix [a*a] we can take V=[I]. It is also clear that

I=[Ya*a, 0, 0, ... ,0]*, 17= [I/Ya*a, 0, 0, ... ,OJ. We have ya*a 1 o A=UIV*=[a/Ya*a, ... ] 0 [1] o so that AI = VIIU*=[I] [I/ya*a, 0, 0, ... ,0] [a*/~a*a]

1 * = a*a ·a .

It is easy to verify that all the axioms (12.4-12.7) are satisfied. The pseudo-inverse of a zero vector is itself. The results of this problem can also be found by applying (12.18) and taking B=A, C=[I]. Bibliographical Remarks

RECOMMENDED LITERATURE

T. M. ApOSTOL, Calculus, I, II (Wiley, 1967-9). A. OSTROWSKI, Vorlesungen fiber Differential- und Integralrechnung, 3 vols. Aufgaben sammlung zur Infinitesimalrechnung, 3 vols. (Birkhauser, 1965-72). GILBERT STRANG, and its applications (Academic Press, 1976).

TEXTS AND MONOGRAPHS

E. K. BLUM, Numerical analysis and computation, theory and practice (Prentice Hall, 1972). E. BODEWIG, Matrix calculus (North Holland, 1956). S. D. CONTE and C. DE BOOR, Elementary numerical analysis (McGraw-Hill, 1973). D. K. FADDEEV and V. N. FADDEEVA, tr. R. C. Williams, Computational methods of linear algebra (Freeman, 1963). D. K. FADDEEV and I. S. SOMINSKII, tr. J. L. Brenner, Problems in higher algebra (Freeman, 1965). V. N. FADDEEVA, tr. C. D. Benster, Computational methods of linear algebra (Dover, 1959). G. E. FORSYTHE and C. B. MOLER, Computer solution of linear algebraic systems (Prentice Hall, 1967). L. Fox, Introduction to numerical linear algebra (Clarendon Press, 1966). F. R. GANTMACHER, tr. K. A. Hirsch, Matrix theory, 2 vols. (Chelsea, 1959). N. GASTINEL, Linear numerical analysis (Academic Press, 1970). R. T. GREGORY and D. L. KARNEY, A collection of matrices for testing computational algo- rithms (Wiley, 1969). R. W. HAMMING, Introduction to applied numerical analysis (McGraw-HilI, 1971). A. S. HOUSEHOLDER, Matrices in numerical analysis (Dover, 1975). A. S. HOUSEHOLDER, Lectures on numerical algebra (Math. Assoc. of America. 1972). E. ISAACSON and H. B. KELLER, Analysis of numerical methods (Wiley, 1966). P. LANCASTER, Theory of matrices (Academic Press, 1969). M. MARCUS, Basic theorems in matrix theory (U. S. Government Printing Office, 1960). M. MARCUS and H. MINC, A survey of matrix theory and matrix inequalities (Allyn and Bacon, 1964). Modem Computing Methods (H. M. Stationery Office, 1961). B. NOBLE, Applied linear algebra (Prentice Hall, 1968). J. M. ORTEGA, Numerical analysis, a second course (Academic Press, 1972). E. STIEFEL, tr. W. C. and C. J. Rheinboldt, An introduction to numerical mathematics (Academic Press, 1963). J. STOER, Einffihrung in die numerische Mathematik, I (Springer, 1972). Bibliographical Remarks 213

J. STOER and R. BULIRSCH, Einjiihrung in die numerische Mathematik, II (Springer, 1973). JOHN TODD, ed., Survey of numerical analysis (McGraw-Hili, 1962). JOHN TODD, Chapter 7, Part I of E. U. Condon-H. Odishaw, Handbook of Physics. 2nd ed. (McGraw-Hill, 1967). R. S. VARGA, Matrix iterative analysis (Prentice Hall, 1962). B. WENDROFF, Theoretical numerical analysis (Academic Press, 1966). J. H. WILKINSON, The algebraic eigenvalue problem ~Clarendon Press, 1965). J. H. WILKINSON, Rounding errors in algebraic processes (Prentice Hall, 1963). J. H. WILKINSON and C. REINSCH, Linear algebra (Springer, 1971). D. M. YOUNG, Iterative solution of large linear systems (Academic Press, 1971). D. M. YOUNG and R. T. GREGORY, A survey of numerical mathematics 2 vols. (Addison• Wesley, 1972-3). Attention is also invited to many useful expository papers in this area; some appear in Symposia Proceedings. Apart from those written by authors mentioned above, and often incorporated in their books, we mention G; GOLUB, Least squares, singular values and matrix approximations, Aplikace Math. 13, 44-51 (1968). W. KAHAN, Numerical linear algebra, Canadian Math. Bull. 9, 757-801 (1966). O. TAUSSKY, A recurring theorem, Amer. Math. Monthly 56, 672-676 (1939). BoU/uis for eigenvalues offinite matrices, pp. 279--297, in Survey of Numerical Analysis, ed. J. Todd (1962). On the variation of the characteristic roots of a finite matrix, pp. 125-138, in Recent advances in matrix theory (1966). A. M. TURING, Rounding-off errors in matrix processes, Quart. J. Mech. Appl. Math. 1, 287-308 (1968). J. VON NEUMANN and H. H. GOLDSTINE, Numerical inverting of matrices of high order, Bull. Amer. Math. Soc. 53, 1021-1057 (1967), Proc. Amer. Math. Soc. 2, 188-202 (1951). A full bibliography of Numerical Algebra has been prepared by A. S. Householder.

SUPPLEMENTARY REFERENCES

C. L. LAWSON and R. J. HANSON, Solving least squares problems (Prentice Hall, 1974). A. BEN ISRAEL and T. N. E. GREVILLE, Generalized inverses: theory and applications (Wiley, 1974). G. W. STEWART, Introduction to matrix computation (Academic Press, 1972). A. KORGANOFF and M. PAVEL-PARVU, Methodes de calcul numerique, 2 (Dunod, 1967). A. GEWIRTZ, H. SITOMER and A. W. TUCKER, Constructive linear algebra (Prentice Hall, 1974). F. H. HILDEBRAND, Introduction to numerical analysis (McGraw-Hili, 1974). L. F. SHAMPINE and R. C. ALLEN, Numerical computing (Saunders, 1973). H. R. SCHWARZ, H. RUTISHAUSER, E. STIEFEL, Numerical analysis of symmetric matrices (Prentice Hall, 1973). R. P. BRENT, Algorithmsfor minimization without derivatives (Prentice Hall,1973). Index

Accumulation of inner product 97 Diagonal matrix 29 ADI method 32, 87 Diagonal similarity 41,61 A. C. AITKEN 162 Diagonalization 71 Anti-scalar product 14,60 Difference equation 78, 156 Arithmetic-geometric mean inequality Differential equation 88, 99 122, 185 Dominant characteristic value, vector 16 Dominant diagonal 54, 178 Back substitution 30, 31 Band matrix 31 Echelon form 42 Basis 42, 65, 78, 120 Eigenvalue - see characteristic value F. L. BAUER 127, 152 Eigenvector - see characteristic vector Block matrix 32 Elimination E. BODEWIG 186 GAUSS 35 R. P. BRENT 138 GAUss-JORDAN 37 P. A. BUSINGER 107 Equation(s) normal 106 L. CARLITZ 160 over determined 107 CAUCHy-SCHWARZ Inequality 123 1. M. H. ETHERINGTON 47 Change of variable 57, 71, 192 R. C. W. ETTINGER 162 Characteristic values, vectors 53 Expansion CHEBYSHEV norm 16 in co factors 34 CHOLESKY factorization 34, 39, 107 in basis 65, 78, 120 L. CoLLATZ 97 Condition numbers 44 Factorization (see Decomposition) Continuity 21 Convergence M. FIEDLER 159 Finite differences 88, 100 in norm 23 K. E. FITZGERALD 98 linear, geometric 95 G. E. FORSYTHE 105 quadratic 96 Convexity 25 J. G. F. FRANCIS 76 F. G. FROBENIUS 55 Decomposition P. FURTWANGLER 61 LDU or LU or triangular 33 polar 115 V. G. GALERKIN 101 singular value 110 N. GASTlNEL40, 51,110,160 Deflation 66, 189 GAUSS elimination method 35 T. J. DEKKER 41, 141 GAUSS-JORDAN process 37 H. H. DENMAN 162 GAUSS-SEIDEL process 85, 197, 199 Determinant 36 WALTER GAUTSCHI 160 Index 215

Generalized characteristic value problem LL' (or CHOLESKY decomposition) 34 63,102,201 LR method 76 Generalized inverse - see pseudo-inverse LU factorization 33, 39, 42 GERSCHGORIN THEOREM 34, 88, 177, 199 J. W. GIBBS 110 MANHATTAN norm 16, 127 J. W. GIVENS 71, 158 J. F. MAITRE 129 H. H. GOLDSTINE 47 Matrix G. H. GOLUB 107, 110 band 31 I. J. GOOD 110 block 32 Gradient (or steepest descent) methods 90 DEKKER 41, 141 GRAM-SCHMIDT process 35, 76, 135 HESSENBERG 41. 142 HILBERT 13, 160 41 ill-conditioned 44 M. R. HESTENES 210 non-negative 55 HILBERT matrix 13, 160 partitioned 32 K. HOLLADAY 138 permutation 34, 61 A. S. HOUSEHOLDER 82, 121, 172, 180, 194 positive 55 positive-definite 56 44 Ill-condition sparse 86 Induced norm 19 triangular 30 Inequality tridiagonal 30 CAUCHy-SCHWARZ 123 MINKOWSKI inequality 123 HOLDER 122 34, 59 KANTOROVICH 64, 95, 185 E. H. MOORE 110 MINKoWSKI 123 MOORE-PENROSE axioms 111 Inversion 29, 38, 44 J. MORRIS 49 Inverse iteration 67, 191 F. R. MOULTON 49, ISO Iterative improvement (or refinement) Multiplication, fast 96, 198, 201 BRENT 138 Iterative methods 83 STRASSEN-GASTINEL 40, 137 WINOGRAD 138 JACOBI method 83, 197, 199 JACOBI rotations 83, 192 E. H. NEVILLE 49, 151 24 M. NEWMAN 164 W. KAHAN 48, 148 NEWTON method 82 KANTOROVICH inequality 64, 95, 185 Norm KATO bounds 26 compatible 23 D. E. KNUTH 119 euclidean 16, 20 V. N. KUBLANOWSKAJA 76 induced 19 MANHATTAN 16, 20 LAGRANGE multiplier 121 maximum, CHEBYSHEV or sup 16, 19 LAGRANGE interpolation 161 p-norm 17, 25 LAPLACE equation 88 SCHUR or FROBENIUS 16, 18 LDU decomposition 34 Normal equations 106 Least squares 105 Left-inverse 44,46,49, 150 I. OLKIN 42 Left (or row) characteristic vector 53 Operation count 29, 37, 39, 41, 72, 87, 96 Lower triangular matrix 33 Orthogonalization 35, 76 216 Index

Over-determined system 203 Semidefinite matrix 56 Over-relaxation 32, 87 Singular values 110, 18 I Singular vectors 110 B. N . PARLETT 80 45, 87 Partitioned matrix 32 Spectral radius 23 34, 61 Square root method 33 O. PERRON 55 Steepest descent method 90 Pivoting (complete, partial) 46 Straight line fit 108, 109 Plane rotation 57, 72 V. STRASSEN (fast multiplication) 137 Polynomial (in a matrix) 60 Successive over-relaxation 32, 87 Positive definite matrix 56 G. SZEGO 159 Power method 65 Pseudo-inverse I 1 I O. TAUSSKY 52, 55, 174 Quadratic convergence 96 G. J. TEE 52, 172 QR method 76 42, 143 Quadratic form 14, 56 Triangle inequality 16 Triangular decomposition 33 Rank 114, 146,206 Triangular matrix 30 RAYLEIGH quotient 14,69, 70, 82, 120, 179 Tridiagonal or triple diagonal matrix 30 RAYLEIGH-RITZ method 104, 201 A. M. TURING 34 Relative error 45 Residual 48 Unit-circle or sphere 17 Right-inverse 44,46,49, ISO Upper triangular matrix 30 Right (or column) characteristic vector 53 VANDERMONDE matrix 13, 105, 160 Rotation methods 71 Vector norm 16 Round-off error 47 J. VON NEUMANN 47 H. RUTISHAUSER 49,76, 151, 172, 194 D. E. RUTHERFORD'158 R. H. WAMPLER 204 WIELANDT inverse iteration 60 Scaling 50, 152 T. S. WILSON 48 S. SCHECHTER 162 J. H. WILKINSON 195 F. SCHEID 204 J. WILLIAMSON 63, 181, 199 K. W. SCHMIDT 42 S. WINOGRAD (fast multiplication) 138 I. SCHUR 182 SCHUR complement 138, 180 D. M. YOUNG, Jr. 32, 87 SCHWARZ (or CAUCHy-SCHWARZ) inequality 123 G. ZIELKE 52, 73

A B C 8 o 9 E 0 F 1 G 2 H 3 I 4 J 5