<<

Lecture 8: Vectors and Matrices III – Applications of Vectors and Matrices (See Sections 3.2, 3.3 and 3.8 to 3.12 in Boas)

As noted in the previous lecture the notation of vectors and matrices is useful not only for the study of linear transformations but also for (the related) problem of solving systems of simultaneous linear equations. Consider N simultaneous linear equations for N unknowns, xk , k 1, , N ,

A111 x A 122 x   A 111NNNN x  A 1 x  X 1       . (8.1)

AN1 x 1 A N 2 x 2   A NN 1 x N 1  A NN x N  X N

Clearly we are encouraged to think in terms of a NxN representation for the coefficients, Akl , a N-D vector of unknowns, xk , and a N-D vector of known inputs,

X l . Then, using the notation we have developed (to simplify our lives), we can write

  Ax X. (8.2)

As long as the matrix A is nonsingular, detA  0 , we can find its inverse and solve the simultaneous equations in matrix form ( CA  is the cofactor matrix for A) as defined in Eq. 7.29 in the previous lecture,

T  NNNCA  11  klXC l lk x A X  xk  A kl X l   X l   l1 l  1det AA l  1 det  det A  (8.3) x k . k det A

In the last expression the matrix A is the matrix A with the kth column replaced by  k the column vector X ,

AAXAA11 1k 1 1 1 k 1 1 N  A         . k  (8.4)  AAXAAN1 Nk 1 N Nk 1 NN

Physics 227 Lecture 8 1 Autumn 2008 We recognize that Eqs. (8.3) and (8.4) define Cramer’s rule (see Section 3.3 in Boas).

Next let us consider the more general problem where Eq. (8.2) corresponds to M   equations for N unknowns. The vector X now has M components, the vector x has N components, and the matrix A is MxN (M rows and N columns). The question to ask first is - when does this system of equations have a solution? This corresponds to determining how many linearly independent equations are present in this revised version of Eq. (8.2). To answer this question we perform the process known as row reduction (i.e., add/subtract rows times constants to/from other rows and interchange rows, See Boas 3.2) to find a new version of the matrix A with as many zero elements as possible in the lower left hand corner (“below the pivot”). This version of A is called the . Having performed this process (to the maximum extent possible) we count the number of rows in the reduced matrix (still a MxN matrix) with nonzero elements. Call this number M  and label it the rank of the matrix A, where clearly MM  . The rank counts the number of linearly independent rows in A and thus the number of linearly independent equations in the original system of equations.

ASIDE: We can apply this same idea of to a general of functions (not just linear functions). Consider the of the matrix defined

in terms of N functions, fk  x, k 1, , N , and their derivatives (with respect to x)

f12 x f x  fN  x f x f  x  f   x Wx   12 N .     (8.5) NNN1   1   1 f12 f fN

This determinant is called the of the functions. At values of x where Wx   0, the functions are linearly independent in the sense that (recall our discussion of linearly independent vectors)

N ck f k x 0  c k  0, k  1, , N . (8.6) k1

No linear combination of the functions vanishes except the trivial one (all zero coefficients) in the range of x where .

Physics 227 Lecture 8 2 Autumn 2008 Returning to the problem of simultaneous equations we define a new Mx(N+1)  matrix, the augmented version of A, which has the column vector X as its N 1th column,

AAX11 1N 1  A      . Aug  (8.7)  AAXM1  MN M

Next we row reduce this matrix to determine its rank. Call this last quantity M Aug and note that cannot be smaller than M  , MMMAug , since every row that can be reduced to zeros in AAug can certainly be reduced to zeros in A. Now we can list the various possibilities.

1) If MM Aug (the rank of A is smaller than the rank of ), the equations are inconsistent with each other and there are no solutions to the set of simultaneous equations.

2) If MMNMAug   , we can eliminate MM  non-independent equations (rows) from the system of equations and arrive at the NxN problem we discussed at the beginning of the lecture. The resulting NxN matrix of

coefficients we label ARed , the “reduced” matrix. Since we have eliminated all the rows that have only zero elements, we are guaranteed that the determinant -1 of the reduced matrix, , is nonzero and that an inverse matrix, ARed , exists. Hence we can use this inverse, i.e., Cramer’s rule, to find a unique solution to the system of equations, i.e., a unique value for the N-D column vector   -1 x ARed X .

3) If MMNAug , the reduced problem has fewer equations than there are unknowns, i.e., there are not enough independent equations to determine all of the unknowns. Rather the solution to this system corresponds to solving for M  of the unknowns in terms of the other NM  unknowns (and the  original coefficients and the known vector X ).

As an example consider the equations

Physics 227 Lecture 8 3 Autumn 2008 x y 2 z  7 7   1 1  2      2x y  4 z  2 M  4 2 2  1  4 ,,.XA      5410x  y  z  1 N  3 1    5410  (8.8)     3x y  6 z  5 5   3  1  6 

Proceeding to row reduce we have

1 1 2   1 1 2      0 3 0 0 3 0 A     . RRRR2 2 10 9 0  3 3 2  0 0 0  (8.9) RRRR3 5 1  4 4 3 2   RR4 3 1 0 4 0   0 0 0 

Hence we see that the rank of A is M  2 . Proceeding with the we have

1 1 2 7   1 1 2 7      2 1  4 2 0  3 0  12 A      Aug 5 4 10 1 RR2 2 1  0 9 0 36   RR3 5 1   3 1  6 5 RR4 3 1  0  4 0  16  1 1 2 7 (8.10)  0 3 0 12 . RR3 3 2 0 0 0 0 RR4 4 3 2  0 0 0 0

Thus we have MMNAug 23   and the problem is under-constrained. We can solve for 2 unknowns in terms of the third. From the revised equations we have

x y 2 z  7 x  7  y  2 z  3  2 z  . (8.11) 3yy   12  4

The unknown z remains a free variable. The choice of z instead of x to play this role is, of course, arbitrary.

Physics 227 Lecture 8 4 Autumn 2008 Now return to Eq. (8.2) and think about it in a slightly different way. What if the   right-side-side ( X ) is proportional to the unknown vector x , i.e., can we find a vector such that after we perform the linear transform represented by the matrix A we end up back at times a constant? As we will see, the problem of finding vectors that are invariant (except for a overall constant) under a given transformation is extremely important/useful in physics (here we move to Section 3.11 in Boas, or Chapter 10 in the previous edition of Boas). We want to consider the following new version of Eq. (8.2),

    Ax x  A 1 x  0. (8.12)

Note that this version is a homogenous equation. If the matrix  A 1 is 1 nonsingular and  A 1 exists, we know that the above equation, Eq. (8.12), has only the trivial solution

 1  xA 1 0  0. (8.13)

Thus, to obtain interesting solutions (and we want to do that), we require that the inverse not exist or that

detA1 0. (8.14)

This condition, the eigenvalue condition (also called the characteristic equation for A), is an Nth order equation for the parameter  generally having N distinct solutions, which may be complex. Note that the eigenvalues are a property of the transformation represented by A, and not of the vectors being transformed. On the other hand, for each specific eigenvalue  there is a corresponding vector, the  k eigenvectorvk , which satisfies  Avk  k v k . (8.15)

 Since this is a homogenous equation, the length of v is not specified (if is a  k solution so is cvk , where c is a constant) and typically we choose to make it a unit T    vector (to simplify the analysis – the lazy but smart argument), vk v k v k  v k 1. (Another way to see this point is to note from Eq. (8.14) that only N-1 of the N

Physics 227 Lecture 8 5 Autumn 2008 equations in (8.15) are linearly independent. You should see this explicitly as you solve (8.15).) For complex vectors the normalization choice is typically T  †    vk v k v k v k  v k  v k 1, which leaves an overall phase undetermined (i.e., i this constraint is invariant under vkk e v ).

We can derive a few important properties of eigenvalues and vectors by considering a  pair of distinct eigenvalues, kl, , and the corresponding eigenvectors, vvkl, for a general complex matrix A. By definition they satisfy     Avk k v k, Av l l v l         v†††† Av  v v, v Av  v v l k k l k k l l k l (8.16) †††† †     vl Av k  v k A v l   k v k v l .

Of special interest is the case when the matrix A is Hermitian, AA † (symmetric if A is real), which is typically true in physics applications, especially . For Hermitian matrices the difference between the second expression on the second line and the last expression in Eq. (8.16) vanishes. Thus for Hermitian A we have

††††††        vAvvAvvAAvk l k l  k  l 0  l  k vv k l †  k l, vk v k  0  k  k  0 (8.17)    † .  l k 00 vv k l 

The first result tells us that the eigenvalues for a are real, while the second result says that the eigenvectors for distinct eigenvalues (for a Hermitian matrix) are orthogonal. These results will be very important in your studies in Quantum Mechanics. They also suggest a general way to find orthogonal vectors to use as vectors- find the eigenvectors of an appropriate Hermitian matrix (transformation).

It should be clear that our analysis would be simplified if we choose to have our basis vectors aligned with the eigenvectors so that the components of the eigenvectors are

Physics 227 Lecture 8 6 Autumn 2008  of the form v  0,0,1, ,0 and the matrix A (in this basis) is with the  k l   eigenvalues along the diagonal. We turn now to achieving this goal.

Assuming that we have found the eigenvalues and their eigenvectors in an arbitrary basis, we construct the following so-called “modal” matrix (not the cofactor matrix)

v12  v   vN  11 1 v  v   v  C  1222 N 2 ,     (8.18)  v v v  12NN   N N i.e., each column of the matrix C is composed of the components of one of the eigenvectors in the initial, arbitrary basis. Next consider the result of multiplying on the left by A,

NN AC A C  A v   v ,  kl km ml km lmk l l  mm11

1v 1  2 v 2   NN v  11 1 v  v    v  (8.19) AC  1 122 2 2 NN2 .      v  v  v 1 1NN 2 2  NN N

If we define next the (desired) with the eigenvalues along the diagonal,

1 00 00  D  2 ,     (8.20)  00 N it follows that

Physics 227 Lecture 8 7 Autumn 2008 1v 1  2 v 2   NN v  11 1 v  v    v  CD1 122 2 2 NN2 AC.     (8.21)  v  v  v 1 1NN 2 2  NN N

Combining Eqs. (8.19) and (8.21) while assuming that C is nonsingular (i.e., that the eigenvectors are all independent, and they will be for distinct eigenvalues, and can be made so by hand even if there are degeneracies in the eigenvalues) we can define the inverse of C and arrive at the important relation

1 CD AC  C AC  D. (8.22)

Note that the general transformation of a matrix M defined by

N C1 MC C  1 MC C  1 M C  kl  km mn nl (8.23) mn,1 is called a similarity transformation, where we can think of this transformation as operating separately on both indices of M, C 1 from the left and C from the right.    The corresponding transformation of vectors is x x C1 x . The specific similarity transformation defined by the matrix C in Eq. (8.18) takes us to the basis system where A is diagonal.

Note that in the definition of C the choice of the normalization of the eigenvectors plays no special role. For any choice of the normalization of the eigenvectors the matrix C will still provide a similarity transformation that diagonalizes the problem. However, for general normalization, this transformation will not be just a . 63 Example 1: Consider the 2x2 matrix A  , which is real but not symmetric 21 and thus not Hermitian ( AAA† T ). Hence we do not expect the eigenvectors of this matrix to be orthogonal to each other. We can find the eigenvalues directly from

63  2  7   12  0    3   4 . (8.24) 21

Physics 227 Lecture 8 8 Autumn 2008

For a 2-D matrix we can also use the general results that

N 11    TrA  Tr1 A  Tr CC A   Tr  C AC   Tr D  k , k1 detA det1 A  det CC11 A   det  C AC          (8.25) N detD k , k1 to solve for the eigenvalues.

ASIDE: The interested reader should prove that these results are true in any number of dimensions. It follows from simply writing out the definitions in terms of components and noting the cyclic symmetry. We have, for example,

NNN 1  1  1 Tr C AC Ckl A lm C mk  C mk C kl A lm   lm A lm k, l , m 1 k , l , m  1 l , m  1 NN (8.26) AADll Tr[ ]  Tr[ ]   l . ll11

For the 2x2 matrix A above these relations are sufficient to write

TrA  7 12  1  3   . (8.27) detA  12 12  2  4

The corresponding (normalized) eigenvectors are found from

x1 1 1 13  6  3 x 1  3 y 1  0    1  v 1   , y1 2 1 (8.28) x2 313 24  6  4 x 2  3 y 2  0     v 2   , y2 2 13 2

Physics 227 Lecture 8 9 Autumn 2008 where we have chosen to make them unit vectors. As noted earlier, since this is a 2- D problem there is really a second equation for each eigenvector, which is NOT linearly independent from the first in (8.28). Explicitly the second row of the matrix gives us

x1 1 1 13 2 x 1  1 3 y 1  0  1 v 1  , y1 2 1 (8.29) x2 313 24 2 x 2  1 4 y 2  0  v 2  , y2 2 13 2 i.e., as expected, the same result as the first row. There is no independent information here.

 Also note that vv12 3  2 26  0 . As expected the eigenvectors are not orthogonal, since A is not Hermitian. From the earlier discussion the similarity transformation defined by

13 23  2 13 1 13 13 CC, 26  (8.30) 12 11   2 13 22 will diagonalize the matrix A. Note that for a 2x2 matrix finding the inverse matrix from the general formula in terms of the cofactor matrix Ckl ,

1 Clk  A   , (8.31) kl detA is particularly easy. Recall that the cofactor is given by 1kl times the “”, the determinant of the matrix with the current row and column removed. For a 2x2 matrix the minor is the element diagonally opposite. Thus we find

Physics 227 Lecture 8 10 Autumn 2008 1111AAAA22 11 12 21 AAAA,,,. (8.32) 11detAAAA 22 det  12 det  21 det 

It is straightforward to confirm that for this definition of C we have

1 30 C AC  . On the other hand, as pointed out above, the normalization and the 04 phase of the eigenvectors is not an issue here. We could as well choose

1 3 1  2 3  CC ,,   (8.33) 1 2   1 1  or, with a different phase,

1 3 1   2  3  CC ,,    (8.34) 1 2    1  1  or even

1 1 3 1  2 3  CC ,  26   . (8.35) 26 1 2   1 1 

All 4 of the implied similarity transformations will result in the same diagonal matrix.

For later reference we note that

1 detCC  , det1 26; 26  1 detCC  1 det ; 1 (8.36) detCC   1  det  ; 1 detCC  , det 1 26. 26 

Physics 227 Lecture 8 11 Autumn 2008 10 3 Example 2: Let us next consider a matrix that is Hermitian, A  . Again it 32 is straightforward to find

TrA  12 12   1 1   . (8.37) detA  11 12  2 11

Here the eigenvectors are

1 1 11  10  1 x 1  3 y 1  0  v 1   , 10 3 (8.38) 1 3 211  10  11 x 2  3 y 2  0  v 2   . 10 1

We see, as expected for a Hermitian matrix, that the eigenvectors are orthogonal,  T v1 v 2  v 1 v 2 3  3 10  0 . Thus a possible similarity transformation to the diagonal form is provided by

1 13 C  , 10 31 (8.39) 1 111 3   1 3  T CCC        , 103 1  10  3 1  where, in the initial step of finding the inverse, Eq. (8.32) above was used. Another possible choice, differing by a phase, is

111 3 1  1 3  T CCC ,.       (8.40) 103 1  10  3 1 

In these two cases we have

Physics 227 Lecture 8 12 Autumn 2008 1 detCC   1  det ; 1 (8.41) detCC  1 det .

For the case of a Hermitian matrix with orthogonal eigenvectors, we can interpret the similarity transformation that takes us to the diagonal basis as an orthogonal transformation. Orthogonal transformations, which (as we have discussed) include both and reflections, have the defining feature that they preserve scalar products. In particular the lengths of vectors are unchanged by such transformations and pairs of vectors that are initially orthogonal (i.e., have vanishing scalar products) remain orthogonal after the transformation. These are, of course, properties that we expect to be true of rotations (and reflections). If we define the transformation in terms of components by rk  M kl r l , we have

    T    rrTTTT rr  Mr Mr  rMMr (8.42) MMMMTT 1   1.

An orthogonal transformation is described by an , which in turn is defined by MM1  T . Looking above we see that the matrices C and C in Example 2 (the Hermitian starting matrix) have this property, where MCC1 T . On the other hand, none of the various C matrices in Example 1, where the starting matrix is not Hermitian, have this property. It further follows from the definition that, for orthogonal matrices, we have

1 T detMMMM  1 det   det 

2 T detMMM  det det  (8.43) detM    1.

The +1 case corresponds to true rotations. For example, a rotation in the x-y plane by  radians in the positive (counter-clockwise) direction corresponds to the matrix

Physics 227 Lecture 8 13 Autumn 2008 cos sin   x   x cos  y sin   M          , (8.44) sin cos   y    x sin   y cos   and detM    1, independent of the specific value of  . In this case we are thinking in terms of rotating the basis vectors while  the vector r with components xy,  and xy,  in the 2 basis sets remains fixed as suggested in the figure. (Note that, using the results of the Extra Credit problem about the , we can write this 2x2 in terms of the Pauli matrices Me   i2 .) On the other hand, a through the x-axis changes the sign of all y-components and is represented by

10 Ry  , (8.45) 01

 with detRy  1. In general, an orthogonal transformation that includes both a rotation and a reflection, e.g., the product of M   and Ry , also has determinant equal to -1.

ASIDE: In the language of theory (see Section 3.13 and Lecture 10), orthogonal transformations (rotations and reflections) in N-dimensions correspond to the group O(N), while the special SO(N) includes only transformations with determinant = +1, true rotations.

Going back to Example 2, we see that a Hermitian matrix is diagonalized by an orthogonal matrix. Further the arbitrary choice of the phase of the eigenvectors, and thus the columns of C, determines whether this transformation is a true rotation or includes also a reflection. For example, the transformation corresponding to C is a pure rotation, cos  1 10 , while the transformation corresponding to C includes a reflection, CCR  y . To determine the sign of the angle we should carefully write out the formulae. In going from the original frame to the rotated (diagonal) frame we have

Physics 227 Lecture 8 14 Autumn 2008     C11 AC D  r  C r  r  Cr , (8.46) MC   1.

Thus we see that the corresponding rotation has sin 3 10    1.25 radians.

ASIDE: In general we can think either in terms of rotating the unit basis vectors, as here, with the actual physics vectors fixed, or in terms of rotating the “physics” vectors with the basis vectors fixed, where the rotation is in the opposite direction. The former is often called a “passive” transformation, while the later is an “active” transformation.

In the next lecture we will see how these methods of matrix eigenvalues/vectors can be employed to analyze coupled harmonic oscillators. Coupled harmonic oscillators provide a remarkably accurate description of many complex mechanical systems near equilibrium.

Physics 227 Lecture 8 15 Autumn 2008