The Hermitian Conjugate of a Matrix Is the Transpose of Its Complex Conjugate

Home , Complex conjugate, Hermitian matrix

Notes on Hermitian Matrices and Vector Spaces

1. Hermitian matrices

Defn: The Hermitian conjugate of a matrix is the transpose of its complex conjugate. So, for example, if 1 i M =  0 2  ,  1 i 1 + i  − then its Hermitian conjugate M † is

1 0 1 + i M = . † i 2 1 i − − In terms of matrix elements, [M †]ij = ([M]ji)∗ .

Note that for any matrix (A†)† = A. Thus, the conjugate of the conjugate is the matrix itself.

Defn: A square matrix M is said to be Hermitian (or self-adjoint) if it is equal to its own Hermitian conjugate, i.e. M † = M. For example, the following matrices are Hermitian:

1 2 3 1 i ,  2 4 5  . i 1 −  3 5 6 

Note that a real symmetric matrix (the second example) is a special case of a Hermitian matrix.

Theorem: The Hermitian conjugate of the product of two matrices is the product of their conjugates taken in reverse order, i.e.

(AB)† = B†A† .

Proof:

[LHS] = ([AB] ) = ( [A] [B] ) = ([A] ) ([B] ) ij ji ∗ X jk ki ∗ X jk ∗ ki ∗ k k = ([B] ) ([A] ) = [B ] [A ] = [A B ] = [RHS] . X ki ∗ jk ∗ X † ik † kj † † ij ij k k

1 Exercise: Check this result explicitly for the matrices

1 2 0 i A = ,B = . 3 4 i 0

Theorem: Hermitian Matrices have real eigenvalues. Proof

Ax = λx so x†Ax = λx†x. (1) Take the complex conjugate of each side:

(x†Ax)† = λ∗(x†x)† .

Now use the last theorem about the product of matrices and the fact that A is Hermitian (A† = A), giving x†A†x = x†Ax = λ∗x†x. (2) Subtracting (1), (2), we have (λ λ∗)x†x = 0. − Since x†x = 0, we have λ λ∗ = 0, i.e. λ = λ∗. So λ is real. (QED). 6 − Theorem: Eigenvectors of Hermitian matrices corresponding to diﬀerent eigenvalues are orthogonal. Proof Suppose x and y are eigenvectors of the hermitian matrix A corresponding to eigenvalues λ1 and λ2 (where λ1 = λ2). So we have 6

Ax = λ1x, Ay = λ2y.

Hence y†Ax = λ1y†x, (3)

x†Ay = λ2x†y. (4) Taking the Hermitian conjugate of (4), we have

(x†Ay)† = y†Ax = λ2∗(x†y)∗ = λ2y†x, where we have used the facts that A is Hermitian and that λ2 is real. So we have

y†Ax = λ2y†x. (5)

2 Subtracting (5) from (3), we have

(λ1 λ2)y†x = 0, − and since we are assuming λ1 = λ2, we must have y†x = 0, i.e. x and y are orthogonal. (QED). 6

Example (i) Find the the eigenvalues and corresponding eigenvectors of the matrix

1 0 2i A =  −0 2− 0   2i 0 1  −

(ii) Put the eigenvectors X1,X2 and X3 in normalised form (i.e. multiply them by a

suitable scalar factor so that they have unit magnitude or ‘length’, Xi†Xi = 1) and check that they are orthogonal (X†Xj, for j = i). i 6 (iii) Form a 3 3 matrix P whose columns are X1,X2,X3 and verify that the matrix × { }

P †AP

is diagonal. Do you recognise the values of the diagonal elements?

Solution (i) Find the the eigenvalues in the usual way from the characteristic equation,

1 λ 0 2i − − − 0 2 λ 0 = 0 . − 2i 0 1 λ − −

So, ( 1 λ)(2 λ)( 1 λ) 2i( 2i(2 λ)) = 0 − − − − − − − − giving (2 λ)(λ2 + λ 3) = 0 − − which has solutions λ = 1, 2, 3. These are the three eigenvalues of A (all real as expected). − To get the eigenvectors, go back to the eigenvalue equation

(A λI)X = 0 . − Write this out for the three eigenvalues in turn: For λ1 = 1, 2 0 2i x  −0 1− 0   y  = 0 .  2i 0 2   z  − 3 This matrix equation corresponds to 3 simple linear equations (of which only 2 are independent). We easily ﬁnd z/x = i , y = 0. So we can take 1   X1 = n1 0  i  where n1 is a normalisation constant. For λ2 = 2, 3 0 2i x  −0 0− 0   y  = 0 .  2i 0 3   z  − This gives 3x = 2iz and 2ix = 3z so, x = z = 0 and y is arbitrary. So we can take − 0   X2 = n2 1 .  0 

Finally, for λ3 = 3, − 2 0 2i x  0 5− 0   y  = 0 .  2i 0 2   z  From this we deduce x = iz and y = 0, so we can take

i   X3 = n3 0 .  1 

(ii) The norm or magnitude of X1 is qX1†X1. Now

1 2   2 2 X†X1 = n ( 1 0 i ) 0 = n (1 + 0 + 1) = 2n , 1 1 − 1 1  i 

so we can take n1 = 1/√2 and get the normalised form of the ﬁrst eigenvector

1 1   X1 = 0 . √2  i  Similarly we can take 0 i   1   X2 = 1 X3 = 0 . √2  0   1 

4 (iii) Form the matrix 1 0 i 1 P =  0 √2 0  √2  i 0 1  from the normalised eigenvectors. Then

1 0 i 1 0 2i 1 0 i 1 P AP =  0 √2− 0   −0 2− 0   0 √2 0  † 2  i 0 1   2i 0 1   i 0 1  − − 1 0 i 1 0 3i 1 0 0 1 =  0 √2− 0   0 2√2− 0  =  0 2 0  2  i 0 1   i 0 3   0 0 3  − − − Note that the corresponding eigenvalues appear as the diagonal elements. Note also that 1 0 0   P †P = 0 1 0 = I.  0 0 1  A matrix satisfying this condition is said to be unitary.

2. Vector spaces

The vectors described above are actually simple examples of more general objects which live in something called a Vector Space. (*) Any objects which one can add, multiply by a scalar (complex number) and get a result of a similar type are also referred to as vectors. Defn: A set of objects (vectors) x, y, . . . is said to form a linear vector space if: (i) The set is closed under addition (usual commutative and associative propertiesV apply), i.e. if x, y , then the ‘sum’ x + y . (ii) The set is∈ closed V under multiplication∈ by V a scalar, i.e. if λ is a scalar and x , then λx . The usual associative and distributive properties must also apply. ∈ V (iii) There∈ V exists a null or zero vector 0 such that x + 0 = x. (iv) Multiplication by unity (λ = 1) leaves the vector unchanged, i.e. 1 x = x. (v) every vector x has a negative vector x such that x + ( x) = 0. × − − [Note that, in the above, as is often the case, I have not bothered to denote vectors by bold face type. It is usually clear within the context, what is a vector and what is a scalar!]

(*) An excellent Mathematics text book covering all of the tools we need is the book ‘Mathematical Methods for Physics and Engineering’ by Riley, Hobson and Bence, CUP 1997. Chapter 7 contains material on vectors, vector spaces, matrices, eigenvalues etc. Chapter 15 has more on the application to functions which we describe in what follows.

5 It is immediately obvious that the usual 3-dimensional vectors form a vector space, as do the vectors acted upon by matrices as described above in the examples.

In the context of Quantum mechanics (see Mandl 1.1), we will assume that wave functions ψ(x) form a vector space in the above sense.§ Multiplication of an ordinary vector by a matrix is a linear operation and results in another vector in the same vector space. Thus, if x is a column vector with n elements and A is an n n (complex) matrix, then y = Ax is also a column vector of n elements. × It is also clear that

A(x + y) = Ax + Ay and A(λx) = λ(Ax) so that multiplication by a matrix is seen to be a linear operation. A matrix is an example of what, in the general context of vector spaces, is called a linear operator. Turning back to the space of wave functions, we note that the operation of diﬀerenti- ation ∂/∂x is a linear operation:

∂ ∂ψ ∂ψ (ψ (x) + ψ (x)) = 1 + 2 . ∂x 1 2 ∂x ∂x Inner products As with ordinary 3-D geometrical vectors, one can often deﬁne a inner or scalar or dot product. For n dimensional complex vectors we deﬁne − n

y x y†x = y∗xi . h | i ≡ X i i=1

So the (magnitude)2 of x is n 2 x x x†x = xi h | i ≡ X | | i=1 and gives a measure of distance within the vector space. In the space of wave functions, an inner product can be deﬁned by

∞ ψ1 ψ2 Z ψ1(x)∗ψ2(x)dx . (1) h | i ≡ −∞ Thus, the (norm)2 or (magnitude)2 of ψ is given by

∞ 2 ψ ψ = Z ψ(x) dx . h | i | | −∞ It follows from the deﬁnition of the inner product (1) that

ψ1 ψ2 = ψ2 ψ1 ∗ . (2) h | i h | i 6 A vector space of this form, with an inner product, is sometimes referred to as a Hilbert Space (e.g. Mandl, Chapter 12).

Properties of Hermitian linear operators We can now generalise the above Theorems about Hermitian (or self-adjoint) matrices, which act on ordinary vectors, to corresponding statements about Hermitian (or self- adjoint) linear operators which act in a Hilbert space, e.g. the space of wave functions in Quantum Mechanics. For completeness, I rewrite the above Theorems (and Proofs) using the more general notation to describe scalar products. In this context, one often talks about eigenfunctions rather that eigenvectors if the ‘vectors’ happen to be functions or, speciﬁcally in our case, wave functions.

Defn: The Hermitian conjugate A† of a linear operator A is deﬁned by

y Ax ∗ = x A†y . h | i h | i An equivalent deﬁnition is given by

Ax y = x A†y . h | i h | i This follows using property (2) above of the inner product.

Thus, if A is a Hermitian operator (A† = A), we must have

Ax y = x Ay . h | i h | i

Exercise: Check that this deﬁnition agrees with that given when A is a complex matrix.

Theorem: The Hermitian conjugate of the product of two linear operators is the product of their conjugates taken in reverse order, i.e.

(AB)† = B†A† .

Proof: For any two vectors x and y,

y ABx = (AB)†y x . h | i h | i But also y ABx = A†y Bx = B†A†y x . h | i h | i h | i

So we must have (AB)† = B†A† .

7 Theorem: Hermitian (linear) operators have real eigenvalues. Proof

Ax = λx so x Ax = λ x x . (1) h | i h | i Take complex conjugate: x Ax ∗ = λ∗ x x ∗ h | i h | i so, using the deﬁnition of Hermitian conjugation (see above)

x A†x = λ∗ x x . h | i h | i Now A is hermitian, so this means

x Ax = λ∗ x x . (2) h | i h | i Subtracting (1), (2), we have (λ λ∗) x x = 0. − h | i Since x x = 0, we have λ λ∗ = 0, i.e. λ = λ∗. So λ is real. h | i 6 − Theorem: Eigenvectors (eigenfunctions) of Hermitian operators corresponding to diﬀerent eigenvalues are orthogonal. Proof Suppose x and y are eigenvectors (eigenfunctions) of the hermitian operator A corresponding to eigenvalues λ1 and λ2 (where λ1 = λ2). So we have 6

Ax = λ1x, Ay = λ2y.

Hence y Ax = λ1 y x , (3) h | i h | i x Ay = λ2 x y . (4) h | i h | i Taking the complex conjugate of (4), we have

x Ay ∗ = y A†x = λ∗ x y ∗ = λ2 y x , h | i h | i 2h | i h | i where we have used the facts that x y ∗ = y x , and that λ2 is real. Then, since A is hermitian, it follows that h | i h | i y Ax = λ2 y x . (5) h | i h | i Subtracting (5) from (3), we have

(λ1 λ2) y x = 0, − h | i 8 and since we are assuming λ1 = λ2, we must have y x = 0, i.e. x and y are orthogonal. 6 h | i Example The linear operator d2 K = −dx2 acts on the vector space of functions of the type f(x)( L x L) such that − ≤ ≤ f( L) = f(L) = 0 . − (i) Show that K is Hermitian. (ii) Find its eigenvalues and eigenfunctions.

Solution (i) We need to check if g Kf = f Kg ∗ , h | i h | i for any functions f and g in the given space. We have

L 2 L d f df L dg∗ df LHS = g Kf = Z g∗ 2 dx = g L + Z dx h | i − L dx ∗ dx− L dx dx − − L dg∗ df = Z dx L dx dx − where we have used integration by parts. Similarly

L 2 L d g dg L df ∗ dg RHS∗ = f Kg = Z f ∗ 2 dx = f L + Z dx h | i − L dx ∗ dx− L dx dx − − L df ∗ dg = Z dx L dx dx − = LHS∗ .

Hence we conclude K is Hermitian. (ii) We require to solve the diﬀerential equation

d2f Kf = λf, i.e. = λf , − dx2 subject to the boundary conditions f( L) = f(L) = 0. The eigenvalue equation is therefore − d2f = λf dx2 − which has the general solution (simple harmonic oscillator!)

f(x) = A cos(√λx) + B sin(√λx) .

9 The boundary conditions then imply

f(L) = 0 A cos(√λL) + B sin(√λL) = 0 ⇒ f( L) = 0 A cos(√λL) B sin(√λL) = 0 . − ⇒ −

There are two types of eigenfunctions odd: A = 0 and sin(√λL) = 0, so

√λL = nπ, (n = 0, 1, 2 ...)

giving eigenvalues and eigenfunctions of K as

2 2 n π (o) nπx λ = , f (x) = Bn sin , (n = 0, 1, 2 ...) L2 n L even: B = 0 and cos(√λL) = 0, so

√λL = (n + 1/2)π, (n = 0, 1, 2 ...)

giving eigenvalues and eigenfunctions of K as

2 2 (n + 1/2) π (e) (n + 1/2)πx λ = , f (x) = An cos , (n = 0, 1, 2 ...) L2 n L If you are feeling really keen, you could check that all these eigenfunctions are indeed orthogonal, i.e. f (p) f (q) = 0 h n | m i whenever m = n, for p = o, e and q = o, e. 6 Basis vectors In dealing with any vector space , it is very convenient to choose a set of basis vectors in terms of which any of the vectorsV in can be written. The number of these needed is called the dimension of the vectors spaceV (maximum number of linearly independent vectors in ). (Recall the discussion of such things in MATH102!) For example, in ordinaryV 3D (Euclidean) space, we are all used to adopting i, j and k as basis vectors. There are, of course, 3 of them and they are clearly linearly independent. Actually, they are orthogonal and unit ( orthonormal). We don’t have to use these as a basis. They don’t even have to be orthogonal!≡ For example we could change basis to

e1 = i + j , e2 = i j , e3 = i + j + k . − The important thing about a basis is that an arbitrary vector can always be written as a linear combination of the basis vectors. Thus if ψ and φi , i = 1, 2 ...N is a basis of , we can write ∈ V { } V N ψ = c φ . X i i i=1

10 It is often convenient to choose the basis vectors to be the eigenvectors of some linear operator which acts on the space. In Quantum Mechanics, we will do this all the time!

Example The matrix 1 0 2i A =  −0 2− 0   2i 0 1  − (see page 3) acts on the 3D space of column vectors of the form

x  y  .  z 

Construct a basis for this 3D space consisting of the eigenvectors of A. How do you know the 3 vectors are linearly independent? Express the vector

a v =  b   c  in terms of these basis vectors. Solution See example on page 3, for the extraction of eigenvectors of A:

1 0 i 1     1   X1 = 0 ,X2 = 1 ,X3 = 0 . √2 √2  i   0   1 

Since there are 3 of them and they are linearly independent, they form a basis. We know they are linearly independent because they are orthogonal (correspond to diﬀerent eigenvalues). Now let

a 3 1 0 i   c1     c3   v = b = ciXi = 0 + c2 1 + 0 . X √2 √2  c  i=1  i   0   1 

There are two ways to solve this (a) Equate components on each side of this vector equation, so ﬁnding 3 linear simulta- neous equations in c1, c2 and c3. Solve for these. (b) Use the fact that the Xi are orthonormal to show that

cj = Xj†v .

11 Try (a) as an exercise. Now look at method (b) in more detail. We want

3 v = c X , X i i i=1 where 1 if i = j, X†X = δ = . j i ij 0 if i = j 6

Left multiply both sides of the equation for v by Xj†:

3 3 X†v = c X†X = c δ = c . j X i j i X i ij j i=1 i=1

Thus we can calculate, for example

a 1   1 c1 = X†v = ( 1 0 i ) b = (a ic) . 1 √2 − √2 −  c 

Similarly, we ﬁnd 1 c2 = b , c3 = (c ia) . √2 −