Lie Algebras in Particle Physics 2020/2021 (NWI-NM101B)1

Timothy Budd Version: May 5, 2021

Contents

1 Introduction and basic definitions2

2 Finite groups 5

3 Introduction to Lie groups and Lie algebras 13

4 Irreducible representations of su(2) 19

5 The 23

6 Root systems, simple roots and the 27

7 Irreducible representations of su(3) and the quark model 33

8 Classification of compact, simple Lie algebras 38

1These lecture notes are an extended version of notes by F. Saueressig and are loosely based on the book Lie Algebras in Particle Physics by H. Georgi. 1 Introduction and basic definitions

Group theory is the mathematical theory underlying the notion of . Understanding the symmetries of a system is of great importance in physics. In classical mechanics, Noether’s theo- rem relates symmetries to conserved quantities of the system. Invariance under time-translations implies the conservation of energy. Invariance with respect to spatial translations leads to the conservation of momentum, and rotational invariance to the conservation of angular momentum. These conserved quantities play a crucial role in solving the equations of motion of the system, since they can be used to simplify the dynamics. In quantum mechanics, symmetries allow one to classify eigenfunctions of, say, the Hamiltonian. The rotational symmetry of the hydrogen atom implies that its eigenstates can be grouped by their total angular momentum and angular momentum in the z-direction, which completely fixes the angular part of the corresponding wave-functions. In general a symmetry is a transformation of an object that preserves some properties of the object. The two-dimensional rotational symmetries SO(2) correspond to the linear transformations of the Euclidean plane that preserve lengths. A permutation of n particles preserves local particle number. Since a symmetry is a transformation, the composition of two symmetries of an object is always another symmetry. The concept of a group is introduced exactly to capture the way in which such symmetries compose.

Definition 1.1 (Group). A group (G, · ) is a set G together with a binary operation G × G → G that associates to every ordered pair f, g of elements a third element f · g. It is required to satisfy: (i) If f, g ∈ G then h = f ·g is also an element of G, i.e. the group is closed under multiplication. (“closure”) (ii) The group multiplication is associative. For f, g, h ∈ G we have that f · (g · h) = (f · g) · h. (“associativity”) (iii) There is an identity element e such that for all f ∈ G, e · f = f · e = f. (“identity”) (iv) Every element f ∈ G has an inverse denoted by f −1 such that f ·f −1 = f −1·f = e. (“inverse”) Let us start by looking at a few abstract examples (check that these indeed satisfy the group axioms!):

Example 1.1. G = {1, −1} with the product given by multiplication.

Example 1.2. The integers under addition (Z, +).

Example 1.3. The permutation group (SX , ◦) of a set X: SX is the set of bijections {ψ : X → X} and ◦ is given by composition, i.e.

if ψ : X → X and ψ0 : X → X then ψ ◦ ψ0 : X → X : x 7→ ψ(ψ0(x)).

In particular, if n is a positive integer we write Sn := S{1,2,...,n} for the permutation group on {1, 2, . . . , n}.

Example 1.4. The G = GL(N, R) consisting of all real N × N matrices with non-zero determinant with product given by matrix multiplication. Based on these examples we should make a few remarks: • The product of a group is not necessarily implemented as a multiplication, as should be clear from examples 2 and 3. One is free to choose a notation different from f = g · h to reflect the product structure, e.g. f = g ◦ h, f = g + h, f = gh. Often we will write G instead of (G, ·) to indicate a group if the product is clear from the context.

2 • Notice that there is no requirement that g · h = h · g in the definition of a group, and indeed examples 3 and 4 contain elements g and h such that g · h 6= h · g.

Definition 1.2 (Abelian group). A group G is abelian when

g · h = h · g for all g, h ∈ G.

Otherwise the group is called non-abelian. • The groups in the examples are of different size:

Definition 1.3 (Order of a group). The order of a group G is the number of elements in its set. A group of finite order is called a finite group, and a group of infinite order is an infinite group.

For example, Sn is a finite group of order n!, while GL(N, R) is an infinite group. Although one may use different operations to define a group (multiplication, addition, composition, . . . ) this does not mean that the resulting abstract groups are necessarily distinct. In fact, we will identify any two groups that are related through a relabeling of their elements, i.e. any two groups that are isomorphic.

Definition 1.4 (Isomorphic groups). Two groups G and G0 are called isomorphic, G ' G0, if there exists a bijective map φ : G → G0 that preserves the group structure, i.e.

φ(g · h) = φ(g) · φ(h) for all g, h ∈ G. (1.1)

Example 1.5 (Cyclic group). For positive integer n the three groups defined below are all isomorphic (see exercises). Any one of them may be taken as the definition of the cyclic group of order n.

• the set Zn := {0, 1, 2, . . . , n − 1} equipped with addition modulo n;

• the set Cn ⊂ Sn of cyclic permutations on {1, 2, . . . , n} equipped with composition; n • the nth roots of unity Un := {z ∈ C : z = 1} equipped with multiplication. Abstract group theory is a vast topic of which we will only cover a few aspects that are most relevant for physics. Central in this direction are representations.

Definition 1.5 (Representation). A representation D of a group G on a vector space V is a mapping D : G → GL(V ) of the elements of G to invertible linear transformations of V , i.e. elements of the general linear group GL(V ), such that the product in G agrees with composition in GL(V ), D(g)D(h) = D(g · h) for all g, h ∈ G. (1.2) n n Often V = R or V = C meaning that D(g) is given by a n × n matrix and the product in GL(V ) is simply matrix multiplication. The dimension of a representation is the dimension of V . Similarly to the isomorphism of groups, we consider two representations D : G → GL(V ) and D0 : G → GL(V 0) of G to be equivalent (or isomorphic, D ' D0) if they are related by a similarity transform, meaning that there exists an invertible linear map S : V → V 0 such that

D0(g) = SD(g)S−1 for any g ∈ G. (1.3)

3 Having introduced abstract groups and their representations, we disentangle two aspects of the symmetry of a physical system: the abstract group captures how symmetry transformations com- pose, while the representation describes how the symmetry transformations act on the system.2 This separation is very useful in practice, thanks to the fact that there are far fewer abstract groups than conceivable physical systems with symmetries. Understanding the properties of some abstract group teaches us something about all possible systems that share that symmetry group. As we will see later in the setting of Lie groups, one can to a certain extent classify all possible abstract groups with certain properties. Furthermore, for a given abstract group one can then try to classify all its (inequivalent) representations, i.e. all the ways in which the group can be realized as a symmetry group in the system. For instance, the isospin symmetry of pions in subatomic physics and the rotational symmetry of the hydrogen atom share (almost) the same abstract group. This means that their representations are classified by the same set of quantum numbers (the total (iso)spin and (iso)spin in the z- direction). First we will focus on representations of finite groups, while the remainder of the course deals with Lie groups, i.e. infinite groups with the structure of a finite-dimensional manifold (like U(1), SO(3), SU(2),...).

2By definition the representations we are considering act as linear transformations on a vector space. So we only cover physical systems with a linear structure in which the symmetries are linear transformations. This is not much of a restriction, since all quantum mechanical systems are of this form.

4 2 Finite groups

When dealing with finite groups one may conveniently summarize the structure of a group G = {e, g1, ,...} in its group multiplication table, also known as its Cayley table:

· e g1 g2 ... e e g1 g2 ... g1 g1 g1 · g1 g1 · g2 ... (2.1) g2 g2 g2 · g1 g2 · g2 ......

Note that the closure and inverse axioms of G (axioms (i) and (iv) in Definition 1.1) precisely state that each column and each row contains all elements of the group exactly once. Associativity on the other hand is not easily visualized in the table.

2.1 Example: The group of order 3 We can easily deduce using the Cayley table that there exists a unique abstract group of order 3. Indeed, if G = {e, a, b} with e the identity, then we know that the Cayley table is of the form · e a b e e a b (2.2) a a ? ? b b ? ? There is only one way to fill in the question marks such that each column and each row contains the three elements e, a, b: · e a b e e a b (2.3) a a b e b b e a It is easily checked that this multiplication is associative and thus defines a group. In fact, we have already encountered this group as a special case of the cyclic group of Example 1.5 when n = 3, i.e. G is isomorphic to the group Z3 of addition modulo 3 (with isomorphism given by e → 0, a → 1, b → 2 for instance). Given the abstract group, one can construct representations. We first specify the vector space on which the representation acts to be C. In this case one has the trivial representation D(e) = 1 ,D(a) = 1 ,D(b) = 1 , (2.4) and the non-trivial representation corresponding to rotations by 120 degrees D(e) = 1 ,D(a) = e2πi/3 ,D(b) = e4πi/3 . (2.5) A straightforward computation verifies that these representations indeed obey the multiplication laws indicated in the group multiplication table. Both representations (2.4) and (2.5) are complex one-dimensional. It is then a natural question if there are other representations of Z3 which we can construct. This brings us to the so-called regular representation.

Definition 2.1 (Regular representation). The regular representation of a finite group G = {g1, g2, . . . , gn} is obtained by choosing the vector space V to be the one spanned by its group elements, ( n ) X V = λi|gii : λi ∈ C . (2.6) i=1

5 For h, g ∈ G the relation D(h)|gi ≡ |h · gi (2.7) then defines a representation D : G → GL(V ), by extending to linear combinations,

n n X X D(h) λi|gii = λi|h · gii. (2.8) i=1 i=1 The dimension of the regular representation thus equals the order of the group.

We use the example Z3 to illustrate how the definition (2.7) allows to construct the matrices D(g) of the regular representation explicitly. The technique used in the construction is general and can be applied to any finite group. As starting point we choose a basis for the vector space, |gii, i = 1, 2, 3 where g0 = e, g1 = a, g2 = b are the group elements. We introduce a scalar product, requiring that the basis is orthonormal

hgi|gji = δij . (2.9)

The matrices D(h) implicitly defined through the relation (2.7) are constructed by computing their matrix elements [D(h)]ij = hgi| D(h) |gji = hgi|h · gji. (2.10) The explicit computation uses the action of the representation on the basis elements

D(a)|ei = |ai ,D(a)|ai = |bi ,D(a)|bi = |ei ,..., (2.11) and subsequently evaluating the scalar products. The resulting matrices are

1 0 0 0 0 1 0 1 0 D(e) = 0 1 0 ,D(a) = 1 0 0 ,D(b) = 0 0 1 . (2.12) 0 0 1 0 1 0 1 0 0

Using standard matrix multiplication, one can verify that these matrices indeed obey the group multiplication laws (2.3). Thus (2.12) indeed is a representation of Z3.

2.2 Irreducible representations A key role in classifying the representations of a group is played by the so-called irreducible representations.

Definition 2.2 (Invariant subspace). An invariant subspace W of a representation D : G → GL(V ) is a linear subspace W ⊂ V such that

D(g)ψ ∈ W for any ψ ∈ W and g ∈ G. (2.13)

Trivially W = V and W = {0} are always invariant, so we are usually more interested in proper invariant subspaces W , meaning that W ⊂ V is invariant and W 6= {0} and W 6= V .

Definition 2.3 ((Ir)reducible representation). A representation is reducible if it has a proper invariant subspace, and irreducible otherwise. Sometimes it is convenient to formulate this in terms of projection operators.3 A representation is reducible if and only if there exists a non-trivial projection operator P : V → V satisfying

PD(g) P = D(g) P for all g ∈ G. (2.14)

Indeed, the image of V under P is then an invariant subspace.

3Recall that a projection operator P satisfies P 2 = P and, as a consequence P (1 − P ) = 0.

6 Definition 2.4 (Completely reducible representation). A representation is called completely reducible if it is equivalent to a representation whose matrix elements have block-diagonal form,   D1(g) 0 ··· D(g) =  0 D2(g) ··· (2.15)  . . .  . . .. where the Dj are (not necessarily different) irreducible representations. Equivalently, eq. (2.15) may be written as the direct sum of irreducible representations each acting on a subspace

D(g) = D1(g) ⊕ D2(g) ⊕ .... (2.16)

We now show that the 3-dimensional representation of Z3 is completely reducible. The first step shows that the representation is reducible. The projector defining the invariant subspace is given by 1 1 1 1 P = 1 1 1 . (2.17) 3   1 1 1 An explicit computation shows that PD(g)P = D(g)P , ∀g ∈ {e, a, b}. Notably, the projector singles out the completely symmetric state ∝ |ei + |ai + |bi, which is manifestly invariant under permutations. To show that the representation is completely reducible, we construct the similarity transform diagonalizing the matrices (2.12). Setting ω ≡ e2πi/3 this transformation is given by

1 1 1  1 1 1  1 −1 1 S = √ 1 ω2 ω  ,S = √ 1 ω ω2 . (2.18) 3 1 ω ω2 3 1 ω2 ω

Evaluating D0(g) = SD(g)S−1 yields

1 0 0 1 0 0 1 0 0  0 0 0 D (e) = 0 1 0 ,D (a) = 0 ω2 0 ,D (b) = 0 ω 0  . (2.19) 0 0 1 0 0 ω 0 0 ω2

Thus the regular representation of Z3 can be written as the direct sum of irreducible representa- 0 tions, D (g) = Dt ⊕ D1 ⊕ D1, where Dt is the trivial representation (2.4) and D1 is the complex 1-dimensional representation (2.5). One should think of the irreducible representations as the analogues of prime numbers. Just like any positive integer can be decomposed as a product of primes, any representation can be decomposed as a direct sum of irreducible representations.

Theorem 2.1. Every representation of a finite group is completely reducible and thus similar to a representation of the form of (2.15). The proof relies on a number of ingredients.

Definition 2.5 (Unitary representation). A representation D : G → GL(V ) on a complex vector space V is unitary if D(g)†D(g) = 1 for all g ∈ G. In the exercises you will prove the following two results.

Theorem 2.2. Any representation of a finite group is equivalent to a unitary representation.

7 Theorem 2.3. If a unitary representation of a finite group has an invariant subspace W , then its orthogonal complement W ⊥ is invariant as well.

Proof of Theorem 2.1. Let D : G → GL(V ) be a representation. By Theorem 2.2 we may assume without loss of generality that D is unitary. If D is irreducible we are done. Otherwise it has an invariant subspace W (by Definition 2.3), while its orthogonal complement W ⊥ is invariant too (by Theorem 2.3). Then there exist unitary representations D1 and D2 such that D(g) is similar to the block diagonal form D (g) 0  D(g) ' 1 , (2.20) 0 D2(g) ⊥ where D1(g) corresponds to the action of G on W and D2(g) that on W . We may iterate the procedure for D1 and D2 separately, until we are left with only irreducible representations. This happens after a finite number of steps since upon decomposition the dimensions of the represen- tations strictly decrease.

Remark: Theorem 2.2 and Theorem 2.1 are not true for infinite groups. For instance, the integers Z under addition (Example 1.2) has the two-dimensional representation 1 x D(x) = . (2.21) 0 1

It is reducible since it leaves the first coordinate invariant, but it is not similar to a block diagonal form. Therefore it is not completely reducible.

2.3 Schur’s lemma The importance of irreducible representations for physics lies in the property that they allow to separate the dynamics of the system and the structures determined by the symmetries of the prob- lem. To illustrate this, we first discuss Schur’s lemma before applying it in quantum mechanics.

Theorem 2.4 (Schur’s lemma - Part I). Let D1 : G → GL(V1) and D2 : G → GL(V2) be irreducible representations of a group G, and A : V2 → V1 a linear operator. If A intertwines D1 and D2, meaning that D1(g)A = AD2(g) for all g ∈ G, then A = 0 or D1 and D2 are equivalent.

Proof. The proof is done in two steps.

A is injective or A = 0: Let us consider the subspace W2 ⊂ V2 annihilated by A. For any vector |µi ∈ V2 we have that

|µi ∈ W2 =⇒ D1(g)A|µi = 0 =⇒ AD2(g)|µi = 0 =⇒ D2(g)|µi ∈ W2 (2.22) for any g ∈ G, but that means W2 is a (not necessarily proper) invariant subspace of D2. Since D2 is irreducible, W2 = {0} or W2 = V2, meaning that A is injective or A = 0.

A is surjective or A = 0: Let us consider the image W1 = A(V2) ⊂ V1. For any vector |νi ∈ V1 we have

|νi ∈ W1 =⇒ |νi = A|µi for some |µi ∈ V2 (2.23)

=⇒ D1(g)|νi = D1(g)A|µi = AD2(g)|µi (2.24)

=⇒ D1(g)|νi ∈ W1 (2.25) for any g ∈ G, but that means W1 is an invariant subspace of D1. Since D1 is irreducible, W1 = V1 or W1 = {0}, meaning that A is surjective or A = 0.

8 Together this implies that A = 0 or A is invertible. The latter case means that D1 and D2 are −1 equivalent, since then D1(g) = AD2(g)A for all g ∈ G.

The results is useful if you want to determine whether two different-looking irreducible represen- tations are equivalent: if you can find any non-zero operator A that intertwines them, then they are equivalent.

Theorem 2.5 (Schur’s lemma - Part II). Let D : G → GL(V ) be a finite-dimensional ir- reducible representation. If a linear operator A : V → V commutes with all elements of D, i.e. [D(g),A] = 0 for all g ∈ G, then A is proportional to the identity operator.

Proof. The proof uses that every finite-dimensional matrix A has at least one eigenvector4, A|µi = λ|µi. From the assumption [D(g),A] = 0 it then follows that

0 = D(g)(A − λ1)|µi = (A − λ1)D(g)|µi for all g ∈ G. (2.26)

Applying Theorem 2.4, with D1 = D2 = D, to the operator (A − λ1) it follows that A − λ1 = 0 or invertible. It cannot be invertible since (A − λ1)|µi = 0, thus A = λ1.

In other words, if you can find any operator A that is not proportional to the identity and that commutes with a representation D, then D is reducible. A direct consequence of Schur’s lemma is that states transforming under an irreducible repre- sentation are essentially unique. Suppose that the matrices D(g) are fixed for all g ∈ G. Any further basis transformation would correspond to an operator A satisfying AD(g)A−1 = D(g), but according to Schur’s lemma A ∝ 1, corresponding only to trivial phase transformations. We close this subsection with an application of irreducible representations in quantum mechanics. In this case the dynamics of the system is encoded in the Hamilton operator Hˆ and observables correspond to the eigenvalues of hermitian operators Oˆ, i.e. satisfying Oˆ† = Oˆ. Symmetries are implemented through a unitary representation D(g) of the symmetry group G acting on the entire Hilbert space of the system. In general, the representation D(g) is reducible. If it is completely reducible, one can choose a basis in which the action of D is block diagonal. Each set of states belonging to a given block transforms with respect to an irreducible representation Da(g) of G. It is then natural to introduce an orthonormal set of states |a, j, xi on the Hilbert space satisfying ha, j, x|b, k, yi = δabδjkδxy. Here a, b label the subspaces on which the irreducible representations act, j, k label the basis element within such a subspace, and x, y are collective labels for all other quantum numbers. Notably, this structure has profound consequences at the level of the matrix elements of the operators Oˆ ha, j, x| Oˆ |b, k, yi (2.27)

First notice that D(g) is blockdiagonal when acting on the basis |a, j, xi:

ha, j, x| D(g) |b, k, yi = δab δxy [Da(g)]jk . (2.28)

If D(g) generates a symmetry, it commutes with the operators Oˆ,[D(g), Oˆ] = 0, and in particular with the Hamiltonian Hˆ . This implies

0 = ha, j, x| [O,Dˆ (g)] |b, k, yi (2.29)

4Any non-constant polynomial has at least one (complex) root, so in particular the characteristic polynomial λ 7→ det(A − λI). This root is an eigenvalue of A.

9 X   = ha, j, x| Oˆ |a0, k0, x0iha0, k0, x0| D(g) |b, k, yi − ha, j, x| D(g) |a0, k0, x0iha0, k0, x0| Oˆ |b, k, yi a0,k0,x0 (2.30) X  ˆ 0 0 ˆ  = ha, j, x| O |b, k , yi[Db(g)]k0k − [Da(g)]jk0 ha, k , x| O |b, k, yi . (2.31) k0 The last line is precisely the structure entering Schur’s lemma with A = ha, j, x| Oˆ |a0, k0, x0i. As a consequence Part I (Theorem 2.4): ha, j, x| Oˆ |b, k, yi = 0 , if a 6= b , (2.32) Part II (Theorem 2.5): ha, j, x| Oˆ |b, k, yi ∝ δjk , if a = b .

We conclude that there exists some function fa(x, y) such that

ha, j, x| Oˆ |b, k, yi = fa(x, y) δab δjk . (2.33) Thus the dependence of the matrix elements of Oˆ on the quantum numbers a, j is fixed by group theory, and all physics is stored in the unknown coefficients fa(x, y). This entails in particular that if the Hamiltonian commutes with all elements D(g) of a representation of G, one can choose the eigenstates of Hˆ to transform according to irreducible representations of G. If an irreducible representation appears only once in the Hilbert space, every state transforming within this repre- sentation is an eigenstate of Hˆ with the same eigenvalue.

2.4 Characters of a representation The decomposition of a representation into its irreducible building blocks utilizes the characters of the representation.

Definition 2.6 (Character). The character χD : G → C of a representation D of G is defined through the trace of the linear operator D(g), X χD(g) = TrD(g) = [D(g)]ii . (2.34) i

The character is invariant under similarity transformations, and therefore equivalent representa- 0 0 tions D and D have the same character χD = χD. As we will see, the converse is also true for irreducible representations: if two irreducible representations share the same character, then they are equivalent. More precisely, characters of inequivalent irreducible representations are orthogonal as we will now prove.

Theorem 2.6 (Orthogonality of characters). Let G be a group of order N and D, D0 irre- ducible representations. Then ( X 0 if D,D0 inequivalent χD(g)χD0 (g) = (2.35) N if D,D0 equivalent. g∈G

This will be a direct consequence of the following striking result. Let us consider the quite large family of functions G → C consisting of all matrix elements [D(g)]ij of all irreducible representa- tions D of G. According to the following theorem these are all orthogonal when summed over the elements of the group.

Theorem 2.7 (Orthogonality of matrix elements). Let D : G → GL(V ) and D0 : G → 0 GL(V ) be irreducible representations of G with dimensions nD and nD0 . Then ( 0 X −1 0 0 if D,D inequivalent [D(g )]ij [D (g)]kl = (2.36) N δ δ if D = D0. g∈G nD jk il

10 Proof. Let X : V 0 → V be any linear map. We claim that the operator A : V 0 → V defined by X A = D(g−1)XD0(g) (2.37) g∈G intertwines D and D0. To see this we evaluate D(h)A as follows, X D(h)A = D(h)D(g−1)XD0(g) (2.38) g∈G X = D(h · g−1)XD0(g) (2.39) g∈G X = D((g · h−1)−1)XD0(g) (2.40) g∈G g0=g·h−1 X = D((g0)−1)XD0(g0 · h) (2.41) g0∈G X = D((g0)−1)XD0(g0)D0(h) = AD0(h). (2.42) g0∈G By Schur’s lemma A = 0 if D and D0 are inequivalent, while A ∝ 1 if D = D0. Notice also that if D = D0, then X X X Tr(A) = Tr(D(g−1)XD(g)) = Tr(D(g)−1XD(g)) = Tr(X) = NTr(X), (2.43) g∈G g∈G g∈G and therefore we must have N A = Tr(X) 1. (2.44) nD

0 0 0 Let {|ii : i = 1, . . . , nD} be a basis for V and {|k i : k = 1, . . . , nD0 } a basis for V . Then X = |jihk | 0 with fixed j, k is an example of a linear map V → V . The matrix element [A]il of the associated operator is precisely the left-hand side of (2.36):

0 X −1 0 0 0 X −1 0 [A]il = hi|A|l i = hi|D(g )|jihk |D (g)|l i = [D(g )]ij [D (g)]kl. (2.45) g∈G g∈G By our previous observation this vanishes if D and D0 are inequivalent. If on the other hand 0 D = D , we have Tr(X) = δjk and therefore by (2.44), N N [A]il = Tr(X)δil = δjkδil. (2.46) nD nD

Remark 2.1. Note that Theorem 2.7 does not apply to non-identical but equivalent representa- tions D =∼ D0. As an aside let us see what happens in this case. By definition of equivalence there exists a linear map S : V → V 0 such that D0(g) = SD(g)S−1. Then we find

X −1 0 X −1 X −1 [D(g )]ij [D (g)]kl = [D(g )]ij [S]km[D(g)]mn[S ]nl g∈G g∈G m,n X −1 X −1 = [S]km[S ]nl [D(g )]ij[D(g)]mn m,n g∈G Thm.2 .7 N X = [S] [S−1] δ δ n km nl in j,m D m,n

N −1 = [S]kj[S ]il, (2.47) nD so instead of the orthogonality one finds the matrix elements of the similarity transform and its inverse.

11 Proof of Theorem 2.6. Since the characters are invariant under similarity transformations, we may assume without loss of generality that D and D0 are unitary (Theorem 2.2) and that D = D0 if they are equivalent. Then we have

nD nD0 X X −1 X X X −1 0 χD(g)χD(g) = χD(g )χD(g) = [D(g )]ii[D (g)]kk. (2.48) g∈G g∈G i=1 k=1 g∈G

By the previous theorem this vanishes if D and D0 are inequivalent, while for D = D0 it equals

n n XD XD N δik = N. (2.49) nD i=1 k=1

The orthogonality relation entails the following profound consequences 1) Theorem 2.7 shows that the matrix elements of inequivalent irreducible unitary representa- tions are orthogonal and non-vanishing (and therefore linearly independent) functions of the group elements. It can be shown (but we will not do it here) that these form a complete basis for the linear functions on G. As a consequence, if G has order N and na are the dimensions of its irreducible representations then one has the remarkable formula

X 2 N = na. (2.50) a

2) Consider a representation D of the group G with characters χD(g). Its completely reduced form contains the irreducible representation Da precisely ma times, where 1 X m = χ (g) χ (g) . (2.51) a N Da D g∈G

This can be seen as follows. Since the characters are independent of the choice of basis on the linear space, we can choose a basis where D(g) is block diagonal,

D = Da ⊕ · · · ⊕ Da ⊕ · · · . (2.52) | {z } |{z} ma times other irreducible representations P In this form it is clear that χD = a maχDa and then (2.51) follows from Theorem 2.6. 3) The characters can be used to construct projectors onto the subspaces transforming under the irreducible representation Da(g):

na X P = χ (g) D(g) . (2.53) a N Da g∈G

Again, this is a consequence of the completeness relation (2.36), which traced over j = k implies na X χ (g)[D(g)] = δ δ , (2.54) N Da lm ab lm g∈G In the block-diagonal basis for D this gives “1” on subspaces transforming with respect to Da and zero otherwise. This concludes our survey of finite groups.

12 3 Introduction to Lie groups and Lie algebras

If the elements of a group G depend smoothly, g = g(α), on a set of continuous parameters αa, a = 1, . . . , n, then G is called a . A prototypical example is the rotation group SO(2), implementing rotations in the two-dimensional plane: the group elements depend on a single continuous parameter given by the rotation angle α.

2 A matrix representation for SO(2) acting on R is given by cos α − sin α D(g(α)) = . (3.1) sin α cos α Since D(g)−1 = D(g)† = D(g)T this representation is unitary. Formally, a Lie group is defined as follows:

Definition 3.1 (Lie group). A Lie group of dimension n is a group (G, ·) whose set G possesses the additional structure of an n-dimensional smooth manifold. The group structures, composition (g, h) → g · h and inversion g → g−1, are required to be smooth maps. Recall that, informally speaking, a set G is an n-dimensional smooth manifold if the neighbourhoud n of any element can be smoothly parametrized by n parameters (α1, . . . , αn) ∈ R . In terms of such n n n a parametrization the composition and inversion correspond locally to maps R × R → R and n n R → R respectively, which we require to be smooth in the usual sense. 2 We see that SO(2) is a 1-dimensional Lie group because the rotations of R are smoothly parametrized by a single parameter α ∈ R. Since α and α + 2π are mapped to the same matrix D(g(α)), the manifold structure of SO(2) is that of a circle (of length 2π). Recall that a representation D of a group G on a vector space V is a mapping G → GL(V ) that is compatible with the group structure of G. In the case of a Lie group G one requires that D is smooth, meaning that the matrices D(g(α)) depend smoothly on the parameters αa.

3.1 Generators of a Lie groups

When parameterizing a Lie group, we adopt the convention that αa = 0 corresponds to the identity, i.e., at the level of the group g(α)|α=0 = e or for a representation D(g(α))|α=0 = 1. It turns out that a lot of information on the Lie group can be recovered by studying the vicinity of the identity. The construction of the generators exploits that the representation smoothly depends on the parameters αa. As a consequence any representation admits a Taylor expansion at the identity, 2 D(g(α)) = 1 + αa Xa + O(α ), (3.2) 5 where the generators Xa are defined as ∂ Xa ≡ D(g(α))|α=0 . (3.3) ∂αa 5Note that in the physics literature one often uses a convention with an additional imaginary unit in the definition of the generator, i.e. D(g()) = 1 + i a X˜a + ... and X˜a ≡ −iXa. The reason is that most Lie groups encountered in physics are unitary, in which case the generators X˜a are Hermitian while the Xa are antihermitian (see section 3.3 below). We will stick to the mathematical convention without the i, which is more convenient when considering the algebraic structure.

13 In (3.2) we have employed the Einstein summation convention to sum over repeated indices: Pn xaya := a=1 xaya. Applying (3.3) to (3.1) shows that SO(2) possesses a single generator given by

0 −1 X = . (3.4) 1 0

There is a natural way to recover group elements from linear combinations of the generators. Notice that if the vector αa is not particularly small, then 1 + αaXa does not give an element of the group, because we have neglected the contributions from the higher order terms in (3.2). Luckily we can choose an integer k and construct a group element in k steps: In the limit where k 1 αa becomes large, + k Xa does represent a group element, close to the identity, that we can then compose k times with itself. This motivates the introduction of the exponential map.

Definition 3.2 (The exponential map). Sufficiently close to unity, the elements of a represen- tation can be obtained as the exponential of the generators

 k ∞ Xa X 1 m αaXa D(g(α)) = lim 1 + αa = (αaXa) ≡ e . (3.5) k→∞ k m! m=0

This construction applies to any representation D of the group. In the exercises you will check that the exponential map of multiples of (3.4) reproduces the parametrization (3.1), i.e.

cos α − sin α eαX = , (3.6) sin α cos α and therefore the exponential map of SO(2) covers the whole group. This turns out to be true for all the Lie groups that we will consider in this course:

Fact 3.1. If G is a connected6 and compact7 Lie group and D : G → GL(V ) a finite-dimensional αaXa representation, then for any g ∈ G one can find a linear combination αaXa such that D(g) = e .

If all generators mutually commute, i.e. [Xa,Xb] = 0 for all a, b = 1, . . . n, then it is easily seen from the definition of the exponential map that A = αaXa and B = βaXa satisfy

A B B A A+B e · e = e · e = e . (if [Xa,Xb] = 0) (3.7)

In other words the Lie group is necessarily Abelian and the group multiplication is equivalent to addition of the corresponding generators.

6A Lie group is connected if any element in G can be reached from the identity by a continuous path. An example of a Lie group that is not connected is O(2), the set of 2 × 2 orthogonal real matrices M. Since det(M) = ±1 there is no way to reach a matrix with determinant −1 along a continuous path starting at the identity matrix, which has determinant 1. 7A Lie group defined via a set of matrices is compact if the absolute value of all matrix elements is bounded by some constant.

14 For general non-Abelian Lie groups however we know that there are A = αaXa and B = βaXa that fail this relation, i.e. eAeB 6= e(A+B) 6= eAeB. (3.8) Note that we do have λA µA (λ+µ)A e e = e for λ, µ ∈ R. (3.9) In particular, setting λ = 1 and µ = −1 we find for the group inverse of eA,

eAe−A = e0 A = 1, =⇒ (eA)−1 = e−A. (3.10)

3.2 Lie algebras

As we will see below, the generators Xa span a vector space g = {αaXa : αa ∈ R} with a particular structure, known as a .

Definition 3.3 (Lie algebra). A Lie algebra g is a vector space together with a “Lie bracket” [·, ·] that satisfies (i) closure: if X,Y ∈ g then [X,Y ] ∈ g;

(ii) linearity:[aX + bY, Z] = a[X,Z] + b[Y,Z] for X,Y,Z ∈ g and a, b ∈ C; (iii) antisymmetry:[X,Y ] = −[Y,X] for X,Y ∈ g; (iv) Jacobi identity:[X, [Y,Z]] + [Z, [X,Y ]] + [Y, [Z,X]] = 0 for X,Y,Z ∈ g. When working with matrices, like in our case, we can take the Lie bracket to be defined by the commutator of the matrix multiplication

[X,Y ] := XY − YX. (3.11)

It is then easy to see that this bracket satisfies linearity and antisymmetry. In the exercises you will verify that it also satisfies the Jacobi identity by expanding the commutators. Properties (ii)-(iv) thus follow directly from the elements being represented by matrices. We will now see that the closure (i) of the generators Xa follows from the closure of the group multiplication.

Let A, B ∈ g, i.e. A and B are linear combinations of the generators Xa, and let  ∈ R, then typically eAeB 6= eBeA, or equivalently e−Ae−BeAeB 6= 1. (3.12) However, by the closure of the group e−Ae−BeAeB must be some group element. Let us deter- mine this group element for small . Expanding to second order in ,

e−Ae−BeAeB 1 2 2 1 2 2 1 2 2 1 2 2 3 = 1 − A + 2  A 1 − B + 2  B 1 + A + 2  A 1 + B + 2  B + O( ) The terms proportional to A, B, 2A2, 2B2 all cancel, leaving

e−Ae−BeAeB = 1 + 2(AB − BA) + O(3) = exp 2[A, B] + O(3) . (3.13)

Since this is true for  arbitrarily small, we must have that [A, B] is a generator, thus [A, B] ∈ g and (i) follows. We conclude that to any Lie group G we can associate a Lie algebra g spanned by the generators of G. In mathematical terms the linear space spanned by the generators is the tangent space of the manifold G at the identity e ∈ G, so we find that the tangent space of a Lie group G naturally possesses a Lie algebra structure g. By convention, we will use Fraktur font to denote Lie algebras: the Lie algebras of the Lie groups SO(N), SU(N), . . . , are denoted by so(N), su(N), . . .

15 Since the generators Xa form a basis of the Lie algebra, the commutator of two generators [Xa,Xb] is a real linear combination of generators again. The real coefficients are called the .

Definition 3.4 (Structure constants). Given a Lie group G with generators Xa, a = 1, . . . , n. The structure constants fabc of the Lie algebra g spanned by Xa are defined through the relation

[Xa,Xb] = fabcXc. (3.14)

The properties of the structure constants are summarized as follows: 1) The structure constants do not depend on the representation, since they express how in- finitesimal symmetry transformations compose. They are thus a property of the underlying abstract Lie group. Note that they do depend on the choice of basis of generators (we will come back to this dependence on the basis in Section 5.2). 2) The anti-symmetry of the commutator implies that the structure constants are anti-symmetric in the first two indices fabc = −fbac . (3.15)

3) The Jacobi identity

[ Xa , [ Xb ,Xc ] ] + [ Xb , [ Xc ,Xa ] ] + [ Xc , [ Xa ,Xb ] ] = 0 (3.16)

can be equivalently stated as a relation among the structure constants:

fbcd fade + fabd fcde + fcad fbde = 0 . (3.17)

4) The structure constants are sufficient to compute the product of the group elements exactly. This follows from the Baker-Campbell-Hausdorff formula for exponentials of matrices

1 1 eX eY = eX+Y + 2 [X,Y ]+ 12 ([X,[X,Y ]]+[Y,[Y,X]])+more multi-commutators . (3.18)

The multi-commutators can again be expressed in terms of the structure constants. The importance of the Lie algebra arises from the property that it allows to study properties and representations of a group at the level of linear algebra. In particular, the exponential map allows one to translate proofs back and forth between the two perspectives. Two families of Lie groups, the special unitary groups SU(N) and special orthogonal groups SO(N), play an important role in physics. We also include the definition of the less familiar family of compact symplectic groups Sp(N), because together SU(N), SO(N) and Sp(N) form the only infinite families of the (yet to be introduced) simple compact Lie groups, as we will see in the last chapter of these lecture notes.

3.3 The special SU(N) For N ≥ 2, the in N dimensions is

n N×N † o SU(N) = U ∈ C : U U = 1, det U = 1 . (3.19)

Its importance as a symmetry group stems from the fact that it leaves invariant the N-dimensional scalar product familiar from quantum mechanics,

∗ N h ξ | η i ≡ ξa ηa, ξ, η ∈ C , (3.20)

16 where we used the summation convention. Indeed, we observe that for any U ∈ SU(N),

∗ ∗ ∗ ∗ † U †U=1 ∗ hUξ|Uηi = (Uξ)a(Uη)a = Uabξb Uacηc = ξb (U )baUacηc = ξb δbc ηc = hξ|ηi. (3.21) Note that we have not used that det U = 1. The most general linear transformations preserving the inner product thus form the larger unitary group

n N×N † o U(N) = U ∈ C : U U = 1 . (3.22)

The determinant of a unitary matrix U ∈ U(N) necessarily satisfies | det U| = 1. Therefore U(N) and SU(N) differ only by a phase: any unitary matrix U ∈ U(N) can be written as

U = eiαU 0, eiNα = det U, U 0 ∈ SU(N). (3.23)

Since this phase commutes with all elements of the group it does not play a role in understanding the algebraic structure of unitary matrices.8 For this reason we concentrate on SU(N) rather than U(N). In order to determine the corresponding Lie algebra we need to translate the conditions (3.19) for the Lie group elements into conditions on the generators. Writing an arbitrary group element † in terms of the exponential map, U(α) ≡ eαaXa , one has U † = eαaXa and U −1 = e−αaXa . The † −1 † condition U = U then implies that the generators are antihermitian Xa = −Xa. In addition, det U = 1 imposes that the generators must be traceless, trXa=0. This property follows from

! αaXa log det U = 0 = tr log e = αa tr Xa . (3.24)

Since this must hold for any value of the αa the generators must be traceless. Thus the Lie algebra su(N) consists of all antihermitian, trace-free matrices,

n N×N † o su(N) = X ∈ C : X = −X, tr X = 0 . (3.25)

The number of generators can be deduced from the following counting argument: a N × N matrix with complex entries has 2N 2 parameters. The condition that the matrix is antihermitian imposes N 2 constraints. Requiring that the matrices are trace-free removes an additional generator (the one corresponding to global phase rotations). Hence

dim su(N) = N 2 − 1 . (3.26)

3.4 The special SO(N) For N ≥ 2, the special orthogonal group SO(N) is the real analogue of the special unitary group,  N×N T SO(N) = Λ ∈ R :Λ Λ = 1, det Λ = 1 . (3.27) N This time it is the Euclidean scalar product on R that is preserved,

h ξ , η i ≡ ξa ηa, (3.28) since for any Λ ∈ SO(N) we have

T ΛT Λ=1 hΛξ, Ληi = (Λξ)a(Λη)a = ξb(Λ )baΛacηc = ξb δbc ηc = hξ, ηi. (3.29) Again SO(N) is not the largest group of matrices preserving the inner product, because we have  N×N T not used that det Λ = 1. The largest group O(N) = Λ ∈ R :Λ Λ = 1 is that of all or- thogonal N × N matrices including reflections. Since any orthogonal matrix has determinant ±1,

8We will see later that the absence of this phase in SU(N) entails that the Lie algebra su(N) is simple (in a precise sense), while u(N) is not.

17 O(N) really consists of two disjoint components, one for each sign of the determinant. The com- ponent containing the identity matrix is precisely SO(N). We will focus on the latter because it is connected (and compact), and therefore any element can be obtained via the exponential map.

αaXa T For any Λ ∈ SO(N) we may write Λ = e with Xa real N × N matrices. Then Λ = T eαaXa and Λ−1 = e−αaXa . Thus ΛT = Λ−1 implies that the generators of the Lie algebra so(N) T are antisymmetric martrices Xa = −Xa. The restriction on the determinant implies that the generators are traceless, but this already follows from antisymmetry. Hence we find that

 N×N T so(N) = X ∈ R : X = −X . (3.30)

The number of generators follows from counting the independent entries in an antisymmetric N ×N matrix and is given by 1 dim so(N) = N(N − 1) . (3.31) 2

3.5 The compact Sp(N) For N ≥ 1, the compact symplectic group Sp(N) ⊂ SU(2N) consists of the unitary 2N × 2N matrices that leave invariant the skew-symmetric scalar product

2N   X 0 1N ( ξ , η ) ≡ ξa Jab ηb ,J = . (3.32) −1N 0 a=1 Hence it satisfies n 2N×2N † T o Sp(N) = U ∈ C : U U = 1,U JU = J . (3.33) The last condition can be equivalently written as U −1 = J −1U TJ. In terms of the exponential map this means that T −1 T e−αaXa = J −1eαaXa J = eαaJ Xa J , (3.34) where the last equality follows easily from the power series expansion of the exponential. Since J −1 = −J, it follows that the generators of the Lie algebra satisfy

T Xa = JXaJ. (3.35)

Together with the fact that the generators have to be antihermitian for U to be unitary, this leads us to identify the Lie algebra as

n 2N×2N † T o sp(N) = X ∈ C : X = −X,X = JXJ . (3.36)

With some work one may deduce from these conditions that sp(N) has dimensions N(2N+1).

18 4 Irreducible representations of su(2)

The exponential map provides a connection between Lie groups and the underlying algebra. As a consequence we can construct representations of the Lie group from representations of the corre- sponding Lie algebra. Note that we have been a bit sloppy in our notation for the Lie algebra and its representations. In the previous section we introduced the Lie algebra g of a Lie group G by looking at a representation N D : G → GL(C ) and considering the expansion of D(g(α)) around the identity g(0) = e. The linear space spanned by the generators should be understood as one particular representation N×N (which we could denote by D : g → C ) of an abstract Lie algebra g. Even though the dimension of the representation (i.e. the size N of the matrices) depends on the representation, the dimension n = dim(G) of the Lie algebra g is fixed (i.e. the number of generators). We have argued that the structure constants fabc can be understood as a property of the abstract Lie algebra. Therefore the problem of finding other, say, N-dimensional representations of a Lie N×N algebra g is equivalent to finding a collection of n generators Xa ∈ C satisfying (3.14). The goal of this section is to construct all finite-dimensional irreducible representations of the Lie algebra su(2).

4.1 The fundamental representation of su(2) We have already seen one representation of su(2) in Section 3.3, namely the defining or fundamental representation n 2×2 † o su(2) = X ∈ C : X = −X, tr X = 0 . (4.1) A basis of traceless Hermitian matrices, well-known from quantum mechanics, is given by the Pauli matrices 0 1 0 −i 1 0  σ = , σ = , σ = , (4.2) 1 1 0 2 i 0 3 0 −1 satisfying † 1 σa = σa, tr[σa] = 0, σa σb = δab + iabc σc. (4.3)

Here abc is the completely anti-symmetric -tensor with 123 = 1. To obtain a basis of antihermitian traceless matrices, we then only have to multiply the Pauli matrices by i. In particular, if we set i Xa = − 2 σ (4.4) then [ Xa ,Xb ] = abc Xc , (a, b, c = 1, 2, 3). (4.5)

It follows that the structure constants of su(2) are fabc = abc. Any other set of three antihermitian matrices satisfying (4.5) constitutes another representation of su(2).

4.2 Irreducible representations of su(2) from the highest-weight construction Recall that a representation D : G → GL(V ) is irreducible if it has no proper invariant subspace W ⊂ V , i.e. a subspace such that D(g)w ∈ W for all w ∈ W and g ∈ G. An invariant subspace W can be equivalently characterized in terms of the Lie algebra g, namely that Xaw ∈ W for all w ∈ W and all generators Xa. An irreducible representation of su(2) therefore corresponds to a set N of N × N antihermitian matrices satisfying (4.5) that do not leave any proper subspace W ⊂ C invariant. Such representations can be constructed systematically via the highest-weight construction. The first step in the construction diagonalizes as many generators as possible. From quantum mechanics we recall that two (Hermitian or antihermitian) operators can be diagonalized simultaneously if

19 and only if they commute. For su(2) this implies that only one generator may be taken diagonal, which without loss of generality we take to be X3. Since X3 is antihermitian,

J3 ≡ iX3 (4.6) is a real diagonal N × N matrix. The remaining generators X1 and X2 are recast into raising and lowering operators 1 −i J± ≡ √ (J1 ± i J2) = √ (X1 ± i X2) ,Ja ≡ iXa. (4.7) 2 2

† † Since Xa = −Xa, one has (J−) = (J+) . Furthermore, the commutator (4.5) implies

[ J3 ,J± ] = ±J± , [ J+ ,J− ] = J3 . (4.8)

The key property of the raising and lowering operators is that they raise and lower the eigenvalues of J3 by one unit. To be precise, suppose |mi is an eigenvector of J3 with eigenvalue m, which is necessarily real because J3 is Hermitian. Then J±|mi is an eigenvector of J3 with eigenvalue m±1 (unless J±|mi = 0), because

J3 J± |mi = J± J3 |mi ± J± |mi = (m ± 1) J± |mi , (4.9) where we used the commutator (4.8) in the first step. The states transforming within a given irreducible representation are found from the highest weight construction. Since the representation is finite-dimensional and J3 is hermitian, J3 has a largest eigenvalue j ∈ R. Let |ji be one of the corresponding (normalized) eigenvectors, which we call a highest-weight state, J3|ji = j |ji , hj|ji = 1. (4.10) The property that it is the highest-weight state implies that it is annihilated by the raising oper- ator J+|ji = 0, (4.11) since by construction there cannot be any state with J3-eigenvalue j + 1. Applying J− lowers the J3-eigenvalue by one unit, and we denote the corresponding eigenvector by |j − 1i,

J−|ji = Nj |j − 1i , (4.12) where we introduced a real normalization constant Nj > 0 to ensure hj − 1|j − 1i = 1. The value of Nj can be computed from the normalization of the highest weight state and commutator. Employing the definition (4.12)

2 Nj hj − 1|j − 1i = hj| J+ J− |ji

= hj| [ J+ ,J− ] |ji (4.13)

=hj| J3 |ji = j. √ Thus we conclude that Nj = j. Conversely, applying J+ to |j − 1i leads back to the state |ji: 1 1 1 j J+ |j − 1i = J+ J− |ji = [ J+ ,J− ] |ji = J3 |ji = |ji = Nj |ji . (4.14) Nj Nj Nj Nj

We can continue this process and define |j −ki to be the properly normalized state obtained after k applications of J− on |ji, assuming it does not vanish. In particular we introduce the normalization coefficients Nj−k > 0 via

J− |j − ki ≡ Nj−k |j − k − 1i . (4.15)

20 We claim that the states |ji, |j − 1i,... span an invariant subspace W of the representation. It is clear from the construction that W is invariant under J3 and J−, so it remains to verify that for J+|j−ki ∈ W for all k. This can be checked by induction: we have seen that always J+|j−1i ∈ W . Now suppose J+|j − ki ∈ W for some k ≥ 1. Then 1 1 J+|j − k − 1i = J+J−|j − ki = (J−J+|j − ki + J3|j − ki) , (4.16) Nj−k Nj−k but both terms are in W : J−J+|j − ki ∈ W because J+|j − ki ∈ W by the induction hypothesis and J3|j − ki = (j − k)|j − ki ∈ W . Hence W is a non-trivial invariant subspace. If we require the representation to be irreducible, |ji, |j − 1i,... must form a basis for the full vector space. This implies in particular that our choice of higher-weight state |ji was unique (up to phase). The definition (4.15) allows to derive the recursion relation

2 2 |Nj−k| − |Nj−k+1| = j − k . (4.17)

This follows from

2 2 |Nj−k| = |Nj−k| h j − k − 1 | j − k − 1 i

= h j − k | J+ J− |j − ki (4.18) = h j − k | [J+ ,J− ] |j − ki + h j − k | J− J+ |j − ki 2 =j − k + |Nj−k+1| .

Noticing that the recursion relation (4.17) is a telescopic sum, the solution for the normalization coefficients is obtained as

k 2 2 X 1 |Nj−k| = |Nj| + (j − l) = 2 (k + 1)(2j − k) . (4.19) l=1

In order for the number of states transforming within the representation to be finite, there must also be a lowest-weight state with J3 eigenvalue j − l, l ∈ N. By definition this state is annihilated by J−, J− |j − li = 0, (4.20) since there is no state with J3-eigenvalue j − l − 1. This implies that the norm Nj−l introduced in (4.15) and computed in (4.19) has to vanish

1 p ! Nj−l = √ (2j − l)(l + 1) = 0 . (4.21) 2 This happens if and only if j = 0, 1/2, 1, 3/2,... and l = 2j. In other words, one obtains a finite- dimensional representation if j is either integer or half-integer. The finite-dimensional irreducible representations of su(2) can then be labeled by their j-value, and are called the spin-j represen- tations. The spin-j representation is 2j + 1-dimensional and the J3-eigenvalues of the states cover the range −j ≤ m ≤ j with unit steps. We have proved that any N-dimensional irreducible representation of su(2) is equivalent to the spin-j representation with 2j + 1 = N, meaning that we can find a basis |mi ≡ |j, mi, m = −j, −j + 1, . . . , j such that

J3|j, mi = m |j, mi,J−|j, mi = Nm |j, m − 1i,J+|j, mi = Nm+1 |j, m + 1i, (4.22)

1 p Nm = √ (j + m)(j − m + 1). (4.23) 2

21 In particular we can read off explicit matrices for the generators in the spin-j representation

0 [J3]m0m = hj, m | J3 |j, mi = m δm0,m . (4.24) and

0 0 [J+]m0m = hj, m | J+ |j, mi = Nm+1 hj, m |j, m + 1i 1 p (4.25) = √ (j + m + 1)(j − m) δm0,m+1 . 2

† The matrix representation of the lowering operator is obtained via J− = (J+) . It remains to check that this spin-j representation is indeed an irreducible representation. It is easy to verify that the matrices J3,J+,J− satisfy the commutation relations (4.8). Solving (4.6) and (4.7) then gives explicit matrices Xa satisfying the commutation relations [Xa,Xb] = abcXc. To see that the representation is irreducible let’s consider an arbitrary (non-vanishing) vector

j X |ψi = am|j, mi. (4.26) m=−j

Note that we can always find a k ≥ 0 such that

k J+|ψi ∝ |j, ji, (4.27) namely k = 2j if a−j 6= 0 or k = 2j − 1 if a−j = 0 and a−j+1 6= 0, etcetera. But then we can obtain all basis vectors from |ψi by repeated application of J− and J+ via

l k J−J+|ψi ∝ |j, j − li. (4.28)

This means that if |ψi is in some invariant subspace then all basis vectors are in this subspace, so the spin-j representation has no proper invariant subspaces! This completes the proof of the following theorem.

Theorem 4.1. A complete list of inequivalent irreducible representations of su(2) is given by the 1 3 (2j + 1)-dimensional spin-j representations (4.22) for j = 0, 2 , 1, 2 ,.... We close this section by checking that for j = 1/2 we reproduce the two-dimensional fundamental representation of su(2). Using the basis |1/2, 1/2i, |1/2, −1/2i we evaluate (4.24) to

1 1 0  1 J = = σ = iX . (4.29) 3 2 0 −1 2 3 3

From (4.25) we obtain 1 0 1 1 0 0 J+ = √ ,J− = √ . (4.30) 2 0 0 2 1 0

Converting back to X1 and X2 via

i i 0 1 1 i 0 −i X1 = −√ (J+ + J−) = − ,X2 = −√ (J+ − J−) = − , (4.31) 2 2 1 0 2 2 i 0

i one recovers Xa = − 2 σa with σa the Pauli matrices (4.2).

22 5 The adjoint representation

5.1 Definition Let G be a Lie group and g the corresponding Lie algebra in some N-dimensional matrix repre- sentation. If the dimension of G, i.e. the number of generators of the Lie algebra g, is n, then G possesses a natural n-dimensional representation with underlying vector space given by the Lie algebra g itself.

Definition 5.1 (Adjoint representation of a Lie group). The adjoint representation D : G → GL(g) of G is given by D(g)X = gXg−1 for g ∈ G, X ∈ g. (5.1)

This definition makes sense because the elements of G and g are both N × N matrices. To see that gXg−1 ∈ g, notice that t 7→ g etX g−1 (5.2) determines a smooth path in G that visits the identity at t = 0. Hence, tX −1 −1 ∂t g e g |t=0 = gXg ∈ g. (5.3)

Let us look at the corresponding representation of the Lie algebra. To this end we consider a generator Y ∈ g and expand the action of D(1 + Y ) to first order in , D(1 + Y )X = (1 + Y )X(1 − Y + ...) = X + [Y,X] + ... = (1 + [Y, · ] + ...)X. (5.4) Hence the generator Y acts on the Lie algebra g by taking the Lie bracket [Y, · ]. Viewing the elements X ∈ g as states |Xi we may thus summarize the representation as Y |Xi = |[Y,X]i. (5.5)

To obtain explicit matrices [Ta]bc for this representation, we choose a basis of generators Xa 9 spanning the Lie algebra g and consider an inner product such that hXa|Xbi = δab. Recall that the commutator of the generators is given in terms of the structure constants by

[ Xa ,Xb ] = fabc Xc . (5.6)

To find the matrix element [Ta]bc corresponding to the action of Xa on g we use the inner product to find (2.10) (5.5) (5.6) [Ta]bc = hXb| Xa |Xci = hXb|[Xa,Xc]i = facdhXb|Xdi = facb. (5.7) This constitutes the adjoint representation of g:

Definition 5.2 (Adjoint representation of a Lie algebra). The adjoint representation of a Lie algebra g with structure constants fabc is generated by the matrices

[Ta]bc ≡ facb . (5.8)

The generators Ta satisfy the commutator relation (5.6). This can be verified by substituting the definition (5.8) into (5.6) and applying the Jacobi identity (3.17) (see exercises). The dimension of the adjoint representation equals the dimension of the Lie group. This is similar to the regular representation encountered in the context of finite groups, where the elements of the group served as a basis of the linear space on which the representation acts. Now it is not the group elements but the generators that span the linear space.

9We will see in a minute that the algebra has a canonical inner-product given by the Cartan-Killing metric. Here the inner product is arbitrarily defined by our choice of basis.

23 5.2 Cartan-Killing metric There is a natural metric on the space of generators:

Definition 5.3 (Cartan-Killing metric). The Cartan-Killing metric

γab := tr [Ta Tb] = [Ta]cd [Tb]dc = fadcfbcd (5.9) provides a scalar product on the space of generators of the adjoint representation. (Note that the trace tr is with respect to the matrix-indices of the generators [Ta]bc.)

Fact 5.1. As a side-remark let us mentions that if g is a simple10 Lie algebra and one is not worried over the precise normalization, one may equally compute the Cartan-Killing metric by taking the trace γab = c tr[XaXb] in any other irreducible representation than the adjoint one, because they are all equal up to a positive real factor c.

A natural question concerns the freedom in writing down the generators Ta. Since the generators 0 Xa constitute a basis of a linear space, there is the freedom of transforming to a new basis Xa by performing an invertible, linear transformation L:

0 Xa = Lba Xb . (5.10) The change of basis induces a change in the structure constants

0  −1 fabc 7→ fabc = Lda Leb fdef L cf . (5.11) This is seen by applying the transformation (5.10) to the commutator

 0 0 (5.6) 0 0 (5.10) 0 X ,X = f X = f Lfc Xf a b abc c abc (5.12) ! (5.6) = Lda Leb [ Xd ,Xe ] = Lda Leb (fdef Xf ) . As a consequence the generators of the adjoint representation transform according to

 0  (5.8) 0 (5.11) −1 [Ta]bc 7→ Ta bc = facb = Lda Lfc fdfe [L ]be (5.8) −1 −1 = Lda Lfc [Td]ef [L ]be = Lda [L TdL]bc. (5.13) This, in turn, implies that the Cartan-Killing metric transforms as

0  0 0  −1 −1  γab = tr [Ta Tb] 7→ γab = tr Ta Tb = Lca Ldb tr L TcLL TdL = Lca Ldb γcd, (5.14) where we used the cyclicity of the trace. Since the Cartan-Killing metric is symmetric in the generator indices a, b, one may choose the basis transformation L such that that after the transformation γab is of the diagonal form  1  − n− 0 0 1 γab = tr [Ta Tb] =  0 + n+ 0  (5.15) 1 0 0 0 n0 for some integers n−, n+ and n0 that add up to n. This fact about symmetric matrices may be familiar to you from general relativity (where the spacetime metric can be brought by a local basis transformation to Minkowski form, i.e. n− = 1, n+ = 3) and is know under the name of Sylvester’s law of inertia. 10A Lie algebra is simple if its adjoint representation is irreducible. A more useful definition of a will be given in Chapter8. For now it is sufficient to know that the Lie algebras su(N) for N ≥ 2 and so(N) for N ≥ 3 are all simple.

24 The block containing the negative eigenvalues is generated by the n− “compact” (rotation-type) generators, the positive eigenvalues belong to the n+ “non-compact” (boost-type) generators and p the zero-entries originate from nilpotent generators where (Xa) = 0 for some p > 1. Thus the eigenvalues of the Cartan-Killing metric provide non-trivial information on the compactness of the Lie algebra.

Definition 5.4 (Compact Lie algebra). A Lie algebra g is compact if the Cartan-Killing metric is negative definite, meaning that ka < 0 for all a. Adopting the basis where the Cartan-Killing metric is diagonal has the consequence that the structure constants of a compact Lie algebra g become completely antisymmetric11

fabc = fbca = fcab = −fbac = −facb = −fcba . (5.16)

The proof of this property is rather instructive. It expresses the structure constants in terms of a the Cartan-Killing metric. Normalizing the generators such that k = −1 for all Ta one has

(5.6) (5.9) γdc=−δdc tr ( [ Ta ,Tb ] Tc) = fabd tr ( Td Tc) = fabdγdc = −fabc . (5.17)

The cyclicity of the trace then shows that

fabc = − tr ( [ Ta ,Tb ] Tc)

= − tr ( Ta Tb Tc − Tb Ta Tc)

= − tr ( Tb Tc Ta − Tc Tb Ta) (5.18)

= − tr ( [ Tb ,Tc ] Ta)

= fbca.

Together with the anti-symmetry of the structure constants in the first two indices, this establishes the relation (5.16).

5.3 Casimir operator

Suppose g is compact with generators Xa in any irreducible representation. Then we can introduce an important quadratic operator.

Definition 5.5 (Casimir operator). The (quadratic) Casimir operator is given by

ab C := γ XaXb, (5.19)

ab −1 ab where γ = [γ ] is the inverse of the Cartan-Killing metric γab. Note that generally C/∈ g since it is not built from commutators of the generators, but it makes perfect sense as a matrix acting on the same vectors as Xa.

Theorem 5.1. The Casimir operator is independent of the choice of basis Xa and commutes with all generators, i.e. [C,Xa] = 0.

0 Proof. To see that it is independent of the basis let us look at a transformation Xa 7→ Xa = LbaXb as before. Then

0 (5.19) 0 ab 0 0 0 ab C 7→ C = (γ ) XaXb = (γ ) LcaXcLdbXd (5.20) (5.14)  t −1 t cd = L(L γL) L cd XcXd = γ XcXd = C. (5.21) 11This is the basis which is typically adopted when studying non-abelian gauge groups in quantum field theory.

25 Without loss of generality we may assume then assume that we have chosen generators such that γab = δab. Then we find for any c,

(5.19) [C,Xb] = [XaXa,Xb] = Xa[Xa,Xb] + [Xa,Xb]Xa (5.22) (5.6) (5.16) = XafabcXc + fabcXcXa = 0. (5.23)

Note that the Casimir operator does not just commute with all generators, but also with the group elements of G in the corresponding representation D. Indeed, using the exponential map (Definition 3.2) any element g ∈ G can be written as D(g) = exp(αaXa) and therefore satisfies

Thm. 5.1 [C,D(g)] = [C, exp(αaXa)] = 0. (5.24)

This puts us precisely in the setting of Schur’s Lemma, Theorem 2.5, which allows us to conclude that if D is irreducible the Casimir operator is proportional to the identity,

C = CD 1, (5.25) where CD is a positive real number that only depends on the representation D. It can therefore be used to label or distinguish irreducible representations. As an example let us look at su(2) with the standard basis

[Xa,Xb] = abcXc. (5.26)

In the exercises you will determine that the corresponding Cartan-Killing metric is

γab = −2δab. (5.27)

The Casimir operator is therefore 1 1 C = γabX X = − (X2 + X2 + X2) Ja==iXa (J 2 + J 2 + J 2) (5.28) a b 2 1 2 3 2 1 2 3 Given what we know about the total angular momentum (or spin) in quantum mechanics it should not come as a surprise that it commutes the generators Ja. To compute the value Cj in the spin-j representation it is sufficient to focus on a single state, say the highest-weight state |ji = |j, m = ji. Using that

(4.7) 1 1 J +J − + J −J + = (J + iJ )(J − iJ ) + (J − iJ )(J + iJ ) = J 2 + J 2, (5.29) 2 1 2 1 2 2 1 2 1 2 1 2 we find

(5.28) 1 1 C|ji = (J 2 + J +J − + J −J +)|ji = (J 2 + [J +,J −] + 2J −J +)|ji (5.30) 2 3 2 3 (4.8) 1 (4.11) 1 = (J 2 + J + 2J −J +)|ji = (J 2 + J )|ji, (5.31) 2 3 3 2 3 3 1 and therefore Cj = 2 j(j + 1). We see that the Casimir operator indeed distinguishes between the different irreducible representations of su(2). For larger Lie algebras, like su(N) with N ≥ 3, this is no longer the case since irreducible represen- tations are typically labeled by more than one parameter. Luckily in general the quadratic Casimir operator C is not the only operator that commutes with all group elements. Further Casimir in- variants that are polynomials in Xa of order higher than two can then be constructed. It turns out that all Casimir invariants together do distinguish the irreducible representations.

26 6 Root systems, simple roots and the Cartan matrix

6.1 Central to the classification of Lie algebras and their representations is the selection of a maximal set of commuting generators. Mathematically this is described by the Cartan subalgebra of a Lie algebra. First we need to know what a subalgebra is.

Definition 6.1 (Subalgebra). A subalgebra h of a Lie algebra g is a linear subspace h ⊂ g on which the Lie bracket closes: [X,Y ] ∈ h for all X,Y ∈ h. This can also be denoted

[h, h] ⊂ h. (6.1)

If Hi ∈ g for i = 1, . . . m form a collection of commuting Hermitian generators, [Hi,Hj] = 0 for all i and j, then the Hi are called Cartan generators. Trivially they span a subalgebra of g.

Remark 6.1. Here you may rightfully object that the Lie algebras g we have encountered so far do not contain any Hermitian generators (except the trivial 0 ∈ g). Indeed, in the case of the special † unitary and orthogonal groups, the generators Xa were antihermitian Xa = −Xa. In order to find Hermitian generators we have to allow taking complex linear combinations βaXa, βa ∈ C, instead of just real linear combinations αaXa ∈ g, αa ∈ C. In mathematical terms such complex linear combinations take value in the complexified Lie algebra gC = g + ig. Note that we have already implicitly used such generators in Section 4.2 when introducing the lowering and raising operators J = √−i (X ± iX ) which are elements of g but not of g (they are not antihermitian). When ± 2 1 2 C dealing with and classification of Lie algebras it is typically a lot easier to work with gC. We will do so implicitly in these lectures even when we write just g!

Definition 6.2 (Cartan subalgebra). A subalgebra h of a Lie algebra g is a Cartan subalgebra of g if [h, h] = 0 and it is maximal, in the sense that there are no elements X ∈ g \ h such that [h,X] = 0. Although a Lie algebra g can have many Cartan subalgebras, they are all equivalent in the sense that they can be related by a basis transformation of g. Therefore we often talk about the Cartan subalgebra of g.

Definition 6.3 (Rank). The rank of g is the dimension of its Cartan subalgebra.

The Cartan generators Hi form a maximal set of generators that can be simultaneously diagonal- ized. Hence, we may choose a basis {|µ, xi} of states such that

Hi|µ, xi = µi|µ, xi, (6.2) where x is some additional label to specify the state (in case of multiplicities).

Definition 6.4 (Weights). The vector of eigenvalues µ = (µ1, . . . , µr) of a state is called a weight or weight vector.

Example 6.1 (su(2)). The rank of su(2) is 1 since no two generators commute. The Cartan generator (spanning the Cartan subalgebra of su(2)) is typically chosen to be J3 = iX3. The weights of the spin-j representation of su(2) are simply j, j − 1,..., −j.

27 6.2 Roots For the remainder of this Chapter we focus on the adjoint representation of a compact Lie algebra g, where states |Xi are labeled by generators X ∈ g. Suppose g has rank r. States associated to r Cartan generators have weight vector (0,..., 0) ∈ R , since for all i, j = 1, . . . , r we have

(5.5) Cartan Hi|Hji = |[Hi,Hj]i = 0. (6.3)

Conversely, by the maximality criterion (Definition 6.2), any state with zero weight vector is a Cartan generator. The non-Cartan states can be taken to be simultaneous eigenstates of the Cartan generators

Hi|Eαi = αi|Eαi for all i = 1, . . . , r. (6.4)

Later we will see that the eigenstate |Eαi is uniquely specified (up to scalar multiplication) by the weight vector α = (α1, . . . , αr).

Definition 6.5 (Roots). A nonvanishing weight vector α = (α1, . . . , αr) of the adjoint represen- tation is called a root. At the level of g (6.4) implies that [Hi,Eα] = αi Eα. (6.5)

Thus the generators Eα act similarly to the raising and lowering operators we saw in the case of su(2). Notice that, contrary to the Cartan generators, the generators Eα cannot be Hermitian. Indeed, taking the conjugate of (6.5) we find

† † [Hi,Eα] = −αiEα. (6.6)

† † Instead we can choose the normalization such that Eα = E−α, similarly to the relation J+ = J− between the raising and lowering operators in su(2). The Cartan-Killing metric induces a scalar product on the space of states   hY |Xi = tr Y †X , (6.7) where we have include the conjugate transpose since we wish to compute inner-products of non- Hermitian operators Eα as well. The states |Hii and |Eαi can then be normalized such that

hHi|Hji = δij, hEα|Eβi = δαβ, (6.8) Qr where δαβ := i=1 δαiβi . The set {Hi} ∪ {Eα} then forms an orthonormal basis of g.

6.3 It turns out that the collection of root vectors α satisfies very rigid conditions, that we capture in 2 the following definition. Here we use the notation |α| = α · α and α · β = αiβi.

Definition 6.6 (Root system). A root system Φ or rank r is a finite subset of non-zero vectors r in R satisfying r (i) Φ spans R . (ii) The only scalar multiples of α ∈ Φ that also belong to Φ are ±α. α·β (iii) For any α, β ∈ Φ the ratio α·α is a half-integer.

28 α·β (iv) For every α, β ∈ Φ the Weyl reflection β − 2 α·α α of β in the hyperplane perpendicular to α is also in Φ. We will now show that

Theorem 6.1. The set {α} of roots of a compact Lie algebra g forms a root system. We will verify the properties one by one. Property (i) follows from the compactness of g:

r Proof of property (i). Suppose the roots do not span R . Then there exists a nonzero vector βi such that β · α = 0 for all α ∈ Φ. But that means that all states |Eαi are annihilated by βiHi, βiHi|Eαi = β · α|Eαi = 0, and the same is true by definition for the states |Hji. Since those states together form a basis of g, it follows that [βiHi, g] = 0. Since βiHi is represented by the zero matrix in the adjoint representation, it has vanishing norm in the Cartan-Killing metric, in contradiction with the compactness of g.

The next steps rely on the fact that to each pair ±α of roots one may associate an su(2)-subalgebra. If α is a root, then [Eα,E−α] ∈ h has to be in the Cartan subalgebra h. To see this we use the Lie bracket (6.5) to compute

Hi|[Eα,E−α]i = HiEα|E−αi = (EαHi + αiEα)|E−αi = (−αi + αi)Eα|E−αi = 0. (6.9) On the other hand using (6.8),

hHi|[Eα,E−α]i = tr(Hi[Eα,E−α])

= tr(HiEαE−α) − tr(HiE−αEα)

= tr(E−α[Hi,Eα]) (6.10)

= αi tr(E−αEα)

= αihEα|Eαi = αi.

Since the Hi form an orthonormal basis of the Cartan subalgebra we conclude that

[Eα,E−α] = αiHi. (6.11)

On the basis of this relation we introduce the su(2) generators E± and E3 by −1 −2 E± := |α| E±α,E3 := |α| αiHi. (6.12) It is straightforward to check that they indeed satisfy the su(2) algebra

[E+,E−] = E3, [E3,E±] = ±E±. (6.13)

Let us use this structure to verify that the root vector α specifies the generator Eα uniquely. 0 Suppose that there exists another state |Eαi with root vector α orthogonal to |Eαi. Then the 0 state E−α|Eαi has zero weight vector and is thus a Cartan generator, which can be identified following a computation similar to (6.10): 0 0  hHi| E−α |Eαi = −αi tr E−α Eα = 0 . (6.14) 0 Thus the projection on the Cartan subalgebra vanishes. As a consequence |Eαi must be a lowest 0 weight state, E−|Eαi = 0. But 0 −2 0 0 E3|Eαi = |α| αiHi |Eαi = |Eαi . (6.15) 0 Thus the state |Eαi has an E3-eigenvalue m = +1. This is in contradiction with the fact shown in Chapter 4 that the lowest weight states of an su(2)-representation must have a negative E3- eigenvalue (namely −j in the spin-j representation). Thus we conclude that one cannot have two 0 generators Eα and Eα corresponding to the same root. Similar arguments lead to the follow- ing.

29 Proof of property (iii). Let us consider a root β different from α. From (6.12) is follows that

α · β E |E i = |α|−2α H |E i = |E i. (6.16) 3 β i i β α · α β But from the su(2) representation theory the eigenvalue (α · β)/(α · α) has to be half-integer.

Proof of property (ii). Notice that it is sufficient to show that if α is a root that tα for t > 1 cannot be a root. To this end, suppose to the contrary that α0 = tα is a root. Applying property (iii) to α and tα we find that both t and 1/t must be half-integers, so the only option is t = 2.

Let us consider the su(2)-subalgebra (6.12) associated to root α. The state |Eα0 i has E3-eigenvalue 2 and therefore it has a component in a spin-j representation for some integer j ≥ 2. Acting with the lowering operator E− on this component we construct a state with root vector α. But this state is not a multiple of |Eαi, since |Eαi lives in the spin-1 representation (because E+|Eαi = 0 and E3|Eαi = |Eαi). We have shown that this is impossible, finishing our proof by contradiction.

Proof of property (iv). Suppose α and β are two distinct roots. The state |Eβi has E3-eigenvalue m = (α · β)/(α · α) with respect to the su(2)-representation (6.12) with m half-integer. Hence |Eβi can be decomposed into components transforming in various spin-j representations for j ≥ |m|. If 2m m ≥ 0 then acting 2m times with the lowering operator produces a non-vanishing state E− |Eβi satisfying 2m 2m 2m 2m HiE− |Eβi = −2mαiE− |Eβi + E− Hi|Eβi = (βi − 2mαi) E− |Eβi. (6.17) It therefore has root equal to β − 2mα = β − 2(α · β)/(α · α)α. Similarly, if m < 0 one may 2|m| act 2|m| times with the raising operator to produce a non-vanishing state E+ |Eβi with root β + 2|m|α = β − 2(α · β)/(α · α)α.

We conclude that the roots of any compact Lie algebra g form a root system Φ with rank r equal to the rank of g, i.e. the dimension of its Cartan subalgebra. Although we will not prove this in detail, any root system Φ characterizes a unique compact Lie algebra.

Fact 6.1. Compact Lie algebras g are in 1-to-1 correspondence with root systems Φ.

6.4 Angles between roots

Property (iii) has a direct consequence for the angle θαβ between any two roots α, β ∈ Φ for which α 6= ±β. Not only α · β/|α|2 but also α · β/|β|2 must be a half-integer. Hence, by the cosine rule (α · β)2 2(α · β) 2(α · β) 4 cos2 θ = 4 = ∈ (6.18) αβ |α|2|β|2 |α|2 |β|2 Z is an integer. By inspection one deduces that there are only few possibilities for the angle θαβ and the length ratio |α|/|β|, which are summarized as follows:

2 2 2 4 cos θαβ θαβ |α| /|β| 0 π/2 arbitrary 1 π/3 , 2π/3 1 (6.19) 1 2 π/4 , 3π/4 2 or 2 1 3 π/6 , 5π/6 3 or 3

6.5 Simple roots In order to proceed to classify root systems it is useful to adopt a convention as to which of the two roots ±α should be interpreted as a raising operator.

30 Definition 6.7 (Positive root). A root vector α = (α1, . . . , αr) is called positive if its first non-zero component is positive, and negative otherwise. −1 −1 If α is positive we call |α| Eα a raising operator, while if α is negative |α| Eα is a lowering operator.

Definition 6.8 (Simple root). A simple root of g is a positive root that is not a sum of two other positive roots. Simple roots will be highlighted with a hat on top of the root vector in the sequel.

Lemma 6.1. The simple roots satisfy the following important properties: (i) A root system of rank r has precisely r simple roots, which are linearly independent. They r thus form a basis of R . (ii) All other roots can be obtained from successive Weyl reflections of simple roots (see property (iv) of the root system). (iii) Ifα ˆ and βˆ are simple roots thenα ˆ − βˆ is not a root. (iv) Ifα ˆ and βˆ are different simple roots, then their scalar product is nonpositive,α ˆ · βˆ ≤ 0. Part (i) is proved in the exercises and we skip the proof of (ii). The two remaining parts are easily deduced as follows.

Proof of Lemma 6.1(iii) . Assume thatα ˆ and βˆ are simple roots andα ˆ − βˆ is a root. Then either     αˆ − βˆ or βˆ − αˆ is a positive root. But then eitherα ˆ = αˆ − βˆ + βˆ or βˆ = βˆ − αˆ +α ˆ can be written as the sum of two positive roots. This contradicts the assumption thatα ˆ and βˆ are simple.

ˆ Proof of Lemma 6.1(iv) . From (iii) it follows that E−αˆ|Eβˆi = |[E−αˆ,Eβˆ]i = 0 since β − αˆ cannot be a root. Hence |βˆi is a lowest-weight state in the su(2)-representation ofα ˆ, but then its E3- eigenvalueα ˆ · β/ˆ (ˆα · αˆ) cannot be positive.

The information about the r simple rootsα ˆi, i = 1, . . . , r is conveniently stored in the Cartan matrix:

Definition 6.9 (Cartan matrix). Given the r simple roots of a simple, compact Lie algebra, the r × r-dimensional Cartan matrix A is defined by the matrix elements 2ˆαi · αˆj A = . (6.20) ij |αˆj|2

From the definition it follows that • the diagonal elements of the Cartan matrix are always 2. • the off-diagonal elements encode the angles and relative lengths of the simple roots. Assuming Aij ≤ Aji the possible values are

(Aij,Aji) (0, 0) (−1, −1) (−2, −1) (−3, −1) π 2π 3π 5π (6.21) angle 2 3 4 6

Example 6.2 (Cartan matrix of su(3)). The two simple roots are √ √ 1  1 3  2  1 3  αˆ = 2 , 2 , αˆ = 2 , − 2 . (6.22)

31 Evaluating the scalar products of the roots gives

1 2 2 2 1 2 1 |αˆ | = |αˆ | = 1 , αˆ · αˆ = − 2 . (6.23) Substituting these values into the definition of the Cartan matrix (6.20) gives

 2 −1  A = . (6.24) −1 2

32 7 Irreducible representations of su(3) and the quark model

After SU(2), whose representations we have already studied, SU(3) is arguably the most important (simple) compact Lie group in particle physics. The simplest application is the quark model that describes mesons and baryons in terms of their constituent quarks. The main idea is that the three lightest quarks, the up, down and strange quark, are related by SU(3) flavour symmetry, spanning a 3-dimensional vector space that transforms under the fundamental representation. Their anti- particles transform in the anti-fundamental representation (to be defined below). Any composite states of quarks and anti-quarks live in tensor products of these representations, which necessarily decompose into irreducible representations of SU(3). The similarities between the masses of certain mesons and baryons can be explained by recognizing the particles with similar masses as living in the same irreducible representation. To understand this grouping let us start by classifying the irreducible representations.

Remark 7.1. This approximate flavour-SU(3) symmetry is unrelated to the exact color-SU(3) symmetry of the Standard Model gauge group SU(3) × SU(2) × U(1) that was discovered later.

7.1 The algebra As we have seen in Section 3.3, the fundamental representation of the Lie algebra su(3) is given by the traceless antihermitian 3 × 3 matrices,

n 3×3 † o su(3) = X ∈ C : X = −X, tr X = 0 .

A basis of Hermitian traceless 3 × 3 matrices, analogous to the Pauli matrices of su(2), is given by the Gell-Mann matrices  0 1 0   0 −i 0   1 0 0  λ1 =  1 0 0  , λ2 =  i 0 0  , λ3 =  0 −1 0  , 0 0 0 0 0 0 0 0 0  0 0 1   0 0 −i   0 0 0  λ4 =  0 0 0  , λ5 =  0 0 0  , λ6 =  0 0 1  , (7.1) 1 0 0 i 0 0 0 1 0  0 0 0   1 0 0  λ = 0 0 −i , λ = √1 0 1 0 . 7   8 3   0 i 0 0 0 −2

They are normalized such that tr(λaλb) = 2δab. It is customary to take the generators in the fundamental representation to be

i 1 Xa ≡ −iTa ≡ − 2 λa, tr(TaTb) = 2 δab, a, b = 1,..., 8. (7.2)

The structure constants fabc are completely antisymmetric and the only nonvanishing constants (with indices in ascending order) are √ 1 3 f = 1, f = f = f = f = −f = −f = , f = f = . (7.3) 123 147 246 257 345 156 367 2 458 678 2

Any other set of Hermitian matrices T1,...,T8 satisfying

[Ta,Tb] = ifabcTc (7.4) therefore determines a representation Xa ≡ −iTa of su(3) (and of SU(3) via the exponential map).

33 7.2 Cartan subalgebra and roots By examining the structure constants (7.3) one may observe that at most two generators can be found to commute, meaning that the Lie algebra su(3) is of rank 2. The Cartan subalgebra is conventionally taken to be spanned by the generators H1 = T3 and H2 = T8, which in the fundamental representation are diagonal. These generators have clear physical interpretations in the quark model, since 2 2 1 I3 = H1 = T3,Y = √ H2 = √ T8,Q = I3 + Y (7.5) 3 3 2 are respectively the isospin, hypercharge and electrical charge generators of the quarks.

The remaining generators can be combined into raising and lowering operators Eα, 1 1 1 E 1 = √ (T1 ± iT2),E 2 = √ (T4 ± iT5),E 3 = √ (T6 ± iT7) (7.6) ±α 2 ±α 2 ±α 2 with corresponding roots √ √ 1 2 1 1 3 1 1 α = (1, 0), α = ( 2 , 2 3), α = (− 2 , 2 3). (7.7) The root diagram therefore looks as follows.

7.3 Representations

Let us assume T1,...,T8 are Hermitian and form an irreducible representation of su(3). k k Recall from (6.13) that for each root α , the operators E±αk and αi Hi form an su(2)-subalgebra

k k [Eαk ,E−αk ] = αi Hi, [αi Hi,E±αk ] = ±E±αk , (7.8) which may or may not be reducible (note that |αk| = 1). In any case our knowledge about k su(2)-representations implies that the eigenvalues of αi Hi are half-integers. These eigenvalues k 2 are precisely the inner products α · µ of the roots with the weights µ = (µ1, µ2) ∈ R of the representation. Hence, the weights must lie on the vertices of the triangular lattice depicted in the figure below. If µ is a weight, meaning that Hi|Ψi = µi|Ψi for some state |Ψi, then E±αk |Ψi is k a state with weight µ ± α provided E±αk |Ψi= 6 0. Using a similar reasoning as for the roots in Section 6.3 it follows that if αk · µ > 0 then all of µ, µ − αk, µ − 2αk,..., µ − 2(αk · µ)αk are also weights. In particular, the set of weights is invariant under the Weyl reflections µ → µ−2(αk ·µ)αk, i.e. reflections in the planes orthogonal to the roots α1, α2 and α3. Moreover, any two states with different weights can be obtained from each other by actions of the operators E±αk , because the collection of all states accessible from some initial state span an invariant subspace, which has to be the full vector space because of irreducibility. Therefore the weights must differ by integer multiples of αk. Combining these observations we conclude that the set of weights necessarily occupy a convex polygon that is invariant under Weyl reflections (thus either a hexagon or a triangle), like in the following figure.

34 7.4 Highest weight states In Chapter4 we have seen that irreducible representations of su(2) are characterized by their highest weights (j in the spin-j representation). Something similar is true for su(3), but now the weights are two-dimensional so we need to specify what it means for a weight to be highest. To this end we make use of the simple roots of su(3) which we have determined to be √ √ 1 2 1 1 2 3 1 1 αˆ = α = ( 2 , 2 3), αˆ = −α = ( 2 , − 2 3) (7.9) in Example 6.2.

Definition 7.1 (Highest weight). A state |Ψi that is annihilated by the raising operators Eαˆ1 and Eαˆ2 is called a highest weight state. Its weight is called the highest weight. Any irreducible representation can be shown to have a unique highest weight with multiplic- ity one. In the weight diagram above it corresponds to the right-most weight (i.e. largest H1- eigenvalue).

Definition 7.2 (Label). The label (p, q) of an irreducible representation is given in terms of the highest weight µ by p = 2α ˆ1 · µ, q = 2α ˆ2 · µ. (7.10)

The irreducible representations of su(3) satisfy the following properties, which we do not prove here. • Each label (p, q), p, q = 0, 1, 2,..., corresponds to a unique irreducible representation of su(3). • The set of weights of the representation with label (p, q) can be determined as follows. By inverting (7.10) the highest weight is µ = ( 1 (p + q), √1 (p − q)). The Weyl reflections of µ 2 2 3 determine the corners of a convex polygon P , that is either a hexagon or a triangle. The 1 2 complete set of weights then corresponds to those vectors of the form µ + n1αˆ + n2αˆ with n1, n2 ∈ Z that are contained in P . • The multiplicities of the weights (i.e. the number of linearly independent states with that weight) can be deduced from the diagram. The weights occur in nested “rings” that are either hexagonal (the outer ones) or triangular (the inner ones). The weights on the outer ring have multiplicity 1. Each time one moves inwards from a hexagonal ring to the next ring the multiplicity increases by 1. Inside triangular rings the multiplicities do not change anymore. Examples of weight diagrams with small labels are shown below. The highest weights are circled (in blue), and multiplicities are indicated on the weights when they are larger than 1.

35 As one may verify for these examples, the dimension of the representation is generally given by 1 N = (p + 1)(q + 1)(p + q + 2). (7.11) 2 For p 6= q the two irreducible representations (p, q) and (q, p) are not equivalent but related by complex conjugation. To see this, let Ta be the generators of the irreducible (p, q)-representation. Since [Ta,Tb] = ifabcTc with real structure constants fabc we have that

∗ ∗ ∗ ∗ [−Ta , −Tb ] = [Ta,Tb] = ifabc(−Tc ), (7.12) ∗ showing that the matrices −Ta also generate a representation, which is automatically irreducible too. All eigenvalues of H1 = T3 and H2 = T8 have changed sign, thus changing the set of weights to those of the (q, p) representation. The irreducible representations are often denoted by their dimension (7.11) in bold face N instead of their label (p, q), where if p > q the representation N corresponds to (p, q) and N¯ to (q, p). Let us discuss the interpretation of some of these representations in the quark model. • 3,(p, q) = (1, 0): This is the fundamental representation (7.2) of su(3) as can be easily 1 1 checked by determining the eigenvalues of H1 = 2 λ3 and H2 = 2 λ8. The three basis states, |ui, |di, |si correspond to the up, down and strange quark respectively. Their arrangement determines the values of the isospin and charge via (7.5). • 3¯,(p, q) = (0, 1): the complex conjugate representation or anti-fundamental representation. Since the corresponding states have opposite quantum numbers as compared to the funda- mental representation, it stands to reason that they correspond to the anti-quarks |¯ui, |d¯i, |¯si

• 8,(p, q) = (1, 1): this is the octet or adjoint representation, since the non-zero weights are exactly the roots of su(3). It appears in the decomposition 3 ⊗ 3¯ = 8 ⊕ 1 of the tensor product of the fundamental and anti-fundamental representation, and therefore describes bound states of a quark with an anti-quark, which are mesons. It also appears in the decomposition 3 ⊗ 3 ⊗ 3 = 10 ⊕ 8 ⊕ 8 ⊕ 1 of the tensor product of three fundamental representations, describing bound states of three quarks, which are baryons. The baryons in the adjoint representation have spin 1/2.

36 • 10,(p, q) = (3, 0): the decuplet representation also appears in the decomposition of 3 ⊗ 3 ⊗ 3 and describes the spin-3/2 baryons.

We close this section with several remarks concerning the quark model. • The existence of quarks was unknown in the early sixties. It is on the basis of the nice group- ing (the Eightfold way) of the discovered mesons and baryons into octets and decuplets that Gell-Mann and Zweig (independently in 1964) postulated the existence of more fundamental particles living in the fundamental representation of su(3) (dubbed “quarks” by Gell-Mann). • The su(3) flavor symmetry is only an approximate symmetry of the standard model (contrary to the su(3) color symmetry of QCD) because, among other reasons, the quark masses are quite different (mu = 2.3 MeV, md = 4.8 MeV, ms = 95 MeV). Since the difference in masses between the up and down quark is much smaller than between the up and strange quark, one would expect that the symmetry relating the up and down quarks is more accurate. This smaller symmetry corresponds precisely to su(2)-subrepresentation associated to the 1 root α = (1, 0), which has the isospin I3 as Cartan generator. Indeed, the masses of the mesons and baryons in the corresponding su(2)-multiplets (horizontal lines in the diagrams) are quite close, e.g. the proton and neutron among the spin-1/2 baryons.

37 8 Classification of compact, simple Lie algebras

8.1 Simple Lie algebras Recall (Definition 5.4) that a Lie algebra is called compact if its Cartan-Killing metric is positive definite. An important result, that we will not prove, is that any compact Lie algebra g decomposes as a direct sum of compact, simple Lie algebras

g = g1 ⊕ · · · ⊕ g2. (8.1)

Hence, in order to classify compact Lie algebras it is sufficient to classify compact, simple Lie algebras and then consider all possible compositions. In order to introduce the concept of simple Lie algebras, we need a few definitions. Recall that a subspace h ⊂ g is called a subalgebra of g if [h, h] ⊂ h. A stronger condition is the following.

Definition 8.1 (Invariant subalgebra). A subalgebra h of a Lie algebra g is called invariant if [X,Y ] ∈ h for all X ∈ h and Y ∈ g, i.e. [h, g] ⊂ h. (8.2)

For a Lie algebra g with a basis X1,...,Xn of generators and corresponding structure constants fabc the k-dimensional subspace spanned by X1,...,Xk is a subalgebra precisely when

fijα = 0 for 1 ≤ i, j ≤ k, k < α ≤ n, (8.3) while it is invariant if in addition

fiβα = 0 for 1 ≤ i ≤ k, k < α, β ≤ n. (8.4)

Definition 8.2 (Simple Lie algebra). A Lie algebra g is simple if it has no invariant subalgebras (other than {0} and g itself). Equivalently, a Lie algebra is simple precisely when its adjoint representation is irreducible. Indeed, h is an invariant subalgebra precisely when X|Y i = |[X,Y ]i ∈ h for any X ∈ g and Y ∈ h. This is precisely the condition for h ⊂ g to be an invariant subspace for the adjoint representation.

8.2 Dynkin diagrams In the previous chapter we have seen that compact Lie algebras can be characterized in various equivalent ways (with increasing simplicity) by • the root system Φ = {αi}; • the collection of simple roots {αˆi}; 2ˆαi·αˆj • the Cartan matrix Aij = αˆj ·αˆj . The Cartan matrix is often visualized via its .

Definition 8.3 (Dynkin diagram). The Dynkin diagram describing a Lie algebra is constructed as follows: • simple roots are represented by nodes • i • max{|Aij|, |Aji|} gives the number of lines connecting the circles representing the rootsα ˆ andα ˆj.

38 • if the roots connected by lines have different length, one adds an arrow pointing towards the shorter root (hint: think of the arrow as a “greater than” symbol >). In the exercises it is shown that the Dynkin diagram of a simple compact Lie algebra is connected. Otherwise the Dynkin diagram consists of a number of connected components corresponding to the simple compact Lie algebras in to which it decomposes as a direct sum. From the definition it follows that there is only one Dynkin diagram of rank 1 and that there are four diagrams of rank 2:

Four of these correspond to familiar Lie algebras, while G2 is the first exceptional Lie group that we encounter. If we go to higher rank, then not every graph built from three or more nodes connected by (multiple) links corresponds to the Dynkin diagram of a compact Lie algebra. Indeed, there have to exist r linearly independent vectorsα ˆi, whose inner products give rise to the links in the diagram. Luckily this is the only criterion that a diagram has to satisfy to qualify as a Dynkin diagram. As we will see now it puts severe restrictions on the types of graphs that can appear.

8.3 Classification of compact, simple Lie algebras In this section we focus only on connected Dynkin diagrams, since we want to classify simple Lie algebras. For the moment we also forget about the arrows on the Dynkin diagrams, since we are just going to consider what angles between the simple roots are possible.

Lemma 8.1 (Dynkin diagrams of rank 3). The only Dynkin diagrams of rank 3 are

Proof. This results from the fact that the sum of angles between 3 linearly independent vectors must be less than 360◦. Computing the angle in the first diagram, one has 120◦ +120◦ +90◦ = 330◦. The analogous computation for the second diagram yields 120◦ + 135◦ + 90◦ = 345◦. This also implies that the diagrams shown below are not valid, since the angles add up to 360◦:

Adding even more lines of course even increases the sum of angles.

This has important consequences for diagrams of higher rank, because of the following.

Lemma 8.2 (Subsets of a Dynkin diagram). A connected subset of nodes from a Dynkin diagram together with the links connecting them is again a Dynkin diagram.

39 Proof. Taking a subset of simple roots preserves the corresponding lengths and angles, and the vectors are still linearly independent.

In particular, a triple line cannot occur in diagrams with three or more nodes, because otherwise there would be a subset of rank 3 with a triple line. Hence, we have:

Lemma 8.3 (No triple lines). G2 is the only Dynkin diagram with a triple line. Another way to shrink a diagram is by shrinking single lines.

Lemma 8.4 (Contraction of a single line). If a Dynkin diagram contains two nodes connected by a single line, then shrinking the line and merging the two nodes results again in a Dynkin diagram.

Proof. Suppose thatα ˆ and βˆ are connected by a single line. We claim that replacing the pair of vectorsα ˆ and βˆ by the single vectorα ˆ + βˆ preserves the necessary properties. By Lemma 8.1 and Lemma 8.2 each other nodeγ ˆ is connected to at most one ofα ˆ or βˆ (otherwise there would be a triangle subdiagram). Ifγ ˆ was connected toα ˆ thenγ ˆ · (ˆα + βˆ) =γ ˆ · αˆ, while ifγ ˆ was connected to βˆ thenγ ˆ · (ˆα + βˆ) =γ ˆ · βˆ. Since also |αˆ + βˆ|2 = |αˆ|2 = |βˆ|2, all lines other than the contracted one are preserved. The resulting set of vectors is also still linearly independent.

As a consequence we have that

Lemma 8.5 (No loops). A Dynkin diagram cannot contain loops.

Proof. By Lemma 8.1 a loop must always contain a single edge, which can be contracted by Lemma 8.4. Hence, every loop can be contracted to a triangle, in contradiction with Lemma 8.1.

Lemma 8.6 (At most one double line). A Dynkin diagram can contain at most one double line.

Proof. If there are two or more double lines, then one can always contract single lines until two double lines becomes adjacent. But this is not allowed by Lemma 8.1.

At this point we know that Dynkin diagrams always have the shape of a tree with at most one of its links being a double line. To further restrict branching of the tree the following result is quite useful.

Lemma 8.7 (Merging two single lines). If a Dynkin diagram has a node with (at least) two dangling single lines, then merging them into a double line gives another Dynkin diagram. In other words, the following is a valid operation on Dynkin diagrams (with the gray blob representing the remaining part of the diagram):

Proof. Replacingα ˆ and βˆ byα ˆ +βˆ, linear independence of the resulting set of vectors is automatic. Hence it remains to check the inner products betweenγ ˆ andα ˆ + βˆ. Using that |αˆ|2 = |βˆ|2 = |γˆ|2,

40 αˆ · βˆ = 0, and 2ˆα · γ/ˆ |γˆ|2 = 2βˆ · γ/ˆ |γˆ|2 = −1, it follows indeed that

γˆ · (ˆα + βˆ) γˆ · (ˆα + βˆ) 2 = −2 and 2 = −1. (8.5) |γˆ|2 |αˆ + βˆ|2 This corresponds precisely with the diagram on the right.

As a consequence a Dynkin diagram can only contain one particular junction.

Lemma 8.8 (At most one junction or double line). A Dynkin diagram contains one double line, or one junction, or neither. A junction is a node with more than two neighbours. Any junction must have the form

, i.e. three neighbours connected by single lines.

Proof. By repeated contraction (Lemma 8.4) and merging (Lemma 8.7) any diagram with a dif- ferent type of junction (with more than three neighbours or involving double lines) can be reduced to a diagram with two adjacent double lines, which is not allowed. The same is true if a diagram has two junctions, or both a junction and a double line.

The remaining step is to get rid of some remaining diagrams on a case-by-case basis, by showing that they cannot satisfy linear independence.

Lemma 8.9 (Diagrams failing linear independence). The following diagrams cannot be realized by linear independent vectors:

Pr j Proof. The labels µ1, . . . , µr indicated on the diagram are such that j=1 µjαˆ = 0. To see this Pr j 2 Pr i j one may straightforwardly compute | j=1 µjαˆ | = i,j=1 µiµjαˆ ·αˆ = 0 using the inner products

41 αˆi · αˆj as prescribed by the diagram. Such a linear relation implies that the vectorsα ˆj are not linearly independent.

This is all we need to deduce the final classification theorem. In general, connected Dynkin diagrams are denoted by a capital letter denoting the family and a subscript denoting the rank, e.g. B2 for the Dynkin diagram of the rank-2 Lie algebra so(5). The same notation is also sometimes confusingly used to specify the corresponding Lie algebra and/or Lie group.

Theorem 8.1 (Classification of simple, compact Lie algebras). Any simple, compact Lie algebra has a Dynkin diagram equal to a member of one of the four infinite families An, Bn, Cn, Dn, n ≥ 1, or to one of the exceptional Lie algebras G2, , , , or . In other words its Dynkin diagram appears in the following list:

Proof. Let’s look have a look at the three cases or Lemma 8.8.

No junction or double line: Then the diagram is clearly of the form of An for some n ≥ 1. One double line: The diagram has no junction. If the double line occurs at an extremity then it is of the form of Bn of Cn. If it sits in the middle, then by Lemma 8.9(d) and (e) it can have only one single edge on both sides, i.e. of the form F4.

One junction: Let’s denote the lengths of the paths emanating from the junction by `1 ≤ `2 ≤ `3. Using Lemma 8.9: (a) implies `1 = 1; (b) implies `2 ≤ 2; and (c) implies `2 = 1 or `3 ≤ 4. It is then easy to see that Dn (`2 = 1), E6 (`2 = `3 = 2), E7 (`2 = 2, `3 = 3), and E8 (`2 = 2, `3 = 4) are the only possibilities.

As one may note by looking at the infinite families for small n, there are some overlaps between the families. Since there is only one rank-1 diagram it makes sense to label it by A1, and require n ≥ 2 for the remaining families. However, B2 = C2 and D3 = A3, while D2 consisting of two disconnected nodes does not correspond to a simple Lie algebra. Hence, to make the classification unique one may require n ≥ 1 for An, n ≥ 2 for Bn, n ≥ 3 for Cn, and n ≥ 4 for Dn. Finally, going back to the common Lie groups that we identified in Chapter 6 it is possible to obtain the following identification:

42 Cartan label Lie algebra dimension rank 2 An su(n + 1) (n + 1) − 1 n ≥ 1 Bn so(2n + 1) (2n + 1)n n ≥ 2 Cn sp(n) (2n + 1)n n ≥ 3 Dn so(2n) (2n − 1)n n ≥ 4 G2 14 2 F4 52 4 E6 78 6 E7 133 7 E8 248 8 From the overlaps described above it follows that we have the following isomorphisms between Lie algebras, Cartan label Lie algebra A ,B ,C so(3) ' su(2) ' sp(1) 1 1 1 , (8.6) B2,C2 so(5) ' sp(2) A3,D3 so(6) ' su(4) as well as the nonsimple so(4) = su(2) ⊕ su(2) (see exercises).

8.4 Concluding remark on unified gauge theories The idea behind Grand Unified Theories (GUT) is to embed the gauge group of the standard model of particle physics, SU(3)×SU(2)×U(1) in a simple, compact Lie group. In other words we want to recognize su(3) ⊕ su(2) ⊕ u(1) as a subalgebra in one of the simple, compact Lie algebras of Theorem 8.1. Based on the Dynkin diagrams, one can determine which Lie algebras are suited for this purpose. The simplest possibility turns out to be A4, i.e. su(5). Let us demonstrate more generally how one can obtain a subalgebra from a rank-r Lie algebra g by removing a nodeα ˆ0 from its Dynkin diagram. We label the simple roots asα ˆ0,... αˆr−1 and j choose a basis of Cartan generators H0,...,Hr−1 such thatα ˆ0 = 0 for j ≥ 1. Then the Cartan 1 r−1 generators H1,...,Hr−1 together with the roots arising from Weyl reflections ofα ˆ ,..., αˆ span 0 a rank-(r − 1) subalgebra g1 ⊂ g, whose Dynkin diagram is the original one with the nodeα ˆ removed. However, one can extend the subalgebra to g1 ⊕ u(1) by adding the Cartan generator H0, since by construction it commutes with all of g1.

In particular, if we remove one of the middle nodes of the diagram A4 of su(5) then the remaining diagram decomposes in an A2 and an A1, which together form the Dynkin diagram for su(3)⊕su(2). Hence we find the explicit subalgebra

su(3) ⊕ su(2) ⊕ u(1) ⊂ su(5), (8.7) which forms the basis of the famous Georgi-Glashow GUT based on SU(5).

43