<<

Lie algebras, their and GLn Minor Thesis

Greta Panova June 20, 2008

Contents

1 Introduction 1

2 Lie Algebras 2 2.1 Definitions and main theorems ...... 2 2.2 Representations of Lie algebras ...... 7 2.3 Representations of sl(2,F ) ...... 9

3 Lie Groups and their Lie algebras 11

4 Representations of semisimple Lie algebras 15 4.1 Root space decomposition ...... 15 4.2 Roots and weights; the Weyl ...... 19

5 The special linear sl(n, C) and the general GLn(C) 22 5.1 Structure of sl(n, C) ...... 22 5.2 Representations of sl(n, C)...... 24 5.3 Weyl’s construction, tensor products and some combinatorics ...... 25 n 5.4 Representations of GLn(C ) ...... 27 5.5 Some explicit construction, generators of SλV ...... 28 6 Conclusion 30

A Schur , Weyl’s construction and representations of GLn(C) 30

1 Introduction

The goal of this minor thesis is to develop the necessary theory of Lie algebras, Lie groups and their representation theory and explicitly determine the structure and representations of sln(C) and GLn(C). My interest in the representations of GL(V ) come from their strong connection to combinatorics as developed in Chapter 7 and its appendix in [3]. Even though the irreducible holomorphic representations of GL(V ) can be derived without passing through the general theory of Lie algebras, as Schur did in his dissertation, any attempt to generalize the result or to obtain analogous results for the unitary or symplectic groups will pass through the theory of Lie algebras. Here we will develop the basic theory of Lie algebras and their representations, focusing on semisimple Lie algebras. We will develop most of the necessary theory to show facts like complete irreducibility of representations of semisimple Lie algebras, develop the theory necessary to decompose the lie algebras into root spaces and use these root spaces to decompose representations into weight spaces and list the irreducible

1 representations via weights. We will establish connections between Lie groups and Lie algebras, which will, for example, enable us to derive the irreducible representations of GL(V ) through the ones for gl(V ). In our development of the basic theory of Lie algebras we will follow mostly [2], while studying Lie groups, roots and weights, sl(n, S) we will follow [1]. We will encounter some combinatorial facts which will be taken for granted and whose proofs are found in [3].

2 Lie Algebras

In this section we will follow [2]. We will develop the basic theory of Lie algebras and later we’ll establish how they arise from Lie groups and essentially motivate their existence.

2.1 Definitions and main theorems We will, of course, start with the definition of a Lie algebra. Definition 1. A Lie algebra is a over a field F endowed with a bracket operation L × L → L denoted (x, y) → [xy], which satisfies the following axioms: (L1) [xy] is bilinear, (L2) [xx] = 0 for all x ∈ L, (L3) [xy] satisfies the Jacobi identity [x[yz]] + [y[zx]] + [z[xy]] = 0 for all x, y, z ∈ L. A homomorphism (isomorphism) of Lie algebras will be a vector space homomorphism (resp. isomor- phism) that respects the bracket operation, a subalgebra would a subspace closed under the bracket operation. An important example of Lie algebra is the general gl(V ), which coincides as a vector space with End V (or Mn - space of n×n matrices) and has a bracket operation defined as [XY ] = XY −YX (it is a straightforward check to verify the axioms). Next we can define the special linear algebra sl(V ) as the subspace of End V of 0 and the same bracket operation, it is clearly a subalgebra of gl(V ). Similarly we can define other subalgebras of gl(V ) by imposing conditions on the matrices, e.g. uppertriangular, strictly uppertriangular, diagonal. Another important example of a gl(U) subalgebra, where U is a vector space endowed with a bilinear operation (denoted as multiplication), is the algebra of derivations Der U. It is first of all the vector space of linear maps δ : U → U, such that δ(XY ) = Xδ(Y ) + δ(X)Y . In order to verify that this is a subalgebra we need to check that [δδ0] ∈ DerU. For the sake of exercise we will do that now. We need to check that [δδ0] = δδ0 − δ0δ satisfies the derivation condition (it is clear it’s a ), so for X,Y ∈ U we apply the derivation rule over and over 1. Now if we let U = L with the bilinear operation defined by the bracket of L, we can talk about Der L. Some of these derivations arise as ad x : L → L given by ad x(y) = [xy]. It is a derivation, since by the Jacobi identity ad x(yz) = [x[yz]] = −[y[zx]] − [z[xy]] = [y[xz]] + [[xy]z] = [y ad x(z)] + [ad(y)z]. We can now make a Definition 2. The map L → Der L given by x → ad x, where ad x(y) = [xy] for all y ∈ L, is called the of L. Continuing with the definitions we need to define ideals of a Lie algebra. The notion is the same as ring ideals, just that multiplication is given by the bracket, in other words I is an ideal of L if for any x ∈ L and y ∈ I [xy] ∈ I. Two important ideals are the Z(L) = {z ∈ L|[xz] = 0 for all x ∈ L} and the derived algebra [LL] consisting of all linear combinations of [xy]. Sums and products (given by the bracket) of ideals are clearly also ideals. A Lie algebra with no nontrivial ideals (i.e. 0 and itself) and with [LL] 6= 0 is called simple, in this case [LL] 6= 0 necessarily implies [LL] = L. We define the normalizer of a subalgebra K of

1[δδ0](XY ) = δ(δ0(XY )) − δ0(δ(XY )) = δ(δ0(X)Y + Xδ0(Y )) − δ0(δ(X)Y + Xδ(Y )) = δ(δ0(X)Y ) + δ(Xδ0(Y )) − δ0(δ(X)Y ) − 0 0 0  0  0 0 0  0  0 0 δ (Xδ(Y )) = δ(δ (X))Y +δ (X)δ(Y )+δ(X)δ (Y )+Xδ(δ (Y ))−δ (δ(X))Y −δ(X)δ (Y )−δ (X)δ(Y )−Xδ (δ(Y )) = X(δδ (Y )− δ0δ(Y )) + (δδ0(X) − δ0δ(X))Y = X[δδ0](Y ) + [δδ0](X)Y

2 L as NL(K) = {x ∈ L|[xK] ⊂ K} and the centralizer of a subset X of L as CL(X) = {x ∈ L|[xX] = 0}, both of them turn out to be subalgebras of L by the Jacobi identity 2 We need to define what we mean by a representation of a Lie algebra since this is the ultimate goal of this paper. A representation of L is a homomorphism φ : L → gl(V ). The most important example of which is the already mentioned adjoint representation ad L :→ gl(L). Since linearity is clear, all we need to check is that it preserves the bracket, which again follows from the Jacobi identity 3. We proceed now to some very important concepts in the theory of Lie algebras. We call a Lie algebra L solvable if the sequence of ideals L(0) = L, L(1) = [LL],...,L(n) = [L(n−1)L(n−1)] eventually equals 0. Note for example that simple algebras are not solvable, as L(n) = L for all n. Here is the place to introduce the subalgebra of gl(n, F ) t(n, F ) of upper triangular n × n matrices and the subalgebra n(n, F ) of strictly upper triangular matrices. We notice that [t(n, F )t(n, F )] ⊂ n(n, F ), and that taking the of two matrices A and B with entries ai,j, bi,j = 0 if j − i ≤ k for k > 0 will result in C = AB − BA with ci,j = 0 for j − i ≤ k + 1 and continuing this way we will eventually get a 0 (e.g. for k > n), so these two algebras are solvable. We note here some easy facts. If L is solvable, so are all its subalgebras and homomorphic images. Also, if I is a solvable ideal of L and L/I is solvable, then so is L. This can be shown by considering the quotient map π : L → L/I. Since (L/I)(n) = 0, then so is 0 = (π(L))(n) = π(L(n)), so L(n) ⊂ Ker π = I. Since I(k) = 0 for some k, then L(n+k) = (L(n))(k) = 0. As a consequence of this fact we have that if I and J are solvable ideals of L, then since (I + J)/J ∼ I/(I ∩ J) is solvable as a quotient of I, by the what we just proved I + J is also solvable. This allows us to make one of the most important definitions, that of semisimplicity. If S is a maximal (not contained in any other such) solvable ideal of L, then for any other solvable ideal I we would have S + I solvable, so by the maximality of S, S + I = S and hence I ⊂ S, showing that S must be unique. We denote S by Rad L, the radical of L and make the Definition 3. A Lie algebra L is called semisimple if its radical is 0, Rad L = 0. We now define another very important notion to play a role later. As we know from linear algebra every matrix X can be written uniquely as X = Xs + Xn, where Xn is nilpotent, Xs is diagonalizable and Xs and Xn commute. We would like to do something similar with Lie algebras, i.e. find their nilpotent and semisimple elements. If X ∈ gl(V ) is a nilpotent matrix, Xn = 0, then for any Y ∈ gl(V ) we have k P i k−i (adX) (Y ) = i aiX YX by induction on k for some ai ∈ F and so we have for k ≥ 2n that either i or k − i is ≥ n, so Xi = 0 or Xk−i = 0 and ultimately (ad X)k(Y ) = 0, so ad X is nilpotent. If x is semisimple, then there is a of V , v1, . . . , vn, relative to which x = diag(a1, . . . , an). Then for the basis elements eij of gl(V ) (matrices with only nonzero entry equal to 1 at row i, column j) we have ad x(eij) = xeij − eijx = aieij − ajeij so ad x is a with ai − aj on the diagonal. Since Jordan decomposition is unique and since the properties of nilpotency and semisiplicity are preserved we have that ad x = ad xs + ad xn is the Jordan decomposition of ad x. We will consider now what it means for a Lie algebra to be nilpotent and semisimple. Definition 4. A Lie algebra L is called nilpotent if the series L0 = L, L1 = [LL],...,Li = [LLi−1],... eventually becomes 0. For example any nilpotent algebra is also solvable since L(i) ⊂ Li. If Ln = 0, this means that for any x0, . . . , xn ∈ L we have [xn[xn−1[... [x1x0] ... ] = 0 or in operator notation ad xn ad xn−1 ... ad x1(x0) = 0, n in particular this is true if xn = ··· = x1 = x, so (ad x) = 0 for all x ∈ L. It turns out that the converse is also true, we have

Theorem 1 (Engel’s theorem). If all elements of L are ad-nilpotent (i.e. (ad x)k = 0) then L is nilpotent.

2 Linearity is clear, we need only check that they are closed under the bracket operation. If x, y ∈ NL(K) then 0 = [[xy]k] + [[yk]x] + [[kx]y] and since [yk] ∈ K,[kx] = −[xk] ∈ K we have [x[yk]] = −[[yk]x] ∈ K and [y[kx]] ∈ K, so [[xy]k] ∈ K. The case for CL follows from the second two terms being 0. 3For x, y, z ∈ L, we have [ad x ad y](z) = ad x(ad y(z)) − ad y(ad x(z)) = [x[yz]] − [y[xz]] = [[xy]z] = ad[xy](z).

3 For example then by this theorem then n(n, F ) is a . We will prove Engel’s theorem, because it will illustrate some important techniques. The proof requires a lemma, important on its own, we will prove it first. Lemma 1. If L is a subalgebra of gl(V ) (dim V < ∞) consisting of nilpotent endomorphisms, then L.v = 0 for some nonzero v ∈ V . Proof of lemma. In order to prove this lemma we will proceed by induction on dim L (this is the first common trick in Lie algebras). The goal is to find a codimension 1 subalgebra of L and use the induction hypothesis on it. The dimension 0 and 1 cases are trivial. Now let K be a maximal proper subalgebra of L, it acts on L via the adjoint operator and hence on L/K. Since the elements of K are nilpotent, so are their ad-s by the preceding paragraph. We can use the induction hypothesis on K ⊂ gl(L/K), as clearly dim K < dim L, so there is a vector x + K ∈ L/K, such that x 6∈ K and such that [yx] = 0 for all y ∈ K, in particular x ∈ NL(K) \ K. Since K ⊂ NL(K), NL(K) is a subalgebra of L as we showed and K is maximal proper subalgebra, we should have NL(K) = L as a strictly larger subalgebra. But that means that for all x ∈ L [xK] ⊂ K, i.e. K is an ideal. If K is an ideal, then the vector space spanned by K and z ∈ L is a subalgebra for any z ∈ L (since [zK] ⊂ K ⊂ span{K, z}). So in particular if dim L/K > 1, then the space spanned by K and z ∈ L \ K is a proper subalgebra violating the maximality of K. Therefore dim L/K = 1 and so L = span{K, z}. Now we can apply the induction hypothesis to K, so there is a nonzero vector space k k W = {w|Kw = 0}. The element z is nilpotent, so is then its restriction to W (z = 0 then (z|W ) = 0), so it has an eigenvector in W corresponding to the 0 eigenvalue, zv = 0, then Lv = span{Kv, zv} = 0 and we are done. This lemma in particular implies that we can find a basis of V , for which the matrices of L are strictly upper triangular. Proof of Engel’s theorem. The condition that the elements of L are ad-nilpotent, suggests that we look at the algebra ad L first, it acts on the vector space gl(L) and its elements are nilpotent, so we can use the lemma and find an x ∈ L, such that [Lx] = 0. The existence of this x enables us to find a proper subalgebra of L of smaller dimension and so use again induction on dim L. In particular, since x ∈ Z(L) (centralizer of L) then Z(L) is a nonempty subalgebra of L as we showed earlier. It is also an ideal of L 4, so we can form the quotient algebra L/Z(L), which will then have a smaller dimension than L, and will still consist of ad-nilpotent elements. By induction it is nilpotent, so there is n, such that (L/Z(L))n = 0, i.e. Ln ⊂ Z(L), but then Ln+1 = [LLn] ⊂ [LZ(L)] = 0 and so L is nilpotent. Engel’s theorem essentially implies existence of a common eigenvector for 0, we would like to extend this result for any eigenvalue. Here we should assume that the underlying field F is algebraically closed and of characteristic 0. Theorem 2 (Lie’s theorem.). If L is a solvable subalgebra of gl(V ), V of finite dimension, then L stabilizes a flag in V , i.e. there is a basis for V with respect to which the matrices in L are upper triangular. Proof. We can proceed by induction on the dim V , we just need to show that there is a common eigenvector v for all elements in L and then apply the induction to V/v. In order to prove the existence of such vector, we again apply the usual trick - induction on dim L. Again we need to find an ideal of codimension one. Since L is solvable we know that [LL] 6= L, otherwise L(n) = L, it is clearly an ideal and any subspace K0 of the quotient L/[LL] is an ideal of L/[LL], since [Kx] ⊂ [LL] goes to 0. So we take a codimension one subspace of L/[LL], its inverse by the quotient map is then a codimension one ideal in L, call it K, L = K + F z. As K(n) ⊂ L(n), K is solvable, so we apply the induction hypothesis and find an eigenvector v for K, i.e. for any x ∈ K, there is a λ(x) ∈ F , such that x.v = λ(x)v. Here is the place to note that the map λ : K → F is actually linear 5, it will play important roles further in our discussion. Now instead of fixing v, we can fix λ and consider the possibly larger subspace W = {w ∈ V |x.w = λ(x).w for all x ∈ K}.

4If y ∈ Z(L) and z ∈ L, then for any u ∈ L [u[zy]] = −[z[yu]] − [y[uz]] = 0 5Since (ax + by)v = a(x.v) + b(y.v) = aλ(x)v + bλ(y)v.

4 Now the goal is to show that z(W ) ⊂ W , so that we can find an eigenvector for z|W , which will then be an eigenvector for L. We have to introduce now another important trick, that is the fundamental equality xz.w = zx.w + [x, z]w. Now if x ∈ K, then we would have x.(z.w) = λ(x)(z.w) + λ([x, z])w and if λ([z, x]) = 0, then 2 i we would have z.w ∈ W , which is what we need. Consider the spaces Wi = {w, z.w, z .w, . . . , z .w}, we have that zWi ⊂ Wi+1. Employing our fundamental trick again and using the fact that K is an ideal we i i−1 i−1 have that for any x ∈ K, x(z w) = z(x.z w) + [x, z](z w), so by induction if K.Wi−1 ⊂ Wi−1 then i since [x, z] ∈ K we have x(z w) ∈ zWi−1 + Wi−1 ⊂ Wi, so KWi ⊂ Wi for all i. Moreover we get from i i i−1 i−1 the same induction that xz w − λ(x)z w ∈ Wi−1: if xz w = λ(x)z w + wi−2 for wi−2 ∈ Wi−2, then i i−1 i−1 i−1 i xz w = z(x.z w) + [x, z]z w = z(λ(x)z w + wi−2) + wi−1 = λ(x)z w + zwi−2 + wi−1 and the last two i terms are in Wi−1. Since z w form a basis for Wn (where Wn−1 ( Wn = Wn+1) every x ∈ K acts as an

upper of Wn with this basis, so T rWn (x) = (n + 1)λ(x). In particular since [x, z] ∈ K,

T rWn ([x, z]) = (n + 1)λ([x, z]). But we know from linear algebra that T r(AB) = T r(BA), so in particular T r(xz − zx) = T r(xz) − T r(zx) = 0, so in characteristic 0 this forces λ([x, z]) = 0 and we are done. As a corollary of this fact, if we consider the adjoint representation of L in gl(L) we get that L fixes a flag of subspaces L0 ⊂ L1 ⊂ · · · ⊂ Ln = L, which are necessarily ideals. If we choose a basis for this flag, then with respect to it ad L consists of upper triangular matrices and hence [ad L, ad L] = adL[LL] are strictly upper triangular and hence nilpotent, i.e. ad x is nilpotent for x ∈ [LL] and so by Engel’s theorem [LL] is nilpotent. As we know from linear algebra every matrix X can be written uniquely as X = Xs + Xn, where Xn is nilpotent, Xs is diagonalizable and Xs and Xn commute. We would like to do something similar with Lie algebras, i.e. find their nilpotent and semisimple elements. By a theorem from linear algebra, we know that for any matrix, its Jordan decomposition into the sum of commuting semisimple (diagonalizable) and nilpotent parts, can be achieved concretely by giving this parts as polynomials in our original matrix, namely given X there exist polynomials p and q with 0 constant terms, such that Xs = p(X) is the semisimple part 6 of X and Xn = q(X) - the nilpotent . This fact enables us to give an easy checkable criterion for solvability. Theorem 3 (Cartan’s cirterion). Let L be a subalgebra of gl(V ), V -finite dimensional. If T r(xy) = 0 for all x ∈ [LL] and y ∈ L, then L is solvable.

Proof. If we show that [LL] is nilpotent, then L(i) ⊂ [LL](i−1) would be solvable, so by Engel’s theorem we need to show that the elements of [LL] are nilpotent, so we need to figure out a trace criterion for nilpotency. Suppose first that A ⊂ B ⊂ gl(V ) are subspaces and M = {x ∈ gl(V )|[x, B] ⊂ A}. Suppose that there is an x such that T r(xy) = 0 for all y ∈ M, we will show now x is nilpotent. We can find a basis, relative to which the semisimple part of x, xs is diagonal, so fix this basis and write x = diag(a1, . . . , an) + xn. We want to show that ai = 0, consider the space E = Q{a1, . . . , an}, we will show that E = 0 by showing that E∗ = {f : E → Q|f linear} is 0. ∗ For any f ∈ E , let y = diag(f(a1), . . . , f(an)), if then eij is the corresponding basis of gl(V ) of 7 matrices with only nonzero entry, equal to 1, at (i, j), then ad xs(eij) = xseij − eijxs = (ai − aj)eij and ad y(eij) = (f(ai) − f(aj))eij. By Lagrange interpolation we can find a polynomial r(T ) in F [T ] with 0 constant term, such that r(ai −aj) = f(ai)−f(aj) for all i, j. We then have that r(ad xs)eij = r(ai −aj)eij = (f(ai)−f(aj))eij = ad y.eij for all i, j, and since eij are a basis for gl(V ), we must have that ad y = r(ad xs). Since ad xs is the semisimple part of ad x as we remarked earlier, it will be a polynomial in ad x without constant term and so ad y = r(ad xs) will be a polynomial in ad x without constant term. Since by hypothesis ad x(B) = [x, B] ⊂ A, we will have that ad y as a polynomial in ad x will also map B into A, so y ∈ M. P Hence we have that T r(xy) = 0, but T r(xy) = i aif(ai) = 0. We can apply f to this linear combination

6 Q mi One can do this explicitly: if i(T − ai) is the characteristic polynomial of X, then by the Chinese remainder theorem m there is a polynomial p(T ) ≡ ai (mod (T − ai) i ) and p(T ) ≡ 0 (mod T ) and so on each eigensapce Vi for X we have p(X) − ai.I = 0, so p(X) acts diagonally and clearly fixes the eigenspaces as a polynomial in X, so it is the semisimple part, Xn = X − q(X) would be the nilpotent. 7 So in particular ad xs is semisimple and since ad xn will still be nilpotent as we showed earlier, ad xs is the semisimple part of ad x.

5 P 2 of ais and use the fact that f(ai) ∈ Q to get that i f(ai) = 0. Since the f(ai)s are rationals, they must then be 0, so f = 0 and E = 0, so ai = 0 and so xs = 0 and so x = xn is nilpotent. Having shown that, we can apply it to A = [LL], B = L, so M = {x ∈ gl(V )|[x, L] ⊂ [LL]}. We then have L ⊂ M, but the hypothesis of the theorem is only that T r(xy) = 0 for y ∈ L, not y ∈ M. We will show now that T r(xy) = 0 for x ∈ [LL], y ∈ L implies T r(xy) = 0 for x ∈ [LL] and y ∈ M so using the above paragraph we will have x nilpotent. Let y ∈ M and let x = [a, b] ∈ [LL] for a, b ∈ L. Then T r([a, b]y) = T r(aby−bay) = T r(aby)−T r(bay) = T r(bya)−T r(bay) = T r(b[y, a]) since T r(AB) = T r(BA) for any two matrices. But then b ∈ [LL] and [y, a] ∈ L by definition of M, so by the hypothesis of the theorem T r(xy) = T r(b[y, a]) = 0 and we can apply the preceding paragraph to get x - nilpotent. As we already mentioned, a semisimple algebra is the one for which Rad L = 0, but this is a difficult condition to check. We need other criteria and for that we will introduce the .

Definition 5. Let L be a Lie algebra. If x, y ∈ L, we define κ(x, y) = T r(ad x, ad y). Then κ is a symmetric bilinear form on L, it is called the Killing form. We can now state the criterion for semisimplicity as Theorem 4. A Lie algebra L is semisimple if and only if its Killing form is nondegenerate.

Proof. The proof naturally involves Cartan’s criterion we just showed. For any Lie algebra then, its adjoint representation is a subalgebra of gl(L) and if T r(ad x ad y) = 0 ad L will be solvable and since ad L ∼ L/Z(L) and Z(L) is solvable we have L is solvable. So if we let S = {x ∈ L|T r(x, y) = 0 for all y ∈ L} we have S is solvable and so S ⊂ Rad L = 0, so κ is nondegenerate. If now κ is nondegenerate we have S = 0. Since Rad L is the union of all abelian ideals, we can show that for every abelian ideal I, I ⊂ S. If x ∈ I and y ∈ L, then for z ∈ L we have (ad x ad y)(z) = [x[yz]] ∈ I and then (ad y([x[yz]])) ∈ I and so (ad x ad y)2(z) ∈ [II] = 0. So ad x ad y is nilpotent and hence has trace 0 for all y ∈ L which makes x ∈ S, so Rad L ⊂ S = 0 and L is semisimple.

We say that a Lie algebra is a direct sum of ideals if there are ideals I1,...,Ik, such that L = I1 +···+Ik as a vector space. Semisimple algebras have the nice property to be such direct sums, in other words

Theorem 5. Let L be semisimple. Then there exist simple ideals (as Lie algebras) L1,...,Lk of L, such that L = L1 ⊕ · · · ⊕ Lk and this decomposition is unique in the sense that any simple ideal is one of the Lis. Proof. The Killing form has a nice property of associativity, i.e. κ([xy], z) = κ(x, [yz]) and so if we take the orthogonal subspace to an ideal I, i.e. I⊥ = {x ∈ L|κ(x, y) = 0 for all y ∈ I}, then I⊥ will also be an ideal( for z ∈ L, x ∈ I⊥, y ∈ I we have κ([xz], y) = κ(x, [zy]) = 0 since [zy] ∈ I). Moreover I ∩ I⊥ is 0 since it is solvable by Cartan’s criterion and L is semisimple. We must then have L = I ⊕ I⊥. ⊥ Now let L1 be a minimal nonzero ideal of L, we have L = L1 ⊕ L1 . If I ⊂ L1 is an ideal of L1, then [LI] ⊂ L1 is an ideal of L1 by the following reason. Let x ∈ I and any element of L can be expressed as ⊥ y + z, where y ∈ L1 and z ∈ L1 , then [x(y + z)] = [xy] + [xz]. We already have [xy] ∈ I, we will now use the fact that L semisimple implies that the Killing form is nondegenerate and show that [xz] = 0 for any ⊥ x ∈ L1 and z ∈ L1 . Let u ∈ L, then by associativity κ([zx], u) = κ(z, [xu]) = 0 since [xu] ∈ L1 . As this holds for all u ∈ L we need to have [zx] = 0 for κ to be nondegenerate. So [x(y + z)] = [zy] ∈ I and so I is an ideal of L. Then by the choice of minimality of L1 it follows that I = 0 or L1, so L1 has no nontrivial ⊥ ideals and so is simple (or abelian, which is not the case as Rad L = 0). Similarly any ideal of L1 is an ideal ⊥ ⊥ of L and so L1 cannot have any solvable ideals, so it is semisimple. Then by we reduce to the case of L1 and by induction it is a direct sum. In order to show uniqueness suppose I is a simple ideal of L, then [IL] is an ideal of I and since L is semisimple it cannot be 0, so [IL] = I. But then L = L1 ⊕ · · · ⊕ Lk, so I = [IL] = [IL1] ⊕ · · · ⊕ [ILk] ( note that [ILi] ∩ [ILj] ⊂ Li ∩ Lj = 0) and therefore all but one summand must be 0, I = [ILi] ⊂ Li and so I is an ideal of Li, so it must be I = Li as Li is simple.

6 Since every L is isomorphic to the image of its adjoint representation (the being the center, i.e. 0), then it is isomorphic to a subalgebra of gl(L). If L is semisimple it is the direct sum of simple L1 ⊕ ...Lk, so L is isomorphic to a subalgebra of gl(L1 ⊕ ...Lk), i.e. we have the Corollary 1. Every is isomorphic to a subalgebra of gl(V ) for some V .

2.2 Representations of Lie algebras Here we will develop some of the basic Representation Theory of semisimple Lie algebras. We start of with a Definition 6. A F-vector space V endowed with an operation L × V → V ((x, v) → x.v) is called an L- if 1. (ax + by).v = a(x.v) + b(y.v), 2. x.(av + bw) = a(x.v) + b(x.w), 3. [xy]v = x.(y.v) − y.(x.v), for all x, y ∈ L, a, b, ∈ F and v, w ∈ V . Homomorphisms are defined as usual, an L-module is irreducible if it has no nontrivial L-submodules and is completely reducible if it is a direct sum of irreducible L-submodules. Given an L-module V , its dual V ∗ is also an L-module as follows: if f ∈ V ∗, x ∈ L and v ∈ V , then (x.f)(v) = −f(x.v). The reason for the minus sign is to satisfy the third axiom in the definition, since ([xy].f)(v) = −f([xy].v) = −f(x(y.v) − y(x.v)) = −f(x(y.v))+f(y.(x.v)) = (x.f)(y.v)−(y.f)(x.v) = −(y.(x.f))(v)+(x.(y.f))(v) = ((xy −yx).f)(v). Another peculiarity comes with the action of L on a V ⊗W , where we define x(v⊗w) = x.v⊗w+v⊗x.w, one checks again that it respects the third axiom. No discussion of representation theory could be complete without mentioning Lemma 2 (Schur’s lemma). If φ : L → gl(V ) is irreducible, then the only endomorphisms of V sommuting with all φ(x) are the scalars. We will now exhibit a commuting endomorphism and apply Schur’s lemma. We can extend our dis- cussion about the Killing form by defining a symmetric bilinear form on a semisimple L for any faithful representation φ : L → gl(V ) as β(x, y) = T r(φ(x)φ(y)), which follows the properties of the Killing form, namely associativity and nondegeneracy. Now if (x1, . . . , xn) is a basis for L, let (y1, . . . , yn) be its dual basis Pn with respect to β, i.e. β(xi, yj) = δij, define the of φ as cφ(β) = i=1 φ(xi)φ(yi). Since φ(xi), φ(yi) ∈ End(V ), the so is cφ, our goal now is to show that it commutes with every φ(x). We need to use the associativity of β. If z ∈ L, we should consider its bracket rather than product with P P the basis since the bracket is preserved under φ. We can write [zxi] = j aijxj and [zyi] = j bijyj for some coefficients aij, bij and use the orthogonality with respect to β, then β([zxi], yj) = β(−[xiz], yj) = P P −β(xi, [zyj]) = − bjlβ(xi, yl) = −bji and β([zxi], yj) = ailβ(xl, yj) = aij, so aij = −bji. Now we can consider X X [φ(z), cφ(β)] = [φ(z), φ(xj)φ(yj)] = φ(z)φ(xj)φ(yj) − φ(xj)φ(yj)φ(z) = j j X φ(z)φ(xj)φ(yj) − φ(xj)φ(z)φ(yj) + φ(xj)φ(z)φ(yj) − φ(xj)φ(yj)φ(z) = j X X [φ(z)φ(xj)]φ(yj) + φ(xj)[φ(z)φ(yj)] = φ([zxj])φ(yj) + φ(xj)φ([zyj]) = j j X X X X ajkφ(xk)φ(yj) − aljφ(xj)φ(yl) = 0. j k j l

7 So cφ(β) commutes with any φ(z) and if φ is irreducible then by Schur’s lemma we should have cφ(β) be scalar P Pdim L multiplication, in particular it is 1/(dim V )T r(cφ) = 1/ dim V i T r(φ(xi)φ(yj)) = 1/ dim V i=1 β(xi, yj) = dim L/ dim V and so it does not depend on the bases we choose. Casimir’s element is important in the sense that it shows there is an endomorphism commuting with the action of L and with nonzero trace (equal to dim L); it will be used in the proof of our main theorem as a projection to an irreducible module. We now proceed to prove the main theorem of this section, namely Theorem 6 (Theorem (Weyl)). Let L be semisimple and φ : L → gl(V ) be a finite dimensional represen- tation of L. Then φ is completely reducible.

Proof. First of all, since L is semisimple we can write it as a direct sum of simple ideals L = I1 ⊕ ...Ik and then [LL] = [I1L] ⊕ ... [IkL]. Since Ij is simple we have Ij ⊃ [IjL] ⊃ [IjIj] = Ij, so [LL] = I1 ⊕ ...Ik = L. As a consequence of this fact we have that L acts trivially on any one-dimensional space since there φ(xy) = φ(yx) = φ(x)φ(y) and so φ([x, y]) = [φ(x), φ(y)] = 0. This nice fact suggests that we look first at the case where W is a codimension 1 L−submodule of V . Since as we observed L acts trivially on V/W , we have that φ(L)(V ) ⊂ W . Since cφ is an L− module endomorphism and it is sum of products of elements in φ(L), then cφ(V ) ⊂ φ(L)(V ) ⊂ W . Now we have just exhibited a projection operator as the averaging operators in the representation theory of finite groups. The goal is to show that V = W ⊕ Ker cφ. Consider Ker cφ ∩ W - an L− submodule of W so it must be 2 either 0 or W . If it is 0 we are done as then V = W ⊕ Ker cφ. If it is W , then we have cφ(V ) ⊂ cφ(W ) = 0 so cφ is nilpotent and in particular T r(cφ) = 0, but we already showed that this trace is dim L and since charF = 0 we reach a contradiction. We required that W be irreducible, but we can reduce it to this case easily by induction on dim W . Again, dim V/W = 1. Since again L(V/W ) = 0 we can write an exact sequence of L−module homomorphisms 0 → W → V → F → 0. If W 0 is a proper module of W , we can quotient by it and get 0 → W/W 0 → V/W 0 → F → 0 and since dim W/W 0 < dim W by induction this sequence splits, i.e. V/W 0 = W/W 0 ⊕ W˜ /W 0, where dim W˜ /W 0 = 1. So now looking at W 0 and W˜ we have again an exact sequence 0 → W 0 → W˜ → F → 0 and again by induction we have W˜ = W 0 ⊕ W 00. Finally V/W 0 = W/W 0 ⊕ W˜ /W 0 = W/W 0 ⊕ W 00/W 0. Since W ∩ W˜ ⊂ W 0 we have W ∩ W 00 = 0, so V = W ⊕ W 00. Now consider the general case where W is a submodule of V . The space of linear maps Hom(V,W ) can be viewed as an L−module since Hom(V,W ) = V ∗ ⊗ W by (f ⊗ w)v = f(v)w for v ∈ V and w ∈ W . The action of L on Hom(V,W ) can be described using the tensor product rule, namely if x ∈ L, then (x(f⊗w))v = (x.f⊗w)(v)+(f⊗x.w)(v) = x.f(v)w+f(v)x.w = −f(x.v)w+x(f(v).w) and if g(v) = (f⊗w)(v) is the actual homomorphism, then x(g)(v) = −g(x.v) + x.g(v). Now let V be the subspace of Hom(V,W ) of maps whose restriction to W is scalar multiplication. Then this space is also an L−submodule of Hom(V,W ): if f ∈ Hom(V,W ) and f|W = a1|W , then x.f(w) = −f(x.w) + x.f(w) = −ax.w + x.aw = 0, in particular x.f ∈ V. In fact if W ⊂ V is the subspace consisting of f|W = 0, then it is again an L−submodule and L(V) ⊂ W. Moreover dim V/W = 1 since V/W is determined by that scalar a. So we end up with a familiar situation, namely V has a codimension one submodule W and by the special case we get V = W ⊕ {f}, where f ∈ V spans the complementary to W one-dimensional module, we can assume that f|W = 1W . As we showed earlier for the action of L on V we have 0 = (x.f)(v) = −f(x.v) + x(f(v)), i.e. x.f = f.x on V , so f is an L−homomorphism and since clearly W ∩ Ker f = 0 and f is a projection of V into W we have V = W ⊕ Ker f as L−submodules.

As a result using this theorem we can state another important theorem which will later help us analyze the action of L on a vector space. Theorem 7. Let L ⊂ gl(V ) be a semisimple Lie algebra, then L contains also the nilpotent and semisimple parts of its elements. In particular, in any representation φ the diagonalizable elements of L will be diagonalizable in φ(L). This will help us later present any module V as a direct sum of eigenspaces to this elements.

8 Proof. We recall a result from linear algebra that says that the semisimple and nilpotent parts, xs and xn respectively, of x can be expressed as polynomials in x with 0 constant coefficient. This in particular shows that if xA ⊂ B for any subspaces B ⊂ A, then xsA, xnA ⊂ B also. In order to use this fact we will attempt to create subspaces A and B, such that xA ⊂ B if and only if x ∈ L. For any WL−submodule of V , let LW = {y ∈ gl(V )|y(W ) ⊂ W, T r(y|W ) = 0}. Since we showed as a consequence to the direct sum decomposition of L that [LL] = L, we have that every element is of the form [x, y] for some x, y ∈ L and hence has trace 0. Moreover we have that xs and xn are also in LW : since x(W ) ⊂ W we have xs(W ) ⊂ W and xn(W ) ⊂ W and also T r(xn|W ) = 0 since xn is nilpotent and hence T r(xs) = T r(x) = 0 on W also. So L, xs, xn ⊂ LW for all L−submodules W . Let N ⊂ gl(V ) be the space of z ∈ gl(V ), such that [xL] ⊂ L. We also have that xs, xn ∈ N: since we showed that ad xn and ad xs are the nilpotent and semisimple parts of ad x : L → L, we have that ad xn(L) ⊂ L and ad xs(L) ⊂ L also, i.e. 0 [xnL] ⊂ L and [xsL] ⊂ L. So if we show that the intersection L = N ∩ (∩W LW ) ⊂ L we will be done. 0 Observe that L is an L−module under the bracket action since for every x ∈ L and y ∈ LW we have [x, y](W ) ⊂ x(y(W )) + y(x(W )) ⊂ W as y(W ), x(W ) ⊂ W and then since it is a commutator its trace is 0; N is clearly an L module as [LN] ⊂ L ⊂ N. Now comes the big moment of applying Weyl’s theorem. Since L0 is an L-module and it contains L we have L0 = L + M where M is an L−submodule of L0 and L ∩ M = 0. But since L0 ⊂ N, then [L0L] ⊂ [NL] ⊂ L, so L + [LM] = [LL] + [LM] = [L0L] ⊂ L, so [LM] ⊂ L. But as an L−module under [] we have [LM] ⊂ M, so [LM] = 0 and in particular the elements of M are endomorphisms commuting with L. So if y ∈ M and W is an irreducible L−submodule of V , then since M ⊂ LW we have y(W ) ⊂ W and T r(yW ) = 0. By Schur’s lemma y should act as a scalar on W and since its trace is 0 it should be 0. Since this is true for any irreducible W and we can write V as a direct sum of such W s by Weyl’s theorem (again!), then y acts as 0 on all of them, hence on V and so y = 0. Therefore 0 M = 0, so L = L and xs, xn ∈ L.

2.3 Representations of sl(2,F ) The special linear Lie algebra of dimension 2 is not just an easy example we start with. As we have already noticed dimension induction has been a key method in proving facts about Lie algebras, so it won’t be a surprise to see that the general case of sl(n, F ) will involve the special case of sl(2,F ). Definition 7. The Lie algebra sl(n, F ) is defined as the vector space of n × n matrices over F of trace 0 with a bracket operator given by [X,Y ] = XY − YX. In particular we see that dim sl(n, F ) = n2 − 1 and of course that sl(n, F ) is well defined as a Lie algebra, since T r(XY ) = T r(YX) for any n × n matrices X,Y . We will go back to the general case later, but for now let n = 2. Clearly sl(2,F ) can be generated as a vector space by

0 1 0 0 1 0  x = , y = , h = . 0 0 1 0 0 −1

We can determine sl(2,F ) completely by listing the bracket relations among these generators. We have that

[x, y] = h, [h, x] = 2x, [y, h] = 2y, (1)

so in particular any of x, y, h would generate sl(2,F ) as a Lie algebra. It would have been pointless to discuss semisimple Lie algebras if our main example wasn’t such. So we will show briefly that sl(2,F ) is not only semisimple, but in fact simple. For any element z = ax+by+ch ∈ sl(2,F ) with a, b, c ∈ F we have that [z, x] = a.0+b[y, x]+c[h, x] = −bh+2cx and then [h[z, x]] = −b.0+2c[h, x] = 4cx. Similarly [h[z, y]] = [h, (ah − 2cy)] = −4cy and [x, [z, h]] = [x, (−2ax + 2by)] = 2bh and [y, [z, h]] = 2ah. So we see that if charF = 0 and at least one of a, b, c is not 0, i.e. z 6= 0, we can get one of h or x (or y) by applying the bracket to z and from them we can get any element, so the ideal generated by a nonzero z is just sl(n, F ) and so sl(n, F ) is simple. Now that we know the structure of sl(2,F ) so explicitly we can consider its irreducible representations. Let V be any sl(2,F ) module, φ : sl(2,F ) → gl(V ). Since h is semisimple (diagonal) then φ(h) is diagonal

9 with respect to some basis of V , i.e. for every eigenvalue λ of φ(h) there is a subspace Vλ = {v ∈ V |h.v = λv} and V = ⊕λVλ. The eigenvalues λ are called weights and the corresponding spaces Vλ - weight spaces. These agree with a more general definition which we will give when we consider representations of semisimple Lie algebras in general. Now we should see how L acts on these Vλ. If v ∈ Vλ, then we employ our fundamental equality to get h.(x.v) = [h, x]v + x(hv) = 2x.v + x.λv = (λ + 2)(x.v), so x.v is an eigenvector of h with eigenvalue λ + 2, so xVλ ⊂ Vλ+2. Similarly h(y.v) = (λ − 2)y.v, so y(Vλ) ⊂ Vλ−2. Since V is finite dimensional and since V = ⊕λVλ as a vector space we have only finitely many λs, in particular, there must be a λ, such that Vλ+2 = 0 (otherwise we get all λ + 2Z), in particular we’ll have x(Vλ) = 0. We call any v ∈ Vλ for which x.v = 0 a maximal vector of weight λ. Now let V be irreducible. Choose one of these maximal vectors, v0 ∈ Vλ. Since x, y, h generate L both as a vector space and algebra, it is natural to think that the vector space generated by successive application i of x, y, h on v0 will be stable under L. We have h.v0 = λ.v0, x.v0 = 0 and let vi = 1/i!y .v0. By induction we will show that

Lemma 3. 1. h.vi = (λ − 2i)vi,

2. y.vi = (i + 1)vi+1,

3. x.vi = (λ − i + 1)vi−1.

Proof. We have that y.v0 = v1 by definition and in fact the second part of the lemma follows straight i+1 i from the definition of vi, vi+1 = 1/(i + 1)!y v0 = 1/(i + 1)y.(1/i!y v0) = 1/(i + 1)vi. As for the rest, if (1) holds for vi, then vi ∈ Vλ−2i and so as we remarked y.vi ∈ Vλ−2i−2, so h.(y.vi) = (λ − 2(i + 1))y.vi, i.e. (i + 1)h.vi+1 = (i + 1)(λ2 − i − 2)vi+1 from (2) and so (1) follows by induction. We use (2) and our fundamental equality again to show (3):

x.vi+1 = x(1/(i + 1)y.vi) = 1/(i + 1)(x.y.vi) = 1/(i + 1)([x, y]vi + y.x.vi) =

1/(i + 1)(h.vi + y(λ − i + 1)vi−1) = 1/(i + 1)((λ − 2i)vi + (λ − i + 1)y.vi−1) = 2 1/(i + 1)((λ − 2i)vi + (λ − i + 1)ivi) = 1/(i + 1)(λ(i + 1) − i − i )vi = (λ − i)vi,

which is what we needed to prove. From this recurrences we can show the following main theorem which describes the irreducible represen- tations of sl(2,F ) completely. Theorem 8. If V is an irreducible s(2,F )−module, then

1. It is a direct sum of weight spaces Vµ for µ = m, m − 2,..., −(m − 2), −m, such that each Vµ is one dimensional and is spanned by an eigenvector of h of eigenvalue µ. 2. V has a unique maximal vector, whose weight is m, called the highest weight of V . 3. The action of L on V is determined by the recurrences in the lemma, so up to isomorphism there is a unique irreducible sl(2,F )−module of dimension m + 1.

Proof. From part (1) of lemma (3) we have that the vis are linearly independent being eigenvectors for different eigenvalues. Now since dim V < ∞, the vis must be finitely many, so there is a minimal m, such that vm+1 = 0, so vm 6= 0. By the recurrence (2) we have of course that vi = 0 for all i > m and vi 6= 0 for all i = 0, . . . , m. The subspace of V spanned by (v0, v1, . . . , vm) is an sl(2,F )−submodule of V , since it is fixed by x, y, h which generate sl(2,F )− as a vector space. Since V is irreducible then we must have V = span{v0, v1, . . . , vm} and then dim V = m + 1. Moreover, since vm+1 = 0, relation (3) of lemma (3) shows that 0 = x.vm+1 = (λ − m)vm and since vm 6= 0 we must have λ = m. From relation (1) then we have that h.vi = (m − 2i)vi, so the eigenvalues are precisely m, m − 2, . . . , m − 2m = −m, in particular symmetric around 0. The maximal vector is naturally

10 v0 and is unique up to scalar as the only one in Vm ∩ V . The uniqueness of V follows from the fact that the choice of m determines it completely by the recurrence relations of (3) and the fact that, again, x, y, z generate sl(2,F ) (so that these relations determine the action of sl(2,F )). By Weyl’s theorem we have that any sl(2,F )−module W is completely reducible and so every eigenvector of h would be in some irreducible module and then by the theorem it will be in one of the Vµ-s, in particular its eigenvalue will be an integer. We also have from the theorem that in any irreducible V we have either 0 or 1 as an eigenvalue (if m is even or odd respectively). If W = ⊕V i for some irreducible V is, then we have i i i P P Wµ = ⊕V , in particular W0 = ⊕V and W1 = ⊕V and so dim W0 = i 1 and dim W1 = i 1 µ 0 1 i|V0 6=0 i|V1 6=0 i i and since each V falls in exactly one of these categories, then dim W0 + dim W1 is the number of V s in W , i.e. number of summands. Apart from existence we have essentially described the irreducible representations of sl(2,F ). We will soon show the general case of sl(n, F ).

3 Lie Groups and their Lie algebras

So far we have considered Lie algebras from the axiomatic point of view, but that didn’t motivate neither their existence nor our current interest in them. To motivate their existence we will introduce a much more natural structure.

Definition 8. A G is a C∞ with differentiable maps multiplication × : G × G → G and inverse ι : G → G which turn G into a group. Naturally a morphism between Lie groups would be a map that is both differentiable and a group homomorphism. A Lie would be a subgroup that is also a closed submanifold. The main example of a Lie group to our interest is the GLnR (or over C) of invertible n × n matrices. The map × : GLn × GLn → GLn is given by the usual and is clearly differentiable, the inverse map can be given by Cramer’s rule and so is also differentiable. Other Lie groups of interest happen to be of GLn, for example the SLn, the group Bn of upper n triangular matrices, the special SOn of transformations of R preserving some bilinear form. A representation of a Lie group G is defined as the usual representations of groups, i.e. a morphism G → GL(V ). It turns out the tangent spaces to Lie groups are Lie algebras. To see how this works we consider maps ρ : G → H and we aim to show that they are completely described by their differentials at the identity, i.e. dρe : TeG → TeH. This will reduce our studies to the studies of Lie algebras. So let ρ : G → H be a morphism of Lie groups. For any g ∈ G we have its right action on G is respected by ρ. However it does not fix the identity, so we go further and consider conjugation by g. Define Ψg : G → G −1 −1 −1 by Ψg(x) = g.x.g for x ∈ G. Now we have that Ψρ(g)(ρ(x)) = ρ(g)ρ(x)ρ(g) = ρ(g.x.g ) = ρ ◦ Ψg(x), i.e. the following diagram commutes

ρ G / H

Ψg Ψρ(g)   G ρ / H.

In particular we have that Ψ : G → Aut(G) and Ψg(e) = e.Ψg is also a Lie group morphism, in particular differentiable, so we can consider its differential at e.

Definition 9. The representation Ad : G → Aut(TeG) given by Ad(g) = (dΨg)e : TeG → TeG is called the adjoint representation of G.

11 dγ Since ρ ◦ Ψg = Ψρ(g) ◦ ρ we have that for any arc γ(t) in G starting at e and dt |t=0 = v we have ρ(Ψg(γ(t))) = Ψρ(g)(ρ(γ(t))) and differentiating with respect to t we get dρ(dΨg(dγ(t)/dt)) = dΨρ(g)(dρ(dγ(t)/dt)) and evaluating at t = 0 we get dρ(dΨg(v)) = dΨρ(g)(dρ(v)), in other words the following diagram commutes:

(dρ)e TeG / TeH

Ad(g) Ad(ρ(g))   TeG / TeH. (dρ)e

We want to get rid of the map ρ and leave only dρ in order to consider only maps between tangent spaces, so we take again a differential. We define

ad = d Ad : TeG → End(TeG),

since the of Aut(TeG) ⊂ End(TeG) is just the space of endomorphisms of TeG. Then we have that ad(X) ∈ End(TeG), so for any Y ∈ TeG, we can define

[X,Y ] := ad(X)(Y ),

which is a bilinear map. Since ρ and (dρ)e respect Ad, we have that dρe(ad(X)(Y )) = ad(dρe(X))(dρe(Y )), i.e. the following diagram commutes

(dρ)e TeG / TeH

ad(X) ad(ρ(X))   TeG / TeH, (dρ)e

so the differential dρe respects the bracket and moreover we have arrived at a situation involving only differentials and tangent spaces and in particular no actual map ρ. The notation [X,Y ] is not coincidental with previous section, it is in fact the same bracket and TeG is a Lie algebra. So that things become explicit we will consider the case when G is a subgroup of GLn(C), in which case we know exactly what Ad(g) is. Let β(t) and γ(t) be two arcs in G with β(0) = γ(0) = e and dβ dγ −1 −1 −1 dt |t=0 = X and dt |t=0 = Y . Then Ψg(γ(t)) = gγ(t)g , so Ad(g)(Y ) = d(gγ(t)g )/dt|t=0 = gY g . Then ad(X) = (d Ad(β(t))/dt)|t=0, so d Ad(β(t))(Y ) d ad(X)(Y ) = = | (β(t)Y β(t)−1) = (2) dt dt t=0 dβ(t) dβ−1 | Y β(t)−1| + β(t)| Y t| = XY e + eY (−X) = (3) dt t=0 t=0 t=0 d t=0 XY − YX. (4)

This definition of the bracket explains why we defined the action of a Lie algebra on a tensor product the way we did. Let V and W be representations of a Lie group G, then the usual definition of the representation dβ V ⊗ W is given by g(v ⊗ w) = g(v) ⊗ g(w). If β(t) is an arc in G with β(0) = e and dt (0) = X, then the d action of X ∈ TeG on any space V is given by X(v) = dt |t=0β(t).v, so the action on V ⊗ W will be naturally given by differentiation by parts, i.e. d d d X(v ⊗ w) = (β(t)v ⊗ β(t)w) = ( | β(t)v) ⊗ w + v ⊗ ( | β(t)w) = X(v) ⊗ w + v ⊗ X(w). dt dt t=0 dt t=0 Definition 10. We define the Lie algebra associated to a Lie group G to be the tangent space to G at the origin, i.e. TeG.

12 To make our discussion complete and justify our study of Lie algebras (as opposed to Lie groups) we will prove the following two principles Theorem 9 (First Principle). Let G and H be Lie groups, G connected. A map ρ : G → H is uniquely determined by its differential dρe : TeG → TeH at e-the identity. Theorem 10 (Second Principle). Let G and H be Lie groups with G connected and simply connected. A linear map δ : TeG → TeH is the differential of a homomorphism ρ : G → H if and only if the map preserves the bracket operation in the sense that δ([X,Y ]) = [δ(X), δ(Y )]. We are going to prove these theorems later, after we establish the relationship between a Lie group G and its Lie algebra. For that we will consider one parameter subgroups. Let g ∈ G and let mg : G → G be the map given by left multiplication by g. Let X ∈ L = TeG, we define a vector field vX on G by vX (g) = (mg)∗(X). (5) We can integrate this field, i.e. there is a unique differentiable map (”the integral”) φ : I → G, where 0 ∈ I, 0 I is open and φ(0) is any given point, such that φ (t) = vX (φ(t)). If α(t) = φ(s).φ(t) for s, t ∈ I and β(t) = 0 0 0 d φ(s + t), then β (t) = φ (s + t) = vX (φ(s + t)) = vX (β(s + t)) and on the other hand α (t) = dt (mφ(s)φ(t)) = 0 (mφ(s))∗(φ (t)) = (mφ(s))∗vX (φ(t)) = (mφ(s))∗(mφ(t))∗(X) = (mφ(s).φ(t))∗X = vX (φ(s).φ(t)) = vX (α(t)). Since α(0) = β(0) the uniqueness of such maps implies α(t) = β(t), in other words φ(s + t) = φ(s).φ(t), so thus we defined a homomorphism φX : R → G. This Lie group map φX is called the one-parameter subgroup of G with tangent vector X. If we had a homomorphism φ : R → G, such that φ0(0) = X, then this would uniquely determine d d φ (Exercise 8.31 in [1]). This follows because if we fix s and take dt φ(s + t) = dt φ(s)φ(t) we will have 0 0 from the left-hand side φ (s + t) and from the right-hand side (mφ(s))∗(φ (t)). Taking t = 0 we get that 0 φ (s) = (mφ(s))∗(φ(0)) = vX (φ(s)), which as we mentioned is uniquely defined. If then ψ : G → H is map of Lie groups, then ψ ◦ φX : R → H is a homomorphism. We have that ψ∗(X) is a tangent vector of H at

the identity, so φψ∗X is the unique homomorphism with such tangent vector. However the homomorphism d 0 ψ ◦ φX has tangent vector at 0 dt ψ(φX (t))|t=0 = ψ∗(φX (0)) = ψ∗(X), so they must coincide, i.e.

φψ∗X = ψ ◦ φX . (6)

Let L be the Lie algebra of G, i.e. L = TeG.

Definition 11. The exponential map exp : L → G is given by exp(X) = φX (1).

0 We have that φλX is the unique homomorphism such that φ (0) = λX. Let ψ(t) = φX (λt). Then 0 ψ is still a homomorphism and ψ (0) = λφX (0) = λX by the chain rule, so necessarily we must have φλX (t) = φX (λt), so the exp maps restricted to the projective space of lines through the origin gives the one-parameter subgroups of G. We can apply (6) to the exponential map exp : L → G and get that exp is the unique map from L to G taking 0 to the identity e on G with differential at the origin the identity and whose restrictions to lines through the origin are the one-parameter subgroups of G. Again thanks to (6) we have that for any map

ψ : G → H φψ∗X = ψ(φX ), i.e. exp(ψ∗(X)) = ψ(exp(X)) by evaluating at t = 1. Since the differential of exp at the origin is an isomorphism, in particular it’s invertible, the inverse mapping theorem tells us that exp is invertible in a neighborhood U of e, so in particular its image will contain that neighborhood. Now let G be connected and H be a subgroup of G generated by elements in a neighborhood U of e (Exercise 8.1 in [1]), then H is an open subset of G. Suppose H 6= G. All of its cosets aH being translates of H will also be open and besides disjoint from each other, so (∪aH6=H aH) ∩ H = 0, H ∪ (∪aH6=H ) = G and hence G is the union of two disjoint open sets. These sets are also closed in G being complements to open, so G is the disjoint union of two closed sets and hence it is not connected. So G is generated as a group by the elements of U, in particular G is generated by exp(L). This proves the first principle: since for any map ψ : G → H of Lie groups we have the commutative diagram

13 ψ∗ TeG / TeH

exp exp   G / H, ψ and since if G is connected, it and its image in H (the respective connected component) are generated by exp(TeG) and exp(TeH), then the map ψ is determined uniquely by the differential ψ∗ = (dψ)e. The reason the exponential map is called that becomes apparent when we consider the case of GLn(R). For any X ∈ End(V ) we can set

X2 X3 exp(X) = 1 + X + + + ..., 2 6 which converges because if a is the maximal (absolute value) entry in X, then the maximal entry in Xk is by induction less than n|a|k (n = dim V ). This map has an inverse exp(−X). Since the differential of this 0 0 map at X = 0 is the identity and the map φX (t) = exp(tX) satisfies φX (t) = exp (tX) = exp(tX)X =

(mφX (t))∗X = vX (φX (t)), we have that exp restricted to any line through the origin gives the one-parameter subgroups, so it is the same exp we defined above. We want to show now something stronger than that the exp generate G, namely that the group structure of G is encoded in its Lie algebra. We want to express exp(X). exp(Y ) as exp(Z) where Z ∈ L. The diffciulty arises from the fact that exp(X) exp(Y ) contains products X.Y , which don’t have to belong to L. However the bracket [X,Y ] ∈ L, so we aim to express exp(X). exp(Y ) as sum of bracket operations. We have that (g−I)2 (g−I)3 log(g) = (g − I) − 2 + 3 + ... is the formal power series of the inverse of exp. We define then X ∗ Y = log(exp(X). exp(Y )) ∈ L, i.e. exp(X ∗ Y ) = exp(X). exp(Y ) and we have that X ∗ Y = log(I + (X + Y ) + (X2/2 + X.Y + Y 2/2) + ... ).

1 Theorem 11 (Campbell-Hausdorff formula). The quantity log(exp(X) exp(Y )) = X + Y + 2 [X,Y ] + ... can be expressed as a convergent sum of X,Y and the bracket operator. In particular it does depend only on the Lie algebra. Using this theorem and the assumption that every Lie group may be realized as a subgroup of the general linear group, which follows from another fact that every Lie algebra L is a subalgebra of gln(R) (we showed it for semisimple Lie algebras) and G is generated by exp(L), we can show that exp(X). exp(Y ) ∈ exp(L) for X,Y ∈ L. So if we have a Lie subalgebra Lh of L, then its image under the exponential map is locally closed and hence the group H generated by exp(Lh) is an immersed subgroup of G with tangent space TeH = Lh. Now let H and G be Lie groups with G simply connected and let Lh and Lg be their Lie algebras. Consider the product G×H. Its Lie algebra is Lg ⊕Lh. If α : Lg → Lh is a map of Lie algebras, then we can consider LJ ⊂ Lg ⊕Lh, the graph of α (LJ = {(X, α(X))|X ∈ Lg}). Then LJ is a Lie subalgebra of Lg ×Lh, since it is clearly a vector space and [(X, α(X)), (Y, α(Y ))] = ([X,Y ], [α(X), α(Y )]) = ([X,Y ], α([X,Y ])). Then by what we just noticed the group J generated by exp(LJ ) is an immersed Lie subgroup of G×H with tangent space LJ . Now we have a projection map π : J → G, whose differential dφ : LJ → Lg should agree with the projection of the inclusion map LJ ,→ Lg ×Lh → Lg, i.e. this is the map (X, α(X)) → X, so dφ is an isomorphism, so again by the inverse function theorem φ must be an isogeny and since G is simply connected the image of π must be G, so π is an isomorphism. The projection on the second factor composed with this −1 −1 isomorphism η = π2 ◦π : G → H is a map of Lie groups, whose differential is dη = dπ2 ◦dπ = α◦id = α. This proves the second principle that a linear map of Lie algebras is the differential of a map of their Lie groups if and only if it is a map of Lie algebras.

14 4 Representations of semisimple Lie algebras

Returning back to our study of representations, lets take a look at what we did with sl(2,F )). We found elements x, y, h, such that x and y where eigenvectors with respect to ad(h), h was semisimple and sl(2,F ) = h ⊕ x ⊕ y. Thanks to these elements we could decompose an irreducible representation into eigenspaces with respect to h with x and y sending one to another. We would like to do something similar for any semisimple Lie algebra L. We shouldn’t expect to find elements corresponding to h,x and y, but rather subalgebras, and if we want to decompose L into a direct sum of subalgebras the most natural approach is to look at the adjoint action of certain elements of L on L. The crucial fact about h is that it is semisimple, acts diagonally. Suppose we can find a subalgebra H ⊂ L of semisimple elements, which is abelian and acts diagonally on L. Then by the action of H on L we can decompose L into eigenspaces, i.e. for α a linear functional on H let Lα = {x ∈ L|[h, x] = α(h)x, h ∈ H}, then L = H ⊕ (⊕α∈H∗ Lα), where we will have H = L0. Since L is finite dimensional we will have a finite collection of αs, the weights of the adjoint representation, we call them the roots of L, and the corresponding spaces Lα are called the root spaces of L. The set of roots will be denoted R. We will show ∗ that the adjoint action of Lα carries Lβ into Lα+β and so the roots will form a lattice, ΛR ⊂ H . It will be of rank the dimension of H. Each Lα will be one dimensional and R will be symmetric about 0. Now if V is any irreducible representation of L, then similarly to the above decomposition we can decompose it into a direct sum V = ⊕Vα, where Vα = {v ∈ V |h(v) = α(h)v, h ∈ H} are the eigenspaces ∗ with respect to H. These αs in H are called the weights of V and Vα - the weight spaces, dim Vα - the multiplicity of the weight α. We will have that for any root β, Lβ send Vα into Vα+β. Then the

subspaces ⊕β∈ΛR Vα+β will be an invariant subspace of V and since V is irreducible the weights should all be translates of one another by ΛR.

4.1 Root space decomposition We will consider the case of any representation later, for the moment we will focus on the root space decomposition. Our presentation here will follow [2]. We call a subalgebra toral if it consists of semisimple elements. L has such subalgebras, as otherwise all its elements would be nilpotent and so L will be nilpotent by Engel’s theorem, hence solvable and hence not semisimple. Any is abelian by the following reasoning. Let T be toral, x ∈ T , so ad x is semisimple and so over an algebraically closed F it is diagonalizable. So if ad x has only 0 eigenvalues, then ad x = 0. Suppose it has an eigenvalue a 6= 0, i.e. there is a y ∈ T , such that [x, y] = ay. Since y is also semisimple, so is ad y and it has linearly independent eigenvectors y1 = y, . . . , yn (since ad(y)(y) = 0) of eigenvalues 0, b2, . . . , bn, we can write x in this basis as x = a1y1 + . . . anyn. Then −ay = ad y(x) = 0.y + b2a2y2 + ... , i.e. y is a linear combination of the other eigenvectors, which is impossible. So a = 0 and ad x = 0 for all x ∈ T , i.e. [x, y] = 0 for all x, y ∈ T . Let H be a maximal toral subalgebra of L, i.e. not included in any other. For any h1, h2 ∈ H, we have ad h1 ◦ ad h2(x) = [h1, [h2, x]] = −[h2, [x, h1]] − [x, [h1, h2]] = [h2, [h1, x]] = ad h2 ◦ ad h1(x) by the Jacobi identity, so adL H consists of commuting semisimple endomorphisms and by a standard theorem in 8 linear algebra these are simultaneously diagonalizable . So we can find eigenspaces Lα = {x ∈ L|[hx] = α(h)x for all h ∈ H } for α ∈ H∗, such that they form a basis of L, i.e. we can write the a Root space decomposition: L = CL(H) ⊕ Lα, (7) α∈Φ

∗ where Φ is the set of α ∈ H for which Lα 6= 0. Here CL(H) is just L0. Our goal is to prove that CL(H) = H. We will show first that 8Let A and B be two diagonalizable commuting matrices, then for any eigenvalue α of B and v such that Bv = αv we have

α(Av) = ABv = B(Av), i.e. A fixes eigenspaces Bα of B. Since then A|Bα is still semisimple we can diagonalize and obtain an eigenbasis which is necessarily an eigenbasis for B also.

15 ∗ Lemma 4. For all α, β ∈ H we have [LαLβ] ⊂ Lα+β. For any x ∈ Lα of α 6= 0, we have that ad x is nilpotent and if β 6= −α then the spaces Lα,Lβ are orthogonal with respect to the Killing form κ.

Proof. Let x ∈ Lα, y ∈ Lβ and h ∈ H, then by the Jacobi identity we have ad h([x, y]) = [h, [x, y]] = [x, [h, y]] + [[h, x], y] = [x, β(h)y] + [α(h)x, y] = (α + β)(h)[x, y], so [x, y] ∈ Lα+β. Next if ad x has a nonzero eigenvalue, i.e. there is a y, such that ad x(y) = ay we necessarily must have y ∈ CL(H) since otherwise y ∈ Lβ ∩ Lα+β = 0. But for any h ∈ H,[y, h] = 0, [h, x] = α(h)x, so 0 = [y, [h, x]] + [h, [x, y]] + [x, [y, h]] = −a.α(h)y + 0 + 0, so α = 0. Next, let h ∈ H, such that (α+β)(h) 6= 0. Since the Killing form is associative, we have for x ∈ Lα, y ∈ Lβ that κ([x, y], h) = κ(x, [h, y]) = β(h)κ(x, y) from one hand and from another κ([x, y], h) = −κ([y, x], h) = −κ(y, [x, h]) = −α(h)κ(x, y), so since β(h) 6= −α(h) we have κ(x, y) = 0.

As a corollary to this lemma we have that the restriction of κ to CL(H) is nondegenerate. Since κ is nondegenerate on L and L0 = CL(H) is orthogonal to Lα for all α 6= 0, then there is no z ∈ L0, such that κ(z, L0) = 0 as otherwise κ(z, L) = 0. This will help us prove the next theorem.

Theorem 12. Let H be a maximal toral subalgebra of L. Then H = CL(H).

Proof. Our goal is to show that H contains all semisimple elements of CL(H) and that CL(H) cannot have any nilpotent ones. (1) First of all since CL(H) maps H via ad to 0, then the nilpotent and semisimple parts of each ad x, x ∈ CL(H) being polynomials in ad x map H into 0 also. But (ad x)s = ad xs and (ad x)n = ad xn, so xs, xn ∈ CL(H) too. Now suppose x ∈ CL(H) is semisimple. Then the subalgebra generated by H and x is still toral, as x commutes with everything in H and sum of semisimple elements is still semisimple which follows for example from the simultaneous diagonalizability. Since H is maximal with respect to this property we must have x ∈ H. (2)If x ∈ CL(H) is nilpotent then ad x is nilpotent, [x, H] = 0 and if y ∈ H, then ad x ◦ ad y = ad y ◦ ad x, so (ad x ◦ ad y)k = (ad y)k ◦ (ad x)k for any k and finally ad x ad y is nilpotent, i.e. κ(x, y) = 0. Since κ is nondegenerate on CL(H) we must then have κ(h, H) 6= 0 for h semisimple, i.e. in H. So κ is nondegenerate on H also. (3) We also have that CL(H) is nilpotent by Engel’s theorem as follows. For any x ∈ CL(H), x = xs +xn and ad xn is clearly nilpotent and ad xs = 0 on CL(H) since by point (1) xs ∈ H. Since again by xs ∈ H and xn ∈ CL(H)[xs, xn] = 0, we have that ad x is the sum of two commuting nilpotents and so is nilpotent. (4)If x ∈ CL(H) is nilpotent then ad x is nilpotent, [x, H] = 0 and if y ∈ H, then ad x ◦ ad y = ad y ◦ ad x, so (ad x ◦ ad y)k = (ad y)k ◦ (ad x)k for any k and finally ad x ad y is nilpotent, i.e. κ(x, y) = 0. Since κ is nondegenerate on CL(H) we must then have κ(h, H) 6= 0 for h semisimple, i.e. in H. So κ is nondegenerate on H also. (5) Point (4) tells us that H ∩ [CL(H)CL(H)] = 0 as follows. For h ∈ H, x, y ∈ CL(H) we have κ(h, [x, y]) = κ([h, x], y) = 0. If [x, y] ∈ H this contradicts nondegeneracy, so [x, y] 6∈ H and hence H ∩ [CL(H)CL(H)] = 0. (6) Suppose [CL(H)CL(H)] 6= 0, by lemma 1 we have that CL(H) acts on [CL(H),CL(H)] via the adjoint representation, so there is a common eigenvector of eigenvalue 0, i.e. [CL(H),CL(H)] ∩ Z(CL(H)) 6= 0. Let z be such element, by (2) and then (5) z cannot be semisimple, so zn 6= 0 and zn ∈ CL(H) by (1) and since [zn + zs, c] = 0 for all c ∈ CL(H) and [zs, c] = 0 since zs ∈ H we must have [zn, c] = 0 for all c ∈ CL(H). Then ad zn ad c is nilpotent by commutativity and so κ(zn,CL(H)) = 0 contradicting nondegeneracy. So [CL(H)CL(H)] = 0 and is abelian. (7) Finally, since CL(H) is abelian for any nilpotent element xn of it we will have κ(xn,CL(H)) = 0 as xn, c commute for all c ∈ CL(H). This contradicts the nondegeneracy of κ over CL(H), so CL(H) has no nilpotent elements. But then it must have only semisimple ones by (1) and then by (2) they are all in H, so H = CL(H). Together with the previous arguments this shows

16 Theorem 13 (Cartan decomponsition). Let L be semisimple. If H is a maximal toral subalgebra of L then ∗ there are eigenspaces Lα = {x ∈ L|hx = α(h)x for all h ∈ H} for α ∈ H , such that

L = H ⊕ (⊕α∈H∗ Lα).

From the above proof we have in particular that κ is nondegenerate on H, then we can find an orthonormal ∗ basis with respect to κ, h1, . . . , hl. Then every φ ∈ H is uniquely determined by its values on hi, say φ(hi) = φi. If tφ = φ1h1 + . . . φlhl, then κ(tφ, hi) = φi, so by linearity κ(tφ, h) = φ(h) for every h ∈ H. We will prove some useful facts about the . Theorem 14. 1. The set Φ of roots of L relative to H spans H∗.

2. If α ∈ Φ, then −α ∈ Φ.

3. Let α ∈ Φ and let tα be the element of H as defined above, i.e. α(h) = κ(tα, h). If x ∈ Lα, y ∈ L−α, then [x, y] = κ(x, y)tα.

4. If α ∈ Φ then [LαL−α] is one dimensional with basis tα.

5. α(tα) 6= 0 for α ∈ Φ.

6. If α ∈ Φ and xα 6= 0 is in Lα, then there is a yα ∈ L−α, such that the elements xα, yα and hα = [xαyα] span a three dimensional subalgebra of L isomorphic to sl(2,F ) via xα → x, yα → y and hα → h.

2tα 7. [xαyα] = hα = and hα = −h−α. κ(tα,tα)

∗ Proof. (1) If Φ does not span H then there is an h ∈ H such that α(h) = 0 for all α ∈ Φ, i.e. for any x ∈ Lα, ad h(x) = 0, i.e. [h, Lα] = 0. But since [hH] = 0 by H being abelian we have [hL] = 0, i.e. h ∈ Z(L). But Z(L) = 0 by L - semisimple, so we reach a contradiction. (2) We have from lemma 4 that Lα and Lβ are orthogonal unless β = −α. If −α 6∈ Φ, then by the same lemma κ(Lα,L) = 0 contradicting the nondegeneracy. (3) Consider for any h ∈ H κ([x, y] − κ(x, y)tα, h) = κ([x, y], h) − κ(x, y)κ(tα, h) = κ(x, [y, h]) − α(h)κ(x, y) = κ(x, −(−α(h)y)) − α(h)κ(x, y) = 0 by the associativity of κ. Since κ is nondegenerate on H we must have [x, y] − κ(x, y)tα = 0 (note that [x, y] ∈ H by lemma 4). (4) Is a direct corollary of (3), provided that [LαL−α] 6= 0. If it where, then κ(Lα,L) = 0 by lemma 4, so since Lα 6= 0 this contradicts the nondegeneracy of κ on L. (5) Suppose α(tα) = 0. Then if x ∈ Lα and y ∈ L−α, such that [x, y] 6= 0 (by (4) it is possible), we will have that [tαx] = α(tα)x = 0 and similarly [tαy] = 0, so the subspace S spanned by x, y and tα is a subalgebra of L. It acts on L via the adjoint representation and we see that S ' adL S ⊂ gl(L). Since ad s is nilpotent for all s ∈ S (easy check that S is nilpotent), then we check that tα ∈ [SS] is nilpotent and since S is solvable we’ve shown that adL tα is nilpotent. But since tα is semisimple, ad tα is also, so adL tα = 0, i.e. tα ∈ Z(L) = 0, contradiction. 2 2 (6) Take xα ∈ Lα and let yα ∈ L−α be such that κ(xα, yα) = = , possible because of (4) and κ(tα,tα) α(tα) 2tα 2tα the nondegeneracy of κ. We set hα = , then [xα, yα] = hα by (3) and [hαxα] = α( )xα = 2xα by κ(tα,tα) α(tα) linearity of α and similarly since [tαyα] = −α(tα)yα we have [hαyα] = −2, the same relations as in equation (1) for sl(2,F ), so the three-dimensional algebra spanned by xα, yα, hα is isomorphic to sl(2,F ). (7) By definition of tα we have that κ(tα, h) = α(h), so also κ(t−α, h) = −α(h), so by linearity of κ κ(tα + t−α, h) = 0 and by its nondegeneracy on H we must have tα = −t−α which shows (7).

In view of what we proved in the (6)th part of this proposition, let Sα ' sl(2,F ) be a subalgebra of L, generated by xα, yα and hα. For a fixed α ∈ Φ, let M = span{H1,Lcα| for all cα ∈ Φ}, where H1 ⊂ H is generated by hα and Ker α ⊂ H. Lemma 4 shows that ad(Sα)(M) ⊂ M, i.e. it is a Sα submodule of L, i.e. it is a representation of Sα.

17 By our analysis about sl(2,F ) we have that all weights should be integers, so the ones occurring here being among 0 and 2c (as α(hα) = 2), where cs are among the ones for which Lcα 6= 0. So in particular we must have c an integral multiple of 1/2. Now M is not actually irreducible. In fact, Sα is an Sα− submodule of M (it is a subspace already) and Ker α is also a Sα submodule, since for every h ∈ Kerα we have ad xα(h) = −α(h)x = 0, same for yα and ad hα(h) = [hα, h] since H is abelian. So M = Ker α⊕Sα ⊕M1 as an Sα module. Note that a weight 0 of hα could not occur in M1, as the weights there are the cα(hα) for c 6= 0. So we have a total of 2 occurrences of 0, exhausted by Ker α and Sα and so the even weights are only in Ker α and Sα and these are respectively 0 and -2,0,+2. In particular then for c = 2 we cannot have 2α as a root (as it will give weight 4, not accounted for). So if twice a root is not a root, then 1/2α cannot be a root either (as else 2.1/2α won’t be). So 1 is not a weight for hα and so there are only even weights, namely the 0 and -2,0,+2, so M = Ker α ⊕ Sα as an Sα submodule. But Lα ⊂ M, so Lα = Lα ∩ (Ker α + Sα) = Lα ∩ Sα = span{xα} as a vector space. In particular dim Lα = 1. Thus we proved the following theorem.

Theorem 15. For any α ∈ Φ we have that dim Lα = 1 and the only roots multiples of α are ±α. We will show now some more facts revealing the structure of the root space.

Theorem 16. Let α, β ∈ Φ, then

1. β(hα) ∈ Z and β − β(hα)α ∈ Φ, the numbers β(hα) are called the Cartan integers.

2. if also α + β ∈ Φ, then [LαLβ] = Lα+β. 3. if β 6= ±α, then if r and q are the largest integers for which β −rα, β +qα are roots, then all β +pα ∈ Φ for p = −r, −r + 1, . . . , q − 1, q and β(hα) = r − q.

4. L is generated as a Lie algebra by the root spaces Lα.

Proof. Consider the action of Sα on Lβ for β 6= ±α. Let K = ⊕i∈ZLβ+iα. We have that Sα acts on K as ad(xα)(Lβ+iα) ⊂ Lβ+(i+1)α, similarly for yα and hα fixes them. We also have that since iα is not a root unless i = 0, ±1 and so β 6= −iα being a root, i.e. β + iα 6= 0. Anyway, K is an Sα submodule of L, so its weights must be integral. On the other hand, the weights are given by the action of hα on the one dimensional (from the previous theorem) spaces Lβ+iα, if these are spanned by zβ+iαs respectively, then ad hα(zβ+iα) = (β + iα)(hα)(zβ+iα, so the weights are β(hα) + iα(hα) = β(hα) + 2i, in particular β(hα) ∈ Z. We also see that these integers have at most one of 0 or 1 among them (once), so K must be irreducible by theorem 8. By the same theorem then if β(hα) + 2q and β(hα) − 2r are the highest, resp lowest weights, all β(hα) + 2p in between must occur. These are weights if and only if Lβ+pα 6= 0, i.e. we have β + pα is a root for p = −r, . . . , q, i.e. they form a string called the α−string through β. Again by the same theorem the weights should be symmetric about 0, so β(hα) − 2r = −(β(hα) + 2q), i.e. β(hα) = r − q. In particular then since −r ≤ −r + q ≤ q, the weight β + (q − r)α = β − β(hα)α also appears. Last, if α + β ∈ Φ, then since ad(xα)(Lβ) ⊂ Lα+β and since we showed that since α + β is a root then Lα+β it is a summand of K - irreducible, so ad xα(Lβ) 6= 0 as there is no other way to reach Lα+β through Lβ by the action of Sα. Since Lα+β is one dimensional then we must have [LαLβ] = Lα+β.

∗ P Finally, let α1, . . . , αl ∈ Φ be a basis for H . For any β ∈ Φ, we can write it as β = i ciαi and we ∗ want to see what the cis look like. We can obtain an inner product on H from the Killing for, namely if ∗ δ, γ ∈ H , define (γ, δ) = κ(tγ , tδ). We have (γ, γ) = κ(tγ , tγ ) = γ(tγ ) 6= 0 for γ ∈ Φ as we already proved and since Φ spans H∗, then we have (u, u) 6= 0 for all 0 6= u ∈ H∗, i.e. this inner product is nondegenerate. This inner product, apart from arising naturally from the Killing form, is useful because by theorem 16, part 2tδ γ(tδ ) κ(tδ ,tγ ) (δ,γ) (1) we have γ(hδ) ∈ Z, so by definition of hδ this amount to γ( δ(t ) ) = 2 δ(t ) = 2 κ(t ,t ) = 2 (δ,δ) ∈ Z. So P δ δ δ δ now using this inner product we can write (β, αj) = i ci(αi, αj) and we see that if we consider this as a system with variables ci, the coefficients are all in Q, solving this system using simple linear algebra we see that the solutions are rational functions in the coefficients, so ci ∈ Q.

18 P Moreover, we will show that (α, β) ∈ Q. From linear algebra we have that T r(AB) = T r(AC)T r(CB) C P (C span the space of matrices, e.g. Eij), so in particular we have κ(tγ , tδ) = T r(ad tγ ad tδ) = α κ(tγ , tα)κ(tδ, tα), P 2 (β,α) 2 in particular then (β, β) = α(β, α) . We know already (β,β) ∈ Q, so dividing both sides by (β, β) we get 1 P (α,β) 2 that (β,β) = α( (β,β) ) ∈ Q, so (β, β) ∈ Q and then also (α, β) ∈ Q. The last equality also shows that (β, β) > 0, i.e. our inner product is positive definite. Thus we have shown the following theorem. P Theorem 17. If α1, . . . , αl span Φ, then every β ∈ Φ can be written as i ciαi with ci ∈ Q and the inner product on H∗ defined as the extension of

(γ, δ) = κ(tγ , tδ) is positive definite and is rational restricted to Φ, i.e. (β, α) ∈ Q for all α, β ∈ Φ.

4.2 Roots and weights; the Let ρ : L → gl(V ) be a representation of L. We know that ρ(H) would still be an abelian subalgebra of ∗ semisimple matrices, i.e. we can write a decomposition V = ⊕Vα where α ∈ H and Vα = {v ∈ V |h.v = α(h)v for all h ∈ H }. We will drop the ρ as it will be clear what we mean by h.v = ρ(h).v. These eigenvalues α are called weights, so for example the weights of the adjoint representation are the roots. If now β is a root of L we can consider the action of Lβ over Vα. Let y ∈ Lβ and v ∈ Vα. Then by our fundamental computations we have h.(y.v) = y.(h.v) + [h, y]v = y.(α(h)v) + β(h)y.v = (α + β)(h)(y.v) for any h ∈ H, so Lβ : Vα → Vα+β. In particular, if V is irreducible then the weights must be congruent to one another module the root lattice ΛR = ΛΦ as we showed in the beginning of the Root space decomposition section9. We showed that for any roots β and α that β(hα) = 0 from theorem 16. Now suppose β is a weight of V , we claim that the same is true. It follows in the same way we proved it for the roots. Again consider the subalgebra Sα ' sl(2,F ) and its action of Vβ, let K = ⊕iVβ+iα and assume β 6= mα for any m ∈ Z, then none of the β + iα would be 0. Then K is a Sα submodule of V and again by what we know about the representations of sl(2,F ), we get an α−string through β and the weights must be integral. The weights are given by the action of hα on Vβ+iα, i.e. iα(hα) + β(hα), α has still the same meaning, i.e. it is a root, so α(hα) = 2 (by definition of hα) and so β(hα) ∈ Z. So we have just shown a lemma.

Lemma 5. All weights of all representations assume integer values on the hαs.

Now let ΛW be the set of linear functionals on H, which are integer valued on the hαs. Then ΛW is a lattice and all weights will lie in it, in particular the roots also. We already noticed that the set of roots is invariant under addition, so we are going to explore further what keeps the weights invariant.

Definition 12. The Weyl group W is the group generated by the involutions Wα given by (β, α) W (β) = β − β(h )α = β − 2 α, α α (α, α) where the inner product is the extension of the one given by (α, β) = κ(tα, tβ) when α, β are roots.

∗ The Wα is in fact a reflection and the plane it fixes is given by β(hα) = 0, i.e. it is Ωα = {β ∈ H |β(hα) = 0}. With respect to the inner product then we will have for every β ∈ Ωα that (β, α) = κ(tα, tβ) proportional to κ(tβ, hα) = β(hα) = 0, so Ωα ⊥ α. We also have Wα = α − 2α = −α, so the Wαs are indeed reflections. We can say then that the Weyl group is generated by the reflections in the planes perpendicular to the roots. As a consequence of theorem 16 we have that Wα(Φ) ⊂ Φ for all α ∈ Φ, so Lemma 6. The set of weights of a representation is invariant under the Weyl group.

9We will denote the same object - set of roots by both R and Φ as the notations in [2] and [1] differ.

19 Again from theorem 16, we have that β −rα, . . . , β +qα appear, if we let γ = β −rα, then the weights will be γ, . . . , γ + (q + r)α and under hα the corresponding weights in sl(2,F ) will be γ(hα), . . . , γ(hα) + 2(p + q), which should be symmetric about 0, so we have γ(hα) = p + q, i.e. the length of the α−string is equal to the value of the first weight evaluated at hα. We also have that the multiplicities of the weights are invariant under the Weyl group, again restricting our analysis to any string of weights, i.e. to the action of Sα and applying theorem 8. In our study of sl(2,F ) we started off with choosing the ”farthest” nonzero Vλ, but this was possible since the integers have natural ordering. We would like to apply similar analysis and for this purpose we need to introduce an ordering on the roots. + Now let l : RΦ → R be any linear functional and let R = {α ∈ Φ|l(α) > 0} be the set of positive roots with respect to l and R− = {β ∈ Φ|l(β) < 0} - the negative ones. Note that we can choose l so that no root lies on its kernel. So we have Φ = R+ ∪ R−. Note that the positivity/negativity depends on the choice of functional. Definition 13. Let V be a representation of L. A nonzero vector v ∈ V , which is both an eigenvector of + ∗ the action of H and is in the kernel of all Lα for α ∈ R (for some choice of a functional l : H → R), is called a highest weight vector of V . If V is also irreducible then we call the weight α the highest weight of the representation. We will prove now one of the main facts. Theorem 18. Let L be a semisimple Lie algebra. Then 1. Every finite dimensional representation possesses a highest weight vector.

2. The subspace W of V generated by the images of a highest weight vector v under successive applications − of the root spaces Lβ for β ∈ R is an irreducible subrepresentation. 3. Given the choice of l, i.e. fixed positive roots, the highest weight vector of an irreducible representation is unique up to scalar.

+ Proof. Let β be the weight of V which maximizes l. Let then v ∈ Vβ. If γ ∈ R is a positive root, then l(γ + β) = l(β) + l(γ) > l(β), so by our choice γ + β is not a weight, i.e. Vγ+β = 0. But Lγ Vβ ⊂ Vγ+β = 0, so Lγ kills v for all positive roots γ, proving the first part. We will show part (2) by induction on the number of applications of negative root spaces. Namely, let Wn be the subspace spanned by all wn.v, such that wn is a word of length at most n of elements of negative root + spaces. Let x ∈ Lα for α ∈ R and consider any wn.v. We have that wn = y.wn−1, where wn−1 ∈ Wn−1 and − y ∈ Lβ, for β ∈ R . Then we have x.(wn.v) = x.y.(wn−1.v) = y.x.(wn−1.v) + [x, y](wn−1.v). By induction x.wn−1.v ∈ Wn−1, so y.x.wn−1.v ∈ Wn. Also we know that if β 6= −α, then [x, y] = 0, so we are done in this case. In the case β = −α, we have [x, y] = utα from theorem 14, where u is some scalar. But then [x, y] ∈ H, i.e. it fixes the weight spaces, so [x, y](wn−1v) ∈ Wn−1. Finally we have that x(wn.v) ∈ Wn. P This shows that the space W = n Wn, generated by successive applications of only negative roots to v, is fixed by positive roots. As it is also fixed by negative roots, we have that it is a L−submodule of V . It is clearly irreducible since it is a space generated by L.v. From here it follows in particular that if V is irreducible then dim Vα = 1. If otherwise there were two vectors v, w ∈ Vα not scalar multiples of each other, then we cannot possibly obtain w from v by applying only negative roots, as Lβ1 (Lβ2 . . . v) ⊂ Vα+β1+... 6= Vα fore l(α + β1 + ... ) < l(α). So w is not in the irreducible subrepresentation generated by v as in part (2) and so we must have dim Vα = 1. Uniqueness then follows from the uniqueness of α. As a corollary to this theorem, or more precisely - to the proof of part (2), we get exercise 14.15 from [?Fh] by considering the adjoint representation. If α1, . . . , αk are roots of L, then by induction we can show 0 00 that the subalgebra L generated by H,Lα1 ,...,Lαk is in fact H ⊕ (⊕Lα) = L for all αs which are in 0 00 N{α1, . . . , αk} ∩ Φ. The containment L ⊂ L follows similarly to the proof of theorem 18 part (2) by

20 00 induction on the number of summands αi, or it is simply obvious and any Lαi sends L into itself. The 00 0 containment L ⊂ L follows from theorem 16 since [LαLβ] = Lα+β if α + β is a root. In view of this result we can make the following definition. Definition 14. We call a root α simple if it cannot be expressed as a sum of two positive or two negative roots.

We see that every negative root is a sum of simple negative roots and in view of the preceding paragraph the subalgebra generated by H and Lβ for β running over the negative simple roots is in fact the direct sum of H ⊕ (⊕β∈R− Lβ). So we can rephrase the last theorem by replacing negative roots with primitive negative roots and get that any irreducible representation is generated as a vector space by the images of a highest weight vector under successive applications of negative simple roots. We are now going to prove a theorem that ties together the Weyl group and weights of irreducible representations. Theorem 19. Let V be an irreducible representation and α a highest weight. The vertices of the convex hull of the weights of V are conjugate to α under the Weyl group. The set of weights of V are exactly the weights that are congruent to α module the root lattice ΛR and that lie in the convex hull of the images of α under the Weyl group W. Proof. In view of our observations so far we have that since V is generated by successive applications of negative roots to the highest weight vector v, then all the weights appearing in V are in fact of the form − − − α + β1 + β2 + ... for βi ∈ R , so all weight lie in the positive cone α + C , where C is the positive cone spanned by R−. Note that by our definitions, C− is entirely in contained in one half-plane (the one where l is negative). Since α is a weight, we showed that if α + β is also a weight, then so must be the whole string α, α + − β, . . . , α−α(hβ)β. Since α is maximal, it is necessarily the case that β ∈ R . It also follows that α−α(hα)β = Wβ(α) is a weight. Any vertex of the convex hull of the weights of V will lie on a line passing through α − and α + β for some β ∈ R . We then must necessarily have that this vertex is Wβ(α) as we showed earlier (it is the other end of the string, geometrically - line, containing weights). As we also know the weights of the form γ + nδ for γ -weight and δ ∈ R must form a string, then for any two weights of V the intersection of the line segment connecting them and α + ΛR (i.e. all possible weights) must be the segment itself. So the set of weights is convex, i.e. it is the intersection of ΛR and its convex hull.

Note that if α − α(hβ)β is weight then its ordering must be ”lower” than α, i.e. we must have l(α) > l(α − α(hβ)β) = l(α) − α(hα)l(β) and since l(β) < 0 we must have α(hβ) < 0 or alternatively α(−h−β) < 0 + i.e. α(h−β) > 0 where now by the of roots −β runs over R .

∗ + Definition 15. A Weyl chamber is the locus W of points in H , such that γ(hβ) ≥ 0 for all β ∈ R . Alternatively, it is a connected region (polytope) with faces the planes Ωα. The Weyl group acts transitively on the Weyl chambers. The importance of the Weyl chamber is revealed in the following theorem.

Theorem 20. Let W be the Weyl chamber associated to some ordering of the roots. Let ΛW be the weight lattice. Then for any α ∈ W ∩ ΛW , there exists a unique irreducible finite-dimensional representation Γα of L with highest weight α.

This theorem gives a bijection between the points in W ∩ ΛW and the irreducible representation. In view of the previous theorem we have that the weights of Γα will be the weighs in the polytope whose vertices are the images of α under the Weyl group W. Note moreover that these are all the irreducible representations - for every irreducible representation its highest weight will lie in the Weyl chamber. Proof. We will show uniqueness first. If V and W are two irreducible finite-dimensional representations of L with highest weight vectors v and w of weight α, then consider the vector (v, w) ∈ V ⊕W . Its weight is again

21 α and in general the weight space decomposition is still ⊕(Vβ ⊕ Wβ), so the weights are the same and α is again highest (note that the weights are the same because they are solely determined by α from the previous theorem). Let U ⊂ V ⊕W be the irreducible subrepresentation generated by (v, w) and consider the nonzero projection maps π1 : U → V and π2 : U → W , nonzero since they map (v, w) to v and w respectively. Then =πi is a nonzero submodule of the irreducible V (or W ) and Ker πi a submodule of U, so we must have both −1 πi surjective and injective, i.e. isomorphism, showing in particular that π2 ◦ π1 : V → W is an isomorphism and so the uniqueness is proved. Existence is much harder and involves concepts we have not introduced. We will try to describe it here though. It involves the concept of a universal enveloping algebra, that is an associative algebra U over F , such that there is a linear map i : L → U, such that i([x, y]) = i(x)i(y)−i(y)i(x), and which has the universal property of every other such algebra arising from a homomorphism with U. Such algebra can be explicitly constructed as U(L) = T(L)/J, where T(L) = ` T iL is the tensor algebra and J is the ideal generated by x ⊗ y − y ⊗ x − [xy] for x, y ∈ L. Now we can construct a so called standard cyclic module Z(α) of weight + α as follows. If we choose nonzero xβ ∈ Lβ for β ∈ R and let I(α) be the left ideal of U(L) generated by all these xβ and all hγ − α(hγ ).1 for all γ ∈ Φ = R. The choice of such ideal makes sense as its elements annihilate a highest weight vector of weight α, so we can choose Z(α) = U(L)/I(α), where the coset of 1 corresponds to the line through our highest weight vector v, i.e. Z(α) ' Uv, where v is our highest weight vector. Next by some facts about standard cyclic modules Z(α) would have a unique maximal submodule Y and the V (α) = Z(α)/Y will be our irreducible representation of weight α. To summarize, the point was to construct a module of highest weight α explicitly and take its irreducible submodule containing the highest weight vector we started with.

Just like with the definition of simple roots, we can define the fundamental weights ω1, . . . , ωn - the weights such that any weight in the Weyl chamber can be expressed uniquely as nonnegative integral combination of these fundamental weights. These are in fact the first weights along the edges of the Weyl chamber, i.e. ωi(hα ) = δij, where αis are the simple roots. Then every weight α in the Weyl chamber (also j P called dominant weight) is an integral nonegative sum α = aiωi and then we can write the corresponding irreducible representation as Γa1,...,an = Γa1ω1+···+anωn .

5 The special linear Lie algebra sl(n, C) and the general linear group GLn(C) It is time now that we apply all the developed theory to our special case of interest - the linear group and corresponding Lie algebra. We will first reveal the structure of L = (n, C), i.e. find its H, roots and Lα and then we will proceed to finding the weights, the Weyl chamber and constructing the irreducible representations. We will also see at the end how all this helps in finding the representations of GLn(C).

5.1 Structure of sl(n, C) We will first find (a) maximal toral subalgebra of L = sl(n, C). The obvious choice would be the to consider the set of all diagonal matrices. It is clearly a toral subalgebra, the only question is whether it is maximal. Suppose that H is not maximal and H ⊂6= T , for some other toral subalgebra. Then since T still must be abelian by what we showed in the beginning of the section on root space decomposition, so we have that for every t ∈ T and h ∈ H,[th] = 0, i.e. T ⊂ CL(H). The question then is to show that for our particular choice the matrices that commute with any diagonal matrix of trace 0 are only the diagonal matrices, i.e. H ⊂ T ⊂ CL(H) = H so T = H. Well, let eij be the matrix with only nonzero entry at (i, j) and this entry P is equal to 1. If h = diag(a1, . . . , an), then [heij] = aieij − ajeij = (ai − aj)eij. Let x = xijeij, i.e. x is a P matrix with entries xij. Then [hx] = i,j(ai − aj)xijeij and x ∈ CL(H) iff (ai − aj)xij = 0 for all i, j and all a1, . . . , an such that a1 + ··· + an = 0. So fixing x for every i 6= j we can take ai = 1, aj = −1 and ak = 0 otherwise and force xij = 0, showing that x must be diagonal.

22 Now that we have our maximal toral subalgebra H we need to find the root spaces Lα and for that we first need to find the roots. We just saw that if h ∈ H, i.e. h = diag(a1, . . . , an), then [heij] = (ai − aj)eij, so the matrices eij are eigenvectors for H. If we then let αij : H → C be given by αij(diag(a1, . . . , an)) = ai − aj, we see that Lαij = {x ∈ L|[hx] = αij(h)x for all h ∈ H} for i 6= j is nonempty as it contains Ceij. In fact counting dimensions, the nilpotent part of L, i.e. the complementary to H in the Cartan decomposition, is spanned as a vector space by the n(n − 1) vectors eij for i 6= j and on the other hand it contains the n(n − 1) disjoint nonempty spaces Lαij , so we must have

L = H ⊕ (⊕i6=jLαij ) and clearly αij are the roots of L = sl(n, C). Either from theorems 14 and 16 or simply from dimension count n 2 again, since dim H = n−1 (codimension 1 in C ) and so dim(⊕ijLαij ) ≤ dim L−dim H = n −1−(n−1) = n(n − 1) , we must have dim Lαij = 1, so Lαij = Ceij. For the sake of simplicity, define the functionals on the space of diagonal matrices li(diag(b1, . . . , bn)) = bi, then we see αij = li − lj and so we express the roots as pairwise differences of n functionals. On the space H since the sum of diagonal entries is 0, these n linear functionals are not independent, but we have l1 + ··· + ln = 0. This is the only relation they need to satisfy, so we can picture them in n − 1-dimensional space. In order to draw a realistic picture we need to determine the inner product, i.e. the Killing form on sl(n, C). We will determine it from the definition as κ(x, y) = T r(ad x ad y). Since κ is linear we can determine the values κ(eii, ejj) and extend them over H by linearity. We have that κ(eii, ejj) = T r(ad eii ad ejj) and for any basis element of gln epr we have ad eii ad ejj(epr) = [eii, [ejj, epr]] = [eii, (δpj − δrj)epr] = (δpj − δrj)[eii, epr] = (δpj − δrj)(δpi − δri)epr. So epr are eigenvectors and by dimension count are all possible eigenvectors, so X X T r(ad eii ad ejj) = (δpj − δrj)(δpi − δri) = (8) p r ( −2 for i 6= j , (9) P P 2 P P 2 p r(δpi − δri) = p6=i 1 + r(1 − δri) = 2(n − 1) for i = j P P Now let h = diag(a1, . . . , an) and g = diag(b1, . . . , bn) with h, g ∈ H, i.e. ai = 0 and bi = 0. P P P P P Then κ(h, g) = aibjκ(eii, ejj) = (−2)aibj + 2(n − 1)aibj = −2aibj + 2naibi = P P i,j P P i6=j i=j i,j i −2( i ai)( j bj) + 2n aibi = 2n i aibi. ∗ Having figured out κ we need to find the inner product on H given by (α, β) = κ(tα, tβ) and for that we need to find tα. For h = diag(a1, . . . , an) ∈ H we have li(h) = ai, and if tl = diag(t1, . . . , tn), P i we must have by definition of tα that ai = li(h) = κ(tli , h) = 2n k tkak and that for all ais with sum 1 0, so the conditions are just enough to determine tli uniquely as 2n eii. Then we can say that the inner ∗ 1 product on H is determined by (li, lj) = κ(tli , tlj ) = 2n δij. In particular the roots given by αij = li − lj 1 ∗ will have inner products (αij, αkl) = 2n (δik − δil − δjk + δjl). And in general the inner product on H = C{l1, . . . , ln}/(l1 + ··· + ln) = {a1l1 + ··· + anln|a1 + ··· + an = 0} is given by X X 1 X ( a l , b l ) = ( a b ). i i i i 2n i i

We see that our inner product makes the li orthogonal, if we want to picture the root lattice then we n can consider the space C with unit vectors given by the lis (e.g. the usual coordinates ei = (0,..., 1, 0,... ) ∗ P with 1 at position i), then H will be the plane given by xi = 0 and the roots αij = li − lj are the vertices of a certain kind of polytope (if n = 3 this is a planar hexagon for example), called very creatively the root polytope. The root lattice then will simply be X X X ΛR = { aili|ai ∈ Z, ai = 0}/( li = 0), i ∗ P where the quotient is just to emphasize we are in H , i.e. the plane defined by li = 0.

23 ∗ Now we will look at the weight lattice ΛW . It is the set of linear functionals β ∈ H which assume integer 2tα 2tα values on hα . So we need to find the hα s. But we already showed that hα = = and we ij ij κ(tα,tα) α(tα) P ∗ found tαij = eii − ejj, so αij(tαij ) = 1 − (−1) = 2 and so hαij = tαij = eii − ejj. Then if β = bili ∈ H

is a weight, we must have that β(hαij ) = β(eii − ejj) = bi − bj ∈ Z, so all coefficients are congruent one P another modulo Z, so since we have li = 0 we can assume they are in Z, so X ΛW = Z{l1, . . . , ln}/( li = 0). P We can now determine the Weyl group also, it is given by reflections Wα (β) = β − β(hα )α = bklk − P ij ij (bi − bj)(li − lj) = bklk + (bj − bi)li + (bi − bj)lj = b1l1 + . . . bjli + ··· + bilj + ... , i.e. it exchanges li with lj, fixing everything else. So we have that W ' Sn -the on the n elements li. We can now pick a direction, divide the roots into positive and negative and find a Weyl chamber. For ∗ P β ∈ H , i.e. β = aili a linear functional will be determined by its values on li (plus the condition that ∗ P these values sum to 0), so if φ : H → R is our linear functional with φ(li) = ci, we have φ(β) = bici and P ci = 0. On the roots we have φ(αij) = ci − cj, which suggest a natural ordering given that c1 > c2 > + − ··· > cn, then φ(αij) = ci − cj > 0 iff i < j. So R = {li − lj|i < j} and R = {li − lj|i > j}. We can easily determine the simple roots also, as for any i < j we have li −lj = (li −li+1)+(li+1 −li+2)+···+(lj−1 −lj−1) is a sum of positive roots if i + 1 < j. So the simple (primitive) positive roots must be among li − li+1 and it is easy to see that these cannot be epxressed as sums of other positive roots. ∗ The Weyl chamber associated to this ordering will be given by the β ∈ H , such that β(hγ ) ≥ 0 for + P every γ ∈ R , i.e. if β = bili, then we will have β(hα ) = bi − bj ≥ 0 for all i < j, i.e. then the Weyl P ij chamber is W = { bili|b1 ≥ b2 ≥ . . . bn}. So we can see how choosing a different order we will get a disjoint and geometrically equal Weyl chamber, as we expect by the action of W. If we let ai = bi − bi+1 P with an = bn, we see that β ∈ W is equivalent to ai ≥ 0 for all i. Also bi = aj and so we have P P P P j≥i that β = i( j≥i ajli) = j aj( i≤j li), so in particular the Weyl chamber is the cone between the rays given by l1, l1 + l2, . . . , l1 + ··· + ln−1. The faces of W are always given by the hyperplanes Ωα, in this case P Ωαij = { bili|bi = bj}, perpendicular to the roots αij, but we see that given our ordering b1 ≥ · · · ≥ bn in Ωα ∩ W we must have bi = bi+1 = . . . bj. This means that the codimension 1 faces of W must be the ij P Ωαi,i+1 = { bili|bi = bi+1}, the ones perpendicular to the simple roots. We see here explicitly what the fundamental weights are, namely ωi = l1 + ··· + li, which are the first on the edges of the Weyl chamber. Then the irreducible representations will correspond to any (n − 1)−tuple P of nonnegative integers (a1, . . . , an−1), corresponding to the weight α = aiωi = (a1 + ··· + an−1)l1 +

. . . an−1ln−1,Γa1,...,an−1 .

5.2 Representations of sl(n, C)

We saw that α = (a1 + ··· + an−1)l1 + . . . an−1ln−1 is in the Weyl chamber W and for ai ∈ N it is also clearly in ΛW , so it is a weight. We want to show that the representation corresponding to it, Γα does in fact exist. n Let V = C be the standard representation of sl(n, C). What are its weights? For h = diag(b1, . . . , bn) P with bi = 0, its eigenvalues as a matrix are simply its diagonal entries b1, . . . , bn. So if ei are the usual

basis vectors of V , then h.ei = bi.ei, so for every h we have h.ei = li(h)ei, so ei ∈ Vli - weight space for

weight li and clearly V = ⊕iVli . This shows that the weights of this representation are the lis. It is clearly irreducible, for example by checking that the matrix with 1s above the main diagonal and 1 at the lower left corner and all other entries 0 sends en into en−1, then en−2, e1 to en, i.e. acts transitively on the basis vectors. Vi Now consider V , the ith exterior power of V . Its elements are v1 ∧· · ·∧vi and the action of an element P u ∈ L is given by the usual action on a tensor product. We have u(v1 ⊗ . . . vi) = j v1 ⊗ . . . u(vj) ⊗ . . . vi, with the wedge product we just replace ⊗ with ∧. Now Vi V has basis induced from the tensor product basis

of i−tuples of unordered basis vectors, i.e. ej1,...,ji = ej1 ∧ · · · ∧ eji such that j1 < j2 ··· < ji. Consider again the action of h = diag(b , . . . , b ) on a basis vector, we have h(e ) = P e ∧ ... (b e ) ∧ . . . e = 1 n j1,...,eji k j1 jk jk ji P Vi ( k bjk )ej1,...,ji . So we see that h(ej1,...,ji ) = (lj1 + ··· + lji )(h)ej1,...,ji and so every basis vector of V

24 spans a one-dimensional weight space V . The weights of this representation are given by all possible lj1+···+ji sums of i different ljs, i.e. from l1 + ··· + li to ln−i+1 + ··· + ln. In particular we see that every weight is

in the image of the Weyl group acting on l1 + ··· + li, namely if lj1 + ··· + lji is a weight than if σ is the permutation sending σ(k) = jk for k = 1, . . . , i and everything else in what remains, then since σ ∈ Sn ' W

we have σ(l1 + ··· + li) = lj1 + ··· + lji as the elements of the Weyl group are all possible transpositions Vi lr ↔ lp. So since we saw how l1 + ··· + li ∈ W, the Weyl chamber, we then have that V = Γl1+···+li as Vi follows. First of all, l1 + ··· + li is certainly the highest weight of V and if it had any subrepresentations,

then one of them would have to account for the highest weight and so be Γl1+···+li . But then Γl1+···+li would Vi also account for the weights that are images of l1 + ··· + li under W, so all the weights of V , counting multiplicities as these are all 1, and so it must be equal to Vi V . By theorem 20 we must have an irreducible representation Γα for every α = (a1 + ··· + an−1)l1 + ··· + an−1ln−1, where ai ∈ N. If we rewrite α as α = a1l1 +a2(l1 +l2)+···+an−1(l1 +···+ln−1), this can suggest where to look for the desired representation in view of the preceding paragraph. Namely, since Vi V has k Vi highest weight l1+···+li and a highest weight vector v, then Sym V will have highest weight k(l1+···+li) and highest weight vector v . . . v. We will show this explicitly. For I = {j1, . . . , jk}, let eI = ej ∧· · ·∧ej and | {z } 1 i k P k Vi lI = p∈I lp, then Sym V is spanned by the commutative products eI1 . . . . eIk and again by the action Pk of a Lie algebra on tensor products we have that for h ∈ H, h(eI . . . eI ) = eI . . . h(eI ) . . . eI = P P 1 k r=1 1 r k r lIr (eI1 . . . eIk ) = ( r lIr )eI1 . . . eIk . So each of these elements eI1 . . . eIk span a one dimensional weight space for Symk Vi V for the weight P P l and we readily see by the arrangement of weights l > l as r p∈Ir p i j long as i < j that the maximal weight is k.(l1 + ··· + li) (by maximizing each Ir to be {1, . . . , i}, keeping in k Vi mind that IR should contain i different numbers). Now since Sym V has maximal weight k(l1 + ··· + li),

it must also contain the unique irreducible representation Γk(l1+···+li) (i.e. it must contain an irreducible representation generated by a highest weight vector for the given highest weight and since there is only one irreducible representation of a given highest weight, it must be our Γ). Now the last step is to show that the representation S = Syma1 V ⊗Syma2 V2 V ⊗· · ·⊗Syman−1 (Vn−1 V ) has highest weight α = a1l1 + a2(l1 + l2) + ··· + an−1(l1 + ··· + ln−1) and by the discussion at the end of the preceding paragraph we will necessarily have that it contains the irreducible representation Γα. But this actually follows by similar reasoning as in the previous paragraph. Again our representation S is ai Vi generated by all tensor products v1 ⊗ . . . vn−1, where vi run through the generators of Sym V as in Pk the preceding paragraph. By it we have h(vi) = mi(h)vi, where if vi = eI1 . . . eIk then mi = r=1 lIr . So Pn−1 P h(v1 ⊗ · · · ⊗ vn−1) = p=1 v1 ⊗ . . . h(vp) ⊗ · · · ⊗ vn−1 = ( p mp)(h)(v1 ⊗ · · · ⊗ vn−1) and the weights of S Pn−1 must be exactly these sums p=1 mp. In order to maximize this sum, since the mps are independent, we maximize each mp, which by the preceding paragraph gives us exactly ap(l1 + ··· + lp), so at the end the P maximal weight is p ap(l1 + ··· + lp) = α, which is what we needed to show. Since the αs run through all possible maximal weights we have exhausted all possible irreducible representations and so we’ve proven the following theorem.

Theorem 21. The irreducible representations of sl(n, C) are the Γa1,a2,...,an−1 of highest weight (a1 + a1 a2 V2 an−1 Vn−1 . . . an−1)l1+. . . an−1ln−1 and it appears as a subrepresentation of Sym V ⊗Sym V ⊗· · ·⊗Sym ( V ).

5.3 Weyl’s construction, tensor products and some combinatorics

Our goal here is to exhibit the irreducible representations Γa1,...,an−1 as explicitly as possible as subspaces of the d−th tensor power V ⊗d, where d is the dimension of Syma1 V ⊗ Syma2 V2 V ⊗ · · · ⊗ Syman−1 (Vn−1 V ) ⊂ ⊗d P V , i.e. d = i aii. ⊗d Observe that since sl(n, C) acts on V , and Sd acts on V by permuting coordinates, then the two actions commute. Moreover, we know from both [1] and [3] what the representations of Sd are, in particular

they are nicely indexed by partitions λ ` d λ1 ≥ λ2 ≥ . . . λn, and in fact so are the Γa1,...,an−1 if we write Pn−1 Pn−1 the partition of d as ( i=1 ai, i=2 ai, . . . , an−1, 0) in bijection with any sequence of nonnegative integers

25 (a1, . . . , an−1). We can then use the representations of Sn to study the representations of sl(n, C) and for that we will digress a little bit in reviewing representations of Sn. A standard way of studying representations of Sn is as shown in [1] the use of Young symmetrizers. Consider the group algebra CSd and let T be a of shape λ, where λ ` d. Then let Pλ = {g ∈ Sd : g preserves each row of T } and Qλ = {g ∈ Sd : g preserves each column of T }. The choice of T does not matter as long as we are consistent (use the same one) as the sets would be conjugate. Now let a = P e and b = P sgn(g).e , where e is the element in the group algebra corresponding to g. λ g∈Pλ g λ g∈Qλ g g The define the cλ = aλ.bλ ∈ CSd. The main theorem in the representaiton theory 2 of Sd states that cλ = nλcλ for nλ ∈ N and the image of cλ by right multiplication on CSd is an irreducible representation Vλ and every irreducible representation is obtained in this way. ⊗d Going back to V we have the action of Sd on it given by (v1 ⊗· · ·⊗vd).σ = (vσ(1) ⊗· · ·⊗vσ(d)) for σ ∈ Sd ⊗d ⊗d ⊗d ⊗d and so we can consider the image of cλ on V and denote it by SλV = Im(cλ|V ) = V .cλ = V ⊗CSd Vλ, ⊗d which will be a subrepresentation of V for GLn and hence for SLn and taking the differential of the representation map we obtain a corresponding irreducible representation of sl(n, C). n Theorem 22. The representations Sλ(C ) is the irreducible representation of sl(n, C) with highest weight

λ1l1 + ··· + λnln. It is in fact the representation Γa1,...,an−1 given by ai = λi − λi−1.

Proof. We need to see how the representation of SLn relates to a corresponding representation of sl(n, C). There is no natural map from a Lie group to its Lie algebra, but there is certainly a map from a Lie algebra to a Lie group, the exponential. If ρ : SL(V ) → SL(W ) is a representation, the induced representation will be dρ : sl(V ) → sl(W ) and the following diagram will commute

dρ sl(V ) / sl(V 0) . (10)

exp exp   SL(V ) ρ / SL(V )

0 0 0 So in particular if V = ⊕Vα is a weight space decomposition, i.e. for every w ∈ Vα and h ∈ H we have ρ(h)w = α(h).w, then by commutativity we will have that ρ(exp(h)) = exp((dρ)(h)). Since we (dρ)(h)2 are dealing with matrices we have exp((dρ)(h)) = 1 + (dρ)(h) + 2 + ... and in particular we have i i P (dρ(h)) P α(h) α(h) α(h) exp((dρ)(h))w = i i! w = i i! w = e w, so its eigenvalues are e with the same eigenvectors. P α(h) Therefore we have that if A ∈ SL(V ) with A = exp(h), then T r(ρ(A)) = α∈W e . y1 yn Now if h = diag(y1, . . . , yn) ∈ sl(n, C) then A = diag(e , . . . , e ), then we have for β = b1l1 +···+bnln ∈ W (i.e. a weight of V’, so bi ∈ Z), we have β(h) = b1y1 + ··· + bnyn, so X X X T r(A) = eβ(h) = eb1y1+b2y2+···+bnyn = (ey1 )b1 (ey2 )b2 ... (eyn )bn . (11) β∈W β∈W β∈W

Now if we assume for a moment some of the representation theory of GLn, which will be shown later in appendix A, we can use a theorem from the theory of Schur functors giving us a formula for the trace of 0 A when V = SλV . The theorem states that for any A ∈ GL(V ), the trace of A on SλV , which is now a representation of GL(V ), is the value of the Schur polynomials (see ??) on the eigenvalues x1, . . . , xn of A on V , i.e. T r| V (A) = sλ(x1, . . . , xn). Sλ P We know from ?? that sλ = mλ + µ<λ Kλµmµ, where mµ are the monomial symmetric functions, i.e. m = P xµ1 xµ2 . . . xµn , and K are the Kostka coefficients, whose combinatorial represen- µ i1,i2,...,in distinct i1 i2 in λµ tation is given by the number of semi-standard Young tableaux of shape λ and type µ. So for any diagonal matrix A ∈ GL(V ), which we can always write as A = diag(ey1 , . . . , eyn ), we have that

y1 yn y1 yn X y1 yn T r(A) = sλ(e , . . . , e ) = mλ(e , . . . , e ) + mµ(e , . . . , e ). (12) µ<λ

26 Now the last equation (12) can be expanded as a sum of monomials in ey1 , . . . , eyn , but so is the formula (11). As these must be equal for all y1, . . . , yn, we must have that the monomials coincide, so the weights β ∈ W

must be of the kind µi1 l1 + ··· + µin ln for some permutation i1, . . . , in of (µ1, . . . , µn) for some µ ≤ λ and must occur with multiplicity Kλµ. In particular we have that the weight λ1l1 + ··· + λnln has multiplicity exactly one and is highest among all appearing weights10. By the uniqueness of irreducible representations n with given highest weight it follows that the representation Sλ(C ) is the irreducible representation of highest weight λ1l1 + ··· + λnln. As a consequence to this theorem we are enabled thanks to combinatorics rules to determine the decom- position of any tensor power into irreducible representations. Specifically, if V1,...,Vk are representations,

then the trace of a matrix A ∈ GLn on V1 ⊗ V2 ⊗ · · · ⊗ Vk is T r|V1⊗···⊗Vk (A) = T r|V1 (A). . . . T r|Vk (A). We now know that the traces are polynomials (necessarily symmetric!) in the eigenvalues and the traces of the irreducible representations are the Schur polynomials. So if f(x1, . . . , xn) is the trace of our representation P of interest, if we write f(x1, . . . , xn) = λ cλsλ(x1, . . . , xn), then we will have that cλ is the multiplicity of n the irreducible representation Sλ(C ). n n An important example of this is the decomposition of Sµ(C ) ⊗ Sν (C ) into irreducible representations. P λ The trace polynomial will be sµ(x1, . . . , xn)sν (x1, . . . , xn) and we know, e.g. by [3], that sµsν = λ cµν sλ, λ where the coefficients cµν are called the Littlewood-Richardson coefficients and since they necessarily are equal to the multiplicity of an irreducible representation they must be nonnegative integers. They can also λ be computed by the Littlewood-Richardson rule, saying that cµν is the number of skew SSYTs of shape λ µ, type ν, and such that their reverse reading word is a lattice permutation.

n 5.4 Representations of GLn(C ) n We certainly have that the SλC s are irreducible representations of GLn as shown in the appendix, the question is whether these are all the irreducible representations. The commutative diagram (??), which comes from the theory we developed about maps between Lie groups and their Lie algebras, shows that when the Lie group is simply connected then there is one-to-one correspondence between the irreducible representations of the Lie group and the irreps of its Lie algebra. One can show this by arguing as follows: if V1 ⊂ V is a subrepresentation of the group / algebra then the same is true for the algebra/group by the commutativity of exp and the representation map. Since SLn(C) is simply connected we have such one-to-one correspondence with the irreducible represen-

tations Γa1,...,an−1 of sl(n, C). GLn however is not simply connected and gln is not even semisimple, so we will try to build the irreducible representations from the special groups. n Denote C − V and let Dk be the one-dimensional representation of GLnC given by the kth power Vn of the , so if k ≥ 0 we have Dk = ( V )⊗k. Given a representation of SLn we can lift it to CLn by tensoring with Dk. If for any a = (a1, . . . , an) we denot by Φa the subrepresentation of a1 an Vn a1 a2 an Sym V ⊗ · · · ⊗ Sym ( V ), spanned by v = (e1) .(e1 ∧ e2) ... (e1 ∧ · · · ∧ en) , we see that passing to the Lie algebra corresponds to the irrep of highest weight a1l1 + ··· + (a1 + ··· + an)ln, and in particular

restricting to SLn gives Γ(a1,...,an−1). Or we can set λ = (a1 + ··· + an, a2 + ··· + an, . . . , an) and consider

SλV as a representation Ψλ of GLn. We see that in the first case we have Φa1,...,an+k = Φa1,...,an ⊗Dk and in

the second P siλ1+k,...,λn+k = Ψλ ⊗ Dk. Moreover these two representations are actually isomorphic, as their ∗ restrictions to SLn agree and their restriction to the center C ⊂ GLn are both Dan , so agree everywhere. We will show finally that these are all the irreducible representations of GLn.

Theorem 23. Every irreducible complex representation of GLnC is isomorphic to Ψλ for a unique index λ = (λ1, . . . , λn) with λ1 ≥ λ2 ≥ · · · ≥ λn.

Proof. Consider the corresponding Lie algebras. We have that glnC = slnC × C, where C is the ideal formed by matrices a.I with a ∈ C and I -the . In particular C is the radical of gln and n is

10 Follows by the ordering of the weights li > li+1 and by the fact that any other weight appearing is of the form µ1l1+···+µnln with µ < λ

27 the semisimple part. The following reasoning shows that every irreducible representation of gln is a tensor product of an irreducible representation of sln and a one-dimensional representation. Let U be an irreducible representation of gln. We have that by Lie’s theorem since Rad(gln) = C is solvable and since Rad(gln) ⊂ gl(U) then U contains a common eigenvector for all elements in Rad(gln) and so there is a linear functional α, such that W = {w ∈ U|x.w = α(x)v for every x ∈ Rad(L)} is nonempty. But then since every element of gln can be written as x + y with x ∈ Rad(gln) and y ∈ sln we have that for w ∈ W x.y.w = y.x.w + [x, y](w) = α(x)(y.w) since in our case Rad(gln) commutes with every element. Then we must have y.w ∈ W , so gln : W → W and hence W is a subrepresentation of U, so must be all of U. Now if we extend α to gln and let D be the one-dimensional representation given by α, i.e. for every z ∈ D ∗ and x ∈ gln we have x.z = α(x)z and so U ⊗ D is trivial on Rad(gln) and so is a necessarily irreducible representation of sln. n So if Wλ = Sλ(C ) is the irreducible representation of sln(C), given by the partition λ1, . . . , λn, we can extend it to gln = sln × C by acting trivially on the second factor. By the above consideration we have that every irreducible representation of gln is isomorphic to Wλ ⊗ L(w), where L(w) is the one dimensional representation given by multiplication by w ∈ C on the second factor of slnC × C. So passing back to Lie groups, the same will be true for the simply connected SLnS × C (the simply- connectedness we take for granted here). We now need to pass from SLn × C to GLnC and for that there z is a natural map ρ : SLnC × C → GLnC given by ρ(g, z) = e .g. The kernel of this map is generated s s by e I × (−s) and for e I ∈ SLnC we need to have s = 2πi/n. So SLnC × C = GLnC × ker ρ. Now an irreducible representation of GLnC can be lifted to an irreducible representation on SLn × C by letting it be 0 on ker ρ, so actually this defines a one-to-one correspondence of irreducible representations and we need s sd to find which of Wλ × L(w) are 0 on ker ρ. We have that e .I acts on Sλ by multiplication by e , where P ⊗d −sw d = λi, since that’s how it acts on V . −s acts on L(W ) by multiplication by e , so the action on ker ρ is given by multiplication by esd−sw, which is trivial iff sd − sw ∈ frm−eπiZ and since s = 2πi/n, this P is equivalent to w = λi + kn for k ∈ Z. P Then we have that the irreducible representations are Wλ ⊗ L( λi + kn), which restricted to GLn is P Ψλ1+k,...,λn+k and the pullback of Ψ is Wλ⊗L( λi+kn), so we have exhausted all irreducible representations of GLnC and so are done.

5.5 Some explicit construction, generators of SλV

a a1 Vk ak For a sequence of nonnegative integersa = (a1, . . . , ak), let A = Sym V ⊗ · · · ⊗ Sym V , as we saw the a a irreducible representation Γa ⊂ A . Here we will describe A and exhibit a basis for Sλ. A way to describe Aa is to look at the right action of the symmetric group on V ⊗d and note that on one hand we have k ⊗d a ^ ⊗ak a1 ⊗d Sλ(V ) = V cλ ⊂ A ⊂ ( V ) ⊗ · · · ⊗ (V ) = V bλ.

0 a ⊗d These inclusions suggest that we can probably find a a ∈ CSd, such that A = V .a.bλ. Finding this a0 is not that hard at all if we note that (k, . . . , k, k − 1, . . . , k − 1,..., 1,..., 1) is in fact the conjugate | {z } | {z } | {z } ak ak−1 a1 partition of λ denoted λ0, i.e. it is the list of lengths of the columns of λ. The wedges Vl V are just ⊗l P V .b(1l), where b(1l) = σ permutes the column of length l sgn(σ)σ (i.e. bλ for λ - a column of length l). On the other hand Symai (Vi V ) is actually taking all these columns of length i and symmetrizing them, i.e. Vi ⊗ai permuting them in on all possible ways, i.e. ( V ) .aai , where a(ai) is the aλ for λ = ai (don’t confuse the two different kinds of as!). So wedging over the columns of different lengths gives us that AaV = k ⊗ak ⊗a1 ⊗d 0 0 k P (V .b1 ) a(ak) ⊗ · · · ⊗ (V ) .a(a1) = V .a .bλ, where a = er, is the sum over the r∈Sak ×···×Sa1

permutations r which permute the columns of same length between each other, i.e. the group Sak ×· · ·×Sa1 . 0 00 0 Note also that a and bλ commute for the same reason that aλ and bλ do. It is also clear that aλ = a .a , 00 P where a = ep over the coset representatives. p∈(Sλ1 ×···×Sλk )/(Sak ×···×Sa1 )

28 • a • V2 Vn a Set A (V ) = ⊕aA (V ) = Sym (V ⊕ V ⊕ · · · ⊕ V ), which makes a graded ring out of the a V s. Then we can define a graded ring S• as the quotient of A•(V ) by the two-sided ideal I• generated for all p ≥ q ≥ r ≥ 1 by all elements of the form (called the ”Plucker relations”) (v1 ∧ · · · ∧ vp).(w1 ∧ · · · ∧ wq) − P (v1 ∧ · · · ∧ w1 ∧ · · · ∧ wr ∧ · · · ∧ vp).(vi1 ∧ vi2 ∧ · · · ∧ vir ∧ wr+1 ∧ · · · ∧ wq), where the sum is over all 11 1 ≤ i1 < i2 < ··· < ir ≤ p and the w1, . . . , wr are inserted in the places of vi1 , . . . , vir . • • We are now going to exhibit the coordinates of S in A . Let e1, . . . , en be a basis for V and let T be a semistandard Young tableaux of shape λ filled with the integers 1, . . . , n, i.e. the rows are nondecreasing and the columns strictly increasing. Let T (i, j) denote the entry of T in row i and column j and define

l Y e = e ∧ e ∧ · · · ∧ e 0 , T T (1,j) T (2,j) T (λj ,j) j=1

i.e. we wedge together the entries in columns and multiply the results in S•. We are going to prove the following theorem.

Theorem 24. The eT for T a semistandard tableaux on λ form a basis for Sa(V ). The projection from a a a A (V ) to S (V ) maps the subspace Sλ isomorphically onto S (V ). a Proof. We will first show that the elements eT generate S . a a To see that the eT span S , note that clearly S is spanned by all eT for all tableaux T with strictly increasing columns (not just semistandard) as these span Aa itself. Introduce a reverse lexicographic ordering on the T s by comparing the listing of their elements column by column, and within each column list top to bottom and we compare starting from the last entry going forward until we find two differing entries. Then for example we have that the tableaux T0 which has T (i, j) = i for all i, j is the smallest one. Suppose now that a tableaux T is not semistandard, i.e. there are two adjacent columns indexed j and j + 1, for which there is a row r, such that T (r, j) > T (r, j + 1) and let r be the smallest such. Let then v1, . . . , vp be the column of index j and w1, . . . , wq- the one of index j + 1 (i.e. vi = T (i, j), wi = T (i, j + 1)). For • the r we just picked then we have by the Plucker relations in I that (v1 ∧ · · · ∧ vp).(w1 ∧ · · · ∧ wq) = P (v1 ∧ · · · ∧ w1 ∧ · · · ∧ wr ∧ · · · ∧ vp).(vi1 ∧ · · · ∧ vir ∧ wr+1 ∧ . . . wq). Multiplying the remaining columns P 0 of T we get that eT = eT 0 , where the T are obtained from T by interchanging the first r elements from column j + 1 of T with some r elements of column j of T . We see then that the listing of entries of T 0 is the same as the one for T starting from the back until we encounter the position (i, j + 1), where the entry of T 0 0 will be vir = T (ir, j) ≥ T (r, j) > T (r, j + 1), so in the reverse lexicographic ordering we will have T > T . P 0 In this way we see that by the Plucker relations eT = eT 0 with T > T as long as T is not semistandard. Continuing this process iteratively we see that we can express eT for every non-semistandard tableaux T 0 with a sum of eT 0 with T > T and so we will only be increasing in the lexicographic order. Since there are finitely many tableaux on n elements of shape λ we will reach a moment where eT is a sum (of sums of sums etc) of tableaux that must be semistandard (otherwise we can keep expressing as sums of larger ones). This a shows that eT for T -semistandard generate S (V ). We need to show now that they are linearly independent. We will show this indirectly by showing that a a Sλ(V ) ⊂ S . If the projection of Sλ into S is not zero, then their intersection is not 0 and since both Sλ a a and S are representations of GL(V ) with Sλ irreducible we must have that Sλ is a submodule of S . For every representation we have that its dimension is the trace (character) at the identity of the group, then in our particular case we have that dim Sλ = T r(I) = sλ(1,..., 1) (from appendix A). On the other hand P T from the combinatorial interpretation of Schur functions ([3]) we have sλ(x1, . . . , xn) = T |shape(T )=λ x , so

11 • ⊗d 00 0 We have that S is in fact the graded algebra of the Sλs. A way to see this is by noticing that Sλ = (V a )a .bλ a is equivalent to averaging the elements of A which are obtained by one another via a coset representative p ∈ (Sλk × · · · × Sλ1 )/(Sak ×· · ·×Sa1 ). In particular the given relations restrict to the case of only two columns, represented by (v1 ∧· · ·∧vp) and (w1 ∧· · ·∧wq), in which we average out over all r over all possible picks of rows i1 < ··· < ir, for which we exchange the elements

in rows i1, . . . , ir to get the first column to be (v1 ∧· · ·∧wi1 ∧· · ·∧wir ∧. . . vp) and the second (w1 ∧. . . vi1 ∧· · ·∧vir ∧· · ·∧wq). In the original relations we move elements from positions (i1, . . . , ir) to (1, . . . , r) two times, i.e. conjugate by a permutation preserving columns, and since we apply it twice we don’t change the sign (think of permutations appearing in bλ). In any case • • • we have that A /I = S .

29 P sλ(1,..., 1) = T |shapte(T )=λ 1 is the number of semistandard tableaux T of shape λ filled with the numbers a 1, 2, . . . , n. But this is the same number of possible generators eT for S - the semistandard tableaux T of a shape λ and filled with 1, . . . , n, so counting dimensions we get #{eT | generating S , i.e. T -semistandard} ≥ a dim S ≥ dim Sλ = #{T semistandard }, so all inequalities are equalities showing that the eT s are linearly a independent and that S ' Sλ. • So we need to show only that the image of Sλ in S is nonempty. We first of all have that each eT is in fact in Sλ (one can show this by seeing that eT .cλ = ceT , for some scalar c). Next we show that at least one • of them has a nonzero image in S . We will check this for eT0 with T0(i, j) = i for all i, j. Now T0 is definitely the smallest in the reverse lexicographic order and moreover, applying the Plucker relations to any two of

its columns results in the same tableaux (we have that w1 = 1, . . . , wr = r and vi1 = i1, . . . , vir = ir and the wedge v1 ∧ . . . w1 ∧ · · · ∧ vp = 0 for having two equal elements unless we replaced w1, . . . , wr with v1, . . . , vr,

which leads to the trivial relation eT0 − eT0 ), in particular we cannot have it as a summand in any Plucker relation as reversing the exchange of r elements between two columns of T0 can’t result into anything but • a T0. So eT0 6∈ I , showing that its image under the projection is not 0 and so that eT0 ∈ Sλ ∩ S .

6 Conclusion

In this paper we developed the basic theory of Lie algebras, classifying them through properties like solv- ability, nilpotency, simplicity and showing criteria for establishing such properties. We studied the structure of semisimple Lie algebras via roots and root space decomposition and used that to study the irreducible representations of semisimple Lie algebras. We showed in particular a theorem that established a bijection between irreducible representations and highest weights (i.e. points in the intersection of the Weyl chamber with the root lattice), which gave us a nice way of listing all irreducible representations. We showed the theory in practice by studying the structure and representations of sl(n, C). We established connections between Lie groups and Lie algebras, which enabled us in particular to limit the irreducible representations of the Lie group GL(V ) by the irreducible representations of its Lie algebra gl(V ), thereby proving that the irreducible representations Sλ are in fact all the irreducible representations of GL(V ). We showed some explicit constructions in the intersection of combinatorics and algebra.

A Schur functors, Weyl’s construction and representations of GLn(C)

⊗d Consider the right action of Sd on V for any vector space V given by (v1 ⊗ · · · ⊗ vd).σ = vσ(1) ⊗ · · · ⊗ vσ(d). This action commutes with the left action of GL(V ) acting coordinatewise on the tensor product. Let λ be a partition of d and let cλ be the Young symmetrizer in the group algebra of Sn defined as cλ = aλbλ, where P P aλ = g∈P eg and bλ = g∈Q sgn(g)eg, where P is the set of permutations fixing the rows of a Young tableaux of shape λ filled with the elements {1, . . . , d} and Q is the set of permutation fixing the columns ⊗d of the same tableaux. Let SλV = V .cλ be the so called Schur (it is a functor on the set of vector spaces V and linear maps between them). It is clearly a representation of GL(V ). We are going to prove the following theorem.

Theorem 25. For any g ∈ GL(V ) the trace of g on Sλ(V ) is the value of the on the eigenvalues x1, . . . , xn of g on V , T r|SλV (g) = sλ(x1, . . . , xn). Each SλV is an irreducible representation of GL(V ). In order to prove this theorem we will first prove some general lemmas. Let G be any finite group, in our case its role will be played by Sd and A = CG be its group algebra. For U a right module over A, let B = HomG(U, U) = {φ : U → U|φ(v.g) = φ(v).g for all v ∈ U and g ∈ G}. If U decomposes into ⊕ni a direct sum of irreducible modules via U = ⊕iUi , then since by Schur’s lemma a map between two irreducible modules is either 0 or scalar multiplication, we have that HomG(Ui,Uj) = 0 if i 6= j and ⊕ni ⊕ni HomG(Ui ,Ui ) = Mni (C), ni × ni matrices. If then W is a left A module, then the tensor product U ⊗A W is left B module as for φ ∈ B and a ∈ A we have φ.(u.a⊗w) = φ(u.a)⊗w = φ(u).a⊗w = φ(u)⊗a.w.

30 Lemma 7. Let U be a finite-dimensional A−module. Then (i) For any c ∈ A the map U ⊗A Ac → UC is an isomorphism of left B−modules. (ii) If W = Ac is an irreducible left A−module, then U ⊗A W = Uc is an irreducible left B−module. ⊕mi (iii) If Wi = Aci are the distinct irreducible left A−modules, with mi = dim Wi, then U ' ⊕i(Uci) . is the decomposition of U into irreducible B−modules.

Proof. First of all, U ⊗A Ac → Uc is clearly surjective, as Uc is the image of U ⊗A c. So we need to show it is injective. Now Ac is a submodule of A and A is a representation of a finite group G, so A is completely 0 0 0 reducible and in particular we have A = Ac⊕A , for some submodule A . So U = U ⊗A A = U ⊗A (Ac+A ) = 0 (U ⊗A Ac)⊕(U ⊗A A ). So if φ : U ⊗A Ac → Uc is not injective, then so is the inclusion U ⊗A Ac → Uc ,→ U, contradicting the fact that U ⊗A Ac is direct summand of U. This proves part (i). For part (ii) suppose first that U is irreducible. From the representation theory of finite groups we have P 2 that if Wi are all the irreducible representations of G, then from one hand we have i(dim Wi) = |G| and from the other if we map the group algebra into End(⊕iWi) = ⊕End(Wi) = ⊕iMdim Wi C since the elements of G are automorphisms of ⊕iWi, by counting dimensions we must have that this inclusion map is an isomorphism. So we can view A = ⊕Mni (C), where ni are the dimensions of its irreducible representations. Since W = Ac is also a left A− ideal, it must be a minimal ideal of A (as every ideal is a A−module), and so must be a minimal ideal in the matrix ring Mni for some i. Since if a matrix that has only one nonzero column of index j, when multiplied on the left by any matrix, results again in a matrix with only one nonzero column of the same index, and since every matrix can be multiplied on the left by a certain matrix (of only nonzero entry at (j, j)) to get such a matrix of one nonzero column, we must have that minimal ideal consists of matrices with only nonzero column at j for some j. Similarly since U is a right A-ideal, it must have only one nonzero row, say at i, for its inclusions in Mdim Wk . Finally for every element of U ⊗A W its part in Mdim Wi is either 0 if one of U of W are 0 there, or is of the form C ⊗A D, for matrices C, D, we can write C = I.C and C ∈ A, so C ⊗A D = I ⊗ C.D. But for every matrix C with only nonzero row i and D with only nonzero column j, we have that C.D is matrix with only nonzero entry at (i, j). This means that U ⊗A W is one dimensional and hence necessarily irreducible. ⊕ni ⊕ni Now suppose U = ⊕iUi with U − i - irreducible, then U ⊗A W = ⊕i(Ui ⊗ W ) . The Uis being irreducible by the above paragraph must be in some Mdim Wj and so that Ui ⊗A W is not 0, we consider only the Ui ⊂ Mdim Wk for W ⊂ Mdim Wk . However all irreducible modules of a matrix algebra must be ⊕n ⊕n isomorphic, so we have that finally U ⊗A W = (Uk ⊗A W ) k = C k , which is clearly irreducible over

B = ⊕Mnj (C), proving part (ii). ⊕mi For part (iii) we have that since A = ⊕iWi (by for example the beginning of the second paragraph) ⊕mi ⊕mi then for U as an A−module we have U ' U ⊗A A = U ⊗A (⊕iW ) = ⊕i(U ⊗A Wi) . since by (ii) ⊕mi U ⊗A Wi is irreducible over B and by (i) it is isomorphic to Uci, we get that U ' ⊕i(Uci) and by uniqueness of irreducible submodules decomposition, it must be the decomposition of U into irreducible B−modules.

⊗d To prove theorem 25 we apply this lemma with A = CSd and U = V viewed as a right A−module. Then since we know, by for example [1] or [3], what the irreducible representations of Sd are, namely Vλ = Acλ with λ ` d and cλ - the Young symmetrizer, we can decompose U ' ⊕λ(Ucλ) as a left B−module. The question now is to show that the algebra B is spanned by End(V ). Lemma 8. The algebra B is spanned as a linear subspace of End(V ⊗d) by End(V ). A subspace of V ⊗d is a B−submodule if and only if it is invariant under GL(V ). Proof. One can show that Symd W is spanned by wd = d1w ⊗ · · · ⊗ w as w runs through W . One can show this for example by induction on d, starting with the fact that v ⊗ w + w ⊗ v, the generic element of Sym2 W , can be written as (v + w) ⊗ (v + w) − (v ⊗ v) − (w ⊗ w) and then proceed as if dealing with the elementary symmetric polynomials. We can consider End(V ) = V ∗ ⊗ V , we have that End(V ⊗d) = (V ⊗d)∗ ⊗ (V ⊗d) = ∗ ⊗d ⊗d ∗ ⊗d ⊗d ⊗d ⊗d ⊗d ⊗d (V ) ⊗ V = (V ⊗ V ) = (End(V )) . Now since B = HomSd (V ,V ) ⊂ End(V ) = End(V ) d and it is invariant under Sd it must in fact be a subset of Sym (End(V )), which is spanned by End(V ) (as φd). But GL(V ) is dense in End(V ), so this implies the second statement of the lemma.

31 This lemma now proves that a B−module decomposition is the same as a GL(V ) decomposition in ⊗d ⊗d our case, showing that V = ⊕λ(V cλ) is the irreducible decomposition over GL(V ) and of course ⊗d SλV = V cλ, showing the second part of theorem 25. For the first part now we will use the fact that both the complete symmetric polynomials hλ and the ⊗d Schur polynomials span the space of symmetric polynomials. We don’t know what exactly V cλ is, but ⊗d λ1 λ2 λk ⊗d we know this for V aλ, it is in fact isomorphic to Sym V ⊗ Sym V ⊗ · · · ⊗ Sym V ' V ⊗A A.aλ, as we symmetrize the rows of the Young tableaux of shape λ. From the representations of Sd we have that Aaλ = ⊕µ`dKµλAcµ is the decomposition into irreducible representations of Sd. From our lemma then we have that as GL(V )-modules

λ1 λ2 λk ⊗d ⊗d ⊕Kµλ Sym V ⊗ Sym V ⊗ · · · ⊗ Sym V = V aλ = ⊕µ(V cλ) . (13)

Let g ∈ GL(V ) be a matrix with eigenvalues x1, . . . , xn. Then the trace of g on the left-hand side can be computed by the standard way of dealing with tensor products, i.e. Y ⊗d λ T r|V aλ (g) = T r|Sym i V (g), i and we can compute the trace of g on Symp V by observing that if g is diagonal(or diagonalizable), then if p vi1 , . . . , vip are eigenvectors, g(vi1 . . . vip ) = xi1 . . . xip (vi1 . . . vip ) in Sym V and by dimension count going over all i ≤ · · · ≤ i , these are all eignvectors of g acting on Symp V , so T r(g) = P x . . . x = 1 p i1≤···≤ip i1 ip hp(x1, . . . , xn) - the complete symmetric function. Using the fact that the semisimple matrices are a dense set in GL(V ) we can extend this result for any g of the given eigenvalues. We then have that T r|V ⊗da (g) = Q λ i hλi (x1, . . . , xn) = hλ(x1, . . . , xn) by definition of hλ. Now taking T r(g) on both sides of equation (13) give us that X ⊗d hλ(x1, . . . , xn) = KµλT r|V cµ (g), (14) µ which coincides with the expression for hλ in terms of sλ. Since we know that the matrix with entries Kµλ is invertible, after writing a system of equations (14) for all λ ` d, we can multiply by the inverse of the Kµλ −1 matrix and get that the vector of entries T r|SµV (g) running over µ ` d is equal to K .h, where h is the −1 vector of entries hλ. Since we know s = K .h, where s is the vector of entries the Schur functions sµ, we must have

T r|SλV (g) = sλ(x1, . . . , xn), as desired.

References

[1] William Fulton and Joe Harris, Representation theory, Graduate Texts in Mathematics, vol. 129, Springer-Verlag, New York, 1991. A first course; Readings in Mathematics. MR1153249 (93a:20069) [2] James E. Humphreys, Introduction to Lie algebras and representation theory, Graduate Texts in Mathematics, vol. 9, Springer-Verlag, New York, 1978. Second printing, revised. MR499562 (81b:17007) [3] Richard P. Stanley, Enumerative combinatorics. Vol. 2, Cambridge Studies in Advanced Mathematics, vol. 62, Cambridge University Press, Cambridge, 1999. With a foreword by Gian-Carlo Rota and appendix 1 by Sergey Fomin. MR1676282 (2000k:05026)

32