<<

Basic Concepts in Algebra

§1. Notations and terminologies (1.1) Some symbols

• ∀: “for all”

• ∃: “there exists”

• 7→: maps to (under a map/function)

• lhs := rhs (“rhs” is the definition of “lhs”)

• X r S: the complement in S of a subset S ⊆ X • 2S, where S is a set: the set of all subsets of S

• Z: the set of all

• N: the set of all natural numbers (including 0)

• Q: the set of all rational numbers

• R: the set of all real numbers

• C: the set of all complex numbers

• N>0: the set of all positive integers

• M2(R): the set of all 2 × 2 matrices with entries in R

• GL3(Q): the set of all invertible 3 × 3 matrices A with entries in Q, i.e. there exists a 3 × 3 matrix B with entries in Q such that A · B = B · A = I3, where I3 is the 3 × 3 matrix whose diagonal entries are 1 and off-diagonal entries are 0.

• GL2(Z): the set of all invertible 2 × 2 matrices A with entries in Z such that there exists a 2 × 2 matrix B with entries in Q satisfying A · B = B · A = I3 ; this amounts to the condition that the determinant of A is ±1.

• R[x]: the set of all polynomials with coefficients in R

• Q[x, y]: the set of all polynomials in two variables x and y with all coefficients in Q

• Z[x, y, z]: the set of all polynomials in three variables x, y, z with all coefficients in Z.

(1.2) Definition Let f : S → T be a map from a set S to a set T .

1 (a) The map f is said to be injective (or, f is an injection) if for any two elements s1, s2 ∈ S, f(s1) = f(s2) only if s1 = s2. Another standard terminology for the same concept: f is one-to-one.

(b) The map f is said to be surjective (or, f is a surjection) if for every element t ∈ T , there exists an element s ∈ S such that f(s) = t. In other words the image f(S) of f is equal to the target T of the map f. Another standard terminology for the same concept: f is onto.

(c) The map f is said to be bijective (or, f is a bijection) if f is both injective and surjective. Another standard terminology for the same concept: f is a one-to-one and onto correspondence.

(1.3) Definition Let S1,...,Sn be sets. The S1 × · · · Sn of S1,...,Sn is the set consisting of all n-tuples (x1, . . . , xn) such that xi ∈ Si for all i = 1, . . . , n. Such a product Qn Q is also denoted i=1 Si, where “ ” is the symbol for a product. The above definition in words can also be expressed by the following formula

n Y Si = S1 × · · · Sn := {(x1, . . . , xn) | xi ∈ Si ∀ i = 1, . . . , n} i=1 (1.4) Remark We will not need to use the product of an infinite family of sets. In case you wonder what such an infinite product is, suppose that I is an indexing set, and Si is a family of sets indexed by I.

(a) The disjoint union ti∈I Si of the sets Si. is the set of all pairs (i, x) where i is an element of the indexing set I and x is an element of the set Si indexed by i. For each j ∈ I, the set Sj is naturally identified with the subset of all elements in ti∈I Si of the form (j, x) such that x ∈ Sj. (Another notation form the disjoint union is “`” because “” of sets are ` nothing but disjoint union, so ti∈I Si can also be written as i∈I Si. However we want to avoid such notation because the coproduct of a finite family of groups is the same as their products, but the coproduct of an infinite family of groups is a proper subgroup of the product group.) Q (b) The product i∈I Si is the set of all functions

f : I → ti∈I Si

from I to the disjoint union ti∈I Si of the sets Si such that

(1.5) Definition Let f : X → S and g : Y → S be maps of sets. The fiber product of f : X → S and g : Y → S, denoted by X ×f,S,g Y or X ×S Y for short, is the subset of X ×Y

2 consisting of all pairs (x, y) ∈ X × Y such that f(x) = g(y). Denote by π1 : X ×S Y → X and π2 : X ×S Y → Y the two “projections”, defined by

π1(x, y) = x , π2(x, y) = y ∀(x, y) ∈ X ×S Y.

Clear f ◦ π1 = g ◦ π2 by construction. The triple (X ×S Y, π1, π2) satisfies the following :

For any maps of sets u: T → X, v : T → Y such that f ◦ u = g ◦ v, there exists a unique map h: T → X ×S Y such that u = π1 ◦ h and v = π2 ◦ h.

(1.6) Definition An equivalence relation on a set S is a subset R ⊆ S × S satisfying the following properties.

• (R is reflexive) (x, x) ∈ R for all x ∈ S. In word, every element of S is equivalent to itself under the equivalence relation R.

• (R is symmetric) If (x, y) ∈ R then (y, x) ∈ R. In words, if x is equivalent to y then y is equivalent to x.

• (R is transitive) If x, y, z are elements in S such that (x, y) ∈ R and (y, z) ∈ R, then (x, z) in R. In words, if x is equivalent to y and y is equivalent to z, then x is equivalent to z.

(1.7) Definition Let R be an equivalence relation on a set S.

(a) The equivalence class under R containing an element x ∈ R is the set of all elements y ∈ S such that (x, y) ∈ R. (So an equivalence class is a subset of S. Note that any two equivalence classes are either identical or disjoint. So the equivalence relation R partitions the set S into a disjoint union of equivalence classes.)

(b) The set S/R is the set of all equivalence classes in S with respect to R. (So each element of the set S/R is a subset of S, i.e. S/R is a set of subsets of S.)

Example: Two integers a, b are said to be congruent modulo 37 (notation: a ≡ b (mod 37)) if their difference is divisible by 37. Being congruent modulo 37 is an equivalence relation on Z. The set of all equivalence classes for this equivalence relation denoted by Z/37Z. Note that each element of Z/37Z is a subset of Z; one such element is 1 + 37Z, consisting of all integers n such that get 1 as the remainder if you divide n by 37. Of course you can replace 37 by any N and define an equivalence relation in a similar way.

(1.8) Definition A partial ordering on a set S is a relation on S, written a  b if this relation holds for the ordered pair (a, b), and the following properties hold.

3 (i) a  a for all a ∈ S.

(ii) If a  b and b  c, a, b, c ∈ S, then a  c.

(iii) If a, b ∈ S, a  b and b  a, then a = b.

A partial ordering on S is said to be a total ordering if property (iv) below holds. (iv) For any two elements a, b ∈ S, either a  b or b  a.

(1.9) Definition Let  be a partial ordering on a set S.

(a) An upper bound of a subset T is an element b ∈ S such that t  b for all t ∈ T .

(b) A maximal element of a subset T is an element m ∈ T such that there is no element in T which is bigger than m. In other words, if t ∈ T and m  t, then t = m.

(1.10) Zorn’s Lemma. Let S be a non-empty partially ordered set such that every totally ordered subset T ⊂ S has an upper bound in S. Then there exists a maximal element in S. Zorn’s Lemma is an equivalent form of the in set theory, which is known to be independent of the basic axioms in standard set theory and is consistent if the standard set theory is, i.e. assuming it will not lead to contradiction unless the standard set theory already does (an inconceivable scenario—then almost all of the known will have to be abandoned). Most mathematicians use Zorn’s lemma freely.

(1.11) Definition (a) Two sets S1 and S2 are said to have the same cardinality if there exists a bijection between S1 and S2; we write Card(S1) = Card(S2) if this is the case.

(b) We say that the cardinality of a set S1 is less than or equal to a set S2 if there exists an injection from S1 → S2. This property is equivalent (under the axiom of choice) to the existence of a surjection from S2 to S1. Notation: Card(S1) ≤ Card(S2).

(1.12) Basic facts about cardinality, assuming the axiom of choice.

(i) If Card(S1) ≤ Card(S2), and Card(S2) ≤ Card(S3), then Card(S1) ≤ Card(S3).

(ii) If Card(S1) ≤ Card(S2), and Card(S2) ≤ Card(S1), then Card(S1) = Card(S2).

(iii) Either Card(S1) ≤ Card(S2) or Card(S2) ≤ Card(S1) for any two sets S1 and S2.

§2. Groups (2.1) Definition A group is a triple (G, µ, e), where G is a set, µ: G × G → G is a binary operation and e is an element of G, such that the following properties are satisfied.

• (e is a unity element for the group law µ) µ(x, e) = µ(e, x) = x for all x ∈ G.

4 • (associativity) µ(x, µ(y, z)) = µ(µ(x, y), z) for all x, y, z ∈ G.

• (existence of inverse) For every element x ∈ G, there exists an element y ∈ G such that µ(x, y) = µ(y, x) = e. [It is easy to check that this element y is uniquely determined by the above property; it is called the inverse of x and denoted x−1.]

When the group law µ is understood, we will suppress the symbol µ and write “x · y” for µ(x, y). Moreover will often suppress both the group law µ and the unity element e, and simply use the underlying set G as the notation for the group, if that cause no confusion. We often write xn for the product of x with itself n times; x−n mean (x−1)n.

(2.2) Definition A group G is said to be commutative if xy = yx for all x, y ∈ G. As a synonym, an abelian group (in honor of the mathematician Abel) is the same as a commuta- tive group. We often use the symbol “+” instead of “·”. When using the additive notation for an abelian group G, xn in the multiplicative notation becomes n · x, apply the group law to n copies of an element x of G.

(2.3) Definition A subgroup of a group G is a subset H of G which contains the unity element e of G such that x · y ∈ G for all x, y ∈ G.

(2.4) Definition Let S be a subset of a group G. The subgroup of G generated by S is the smallest subgroup of G which contains S; it is the subset of G consisting the unity element e and all finite products x1 · x2 ··· xn (n ∈ N>0) where each xi is either an element of S or is the inverse of an element of S.

(2.5) Definition Let G be a group and let H be a normal subgroup.

(a) The left H-coset which contains an element a ∈ G is the subset a·H = {a·h | h ∈ H} of G. Being in the same left H-coset is an equivalence relation on G: two element x and y are equivalent if and only if x−1 · y ∈ H.

(b) G/H is the set of all left H-cosets. (So G/H is a set of subsets of G.)

(c) The right H-coset which contains an element a ∈ G is the subset H · a of G; H\G is the set of all right H-cosets in G

Note that map x 7→ x−1, induces a bijection between G/H and H\G. The common cardi- nality of G/H and H\G is called the index of H in G, denoted [G : H] (which can be either a positive integer or ∞). We have |G| = [G : H] · |H| if G is a finite group.

(2.6) Definition A group homomorphism from a group (G1, µ1, e1) to a group (G2, µ2, e2) is a map h: G1 → G2) such that h(e1) = h(e2) and h(µ1(x, y)) = µ2(h(x), h(y)) for all x, y ∈ G1. In other words, h respects the group structures.

5 Suppose that h is a homomorphism from G1 to G2, then the image under h of a subgroup of G1 is a subgroup of G2, and the inverse image under h of a subgroup of G2 is a subgroup of G1.

(2.7) Definition A group homomorphism h: G1 → G2) is an if there exists a group homomorphism f : G2 → G1 such that f ◦ h = idG1 and h ◦ f = idG2 .

(2.8) Definition Let h be a homomorphism from a group G1 to a group G2. The of h, denoted by Ker(h), is the subset of G1 consisting of all elements x ∈ G1 such that h(x) = eG2 , where eG2 is the unity element of the target group G2. (2.9) Definition A normal subgroup N of a group G is a subgroup of G such that xyx−1 ∈ N for all elements x ∈ G and all elements y ∈ N. Notation: N E G. Remark A subgroup H of a group G is normal if and only if every left H-coset is a right H-coset. In particular every subgroup of index two is normal.

(2.10) Remark Let h be a homomorphism from a group G1 to a group G2.

(a) The kernel Ker(h) of h is a normal subgroup of G1. More generally the inverse image under h of any normal subgroup of the target group G2 is a normal subgroup of the source group G1.

(b) h is injective if and only if Ker(h) is the trivial subgroup {eG} of G. Terminology. An group endomorphism is a homomorphism from a group to itself. A group automorphism is an isomorphism from a group to itself.

(2.11) Definition Let (G1, ·1, e1) and (G2, ·2, e2) be groups. The product G1 × G2 of the underlying sets has a natural group structure, where the group law is defined by

(x, y) · (u, v) = (x ·1 u, y ·2 v) ∀ x, u ∈ G1, ∀y, v ∈ G2 .

The resulting group structure on G1 × G2 is called the product of G1 and G2.

The injective group homomorphisms h1 : G1 −→ G1 × G2 and h2 : G1 −→ G1 × G2, defined by h1(x) = (x, e2) ∀ x ∈ G1 , h2(y) = (e1, y) ∀ y ∈ G2 ,

identifies G1 and G2 as normal subgroups of the group G1 × G2 which intersect trivially and generate the group G1 × G2. Note also that any element of h1(G1) commutes with any element of h2(G2). Conversely, if G1 and G2 are normal subgroups of a group G such that G1 ∩ G2 = {eG} and G is the smallest subgroup of G which contains G1 and G2, then the map α: G1 × G2 −→ G defined by

α((x, y)) = x · y , ∀x ∈ G1, ∀y ∈ G2 is a group isomorphism.

6 (2.12) Definition Let G be a group. Let u be an element of G.

(a) A conjugate of u in G is an element of the form x · u · x−1.

(b) The conjugacy class in G containing u is the set of all conjugates of u; it is a subset of G, denoted by Gu

One can check without difficulty that the surjective map x 7→ x · u · x−1 from G to the conjugacy class of u induces a bijection

∼ G −1 G/ZG(u) −→ u , xZG(u) 7→ xux

G from G/ZG(u) to the conjugacy class u.

(2.13) Definition Let H be a subgroup of a group G.

(a) The centralizer of u in G, denoted by ZG(u) is the subset of all elements x ∈ G −1 such that xu = ux (or equivalently xux = u). It is easily checked that ZG(u) is a subgroup of G.

(b) The normalizer of H in G, denoted by NG(H), is the subset of all elements x ∈ G such −1 that x · H · x = H. It is easily checked that NG(H) is a subgroup of G; moreover it contains H and ZG(H). It is clear that NG(H) = H if and only if H is a normal subgroup of G.

(2.14) Quotient groups Let N be a normal subgroup of a group (G, µ, e). Consider the set G/N of all (left) N-cosets in G. It is not difficult check that the map

µ¯ :(G/N) × (G/N) −→ G/N , (x · N, y · N) 7→ xy · N ∀x, y ∈ G is well-defined, i.e. independent of the choice of representatives in the cosets. Lete ¯ be the coset N. It is easier to check that (G/N, µ,¯ e¯ is a group, and the natural surjective map

π : G −→ G/I , π(x) = x · N ∀x ∈ G is a group homomorphism. Moreover the kernel h(π) of the homomorphism π is the normal subgroup N.

Important property: There is a natural bijection between subgroups of the quotient group G/N and subgroups of G containing N: To any subgroup H¯ of G/N, associate the subgroup π−1(H¯ ) of G. Conversely, to every subgroup H of G which contains N, associated the subgroup π(H) = H/N of G/N. Under the correspondence, H is normal if and only if H¯ is normal. Moreover if H is normal, then we have a natural isomorphism between G/H and (G/N)/(H/N).

7 (2.15) Definition Let S be a set. A permutation on S is a bijection σ : S → S from S to itself. The set of all permutations form a group, denoted by Perm(S), where the group law is given by composition (of bijections).

Remark When S = {1, 2, . . . n} for a positive integer n, we usually write Sn for the set of all permutations of these n labels. It is a finite group with n! elements.

(2.16) Definition There is a group homomorphism sgn: Sn → µ2 = {±1}, defined by Y Y (xσ(i) − xσ(j)) = sgn(σ) (xi − xj) . 1≤i

A permutation σ ∈ Sn is said to be even (resp. odd) if sgn(σ) = 1 (resp. if sgn(σ) = −1. The subgroup of Sn consisting of all even permutations is called the alternating group in n letters, denoted An.

(2.17) Definition Let p be a . A p-group is a finite group whose cardinality is a power of p.

(2.18) Definition Let G be a group. The subgroup of G, denoted by (G, G) is the subgroup of G generated by all elements of the form x · y · x−1 · y−1 , where x and y are elements of G.

Remark The commutator subgroup (G, G) of G is a normal subgroup of G and G/(G, G) is an abelian group. Moreover the quotient homomorphism π : G −→ G/(G, G) has the following property: if h: G → A is a group homomorphism and A is an abelian group, then there exists a unique group homomorphism g : G/(G : G) −→ A such that h = g ◦ π.

(2.19) Definition Let G be a group.

(a) The set Autgrp(G) of all group automorphisms forms a group; the group law is given by composition of automorphisms.

(b) For every element x ∈ G, denote by AdG(x) the group automorphism of G which sends −1 every element y ∈ G to x · y · x . The map x 7→ AdG(x) is a group homomorphism

AdG : G −→ Autgrp(G) .

Note that Ker(AdG) is the center Z(G) of G.

(2.20) Definition Let N and H be subgroups of a group G. We say that G is a semi- of N and H with N normal if N is a normal subgroup of G, N ∩ H = {eG} and G ∼ is generated by H and N. Notation: G = N o H.

8 Note that G is not determined by N and H up to isomorphism. In other words there are examples of non-isomorphic groups G and G0, such that G is a semi-direct product of a 0 normal subgroup N1 and a subgroup H1, G is semi-direct product of a normal subgroup N2 ∼ ∼ and a subgroup H2, and there exist group H1 −→ H2, N1 −→ HN .

(2.21) Definition Let H and N be groups, and let α: H −→ Autgrp(N) be a group ho- momorphism. The semi-direct product N oα H attached to (H, N, α) is the group with underlying set N × H, whose group law is defined by the following formula:

(n1, h1) · (n2, h2) := (n1 · α(h1)(n2), h1 · h2) ∀ n1, n2 ∈ N, ∀ h1, h2 ∈ H

We have natural injective group homomorphisms j : H −→ N oα H and ι: N −→ N oα H given by j(h) = (eN , h) ∀ h ∈ H, ι(n) = (n, eH ) ∀n ∈ N.

It is easy to see that ι(N) is a normal subgroup of N oα H intersecting the subgroup j(H) trivially, and N oα H is generated by ι(N) and j(H). Moreover α is naturally identified with the conjugation action of elements of j(H) on the normal subgroup N.

Conversely, supposed that H is a subgroup of a group G, N is a normal subgroup of G, and G is a semi-direct product of N and H. Denote by α the homomorphism from H to Autgrp(N) given by α(h)(n) = h · n · h−1 ∀ h ∈ H, ∀ n ∈ N.

Then one can check that G is isomorphic to the semi-direct product N oα H.

§3. Vector spaces

1 (3.1) Definition Let (F, +V , ·F , 0F , 1F ) be a field. A vector space over F (or an F -vector space) is a quadruple (V, +V , µ, 0V ), where V is a set, 0V is an element of V , +V is a binary operation on V , and µ: F × V −→ V is a map, such that the following conditions are satisfied.

• (V, +V , 0V ) is an abelian group.

• µ(a, µ(b, v)) = µ(a ·F b, v) for all a, b ∈ F and all v ∈ V . • µ(a + b, v) = µ(a, v) + µ(b, v) for all a, b ∈ F and all v ∈ V .

• µ(a, v + w) = µ(a, v) + µ(a, w) for all a ∈ F and all v, w ∈ V .

• µ(0F , v) = 0V for all v ∈ V .

• µ(1F , v) = v for all v ∈ V .

1See 5.10 for the definition of fields.

9 We often suppress the symbol “µ” and write a · v or av for µ(a, v) is no confusion is possible. An element of V is often called a “vector” in V .

(3.2) Definition Let V1 and V2 be vector spaces over the same field F .A linear transfor- mation over F from V1 to V2 is a map T : V1 → V2 that preserves the structure of F -vector spaces. More precisely, T is a group homomorphism for the underlying abelian groups, and T (a · v) = a · T (v) for all a ∈ F and all v ∈ V . Synonym: F -linear maps.

(3.3) Remark (1) The kernel Ker(T ) of T is the vector subspace consisting of all ele- ments v ∈ V1 such that T (v) = 0 in V2. The image Im(T ) of the map T forms a vector subspace of V2 over F . (2) A more uniform terminology for “linear transformations” is “homomorphisms between F -vector spaces”.

(3) A linear transformation is injective if and only its kernel is trivial.

(4) A (linear) endomorphism of a vector space V over a field F is a linear transformation from V to itself. A linear endomorphism of V is also called a linear operator on V .

(5) The set of all F -linear endomorphisms of V is denoted by EndF (V ). The set of all F -linear transformations from an F -linear vector space V1 to an F -linear vector space V2 is denoted by HomF (V1,V2). Both EndF (V ) and HomF (V1,V2) are vector spaces over F

(6) The set of all F -linear automorphism of V is denoted by GLF (V ); it has a natural group structure, with the group law given by composition.

(7) The dual vector space to an F -vector space V is the F -vector space HomF (V,F ). An element of HomF (V,F ) is said to be a linear functional on V .

(3.4) Definition Let V be a vector space over a field F . Let v1, . . . , vm be elements of V .

(a) An F -linear combination (or linear combination for short) of v1, . . . , vm is an expression of the form m X ai · vi i=1

with ai ∈ F .

(b) The F -linear span of v1, . . . , vm is the smallest F -vector subspace of V which contains v1, . . . , vm. It is easy to see that the linear span of v1, . . . , vm consists of all F -linear combinations of v1, . . . , vm.

10 (c) We say that the list of vectors v1, . . . , vm are linearly independent over F if for every Pm m-tuple (a1, . . . , am) in F , i=1 ai ·vm = 0 only if a1 = ··· = am = 0. In other words, every non-trivial linear combination of the vi’s is not equal to the zero vector.

(d) We say that the list of vectors v1, . . . , vm forms an F -basis if it is linearly independent over F and its F -linear span is equal to V . Two other equivalent properties for the list of vectors v1, . . . , vn to be an F -basis:

(†) v1, . . . , vn is a maximal linearly independent list of vectors in V .

(‡) v1, . . . , vn is a minimal list of vectors in V which spans V .

(3.5) Proposition Let V be a vector space over a field F .

(a) (Existence of basis) V has an F -basis.

(b) Let S and T be subsets of V . IF V is the F -linear span of S and T is linearly inde- pendent over F , then Card(T ) ≤ Card(S)

(c) (definition of dimension) Any two F -basis of V have the same cardinality. The common cardinality of all basis of an F -vector space V is called the dimension of V , denoted dimF (V ).

Remark The statement 3.5 holds for arbitrary vectors spaces, including those which are infinite dimensional, i.e. cannot be spanned by a finite number of elements. The meaning of cardinality is taken in the sense of set theory as in 1.11.

(3.6) Some basic properties. (1) Any two vector spaces over the same field F are isomorphic as F -vector spaces.

(2) dimF (HomF (V1,V2)) = d1 · d2, where if di = dimF (Vi) for i = 1, 2.

(3) For any linear transformation T : V1 → V2 between F -vector spaces, we have

∼ V1/Ker(T ) −→ Im(T ) .

In particular dimF (V1) = dimF (Ker(T )) + dimF (Im(T )).

(3.7) Let V and W be finite dimensional vector spaces over a field F , and let T : V → W be an F -linear transformation. Let v1, . . . , n be an F -basis of V , and let w1, . . . , wm be an F -basis of W . The matrix representation of T for the bases v1, . . . , n and w1, . . . , wm is the m × n matrix A = (ai,α) whose entries ai,α : 1 ≤ i ≤ m 1 ≤ α ≤ n are determined by

m X T (vα) = aiα wi ∀ α = 1, . . . , n . i=1

11 (3.8) Definition Let T ∈ EndF (V ) be a linear operator on a finite dimensional vector space V over a field F . Let v1, . . . , vn be an F -basis of V , and let A ∈ Mn(F ) be the matrix representation of T , whose entries ai,j are determined by n X T (vj) = ai,jvi j = 1, . . . , n . i=1 (i) The trace of T is given by n X Tr(T ) = ai,i . i=1 (ii) The determinant of T is given by

n X Y det(T ) = sgn(σ) ai,σ(i)

σ∈Sn i=1

(iii) The characteristic polynomial of T is the polynomial det(x · Idn − A) , where n = dimF (V ) Here det(x · Idn − A) is an n × n matrix with coefficients in the polynomial F [x], and the determinant of this matrix is computed using the displayed formula for determinants in (ii) above. The trace, determinant and characteristic polynomials are independent of the choice of basis v1, . . . , vn, hence are intrinsic invariants of the linear operator T . Up to sign the trace and determinant are coeffients of the characteristic polynomial: Write the characteristic n n−1 polynomial of T as x + an−1 x + ··· + a1 x + a0, then

n Tr(T ) = −an−1 , det(T ) = (−1) a0 .

(3.9) Definition Let T ∈ EndF (V ) be a linear operator on a finite dimensional vector space V over a field F .

(i) An element λ ∈ F is an eigenvalue of T if Ker(T − λ · IdV ) 6= (0).

(ii) The eigenspace attached to an eigenvalue λ of T is the vector subspace Ker(T −λ · IdV ); its elements are called eigenvectors of T for the eigenvalue λ. (iii) The linear operator T is diagonalizable if and only if there exists an F -basis of V consisting of eigenvectors; equivalently V is the linear span of the eigenspaces of T .

(3.10) Definition Let V be a vector space over a field F . (a) The dual vector space of V is

t V := HomF (V,F ) , consisting of all F -linear transformations from V to F (also called the linear functionals of V ).

12 (b) We have a natural F -linear map from V to t(tV ), the dual of the dual of V , defined by v 7→ ( α 7→ α(v)) ∀v ∈ V, ∀α(v) . In other words, the canonical map sends a vector v ∈ V to the linear functional can(v) on the dual of V , given by “evaluating at v”. t t The canonical map can: V −→ ( V ) is an isomorphism if dimF (V ) < ∞, but it is not an isomorphism if dimF (V ) = ∞.

(c) Suppose that dimF (V ) = n < ∞. The dual basis to an F -basis v1, . . . , vn is the basis t t t v1,..., vn of V , where vi is given by  1 if j = i tv (v ) = ∀j = 1, . . . , n i j 0 if j 6= i for i = 1, . . . , n.

(3.11) Definition The dual (or transpose) of an F -linear transformation T ∈ EndF (V,W ) is the map tT : tW →t V such that tT (β)(v) = β(T (v)) ∀β ∈t W, ∀v ∈ V.

Suppose that dimF (V ) = n < ∞, dimF (W ) = m < ∞, v1, . . . , vn is an F -basis of V , and w1, . . . , wm is an F -basis of W . Let A ∈ Mm×n(F ) be the matrix representation of T with t t t t respect to the above basis. Let v1,..., vn and w1,..., wm be the dual basis of the above bases, for tV and tW respectively. Then the matrix representation of tT with respect to the t two dual bases is A ∈ Mn×m, the transpose of A.

§4. Group actions (4.1) Definition Let (G, ·, e) be a group. A left action of a group on a set S is a map µ: G × S −→ S satisfying the following conditions. • µ(e, s) = s for all s ∈ S. • µ(x, µ(y, s)) = µ(x · y, s) for all x, y ∈ G and all s ∈ S.

(4.2) Remark (1) When there is no possible confusion we will suppress the symbol “µ” and write x · s or xs for µ(x, s). (2) A left action µ of a group G on a set S defines a group homomorphism ρ: G −→ Perm(S) such that ρ(x)(s) = µ(x, s). Conversely every group homomorphism from G to Perm(S) gives rise to a left action of G on S, according to the same formula above. (3) The notion of right action G is defined in a similar way. (4) A left G-action µ on a set S can be turned into a right G-action ν on S, and vice versa, by µ(g, s) = ν(s, g−1) ∀g ∈ G, ∀s ∈ S.

13 (4.3) Definition Let V be a vector space over a field F . Let G be a group. A left linear action of G on V is a left action µ: G × V −→ V such that

µ(g, a · u + b · v) = a · µ(g, u) + b · µ(g, v) ∀g ∈ G, ∀a, b ∈ F ∀u, v ∈ V.

In other words the µ induces a ρ: G −→ GLF (V ) given by

ρ(g)(v) = µ(g, v) .

A ring homomorphism ρ: G → GLF (V ) is also called an linear representation of G on the vector space V .

(4.4) Definition Suppose we have a left action (S, G, µ) of a group G on a set S.

(1) The G-orbit of an element s ∈ S, denoted by G · s, is the subset of S consisting of all elements x ∈ S such that x = µ(g, s) for some element g ∈ G.

(2) Any two G-orbits in S are either equal or disjoint. In other words the left G-action partitions the set S into a disjoint union of G-orbits. The set of all (left) G-orbits on S is denoted G\S; it is a set of subsets of S.

(3) The stabilizer of an element s ∈ S, denoted StabG(s), is the subset of G consisting of all elements g ∈ G such that µ(g, s) = s; it is a subgroup of G. The map from G to the G-orbit containing s, which sends every element g ∈ G to the element µ(g, s) in S, defines a bijection

∼ G/StabG(s) −→ G · s .

(4) The fixer of a subset T ⊆ S, denoted FixG(T ) , is the subset of G consisting of all elements g ∈ G such that µ(g, t) = t; it is a subgroup of G.

(5) The stabilizer of a subset T ⊆ S, denoted StabG(T ) , is the subset of G consisting of all elements g ∈ G such that µ(g, t) ∈ T and µ(g−1, t) ∈ T for all t ∈ T . The last condition is equivalent to µ(g, T ) = T ; StabG(T ) is a subgroup of G containing FixG(T ).

(4.5) The conjugation action. Let G be a group. The conjugation action (or the adjoint action) of G on itself is the map

Ad Ad −1 µG : G × G −→ G , µG (x, y) = AdG(x)y = x · y · x .

Recall that AdG is defined in 2.19. In many cases, orbits and stabilizers for the conjugation or for the action on 2G induced by the conjugation action have received our attention before, where 2G is the set of all subsets of G.

14 (1) The stabilizer subgroup of an element x ∈ G under the conjugation action is ZG(x), the centralizer subgroup of x in G.

(2) Each orbit of the conjugation action of G on itself is a conjugacy class in G. For each element x ∈ G, the map from G to the conjugacy class containing x, given by sending every element y ∈ G to the element Ad(x)(y) = x · y · x−1, establishes a bijection from G/ZG(x)) to the conjugacy class of x in G; this is a special case of 4.4 (3). The decomposition of G into a disjoint union of conjugacy classes is a special case of the decomposition into orbits. When G is a finite group, counting the cardinality using this decomposition gives the following class equation for G:

X |G| |G| = |Z(G)| + , |ZG(xi)| i∈I where I is a finite set which parametrizes the set of all conjugacy classes of G which are not in Z(G) (i.e. have more than one element), and xi is an element of the conjugacy class Ci parametrized by i. (3) Let H be a subgroup of G, which can be regarded as a subset of G. The stabilizer of the subset H ⊂ G under the conjugation action is the normalizer subgroup NG(H) of H, while the fixer subgroup of this subset is the centralizer subgroup FixG(H). We can also regard H as an element [H] of 2G; the stabilizer subgroup of the element G G [H] ∈ 2 under the action on 2 induced by the conjugation action is StabG(H). We have a bijection from G/StabG(H) to the set of all subgroups of G conjugate to H, again as a special case of 4.4 (3).

The following is an easy consequence of the class equation and induction.

(4.6) Proposition Let p be a prime number and G be a non-trivial finite p-group, i.e. |G| is a power of p.

(i) The center Z(G) of G is non-trivial.

(ii) There exists a finite chain of subgroups 0 = G0 ( G1 ( G2 ( ··· ( Gm = G such that −1 −1 Gi EG for each i = 1, . . . , m and x·y ·x ·y ∈ Gi−1 for all x ∈ G, all y ∈ Gi and all i = 1, . . . , m. In particular the quotient group Gi/Gi−1 is abelian for all i = 1, . . . , m.

(4.7) Definition Let p be a prime number and let G be a finite number. A p-subgroup H in G is a Sylow p-subgroup of G if the index [G : H] of H in p is relatively prime to p. In other words, |H| is the highest power of p which divides |G|.

(4.8) Theorem (Sylow) Let G be a finite group. Let p be a prime number which divides |G|.

15 (1) There exists a Sylow p-subgroup H in G.

(2) Any two Sylow p-subgroups H1 and H2 of G are conjugate in G; i.e. there exists an −1 element x ∈ G such that x · H1 · x = H2.

(3) [G : NG(H)] ≡ 1 (mod p). In other words the number of Sylow p-subgroups has the form 1 + a · p for some n ∈ N.

§5. Rings, ideals and factorization (5.1) Definition A ring is a quintuple (R, +, ·, 0, 1), where R is a set, + : R × R → R and · : R × R → R are two binary operations satisfying the following conditions.

• (R, +, 0) is a commutative group,

• 1 6= 0.

• 1 · x = x · 1 = x for all x ∈ R,

• (associativity) (x · y) · z = x · (y · z) for all x, y, z ∈ R.

• (distributive laws) (x + y) · z = (x · z) + (y · z) and x · (y + z) = (x · y) + (x · z) for all x, y, z ∈ R.

A ring R is said to be commutative if x · y = y · x for all x, y ∈ R. We often suppress the “·” in formulas when there is no possible confusion. A subset S of a ring R is a of R if (S, +, 0) is a subgroup of (R, +, 0), 1 ∈ S, and x · y ∈ S for all x, y ∈ S.

(5.2) Examples of rings.

• Z, Q, R, C are commutative rings.

• R[x], C[x, y], Z[x], Q[x, y, z] are commutative rings.

• The matrix rings M2(Z), M3(Q), M4(R), M5(C) with the standard definition of addition and multiplication are non-commutative rings.

• The set of all R-valued continuous functions C(R) on R with “+” and “·” given by the sum and product of values of functions forms a commutative rings.

• The subset N of Z is not a subring of Z. • Let R be a . The set R[x] of all polynomials in a variable x with all coefficients in R is a ring, where the two operations “+” and “·” are given by the standard formulas.

16 • Let M be a commutative group. The set Endgrp(M) of all group endomorphisms of M has a natural structure as a ring, where the multiplication is given by composition of endomorphisms. Such rings are usually non-commutative.

• Let V be a vector space over a field F . Then the set EndF (V ) of all F -linear endo- morphisms of V has a natural structure as a ring, where the multiplication is given by composition of endomorphisms. If dimF (V ) = n < ∞, then the ring EndF (V ) is isomorphic to the Mn(F ); see 5.7 for the definition of ring isomorphisms.

(5.3) Definition Let R be ring. An element u ∈ R is a unit of R if there exists an element v ∈ R such that u · v = v · u = 1. Note that such an element v is uniquely determined by the above condition; we say that v is the inverse of u. Denote by R× the set of all units in R. It is easy to check that (R×, ·, 1) forms a group; this group is commutative if R is. Examples.

(i) Z× = {1, −1}

× (ii) Let V be a vector space. We have EndF (V ) = GLF (V ); this matrix group is isomor- phic to GLn(V ) if dimF (V ) = n. (iii) If R is a commutative ring, then the group of units R[x]× in the ring R[x] of all polynomials in one variable x with coefficients in R is equal to R×.

× ∼ (iv) (Z/8Z) = (Z/2Z) × (Z/2Z).

(5.4) Group rings. Let G be a finite group and let R be a commutative ring. Let R[G] be the set of all formal sums of the form X ax [x] , ax ∈ R ∀x ∈ G. x∈G Define addition by adding coefficients: X X X ax [x]) + ( bx [x]) = ( (ax + bx)[x] , x∈G x∈G x∈G and define multiplication by ! X X X X ax [x]).( bx [x]) = au · bu−1x [x] . x∈G x∈G x∈G u∈G

Then R[G] is a ring, and G −→ R[G]× , x 7→ [x] is a group homomorphism.

17 (5.5) Definition Let R be a ring. (a) A subset I of R is a left (resp. right ideal) if x · I ⊆ I for all x ∈ R (resp. if I · x ⊆ I for all x ∈ R). A subset I of a ring R is an ideal if it is both a left and a right ideal; in other words I is stable under both the right and the left multiplication by arbitrary elements of R. (b) Let S be a subset of a ring R. The ideal (resp. left ideal, resp. right ideal) generated by S is the smallest ideal (resp. left ideal, resp. right ideal) of R which contains S; in other words it is the intersection of all ideals (resp. left ideals, resp. right ideals) in R which contain S. Explicitly, the left ideal generated by S is the subset of R consisting of all finite sums of the form X xjsj , j∈J

where J is a finite indexing set, xj ∈ R and sj ∈ S for every j ∈ J. Similarly the ideal generated by S is the subset of R consisting of all finite sums of the form X xjsjyj , j∈J

with xj, yj ∈ R and sj ∈ S for every element J in the finite indexing set J.

(5.6) Definition Let R be a ring and let I and J be ideals. The ideal I ·J is the ideal generated by the subset {x·y | x ∈ I, y ∈ J} ⊂ R. If I and J are generated by subsets S,T ⊂ R of R respectively, then the ideal I·J is the subset of R consisting of all finite sums of the form X xk · sk · yk · tk · zk , k∈K

where K is a finite indexing set, xk, yk, zk ∈ R, sk ∈ S, tk ∈ T for all k ∈ K.

(5.7) Definition Suppose (R, +R, ·R, 0R, 1R) and (S, +S, ·S, 0S, 1S) are rings. A ring ho- momorphism from the ring R to the ring S is a map h: R → S which respects the ring structure. In other words

• h(0R) = 0S,

• h(1R) = 1S,

• h(x +R y) = h(x) +S h(y) ∀x, y ∈ R, and

• h(x ·R y) = h(x) ·S h(y) ∀x, y ∈ R. The kernel of a ring homomorphism h: R → S, denoted by Ker(h), is the subset of R consisting of all elements a ∈ R such that h(a) = 0S; it is an ideal of R. A ring homomorphism h: R → S as above is said to be a ring isomorphism if there 0 0 0 exists a ring homomorphism h : S → R such that h ◦ h = idR, and h ◦ h = idS.

18 (5.8) Remark Suppose that h: R → S is a ring homomorphism as above. (1) The image under h of a subring of R is a subring of S.

(2) The inverse image under h of a subring of S is a subring of R.

(3) The inverse image under h of an ideal (resp. left ideal, resp. right ideal) of S is an ideal (resp. left ideal, resp. right ideal) of R. [However if we replace “subring” by “ideal” (or left/right ideal) in the statement (1) above, the resulting statement is false.]

(5.9) Quotient rings. Let I be a proper ideal of a ring R (i.e. I 6= R). Since (R, +, 0) is a commutative group, we can consider the quotient group R/I for the addition, whose elements are subsets of R of the form a + I with a ∈ R. Consider the natural surjective map

π : R −→ R/I , x 7→ x + I.

It turns out that the condition that I is an ideal implies that there is a (necessarily unique) ring structure on R/I such that π is a ring homomorphism. (Check that the map

(R/I) × (R/I) −→ R/I , ((a + I), (b + I)) 7→ ab + I ∀a, b ∈ R is well-defined.)

Important property: There is a natural bijection between ideals of the R/I and ideals of R containing I: To any ideal J¯ of R/I, associate the ideal π−1(J¯) of R. Conversely, to every ideal J of R which contains N, associated the ideal π(J) of R/I. The statements hold if we replace “ideal(s)” by “left ideal(s)” in the above; similarly for right ideal.

(5.10) Definition (1) A is a ring such that for every element 0 6= x ∈ R, there exists an element y ∈ R such that x · y = y · x = 1; in other words every non-zero element of R is a unit of R.

(2) A field is a commutative division ring.

(3) A field F is said to be algebraically closed if and only if every non-constant polynomial f(x) ∈ F [x] with coefficients in F has at least one root in F ; equivalently, every non- constant polynomial in F [x] is a product of linear polynomials (i.e. polynomials of degree one).

(5.11) Examples of fields and division rings.

• The rings Q, R, C are fields.

• Z is not a field.

• The polynomial rings R[x], Z[x, y] are not fields.

19 • The Hamiltonian quaternion H is a 4-dimensional vector space over R with 1, i, j, k as an R-basis, where 1 is the unity element for the multiplication, and i, j, k are elements of H with the following properties: i2 = j2 = k2 = −1, i · j = k = −j · i, j · k = i = −k · j, k · i = j = −i · k

An easy calculation shows that the inverse of a non-zero element a + bi + cj + dk in H with a, b, c, d ∈ R is (a2 + b2 + c2 + d2)−1 · (a − bi − cj − dk), so H is a non-commutative division ring.

• The matrix rings M2(R), M3(Z) are not division rings.

• The fields Q and R are not algebraically closed. The field C of all complex numbers is algebraically closed. The last statement is the famous Fundamental Theorem of Arithmetic, first proved by Gauss in the 18th century.

• The set of all complex numbers z which is the root of some non-constant polynomial f(x) ∈ Q[x] with coefficients in Q is a subfield of C, called the field of all algebraic numbers and denoted by Qalg. (5.12) Definition An is a commutative ring such that x · y 6= 0 if x 6= 0 and y 6= 0. It is clear that every field is an integral domain.

Examples.

• Z, Q, R, C are integral domains.

• Z/6Z is not an integral domain: the product of the non-zero elements 2 + 6Z and 3 + 6Z is zero. • Every subring of an integral domain is an integral domain. In particular every subring of a field is an integral domain.

• Let R be an integral domain. Then the R[x] consisting of all polyno- mials in a variable x with all coefficients in R is an integral domain.

(5.13) Definition Let R1,...,Rn be rings. The product set R1 × · · · × Rn, consisting of all n-tuples (x1, x2, . . . , xn) such that xi ∈ Ri for all i = 1, . . . , n, has a natural ring structure, with addition and multiplication given coordinate-by-coordinate. We call it the product ring of R1,...,Rn; the unity element for addition is (0R1 ,..., 0Rn ), while the unity element for multiplication is (1R1 ,..., 1Rn ).

Remark (i) If R1,...,Rn are commutative and n ≥ 2, then the product ring R1 ×· · ·×Rn is not an integral domain. For instance (1, 0,..., 0) · (0, 1, 0,..., 0) = (0, 0,..., 0).

20 (ii) Each projection pri : R1 × · · · × Rn → Ri is a ring homomorphism. However the “inclusion maps” ιi : Ri → R1 × · · · × Rn, which sends every element x ∈ Ri to the n-tuple whose i-th component is x and the other components are 0, is not a ring

homomorphism if n ≥ 2: 1Ri is not sent to the unity element of the product ring.

(5.14) Definition Let I be an ideal in a commutative ring R.

(i) I is a if I 6= R and the only ideals of R containing I are I and R. Equivalent, the quotient ring R/I is a field.

(ii) I is a if for all elements x, y∈ / I, their product x · y in again not in I. Equivalently, the quotient ring R/I is an integral domain.

It is clear from the two alternative definitions that every maximal ideal is a prime ideal. The equivalence of the two definitions in (ii) comes from the following easy fact: A commu- tative ring is a field if and only if it has only two ideals, (0) and the whole ring itself.

Examples.

(i) In the ring Z[x], {0}, (3), (x), (3, x) are all prime ideals; among them only (3, x) is a maximal ideal.

(ii) The finite ring Z/100Z has only two prime deals, 2Z/100Z and 5Z/100Z, both are also maximal ideals.

(iii) In the ring R = R[x, y, z]/(x2 + y2 + z2), the principal ideal generated by the imagex ¯ is a prime ideal, but is not a maximal ideal. The ideal m := (¯x, y¯ − 1)¯ is maximal and the quotient R/m is isomorphic to C. Notation. Let R be a commutative ring. Denote by Spec(R) the set of all prime ideals of R, called the spectrum of R. The subset of Spec(R) consisting of all maximal ideals is called the maximal spectrum of R and denoted by MaxSpec(R).

(5.15) Remark Suppose that R1,...,Rn are commutative.

(i) There is a bijection α from the disjoint union of the spectra Spec(Ri)’s to the spectrum Qn −1 Spec( i=1 Ri) of the product ring, which sends a prime ideal ℘ in Spec(Ri) to pri (℘), the inverse image of ℘ in the product ring under the i-th projection homomorphism Qn pri : i=1 Ri −→ Ri. (ii) The bijection α in (i) induces a bijection from the disjoint union of the maximal spectra Qn MaxSpec(Ri)’s to the maximal spectrum MaxSpec( i=1 Ri) of the product ring. (5.16) Definition An (PID) is an integral domain such that every ideal is principal, i.e. can be generated by one element.

21 Examples.

• Z is a principal ideal domain. • If F is a field, then the polynomial ring F [x] is an integral ideal domain. (Exer. Use the division algorithm to prove this assertion.)

• Z[x] is not an integral ideal domain. Similarly Q[x, y] is not an integral ideal domain. √ • The ring√ of Gaussian integers Z[ −1], consisting of all complex numbers of the form a + b −1 with a, b ∈ Z is a principal ideal domain.

(5.17) Definition (Euclidean domains) An integral domain R is an if there is a function σ : R − {0} → N such that a “division algorithm”, namely: For every element a ∈ R and every non-zero element b ∈ R − {0}, there are elements q, r ∈ R such that a = q · b + r , σ(r) < σ(b) if r 6= 0 .

Remark (i) It is easy to see that every Euclidean domain is a PID.

(ii) Examples of Euclidean domains do not come in abundance. The better-known exam- ples of Euclidean domains√ include Z, polynomial rings F [x] over a field F , and the ring of Gaussian integers Z[ −1]. These are also the better-known examples of PID’s. (iii) There are examples of PID’s which are not Euclidean domains, but it takes efforts in these examples to prove the non-existence of Euclidean algorithms.

(iv) For an algebraic number field K, the ring OK of algebraic integers in K is a PID if and only if the class number of K is equal to one.2 Algebraic integers and class numbers are typically explained in books on .

(5.18) In the ring Z of integers, the relation “a | b”(a divides b) can be better thought of as a relation between ideals: (a) ⊃ (b). (Here (a) is the general notation for the ideal (of Z in the present case) generated by a, i.e. the ideal aZ.) The same statement holds for any commutative ring. For integral ideal domains the familiar elementary concepts for the arithmetic of Z can be naturally generalized, and usually best thought of in terms of ideals. We illustrate this point below.

2An algebraic number field is a subfield K of C consisting of algebraic numbers, i.e. roots of non-constant polynomials in Q[x], which is finite dimensional as a vector space over Q. The subring OK of K, consisting of all elements of K which are roots of some monic polynomial in Z[x], is called the ring of algebraic integers × in K. Two non-zero ideals I1,I2 in OK are said to be equivalent if there exists an element a ∈ K such that a · I1 = I2. A theorem of Dirichlet asserts that there are only a finite number of equivalence classes of non-zero ideals in OK ; that number is called the class number of K.

22 Unique factorization in a PID. One formulation of this property for a PID is the fol- lowing: Every proper ideal I, there exists a positive integer m, maximal ideals ℘1, . . . ℘m and posi- Qm ej tive integers e1, . . . , em such that I = j=1 ℘j . Moreover the positive integer m is uniquely determined by I, and the m-tuples (℘1, . . . , ℘m) and (e1, . . . , em) are uniquely determined up to permutation. gcd and lcm in a PID. The familiar concepts “greatest common divisor” and “least common multiple” in Z can also be formulated in terms of ideals:

(gcd(a, b)) = (a, b)(= aZ + bZ), lcm(a, b)) = (a) ∩ (b)(= aZ ∩ bZ) .

It is clear how to extend the concepts of gcd and lcm to the context of principal ideal do- mains. Here is a useful special case. Suppose that F is a field and f1(t), . . . , fm(t) are polynomials such that there is no (non- constant) irreducible polynomial p(t) which divides all the fi(t)’s. Then there exists polyno- Pm mials g1(t), . . . , gm(t) ∈ F [t] such that i=1 gi(t)fi(t) = 1.

Proof. That gcd of the elements fi(t) for i = 1, . . . , m is equal to 1, which means that the ideal of F [t] generated by the element fi(t)’s is equal to the whole ring F [t].

(5.19) Definition Two non-zero elements a, b in a principal ideal domain are said to be relatively prime if the ideal (a, b) = aR + bR generated by a and b is equal to R.

(5.20) Definition Let F be a field, and let F [x] be the field of all polynomials in one variable x with coefficients in F .

d d−1 (i) Let f(x) = ad x + ad−1, x + ··· + a1 x + a0 be a polynomial in F [x]. The derivative of f, denoted by f 0(x), is the polynomial

0 d−1 d−2 f (x) := d ad x + (d − 1) ad−1 x + ··· + a1 ∈ F [x] .

(ii) An polynomial f(x) ∈ F [x] is separable if it is relatively prime to its derivative f 0(x); in other words the ideal of F [x] generated by f(x) and its derivative f 0(x) is equal to F [x].

Suppose that E is an algebraically closed field which contains F as a subfield, then an element f(x) ∈ F [x] is separable if and only if the polynomial f(x) does not have multiple roots in E. A simple example of an inseparable polynomial: Let F = Fp(t), the fraction field of Fp[x]; p see 5.23 example (2). The polynomial x − t ∈ Fp(t)[x], is not separable.

(5.21) Proposition Let R be a commutative ring and let I be a proper ideal of R (i.e. I 6= R). Then there exists a maximal ideal M of R which contains I.

23 Prop. 5.21 follows very quickly from Zorn’s Lemma; see 1.10 or Lemma 1.9 in the Appendix of Artin’s book for the statement of Zorn’s lemma. Here is a sketch of the proof of 5.21. Consider the set J consisting of all ideals J of R which contains I but does not contain the unity element 1. The inclusion relation is a partial ordering on J . Clearly J is non- empty. Let’s check that every totally ordered subset T ⊆ J has an upper bound in J : The 0 0 union I = ∪J∈T J is an ideal which contains every element J ∈ T and I does not contain 1, so I0 is an upper bound of I in T . By Zorn’s lemma the partially ordered set J contains a maximal element M, which is nothing but a maximal ideal of R containing I.

(5.22) Theorem (Nullstellensatz) Let I = (f1, . . . , fr) be an ideal in the polynomial ring C[x1, . . . , xn] generated by polynomials fi(x1, . . . , xn) ∈ C[x1, . . . , xn] for i = 1, . . . , r. Let n V ⊆ C be the subset of common zeroes of the polynomials f1, . . . , fr, i.e. n V = {(z1, . . . , zn) ∈ C | fi(z1, . . . , zn) = 0 ∀i = 1, . . . , r} .

If g(x1, . . . , xn) is an element of C[x1, . . . , xn] such that g(z1, . . . , zn) = 0 for all (z1, . . . , zn) ∈ V , then some power of g(x1, . . . , xn) is in the ideal I. This is the classical form of Hilbert’s Nullstellensatz; see Theorem 8.7 of Chapter 10 of Artin’s book. The same statement holds if the field C is replaced by an arbitrary algebraically closed field F .

(5.23) Definition Let R be an integral domain. A field of fractions for R is an injective ring homomorphism j : R,→ F from R to a field F such that F is the smallest field containing the image j(R) of R; equivalently, for every element x ∈ F , there exists a non-zero element b ∈ R such that a · x ∈ R. A field of fractions of an integral domain R can be constructed as the set of all equivalence classes on R × (R − {0}), modulo the following equivalence relation: (a, b) ∼ (a0, b0) ⇐⇒ a · b0 = b · a0 ∀ a, b, a0, b ∈ R, b 6= 0, b0 6= 0 [Think of the equivalence class containing (a, b) as the element j(a) · j(b)−1 in the fraction field F .] Below is a universal property for a/the field of fractions an integral domain R: For any injective ring homomorphism ι: R → K from R to a field K, there exists a unique field homomorphism α: F → K such that ι = α ◦ j. Examples. (1) Q is the fraction field of Z. (2) The fraction field of a polynomial ring F [x] over a field F , denoted by F (x) and called the field of rational functions over F in one variable x, consists of fractions of the form f(x) g(x) with f(x), g(x) ∈ F [x], g(x) 6= 0, modulo the usual equivalence relation

f1(x) f2(x) = if and only if f1(x) · g2(x) = f2(x) · g1(x) . g1(x) g2(x)

24 (3) More generally, if R is an integral domain with fraction field F , then the fraction field of R[x] is naturally isomorphic to F (x).

(5.24) Definition Let F be a field. Let h: Z −→ F be the (only) ring homomorphism from Z to F .

(i) If Ker(h) 6= (0), then Ker(h) = pZ for some prime p; in this case we say that F has characteristic p.

(ii) If Ker(h) = (0), then we say that F has characteristic 0.

(iii) The prime subfield of F is the smallest subfield contained in F . It is isomorphic to Fp := Z/pZ if F has characteristic p > 0, and isomorphic to Q if F is of characteristic 0.

(5.25) Definition Let R be an integral domain.

(i) An non-zero element a ∈ R is irreducible if a∈ / R× and a cannot be factored non- trivially; i.e. if a = b · c, b, c ∈ R, then b ∈ R× or c ∈ R×.

(ii) Let x be a non-zero element in R which is not a unit, i.e. x∈ / R×. We say that unique factorization holds for a if the following holds.

× (a) There exist an element u ∈ R and irreducible elements y1, . . . , ym ∈ R, m > 0, Qm such that x = u · i=1 yi. Qn × (b) Suppose that x = v · j=1 zj, where n ∈ N>0, v ∈ R is a unit and zj is an irre- ducible element in R for j = 1, . . . n. Then m = n, and there exists a permutation × σ ∈ Sm and units u1, . . . um ∈ R such that zi = ui · yσ(i) for i = 1, . . . , m. (c) We say that R is a unique factorization domain (UFD) if unique factorization holds for every non-zero element of R which is not a unit.

Remark (1) Let P be the set of all principal ideals in R which are not equal to R, partially ordered by inclusion. A non-zero element x ∈ R which is not a unit is irreducible in R if and only if the principal ideal (x) is a maximal element in P.

(2) Property (ii)(a) holds for all elements 0 6= x∈ / R× in R (existence of factorization) if and only if every increasing chain in P stabilizes after finite a finite number of steps.

(3) Property (ii) (b) holds for all elements 0 6= x∈ / R× in R (uniqueness of factorization) if and only if the principal ideal generated by any irreducible element x ∈ R is a prime ideal in R.

25 (4) We can define gcd and lcm in a UFD R. It is better to think of concepts in terms of principal ideals. For instance the gcd of a finite number of elements a1, . . . , am in R, not all equal to 0, is the smallest principal ideal (b) which contains (ai) for all i = 1, . . . m. Similarly, the lcm of a finite number of non-zero elements a1, . . . , am in R is the largest principal ideal c which is contained in (ai) for all i = 1, . . . , m. (5) An equivalent statement of conditions (a), (b) above is: the principal ideal (x) can be factored as a product of principal ideals generated by irreducible elements in an essentially unique way.

(5.26) An example√ of an integral√ domain which is not a UFD. Let R = Z + −5Z = {a + b −5 ∈ C | a, b ∈ Z}, a subring of C, hence an√ integral domain. Note that the function σ : R → N which sends any element a + b −5 ∈ R to a2 + b2 satisfies σ(x · y) = σ(x) · σ(y). Using the function σ we see immediately that × R = {±1}. It√ is also easy√ to check that 3 and 7√ are irreducible elements√ in R: Suppose that 3 = (a + b −5)(c + d −5, a, b, c, d ∈ Z, a + b −5 6= ±1, c + d −5 6= ±1. Evaluating the function σ using this decomposition, we get 21 = (a2 + 5b2) · (c2 + 5d2), which quickly leads to contradiction. We have two factorizations √ √ 21 = 3 · 7 = (4 + −5) · (4 − −5) √ √ of the non-zero element 21 ∈ R. However neither 4 + −5 nor 4 − −5 is divisible by the irreducible element 3. Therefore R is not a UFD.

(5.27) Remark The failure of unique factorization for the ring R above turns out to be of minor nature when one examines it from the vintage point of factorization of ideals. The truth is that every non-zero ideal of R can be factored into a product of maximal ideals in an essentially unique way. However we get into the unpleasant situation of non-uniqueness of factorization if we insists on using only principal ideals for factoring.

To see what is really happening in the above example, consider the following ideals in R, √ √ √ √ P1 := (3, 4 + −5),P2 := (3, 4 − −5),Q1 := (7, 4 + −5),Q2 := (7, 4 − −5).

One can verify that P1, P2, Q1, Q2 are maximal ideals in R, and we have the following factorization √ √ (3) = P1 · P2, (7) = Q1 · Q2, (4 + −5) = P1 · Q1, (4 − −5) = P2 · Q2 .

of ideals in R, which completely “explains” the non-uniqueness of the two factorization of the element 21.

26 Ideals were introduced by Kummer to salvage the unique factorization property for rings such as R above. He called them ideal numbers, which is the historical origin of the name “ideals”.

(5.28) Proposition Let R be a UFD, and let R[x] be the ring of all polynomials with d coefficients in R. Recall that the degree of a non-zero element a0 + a1 x + ... + ad x is d if × × ad 6= 0. Recall also that (R[x]) = R .

(1) R[x] is a UFD.

(2) Every irreducible element of R is an irreducible element of R[x].

d (3) Every element f(x) = a0 + a1 x + ··· + ad x of positive degree d ≥ 1 such that gcd(a0, . . . , ad) = (1) is an irreducible element in R[x]. (4) Every irreducible element in R[x] is of the form described in (2) or (3) above.

Remark It is easy to see that if R is a ring and R[x] is a UFD, then R is a UFD.

(5.29) Corollary The following rings are UFD.

(i) Z, Z[x], Z[x1, . . . , xn]

(ii) F , F [x], F [x1, . . . , xn], where F is a field. √ √ (iii) Z[ −1], Z[ −1][x1, . . . , xn].

§6. Modules

(6.1) Definition A left over a ring (R, +R, ·R, 0R, 1R), or a left R-module, is a quadruple (M, +M , µ, 0M ), where (M, +M , 0M ) is a commutative group, and

µ: R × M −→ M

is a map satisfying the following properties.

• µ(a, x +M y) = µ(a, x) +M µ(a, y) for all a ∈ R and all x, y ∈ M,

• µ(a +R b, x) = µ(a, x) +M µ(b, x) for all a, b ∈ R and all x ∈ M,

• µ(a, µ(b, x)) = µ(a ·R b, x) for all a, b ∈ R and all x ∈ M. • µ(1, x) = x for all x ∈ M.

• µ(0, x) = oM for all x ∈ M.

27 Remark (1) We usually suppress the symbol “µ” and write a · x or ax for µ(a, x) when no confusion is possible. (2) Right modules are defined in a similar way. (3) There is no difference between left and right R-modules if R is a commutative ring; then we will suppress “left” or “right” and only say “R-modules”. (4) A left R-module structure on an abelian group M is the same as a ring homomorphism R −→ Endgrp(M). Examples. 1. Every abelian group can be regarded as a module over Z. In other words a Z-module is nothing other than an abelian group. 2. A module over a field F is the same as a vector space over F .

(6.2) Definition Let F be a field and let G be a finite group. Then a module over the F [G] of the group G over F is the same as an F -linear action of G on an F -vector space, or equivalently a homomorphism from G to a linear group GLF (V ). Any of the three equivalent concepts will be call a linear representation of G. The correspondence between the three equivalent notions can be described as follows. Let V be the underlying vector space of the linear representation. Let µ: F [G] × V −→ V be the left F [G]-module structure on V , let ν : G × V −→ V be the corresponding left linear G-action on V , and let ρ: G → GLF (V ) be the corresponding group homomorphism. Recall P that a typical element of F [G] is a formal sum x∈G ax · [x] with “coefficients” ax ∈ F . Then we have ρ(x)(v) = ν(x, v) = µ(1 · [x], v) ∀ x ∈ G ∀ v ∈ V and ! X X X µ ax · [x], v = ax · ν(x, v) = ax · ρ(x)(v) x∈G x∈G x∈G P for all elements x∈G ax · [x] ∈ F [G] and all v ∈ G.

(6.3) Another important class of examples comes from linear algebra. Suppose that V is a finite dimensional vector space over a field F and T ∈ EndF (V ) is an F -linear operator on V . Then V has a structure as a module over the polynomial ring F [x], with the module structure given by f(x) · v := f(T )(v) , ∀ f(x) ∈ F [x] . Pm i Pm i Here f(T ) := i=0 ai T (v) if f(x) = i=0 ai x . Equivalently, there is a unique ring homo- morphism hT : F [X] → EndF (V ) such that hT (x) = T . The “pull-back” of the tautological EndF (V )-module structure on V by the ring homomorphism hT gives V an F [x]-module structure.

28 2 Assume that V is finite dimensional over F . Then dimF (EndF (V )) = dimF (V ) < ∞, and the kernel Ker(hT ) of the ring homomorphism hT : F [x] → EndF (V ) defined in 6.3 is non-zero. Therefore there exists a unique non-constant monic polynomial g(x) ∈ F [x] which generates the ideal Ker(hT ).

(6.4) Definition Notation as above. The generator g(x) of the kernel Ker(hT ) of the ring homomorphism hT : F [x] → End(V ) which sends x to T is called the minimal polynomial of the linear operator T . Among monic polynomials f(x) in F [x] such that f(T )(v) = 0 for all v ∈ V , the minimal polynomial is the one with the smallest degree.

(6.5) Definition An R-submodule of a left R-module M is a subset M 0 of M such that M 0 0 0 is a subgroup of (M, +M , 0M ) and a · x ⊂ M for all x ∈ M .

(6.6) Definition The R-submodule generated by a subset S ⊂ M of M is the smallest R- submodule of M which contains S; in other words it is the intersection of all R-submodules of M which contain S. More explicitly it is the subset consisting of all elements which can be written as a finite sum X xi · si , xi ∈ R, si ∈ S ∀i ∈ I. i∈I A left R-module M is of finite type if M can be generated by a finite subset of M.

(6.7) Definition Let R be a ring and M,N be left R-modules.

(1) A module homomorphism from M to N is a map h: M → N such that h is a group homomorphism for the abelian groups underlying M and N, and h(a · x) = a · h(x) for all a ∈ R and all x ∈ M.

(2) The kernel of a module homomorphism h: M → N is the subset of M, denoted by Ker(h), consisting of all elements x ∈ M such that h(x) = 0 in M. It is a left R- submodule of M.

(3) Denote by HomR(M,N) the set of all left R-module homomorphisms; it has a natural structure as a commutative group, with group law given by addition. When R is commutative, HomR(M,N) has a natural structure as an R-module.

(4) We write EndR(M) for HomR(M,M).

29 (6.8) Examples.

(1) Let R be a ring. We can consider R as a left R-module using the multiplication law in R. Then left R-submodules of this left R-module R are exactly the left ideals of R.

(2) Let R be a ring, let M be a left R-module and let I be a set. Denote by M I the set of all maps f : I → M; it has a natural left M-module structure, where the sum of two elements f and g is the map i 7→ f(i) + g(i) ∀i ∈ I, and the module structure is given by (a · f)(i) = a · f(i) ∀a ∈ R, ∀f ∈ RI , ∀i ∈ I. This left R-module RI is called the direct product of copies of M indexed by I. [Note that when M = R, RI also has a natural right R-module structure, where the product (f · b) of an element f ∈ RI by an element b ∈ R on the right is defined to by (f · b)(i) = f(i) · b for all i ∈ I. Moreover this right R-module structure is compatible with the previous left R-module structure, in th sense that (a · f) · b = a · (f · b) for all a ∈ R, all f ∈ RI and all b ∈ R. The standard terminology is “M is an (R,R)- ”. We will leave the formal definition of the notion of (R1,R2)- and their elementary properties to the reader as we will not use this notion.]

(3) Let R be a ring, M be a left R-module and let I be a set. The of copies of M indexed by I is the R-submodule of M I consisting of all elements f ∈ M I such that there exists a finite subset J ⊂ I(which may depend on f) with the property that f(i) = 0 for all i∈ / J. When I = {1, . . . , n} for some natural number n ∈ N, we write R⊕I for the direct sum. (4) A free left R-module is a left module isomorphic to the left R-module R⊕I for some set I. When I = {1, . . . , n} for some n ∈ N, the free module R⊕I is written R⊕n, or Rn for short.

(5) Let I be an indexing set. Let R be a ring, and let {Mi | i ∈ I} be a family of left R-modules indexed by I. Let ti∈I Mi be the formal disjoint union of the Mi’s. Q The direct product i∈I Mi of the Mi’s is the set of all maps f : I −→ ti∈I Mi such ` I that f(i) ∈ Mi for all i ∈ I. The direct sum i∈I Mi of the Mi’s is the subset of M consisting of all maps f : I −→ ti∈I Mi such that there exists a finite subset J ⊂ I (which may depend on f) with the property that f(i) = 0 for all i∈ / J. The addition Q ` of elements in i∈I Mi (resp. in i∈I Mi) and left multiplication with elements of R are defined coordinate-wise. ` Q It is clear that i∈I Xi = i∈I Xi if I is finite. When I = {1, 2, . . . , n}, n ∈ N, we often write X1 ⊕ · · · ⊕ Xn or X1 × · · · × Xn.

30 (6.9) Some basic properties. Let N be a left R-module.

(i) The map HomR(R,N) −→ N which sends every left R-module homomorphism h : M → N to the element h(1) ∈ N is a bijection. Q Q (ii) The map α: HomR(N, Mi) −→ HomR(N,Mi) which sends an R-module Qi∈I i∈I Q homomorphism h: N → i∈I Mi to the element α(h) ∈ i∈I HomR(N,Mi), given by

α(h)(i) = pri ◦ h ∀i ∈ I, Q is a bijection. Here pri : Mj −→ Mi is the “i-th projection”, which sends a j∈I Q typical element f : I → tj∈I Mj in j∈I Mj to the i-th component f(i) of f. ` Q (iii) The map β : HomR( Mi,N) → HomR(Mi,N) which sends an R-module ` i∈I i∈I Q homomorphism h: i∈I Mi → N to the element β(h) ∈ i∈I HomR(Mi,N) given by

β(h)(i) = h ◦ ιi , ` is a bijection. Here ιi : Mi → j∈I Mj is the natural “inclusion map” from Mi to the direct sum, such that for any element x ∈ Mi, ιi(x) is the map from I to tj∈I Mj given by  x if j = i ι (x)(j) = i 0 if j 6= i .

(6.10) Definition Let N be an R-submodule of a left R-module M. We define a left R- module structure on the quotient group M/N as follows. For any a ∈ R and any x ∈ M, the product a · (x + N) of a with the element x + N ∈ M/N is defined to be the element x + N ∈ M/N. It is straight-forward to check that the above definition is well-definition and gives M/N a structure of a left R-module. In many ways working with modules is the same as doing linear algebra over rings instead of fields. So far everything is formal. The next result on the structure of finitely generated modules over a PID is a main result for an introductory course on algebra.

(6.11) Theorem Let R be a principal ideal domain. Let M be a finitely generated module over M. Let Mtor be the torsion R-submodule of M, consisting of all element x ∈ M such that there exists a non-zero element a ∈ M with a · m = 0.

∼ n (1) There exists a natural number r ∈ N and an R-module isomorphism M = R ⊕ Mtor. The natural number r is uniquely determined by M, called the rank of M.

(2) Let N be a finitely generated torsion module. Then there exists a finite number of ∞ mutually distinct maximal ideals ℘1, . . . , ℘m, m ∈ N, such that the R-submodule N[℘i ] of N, given by

∞ n N[℘ ] := {x ∈ N | ℘ · I = {0} for some n ∈ N>0} ,

31 ∞ ∞ is non-zero. The natural R-homomorphism N[℘1 ] ⊕ · · · N[℘m ] −→ N , defined by

∞ (x1, . . . , xm) 7→ x1 + ··· + xm , xi ∈ N[℘i ] ∀i = 1, . . . , m

is an isomorphism.

(3) Let ℘ be a maximal ideal of R and let N be a finitely generated R-module such that n ℘ ·N = {0} for some n ∈ N>0. Then there exists a natural number a ∈ N and positive integers e1, . . . , ea with ei > 0 for all i = 1, . . . , a, and an R-module isomorphism

N ∼= R/℘e1 ⊕ · · · ⊕ R/℘ea .

The natural number a is uniquely determined by N, and the positive integers e1, . . . , ea are uniquely determined by N up to permutation.

Remark (a) The isomorphism in statement (1) implies that the torsion submodule Mtor is of finite type over R, i.e. it is a finitely generated R-module. Similarly the ℘i-primary ∞ component N[℘i ] of the torsion module N in (2) is of finite type over R (b) The statement (1) implies that every torsion-free finitely generated R-module is free. In particular every torsion-free finitely generated abelian group is isomorphic to Zr for a unique natural number r.

(c) Thm. 6.11, applied to the principal ideal domain Z, gives the structure theorem for finitely generated abelian groups:

Let A be a finitely generated abelian group. Then there exist – natural numbers r, m ∈ N, – mutually distinct prime numbers p1, . . . , pm, – a finite sequence

ei,1 ≤ ... ≤ ei,ai

of non-decreasing positive integers attached to the prime pi for each i = 1, . . . , m, and – a group isomorphism

∼ r M ei,j A = Z ⊕ Z/pi Z . 1≤i≤m,1≤j≤ai

The integers r, m and are uniquely determined by A, the prime numbers p1, . . . , pm are uniquely determined by A up to permutation, the number ai

and the non-decreasing positive integer ei,1 ≤ · · · ≤ ei,ai attached to the prime number pi is uniquely determined by A.

32 (6.12) Corollary Let V be a non-zero finite dimensional vector space over a field F . Let T ∈ EndF (V ) be a linear operator on V . We give V the structure of F [x]-module such that each polynomial f(x) ∈ F [x] operates on V as f(T ). (1) There exists a positive integer m and a finite number of distinct monic irreducible polynomials f1(x), . . . , fm(x) and such that V decomposes into the direct sum of non- ∞ zero T -stable linear subspaces V [fi ], where ∞ n V [fi ] := { v ∈ V | fi(T ) (v) = 0 for some n ∈ N>0 } .

(2) For each i = 1, . . . , m, there exists a sequence of natural numbers 0 < ei,1 ≤ ... ≤ ei,ai and an F [x]-module isomorphism

∞ ∼ ei,1 ei,ai V [fi ] = F [x]/ (fi(x) ) ⊕ · · · ⊕ F [x]/ (fi(x) ) .

(3) For each i = 1, . . . , m, the positive integers ai and 0 < ei,1 ≤ ... ≤ ei,ai attached to the irreducible polynomial fi(x) are uniquely determined by the linear operator T on V . (3) The minimal polynomial of the operator T is

m Y ei,a fi(x) i . i=1

(4) The characteristic polynomial of T is

m P Y 1≤j≤a ei,j fi(x) i . i=1  Remark Recall that the characteristic polynomial of T is det x·IddimF (V ) − A , where A is a matrix representation of the linear operator T . The square matrix x·IddimF (V ) − A is an element of MdimF (V )(F [x]), and the determinant is a monic polynomial in F [x].

(6.13) Cor. 6.12 is a formulation of the theory of rational canonical forms in terms of modules over the polynomial ring F [x]. We make it more explicit by choosing a suitable basis for each direct summand corresponding to a factor of the form F [x]/(f(x)e) = F [x]/I, where f(x) is a monic irreducible polynomial in F [x] and I is the principal ideal generated by f(x)e. The linear operator in question is induced by the element x ∈ F [x], operating on the quotient e d d−1 W = F [x]/(f(x) ) via “multiplication by x”. Write f(x) = x + ad−1 x + ... + a1 x + a0. Then dimF (W ) = de. The following list of vectors in W = F [x]/I

d−1 e−1 d−2 e−1 e−1 v1 := x f(x) + I v2 := x f(x) + I ··· vd := f(x) + I d−1 e−2 d−2 e−2 e−2 vd+1 := x f(x) + I vd+2 := x f(x) + I ··· v2d := f(x) + I ...... d−1 d−2 v(e−1)d+1 := x + I v(e−1)d+2 := x + I ··· ved := 1 + I

33 is a basis in W , such that the matrix representation of the linear operator “multiplication by x” has a simple form, called an irreducible block for the rational canonical form of the operator T , or an irreducible rational canonical block for short. Such an irreducible rational canonical block, corresponding to a factor F [x]/(f(x)e) where f(x) is a monic irreducible polynomial of degree d in F [x], is a de × de matrix in block form, where you have e diagonal blocks of size d × d, occupied by the same cyclic d × d matrix associated to the irreducible monic polynomial f(x) of degree d. Most of the entries outside these diagonal blocks are zero, except for e − 1 entries in the “inner upper diagonal corners”. We illustrate it below 4 3 2 in the case when f(x) = x + a3 x + a2 x + a1 x + a0 and e = 3; the block is

 −a3 1 0 0   −a2 0 1 0     −a1 0 0 1     −a0 0 0 0 1     −a3 1 0 0     −a2 0 1 0     −a1 0 0 1     −a0 0 0 0 1     a 1 0 0   3   −a 0 1 0   2   −a1 0 0 1  −a0 0 0 0

Note that the irreducible polynomial f(x) ∈ F [x] and the exponent e can be immediatedly “read off” from such an irreducible rational canonical block. When d = 1, i.e. the irreducible polynomial f(x) is an linear polynomial x−λ, and we have the familiar Jordan blocks. Below is an illustration in the case e = 5  λ 1 0 0 0   0 λ 1 0 0     0 0 λ 1 0  .    0 0 0 λ 1  0 0 0 0 λ

A matrix representation of a linear operator T ∈ EndF (V ) on a finite dimensional vector space V in diagonal block such that each block is an irreducible rational canonical block is called the rational canonical form of T .

(6.14) Corollary Notation as in 6.12.

(1) The linear operator T is diagonalizable, (i.e. V is spanned by eigenvectors of T ) if and only if the minimal polynomial of T is a product of polynomials of degree-one in F [x].

34 (2) (Cayley-Hamilton) The minimal polynomial of T divides the characteristic polyno- mial of T . In other words fmin,T (T ) = 0, where fmin,T (x) ∈ F [x] is the minimal polynomial of T .

(3) Two matrices in Mn(F ) are conjugate if and only if they have the same rational canon- ical form.

(4) Every matrix in Mn(F ) is conjugate to its transpose. (6.15) Remark Cor. 6.12 summarizes the theory of canonical forms for a linear operator on a finite dimensional vector space V over a field F . Below are some related definitions and their basic properties, of interest only when the base field F is not algebraically closed.

(i) We say that a linear operator T on V is reduced if the subring of EndF (V ) generated by F and T is isomorphic to the product of a finite number of fields (ii) Because this subring is isomorphic to the image of the ring homomorphism

hT : F [x] −→ EndF (V ) whose kernel is the principal ideal generated by the minimal polynomial of T , this subring is isomorphic to

, m ! m Y ei,a ∼ Y ei,a F [x] fi(x) i = (F [x] /(fi(x) i )) , i=1 i=1 a product of quotient rings of the form F [x]/ (f(x)e) with f(x) irreducible and e ≥ 1. (iii) It follows quickly from (ii) above that T is reduced if and only if one of the following equivalent conditions hold.

ei,a – The quotient ring F [x]/ (fi(x) i ) is a field for each i = 1, . . . , m.

– The exponent ei,ai of the irreducible polynomial fi(x) in the minimal polynomial of T is equal to 1 for all i = 1, . . . , m.

– The image hT (F [x]) of the ring homomorphism hT is reduced, i.e. if x is an element N of the ring hT (F [x]) such that x = 0 for some positive integer N, then x = 0.

[Note. An element x ∈ R such that xN = 0 for some positive integer N is called nilpotent. The set of all nilpotent elements in a commutative ring R is an ideal of R, called the radical of R. A commutative ring R is reduced if its radical is {0}; equivalently the the only nilpotent element in R is 0.] (iv) We say that the linear operator T is semi-simple if the minimal polynomial of T is separable. It is clear that if T is separable then T is reduced. If the field F is of characteristic 0, then being separable and being reduced are equivalent.

35 (v) An element λ ∈ F is an eigenvalue of T if and only if (x − λ) divides the minimal polynomial of T ; equivalently (x − λ) is one of the irreducible polynomials fi(x)’s in the statement of 6.12. The T -invariant vector subspace V [(x − λ)∞] is called the generalized λ-eigenspace of T ; it contains the eigenspace Ker(T − λ IdV ) of T .

§7. Tensor product of vector spaces We fix a field F throughout this section.

(7.1) Definition Let V1,...,Vm,W be F -vector spaces. A map T : V1 × · · · × Vm −→ W is multilinear over F if

0 0 T (v1, . . . , avi + bvi, vi+1, . . . , vm) = a T (v1, . . . , vi, . . . , vm) + b T (v1, . . . , vi, . . . , vm) 0 for all v1 ∈ V1, v2 ∈ V2, . . . , vi, vi ∈ Vi, . . . , vm ∈ Vm, all a, b ∈ F and all i = 1, . . . , m. The map T is said to be bilinear (resp. trilinear) if m = 2 (resp. m = 3).

(7.2) Definition Let V,W be vector spaces over F . Denote by V ×n the product of n copies of V . ×n (1) A map S : V → W is symmetric if S(vσ(1), . . . , vσ(n)) = S(v1, . . . , vn) for all list of vectors v1, . . . , vn ∈ V and all permutations σ ∈ Sn.

×n (2) A map A: V → W is alternating if A(v1, . . . , vn) = 0 for all list of vectors v1, . . . , vn ∈ V such that vi = vj for some i 6= j with 1 ≤ i, j ≤ n. This condition implies that A(v1, . . . , vn) = sgn(σ)A(vσ(1), . . . , vσ(n)) for all list of vectors v1, . . . , vn ∈ V and all permutations σ ∈ Sn.

(7.3) Definition Let V1,...,Vm be vector spaces over F .A tensor product of V1,...,Vm is an F -multilinear map α: V1 × · · · × Vm → U which satisfies the following universal property: For any multilinear map T : V1 × · · · × Vm → W , there exists a unique F -linear map f : U → W such that T = f ◦ α. The above universal property implies that the tensor product, if exists, is unique up to 0 unique isomorphism. In other words, if β : V1 × · · · × Vm → U is another tensor product ∼ 0 of V1,...,Vm, then there exists a unique isomorphism δ : U −→ U with β = δ ◦ α.A “general nonsense construction” show that a tensor product exists; write α: V1 × · · · × Vm → V1 ⊗ · · · ⊗ Vm for a/the tensor product of V1,...,Vm. Denote by v1 ⊗ · · · ⊗ vm the image of an element (v1, . . . , vm) ∈ V1 × · · · × Vm). (7.4) Remark Because the double dual of a finite dimensional vector space over F is canon- ically isomorphic to the vector space itself, if V1,...,Vm are finite dimensional vector spaces over F , V1 ⊗ · · · ⊗ Vm is naturally isomorphic to the F -linear dual

HomF (ML(V1 × · · · × Vm,F ),F ) of the space ML(V1 × · · · × Vm,F ) of all multilinear maps from V1 × · · · × Vm to F .

36 [Exercise: Describe the natural map from V1 × · · · × Vm to HomF (ML(V1 × · · · × Vm,F ),F ).]

(7.5) Lemma Let V be a vector space over F , n ∈ N. Let α: V ×n → V ⊗n be the tensor product of n copies of V .

(1) There exists a symmetric multilinear map β : V ×n → SnV with the following universal property: For any symmetric multiplinear map S : V ×n → W , there exists a unique F -linear map f : SnV → W such that S = f ◦ β.

(2) There exists an alternating multilinear map γ : V ×n → ΛnV with the following universal property: For any alternating multiplinear map A: V ×n → W , there exists a unique F -linear map g :ΛnV → W such that A = g ◦ γ.

⊗n n (3) The F -linear map π1 : V → S V such that β = π1◦α is surjective. (So the symmetric n ⊗n product S V is naturally a quotient of V .) Denote by v1 · v2 · ... · vn the element n β(v1, . . . , vn) ∈ S V .

⊗n n (4) The F -linear map π2 : V → Λ V such that γ = π2◦α is surjective. (So the symmetric n ⊗n product Λ V is naturally a quotient of V .) Write v1 ∧ · · · ∧ vn for the element n β(v1, . . . , vn) ∈ Λ V . By convention, S0V = F = Λ0V .

(7.6) Lemma Let U, V, W be vector spaces over F . We have natural isomorphisms.

• U ⊗ V ∼= V ⊗ U, underwhich an element u ⊗ v is mapped to v ⊗ u, ∀ (u, v) ∈ U × V .

• (U ⊕ V ) ⊗ W ∼= (U ⊗ W ) ⊕ (V ⊗ W ),U ⊗ (V ⊕ W ) ∼= (U ⊗ V ) ⊕ (U ⊗ W )

• (U ⊗ V ) ⊗ W ∼= U ⊗ (V ⊗ W ), and both are naturally isomorphic to U ⊗ V ⊗ W .

(7.7) Lemma Let V be a finite dimensional vector space, and let v1, . . . , vm be an F -basis of V .

(1) The set of vectors vi1 ⊗ · · · ⊗ vin , where the index (i1, . . . , in) runs through all n-tuples n ⊗n ⊗n n in {1, 2, . . . , m} , is an F -basis of V . In particular dimF (V ) = m .

(2) The set of vectors vi1 ∧ · · · ∧ vin , where the index (i1, . . . , in) runs through all n- n n tuples in {1, 2, . . . , m} with i1 < i2 < ··· < im is an F -basis of ΛF V . In particular n m dimF (ΛF V ) = n .

(3) The set of vectors vi1 · vi2 · ... · vin , where the index (i1, . . . , in) runs through all n-tuples n n in {1, 2, . . . , m} with i1 ≤ i2 ≤ · · · ≤ im is an F -basis of SF V . In particular we have n m+n−1 m+n−1 dimF (SF V ) = m−1 = n .

37 (7.8) Lemma (1) For i = 1, . . . , n let Ti : Vi → Wi be an F -linear map, then there is a unique F -linear map

T1 ⊗ · · · ⊗ Tn : V1 ⊗ · · · ⊗ Vn −→ W1 ⊗ · · · ⊗ Wn such that T1 ⊗· · ·⊗Tn(v1 ⊗· · ·⊗vn) = T (v1)⊗· · ·⊗T (vn) for all (v1, . . . , vn) ∈ V1 ×· · ·×Vn. (2) Let T : V → W be an F -linear map. Then there are F -linear maps SnT : SnV −→ SnW and ΛnT :ΛnV −→ ΛnW characterized by n n S T (v1 · ... · vn) = T (v1) · ... · T (vn) and Λ T (v1 ∧ · · · ∧ vn) = T (v1) ∧ · · · ∧ T (vn) ×n for all (v1, . . . , vn) ∈ V . (7.9) Lemma (1) In the situation of 7.8 (1), we have

Tr(T1 ⊗ · · · ⊗ Tn) = Tr(T1) · ... · Tr(Tn) . (2) Notation as in 7.8 (2). Assume that the characteristic polynomial of T splits into a product of linear factors in F [x]. Let a1, . . . , am be the eigenvalues of T , listed with multiplicity. Then n n n X Y n X Y Tr(S T ) = aij , Tr(Λ T ) = aij .

1≤i1≤i2≤···≤in≤m j=1 1

σ(v1 ⊗ · · · ⊗ vn) = vσ(1) ⊗ · · · vσ(n) ∀σ ∈ Sn, ∀v1, . . . , vn ∈ V. Let Sym(V ⊗n) ⊂ V ×n (resp. Skew(V ⊗n)) be the set of all symmetric (resp. skew symmetric) tensors in V ⊗n, consisting of all elements x ∈ V ⊗n such that σ(x) = x (resp. σ(x) = sgn(σ) · x) for all σ ∈ Sn.

(1) The projections π1 and π2 induces isomorphisms Sym(V ⊗n) −→∼ SnV and Skew(V ⊗n) −→∼ ΛnV.

(2) Both Sym(V ⊗n) ⊂ V ×n and Skew(V ⊗n)) are stable under the action of T ⊗n on V ⊗n for all T ∈ EndF (V ). Remark When n = 2 and 2 ∈ F ×, we have V ⊗ V = Sym(V ⊗2) ⊕ Skew(V ⊗2). This is no long true for n ≥ 3: When n ≥ 3, Sn has irreducible representations of dimension at least two, and V ⊗n decomposes into a direct sum of linear subspaces corresponding to various ⊗n symmetry patterns for Sn; each of the direct summands is stable under the action of T for every linear operator T ∈ EndF (V ). The number of these direct summands is the number of irreducible representations of Sn, which is equal to the number of partitions of the integer n (e.g. 4 = 3 + 1 = 2 + 2 = 2 + 1 + 1 = 1 + 1 + 1 + 1 gives 5 ways to pattern 4 into a sum of positive integers.)

38 §8. Linear representation of finite groups We recall the definition of linear representations; see 6.2.

(8.1) Definition Let G be a group and let F be a field. (1) A linear representation of G on a vector space V over F is a group homomorphism ρ: G −→ GLF (V ), or equivalently an F -linear left action of G on V . If G is a finite group, the above is also equivalent to a left F [G]-module structure on V .

(2) A subrepresentation of a linear representation (V, ρ) of G on an F -vector space V is a vector subspace W of V which is invariant under G; i.e. ρ(g)(v) ∈ V for all g ∈ G and all v ∈ V .

(3) Let W be a subrepresentation of (V, ρ) as in (2) above. The quotient representation of V by W is the homomorphismρ ¯: G → GLF (V/W ) induced by ρ, i.e.

ρ¯(g)(v + W ) = ρ(g)(v) + W ∀g ∈ G, ∀v ∈ V.

We often suppress the symbol ρ and abbreviate ρ(g)(v) to g·v if it does not lead to confusion.

(8.2) Examples.

(i) The trivial F -linear representation of G is the representation (F, 1G), where

∼ × 1G : F → GLF (F ) = F

is the trivial group homomorphism. More generally a representation (V, ρ) is said to be trivial if the group homomorphism ρ: G → GLF (V ) is the trivial homomorphism. (ii) Suppose that G is a finite group. The regular representation is the F -linear representa- tion whose underlying F -vector space is the group ring F [G], with the left F [G]-module structure given by the product law in the ring F [G]. In other words ! X X X ρ(g) ax [x] = ax [gx] = ag−1y [y] x∈G x∈G y∈G P for every g ∈ G and every element x∈G ax [x] ∈ F [G].

(8.3) Definition Let F be a field and let (V, ρV ), (W, ρW ) be F -linear representations of a group G.

(i) An F -linear map T : V → W is G-equivariant if T (ρV (g)(v)) = ρW (g)(T (v)) for all g ∈ G and all v ∈ V .A G-equivaraint linear transformation is also called an intertwining operator; if G is finite it is the same as a F [G]-module homomorphism.

39 (ii) The kernel of an equivariant F -liniear map T : V → W is the subrepresentation Ker(T ) consisting of all elements v ∈ V such that T (v) = 0.

(iii) Two F -linear representations V,W of G are isomorphic if there exists equivariant G- linear maps α: V → W and β : W → V such that α ◦ β = IdW and β ◦ α = IdV .

(8.4) Definition Let F be a field. An non-zero F -linear representation (V, ρV ) of a group G is irreducible if V and {0} are the only subrepresentations of V . If G is finite, then every irreducible F -linear irreducible representation V of G is finite dimensional: For any non-zero elemenet of V , the linear span of the finite set {x·v | x ∈ G} is equal to V .

(8.5) Lemma Let (V, ρ) be a linear representation of a finite group G over a field F . and let W be a subrepresentation of G. Suppose that Card(G)·1 6= 0 in F , i.e. either char(F ) = 0 or char(G) = p > 0 and Card(G) 6≡ 0 (mod p).

(i) There exists an equivariant F -linear map π : V → W such that π(v) = π for all w ∈ W . Note that π ◦ π = π.

(ii) V is the direct sum of the two subrepresentations W and Ker(π).

Under the assumption on F in 8.5, every finite dimensional F -linear representation of a finite group G is isomorphic to the direct sum of a finite number of irreducible representations. Construction of a map π satisfying the requirements in (i): Pick an F -linear transformation h : V → W such that h(w) = W for all w ∈ W ; there are plenty of such. Define π : V → W by X π(v) := Card(G)−1 · ρ(x)(h(ρ(x−1)(v))). x∈G [The idea is to average the linear transformation h: V → W over G to produce a G- equivariant one; this averaging process only changes the effect of the linear transformation outside W .]

(8.6) Proposition (Schur’s lemma) Let F be field, let G be a finite group and let (V, ρV ) and (W, ρW ) be irreducible F -linear representations. Let T : V → W be an equivariant G-linear homomorphism.

(i) If the representations V and W are not isomorphic, then T = 0.

(ii) If V = W and F is algebraically closed, then T = a · IdV for some a ∈ F .

In the rest of this section we assume that G is finite and the base F is C.

40 (8.7) Definition (1) Let (V, ρ) be a finite dimensional linear representation of a group G over C. The character of G is the function χρ : G → F on G defined by

χρ(x) = Tr(ρ(x)) ∀x ∈ G.

(2) A function f : G → C is a class function if f(x · y · x−1) = f(x) for all x, y ∈ C. Let G\ be the set of all conjugacy classes of G and let π : G → G\ be the natural projection, then a class function on G is a function of the form ν ◦ π, where ν : G\ → C.

Note that χρ(x) is the sum of a finite number of roots of 1, namely the eigenvalues of ρ(x), and the complex conjugate of these eigenvalues are the eigenvalues of ρ(x−1), hence the ∗ −1 complex conjugate χρ(x) of χρ(x) is equal to χρ(x ).

Examples.

(a) The character of the trivial representation of G on C is the function on G with constant value 1 ∈ C.

(b) The character χreg of the regular representation of G is  Card(G) · 1 if x = eG χreg(x) = 0 if x 6= eG

(8.8) Definition For any two functions φ, ψ : G −→ C, put 1 X (φ|ψ) := φ(x)ψ(x)∗ Card(G) x∈G where ψ(x)∗ is the complex conjugate of ψ(x), and 1 X hφ|ψi := φ(x)ψ(x−1) Card(G) x∈G Note that (φ|ψ) = (ψ|φ)∗ and hφ|ψi = hψ|φi. If ψ is the character of a finite dimensional representation of G, then ψ(x)∗ = ψ(x−1) for all x ∈ G ahd (φ|ψ) = hφ|ψi.

(8.9) Proposition (i) If χ is the character of an irreducible complex representation, then (χ|χ) = 1.

(ii) If χ1 and χ2 are the characters of two non-isomorphic complex irreducible representa- tions, then (χ1|χ2) = 0.

(8.10) Proposition The regular representation C[G] of G decomposes into a direct sum of irreducible complex representations of G:

h ∼ M ⊕ni C[G] = (Vi, ρi) i=1

41 where (V1, ρ1),..., (Vh, ρh) are mutually non-isomorphic complex representations of G, and n1, . . . nh are positive integers.

(1) ni = dim(Vi) for each i = 1, . . . , h.

Ph 2 (2) i=1 ni = Card(G).

(3) Every irreducible complex representation of G is isomorphic to (Vi, ρi) for some i with 1 ≤ i ≤ h.

(4) The characters χ1, . . . , χh of the irreducible representations (V1, ρi),..., (Vh, ρh) form an orthogonal basis of the C-vector space H of all class functions on G, where the inner product on H is given by (φ, ψ) 7→ (φ|ψ).

(5) In particular the number h of non-isomorphic irreducible representations is equal to Card(G\), the number of conjugacy classes in G.

(8.11) Character table. Let G be a finite group, let h be the number of conjugacy classes of G, let (C, 1G) = (V1, ρ1),..., (Vh, ρh) be a set of representatives of isomorphism classes of irreducible complex linear representations of G, and let χ1, . . . , χh be their characters. Let {eG} = C1,...,Ch be the conjugacy classes of G. Pick elements xi ∈ Ci for i = 1, . . . , h, let ci = Card(Ci). The character table is the h × h matrix, whose row are indexed by the irreducible characters of G and whose columns are indexed by the conjugacy classes of G, such that its (i, α)-entry is χi(xα). Every entry in the first row of the character table is 1 because χ1 is the trivial character. The entries in the first column of the character table are the dimensions of the irreducible representations (Vi, ρi)’s, also called the degrees of the irreducible characters χi’s. The orthogonality relations of characters is expressed in the following orthogonality re- lation of the character table. n X  Card(G) if i = j (1) c χ (x ) χ (x )∗ = α i α j α 0 ifi 6= j i=1

h  Card(G) X ∗ if α = β (2) χ (x ) χ (x ) = cα i α i β 0 if α 6= β i=1

(8.12) Definition Let (ρ1,V1) and (ρ2,V2) be finite dimensional C-representations of finite groups G1 and G2 respectively. Define a C-representation

ρ1  ρ2 : G1 × G2 −→ GL(V1 ⊗ V2) by ρ1  ρ2(x1, x2) = ρ1(x1) ⊗ ρ2(x2);

42 we call it the external tensor product of (ρ1,V1) and (ρ2,V2). Its character is χ (x , x ) = χ (x ) · χ (x ) ∀(x , x ) ∈ G × G . ρ1ρ2 1 2 ρ1 1 ρ2 2 1 2 1 2 ∼ If G1 = G2 = G, the restriction to the diagonal subgroup G = ∆G ⊂ G × G of ρ1  ρ2 is called the internal tensor product of ρ1 and ρ2, denoted by ρ1 ⊗ ρ2; its character is

χρ1⊗ρ2 (x) = χρ1 (x) · χρ2 (x) ∀x ∈ G.

(8.13) Proposition Let G1,G2 be finite groups.

(i) Let (ρ1,V1) and (ρ2,V2) be irreducible C-representatiions of G1 and G2 respectively. Then (ρ1  ρ2,V1 ⊗ V2) is an irreducible C-representation of G1 × G2.

(ii) Every irreducible C-represenation of G1 × G2 is isomorphic to (ρ1  ρ2,V1 ⊗ V2) for an irreducible C-representation (ρ1,V1) of G1 and an irreducible C-representation (ρ2,V2) of G2. Here is a useful fact about the dimension of irreducible complex representations of a finite group. The degree of every complex irreducible representation of G divides [G : Z(G)], where Z(G) is the center of the finite G.

(8.14) Definition Let G be a finite group, H be subgroup of G, and let (W, θ) be a finite dimensional complex representation of H. Let G V = IndH (W ) := { f : G → W | f(hx) = θ(h)(f(x)) ∀h ∈ H, ∀x ∈ G} .

Let ρ : G → GLC(V ) be the linear left action of G on V defined by (ρ(y)(f))(x) = f(xy) ∀x, y ∈ G. G We say that the representation (IndH (W ), ρ) of G is induced by the representation (W, θ) of the subgroup H.

G The character χρ of the representation (IndH (W ), ρ) G induced by the representation (W, θ) of the subgroup H is given by the following formula in terms of the character χθ of (W, θ). 1 X χ (x) = χ (sxs−1) ∀ x ∈ G. ρ Card(H) θ s∈G sxs−1∈H

More explicitly, let Cx be the conjugacy class of x in G, and write Cx ∩ H as a disjoint union

Cx = C1 t · · · t Cm , where each Ci is a conjugacy class in H. Then we have m Card(G) X χ (x) = Card(C ) · χ (x ) , ρ Card(H) · Card(C ) i θ i x i=1 where xi is an element of Ci for each i = 1, . . . , m.

43 (8.15) Further properties of induced representations.

(i) Frobenius reciprocity: If f is a class function on G, then

(f|χρ)G = (fH |χθ)H ) ,

where fH is the restriction to H of the class function f on G, and the inner product is calculated on G and H respectively.

G (ii) Mackey’s criterion: The complex representation (IndH (W ), ρ) of G is irreducible if and only if (W, θ) is irreducible and for every element s ∈ G r H, the two representa- s −1 tions θ|sHs−1∩H and θ of sHs ∩ H are disjoint (i.e. do not contain any irreducible −1 representation in common). Here θ|sHs−1∩H is the restriction to sHs ∩ H of the representation θ of H, and θs is given by

θs(x) := θ(s−1xs) ∀ x ∈ H ∩ sHs−1 .

(iii) Artin’s theorem: Every character of a finite group G is a Q-linear combination of characters of representations induced from cyclic subgroups of G.

(iv) Brauer’s theorem: Every character of a finite group G is a Z-linear combination of characters of representations induced from a subgroup H ⊆ G such that H is isomorphic to the product of a cyclic group with a p-group.

44