Recall the Definition of an Affine Transformation

Recall the deﬁnition of an aﬃne transformation:

n m Deﬁnition: Let A1 ⊂ R and A2 ⊂ R be two aﬃne sets. Let f : A1 −→ A2 be a function.

f is an aﬃne transformation if and only if, given any subset P1,...,Pr of A1 and R r r any constants a1,...,ar ∈ for which Pi=1 ai = 1, then if P = Pi=1 aiPi we have

f(P )= a1f(P1)+ ··· + ar f(Pr ).

Last time we also saw some simple examples of affine transformations and showed that one could compose affine transformations to get affine transformations. Although the definition of an affine transformation is quite different from the definition of a linear transformation, they share many similar properties. The following, for example, is a very important property of affine transformations which is similar to that of linear transformations.

Proposition: Let f : A1 → A2 be an aﬃne transformation of aﬃne spaces and let

X = {P0,P1,...,Ps} be a set of points for which A1 = Aff(X). Then f is completely determined by its values at the points P0,P1,...,Ps.

Proof: This is quite easy to see since every point P in A1 can be written as s P = a0P0 + a1P1 + ··· + asPs with X ai =1 i=0 Inasmuch as P is written in the “proper” way, we can then say that s f(P )= X aif(Pi) i=0 and that determines f(P ). ut

Thus, like linear transformations, affine transformations on finite dimensional affine sets are completely determined by a finite amount of information.

Definition: Let f : A1 −→ A2 be an affine transformation. We say that f is invertible (or, is an invertible transformation) if we can find an affine transformation

g : A2 −→ A1

1 such that g ◦ f =1A1 AND f ◦ g =1A2 .

There is a very easy way to prove that f is an invertible aﬃne transformation. That is the content of the next theorem.

Proposition: Let f : A1 −→ A2 be an aﬃne transformation.

f is an invertible linear transformation ⇔ f is both 1-1 and onto.

Proof: ⇒: This is a standard fact about functions. The existence of g forces f to be both 1-1 and onto. ⇐: The fact that f is both 1-1 and onto forces there to be a function g for which g ◦ f =1A1 AND f ◦ g =1A2 . This is a standard fact about functions. The only thing we have really to prove is that the function, which we know exists, is an aﬃne transformation. I.e. let g = f −1 (the function inverse of f) then we need to show: R r given Q1,...,Qr in A2 and b1,...,br in with Pi=1 bi = 1, then, if r Q = Pi=1 biQi we have

g(Q)= X big(Qi).

Now, since f is 1-1 and onto, we can ﬁnd P1,...,Pr in A1 such that f(Pi)= Qi (equiv- r alently, g(Qi)= Pi). Now let P = Pi=1 biPi. Then, since f is an aﬃne transformation we have r r f(P )= X bif(Pi)= X biQi and this lastis = Q. i=1 i=1 Rewriting this last equation, but now in terms of g rather than f, we get

r r P = g(Q)= X biPi = X big(Qi) i=1 i=1 which is what we wanted to show. ut

We will be wanting to have a description of all the affine transformations between affine subsets of Rn for all n. We start this project of trying to find all such transformations by beginning with the affine sets we know best, the vector spaces. I.e. we begin by trying to find all the affine transformations between two vector spaces.

2 The wonderful thing, which gets this particular piece of the classification going so well, is the fact that there is not a great deal of difference between affine transformations between vector spaces and linear transformations between vector spaces. This is the content of the following theorem.

Theorem: Let V and W be subspaces, respectively, of Rn and Rm and let f : V → W be an aﬃne transformation. If f(0)= 0 then f is a linear transformation.

Proof: In order to show that f is a linear transformation we need to show two things:

for all v1, v2 in V we have f(v1 + v2)= f(v1)+ f(v2); and

for all v ∈ V and λ ∈ R we have f(λv)= λf(v).

But notice that 1v1 +1v2 − 1(v1 + v2)= 0 and since 1+1 − 1 = 1 and also since f is an aﬃne transformation, we have

1f(v1)+1f(v2) − 1f(v1 + v2)= f(0) = ( by hypothesis ) 0 i.e.

f(v1)+ f(v2)= f(v1 + v2)

And this is the ﬁrst property of a linear transformation we wanted to prove about f. In a similar fashion, but this time noticing that 1(λv) − λ(v)+ λ(0) = 0 and the fact that, for any real number λ we have 1 − λ + λ = 1, we obtain (since f is an aﬃne transformation) that f(λv) − λf(v)+ λf(0)= f(0).

Using the hypothesis that f(0)= 0 we get

f(λv) − λf(v)+ λ0 = 0 i.e. f(λv) = λf(v) as we wanted to show. ut

3 Remark: Suppose that f is any aﬃne transformation between vector spaces, as above, and suppose that f(0)= w. Let T−w : W → W be translation by the vector −w on the vector

space W , i.e. T−w(w1)= w + w1. We already showed that T−w is an affine transformation from W to W which is easily seen to be invertible ( its inverse is the translation by w). Now consider F = T−w ◦ f : V → W . Since both of these are affine transformations, so is their composition, i.e. F is an affine transformation. Moreover, F (0) = (T−w ◦ f)(0) = T−w(f(0)) = T−w(w) = −w + w = 0. I.e. F is an affine transformation which takes 0 to 0. We have proved the following theorem which tells us what all the affine transformations between two vector spaces look like.

Theorem: Let V and W we subspaces of Rn and Rm respectively. Let f : V → W be an aﬃne transformation. Then we can always ﬁnd a translation of W (call it T : W → W ) such that T ◦ f is a linear transformation from V to W .

(The following remarks are directed at those students who have seen the notion of a group.) If A is an aﬃne subset of Rn, it is an easy fact that

Aff(A) = {f : A→A | f is an invertible affine transformation } is a group under composition of functions. The identity element of this group is the identity affine transformation and the inverse of an invertible affine transformation is again an invertible affine transformation. We have just seen something about this group in the case when A is a vector subspace of Rn. We have seen that: i) the invertible linear transformations are in this group; ii) the translations are in this group; iii) everything in this group is the product of an invertible linear transformation with a translation! Let’s state the exact nature of what we know in this case.

Theorem: Let V be a subspace of Rn and let Aff(V ) be the group of invertible aﬃne transformations f : V → V . Then

4 i) the translations of V , call the set of them T (V ), are a subgroup of V isomorphic to (V, +) as a group; ii) the invertible linear transformations from V to V , denoted Gl(V ), is a subgroup of Aff(V ); iii) Gl(V ) is a normal subgroup of Aff(V ); iv) in Aff(V ) we have Gl(V ) ∩ T (V )= id; v) Aff(V )= T (V )Gl(V ). (In group theoretic terms, Aff(V ) is the semi-direct product of T (V ) and Gl(V ).)

Proof: All the parts of this proof are relatively simple. One just has to keep track of the definitions. Let’s first look at the translations of V : these form a subgroup basically because V is a subspace. Why is this the case? Let Tv1 and Tv2 be two translations, the first by v1 and the second by v2, then

(Tv1 ◦ Tv2 )(v)= Tv1 ((Tv2 )(v)) = Tv1 (v2 + v)= v1 + (v2 + v) = (v1 + v2)+ v = Tv1+v2 (v) i.e.

Tv1 ◦ Tv2 = Tv1+v2 .

The identity element is translation by the 0-vector and the inverse of the translation by v is the translation by −v and so we are done with showing that the translations form a subgroup of the invertible affine transformations of V . Finally we define a function φ : V → Aff(V ) which takes v → Tv and it is easy to see that this is an isomorphism from V to T (V ), where in the first group the operation is “addition of vectors”, while in the second group it is “composition of functions”. As for the second part of the theorem, you saw last semester that the invertible linear transformations on a vector space form a group under composition, so I won’t say anything about that, other than to notice that, in this case, it is a subgroup of another group. The only new thing is to show that the subgroup Gl(V ) is a normal subgroup of Aff(V ). To do that we have to show, if g is an element of Aff(V ) then

gGl(V )g−1 ⊆ Gl(V )

5 But we have already shown that if g ∈ Aff(V ) then Tv ◦ g = f where f is an invertible linear transformation (i.e. f ∈ Gl(V ) ) and Tv is a translation. We can rewrite this equation as g = T−v ◦ f.

By general group theoretic considerations we then have

−1 −1 −1 g = f ◦ (T−v)

−1 −1 But, the inverse of T−v is Tv, so g = f ◦ Tv. But then , if h ∈ Gl(V ) then

−1 −1 −1 g ◦ h ◦ g = (T−v ◦ f) ◦ h ◦ (g ) = (T−v ◦ f) ◦ h ◦ (f ◦ Tv)

It remains to prove that this composition is a linear transformation on V . But that is clear when we check where it takes the vector 0 (to itself). So, the resulting aﬃne transformation is back again in Gl(V ) and so we are done with the proof that Gl(V ) is a normal subgroup of Aff(V ). I’ll leave iv) and v) as simple exercises for the reader.

Remark: I don’t want to go into a description of the semidirect product of two groups here, it is something you should see in your algebra course. I just want to say that it generalizes the notion of the direct product of two subgroups to the case of two subgroups which generate the whole group, have intersection only the identity but for which only one is normal (in the direct product, the two subgroups which make up the group are both normal subgroups).

Of course, we would like to generalize what we have just done for vector subspaces of Rn to general affine subsets of Rn but we have some problems to overcome. The first thing we notice is that for an affine subset A of Rn, we cannot get translations by adding things in A to vectors in A, since the result won’t necessarily be in A. But, we can add things in the giacitura of A! We use that fact to define translations in general.

Definition: Let A be an affine subset of Rn and let v ∈ D(A). The function Tv : A→A defined by Tv(w)= v+w (for w ∈A) is called the translation of A determined by v.

6 Observations: i) The first thing to notice is that Tv is actually a function from A to A. ii) The second important thing to notice is that Tv is, in fact, an affine transformation of A. I will leave that proof to the student. The proof is essentially the same as the proof that I gave for the special case of this theorem in which A = D(A) i.e. in which A was a vector subspace of Rn. iii) The third important thing to notice is that Tv is an invertible affine transformation. I.e. Tv ∈Aff(A). This is easy to see because T−v is actually the inverse of Tv. iv) The fourth important thing to notice is that T (A) = {Tv | v ∈ D(A)} is a subgroup of Aff(A). v) The final thing I want to mention about T (A) is that as a group it is isomorphic to D(A). The proof, again, is very similar to the one we gave earlier.