<<

Lecture 2: functions between sets

Saul Glasman September 9, 2016

I’ll begin by discussing some pieces of theory notation that I missed out last time. The most important, which I’m sorry to have neglected to mention, is the empty set ∅. This is the set with no elements. It could be written {}, but for whatever reason it isn’t; the convention is ∅. Note that saying “the set with no elements” really does specify exactly what I mean. If I say “the set with one element”, well, which element? There are infinitely many things that element could be. But for the set with no elements, there are really no choices. Another way of saying this is that ∅ is defined by the following property: for any x, x∈ / ∅. A couple of more minor things. If X is a set, we’ll use the notation

{x ∈ X | x satisfies some property} to pick out the subset of X consisting of those elements which have some prop- erty. For example, suppose X = {1, 2, }. Then {x ∈ X | x is a number} = {1, 2}. Finally, there’s the set difference operator “ \ ”; X \ Y is the set of elements of X which are not elements of Y . For instance,

{1, 2}\{1} = {2}.

Moreover, {1, 2}\{1, 3} = {2}, since you can’t take 3 out of a set it wasn’t a member of in the first place. Watch out: Munkres actually uses the minus sign for set difference, but most people nowadays use the backslash \. I’m sorry to keep dumping notation on you, and I realize it might be a lot to keep in your head if you haven’t encountered it before; I’ll give reminders here and there when it comes up and you’ll be used to it before you know it. Now we’re ready to start talking about functions. You’ve all encountered the idea of a before, at least informally, so let’s now make that math- ematically rigorous. A functor f from a set X to a set Y - we write

1 f : X → Y - is supposed to be a rule assigning to each element of X an element of Y . One way of writing that is as follows: Definition 1. A function from X to Y is a subset

f ⊆ X × Y such that each x ∈ X is the first component of exactly one element of f. If (x, y) ∈ f, we say f(x) = y. If we want to encode the values of a function as pairs like that, the condition on f is exactly the condition we need - it guarantees that f(x) has exactly one value in Y , rather than no values or more than one value, which is what we expect from a function. (There’s some overcomplicated stuff about “rules of assignment” in the book - we won’t be bothering with that.) This is a good way of formalizing the idea of a function, but usually we won’t think of functions as sets of pairs this way - we’ll just think of them in the way you might have learned in high school, as machines that take in an element of X and spit out an element of Y . If f : X → Y is a function, X will often be called the domain of f, and Y will often be called the codomain or sometimes the range of f (although this is old-fashioned). You might also catch me calling X the source and Y the target. At this point I should probably give some examples of functions. Suppose

S = {1, 2},T = {a, b, c}.

Let f0 = {(1, c), (2, b)} ⊆ S × T.

Then f0 is a function with f0(1) = c, f0(2) = b.

f1 = {(1, a), (2, a)} is also a function with f1(1) = f1(2) = a. However,

f2 = {(1, c), (1, a), (2, b)} is not a function because there are two pairs with first component 1.

f3 = {(1, c)} is also not a function because there’s no pair with first component 2. The most fundamental example of a function is the . If X is a set, then the identity function on X, written

idX : X → X

2 is the function that just leaves everything the same:

idX (x) = x for all x.

As a subset of X × X, idX = {(x, x) | x ∈ X.} The most important thing we can do with functions is compose them. Sup- pose f : X → Y and g : Y → Z are functions. Then for any x ∈ X, f(x) is an element of Y , and we can apply g to it to get an element g(f(x)) of Z. Definition 2. In this situation, the composition of f and g is the function

g ◦ f : X → Z with (g ◦ f)(x) = g(f(x)). Sometimes g ◦ f is simply written gf.

Composing with the identity is special: Lemma 3 (A lemma, in case you haven’t encountered the word before, is a small auxiliary .). If f : X → Y is a function, then

f ◦ idX = idY ◦ f = f.

Proof. (f ◦ idX )(x) = f(idX (x)) = f(x) and (idY ◦ f)(x) = idY (f(x)) = f(x).

Now let’s start talking about different types of functions. Definition 4. Suppose f : X → Y is a function. We say f is injective if, when

x0 6= x1 ∈ X are two distinct elements, we have

f(x0) 6= f(x1).

In other words, f doesn’t “collide” different elements of X. Of the examples above, f0 is injective, but f1 is not, because 1 6= 2 but f1(1) = f1(2) = a. Definition 5. If f : X → Y is a function. We say f is surjective if for every y ∈ Y , there is some x ∈ X with f(x) = y.

3 In other words, everything in Y is a value of f. The image of f is the set

im f = {f(x) | x ∈ X} ⊆ Y ; it’s the subset of elements of Y which are values of f. Another way of saying f is surjective is to say im f = Y.

The function f0 is not surjective, because c is not in the image of f0. It’s appropriate here to bring in an example from the book, which is probably more like the examples of functions you’re used to from calculus. Define g : R → R by g(x) = x2. Then (show of hands) g is not injective, because g(−1) = g(1) = 1. g is also not surjective, because −1 is not in the image of g; there is no real number whose square is −1. Now let R≥0 be the non-negative real numbers. We can think of g as defined on R≥0 instead of all of R. This is something we can always do; we talk about the restriction of a function to a subset of the domain. We use the notation

g|R≥0 for the new function obtained this way. Then g|R≥0 , because a real number has at most one non-negative square root. We don’t have the problem that g(−1) = g(1), because −1 isn’t in the domain any more. On the other hand, notice that the values of g are all non-negative. That means we can think of g as a function from R to R≥0. Since this function has a different codomain, it’s technically a different function, so let’s give it a different name: call it g0. Then g0 is surjective, because every non-negative real number has at least one square root. We could even do both things at once:

0 g |R≥0 is both injective and surjective. We call such functions bijective. Bijective functions are special, because they’re the functions which are invertible. Definition 6. Let f : X → Y be a function. Then an inverse for f is a function f −1 : Y → X, pronounced “f inverse”, such that

−1 −1 f ◦ f = idX , f ◦ f = idY .

We think of f −1 as “doing f backwards”. Theorem 7. f has an inverse if and only if it’s bijective.

Proof. This is an “if and only if” statement, so there are two things to prove: if f has an inverse, then it’s bijective, and if f is bijective, then it has an inverse.

4 First, suppose f has an inverse. We need to show that it’s injective and that it’s surjective. For injectivity, suppose that f(a) = f(b). Then f −1(f(a)) = f −1(f(b)), so a = b. For surjectivity, let y ∈ Y . Then f(f −1(y)) = y, so y is in the image of f. Now we prove the converse. Suppose f is bijective; we need to build an inverse f −1 to f. Let y ∈ Y . Since f is surjective, there is some x ∈ X such that f(x) = y. But since f is injective, there is only one such x. So we have a unique choice: we can simply decree f −1(y) = x. Then

f(f −1(y)) = f(x) = y and f −1(f(x)) = f −1(y) = x.

That’s the proof. If there’s a bijection f : X → Y , we’ll often say that X and Y are “in bijection”, and we’ll act kind of as if X and Y are the same, since the elements of Y are in one-to-one correspondence with the elements of X; they’re just the elements of X under different names. Last time I promised you a result about equivalence relations. I’ll be able to give you the statement; the proof will have to wait until next week. Before I can talk about this, I have to talk about Let X be a set. We can form the set of all equivalence relations on X - equivalence relations are things, we can put them in a set. Call this set Eq(X).

Definition 8. A partition of X is a collection (Xi)i∈I of subsets of X such that

• if i 6= j, then Xi ∩ Xj = ∅, and • the union [ Xi = X. i∈I

Don’t be scared of the notation (Xi)i∈I ; the point is that we have a big list of subsets of X, and we have to give them names, so we index them with tags from some convenient index set I. It doesn’t matter exactly what I is; it just has to be the right size of set to index our collection. Now we can form the set of all partitions of X - call it Part(X). Then we have the following theorem:

Theorem 9. The sets Eq(X) and Part(X) are in bijection.

5