Notes on Fixed Point Theorems

4 Basic Definitions

In this section of the notes we will consider some of the basic fixed point theorems of analysis, the Brouwer and Kakutani theorems and their extensions to infinite dimensional spaces. We will finish with the remarkable result of Caristi in complete metric spaces.

A fixed point of a map is a point in its domain which satisfies the equation F (x) = x. It is clear that, for such a point to exist, the domain and range must have common points. We point out two things, first, that the problem of existence of fixed points is equivalent to the problem of the solution of equations of the form f(x) = 0. Indeed, given such an equation, we can add the variable x to each side to obtain the equation F (x) = f(x) + x = x so that x is a fixed point of F if and only if it is a zero of f. Conversely, if we wish to solve the fixted point problem, then if f(x) = F (x) − x then x is a fixed point of F if and only if f(x) = 0.

Secondly, we wish to show that the existence of fixed points of continuous real-valued maps defined on a subset of the real line, is reduced to a question of signs by the intermediate value theorem. Indeed, suppose that F is a continuous mapping of [a, b] into [a, b]. If either F (a) = a or F (b) = b then we have a fixed point. Suppose, then, that neither are true and consider the function f(x) = F (x) − x. Then f(a) = F (a) − a > 0 while f(b) = F (b) − b < 0. Moreover, if F is continuous, so is f. Therefore the intermediate value theorem of calculus implies that there is a point such that 0 = f(x) = F (x) − x and this x ∈ (a, b) is a fixed point of F .

The situation is much more complicated if we move from R to higher dimensional spaces. We are going to look and these more general settings in the following.

Perhaps the fixed point theorem best known to students in an advanced calculus class is the Banach-Caccioppoli. As the reader is undoubtedly aware, this familiar theorem is intimately related to the convergence of iterative methods, and, as such, has been the object of much research since its original introduction.

If f is a mapping of a set K into itself, f : K −→ K, then K is referred to as a self- mapping. In the case of self-mappings, we say that a point x ∈ K is a fixed point of f provided f(x) = x.

We can now state two definitions for such maps.

1 Definition 4.1 Let K be a subset of a normed linear space. A self-mapping, f, of K is called a contracting map provided, for all x, y ∈ K,

kf(x) − f(y)k ≤ α kx − yk , where 0 < α < 1 .

Definition 4.2 A self-mapping f of a subset K of a normed linear space is called a non-expansive mapping provided, for all x, y ∈ K,

kf(x) − f(y)k ≤ kx − yk .

The first result we consider is the principle of contraction mappings:

Theorem 4.3 (Banach, Caccioppoli) Let K be a closed subset of a Banach space X and T : K −→ K a contraction mapping on K. Then T has one and only one fixed point x ∈ K. Moreover, if xo ∈ K is arbitrary, then the sequence {xn+1 := T xn | n = 1, 2, 3, ···} converges to x and

αn kx − x k kx − x k ≤ 1 o , n (1 − α) where 0 < α < 1 is the contraction constant.

Proof: (1) (Uniqueness) Assume that x and y are both fixed points of T , i.e.,

kx − yk ≤ α kx − yk , and so (1 − α) kx − yk ≤ 0 which implies that x = y. This shows that the fixed point must be unique if there is any one at all.

(2) (Existence) We will construct the fixed point. Let xo ∈ K be arbitrary and consider ∞ the sequence {xn}n=1 defined above. Then we have

n kxn+1 − xnk ≤ α kxn − xn+1k ≤ · · · ≤ α kx1 − xok , n = 0, 1, 2, ···

∞ Using this inequality, we will show that the sequence {xn}n=0 is a Cauchy sequence. Indeed

2 kxm − xnk ≤ kxm − xm−1k + kxm−1 − xm−2k + ··· + kxn+1 − xnk

m−1 m−2 n ≤ [α + α + ··· + α ] kx1 − xok

n m−n−1 = α [1 + α + ··· α ] kx1 − xok αn[1 − αm−n] αn = kx − x k ≤ kx − x k . 1 − α 1 o 1 − α 1 o

Thus the sequence forms a Cauchy sequence and there exists an x ∈ X such that lim xn = n→∞ x. Since K is closed, x ∈ K. Since T is continuous and k · k is continuous (the latter

because kxk − kxn − xk ≤ kxnk ≤ kxn − xk + kxk), it follows that

0 = lim kxm+1 − T xmk = k lim [xm+1 − T xm]k = kx − T xk , m→∞ m→∞ which implies T x = x. 2

Definition 4.4 A topological space R is said to have the fixed point property provided for every continuous mapping T : R → R, there exists a p ∈ R such that T (p) = p.

The famous Brouwer fixed point theorem states: The closed unit sphere of Rn has the fixed point property. Here is an example of another such set.

2 Definition 4.5 The Hilbert in ` is the set of all points of the form {ξ1, ξ2,...} such that 0 ≤ ξn ≤ 1/n.

We note, first, that the Hilbert cube is compact. Indeed, since `2 is a metric space, (n) ∞ it suffices to show that it is sequentially compact. To this end, suppose {ξ }n=1 is a sequence in `2. We know that a closed bounded set in Rn is compact. So for any N we (n) ∞ (n) ∞ may select a subsequence of {ξ }n=1, which we again denote by {ξ }n=1, whose first (i) (i) N coordinates converge, i.e., ξ1 → ξ1 , ξ2 → ξ2 and so on. Then, given any ε > 0 there is a positive integer M such that, if ξ(M) = (ξ1, ξ2, . . . , ξM , 0, 0,...) then the first M P (n) 2 M coordinates converge and there is an n1 such that if n > n1, (ξi − ξi ) < ε/2 and i=1 ∞ P (n) 2 an n2 such that (ξi ) < ε/2 if n > n2. Hence, if n > max{n1, n2} we have i=M+1

M ∞ (n) X (n) 2 X (n) 2 kξ(M) − ξ k = (ξi − ξi ) + (ξi ) < ε . i=1 i=M+1

3 We conclude that lim ξ(M) = lim ξ(n) , and so the Hilbert cube is sequentially com- M→∞ n→∞ pact.

The Hilbert cube affords us another example of a set which has the fixed point property, but in order to establish that fact, we need to use the Brouwer theorem.

Proposition 4.6 The Hilbert cube has the fixed point property.

Proof: Denote the Hilbert cube in `2 by C and let T : C → C be continuous. Let

Pn : C → C be the map defined by

Pn[(ξ1, ξ2, . . . , ξn, ξn+1,...)] = (ξ1, . . . , ξn, 0, 0,...) .

n The set Cn = Pn(C) is clearly homeomorphic to the closed unit sphere in R . Since the mapping Pn ◦ T : Cn → Cn is continuous, the Brouwer theorem guarantees that there is a fixed point yn ∈ Cn ⊂ C and

v u ∞ u X 1 kyn − T (yn)k = kPn(T yn) − T (yn)k ≤ t . i2 i=n+1

∞ Since C is compact, the sequence {yn}n=1 has a convergent subsequence. If yo is the limit of this subsequence, then by continuity of the norm and the function T ,

0 = lim kyn − T (yn )k = k lim yn − T ( lim yn )k = kyo − T (yo)k . k→∞ k k k→∞ k k→∞ k

Hence yo is a fixed point of T . 2

5 Combinatorial Background

In this section we give a “bare-bones” approach to the necessary facts about and simplicial subdivisions which will be necessar for our future work. It is important to understand, that the combinatorial methods introduced here have inspired a number of numerical methods for the approximate computation of fixed points in various contexts.

We begin with a familiar definition.

Definition 5.1 A subset C of a vector space V is said to be convex provided x, y ∈ C, and 0 ≤ λ ≤ 1 implies that (1 − λ)x + λy ∈ C.

4 It is easy to check that if {Cα}α∈A is a family of convex sets, then ∪α∈ACα is also convex.

In the following, we will assume that the vector space V is a topological vector space.

Definition 5.2 The closed convex hull of a set S ⊂ V is the intersection of all closed sets containg S.

We note that a set always has a closed convex hull since V is itself a closed set containg S. We have the following characterization of the closed convex hull of a finite set:

Theorem 5.3 If S consists of the finite set {x0, x1, . . . , xn} then the closed convex hull of S is

n n X X {y ∈ V | y = λi xi , λi ≥ 0 , λi = 1} i=0 i=0

n P Proof: The set is obviously closed. To show S is convex, let z = µi xi. Then any point i=0 on the line segment joining y with z has the form

n X p = (1 − ν) y + ν z [(1 − ν) λi + ν µi] xi . i=0

Since ν, (1 − ν), λi, and µi are all non-negative, the coefficients in this last expression are all positive. Moreover

n n n X X X [1 − ν] λi + ν µi = (1 − ν) λi + ν µi = (1 − ν) + ν = 1 . i=0 i=0 i=0 Hence the set in question is convex.

∞ Pn P Finally, we need only show that if y = i=0 λi xi with λi ≥ 0 and λi = 0, then y is i=0 in every convex set containing xo, x1, . . . , xn. The proof is by induction on r for the sets xo, x1, . . . xr.

Any convex set containing xo, x1 must contain all points of the form y = λoxo +λ1, x1 λ1 = (1 − λo). So the result holds for r = 1. Now assume that the result is true for r = n − 1. Consider the point

5 n n−1 n−1 X X X y = λi xi = λi xi + λn xn λi = 1 − λn . i=0 i=0 i=1 Then we can write

"n−1 # X λi y = λ x + (1 − λ ) x . n n n 1 − λ i i=0 o

n−1 P Now z = (λi/(1 − λn) xi is a convex combination of r = n − 1 points, so by induction i=0 hypothesis belings to every convex set containing {x0, x1, . . . , xn−1}. Hence the result. 2

N Definition 5.4 The points x0, x1, . . . xn , n ≤ N ∈ R are said to be in general position if the vectors (x1 − x0), (x2 − x0),..., (xn − x0) are linearly independent.

To show that the point x0 does not play any particular role in the definition, we check that if the set of vectors (x1 − x0),..., (xn − x0) are linearly independent, then so are the vectors (x1 −xi),..., (xi−1 −xi), (xi+1 −xi) ..., (xn −xi) are likewise linearly independent. Indeed,

X X αj (xj − xi) = [αj (xj − x0) − αj(xi − x0)] , j6=i j6=i and since the vectors (xk − x0), k = 1, . . . , n, are linearly independent the equation

X [αj (xj − x0) − αj(xi − x0)] = 0 j6=i implies α0 = ... = αn−1 = 0. 2

We now assume that the points x0, x1, . . . , xn are in general position and consider the matrix whose columns are the n ≤ N linearly independent vectors xi − x0 , i = 1, . . . , n. We denote this matrix by ∆(xi). Since the points are in general position this matrix has maximal rank.

N We will find it useful to consider sets of more than N points in R , say {x0, x1, . . . , xs}

Definition 5.5 The set of points {x0, x1, . . . , xs} is said to be in general position provided any N + 1 of them is in general position in the sense of the preceeding definition.

6 Theorem 5.6 If the point of the set {x0, x1, . . . , xs} are in general position then there exists and ε > 0 such that ρ(xi, yi) < ε for all i = 1, 2, . . . , s implies that the set of points {y0, y1, . . . , ys} is in general position.

Proof: Since {x0, x1, . . . , xs} is in general position the matrix ∆(xi) of every set of N + 1 points is of rank N and hence each of these matrices has at least one N th-order determi- nant. Since the determinant is a continuous function of the components of each point, there exists an εk > 0 such that ρ(xk, yk) < εk implies that each matrix ∆(yi) has rank N. This can be done for each k = 1, . . . , n and so, by choosing ε = min{ε1, ε2, . . . , εn} k the points of the set {y0, y1, . . . , yn} will be in general position. 2

Theorem 5.7 Any N + 1 points {x0, x1, . . . , xN } can be brought into general position by an arbitrarily small displacement.

N Proof: Let e0 be the zero vector in R and let e1,..., eN be the standard unit vectors. These N + 1 points are in general position. Now consider the vectors

yi(t) = (1 − t) xi + t ei , 0 ≤ t ≤ 1 , i = 1, 2,...,N.

For t = 1, we have ∆(yi(1)) = ∆(ei) and hence ∆(yi(1)) is of rank N. The determinants | ∆(yi(t)) | are polynomials in t and, since they do not vanish identically by the Funda- mental Theorem of Algebra, we have that, for small values of t, say tˆ < ε, | ∆(yi(tˆ)) |= 6 0. Therefore, the points {y0(tˆ), y1(tˆ), . . . , yN (tˆ)} are in general position. 2

Theorem 5.8 Any finite system of points {x0, x1, . . . , xs} can be brought into general poisition by and arbitrarily small displacement.

s  Proof: Given ε > 0 we must bring k = N+1 sets of N + 1 points into general position. Take any set of N + 1 points. By the previous theorem, these can be brought into general position by an arbitrarily small displacements, say ε1. We know from the theorem before last, that there exists an ε2 > 0 such that, if these N + 1 points are moved a distance less than ε2 they will remain in general position. Now take any other set of N + 1 points and bring them into general position by moving them a distance less than ε2. Then we have s  two sets of points in general position. This process may be continued until all the N+1 are brought into general position, where, in each step, we take ε1 < ε/k. 2

We can now give a definition of what we will mean by a .

7 Definition 5.9 A geometric rectilinear n-dimensional simplex is the closed convex hull of

n + 1 points, a0, a1, . . . , an that are in general position. These points are said to span the simplex and are called the vertices.

n We will use the notation σ = [a0, a1, . . . , an] to denote the simplex. We will denote the simplex spanned by the vectors ei , i = 0, 1, . . . , n, the standard unit simplex by σ. The number n is called the of σ. Any simplex

Recalling the definition of the closed convex hull of a set of points, we see that the simplex consists of all points which have a representaion in the form

n n X X y = λi ai where λi = 1 , λi ≥ 0 . i=0 i=0 n P This representation is unique for if y = µi ai then i=1

n n n X X X 0 = (µi − λi) ai = (µi − λi)(ai − a0) + a0 (µi − λi) . i=0 i=1 i=0 n P But (µi−λi) = 0 and, since the ai are in general position, µi−λi = 0, for all i = 0, . . . , n. i=0

The coefficients λi are called the barycentric coordinates of the point y. We remark that the barycentric coordinates of a point are different than the coordinates of the same point as a point in RN . However, it is relatively easy to see the relationship between the two (k) (k) sets of coordinates. If the vertex ak has Cartesian coordinates c1 , . . . , cN then the N Cartesian coordinates c1, . . . , cN can easily be computed as

N N ! X X (i) y = λ0 a0 + λ1 a1 + ··· + λN aN = λi ck ek i=0 k=1

N N ! X X (i) = (λi ck ) ek . k=1 i=0

Exercise 5.10 Show that, if the simplex is nondegenerate, i.e. σ = [a0, a1, . . . , aN ] then the Cartesian coordinates uniquely determine the barycentric coordinates of a point in the simplex.

8 It is clear that σn is closed and bounded, and hence is a compact set in RN .

A typical simplex in R2 is a line segment and in R3 is a . Intuitively such a simplex has what we want to call edges and faces. We formalize these ideas in the next definition.

Definition 5.11 A p-dimensional face of a simplex is a subset:

p p X X {y | y = λi xi , λi = 1 , λi ≥ 0, i ≤ p < n , λi = 0 , i > p} . i=0 i=0

We see from this definition that a face is itself a simplex. A zero-dimensional face is a vertex; a one-dimensional face is called an edge. A two-dimensional simplex is a . 2 Indeed, if σ = [a0, a1, a2] assume that x 6= a0 is a point of the simplex. Then

n   X λ1 λ2 x = λ a = λ a + (1 − λ ) a + a , i i 0 0 0 λ 1 λ 2 i=0 where λ = 1 − λ0. The expression in brackets represents a point p of the line segment joining a1 and a2 since (λ1 + λ2)/λ = 1 and λi/λ ≥ 0 for i = 1, 2. Thus x is a point of the line segment joining a0 and p.

Conversely, any point of such a line segment is in σ2 as is easily checked. It follows that 2 σ is the union of all line seqments joining a0 to point of the line segment joining a1 and a2 and so is a triangle.

Definition 5.12 Two simplices are said to be properly situated provided their intersection is either empty or a common face.

It is immediate from the definition that two faces of a simplex are properly situated.

The faces of σn different from σn itself are called the proper faces of σn; their union is called the boundary of σn and is denoted ∂ σn. The interior of σn is defined by the equation int (σn) = σn \ ∂ σn. The set int σn is sometimes called the open simplex.

Since ∂σn consists of all points x of σn such that at least one of the barycentric coordinates n n λi(x) = 0 the int (σ ) consists of those points of σ for which λi(x) > 0 for all i. It follows that, given x ∈ σn, there is exactly one face s of σn such that x ∈ int (s) for s must be n the face of σ spanned by those ai for which λi(x) is positive.

9 We note that if σn ⊂ RN , then the points in general position span an n-dimensional hyperplane in RN . We call that plane P . In the following result, the work “open” refers to the set as a subset of RN with the relative induced by RN . The proof of the following result is left as an exercise:

Theorem 5.13 int (σ) is convex and is open in the plane P spanned by the vertices; its

closure is σ. Furthermore, int (σ) is the union of all open line segments joining a0 to points of int (s), where s is the face of σ opposite a0.

n n−1 We have used the notation B1 to denote the open unit ball in R . The unit sphere S is just the boundary of the unit ball and so consists of all points x for which kxk = 1. Thus, for example, S1 is the unit circle in R2.

Theorem 5.14 Let U be a bounded, convex, open set in Rn and let w ∈ U. Then

(a) Each ray emanating from w intersects ∂ (U) := U \ U in precisely one point.

n (b) There is a homeomorphism of U with B carrying ∂ U onto Sn−1.

Proof: Recall that a ray emanating from w is the set of all points of the form w + λ p where p 6= 0 is a fixed point of Rn and λ ≥ 0. Now given any ray R emanating from w, its intersection with U is convex, bounded , and open in R. Hence it consists of all points of the form w = λ p where λ ∈ [0, α) for some α. Then R intersects U \ U in the point x = w + α p.

Suppose R intersects R in another point, say y. Then x lies between w and y on the ray R. Indeed, since y = w + µ p for some µ > α, we have

x = (1 − λ) w + λ y , where λ = α/µ. We rewrite this equation in the form

 1   λ  w = x − y . 1 − λ 1 − λ

∞ Now choose a seqence {yn}n=1 of points of U such that yn → y and define

 1   λ  w = x − y . n 1 − λ 1 − λ n

10 ∞ The sequence {wn}n=1 converges to w, so that wn ∈ U for some n. But then since x = λ wn + (1 − λ) yn, the point x ∈ U since U is convex. But then But x was chosen as a boundary point of U and this is a contradiction. Hence part (a) of the theorem is proved.

In order to establish part (b) assume, without loss of generality, that w = 0. The equation f(x) = x/kxk defines a continuous map, f of Rn\{0} onto Sn−1. By part (a) the restriction of f to the boundary of U defines a bijection of ∂ U onto Sn−1. Since ∂U is compact, this restriction is a homeomorphism. Let g : Sn−1 → ∂ (U), be its inverse. Extend g to a n bijection G : B → U by letting G map the line segment joining 0 to the point u ∈ Sn−1 linearly onto the line segment joining 0 to g(u). Formally, we define

( kg(x/kxk)kx , if x 6= 0 G(x) = 0 if x = 0 . Continuity of G for x 6= 0 is immediate. Continuity at 0 is easy: if M is a bound for kg(x)k, then, whenever kx − 0k < δ, we have kG(x) − G(0)k < M δ. 2

We are now ready to define a “pasting together” of simplices to form a what is called a complex.

Definition 5.15 A K in RN is a collection of simplices in Rn such that

(a) Every face of a simplex of K is in K.

(b) The intersection of any two simplexes of K is a face of each of them.

It is possible to replace the second of the conditions in the definition with one that is sometimes more easily checked. Thus we have the next result.

Proposition 5.16 A collection K of simplices is a simplicial complex if and only if the following hold:

(a0) Every face of a simplex of K is in K.

(b0) Every pair of distinct simplices of K have disjoint interiors,

Proof: First, assume that K is a simplicial complex in the sense of the definition. Given two simplices σ and τ, suppose that x ∈ int (σ) ∩ int (τ). Now let s = σ ∩ τ. If s were

11 a proper face of σ then x ∈ ∂ σ which it is not. Hence s = σ. The same argument shows that s = τ.

To prove the converse, assume that (a0) and (b0) are true. We show that if the set σ∩τ 6= ∅ 0 then it is the face σ of σ that is spanned by those vertices b0, b1, . . . bm of σ that lie in τ. 0 First, σ is contained in σ ∩ τ because this latter set is convex and contains b0, b1, . . . bm. To prove the reverse inclusion, suppose x ∈ σ ∩ τ. Then x ∈ int (s) ∩ int (t), for some face s of σ and some face t of τ. It follows from (b0) that s = t. Hence the vertices of s

lie in τ, so that by definition they are elemants of the set {b0, b1, . . . , bn}. Then s is a face of σ0, so that x ∈ σ0, as desired. 2

It follows from this lemma that if σ is a simplex, then the collection consisting of σ and all its proper faces is a simplicial complex for condition (a0) is immediate and condition (b0) holds becuse for each point x ∈ σ, there is exactly one face s of σ such that x ∈ int (s).

If L is a subcollection of K that contains all faces of its elements, then L is a simplicial complex in its own right. It is called a subcomplex of K. One subcomplex of K is the collection of all simplices of K of dimension at most p; it is called the p-skeleton of K and is denoted K(p). The points of the collection K(0) are called the vertices of K.

It will be useful to distinguish between the geometrical objects that constitute a given complex and the set of points in that complex as a subset of points in RN . We will denote the latter set of points by |K|. This point set is simply the union of the simplices of K. Giving each simplex its natural topology as a subspace of RN , we then topologize |K| by declaring a subset A of |K| to be closed in |K| if and only if A ∩ σ is closed in σ for each σ ∈ K.

Exercise 5.17 Show that this actually defines a topology on |K|.

This space |K| is called the underlying space of K or the of K. A space that is the polytope of a simplicial complex will be called a .

It is important to understand that the topology of |K| is finer (that is has more open sets) than the topology |K| inherits as a subspace of RN . To see this, we show that a set A closed in the relative topology is also closed in the topology defined on |K|. To this end, suppose A is closed in the relative topology on |K|. Then A = B ∩ |K| for some closed set B ⊂ RN . Then B ∩ σ is closed in σ for each σ, so that B ∩ |K| = A is closed in the topology of |K|, by definition.

The two are different in general as the following example illustrates.

12 Example 5.18 Let K be the collection of 1-simplices σ1, σ2,... and their vertices, where 2 σi is the 1-simplex in R having vertices 0 and the points (1, 1/i). Then K is a simplicial complex. The intersection of |K| with the open parabolic arc {(x, x2 | x > 0} is closed in

|K|, because its intersection with each simplex σi is a single point. It is not closed in the topology |K| derives from R2, however, because in that topology, it has the origin as a limit point.

We leave a second example as an exercise.

Exercise 5.19 Let K be the collection of all 1-simplices in R of the form closed intervals [m, m + 1] where m is an integer different from 0, along with all the simplices of the form [1/(n + 1), 1/n] for n a positive integer, along with all the faces of these simplices. Show that K is a complex whose underlying space (see definition above) equals R as a em set but not as a topological space. HINT: Look at the set of points of the form 1/n.

You will notice in both these examples that the complexes were composed of infinitely many simplices. There is a reason for that! Indeed, if there are only finitely many simplices making up the simplicial complex, the topologies actually agree. To see this, we remark that if K is finite and A is closed in |K|, then A ∩ σ is closed in σ and hence is closed in RN . Because A is the union of finitely many sets A ∩ σ, the set A is also closed in RN .

The next results will be useful.

Proposition 5.20 If K is a subcomplex of K, then |L| is a closed subspace of |K|. In particular, if σ ∈ K then σ is a closed subspace of |K|.

Proof: Suppose A is closed in |L|. If σ is a simplex of K, then σ ∩ |L| is the union of those faces si of σ that belong to L. Since A is closed in |L|, the set A ∩ si is closed in si and hence is closed in σ. Since A ∩ σ is he finite union of the sets A ∩ si, it is closed in σ. We conclude that A is closed in |K|.

Conversely, if B is closed in |K|, then B∩σ is closed in σ for each σ ∈ K, and in particular for each σ ∈ L. Hence B ∩ |L| is closed in |L|. 2

Proposition 5.21 A map f : |K| → X is continuous if and only if f σ is continuous for each σ ∈ K.

Proof: If f is continuous, so is its restriction to σ since σ is a subspace of K. Conversely, −1 suppose each map f σ is continuous. If C is a slosed set of X, then f (C) ∩ σ =

13 −1 −1 f σ (C) which is closed in σ by continuity of f σ. Thus f (C) is closed in |Kvert by definition. 2

We now introduce coordinates for points in a complex K.

Definition 5.22 If x is a point of the polyhedron |K|, then x is interior to precisely one

simplex of K, whose vertices are, say, a0, a1, . . . , an. Then

n X x = λi ai , i=0 P where λi > 0 for each i and λi = 1. If v is an arbitrary vertex of K, we define the barycentric coordinates λv(x) of x with respect to v by setting λv(x) = 0 if v is not one of the vertices ai and λv(x) = λi if v = ai.

We remark that, for fixed v, the map x 7→ λv(x) is continuous when restricted to a fixed simplex σ. Indeed, either λ(x) ≡ 0 on σ or is just the barycentric coordinate of x with respect to the vertex v of σ in the sense of the original definition of the barycentric

coordinate of a point in a simplex. The previous proposition then sas that x 7→ λv(x) is continuous on |K|.

Using this idea of barycentric coordinates, we can prove that points are separated in |K|.

Proposition 5.23 The space |K| is Hausdorff.

Proof: Suppose x 6= y. Then there is at least one vertex v such that λv(x) 6= λv(y). Choose r between these two numbers. Then the sets {x | λv(x) < r} and {x | λv(y) > r} are the required disjoint open sets. 2

Finally, we introduce two subspaces of |K|.

Definition 5.24 If v is a vertex of K, the star of v in K, denoted by St (v) is the union of the interiours of those simpleces of K that have v as a vertex. Its closure, denoted by St (v) is called the closed star of v in K. It is the union of all simplices of K having v as a vertex. It is the polytope of a subcomplex of K.

The set St (v) is open in |K| since it consists of all points x of |K| such that λv(x) > 0. Its complement is the union of all simplices of K that do not have v as a vertex and so is, itself, a polytope of a subcomplex of K. In fact it is the intersection of St (v) and the complement of St (v).

14 We are now prepared to introduce the idea of a simplicial map of one simplicial complex into another. Recall that, given a complex K, the 0-skeleton, K(0) of K is just the set of vertices of K.

Definition 5.25 Let K and L be complexes and let f : K(0) → L(0). Then f is called a vertex map.

Given such a map, we can extend it to all of K.

Proposition 5.26 Let K and L be complexes and f a vertex map. Suppose that whenever the vertices v0, v1, . . . vn of K span a simplex of K, the points f(v1), . . . , f(vn) are vertices of a simplex of L. Then f can be extended to a continuous map f : |K| → |L| such that

n n X X x = λi vi implies g(x) = λi f(vi) . i=0 i=1

Proof: Note that, although the vertices f(v0), . . . , f(vn) of L are not necessarily distinct, they still span a simplex τ of L by hypothesis. If, in the expression for g(x) we simplify by collecting terms with like factors f(v), the coefficients are still non-negative and sum

to one. Hence g(x) ∈ τ. It follows that g maps the n-simplex σ spanned by the vi continuously to the simplex τ whose vertex set is {f(v0), . . . f(vn)}.

The map g is continuous as a map of σ into τ, and hence as a map of σ into |L|. Hence, by a preceeding proposition, g is a continuous map from |K| into |L| 2

One more basic notion will play a role in what we need to do.

Definition 5.27 Suppose that K is a complex, and w ∈ RN has the property that each ray emanating from w intersects |K| in at most one point. The cone on K with vertex w is

the collection of all simplices of the form [w, a0, . . . , an] where the ai generates a simplex of the complex K, along with all faces of such simplices. We denote this complex by w?K. We call K the base of the cone.

In order for this definition to make sense, we must check that w ? K is a complex that contains K as a subcomplex. The first thing to check is that the points w, a0, a1, . . . , an are in general position. Suppose that P is the plane generated by the points a0, . . . an. If w were in this plane, then we could consider the line segment joining w to a point

x ∈ int (σ) where σ is the simplex determined by the ai. Since the set int (σ) is open in P , it would contain an interval of points on this line segment. But the ray from w through x intersects |K| in only one point by hypothesis.

15 To see that w ? K is a complex, we note that the simplices of w ? K are of three types:

simplices of K itself, simplices of the form w, a0, . . . , ap, and the 0-simplex w, A pair of simplices of the first type have disjoint interiors since they are simplices of the complex

K. The open simplex int ([w, a0, . . . , ap]) is the uniion of all open line segments joining w to points of int ([a0, . . . , ap]). No two such open simplices can intersect because no ray from w contains more than one point of |K|. For the same reason, simplices of the first and second types have disjoint interiors.

We now explain how a finite complex K may be subdivided into simplices that are as small as desired. This is the process of simplicial subdivision. We start with a definition.

Definition 5.28 Let K be a geometric complex in RN . A complex K0 is said to be a subdivision of K provided

(a) Each simplex of K0 is contained in a simplex of K.

(b) Each simplex of K is the union of finitely many simplices of K0.

We make the observations that if K0 is a subdivision of K and K00 is a subdivision of 0 00 0 K , then K is a subdivision of K. Moreover, if K is a subdivision of K and if Ko is a 0 subcomplex of K, then the collection of all simplies of K that lie in |Ko| is automatically 0 a subdivision of Ko. We call it the subdivision of Ko induced by K .

The proof of the next lemma will be left as an exercise.

Lemma 5.29 If K is a complex then the intersection of any collection of subcomplexes N of K is a subcomplex of K. Conversely, it {Kα}α∈A is a collection of complexes in R and if the intersection of every pair |Kα| ∩ |Kβ| is the polytope of a complex that is a subcomplex of both Kα and Kβ, then the union ∪α∈AKα is a complex.

The method of subdivision that we explain is called the starring method. We start with a complex K. Consider the p-skeleton of K and suppose that we have a subdivision of that p+1 p+1 p-skeleton which we call Lp. Now let σ be a p + 1 simplex of K. The set ∂ σ is the polytope of a subcomplex of the p-skeleton of K, and hence of a subcomplex of Lp. We denote this latter subcomplex by Lσ.

Now if wσ is an interior point of σ, then the cone wσ ?Lσ is a complex whose underlying space is σ. We define Lp+1 to be the union of Lp and the complexes wσ ? σ as σ ranges over all the p + 1-simplices of K.

16 (p+1) We claim that Lp+1 is a complex. It is said to be the subdivision of K obtained by starring Lp from the points wσ. In order to check that Lp+1 is a complex, first note that

|wσ ?Lσ| ∩ |Lp| = ∂ σ

which is the polytope of the subcomplex Lσ of both wσ ?Lσ and Lp. Similarly, if τ is another p+1-simplex of K, then the spaces |wσ ?Lσ| and |wτ ?Lτ | intersect in the simplex σ ∩ τ of K, which is the polytope of a subcomplex of Lp and hence of both Lσ and Lτ . It follows from the preceeding lemma that Lp+1 is a complex.

Now the complex Lp+1 depends on the choice of the points wσ. It is often the case that once chooses a particular interior point of σ to use for the starring procedure, namely the barycenter of the simplex σ.

Definition 5.30 If σ = [v0, . . . , vp] the barycenter of σ is defined to be the point

p X 1 σˆ = v . p + 1 p i=0

It is those point of int (σ) all of whose barycentric coordinates with respect to the vertices of σ are equal. If σ is a 1-simplex, thenσ ˆ is its midpoint. If σ is a 0-simplex, thenσ ˆ = σ. In general,σ ˆ is just the of σ.

The procedure for constructing the subdivisions can be described as an iterative process. Let K be a complex. We define a subsequence of subdivisions of the skeletons of K as (0) follows: Let L0 = K be the 0-skeleton of K. In general, if Lp is a subdivision of the p-skeleton of K, let Lp+1 be the subdivision of the (p+1)-skeleton obtained by starring Lp from the barycenters of the (p + 1)-simplices of K. By the preceeding lemma, the union

of complexes Lp is a subdivision of K. It is called the first barycentric subdivision of K, and is denoted sd K.

Having formed the first barycentric subdivision, we can now construct its barycentric subdivision sd (sd K) which we will denote by sd2 K. This complex is called the second barycentric subdivision of K. In carrying out more and more subdivisions, the individual simplices of the subdivisions become progressively smaller and smaller. Before proving that is the case, we need a short lemma which formalizes the structure of a simplicial subdivision.

First, a bit of notation that will be useful. Suppose σ and τ are simplices. Then if τ is a proper face of σ we write σ τ.

17 Lemma 5.31 The complex sd (K) is the collection of all simplices of the form [ˆσ1, σˆ2,..., σˆn] where σ1 σ2 · · · σn.

Proof: The result will be proved by induction on the dimension of the face. It is obvious that the simplices lying in sd (K) lying n the 0-skeleton K(0) of K are of this form since (0) each simplex in K is a vertex of K anda ˆi = ai for each vertex.

Now suppose that each simplex of sd (K) lying in |K(p)| is of this form. Let τ ∈ sd (K) (p+1) (p) lying in |K | and not in |K |. Then τ belongs to one of the complexesσ ˆ ? Lσ where σ is a (p + 1)-simplex of K and Lσ is in the first barycentric subdivision of the complex conisting of the proper faces of σ. By the induction hypothesis, each simplex of Lσ is of the form [ˆσ1,..., σˆn], where σ1 σ2 · · · σn and σ1 is a proper face of σ. Then τ must be of the form [ˆσ, σˆ1,..., σˆn] which is of the desired form. 2

6 Barycentric Subdivisions and the Brouwer Theo- rem

We can now prove the theorem regarding the shrinking of the sub-simplices in successive barycentric subdivisions.

Theorem 6.1 Given a finite complex K and an ε > 0 there is an integer M such that each simplex of sdM K has diameter less than ε.

Proof: Because K is finite, |K| is a subspace of RN and |K| is compact. Since all metrics N are equivalent in R we choose the uniform metric |s − y| := max1≤i≤N |xi − yi|.

First we show that, for a simplex σ = [a0, a., . . . , ap] the diameter of σ, `, is just max |ai − i,j aj|, i.e., the maximum distance between the vertices of σ. Since, for all i, j, ai, aj ∈ σ we have ` ≥ |ai − aj|. In order to prove the reverse inequality, consider an arbitrary x ∈ σ. Now the ball B`(ai) := {y | |y − ai| ≤ `} is certainly a convex set as is easily checked. Since ` is the maximum distance between the vertices, B`(ai) contains all the vertices and, being convex, must contain σ itself. Since x ∈ σ we have |x|ai| ≤ ` for all i = 0, 1, . . . , p. So we see that in fact |x − ai| ≤ `.

This reasoning shows that, for an arbitrary x, B`(x) contains all the vertices of σ and so contains σ itself. Hence |x − z| ≤ ` for all x, a ∈ σ so diam (σ) = `.

18 Now we want to get an estimate in terms of the barycenter and the dimension of the simplex. Namely, we want to show that if σ has dimension p, and ifσ ˆ is its barycenter, then

 p  |σˆ − z| ≤ diam (σ) , for all z ∈ σ . p + 1

To this end, suppose that σ = [a0, a1, . . . ap]. THen we have the estimate

p p X  1  X  1  |a − σˆ| = |a − a |, ≤ | (a − a ) | 0 0 p + 1 i p + 1 0 i i=0 i=1  p   p  ≤ max |a0 − ai| ≤ diam (σ) . p + 1 i p + 1

The same estimate is valid when aj replaces a0 and hence we may conclude that Bp/(p+1)(ˆσ) is a convex set containing all the vertices of σ. Hence σ ⊂ Bp/(p+1)(ˆσ) and this establishes the required inequality.

Now suppose that σ is a p-simplex and τ is a simplex in the first barycentric subdivision of σ. We want to show that

 p  diam (τ) ≤ diam (σ) . p + 1

To do this, we proceed by induction on p. For p = 0 the result is trivial. Suppose for dimension k < p it is true. By the preceeding lemma and the first part of this proof, we see that if s and s0 are faces of σ with s s0, then

 p  |sˆ − sˆ0| ≤ diam (σ) . p + 1 If s = σ the inwquality follows from the above argument. If s is a proper face of σ of q, then

 q   p  |sˆ − sˆ0| ≤ diam (σ) ≤ diam (σ) , q + 1 p + 1 the first inequality following from the induction hypothesis, and the second from the fact that the function f(s) = x/(x + 1) is increasing for x > 0.

19 Finally, suppose K has dimension N and let d be the maximum diameter of a simplex of K. The maximum diameter of a simplex in the M th barycentric subdivision of K is (N/(N + 1))M d which, if M is taken sufficiently large, is less than the preassigned ε. 2

We intend to prove Brouwer’s theorem for a simplex in RN :

Theorem 6.2 Let σ be a non-degenerate simplex in RN and f : σ → σ a continuous self-map of σ. Then f has a fixed point.

Before presenting the proof of this theorem we need to make some observations. First, if x∗ is a fixed point of f, then the barycentric coordinates of x∗ and f(x∗) coincide. For

ease of notation, we will write the barycentric coordinates of a point as xk so that our ∗ ∗ observation amounts to xk = fk(x ) , k = 0, 1,...,N.

∗ ∗ With this notation, note that it is equivalent to say, for the fixed point, that xk ≥ fk(x ). Trivially, the equality of the barycentric coordinates implies that the inequalities are PN ∗ PN ∗ satisfied. On the other hand, since k=0 xk = 1 = k=0 fk(x ) the inequalities imply that the barycentric coordinates are equal. So, in order to find a fixed point, we want to find a point x such that, for y = f(x), the barycentric coordinates of x and y satisfy the

inequalities xi ≥ yi , i = 0, 1,...,N.

Now suppose we have an N-dimensional simplex, σ. Then its proper faces have dimensions N − 1,N − 2,..., 0 and if we consider a face of dimension d < N it is determined by d + 1 (d) (d) vertices, say s = [ap, aq, . . . , as]. Then the barycentric coordinates of a point x ∈ s are xp ≥ 0, xq ≥ 0, . . . , xs ≥ 0 and all other barycentric coordinates xk = 0. If y = f(x) (d) for x ∈ s then the inequality xk > yk ≥ 0 cannot occur. Otherwise said, if xk > yk for x ∈ s(d), then k ∈ {p, q, . . . , s}.

We now define an index function m on the simplex. Let x ∈ σ and y = f(x) ∈ σ with y 6= x. Now for each x, there is a subset Ix ⊂ ZN such that for j ∈ Ix, xj > yj. Let

m(x) := min Ix .

For each x ∈ σ, m(x) ∈ ZN but if x lies on a proper face the values of m are restricted since in that case Ix 6= ZN .

With these ideas and notation, we are ready to prove the Brouwer theorem. Proof: (Brouwer, 1910) Let σ be an N-simplex and let f : σ → σ. Assume that f has no t fixed point. Each point has associated with it an integer m(x) ∈ ZN . If we form the n h barycentric subdivision sdn σ. Then each vertex of the subdivision is labelled in such a way (because of the restriction of the values of m on a proper face) that the labelling is a

20 Sperner labelling. Hence by Sperner’s lemma, some subsimplex carries a complete set of

labels. Suppose that this sub-simplex has vertices and labels (m, xm(n)), m = 0, 1,...,N. Now by definition of m, if m(x) = j, then xj > yj and so xi > yi at xi(n), i = 0, 1,...,N.

Now by the preceeding lemma, the vertices of the fully labelled sub-simplex satisfies

 N n max |xp(n) − xq(n)| ≤ diam (σ) → 0 as n → ∞ . 0≤p

∞ Now look at the sequence of vertices {x0(n)}n=1. Since this sequence lies in a compact subset of RN is has a convergent subsequence. For ease of notation, call this subsequence, ∞ ∗ ∗ again, {x0(n)}n=1. So there exists a point x such that x0(n) → x as n → ∞. Then, by the above estimate, this convergence is true for all of the vertices, i.e., for all i = 0, 1,...N ∗ we have xi(n) → x as n → ∞. Hence by continuity of f,

∗ ∗ f(xi(n)) → f(x ) := y i = 0, 1,...,N. But the barycentric coordinates of a point x depend continuously on the point x. So, ∗ ∗ ∗ since xi > yi at vertices xi(n) in the limit we have xi ≥ yi at x for i = 0, 1,...,N, and hence x∗ = y∗. Hence f(x∗) = x∗ and x∗ is the required fixed point. 2

7 Some Results in Infinite Dimensional Spaces

In this section we consider the situation in a normed linear space. Here the situation is that a continuous map of the unit ball into itself may well not have a fixed point. Here is a counterexample due to Kakutani.

Example 7.1 Let `2 be the usual Hilbert space of square-summable sequences with norm P∞ 2 2 kxk = i=1 xi . Consider the unit ball B1 ⊂ ` and define a function f on B1 by

2 1/2  y = f(x) = (1 − kxk ) , x1, x2,... . It is easy to check that this map is continuous. Moreover

∞ 2 2 X 2 kyk = (1 − kxk ) + xi = 1 . i=1

Hence, in fact f is a self-map of B2. However, f cannot have a fixed point, for, if y = x then

21 2 y1 = 1 − kxk = x1, y2 = x1 = y1, etc. so that all the components of x must be constant. But then x 6∈ `2,

If one examines the proof of the Brouwer Theorem, it is easy to see what has gone wrong: the proof relies heavily on the compactness of the underlying domain, and here, in `2 the unit ball fails to be complact. In 1930, Schauder sowed that if X is a normed linear space and if f : X → Y ⊂ X, where Y is relatively complact, then f has a fixed point. We will prove this theorem presently. First, we look at a simple example to see that this requrement on f is sometimes automatically satisfied.

Example 7.2 Consider the non-linear integral equation

1 Z x(t) = e−s t cos (7 x(s)) ds , 0 ≤ t ≤ 1 .

0 We can think of this equation in operator form as x = Nx where N stands for the non- linear map defined by the right-hand side. Now consider the Banach space C([0, 1]) with the usual supremum norm. It is clear that, for any x ∈ C([0, 1]), Nx ∈ C([0, 1]). What is crucial is that we have the following estimate:

1 1 Z Z −s t −s t |x(t)| = e cos (7 x(0)) ds ≤ e ds ≤ 1 . 0 0 This estimate shows that the range of N is a set of equibounded functions. Moreover,

1 1 Z Z 1 |y(t) − y(τ)| ≤ |e−s t − e−s τ | ds ≤ s |t − τ| ds = |t − τ| . 2 0 0 So the range consists of a family of equibounded, equicontinuous functions and hence is a relatively compact subset of C([0, 1]).

8 The Schauder Theorem

Definition 8.1 Let X be a normed linear space and E ⊂ X. The transformation T : E → X (not necessarily linear) is said to be compact or completely continuous provided it is continuous and T (M) is relatively compact for every bounded subset M ⊂ E.

22 As we will see, the usefulness of this class of mappings is in the property that they can be approximated, in the sense of the underlying norm, by maps which have finite dimensional range. The importance of Schauder’s work was that he made explicit use of the notion of compactness. The first use of fixed point theorems in the context of dynamical systems was the that of Birkhoff and Kellogg. Birkoff, working in specific function spaces, settled an outstanding problem of Poincar´e;Schauder’s theorem can be viewed as a generalization of the Birkhoff-Kellogg result.

To be more specific about the approximation of this class of mappings, suppose that K ⊂ X is relatively compact and suppose ε > 0 is given. Then the set K is compact and

so there is a finite ε-net in K, say v1, v2, . . . , vp. We define a map Fε on K as follows

p P mi(x) vi i=1 Fε(x) = p , x ∈ K, P mi(x) i=1 where

( ε − kx − vik , if kx − vik ≤ ε m(x) = 0 , if kx − vik > ε .

Notice that, for each x, the sum in the denominator does not vanish since the vi form a finite ε-net hence at least one of the vi is within ε of x. Moreover, we note that the right-hand side of the definition of Fε(x) is just a convex combination of the points vi. These maps are called the Schauder projections.

Now if T is a compact transformation with domain M ⊂ X and with range contained in

the relatively compact set K, then T is approximated to within ε by Fε ◦ T as can be seen by the follwing estimates:

p p P P k mi[T (x)] T (x) − mi[T (x)] vik i=1 i=1 kT (x) − (Fε ◦ T )(x)k = p P mi[T (x)] i=1

p P mi[T (x)] kT (x) − vik i=1 ≤ p < ε . P mi[T (x)] i=1

We can now easily prove the theorem known as the Schauder Fixed Point Theorem:

23 Theorem 8.2 (Schauder, 1930) Let K be a closed, bounded, convex susbset of a normed linear space X and T a compact transformation defined on K such that T (K) ⊂ K. Then T has a fixed point in K.

Proof: Since the set K is closed, the set T (K) ⊂ K and so is a compact subset of K. Let ∞ ∞ {εn}n=1 be a null sequence and define the sequence of mappings {Tn}=1 by Tn := Fεn ◦ T

where Fεn is as described in the preceeding discussion. Since the set K is convex, and

{v1, . . . , vpn } ⊂ T (K) ⊂ K, we see that Tn(K) ⊂ K.

Let Xn be the finite dimensional subspace of X which is spanned by the vectors {v1, . . . , vpn } and set Kn = K ∩ Xn. The transformation is certainly defined on Kn and, moreover, Tn(Kn) ⊂ Kn by definition. From the continuity of T and the definition of Fε it is clear that the compositions are continuous from Kn → Kn. Hence each Tn has a fixed point by the Brouwer Theorem, i.e., for each n there exists a point xn ∈ Kn such that xn = Tn(xn).

∞ Now look at the sequence {T (xn)}n=1. This sequence lies in the compact set T (K) and hence has at least a subsequence that converges to a point xo ∈ K. Without loss of generality, we assume that T (xn) → xo as n → i∞. Then, for any ε > 0, there exists an integer N = N(ε) such that, for n > N(ε) we have kT (xn) − xok < ε. From the approximation property we also have that

kTn(xn) − T (xn)k < εn , and so, combining these two estimates, we have

kTn(xn) − xok < εn + ε .

Since Tn(xn) = xn, we see that, in fact, kxn − xok < εn + ε. Now, from the continuity of T we know that, given εo > 0 there exists a δ = δ(εo) > 0 such that

kT (xn − T (xo)k < εo , provided ε + εn < δ . ∞ Thus the sequence {T (xn)}n=1 converges to T (xo). Hence, by uniqueness of limits,

xo = T (xo) . 2

The next theorem that we present is due to Schauder who used the theory of degree of mappings to prove the result. Later, in 1955, H. Schaeffer found a proof of this important result, that does not involve degree theory. We follow his proof.

24 Theorem 8.3 Let T be a compact transformation of a normed linear space X into itself.

Let λo ∈ [0, 1]. Then either there is an x ∈ X such that

x = λo T (x) , or the set

{x ∈ X | x = λ T (x) , 0 < λ < 1} is unbounded.

Proof: Consider the closed unit ball B1(0) and define n Bn(0) := y | y = n x x ∈ B1(0)}. Suppose that there is a value λo ∈ (0, 1) such that the the equation

x = xλo T (x) does not have a solution. The strategy is to show that, given a positive integer n there is

an xn such that

xn = µn λo T (xn) ,

where µn ∈ (0, 1) and kxnk = n.

To this end, we define a mapping Rn : X → X as

( λo T (x) if x is such that λo T (x) ∈ n Bn(0) Rn(x) = n λ T (x) if x is such that λ T (x) ∈ X \ n B (0) . kλo T (x)k o o n

Then Rn is continuous. If x ∈ nB1(0) then if λo T (x) ∈ nB1(0), nB1(0)

Rn(x) = λo T (x) ∈ nB1(0) .

If λoT (x) ∈ X \ nB1(0), then

n kRn(x)k = λo T (x) = n . kλo T (x)k

Hence, if x ∈ nB1(0) then Rn(x) ∈ nB1(0).

Since the map T is compact, the map Rn is also compact. Now nB1(0) is a bounded, closed, and convex set. Hence by the Schauder Fixed Point Theorem, there is an

xn ∈ nB1(0) such that Rn(xn) = xn.

25 Now suppose that λo T (xn) ∈ nB1(0). Then

λo T (xn) = Rn(xn) = xn , which contradicts the property of λo. Therefore

λo T (xn) ∈ X \ nB1(0) , and

n Rn(xn) = λo T (xn) = xn , kλo T (xn)k with kxnk = n. Likewise, kλo T (xn)k > n. So

n = µn , 0 < µn < 1 . kλo T (xn)k . 2

9 Set-valued Maps and the Kakutani Theorem

In this section we will introduce the notion of set-valued mappings or, as they are sometimes called, correspondences. Such functions arise in a variety of contexts even in elementary calculus when we consider the inverse of a function which is not injective. But there are many more important areas, including the theory of algorithms, control theory, optimiza- tion theory, and, as we shall see, non-linear operator theory where such maps are useful to consider. A considerable amount of research about the properties and application of such mappings has been done by mathematical economists, particularly as related to game theory and the theory of equilibrium.

We will give the basic definitions here, together with some simple examples; our object in this section is to prove the fixed point theorem of Kakuktani for such mappings which, for example, has been used to prove the existence of Nash equilibria in the theory of n-person noncooperative games, perhaps one of the first applications of the theorem.

Consider two sets, X and Y . We denote the set of all subsets of Y by P(Y ), the power set of Y .

Definition 9.1 A set-valued map defined on the set X is a map Q : X → P(Y ). Thus Q(x) ⊂ Y for all x ∈ X.

26 Now, since ∅ ∈ P(Y ), it is possible for Q(x) = ∅. In applications, it is usual to consider only those maps whose images are non-empty sets. Such maps are called proper and we have the following:

Definition 9.2 The domain of a set-valued map Q, denoted dom (Q) is the set

{x ∈ X | Q(x) 6= ∅} .

One further definition is particularly important.

Definition 9.3 The graph of a set valued function Q is the subset of X × Y :

{(x, y) ∈ X × Y | y ∈ Q(x)}

We pause to give some elementary examples.

Example 9.4 Let H : R → R be defined as

( −1 , if x ≤ 0 H(x) = 1 , if x > 0 It is often useful to “fill in the graph” by defining

 −1 , if x < 0  H˜ (x) = [−1, 1] if x = 0  1 , if x > 0 With this definition H˜ is a set-valued map.

Example 9.5 Consider a continuous function f :(a, b) × R × real → R and consider the first order differential equation in implicit form f(t, x, x˙) = 0. A solution of such an equation is a function x = x(t), defined perhaps on some subinterval of (a, b), which is absolutely continuous there and for which f(t), x(t), x˙(t)) = 0 almost everywhere.

One way to study the existence of solutions of such equations is to introduce the set-valued function

F (t, x) = {v ∈ R | f(t, x, v) = 0} . Then we can rewrite the implicit form as an explicit differential inclusion

x˙ ∈ F (t, x(t)) , t ∈ I ⊂ (a, b) .

27 The next example is particularly important in nonlinear optimization and we will meet it later in the course.

Example 9.6 Let f be an extended real valued convex function defined on a convex subset C ∈ Rn. The epigraph of the function f is defined by

 n+1 epi(f) = (x, z) ∈ R |x ≥ f(x), x ∈ C

It is well known that the function f is convex if and only if its epigraph is a convex set. A vector x∗ ∈ Rn is said to be a subgradient of f at the point x ∈ C provided

f(z) ≥ f(x) + hx∗, z − xi for all z ∈ Rn. This last inequality has a simple geometric interpretation, namely that the affine function z → f(x) + hx∗, z − xi describes a non-vertical supporting hyperplane to the convex set epi(f) at the point (x, f(x)) on its boundary. The set of all subgradients of f at x is called the subdifferential of f at x and is denoted by ∂f(x), while the set valued mapping ∂f : x → ∂f(x) is called the subdifferential of f. If ∂f(x) 6= ∅ then the function f is said to be subdifferentiable at x ∈ C. The notion of the subdifferential is central to the modern theory of convex optimization. As a concrete example, let f(x) = ||x||, the Euclidean norm of x. This function is certainly differentiable in the ordinary sense at all x 6= 0 and hence is subdifferentiable n x o with a subdifferential ∂f(x) = ||x|| . In order to determine the set ∂f(0), we apply the definition of subdifferential. Thus, ∂f(0) consists of all vectors x∗ such that ||z|| ≥ hx∗, zi ∗ z or hx , ||z|| i ≤ 1. It follows immediately that ∂f(0) is just the unit ball with center at 0 in Rn.

There are several notions of smoothness for set-valued mappings. We will not discuss them here, but will concentrate on one particular notion, that of upper semiconitnuity for set valed mappings.

Definition 9.7 A set-valued map, Q, is called upper semicontinuous provided its graph is

closed. Otherwise said, if {xn} and {yn} are two sequences which converge to xo and yo respectively and if, for all n , yn ∈ Q(xn) implies that yo ∈ Q(xo).

Here is an example.

Example 9.8 Let Q be a set-valued map from the interval [0, 1] to the subsets of [0, 1] defined by

28  [4/6, 5/6] , if 0 ≤ x < 1/2  Q(x) = [4/6, 5/6] ∪ [1/6, 2/6] , if x = 1/2  [1/6, 2/6] , if 1/2 < x ≤ 1.

Then the graph of Q is closed. Indeed if xn → 1/2 and yn → yo ∈ [4/6, 5/6] with yn ∈ Q(xn) then yo ∈ Q(1/2) by definition. The same holds if yo ∈ [1/6, 2/6]. We note in passing, however, that the value of Q(1/2) is not a convex set.

On the other hand, if the definition is modified to

 [4/6, 5/6] , if 0 ≤ x < 1/2  Q˜(x) = [4/6, 5/6] , if x = 1/2  [1/6, 2/6] , if 1/2 < x ≤ 1. ˜ then the graph of (Q) is not closed. Indeed if we take xn := 1/2 + 1/n and yn = 2/6 for ˜ ˜ all n, then yn ∈ Q(xn) but 2/6 6∈ Q(1/2).

The proof of the following lemma will be left as an exercise.

Lemma 9.9 Let X be a metric space and Q : X → P(X) be a set-valued function with closed graph. Then the images Q(x) are closed sets.

The notion of a fixed point of a set-valued mapping is a direct generalization of that for a single-valued map.

Definition 9.10 Let X be the domain of a set-valued map, Q. Then a point x ∈ X is said to be a fixed point of the map Q provided x ∈ Q(x).

We can already see from the preceeding example that a set-valued function, even one with a closed graph, need not have a fixed point. Indeed the map Q˜ of that example fails to have a fixed point since there are no points of the graph of Q˜ which intersect the line y = x. On the other hand, the preceeding map Q does have a fixed point. The difference is that, at x = 1/2, the image Q(1/2) is convex, while that of Q˜(1/2) is not. This explains the hypothesis that the values of the set-valued map in Kakutani’s theorem have convex images.

We are now ready to state and prove Kakutani’s famous fixed-point theorem.

29 Theorem 9.11 Let X ⊂ Rn be a closed, bounded, and convex set1. Let Q : X → P(X) have non-empty, convex values. Suppose, further, that the graph of Q is closed. Then Q has a fixed point in X.

Proof: We prove the theorem, as usual, for X a non-degenerate simplex in Rn. Indeed let th X = [ao, a1, . . . , an]. Now, for each integer p we consider the p barycentric subdivision of X and a continuous function f ∗ (p) as follows: if x is the vertex of any cell in the subdivision, let y be an arbitrary point of Q(x) and set f (p)(x) = y ∈ Q(x). If x is not (p) (p) such a vertex, then x lies in some cell of the subdivision, say x ∈ [a0 , . . . an ]. Then x is a covex combination of these vertices, say

n n X (p) (p) (p) X (p) x = λj aj , λj ≥ 0 , λj = 1 , j=0 j=0 and then we set

n (p) X (p) (p) (p) f (x) = λj f (aj ) . j=0

Note that, since the barycentric coordinates of points are unique, if x lies on a common face, the two definitions coincide on the common face.

Now it is clear that the varous maps f (p) are continuous maps of the simplex X into (p) itself. Hence the Brouwer theorem guarantees that each has a fixed point, say a point x∗ (p) (p) (p) such that f (x∗ ) = x∗ . If, by chance, any of these fixed points is a vertex, then, by construction, it is a fixed ponit of Q and the proof is complete.

If, on the other hand, none of these points are vertices then, for a given p, we have

n (p) X (p) (p) x∗ = λj aj j=0

(p) (p) and so, using the definition of f and the fact that x∗ is its fixed point, we have

n (p) X (p) (p) x∗ = λj yj , j=0

1The assumption that X is a convex body, in particular that it has interior points, is not restrictive. If the set has more than one point, then there exists an integer m such that the set has interior points relative to Rm.

30 where

(p) (p) (p) (p) yj = f (aj ) ∈ Q(aj ) , j = 0, 1, 2, . . . , n .

We now have (2 n + 1) sequences all of which lie in compact subsets of Rn, namely (p) ∞ the sequence of fixed points {x∗ }p=1, the n sequences of their barycentric coordinates (p) ∞ (p) ∞ {λj }p=1 for each j = 1, . . . , n and the n sequences {yj }p=1 for each j = 1, . . . , n. The first and last of these lie in the simplex X which is closed and bounded while the sequences of the barycentric coordinates all lie in the unit simplex of Rn.

By a standard application of the Bolzano-Weierstrass Theorem, we may assume that all these sequences converge as p → ∞. Thus

(p) x∗ → x∗ as p → ∞ (p) λj → λj as p → ∞ , j = 1, . . . , n (p) yj → yj as p → ∞ , j = 1, . . . , n .

Now, as the diameter of the subcells approach 0 as p → ∞, the convergence of the fixed (p) points to x∗ implies that the vertices aj → x∗ as p → ∞ for all j = 1, . . . , n. Moreover we must have

n X x∗ = λj yj , j=0 It remains only to apply the hypothesis that the graph of Q is closed, for we have

(p) (p) (p) (p) yj ∈ Q(aj ) , and aj → x∗ , yj ∈ yj

hence we must have yj ∈ Q(x∗). But Q(x∗) is convex and we have x∗ is a convex combination of the yj. Hence x∗ ∈ Q(x∗). 2

10 The Infinite Dimensional Case

The theorem of Kakutani has been generalized in several directions. In these note, we present only the simplest of these due to Bohnenblust and Karlin (1950). Their result extends Kakutani’s theorem to Banach spaces under suitable convexity assumptions.

Theorem 10.1 Let C be a closed, bounded, convex subset of a Banach Space X, and

assume that Q : C → P(C) have closed graph. If ∪x∈C Q(x) ⊂ T ⊂ X, where T is sequentially compact, then Q has a fixed point in C.

31 Proof: The intersection S ∩ T 6= ∅ and so is sequentially compact. Given and ε > 0 there exists a finite ε-net, {v1, . . . , vp} ⊂ S ∩ T , Let So = co {v1, . . . , vp}. Then if y ∈ Q(x) the distance of y from So is less than ε. Denote by [Q(x)]ε the set of all points of X which are a distance at most ε from Q(x). The sets [Q(x)]ε are closed and convex and ˜ ˜ the co (So ∩ [Q(x)]ε) 6= ∅. Denote this closed, convex set by Q(x). Note that Q(x) ⊂ So.

∞ ∞ Now, suppose that the sequences {xn}n=1 and {zn}n=1 are such that xn → xo and zn → zo ˜ as n → ∞, with zn ∈ Q(xn) then, by construction, there exists yn such that yn ∈ Q(xn) ∞ and kyn − znk ≤ ε. But the sequence {yn}n=1 lies in the sequentially compact set T and so we may assume that yn → yo for some yo. Clearly, kz − yok < ε and yo ∈ Q(x). Thus ˜ z ∈ Q(x) and Kakutani’s theorem can be applied to the set So with the correspondence x 7→ Q˜(x).

Now for each ε > 0 we can construct an x ∈ S (and even in So!) and a y ∈ Q(x), such that kx − yk ≤ ε. Choose a null sequence εn and determine, for each n, a pair xn, yn as above. We may assume that yn → yo. Since kxn − ynk ≤ εn the xn → yo as well. Hence yo ∈ Q(yo) since the graph of Q is closed. Thus yo is the required fixed point. 2

11 Caristi’s Fixed Point Theorem

In order to present the next fixed point result, we recall a basic property of complete metric spaces which is embodied in Cantor’s Intersection Theorem.

∞ Theorem 11.1 Let X be a complete metric space and let {Fn}n=1 be a decreasing se- 2 ∞ quence of non-empty closed subsets of X such that diam (Fn) → 0. Then F = ∩n=1Fn contains exactly one point.

Proof: Since diam (Fn) → 0 it is clear that F cannot contain more than one point. Hence we only need to show that F 6= ∅. For each n, let xn ∈ Fn. Moreover, it is clear from ∞ diam (Fn) → 0 that the sequence {xn}n=1 is Cauchy. Since X is complete, this sequence converges to a limit, xo ∈ X.

In order to see that x ∈ F , let no be arbitrary. If the Cauchy sequence has only finitely many distinct points, the xo is the point that is repeated infinitely often and is therefore in Fno . If the sequence has infinitely many distinct points, then xo is the limit point of the set of points of the sequence. It is a limit point of the subset {xn | n ≥ no} of the original sequence, and so it is a limit point of Fno since the sequence of sets Fn is a decending sequence. Thus, since Fno is closed, i x ∈ Fno . 2

2 Recall that this means that F1 ⊃ F2 ⊃ · · · .

32 Now let ϕ : X → R be a real-valued function on a metric space {X, ρ} and let λ > 0. Following Bishop and Phelps, we define a relation ≺ϕ,λ on X by

x ≺ϕ,λ y if and only if λ ρ(x, y) ≤ ϕ(x) − ϕ(y) .

It is easy to check that ≺ϕ,λ is a partial order on X. Indeed, the transitivity of ≺ϕ,λ follows from the triangle inequality as is trivial to check. Reflexivity and antisymmetry

follow from properties of ρ. The ordered space will be denoted Xϕ,λ. Observe that if x, y ∈ Xϕ,λ are comparable, i.e., if either x ≺ϕ,λ y OR y ≺ϕ,λ x, then ϕ(y) ≤ ϕ(x) ensures that the former relation is true. Indeed if ϕ(x) < ϕ(y) then λ ρ(x, y) ≤ ϕ(x) − ϕ(y) < 0 contradicting the non-negativity of λ ρ(x, y).

We recall the following

Definition 11.2 A function ϕ : X → R is called lower semicontinuous or simply l.s.c., provided {x ∈ X | ϕ(x) ≤ α} is closed for each α ∈ R. It is called upper semicontinuous or u.s.c., if the function −ϕ is l.s.c.

Our goal is to establish a fixed point theorem due to Caristi (1976) which is of particular interest since it does not require the function whose fixed point we seek to be continuous. There are several proofs of this result; it is often proved by using the Ekeland Variational Principle. Indeed, one can show that these two theorems are equivalent in the sense that each can be derived from the other. Other proofs have been offered in the literature. Here we will prove Caristi’s result by using a result of Bishop and Phelps.

Theorem 11.3 Let {X, ρ} be a complete metric space and ϕ : X → R be a lower semi- continuous function with a finite lower bound. Then, for any xo ∈ Xϕ,λ there is a maximal ? ? ? element x ∈ Xϕ,λ with xo ≺ϕ,λ x . Precisely: for any xo ∈ X there is an x ∈ X, such that

? ? ϕ(x ) + λ ρ(xo, x ) ≤ ϕ(xo) , and

? ? ? ϕ(x ) < ϕ(xo) + λ ρ(x, x ) , for any x 6= x .

Proof: Without loss of generality, we may assume that λ = 1 and consider Xϕ := Xϕ,1. For any z ∈ Xϕ, denote the teriminal tail {y | z ≺ϕ,1 y} by T (z). We note that, since

T (z) = {y | ϕ(y) + ρ(z, y) ≤ ϕ(z)}

33 and the map y 7→ ϕ(y) + ρ(z, y) is lower semicontinuous, each set T (z) is closed in X.

Now, let xo ∈ Cϕ be given. We construct an ascending sequence xo ≺ϕ,1 x1 ≺ϕ,1 x2 ≺ϕ,1 ··· inductively, first choosing x1 ∈ T (xo) so that ϕ(x1) ≤ 1 + inf[ϕ(T (x0))] and when x1, x2, . . . xn−1 have been selected, choosing xn ∈ T (xn−1) so that

1 ϕ(x ) ≤ + inf[ϕ(T (x ))] . n n n−1

Now the sequence T (xo) ⊃ T (x1) ⊃ · · · of closed sets is clearly descending. Moreover, the diameters of these sets decreases monotonically to zero. In fact, for a given n, if

ξ ∈ T (xn) ⊂ T (xn−1), we have ϕ(ξ) ≥ inf[ϕ(T (xn−1))] ≥ ϕ(xn) − 1/n, so that, since xn ≺ϕ,1 ξ, we find

1 ρ(x , ξ) ≤ ϕ(x ) − ϕ(ξ) ≤ . n n n

This latter inequality implies that diam (T (xn)) ≤ 2/n for each n ≥ 1 and so, by Cantor’s ? ∞ theorem, there is a unique x ∈ ∩n=0T (xn).

? ? ? ? Since x ∈ T (xo), we have xo ≺ϕ,1 x . Moreover x is maximal in Xϕ for, if x ≺ϕ,1 z ? ∞ ? then xn ≺ϕ,1 x ≺ϕ,1 z for all n ≥ 0, so z ∈ ∩n=0T (xn) and therefore z ∈ x . 2

The proof of Caristi’s theorem is immediate. Here is the theorem:

Theorem 11.4 (Caristi) Let {X, ρ} be a complete metric space and ϕ : X → R be a lower semicontinuous function with a finite lower bound. Let F : X → X be any (not necessarily continuous) function such that ρ(x, F (x)) ≤ ϕ(x) − ϕ(F (x)) for each x ∈ X. Then F has a fixed point.

Proof: Consider the partially order set Xϕ and let xo be a maximal element. Since ρ(xo,F (xo)) ≤ ϕ(xo) − ϕ(F (xo)) , we have xo ≺ϕ,1 F (xo) in Xϕ, and since xo is maximal, it follows that xo = F (xo). 2

Notice that the Banach fixed point theorem is a special case of the Caristi theorem. Indeed, suppose that F is a contraction on {X, ρ}, a complete metric space. If α with 0 ≤ α < 1 is the contraction constant then ρ(F (x),F 2(x)) ≤ α ρ(x, F (x)). Therefore

ρ(x, F (x)) − α ρ(x, F (x)) ≤ ρ(x, F (x)) − α ρ(F (x),F 2(x)) , so, with the nonnegative function ϕ(x) := (1 − α)−1 ρ(x, F (x)), the conditions of Caristi’s theorem are satisfied and F has a fixed point.

34