Christoffel Revisited

Bas Fagginger Auer

September 1, 2009

Master’s Thesis Mathematical Institute Utrecht University Supervisor: Prof. Dr. J. J. Duistermaat This is an digitally edited photograph of Elwin Bruno Christoffel, the origi- nal of which was taken from the History of archive of the School of Mathematics and Statistics of the University of St Andrews, Scotland, http://www-history.mcs.st-andrews.ac.uk/PictDisplay/Christoffel.html.

Acknowledgements

Before we start with the actual thesis, I would very much like to thank a few people whose input and putting up with my ramblings have made it possible to finish this thesis in its current form. I am very much indebted to my supervisor, professor Hans Duistermaat, for introducing me to this subject and helping me overcome many mathematical difficulties (in particular showing me simpler and more elegant ways to prove a great number of results) during our fruitful and pleasant discussions. My girlfriend, Hedwig van Driel, for painstakingly proofreading this entire document, her care, stuffing me with food, our lovely evenings, and being firm with me when I was procrastinating; thank you. Finally I would like to thank the numerous people whom I have bothered with my questions and unasked-for exposition of my results, in particular: Math- ijs Wintraecken, Matthijs van Dorp, Job Kuit, Albert-Jan Yzelman, Jan Jitse Venselaar, and Jaap Eldering, as well as my parents for their continued support and interest.

Thank you all for your kind help,

Bas Fagginger Auer. Abstract

This thesis discusses E. B. Christoffel’s famous article from 1869 (an English translation is also included), generalises it to the more general setting of locally convex Hausdorff topological vector spaces over R or C, and furthermore estab- lishes a partial converse to the results of Christoffel in Banach spaces. Apart from this we rigorously discuss the aforementioned more general setting by in- troducing the concepts related to this setting from the ground up. This includes in particular a non-standard notion for the derivative and proofs of many useful statements, among which the open mapping theorem, closed graph theorem, fundamental theorem of integration, and the Taylor approximation theorem. Contents

1 Introduction3 1.1 Notation...... 4 1.2 Overview...... 6

2 Topology8 2.1 Topological spaces...... 8 2.2 Separation axioms...... 18 2.3 Sequences...... 21 2.4 Compactness...... 23 2.5 Metric spaces...... 25

3 Algebra 33 3.1 Groups...... 33 3.2 Rings...... 37 3.3 Modules...... 39

4 Topology and algebra 50 4.1 Topological modules...... 50 4.2 Normed modules...... 54 4.3 Topological vector spaces...... 56 4.4 F-spaces...... 59 4.5 Local convexity...... 65

5 Analysis 75 5.1 Differentiation...... 75 5.2 Multilinear families...... 88 5.3 Integration...... 91 5.4 Fr´echet spaces...... 105 5.5 Banach spaces...... 108

- 1 CONTENTS

6 Revisiting Christoffel’s article 123 6.1 Preliminaries...... 123 6.2 Generalisation...... 125 6.3 Digression...... 133 6.4 Simple metrics...... 139 6.5 Making metrics simple...... 146 6.6 Digression (cont’d)...... 150

7 Conclusion 159

8 Translation 161

Bibliography 185

- 2 - Chapter 1

Introduction

Welcome to this thesis, which is concerned with generalising E. B. Christoffel’s article, [Chr1869], and developing the underlying theory of the setting in which this generalisation should take place. The level of the material contained in this thesis should be appropriate for any master student of Mathematics, as we de- velop the theory mostly from the ground up, relying only on results established in basic set theory and analysis on R and C. However, a familiarity with topol- ogy, higher-dimensional analysis, and differential geometry will be very helpful for understanding the structure of this document, the included examples, and reasons for adopting certain definitions.

We are interested in Christoffel’s article, because of its paramount impor- tance for the development of differential geometry at the end of the 19th century. Through the introduction of the Christoffel symbols (in [Chr1869] denoted by ij , but in differential geometry conventionally by Γ), the (in k [Chr1869] denoted by (ijkl) and now usually by R), and the means of covariant differentiation by using the Christoffel symbols, Christoffel provided very useful tools for the further development of differential geometry, as was undertaken by Gregorio Ricci-Curbastro and Tullio Levi-Civita (see [StAndrews]). These developments in turn permitted Albert Einstein to formulate his the- ory of entirely in terms of differential geometry, which was a major step in the physical modeling of the effects of gravity and electromag- netism in celestial mechanics. As differential geometry and general relativity are both still being practised by a great number of mathematicians and physicists today, this makes Christof- fel’s article highly influential and very interesting to further investigate. Even more so because of the geometrical way in which the Christoffel symbols are currently introduced in differential geometry (via an affine on a vec- tor bundle, see [Ban2008]), which is not at all like the algebraic way in which they were used by Christoffel as tools to determine whether or not two given metrics could be transformed into one another via an appropriate coordinate transformation.

- 3 1.1. NOTATION

1.1 Notation

To ensure a concise treatment of the discussed material, we will strive to use the same symbols to denote the same type of objects. However, this is not always possible when a large number of objects is being discussed at the same time, so the following table is only meant to give an indication. A, B, . . . Sets. 0 a, a , a1, a2, . . . Elements of the set A. U, V , . . . Open subsets of A, B, . . . respectively. See Definition (2.1.2). A, B, . . . Collections of subsets of A, B, . . . respectively. f, g, . . . Functions between sets. i, j, . . . Indices of objects. k, l, . . . Elements of N. ∼ An equivalence relation. α, β, . . . Scalars, usually values in either R or C. For collections of numbers we will employ the usual notation. N The natural numbers: 1, 2, 3,.... N0 The natural numbers together with zero: 0, 1, 2, 3,.... Nˆ The natural numbers extended with infinity and considered as a topological space: 1, 2, 3,..., ∞. See Example (2.3.2). Z The integers: ..., −2, −1, 0, 1, 2, 3,.... Q The rationals. R The real numbers. C The complex numbers, identified with the plane R2. K Refers to either R or C. ]α, β[ The open interval {γ ∈ R|α < γ < β} ⊆ R. [α, β] The closed interval {γ ∈ R|α ≤ γ ≤ β} ⊆ R. ]α, ∞[ The open interval {γ ∈ R|α < γ} ⊆ R. ] − ∞, α[ The open interval {γ ∈ R|α > γ} ⊆ R. ] − ∞, ∞[ The real line R. As well as the following notation for set operations. ∅ The empty set. A \ B The complement of the set B in A, {a ∈ A|a∈ / B}. S A The union of all sets in A, {a|∃A ∈ A : a ∈ A}. S S Ai Defined as {Ai|i ∈ I}. i∈I S A1 ∪ ... ∪ Ak Defined as {A1,...,Ak}. T A The intersection of all sets in A, {a|∀A ∈ A : a ∈ A}. T T Ai Defined as {Ai|i ∈ I}. i∈I T A1 ∩ ... ∩ Ak Defined as {A1,...,Ak}. ` A The disjoint union of all sets in A, the set {(A, a)|A ∈ A, a ∈ A}. ` i∈I Ai Defined as {(i, a)|i ∈ I, a ∈ Ai}. Q A The product of all sets in A, the set {g : A → S A|∀A ∈ A : g(A) ∈ A}. Q S i∈I Ai Defined as {g : I → i∈I Ai|∀i ∈ I : g(i) ∈ Ai}. A1 × ... × Ak Defined as {(a1, . . . , ak)|a1 ∈ A1, . . . , ak ∈ Ak}.

- 4 - 1.1. NOTATION

Together with their usual identifications. 123 We will also use the following symbols. dom f The domain of a function, for f : A → B, dom f := A. im f The image of a function, for f : A → B, im f := f(A) := {b ∈ B|∃a ∈ A : f(a) = b} ⊆ B. graph The graph of a function f : A → B, graph f := {(a, b) ∈ A × B|f(a) = b} ⊆ A × B. f −1(·) The pre-image of a set C ⊆ B under a function f : A → B, f −1(C) := {a ∈ A|f(a) ∈ C} ⊆ A. idA The identity map, for a given set A, idA := A → A : a 7→ a. sgn The sign of a number, sgn : R → {−1, 0, +1} where   −1 α < 0 sgn(α) := 0 α = 0 .  +1 α > 0 Re, Im The real and imaginary parts of a complex number, for z = (x, y) = x + i y ∈ C ' R2, Re(z) := x, Im(z) := y. P(A) The collection of subsets or powerset of a set A, P(A) := {B|B ⊆ A}. int(A) Interior of a set A, Definition (2.1.2). A Closure of a set A, Definition (2.1.2). T (A) The topology generated by A, Definition (2.1.6). Sk Group of permutations of {1, . . . , k}, Example (3.1.4). Abc Absorbent, balanced, and convex, Definition (4.3.4). k Daf The k-th derivative of a function f at a, Definition (5.1.1) and Definition (5.1.9). Ck(U, B) The set of all k-times continuously differentiable functions from an open set U ⊆ A to B, Definition (5.1.9). R β α f The integral of a function f over the interval [α, β], see Definition (5.3.2). L(A, B) Space of all continuous linear maps between Banach spaces A and B, see Definition (5.5.4). e·f The flow of f, Theorem (5.5.10). Where our notation truly differs from what is usual, is by denoting properties of objects. Because later objects, in particular topological vector spaces, can have a large number of different properties, these properties are cumbersome to fully write out in words. Therefore we will employ a shorthand in the form of coloured icons, each of which denotes an object type or property. For algebraic objects, we furthermore employ B ≤ A to indicate that B ⊆ A and that B is an algebraic object of the same type as A, with regard to the restrictions of the algebraic operators (i.e. addition, multiplication, . . . ) from A to B. 1 ‘ ‘ We identify i∈I Ai with {Ai|i ∈ I} via (i, a) ↔ (Ai, a). 2 Q Q We identify i∈I Ai with {Ai|i ∈ I} via (Ai 7→ g(Ai)) ↔ (i 7→ g(Ai)). 3 Q We identify A1 × ... × Ak with i∈{1,...,k} Ai via (a1, . . . , ak) ↔ (i 7→ ai).

- 5 - 1.2. OVERVIEW

T Topological space, Definition (2.1.1). c Continuity, Definition (2.1.11). T0 -T6 Separation class, Definition (2.2.1). 1C First countability, Definition (2.3.3). Cpt Compactness, Definition (2.4.2). d(.,.) Metric space, Definition (2.5.1). G Group, Definition (3.1.1). R Ring, Definition (3.2.1). F Field, Definition (3.2.6). M Module, Definition (3.3.1). l Linearity, Definition (3.3.3). Vs Vector space, Definition (3.3.21). T G Topological group, Definition (4.1.1). T R Topological ring, Definition (4.1.2). T M Topological module, Definition (4.1.3). ||.|| Normed module, Definition (4.2.3). T F Topological field, Definition (4.3.1). T Vs Topological vector space, Definition (4.3.3). FS F-space, Definition (4.4.1). LC Local convexity, Definition (4.5.1). UC Uniform completeness, Definition (4.5.9). d Differentiability, Definition (5.1.1). Fr Fr´echet space, Definition (5.4.1). Ba Banach space, Definition (5.5.1). It should be noted that all icons representing topological properties are coloured green, the icons representing algebraical properties blue, and the icons repre- senting properties depending on both topology and algebra green-blue. Further- more, these icons permit us to easily talk about maps preserving a certain prop- erty: instead of a homeomorphism between two topological spaces we can talk about a -isomorphism, instead of a homomorphism we can talk about a - morphism, etc.. This ensures that we do not have to create names for different types of maps preserving different properties, but call all of them ‘morphisms’ with respect to a certain icon. 4

1.2 Overview

We will start in Chapter2 by discussing the basic concepts of topologies, con- tinuity, and metric spaces. Of particular interest here are: chain rule for limits (Lemma (2.1.10)), initial and final topologies (Definition (2.1.18) and Defini- tion (2.1.23)), operations preserving continuity (Theorem (2.1.28)), separation axioms (Definition (2.2.1) and Theorem (2.2.5)), graphs of continuous functions being closed (Lemma (2.2.7)), notion of compactness (Definition (2.4.2)), fixed point theorem (Theorem (2.5.21)), and uniform continuity theorem (Theorem (2.5.23)). We then continue to discuss the basics of group theory and algebra (in the form of rings, fields, and modules) in Chapter3. Interesting notions here are: factorisation lemma for -morphisms (Lemma (3.1.9)), group action (Definition

4Inspired by category theory.

- 6 - 1.2. OVERVIEW

(3.1.10)), solving equations in a ring (Lemma (3.2.3)), factorisation lemma for modules (Lemma (3.3.10)), duality theorem (Theorem (3.3.19)), and the first version of the Hahn-Banach theorem (Theorem (3.3.22) and Example (3.3.23)). After Chapter3 we combine topology and algebra in Chapter4 where we introduce algebraical objects of which all algebraical manipulations should be continuous (topological rings, fields, and modules). We discuss (semi)normed spaces, F-space, and the notion of local convexity. Of particular interest are the fact that translations and non-zero scalings of open sets are open (Lemma (4.1.5)), the comparison between topologies of topological modules (Lemma (4.1.6)), separation of topological modules (Lemma (4.1.7)), notion of a semi- norm (Definition (4.2.3)), the second, more definitive form of the Hahn-Banach theorem (Theorem (4.2.6)), topological vector spaces (Definition (4.3.3)), ab- sorbent balanced and convex subsets (Definition (4.3.4), Lemma (4.3.5), and Lemma (4.3.6)), counterexamples emphasising that we need to be careful in this general setting (Example (4.3.7), Example (4.3.8), and Example (4.3.9)), open mapping and closed graph theorems (Theorem (4.4.3), Theorem (4.4.4), Theorem (4.4.5), and Corollary (4.4.6)), notion of local convexity (Section 4.5 entirely), the final form of the Hahn-Banach theorem (Theorem (4.5.14)), and comparison of a space and the topological dual of its topological dual (Theorem (4.5.15)). Now that we have established our definitive setting (in the form of topolog- ical vector spaces) for the generalisation of [Chr1869], we start doing analysis on topological vector spaces in Chapter5. Here we discuss a slightly different generalised notion of differentiability (when comparing with the Fr´echet and Gˆateauxderivatives that are usually employed) in Definition (5.1.1). We define partial derivatives in Definition (5.1.12), higher order derivatives in Definition (5.1.9)), and verify this different notion to be a proper generalisation in Theorem (5.1.5), Corollary (5.1.7), and Corollary (5.5.7). This derivative furthermore has the usually desired properties as expressed in Theorem (5.1.8) (most notably the chain rule and local constantness), Lemma (5.1.11), Lemma (5.1.13) (sum rule), and Theorem (5.1.16) (symmetry for higher order derivatives). We then investigate differentiability of families of multilinear maps, Definition (5.2.2), in Lemma (5.2.1) and establish a very useful product rule as Equation (5.4) in Theorem (5.2.4), and finally a condition for differentiability of families of lin- ear inverses in Theorem (5.2.6). After differentiation we consider integration in Definition (5.3.2) for which we prove the usual properties in Theorem (5.3.5) and the fundamental theorem of integration in Theorem (5.3.8). These results are then used to show the Taylor approximation theorem in Theorem (5.3.11) and Corollary (5.3.12). We conclude by discussing Fr´echet spaces (Definition (5.4.1)) and Banach spaces (Definition (5.5.1)) and showing that while the in- verse function theorem (Theorem (5.5.8)), implicit function theorem (Theorem (5.5.9)), and existence of solutions for ordinary differential equations (Theorem (5.5.10)) are all true for Banach spaces, they do not hold for Fr´echet spaces, as shown in Example (5.4.3) and Example (5.4.4). After this we have derived all necessary theory, and in Chapter6 we gener- alise [Chr1869] to Theorem (6.2.1), Theorem (6.2.2), and Theorem (6.2.3) and establish a partial converse in Theorem (6.6.1). We conclude the thesis with an English translation of the originally German article [Chr1869] in Chapter8 and a conclusion in Chapter7.

- 7 - Chapter 2

Topology

Topology is the study of qualitative geometry: we provide a given set A that has a geometrical interpretation with a notion of what it means to ‘be near’ or ‘in a neighbourhood of’ a point in A1 by selecting a certain collection of subsets of A that are all interpreted as ‘neighbourhoods’. These subsets are called open sets in A and they give a surprising amount of information about the geometrical properties of A (see for example [Mun2000]).

2.1 Topological spaces

Definition 2.1.1: Topology ( T ) Let A be a set. Then a topology on A is a collection A ⊆ P(A) of subsets of A such that • ∀U ⊆ A : S U ∈ A,

•∀ U1,U2 ∈ A : U1 ∩ U2 ∈ A, •∅ ,A ∈ A. A topological space A is a set A together with a topology A on A, we will write A to indicate that A is a topological space. Elements a ∈ A of a topological space are commonly called points to em- phasise their geometrical interpretation. Definition 2.1.2 Let A . Denote A’s topology by A. A subset U ⊆ A is called open if U ∈ A and closed if A \ U ∈ A. For any subset B ⊆ A we define the closure, and interior as \ [ B := {C ⊆ A | C closed ,B ⊆ C}, int(B) := {U ⊆ A | U open ,U ⊆ B} respectively. Let a ∈ A be any point, then we call a subset B ⊆ A a neighbourhood of a in A if there exists an open set U ⊆ A for which a ∈ U ⊆ B.

1But exactly how near is left unspecified.

- 8 2.1. TOPOLOGICAL SPACES

Note that for any point a ∈ U of an open set U ⊆ A, U is an (open) neighbourhood of a in A. Lemma 2.1.3 Let A T . Then for any subset B ⊆ A, • B is the smallest closed set containing B, • int(B) is the largest open set contained in B, • a ∈ B if and only if for all open neighbourhoods U of a in A we have U ∩ B 6= ∅, • a ∈ int(B) if and only if there exists an open neighbourhood U of a in A such that a ∈ U ⊆ B. Proof. B is nonempty if B is nonempty since A ⊇ B is closed. As arbitrary unions of open sets are open and closed sets are complements of open sets, arbi- trary intersections of closed sets are closed, so B is closed. From the definition B is clearly the smallest closed set containing B. The second item is proven in the same way. Let a ∈ A. Suppose there exists an open neighbourhood U of a in A with U ∩ B = ∅, then B ⊆ A \ U which is closed, so B ⊆ A \ U. As a ∈ U, a∈ / A \ U ⊇ B, a∈ / B. Suppose conversely that a∈ / B, then a ∈ A \ B which is open (B is closed), so A \ B is an open neighbourhood of a in A and since B ⊆ B, we have (A \ B) ∩ B = ∅. This shows the third item. Suppose a ∈ int(B), then int(B) is an open neighbourhood of a in A and a ∈ int(B) ⊆ B. If conversely there exists an open neighbourhood U of a in A such that a ∈ U ⊆ B, then a ∈ U ⊆ int(B) by definition since U is an open set contained in B. This shows the fourth item. Definition 2.1.4: Neighbourhood Let A and a ∈ A. Then we call a collection A ⊆ P(A) a basis of neighbourhoods of a in A if for all U ∈ A, U is a neighbourhood of a in A and for all open neighbourhoods U1 of a in A, there exists some U ∈ A such that U ⊆ U1. Definition 2.1.5: Topological basis Let A be a set. Then a topological basis on A is a collection A ⊆ P(A) such that • A = S A,

•∀ U1,U2 ∈ A : ∃U3 ∈ A : U3 ⊆ U1 ∩ U2. Note that any topology is a topological basis, but that the converse is not necessarily true. Definition 2.1.6: Generated topology Let A be a set and A ⊆ P(A) a collection of subsets. Then the topology generated by A is defined to be the intersection of all topologies on A containing A: \ T (A) := {A1 ⊆ P(A)|A ⊆ A1 and A1 is a topology on A}.

- 9 - 2.1. TOPOLOGICAL SPACES

Lemma 2.1.7: Properties of T (·) Let A be a set and A ⊆ P(A). Then

•T (A) is the unique smallest topology on A containing A, • if A is a topology, then T (A) = A, • if A is a topological basis, then U ∈ T (A) if and only if for all a ∈ U there exists a Ua ∈ A such that a ∈ Ua ⊆ U. Proof. The first item is direct from the definition of T (A) as the intersection of all topologies (which is directly verified to again be a topology) containing A, by it being the smallest, it is also unique. The second item follows directly from the first item, since in this case A itself is the smallest topology containing A. For the third item, suppose A is a topological basis and let

A1 := {U ⊆ A|∀a ∈ U : ∃Ua ∈ A : a ∈ Ua ⊆ U}. S Then ∅ ∈ A1 vacuously and A ∈ A1 because A = A. Clearly for all U ⊆ A1 S we have U ∈ A1. For U1,U2 ∈ A1 we have U1 ∩ U2 ∈ A2, because for all a ∈ U1 ∩ U2 there exist U3,U4 ∈ A such that a ∈ U3 ⊆ U1 and a ∈ U4 ⊆ U2, now as A is a basis there exists a U5 ∈ A with a ∈ U5 ⊆ U3 ∩ U4 ⊆ U1 ∩ U2. Therefore A1 is a topology and for any U ∈ A and a ∈ U we have a ∈ U ⊆ U, so A ⊆ A1. Therefore T (A) ⊆ A1. Now let A2 be any topology containing A. Let U ∈ A1, then for all a ∈ U there exists a Ua ∈ A such that a ∈ Ua ⊆ U. S As all Ua ∈ A ⊆ A2, we have U = a∈U Ua ∈ A2. Since this is true for all U ∈ A1, A1 ⊆ A2. Because this is true for all topologies A2 containing A we have A1 ⊆ T (A). Therefore A1 = T (A). Lemma 2.1.8 Let A be a set and A1, A2 topologies on A. Let B1, B2 be any topological bases on A satisfying A1 = T (B1) and A2 = T (B2). Then A1 ⊆ A2 if and only if for all U1 ∈ B1 and a ∈ U1 there exists a U2 ∈ B2 such that a ∈ U2 ⊆ U1.

Proof. Suppose A1 ⊆ A2. Let U1 ∈ B1 be arbitrary and a ∈ U1, then because U1 ∈ B1 ⊆ T (B1) = A1 ⊆ A2 we have a ∈ U2 ⊆ U1 for some U2 ∈ B2 by Lemma (2.1.7). Suppose conversely that for all U1 ∈ B1 and a ∈ U1 there exists a U2 ∈ B2 with a ∈ U2 ⊆ U1. Let U1 ∈ A1, then by Lemma (2.1.7)(B1 is a topological basis), for all a ∈ U1 there exists a Ua ∈ B1 such that a ∈ Ua ⊆ U1. Now 0 by our assumption, for all such Ua, a ∈ Ua there exists a Ua ∈ B2 such that 0 0 a ∈ Ua ⊆ Ua ⊆ U1. Therefore for all a ∈ U1 there exists a Ua ∈ B2 such that 0 a ∈ Ua ⊆ U1, so (again by Lemma (2.1.7)) U1 ∈ A2. Therefore A1 ⊆ A2. Definition 2.1.9: Limit Let A, B T , f : A → B, a ∈ A, and b ∈ B. Then we say that f has limit b at a, denoted by

lim f(x) = b, x→a

- 10 - 2.1. TOPOLOGICAL SPACES if for all neighbourhoods V of b in B, f −1(V ) is a neighbourhood of a in A. If it is not true that limx→a f(x) = b we write

lim f(x) 6= b. x→a

Note that limx→a f(x) = b if and only if for all open V ⊆ B, b ∈ V there exists an open U ⊆ A, a ∈ U with f(U) ⊆ V . Lemma 2.1.10: Chain rule for limits Let A, B, C T and f : A → B, g : B → C maps. Suppose for a ∈ A, b ∈ B, and c ∈ C we have

lim f(x) = b, lim g(y) = c, x→a y→b then lim (g ◦ f)(x) = c. x→a Proof. Suppose the conditions for the lemma are met. Let W be a neighbour- −1 hood of c in C. Then because limy→b g(y) = c, g (W ) is a neighbourhood of −1 −1 −1 b in B. Hence, by limx→a f(x) = b,(g ◦ f) (W ) = f (g (W )) is a neigh- bourhood of a in A. So for any neighbourhood W of c in C,(g ◦ f)−1(W ) is a neighbourhood of a in A, therefore limx→a(g ◦ f)(x) = c.

Definition 2.1.11: Continuous maps ( c ) Let A, B and a ∈ A. Then a map f : A → B is called continuous at a (denoted by f a) if

lim f(x) = f(a). x→a

We call f continuous (f ) if f is continuous at a for all a ∈ A. Definition 2.1.12: Almost continuous maps Let A, B , and a ∈ A. Then a map f : A → B is called almost continuous at a if for each neigh- bourhood V of f(a) in B, f −1(V ) is a neighbourhood of a in A. Definition 2.1.13: Open and closed maps Let A, B . Then a map f : A → B is called open (resp. closed) if for any U ⊆ A open (resp. closed), f(U) ⊆ B is open (resp. closed). A map f : A → B is called almost open if for any U ⊆ A open, f(U) ⊆ int(f(U)). Note that a map is almost open if and only if for each a ∈ A and each open neighbourhood U of a in A, f(U) is a neighbourhood of f(a). For if this holds and U ⊆ A is open and arbitrary, there exists for each a ∈ U an open neighbourhood Va of f(a) in B such that Va ⊆ f(U) and therefore f(U) = S S a∈U {f(a)} ⊆ a∈U Va ⊆ int(f(U)), as all Va are open and contained in f(U). Conversely, let a ∈ A and U be an open neighbourhood of a in A, then f(a) ∈ f(U) ⊆ int(f(U)) ⊆ f(U), so f(U) is a neighbourhood of f(a) in B. This makes both characterisations equivalent.

- 11 - 2.1. TOPOLOGICAL SPACES

Lemma 2.1.14 Let A, B T and f : A → B a map. Then f c if and only if for all V ⊆ B open we have that f −1(V ) is open in A, and if and only if for all V ⊆ B closed we have that f −1(V ) is closed in A. If the topology of B is generated by a topological basis B, f if and only if for all V ∈ B we have that f −1(V ) is open in A. Proof. Suppose f and let V ⊆ B be open. Let a ∈ f −1(V ), then because f by assumption, f a and hence limx→a f(x) = f(a) ∈ V . By definition of limit and the fact that V is an open neighbourhood of f(a) in B there exists −1 an open neighbourhood Ua of a in A such that f(Ua) ⊆ V , so Ua ⊆ f (V ). −1 S S −1 Because of this f (V ) = a∈f −1(V ){a} ⊆ a∈f −1(V ) Ua ⊆ f (V ) as a ∈ Ua ⊆ −1 −1 S V for all a ∈ f (V ). Therefore f (V ) = a∈f −1(V ) Ua is open in A as all Ua are open. Suppose conversely that f −1(V ) is open in A for all V ⊆ B open. Let a ∈ A be arbitrary and V an open neighbourhood of f(a) in B. Then V is open, so by assumption f −1(V ) is open in A and as f(a) ∈ V we have that a ∈ f −1(V ). This means that for all open neighbourhoods V of f(a) in B, f −1(V ) is an open −1 neighbourhood of a in A and f(f (V )) ⊆ V , so limx→a f(x) = f(a) and f a. As this is true for all a ∈ A, f . The same is true for closed sets, since B ⊆ V is closed iff V \ B is open and f −1(B \ V ) = A \ f −1(V ) is open iff f −1(V ) is closed. For the second point it is sufficient to note that a set V ⊆ B is open if and only if for all b ∈ V there exists a Vb ∈ B such that b ∈ Vb ⊆ V (by Lemma −1 −1 S S −1 (2.1.7)). Therefore f (V ) = f ( b∈V Vb) = b∈V f (Vb) which is open as a union of open sets. Example 2.1.15: Continuity is a local property Consider f : R → R given by  x2 x ∈ f(x) := Q , 0 x∈ / Q then f 0 (limx→0 f(x) = 0 = f(0)), but for all x ∈ R \{0} we have that not f x. Hence this function is continuous in just a single point. Lemma 2.1.16 Let A, B and f : A → B . Then for any subset C ⊆ A, we have

f(C) ⊆ f(C).

Proof. Let b ∈ f(C), then there exists an a ∈ C such that f(a) = b. Let V be any open neighbourhood of b in B. Then, as f , limx→a f(x) = f(a) = b, there exists an open neighbourhood U of a in A such that f(U) ⊆ V . As a ∈ C, by Lemma (2.1.3) there exists an a1 ∈ U ∩C. Hence f(a1) ∈ f(U ∩C) ⊆ V ∩f(C). Therefore V ∩ f(C) 6= ∅ for any open neighbourhood V of b in B, so b ∈ f(C) by Lemma (2.1.3). That we do not necessarily have equality of f(C) and f(C) is shown in the following example.

- 12 - 2.1. TOPOLOGICAL SPACES

Example 2.1.17 Consider f : R → R given by f(x) := e−x which is c . Now f(]0, ∞[) =]0, 1[, but f(]0, ∞[) = f([0, ∞[) =]0, 1] ( [0, 1] = f(]0, ∞[). So in this case f(]0, ∞[) ( f(]0, ∞[). Definition 2.1.18: Initial topology Let A be a set, {fi : A → Bi|i ∈ I} a collection of maps with Bi T for all i ∈ I. Then we define the initial topology on A with respect to {fi : A → Bi|i ∈ I} as −1 T ({fi (V ) ⊆ A | V ⊆ Bi open, i ∈ I}). Lemma 2.1.19: Properties of the initial topology Let A be a set, {fi : A → Bi|i ∈ I} a collection of maps with Bi for all i ∈ I. Let the topology on A be the initial topology with respect to this collection.

• The initial topology is the smallest topology for which fi for all i ∈ I.

• Suppose that for all i ∈ I, the topology on Bi is generated by the topo- logical basis Bi. Then the collection

A := {f −1(V ) ∩ ... ∩ f −1(V ) ⊆ A i1 1 ik k

|k ∈ N, i1, . . . , ik ∈ I distinct,V1 ∈ Bi1 ,...,Vk ∈ Bik } forms a topological basis for the initial topology.

• If furthermore all the Bi are equal to B with topology generated by the topological basis B, then

A0 := {f −1(V ) ∩ ... ∩ f −1(V ) ⊆ A i1 ik |k ∈ N, i1, . . . , ik ∈ I distinct,V ∈ B} forms a basis for the initial topology. • For any C and g : C → A we have that g if and only if for all i ∈ I, fi ◦ g : C → Bi . The initial topology is the unique topology on A with this property. Proof. • The first item is direct from Lemma (2.1.14) together with the definition of the initial topology and Lemma (2.1.7). • Now we consider the second item. First of all note that for any i ∈ I, B = S B , so A ⊇ S A ⊇ S f −1(V ) = f −1(S B ) = f −1(B ) = A i i V ∈Bi i i i i i and therefore A = S A. Now let U = f −1(V ) ∩ ... ∩ f −1(V ),U = 1 i1 1 ik k 2 f −1(W ) ∩ ... ∩ f −1(W ) ∈ A. Let a ∈ U ∩ U be arbitrary, we are j1 1 jl l 1 2 going to construct a U3 ∈ A such that a ∈ U3 ⊆ U1 ∩ U2. Suppose that any of the i1, . . . , ik equals any of the j1, . . . , jl, say i1 = j1. Then a ∈ f −1(V ) ∩ f −1(W ) = f −1(V ∩ W ), so f (a) ∈ V ∩ W . Since i1 1 j1 1 i1 1 1 i1 1 1 0 V1,W1 ∈ Bi1 which is a basis, there exists a V ∈ Bi1 such that fi1 (a) ∈ V 0 ⊆ V ∩ W , so a ∈ f −1(V 0). Now replace f −1(V ) ∩ f −1(W ) in 1 1 i1 i1 1 j1 1 U ∩ U by f −1(V 0). Continue this way until all of (the finite number 1 2 i1 of indices) i1, . . . , ik, j1, . . . , jl are distinct to obtain U3. By construction

- 13 - 2.1. TOPOLOGICAL SPACES

a ∈ U3 and U3 ⊆ U1 ∩ U2 and because all indices are distinct U3 ∈ A, so a ∈ U3 ⊆ U1 ∩U2 for U3 ∈ A. Since this can be done for all U1,U2 ∈ A and S a ∈ U1 ∩ U2, and A = A, A is a topological basis on A. Note that A is contained in the initial topology on A (all f −1(V )∩...∩f −1(V ) are open i1 1 ik k c by Lemma (2.1.14): fij , Vj open), and that the collection generating the initial topology is contained in A. Therefore A is a topological basis generating the initial topology.

• Now if all Bi are equal to B and all bases Bi are equal to B, then we can

for any i1, . . . , ik ∈ I and Vi1 ,...,Vik ∈ B consider Vi1 ∩ ... ∩ Vik which is

open in B and therefore there exists a V ∈ B such that V ⊆ Vi1 ∩...∩Vik . This shows that A0 is at least as large as A. On the other hand, A is 0 0 clearly at least as large as A (pick Vi1 = ... = Vik = V ), hence A = A in this case and therefore A0 forms a basis for the initial topology.

• For the final item, let C T and g : C → A be given. Suppose g , then as all fi in the initial topology we have (Lemma (2.1.10)) that all fi ◦ g . Suppose conversely that for all i ∈ I, f ◦ g . Let f −1(V ) ∩ ... ∩ i i1 1 f −1(V ) be any element from the basis generating the initial topology. ik k Then g−1(f −1(V )∩...∩f −1(V )) = (f ◦g)−1(V )∩...∩(f ◦g)−1(V ) ⊆ i1 1 ik k i1 1 ik k

C which is open, because all fi1 ◦g, . . . , fik ◦g and V1 ⊆ Bi1 ,...,Vk ⊆ Bik are open and finite intersections of open sets are open. Therefore g by Lemma (2.1.14).

Let A1, A2 be any two topologies on A having this property. Choose g = idA. Now for C = A and A both with topology A1 we find that as idA : A → A , so must all fi ◦ idA = fi be for A with topology A1, and similarly with topology A2. Consider C = A with topology A1 and A with topology A2, then as all fi ◦ idA = fi , idA . But this implies that A1 ⊇ A2. Do the same with both topologies interchanged to obtain that A1 = A2: the topology on A is uniquely determined by this property.

Definition 2.1.20: Subspace topology Let A , and B ⊆ A any subset. Then we will, unless specified otherwise, consider B having the initial topology of the inclusion map f : B → A : a 7→ a. From Definition (2.1.18) we see that the topology on B ⊆ A consists precisely of all sets B ∩ U where U ⊆ A is open. Definition 2.1.21: Product topology Let {Ai|i ∈ I} be a collection of topological spaces. Q Then we will, unless specified otherwise, consider the product Ai having Q i∈I the initial topology of the projection maps {fi : j∈I Aj → Ai : g 7→ g(i)|i ∈ I}. Example 2.1.22: Rk k Q Let k ∈ N. Then the product topology on R = R × ... × R = i∈{1,...,k} R is | {z } k generated (Lemma (2.1.19)) by sets

]x1, y1[× ... ×]xk, yk[

- 14 - 2.1. TOPOLOGICAL SPACES

where all xi, xi ∈ R, xi < yi for 1 ≤ i ≤ k, since the collection {]x, y[⊆ R|x, y ∈ R, x < y} forms a topological basis which generates the topology on R. Definition 2.1.23: Final topology Let A be a set, {fi : Bi → A|i ∈ I} a collection of maps with Bi T for all i ∈ I. Then we define the final topology on A with respect to {fi : Bi → A|i ∈ I} as −1 {U ⊆ A | ∀i ∈ I : fi (U) open in Bi}. −1 −1 This is indeed a topology: fi (A) = Bi ⊆ Bi, fi (∅) = ∅ ⊆ Bi are both open, so A and ∅ are part of the final topology. If all of {Uj ⊆ A|j ∈ J} are −1 S S −1 part of the final topology, then fi ( j∈J Uj) = j∈J fi (Uj) ⊆ Bi is open, −1 S because all fi (Uj) ⊆ Bi are open by assumption. Hence j∈J Uj is part of −1 the final topology. If U1, U2 are part of the final topology, then fi (U1 ∩ U2) = −1 −1 −1 −1 fi (U1) ∩ fi (U2) ⊆ Bi is open because fi (U1), fi (U2) ⊆ Bi are open by assumption. Hence U1 ∩ U2 is part of the final topology. So the final topology is a topology. Lemma 2.1.24: Properties of the final topology Let A be a set, {fi : Bi → A|i ∈ I} a collection of maps with Bi for all i ∈ I. Let the topology on A be the final topology with respect to this collection.

c • The final topology is the unique largest topology for which fi for all i ∈ I. • For any C and g : A → C we have that g if and only if for all i ∈ I, g ◦ fi : Bi → C . The final topology is the unique topology on A with this property.

Proof. For the first item, let A be any topology on A for which all fi . Then −1 for any U ∈ A, fi (U) ⊆ Bi is open, but then U is an element from the final topology. Therefore A is contained in the final topology. So the final topology is the largest topology for which all fi and by being the largest, it is also unique. Looking at Lemma (2.1.14), it is clear that all fi with respect to the final topology. For the second item, let C and g : A → C be given. Suppose g , then as all fi we have (Lemma (2.1.10)) that all g ◦ fi . Suppose g not , then by Lemma (2.1.14) there exists a W ⊆ C open such that g−1(W ) ⊆ A is not open. By definition of the final topology, we −1 −1 therefore obtain an i ∈ I such that fi (g (W )) ⊆ Bi is not open, but then −1 −1 −1 (g ◦ fi) (W ) = fi (g (W )) ⊆ Bi is not open, while W ⊆ C is open. So for this i, g ◦ fi not . Now take the contrapositive to obtain that if for all i ∈ I, g ◦ fi , then g . Uniqueness is proven in the same way as for the initial topology. Definition 2.1.25: (Disjoint) union topology Let {Ai|i ∈ I} be a collection of topological spaces. ` Then we will, unless specified otherwise, consider the disjoint union i∈I Ai ` having the initial topology of the collection {fi : Ai → j∈I Aj : a 7→ (i, a)|i ∈ I}. S We will consider the union Ai having the initial topology of the S i∈I collection {fi : Ai → j∈I Aj : a 7→ a}.

- 15 - 2.1. TOPOLOGICAL SPACES

S If we can write A = i∈I Ai, then we can consider for each i ∈ I, Ai ⊆ A with the subspace topology (Definition (2.1.20)). Then the union topology S (Definition (2.1.25)) of Ai is the same as the topology of A if all the Ai ⊆ A i∈I S are open. This is not necessarily true in general: consider for example a∈A{a} with the union topology induced by all subspaces {a} ⊆ A, then any subset of S a∈A{a} is open. Definition 2.1.26: Quotient topology Let A T , B a set and f : A → B a surjective function. Then the quotient topology on B with respect to f is the final topology of what is called the quotient map f : A → B. Note that any surjective function f : A → B gives rise to an equivalence relation ∼ on A by letting a1 ∼ a2 if and only if f(a1) = f(a2) for all a1, a2 ∈ A. Conversely any equivalence relation ∼ on A gives rise to a surjective function f : A → A/ ∼: a 7→ {a1 ∈ A|a ∼ a1}. It is easily verified that both these formulations are equivalent. Example 2.1.27: M¨obiusstrip Let A = [0, 1] × [0, 1] with the subspace topology from R2. Then we can define an equivalence relation by letting (0, y) ∼ (1, 1 − y) for all y ∈ [0, 1] and (x, y) ∼ (x, y) for all (x, y) ∈ [0, 1] × [0, 1]. The quotient space A/ ∼ together with the quotient topology is what is known as the M¨obiusstrip (we glue the x = 0 and x = 1 ends of A together after twisting them for 180 degrees by relating (0, y) to (1, 1 − y)). The initial and final topology can neatly be expressed as the unique topolo- gies needed for continuity of all maps in the commutative diagrams: Initial Final

g fi C / A Bi / A @ @ @@ @ @ @@ g @@ fi @ fi◦g @ g◦fi @@ @  @  Bi C for all i ∈ I. Theorem 2.1.28: Operations preserving continuity Let A, B . Composition: let C , a ∈ A, f : A → B, g : B → C. If f c a and g f(a), then g ◦ f a. In particular if f and g , then g ◦ f .

Glueing: let U1,U2 ⊆ A be both open or both closed such that A = U1 ∪ U2 and let f1 : U1 → B, f2 : U2 → B.

If f1, f2 with respect to the subspace topologies on U1 and U2, and f1(a) = f2(a) for all a ∈ U1 ∩ U2, then there exists a unique function

f : A → B such that f1 = f|U1 and f2 = f|U2 . Restricting domain: let C ⊆ A a subset, and f : A → B.

If f , then f|C where C has the subspace topology.

- 16 - 2.1. TOPOLOGICAL SPACES

Expanding image: suppose B ⊆ C with the subspace topology and let f : A → B. If f c , then f : A → C . Restricting image: let f : A → B, and C ⊆ B a subset with f(A) ⊆ C. If f , then f : A → C where C has the subspace topology. Constantness: let f : A → B and suppose there exists a b ∈ B such that f(a) = b for all a ∈ A, then f . Proof. We will frequently use Lemma (2.1.14) in this proof.

• Follows directly from Lemma (2.1.10).

• Let f1 : U1 → B, f2 : U2 → B both be given for A = U1∪U2 and suppose f1(a) = f2(a) for all a ∈ U1 ∩ U2. Define f : A → B by f(a) := f1(a)

if a ∈ U1 and f(a) := f2(a) otherwise. Then f|U1 = f1, f|U2 = f2 by −1 −1 −1 definition. Note that for any V ⊆ B we have f (V ) = f1 (V ) ∪ f2 (V ).

Suppose U1 and U2 are open, let V ⊆ B be open. Then as f1, f2 , we −1 −1 have f1 (V ) ⊆ U1, f2 (V ) ⊆ U2 are open. Therefore (subspace topology) −1 −1 f1 (V ) = U3 ∩ U1, f2 (V ) = U4 ∩ U2 for some U3,U4 ⊆ A open. Hence −1 −1 −1 f (V ) = f1 (V ) ∪ f2 (V ) = (U3 ∩ U1) ∪ (U4 ∩ U2) which is open as U1 and U2 are open. If U1 and U2 are closed, we can follow the same route for V ⊆ B closed to obtain that f −1(V ) is closed. Therefore f .

Uniqueness follows directly from the demand that f|U1 = f1, f|U2 = f2 together with A = U1 ∪ U2. • Let C ⊆ A and suppose f : A → B . Let V ⊆ B be open, then −1 −1 (f|C ) (V ) = f (V )∩C which is open in the subspace topology, because −1 f (V ) is open. Therefore f|C . • Suppose B ⊆ C and f : A → B . Then f : A → C is the composition of f with the inclusion map B → C which is by choice of the subspace topology, therefore f : A → C by the first item. • Let f : A → B , f(A) ⊆ C ⊆ B. Let W ⊆ C be any open set. Because of the subspace topology W = C ∩ V for V ⊆ B open. As f(A) ⊆ C we have f −1(W ) = f −1(C ∩ V ) = A ∩ f −1(V ) = f −1(V ), which is open because f : A → B . Therefore f : A → C . • Let f : A → B satisfy f(a) = b for all a ∈ A. Then for any open V ⊆ B we have that f −1(V ) equals either ∅ (b∈ / V ) or A (b ∈ V ) which are both part of the topology of A by definition. So f .

Definition 2.1.29: Morphisms of topological spaces Let A, B T . Then all maps f : A → B are morphisms between A and B. The identity morphism is the continuous map

idA : A → A : a 7→ a. Topological isomorphisms (denoted by -isomorphisms) are commonly called homeomorphisms.

- 17 - 2.2. SEPARATION AXIOMS

2.2 Separation axioms

This definition deals with the increasing precision with which we can distinguish subsets of a topological space using the space’s topology.

Definition 2.2.1: Separation (T0 , T1 , T2 , T3 , T3.5 , T4 , T6 ) Let A T . The following properties should hold for any a1, a2 ∈ A, a1 6= a2 (separation of distinct points in A). • We say A if ∃U ⊆ A open : a1 ∈ U ∧ a2 ∈/ U.

• We say A if

∃U1,U2 ⊆ A open : a1 ∈ U1 ∧ a2 ∈ U2 ∧ a1 ∈/ U2 ∧ a2 ∈/ U1.

• We say A or Hausdorff if

∃U1,U2 ⊆ A open : a1 ∈ U1 ∧ a2 ∈ U2 ∧ U1 ∩ U2 = ∅.

The following properties should hold for any a ∈ A, B ⊆ A closed, a∈ / B (separation of a point and a closed set). • We say A is regular if

∃U1,U2 ⊆ A open : a ∈ U1 ∧ B ⊆ U2 ∧ U1 ∩ U2 = ∅.

• We say A if A is regular and . • We say A is completely regular if

∃f : A → R c : f(a) = 0 ∧ f(B) = {1}.

• We say A or Tychonoff if A is completely regular and . The following properties should hold for any B,C ⊆ A closed, B ∩ C = ∅ (separation of distinct closed sets). • We say A is normal if

∃U1,U2 ⊆ A open : B ⊆ U1 ∧ C ⊆ U2 ∧ U1 ∩ U2 = ∅.

• We say A if A is normal and . • We say A is perfectly normal if

−1 −1 ∃f : A → R : f ({0}) = B ∧ f ({1}) = C.

• We say A if A is perfectly normal and . Lemma 2.2.2: Closed point sets Let A . Then A if and only if for all a ∈ A, {a} ⊆ A is closed.

- 18 - 2.2. SEPARATION AXIOMS

Proof. Suppose A T1 . Let a1 ∈ A \{a}, then a1 6= a so (A is ) there exist

U, Ua1 ⊆ A open such that a ∈ U, a1 ∈ Ua1 , a∈ / Ua1 , a1 ∈/ U. So for all a1 ∈ A \{a} there exists a Ua1 ⊆ A open such that a1 ∈ Ua1 ⊆ A \{a}. But then A \{a} = S {a } ⊆ S U ⊆ A, so A = S U a1∈A\{a} 1 a1∈A\{a} a1 a1∈A\{a} a1 which is open as all Ua1 are open. Therefore {a} = A \ (A \{a}) is closed. Suppose {a} ⊆ A is closed for all a ∈ A. Let a1, a2 ∈ A, a1 6= a2. By assumption {a1}, {a2} ⊆ A are closed, so U1 := A \{a2}, U2 := A \{a1} are open and a1 ∈ U1, a2 ∈ U2, a1 ∈/ U2, a2 ∈/ U1, because a1 6= a2. So A . Lemma 2.2.3: Shrinking Let A T . Then A is normal if and only if for all B ⊆ U1 ⊆ A where B is closed and U1 is open there exists an open U2 ⊆ A such that B ⊆ U2 ⊆ U 2 ⊆ U1.

Proof. Suppose A is normal and let B ⊆ U1 ⊆ A be given. As U1 is open, C := A \ U1 ⊆ A is closed, so because A is normal and B,C ⊆ A are closed and disjoint there exist open sets U2,U3 ⊆ A such that B ⊆ U2 and C ⊆ U3 and U2 ∩ U3 = ∅. Now A \ U3 is a closed set containing U2, so U 2 ⊆ A \ U3 ⊆ A \ C = U1, therefore B ⊆ U2 ⊆ U 2 ⊆ U1. Suppose the converse holds. Let B,C ⊆ A be arbitrary closed sets for which B ∩ C = ∅. Then U1 := A \ C is an open set containing B, so by assumption there exists an open U2 ⊆ A such that B ⊆ U2 ⊆ U 2 ⊆ U1, but then U2 and A \ U 2 are disjoint open sets containing B and C respectively. Since this is true for all such B and C, A is normal.

The following theorem explains why there is no analogy of T3.5 for normal spaces. Lemma 2.2.4: Urysohn’s lemma Let A be normal. Then for all B,C ⊆ A that are closed and disjoint there exists an f : A → [0, 1] c such that f(B) = {0} and f(C) = {1}. Proof. We follow [Mun2000]. As Q ∩ [0, 1] ⊆ Q is countable, there exists a bijection q : N → Q ∩ [0, 1] : k 7→ qk for which q0 = 0 and q1 = 1. Now construct

Uqk ⊆ A open by induction on k, such that for all r, s ∈ Q ∩ [0, 1] we have r < s → U r ⊆ Us. First k = 0, 1. Define U1 := A \ C which is open and contains B. By Lemma (2.2.3) there exists an open set, which we will define to be U0, such that B ⊆ U0 ⊆ U 0 ⊆ U1. Therefore the Uqk satisfy the induction hypothesis for 0 ≤ k ≤ 1.

Now suppose we have constructed the Uqk with the inclusion property for

0 ≤ k ≤ k0. As {q0, . . . , qk0 } is finite, there exist l, m ∈ {0, . . . , k0} such that ql < qk0+1 < qm with |ql − qk0+1| and |qk0+1 − qm| minimal. By the induction hypothesis U ⊆ U , so using Lemma (2.2.3) we obtain U ⊆ A as the ql qm qk0+1 open set for which U ⊆ U ⊆ U ⊆ U . Through choice of l and m, ql qk0+1 qk0+1 qm the Uqk satisfy the induction hypothesis for 0 ≤ k ≤ k0 + 1. By induction this permits us to construct the Uqk for all k ∈ N with the desired inclusion property, and therefore Ur for all r ∈ Q ∩ [0, 1] as q : N → Q ∩ [0, 1] is a bijection. Define Ur := ∅ for r < 0 and Ur := A for r > 1 to obtain open Ur ⊆ A for all r ∈ Q satisfying U r ⊆ Us whenever r < s. Now define f : A → [0, 1] by

f(a) := inf{r ∈ Q|a ∈ Ur}

- 19 - 2.2. SEPARATION AXIOMS

which indeed lies within [0, 1] by choice of the Ur. As U0 ⊇ B and Ur = ∅ for r < 0 we see that f(B) = {0}, because U1 = A \ C and Ur = A for all r > 1 we have f(C) = {1}. Because of the inclusion property, if a ∈ U r, then {s ∈ Q|a ∈ Us} ⊆ [r, ∞[, so f(a) ≤ r, and if a∈ / Ur, then a∈ / Us for all s < r and hence f(a) ≥ r. Let a ∈ A be fixed and  ∈]0, ∞[. Choose r, s ∈ Q such that f(a) −  < r < f(a) < s < f(a) +  (such r and s exist because Q is dense in R) and let U := Us \ U r ⊆ A open, then r < f(a) < s implies a∈ / U r and a ∈ Us, so a ∈ U. Furthermore, for all a1 ∈ U we have a1 ∈ U s and a1 ∈/ Ur, so c f(a1) ∈ [r, s] ⊆]f(a) − , f(a) + [. Because this is true for all  ∈]0, ∞[, f a. As a ∈ A was arbitrary, f . Theorem 2.2.5: Relations between the separation axioms Let A T . Then • perfectly normal ⇒ normal,

• T1 and normal ⇒ completely regular, • completely regular ⇒ regular,

• T6 ⇒ T4 ⇒ T3.5 ⇒ T3 ⇒ T2 ⇒ ⇒ T0 . Proof. • Suppose A is perfectly normal and let B,C ⊆ D be closed and disjoint. As A is perfectly normal there exists a f : A → R such that f −1({0}) = B, f −1({1}) = C. Since ] − ∞, 1/2[, ]1/2, ∞[⊆ R are open and disjoint, so are f −1(] − ∞, 1/2[), f −1(]1/2, ∞[) ⊆ A because f . As 0 ∈] − ∞, 1/2[, 1 ∈]1/2, ∞[ we see that B ⊆ f −1(] − ∞, 1/2[), C ⊆ f −1(]1/2, ∞[). So we can separate closed, disjoint B and C with open sets: A is normal. • Let A be normal and . Let a ∈ A and B ⊆ A closed such that a∈ / B. Because A , by Lemma (2.2.2) {a} ⊆ A is closed and disjoint from B, therefore (A is normal by assumption) by Lemma (2.2.4) there exists an f : A → R with f({a}) = {0} and f(B) = {1}. Therefore A is completely regular. • Suppose A is completely regular. Use the trick from the reduction from perfectly normal to normal to obtain regularity of A. • It is clear from Definition (2.2.1) that ⇒ ⇒ , so for the final item, simply combine all of the above, noting that for and above we assume (and hence have ) by definition.

Lemma 2.2.6: Uniqueness of limits Let A , B , f : A → B, a ∈ A, and b1, b2 ∈ B. If limx→a f(x) = b1 and limx→a f(x) = b2, then b1 = b2.

Proof. Suppose b1 6= b2, then because B there exist open neighbourhoods V1 and V2 of b1 and b2 respectively in B such that V1 ∩ V2 = ∅. Because limx→a f(x) = b1 there exists an open neighbourhood U1 of a in A such that f(U1) ⊆ V1 and because limx→a f(x) = b2 there exists an open neighbourhood

- 20 - 2.3. SEQUENCES

U2 of a in A such that f(U2) ⊆ V2. Now f(U1 ∩ U2) ⊆ V1 ∩ V2 = ∅, however a ∈ U1 ∩ U2 so f(a) ∈ V1 ∩ V2, leading to a contradiction. Therefore necessarily b1 = b2. This permits us in a Hausdorff space to actually talk about the limit of a certain function at a certain point.

Lemma 2.2.7: Graphs of c functions are closed Let A T , B T2 , f : A → B. If f , then

graph(f) := {(a, b) ∈ A × B | b = f(a)} ⊆ A × B is closed. Proof. Let (a, b) ∈ A × B \ graph(f), then b 6= f(a), so as B there exist open neighbourhoods V1,V2 of b resp. f(a) in B such that V1 ∩ V2 = ∅. As f , there exists an open neighbourhood U2 of a in A with f(U2) ⊆ V2. Now V(a,b) := U2 × V1 is an open neighbourhood of (a, b) ∈ A × B and for any (a1, b1) ∈ V(a,b) we have f(a1) ∈ f(U2) ⊆ V2, so as b1 ∈ V1 and V1 ∩ V2 = ∅, we find f(a1) 6= b1 and hence (a1, b1) ∈/ graph(f). So for any (a, b) ∈ A×B\graph(f) there exists an open neighbourhood V(a,b) of (a, b) in A × B such that V(a,b) ∩ graph(f) = ∅. Therefore A × B \ graph(f) = S S (a,b)∈A×B\graph(f){(a, b)} ⊆ (a,b)∈A×B\graph(f) V(a,b) ⊆ A × B \ graph(f) which is open as a union of a collection of open subsets. Hence graph(f) ⊆ A×B is closed. A partial converse to this result is given in Lemma (2.2.8) and Theorem (4.4.4). Lemma 2.2.8 Let A, B , f : A → B. If A × B → A :(a, b) 7→ a is a closed map and graph(f) ⊆ A × B is closed, then f .

Proof. Define g1 : A × B → A :(a, b) 7→ a, g2 : A × B → B :(a, b) 7→ b. Then g1, g2 by definition of the initial topology on A × B. Now for any V ⊆ B −1 closed we have f (V ) = {a ∈ A|f(a) ∈ V } = {g1(a, f(a)) ∈ A|g2(a, f(a)) ∈ −1 V } = {g1(a, b) ∈ A|(a, b) ∈ graph(f), g2(a, b) ∈ V } = g1(graph(f) ∩ g2 (V )). −1 As g2 (Lemma (2.1.14)), g2 (V ) is closed, graph(f) is closed by assumption, −1 and g1 is a closed map by assumption, we have that therefore f (V ) ⊆ A is closed. Since this is true for all V ⊆ B closed, f by Lemma (2.1.14).

2.3 Sequences

Definition 2.3.1: Sequence Let A . Then a sequence is a map x : N → A : k 7→ xk. We say that x has limit a in A, denoted by lim xk = a k→∞

- 21 - 2.3. SEQUENCES

if for all open neighbourhoods U of a in A there exists a k ∈ N such that for all l ≥ k we have xl ∈ U. If there exists an a ∈ A such that limk→∞ xk = a we say that the sequence x is convergent in A. A sequence x0 : N → A is called a subsequence of x : N → A if there exist 0 k1 < k2 < . . . in N such that xl = xkl for all l ∈ N. Note that if a sequence is convergent, then so is any subsequence and for any fixed k ∈ N, the sequence l 7→ xl is convergent if and only if l 7→ xk+l is convergent.

Example 2.3.2: Nˆ Nˆ is the topological space defined as a set as

Nˆ := N ∪ {∞}, with topology consisting of all subsets

{k, k + 1, k + 2,...} ∪ {∞} of Nˆ for all k ∈ N, together with the empty set. That this is a topology follows from the fact that N is a well-ordered set: each non-empty subset of N has a least element. Let A T , then looking at Definition (2.3.1) we see that any map

ˆ x : N → A : k 7→ x(k) = xk

c satisfies x ∞ if and only if limk→∞ x(k) = x(∞) if and only if the sequence x|N satisfies limk→∞(x|N)k = x(∞) for the point x(∞) ∈ A.

Definition 2.3.3: First countability (1C ) Let A . Then we say that A is first countable (denoted by A ) if for all a ∈ A there exists a basis of open neighbourhoods of a in A that is countable (cf. Definition (2.1.4)).

Note that we may, without loss of generality, suppose this countable collec- tion U1,U2,... of open neighbourhoods to be descending: U1 ⊇ U2 ⊇ ... by considering U1, U1 ∩ U2, U1 ∩ U2 ∩ U3, . . . , which are all open neighbourhoods of a in A. Lemma 2.3.4 Let A , B , and f : A → B. Let a ∈ A and b ∈ B, then

lim f(x) = b x→a if and only if for all sequences x : N → A satisfying limk→∞ xk = a, we have

lim f(xk) = b. k→∞

- 22 - 2.4. COMPACTNESS

Proof. Suppose limx→a f(x) = b and let x : N → A be any sequence satisfying limk→∞ xk = a. Then limk→∞ f(xk) = limk→∞(f ◦ x)(k) = f(b), because of the chain rule for limits and Example (2.3.2). Suppose limx→a f(x) 6= b. Then there exists an open neighbourhood V of b in B such that for all open neighbourhoods U of a in A there exists a point a1 ∈ U for which f(a1) ∈/ V . Because A 1C , there exists a descending countable collection U1,U2,... of open neighbourhoods of a in A such that for each open neighbourhood U of a in A there exists a k ∈ N such that Ul ⊆ U for all l ≥ k. Construct a sequence x : N → A by mapping k ∈ N to a point xk ∈ Uk for which f(xk) ∈/ V . Then limk→∞ xk = a because for any open neighbourhood U of a in A there exists a k ∈ N such that for all l ≥ k we have Ul ⊆ U and since xl ∈ Ul for all l ∈ N this means that limk→∞ xk = a. However, by construction, for all k ∈ N, we have f(xk) ∈/ V and hence limk→∞ f(xk) 6= b. So there exists a sequence x : N → A with limk→∞ f(xk) 6= b. Lemma 2.3.5 Let A T , and B ⊆ A a subset. Then a ∈ B if and only if there exists a convergent sequence x : N → B such that a = limk→∞ xk. Proof. Let a ∈ B. As A , there exists a descending countable basis of neigh- bourhoods U1,U2,... of a in A. By Lemma (2.1.3), for each k ∈ N, Uk ∩ B 6= ∅ because Uk is an open neighbourhood of a ∈ B. Hence we can for each k ∈ N pick an xk ∈ Uk ∩ B to obtain a sequence x : N → B : k 7→ xk. Let U be any open neighbourhood of a in A, then there exists a k ∈ N such that a ∈ Ul ⊆ U for all l ≥ k (as U1 ⊇ U2 ⊇ ...). Hence for all l ≥ k, xl ∈ U, so limk→∞ xk = a. Suppose conversely that x : N → B with limit a = limk→∞ xk ∈ A. Let U be an arbitrary open neighbourhood of a in A, then there exists a k ∈ N such that for all l ≥ k we have xl ∈ U. In particular xk ∈ U ∩ B (as x : N → B), so U ∩ B 6= ∅. Since this is true for all open neighbourhoods U of a in A, by Lemma (2.1.3), a ∈ B.

2.4 Compactness

Definition 2.4.1: Collections of subsets Let A . A subset A ⊆ P(A) is called a collection of subsets of A. A collection A is called

• open (resp. closed) if U ⊆ A is open (resp. closed) for all U ∈ A, • a cover of A if A = S A, • finite if A consists of a finite number of elements, • locally finite if for all a ∈ A there exists an open neighbourhood U of a in A such that {U1 ∈ A|U1 ∩ U 6= ∅} is finite, S • countably locally finite if A = Ak where each Ak is a locally finite k∈N collection.

- 23 - 2.4. COMPACTNESS

For any two collections A1, A2 of subsets of A we furthermore say that A1 is a subcollection of A2 if A1 ⊆ A2, and that A1 is a refinement of A2 if for all U1 ∈ A1 there exists a U2 ∈ A2 such that U1 ⊆ U2.

Definition 2.4.2: Compactness (Cpt ) Let A T . Then A is compact (denoted by A ) if all open covers of A contain a finite subcollection covering A.

Lemma 2.4.3 Let A , B . If f : A → B c , then f(A) ⊆ B . Proof. Let B be an open cover of f(A). Then (subspace topology) we can write B = {Vi ∩ f(A)|i ∈ I} where all Vi ⊆ B are open. Because f , for all −1 −1 −1 i ∈ I, f (Vi ∩ f(A)) = f (Vi) ∩ A ⊆ A is open and since A = f (f(A)) = −1 S S −1 −1 f ( B) = i∈I f (Vi ∩ f(A)) the collection A := {f (Vi ∩ f(A))|i ∈ I} forms an open cover of A. −1 −1 Because A is there exists a finite subcollection f (Vi1 ∩f(A)), . . . , f (Vik ∩ f(A)) of this cover, which covers A. Hence Vi1 ∩ f(A),...,Vik ∩ f(A) is a finite open cover of f(A) which is furthermore a subcollection of B. This makes f(A) .

Lemma 2.4.4 Let A . Then for any B ⊆ A closed, B . Proof. Let A be any open cover of B, then because of the subspace topology we can write A = {Ui ∩ B|i ∈ I} where all Ui ⊆ A are open. Note that {Ui|i ∈ I} ∪ {A \ B} now forms an open cover of A, because B is closed. Now A , so this open cover has a finite subcollection covering A which in turn also covers B. This shows that B . Lemma 2.4.5 Let A T2 . Then for any B ⊆ A , B is closed. Proof. Let a ∈ A \ B, then for any b ∈ B we have b 6= a and hence (A ) there exists a neighbourhood Ub of a in A and Vb of b in B such that Ub ∩Vb = ∅. The collection {Vb|b ∈ B} forms an open cover of B (as b ∈ Vb for all b ∈ B), so (B

) there exists a finite number of b1, . . . , bk ∈ B such that B ⊆ Vb1 ∪ ... ∪ Vbk .

Hence a ∈ Ub1 ∩ ... ∩ Ubk ⊆ A \ (Vb1 ∪ ... ∪ Vbk ) ⊆ A \ B. So for each a ∈ A \ B there exists an open neighbourhood Ua (= Ub ∩ ... ∩ Ub ) of a in A with S 1 S k a ∈ Ua ⊆ A \ B. Therefore A \ B = a∈A\B{a} ⊆ a∈A\B Ua ⊆ A \ B, so A \ B is open and hence B is closed. Lemma 2.4.6: Sequential compactness Let A 1C . Then for all sequences x : N → A there exists a subsequence x0 : N → A of x which is convergent.

- 24 - 2.5. METRIC SPACES

Proof. Let x : N → A be any sequence. Suppose there exists an a ∈ A such that for all open neighbourhoods U of a in A the set {k ∈ N|xk ∈ U} is infinite. Since A 1C , a admits a countable basis of open neighbourhoods U1 ⊇ U2 ⊇ .... 0 With this basis we can construct a subsequence of x by choosing xl := xkl for kl the least element of the infinite set {k ∈ N|xk ∈ Ul} ⊆ N (possible as N is well-ordered). Now let U be an arbitrary open neighbourhood of a in A, then there exists an m ∈ N such that a ∈ Um ⊆ U. For all l ≥ m we have 0 0 0 xl = xkl ∈ Ul ⊆ Um ⊆ U by construction. Therefore liml→∞ xl = a and x is a convergent subsequence of x. Now suppose that this is not the case: suppose that for all a ∈ A there exists an open neighbourhood Ua of a in A such that {k ∈ N|xk ∈ Ua} is finite, denote the number of elements of this set by ka ∈ N. The collection Cpt U := {Ua ⊆ A|a ∈ A} is an open cover of A, therefore (A ) there exists a finite number of points a1, . . . , al ∈ A such that A = Ua1 ∪ ... ∪ Ual . As

{x1, x2,...} ⊆ A = Ua1 ∪ ... ∪ Ual and {x1, x2,...} ∩ Uam has kam elements for all 1 ≤ m ≤ l, the set {x1, x2,...} has at most ka1 + ... + kal elements and is therefore finite. Because N is infinite, this means that there exists some a ∈ A such that the set {k ∈ N|xk = a} is infinite. Therefore the constant sequence 0 xl := a for all l ∈ N is a convergent subsequence of x.

2.5 Metric spaces

Definition 2.5.1: Metric space (d(.,.) ) Let A be a set. Then a pseudometric on A is a map d : A × A → R, satisfying for all a1, a2, a3 ∈ A that

• d(a1, a2) ≥ 0,

• d(a1, a1) = 0,

• d(a1, a2) = d(a2, a1),

• d(a1, a3) ≤ d(a1, a2) + d(a2, a3).

If in addition d(a1, a2) = 0 → a1 = a2, then d is called a metric on A. A (pseudo)metric space A is a set A together with a (pseudo)metric d. We denote the fact that A is a metric space by A . Definition 2.5.2: Open ball Let A be a (pseudo)metric space. Then for any a ∈ A and δ ∈]0, ∞[ we define the open ball of radius δ around a in A to be BA(a, δ) := {a1 ∈ A | d(a, a1) < δ}. Lemma 2.5.3 Let A be a (pseudo)metric space. Then the collection of all open balls in A,

A := {BA(a, δ) ⊆ A | a ∈ A, δ ∈]0, ∞[} forms a topological basis of A.

- 25 - 2.5. METRIC SPACES

Proof. First note that d(a, a) = 0 < δ for all δ ∈]0, ∞[, so a ∈ BA(a, δ) for all a ∈ A and δ ∈]0, ∞[. Therefore A = S A. Let a1, a2 ∈ A and δ1, δ2 ∈]0, ∞[. If a1 = a2 then BA(a1, δ1) ∩ BA(a2, δ2) = BA(a1, min{δ1, δ2}) which is again an element of the basis, so we may suppose a1 6= a2. Let a3 ∈ BA(a1, δ1) ∩ BA(a2, δ2) be arbitrary. Choose δ3 := min{δ1 − d(a1, a3), δ2 − d(a1, a3)} > 0, then for any a4 ∈ BA(a3, δ3) we have d(a1, a4) ≤ d(a1, a3) + d(a3, a4) < d(a1, a3) + δ3 ≤ d(a1, a3) + δ1 − d(a1, a3) = δ1, so a4 ∈ BA(a1, δ1). Similarly a4 ∈ BA(a2, δ2), so BA(a3, δ3) ⊆ BA(a1, δ1) ∩ BA(a2, δ2) and such a basis element exists for all a3 ∈ BA(a1, δ1) ∩ BA(a2, δ2). Therefore A is a topological basis. Definition 2.5.4: Topology of a metric space Let A be a (pseudo)metric space. Then we always consider A T with topology generated by the topological basis from Lemma (2.5.3). We call a B (pseudo)metrisable if B is -isomorphic to some (pseudo)metric space A.

Note that this in particular makes all BA(a, δ) open subsets of A. Lemma 2.5.5 Let A, B both be (pseudo)metric spaces, a ∈ A, b ∈ B, f : A → B. Then limx→a f(x) = b if and only if

∀ ∈]0, ∞[: ∃δ ∈]0, ∞[: ∀a1 ∈ A :(dA(a, a1) < δ → dB(b, f(a1)) < ).

Proof. Note that the statement is equivalent to ∀ ∈]0, ∞[: ∃δ ∈]0, ∞[: f(BA(a, δ)) ⊆ BB(b, ). Since the open balls form topological bases for the topologies of A and B, we know that for any open neighbourhood V of b in B there exists an  ∈]0, ∞[ such that b ∈ BB(b, ) ⊆ V (recall Lemma (2.1.7)), and similarly for all open neighbourhoods U of a in A there exists a δ ∈]0, ∞[ such that a ∈ BA(a, δ) ⊆ U. Lemma 2.5.6 Let A, B , C d(.,.) , a ∈ A, b ∈ B, c ∈ C, and f : A × B → C. Suppose lim f(x, y) = c (x,y)→(a,b) and that for all x ∈ A there exists a g(x) ∈ C such that

lim f(x, y) = g(x), y→b this gives us a function g : A → C. Then for this function g we have

lim g(x) = c. x→a

Proof. We follow [Dui2003]. Let  ∈]0, ∞[ be given, then because lim(x,y)→(a,b) f(x, y) = c (by Lemma (2.5.5)) there exist open neighbourhoods U and V of a and b in A and B respectively such that for all (x, y) ∈ U × V we have dC (f(x, y), c) < /2. Let x ∈ U and let δ ∈]0, ∞[ be arbitrary. Then because limy→b f(x, y) = g(x)

- 26 - 2.5. METRIC SPACES

there exists an open neighbourhood Vδ of b in B such that for all y ∈ Vδ, dC (f(x, y), g(x)) < δ. As for all y ∈ V ∩ Vδ,(x, y) ∈ U × V , we find

dC (g(x), c) ≤ dC (g(x), f(x, y)) + dC (f(x, y), c) < δ + /2.

Since this is true for all δ ∈]0, ∞[ we find that necessarily

dC (g(x), c) ≤ 0 + /2.

Therefore, for all x ∈ U we have

dC (g(x), c) ≤ /2 < .

Hence (Lemma (2.5.5)), limx→a g(x) = c, as desired. Corollary 2.5.7: Exchanging of limits Let A, B T , C d(.,.) , a ∈ A, b ∈ B, c ∈ C, and f : A × B → C. If lim(x,y)→(a,b) f(x, y) exists and there exists an open neighbourhood U × V of (a, b) in A × B such that for all (x, y) ∈ U × V the limits

lim f(x0, y), lim f(x, y0) x0→a y0→b exist, then     lim lim f(x, y) = lim lim f(x, y) = lim f(x, y). x→a y→b y→b x→a (x,y)→(a,b)

Proof. Apply Lemma (2.5.6) to x and y separately for the metric spaces U and V and use that the limits exist in these metric spaces if and only if they exist in A and B, because U × V is an open neighbourhood of (a, b) in A × B. Definition 2.5.8 Let A be a (pseudo)metric space. Then for any nonempty subset B ⊆ A we define for all a ∈ A the distance from a to B as d(a, B) := inf{d(a, a1) ∈ R|a1 ∈ B} ≥ 0. Lemma 2.5.9 Let A be a (pseudo)metric space and B ⊆ A a nonempty subset. Then both the metric d : A×A → R and the distance function d(·,B): A → R : a 7→ d(a, B) are continuous. Furthermore, d(·,B)−1({0}) = B.

Proof. First the metric. Let (a1, a2) ∈ A × A, and  ∈]0, ∞[ be given. Choose U := BA(a1, /2) × BA(a2, /2) ⊆ A × A which is an open neighbourhood of (a1, a2) in A × A. Let (a3, a4) ∈ U, then d(a3, a4) ≤ d(a3, a1) + d(a1, a4) ≤ d(a3, a1)+d(a1, a2)+d(a2, a4) < /2+d(a1, a2)+/2, so d(a3, a4)−d(a1, a2) < . On the other hand d(a1, a2) ≤ d(a1, a3) + d(a3, a2) ≤ d(a1, a3) + d(a3, a4) + d(a4, a2) < /2+d(a3, a4)+/2, so d(a1, a2)−d(a3, a4) < . So for all (a3, a4) ∈ U we have |d(a3, a4)−d(a1, a2)| < , hence d(U) ⊆]d(a1, a2)−, d(a1, a2)+[. Since c this is true for all a1, a2 ∈ A,  ∈]0, ∞[, we have d . Fix B ⊆ A nonempty, then for any a1, a2 ∈ A and b ∈ B we have d(a1,B) ≤ d(a1, b) ≤ d(a1, a2) + d(a2, b), so d(a1,B) − d(a1, a2) ≤ d(a2,B) and hence

- 27 - 2.5. METRIC SPACES

d(a1,B) − d(a2,B) ≤ d(a1, a2). Similarly d(a2,B) − d(a1,B) ≤ d(a1, a2), so |d(a1,B) − d(a1,B)| ≤ d(a1, a2). This gives continuity of a 7→ d(a, B). Note that {0} ⊆ R is closed and as d(·,B) c , d(·,B)−1({0}) ⊆ A is closed as well by Lemma (2.1.14). For all b ∈ B we have 0 ≤ d(b, B) ≤ d(b, b) = 0, so B ⊆ d(·,B)−1({0}). Therefore B ⊆ d(·,B)−1({0}). Let a ∈ d(·,B)−1({0}) and U any open neighbourhood of a in A. Then there exists a δ ∈]0, ∞[ such that a ∈ BA(a, δ) ⊆ U. Since d(a, B) = inf{d(a, b)|b ∈ B} = 0 < δ there exists some bδ ∈ B such that d(a, bδ) < δ. But then bδ ∈ BA(a, δ) ⊆ U and therefore B ∩ U 6= ∅. Because this is true for all open neighbourhoods U of a in A, we see (Lemma (2.1.3)) that a ∈ B. Therefore d(·,B)−1({0}) ⊆ B. Because of this B = d(·,B)−1({0}). Theorem 2.5.10 Let A T . Then A d(.,.) ⇒ A T6 1C . Proof. Let a ∈ A be fixed. Consider the countable collection n  1  o A := B a, | k ∈ , k ≥ 1 . A k N Each element of A is clearly an open neighbourhood of a in A and for any open neighbourhood U of a in A, we have by definition of the topology that there 1 exists an  ∈]0, ∞[ such that a ∈ BA(a, ) ⊆ U. Therefore, for k ≥ d  e ∈ N we 1 see that a ∈ BA(a, k ) ⊆ BA(a, ) ⊆ U. Because of this A . Let a1, a2 ∈ A and suppose that a1 6= a2. Then d(a1, a2) > 0, choose  = d(a1, a2)/2 > 0, then BA(a1, ) and BA(a2, ) are two disjoint open neigh- bourhoods of a1 resp. a2 in A. Therefore A T2 . Let B,C ⊆ A be disjoint and closed. Choose f : A → R defined by f(a) = d(a, B)/(d(a, B)+d(a, C)). From Lemma (2.5.9) we know that d(·,B)−1({0}) = B = B, because B is closed. Therefore, for all a ∈ A we have that d(a, B) = 0 if and only if a ∈ B (and similarly for C). Because of this d(a, B) + d(a, C) ≤ 0 if and only if d(a, B) = d(a, C) = 0 if and only if a ∈ B ∩ C = ∅, which is impossible. So d(a, B) + d(a, C) > 0 for all a ∈ A. Therefore (together with continuity of d from Lemma (2.5.9)), f . Let a ∈ A, then f(a) = 0 if and only if d(a, B) = 0 if and only if a ∈ B, so f −1({0}) = B. Furthermore, f(a) = 1 if and only if d(a, B) = d(a, B) + d(a, C) if and only if d(a, C) = 0 if and only if a ∈ C, so f −1({1}) = C. So f : A → R and f −1({0}) = B, f −1({1}) = C. Since such a function exists for all disjoint, closed B,C ⊆ A we see that A is perfectly normal and (we already saw A ) therefore A . Definition 2.5.11: Cauchy sequence Let A and x : N → A a sequence. Then we call x a Cauchy sequence in A if for all  ∈]0, ∞[ there exists a k ∈ N such that for all l, m ≥ k we have d(xl, xm) < . Lemma 2.5.12 Let A and x : N → A a sequence. If x is convergent in A, then x is a Cauchy sequence in A. Proof. Suppose x is convergent in A, then there exists an a ∈ A such that limk→∞ xk = a. Let  ∈]0, ∞[, then because BA(a, /2) is an open neighbour- hood of a in A we have that there exists a k ∈ N such that for all l ≥ k

- 28 - 2.5. METRIC SPACES

we have d(a, xl) < /2. But then for all l, m ≥ k we have that d(xl, xm) ≤ d(xl, a) + d(a, xm) < /2 + /2 = . Therefore x is Cauchy. Example 2.5.13 The converse√ of Lemma (2.5.12) is not true: take any sequence in Q approxi- mating 2 with fractions (i.e. a decimal expansion where xk is the k-decimal approximation: 1,√ 1.4, 1.41, 1.414, . . . ), then this sequence is Cauchy, but it has no limit in Q (as 2 ∈/ Q). Definition 2.5.14: Completeness Let A d(.,.) . We call A complete if every Cauchy sequence in A is convergent in A. Example 2.5.15 Completeness is not a topological property: consider ] − 1, 1[ and R both equipped with the usual absolute value metric | · |. Then R is complete by construction, while for ] − 1, 1[ we have that the Cauchy sequence x : N → 1 ] − 1, 1[: k 7→ 1 − k+2 has no limit in ] − 1, 1[ (the sequence converges to 1 ∈/] − 1, 1[ when viewed as a sequence in R), hence ] − 1, 1[ is not complete. However, ] − 1, 1[ is T -isomorphic to via ] − 1, 1[→ : x 7→ √ x with R R 1−x2 inverse →] − 1, 1[: x 7→ √ x . R 1+x2 Therefore completeness is not preserved through -isomorphism. Definition 2.5.16: Baire space Let A . Then we call A Baire if for any countable collection B1,B2,... of subsets of A where for all k ∈ N, Bk ⊆ A is closed and int(Bk) = ∅ we have  [  int Bk = ∅. k∈N Theorem 2.5.17: Baire category theorem Let A . If A is complete, then A is Baire.

Proof. We follow [Mun2000]. Let B1,B2,... be a given countable collection of closed subsets of A with empty interiors. Let U ⊆ A be any nonempty open set. As A , we have by Theorem (2.5.10) that A T3 . Since int(B1) = ∅ and U 6= ∅, there exists an a1 ∈ U \ B1. Now A , a1 ∈ U, a1 ∈/ B1, and B1 ⊆ A is closed. Hence there exists a δ1 ∈]0, 1[ such that U1 := BA(a1, δ1) ⊆ U1 ⊆ U, U1 ∩ B1 = ∅. Suppose that we have constructed a sequence (a1, δ1),..., (ak, δk) ∈ A×]0, ∞[, 1 such that for each 1 ≤ l ≤ k we have δl ≤ l , al ∈ Ul = BA(al, δl), Ul ⊆ Ul−1 (where U0 := U), Ul ∩ Bl = ∅. Then as int(Bk+1) = ∅, there exists an 1 ak+1 ∈ Uk \ Bk+1 and (A ) therefore there exists a δk+1 ∈]0, k+1 [ with Uk+1 := BA(ak+1, δk+1) ⊆ Uk and Uk+1 ∩ Bk+1 = ∅. Using induction this permits us to construct a sequence N → A×]0, 1[: k 7→ (ak, δk) with (here Uk := BA(ak, δk)) 1 U = U ⊇ U ⊇ U ⊇ ..., ∀k ∈ : δ ≤ , U ∩ B = ∅. 0 1 2 N k k k k

- 29 - 2.5. METRIC SPACES

1 By construction, for any given  ∈]0, ∞[, take k = d  e ∈ N, then for all l, m ∈ N with k ≤ l ≤ m we have am ∈ Um ⊆ Ul ⊆ Uk, al ∈ Ul ⊆ Uk, so d(al, am) < 1 δk ≤ k ≤ . But this makes the sequence N → A : k 7→ ak Cauchy and since A is complete, there exists an a ∈ A such that limk→∞ ak = a. As for all k ∈ N, al ∈ Ul ⊆ Uk for all l ≥ k, necessarily a ∈ Uk (Lemma (2.3.5)). Since this is T true for all k ∈ N, we have a ∈ k∈ Uk. But this means that for all k ∈ N, S N S a∈ / Bk, so a∈ / Bk. On the other hand a ∈ U1 ⊆ U, so a ∈ U \ Bk. k∈N S k∈N Therefore U cannot be a subset of Bk. k∈N S As this is true for all nonempty open sets U ⊆ A, the interior of Bk k∈N must be empty and hence A is Baire.

Lemma 2.5.18 Let A d(.,.) and complete, and B ⊆ A any subset. Then B is closed if and only if B is complete (considered as a metric space with the restriction of A’s metric to B × B).

Proof. Suppose B ⊆ A is closed. Let x : N → B be any Cauchy sequence in B. Then x is a Cauchy sequence in A (as B has the metric from A restricted to B×B), so because A is complete, there exists an a ∈ A such that limk→∞ xk = a. Hence by Lemma (2.3.5) a = limk→∞ xk ∈ B = B, as B is closed. So x converges in B. Hence B is complete. Suppose B ⊆ A is complete. Let b ∈ B, then by Lemma (2.3.5)(A, B 1C by Theorem (2.5.10)) there is a sequence x : N → B such that limk→∞ xk = b in A. By Lemma (2.5.12), x is a Cauchy sequence in A. Since xk ∈ B for all k ∈ N and the metric on B is the restriction of the metric on A, x is a Cauchy sequence 0 0 in B. As B is complete, there exists a b ∈ B such that limk→∞ xk = b . By Theorem (2.5.10), A, B T2 so (Lemma (2.2.6)) necessarily b = b0 ∈ B. Therefore B ⊆ B and hence B is closed.

Definition 2.5.19: Lipschitz continuity Let A, B and δ ∈]0, ∞[. Then we call a map f : A → B δ-Lipschitz continuous (denoted by f δ- c ) if ∀a1, a2 ∈ A : dB(f(a1), f(a2)) ≤ δ dA(a1, a2). Lemma 2.5.20 Let A, B and f : A → B. If f δ- then f . Proof. Fix any a ∈ A and use Lemma (2.5.5). Let  ∈]0, ∞[ be given, choose γ = /δ > 0, then for any a1 ∈ A satisfying dA(a, a1) < γ we have (δ- ) dB(f(a), f(a1)) ≤ δ dB(a, a1) < δ γ = . Therefore limx→a f(x) = f(a) and f a. Since this is true for all a ∈ A, f . Theorem 2.5.21: Fixed point theorem Let A complete and f : A → A. If f δ- for δ ∈]0, 1[, then there exists a unique a ∈ A such that f(a) = a.

Proof. We follow [DK2004I]. Suppose the conditions of the theorem are satisfied.

- 30 - 2.5. METRIC SPACES

Let a1 ∈ A be arbitrary and construct the sequence x : N → A by

xk := (f ◦ f ◦ ... ◦ f)(a1), | {z } k times such that xk+1 = f(xk) for all k ∈ N. Then for k ∈ N, k ≥ 2 we have

k−1 d(xk, xk+1) = d(f(xk−1), f(xk)) ≤ δ d(xk−1, xk) ≤ ... ≤ δ d(x1, x2).

So for k, l ∈ N,

d(xk, xk+l) ≤ d(xk, xk+1) + d(xk+1, xk+2) + ... + d(xk+l−1, xk+l) k−1 k k+l−2 ≤ δ d(x1, x2) + δ d(x1, x2) + ... + δ d(x1, x2) k−1 l−1 = δ (1 + δ + ... + δ ) d(x1, x2) δk−1 ≤ d(x , x ). 1 − δ 1 2 As δ ∈]0, 1[, this term can be made arbitrarily small when we increase k. There- fore the sequence x is Cauchy and because A is complete, this means that there exists an a ∈ A such that limk→∞ xk = a. By the chain rule for limits we have (use Lemma (2.5.20) for continuity of f)   f(a) = f lim xk = lim f(xk) = lim xk+1 = a. k→∞ k→∞ k→∞ Therefore f(a) = a, so a is a fixed point of f. Suppose a1 ∈ A satisfies f(a1) = a1, then d(a, a1) = d(f(a), f(a1)) ≤ δ d(a, a1), so (1 − δ) d(a, a1) ≤ 0 and since 1 − δ > 0 and d(a0, b0) ≥ 0 this implies that d(a, a1) = 0 and hence a = a1. Therefore a is unique. Definition 2.5.22: Uniform continuity Let A, B d(.,.) . Then we call a map f : A → B uniformly continuous if

∀ ∈]0, ∞[: ∃δ ∈]0, ∞[: ∀a1, a2 ∈ A :(dA(a1, a2) < δ → dB(f(a1), f(a2)) < ).

Compare Definition (2.5.22) with a function f : A → B being c :

∀ ∈]0, ∞[: ∀a1 ∈ A : ∃δ ∈]0, ∞[: ∀a2 ∈ A :(dA(a1, a2) < δ → dB(f(a1), f(a2)) < ); with uniform continuity we can pick a single δ for all a1, while for ordinary continuity, this δ may vary wildly as we vary a1. Theorem 2.5.23: Uniform continuity Let A, B and f : A → B . If A Cpt , then f is uniformly continuous. Proof. We follow [DK2004I]. Suppose f is not uniformly continuous. Then there exists an  ∈]0, ∞[ such that for all δ ∈]0, ∞[ there exist a1, a2 ∈ A such that dA(a1, a2) < δ, while dB(f(a1), f(a2)) ≥ . In particular for each δ = 1/k, k ∈ N there exist 0 0 0 ak, ak ∈ A such that dA(ak, ak) < 1/k and dB(f(ak), f(ak)) ≥ . Because A

- 31 - 2.5. METRIC SPACES

Cpt 1C (by Theorem (2.5.10)) and A , by Lemma (2.4.6) the sequences k 7→ ak and 0 0 0 k 7→ ak have convergent subsequences x and x with limits a, a ∈ A respectively. 0 0 Because dA(ak, ak) < 1/k for all k ∈ N we see that necessarily a = a and hence, because f c , 0 lim f(xk) = lim f(xk) = f(a). k→∞ k→∞ 0 Hence (Lemma (2.5.9)) limk→∞ dB(f(xk), f(xk)) = dB(f(a), f(a)) = 0, con- 0 tradicting the fact that dB(f(ak), f(ak)) ≥  for all k ∈ N. So we reach a contradiction: f must be uniformly continuous.

- 32 - Chapter 3

Algebra

Algebra deals with the quantitative study of counting and symmetries.

3.1 Groups

Definition 3.1.1: Group ( G ) Let A be a set. A group structure on A is an element e ∈ A, called the identity, together with two maps

−1 A × A → A :(a1, a2) 7→ a1 a2,A → A : a1 7→ a1 , called multiplication and inversion respectively, that satisfy for all a1, a2, a3 ∈ A,

• e a1 = a1 e = a1,

• (a1 a2) a3 = a1 (a2 a3),

−1 −1 • a1 a1 = a1 a1 = e.

We call a group structure on A Abelian or commutative if for all a1, a2 ∈ A we have a1 a2 = a2 a1. In this case we usually denote e by 0 (called zero), a1 a2 −1 by a1 + a2 (called addition), a1 by −a1 (called negation), and a1 + (−a2) by a1 − a2. A group A (denoted by A ) is a set A together with a group structure. With all objects that have algebraic properties we will use the notation B ≤ A to indicate that B ⊆ A is a subset and that the restriction of all algebraic maps (addition, multiplication, . . . ) to B, makes B an algebraic object of the same type as A.

Example 3.1.2 Let A = R\{0} considered as an Abelian group with multiplication and division on R. Then B := {−1, +1} ≤ A since B with respect to (the restrictions of) multiplication and division on R to B.

- 33 3.1. GROUPS

Lemma 3.1.3 Let A G . Then the identity and inverses of elements of A are unique, and for all −1 −1 −1 a1, a2 ∈ A we have (a1 a2) = a2 a1 .

Proof. Suppose there exists an a ∈ A such that for all a1 ∈ A we have a a1 = a1 a = a1. Then in particular e = a e = a, so a = e. Therefore the identity of A is unique. Let a ∈ A be arbitrary and suppose a1, a2 ∈ A satisfy a a1 = a1 a = e and a a2 = a2 a = e. Then a1 = e a1 = (a2 a) a1 = a2 (a a1) = a2 e = a2, so a1 = a2 and the inverse of a is unique. −1 −1 −1 −1 −1 −1 We have that (a1 a2)(a2 a1 ) = a1 (a2 a2 ) a1 = a1 e a1 = a1 a1 = e −1 −1 −1 and similarly (a2 a1 )(a1 a2) = e, so by uniqueness of inverses (a1 a2) = −1 −1 a2 a1 . Example 3.1.4: Group of permutations Let k ∈ N, then the group of permutations of {1, . . . , k} is defined to be the set Sk := {π : {1, . . . , k} → {1, . . . , k}| π bijective }. together with identity id{1,...k}, multiplication (π1, π2) 7→ π1 π2 := π1 ◦ π2, and inversion π 7→ π−1 (the inverse function associated with the bijective function π). Definition 3.1.5: Morphisms of groups Let A, B . Then all maps f : A → B that satisfy for all a1, a2 ∈ A,

f(a1 a2) = f(a1) f(a2) are group morphisms between A and B (denoted by f -morphism). For all group morphisms f, the kernel of f is defined as

ker f := {a ∈ A | f(a) = eB}.

The identity morphism of A is the map

idA : A → A : a 7→ a.

Lemma 3.1.6 Let A, B and f : A → B a -morphism. Then

• f(eA) = eB, • for all a ∈ A we have f(a−1) = f(a)−1,

•{ eA} ≤ ker f ≤ A and ker f = {eA} if and only if f is injective, • f is a -isomorphism if and only if f is bijective.

Proof. • Let b ∈ f(A) ⊆ B (f(A) ⊇ f({eA}) 6= ∅) then there exists an a ∈ A such that f(a) = b. Now f(eA) b = f(eA) f(a) = f(eA a) = f(a) = b, so −1 −1 −1 f(eA) = f(eA) eB = f(eA)(b b ) = (f(eA) b) b = b b = eB, which shows the first statement.

- 34 - 3.1. GROUPS

−1 −1 −1 • Note that f(a ) f(a) = f(a a) = f(eA) = eB and similarly f(a) f(a ) = −1 −1 eB, so by uniqueness f(a ) = f(a) .

• Since f(eA) = eB, eA ∈ ker f and we see immediately that {eA} ≤ A. Suppose a1, a2 ∈ ker f, then f(a1 a2) = f(a1) f(a2) = eB eB = eB, so a1 a2 ∈ ker f, hence multiplication restricts to ker f × ker f → ker f. Sup- −1 −1 −1 pose a1 ∈ ker f, then f(a1 ) = f(a1) = eB = eB, so inversion restricts to ker f → ker f. Therefore ker f ≤ A and we obtain {eA} ≤ ker f ≤ A.

Suppose f is injective. Let a1 ∈ ker f, then f(a1) = eB = f(eA), so a1 = eA. Therefore ker f ⊆ {eA} and hence ker f = {eA}.

Suppose ker f = {eA}. Let a1, a2 ∈ A with f(a1) = f(a2), then eB = −1 −1 −1 −1 f(a1) f(a1) = f(a2) f(a1) = f(a2 a1), so a2 a1 ∈ ker f = {eA}, −1 −1 hence a2 a1 = eA, so a1 = a2 a2 a1 = a2eA = a2. This makes f injective. • Suppose f is an isomorphism, then there exists a morphism of groups g : B → A such that f ◦ g = idB, g ◦ f = idA, so f is a bijection with inverse g. Suppose conversely that f is a bijection, let g : B → A denote the inverse of f, then f ◦ g = idB and g ◦ f = idA. We need to check that g is a G -morphism. Let b1, b2 ∈ B, then g(b1 b2) = g(f(g(b1)) f(g(b2))) = g(f(g(b1) g(b2))) = g(b1) g(b2) because f is a -morphism, so g is a - morphism and therefore f is a -isomorphism.

Definition 3.1.7: Normal subgroup Let A . Then any subset B ⊆ A is called a normal subgroup of A if B ≤ A and for all b ∈ B and a ∈ A we have that aba−1 ∈ B. Definition 3.1.8: Quotient group Let A , B ≤ A a normal subgroup. The quotient group A/B is defined as the set of equivalence classes

A/B := {[a] | a ∈ A}, with [a] := a B := {a b ∈ A|b ∈ B}, together with identity [e] = B, multiplica- −1 −1 tion [a1][a2] := [a1 a2], and inversion [a1] := [a1 ]. That this is a group can be verified using Definition (3.1.7). First note that for any a ∈ A, b ∈ B we have [a] = [a b] = [b a], since for any b1 ∈ B, b a b1 = −1 a ((a b a) b1) ∈ [a]. Suppose [a1] = [a3] and [a2] = [a4], then [a1 a2] = [a3 a4]: as a3 ∈ [a1] there is a b3 ∈ B such that a3 = a1 b3 and similarly a4 = a2 b4 for −1 some b4 ∈ B, hence [a3 a4] = [(a1 b3 a2) b4] = [a1 b3 a2] = [a1 a2 (a2 b3 a2)] = [a1 a2]. This makes multiplication well-defined. For inversion note that [a1] = −1 −1 −1 −1 [a3] iff for some b3 ∈ B, a3 = a1 b3 iff a1 a3 ∈ B iff a3 (a1 ) ∈ B iff −1 −1 [a1 ] = [a3 ], so inversion is well-defined. Now all the desired group properties follow from pulling them inside the brackets [...]. Lemma 3.1.9: Factorisation Let A, B . Then for any f : A → B -morphism,

- 35 - 3.1. GROUPS

• ker f ≤ A is a normal subgroup,

•{ eB} ≤ f(A) ≤ B and f(A) = B if and only if f is surjective,

• there exists a unique, injective G -morphism g : A/ ker f → B such that f(a) = g([a]) for all a ∈ A, • f(A) ' A/ ker f are -isomorphic. Conversely, for any normal subgroup C ≤ A there exist a group D and -morphism f : A → D such that ker f = C. Proof. Let f : A → B be a -morphism.

• By Lemma (3.1.6) we know that ker f ≤ A. Let a ∈ A, b ∈ ker f, then −1 −1 −1 −1 f(a b a ) = f(a) f(b) f(a) = f(a) eA f(a) = f(a) f(a) = eB, so a b a−1 ∈ ker f. Hence ker f is a normal subgroup.

• Let b1, b2 ∈ f(A), then b1 = f(a1), b2 = f(a2) for a1, a2 ∈ A, hence b1 b2 = f(a1) f(a2) = f(a1 a2) ∈ f(A). Furthermore by Lemma (3.1.6) −1 −1 −1 b1 = f(a1) = f(a1 ) ∈ f(A) and eB = f(eA) ∈ f(A). Hence {eB} ≤ f(A) ≤ B. By definition f(A) = B if and only if f is surjective. • Choose g : A/ ker f → B, g([a]) := f(a). Then g is well-defined: if −1 −1 [a1] = [a2], then a1 a2 ∈ ker f, so by Lemma (3.1.6) eB = f(a1 a2 ) = −1 −1 f(a1) f(a2) = g([a1]) g([a2]) , so g([a1]) = g([a2]). Also g([a1][a2]) = g([a1 a2]) = f(a1 a2) = f(a1) f(a2) = g([a1]) g([a2]), so g is a -morphism. It is clear that g is uniquely determined by definition. Suppose that g([a1]) = eB, then f(a1) = eB, so a1 ∈ ker f and hence [a1] = [eA], so ker g = {[eA]} and by Lemma (3.1.6) g is injective. • As g is injective and g([a]) = f(a), g(A/ ker g) = f(A), so A/ ker f → f(A):[a] 7→ g([a]) is bijective. Hence by Lemma (3.1.6), it is a - isomorphism.

Let C ≤ A be a normal subgroup. Choose D := A/C and f : A → D, f(a) := [a]. Then f(a) = [eA] iff a ∈ C, so ker f = C. Definition 3.1.10: Group action Let A , B a set. Then an action f of A on B is a map f : A × B → B :(a, b) 7→ a · b such that for all a1, a2 ∈ A, b ∈ B we have

a1 · (a2 · b) = (a1 a2) · b, eA · b = b. For any b ∈ B we define the orbit of b to be

A · b := {a · b ∈ B | a ∈ A} ⊆ B and the stabilisers of b to be

Ab := {a ∈ A | a · b = b} ⊆ A.

We can check directly that defining b1 ∼ b2 if and only if A · b1 = A · b2 is an equivalence relation, which shows that the orbits of a group action partition the set on which the group acts.

- 36 - 3.2. RINGS

Theorem 3.1.11: Orbit-stabiliser theorem Let A G , B a set. Let f be an action of A on B, then for all b ∈ B there is a bijection

A/Ab ' A · b. Proof. Let b ∈ B. Consider the morphism g : A → A · b defined by g(a) := a · b. The set A · b has a natural group structure (a1 · b)(a2 · b) := (a1 a2) · b, −1 −1 (a1 · b) := (a1 ) · b with respect to which g is a group morphism. Now a · b = g(a) = eA · b = b if and only if a ∈ Ab, so ker g = Ab. Furthermore, for any (a1 ·b) ∈ A·b, g(a1) = a1 ·b, so g is surjective. Therefore Lemma (3.1.9) gives us that the induced map A/ ker g = A/Ab → g(A) = A · b is a -isomorphism and hence a bijection.

3.2 Rings

Definition 3.2.1: Ring Let A be a set. A ring structure on A consists of an Abelian group structure on A (so 0 ∈ A, addition (a1, a2) 7→ a1+a2, negation a1 7→ −a1), together with an element 1 ∈ A (called one) and a map

A × A → A :(a1, a2) 7→ a1 a2

(called multiplication), that satisfy for all a1, a2, a3 ∈ A that

• a1 1 = 1 a1 = a1,

• (a1 a2) a3 = a1 (a2 a3),

• a1 (a2 + a3) = (a1 a2) + (a1 a3),

• (a1 + a2) a3 = (a1 a3) + (a2 a3). We call any a ∈ A invertible with respect to this ring structure if there exists an a−1 ∈ A such that a a−1 = a−1 a = 1. The collection of all invertible elements of A is denoted by A∗ := {a ∈ A | a invertible}.

We say that this ring structure is Abelian or commutative if for all a1, a2 ∈ A we have a1 a2 = a2 a1. A ring A (denoted by A R ) is a set A together with a ring structure. Definition 3.2.2: Morphisms of rings Let A, B . Then all maps f : A → B that are group morphisms of the Abelian group structures on A and B and satisfy for all a1, a2 ∈ A that

f(a1 a2) = f(a1) f(a2), f(1A) = 1B, are ring morphisms between A and B (denoted by f -morphism). The identity morphism of A is the map

idA : A → A : a 7→ a.

- 37 - 3.2. RINGS

So a map f : A → B between two rings is a ring morphism if and only if for all a1, a2 ∈ A we have f(a1 + a2) = f(a1) + f(a2), f(a1 a2) = f(a1) f(a2), and f(1A) = 1B. Lemma 3.2.3: Solving equations in a ring Let A R .

• For all a ∈ A, 0 a = a 0 = 0. • For all a ∈ A we have −a = (−1) a. • 1 ∈ A∗.

• 0 ∈ A∗ if and only if 0 = 1 if and only if A = A∗ = {0}.

• 0 is uniquely determined and for all a1, a2 ∈ A, −(a1 + a2) = −a1 − a2 and −a1 is unique.

−1 • 1 is uniquely determined, inverses of invertible elements are unique, (a1 a2) = −1 −1 a2 a1 . ∗ • If a1 ∈ A , a1 a2 = a1 a3 if and only if a2 = a3 if and only if a2 a1 = a3 a1.

• a1 + a2 = a1 + a3 if and only if a2 = a3 if and only if a2 + a1 = a3 + a1.

∗ • Suppose A = A\{0}. If a1, a2 ∈ A with a1 a2 = 0, then a1 = 0 or a2 = 0.

Proof. • Consider 0 a = (0 + 0) a = (0 a) + (0 a), so 0 = (0 a) − (0 a) = ((0 a) + (0 a)) − (0 a) = (0 a) + ((0 a) − (0 a)) = (0 a) + 0 = (0 a). Similarly a 0 = 0.

• By using 0 a = 0 we find a + ((−1) a) = (1 a) + ((−1) a) = (1 + (−1)) a = 0 a = 0 and similarly ((−1) a) + a = 0, so by Lemma (3.1.3), (−1) a = −a. • As 1 1 = 1 1 = 1, A 3 1 = 1−1 exists and therefore 1 ∈ A∗ is invertible. • Suppose 0 ∈ A∗, then 1 = 0 0−1 = 0. Suppose 0 = 1, then for any a ∈ A, a = 1 a = 0 a = 0 and 0 = 1 ∈ A∗, so A = A∗ = {0}. Suppose A = A∗ = {0}, then 0 ∈ A∗ directly. • The fact that 0 and negations are uniquely determined follows directly from Lemma (3.1.3).

∗ ∗ −1 • The set A together with 1 ∈ A ,(a1, a2) 7→ a1 a2, and a1 7→ a1 is a group. In particular, 1 is unique, all inverses of invertible elements are −1 −1 −1 unique and (a1 a2) = a2 a1 as follows from Lemma (3.1.3). ∗ • Let a1 ∈ A and a2, a3 ∈ A. Suppose a1 a2 = a1 a3, then a2 = 1 a2 = −1 −1 −1 −1 (a1 a1) a2 = a1 (a1 a2) = a1 (a1 a3) = (a1 a1) a3 = 1 a3 = a3. Sim- ilarly if a2 a1 = a3 a1 then a2 = a3. The converse is immediate from multiplying by a1 on the left or the right of a2 = a3. • Results for addition follow in exactly the same way as for multiplication.

- 38 - 3.3. MODULES

∗ • Suppose A = A \{0} and a1, a2 ∈ A, a1a2 = 0. Suppose a1 6= 0, then ∗ −1 −1 a1 ∈ A , so a2 = a1 a1a2 = a1 0 = 0. Similarly if a2 6= 0, then a1 = 0. Hence either a1 = 0 or a2 = 0.

Definition 3.2.4: Ideal Let A R . Then we call a set B an ideal of A if B ⊆ A, 0 ∈ B, and for all a1 ∈ A, b1, b2 ∈ B we have a1b1 − b2 ∈ B and b1a1 − b2 ∈ B. For any ideal B ⊆ A we have that B 6= A if and only if B ∩ A∗ = ∅ (if there exists a b ∈ B ∩ A∗, then 1 = b−1b − 0 ∈ B, so a = a1 − 0 ∈ B for all a ∈ A, so A = B). Definition 3.2.5: Quotient ring Let A , and B an ideal of A. Then the quotient ring A/B is defined as the set of equivalence classes

A/B := {[a] ⊆ A|a ∈ A}, with [a] := {a1 ∈ A|a1 − a ∈ B}, together with 0A/B := [0A], 1A/B := [1A], [a1] + [a2] := [a1 + a2], [a1][a2] := [a1 a2]. That this indeed is a ring is easily verified from the fact that B is an ideal. Suppose [a1] = [a3] and [a2] = [a4], then [a1 + a2] = [a3 + a4](a3 ∈ [a3] = [a1], so a3 = a1 + b3 for some b3 ∈ B, similarly a4 = a2 + b4, so a5 ∈ [a1 + a2] iff B 3 a5 − (a1 + a2) = a5 − ((a3 − b3) + (a4 − b4)) = (a5 − (a3 + a4)) + b3 + b4 iff a5 − (a3 + a4) ∈ B iff a5 ∈ [a3 + a4]), so addition is well defined. Similarly a5 ∈ [a1a2] iff B 3 a5 − a1 a2 = a5 − (a3 + b3)(a4 + b4) = (a5 − a3 a4) − a3 b4 − b3 a4 − b3 b4 iff a5 − a3 a4 ∈ B (as B is an ideal: a3 b4, b3 a4 ∈ B since b3, b4 ∈ B) iff a5 ∈ [a3 a4], so [a1 a2] = [a3 a4] if [a1] = [a3], [a2] = [a4] which makes multiplication well-defined. All the requirements from Definition (3.2.1) are now satisfied by pulling every expression inside [...] and using the fact that they are satisfied by A.

Definition 3.2.6: Field Let A . Then we call A a field (denoted by A F ) if the ring structure on A is commutative and has A∗ = A \{0}. By Lemma (3.2.3) we see that A∗ = A \{0} implies that 0 6= 1.

3.3 Modules

Definition 3.3.1: Module Let A and B a set. Then an A-module structure on B consists of an Abelian group structure on B (so 0B ∈ B,(b1, b2) 7→ b1 +B b2, b1 7→ −b1), together with a map

A × B → B :(a, b) 7→ a b.

(called scalar multiplication), satisfying for all a1, a2 ∈ A and b1, b2 ∈ B that

- 39 - 3.3. MODULES

• (a1 +A a2) b1 = (a1 b1) +B (a2 b1),

• a1 (b1 +B b2) = (a1 b1) +B (a1 b2),

• (a1 a2) b1 = a1 (a2 b1),

• 1A b1 = b1.

An A-module B (denoted by B M /A) is a set B together with an A-module structure.

In exactly the same way as for rings (Lemma (3.2.3)) we can show that for all b ∈ B we have −b = (−1A) b and 0A b = 0B.

Example 3.3.2: R ⇒ Let A , then A /A with respect to its own addition and multiplication as module operations.

Definition 3.3.3: A-linearity ( l /A) Let A , B, C /A. Then a map f : B → C is called A-linear (denoted by f /A) if for all a1 ∈ A and b1, b2 ∈ B we have

f(a1 b1 +B b2) = a1 f(b1) +C f(b2).

For an A- f : B → C we define the kernel of f to be

ker f := {b ∈ B | f(b) = 0C }.

As with Lemma (3.1.6) we have (by considering f as a G -morphism between the Abelian group structures of B and C) that f is injective if and only if ker f = {0B}. From this lemma we also see that for any f : B → C /A, {0B} ≤ ker f ≤ B and {0C } ≤ f(B) ≤ C. Definition 3.3.4: Morphisms of A-modules Let A , B, C /A. Then all maps f : B → C /A are A-module morphisms between B and C. The identity morphism of B is the map

idB : B → B : b 7→ b.

Looking at Lemma (3.1.6) we also see that any f : B → C /A that is bijective, is in fact a -isomorphism. Definition 3.3.5: Multilinearity (k- ) Let k ∈ N, A , B1,..., Bk, C /A. Then we say that a map f : B1 × ... × Bk → C is k-multilinear over A (denoted by f k- /A) if for all b1 ∈ B1,..., bk ∈ Bk and 1 ≤ l ≤ k the map

Bl → C : b 7→ f(b1, . . . , bl−1, b, bl+1, . . . , bk) is /A.

- 40 - 3.3. MODULES

Definition 3.3.6: Quotient module Let A R , B M /A, C ≤ B. Then the quotient module B/C is defined as the set of equivalence classes

B/C := {[b] ⊆ B | b ∈ B}, where [b] := {b1 ∈ B|b1 − b ∈ C}, together with 0B/C := [0B], [b1] + [b2] := [b1 + b2], a[b1] := [a b1]. That B/C is indeed /A is verified in the same way as for Definition (3.2.5) (gives well-definedness of addition), together with the fact that if [b1] = [b2], then b2 = b1 + c2 for some c2 ∈ C, so b3 ∈ [a b1] iff C 3 b3 − a b1 = b3 − a (b2 − c2) = (b3 − a b2) − a c3 iff b3 − a b2 ∈ C (as a c3 ∈ C because C ≤ B), so [a b1] = [a b2] and scalar multiplication is well-defined. Again all requirements of Definition (3.3.1) follow from pulling them inside [...]. Definition 3.3.7: Direct product Let A , {Bi|i ∈ I} with Bi /A for all i ∈ I. Then the direct product of {Bi|i ∈ I} is defined as the set Y n [ o Bi = g : I → Bi | ∀i ∈ I : g(i) ∈ Bi , i∈I i∈I

together with zero i 7→ 0Bi , addition (g1 + g2)(i) := g1(i) +Bi g2(i), and scalar multiplication (a g)(i) := a ·B g(i). Q i Note that i∈I Bi /A. Definition 3.3.8: Direct sum Let A , {Bi|i ∈ I} with Bi /A for all i ∈ I. Then the direct sum of {Bi|i ∈ I} is defined as the set M n Y o Bi := g ∈ Bi | {i ∈ I|g(i) 6= 0Bi } is finite , i∈I i∈I

together with zero i 7→ 0Bi , addition (g1 + g2)(i) := g1(i) +Bi g2(i), and scalar multiplication (a g)(i) := a ·B g(i). L i Note that i∈I Bi /A. Lemma 3.3.9: Direct product and sum properties Let A , {Bi|i ∈ I} with Bi /A for all i ∈ I.

• We always have that M Y Bi ≤ Bi. i∈I i∈I

• If I = {i1, . . . , ik} is finite, then as /A M Y Bi ' Bi ' Bi1 × ... × Bik , i∈I i∈I

0 0 0 with zero (0,..., 0), addition (b1, . . . , bk) + (b1, . . . , bk) = (b1 + b1, . . . , bk + 0 b ), and scalar multiplication a(b1, . . . , bk) = (a b1, . . . , a bk). In this case k L we write Bi1 ⊕ ... ⊕ Bik := i∈I Bi.

- 41 - 3.3. MODULES

• Let for each i ∈ I, Y gi : Bj → Bi : g 7→ g(i). (3.1) j∈I

Then for any B M /A and {fi : B → Bi}, fi l /A for all i ∈ I, there Q exists a unique f : B → i∈I Bi /A such that for all i ∈ I we have that fi = gi ◦ f. • Let for each i ∈ I,

M   b i = j  gi : Bi → Bj : b 7→ j 7→ . (3.2) 0B i 6= j j∈I j

Then for any B /A and {fi : Bi → B}, fi /A for all i ∈ I, there L exists a unique f : i∈I Bi → B /A such that for all i ∈ I we have that fi = f ◦ gi.

Proof. • This is clear from the definition.

• If I is finite, then the set of i ∈ I for which g(i) 6= 0B is always finite Q Q L i for any g ∈ Bi. Hence Bi = Bi. We can identify this set i∈I i∈I i∈I S with Bi1 × ... × Bik using the maps (b1, . . . , bk) 7→ (I → i∈I Bi : i 7→ bl for which i = il ) and g 7→ (g(i1), . . . , g(ik)). Q • Suppose f : B → i∈I Bi is /A and satisfies fi = gi ◦ f for all i ∈ I. Then for b ∈ B and i ∈ I we have f(b)(i) = gi(f(b)) = fi(b), which Q uniquely determines f(b) ∈ i∈I Bi. Now define for b ∈ B, i ∈ I by f(b)(i) := fi(b), then fi(b) = f(b)(i) = gi(f(b)), so fi = gi ◦ f. Further- more, f /A as all fi /A. Hence f is the desired unique map. L • Let g ∈ i∈I Bi, then there exist finitely many i1, . . . , ik ∈ I such that g(il) 6= 0B for 1 ≤ l ≤ k. Define f(g) := fi (g(i1)) + ... + fi (g(ik)) to i L 1 k obtain a map f : i∈I Bi → B. Since all fi /A we see that f /A by definition. Furthermore, for i ∈ I, b ∈ Bi we have by definition of f and gi that f(gi(b)) = fi(gi(b)(i)) = fi(b), hence fi = f ◦ gi as desired. L Uniqueness of f is also apparent: for a given g ∈ i∈I Bi we have for

the i1, . . . , ik ∈ I where g(il) 6= 0 that g = gi1 (g(i1)) + ... + gik (g(ik)),

so f(g) = f(gi1 (g(i1))) + ... + f(gik (g(ik))) = fi1 (g(i1)) + ... + fik (g(ik)) which uniquely determines f(g).

By Lemma (3.3.9) we see that finite direct products or sums cannot be distinguished. We will often abbreviate Bk := B ⊕ ... ⊕ B ' B × ... × B. | {z } | {z } k k Lemma 3.3.10: Factorisation Let A R , B, C /A. Then for any f : B → C /A there exists an unique g : B/ ker f → C /A such that f(b) = g([b]) for all b ∈ B. Furthermore, ker g = {[0]} and

B/ ker f ' f(B) ≤ C.

- 42 - 3.3. MODULES

Proof. Let f : B → C l /A be given. Define g : B/ ker f → C by g([b]) := f(b) for all b ∈ B. Then this map is well-defined, for if [b1] = [b2] then b1 −b2 ∈ ker f, so g([b1]) − g([b2]) = f(b1) − f(b2) = f(b1 − b2) = 0, so g([b1]) = g([b2]). It is also linear because f is linear, and unique by definition. As g , [0] ∈ ker g. Let [b] ∈ ker g be arbitrary, then g([b]) = f(b) = 0, so b ∈ ker f, but then [b] = [0], hence ker g = {[0]}. Because g is injective, and f(b) = g([b]) from which we see that g(B/ ker f) = f(B), so B/ ker f → f(B):[b] 7→ g([b]) is a M -isomorphism.

Lemma 3.3.11 Let A R and B, C, D /A. The we have the following -isomorphisms. • (B ⊕ C) ⊕ D ' B ⊕ (C ⊕ D) ' B ⊕ C ⊕ D, • B ⊕ C ' C ⊕ B,

• (B ⊕ C)/C ' B, if furthermore C ≤ B, then also (B/C) ⊕ C ' B. Proof. • Consider the isomorphisms ((b, c), d) 7→ (b, (c, d)) and (b, (c, d)) 7→ (b, c, d). • Consider (b, c) 7→ (c, b).

• Consider B ⊕ C → B :(b, c) → b (the kernel of this map is precisely {0} × C ' C and it is surjective) and apply Lemma (3.3.10) to obtain that (B ⊕ C)/C ' B. On the other hand if C ≤ B we can take B ⊕ C → (B/C) ⊕ C :(b, c) 7→ ([b], c) (note that ([b], c) = ([0], 0) if and only if c = 0 and b ∈ C, so the kernel of this map is C × {0}' C) and we then use Lemma (3.3.10) to obtain that B ' (B ⊕ C)/C ' (B/C) ⊕ C.

Definition 3.3.12: Let A , B /A, k ∈ N. We define the span of a collection {b1, . . . , bk} ⊆ B to be

hb1, . . . , bkiA := {a1 b1 + ... + ak bk ∈ B|a1, . . . , ak ∈ A} ≤ B.

We call a collection {b1, . . . , bk} ⊆ B linearly independent (or say that b1, . . . , bk are linearly independent) if for all a1, . . . , ak ∈ A we have that

a1 b1 + ... + ak bk = 0 → a1 = ... = ak = 0.

If {b1, . . . , bk} is not linearly independent, it is called linearly dependent. An empty collection ∅ ⊆ B is never linearly independent. We call infinite subset {bi ∈ B|i ∈ I} linearly independent if for all k ∈ N, i1, . . . , ik ∈ I, the collection

{bi1 , . . . , bik } ⊆ B is linearly independent. We say that B has dimension 0 if B = {0}. If there exists a k ∈ N and b1, . . . , bk ∈ B that are linearly independent and for which B = hb1, . . . , bkiA, we say that B has dimension k. We say that B has finite dimension if there exists a k ∈ N0 such that B has dimension k, otherwise we say that B has infinite dimension.

- 43 - 3.3. MODULES

Example 3.3.13 Let k ∈ N, then for

k e1 := (1, 0,..., 0), e2 := (0, 1,..., 0), . . . , ek := (0, 0,..., 1) ∈ K

k (these are called the (canonical) basis vectors of K ) we have that {e1, . . . , ek} is linearly independent. For let α1, . . . , αk ∈ K, then 0 = (0,..., 0) = α1e1+...+αkek = (α1,..., 0)+ ... + (0, . . . , αk) = (α1, . . . , αk) if and only if α1 = ... = αk = 0. k k Furthermore, by definition K = he1, . . . , ekiK, so K has dimension k. Lemma 3.3.14 Let A R , B M /A. 0 0 0 0 If hb1, . . . , bkiA ⊆ hb1, . . . , bliA for b1, . . . , bk and b1, . . . , bl both linearly in- dependent, then k ≤ l. In particular if B has dimension k and B has dimension l, then k = l, and any collection of linearly independent vectors of B has at most k elements. Furthermore, if C /A and B ' C, then B has dimension k if and only if C has dimension k.

0 Proof. Suppose that for any 1 ≤ m ≤ l we have that {bm, b2, . . . , bk} is linearly 0 0 0 dependent. Then for each m, bm ∈ hb2, . . . , bkiA. However, b1 ∈ hb1, . . . , bliA, so this would mean that b1 ∈ hb2, . . . , bkiA and therefore {b1, . . . , bk} is linearly dependent, which leads to a contradiction. Hence there is some m ∈ {1, . . . , l} such that {b0 , b , . . . , b } is linearly in- 1 m1 2 k dependent. Following the same reasoning for b2, b3,..., bk we find m2, m3, . . . , mk ∈ {1, . . . , l} such that {b0 , b0 , b0 , . . . , b0 } is linearly independent. Hence all m1 m2 m3 mk m1,..., mk must be distinct and therefore k ≤ l. This makes the dimension of B unique: if B has dimension k and B has dimension l then by the above k ≤ l and l ≤ k, so k = l. Let C /A, B ' C and suppose B has dimension k. Let f : B → C be the -isomorphism and let b1, . . . , bk ∈ B be linearly independent such that B = hb1, . . . , bkiA. Now let cl := f(bl) for 1 ≤ l ≤ k. Then c1, . . . , ck ∈ C are linearly independent: suppose a1 c1 + ... + ak ck = 0 for a1, . . . , ak ∈ A. Then (f l /A) f(a1 b1 + ... + ak bk) = 0, but f is injective and therefore ker f = {0}, hence a1 b1 + ... + ak bk = 0 and (linear independence) therefore a1 = ... = ak = 0. Furthermore, any c ∈ C = f(B) can be written as c = f(b) for some b ∈ B, but b = hb1, . . . , bkiA, so there exist a1, . . . , ak ∈ A such that b = a1 b1 + ... + ak bk and hence c = f(b) = a1 f(b1)+...+ak f(bk) = a1 c1+...+ak ck ∈ hc1, . . . , ckiA. Therefore C = hc1, . . . , ckiA and hence C has dimension k. For the converse, simply exchange B and C. Lemma 3.3.15 Let A . If B /A has dimension k for some k ∈ N, then B ' Ak as /A. In particular, if B, C /A, B has dimension k, and C has dimension l, then the dimension of B ⊕ C equals k + l.

Proof. Suppose B has dimension k, then there exists b1, . . . , bk ∈ B linearly k independent such that B = hb1, . . . , bkiA. Now consider the map f : A → B given by f(a1, . . . , ak) := a1 b1 +...+ak bk. Then this map is surjective because 0 0 B = hb1, . . . , bkiA = f(A). Suppose f(a1, . . . , ak) = f(a1, . . . , ak), then 0 =

- 44 - 3.3. MODULES

0 0 0 0 a1 b1+...+ak bk −a1 b1−...−ak bk = (a1−a1) b1+...+(ak −ak) bk. As b1, . . . , bk 0 0 are linearly independent we therefore have that a1 − a1 = ... = ak − ak = 0, so 0 0 (a1, . . . , ak) = (a1, . . . , ak). Hence f is injective. Therefore f is bijective, and by definition f l /A, so f is a M -isomorphism and hence B ' Ak. Suppose B has dimension k and C /A has dimension l, then B ' Ak, C ' Al and therefore (Lemma (3.3.11)) we have B ⊕ C ' (Ak) ⊕ (Al) ' Ak+l and therefore by Lemma (3.3.14) the dimension of B ⊕ C equals the dimension of Ak+l which is k + l. Lemma 3.3.16: Rank lemma Let A R , B, C /A, and f : B → C /A. If B has dimension k, then the dimension of ker f plus the dimension of f(B) equals k. Proof. Using Lemma (3.3.10) and Lemma (3.3.11) we find B ' (B/ ker f) ⊕ ker f ' f(B) ⊕ ker f. Hence we can consider f(B) ≤ B, which must have finite dimension as B has finite dimension (Lemma (3.3.14)), now the result follows from Lemma (3.3.15). Definition 3.3.17: Algebraic dual of a module Let A and B /A. Then the (algebraic) dual of B is defined to be the set

B∗ := {f : B → A|f /A}, together with 0 : b 7→ 0A,(f + g)(b) := f(b) + g(b), and (a f)(b) := a f(b), which make B∗ /A. Definition 3.3.18: Algebraic dual Let A , B, C /A. Then for any f : B → C /A we define its algebraic dual, or (which is again /A) as

f ∗ : C∗ → B∗ : g 7→ g ◦ f.

∗ We can directly see that if B has dimension k, then so does B : let b1, . . . , bk ∈ B be linearly independent, then the maps B → A defined for each 1 ≤ l ≤ k by ∗ mapping any b = a1 b1 + ... + ak bk to al are linearly independent in B (simply apply them to b1, . . . , bk). Theorem 3.3.19: Finite dimensional duality Let A , B /A. Denote for any C ≤ B the set

C⊥ := {f ∈ B∗|∀c ∈ C : f(c) = 0} ≤ B∗, and for any D ≤ B∗ the set

D⊥ := {b ∈ B|∀f ∈ D : f(b) = 0} ≤ B.

• Then for all C ≤ B, D ≤ B∗ we have

C ≤ (C⊥)⊥,D ≤ (D⊥)⊥.

- 45 - 3.3. MODULES

∗ • For all C1 ≤ C2 ≤ B, D1 ≤ D2 ≤ B we have

⊥ ⊥ ⊥ ⊥ C1 ≥ C2 ,D1 ≥ D2 .

• For all C ≤ B, C⊥ has dimension k if and only if B/C has dimension k. • For all D ≤ B∗, D has dimension k if and only if B/D⊥ has dimension k. • The correspondences C 7→ C⊥, D 7→ D⊥ are each other’s inverse and form a bijection between the collection of all C ≤ B for which B/C has finite dimension and the collection of all D ≤ B∗ of finite dimension. In particular, if C and D have finite dimension:

B/C = B/(C⊥)⊥,D = (D⊥)⊥.

Proof. We follow [Bou1947] (Chapitre II, par. 4, no. 6, Th´eor`eme1). • Let c ∈ C, f ∈ C⊥, then (definition of C⊥) f(c) = 0, hence ∀f ∈ C⊥ : f(c) = 0 and therefore c ∈ (C⊥)⊥, so C ≤ (C⊥)⊥. Similarly D ≤ (D⊥)⊥.

⊥ • Suppose C1 ≤ C2, let f ∈ C2 , then f(c) = 0 for all c ∈ C2 ≥ C1, so in ⊥ ⊥ ⊥ particular f(c) = 0 for all c ∈ C1, hence f ∈ C1 . This means C1 ≥ C2 ⊥ ⊥ and similarly D1 ≥ D2 . • First of all note that (B/C)∗ ' C⊥ via (B/C)∗ → C⊥ : f 7→ (B → A : b 7→ f([b])) and its inverse (well-defined because of Lemma (3.3.10)) C⊥ → (B/C)∗ : g 7→ (B/C → A :[b] 7→ g(b)). Therefore, by Lemma (3.3.14), if B/C has dimension k,(B/C)∗ has dimension k and hence C⊥ does too. Conversely if C⊥ has dimension k, if B/C has finite dimension, then this dimension (by the converse) must be equal to k. Suppose B/C has infinite dimension. Then there exists a C1 ≤ B with C1 ≥ C and B/C1 of ⊥ ⊥ dimension k +1. However, then C1 ≤ C also has dimension k +1, which is impossible by Lemma (3.3.14) as C has dimension k. Therefore B/C must have finite dimension.

• Suppose that D has dimension k, then let f1, . . . , fk ∈ D be linearly k independent. Now define f : B → A : b 7→ (f1(b), . . . , fk(b)), then the dimension of f(B) is at most k. On the other hand, ker f = D⊥, so the dimension of B/D⊥ equals the dimension of f(B) by Lemma (3.3.10) and is therefore at most k. By the previous part, the dimension of B/D⊥ equals the dimension of (D⊥)⊥ ≥ D, which is at least k. Hence B/D⊥ has dimension k. Conversely if B/D⊥ has dimension k, then (D⊥)⊥ ≥ D has dimension k, so the dimension of D is at most k. Hence D is finite dimensional and therefore (converse) must have dimension equal to that of B/D⊥. • Suppose D ≤ B∗ has dimension k, then B/D⊥ has dimension k and hence (D⊥)⊥ as well. As D ≤ (D⊥)⊥ also has dimension k we therefore have D = (D⊥)⊥. Suppose that for C ≤ B, B/C has dimension k, then C⊥ has dimension k and therefore B/(C⊥)⊥ is k dimensional. Hence B/C = B/(C⊥)⊥. This makes C 7→ C⊥ and D 7→ D⊥ inverse operations.

- 46 - 3.3. MODULES

Lemma 3.3.20: Dual of the dual Let A R with 0 6= 1, B M /A. Then the map f : B → (B∗)∗ : b 7→ (g 7→ g(b)) is l /A and injective. Proof. Let g, h ∈ B∗, a ∈ A, then for any b ∈ B, f(b)(g + a h) = (g + a h)(b) = g(b) + a h(b) = f(b)(g) + a f(b)(h), so f(b): B∗ → A /A and hence f(b) ∈ ∗ ∗ ∗ (B ) . Let b1, b2 ∈ B, a ∈ A, then for any g ∈ B , f(b1+a b2)(g) = g(b1+a b2) = g(b1) + a g(b2) = f(b1)(g) + a f(b2)(g), so f /A. Fix some b ∈ B, b 6= 0. Then the space C := {a b ∈ B|a ∈ A} ≤ B is ∗ /A. We can define a map gC : C → A by gC (a b) := a, then gC ∈ C and ∗ gC (b) = gC (1b) = 1 6= 0, so gC 6= 0 ∈ C . Create the collection

∗ A := {gD : D → A | C ≤ D ≤ B, gD ∈ D , gD|C = gC } partially ordered by gD ≤ gE if and only if D ≤ E and gE|D = gD. Then 0 A= 6 ∅ (since gC ∈ A) and satisfies the chain condition (let A ⊆ A be a totally ordered subset, then S A0 is an upper bound, because all maps in A0 are compatible due to the imposed ordering). By Zorn’s lemma we therefore find that A has a maximal element gD ∈ A with respect to the partial ordering. Suppose D 6= B, then there exists a b1 ∈ B \ D, but then we can construct E := {d + a b1 ∈ B|d ∈ D, a ∈ A} satisfying C ≤ D ≤ E ≤ B and a map gE : E → A by gE(d + a b1) = gD(d) + a, clearly satisfying gE|D = gD and ∗ gE ∈ E , which contradicts maximality of gD. Therefore necessarily B = D ∗ and we find a g ∈ B such that g(b) = gC (b) = 1. So for any b ∈ B, b 6= 0 there exists a g ∈ B∗ with g(b) = 1 (a miniature version of Theorem (3.3.22)). In particular, f(b)(g) = g(b) = 1 6= 0, so f(b) 6= 0 and hence b∈ / ker f if b 6= 0. On the other hand f(0)(g) = g(0) = 0 for all g ∈ B∗, so ker f = {0} and hence f is injective. Definition 3.3.21: Vector space Let A , B /A. Then we call B an A-vector space (denoted by B Vs /A) if A F .

Theorem 3.3.22: Hahn-Banach (R) Let A /R, and f : A → R, such that for all a1, a2 ∈ A, f(a1 + a2) ≤ f(a1) + f(a2) and f(α a1) = α f(a1) for all α ∈ [0, ∞[. Then for any g ∈ B∗ with B ≤ A, satisfying g(b) ≤ f(b) for all b ∈ B, there ∗ exists an h ∈ A such that h|B = g, and h(a) ≤ f(a) for all a ∈ A. Proof. Consider the collection

A := {hC : C → R | B ≤ C ≤ A, hC |B = g, ∀c ∈ C : hC (c) ≤ f(c)}, together with the partial ordering hC ≤ hD if and only if C ≤ D and hD|C = hC . Then A satisfies the chain condition (let A0 ⊆ A be a totally ordered subset, then S A0 is an upper bound because all maps in A0 are compatible under restrictions since A0 is totally ordered) and A= 6 ∅ because g ∈ A, and hence by Zorn’s lemma A has a maximal element with respect to its partial ordering,

- 47 - 3.3. MODULES

denote this element by hC . Suppose C 6= A, then there exists an a ∈ A \ C and since C ≤ A, this implies that for

D := {c + α a | c ∈ C, α ∈ R} we have B ≤ C ≤ D ≤ A. ∗ Now let hD ∈ D be any function satisfying hD|C = hC and hD(d) ≤ f(d) for all d ∈ D. Then for any d = c + α a ∈ D we have hD(d) = hD(c + α a) = hD(c)+α hD(a) = hC (c)+α hD(a). Furthermore, if α = 0, hD(d) = hC (c)+0 ≤ f(c) = f(d) directly (as hC ∈ A). 1 If α < 0, hD(d) = hC (c) − |α| hD(a) ≤ f(d) = f(c − |α| a), so hC ( |α| c) − 1 0 0 0 1 hD(a) ≤ f( |α| c − a), so hD(a) ≥ hC (c ) − f(c − a) for c = |α| c ∈ C. If α > 0 we find in a similar way that hD(d) = hC (c)+|α| hD(a) ≤ f(c+|α| a), 0 0 0 1 so hD(a) ≤ f(c + a) − hC (c ) for c = |α| c ∈ C. ∗ So if hD ∈ D satisfies hD|C = hc and hD(d) ≤ f(d) for all d ∈ D, we have 0 that necessarily hD(c + α a) = hC (c) + α hD(a) with for all c, c ∈ C

0 0 hC (c) − f(c − a) ≤ hD(a) ≤ f(c + a) − hC (c ) (3.3)

1 since the correspondence c ↔ |α| c in C is bijective for α 6= 0. Conversely, for a certain β ∈ R which satisfies Equation (3.3) we can define for all d = c + α a ∈ D the function hD : D → R by hD(c + α a) := hC (c) + α β. ∗ ∗ Then hD(a) = β and hD ∈ D as hC ∈ C . Furthermore hD|C = hC (case α = 0) and hD(d) ≤ f(d) for all d ∈ D (reverse the reasoning leading to Equation (3.3)). 0 0 0 0 Now for any c, c ∈ C we have hC (c) + hC (c ) = hC (c + c ) ≤ f(c + c ) = 0 0 0 0 f(c − a + c + a) ≤ f(c − a) + f(c + a), so hC (c) − f(c − a) ≤ f(c + a) − hC (c ) 0 for all c, c ∈ C. This means that both β− := sup{hC (c) − f(c − a) ∈ R|c ∈ C} 0 0 0 and β+ := inf{f(c + a) − hC (c ) ∈ R|c ∈ C} exist in R, β− ≤ β+ and that any β in the nonempty interval [β−, β+] satisfies Equation (3.3). So there exist β−, β+ ∈ R, β− ≤ β+ such that for all β ∈ [β−, β+] the ∗ function hD : D → R defined by hD(c + α a) = hC (c) + α β satisfies hD ∈ D , hD|C = hC and hD(d) ≤ f(d) for all d ∈ D. Therefore hD ∈ A and since D ≥ C, D 6= C we have hD > hC contradicting the maximality of hC . Therefore the assumption C 6= A leads to a contradiction: necessarily C = A and hence the maximal element hA =: h is the sought after extension of g. It is tempting to try and prove the Hahn-Banach theorem for other fields besides R. Later (Theorem (4.2.6)) we will see that we can also extend The- orem (3.3.22) to C, but the following example shows that completeness of the considered field is very important.

Example 3.3.23: Hahn-Banach fails over Q Here we will investigate the particulars of Theorem (3.3.22) where instead of vector spaces over R we consider vector spaces over the field Q. Consider A = R2 considered as Vs /Q and B = Q ≤ A (identified with Q×{0} via x 7→ (x, 0)). Choose f : A → R :(x, y) 7→ |x|, then f((x1, y1) + (x2, y2)) = |x1 + x2| ≤ |x1| + |x2| = f(x1, y1) + f(x2, y2) and for all α ∈ Q we have f(α (x1, y1)) = |α x1| = |α| f(x1, y1). Pick g : B → Q : x 7→ x, then clearly g ∈ B∗ and g(x) = x ≤ |x| = f(x, 0) for all x ∈ B.

- 48 - 3.3. MODULES

∗ 2 Let h ∈ A (so h : R → Q is l /Q) satisfy both h|B = g and h(x, y) ≤ f(x, y) for all (x, y) ∈ A. Consider the linear subspace √ D := {(x, 0) + α ( 2, 1) ∈ A | (x, 0) ∈ B, α ∈ Q} ≤ A. √ Then because√h is linear and restricts√ to g we have h((x, 0) + α√( 2, 1)) = h(x, 0) + α h( 2, 1) = g(x) + α h( 2, 1) = x + α β for β := h( 2, 1) ∈ Q. As h(x, y) ≤ f(x, y) for all (x, y) ∈ A ≥ D and B ≤ D ≤√A we see from Equation (3.3) and the proof of Theorem (3.3.22) (with√a = ( 2, 1) ∈ A \ B) that necessarily for√ all x ∈ Q we have β ≤ f((x, 0) + ( 2, 1)) − g(x) and β ≥ g√(x) − f((x, 0) − ( √2, 1)).√ For x = 2 ∈ Q we therefore find β ≥ g√(2) − f((2, 0) − ( 2,√1)) = 2−|2√− 2| = 2 and for√x = 1 ∈ Q, β √≤ f((1, 0)+( 2, 1))−g(1) = |1 + 2| − 1 = 2. Therefore β = 2 and hence 2 = β ∈ Q which leads to a contradiction. Therefore such a bounded linear extension h of g cannot exist in this case: Theorem (3.3.22) does not hold for vector spaces over Q.

- 49 - Chapter 4

Topology and algebra

We will now provide the algebraical objects from Chapter3 with the geometrical properties from Chapter2, which is necessary to be able to do analysis later in Chapter5. There we need to investigate exactly how fast the value of a function changes if we ‘move around’ a fixed point for differentiability, this is not possible in the general setting of Chapter2. Of particular use in this regard will be the notion of abc subsets in Definition (4.3.4) and Section 4.5.

4.1 Topological modules

We now make a connection between the discussed topological and algebraical concepts.

T Definition 4.1.1: Topological group ( G ) Let A be a set. Then we call A a topological group (denoted by A ) if A T G such that the multiplication and inversion maps of the group structure are both c with respect to the topology on A.

T Definition 4.1.2: Topological ring ( R ) Let A be a set. Then we call A a topological ring (denoted by A ) if A R such that the multiplication and addition maps given by the ring structure on A are all with respect to the topology on A. By Lemma (3.2.3) we know that for topological rings A, the map A → A : a 7→ −a is given by the composition of A × A → A :(a1, a2) 7→ a1a2 with A → A × A : a 7→ (−1, a), which are both , so a 7→ −a . This makes the Abelian group structure on A a .

T Definition 4.1.3: Topological module ( M ) Let A and B a set. Then we call B a topological A-module (denoted by B /A) if B M /A such that the scalar multiplication and addition maps given by the A-module structure on B are all with respect to the topology on B.

- 50 4.1. TOPOLOGICAL MODULES

Definition 4.1.4: Morphisms of topological A-modules T T Let A R , B, C M /A. Then all maps f : B → C c l /A are topological A-module morphisms between B and C (denoted by f /A-morphism). The identity morphism of B is the map

idB : B → B : b 7→ b. Following the same reasoning as for topological rings we see that for topo- logical modules, the map b 7→ −b . Also note that for B /A and C T we have that {f : C → B|f } /A because addition and scalar multiplication are . Lemma 4.1.5 Let A , B /A. Let V ⊆ B be open (resp. closed). Then for all b ∈ B,

V + b := {b1 + b ∈ B | b1 ∈ V } is open (resp. closed) and for all a ∈ A∗,

aV := {a b1 ∈ B | b1 ∈ V } is open (resp. closed). Proof. Direct from the fact that for fixed b ∈ B and a ∈ A∗, the maps B → B : b1 7→ b1 + b, B → B : b1 7→ a b1 (as B is a topological A-module they are both ) are -isomorphisms, because they have inverses b1 7→ b1 − b and −1 ∗ b1 7→ a b1 as a ∈ A . Note that we can translate any basis of neighbourhoods from 0 ∈ B to any point b ∈ B by Lemma (4.1.5) and that the entire topology of B is generated by translations of this basis. This allows us to compare topologies of topological modules more easily. Lemma 4.1.6: Comparing topologies Let A , B M /A. Let B1, B2 ⊆ P(B) be topologies on B such that B /A with each of these topologies. Let B3 be a basis of open neighbourhoods of 0 in B with respect to the topology B1. If for each V1 ∈ B3 there exists a V2 ∈ B2 such that V2 ⊆ V1, then B1 ⊆ B2. Furthermore B1 = T ({V + b|V ∈ B3, b ∈ B}) (any basis of open neighbour- hoods of 0 in B generates the entire topology via translation).

Proof. Suppose that for each V1 ∈ B3 there exists a V2 ∈ B2 such that V2 ⊆ V1. Let V ∈ B1, then for any b ∈ V , V − b is an open neighbourhoods of 0, so there exists some V1 ∈ B3 such that 0 ∈ V1 ⊆ V − b. By assumption there S exists a Vb ∈ B2 such that 0 ∈ Vb ⊆ V1 ⊆ V − b. But then V = {b} ⊆ S S b∈V b∈V (Vb + b) ⊆ V , so V = b∈V (Vb + b) ∈ B2, as all Vb + b ∈ B2, which is a topology. So B1 ⊆ B2. Denote B4 = T ({V + b|V ∈ B3, b ∈ B}). First of all note (with Lemma (4.1.5)) that the collection of all V + b for V ∈ B3 and b ∈ B is contained in B1, which is a topology. Hence B4 ⊆ B1. In a similar way as above (write open set as union of translated basis elements) we find B1 ⊆ B4 and hence B1 = B4: the topology of B is generated by translations of any basis of open neighbourhoods of 0 in B.

- 51 - 4.1. TOPOLOGICAL MODULES

Lemma 4.1.7 T T Let A R , B M /A. Then B T1 if and only if B T2 . Proof. Suppose B , then by Theorem (2.2.5) B . Suppose B . By Lemma (4.1.5) it is sufficient to only consider the points 0 ∈ B and b ∈ B, b 6= 0. As B , we have by Lemma (2.2.2) that the set V := B \{b} ⊆ B is open. Furthermore, as 0 6= b, 0 ∈ V , so V is an open c neighbourhood of 0 in B. Because B , B × B → B :(b1, b2) 7→ b1 − b2 . As 0 − 0 = 0 we therefore have for 0 ∈ V open that there exist V1,V2 ⊆ B open such that 0 ∈ V1, 0 ∈ V2 and for all b1 ∈ V1, b2 ∈ V2 we have b1 − b2 ∈ V . Now V1 is an open neighbourhood of 0 in B and by Lemma (4.1.5), b + V2 is an open neighbourhood of b in B. Suppose V1 ∩ (b + V2) 6= ∅, then there exists a b1 ∈ V1 ∩ (b + V2), and as b1 ∈ b + V2 there is a b2 ∈ V2 such that b1 = b + b2. However, then b1 ∈ V1, b2 ∈ V2, so b = b1 − b2 ∈ V = B \{b}: we have reached a contradiction. Therefore necessarily V1 ∩ (b + V2) = ∅, so 0 and b can be separated by disjoint open sets for any b 6= 0, hence B . Definition 4.1.8: Topological quotient module Let A , B /A, C ≤ B. Then the topological quotient module B/C is the quotient module B/C (where B and C are just considered as modules) together with the quotient topology (Definition (2.1.26)) of B → B/C : b 7→ [b]. Lemma 4.1.9: Properties of the topological quotient module Let A , B /A, C ≤ B. Then B/C /A and B/C if and only if C ⊆ B is closed. Furthermore, the map B → B/C : b 7→ [b] is open.

Proof. We already know that B/C M /A as quotient module without topological structure, so we only need to verify that addition and scalar multiplication are continuous. Denote the projection map as f : B → B/C : b 7→ [b]. Let E ⊆ B. Let b ∈ f −1(f(E)), then f(b) ∈ f(E), so there exists an e ∈ E with [b] = f(b) = f(e) = [e], so b = e + c for some c ∈ C, but then b = e + c ∈ E + C. Hence f −1(f(E)) ⊆ E + C. Conversely, let b ∈ E + C, then b = e+c for some e ∈ E and c ∈ C, therefore f(b) = f(e+c) = [e+c] = [e]+[c] = [e] + [0] = [e] ∈ f(E), so b ∈ f −1(f(E)). Therefore E + C ⊆ f −1(f(E)). Since f is also surjective, we obtain for any D ⊆ B/C and E ⊆ B that

f(f −1(D)) = D f −1(f(E)) = E + C = {e + c|e ∈ E, c ∈ C}.

We are now going to show that addition is . Let [b1], [b2] ∈ B/C be arbitrary, and W any open neighbourhood of [b1] + [b2] in B/C. Then f(b1 + −1 b2) = [b1 +b2] = [b1]+[b2] ∈ W , so V := f (W ) ⊆ B is an open neighbourhood of b1 + b2 in B. As + is continuous on B and B × B has the product topology, there exist V1,V2 ⊆ B open such that b1 ∈ V1, b2 ∈ V2 and for all b3 ∈ V1, b4 ∈ V2 S we have b3+b4 ∈ V . By Lemma (4.1.5) we have that V1+C = c∈C V1+c ⊆ B is open and similarly V2+C is open. Now define W1 := f(V1+C), W2 := f(V2+C), then W1,W2 ⊆ B/C are open because of the quotient topology and the fact −1 −1 that f (W1) = f (f(V1 + C)) = (V1 + C) + C = V1 + C, which is open, −1 and f (W2) = V2 + C is open. Let [b3] ∈ W1,[b4] ∈ W2, then f(b3) ∈ W1,

- 52 - 4.1. TOPOLOGICAL MODULES

−1 so b3 ∈ f (W1) = V1 + C, which means that b3 = b5 + c5 for some b5 ∈ V1 and c5 ∈ C. Similarly b4 = b6 + c6 for some b6 ∈ V2 and c6 ∈ C. Therefore b3 + b4 = (b5 + c5) + (b6 + c6) = (b5 + b6) + (c5 + c6) ∈ V + C since b5 ∈ V1 −1 −1 −1 −1 and b6 ∈ V2. Now V + C = f (f(V )) = f (f(f (W ))) = f (W ) = V , so b3 + b4 ∈ V and hence [b3] + [b4] = f(b3 + b4) ∈ f(V ) ⊆ W . So for all open neighbourhoods W of [b1] + [b2] in B/C there exist open neighbourhoods W1, W2 of [b1] resp. [b2] in B/C such that for all [b3] ∈ W1,[b4] ∈ W2 we have c [b3] + [b4] ∈ W . This makes addition on B/C . T Similarly scalar multiplication and therefore B/C M /A. Note that B/C T1 if and only if (Lemma (2.2.2)) for all [b] ∈ B/C we have that {[b]} ⊆ B/C is closed, which is the case if and only if (Lemma (4.1.5), {[b]} = {[0]} + b) {[0]} ⊆ B/C is closed. By the quotient topology, W ⊆ B/C is open if and only if f −1(W ) ⊆ B is open, which implies that {[0]} ⊆ B/C is closed if and only if f −1({[0]}) = C ⊆ B is closed. Therefore B/C if and only if C ⊆ B is closed and hence by Lemma (4.1.7) B/C T2 if and only if C ⊆ B is closed. −1 S Now let U ⊆ B be open, then f (f(U)) = U + C = c∈C (U + c) which is open (Lemma (4.1.5)) as a union of open sets and therefore f(U) ⊆ B/C is open by definition of the quotient topology. Hence f is an open map. Definition 4.1.10: Direct product T Let A R , {Bi|i ∈ I} with Bi /A for all i ∈ I. Q Then we consider Bi as a /A with the initial topology (Definition Q i∈I (2.1.18)) of the maps j∈I Bj → Bi defined by Equation (3.1) for all i ∈ I. Definition 4.1.11: Direct sum Let A , {Bi|i ∈ I} with Bi /A for all i ∈ I. L Then we consider Bi as a /A with the final topology (Definition i∈IL (2.1.23)) of the maps Bi → j∈I Bj defined by Equation (3.2) for all i ∈ I. Definition 4.1.12: Topological module dual Let A , B /A. Then the (topological) dual of B is defined to be

B0 := {f ∈ B∗ | f }, together with the initial topology (Definition (2.1.18)) of {B0 → A : f 7→ f(b)|b ∈ B}. This makes B0 /A. That B0 /A follows directly from the fact that B0 ≤ B∗ (from the fact that B∗ M /A together with continuity of addition and scalar multiplication in B), which gives B0 /A. Furthermore

+ · B0 × B0 / B0 A × B0 / B0

(f,g)7→(f(b),g(b)) f7→f(b) (a,f)7→(a,f(b)) f7→f(b)     A × A A × A + / A · / A show that addition and scalar multiplication on B0 are continuous because of the initial topology.

- 53 - 4.2. NORMED MODULES

Example 4.1.13: Linearity does not imply continuity, B0 ( B∗ Consider A = R, B = {f :[−1, 1] → R | f d [−1, 1]} together with (f + g)(x) := f(x) + g(x), (α f)(x) := α f(x), 0(x) := 0, and the norm k · k : B → R, kfk := sup{|f(x)| | x ∈ [−1, 1]}

T which make B M /R. Consider the map 0 g : B → R : f 7→ f (0) then g ∈ B∗, while g∈ / B0. Certainly g ∈ B∗ because differentiation is linear. However, for the sequence

 sin(k2 x) x : → B : k 7→ x 7→ N k

2 2 we have for all k ∈ N that g(xk) = k cos(k 0)/k = k. Therefore limk→∞ g(xk) = 2 limk→∞ k which does not exist in R, while limk→∞ kxkk = limk→∞ sup{| sin(k x)|/|k||x ∈ c [0, 1]} ≤ limk→∞ 1/k = 0, so limk→∞ xk = 0 ∈ B does exist. Hence g is not : g∈ / B0.

4.2 Normed modules

Definition 4.2.1: Normed ring Let A R . Then a seminorm on A is a map | · | : A → R : a 7→ |a| satisfying for all a1, a2 ∈ A that

•| a1| ≥ 0,

•| a1 a2| ≤ |a1| |a2|,

•| a1 + a2| ≤ |a1| + |a2|, •| 1| = 1, |0| = 0.

If in addition |a1| = 0 → a1 = 0, we call | · | a norm on A. A ring (resp. field) A together with a (semi)norm is called a (semi)normed ring (resp. field). Definition 4.2.2: Topology of a normed ring Let A be a (semi)normed ring. Then we consider A as a (pseudo)metric space with (pseudo)metric given by

d : A × A → R :(a1, a2) 7→ |a2 − a1|.

Definition 4.2.3: Normed module (||.|| ) Let A be a (pseudo)normed ring and B M /A. Then a seminorm on B is a map k · k : B → R : b 7→ kbk satisfying for all b1, b2 ∈ B and a ∈ A that

- 54 - 4.2. NORMED MODULES

•k b1k ≥ 0,

•k a b1k = |a| kb1k,

•k b1 + b2k ≤ kb1k + kb2k.

If in addition kb1k = 0 → b1 = 0 and A is a normed ring, we call k · k a norm on B. An A-module B together with a (semi)norm is called a (semi)normed A- module. We denote the fact that an A-module B is a normed A-module by B ||.|| /A. Definition 4.2.4: Topology of a normed module Let A be a (semi)normed ring and B M /A a (semi)normed module. Then we consider B as a (pseudo)metric space with (pseudo)metric given by

d : B × B → R :(b1, b2) 7→ kb2 − b1k. A topological space is called (semi)normable if it is T -isomorphic to a (semi)normed module. Lemma 4.2.5 Let A be a (semi)normed R and B a (semi)normed /A. T T Then A R and B M /A.

Proof. Let b1, b2 ∈ B and b3 ∈ BB(b1, δ1), b4 ∈ BB(b2, δ2). Then dB(b3 +b4, b1 + b2) = k(b3+b4)−(b1+b2)k = k(b3−b1)+(b4−b2)k ≤ kb3−b1k+kb4−b2k < δ1+δ2 for all δ1, δ2 ∈]0, ∞[. From this we obtain continuity of addition by Lemma (2.5.5). Let b1 ∈ B, a1 ∈ A, and b2 ∈ BB(b1, δ1), a2 ∈ BA(a1, δ2). Then dB(a1 b1, a2 b2) = k(a1 b1) − (a2 b2)k = k(a1 b1) − (a2 b1) + (a2 b1) − (a2 b2)k ≤ k(a1 −a2) b1k+ka2 (b1 −b2)k < δ2 kb1k+δ1 |a2| which shows continuity of scalar multiplication. The proof for A is the same. Theorem 4.2.6: Hahn-Banach Let K be either R or C, A Vs /K, and k · k : A → R a seminorm. Then for any f ∈ B∗ with B ≤ A satisfying |f(b)| ≤ kbk for all b ∈ B, there ∗ exists a g ∈ A such that g|B = f, and |g(a)| ≤ kak for all a ∈ A. Proof. Suppose K = R. Then by Theorem (3.3.22)(f(b) ≤ |f(b)| ≤ kbk for ∗ all b ∈ B and k · k is a seminorm) there exists a g ∈ A with g|B = f and g(a) ≤ kak for all a ∈ A. Now because g l , −g(a) = g(−a) ≤ k − ak = kak, so ±g(a) ≤ kak which implies that |g(a)| ≤ kak for all a ∈ A. g is the desired function. Suppose K = C. Then by Theorem (3.3.22) ( Re f(b) ≤ |f(b)| ≤ kbk for all b ∈ B, regard A and B as /R, possible since R ≤ C) there exists a g : A → R /R with g|B = Re f and g(a) ≤ kak for all a ∈ A. Choose h : A → C by h(a) := g(a) − i g(i a). Then as g /R and h(i a) = g(i a) − i g(−a) = i (g(a) − i g(i a)) = i h(a) we find that h /C. Therefore h ∈ A∗. Now Re f(b) = g(b) = Re h(b) and Im f(b) = − Re(i f(b)) = Re f(−i b) = g(−i b) = − Re(i h(b)) = Im h(b) for all b ∈ B, so h|B = f. Furthermore Re h(a) = g(a) ≤ kak for all a ∈ A. So for any a ∈ A with h(a) 6= 0, |h(a)| =     Re |h(a)| = Re |h(a)| h(a) = Re h |h(a)| a ≤ |h(a)| a = |h(a)| kak = kak. h(a) h(a) h(a) h(a) And for a ∈ A with h(a) = 0, |h(a)| = 0 ≤ kak since kak ≥ 0 for all a ∈ A. Therefore |h(a)| ≤ kak for all a ∈ A. h is the desired function.

- 55 - 4.3. TOPOLOGICAL VECTOR SPACES

4.3 Topological vector spaces

T Definition 4.3.1: Topological field ( F ) Let A be a set. Then we call A a topological field (denoted by A ) if A T F such that A T ∗ ∗ −1 c R and in addition A → A : a1 7→ a1 . Example 4.3.2 Both R and C with their usual topologies are normed topological fields.

T Definition 4.3.3: Topological vector space ( Vs ) Let A and B a set. Then we call B a topological A-vector space (denoted by B /A) if B Vs /A T such that B M /A. The morphisms of topological vector spaces are the same as for topological modules (Definition (4.1.4)). From now on, we will assume K to be either K = R or K = C, and denote K’s elements, called scalars because of their role in scalar multiplication, by α, β, . . .. Definition 4.3.4: Abc subsets Let A /K. Then we call any subset U ⊆ A an abc subset of A if U is S absorbent: A = α∈]0,∞[ α U,

balanced: ∀a1 ∈ U : ∀α ∈ K :(|α| ≤ 1 → α a1 ∈ U), and convex: ∀a1, a2 ∈ U : ∀α ∈ [0, 1] : (α a1 + (1 − α) a2 ∈ U). Lemma 4.3.5: Operations preserving abc Let A, B /K. The notation a/b/c is meant to indicate that the statements hold for each property (absorbent, balanced, convex) separately.

• For any a/b/c subset C of A and α ∈ K, α 6= 0, we have that α C is an a/b/c subset of A.

• For any balanced subset C of A and α, β ∈ K, if |α| ≤ |β| then α C ⊆ β C. S • For any balanced subset C of A, int(C) = α∈B (0,1)\{0} α int(C). K • For any a/b/c subset C of A, C is an a/b/c subset of A.

• For any a/b/c subset C of A and f : A → B l /K surjective, f(C) is an a/b/c subset of B.

• For any a/b/c subset D of B and f : A → B /K, f −1(D) is an a/b/c subset of A.

Proof. • Let C ⊆ A and α ∈ K, α 6= 0. Suppose C is absorbent. Let a ∈ A β be arbitrary, then there exists a β ∈ K such that a ∈ β C, so a ∈ α (α C), therefore α C is absorbent.

- 56 - 4.3. TOPOLOGICAL VECTOR SPACES

Suppose C is balanced. Let a ∈ αC and β ∈ K, |β| ≤ 1, then a = α a1 for some a1 ∈ C and hence β a = α(β a1) ∈ αC as C is balanced, so α C is balanced.

Suppose C is convex. Let a1, a2 ∈ αC, β ∈ [0, 1], then a1 = α a3, a2 = α a4 for a3, a4 ∈ C, so β a1 + (1 − β) a2 = α (β a3 + (1 − β) a4) ∈ αC as C is convex, so α C is convex.

• Let C ⊆ A balanced, α, β ∈ K, |α| ≤ |β|. The case where |α| = 0 is clear: 0 C = {0} ⊆ β C, so suppose |α| > 0. Let a ∈ α C, then there exists an α α |α| a1 ∈ C such that a = α a1. Hence a = β β a1 ∈ βC since | β | = |β| ≤ 1 and C is balanced. So α C ⊆ β C.

• Let C ⊆ A be balanced. Let α ∈ K, |α| ≤ 1, α 6= 0. As C is balanced, α C ⊆ 1 C = C, so int(α C) ⊆ int(C). Since α int(C) ⊆ α C is open by Lemma (4.1.5) we therefore have α int(C) ⊆ int(α C) ⊆ S int(C). So α∈B (0,1)\{0} α int(C) ⊆ int(C). K

On the other hand, for α = 1 ∈ BK(0, 1) \{0} we have α int(C) = int(C). • Let C ⊆ A. Suppose C is absorbent, then as C ⊆ C, C is absorbent. In the following use that scalar multiplication and addition c . Suppose C is balanced. Let a ∈ C, α ∈ K, |α| ≤ 1. Note that for all a1 ∈ C, α a1 ∈ C, so by Lemma (2.1.16) and the fact that a ∈ C we find

that α a1 = lima1→a α a1 ∈ C. Hence C is balanced.

Suppose C is convex. Let a1, a2 ∈ C, α ∈ [0, 1]. For all a3, a4 ∈ C we have α a3 + (1 − α) a4 ∈ C, so by Lemma (2.1.16) we find α a1 + (1 − α) a2 =

lim(a3,a4)→(a1,a2)(α a3 + (1 − α) a4) ∈ C, so C is convex.

• Let C ⊆ A, f : A → B l surjective. Suppose C is absorbent, then from S S surjectivity and linearity, B = f(A) = f( α∈]0,∞[ α C) = α∈]0,∞[ f(α C) = S α∈]0,∞[ α f(C), so f(C) is absorbent. Suppose C is balanced. Let b ∈ f(C), α ∈ K, |α| ≤ 1. Then there exists an a ∈ C such that f(a) = b and hence α b = α f(a) = f(α a) ∈ f(C) as α a ∈ C, so f(C) is balanced.

Suppose C is convex. Let b1, b2 ∈ f(C), α ∈ [0, 1], then there exist a1, a2 ∈ C such that f(a1) = b1, f(a2) = b2, so α b1 + (1 − α) b2 = α f(a1)+(1−α) f(a2) = f(α a1+(1−α) a2) ∈ f(C) as α a1+(1−α) a2 ∈ C, so f(C) is convex. • Let D ⊆ B, f : A → B . Suppose D is absorbent. Let a ∈ A, then f(a) ∈ B, so there exists an α ∈ K such that f(a) ∈ α D, so either α = 0 and f(a) = 0, which implies that a ∈ f −1({0}) ⊆ f −1(D), or α 6= 0 and 1 1 −1 −1 −1 f( α a) ∈ D, so α a ∈ f (D) which gives a ∈ α f (D). So f (D) is absorbent. Suppose D is balanced. Let a ∈ f −1(D), α ∈ K, |α| ≤ 1. Then f(α a) = α f(a) ∈ D, so α a ∈ f −1(D) and f −1(D) is balanced. −1 Suppose D is convex. Let a1, a2 ∈ f (D), α ∈ [0, 1]. Then f(α a1 + (1 − −1 α) a2) = α f(a1) + (1 − α) f(a2) ∈ D, so α a1 + (1 − α) a2 ∈ f (D) and f −1(D) is convex.

- 57 - 4.3. TOPOLOGICAL VECTOR SPACES

Lemma 4.3.6: Absorbent and balanced topological basis T Let A Vs /K. Then • there exists a basis of open neighbourhoods A of 0 in A such that all U ∈ A are absorbent and balanced, • for any neighbourhood U of 0 in A, U is absorbent and there exists a U1 ∈ A such that U1 + U1 ∈ U.

1 Proof. Let U be any open neighbourhood of 0 in A. Let a ∈ A. As limk→∞ k a = 0 a = 0 (continuity of scalar multiplication), there exists a k ∈ N such that 1 a ∈ U, hence a ∈ k U. Therefore A = S k U ⊆ S α U ⊆ A, so U is k k∈N α∈]0,∞[ absorbent. Note that this makes any neighbourhood of 0 in A absorbent. Since lim(α,a)→(0,0) α a = 0, there exists a δ ∈]0, ∞[ and an open neigh- bourhood U1 of 0 in A such that for all α ∈ BK(0, δ), and a ∈ U1 we have δ α a ∈ U. In particular for all α ∈ BK(0, 1), a ∈ U1, α ( 2 a) ∈ U. Hence S αδ 0 ∈ U2 := α∈B (0,1) 2 U1 ⊆ U and by Lemma (4.1.5), U2 is open as a union K of open sets. Furthermore, U2 is balanced by definition. So for any neighbourhood U of 0 in A there exists a balanced open neigh- bourhood U2 of 0 in A such that 0 ∈ U2 ⊆ U. Hence we can construct a basis of open neighbourhoods A that are all ab- sorbent and balanced. Again, let U be a neighbourhood of 0 in A. Then as addition c and 0+0 = 0, there exist open absorbent and balanced (from A) neighbourhoods U1, U2 of 0 in A such that U1 + U2 ⊆ U. In Section 4.5 we will see that also demanding convexity of the sets in A provides topological vector spaces with a lot more structure and will permit us to do analysis (basically, what we are doing is finding a topological basis for A that more and more resembles the topological basis of a metric space which consists of open balls, that are all absorbent, balanced, and convex). The following examples have been added to emphasise the fact that intuitive results need not be valid in the context of topological vector spaces when we do not place additional constraints on their topologies. Example 4.3.7: Failure of an almost open mapping to be open Let A be R with its usual topology (which makes A FS /R), and let B be R with topology consisting of {∅, R} (which makes B /R and Baire). l Then the map f := idR : A → B /R. Let U ⊆ A be open and nonempty, then there is some a ∈ U, so B ⊇ f(U) ⊇ {f(a)} = B (as B = R is the smallest closed set containing f(a) in B). Hence f(U) ⊆ B = int(f(U)), so f is almost open. On the other hand, for U = BA(0, 1) we have f(U) = BA(0, 1) ∈/ {∅, R}, so there exists an open U ⊆ A for which f(U) ⊆ B is not open. Hence f is not open. This shows in particular that the T2 demand in Theorem (4.4.3) is necessary.

- 58 - 4.4. F-SPACES

T Example 4.3.8: Failure of Lemma (3.3.10) for Vs c l Consider f := idR : A → B /R from Example (4.3.7). Then ker f = {0} and with the usual topology of R, A = R ' R/{0} as (since U + {0} = U for all U ⊆ R open, see the proof of Lemma (4.1.9)). So A ' A/ ker f, now B = R = idR(R) = f(A), so by Lemma (3.3.10) we have as Vs -isomorphisms A ' A/ ker f ' f(A) = B. However, these isomorphisms are not -isomorphisms, because the topologies on A and B are different. Hence Lemma (3.3.10) does not hold for general /K. Note that Lemma (3.3.10) does hold for /K which satisfy the conditions of Corollary (4.4.6), because the map A/ ker f → f(A):[a] 7→ f(a) /K and bijective. Example 4.3.9: Bijective continuous linear map with non-continu- ous inverse The map from Example (4.3.7) is /R and bijective, however, as it is not open, its inverse is not . This shows that not every continuous linear map which is bijective, is a -isomorphism. In the next section, Corollary (4.4.6) shows that for topological vector spaces with a sufficient amount of structure, the problems in the above examples cannot arise.

4.4 F-spaces

In this section we follow [Hus1965].

Definition 4.4.1: F-space ( FS ) Let A be a set. Then we call A an F-space over K (denoted by A /K) if A /K d(.,.) with a translation invariant metric (that is d(a1 + a3, a2 + a3) = d(a1, a2) for all a1, a2, a3 ∈ A) such that A is complete with respect to this metric. We will prove the open mapping and closed graph theorems in two stages.

Theorem 4.4.2 Let A, B /K, B Baire (Definition (2.5.16)). Then

• any f : A → B /K that is surjective is almost open, • any g : B → A /K is almost continuous.

Proof. • Let f : A → B /K and surjective. Let U be any open neighbour- hood of 0 in A, then by Lemma (4.3.6) there exists an open neighbourhood S U1 of 0 in A that is balanced, A = k U1, and U1 + U1 ⊆ U. k∈N Using surjectivity and linearity of f, we therefore find ! [ [ [ [ B = f(A) = f k U1 = f(k U1) = k f(U1) ⊆ k f(U1) ⊆ B. k∈N k∈N k∈N k∈N

- 59 - 4.4. F-SPACES

Because B is Baire and has nonempty interior itself (being B), there exists a k ∈ N such that int(k f(U1)) 6= ∅, therefore (Lemma (4.1.5)) int(f(U1)) 6= ∅. Hence there exists a b ∈ int(f(U1)).

Since U1 is balanced and f l , by Lemma (4.3.5) f(U1), f(U1) are bal- anced. Therefore (again Lemma (4.3.5)) int(f(U1)) = −int(f(U1)). Hence 0 = b−b ∈ int(f(U1))−int(f(U1)) = int(f(U1))+int(f(U1)) ⊆ int(f(U1)+ f(U1)) ⊆ int(f(U1) + f(U1)) = int(f(U1 + U1)) ⊆ int(f(U)) (using conti- nuity of addition for Lemma (2.1.16), and linearity of f). Hence f(0) = 0 ∈ int(f(U)) for any open neighbourhood of 0 in A. As f we therefore find that f is almost open.

• Let g : B → A /K. Let U be any open neighbourhood of g(0) = 0 in A, then by Lemma (4.3.6) there exists an open neighbourhood U1 of 0 in A S that is balanced, A = k U1, and U1 + U1 ⊆ U. Then k∈N ! −1 −1 [ [ −1 [ −1 B = g (A) = g k U1 = k g (U1) ⊆ k g (U1) ⊆ B. k∈N k∈N k∈N

−1 So as B is Baire, there exists a b ∈ int(g (U1)). Using Lemma (4.3.5) we −1 −1 see that int(U1) = −int(U1), so 0 = b − b ∈ int(g (U1)) − int(g (U1)) = −1 −1 −1 −1 −1 int(g (U1))+int(g (U1)) ⊆ int(g (U1)+g (U1)) ⊆ int(g (U1 + U1)) ⊆ int(g−1(U)). From linearity of g and the fact that 0 ∈ int(g−1(U)) ⊆ g−1(U) for all open neighbourhoods U of g(0) in A we therefore find that g is almost continuous.

Theorem 4.4.3: Open mapping theorem T Let A FS /K, B Vs /K T2 . Then any f : A → B /K that is almost open and c , is in fact open. Proof. Let U ⊆ A be open, then by Lemma (4.1.6) and Definition (4.5.1) we have that for any a ∈ U there exists a neighbourhood Ua of 0 in A such that a + Ua ⊆ U, using linearity of f we therefore find that ! [ [ [ f(U) ⊇ f (a + Ua) = (f(a) + f(Ua)) ⊇ (f(a) + {0}) = f(U). a∈U a∈U a∈U

Hence, if for all neighbourhoods Ua of 0 in A we can find an open neighbourhood Va of 0 in B such that {0} ⊆ Va ⊆ f(Ua), we would obtain that [ f(U) = (f(a) + Va) a∈U which is open by Lemma (4.1.5). Since U was arbitrary, this would imply that f is open. To prove the theorem it is sufficient to show that for any neighbourhood U of 0 in A there exists an open neighbourhood V of 0 in B such that V ⊆ f(U). Conversely, any open map has this property.

- 60 - 4.4. F-SPACES

−k Construct a collection Uk := BA(0, 2 ) ⊆ A for all k ∈ N, then by definition U1 ⊇ U2 ⊇ .... Let a1, a2 ∈ Uk+1, then d(a1 + a2, 0) = d(a1, −a2) ≤ d(a1, 0) + −(k+1) −(k+1) −k d(0, −a2) = d(a1, 0) + d(a2, 0) ≤ 2 + 2 = 2 , so a1 + a2 ∈ Uk. Therefore Uk ⊇ Uk+1 + Uk+1 for all k ∈ N. −k Note that for any k ∈ N, 0 ∈ BA(0, 2 ) ⊆ Uk, so Uk is a closed neigh- bourhood of 0 in A. Let U be any open neighbourhood of 0 in A, then

d(.,.) (A ), there exists an  ∈]0, ∞[ such that BA(0, ) ⊆ U. Therefore, for 2 −k k = d log(1/)e + 1 ∈ N we have 2 < , so 0 ∈ Uk ⊆ BA(0, ) ⊆ U. Hence U1,U2,... is a basis of closed neighbourhoods of 0 in A. Define for all k ∈ N, Vk := int(f(Uk)). Because f is almost open and linear, 0 = f(0) ∈ f(Uk) ⊆ int(f(Uk)) = Vk ⊆ f(Uk), so all Vk are open neighbourhoods of 0 in B and 0 ∈ Vk ⊆ f(Uk) for all k ∈ N. To prove the theorem it is therefore sufficient to show that Vk+1 ⊆ f(Uk) for all k ∈ N, because U1,U2,... forms a basis of neighbourhoods of 0 in A. Fix k ∈ N and b ∈ Vk+1. We are going to inductively create a sequence in Uk that approximates b. As b ∈ Vk+1 ⊆ f(Uk+1) and b − Vk+2 is an open neighbourhood of b in B, by Lemma (2.1.3) f(Uk+1) ∩ (b − Vk+2) 6= ∅: there is some a1 ∈ Uk+1 with b − f(a1) ∈ Vk+2. Pl  Now suppose that we have constructed a1, . . . , al such that b−f m=1 am ∈ 0 Vk+1+l and am ∈ Uk+m for all 1 ≤ m ≤ l. Let b := b − f(a1 + ... + al) ∈ 0 0 Vk+1+l ⊆ f(Uk+1+l). Then b − Vk+1+l+1 is an open neighbourhood of b in B, 0 so f(Uk+1+l)∩(b −Vk+1+l+1) 6= ∅. This implies that there exists a al+1 ∈ Uk+1+l 0 with b−f(a1 +...+al +al+1) = (b−f(a1 +...+al))−f(al+1) = b −f(al+1) ∈ Vk+1+l+1. Using induction this permits us to construct a sequence N → A : l 7→ al satisfying for all l ∈ N that

l ! X al ∈ Uk+l, b − f am ∈ Vk+l+1. (4.1) m=1

Let l, m ∈ N, then because U1 ⊇ U2 + U2 ⊇ ... we have

l X an+m = am+1 + am+2 + ... + am+l−2 + am+l−1 + am+l n=1

∈ Uk+m+1 + Uk+m+2 + ... + Uk+m+l−2 + Uk+m+l−1 + Uk+m+l

⊆ Uk+m+1 + Uk+m+2 + ... + Uk+m+l−2 + Uk+m+l−1 + Uk+m+l−1

⊆ Uk+m+1 + Uk+m+2 + ... + Uk+m+l−2 + Uk+m+l−2 ...

⊆ Uk+m+1 + Uk+m+1

⊆ Uk+m.

So l X an+m ∈ Uk+m n=1 for any l, m ∈ N.

- 61 - 4.4. F-SPACES

Let  ∈]0, ∞[ be given and pick l = d2log(1/)e + 1 ∈ N, then 2l < . Let m, n ≥ l, m ≤ n. Then

n m n−m X X X ao − ap = ao+m ∈ Uk+m ⊆ Uk+l o=1 p=1 o=1

Pn Pm −(k+l) −l and hence d( o=1 ao, p=1 ap) ≤ 2 ≤ 2 < . Pl Therefore the sequence l 7→ m=1 am is Cauchy and because A is complete, there exists an a ∈ A such that (recall that Uk is closed, use Lemma (2.1.16))

l ∞ X X a = lim am = al+0 ∈ Uk+0 = Uk. l→∞ m=1 l=1

As f c we find for this a that

l ! l !! X X b − f(a) = b − f lim am = lim b − f am . l→∞ l→∞ m=1 m=1

By Equation (4.1) we see that for any l ∈ N we have that for all m ∈ N

l+m ! X b − f an ∈ Vk+l+1, n=1 so taking the limit m → ∞ we obtain that for any l ∈ N, b − f(a) ∈ Vk+l+1. Hence \ b − f(a) ∈ f(Uk+l+1). l∈N T Let b1 ∈ f(Uk+l+1) be arbitrary. Let V be an arbitrary open neigh- l∈N bourhood of 0 in B. Then for all l ∈ N, f(Uk+l+1) ∩ (b1 + V ) 6= ∅, so we obtain 0 0 for each l ∈ N an al ∈ Uk+l+1 such that f(al) ∈ b1 + V . Because the Uk+l+1 0 form a decreasing basis of neighbourhoods of 0 in A, liml→∞ al = 0. Hence (f 0 0 ) liml→∞ f(al) = 0, so −b1 = liml→∞(f(al) − b1) ∈ V . So −b1 ∈ V for any open neighbourhood V of 0 in B. Suppose −b1 6= 0, then (B T2 ) there exists an open neighbourhood V of 0 in B and V1 of −b1 in B such that V ∩ V1 = ∅. By the above −b1 ∈ V ⊆ A \ V1 (as A \ V1 closed and V ⊆ A \ V1), so −b1 ∈/ V1. This leads to a contradiction with the fact that V1 is an open neighbourhood of −b1. Hence −b1 = 0 and therefore b1 = 0. On the other hand, for all l ∈ N, 0 = f(0) ∈ f(Uk+l+1) ⊆ f(Uk+l+1), so \ f(Uk+l+1) = {0}. l∈N Therefore b − f(a) = 0 and b = f(a). Hence, for any b ∈ Vk+1 there exists an a ∈ Uk such that f(a) = b, hence Vk+1 ⊆ f(Uk) for all k ∈ N and the theorem is proven. Theorem 4.4.4: Closed graph theorem T Let A FS /K, B Vs /K. Then any g : B → A l /K that is almost continuous and for which graph(g) ⊆ B × A is closed, is in fact continuous.

- 62 - 4.4. F-SPACES

Proof. As g l it is sufficient to show that g c 0.

We are now going to show that we may assume g to be injective without loss of generality. Let f : B → B/ ker g : b 7→ [b]. Then g factorises uniquely (Lemma (3.3.10)) as a map g = h ◦ f with h : B/ ker g → A :[b] 7→ g(b). g is almost continuous by assumption, so for any b ∈ B and open neighbourhood U of g(b) in A we have that g−1(U) is a neighbourhood of b. As g−1(U) = f −1(h−1(U)) and f , h−1(U) = f(g−1(U)) ⊇ f(g−1(U)). Because f is open (Lemma (4.1.9)), this implies that h−1(U) is a neighbourhood of [b] in B/ ker g. This shows that h is almost continuous. Now if h , then g = h◦f . Therefore it is sufficient to prove the theorem for h, which is injective by Lemma (3.3.10), and we may assume g to be injective.

Construct the same countable basis of closed neighbourhoods of 0 in A as in Theorem (4.4.3): U1 ⊇ U2 ⊇ ... satisfying Uk ⊇ Uk+1 + Uk+1 for each k ∈ N. −1 Define for all k ∈ N, Vk := int(g (Uk)). Because g is almost continuous, −1 −1 g (Uk) is a neighbourhood of 0 = g(0) in B, so 0 ∈ Vk ⊆ g (Uk) for each k ∈ N. To prove the theorem it is therefore sufficient to show that g(Vk+1) ⊆ Uk for each k ∈ N (as the Uk form a basis of closed neighbourhoods of 0 in A and all Vk are open neighbourhoods of 0 in B). Fix k ∈ N and b ∈ Vk+1. We are going to create a sequence in A that approximates g(b). −1 As b ∈ Vk+1 ⊆ g (Uk+1) and b − Vk+2 is an open neighbourhood of b −1 in B by Lemma (4.1.5), g (Uk+1) ∩ (b − Vk+2) 6= ∅. So there exists some −1 b1 ∈ g (Uk+1) ∩ (b − Vk+2), hence b − b1 ∈ Vk+2 and g(b1) ∈ Uk+1. Now suppose we have constructed b1, . . . , bl such that g(bm) ∈ Uk+m for Pl 0 1 ≤ m ≤ l and b − m=1 bm ∈ Vk+l+1. Let b := b − (b1 + ... + bl) ∈ Vk+l+1 ⊆ −1 0 0 g (Uk+l+1), then b − Vk+1+l+1 is an open neighbourhood of b in B, so there −1 0 exists some bl+1 ∈ g (Uk+l+1)∩(b −Vk+1+l+1). Hence b−(b1 +...+bl +bl+1) = 0 b − bl+1 ∈ Vk+1+l+1 and g(bl+1) ∈ Uk+l+1. Using induction this permits us to construct a sequence N → B : l 7→ bl, satisfying for all l ∈ N that

l X g(bl) ∈ Uk+l, b − bm ∈ Vk+l+1. m=1

As U1 ⊇ U2 + U2 ⊇ ... we find via the same reasoning as in Theorem (4.4.3) Pl that l 7→ m=1 g(bl) is a Cauchy sequence in A and because A is complete there exists an a ∈ A such that

∞ X a = g(bl) ∈ Uk = Uk. l=1

Now as for any l, m ∈ N

l+m X b − bn ∈ Vk+l+m+1 ⊆ Vk+l+1 n=1

- 63 - 4.4. F-SPACES

P∞ we find taking the limit of m → ∞ that for all l ∈ N, b − m=1 bm ∈ Vk+l+1 = −1 g (Uk+l+1). Hence

∞ X \ −1 b − bm ∈ g (Uk+l+1). m=1 l∈N

T −1 Let b1 ∈ g (Uk+l+1). Let V × U any open neighbourhood of (0, 0) l∈N −1 in B × A. Then for all l ∈ N, b1 ∈ g (Uk+l+1) ∩ (b1 + V ), so there exists an 0 0 bl ∈ b1 + V with g(bl) ∈ Uk+l+1. As the {Uk+l+1|l ∈ N} form a descending basis of neighbourhoods of 0 in A, there exists an l ∈ N such that Uk+l+1 ⊆ U. 0 0 Hence (bl, g(bl)) ∈ (b1 + V ) × U ∩ graph(g). So (b1 + V ) × (0 + U) ∩ graph(g) 6= ∅ for all open neighbourhoods V × U of (0, 0) in B × A. Hence as graph(g) is closed, (b1, 0) ∈ graph(g), so g(b1) = 0. On the other hand, for all l ∈ N, −1 −1 0 ∈ g ({0}) ⊆ g (Uk+l+1), so

\ −1 {0} ⊆ g (Uk+l+1) ⊆ ker g. l∈N P∞ As g is injective, ker g = {0} and therefore b − m=1 bm = 0. Pl Now consider the sequence N → graph(g): l 7→ (b − m=1 bm, g(b − Pl Pl Pl m=1 bm)) = (b − m=1 bm, g(b) − m=1 g(bm)) which as l → ∞ goes to (0, g(b) − a) ∈ graph(g) as graph(g) is closed. Hence g(b) − a = g(0) = 0, so g(b) = a ∈ Uk and therefore g(Vk+1) ⊆ Uk for all k ∈ N and the theorem is proven. We can now combine the above results with Theorem (4.4.2). Theorem 4.4.5: Banach T Let A FS /K, B Vs /K T2 Baire. Then

• any f : A → B c l /K is surjective if and only if f is open, • any g : B → A /K is continuous if and only if graph(g) ⊆ B × A is closed.

Proof. • Let f : A → B /K. Suppose f is surjective, then by Theorem (4.4.2)(B Baire), f is almost open. Hence by Theorem (4.4.3)(B ), f is open. Conversely, suppose f is open. Then for any open neighbourhood U of 0 in A, f(U) is an open neighbourhood of 0 in B. By Lemma (4.3.6), U and f(U) are absorbent. Since f we therefore obtain that B = S S S  α∈]0,∞[ α f(U) = α∈]0,∞[ f(α U) = f α∈]0,∞[ α U = f(A). There- fore f is surjective.

• Let g : B → A /K. Suppose graph(g) is closed, then by Theorem (4.4.2), g is almost continuous. Hence by Theorem (4.4.4), g . Conversely, if g , by Lemma (2.2.7), graph(g) is closed since A by Theorem (2.5.10).

- 64 - 4.5. LOCAL CONVEXITY

Corollary 4.4.6 T Let A FS /K, B Vs /K T2 Baire, f : A → B. If f c l /K and bijective, then its inverse is /K and bijective. If this is the case, then B is also /K. Proof. Suppose f /K and bijective. Because f is bijective and , there exists an inverse g : B → A which is bijective and . B is Baire and f is surjective because f is bijective, so by Theorem (4.4.2) f is almost open. A and B , so by Theorem (4.4.3) f is open. In particular, for any U ⊆ A open, g−1(U) = f(U) ⊆ B is open (g is bijective with inverse f), so g by Lemma (2.1.14). This makes B T -isomorphic to A and hence metrisable with a translation invariant (f and g are linear) metric and complete (use f and g to move Cauchy sequences between A and B). Hence B /K.

4.5 Local convexity

Definition 4.5.1: Locally convex topological vector spaces ( LC ) Let A /K. Then we call A locally convex (denoted by A ) if there exists a basis of open neighbourhoods A of 0 in A such that all U ∈ A are abc subsets of A. Lemma 4.5.2 Let A /K. Then for any open neighbourhood U of 0 in A that is an abc subset of A, the map k · kU : A → R : a 7→ inf{α ∈]0, ∞[ | a ∈ α U} is a seminorm on A and for all α ∈]0, ∞[,

−1 α U = k · kU ([0, α[).

Conversely, for any seminorm k · k : A → R, the set k · k−1([0, 1[) is an abc subset of A.

Proof. By definition kakU ≥ 0 for all a ∈ A and since U is absorbent we have that ∀a ∈ A : ∃α ∈]0, ∞[: a ∈ α U, so kakU ∈ [0, α] ⊆ R exists for all a ∈ A. For α = 0 we clearly have kα akU = inf]0, ∞[= 0 = 0kakU , if α 6= 0 we see that β |α| for any β ∈]0, ∞[, α a ∈ β U iff α a = β a1 for some a1 ∈ U iff a = |α| α a1 β for some a1 ∈ U iff a = |α| a2 for some a2 ∈ U (because U is balanced and β ||α|/α| = 1) iff a ∈ |α| U. Therefore kα akU = inf{β ∈]0, ∞[|α a ∈ β U} = β inf{β ∈]0, ∞[|a ∈ |α| U} = |α| kakU . So kαakU = |α| kakU for all a ∈ A, α ∈ K. Let a1 ∈ α1U and a2 ∈ α2U for α1, α2 ∈]0, ∞[, then there exist a3, a4 ∈ U such that a1 = α1 a3, a2 = α2 a4. Because of this a1 + a2 = α1 a3 + α2 a4 = (α1 + α2)((α1/(α1 + α2)) a3 + (α2/(α1 + α2)) a4) ∈ (α1 + α2) U, as U is convex and (α1/(α1 +α2))+(α2/(α1 +α2)) = 1. Therefore ka1 +a2kU ≤ ka1kU +ka2kU . This makes k · kU a seminorm on A. Now let α ∈]0, ∞[ be given. Let a ∈ A and suppose kakU < α, then there exists a β ∈ [0, α[ such that a ∈ β U ⊆ α U by definition of the infimum. There- −1 fore k · kU ([0, α[) ⊆ α U. Let a ∈ αU, then a = α a1 for some a1 ∈ U. Because

- 65 - 4.5. LOCAL CONVEXITY

c the map R → A : β 7→ βa1 and α U is an open neighbourhood of α a1 in A there exists a δ ∈]0, ∞[ such that β a1 ∈ αU for all β ∈]α(1 − δ), α(1 + δ)[. But then (1 + δ/2) a = α (1 + δ/2) a1 ∈ U, so a ∈ (α/(1 + δ/2)) U and −1 kakU ≤ α/(1 + δ/2) < α. Therefore α U ⊆ k · kU ([0, α[).

Let k · k : A → R be a seminorm and define U := k · k−1([0, 1[) ⊆ A. Let a ∈ A. Suppose kak = 0, then a ∈ U directly. Otherwise kak > 0 (as 1 1 kak ≥ 0), so we can define a1 := 2 kak a. Then ka1k = 2 ∈ [0, 1[, so a1 ∈ U and therefore a ∈ 2 kak U. Hence U is absorbent.

Let a ∈ U, α ∈ BK(0, 1), then 0 ≤ kα ak = |α| kak ≤ 1 kak < 1, so α a ∈ U: U is balanced. Let a1, a2 ∈ U, α ∈ [0, 1]. Then 0 ≤ kα a1 + (1 − α) a2k ≤ |α| ka1k + |1 − α| ka2k < α 1 + (1 − α) 1 = 1, so α a1 + (1 − α) a2 ∈ U. Hence U is convex. So U is an abc subset of A.

Lemma 4.5.3 T Let A Vs /K. Then A LC if and only if there exists a family of seminorms {k·ki : A → R|i ∈ I} on A such that the topology of A is the initial topology (Definition (2.1.18)) with respect to this collection.

Proof. Suppose A and let A be the basis of abc neighbourhoods of 0 in A. Define for each U ∈ A the seminorm k · kU : A → R as in Lemma (4.5.2). With Lemma (4.5.2) and Lemma (4.1.6) we obtain that the initial topology of {k · kU |U ∈ A} must coincide with the topology of A generated by the basis of neighbourhoods A: for any U1,...,Uk ∈ A and  ∈]0, ∞[ we have k · k−1([0, [) ∩ ... ∩ k · k−1([0, [) = ( U ) ∩ ... ∩ ( U ) =  (U ∩ ... ∩ U ). U1 Uk 1 k 1 k Suppose conversely that the topology of A is the initial topology of a family of seminorms {k · ki : A → R|i ∈ I}. Choose

A := {{a ∈ A | kaki1 ,..., kakik < } | i1, . . . , ik ∈ I,  ∈]0, ∞[} = {k · k−1(] − 1, [) ∩ ... ∩ k · k−1(] − 1, [) i1 ik

| i1, . . . , ik ∈ I,  ∈]0, ∞[}.

Then all these sets are open, because all sets ] − 1, [⊆ R are open for  ∈]0, ∞[ and the k · ki by choice of the initial topology. Furthermore 0 is an element of all these sets, since k0ki = 0 <  for all  ∈]0, ∞[ and i ∈ I because the k · ki are seminorms. Therefore A is a collection of open neighbourhoods of 0 in A. From the expression for the basis generating the initial topology (Lemma (2.1.19)) we see that it is even a basis of open neighbourhoods of 0 (which by Lemma (4.1.6) generates the entire initial topology by translation). The sets are furthermore all abc subsets of A by Lemma (4.5.2). Therefore A . For the remaining part of this section we will use the notion of local convexity interchangeably with the family of seminorms {k · ki|i ∈ I} from Lemma (4.5.3).

Lemma 4.5.4 Let A /K , B T . Then for f : B → A, b ∈ B, a ∈ A, the following are equivalent:

- 66 - 4.5. LOCAL CONVEXITY

• limy→b f(y) = a,

• for all i ∈ I, limy→b kf(y) − aki = 0,

•∀ i ∈ I : ∀ ∈]0, ∞[: ∃b ∈ V ⊆ B open : ∀b1 ∈ V : kf(b1) − aki < . Proof. By considering b 7→ f(b) − a with continuity of addition we may assume a = 0. Suppose limy→b f(y) = 0. Since A has the initial topology of all k · ki, all c k · ki and hence by Lemma (2.1.10), limy→b kf(y)ki = limx→0 kxki = k0ki = 0 for all i ∈ I Suppose conversely that limy→b kf(y)ki = 0 for all i ∈ I. Let U be an open neighbourhood of 0 in A, then because of the initial topology there exists an  ∈]0, ∞[ and i , . . . , i ∈ I such that 0 ∈ k·k−1(]−1, [)∩...∩k·k−1(]−1, [) ⊆ U. 1 k i1 ik For each ij ∈ I with 1 ≤ j ≤ k there exists an open neighbourhood Vj of b in B such that kf(Vj)kij ⊆]−1, [ since limy→b kf(y)kij = 0 by assumption and ]−1, [ is an open neighbourhood of 0 in . Therefore, f(V ) ⊆ k · k−1(] − 1, [) for each R j ij 1 ≤ j ≤ k. As a finite intersection of open neighbourhoods, V := V1 ∩ ... ∩ Vk is an open neighbourhood of b in B and by construction f(V ) ⊆ k · k−1(] − i1 1, [) ∩ ... ∩ k · k−1(] − 1, [) ⊆ U. So for each open neighbourhood U of 0 in A ik there exists an open neighbourhood V of b in B such that f(V ) ⊆ U. Therefore limy→b f(y) = 0. The final statement arises from writing out the second explicitly. Lemma 4.5.5 T Let A, B Vs /K LC , f : A → B l /K. Denote the seminorms on A by k · ki for i ∈ I and the seminorms on B by 0 k · kj for j ∈ J. Then f if and only if for all j ∈ J there exist α ∈]0, ∞[ and i1, . . . , ik ∈ I with 0   kf(a)kj ≤ α kaki1 + ... + kakik for all a ∈ A. Proof. We follow [Bou1955]. Because f and Lemma (4.1.5) we know that f if and only if f 0. Suppose the estimate holds and let  ∈]0, ∞[, j ∈ J. Then there exist α ∈  ]0, ∞[ and i1, . . . , ik ∈ I such that the estimate holds. Pick δ := kα ∈]0, ∞[, then 0  for all a ∈ A with kaki1 ≤ δ,..., kakik ≤ δ we have kf(a)kj ≤ αk kα = . Hence 0 lima→0 kf(a)kj = 0 for all j ∈ J and therefore (Lemma (4.5.4)) lima→0 f(a) = 0 which makes f . −1 0−1 Suppose conversely that f , then lima→0 f(a) = 0, so f (k·kj (]−1, 1[)) is a neighbourhood of 0 in A. Hence (by Lemma (2.1.19)) there exist i1, . . . , ik ∈ I and an  ∈]0, ∞[ such that 0 ∈ k · k−1(] − 1, [) ∩ ... ∩ k · k−1(] − 1, [) ⊆ i1 ik −1 0−1 f (k · kj (] − 1, 1[)). Let a ∈ A be arbitrary and suppose we have some

γ ∈]0, ∞[ such that γkakil <  for all 1 ≤ l ≤ k. Then by construction of , 0 0 1 kf(γa)kj < 1, so 0 ≤ kf(a)kj < γ . Now if kakil = 0 for some 1 ≤ l ≤ k, we can 0 let γ → ∞ which shows that kf(a)kj = 0 and the estimate holds. Otherwise pick γ =  from which we find kaki1 +...+kakik 1 1 kf(a)k0 ≤ = (kak + ... + kak ), j γ  i1 ik

- 67 - 4.5. LOCAL CONVEXITY

1 which shows that the desired estimate holds for all a ∈ A if we pick α =  . Lemma 4.5.6 T Let A Vs /K LC . The A T2 if and only if for all a ∈ A we have that a = 0 if kaki = 0 for all i ∈ I. Proof. Suppose A . Let a ∈ A and suppose a 6= 0. Since A and the initial topology is generated by the seminorms there exist an  ∈]0, ∞[ and i , . . . , i ∈ I such that a∈ / k·k−1(]−1, [)∩...∩k·k−1(]−1, [) 3 0. Therefore, 1 k i1 ik for some l ∈ {1, . . . , k} we have a∈ / k · k−1(] − 1, [), so kak ≥  > 0. So for il il all a ∈ A, if a 6= 0, there exists an i ∈ I such that kaki 6= 0. Now take the contrapositive. Suppose for all a ∈ A, if for all i ∈ I we have kaki = 0, then a = 0. Let a1, a2 ∈ A be given and suppose a1 6= a2. By translating by −a1 we may suppose that a1 = 0, a2 = a 6= 0. Since a 6= 0, there exists an i ∈ I such that −1 kaki > 0 by assumption. Take for  := kaki/2 > 0 the sets U1 := k ·ki (]−1, [) −1 of 0 and U2 := a + k · ki (] − 1, [). It is clear that U1 resp. U2 is an open neighbourhood of 0 resp. a in A. Let a1 ∈ U1 ∩ U2, then because a1 ∈ U1 we have ka1ki <  and because a1 ∈ U2, a1 = a + a2 with ka2ki < . But then 2  = kaki = ka1 − a2ki ≤ ka1ki + ka2ki < 2 leading to a contradiction. Therefore U1 ∩ U2 = ∅, so U1 and U2 are two disjoint open neighbourhoods. This makes A . Lemma 4.5.7: Comparison with normed spaces Let A be a set. Then • A is a seminormed K-module if and only if A with a finite number of seminorms,

• A ||.|| /K if and only if A with a finite number of seminorms. Proof. Suppose A is a seminormed K-module, then there exists a single semi- norm k·k : A → R defining the topology on A as per Definition (4.2.4). However, this is precisely the initial topology of k · k. By Lemma (4.2.5), A /K and since the topology on A is the intial topology of k · k, A by Lemma (4.5.3). Therefore A /K . Suppose conversely that A with a finite number of seminorms {k · k1,..., k · kk}. Then k · k : A → R defined by

kak := kak1 + ... + kakk is a seminorm on A, the initial topology of which coincides with that of the finite number of seminorms. Hence A is a seminormed K-module. By Lemma (4.5.6), A if and only if k · k is a norm if and only if A . Lemma 4.5.8 Let A /K . Then A is pseudometrisable if and only if the collection of open abc neigh- bourhoods giving rise to local convexity is countable. Furthermore, if this is the case, then the pseudometric may be assumed to

d(.,.) satisfy d(a1 + a3, a2 + a3) = d(a1, a2) for all a1, a2, a3 ∈ A, and A if and only if A .

- 68 - 4.5. LOCAL CONVEXITY

Proof. Suppose A admits a pseudometric d which generates A’s topology. Let A be the basis of open neighbourhoods of 0 arising from local convexity. Then 1 for all k ∈ N, k ≥ 1, the set BA(0, k ) is an open neighbourhood of 0, so for each 1 k ≥ 1 there exists a Uk ∈ A with 0 ∈ Uk ⊆ BA(0, k ). The countable collection A1 := {U1,U2,...} is by this construction again a basis of open neighbourhoods of 0 that gives rise to local convexity, by Lemma (4.1.6) generates A’s topology, and by Lemma (4.5.3) corresponds to a countable collection of seminorms. Suppose A is locally convex because of a countable collection of seminorms k · k1, k · k2,.... Define the function d : A × A → R by

∞ X 1 ka2 − a1kk d(a1, a2) := k . (4.2) 2 1 + ka2 − a1kk k=1

P∞ 1 Then for all a1, a2 ∈ A, d(a1, a2) ≤ k=1 k 1 = 1, so d(a1, a2) ∈ R. Clearly 2 P∞ d(a1, a2) ≥ 0, d(a1, a2) = d(a2, a1) and d(a1, a1) = k=1 0 = 0. Because ka3 − a1kk = k(a3−a2)+(a2−a1)kk ≤ ka3−a2kk+ka2−a1kk and x/(1+x)+y/(1+y) ≥ (x + y)/(1 + x + y) for x, y ∈ [0, ∞[ we also find d(a1, a3) ≤ d(a1, a2) + d(a2, a3). So d is a pseudometric on A. Now note that for any  ∈]0, ∞[, a ∈ BA(0, ) P∞ 1 kakk if and only if d(0, a) = k < . There exists an l ≥ 1 such that k=1 2 1+kakk −l P∞ 1 P∞ 1 1−l 2 < /2, so as k=l 2k = k=0 2k+l = 2 we see that if kak1,..., kakl < P∞ 1 kakk Pl 1 P∞ 1 1−(l+1) /2, then k < k /2 + k < /2 + 2 < , k=1 2 1+kakk k=1 2 k=l+1 2 −1 −1 so k · k1 (] − 1, /2[) ∩ ... ∩ k · kl (] − 1, /2[) ⊆ BA(0, ). Conversely, for any min{1,}  ∈]0, ∞[ and l ∈ N, l ≥ 1 we can choose δ = 2l+2 ∈]0, ∞[ to obtain that if dA(0, a) < δ, then kakk <  for all k ≤ l. For if kakk ≥  for a certain k ≤ l, then 1 kakk 1  1 dA(0, a) ≥ k ≥ k ≥ k+1 min{1, } > δ, contradiction. Therefore 2 1+kakk 2 1+ 2 −1 −1 BA(0, δ) ⊆ k · k1 (] − 1, [) ∩ ... ∩ k · kl (] − 1, [). Because of this and Lemma (4.1.6), the topology generated by d coincides with the initial topology of the seminorms and hence A is pseudometrisable. From Equation (4.2) it is clear that d(a1 + a3, a2 + a3) = d(a1, a2) for all a1, a2, a3 ∈ A. Furthermore, we see from Equation (4.2) that d(a1, a2) = 0 if and only if for all k ≥ 1 we have ka2 − a1kk = 0. Therefore d is a metric if ∀a ∈ A :(∀k ≥ 1 : kakk = 0) → a = 0, which by Lemma (4.5.6) is equivalent to A being T2 . Conversely if d is a metric, then A by Theorem (2.5.10). Just as with metric spaces we can talk about completeness for locally convex topological vector spaces, where we demand completeness with respect to all seminorms.

Definition 4.5.9: Uniform completeness (UC ) T Let A Vs /K LC . Then we call A uniformly complete (denoted by A ) if A is complete with respect to each seminorm, that is if for all sequences x : N → A we have that x is convergent in A if

∀i ∈ I : ∀ ∈]0, ∞[: ∃k ∈ N : ∀l, m ≥ k : kxl − xmki < . This notion is compatible with our original definition (Definition (2.5.14)) of completeness as is shown in the following lemma.

- 69 - 4.5. LOCAL CONVEXITY

Lemma 4.5.10 T Let A Vs /K LC T2 . If A UC and the collection of abc neighbourhoods giving rise to local convexity is countable, then A FS /K. Proof. Suppose A and has a countable collection of abc neighbourhoods. Then as A , A is metrisable by Lemma (4.5.8), denote the metric by d and assume the metric satisfies d(a1 + a3, a2 + a3) = d(a1, a2) for all a1, a2, a3 ∈ A (translation invariance). Let x : N → A be any Cauchy sequence with respect to the metric d. Let c i ∈ I and  ∈]0, ∞[ be arbitrary. As k·ki : A → R , there exists a δ ∈]0, ∞[ such −1 that BA(0, δ) ⊆ k · ki (] − 1, [). Because x is Cauchy there exists a k ∈ N such that for all l, m ≥ k we have d(xl, xm) < δ. But as d(xl −xm, 0) = d(xl, xm) < δ, xl − xm ∈ BA(0, δ), so kxl − xmki ∈] − 1, [ and hence kxl − xmki < . A by assumption and the above is true for all i ∈ I and  ∈]0, ∞[, so x is convergent. Because this is true for all Cauchy sequences in A, A is complete. Therefore A .

Example 4.5.11: Kk /K Let k ∈ . N q k Pk 2 Then the set K , together with the norm k(x1, . . . , xk)k := l=1 |xl| is /K (with just a single seminorm). Because R and C are complete, Kk . Qk is also /Q , but not as Q is not complete. Lemma 4.5.12 Let A /K, B ≤ A. If A , then A/B . Furthermore, A/B if and only if B ⊆ A is closed.

Proof. Suppose A due to a collection of seminorms {k · ki|i ∈ I}. Define 0 k · ki : A/B → R for all i ∈ I by

0 k[a]ki := inf{ka1ki ∈ R|a1 ∈ [a]}.

0 0 Note that 0 ≤ k[a]ki ≤ kaki < ∞, so k[a]ki ∈ [0, ∞[ for all [a] ∈ A/B. Let α ∈ K and [a] ∈ A/B. For α 6= 0, a1 ∈ [a] if and only if a1 = a + b for b ∈ B if and only if αa1 = α a + α b for b ∈ B if and only if α a1 ∈ [α a]. Hence, 0 0 as kα a1ki = |α| ka1ki, we find k[α a]ki = |α| k[a]ki. Otherwise, if α = 0, then 0 0 [α a] = [0] and as k0ki = 0, 0 ∈ B = [0], we see that k[0 a]ki = k[0]ki = 0 = 0 0 0 |0|k[a]ki. Hence kα[a]ki = |α|k[a]ki for all α ∈ K,[a] ∈ A/B. Let [a1], [a2] ∈ A/B. Then for all a3 ∈ [a1], a4 ∈ [a2] there exist b3, b4 ∈ B such that a3 = a1 +b3, a4 = a2 +b4, so ka3ki +ka4ki = ka1 +b3ki +ka2 +b4ki ≥ ka1 + a2 + (b3 + b4)ki. As b3 + b4 ∈ B, a1 + a2 + (b3 + b4) ∈ [a1 + a2]. So for any a3 ∈ [a1], a4 ∈ [a2] there exists an a5 ∈ [a1 +a2] such that ka5ki ≤ ka3ki +ka4ki. 0 0 0 0 Hence k[a1] + [a2]ki = k[a1 + a2]ki ≤ k[a1]ki + k[a2]ki for all [a1], [a2] ∈ A/B. 0 Therefore k · ki is a seminorm for all i ∈ I. 0 Let i ∈ I. By definition k[a]ki = inf{ka − bki ∈ R|b ∈ B} and hence (in 0 exactly the same fashion as for Lemma (2.5.9)) the map A → R : a 7→ k[a]ki 0 . Therefore, as A/B is equipped with the final topology of a 7→ [a], k · ki . Therefore the topology of A/B is as least as large as the initial topology of all 0 the k · ki, i ∈ I.

- 70 - 4.5. LOCAL CONVEXITY

Let V be an open neighbourhood of [0] in A/B. Then (a 7→ [a])−1(V ) ⊆ A is an open neighbourhood of 0 in A. Since A LC (use Lemma (2.1.19)), there exist i , . . . , i ∈ I and an  ∈]0, ∞[ such that 0 ∈ k·k−1(]−1, [)∩...∩k·k−1(]−1, [) ⊆ 1 k i1 ik (a 7→ [a])−1(V ). As a 7→ [a] is an open map by Lemma (4.1.9), we find that therefore

[0] ∈ V := (a 7→ [a])(k · k−1(] − 1, [) ∩ ... k · k−1(] − 1, [)) ⊆ V 1 i1 ik

−1 and V1 is an open neighbourhood of [0] in A/B. Now a ∈ k·ki (]−1, [)+B iff for 0−1 some b ∈ B, ka+bki <  iff inf{ka+bki ∈ R|b ∈ B} <  iff [a] ∈ k·ki (]−1, [), so (use (a 7→ [a])(B) = [0])

[0] ∈ k · k0−1(] − 1, [) ∩ ... ∩ k · k0−1(] − 1, [) = V ⊆ V. i1 ik 1 0 Hence the initial topology generated by the k · ki, i ∈ I is at least as large as the topology of A/B. 0 Therefore the topology of A/B equals the initial topology of the k · ki, i ∈ I and hence A/B . By Lemma (4.1.9) A/B T2 if and only if B ⊆ A is closed. Lemma 4.5.13 T Let A Vs /K. Then A0 .

0 0 Proof. Define k · ka : A → R for all a ∈ A by 0 kfka := |f(a)|.

0 0 Then clearly kfka = |f(a)| ≥ 0, kαfka = |(αf)(a)| = |αf(a)| = |α||f(a)| = 0 0 0 0 |α|kfka, kf + gka = |(f + g)(a)| = |f(a) + g(a)| ≤ |f(a)| + |g(a)| = kfka + kgka, 0 0 0 so all k · ka are seminorms on A . Since the evaluation f 7→ f(a): A → K c 0 0 for all a ∈ A and | · | : K → R , k · ka for all a ∈ A . Because of this the topology on A0 is at least as large as the initial topology generated by the 0 seminorms k · ka, a ∈ A. On the other hand, let U 0 be an open neighbourhood of 0 in A0. Then by 0 Lemma (2.1.19) and definition of the topology of A , there exist a1, . . . , ak ∈ A −1 and α1, . . . , αk ∈ K, β1, . . . , βk ∈]0, ∞[ such that 0 ∈ (f 7→ f(a1)) (BK(α1, β1))∩ −1 0 −1 ...∩(f 7→ f(ak)) (BK(αk, βk)) ⊆ U . Let 1 ≤ l ≤ k, as 0 ∈ (f 7→ f(al)) (BK(αl, βl)) we have that |0(al) − αl| = |αl| < βl. Choose γl := βl − |αl| ∈]0, ∞[, then if f ∈ k · k0−1(] − 1, γ [), |f(a )| < γ , so |f(a ) − α | ≤ |f(a )| + |α | < al l l l l l l l −1 βl − |αl| + |αl| = βl, hence f ∈ (f 7→ f(al)) (BK(αl, βl)). Therefore 0 ∈ k·k0−1(]−1, γ [)∩...∩k·k0−1(]−1, γ [) ⊆ (f 7→ f(a ))−1(B (α , β ))∩...∩(f 7→ a1 1 ak k 1 K 1 1 −1 0 0 f(ak)) (BK(αk, βk)) ⊆ U . Hence the topology of A is at most as large as the 0 initial topology generated by the seminorms k · ka, a ∈ A. So the seminorms 0 0 0 k · ka, a ∈ A generate the topology of A . Therefore A . 0 0 Let f ∈ A and suppose that for all a ∈ A, kfka = |f(a)| = 0. Then for all a ∈ A, f(a) = 0, so f = 0. Therefore, by Lemma (4.5.6), A0 . The seminorms from Lemma (4.5.2) permit us to rephrase Theorem (4.2.6) into a very convenient form for locally convex topological vector spaces. This form, Theorem (4.5.14), permits us to translate analysis on our topological vector space to analysis on K by applying elements of the topological dual to the points we are investigating.

- 71 - 4.5. LOCAL CONVEXITY

Theorem 4.5.14: Hahn-Banach T Let A Vs /K T2 LC . 0 Then for any a1, a2 ∈ A, we have a1 = a2 if and only if for all f ∈ A , f(a1) = f(a2).

0 Proof. Because any f ∈ A is a function by definition, clearly f(a1) = f(a2) if 0 a1 = a2. Now suppose a1 6= a2. As all f ∈ A are l we may by translating by −a1 suppose that a1 = 0 and a2 = a for a 6= 0. Because a 6= 0 and A , by Lemma (4.5.6) there exists an i ∈ I such that kaki > 0. Consider the map

B := {α a | α ∈ K} ≤ A g : B → K : α a 7→ α kaki.

c Then g and for all α a ∈ B we have |g(α a)| = kaki |α| = kα aki. By ∗ Theorem (4.2.6)(B ≤ A, g ∈ B , g ≤ k · ki|B, k · ki seminorm) there exists an ∗ h ∈ A satisfying h|B = g and |h(a1)| ≤ ka1ki for all a1 ∈ A. Because of this (k · ki by the initial topology of the seminorms on A),

0 ≤ lima1→0 |h(a1)| ≤ lima1→0 ka1ki = k0ki = 0, so lima1→0 h(a1) = 0 and therefore, as h , h . Now h(0) = 0 and h(a) = g(a) = kaki > 0, so h(0) 6= h(a). 0 Therefore if a1 6= a2, there exists an h ∈ A such that h(a1) 6= h(a2). Because of this theorem, we can also give a stronger, continuous variant of Lemma (3.3.20). Theorem 4.5.15: Dual of the dual Suppose K is either R or C, and let A /K . Then the map f : A → (A0)0 : a 7→ (g 7→ g(a)) is /K and bijective. Proof. We follow [Bou1955]. By Lemma (3.3.20) we already know that f /K. For continuity note that for any g ∈ A0 the composition of (h 7→ h(g)) ◦ f is given by ((A00 → K : h 7→ h(g)) ◦ f)(a) = f(a)(g) = g(a), so (h 7→ h(g)) ◦ f = g for any g ∈ A0. By definition of the initial topology on (A0)0 this makes f : A → (A0)0 . 0 Let a1, a2 ∈ A and suppose f(a1) = f(a2). Then for any g ∈ A we have g(a1) = f(a1)(g) = f(a2)(g) = g(a2), hence by Theorem (4.5.14) a1 = a2, so f is injective. Let g ∈ (A0)0, then g : A0 → K /K. Hence by Lemma (4.5.5) and Lemma (4.5.13) there exists an α ∈]0, ∞[ and a1, . . . , ak ∈ A such that for any 0 h ∈ A we have |g(h)| ≤ α (khka1 + ... + khkak ) = α (|h(a1)| + ... + |h(ak)|). 0 0 0 Now let gl : A → K : h 7→ h(al) for 1 ≤ l ≤ k, then g1, . . . , gk ∈ (A ) 0 and for any h ∈ A we have |g(h)| ≤ α (|g1(f)| + ... + |gk(f)|). Suppose h ∈ ker g1 ∩ ... ∩ ker gk, then |g(h)| ≤ α (0 + ... + 0) = 0, so g(h) = 0 and therefore h ∈ ker g. Hence ker g1 ∩ ... ∩ ker gk ≤ ker g. Now using the notation of ∗ ∗ ∗ ⊥ Theorem (3.3.19) for A and (A ) , this implies that hgi = ker g ≥ ker g1 ∩ ⊥ K ... ∩ ker gk = hg1, . . . , gki . Since these are finite dimensional we therefore ⊥ ⊥ K ⊥ ⊥ have hgiK = (hgi ) ≤ (hg1, . . . , gki ) = hg1, . . . , gkiK. Hence there exist K K 0 α1, . . . , αk ∈ K such that g = α1 g1 + ... + αk gk, so for h ∈ A we have g(h) = α1 g1(h) + ... + αk gk(h) = α1 h(a1) + ... + αk h(ak) = h(α1 a1 + ... + αk ak). Therefore, let a := α1 a1 + ... + αk ak ∈ A, then g(h) = h(a) = f(a)(h) for

- 72 - 4.5. LOCAL CONVEXITY all h ∈ A0, so f(a) = g. So for any g ∈ (A0)0 there exists an a ∈ A such that f(a) = g and this makes f surjective.

As well as a continuous variant of Theorem (3.3.19) for topological vector spaces. Theorem 4.5.16: Duality T Let A Vs /K T2 LC . Denote for any B ≤ A the set

B⊥ := {f ∈ A0 | ∀b ∈ B : f(b) = 0} ≤ A0, and for any C ≤ A0 the set

C⊥ := {a ∈ A | ∀f ∈ C : f(a) = 0} ≤ A.

• Then for all B ≤ A, C ≤ A0 we have

B ≤ (B⊥)⊥,C ≤ (C⊥)⊥.

0 • For all B1 ≤ B2 ≤ A, C1 ≤ C2 ≤ A we have

⊥ ⊥ ⊥ ⊥ B1 ≥ B2 ,C1 ≥ C2 .

• For all B ≤ A, C ≤ A0 we have

(B⊥)⊥ = B ⊆ A, (C⊥)⊥ = C ⊆ A0.

Proof. • Show in the same way as in Theorem (3.3.19). • Show in the same way as in Theorem (3.3.19). • Let B ≤ A. We already know that B ≤ (B⊥)⊥. Suppose there exists a b ∈ B for which b∈ / (B⊥)⊥. Then there is an f ∈ B⊥ with f(b) 6= 0. Pick c 0 −1  := |f(b)|/2 ∈]0, ∞[. As f (since f ∈ A ), U := f (BK(f(b), )) ⊆ A is an open neighbourhood of b in A. Hence (b ∈ B, Lemma (2.1.3)) there ⊥ exists a b1 ∈ B ∩ U. Now as b1 ∈ B, f(b1) = 0 (f ∈ B ), but on the other hand, b1 ∈ U, so f(b1) 6= 0, a contradiction is reached: such a b cannot exist and therefore necessarily B ⊆ (B⊥)⊥. Let b∈ / B, then [b] 6= [0] ∈ A/B. By Lemma (4.5.12) we see that A/B , therefore by Theorem (4.5.14) there exists a g ∈ (A/B)0 such that g([b]) 6= g([0]) = 0. Define f : A → K : a 7→ g([b]), then f ∈ A0 (as g l ) ⊥ and for any b1 ∈ B we have f(b1) = g([b1]) = g([0]) = 0. Hence f ∈ B , however f(b) = g([b]) 6= 0, so b∈ / (B⊥)⊥. Therefore B = (B⊥)⊥. Let C ≤ A0. Let f ∈ C and suppose f∈ / (C⊥)⊥, then there exists an a ∈ C⊥ such that f(a) 6= 0. As the topology on A0 is generated by the

seminorms k · ka1 : f1 7→ |f1(a1)| for all a1 ∈ A (Lemma (4.5.13)) we −1 have for  := |f(a)|/2 ∈]0, ∞[ that V := k · ka (BK(f(a), )) is an open neighbourhood of f in A0. As f ∈ C (Lemma (2.1.3)), there exists an ⊥ f1 ∈ C ∩ V . As f1 ∈ C, a ∈ C , we have f1(a) = 0, while on the other

- 73 - 4.5. LOCAL CONVEXITY

hand f1 ∈ V , so f1(a) 6= 0: we reach a contradiction. Therefore such an f cannot exist and C ⊆ (C⊥)⊥. Let f∈ / C, then [f] 6= [0] ∈ A0/C, so by Theorem (4.5.14) we obtain a g ∈ (A0/C)0 for which g([f]) 6= g([0]) = 0. g in turn gives us a map 0 00 (A → K : f1 7→ g([f1])) ∈ A which by Theorem (4.5.15) corresponds to 0 a unique a ∈ A such that f1(a) = g([f1]) for all f1 ∈ A . For any f1 ∈ C ⊥ we have that f1(a) = g([f1]) = g([0]) = 0, so a ∈ C . On the other hand for our f, f(a) = g([f]) 6= 0, so f∈ / (C⊥)⊥. Therefore C = (C⊥)⊥.

- 74 - Chapter 5

Analysis

Now we are ready to introduce the notions of differentiation (approximating a given function near a given point as well as possible by a linear map), and integration which makes it possible (Theorem (5.3.8)) to recover a function from its derivative and to approximate functions by a sum of 1- l , 2- , . . . maps1 as a generalisation of approximating a function by its derivative (this is done in Corollary (5.3.12)). We will also prove important and very useful existence theorems for Ba spaces: Theorem (5.5.8), Theorem (5.5.9), and Theorem (5.5.10).

5.1 Differentiation

Definition 5.1.1: Differentiability ( d , C1(U, B)) T Let A, B Vs /K T2 LC , and U ⊆ A open. Then we call a map f : U → B differentiable at a (denoted by f a) for c a ∈ U if there exists a map Daf : A → B /K (called the derivative of f at a), and a map f,a :(U − a) → B such that for all a1 ∈ (U − a) we have

f(a + a1) = f(a) + Daf(a1) + f,a(a1) (5.1) and for all a1 ∈ A,  (α a ) lim f,a 2 = 0. (α,a2)→(0,a1) α If f a for all a ∈ U we write f U. If f U and the map

Df : U × A → B :(a, a1) 7→ Daf(a1) is , we say that f is continuously differentiable on U (denoted by f ∈ C1(U, B)).

We demand local convexity to ensure that all lines a + α a1 are contained in U for small enough α ∈ K if U is an open neighbourhood of a in A, as well as to ensure uniqueness of the approximation.

1The function’s Taylor sequence.

- 75 5.1. DIFFERENTIATION

Note that Equation (5.1) really is an expression of the fact that the linear map Daf is the best linear approximation of our function f near the point a. The ‘rest term’ f,a(a1) contains the part of f that goes to 0 in a ‘faster than linear way’ as a1 → 0. Lemma 5.1.2: Differentiability implies continuity T Let A, B Vs /K T2 LC , U ⊆ A open, f : U → B, and a ∈ U. If f d a, then f c a. In particular, if f ∈ C1(U, B), then f .

Proof. Let V be any neighbourhood of 0 in B, as

 (α a ) lim f,a 2 = 0, (α,a1)→(0,0) α there exists a δ ∈]0, 1[ and an open abc neighbourhood U1 of 0 in A such that for all α ∈ BK(0, δ) and a1 ∈ U1 we have f,a(α a1) ∈ α V . δ δ δ Let a2 ∈ 2 U1, then a2 = 2 a1 for some a1 ∈ U1, so f,a(a2) = f,a( 2 a1) ∈ δ 2 V ⊆ V (as δ ∈]0, 1[). Hence, for any open neighbourhood V of 0 in B there exists an open neigh- δ δ bourhood 2 U1 of 0 in A such that for all a2 ∈ 2 U1 we have f,a(a2) ∈ V .

Therefore lima1→0 f,a(a1) = 0. Using this we see that   lim f(a + a1) = lim f(a) + Daf(a1) + f,a(a1) a1→0 a1→0

= f(a) + lim Daf(a1) + lim f,a(a1) a1→0 a1→0

= f(a) + Daf(0) + 0 = f(a), because all limits exist, addition is continuous, and Daf l . Hence f a. Example 5.1.3: Differentiability is a local property 1 Consider f : R → R from Example (2.1.15). Then f 0 (limα→0 α (f(0 + α) − f(0)) = 0), but for all α ∈ R \{0} we have that not f α. So this function is differentiable in just a single point.

Lemma 5.1.4: Uniqueness of the derivative Let A, B /K , U ⊆ A open, a ∈ U, f : U → B. If the map g : A → B given by 1   g(a1) := lim f(a + α a2) − f(a) (α,a2)→(0,a1) α is defined for all a1 ∈ A and is /K, then f a and Daf = g. Furthermore, if f a, then for all a1 ∈ A, 1   Daf(a1) = lim f(a + α a1) − f(a) . α→0 α In particular, the derivative is unique.

- 76 - 5.1. DIFFERENTIATION

Proof. Suppose that for all a1 ∈ A we have existence of the limit 1   g(a1) = lim f(a + α a2) − f(a) (α,a2)→(0,a1) α

c in B, and g : A → B l /K. From the definition of g we have for any a1 ∈ U −a that 1   lim f(a + αa2) − f(a) − α g(a2) = 0, (α,a2)→(0,a1) α so choosing for all a1 ∈ U − a the function

f,a(a1) := f(a + a1) − f(a) − g(a1) we see that f(a + a1) = f(a) + g(a1) + f,a(a1) and  (α a ) lim f,a 2 = 0. (α,a2)→(0,a1) α

Hence f d a. Let Daf : A → B satisfy Equation (5.1) (existence of at least one such map is guaranteed by the fact that f a). Because A LC there exists an open abc neighbourhood U1 of 0 in A such that U1 + a ⊆ U. Let a1 ∈ A, then because limα→0 α a1 = 0 there exists a δ ∈]0, ∞[ such that for all α ∈ BK(0, δ) we have α a1 ∈ U1 and hence

f(a + α a1) = f(a) + Daf(α a1) + f,a(α a1), so for all α 6= 0, |α| < δ, (use Daf ),  (α a ) 1   f,a 1 = f(a + α a ) − f(a) − D f(a ) α α 1 a 1

f,a(α a2) and as lim(α,a2)→(0,a1) α = 0 we obtain 1   Daf(a1) = lim f(a + α a2) − f(a) . (α,a2)→(0,a1) α

Therefore (A T2 , Lemma (2.2.6)) Daf(a1) = g(a1) for all a1 ∈ A, so Daf = g and the derivative is unique. The last statement now follows from 1   1   lim f(a + α a2) − f(a) = lim f(a + α a1) − f(a) , (α,a2)→(0,a1) α α→0 α because the limit on the left hand side exists whenever f a. Because this definition of differentiability is different from the usual Fr´echet or Gˆateauxderivative (see for example [Ham1982]), we need to check how it compares to these types of differentiation. Clearly the demands of Definition (5.1.1) are stronger than those made of the Gˆateauxderivative (which need not even be linear), but how it compares to the Fr´echet derivative is not immediately clear.

- 77 - 5.1. DIFFERENTIATION

Theorem 5.1.5: Compatible differentiation T Let A ||.|| /K, B Vs /K T2 LC , U ⊆ A open, f : U → B, and a ∈ U. If there exists a map g : A → B c l /K such that 1   lim f(a + a1) − f(a) − g(a1) = 0, a1→0 ka1kA then f d a and Daf = g. Proof. By Lemma (4.5.7), A /K . Fix a1 ∈ A and let V be an open abc neighbourhood of 0 in B. Then because A and the above limit, there exists a δ1 ∈]0, ∞[ such that for all a2 ∈ BA(0, δ1) we have 1 f(a + a2) − f(a) − g(a2) ∈ ka2kA V. 1 + ka1kA

Choose δ := δ1/(1+ka1kA) and U := BA(a1, 1) which is an open neighbourhood of a1 in A. Then for all α ∈ BK(0, δ), α 6= 0 and a2 ∈ U we have ka2kA < 1 + ka1kA and therefore kα a2kA = |α| ka2kA < δ (1 + ka1kA) = δ1, so (use Lemma (4.3.5) and the fact that V is abc) 1 f(a + α a2) − f(a) − g(α a2) ∈ kα a2kA V 1 + ka1kA 1 ⊆ |α| (1 + ka1kA) V 1 + ka1kA |α| = α V α = α V.

So for all a1 ∈ A and open abc neighbourhoods V of 0 in B, there exists a

δ ∈]0, ∞[ and open neighbourhood U of a1 in A such that for α ∈ BK(0, δ), α 6= 0 and a2 ∈ U we have 1   f(a + α a ) − f(a) − g(α a ) ∈ V. α 2 2

Hence for all a1 ∈ A 1   lim f(a + α a2) − f(a) − g(α a2) = 0. (α,a2)→(0,a1) α As g we therefore find that f a. By Lemma (5.1.4) we furthermore see that Daf = g. We immediately obtain the following consequence if B is also a normed space, which shows that the derivative defined in Definition (5.1.1) is at least as general as the notion of Fr´echet differentiability (see Corollary (5.5.7)). Corollary 5.1.6: Compatibility on normed spaces Let A, B /K, U ⊆ A open, f : U → B. Let a ∈ U. If there exists a map g : A → B /K such that kf(a + a ) − f(a) − g(a )k lim 1 1 B = 0, a1→0 ka1kA then f a and Daf = g.

- 78 - 5.1. DIFFERENTIATION

For paths (functions from an interval in R to A), we can do even better. Corollary 5.1.7: Compatibility for paths T Let A Vs /K T2 LC and S ⊆ R an open interval, f : S → A. Let α ∈ S. The limit f(β) − f(α) f 0(α) = lim β→α β − α exists if and only if f d α, in which case

0 Dαf(β) = β f (α) for all β ∈ R. In particular, this expression shows that if the limit f 0(α) exists for all α ∈ S, then S → A : α 7→ f 0(α) c if and only if f ∈ C1(S, A).

Proof. Note that even though A may be a /C while R itself is a /R, this does not give any problems for linearity of the involved maps since R ≤ C, so we may consider A as a /R by taking the restriction of scalar multiplication to R × A ⊆ C × A. Let α ∈ S. Suppose that the limit for f 0(α) exists, then

0 = f 0(α) − f 0(α) f(β) − f(α) (β − α) f 0(α) = lim − β→α β − α β − α 1   = lim f(α + β) − f(α) − β f 0(α) , β→0 β so using the fact that we can choose balanced neighbourhoods of 0 in A and the expression can only change sign as 1/β = ±1/|β|, we find 1   lim f(α + β) − f(α) − β f 0(α) = 0. β→0 |β|

Hence we can apply Theorem (5.1.5) with the map R → A : β 7→ β f 0(α) l to conclude that f α. Suppose conversely that f α, then for 1 ∈ R we have 1   lim f(α + β γ) − f(α) − Dαf(β γ) = 0. (β,γ)→(0,1) β Let U be an open abc neighbourhood of 0 in A, then because of the above limit there exists a δ ∈]0, ∞[ such that |β| < δ and |γ − 1| < δ imply

f(α + β γ) − f(α) − β γ Dαf(1) ∈ β U.

In particular for γ = 1 and |β| < δ, β 6= 0, 1   f(α + β) − f(α) − D f(1) ∈ U. β α Hence  1    lim f(α + β) − f(α) − Dαf(1) = 0, β→0 β

- 79 - 5.1. DIFFERENTIATION

so f(β) − f(α) Dαf(1) = lim . β→α β − α 0 Therefore the limit f (α) exists and is equal to Dαf(1). Now that we have established that Definition (5.1.1) is reasonable, we can determine the properties of the derivative of a function in Theorem (5.1.8) (com- pare with Theorem (2.1.28)). Theorem 5.1.8: Operations preserving differentiability T Let A, B Vs /K T2 LC .

Composition: let C /K, U ⊆ A open, V ⊆ B open, f : U → B, g : V → C, a ∈ U, f(a) ∈ V . If f d a, g f(a), then f ◦ g a and

Da(g ◦ f) = Df(a)g ◦ Daf.

In particular if f ∈ C1(U, B) and g ∈ C1(V,C) with f(U) ⊆ V , then g ◦ f ∈ C1(U, C). Addition: let U ⊆ A open, a ∈ U, f : U → B, g : U → B. If f, g a, then g + f a and

Da(g + f) = Dag + Daf.

In particular if f, g ∈ C1(U, B), then f + g ∈ C1(U, B).

Scaling: let U ⊆ A open, a ∈ U, f : U → B, α ∈ K. If f a, then α f a and

Da(α f) = α Daf.

In particular for f ∈ C1(U, B) and α ∈ K, we have α f ∈ C1(U, B). 1 1 Glueing: let U1,U2 ⊆ A be both open, f1 ∈ C (U1,B), f2 ∈ C (U2,B). 1 If f1|U1∩U2 = f2|U1∩U2 , then there exists a unique f ∈ C (U, B) such that

f|U1 = f1 and f|U2 = f2. Restricting domain: let U ⊆ A open, and f ∈ C1(U, B). 1 Then for any U1 ⊆ U open, f|U1 ∈ C (U1,B). Constantness: let U ⊆ A abc open, f : U → B. Then there exists a b ∈ B such that f(a) = b for all a ∈ U if and only if 1 f ∈ C (U, B) and Daf = 0 for all a ∈ U.

Linearity: let f : A → B c l /K. 1 Then f ∈ C (A, B) and Daf = f for all a ∈ A. If conversely f ∈ C1(A, B) and there exists a g : A → B /K such that Daf = g for all a ∈ A, then there exists a b ∈ B such that f(a) = g(a) + b for all a ∈ A.

- 80 - 5.1. DIFFERENTIATION

Proof. We will cover this item by item.

T Composition: let C Vs /K, U ⊆ A open, V ⊆ B open, f : U → B, g : V → C, a ∈ U, f(a) ∈ V . Suppose f d a, g f(a), then for a1 ∈ U − a we have (use Df(a)g l )

(g ◦ f)(a + a1) = g(f(a + a1))

= g(f(a) + (Daf(a1) + f,a(a1)))

= g(f(a)) + Df(a)g(Daf(a1) + f,a(a1)) + g,f(a)(Daf(a1) + f,a(a1))

= (g ◦ f)(a) + (Df(a)g ◦ Daf)(a1) + f◦g,a(a1)

where we have defined

f◦g,a(a1) := Df(a)g(f,a(a1)) + g,f(a)(Daf(a1) + f,a(a1)).

Now for any a2 ∈ A, α ∈ K, α 6= 0 sufficiently small we obtain (use Daf, Df(a)g and Equation (5.1))

 (α a )  (α a ) f◦g,a 2 = D g f,a 2 α f(a) α 1    (α a ) +  α D f(a ) + f,a 2 . α g,f(a) a 2 α

c As Daf, Df(a)g and for any a1 ∈ A, b1 ∈ B

 (α a )  (α b2) lim f,a 2 = lim g,f(a) = 0, (α,a2)→(0,a1) α (α,b2)→(0,b1) α

we obtain with Lemma (2.1.10) that for any a1 ∈ A   f,a(α a2) lim Df(a)g = lim Df(a)g(a2) (α,a2)→(0,a1) α a2→0

= Df(a)g(0) = 0   f,a(α a2) lim Daf(a2) + = lim Daf(a2) (α,a2)→(0,a1) α a2→a1  (α a ) + lim f,a 2 (α,a2)→(0,a1) α

= Daf(a1) + 0.

Hence for all a1 ∈ A,

f◦g,a(α a2) 1 lim = 0 + lim g,f(a)(α b2) (α,a2)→(0,a1) α (α,b2)→(0,Daf(a1)) α = 0.

Together with the fact that Df(a)g ◦Daf /K as both Df(a)g and Daf have these properties, we see that g ◦ f a. As the derivative is unique by Lemma (5.1.4), we furthermore obtain Da(g ◦ f) = Df(a)g ◦ Daf.

- 81 - 5.1. DIFFERENTIATION

Addition: let U ⊆ A open, a ∈ U, f : U → B, g : U → B. Suppose f, g d a, then in the same way as for composition we find for a1 ∈ U − a that

(g + f)(a + a1) = g(a + a1) + f(a + a1)

= g(a) + Dag(a1) + g,a(a1) + f(a) + Daf(a1) + f,a(a1)

= (g + f)(a) + (Dag + Daf)(a1) + g+f,a(a1) where we defined

g+f,a(a1) := g,a(a1) + f,a(a1).

Now for any a1 ∈ A we have  (α a )  (α a )  (α a ) lim g+f,a 2 = lim g,a 2 + lim f,a 2 (α,a2)→(0,a1) α (α,a2)→(0,a1) α (α,a2)→(0,a1) α = 0 + 0 = 0.

c Clearly Dag + Daf l /K, so g + f a and by uniqueness Da(g + f) = Dag + Daf.

Scaling: let U ⊆ A open, a ∈ U, f : U → B, α ∈ K. Suppose f a, then for any a1 ∈ U − a

(α f)(a + a1) = α f(a + a1)

= α (f(a) + Daf(a1) + f,a(a1))

= (α f)(a) + (α Daf)(a1) + α f,a(a1) where we defined α f,a(a1) := α f,a(a1).

For any a1 ∈ A we have  (βa ) lim α f,a 2 = α 0 = 0. (β,a2)→(0,a1) β

Clearly α Daf /K, so α f a and by uniqueness Da(α f) = α Daf. Glueing, restricting domain: Both follow directly from the fact that the definition of differentiability is only made on an (arbitrarily small) open neighbourhood of a given point and hence local. Constantness: let U ⊆ A abc open, f : U → B. Suppose f(a) = b for all a ∈ U. Let a ∈ U, a1 ∈ U − a, then

f(a + a1) = b

= f(a) + 0(a1) + 0.

Since the zero map is , f a and Daf = 0. Now the map U × A → 1 B :(a, a1) 7→ Daf(a1) = 0 is constant and hence , so f ∈ C (U, B). 1 Suppose conversely that f ∈ C (U, B) and Daf = 0 for all a ∈ U. Let 0 a ∈ U and a1 ∈ A. As U is abc, we can for any g ∈ B define for sufficiently small δ ∈]0, ∞[

h :] − δ, δ[→ K : α 7→ g(f(a + α a1))

- 82 - 5.1. DIFFERENTIATION

which is c as a composition of maps. Since f d a, we have by Lemma (5.1.4) (use that g l )

h(α) − h(0) 1   lim = lim g(f(a + α a1)) − g(f(a)) α→0 α − 0 α→0 α  1   = lim g f(a + α a1) − f(a) α→0 α  1   = g lim f(a + α a1) − f(a) α→0 α

= g (Daf(a1)) .

0 So the limit h (0) = g(Daf(a1)) = g(0) = 0 exists and is equal to zero. 0 Therefore, for any g ∈ A , and a1, a2 ∈ U the continuous map (well-defined because U is convex)

i : [0, 1] → K : α 7→ g(f(a1 + α (a2 − a1))) is differentiable on ]0, 1[ with derivative equal to zero (consider for each α ∈]0, 1[, a = a1 +α (a2 −a1) and direction a2 −a1, now take the derivative of β 7→ g(f(a + β (a2 − a1))) as above). In the case that K = R we have (i : [0, 1] → R continuous, i|]0,1[ differentiable) by the mean value theorem that there exists an α ∈]0, 1[ such that i(1) − i(0) = i0(α)(1 − 0) = 0. Hence i(0) = i(1). In the case that K = C, apply the mean value theorem to Re i and Im i separately. This gives g(f(a1)) = i(0) = i(1) = g(f(a2)). 0 Since this is true for all g ∈ A and a1, a2 ∈ U, Theorem (4.5.14) implies that f(a1) = f(a2) for all a1, a2 ∈ U: f is constant.

Linearity: let f : A → B /K. Then for any a ∈ A, a1 ∈ A − a = A we have

f(a + a1) = f(a) + f(a1)

= f(a) + f(a1) + 0.

As f /K, f a and by uniqueness Daf = f. Furthermore, the map 1 A × A → B :(a, a1) 7→ Daf(a1) = f(a1) , so f ∈ C (A, B). 1 Suppose that f ∈ C (A, B) and let g : A → B such that Daf = g for all a ∈ A. Then the map h : A → B : a 7→ f(a) − g(a) satisfies Dah = Daf − g = g − g = 0 for all a ∈ A, so there exists a b ∈ B such that h(a) = b for all a ∈ A. Hence f(a) = h(a) + g(a) = g(a) + b for all a ∈ A.

Definition 5.1.9: Higher order derivatives (Ck(U, B)) T Let A, B Vs /K T2 LC , U ⊆ A open, and a ∈ U. Let k ∈ N0. Suppose k = 0, then f U is denoted by f ∈ C0(U, B) and if this is the 0 0 case we write D f : U → B : a 7→ Daf := f(a). Suppose k = 1, then we say f is 1 time differentiable at a if f a. If 1 1 1 f ∈ C (U, B), we denote D f : U × A → B :(a, a1) 7→ Daf(a1) := Daf(a1).

- 83 - 5.1. DIFFERENTIATION

Suppose k ≥ 2, then we inductively define f to be k times differentiable at k k a if f is k − 1 times differentiable at a and there exists a map Daf : A → B c k- l /K such that for all a1, . . . , ak−1 ∈ A the map

0 k−1 U → B : a 7→ Da0 f(a1, . . . , ak−1)

d k is a with derivative A → B : ak 7→ Daf(a1, . . . , ak−1, ak). Suppose k ≥ 2, then we inductively define f to be k times continuously differentiable on U (denoted by f ∈ Ck(U, B)) if f ∈ Ck−1(U, B), and the map

k k k D f : U × A → B :(a, a1, . . . , ak) 7→ Daf(a1, . . . , ak) is . We say f is smooth on U (denoted by f ∈ C∞(U, B)) if f ∈ Ck(U, B) for all k ∈ N. We say f is analytic on U (denoted by f ∈ Cω(U, B)) if f ∈ C∞(U, B) and for each a ∈ U there exists an open neighbourhood U1 of a in U such that for all a1 ∈ U1 − a we have

∞ X 1 f(a + a ) = Dkf(a , . . . , a ). 1 k! a 1 1 k=0 | {z } k

Note that this definition enables us to apply Theorem (5.1.8) to k times differentiable functions as well. The definition of analyticity is motivated by Theorem (5.3.11): a function is analytic if its Taylor series locally gives a complete description of the function.

Example 5.1.10: Ck(A, B) Vs /K T Let A, B Vs /K T2 LC , U ⊆ A open, k ∈ N. Since addition and scalar multiplication on A and B are we have by Theorem (2.1.28) and Theorem (5.1.8) that for any f, g ∈ Ck(U, B), α ∈ K, f + α g ∈ Ck(U, B). This makes Ck(U, B) /K. Furthermore, from Definition (5.1.9) we know that C0(U, B) ⊇ C1(U, B) ⊇ ..., therefore, as /K we have C0(U, B) ≥ C1(U, B) ≥ ... ≥ Ck(U, B) ≥ ....

Lemma 5.1.11 Let k ∈ N, A, B1,..., Bk /K , U ⊆ A open. Let f : U → B1 × ... × Bk : a 7→ (f1(a), . . . , fk(a)). Then for a ∈ U, f a if and only if for all 1 ≤ l ≤ k the map fl : U → Bl a. If this is the case, then for all a1 ∈ A

Daf(a1) = (Daf1(a1),...,Dafk(a1)).

l In particular, f ∈ C (U, B1 × ... × Bk) if and only if for all 1 ≤ m ≤ l, l fm ∈ C (U, Bm).

Proof. Suppose f a and let 1 ≤ l ≤ k. Write gl : B1 × ... × Bk → Bl : 1 (b1, . . . , bk) 7→ bl, then gl ∈ C (B1 × ... × Bk,Bl) since gl /K (use Theorem (5.1.8)). Hence gl f(a) and therefore by Theorem (5.1.8) fl = gl ◦ f a.

- 84 - 5.1. DIFFERENTIATION

Suppose conversely that for all 1 ≤ l ≤ k, fl d a. Then for a2 ∈ A and α ∈ K small enough but nonzero we have 1    1   1   f(a+α a )−f(a) = f (a+α a )−f (a) ,..., f (a+α a )−f (a) α 2 α 1 2 1 α k 2 k so letting (α, a2) → (0, a1) we see that as all fl a, we have by Lemma (5.1.4) that f a with Daf(a1) = (Daf1(a1),...,Dafk(a1)).

Definition 5.1.12: Partial derivative T Let k ∈ N, A1,..., Ak, B Vs /K T2 LC , U1 ⊆ A1,..., Uk ⊆ Ak open. 1 Define U := U1 × ... × Uk and let f ∈ C (U, B). 0 Then we define for 1 ≤ l ≤ k,(a1, . . . , ak) ∈ U and a ∈ A the l-th partial 0 derivative of f in direction of a at (a1, . . . , ak) as

∂   0 ∂f(a1, . . . , ak) 0 0 f(a1, . . . , ak) (a ) := (a ) := Dal g(a ) ∂al ∂al where g : Ul → B is given by g(a) := f(a1, . . . , al−1, a, al+1, . . . , ak). In the case that for certain 1 ≤ l ≤ k, Al = K we abbreviate

∂   ∂f(a1, . . . , ak) f(a1, . . . , ak) := := Dal g(1). ∂al ∂al

m For higher order derivatives we define for m > 1, f ∈ C (U, B), i1, . . . , im ∈ 0 0 {1, . . . , k}, and a1 ∈ Ai1 ,..., am ∈ Aim that

m ∂ f(a1, . . . , ak) 0 0 (a1, . . . , am) := ∂aim . . . ∂ai1 m−1 ∂ ∂ f(a1, . . . , ak) 0 0  0 (a1, . . . , am−1) (am). ∂aim ∂aim−1 . . . ∂ai1

0 0 This expression is symmetric in permutations iπ(1), . . . , iπ(m), aπ(1), . . . , aπ(m) for π ∈ Sm by Theorem (5.1.16).

Lemma 5.1.13: Sum rule Let k ∈ N, A1,..., Ak, B /K , U1 ⊆ A1,..., Uk ⊆ Ak open. 1 Define U := U1 × ... × Uk and let f ∈ C (U, B). Then the map U × Al → B given by

0 ∂f(a1, . . . , ak) 0 (a1, . . . , ak, al) 7→ (al) ∂al is c . 0 0 Furthermore, for any (a1, . . . , ak) ∈ U, a1 ∈ A1,..., ak ∈ Ak we have

k 0 0 X ∂f(a1, . . . , ak) 0 D(a1,...,ak)f(a1, . . . , ak) = (al). (5.2) ∂al l=1

- 85 - 5.1. DIFFERENTIATION

Proof. Let us for each (a1, . . . , ak) ∈ U, 1 ≤ l ≤ k use the notation

(abl) := (a1, . . . , al−1, al+1, . . . , ak)

(abl) and define the map gl : Ul → B by

(abl) gl (a) := f(a1, . . . , al−1, a, al+1, . . . , ak).

1 00 Suppose f ∈ C (U, B), let 1 ≤ l ≤ k,(a1, . . . , ak) ∈ U, and al ∈ Al. Then for α ∈ K small enough but nonzero

1   g(abl)(a + α a00) − g(abl)(a ) α l l l l l 1   = f(a , . . . , a + α a00, . . . , a ) − f(a , . . . , a , . . . , a ) α 1 l l k 1 l k

0 00 0 So taking for al ∈ Al the limit (α, al ) → (0, al) this expression goes to (as f ∈ C1(U, B)) 0 D(a1,...,ak)f(0, . . . , al,..., 0)

c l 0 (abl) which is /K as function of al. Hence, by Lemma (5.1.4), we have that g d al and (abl) 0 0 Dal gl (al) = D(a1,...,ak)f(0, . . . , al,..., 0). 1 0 Because f ∈ C (U, B) we see that this expression is as a function of (a1, . . . , ak, al). Now Definition (5.1.12) gives

∂f(a , . . . , a ) 1 k 0 (abl) 0 (al) = Dal gl (al) ∂al which yields the desired continuity of the partial derivatives of f.

We also obtain from D(a1,...,ak)f that,

k 0 0 X 0 D(a1,...,ak)f(a1, . . . , ak) = D(a1,...,ak)f(0, . . . , al,..., 0) l=1 k X (abl) 0 = Dal gl (al) l=1 k X ∂f(a1, . . . , ak) 0 = (al) ∂al l=1 which gives Equation (5.2).

Definition 5.1.14: Diffeomorphism T Let A, B Vs /K T2 LC , U ⊆ A open, V ⊆ B open, and k ∈ N. Then a map f : U → V is called a Ck diffeomorphism if f is bijective, f ∈ Ck(U, B), and f −1 ∈ Ck(V,A). If there exists a Ck diffeomorphism f : U → V , then we call U and VCk diffeomorphic.

- 86 - 5.1. DIFFERENTIATION

Lemma 5.1.15 T Let A, B Vs /K T2 LC , U ⊆ A open, and V ⊆ B open. Suppose f : U → V is a Ck diffeomorphism, then for all a ∈ U we have

−1 −1 [Daf] = Df(a)f .

In particular for all a ∈ U, Daf : A → B is a -isomorphism.

k −1 Proof. Let f : U → V be a C diffeomorphism and a ∈ U. Then f ◦ f = idU , −1 so by Theorem (5.1.8) we have for all a1 ∈ A that a1 = Da idU (a1) = Da(f ◦ −1 −1 f)(a1) = Df(a)f (Daf(a1)). Similarly, by considering f ◦ f = idV for all −1 −1 −1 b1 ∈ B, b1 = Daf(Df(a)f (b1)). Hence [Daf] = Df(a)f . A converse to Lemma (5.1.15) is given in Theorem (5.5.8). Theorem 5.1.16: Symmetry of higher order derivatives Let A, B /K , k ∈ N, U ⊆ A open. k Then for any f ∈ C (U, B) we have for all a ∈ U and a1, . . . , ak ∈ A that for any π ∈ Sk,

k k Daf(aπ(1), . . . , aπ(k)) = Daf(a1, . . . , ak).

2 0 Proof. Suppose k = 2. Let f ∈ C (U, B), a ∈ U, a1, a2 ∈ A. Let g ∈ B be arbitrary. 2 c The function K → A :(α, β) 7→ a + α a1 + β a2 and U is an open neighbourhood of a. Hence there exists a δ ∈]0, ∞[ such that for all α, β ∈

BK(0, δ) we have a + α a1 + β a2 ∈ U. Let h : BK(0, δ) × BK(0, δ) → K be given by h(α, β) := g(f(a + α a1 + β a2)). Then as g l we have by Theorem (5.1.8) and Corollary (5.1.7) that

∂2h(α, β) ∂    = D g D f(a ) ∂α ∂β ∂α f(a+α a1+β a2) a+α a1+β a2 2 ∂    = g D f(a ) ∂α a+α a1+β a2 2 = . . . follow the same procedure . . .   = g D2 f(a , a ) a+α a1+β a2 2 1

∂2h(α,β) ∂2h(α,β) ∂2h(α,β) and similar expressions for ∂β ∂α , ∂α ∂α , and ∂β ∂β . Therefore, as f ∈ C2(U, B), h is twice partially differentiable map K2 → K with continuous partial derivatives, therefore, from analysis on K, we know that

2 2  2  ∂ h(α, β) ∂ h(α, β)  2  g D f(a2, a1) = = = g D f(a1, a2) . a+α a1+β a2 ∂α ∂β ∂β ∂α a+α a1+β a2

As this is true for all g ∈ A0 we find at α = β = 0 with Theorem (4.5.14) that

2 2 Daf(a2, a1) = Daf(a1, a2) for all a1, a2 ∈ A. For k > 2 the proof follows likewise (construct a function h(α1, . . . , αk) = 0 g(f(a + α1a1 + ... + αkak)) for g ∈ B and proceed in the same way).

- 87 - 5.2. MULTILINEAR FAMILIES

5.2 Multilinear families

In the treatment of [Chr1869] we will encounter a lot of functions that are multilinear in all but one of their variables. Therefore we will investigate such functions in this section.

Lemma 5.2.1: Differentiability of families of k- l maps T Let k ∈ N, A, B1,..., Bk, C Vs /K T2 LC , U ⊆ A open. Let f : U × B1 × ... × Bk → C, such that for all a ∈ U the map

B1 × ... × Bk → C :(b1, . . . , bk) 7→ f(a, b1, . . . , bk) is k- . l c Then f ∈ C (U × B1 × ... × Bk,C) for some l ∈ N if and only if f and for all b1 ∈ B1,..., bk ∈ Bk the map

U → C : a 7→ f(a, b1, . . . , bk) is an element of Cl(U, C). In particular if this condition is satisfied we have for all a ∈ U, a0 ∈ A, 0 0 b1, b1 ∈ B1,..., bk, bk ∈ Bk that

0 0 0 0 D(a,b1,...,bk)f(a , b1, . . . , bk) = D(a,b1,...,bk)f(a , 0,..., 0) k X 0 + f(a, b1, . . . , bm−1, bm, bm+1, . . . , bk). m=1

Proof. Let f : U × B1 × ... × Bk → C satisfy the conditions of the theorem. l If f ∈ C (U × B1 × ... × Bk,C) for l ∈ N, then the condition is satisfied directly. Suppose conversely that f and for all b1 ∈ B1,..., bk ∈ Bk we have l (U → C : a 7→ f(a, b1, . . . , bk)) ∈ C (U, C). Fix b1 ∈ B1,..., bk ∈ Bk and denote g : U → C : a 7→ f(a, b1, . . . , bk), l 0 0 0 then g ∈ C (U, C) by assumption. Let a ∈ A, b1 ∈ B1,..., bk ∈ Bk, then for α ∈ K small enough but nonzero, using k-linearity 1   f(a + α a0, b + α b0 , . . . , b + α b0 ) − f(a, b , . . . , b ) α 1 1 k k 1 k 1  = f(a + α a0, b , . . . , b ) α 1 k k X 0 0 0 0 0 0 + α f(a + α a , b1 + α b1, . . . , bm−1 + α bm−1, bm, bm+1 + α bm+1, . . . , bk + α bk) m=1  − f(a, b1, b2, . . . , bk) 1   = g(a + α a0) − g(a) α k X 0 0 0 0 0 0 + f(a + α a , b1 + α b1, . . . , bm−1 + α bm−1, bm, bm+1 + α bm+1, . . . , bk + α bk). m=1

- 88 - 5.2. MULTILINEAR FAMILIES

c d 0 0 0 00 00 00 Since f and g , we can take the limit (α, a , b1, . . . , bk) → (0, a , b1 , . . . , bk) of this expression to obtain

k 00 X 00 Dag(a ) + f(a + 0, b1 + 0, . . . , bm−1 + 0, bm, bm+1 + 0, . . . , bk + 0). m=1

00 00 00 Since this expression depends continuously on a, b1,..., bk, a , b1 ,..., bk because g ∈ Cl(U, C) and f , we see with Lemma (5.1.4) that f ∈ C1(U × B1 × ... × Bk,C) with derivative precisely given by this expression. We can take further derivatives of our expression for Df and use induction l to obtain that f ∈ C (U × B1 × ... × Bk,C).

Lemma (5.2.1) shows that the derivative of a collection of k- l maps depend- ing on a parameter a ∈ U, really only depends on the derivative with respect to a. This motivates a less cumbersome notation.

Definition 5.2.2: k-linear family T Let k ∈ N, A, B1,..., Bk, C Vs /K T2 LC , U ⊆ A open. Then we call a map f : U × B1 × ... × Bk → C a family of k-linear maps or a k-linear family if for all a ∈ U the map fa defined by

fa : B1 × ... × Bk → C :(b1, . . . , bk) 7→ f(a, b1, . . . , bk) is k- . l Furthermore, if f ∈ C (U ×B1×...×Bk,C) we denote for a ∈ U, a1, . . . , al ∈ A, b1 ∈ B1,..., bk ∈ Bk

Dl f(a , . . . , a )(b , . . . , b ) := Dl f((a , 0,..., 0),..., (a , 0,..., 0)). a 1 l 1 k (a,b1,...,bk) 1 l

l Note that by Lemma (5.2.1), the expression Daf(a1, . . . , al)(b1, . . . , bk) com- pletely determines the derivative of f, since

0 0 0 0 D(a,b1,...,bk)f(a , b1, . . . , bk) = Daf(a )(b1, . . . , bk) k X 0 + fa(b1, . . . , bl−1, bl, bl+1, . . . , bk). (5.3) l=1

Example 5.2.3 Let k ∈ N and consider the inner product

k k h·, ·i : K × K → K defined by k X hx, yi := xl yl l=1 then h·, ·i (considered as a map {0} × Kk × Kk → K) is a constant family of 2-linear maps. Note that in particular for k = 1 the real and complex products fall in this category.

- 89 - 5.2. MULTILINEAR FAMILIES

Theorem 5.2.4: Product rule T Let k ∈ N, A, B1,..., Bk, C Vs /K T2 LC , U ⊆ A open. 1 Let f ∈ C (U × B1 × ... × Bk,C) be a family of k-linear maps. Let g1 : U → B1,..., gk : U → Bk and fix a ∈ U. If for all 1 ≤ l ≤ k we have that gl d a, then the map

0 0 0 h : U → C : a 7→ fa0 (g1(a ), . . . , gk(a )) is a and

Dah(a1) = Daf(a1)(g1(a), . . . , gk(a)) k X + fa (g1(a), . . . , gl−1(a),Dagl(a1), gl+1(a), . . . , gk(a)) . (5.4) l=1

l In particular if for a certain l ∈ N, f ∈ C (U × B1 × ... × Bk,C), g1 ∈ l l l C (U, B1), . . . , gk ∈ C (U, Bk), then h ∈ C (U, C).

0 0 0 0 Proof. Consider the function i : U → U×B1×...×Bk : a 7→ (a , g1(a ), . . . , gk(a )), 00 0 0 c then i a and for a ∈ A we have (as a 7→ a is l , and g1,..., gk a, use Lemma (5.1.11))

00 00 00 00 Dai(a ) = (a ,Dag1(a ),...,Dagk(a )).

Now h(a) = f(i(a)), f i(a), and i a, so by Theorem (5.1.8) h = f ◦ i a and

0 0 Dah(a ) = Di(a)f(Dai(a )) 0 0 0 = D(a,g1(a),...,gk(a))f(a ,Dag1(a ),...,Dagk(a )) (5.3) 0 = Daf(a )(g1(a), . . . , gk(a)) k X 0 + fa(g1(a), . . . , gl−1(a),Dagl(a ), gl(a), . . . , gk(a)). l=1

1 1 From Equation (5.4) we see that if g1 ∈ C (U, B1), . . . , gk ∈ C (U, Bk), 0 0 1 then Dah(a ) depends continuously on a and a and hence h ∈ C (U, C). In l l particular if for l ∈ N, f ∈ C (U × B1 × ... × Bk,C), g1 ∈ C (U, B1), . . . , l g ∈ C (U, Bk), then taking derivatives of Equation (5.4) and using induction, we see that h ∈ Cl(U, C). Example 5.2.5 For the map h·, ·i from Example (5.2.3) we see that for any two paths f, g ∈ C1(K, Kk), their inner product

h : K → K, h(x) := hf(x), g(x)i, is C1 and satisfies by Equation (5.4)

h0(x) = 0 + hf 0(x), g(x)i + hf(x), g0(x)i.

- 90 - 5.3. INTEGRATION

Theorem 5.2.6: Differentiability of families of inverses T Let A, B, C Vs /K T2 LC , U ⊆ A open, k ∈ N. Let f ∈ Ck(U × B,C) be a family of linear maps such that for all a ∈ U, fa : B → C is bijective. Denote the family of inverses by g : U × C → B : −1 (a, c) 7→ fa (c). If g c , then g ∈ Ck(U × C,B). In particular, for all a ∈ U, a1 ∈ A, c ∈ C we have

Dag(a1)(c) = −ga(Daf(a1)(ga(c))). (5.5)

Proof. Let a ∈ U, a1 ∈ A, c, c1 ∈ C, and α ∈ K small enough (A ), then using linearity and invertibility 1   g(a + α a , c + α c ) − g(a, c) α 1 1 1   = g(a + α a , f(a, g(a, c + α c ))) − g(a + α a , f(a + α a , g(a, c))) α 1 1 1 1  1   = −g a + α a , f(a + α a , g(a, c)) − f(a, g(a, c + α c )) 1 α 1 1  1   = −g a + α a , f(a + α a , g(a, c)) − f(a, g(a, c)) 1 α 1 1 + g(a + α a , α f(a, g(a, c ))) 1 α 1  1   = −g a + α a , f(a + α a , g(a, c)) − f(a, g(a, c)) 1 α 1 + g(a + α a1, c1).

Since all involved functions depend continuously on α, a1, and c1, and f d (a, g(a, c)) we therefore find with Lemma (5.1.4) that g (a, c) and

D(a,c)g(a1, c1) = −ga(D(a,ga(c))f(a1, 0)) + g(a, c1) which becomes Equation (5.5) in the notation of Definition (5.2.2). Because g is continuous and f continuously differentiable, D(a,c)g(a1, c1) depends contin- 1 uously on a, c, a1, and c1, and therefore g ∈ C (U × C,B). Now we can use induction, the composition rule from Theorem (5.1.8) and product rule from Theorem (5.2.4) to find that g ∈ Ck(U × C,B) because f ∈ Ck(U × B,C), by taking derivatives of Equation (5.5).

5.3 Integration

Definition 5.3.1: Partitions of an interval Let α, β ∈ R, α < β. Then a partition of the interval [α, β] is a collection γ0, . . . , γk ∈ R satisfying α = γ0 < γ1 < . . . < γk = β. For any two partitions γ0, . . . γk and δ0, . . . , δl of [α, β] we say that δ0, . . . , δl is a refinement of γ0, . . . , γk if there exist integers i0, . . . , ik ∈ N such that

γj = δij for all 1 ≤ j ≤ k. We will now introduce the notion of an integral, not in terms of measuring sets (i.e. measuring their volume, as is done with Lebesgue integration), but as an inverse operation to differentiation as presented in Section 5.1. In this section we demand that A UC to ensure that the integrals actually exist.

- 91 - 5.3. INTEGRATION

Definition 5.3.2: Integral T Let A Vs /K T2 LC UC , S ⊆ R an open interval and f : S → A a map. Let α, β ∈ S, α < β. Then we say for a ∈ A that the integral of f over [α, β] equals a, denoted by Z β f = a, α if for all i ∈ I and  ∈]0, ∞[ there exist a partition γ0, . . . , γk of [α, β], such that for all refinements δ0, . . . , δl of this partition we have that

l−1 h X δm+1 + δm i (δ − δ ) f − a < . m+1 m 2 m=0 i R β R β If it is not the case that α f = a, we write α f 6= a. For α > β we define Z β Z α f := − f α β whenever the latter exists and for α = β we define Z α f := 0. α

Example 5.3.3 Let A /K , a ∈ A, S ⊆ R an open interval and f : S → A the constant map f(α) = a for all α ∈ S. Then for any α, β ∈ S, α < β and any partition γ0, . . . , γk of [α, β] we have Pk−1 Pk−1 l=0 (γl+1 − γl) f((γl+1 + γl)/2) = l=0 (γl+1 − γl) a = (β − α) a. Therefore, for any α, β ∈ S (swap them if α > β),

Z β (γ 7→ a) = (β − α) a. α

Lemma 5.3.4 Let A /K , S ⊆ R an open interval, and f : S → A a map. Suppose f c , then for any α, β ∈ S there exists a unique a ∈ A such that R β α f = a.

(k) (k) Proof. Let α, β ∈ S, α < β. Create for each k ∈ N the partition γ0 , . . . , γ2k k (k) β−α of [α, β] by dividing [α, β] into 2 equal pieces: define γl := α + 2k l for k (k) (k+1) 0 ≤ l ≤ 2 . Note that γl = γ2l by definition. Now construct the sequence x : N → A by defining for all k ∈ N

2k (k) (k) X (k) (k) γ + γ  x := (γ − γ ) f l+1 l . k l+1 l 2 l=0

Let i ∈ I,  ∈]0, ∞[ be given, then ([α, β] Cpt , f , use Theorem (2.5.23))  there exists a δ ∈]0, ∞[ such that kf(α1)−f(α2)ki < β−α whenever |α1 −α2| < 2  β−α  (k) (k) δ. Pick k0 ≥ 1 + d log δ e, then for any k ≥ k0 we have that γl+1 − γl =

- 92 - 5.3. INTEGRATION

β−α (k) (k) 2k < δ. Let k ≥ k0 be arbitrary and δ0, . . . , δl any refinement of γ0 , . . . , γ2k . (k) k Then there exist i0, . . . , i2k ∈ N such that γm = δim for 0 ≤ m ≤ 2 . So,

l X xk − (δm+1 − δm) f((δm+1 + δm)/2) m=0 2k X (k) (k) (k) (k) = (γm+1 − γm ) f((γm+1 + γm )/2) m=0 k 2 in+1−1 X X − (δo+1 − δo) f((δo+1 + δo)/2)

n=0 o=in k   2 im+1−1 X X (k) (k) =  (δp+1 − δp) f((γm+1 + γm )/2) m=0 p=im

k 2 in+1−1 X X − (δo+1 − δo) f((δo+1 + δo)/2)

n=0 o=in k 2 im+1−1 X X  (k) (k)  = (δn+1 − δn) f((γm+1 + γm )/2) − f((δn+1 + δn)/2) . m=0 n=im With this,

l X (δm+1 − δm) f((δm+1 + δm)/2) − xk m=0 i k 2 im+1−1 X X  (k) (k)  = (δn+1 − δn) f((γ + γ )/2) − f((δn+1 + δn)/2) m+1 m m=0 n=i m i k i −1 2 m+1 X X (k) (k) ≤ |δn+1 − δn| f((γm+1 + γm )/2) − f((δn+1 + δn)/2) i m=0 n=im k 2 im+1−1 X X  < (δ − δ ) =  n+1 n β − α m=0 n=im where we used uniform continuity of f together with the fact that for each k (k) (k) 0 ≤ m ≤ 2 and im ≤ n < im+1 we have [δn, δn+1] ⊆ [γm , γm+1] and 0 < (k) (k) γm+1 − γm < δ. In particular, for all k, l ≥ k0 we have kxk − xlki < . Since this is true for all i ∈ N (A UC ), there exists a unique (A T2 ) limit a ∈ A of the sequence x. Also, let i ∈ I,  ∈]0, ∞[ be given and pick k0 at least as large as before, but also such that kxk − aki <  for all k ≥ k0 (use limk→∞ xk = a). Then for any

- 93 - 5.3. INTEGRATION refinement δ , . . . , δ of γ(k0), . . . , γ(k0) we have by the above 0 l 0 2k0

l X (δm+1 − δm) f((δm+1 + δm)/2) − a m=0 i l X ≤ (δ − δ ) f((δ + δ )/2) − x + kx − ak m+1 m m+1 m k0 k0 i m=0 i <  +  = 2 .

R β Therefore α f = a. R β R α If α = β then we have by definition α f = α f = 0 ∈ A. R α If α > β, then by the above there exists an a ∈ A such that β f = a. So R β R α α f = − β f = −a ∈ A exists by definition. Theorem 5.3.5: Integration T Let A Vs /K T2 LC UC , S ⊆ R an open interval, and f : S → A c .

• For any γ ∈ K, g : S → A , and α, β ∈ S we have Z β Z β Z β (f + γ g) = f + γ g. α α α

• For any α, β, γ ∈ S we have Z γ Z β Z γ f = f + f. α α β

• For any seminorm k · k : A → R and α, β ∈ S we have

Z β Z β

f ≤ kfk . α α

• For all α, β ∈ S we have for any B /K , g ∈ A → B l that  Z β  Z β g f = (g ◦ f). α α

Proof. In this proof we repeatedly use the fact that A which implies that limits (in particular the limit of the sequence x from the proof of Lemma (5.3.4)) are unique.

• Let γ ∈ K, g : S → A . Suppose α < β. Then f + γ g , so there exists R β a unique value for α (f + γ g) by Lemma (5.3.4). Since for any partition Pk γ0, . . . , γk of [α, β] we have l=0(γl+1 − γl)(f + γ g)((γl+1 + γl)/2) = Pk Pk l=0(γl+1 − γl) f((γl+1 + γl)/2) + γ m=0(γm+1 − γm) g((γm+1 + γm)/2), R β R β this unique value necessarily equals α f +γ α g. For the case that α = β we obtain 0 = 0 + γ 0 and when α > β we can swap α and β and add R β R β R β minus signs to the left and right hand sides of α (f + γg) = α f + γ α g to obtain the desired result.

- 94 - 5.3. INTEGRATION

R γ R β R γ • Let α, β, γ ∈ S, α < β < γ. Then by Lemma (5.3.4), α f, α f, β f exist in A. Furthermore, let γ0, . . . , γk be any partition of [α, γ], then by possibly refining, through adding β to this partition, we may suppose that γl = β (and conversely, we can concatenate any two partitions of [α, β] and Pk [β, γ] to obtain a partition of [α, γ]). Then m=0(γm+1 − γm) f((γm+1 + Pl−1 Pk γm)/2) = m=0(γm+1−γm) f((γm+1+γm)/2)+ n=l(γn+1−γn) f((γn+1+ R β R γ γn)/2). As both terms on the right hand side will tend to α f and β f R γ R β R γ for increasingly finer partitions, necessarily α f = α f + β f. For other configurations, instead of α < β < γ, simply apply minus signs at the appropriate positions and reduce to the · ≤ · ≤ · case.

• Let k · k : A → R c be a seminorm. Then f and kfk := k · k ◦ f , R β R β so by Lemma (5.3.4), both α f and α kfk exist in A and R respectively and are unique. Suppose α < β, the result now follows from repeated applications of the triangle inequality and continuity of k · k: for any Pk partition γ0, . . . , γk of [α, β] we have k l=0(γl+1 −γl) f((γl+1 +γl)/2)k ≤ Pk Pk l=0 |γl+1 − γl| kf((γl+1 + γl)/2)k = l=0(γl+1 − γl) kfk((γl+1 + γl)/2). R β R β R β Hence necessarily k α fk ≤ α kfk = | α kfk since α < β and therefore R β α kfk ≥ 0 because all terms in the sums converging to this integral are positive. The case α = β is direct (0 ≤ 0) and for α > β we have by the R α R α R β R α R α R α above k β fk ≤ β kfk, so k α fk = k − β fk = k β fk ≤ β kfk = R β R β R α − α kfk = | α kfk| as β kfk ≥ 0.

• Suppose α < β. Let g ∈ A → B l , then g ◦ f : S → B , so R β R β both α f and α (g ◦ f) exist and are unique. As g , we have for any  Pk  partition γ0, . . . , γk of [α, β] that g l=0(γl+1 − γl)f((γl+1 + γl)/2) = Pk l=0(γl+1 − γl)(g ◦ f)((γl+1 + γl)/2), so using continuity of g we find that R β R β necessarily g( α f) = α (g ◦ f). The cases α = β and α > β follow directly.

Example 5.3.6: Integration on R and C T Let A Vs /K T2 LC UC , S ⊆ R an open interval, f : S → A , and α, β ∈ S. Then for any g ∈ A0 we have by Theorem (5.3.5) (as g : A → K /K) R β R β that g( α f) = α (g ◦ f). However g ◦ f : S → K, so looking at Definition R β R β (5.3.2) we see that α (g ◦ f) and the Riemann integral α (g ◦ f)(x) dx must agree, because both exist ([α, β] compact, g ◦ f ) and they satisfy the same limiting procedure. Therefore, for any g ∈ A0,

 Z β  Z β g f = (g ◦ f)(x) dx. α α

Lemma 5.3.7 Let A, B /K , S ⊆ R an open interval, U ⊆ A open, and f : S×U → B .

- 95 - 5.3. INTEGRATION

Then for any α, β ∈ S the map g : U → B defined by Z β   g(a) := γ 7→ f(γ, a) α is c . Furthermore, if for k ∈ N, f ∈ Ck(S × U, B), then g ∈ Ck(U, B) and Z β   Dag(a1) = γ 7→ D(γ,a)f(0, a1) . α Proof. Fix a ∈ U and let i ∈ I,  ∈]0, ∞[ be given. Let γ ∈ [α, β], then because f (γ, a) there exists a δγ ∈]0, ∞[ and an open neighbourhood Uγ of a in U such that for all γ1 ∈ [α, β], a1 ∈ U we have that |γ1 − γ| < δγ and a1 ∈ Uγ together imply that kf(γ1, a1) − f(γ, a)ki < /2. The collection {]γ − δγ , γ + δγ [⊆ S|γ ∈ [α, β]} forms an open cover of [α, β]. Since [α, β] is compact there exists a finite number of γ1, . . . , γk ∈ [α, β] such that [α, β] ⊆ Sk Tk l=1]γl − δγl , γl + δγl [. Choose U1 := l=1 Uγl which is (finite intersection of open sets) an open neighbourhood of a in U. Let γ ∈ [α, β] and a1 ∈ U1 be arbitrary. Then there exists an l ∈ {1, . . . , k} such that γ ∈]γl − δγl , γl + δγl [.

Hence |γ−γl| < δγl and a1 ∈ U1 ⊆ Uγl , so kf(γ, a1)−f(γl, a)ki < /2. Therefore

kf(γ, a1) − f(γ, a)ki ≤ kf(γ, a1) − f(γl, a) + f(γl, a) − f(γ, a)ki

≤ kf(γ, a1) − f(γl, a)ki + kf(γl, a) − f(γ, a)ki < /2 + /2 = , as a ∈ Uγl . So for any a ∈ U, i ∈ I,  ∈]0, ∞[ there exists an open neighbourhood U1 of a in U such that for all a1 ∈ U1 and γ ∈ [α, β] we have kf(γ, a1) − f(γ, a)ki < . Hence, for any a1 ∈ U1 we have (use Theorem (5.3.5))

Z β Z β

kg(a1) − g(a)ki = (γ 7→ f(γ, a1)) − (γ 7→ f(γ, a)) α α i

Z β

= (γ 7→ f(γ, a1) − f(γ, a)) α i

Z β

≤ (γ 7→ kf(γ, a1) − f(γ, a)ki) α

Z β

≤  =  (β − α). α Since this is true for all a ∈ U, i ∈ I,  ∈]0, ∞[, we find that g . Now suppose that f ∈ Ck(S ×U, B). First of all note that for δ ∈ K nonzero but small enough, a ∈ U, and a2 ∈ A we have by Theorem (5.3.5) 1  1 Z β   Z β   g(a + δ a2) − g(a) = γ 7→ f(γ, a + δ a2) − γ 7→ f(γ, a) δ δ α α 1 Z β   = γ 7→ f(γ, a + δ a2) − f(γ, a) δ α Z β  1  = γ 7→ f(γ, a + δ a2) − f(γ, a) . α δ

- 96 - 5.3. INTEGRATION

As f ∈ C1(S × U, B), the function defined for all γ ∈ S, δ 6= 0 small, and 1   a2 ∈ A by (γ, δ, a2) 7→ δ f(γ, a + δ a2) − f(γ, a) and for δ = 0, a1 ∈ A by c (γ, 0, a1) 7→ D(γ,a)f(0, a1) is . Therefore, by the first part of this lemma, so 1   is (δ, a2) 7→ δ g(a + δ a2) − g(a) and for (δ, a2) → (0, a1), this function goes β   R d to α γ 7→ D(γ,a)f(0, a1) . Hence g a and Dag(a1) is exactly given by this integral. Again using the first part of the lemma we see from the expression of 1 Dag(a1) that it depends continuously on a and a1, therefore g ∈ C (U, B). Use induction to obtain that g ∈ Ck(U, B). Theorem 5.3.8: Fundamental theorem of integration T Let A Vs /K T2 LC UC , S ⊆ R an open interval, and f : S → A . Then for any α ∈ S, the map gα : S → A defined by

Z β gα(β) := f α

1 for all β ∈ S, satisfies gα ∈ C (S, A) and Dβgα(1) = f(β) for all β ∈ S. 1 On the other hand, for any g ∈ C (S, A) satisfying Dαg(1) = f(α) for all α ∈ S, we have that for all α, β ∈ S,

Z β g(β) − g(α) = f. α Proof. In this proof we repeatedly use Theorem (5.3.5). Fix α ∈ S and define gα : S → A as above. Let β ∈ S,  ∈]0, ∞[, and i ∈ I be given. As f β there exists a δ ∈]0, ∞[ such that for all γ ∈ S −β satisfying |γ| < δ we have kf(β + γ) − f(β)ki < /2. Then for any γ ∈ (S − β)∩] − δ, δ[,

Z β+γ Z β Z β+γ gα(β + γ) − gα(β) = f − f = f. α α β

So in the case that γ 6= 0, we have (use Example (5.3.3))

1   1  Z β+γ  (β + γ) − β

gα(β + γ) − gα(β) − f(β) = f − f(β) γ γ β γ i i

1 Z β+γ

≤ kf − f(β)ki |γ| β

1 Z β+γ 

≤ |γ| β 2 1  = |γ| < . |γ| 2

Therefore (this is true for all  ∈]0, ∞[, i ∈ I) 1   lim gα(β + γ1) − gα(β) = f(β). γ→0 γ

- 97 - 5.3. INTEGRATION

Hence by Corollary (5.1.7) gα d β with Dβgα(γ) = γ f(β) for all β ∈ S, c γ ∈ R. So gα S. The map S × R → A :(β, γ) 7→ Dβgα(γ) = γf(β) since f 1 and scalar multiplication . Hence gα ∈ C (S, A). 1 Now let g ∈ C (S, A) and suppose that for all α ∈ S, Dαg(1) = f(α). Fix some γ ∈ S, then in particular for all α ∈ S we have Dαg(1) = f(α) = Dαgγ (1). 1 Hence the map g − gγ (by Theorem (5.1.8)) satisfies (g − gγ ) ∈ C (S, A) and Dα(g − gγ ) = Dαg − Dαgγ = 0. By Theorem (5.1.8) g − gγ is constant: there exists an a ∈ A such that g(α)−gγ (α) = a for all α ∈ S. Therefore g(β)−g(α) = R β R α R β gγ (β) + a − gγ (α) − a = γ f − γ f = α f. This immediately gives the following result, which is slightly counterintuitive in light of Example (5.1.10). Corollary 5.3.9: Integration as antiderivative T Let A Vs /K T2 LC UC , S ⊆ R an open interval. Let C := {f : S → A|∃a ∈ A : ∀α ∈ S : f(α) = a} be the collection of constant functions, considered as Vs /K. Let k ∈ N, α ∈ S, then C ≤ Ck+1(S, A) and the maps h Z β i Ck(S, A) → Ck+1(S, A)/C : f 7→ β 7→ f α k+1 k   C (S, A)/C → C (S, A):[g] 7→ β 7→ Dβg(1) are l /K and inverses of each other. In particular as /K we have for all k ∈ N, Ck(S, A) ' Ck+1(S, A)/A, while at the same time it is also true that Ck(S, A) ≥ Ck+1(S, A).

Proof. Let k ∈ N, α ∈ S. By Theorem (2.1.28) and Theorem (5.1.8) we know that C ≤ Ck+1(S, A) as /K. Furthermore C ' A as /K via A → C : a 7→ (β 7→ a) and C → A : f 7→ f(α). The map Ck+1(S, A)/C → Ck(S, A) is well-defined, because for any two g, h ∈ Ck+1(S, A) that differ by a constant a ∈ A (so g(β) = h(β) + a for all β ∈ S) we have (Theorem (5.1.8)) that Dβg(1) = Dβh(1) + 0 = Dβh(1), so for all h ∈ [g], Dβh(1) = Dβg(1). k R β Let f ∈ C (S, A) and choose g : S → A : β 7→ α f. Then by Theorem (5.3.8) the function h : S → A : β 7→ Dβg(1) satisfies h(β) = Dβg(1) = f(β) for all β ∈ S, so h = f. k+1 Let [g] ∈ C (S, A)/C and choose f : S → A : β 7→ Dβg(1). Then by R β R β Theorem (5.3.8) the function h : S → A : β 7→ α f satisfies h(β) = α f = g(β) − g(α), so [h] = [g] since the function β 7→ −g(α) is constant and hence an element of C. So these two maps are inverses of each other and /K by Theorem (5.1.8) and Theorem (5.3.5), hence Ck(S, A) ' Ck+1(S, A)/C ' Ck+1(S, A)/A as /K. From Example (5.1.10) we know that Ck+1(S, A) ≤ Ck(S, A).

- 98 - 5.3. INTEGRATION

Lemma 5.3.10: Partial integration T 1 1 Let A Vs /K T2 LC UC , S ⊆ R an open interval, f ∈ C (S, K), and g ∈ C (S, A). Then for any α, β ∈ S we have

Z β Z β (γ 7→ Dγ f(1) g(γ)) = f(β) g(β) − f(α) g(α) − (γ 7→ f(γ) Dγ g(1)). α α

Proof. The scalar multiplication map K × A → A is c 2- l /K and f and g are C1, so by Theorem (5.2.4) we obtain that the function

h : S → A : γ 7→ f(γ) g(γ) satisfies h ∈ C1(S, A), and

Dγ h(1) = Dγ f(1) g(γ) + f(γ) Dγ g(1).

By Theorem (5.3.5) and Theorem (5.3.8) we therefore obtain that

f(β)g(β) − f(α)g(α) = h(β) − h(α) Z β = (γ 7→ Dγ h(1)) α Z β = (γ 7→ Dγ f(1)g(γ) + f(γ)Dγ g(1)) α Z β Z β = (γ 7→ Dγ f(1)g(γ)) + (γ 7→ f(γ)Dγ g(1)) α α which shows the desired result. We can now generalise the derivative as a local, linear approximation of a given function near a given point, to the Taylor sequence, which gives us a linear, quadratic, cubic, . . . approximation. Theorem 5.3.11: Taylor Let A, B /K , U ⊆ A open abc, k ∈ N, and f ∈ Ck+1(U, B). Then for any a1 ∈ U − a we have

1 2 1 k f(a + a1) = f(a) + Daf(a1) + Daf(a1, a1) + ... + Daf(a1, . . . , a1) 2! k! | {z } k 1 Z 1   + α 7→ (1 − α)k Dk+1 f(a , . . . , a , a ) . (5.6) a+αa1 1 1 1 k! 0 | {z } k+1

Proof. Let a1 ∈ U − a, then a, a + a1 ∈ U and since U is convex and open, there exists an  ∈]0, ∞[ such that a + α a1 = α a + (1 − α)((a + a1) − a) ∈ U for all α ∈ S :=] − , 1 + [⊆ R. Let k ∈ N, suppose f ∈ Ck(U, B), and let 0 ≤ l < k. The map g : S → A : 1 α 7→ a + α a1 is C with derivative Dαg(β) = β a1 for all α ∈ S. Suppose l = 0, choose h0 : S → B, h0(α) := f(a + α a1) = (f ◦ g)(α). By Theorem (5.1.8) we 1 have (as l < k) h0 ∈ C (U, B) with derivative

Dαh0(β) = Dg(α)f(Dαg(β)) = Da+α a1 f(β a1).

- 99 - 5.3. INTEGRATION

Suppose 0 < l < k, choose h : S → B, h (α) := Dl f(a , . . . , a ) = l l a+α a1 1 1 l 1 Dg(α)f(a1, . . . , a1). Then by Theorem (5.1.8), hl ∈ C (U, B) and has derivative

l+1 Dαhl(β) = Dg(α)f(a1, . . . , a1,Dαg(β)) = Dl+1 f(a , . . . , a , βa ). a+α a1 1 1 1 Suppose k = 0 and f ∈ C0+1(U, B). Then by Theorem (5.3.8) we have

1 Z 1 (α 7→ (1 − α)0 D1 f(a )) a+αa1 1 0! 0 Z 1 = (α 7→ Dαh0(1)) 0 = h0(1) − h0(0)

= f(a + a1) − f(a). Therefore 1 Z 1 f(a + a ) = f(a) + (α 7→ (1 − α)0 D1 f(a )) 1 a+αa1 1 0! 0 and Equation (5.6) holds for k = 0. Now suppose Equation (5.6) is true for k ∈ N and that f ∈ C(k+1)+1(U, B). −1 k+1 1 Choose i : S → K, i(α) := (k+1)! (1 − α) , then i ∈ C (S, K) with Dαi(1) = 1 k k! (1 − α) , so using Lemma (5.3.10) we find 1 Z 1   α 7→ (1 − α)k Dk+1 f(a , . . . , a ) a+αa1 1 1 k! 0 Z 1   = α 7→ Dαi(1) hk+1(α) 0 Z 1   = i(1) hk+1(1) − i(0) hk+1(0) − α 7→ i(α) Dαhk+1(1) 0 −1 = 0 − Dk+1f(a , . . . , a ) (k + 1)! a 1 1 −1 Z 1   − α 7→ (1 − α)k+1 Dk+2 f(a , . . . , a , a ) . a+αa1 1 1 1 (k + 1)! 0 Since Equation (5.6) was true for k, we find 1 1 f(a + a ) = f(a) + D f(a ) + D2f(a , a ) + ... + Dkf(a , . . . , a ) 1 a 1 2! a 1 1 k! a 1 1 1 Z 1   + α 7→ (1 − α)k Dk+1 f(a , . . . , a ) a+αa1 1 1 k! 0 1 1 = f(a) + D f(a ) + D2f(a , a ) + ... + Dkf(a , . . . , a ) a 1 2! a 1 1 k! a 1 1 1 + Dk+1f(a , . . . , a , a ) (k + 1)! a 1 1 1 1 Z 1   + α 7→ (1 − α)k+1 Dk+2 f(a , . . . , a , a , a ) . a+αa1 1 1 1 1 (k + 1)! 0

- 100 - 5.3. INTEGRATION

Hence Equation (5.6) holds for k + 1. So with induction, Equation (5.6) holds for all k ∈ N. Corollary 5.3.12: Taylor approximation T Let A, B Vs /K T2 LC UC , U ⊆ A open abc, k ∈ N.

k+1 • Suppose f ∈ C (U, B), then for any a1 ∈ U − a we have

   k l 1 X α l lim  f(a + αa1) − Daf(a1, . . . , a1) α→0 αk+1 l! l=0 | {z } l

1 k+1 = Da f(a1, . . . , a1). (k + 1)! | {z } k+1

In particular,

α1 αk f(a + α a ) = f(a) + D1f(a ) + ... + Dkf(a , . . . , a ) + O(αk+1). 1 1! a 1 k! a 1 1

• Suppose f ∈ Ck(U, B) and let k · k : A → R c be a seminorm. Suppose there exists an  ∈]0, ∞[ such that for all a1 ∈ U − a and all α ∈ [0, 1]

Dk f(a , . . . , a ) − Dkf(a , . . . , a ) ≤ k, a+αa1 1 1 a 1 1

then for any a1 ∈ U − a we have

k k X 1 l  f(a + a1) − D f(a1, . . . , a1) ≤ . l! a k! l=0 | {z } l

k+1 Proof. • Suppose f ∈ C (U, B) and let a1 ∈ U − a, α ∈ BK(0, 1) \{0}, then from Equation (5.6) we find

 k  1 X αl f(a + α a ) − Dl f(a , . . . , a ) αk+1  1 l! a 1 1  l=0 | {z } l 1  1  = f(a + α a ) − f(a) − ... − Dkf(α a , . . . , α a ) αk+1 1 k! a 1 1 1  1 Z 1   = β 7→ (1 − β)k Dk+1 f(α a , . . . , α a ) k+1 a+β α a1 1 1 α k! 0 αk+1 Z 1   = β 7→ (1 − β)k Dk+1 f(a , . . . , a ) . k+1 a+β α a1 1 1 k!α 0

- 101 - 5.3. INTEGRATION

Hence   k l 1 X α l lim f(a + α a1) − Daf(a1, . . . , a1) α→0 αk+1 l! l=0 | {z } l  1 Z 1   = lim β 7→ (1 − β)k Dk+1 f(a , . . . , a ) a+β α a1 1 1 α→0 k! 0 1 Z 1   = β 7→ (1 − β)k Dk+1 f(a , . . . , a ) a+β 0 a1 1 1 k! 0 1 = Dk+1f(a , . . . , a ), (k + 1)! a 1 1

where in the beforelast step we used the fact that the map K × R → B : (α, β) 7→ (1 − β)k Dk+1 f(a , . . . , a ) c together with Lemma (5.3.7). a+βαa1 1 1

k • Suppose f ∈ C (U, B), then we find with Equation (5.6) for any a1 ∈ U −a that 1 1 f(a + a ) = f(a) + D f(a ) + D2f(a , a ) + ... + Dk−1f(a , . . . , a ) 1 a 1 2! a 1 1 (k − 1)! a 1 1 1 Z 1   + α 7→ (1 − α)k−1 Dk f(a , . . . , a , a ) a+α a1 1 1 1 (k − 1)! 0 1 + Dkf(a , . . . , a , a ) k! a 1 1 1 Z 1 k−1   (1 − α)  k − α 7→ Daf(a1, . . . , a1, a1) 0 (k − 1)! k X 1 = Dl f(a , . . . , a ) l! a 1 1 l=0 | {z } l 1 Z 1   + α 7→ (1 − α)k−1 Dk f(a , . . . , a ) a+α a1 1 1 (k − 1)! 0 k  − Daf(a1, . . . , a1) .

Now using Theorem (5.3.5) and the fact that k · k is and a seminorm, we find 1 Z 1    α 7→ (1 − α)k−1 Dk f(a , . . . , a ) − Dkf(a , . . . , a ) a+α a1 1 1 a 1 1 (k − 1)! 0 1 Z 1   ≤ α 7→ (1 − α)k−1 Dk f(a , . . . , a ) − Dkf(a , . . . , a ) a+α a1 1 1 a 1 1 (k − 1)! 0 1 Z 1   ≤ α 7→ (1 − α)k−1 k (k − 1)! 0 k = k! from which the estimate follows.

- 102 - 5.3. INTEGRATION

Finally, Theorem (5.3.8) permits us to show that the derivative introduced in Definition (5.1.1) is equivalent to the Gˆateauxderivative, as introduced in [Ham1982], if the involved function is continuously differentiable.

Theorem 5.3.13: Compatibility with Gˆateaux derivative T Let A, B Vs /K T2 LC UC , U ⊆ A open abc, f : U → B. Then f ∈ C1(U, B) if and only if there exists a map g : U × U × A → B c (compare with Lemma 3.3.1 of [Ham1982]), such that for all a1, a2 ∈ U, A → B : a3 7→ g(a1, a2, a3) l /K and

f(a2) − f(a1) = g(a1, a2, a2 − a1).

If this is the case, then for all a ∈ U, a1 ∈ A,

Daf(a1) = g(a, a, a1).

Proof. We follow [Ham1982]. Suppose such a map g exists. Fix a ∈ U, a1 ∈ A. Let a2 ∈ A and α ∈ K small enough but nonzero. Then 1   1 f(a + α a ) − f(a) = g(a, a + α a , α a ) α 2 α 2 2 = g(a, a + α a2, a2) so as g , 1   lim f(a + α a2) − f(a) = g(a, a + 0 a1, a1). (α,a2)→(0,a1) α

Since the map A → B : a2 7→ g(a, a, a2) /K we find with Lemma (5.1.4) that f d a and Daf(a1) = g(a, a, a1). Furthermore, g by assumption, so 1 (a, a1) 7→ Daf(a1) = g(a, a, a1) , and hence f ∈ C (U, B). Suppose conversely that f ∈ C1(U, B). Then we can define g : U×U×A → B by Z 1   g(a1, a2, a3) := α 7→ Dα a2+(1−α) a1 f(a3) . 0

Then by the fact that (a, a1) 7→ Daf(a1) and Lemma (5.3.7) we find that g . By Theorem (5.3.5) and the fact that Daf we also find that for any a1, a2 ∈ U, a3 7→ g(a1, a2, a3) . Furthermore, by Theorem (5.3.8), for any a1, a2 ∈ U

f(a2) − f(a1) = f(1 a2 − (1 − 1) a1) − f(0 a2 − (1 − 0) a1) Z 1   = α 7→ Dα a2−(1−α) a1 f(a2 − a1) 0 = g(a1, a2, a2 − a1).

We are now going to prove a result that will be necessary for the treatment of in Section 6.4. This statement originates from the study of classical mechanics, see [Dui2006].

- 103 - 5.3. INTEGRATION

Theorem 5.3.14: Euler-Lagrange variational formula (La(f, g)) T Let A, B Vs /K T2 LC UC , U ⊆ A × A open. Define for f ∈ C2(U, B) and any path g : S → A with S ⊆ R an open interval, g ∈ C2(S, A), and (g(α), g0(α)) ∈ U for all α ∈ S, the Lagrange map of g with respect to f as La(f, g): S × A → B, given by the family of linear maps

0 ∂  ∂f(g(α), a2)    ∂f(a1, g (α))  La(f, g)(α, a) := (a) − (a). ∂α ∂a 0 ∂a 2 a2=g (α) 1 a1=g(α)

Let C /K , W ⊆ C open and a family of paths g : W × S → 2 A :(c, α) 7→ gc(α), g ∈ C (W × S, A), S ⊆ R an open interval, such that 0 (gc(α), gc(α)) ∈ U for all c ∈ W , α ∈ S. Then we have the Euler-Langrage variational formula: for any γ, δ ∈ S, c ∈ W , and c1 ∈ C we have

Z δ Z δ ∂   0    ∂gc(α)  α 7→ f(gc(α), gc(α)) (c1) = − α 7→ La(f, gc) α, (c1) ∂c γ γ ∂c

∂f(gc(δ), a2) ∂gc(δ)  ∂f(gc(γ), a2) ∂gc(γ)  + (c1) − (c1) . ∂a2 0 ∂c ∂a2 0 ∂c a2=gc(δ) a2=gc(γ) (5.7)

0 ∂gc(α) Proof. We follow [Dui2006]. Note that gc(α) = ∂α (1). By Lemma (5.3.7), Lemma (5.1.13), and Theorem (5.1.8) we have

Z δ ∂   0  α 7→ f(gc(α), gc(α)) (c1) ∂c γ Z δ  ∂  0   = α 7→ f(gc(α), gc(α)) (c1) γ ∂c Z δ  ∂f(a , g0 (α)) ∂g (α)  1 c c = α 7→ (c1) γ ∂a1 a1=gc(α) ∂c 2 ∂f(gc(α), a2) ∂ gc(α)  + (1, c1) . 0 ∂a2 a2=gc(α) ∂c ∂α Note that with Theorem (5.1.8) and Equation (5.4)

∂ ∂f(gc(α), a2) ∂gc(α)  (c1) 0 ∂α ∂a2 a2=gc(α) ∂c  ∂g (α)  ∂f(a , g0 (α)) ∂g (α)  c 1 c c = La(f, gc) α, (c1) + (c1) ∂c ∂a1 a1=gc(α) ∂c 2 ∂f(gc(α), a2) ∂ gc(α)  + (c1, 1) . 0 ∂a2 a2=gc(α) ∂α ∂c

- 104 - 5.4. FRECHET´ SPACES

So with Theorem (5.1.16) we obtain from our first expression that

Z δ ∂   0  α 7→ f(gc(α), gc(α)) (c1) ∂c γ Z δ   ∂gc(α)  = α 7→ − La(f, g) α, (c1) γ ∂c

∂ ∂f(gc(α), a2) ∂gc(α)  + (c1) 0 ∂α ∂a2 a2=gc(α) ∂c which yields Equation (5.7) via Theorem (5.3.5) and Theorem (5.3.8).

5.4 Fr´echet spaces

Definition 5.4.1: Fr´echet space ( Fr ) Let A be a set. T Then we call A a Fr´echetspace (denoted by A /K) if A Vs /K T2 LC UC for which the collection of seminorms giving rise to local convexity is countable.

Using Lemma (4.5.10) we see that for any A /K, A FS /K. Therefore almost all theory derived in the previous sections is valid for Fr´echet spaces; we summarise these results in Theorem (5.4.2) for convenience. Theorem 5.4.2 0 Let A, B /K with seminorms {k · ki|i ∈ N}, {k · kj|j ∈ N} respectively. • Let f : A → B be a map, a ∈ A, b ∈ B. Then the following are equivalent:

– limx→a f(x) = b,

– for all j ∈ N and  ∈]0, ∞[ there exist i1, . . . , ik ∈ N and a δ ∈]0, ∞[

such that for all a1 ∈ A with ka1 − aki1 < δ,..., ka1 − akik < δ we 0 have kf(a1) − bkj < , – for all sequences x : N → A with limk→∞ xk = a we have for all j ∈ N 0 that limk→∞ kf(xk) − bkj = 0. • Let a ∈ A. Then the following are equivalent: – a = 0,

– for all i ∈ N, kaki = 0, – for all f ∈ A0, f(a) = 0.

0 0 • Suppose B ≤ A. Then for any f ∈ B there exists a g ∈ A with g|B = f. • We have that A ' (A0)0 are -isomorphic.

• Let f : A → B l /K. Then the following are equivalent: – f c , – graph(f) ⊆ A × B is closed,

- 105 - 5.4. FRECHET´ SPACES

– for all j ∈ N there exists an α ∈]0, ∞[ and i1, . . . , ik ∈ N such that for all a ∈ A we have

0   kf(a)kj ≤ α kaki1 + ... + kakik .

• Let f : A → B c l /K. Then – f is surjective if and only if f is open, – f is bijective if and only if f −1 : B → A /K if and only if f is a T Vs -isomorphism.

Proof. • Use Lemma (4.5.4) for the first equivalence, together with the fact that the topology on A and B is the initial topology of their respective seminorms, Lemma (2.1.19). By Lemma (4.5.10), A FS and hence A d(.,.) . Therefore by Theorem (2.5.10) and Lemma (2.3.4) we obtain equivalence with the third item.

• First of all if a = 0, then by definition kaki = k0 aki = 0 kaki = 0 and f(a) = f(0 a) = 0 f(a) = 0 for any i ∈ I, f ∈ A0. Conversely use Theorem (4.5.14) and Lemma (4.5.6). • This is Theorem (4.5.14).

• By Theorem (4.5.15) we have that f : A → (A0)0 : a 7→ (g 7→ g(a)) /K and bijective. Hence by Corollary (4.4.6) and the fact that A we see that f is a -isomorphism. • Use Theorem (4.4.5) and Lemma (4.5.5). • Use Theorem (4.4.5) and Corollary (4.4.6).

Despite Theorem (5.4.2), Section 5.1, and Section 5.3 there are still quite a few results which are not valid in general Fr´echet spaces, among which the inverse function theorem and existence and uniqueness of solutions of ordinary differential equations. An extensive treatment of a version of the inverse function theorem that can be applied in a broader context (the Nash-Moser inverse function theorem) is given in [Ham1982]. Here we will only include two counterexamples from this article. Example 5.4.3: Fr´echet spaces and the inverse function theorem Let A := {f : C → C | f holomorphic on C} with topology induced by the seminorms

(l) kfkk,B := sup{|f (x)| ∈ R|0 ≤ l ≤ k, x ∈ B} for all k ∈ N and B ⊆ C Cpt (here f (l) denotes the l-th derivative of f). As Q2 ⊆ C is countable and dense we can make a countable selection of seminorms which makes A Fr /C.

- 106 - 5.4. FRECHET´ SPACES

Consider the exponential map F : A → A given by

F (f)(x) := ef(x) for all f ∈ A, x ∈ C. ∞ Then F ∈ C (A, A) with derivative given for all k ∈ N and f, f1, . . . , fk ∈ A by  k  f(x) Df F (f1, . . . , fk) (x) = e f1(x) . . . fk(x).

In particular Df F is bijective for all f ∈ A with inverse given by

−1  −f(x)  [Df F ] : A → A : f1 7→ x 7→ e f1(x) .

It is clear that the family of inverses

−1 −1 [DF ] : A × A → A :(f, f1) 7→ [Df F ] (f1) is also C∞ as a function A × A → A. Now let B := {f ∈ A | ∀x ∈ C : f(x) 6= 0} ⊆ A then F (A) ⊆ B by definition, since |F (f)(x)| = |ef(x)| > 0 for all f ∈ A, x ∈ C. Suppose that there is a nonempty open U ⊆ A for which U ⊆ B. Let f ∈ U, then because f ∈ A is holomorphic there exists a sequence of polynomials N → A : k 7→ pk, for example f (1)(0) f (k)(0) p (x) := f(0) + x1 + ... + xk, k 1! k! for which limk→∞ pk = f. Since U is an open neighbourhood of f in A, there exists a k ∈ N such that pl ∈ U for all l ≥ k. In particular pk ∈ U ⊆ B so pk has no zeroes; this is in contradiction with the fundamental theorem of calculus: pk has k > 0 zeroes. Therefore B ⊇ F (A) cannot contain a nonempty open set and hence F does not have a differentiable inverse defined on any open U ⊆ A, even though ∞ F ∈ C (A, A), Df F is invertible for all f ∈ A, and the family of inverses [DF ]−1 ∈ C∞(A × A, A). Example 5.4.4: Fr´echet spaces and ordinary differential equations Let ∞ A := C ([−1, 1], R) with topology induced by the seminorms

(l) kfkk := sup{|f (x)| ∈ R | 0 ≤ l ≤ k, x ∈ [−1, 1]} which make A Fr /R. Let F : A → A be the map

F (f)(x) := f 0(x).

Then F ∈ C∞(A, A) because F c l /R. Now consider the ordinary differential equation for fixed f0 ∈ A

0 γ (t) = F (γ(t)), γ(0) = f0

- 107 - 5.5. BANACH SPACES

where γ : I → A for an open interval 0 ∈ I ⊆ R is the sought solution. Writing γ(t, x) = γ(t)(x) we see that the differential equation can be written as ∂γ ∂γ (t, x) = (t, x), γ(0, x) = f (x) ∂t ∂x 0 for all t ∈ I, x ∈ [−1, 1]. This admits solutions γ(t, x) = f(t + x) where f : I + [−1, 1] → R is a smooth function satisfying f|[−1,1] = f0. Hence the solution is by no means unique (e.g. take the function f0(x) = −1 (x−1)2 (k) e for x ∈] − 1, 1[, f0(±1) = 0 then f0 (±1) = 0 for all k ∈ N, so we can find a myriad of different extensions f of f0). On the other hand, if we would take

∞ A := {f : R → R | f|[−1,1] ∈ C ([−1, 1]), ∀x ≤ −1 : f(x) = 0}, then the differential equation does not have any solution if f0(x) 6= 0 for x ∈ −1 2 ] − 1, 1[ (e.g. again consider f0(x) = e (x−1) > 0 for x ∈] − 1, 1[ and f0(x) = 0 for x∈ /] − 1, 1[).

5.5 Banach spaces

Definition 5.5.1: Banach ( Ba ) Let A be a set. Then we call A a Banach space (denoted by A /K) if A ||.|| /K (recall Definition (4.2.3) and Definition (4.2.4)) which is complete as a metric space. Lemma 5.5.2 Let A Vs /K. Suppose A /K, then A Fr /K. Conversely, if A /K and the collection of seminorms giving rise to local convexity is finite, then A /K.

T Proof. By Lemma (4.5.7) we see that A if and only if A Vs T2 LC with a finite number of seminorms. Suppose A /K, then A is complete by definition and since A only has a single norm, this implies that A UC with a single seminorm and hence A . Suppose conversely that A with a finite number of seminorms, then A and by Lemma (4.5.10), A is complete, so A . In particular all results in Theorem (5.4.2) hold for . Lemma 5.5.3 Let k ∈ N, A1,..., Ak, B /K, f : A1 × ... × Ak → B k- l /K. c Then f if and only if there exists an α ∈]0, ∞[ such that for all a1 ∈ A1, ..., ak ∈ Ak we have

kf(a1, . . . , ak)kB ≤ α ka1kA1 ... kakkAk .

Furthermore if A /K, U ⊆ A open, and f : U × A1 × ... × Ak → B a k-linear family, then for all a ∈ U there exists a δ ∈]0, ∞[ and α ∈]0, ∞[ such 0 that for all a ∈ BA(a, δ) ⊆ U, a1 ∈ A1,..., ak ∈ Ak we have

0 kfa (a1, . . . , ak)kB ≤ α ka1kA1 ... kakkAk . (5.8)

- 108 - 5.5. BANACH SPACES

Proof. Suppose that the estimate holds for all a1 ∈ A1,..., ak ∈ Ak. Let pk   ∈]0, ∞[ be given, then pick δ = α ∈]0, ∞[ to obtain for all a1 ∈ BA1 (0, δ), k ..., ak ∈ BAk (0, δ) that kf(a1, . . . , ak)kB ≤ α ka1kA1 ... kakkAk < α δ = . l c Hence lim(a1,...,ak)→(0,...,0) f(a1, . . . , ak) = 0 and since f k- we have f .

Now let A Ba , U ⊆ A open and f : U × A1 × ... × Ak → B a k-linear fam- 0 0 ily. Let a ∈ U. Then lim(a ,a1,...,ak)→(a,0,...,0) fa (a1, . . . , ak) = fa(0,..., 0) = 0, so for 1 ∈]0, ∞[ there exists a δ ∈]0, ∞[ such that BA(a, δ) ⊆ U and for all 0 a ∈ BA(a, δ), a1 ∈ BA1 (0, 2 δ), . . . , ak ∈ BAk (0, 2 δ) we have f(a1, . . . , ak) ∈ 0 BB(0, 1). Let a ∈ BA(a, δ), a1 ∈ A1,..., ak ∈ Ak, then if for 1 ≤ l ≤ k 0 some kalkAl = 0, al = 0, so kfa (a1, . . . , ak)kB = k0kB = 0 and Equation δ kalkAl (5.8) holds. Otherwise note that for 1 ≤ l ≤ k, k alkAl = δ < kalkAl kalkAl δ δ δ 0 2 δ, so al ∈ BAl (0, 2 δ) and hence ... kfa (a1, . . . , ak)kB = kalkAl ka1kA1 kakkAk δ δ 1 0 0 kfa ( a1,..., ak)kB < 1, so kfa (a1, . . . , ak)kB ≤ k ka1kA1 ... kakkAk . ka1kA1 kakkAk δ 1 Hence if we pick α = δk ∈]0, ∞[, then we see that Equation (5.8) holds. Definition 5.5.4: Space of all continuous linear maps (L(A, B)) Let A, B /K. Define the space of all continuous linear maps between A and B (denoted by L(A, B)) by L(A, B) := {f : A → B | f /K } together with 0(a) := 0, (f + g)(a) := f(a) + g(a), (α f)(a) := α f(a) and the norm k · k∞ : L(A, B) → R defined by

kfk∞ := sup{kf(a)kB/kakA ∈ R | a ∈ A \{0}}. Note that this definition implies that for all f ∈ L(A, B) and a ∈ A we have

kf(a)kB ≤ kfk∞ kakA and that for all f ∈ L(A, B), kfk∞ < ∞ exists by Lemma (5.5.3). Lemma 5.5.5 Let A, B /K, U ⊆ A open abc, f ∈ C1(U, B). If there exists a δ ∈]0, ∞[ such that kDafk∞ ≤ δ for all a ∈ U, then f δ- : for all a1, a2 ∈ U we have that

kf(a2) − f(a1)kB ≤ δ ka2 − a1kA.

Proof. Let a1, a2 ∈ U. Since U is convex by assumption, α a2 + (1 − α) a1 ∈ U

- 109 - 5.5. BANACH SPACES for all α ∈ [0, 1]. Hence by Theorem (5.3.5) and Theorem (5.3.8)

kf(a2) − f(a1)kB = kf(1 a2 + (1 − 1) a1) − f(0 a2 + (1 − 0) a1)kB Z 1   = α 7→ Dα a2+(1−α) a1 f(a2 − a1) 0 B Z 1   ≤ α 7→ kDα a2+(1−α) a1 f(a2 − a1)kB 0 Z 1   ≤ α 7→ kDα a2+(1−α) a1 fk∞ ka2 − a1kA 0 Z 1   ≤ α 7→ δ ka2 − a1kA 0 = δ ka2 − a1kA.

Theorem 5.5.6: Properties of L(A, B) Let A, B Ba /K. Then

• L(A, B) /K, • for C /K, f ∈ L(A, B), g ∈ L(B,C), we have

kg ◦ fk∞ ≤ kgk∞ kfk∞,

• for any U ⊆ A open and map f : U → L(B,C) c , the induced map

g : U × B → C :(a, b) 7→ f(a)(b)

is ,

• for any f ∈ L(A, A) with kfk∞ < 1, idA −f : A → A is invertible and −1 1 k(idA −f) k∞ ≤ , 1−kfk∞ • the collection of invertible maps

L(A, B)∗ := {f ∈ L(A, B)| f bijective } ⊆ L(A, B)

is open and the map

L(A, B)∗ → L(B,A)∗ : f 7→ f −1

is , in particular it is a T -isomorphism.

Proof. • By Lemma (5.5.3), the supremum kfk∞ exists in R for all f ∈ L(A, B) as kf(a)kB/kakA ≤ α for all a ∈ A \{0}. It is also clear that kfk∞ ≥ 0 for all f ∈ L(A, B). Suppose that kfk∞ = 0, then necessarily kf(a)kB = 0 for all a ∈ A, so f(a) = 0 for all a ∈ A and hence f = 0. As for all a ∈ A, k(f + g)(a)kB = kf(a) + g(a)kB ≤ kf(a)kB + kg(a)kB we see that kf + gk∞ ≤ kfk∞ + kgk∞. Furthermore for α ∈ K, a ∈ A, k(α f)(a)kB = kα f(a)kB = |α| kf(a)kB, so kα fk∞ = |α|kfk∞. This ||.|| makes k · k∞ a norm and therefore L(A, B) /K.

- 110 - 5.5. BANACH SPACES

Let N → L(A, B): k 7→ fk be a sequence that is Cauchy with respect to k · k∞. Fix a ∈ A, a 6= 0, and let  ∈]0, ∞[ be given. Then because k 7→ fk is Cauchy, there exists a k ∈ N such that for all l, m ≥ k we have   kfl −fmk∞ < . Hence kfl(a)−fk(a)kB ≤ kakA = . This makes kakA kakA the sequence k 7→ fk(a) Cauchy and therefore (B is complete) there exists an f(a) ∈ B such that limk→∞ fk(a) = f(a). Use this to construct a map f : A → B, define f(0) := 0. Since all the fk are l and addition and scalar multiplication on A c , f .

Let  ∈]0, ∞[, then there exists a k ∈ N such that kfl − fmk∞ < /2 for all l, m ≥ k. Fix a ∈ A and l ≥ k, then by continuity of k · k we have that kfl(a) − f(a)kB = limm→∞ kfl(a) − fm(a)kB ≤ limm→∞ kfl − fmk∞ kakA ≤ /2 kakA. Hence kfl − fk∞ ≤ /2 <  for all l ≥ k and therefore limk→∞ fk = f.

Again let  ∈]0, ∞[, then there is a k ∈ N such that kfl − fk∞ < /2 for all l ≥ k. As fk , lima→0 fk(a) = 0, there exists a δ ∈]0, 1[ such that kfk(a)kB < /2 for all a ∈ BA(0, δ). Now for all a ∈ BA(0, δ), kf(a)kB = kf(a) − fk(a) + fk(a)kB ≤ kf(a) − fk(a)kB + kfk(a)kB ≤ kf − fkk∞ kakA + kfk(a)kB < (/2) 1 + /2 = . So lima→0 f(a) = 0, and f . This gives us that f ∈ L(A, B).

Therefore, for any Cauchy sequence k 7→ fk in L(A, B) there exists an f ∈ L(A, B) such that limk→∞ fk = f. So L(A, B) is complete, this makes L(A, B) Ba .

• Let C /K, f ∈ L(A, B), g ∈ L(B,C). Then for any a ∈ A we have that k(g ◦ f)(a)kC = kg(f(a))kC ≤ kgk∞ kf(a)kB ≤ kgk∞ kfk∞ kakA and hence k(g ◦ f)k∞ ≤ kgk∞ kfk∞. • Let U ⊆ A open and f : U → L(B,C) , define g : U × B → C by g(a, b) := f(a)(b). Then for all a, a0 ∈ U, b, b0 ∈ B we have

0 0 0 0 kg(a, b) − g(a , b )kC = kf(a)(b) − f(a )(b )kC 0 0 0 0 = kf(a)(b) − f(a)(b ) + f(a)(b ) − f(a )(b )kC 0 0 0 0 ≤ kf(a)(b) − f(a)(b )kC + kf(a)(b ) − f(a )(b )kC 0 0 0 ≤ kf(a)k∞ kb − b kB + kf(a) − f(a )k∞ kb kB

which can be made arbitrarily small by choosing b near b0 and a near a0 0 (lima0→a kf(a) − f(a )k∞ = 0 as f ). Hence g .

k • Let f ∈ BL(A,A)(0, 1), and define for k ∈ N, f := f ◦ ... ◦ f, and | {z } k 0 Pk l f := idA. Let for k ∈ N, gk := l=0 f ∈ L(A, A). Then for k, l ∈ Pl k+m Pl k m N we have kgk+l − gkk∞ = k m=1 f k∞ ≤ m=1 kf ◦ f k∞ ≤ Pl k m k Pl m k m=1 kf k∞ kf k∞ ≤ kfk∞ m=1 kfk∞ ≤ kfk∞/(1 − kfk∞), since 0 ≤ kfk∞ < 1. Hence N → L(A, A): k 7→ gk is Cauchy so by complete- ness of L(A, A) there exists a g ∈ L(A, A) such that g = limk→∞ gk = P∞ l l=0 f . On the other hand, for any k ∈ N we have (idA −f) ◦ gk = k+1 gk − f ◦ gk = idA −f so letting k → ∞ we find (idA −f) ◦ g = idA −0 k+1 k+1 since kf k∞ ≤ kfk∞ → 0. Similarly g ◦ (idA −f) = idA, so idA −f Pk k 1 is invertible with inverse g. Furthermore, kgkk ≤ kfk ≤ l=0 ∞ 1−kfk∞

- 111 - 5.5. BANACH SPACES

1 ∗ for all k ∈ , so kgk ≤ . Hence (idA −f) ∈ L(A, A) for all N 1−kfk∞ −1 1 f ∈ B (0, 1) and k(idA −f) k∞ ≤ . L(A,A) 1−kfk∞ • Let I : L(A, B)∗ → L(B,A)∗ : f 7→ f −1. Let f ∈ L(A, B)∗. Note that by Corollary (4.4.6) this is equivalent to −1 ∗ −1 f ∈ L(B,A) . Fix δ ∈]0, 1[ and let g ∈ BL(A,B)(f, δ/kf k∞), then

−1 −1 −1 k idA −f ◦ gk∞ = kf ◦ f − f ◦ gk∞ −1 = kf ◦ (f − g)k∞ −1 ≤ kf k∞ kf − gk∞

−1 δ < kf k∞ −1 = δ. kf k∞

−1 So the map (idA −f ◦ g) ∈ BL(A,A)(0, δ) and we find that therefore −1 −1 ∗ f ◦ g = idA −(idA −f ◦ g) ∈ L(A, A) . Hence there exists a h ∈ ∗ −1 −1 L(A, A) such that h ◦ (f ◦ g) = (f ◦ g) ◦ h = idA. Furthermore, 1 1 −1 khk∞ ≤ −1 < . Now (h ◦ f ) ◦ g = idA and idB = 1−k idA −f ◦gk∞ 1−δ −1 −1 −1 −1 −1 f ◦ f = (f ◦ idA) ◦ f = (f ◦ (f ◦ g ◦ h)) ◦ f = g ◦ (h ◦ f ), ∗ −1 −1 so g ∈ L(A, B) with inverse h ◦ f . Now kI(g)k∞ = kh ◦ f k∞ ≤ −1 −1 −1 khk∞ kf k∞ < kf k∞/(1 − δ), so I(g) ∈ BL(B,A)(0, kf k∞/(1 − δ)). This means that for any δ ∈]0, 1[ and f ∈ L(A, B)∗

−1 ∗ f ∈ BL(A,B)(f, δ/kf k∞) ⊆ L(A, B) , as well as

−1 −1 ∗ I(BL(A,B)(f, δ/kf k∞)) ⊆ BL(B,A)(0, kf k∞/(1 − δ)) ⊆ L(B,A) . In particular we obtain the fact that L(A, B)∗ is open. ∗ −1 2 −1 Let f ∈ L(A, B) ,  ∈]0, ∞[. Choose δ := min{/(2 kf k∞), 1/(2 kf k∞)}, −1 then for g ∈ BL(A,B)(f, δ) we have by the previous item kg k∞ < −1 −1 −1 −1 −1 kf k∞/(1 − (1/2)) = 2 kf k∞. Now as f ◦ (g − f) ◦ g = (f ◦ g − −1 −1 −1 idA) ◦ g = f − g we have

−1 −1 kI(f) − I(g)k∞ ≤ kf k∞ kg k∞ kg − fk∞ −1 2 < 2 kf k∞ kg − fk∞ −1 2 < 2 kf k∞ δ ≤ .

Therefore I c .

Note that this implies that L(A, A) together with k · k∞ is a normed ring with multiplication defined by (f, g) 7→ f ◦ g. Corollary 5.5.7: Compatibility for Banach spaces Let A, B Ba /K, U ⊆ A open, and f : U → B. Let a ∈ U, if there exists a g ∈ L(A, B) such that kf(a + a ) − f(a) − g(a )k lim 1 1 B = 0, a1→0 ka1kA

- 112 - 5.5. BANACH SPACES

then f d a and Daf = g. Furthermore, if f U and the map

U → L(A, B): a 7→ Daf is c , then f ∈ C1(U, B). 2 Conversely, if f ∈ C (U, B), then a 7→ Daf . Proof. The first part of the proof follows immediately from Corollary (5.1.6). Suppose f U, and U → L(A, B): a 7→ Daf . Then by Theorem (5.5.6) 1 U × A → B :(a, a1) 7→ Daf(a1) , so f ∈ C (U, B). Suppose f ∈ C2(U, B). Let a ∈ U and  ∈]0, ∞[. Then the 2-linear family 0 2 U × A × A → B :(a , a1, a2) 7→ Da0 f(a1, a2) . So by Equation (5.8) there 0 exists a δ ∈]0, ∞[ and α ∈]0, ∞[ such that for all a ∈ BA(a, δ) ⊆ U, β ∈ [0, 1], 0 0 and a1, a2 ∈ A we have (note that k(β a+(1−β) a )−akA = (1−β)ka −akA < δ)

2 kDβ a+(1−β) a0 f(a1, a2)kB ≤ α ka1kA ka2kA. 0  Pick δ := min{ α , δ} ∈]0, ∞[, then by Theorem (5.3.8) and Theorem (5.3.5) we 0 0 have for any a ∈ BA(a, δ ) and any a1 ∈ A

kDaf(a1) − Da0 f(a1)kB = kD1 a+(1−1) a0 f(a1) − D0 a+(1−0) af(a1)kB Z 1  2 0  = β 7→ Dβ a+(1−β) a0 f(a1, a − a) 0 B Z 1  2 0  ≤ β 7→ kDβ a+(1−β) a0 f(a1, a − a)kB 0 Z 1  0  ≤ β 7→ α ka1kA ka − akA 0 0 = α ka1kA ka − akA 0 < α δ ka1kA ≤  ka1kA.

Therefore, for any a1 ∈ A, a1 6= 0,

k(D f − D 0 f)(a )k a a 1 B ≤  ka1kA 0 0 and hence kDaf − Da0 fk∞ ≤  for all a ∈ BA(a, δ ). So lima0→a Da0 f = Daf. Since this is true for all a ∈ U, U → L(A, B): a 7→ Daf . We are now going to prove the inverse function theorem for Banach spaces. This theorem states that if the derivative of a function at a certain point is invertible, then the function itself must also be invertible in an open neighbour- hood of this point. Theorem 5.5.8: Inverse function theorem k Let A, B Ba /K, U0 ⊆ A open. Let f ∈ C (U0,B) such that U0 → L(A, B): a 7→ Daf (automatically true for k ≥ 2 by Corollary (5.5.7)).

If for a certain a0 ∈ U the derivative Da0 f : A → B is bijective, then there exists an open neighbourhood U ⊆ U0 of a0 in A and V of f(a0) in B such that

f|U : U → V is a Ck diffeomorphism.

- 113 - 5.5. BANACH SPACES

Proof. We follow [Ham1982] and [DK2004I]. Let f : U0 → B satisfy the hy- pothesis. T c l Vs By Corollary (4.4.6) Da0 f : A → A (being and bijective) is a - −1 isomorphism and therefore (Da0 f) . We see therefore, by considering

−1 −1 (U0 − a0) → [Da0 f] (f(U0) − f(a0)) : a 7→ [Da0 f] (f(a + a0) − f(a0)),

k −1 which is C with derivative [Da0 f] ◦ Da0 f = idA at 0 (Theorem (5.1.8)), that we may assume B = A, U0 3 a0 = 0, f(a0) = 0 and Da0 f = idA. The idea is that f now ‘resembles’ the identity mapping at 0, which we know to be a T -isomorphism, and we therefore consider their difference, which in turn should ‘resemble’ the zero mapping, let

g : U0 → A : a 7→ a − f(a).

1 Note that g ∈ C (U0,A) with g(0) = 0 − f(0) = 0, D0g = idA − idA = 0. By our assumption on f, lima→0 Dag = D0g = 0, so there exists a δ ∈]0, ∞[ 1 such that for all a ∈ BA(0, 2 δ) ⊆ U0 we have kDagk∞ ≤ 2 . Hence by Lemma 1 δ (5.5.5), for any a ∈ BA(0, δ) we have kg(a)kA = kg(a) − g(0)kA ≤ 2 kakA ≤ 2 , so g(BA(0, δ)) ⊆ BA(0, δ/2). Now let b ∈ BA(0, δ/2) and define

gb : U0 → A : a 7→ g(a) + b = a − f(a) + b then gb(a) = a if and only if f(a) = b. Let a ∈ BA(0, δ), then kgb(a)kA = kg(a) + bkA ≤ kg(a)kA + kbkA ≤ δ/2 + 1 δ/2 = δ. Hence gb(BA(0, δ)) ⊆ BA(0, δ). As Dagb = Dag, kDagbk ≤ 2 for all 1 a ∈ BA(0, 2δ), so by Lemma (5.5.5), gb is 2 - . Now BA(0, δ) ⊆ A is closed and A is complete, so by Lemma (2.5.18) BA(0, δ) is complete and by Theorem (2.5.21), there exists a unique a ∈ BA(0, δ) such that gb(a) = a, that is, such that f(a) = b. So for all b ∈ BA(0, δ/2) there exists a unique a ∈ BA(0, δ) such that f(a) = b. Let V := BA(0, δ/2) and define h : V → B(0, δ) by h(b) := a whenever f(a) = b (by the preceding we know that this makes h well-defined). Now for any a1, a2 ∈ BA(0, δ) we have that ka1 − a2kA = kf(a1) + g(a1) − f(a2) − g(a2)kA ≤ 1 kf(a1) − f(a2)kA + kg(a1) − g(a2)kA ≤ kf(a1) − f(a2)k + 2 ka1 − a2kA, so ka1 − a2kA ≤ 2 kf(a1) − f(a2)kA and hence

kh(b1) − h(b2)kA ≤ 2 kb1 − b2kA for all b1, b2 ∈ V . This makes h . By letting U := h(V ) = f −1(V ) ⊆ A which is an open neighbourhood of 0 −1 as f and f(0) = 0, we see that h : V → U satisfies h = (f|U ) . k ∗ Now we need to show that h ∈ C (V,A). Since D0f ∈ L(A, A) (which is open in L(A, A) by Theorem (5.5.6)) there exists an  ∈]0, ∞[ such that ∗ BL(A,A)(D0f, 2 ) ⊆ L(A, A) . Furthermore, as a 7→ Daf , we can choose the δ we established earlier smaller such that also Daf ∈ BL(A,A)(D0f, ) ⊆ ∗ L(A, A) for all a ∈ BA(0, 2 δ). By these choices, Daf : A → A is bijective for all k a ∈ BA(0, 2 δ). Furthermore, since f ∈ C (U0,B), the map BA(0, 2 δ)×A → A : k−1 ∗ (a, a1) 7→ Daf(a1) is C . The map BA(0, 2 δ) → L(A, A) : a 7→ Daf by assumption and since inversion L(A, A)∗ → L(A, A)∗ : h 7→ h−1 by Theorem

- 114 - 5.5. BANACH SPACES

∗ −1 c (5.5.6), we have that BA(0, 2 δ) → L(A, A) : a 7→ [Daf] . By Theorem −1 (5.5.6) the map BA(0, 2 δ)×A → A :(a, a1) 7→ [Daf] (a1) is therefore . Now −1 with Theorem (5.2.6) we find that BA(0, 2 δ) × A → A :(a, a1) 7→ [Daf] (a1) k−1 k−1 is C , because (a, a1) 7→ Daf(a1) is C . Consider the map i : BA(0, 2 δ) × BA(0, 2 δ) → L(A, A) given by

 Z 1   i(a1, a2) := A → A : a3 7→ α 7→ Dα a2+(1−α) a1 f(a3) . 0

Then with Theorem (5.3.5) and the fact that for all α ∈ [0, 1] we have kα a2 + (1 − α) a1kA ≤ α ka2k + (1 − α) ka1kA < (α + 1 − α) 2 δ = 2 δ, we find

Z 1   k(D0f − i(a1, a2))(a3)kA = D0f(a3) − α 7→ Dα a2+(1−α) a1 f(a3) 0 A Z 1   = α 7→ D0f(a3) − Dα a2+(1−α) a1 f(a3) 0 A Z 1   ≤ α 7→ kD0f(a3) − Dα a2+(1−α) a1 f(a3)kA 0 Z 1   ≤ α 7→ kD0f − Dα a2+(1−α) a1 fk∞ ka3kA 0 Z 1   ≤ α 7→  ka3kA 0 =  ka3kA.

∗ Hence ki(a1, a2)k∞ ≤  < 2 , so i(a1, a2) ∈ BL(A,A)(D0f, 2 ) ⊆ L(A, A) for ∗ all a1, a2 ∈ BA(0, 2 δ). As BA(0, 2 δ) → L(A, A) : a 7→ Daf , we find with Lemma (5.3.7) that i , which in turn by Theorem (5.5.6) gives us that (a1, a2, a3) 7→ i(a1, a2)(a3) . Let a1, a2 ∈ U and b1 := f(a1) ∈ V , b2 := f(a2) ∈ V , then (use Theorem (5.3.13), f ∈ C1(U, A))

b2 − b1 = f(a2) − f(a1)

= i(a1, a2)(a2 − a1)

= i(h(b1), h(b2))(h(b2) − h(b1)).

∗ Now as a1, a2 ∈ U ⊆ BA(0, 2 δ), we have that i(a1, a2) ∈ L(A, A) is invertible. Let us therefore define the map j : BA(0, 2 δ) × BA(0, 2 δ) → L(A, A) by

−1 j(a1, a2) := [i(a1, a2)] then j since inversion is continuous by Theorem (5.5.6). Applying j on both sides we find for all b1, b2 ∈ V that

h(b2) − h(b1) = j(h(b1), h(b2))(b2 − b1).

As h, j ,(b1, b2, b3) 7→ j(h(b1), h(b2))(b3) by Theorem (5.5.6), furthermore, 1 this map is linear in b3. Hence by Theorem (5.3.13), h ∈ C (V,A) and

−1 −1 Dbh(b1) = j(h(b), h(b))(b1) = [i(h(b), h(b))] (b1) = [Dh(b)f] (b1).

- 115 - 5.5. BANACH SPACES

−1 k−1 We already established that U × A → A :(a, a1) 7→ [Daf] (a1) is C , so using induction, Equation (5.4), and the fact that h ∈ C1(V,A) with derivative −1 −1 Dbh(b1) = [Dh(b)f] (b1), we find that V × A → A :(b, b1) 7→ [Dh(b)f] (b1) is Ck−1 and therefore that h ∈ Ck(V,A). k Therefore f|U : U → V is a C diffeomorphism with inverse h. Theorem 5.5.9: Implicit function theorem k Let A, B, C Ba /K, U0 ⊆ A, V0 ⊆ B open. Let f ∈ C (U0 × V0,C) such that c U0 × V0 → L(A × B,C):(a, b) 7→ D(a,b)f (automatically true for k ≥ 2 by Corollary (5.5.7)). Define for a ∈ U and b ∈ V the maps

fa : V0 → C : b1 7→ f(a, b1), fb : U0 → C : a1 7→ f(a1, b).

Suppose that for a certain (a0, b0) ∈ U0 × V0 and c0 ∈ C we have f(a0, b0) = c0 and that Da0 fb0 : A → C is bijective. Then there exists an open neighbourhood U ⊆ U0 of a0 in A and V ⊆ V0 of b0 in B and a map h : V → U such that • h ∈ Ck(V,A),

• for all b ∈ V , h(b) ∈ U is the unique element in U for which f(h(b), b) = c0, • for all b ∈ V ,

−1 Dbh(b1) = −[Dh(b)fb] (Dbfh(b)(b1)). (5.9)

T Vs Proof. We follow [DK2004I]. By Corollary (4.4.6) Da0 fb0 : A → C is a - isomorphism. Hence we can consider the map

−1 −1 (U0−a0)×(V0−b0) → [Da0 fb0 ] (f(U0,V0)) : (a, b) 7→ [Da0 fb0 ] (f(a0+a, b0+b)−c0)

k which is C , instead of f. Therefore we may suppose that C = A, U0 3 a0 = 0,

V0 3 b0 = 0, c0 = 0, and Da0 fb0 = idA. Now define the map g : U0 × V0 → A × B by g(a, b) := (f(a, b), b). Then by k Lemma (5.1.11) and Lemma (5.1.13) we have that g ∈ C (U0 × V0,A × B) with D(a,b)g(a1, b1) = (D(a,b)f(a1, b1), b1) = (Dafb(a1) + Dbfa(b1), b1). In particular

D(a0,b0)g(a1, b1) = (a1 + Db0 fa0 (b1), b1), so D(a0,b0)g : A × B → A × B is bijective. Also (a, b) 7→ D(a,b)g as (a, b) 7→ D(a,b)f , therefore by Theorem 0 0 (5.5.8) there exist open neighbourhoods U, U ⊆ U0 and V,V ⊆ V0 of 0 in A and B respectively, and a Ck diffeomorphism i : U 0 × V → U × V 0 which is an inverse of g|U×V 0 . Since g(a, b) = (f(a, b), b) we can write i(c, b) = (h(c, b), b) for h : U 0 × V → U, Ck, and take V 0 = V . Now for a ∈ U, b ∈ V , c ∈ C we have f(a, b) = c if and only if g(a, b) = (c, b) if and only if (a, b) = i(c, b) if and only if h(c, b) = a. Hence f(a, b) = 0 if and only if h(0, b) = a, for all (a, b) ∈ U ×V . Therefore V → U : b 7→ h(0, b) is the Ck map we seek, the derivative of which follows from g◦i = idU 0×V :(c, b) = g(i(c, b)) = g(h(c, b), b) = (f(h(c, b), b), b), so using Theorem (5.1.8) and Lemma (5.1.13) we find that c1 = D(h(c,b),b)f(D(c,b)h(c1, b1), b1) = Dh(c,b)fb(D(c,b)h(c1, b1)) + Dbfh(c,b)(b1). So in particular for c = c1 = 0 we have Dh(0,b)fb(D(0,b)h(0, b1)) = −Dbfh(c,b)(b1) which yields Equation (5.9).

- 116 - 5.5. BANACH SPACES

Theorem 5.5.10: Existence and uniqueness of solutions of ordinary differential equations (e·f ) k Let A Ba /K, U ⊆ A open. Let f ∈ C (U, A) such that U → L(A, A): a 7→ Daf c (automatically true for k ≥ 2 by Corollary (5.5.7)). Then there exists an open set V ⊆ R × A for which

{0} × U ⊆ V ⊆ R × U, and there exists a map, called the flow of f,

e· f : V → U :(α, a) 7→ eα f (a) which satisfies e· f ∈ Ck(V,A). For convenience we define for all a ∈ U,

S(a) := {α ∈ R | (α, a) ∈ V } ⊆ R, and for all α ∈ R, U(α) := {a ∈ U | (α, a) ∈ V } ⊆ U. The flow e· f has the following properties.

• For all α ∈ R, the map U(α) → U : a 7→ eα f (a)

is Ck and satisfies for all α, β ∈ R 0 f (α+β) f α f β f e = idU , e = e ◦ e whenever the right-hand-side is well-defined.

• For all a ∈ U, 0 ∈ S(a) ⊆ R is an open interval and the map α f ga : S(a) → U : α 7→ e (a) satisfies

k 0 ga ∈ C (S(a),A), ga(0) = a, ∀α ∈ S(a): ga(α) = f(ga(α)). (5.10)

This ga is furthermore the unique and maximal solution to Equation (5.10) in the sense that if there is another map h : S → U, with S ⊆ R an open 0 interval, satisfying h (α) = f(h(α)) for all α ∈ S and h(α0) = a, then S ⊆ (α0 + S(a)) and h(α) = ga(α − α0) for all α ∈ S. Proof. We follow [DK2000] and make extensive use of Corollary (5.1.7). First we show existence of solutions to the equations g(0) = a, g0(α) = f(g(α)). Fix a0 ∈ U. Since f and a 7→ Daf there exists an  ∈]0, ∞[ such that BA(a0, 2 ) ⊆ U, and for all a ∈ BA(a0, 2 ) we have kf(a) − f(a0)kA < 1 and kDaf − Da0 fk∞ < 1, hence

kf(a)kA ≤ kf(a0)kA + 1 =: α0, kDafk∞ ≤ kDa0 fk∞ + 1 =: α1

 for all a ∈ BA(a0, 2 ). Pick δ ∈]0, ∞[ such that α0 δ < 2 , and α1 δ < 1. Let

B := {g :] − δ, δ[→ BA(a0, ) ⊆ U | kgkB < ∞}

- 117 - 5.5. BANACH SPACES together with the norm

kgkB := sup{kg(α)kA ∈ R | α ∈] − δ, δ[}.

Then B Ba /K ([Mun2000], Theorem 43.6).  Let F : BA(a0, 2 ) × B → B be defined by Z α   F (a, g)(α) := a + β 7→ f(g(β)) . 0 Note that for g ∈ B, F (a, g) = g if and only if g(0) = a and for all α, g0(α) = f(g(α)) by Theorem (5.3.8). 0  Then for any a, a ∈ BA(a0, 2 ), α ∈] − δ, δ[ we have (as g(α) ∈ BA(a0, ) ⊆ BA(a0, 2 )) Z α 0   0 kF (a, g)(α) − a kA = a + β 7→ f(g(β)) − a 0 A Z α 0   ≤ ka − a kA + β 7→ kf(g(β))kA 0 0 ≤ ka − a kA + |α| α0 0 < ka − a kA + α0 δ, so in particular by our choice of δ, F (a, g)(α) ∈ BA(a0, ) ⊆ U for all a ∈  0 BA(a0, 2 ), α ∈] − δ, δ[, which makes F (a, g) ∈ B. By Lemma (5.5.5), kf(a ) − 0 f(a)kA ≤ α1 ka − akA, so for g, h ∈ B Z α Z α 0   0   kF (a, g)(α) − F (a , h)(α)kA = a + β 7→ f(g(β)) − a − β 7→ f(h(β)) 0 0 A Z α 0   ≤ ka − a kA + β 7→ kf(g(β)) − f(h(β))kA 0 Z α 0   ≤ ka − a kA + β 7→ α1 kg(β) − h(β)kA 0 0 ≤ ka − a kA + |α| α1 kg − hkB 0 < ka − a kA + α1 δ kg − hkB. Hence

0 0 kF (a, g) − a kB < ka − a kA + α0 δ, 0 0 kF (a, g) − F (a , h)kB < ka − a kA + α1 δ kg − hkB (5.11)

0  for all a, a ∈ BA(a0, 2 ) and g, h ∈ B.  Let a ∈ BA(a0, 2 ), then for all g ∈ BB(a0, ) we have kF (a, g) − a0kB ≤  2 + α0 δ < , so F (a, g) ∈ BB(a0, ). Also, for any g, h ∈ BB(a0, ) we have kF (a, g) − F (a, h)kB ≤ α1 δ kg − hkB where α1 δ < 1. Now as BB(a0, ) ⊆ B is closed, it is complete by Lemma (2.5.18) and therefore by Theorem (2.5.21), there exists a unique ga ∈ BB(a0, ) such that F (a, ga) = ga. By Theorem (5.3.8) 1 0 we have that ga ∈ C (] − δ, δ[,A), ga(0) = a + 0, and ga(α) = 0 + f(ga(α)). This function ga is by Theorem (5.3.8) the solution ga :] − δ, δ[→ A in B to 0 ga(0) = a and ga(α) = f(ga(α)) for all α, and furthermore the unique solution in BB(a0, ) by Theorem (2.5.21).

- 118 - 5.5. BANACH SPACES

From the definition of F , f ∈ Ck(U, A), and Lemma (5.3.7) we find that F ∈ k   C (BA(a0, 2 ) × B,B) and furthermore for all a ∈ BA(a0, 2 ) that by Equation (5.11),   Dg h 7→ F (a, h) ≤ 0 + α1 δ < 1. ∞

Hence by Theorem (5.5.6) the map (idB −Dg(h 7→ F (a, h))) : B → B is invert-  ible. This means that the map B ×BA(a0, 2 ) → B :(g, a) 7→ g−F (a, g) satisfies  the conditions of Theorem (5.5.9) at (ga, a) ∈ B×U for all a ∈ BA(a0, 2 ). Hence k  ga depends in a C fashion on a, that is, for all a ∈ BA(a0, 2 ) there exists a 0  0 0 k δ ∈]0, 2 [ such that the map BA(a, δ ) → B : a 7→ ga0 is C .

0  So for all a0 ∈ U there exist δ, δ ,  ∈]0, ∞[ such that for all a ∈ BA(a0, 2 ) 1 there exists a unique ga ∈ BB(a0, ), ga ∈ C (] − δ, δ[,A) with ga(0) = a and 0 0 ga(α) = f(g(α)) for all α ∈] − δ, δ[. Furthermore, the map BA(a0, δ ) → B : k a 7→ ga is C .

Now we will show that such solutions ga are unique in a global sense. Suppose we have two maps g : S → UC1 and h : T → UC1, where S, T ⊆ R are open intervals. Suppose g and h satisfy g0(α) = f(g(α)) and h0(β) = f(h(β)) for all α ∈ S, β ∈ T and g(α0) = h(β0) for some α0 ∈ S, β0 ∈ T . Then first of all, by Theorem (5.1.8), the maps i :(S − α0) → U : α 7→ g(α − α0), j :(T − β0) → 0 0 U : β 7→ h(β − β0) satisfy i(0) = j(0) and i (α) = f(i(α)), j (β) = f(j(β)). Let 0 a0 := i(0) = j(0) ∈ U, then by the preceding there exist δ, δ ,  ∈]0, ∞[ and a unique g :] − δ, δ[→ U in B (a , ) with g (0) = a , g0 (α) = f(g (α)). We a0 B 0 a0 0 a0 a0 can furthermore choose δ smaller such that ]−δ, δ[⊆ [−δ, δ] ⊆ (S −α0)∩(T −β0) c since 0 ∈ (S − α0) ∩ (T − β0) is an open interval. As i, j , limα→0 i(α) = Cpt limα→0 j(α) = a0, and [−δ, δ] , so ki([−δ, δ])kA and kj([−δ, δ])kA are bounded in R. Hence we can take δ smaller, such that i|]−δ,δ[, j|]−δ,δ[ ∈ BB(a0, ). Then, as ga0 is the unique solution to F (a0, ga0 ) = ga0 and i|]−δ,δ[ = F (a0, i|]−δ,δ[), we

find that ga0 = i|]−δ,δ[ and similarly ga0 = j|]−δ,δ[. Hence there exists a δ ∈]0, ∞[ such that i(α) = j(α) for all α ∈] − δ, δ[. 0 Now consider the collection S := {α ∈ (S − α0) ∩ (T − β0) | i(α) = j(α)} ⊆ 0 (S −α0)∩(T −α0). Then by the previous we know that 0 ∈ S implies that there exists a δ ∈]0, ∞[ such that ] − δ, δ[⊆ S0. By applying the same argument again, 0 0 we find that for any α ∈ S there exists a δα ∈]0, ∞[ such that ]α−δα, α+δα[⊆ S . 0 0 Hence S ⊆ (S − α0) ∩ (T − β0) is open. On the other hand, S is the inverse image of {0} of the map ((a, a0) 7→ a − a0) ◦ (α 7→ (i(α), j(α))) which is and therefore (Lemma (2.1.14)) S0 is closed. So S0 is a subset of an open interval that is both open and closed, hence S0 is either empty or equal to the entire 0 0 open interval. Because 0 ∈ S we find that necessarily S = (S − α0) ∩ (T − β0). Therefore g(α0 + α) = h(β0 + α) for all α ∈ (S − α0) ∩ (T − β0).

So for any two paths g : S → U, h : T → U both C1 and satisfying g0(α) = f(g(α)), h0(β) = f(h(β)) for all α ∈ S, β ∈ T we have that if g(α0) = h(β0) for some α0 ∈ S, β0 ∈ T , then g(α0 + α) = h(β0 + α) for all α ∈ (S − α0) ∩ (T − β0).

This uniqueness property can be used to increase the domain ] − δ, δ[ of our solutions ga. Let for a0 ∈ U, S(a0) denote the union of all open intervals S ⊆ R

- 119 - 5.5. BANACH SPACES

1 containing 0 for which there exists a g : S → U that is C with g(0) = a0 and g0(α) = f(g(α)) for all α ∈ S. In particular there exists a δ ∈]0, ∞[ such that ] − δ, δ[⊆ S(a0) by construction of ga0 , so S(a0) ⊆ R is an open interval containing 0. 1 With a slight abuse of notation, we will write ga0 for the maximal C curve defined on S(a0) as the union of all curves whose domain is contained in the union S(a0). This makes ga0 well-defined because of the uniqueness property stated above.

It is clear that with this definition, ga0 : S(a0) → U is the unique and max- imal solution from the second point of the theorem.

Let α ∈ S(a) and β ∈ S(ga(α)). Then as ga(α) = gga(α)(0) and β 7→ ga(α+β), gga(α) are both maximal solutions to Equation (5.10), we have S(α) =

α + S(ga(α)) and gga(α)(β) = ga(α + β). In particular α + β ∈ S(a) for all α ∈ S(a) and β ∈ S(ga(α)).

We are now going to construct V and e·f . Choose [ V := {S(a0) × {a0} | a0 ∈ U} ⊆ R × U.

Let a0 ∈ U, α ∈ S(a0) and choose a1 := ga0 (α) ∈ U. 0  By our first assertion there exist , δ, δ ∈]0, ∞[ such that for all a ∈ BA(a0, 2 )  0 and all a ∈ BA(a1, 2 ) we have ] − δ, δ[⊆ S(a) and that BA(a0, δ ) → B : a 7→ ga k 0  is C . Choose δ ≤ δ ≤ 2 for convenience. 00 0 We have that lima→a0 ga = ga0 in B, so there exists a δ ∈]0, δ ] such that 00 0 00 for all a ∈ BA(a0, δ ) we have kga −ga0 kB < δ . In particular for a ∈ BA(a0, δ ) 0 0  we have kga(α) − a1kA ≤ kga − ga0 kB < δ , so ga(α) ∈ BA(a1, δ ) ⊆ BA(a1, 2 ) 00 and hence ] − δ, δ[⊆ S(ga(α)). So for a ∈ BA(a0, δ ), α ∈ S(a) and ] − δ, δ[⊆ S(ga(α)), hence ]α − δ, α + δ[⊆ S(a). But this means that (α, a0) ∈]α − δ, α + 00 δ[×BA(a0, δ ) ⊆ V : V is open. Since α = 0 ∈ S(a0) for all a0 ∈ U, we furthermore find that {0} × U ⊆ V . This permits us to define

·f α f e : V → U :(α, a) 7→ e (a) := ga(α) in accordance with the notation for ga from the theorem, which is well-defined because of uniqueness property of the ga. 0 f 0 f Note that e (a) = ga(0) = a = idU (a), so e = idU . Furthermore, when- β f α f β f α f ever (β, a), (α, e (a)) ∈ V , we have e (e (a)) = e (ga(β)) = gga(β)(α) = (α+β) f ga(α + β) = e (a). e·f is Ck by Theorem (5.1.8), the composition rule eα f ◦eβ f = e(α+β) f , and k the fact that a 7→ ga is C . Example 5.5.11: Notation of Theorem (5.5.10). The notation e·f in Theorem (5.5.10) has purposefully been introduced because ∞ of the following. Let A Ba /K and consider idA which is C . Hence e· idA : V → A exists and we can study its form by looking at the map F from the proof. Fix any a ∈ A, then

Z α   F (a, 0)(α) = a + β 7→ idA(0) = a 0

- 120 - 5.5. BANACH SPACES iterating F to find the solution (as is done in Theorem (2.5.21) which is used to find the solutions in the proof) we find

Z α   F (a, F (a, 0))(α) = a + β 7→ idA(a) = a + α a. 0 So after k iterations we find α2 αk F (a, . . . , F (a, 0) ...)(α) = a + α a + a + ... + a | {z } 2 k! k which is exactly the k-th order Taylor expansion of the map α 7→ eα a. Indeed this map, defined for all α ∈ R, is the sought-after solution since e0 a = a and d α α α dα e a = e a = idA(e a). Hence for all a ∈ A we have

eα idA (a) = eα a which is indeed a C∞ map, which is furthermore defined on the entire V = R×A. This motivates the notation of Theorem (5.5.10). The following lemma may be used to calculate the derivatives of the flow e·f of a given function f. Lemma 5.5.12: Derivatives of e·f Let A Ba /K, U ⊆ A open, k ∈ N, and f ∈ Ck(U, A) satisfying the conditions of Theorem (5.5.10). Let e·f : V → U, with V ⊆ R × U open, denote the flow of f. Then for all 0 ≤ l < k, a ∈ U, and u1, . . . , ul ∈ A, the curve g : S(a) → A defined by l α f g(α) := Dae (u1, . . . , ul) satisfies for all α ∈ S(a)

0 l α f g (α) = Da(f ◦ e )(u1, . . . , ul)   a l = 0 g(0) = u1 l = 1  0 l > 1.

In particular, the flow of

l α f U → A : a 7→ Da(f ◦ e )(u1, . . . , ul) gives us information about the l-th derivative of the flow of f in the directions u1,..., ul. Proof. We will use Theorem (5.5.10) extensively in this proof, particularly the fact that e·f is Ck and that therefore all limits of quotients involving e·f exist. 0 0 Suppose l = 0, then g = ga, so g(0) = ga(0) = a and g (α) = ga(α) = f(ga(α)) = α f 0 α f f(e (a)) = Da(f ◦ e ) by Equation (5.10).

- 121 - 5.5. BANACH SPACES

0 f Suppose l = 1, then g(0) = Dae (u1) = Da idU (u1) = idU (u1) = u1. Furthermore with Corollary (5.1.7) and Theorem (5.1.16)

0 1  (α+β) f α f  g (α) = lim Dae (u1) − Dae (u1) β→0 β 2 ·f = D(α,a)e ((0, u1), (1, 0)) 2 ·f = D(α,a)e ((1, 0), (0, u1))  0 0  00 00 α00 f 00   = D(α,a) (α , a ) 7→ D(α0,a0) (α , a ) 7→ e (a ) (1, 0) (0, u1)

 0 0 1  0 0  = D(α,a) (α , a ) 7→ lim ga0 (α + β) − ga0 (α ) (0, u1) β→0 β  0 0 0 0  = D(α,a) (α , a ) 7→ ga0 (α ) (0, u1)

(5.10)  0 0 0  = D(α,a) (α , a ) 7→ f(ga0 (α )) (0, u1)

 0 0 α0 f 0  = D(α,a) (α , a ) 7→ f(e (a )) (0, u1) α f = Da(f ◦ e )(u1).

l 0 f l Suppose l > 1, then g(0) = Dae (u1, . . . , ul) = Da idU (u1, . . . , ul) = 0. Furthermore by the above calculation, we may use induction to assume that for l − 1

l ·f l−1 α f D(α,a)e ((0, u1),..., (0, ul−1), (1, 0)) = Da (f ◦ e )(u1, . . . , ul−1). Hence

l α f l+1 ·f Da(f ◦ e )(u1, . . . , ul) = D(α,a)e ((0, u1),..., (0, ul−1), (1, 0), (0, ul)) l+1 ·f = D(α,a)e ((0, u1),..., (0, ul−1), (0, ul), (1, 0))

1  l ·f = lim D(α+β 1,a)e ((0, u1),..., (0, ul)) β→0 β l ·f  − D(α,a)e ((0, u1),..., (0, ul))

1  l (α+β) f l α f  = lim Dae (u1, . . . , ul) − Dae (u1, . . . , ul) β→0 β = g0(α), which shows that g has the desired property.

- 122 - Chapter 6

Revisiting Christoffel’s article

In this chapter we will revisit and generalise [Chr1869]. Before we start however, we will first need to introduce a few concepts that will be helpful for discussing Christoffel’s article.

6.1 Preliminaries

Definition 6.1.1: k-Tensor T Let A Vs /K T2 LC , U ⊆ A open, and k ∈ N. Then a k-tensor is a family of k-linear maps (Definition (5.2.2)) f : U × c A × ... × A → K :(a, u1, . . . , uk) 7→ fa(u1, . . . , uk) that is . | {z } k k We say that a k-tensor is symmetric if for any π ∈ S , a ∈ U, and u1, . . . , uk ∈ A we have fa(u1, . . . , uk) = fa(uπ(1), . . . , uπ(k)).

We say for l ∈ N that a k-tensor f ∈ Cl(U) if f ∈ Cl(U × A × ... × A, K). Definition 6.1.2: Metric Let A /K , and U ⊆ A open. Then a 2-tensor f : U ×A×A → K induces a map fˆ : U ×A → A0 :(a, u) 7→ ˆ fa(u) defined by ˆ 0 fa : A → A :(u 7→ (v 7→ fa(u, v))). ˆ ˆ−1 0 If for all a ∈ U, fa is bijective and the map f : U × A → A :(a, g) 7→ ˆ−1 fa (g) , we call f non-degenerate. A 2-tensor f : U × A × A → K that is symmetric and non-degenerate is called a metric.

Note that any A /K only admits a metric if A ' A0 are -isomorphic, ˆ 0 since for any a ∈ U, fa : A → A is an -isomorphism if f is a metric. Lemma 6.1.3 Let A /K , U ⊆ A open, and f : U × A × A → K a metric. Then

- 123 6.1. PRELIMINARIES

• for any g, h ∈ A0 and a ∈ U we have

ˆ−1 ˆ−1 g(fa (h)) = h(fa (g)),

ˆ 0 c ˆ l ˆ−1 0 ˆ−1 • f : U × A → A , for all a ∈ U, fa /K and f : U × A → A , fa /K.

• If f ∈ Ck(U) for some k ∈ N, then fˆ ∈ Ck(U × A, A0) and fˆ−1 ∈ Ck(U × A0,A). We then have for all a ∈ U, u ∈ A, v, w ∈ A, and g ∈ A0

ˆ  Daf(u)(v) (w) = Daf(u)(v, w) ˆ−1  ˆ−1 fa Daf (u)(g), v = −Daf(u)(fa (g), v). (6.1)

Proof. • Let a ∈ U, g, h ∈ A0. Then because f is non-degenerate:

ˆ−1 ˆ ˆ−1 ˆ−1 ˆ−1 ˆ−1 g(fa (h)) = (fa(fa (g)))(fa (h)) = fa(fa (g), fa (h))

and the left hand side is symmetric if and only if the right hand side is.

ˆ 0 ˆ ˆ • The map f : U × A → A is : f(a, u)(v) = fa(u)(v) = fa(u, v) and ˆ f : U ×A×A → K . Because for each a ∈ U, fa : A×A → K is 2- , fa ˆ−1 ˆ−1 /K. f is by assumption and fa as the inverse of a linear map.

• For a ∈ U, u, a1, u1, v ∈ A and α ∈ K small enough but not equal to zero, we have 1   fˆ (u + α u )(v) − fˆ (u)(v) α a+α a1 1 a 1   = f (u + α u , v) − f (u, v) α a+α a1 1 a 1   = f (u, v) − f (u, v) + f (u , v), α a+α a1 a a+α a1 1

1 so as f ∈ C (U) we see that by taking the limit for (α, (a1, u1)) → (0, (a2, u2)), we obtain that the above goes to

Daf(a2)(u, v) + fa(u2, v)

0 and since the map (A → K : v 7→ Daf(a2)(u, v) + fa(u2, v)) ∈ A we therefore have by Lemma (5.1.4) that fˆ d (a, u) with

ˆ  D(a,u)f(a2, u2) = v 7→ Daf(a2)(u, v) + fa(u2, v)

ˆ which gives us the desired formula for Daf(a2) if we use the notation of Definition (5.2.2). Since this expression is continuous in all its variables, fˆ ∈ C1(U × A → A0). Now we can differentiate this expression using Theorem (5.1.8) and with induction conclude that f ∈ Ck(U) implies that fˆ ∈ Ck(U × A, A0). ˆ 0 ˆ k 0 Because fa : A → A is bijective for all a ∈ U, f ∈ C (U × A, A ), and the collection of inverses fˆ−1 by assumption, we can apply Theorem (5.2.6)

- 124 - 6.2. GENERALISATION

to conclude that fˆ−1 ∈ Ck(U × A0,A). Furthermore, Equation (5.5) gives us that for a ∈ U, u, v ∈ A, and g ∈ A0,

ˆ−1 ˆ−1 ˆ ˆ−1 fa(Daf (u)(g), v) = −fa(fa (Daf(u)(fa (g))), v) ˆ ˆ−1 ˆ ˆ−1 = fa(fa (Daf(u)(fa (g))))(v) ˆ ˆ−1 = −Daf(u)(fa (g))(v) ˆ−1 = −Daf(u)(fa (g), v).

6.2 Generalisation

We will now generalise the calculations in [Chr1869], a translation of which is included in Chapter8. Between [Chr1869] and our calculations here, we have n the following correspondences (here ei denotes the i-th basis vector of R for 1 ≤ i ≤ n). Here [Chr1869] A A = Rn, x0 ∈ A B B = Rn, x ∈ B f x = x(x0) = f(x0) 0 ∂xi ∂xi(x ) Df 0 = 0 = Dx0 fi(ej) ∂xj ∂xj 0 0 0 g ωij = ωij(x ) = gx0 (ei, ej) h ωij = ωij(x) = hx(ei, ej) ij B (x) = Bh(e , e , e ) k x i j k ij Γ Pn (x) e = Γh(e , e ) k=1 k k x i j h R (ijkl)(x) = Rx(ei, ej, ek, el) We will mimic the structure of [Chr1869] as closely as possible; to show our progress through the article the relevant section numbers have been included and equations will frequently be compared to those in [Chr1869].

1., 2. First of all, our goal will be, given two metrics g and h on open subsets U and V of spaces A and B respectively, to determine whether or not there exists a change of variables (diffeomorphism) f from U to V which transforms the met- ric g into h.

We will start by investigating the necessary conditions for the existence of such an f. T Let A, B Vs /K T2 LC , U ⊆ A, V ⊆ B open. Let g : U × A × A → K, h : V × B × B → K be metrics on A and B respectively. Suppose there exists an f : U → V , f ∈ C1(U, B) such that for all a ∈ U, u, v ∈ A we have ga(u, v) = hf(a)(Daf(u),Daf(v)), (6.2)

- 125 - 6.2. GENERALISATION which corresponds to equation (1.) in [Chr1869]. Suppose g ∈ C1(U), h ∈ C1(V ), and f ∈ C2(U, B). Let a ∈ U, u, v, w ∈ A, then from Equation (5.4) and Equation (6.2) we obtain

Dag(u)(v, w) = Df(a)h(Daf(u))(Daf(v),Daf(w)) 2 2 + hf(a)(Daf(v, u),Daf(w)) + hf(a)(Daf(v),D f(w, u)). (6.3) Following [Chr1869] we will now consider permutations of u, v and w in 2 order to isolate a single Daf(u, v) term. In light of equation (4.), we define the 3-tensor Bg ∈ C0(U × A × A × A, K) by 1  Bg(u, v, w) := D g(u)(v, w) − D g(w)(u, v) + D g(v)(w, u) (6.4) a 2 a a a for all a ∈ U, u, v, w ∈ A. 12 Note that because of g being symmetric, we obtain (compare with (5.)): g g g g Ba(u, v, w) = B (v, u, w),Dag(u)(v, w) = Ba(u, v, w) + Ba(u, w, v). (6.5) Then by Equation (6.3) and Equation (6.4) we find using symmetry of h and Theorem (5.1.16) g h Ba(u, v, w) = Bf(a)(Daf(u),Daf(v),Daf(w)) 1 + h (D2f(v, u),D f(w)) + h (D f(v),D2f(w, u)) 2 f(a) a a f(a) a a 2 2 − hf(a)(Daf(u, w),Daf(v)) − hf(a)(Daf(u),Daf(v, w)) 2 2  + hf(a)(Daf(w, v),Daf(u)) + hf(a)(Daf(w),Daf(u, v)) h 2 = Bf(a)(Daf(u),Daf(v),Daf(w)) + hf(a)(Daf(u, v),Daf(w)). Hence (compare with (6.) from [Chr1869]) we find for all a ∈ U, u, v, w ∈ A

2 h hf(a)(Daf(u, v),Daf(w)) + Bf(a)(Daf(u),Daf(v),Daf(w)) g = Ba(u, v, w). (6.6) Now we can use the non-degeneracy of g and h to solve this equation, pro- 3 vided that Daf : A → B is bijective for all a ∈ U with a continuous inverse. 4

1This is the first example of an object that we will associate with just the metric g (the definition of Bg depends solely on g itself). We implicitly make the same definition of such h an object for the metric h (i.e. a 3-tensor Ba defined by Equation (6.4) but with g replaced by h). 2One may argue that Bg is ‘not a tensor’, because Bg does not satisfy the proper trans- formation relations. With regard to this it should be noted that the definition of a ‘tensor’ as per Definition (6.1.1) is just a convenient name for a family of multilinear maps with values in K and therefore this does not agree with the usual definition of a tensor as a multilinear object that behaves properly under coordinate transformations. We see however, that the metric (Equation (6.2)), the curvature tensor (Equation (6.12)), and covariant derivatives of a properly transforming covariant tensor (Theorem (6.2.2)) all transform covariantly, as should be expected. 3 2 In order to uniquely determine Daf with Equation (6.6), we need to consider 2 hf(a)(Daf(u, v), x) for all x ∈ B, which means that Daf should be surjective. On the other hand, we then need to express w in terms of Daf(w): to be able to do this in a unique way, 0 Daf should also be injective and because we can only take the inverse of elements of A and 0 B (which are continuous), the inverse of Daf should be continuous. 4 Note that if A and B satisfy the conditions of Corollary (4.4.6), the inverse of Daf is continuous whenever Daf is bijective.

- 126 - 6.2. GENERALISATION

−1 c Suppose Daf : A → B is bijective for all a ∈ U and Daf : B → A . Using bijectivity of Daf we find from Equation (6.6) that

2 g −1 h hf(a)(Daf(u, v), x) = Ba(u, v, Daf (x)) − Bf(a)(Daf(u),Daf(v), x) for all a ∈ U, u, v ∈ A, x ∈ B. −1 2 Hence, using non-degeneracy of h and continuity of Daf , Daf(u, v) is uniquely determined as 2 ˆ−1 g −1 h Daf(u, v) = hf(a)(x 7→ Ba(u, v, Daf (x)) − Bf(a)(Daf(u),Daf(v), x)) ˆ−1 g −1 = hf(a)(x 7→ Ba(u, v, Daf (x))) ˆ−1 h − hf(a)(x 7→ Bf(a)(Daf(u),Daf(v), x)) for all a ∈ U, u, v ∈ A, and this equation is equivalent to Equation (6.6) by the non-degeneracy of h. We therefore introduce (compare with (7.)) a map Γg : U × A × A → A by

g −1 g  Γa(u, v) :=g ˆa w 7→ Ba(u, v, w) . (6.7)

By Definition (6.1.2) and Lemma (6.1.3), Γg ∈ C0(U × A × A, A) is a family of 2-linear maps. Note that if g ∈ Ck(U), then Γg ∈ Ck−1(U × A × A, A). Furthermore, Equation (6.5) implies that for all a ∈ U, u, v ∈ A we have (compare with (8.)) g g Γa(u, v) = Γa(v, u). (6.8) Note that by definition ˆ−1 h h hf(a)(x 7→ Bf(a)(Daf(u),Daf(v), x)) = Γf(a)(Daf(u),Daf(v)).

2 For the other term in our expression for Daf(u, v) we first note that for any −1 −1 w ∈ A we have by Equation (6.2) thatg ˆa(Daf (x))(w) = ga(Daf (x), w) = hf(a)(x, Daf(w)) and hence

−1 −1 Daf (x) =g ˆa (w 7→ hf(a)(x, Daf(w))) −1 ˆ =g ˆa (hf(a)(x) ◦ Daf). This permits us to use Lemma (6.1.3) for symmetry of g and after that symmetry of h to obtain for a ∈ U, u, v ∈ A that ˆ−1 g −1 hf(a)(x 7→ Ba(u, v, Daf (x))) ˆ−1 g −1 ˆ = hf(a)(x 7→ Ba(u, v, gˆa (hf(a)(x) ◦ Daf))) ˆ−1 g −1 ˆ = hf(a)(x 7→ (w 7→ Ba(u, v, w))(ˆga (hf(a)(x) ◦ Daf))) ˆ−1 ˆ −1 g = hf(a)(x 7→ (hf(a)(x) ◦ Daf)(ˆga (w 7→ Ba(u, v, w)))) ˆ−1 ˆ −1 g = hf(a)(x 7→ hf(a)(x)(Daf(ˆga (w 7→ Ba(u, v, w))))) g = Daf(Γa(u, v)). Hence (compare with (9.)) we obtain for all a ∈ U, u, v ∈ A that

2 h g Daf(u, v) + Γf(a)(Daf(u),Daf(v)) = Daf(Γa(u, v)). (6.9)

- 127 - 6.2. GENERALISATION

By applying hf(a)(·,Daf(w)) on both sides, we again obtain Equation (6.6). Hence Equation (6.6) and Equation (6.9) are equivalent. So Equation (6.2) implies Equation (6.9), or more precisely: if there exists 2 an f : U → V satisfying Equation (6.2), f ∈ C (U, B), and Daf is invertible with continuous inverse for all a ∈ U, then this function f necessarily satisfies Equation (6.9).

3. Suppose conversely that there exists a C2 diffeomorphism f : U → V satisfying Equation (6.9). Then we can define (which is exactly why we need f to be a diffeomorphism, see Lemma (5.1.15)) the metric i : V ×B ×B → K by (compare with Equation (6.2))

−1 −1 ib(x, y) := gf −1(b)(Dbf (x),Dbf (y)) which satisfies i ∈ C1(V ), as g ∈ C1(U), and f is a C2 diffeomorphism. Follow- ing the same reasoning as before, i should satisfy Equation (6.9). As h satisfies Equation (6.9) by assumption, we find for all a ∈ U, u, v ∈ A

i h Γf(a)(Daf(u),Daf(v)) = Γf(a)(Daf(u),Daf(v)), and hence (f and Daf are bijective)

i h Bb(x, y, z) = Bb (x, y, z) for all b ∈ V , x, y, z ∈ B. We therefore find from Equation (6.5) that

Dbi(x)(y, z) = Dbh(x)(y, z) for all b ∈ V , x, y, z ∈ B. But then by Theorem (5.1.8) we have that for all x, y ∈ B the function V → K : b 7→ ib(x, y)−hb(x, y) must be constant. Hence if i and h agree in a single point (i.e. if h satisfies Equation (6.2) at a single point), then they must agree at all points. This leads us to the following theorem. Theorem 6.2.1 T 1 Let A, B Vs /K T2 LC , U ⊆ A, V ⊆ B open. Let g : U × A × A → K, g ∈ C (U), h : V × B × B → K, h ∈ C1(V ) be metrics on A and B respectively. Let f : U → V be a C2 diffeomorphism, then the following statements are equivalent, • f satisfies Equation (6.2) for all u, v ∈ A and all a ∈ U, • f satisfies Equation (6.2) for all u, v ∈ A and some a ∈ U, and f satisfies Equation (6.9) for all u, v ∈ A and all a ∈ U.

4. Suppose that f ∈ C3(U, B), g ∈ C2(U), and h ∈ C2(V ) satisfy Equation (6.2) and Equation (6.9), then we can take another derivative of Equation (6.9) using

- 128 - 6.2. GENERALISATION

Equation (5.4), to obtain

3 h Daf(u, v, w) + Df(a)Γ (Daf(w))(Daf(u),Daf(v)) h 2 h 2 + Γf(a)(Daf(u, w),Daf(v)) + Γf(a)(Daf(u),Daf(v, w)) 2 g g = Daf(Γa(u, v), w) + Daf(DaΓ (w)(u, v)) for all a ∈ U, u, v, w ∈ A. Now swap v and w and subtract this equation from itself (the third order derivatives of f will cancel because of Theorem (5.1.16)) to obtain

h h 0 + Df(a)Γ (Daf(w))(Daf(u),Daf(v)) − Df(a)Γ (Daf(v))(Daf(u),Daf(w)) h 2 h 2 + Γf(a)(Daf(u, w),Daf(v)) − Γf(a)(Daf(u, v),Daf(w)) + 0 2 g 2 g = Daf(Γa(u, v), w) − Daf(Γa(u, w), v) g g + Daf(DaΓ (w)(u, v)) − Daf(DaΓ (v)(u, w)).

To get rid of the second order derivatives of f we use Equation (6.9):

h h Df(a)Γ (Daf(w))(Daf(u),Daf(v)) − Df(a)Γ (Daf(v))(Daf(u),Daf(w)) h g h h + Γf(a)(Daf(Γa(u, w)),Daf(v)) − Γf(a)(Γf(a)(Daf(u),Daf(w)),Daf(v)) h g h h − Γf(a)(Daf(Γa(u, v)),Daf(w)) + Γf(a)(Γf(a)(Daf(u),Daf(v)),Daf(w)) + 0 g g h g = Daf(Γa(Γa(u, v), w)) − Γf(a)(Daf(Γa(u, v)),Daf(w)) g g h g − Daf(Γa(Γa(u, w), v)) + Γf(a)(Daf(Γa(u, w)),Daf(v)) g g + Daf(DaΓ (w)(u, v)) − Daf(DaΓ (v)(u, w)).

This can be rearranged to (compare with (12.))

h h Df(a)Γ (Daf(w))(Daf(u),Daf(v)) − Df(a)Γ (Daf(v))(Daf(u),Daf(w)) h h + Γf(a)(Γf(a)(Daf(u),Daf(v)),Daf(w)) h h − Γf(a)(Γf(a)(Daf(u),Daf(w)),Daf(v)) + 0 g g g g = 0 + Daf(DaΓ (w)(u, v) − DaΓ (v)(u, w) + Γa(Γa(u, v), w) g g − Γa(Γa(u, w), v)). (6.10)

5.

We can turn the right-hand side into a 4-tensor by applying ga(·, x) inside of g Daf(·). In anticipation of this we will rewrite the derivatives DaΓ , note that with Equation (5.4)

g 0 g DaB (w)(u, v, x) = Da(a 7→ ga0 (Γa0 (u, v), x))(w) g g = Dag(w)(Γa(u, v), x) + ga(DaΓ (w)(u, v), x), so using Equation (6.5) we find

g g ga(DaΓ (w)(u, v), x) = DaB (w)(u, v, x) g g g g − Ba(Γa(u, v), w, x) − Ba(x, w, Γa(u, v)).

- 129 - 6.2. GENERALISATION

Now (Equation (6.4)) 1  D Bg(w)(u, v, x) = D2g(u, w)(v, x) − D2g(x, w)(u, v) + D2g(v, w)(x, u) . a 2 a a a Therefore (in the last step use Equation (6.7))

g g g g g g ga(DaΓ (w)(u, v) − DaΓ (v)(u, w) + Γa(Γa(u, v), w) − Γa(Γa(u, w), v), x) 1 = D2g(u, w)(v, x) − D2g(x, w)(u, v) + D2g(v, w)(x, u) 2 a a a 2 2 2  − Dag(u, v)(w, x) + Dag(x, v)(u, w) − Dag(w, v)(x, u) g g g g − Ba(Γa(u, v), w, x) − Ba(x, w, Γa(u, v)) g g g g + Ba(Γa(u, w), v, x) + Ba(x, v, Γa(u, w)) g g g g + Ba(Γa(u, v), w, x) − Ba(Γa(u, w), v, x) 1  = D2g(u, w)(v, x) − D2g(x, w)(u, v) − D2g(u, v)(w, x) + D2g(x, v)(u, w) 2 a a a a g g g g + ga(Γa(x, v), Γa(u, w)) − ga(Γa(x, w), Γa(u, v)).

Because of this we define the 4-tensor (compare with (14.)) Rg : U × A × A × A × A → K by

1 Rg(u, v, w, x) := D2g(u, x)(v, w) + D2g(v, w)(u, x) a 2 a a 2 2  − Dag(u, w)(v, x) − Dag(v, x)(u, w) g g g g + ga(Γa(u, x), Γa(v, w)) − ga(Γa(u, w), Γa(v, x)) (6.11) for all a ∈ U, u, v, w, x ∈ A. By the above (use symmetry of g, Γ and Theorem (5.1.16)):

g Ra(u, v, w, x) 1  = D2g(u, x)(w, v) − D2g(v, x)(u, w) − D2g(u, w)(x, v) + D2g(v, w)(u, x) 2 a a a a g g g g + ga(Γa(v, w), Γa(u, x)) − ga(Γa(v, x), Γa(u, w)) g g g g g g = ga(DaΓ (x)(u, w) − DaΓ (w)(u, x) + Γa(Γa(u, w), x) − Γa(Γa(u, x), w), v)

- 130 - 6.2. GENERALISATION

Now apply hf(a)(·,Daf(x)) on both sides of Equation (6.10) to obtain

h Rf(a)(Daf(u),Daf(x),Daf(v),Daf(w))  h = hf(a) Df(a)Γ (Daf(w))(Daf(u),Daf(v)) h − Df(a)Γ (Daf(v))(Daf(u),Daf(w)) h h + Γf(a)(Γf(a)(Daf(u),Daf(v)),Daf(w)) h h  − Γf(a)(Γf(a)(Daf(u),Daf(w)),Daf(v)),Daf(x)

 g g g g = hf(a) Daf(DaΓ (w)(u, v) − DaΓ (v)(u, w) + Γa(Γa(u, v), w)

g g  − Γa(Γa(u, w), v)),Daf(x)

(6.2)  g g g g = ga DaΓ (w)(u, v) − DaΓ (v)(u, w) + Γa(Γa(u, v), w)

g g  − Γa(Γa(u, w), v)), x g = Ra(u, x, v, w).

Hence (compare with (15.)) we find for all a ∈ U, u, v, w, x ∈ A that

g h Ra(u, v, w, x) = Rf(a)(Daf(u),Daf(v),Daf(w),Daf(x)). (6.12) Furthermore, from Equation (6.11) and symmetry of g, Γ, and higher order derivatives we obtain (compare with (16.)) for all a ∈ U, u, v, w, x ∈ A that

g g Ra(u, v, w, x) = −Ra(v, u, w, x), g g Ra(u, v, w, x) = −Ra(u, v, x, w), g g Ra(u, v, w, x) = Ra(w, x, u, v), g g g Ra(u, v, w, x) = −Ra(u, w, x, v) − Ra(u, x, v, w). (6.13)

6. Considering Equation (6.2) and Equation (6.12) we see that it might be re- warding to consider a general k-tensor that satisfies a similar transformation rule. So let EA : U × A × ... × A → K, EB : V × B × ... × B → K be k- for some k ∈ N on A and B respectively. Furthermore suppose that EA ∈ C1(U), B 1 E ∈ C (V ) and that they satisfy for all a ∈ U, u1, . . . , uk ∈ A the equation

A B Ea (u1, . . . , uk) = Ef(a)(Daf(u1),...,Daf(uk)). (6.14) Then we can in the same way as we did for g and h take the derivative of this

- 131 - 6.2. GENERALISATION equation using Equation (5.4), and use Equation (6.9) to obtain

A DaE (v)(u1, . . . , uk) B = Df(a)E (Daf(v))(Daf(u1),...,Daf(uk)) B 2 B 2 + Ef(a)(Daf(u1, v),...,Daf(uk)) + ... + Ef(a)(Daf(u1),...,Daf(uk, v))

(6.9) B = Df(a)E (Daf(v))(Daf(u1),...,Daf(uk))  B h − Ef(a)(Γf(a)(Daf(u1),Daf(v)),...,Daf(uk))

B h  + ... + Ef(a)(Daf(u1),..., Γf(a)(Daf(uk),Daf(v)))

 B g + Ef(a)(Daf(Γa(u1, v)),...,Daf(uk))

B g  + ... + Ef(a)(Daf(u1),...,Daf(Γa(uk, v)))

(6.14) B = Df(a)E (Daf(v))(Daf(u1),...,Daf(uk))  B h − Ef(a)(Γf(a)(Daf(u1),Daf(v)),...,Daf(uk))

B h  + ... + Ef(a)(Daf(u1),..., Γf(a)(Daf(uk),Daf(v)))

 A g A g  + Ea (Γa(u1, v), . . . , uk) + ... + Ea (u1,..., Γa(uk, v)) .

Define the (k + 1)-tensor ∇EA : U × A × A × ... × A → K by

A A ∇Ea (u0, u1, . . . , uk) := DaE (u0)(u1, . . . , uk)  A g A g  − Ea (Γa(u0, u1), . . . , uk) + ... + Ea (u1,..., Γa(u0, uk)) (6.15) for all a ∈ U, u0, u1, . . . , uk ∈ A. Then by the above for v = u0 this new tensor satisfies

A B ∇Ea (u0, u1, . . . , uk) = ∇Ef(a)(Daf(u0),Daf(u1),...,Daf(uk)) and hence transforms in exactly the same way as EA and EB: it satisfies Equa- tion (6.14) for k + 1. Therefore we now arrive at the following statement. Theorem 6.2.2 T 2 Let A, B Vs /K T2 LC , U ⊆ A, V ⊆ B open. Let f : U → V be a C diffeomor- phism, and g : U × A × A → K, g ∈ C1(U), h : V × B × B → K, h ∈ C1(V ) be metrics on A and B respectively. Suppose f, g and h satisfy Equation (6.2) for all u, v ∈ A and a ∈ U. Let k ∈ N, EA : U × A × ... × A → K, EB : V × B × ... × B → K be k-tensors on A and B respectively that are both C1. If EA and EB satisfy Equation (6.14) for all a ∈ U, then the (k + 1)-tensors ∇EA and ∇EB defined by Equation (6.15) on A and B respectively satisfy Equation (6.14) for k + 1 and all a ∈ U. So we can ‘take the derivative’ 5 of any k-tensor satisfying Equation (6.14) to obtain a (k + 1)-tensor which also satisfies Equation (6.14).

5This really is the with respect to the Levi-Civita connection induced by the metric g.

- 132 - 6.3. DIGRESSION

Taking a look at this procedure for k = 2 with EA = g and EB = h we find using Equation (6.5) that

g g ∇ga(u, v, w) = Dag(u)(v, w) − ga(Γa(u, v), w) − ga(v, Γa(u, w)) g g = Dag(u)(v, w) − Ba(u, v, w) − Ba(u, w, v)

= Dag(u)(v, w) − Dag(u)(v, w) = 0.

So ∇g = 0, and similarly ∇h = 0.6 Therefore this derivative does not give us any new equations directly from Equation (6.2).

7., 8. Suppose f ∈ C3(U, B), g ∈ C1(U), h ∈ C1(V ), and that f, g and h satisfy Equation (6.2) and Equation (6.9) for all a ∈ U. Then they satisfy Equation (6.14) for k = 2 with EA = g and EB = h. Using Theorem (6.2.2) we can consider Equation (6.14) for k = 3 with ∇g and ∇h, but we already saw that ∇g = ∇h = 0 do not give any new equations. Looking at Equation (6.12) however we find that Equation (6.2) implies, provided g ∈ C2(U) and h ∈ C2(V ), that we also satisfy Equation (6.14) for k = 4 with EA = Rg and EB = Rh. Now we can, provided g ∈ C3(U) and h ∈ C3(V ), again use Theorem (6.2.2) and find that we satisfy Equation (6.14) for k = 5 with EA = ∇Rg, EB = ∇Rh. Applying Theorem (6.2.2) again we satisfy Equation (6.14) for k = 6 by considering ∇(∇Rg) and ∇(∇Rh), and we can continue this way indefinitely if g ∈ C∞(U) and h ∈ C∞(V ). Theorem 6.2.3 T 3 Let A, B Vs /K T2 LC , U ⊆ A, V ⊆ B open. Let f : U → V be a C diffeomor- phism, and g : U × A × A → K, g ∈ C1(U), h : V × B × B → K, h ∈ C1(V ) be metrics on A and B respectively. Suppose f, g and h satisfy Equation (6.2) for all u, v ∈ A and a ∈ U and that there is an l ∈ N, l ≥ 2 such that g ∈ Cl(U) and h ∈ Cl(V ). Then we obtain a chain of instances of Equation (6.14), each of which implies the next (via Equation (6.12) for k = 2 and via Theorem (6.2.2) for k ≥ 4, write ∇mRg := ∇ ... ∇ Rg): | {z } m k EA EB 2 g h 4 Rg Rh 5 ∇Rg ∇Rh 6 ∇2Rg ∇2Rh ...... l + 2 ∇l−2Rg ∇l−2Rh.

6.3 Digression

This is the point where we will diverge from [Chr1869] because of our more general (typically infinite-dimensional) setting, which is not compatible with

6Again in a more modern context: the metric is covariantly constant.

- 133 - 6.3. DIGRESSION the theory of invariants and equation counting that is used in Sections 9. to 12. of the article. 7 We will first introduce two very useful definitions which provide a pleasant shorthand for Equation (6.14) and permit us to push and pull k-tensors from one open set to another using the diffeomorphisms between these open sets. Definition 6.3.1: Pullback T 1 Let A, B Vs /K T2 LC , U ⊆ A, V ⊆ B open. Let f : U → V , f ∈ C (U, B), k ∈ N, and g : V × Bk → K a k-tensor on B. Then the pullback of g by f is defined as the k-tensor f ∗g : U × Ak → K on A, for all a ∈ U, u1, . . . , uk ∈ A given by

∗ (f g)a(a1, . . . , ak) := gf(a)(Daf(u1),...,Daf(uk)). (6.16)

Definition 6.3.2: Pushforward Let A, B /K , U ⊆ A, V ⊆ B open. Let f : U → V a C1 diffeomorphism, k ∈ N, and g : U × Ak → K a k-tensor on A. Then the pushforward of g by f is defined as the k-tensor on B, f∗g : V × k B → K, for all b ∈ V , v1, . . . , vk ∈ B given by

−1 −1 (f∗g)b(v1, . . . , vk) := gf −1(b)([Df −1(b)f] (v1),..., [Df −1(b)f] (vk)). (6.17)

Lemma 6.3.3 Let A, B /K , U ⊆ A, V ⊆ B open. Let f : U → V , f ∈ C1(U, B), k ∈ N, and g : V × Bk → K a k-tensor on B. If for l ∈ N, f ∈ Cl+1(U, B) and g ∈ Cl(V ), then (f ∗g) ∈ Cl(U). Suppose f is a C1 diffeomorphism and let h : U × Ak → K be a k-tensor on A, then −1 ∗ f∗h = (f ) h. l+1 l l In particular, if f is a C diffeomorphism and h ∈ C (U), then f∗h ∈ C (V ). Proof. Suppose f ∈ Cl+1(U, V ) and g ∈ Cl(V ). Then using induction, the expression for f ∗g from Definition (6.3.1) and Equation (5.4) we see that (f ∗g) ∈ Cl(U). Suppose f is a C1 diffeomorphism, then using Lemma (5.1.15) we find that for all b ∈ V , v1, . . . , vk ∈ B that

−1 −1 (f∗h)b(v1, . . . , vk) = hf −1(b)([Df −1(b)f] (v1),..., [Df −1(b)f] (vk)) −1 −1 = hf −1(b)(Dbf (v1),...,Dbf (vk)) −1 ∗ = ((f ) h)b(v1, . . . , vk),

−1 ∗ so f∗h = (f ) h. Applying the first part of the lemma we therefore see that if −1 l+1 l −1 ∗ l f ∈ C (U, B) and h ∈ C (U), then f∗h = (f ) h ∈ C (V ). Lemma 6.3.4 Let A, B, C /K , U ⊆ A, V ⊆ B, W ⊆ C open. Let f : U → V , f ∈ C1(U, B), and g : V → W , g ∈ C1(V,C).

7It should be noted that Christoffel does not give any explicit means or examples of actually obtaining these invariants in [Chr1869].

- 134 - 6.3. DIGRESSION

Then for any k-tensor h : W × Ck → K on C we have (g ◦ f)∗h = f ∗(g∗h). (6.18)

If f and g are C1 diffeomorphisms, then for any k-tensor i : U × Ak → K on A (g ◦ f)∗i = g∗(f∗i). (6.19) In particular if f is a C1 diffeomorphism, h a k-tensor on B, and i a k-tensor on A ∗ ∗ f∗(f h) = h, f (f∗i) = i.

Proof. Let a ∈ U, u1, . . . , uk ∈ A, then using Theorem (5.1.8)

∗ ((g ◦ f) h)a(u1, . . . , uk) = h(g◦f)(a)(Da(g ◦ f)(u1),...,Da(g ◦ f)(uk))

= hg(f(a))(Df(a)g(Daf(u1)),...,Df(a)g(Daf(uk))) ∗ = (g h)f(a)(Daf(u1),...,Daf(uk)) ∗ ∗ = (f (g h))a(u1, . . . , uk), which shows that (g ◦ f)∗h = f ∗(g∗h). Using Lemma (6.3.3) we see that if f and g are C1 diffeomorphisms, then −1 ∗ −1 −1 ∗ −1 ∗ −1 ∗ (g ◦ f)∗i = ((g ◦ f) ) i = (f ◦ g ) i = (g ) ((f ) i) = g∗(f∗i). ∗ −1 ∗ ∗ −1 ∗ Using the same trick we also see that f∗(f h) = (f ) (f h) = (f ◦f ) h = ∗ ∗ ∗ (idV ) h = h and similarly f (f∗i) = (idU ) i = i. Lemma 6.3.5 T 1 Let A, B Vs /K T2 LC , U ⊆ A, V ⊆ B open. Let f : U → V , be a C diffeomorphism, and h : V × B × B → K a metric on B. Then the pullback f ∗h is a metric on A.

Proof. We already know that g := f ∗h : U × A × A → K is a 2-tensor. As h is symmetric, it is immediate from Equation (6.16) that g is symmetric. Now for a ∈ U, u, v ∈ A we have (ˆga(u))(v) = ga(u, v) = hf(a)(Daf(u),Daf(v)) = ˆ ˆ (hf(a)(Daf(u)))(Daf(v)) = (hf(a)(Daf(u)) ◦ Daf)(v). So for a ∈ U, u ∈ A

ˆ gˆa(u) = hf(a)(Daf(u)) ◦ Daf from which we find thatg ˆa is bijective with inverse

−1 −1 ˆ−1 −1 gˆa (i) = [Daf] (hf(a)(i ◦ [Daf] )) for all i ∈ A0. Furthermore, as hˆ−1 c and f −1 ∈ C1(U, B), we see thatg ˆ−1 by Lemma (5.1.15). Therefore g is a metric. Because we will soon be using multiple metrics and diffeomorphisms simul- taneously, we will from now on denote the diffeomorphisms by greek characters φ, χ, ψ, . . . to avoid confusion. Theorem 6.3.6 8 Let A, B, C, D /K , U ⊆ A, V ⊆ B, W ⊆ C, X ⊆ D open. Let 8This theorem is nothing but an expression of the fact that in the setting of Riemannian we can look at the metrics g and h in coordinate charts of our choosing.

- 135 - 6.3. DIGRESSION

g : U × A × A → K, h : V × B × B → K be metrics on A and B respectively. Let for k ∈ N, φ : W → U, χ : X → V be Ck diffeomorphisms. Then there exists a Ck diffeomorphism ψ : U → V such that g = ψ∗h if and only if there exists a Ck diffeomorphism ω : W → X such that (φ∗g) = ω∗(χ∗h),

ω φ∗g W / X χ∗h

φ χ   g U / V h. ψ

Proof. Suppose there exists a Ck diffeomorphism ψ : U → V such that g = ψ∗h. Then ω := χ−1 ◦ψ ◦φ : W → X is a Ck diffeomorphism, and by Equation (6.18) ω∗(χ∗h) = (χ ◦ ω)∗h = (χ ◦ χ−1 ◦ ψ ◦ φ)∗h = φ∗(ψ∗h) = φ∗g. Suppose conversely that there exists a Ck diffeomorphism ω : W → X such that φ∗g = ω∗(χ∗h). Choose ψ := χ ◦ ω ◦ φ−1 : U → V which is a Ck diffeomor- phism, then by Equation (6.18) ψ∗h = (χ ◦ ω ◦ φ−1)∗h = (φ−1)∗(ω∗(χ∗h)) = (φ−1)∗(φ∗g) = (φ ◦ φ−1)∗g = g. Although it may not immediately be apparent, this theorem permits us to simplify the question we asked at Equation (6.2). From our previous consider- ations (being Theorem (6.2.1), Theorem (6.2.2), and Theorem (6.2.3)) we see that if we want to compare g and h in a way compatible with [Chr1869], we need U and V to be at least C3 diffeomorphic (via the map f). Corollary 6.3.7 T Let A, B Vs /K T2 LC , U ⊆ A, V ⊆ B open. Let g : U × A × A → K, h : V × B × B → K be metrics on A and B respectively. Suppose for k ∈ N that U and V are Ck diffeomorphic via some Ck diffeo- morphism χ : U → V . Then there exists a Ck diffeomorphism f : U → V such that f, g, and h satisfy Equation (6.2) if and only if there exists a Ck diffeomorphism ω : U → U such that g = ω∗(χ∗h).

Proof. Simply apply Theorem (6.3.6) for C = D = A, W = X = U, φ = idU , and f = ψ. This permits us to reduce without loss of generality to the case where B = A and V = U by considering the metric (Lemma (6.3.5)) χ∗h instead of h. Using Lemma (6.3.4) we see that for a diffeomorphism φ : U → U, g = φ∗h is equivalent to φ∗g = h, so we can use pushforwards just as well as pullbacks. Because we can assume V = U, the collection of all diffeomorphisms that may take h into g or vice versa can be composed with each other and therefore they form a group. Definition 6.3.8: Diffeomorphism group Let A /K , U ⊆ A open, and k ∈ N. Then the k-diffeomorphism group of U is defined to be

k k GU := {φ : U → U| φ is a C diffeomorphism } together with identity idU : U → U : a 7→ a, multiplication (φ, χ) 7→ φ ◦ χ, and inversion φ 7→ φ−1.

- 136 - 6.3. DIGRESSION

The collection of Ck metrics on U is defined to be

k k MU := {g : U × A × A → K| g is a metric, g ∈ C (U) }. k l By the above, GU has a natural action on MU for all k, l ∈ N, k > l given by the pushforward

k l l GU × MU → MU :(φ, g) 7→ φ · g := φ∗g. (6.20) That this indeed is well-defined action can be verified with Lemma (6.3.3), Lemma (6.3.5), and Lemma (6.3.4): χ · (φ · g) = χ∗(φ∗g) = (χ ◦ φ)∗g = (χ ◦ φ) · g and idU ·g = (idU )∗g = g. We see that now the question of finding a Ck diffeomorphism f : U → V such that f, g, and h satisfy Equation (6.2) is equivalent to χ∗h lying in the or- k k bit GU ·g, where χ : U → V is any C diffeomorphism. Therefore we can answer k our question if we know the orbit of g under the action of GU . Unfortunately, k the group GU is extremely complicated.

We therefore will not concentrate on the entire orbit of g, but on the orbit of the Taylor expansion (recall Theorem (5.3.11)) of g at a fixed point a ∈ U. Using Theorem (6.3.6) and the diffeomorphism U → (U − a): a1 7→ a1 − a we see that we may assume that U is an open neighbourhood of 0 = a. So we from T now on assume that A Vs /K T2 LC UC (necessary to use Theorem (5.3.11)) and U an open neighbourhood of 0 in A. Using Theorem (5.1.8), Equation (5.4), and Equation (6.16) we see that for l k g ∈ MU , φ ∈ GU with k > l both large enough and φ(0) = 0 we have that the Taylor sequence of g at 0 and the Taylor sequence of φ∗g at 0 look as Table 6.1 −1 (for φ∗g simply use φ → φ , Lemma (6.3.3)). It is clear that for higher order

Order g φ∗g 0 g0(u, v) g0(D0φ(u),D0φ(v)) 1 D0g(w)(u, v) D0g(D0φ(w))(D0φ(u),D0φ(v)) 2 2 +g0(D0φ(u, w),D0φ(v)) + g0(D0φ(u),D0φ(v, w)) 2 2 2 D0g(w, x)(u, v) D0g(D0φ(w),D0φ(x))(D0φ(u),D0φ(v)) 2 +D0g(D0φ(w))(D0φ(u, x),D0φ(v)) 2 +D0g(D0φ(w))(D0φ(u),D0φ(v, x)) 2 +D0g(D0φ(x))(D0φ(u, w),D0φ(v)) 3 2 2 +g0(D0φ(u, w, x),D0φ(v)) + g0(D0φ(u, w),D0φ(v, x)) 2 +D0g(D0φ(x))(D0φ(u),D0φ(v, w)) 2 2 3 +g0(D0φ(u, x),D0φ(v, w)) + g0(D0φ(u),D0φ(v, w, x)) ......

Table 6.1: Relation between the Taylor sequence of a metric g and of its pullback φ∗g by a diffeomorphism φ. Obtained by repeatedly using Theorem (5.1.8) and Equation (5.4). expansions, the transformation rules become extremely convoluted: this is not a very practical approach. From [Chr1869] we obtained, in the form of Theorem (6.2.3), that for Rg, ∇Rg, . . . we have the results in Table 6.2. For these terms, all expressions retain

- 137 - 6.3. DIGRESSION

g φ∗g g0(u, v) g0(D0φ(u),D0φ(v)) g g R0(u, v, w, x) R0(D0φ(u),D0φ(v),D0φ(w),D0φ(x)) g g ∇R0(u, v, w, x, y) ∇R0(D0φ(u),D0φ(v),D0φ(w),D0φ(x),D0φ(y)) . . . .

Table 6.2: Chain of equations from Theorem (6.2.3) for a metric g and its pullback φ∗g by a diffeomorphism φ.

their original form under pullback (apart from an introduced D0φ). 2 However, while we know that near 0 the Taylor series (g0,D0g, D0g, . . .) gives a reasonable description of g (Corollary (5.3.12)), it is quite unclear whether g g or not we can approximate g by (g0,R0, ∇R0,...). We can determine the se- g g 2 quence (g0,R0, ∇R0,...) directly from the Taylor series (g0,D0g, D0g, . . .) using Equation (6.1), Equation (6.11), and Equation (6.15). However, it is not clear 2 g g whether or not we can recover (g0,D0g, D0g, . . .) again from (g0,R0, ∇R0,...). g g If we could, this would guarantee via Corollary (5.3.12) that (g0,R0, ∇R0,...) describes the metric near 0, therefore we will now investigate whether or not this is possible. By Theorem (6.3.6) we can start by considering φ∗g (which lies in the same orbit as g) instead of g, for a diffeomorphism φ which simplifies g as much as possible. Then for this simplified g, we will try to determine to what degree we 2 g g 9 can recover (g0,D0g, D0g, . . .) from (g0,R0, ∇R0,...). It turns out (Theorem (6.5.1)) that to actually obtain this ‘simplifying dif- feomorphism’ φ we are going to need Theorem (5.5.8) and Theorem (5.5.10) for existence, which requires us to demand that A Ba /K. 10 However, we will first discuss what we consider to be a ‘simple’ metric.

9This simplification of g actually seems to be necessary. For general g there is no apparent 2 way in which such a recovery can be made; one can even show that D0g(w, x)(u, v) cannot g be recovered from any linear combination of R0(u, v, w, x) where we permute u, v, w, and x (this is immediately clear if A is finite-dimensional: the number of independent components 2 1 2 of D0g with respect to a basis {e1, . . . , ek} of A,( 2 k(k + 1)) , is strictly greater than that g 1 2 2 of R0, 12 k (k − 1), in the general case it can be rewritten as a conflicting system of linear equations). 10Quite a pity, up until now we have been working in a very general context. However, Example (5.4.3) and Example (5.4.4) show that this is really necessary. Furthermore, A is almost a Hilbert space (so in particular almost ) because of the existence of the metric g: 0 necessarily A ' A . In particular, if there exists a point a at whichg ˆa is negative definite on a finite-dimensional subspace of A and positive semidefinite on the complement, then A is a Hilbert space.

- 138 - 6.4. SIMPLE METRICS

6.4 Simple metrics

We will start by considering the notion of a : a generalisation of the concept of a straight line, with respect to the metric g. 11 12 Geodesics will be curves γ for which the ‘total kinetic energy’ of the curve, given by R  1 0 0  α 7→ 2 gγ(α)(γ (α), γ (α)) , is extremal with respect to variations of γ which leave the end-points of γ fixed. Theorem (5.3.14) shows us that such variations are completely described by the Lagrange map of the kinetic energy. T Let A Vs /K T2 LC , and U ⊆ A open. Let g : U × A × A → K be a metric on A with g ∈ C2(U). Consider the function G : U × A → K given by 1 G(a, u) := g (u, u) 2 a which is C2. Then for any curve γ : S → U, γ ∈ C2(S, A) with S ⊆ R an open interval, we have for α ∈ S, u ∈ A (see Theorem (5.3.14) and use Equation (5.4))

∂   1 La(G, γ)(α, u) = g (γ0(α), u) − D g(u)(γ0(α), γ0(α)) ∂α γ(α) 2 γ(α) 0 0 00 = Dγ(α)g(γ (α))(γ (α), u) + gγ(α)(γ (α), u) 1 − D g(u)(γ0(α), γ0(α)) 2 γ(α) (6.4) g 0 0 00 = Bγ(α)(γ (α), γ (α), u) + gγ(α)(γ (α), u)

(6.7)  00 g 0 0  = gγ(α) γ (α) + Γγ(α)(γ (α), γ (α)), u

  00 g 0 0  = gˆγ(α) γ (α) + Γγ(α)(γ (α), γ (α)) (u).

 00 Hence La(G, γ)(α, u) = 0 for all α ∈ S, u ∈ A if and only ifg ˆγ(α) γ (α) + g 0 0  Γγ(α)(γ (α), γ (α)) = 0 for all α ∈ S if and only if (Definition (6.1.2)) for all α ∈ S we have 00 g 0 0 γ (α) + Γγ(α)(γ (α), γ (α)) = 0. Definition 6.4.1: Geodesic Let A /K , U ⊆ A open. Let g : U × A × A → K be a metric on A, g ∈ C1(U). Then we call a curve γ : S → U, γ ∈ C2(S, A), with S ⊆ R an open interval a geodesic with respect to g if for all α ∈ S, γ satisfies

00 g 0 0 γ (α) + Γγ(α)(γ (α), γ (α)) = 0. (6.21)

11 For metrics g that are positive definite, ga(u, u) > 0 for all u 6= 0, the geodesics are actually (see Section 10 of [Ban2008]) locally the paths of shortest length as measured by R q 0 0 dom γ gγ(α)(γ (α), γ (α)) dα. One can compare this with straight lines γ(α) = a + α u k R p 0 0 in R which are paths of shortest length with respect to dom γ hγ (α), γ (α)i dα = R 0 dom γ kγ (α)k dα, which is the usual notion of length. 12Another interpretation of geodesics is that of paths of free-falling, that is, not subject to external forces, particles in the theory of general relativity. See Chapter 4 from [Wal1984].

- 139 - 6.4. SIMPLE METRICS

Let γ : S → U be a geodesic with respect to g. Then using Equation (5.4) we find that ∂   g (γ0(α), γ0(α)) = D g(γ0(α))(γ0(α), γ0(α)) ∂α γ(α) γ(α) 00 0 + 2 gγ(α)(γ (α), γ (α))

(6.5) g 0 0 00 0 = 2 Bγ(α)(γ (α), γ (α)) + 2 gγ(α)(γ (α), γ (α))

(6.7)  00 g 0 0 0  = 2 gγ(α) γ (α) + Γγ(α)(γ (α), γ (α)), γ (α) (6.21) = 0.

Therefore we find with Theorem (5.1.8) that for a geodesic γ : S → U,

0 0 0 0 gγ(α)(γ (α), γ (α)) = gγ(β)(γ (β), γ (β)) for all α, β ∈ S. 13 So we obtain the following lemma. Lemma 6.4.2 T Let A Vs /K T2 LC , U ⊆ A open. Let g : U × A × A → K be a metric on A, g ∈ C2(U). Let γ : S → U, γ ∈ C2(S, A), with S ⊆ R an open interval. Then γ is a geodesic with respect to g if and only if  1   La U × A → :(a, u) 7→ g (u, u) , γ (α, u) = 0 (6.22) K 2 a for all α ∈ S, u ∈ A. If this is the case, we have for all α, β ∈ S

0 0 0 0 gγ(α)(γ (α), γ (α)) = gγ(β)(γ (β), γ (β)). (6.23) We now call a metric simple at a certain point if the geodesics, the straight lines with respect to g, emanating from this point, are straight in the usual sense (see Lemma (6.4.4)). Definition 6.4.3: Simple metric Let A /K , U ⊆ A open. Then a metric g : U × A × A → K on A, g ∈ C1(U) is called a simple metric at a ∈ U if there exists a neighbourhood U1 of 0 in U −a such that for all u ∈ U1 we have g Γa+u(u, u) = 0. (6.24) Suppose g is a simple metric at a ∈ U, satisfying Equation (6.24) for all u ∈ U1 where U1 is an open abc (A ) neighbourhood of 0 in U − a. Let u ∈ U1, then, as U is balanced, for all α ∈ BK(0, 1) we have α u ∈ U1, so g 2 g Γa+α u(α u, α u) = α Γa+α u(u, u) = 0. Define γ :]−1, 1[→ U by γ(α) := a+α u, then

00 g 0 0 g γ (α) + Γγ(α)(γ (α), γ (α)) = 0 + Γa+α u(u, u) = 0,

13So the infinitesimal length squared of geodesics with respect to g is constant.

- 140 - 6.4. SIMPLE METRICS so α 7→ a + α u is a geodesic. Therefore, if g is simple at a, all straight lines emanating from a are geodesics with respect to g. Suppose conversely that there exists an open abc neighbourhood U1 of 0 in U − a such that for all u ∈ U1, the map ] − 1, 1[→ U : α 7→ a + α u is a geodesic. g Then by the previous, Γa+α u(u, u) = 0 for all α ∈] − 1, 1[. In particular for g any u1 ∈ U, by continuity of Γ and the fact that U1 3 u = limα→1 α u, g g Γa+u(u, u) = limα→1 Γa+α u(u, u) = limα→1 0 = 0, so g is simple at a. Suppose A UC . We follow [Dui2006]. Let u ∈ U1, then because scalar multi- c plication K × A → A and U1 is balanced, there exists a δ ∈]0, ∞[ such that for all α ∈ BK(0, 1 + δ) we have α u ∈ U1. In particular γu :] − 1 − δ, 1 + δ[→ U, γu(α) := a + α u is a geodesic with respect to g with image in U1. Hence, as [0, 1] ⊆] − 1 − δ, 1 + δ[,

1 Z 1  1  ga(u, u) = α 7→ ga(u, u) 2 0 2 Z 1  1 0 0  = α 7→ gγu(0)(γu(0), γu(0)) 0 2 Z 1 (6.23)  1 0 0  = α 7→ gγu(α)(γu(α), γu(α)) 0 2 Z 1  0  = α 7→ G(γu(α), γu(α)) , 0 so via Theorem (5.3.14) we find that for v ∈ A  1  g (u, v) = D u0 7→ g (u0, u0) (v) a u 2 a Z 1  0  0  = Du u 7→ α 7→ G(γu0 (α), γu0 (α)) (v) 0 Z 1 (5.7)   ∂γu(α)  = − α 7→ La(G, γu) α, (v) 0 ∂u  ∂γ (1)   ∂γ (0)  + g γ0 (1), u (v) − g γ0 (0), u (v) γu(1) u ∂u γu(0) u ∂u (6.22) = −0 + ga+u(u, 1 v) − ga(u, 0 v)

= ga+u(u, v).

In particular, for all u ∈ U1 and v ∈ A we have

ga(u, v) = ga+u(u, v). Taking the derivative twice with respect to u of Equation (6.24) in directions v, w ∈ A and evaluating the result at u = 0, we find with Equation (5.4) that

g 0 + 0 + 0 + Γa+0(v, w) = 0, g so Γa(v, w) = 0 for all v, w ∈ A. Therefore, by Equation (6.5) and Equation (6.7), we find that Dag(w)(u, v) = 0 for all u, v, w ∈ A. If g is simple at a, Dag = 0. T 2 Let B Vs /K T2 LC , V ⊆ B open and φ : V → U a C diffeomorphism satisfying φ(b) = a for some b ∈ V and D2 φ = 0 for all b ∈ V . Shrink V to an b1 1 open abc neighbourhood of b.

- 141 - 6.4. SIMPLE METRICS

1 Then for any u ∈ B the map V → A : b1 7→ Db1 φ(u) is C with derivative 0.

Hence by Theorem (5.1.8), it is constant, so Db1 φ(u) = Dbφ(u) for all b1 ∈ V , u ∈ B. c Let χ := Dbφ. Then by Lemma (5.1.15), χ : B → A l bijective with −1 χ : A → B . Now by Theorem (5.1.8), as Db1 φ = χ for all b1 ∈ V , φ(b1) = χ(b1 − b) + a for all b1 ∈ V . Hence Equation (6.9) becomes, for all −1 b1 ∈ φ (U) ∩ V and u, v ∈ B

2 g φ∗g D φ(u, v) + Γ (Db φ(u),Db φ(v)) = Db φ(Γ (u, v)), b1 φ(b1) 1 1 1 b1 which reduces to

∗ 0 + Γg (χ(u), χ(v)) = χ(Γφ g(u, v)) χ(b1−b)+a b1 and therefore ∗ Γφ g(u, v) = χ−1(Γg (χ(u), χ(v))). b1 χ(b1−b)+a −1 −1 In particular, for all u ∈ χ (U1), b1 = b + u ∈ φ (U), so

φ∗g −1 g Γb+u(u, u) = χ (Γa+χ(u)(χ(u), χ(u))) = 0

∗ because g is simple at a and χ(u) ∈ U1. Hence φ g is simple at b. With this, Lemma (6.3.3), and Lemma (6.3.5), we arrive at the following lemma. Lemma 6.4.4 T Let A Vs /K T2 LC UC , U ⊆ A open. Let g : U × A × A → K be a metric on A, g ∈ C2(U), and a ∈ U. Then g is a simple metric at a if and only if there exists an open abc neigh- bourhood U1 of 0 in U − a such that for all u ∈ U1 we have that

] − 1, 1[→ U1 : α 7→ a + α u is a geodesic with respect to g. 14 If this is the case, then for all u ∈ U1 and v ∈ A we have

ga(u, v) = ga+u(u, v), (6.25) and at a we have Dag = 0. (6.26) Furthermore, let B /K , V ⊆ B open. Then for any C2 diffeomor- 2 ∗ phism φ : V → U satisfying for all b ∈ V that Db φ = 0, the pullback φ g is a C2 metric that is simple at φ−1(a), if g is a simple metric at a. Note that Equation (6.26) implies that if a metric g is simple at all a ∈ U, then by Theorem (5.1.8) it must necessarily be constant on each connected component of U.

14This is often called Gauß’s Lemma.

- 142 - 6.4. SIMPLE METRICS

Suppose g ∈ Ck(U) for k ≥ 2. We are now going to investigate the conse- quences of Equation (6.25) by taking derivatives of this equation in directions v2, v3,... ∈ A using Equation (5.4) and Theorem (5.1.16): ga(u, v1) = ga+u(u, v1) ga(v2, v1) = Da+ug(v2)(u, v1) + ga+u(v2, v1) 2 0 = Da+ug(v2, v3)(u, v1) + Da+ug(v2)(v3, v1) + Da+ug(v3)(v2, v1) 3 0 = Da+ug(v2, v3, v4)(u, v1) 2 2 2 + Da+ug(v2, v3)(v4, v1) + Da+ug(v2, v4)(v3, v1) + Da+ug(v3, v4)(v2, v1) ... k k−1 0 = Da+ug(v2, . . . , vk+1)(u, v1) + Da+ug(v2, v3, . . . , vk)(vk+1, v1) k−1 k−1 + Da+ug(vk+1, v2, . . . , vk−1)(vk, v1) + ... + Da+ug(v3, v4, . . . , vk+1)(v2, v1). Evaluating these expressions at u = 0 we find for all 2 ≤ l ≤ k − 1 and v1, . . . , vl+2 ∈ A that

l l Dag(v2, v3, . . . , vl+1)(vl+2, v1) + Dag(vl+2, v2, . . . , vl)(vl+1, v1) l + ... + Dag(v3, v4, . . . , vl+2)(v2, v1) = 0. (6.27)

l What we want is to express Dag(vl+2, . . . , v3)(v2, v1) completely in terms of l−2 g 15 ∇ Ra with appropriate combinations of v1, . . . , vl+2 inserted in each term. To do so, it is convenient to simplify our notation using the symmetries l of Dag: by Theorem (5.1.16) and the fact that g is a metric we know that l Dag(vl+2, . . . , v3)(v2, v1) is symmetric under permutations of vl+2, . . . , v3 and v2, v1. Therefore, for any π ∈ Sl+2, the expression

l Dag(vπ(l+2), . . . , vπ(3))(vπ(2), vπ(1)) is completely characterised by π(1), π(2) ∈ {1, . . . , l + 2}: the expression re- mains the same for all permutations of the indices π(3), . . . , π(l+2) by Theorem (5.1.16). Hence we will for 2 ≤ l ≤ k − 1, i1, i2 ∈ {1, . . . , l + 2}, i1 6= i2 use the notation l l (i1, i2) := Dag(vj1 , . . . , vjl )(vi1 , vi2 ) where j1 < . . . < jl are values in {1, . . . , l + 2}\{i1, i2}, that are uniquely determined (since there are l + 2 − 2 = l elements in {1, . . . , l + 2}\{i1, i2}). Then for any π ∈ Sl+2,

l l Dag(vπ(l+2), . . . , vπ(3))(vπ(2), vπ(1)) = (π(2), π(1)) . Note that as g is symmetric,

l l (i1, i2) = (i2, i1) , and that Equation (6.27) implies

(1, 2)l + (1, 3)l + ... + (1, l + 2)l = 0,

15 l−2 g g 2 ∇ Ra because Ra is expressed in Dag as highest order derivative of g by Equation l−2 g l (6.11) and hence, by Equation (6.15), ∇ Ra contains Dag as highest order derivative.

- 143 - 6.4. SIMPLE METRICS

which in turn gives us, by permuting the vectors v·, that for any i ∈ {1, . . . , l+2}

l+2 X (i, j)l = 0. (6.28) j=1,j6=i

Now by Equation (6.11) and Equation (6.26) we have that for any i1, i2, i3, i4 ∈ {1, 2, 3, 4} 1  Rg(v , v , v , v ) = (i , i )2 − (i , i )2 − (i , i )2 + (i , i )2 + 0. a i1 i2 i3 i4 2 2 3 1 3 2 4 1 4

Hence by Equation (6.15), for 2 ≤ l ≤ k−1, j1, . . . , jl−2, i1, . . . , i4 ∈ {1, . . . , l+2} 1  ∇l−2Rg(v , . . . , v , v , . . . , v ) = (i , i )l+(i , i )l−(i , i )l−(i , i )l +... a j1 jl−2 i1 i4 2 1 4 2 3 1 3 2 4 where ... consists of lower (that is < l) order derivatives of g. This can be shown using induction. For l = 2 we have with Equation (6.11) g g g g g that for Ra(v1, . . . , v4), ‘...’ = ga(Γa(v1, v4), Γ (v2, v3))−ga(Γa(v1, v3), Γa(v2, v4)) which by Equation (6.7) can be expressed entirely in ga and Dag. (l−1)−2 g Now suppose that for some l ≥ 3 we have that for ∇ Ra(v1, . . . , v(l−1)+2), m that ‘...’ can be expressed in Da g for 0 ≤ m < l−1 and that the four remaining l−1 terms are a linear combination of Da g (note that this is the case for l = 3). l−2 g (l−1)−2 g We obtain ∇ Ra(v1, . . . , vl+2) by taking ∇(∇ R )a, which by Equation (l−1)−2 g g (l−1)−2 g (6.15) involves Da(∇ R ) and substituting Γa in ∇ R . g m By substituting Γa we obtain terms that can be expressed in terms of Da g g (l−1)−2 g for 0 ≤ m ≤ l − 1 (Γa is determined by ga and Dag, and ∇ R involves derivatives of g of at most order l − 1 by our induction hypothesis). (l−1)−2 g Da(∇ R ) gives us the derivative of the first four non-‘...’ terms of (l−1)−2 g l ∇ R , which is a linear combination of four Dag terms, and the derivative of ‘...’ from ∇(l−1)−2Rg. Since by induction, ‘...’ from ∇(l−1)−2Rg is completely m expressed in terms of Da g for 0 ≤ m < l − 1 we find that the derivative m is completely expressed in terms of Da g for 0 ≤ m ≤ l − 1, where we use Equation (6.1) to write derivatives of Γg in terms of derivatives of g. Hence the l−2 g m ‘...’ part of ∇ Ra(v1, . . . , vl+2) only contains terms Da g for 0 ≤ m < l. Now l−2 g using induction we see that for ∇ Ra,‘...’ is expressed entirely in terms of m 16 Da g for 0 ≤ m < l for all 2 ≤ l ≤ k − 1. Inspired by this and Equation (6.28) 17 we consider for 2 ≤ l ≤ k − 1,

16 g g Note that even though Γa = 0 by Equation (6.26), the derivatives of Γa need not be 0 and repeated applications of Equation (5.4) and Theorem (5.1.8) yield very complicated expressions for all terms contained in ‘...’. However, all derivatives of g that occur in ‘...’ are of lower order than l. 17 2 ∗ g Actually after painstakingly calculating Da(φ g)(w, x)(u, v) = −(Ra(x, u, w, v) + g Ra(w, u, x, v))/3 using Theorem (6.5.2), where φ is the map from Theorem (6.5.1).

- 144 - 6.4. SIMPLE METRICS

v1, . . . , vl+4 ∈ A the following expression:

X l−2 g ∇ Ra(vπ(1), . . . , vπ(l−2), vπ(l−1), vl+1, vπ(l), vl+2) π∈Sl 1 X h i = (π(l − 1), l + 2)l + (l + 1, π(l))l − (π(l − 1), π(l))l − (l + 1, l + 2)l + ... 2 π∈Sl l l 1 X 1 X = (l − 1)! (i, l + 2)l + (l − 1)! (l + 1, i)l 2 2 i=1 i=1 l l 1 X X − (l − 2)! (i, j)l 2 i=1 j=1,j6=i 1 − l!(l + 1, l + 2)l + ... 2 (6.28) 1 1 = − (l − 1)! (l + 1, l + 2)l − (l − 1)! (l + 1, l + 2)l 2 2 l 1 X   − (l − 2)! − (i, l + 1)l − (i, l + 2)l 2 i=1 1 − l!(l + 1, l + 2)l + ... 2 1  = − l! + (l − 1)! + (l − 2)! (l + 1, l + 2)l + ... 2 l(l + 1) = − (l − 2)! Dl g(v , . . . , v )(v , v ) + ... 2 a 1 l l+1 l+2 where again ... denotes terms containing derivatives of g of order < l. This leads us to the following lemma. Lemma 6.4.5 T Let A Vs /K T2 LC UC , U ⊆ A open. Let g : U × A × A → K be a metric on A, g ∈ Ck(U), k ≥ 2, and a ∈ U. Suppose g is a simple metric at a. Then for all 2 ≤ l ≤ k − 1 and u, v, w1, . . . , wl ∈ A we have

l Dag(w1, . . . , wl)(u, v) = l − 1 1 X − 2 ∇l−2Rg(w , . . . , w , w , u, w , v) l + 1 l! a π(1) π(l−2) π(l−1) π(l) π∈Sl m + . . . expression in Da g for 0 ≤ m < l ... . (6.29) Equation (6.29) becomes for l = 2 (use Equation (6.26)) 1  D2g(w , w )(u, v) = − Rg(w , u, w , v) + Rg(w , u, w , v) + 0. a 1 2 3 a 1 2 a 2 1 We therefore see by using induction and Equation (6.29), that the sequence g g k−3 g 2 k−1 (ga,Ra, ∇Ra,..., ∇ Ra) completely determines (ga,Dag, Dag, . . . , Da g) if g ∈ Ck(U) is simple at a. This, together with Equation (6.11) and Equation (6.15), yields the following theorem.

- 145 - 6.5. MAKING METRICS SIMPLE

Theorem 6.4.6: Taylor sequences of simple metrics T Let A Vs /K T2 LC UC , U ⊆ A open. Let g, h : U × A × A → K be metrics on A. Suppose g, h ∈ Ck(U) for k ≥ 2 and that g and h are both simple metrics at a ∈ U. l l l g l h Then Dag = Dah for all 0 ≤ l < k if and only if ga = ha and ∇ Ra = ∇ Ra for all 0 ≤ l < k − 2. A similar result, obtained in an entirely different fashion can be found in [Car1951], Chapitre X, no 218 - 219 on page 238.

6.5 Making metrics simple

Now that we know, in the form of Theorem (6.4.6), that the Taylor sequence of l g a metric g which is simple at a is completely determined by ga and ∇ Ra, we need to ascertain whether or not a general metric can be brought into a simple form. 18 Theorem 6.5.1 Let A Ba /K, U ⊆ A open. Let g : U × A × A → K be a metric on A and suppose that g ∈ Ck(U) for k ≥ 3. Then for any a ∈ U there exists an open neighbourhood U1 of a in U, an k−1 open abc neighbourhood V of 0 in A, and a C diffeomorphism φ : V → U1 ∗ such that φ(0) = a, D0φ = idA, and φ g is a simple metric at 0.

Proof. Using the diffeomorphism A → A : a1 7→ a1 − a we see that we may take a = 0 without loss of generality. Define f : U × A → A × A by

g f(a, u) := (u, −Γa(u, u)). (6.30) Since g ∈ Ck(U), Γg ∈ Ck−1(U × A × A, A), so f ∈ Ck−1(U × A, A × A). Furthermore, as k ≥ 3, f is C2 and hence by Theorem (5.5.10) the flow e· f exists on a certain open set W ⊆ R × U × A with {0} × U × A ⊆ W . For convenience we define the maps π1, π2 : A × A → A, ι : A → A × A by

π1(a1, a2) := a1, π2(a1, a2) := a2, ι(a) := (0, a). (6.31)

c ∞ Note that π1, π2, and ι are l and hence (Theorem (5.1.8)) C . We define for S(0, u) 6= ∅ the curve γu : S(0, u) → U to be

α f γu(α) := π1(e (0, u)).

Equation (5.10) and our expression for f then give us that γu is the unique maximal solution to 19

0 00 g 0 0 γu(0) = 0, γ (0) = u, γ (α) + Γ (γ (α), γ (α)) = 0, (6.32) u u γu(α) u u

18The answer is yes, see Theorem (6.5.1), by pulling back the metric with the exponential map (see [Ban2008], Section 9, page 33) induced by the geodesics. This yields what is referred to as the metric in ‘Riemann normal coordinates’. 19 1 As for any open interval S ⊆ R and C map S → U × A : α 7→ (γ(α), η(α)) we have 0 0 0 (γ, η) (α) = f((γ, η)(α)) if and only if (γ (α), η (α)) = (η(α), −Γγ(α)(η(α), η(α))) if and only 0 00 0 0 0 if γ (α) = η(α) and γ (α) = η (α) = −Γγ(α)(γ (α), γ (α)).

- 146 - 6.5. MAKING METRICS SIMPLE

in particular γu is a geodesic. Note that with this definition α f α f 0 e (0, u) = (e ◦ ι)(u) = (γu(α), γu(α)) k−1 and γu is C . 1 Define for u ∈ A with S(0, u) 6= ∅ and β ∈ K, β 6= 0 the map η : β S(0, u) → 0 0 00 2 00 U by η(α) := γu(β α). Then η (α) = β γu(β α) and η (α) = β γu (β α), so η(0) = 0, η0(0) = β u and η00(α) + Γg (η0(α), η0(α)) = β2 γ00(β α) + Γg (β γ0 (β α), β γ0 (β α)) η(α) u γu(β γ) u u = β2 0 = 0.

Hence (by Theorem (5.5.10), γβ u is the unique maximal solution) η(α) = γβ u(α) 1 and β S(0, u) ⊆ S(0, β u). Similarly we find β S(0, β u) ⊆ S(0, u). 1 So for all u ∈ A with S(0, u) 6= ∅, β ∈ K, β 6= 0, and α ∈ β S(0, u) we have

S(0, u) = β S(0, β u), γu(β α) = γβ u(α). (6.33) Now consider the set V := {u ∈ A | (1, 0, u) ∈ W } ⊆ A.

Then first of all, 0 ∈ W as γ0(α) = 0 for all α ∈ R satisfies Equation (6.32) and V is open because W is open. So V is an open neighbourhood of 0 in A. This permits us to define the map φ : V → U by

1 f 1 f φ(u) := π1(e (0, u)) = (π1 ◦ e ◦ ι)(u).

k−1 ·f k−1 ∞ Since f is C , e is C by Theorem (5.5.10), π1 is C as continuous linear map, so φ is Ck−1. Furthermore, for all u ∈ V , α ∈ K with α u ∈ V we have

φ(α u) = γα u(1) = γu(α) by Equation (6.33). As φ(0) = 0 and φ is C2, for any u ∈ A by Lemma (5.1.4) 1   D0φ(u) = lim φ(α u) − φ(0) α→0 α 1   = lim γu(α) − γu(0) α→0 α 0 = γu(0) = u.

2 So φ is C and D0φ = idA which is invertible, therefore by Theorem (5.5.8) we can shrink V to an open abc neighbourhood of 0 in A and find an open k−1 neighbourhood U1 of 0 in U such that φ : V → U1 is a C diffeomorphism. Let u ∈ A and α ∈ K such that α u ∈ V . Then 1   Dα uφ(u) = lim φ(α u + β u) − φ(α u) β→0 β 1   = lim γu(α + β) − γu(α) β→0 β 0 = γu(α),

- 147 - 6.5. MAKING METRICS SIMPLE and likewise

2 1   Dα uφ(u, u) = lim Dα u+β uφ(u) − Dα uφ(u) β→0 β

1  0 0  = lim γu(α + β) − γu(α) β→0 β 00 = γu (α).

Note that as φ is C2 and g is C1, φ∗g is a C1 metric on V by Lemma (6.3.3) and Lemma (6.3.5). By Equation (6.9) we furthermore obtain for all u ∈ V that

φ∗g 2 g Duφ(Γu (u, u)) = Duφ(u, u) + Γφ(u)(Duφ(u),Duφ(u)) = γ00(1) + Γg (γ0 (1), γ0 (1)) u γu(1) u u (6.32) = 0.

As Duφ is bijective (because φ|V is a diffeomorphism), we therefore find that φ∗g ∗ for all u ∈ V ,Γu (u, u) = 0: φ g is a simple metric at 0. We will now give a means by which we can determine φ’s Taylor series, using Lemma (5.5.12). 1 f Let u ∈ V , 0 ≤ l < k − 1, v1, . . . , vl ∈ A. Note that as φ(u) = π1(e (0, u)), c by Theorem (5.1.8) we have (π1, ι are l )

l l 1 f Duφ(v1, . . . , vl) = D(0,u)(π1 ◦ e )((0, v1),..., (0, vl)) l 1 f = π1(Du(e ◦ ι)(v1, . . . , vl)).

This and Lemma (5.5.12) motivate us to choose η : S(0, u) → A × A as

l α f η(α) := (η1(α), η2(α)) := Du(e ◦ ι)(v1, . . . , vl), such that l Duφ(v1, . . . , vl) = η1(1). l α f As η(α) = D(0,u)e ((0, v1),..., (0, vl)), Lemma (5.5.12) gives us   (0, u) l = 0 η(0) = (0, v1) l = 1  (0, 0) l > 1 0 l α f η (α) = D(0,u)(f ◦ e )((0, v1),..., (0, vl)).

With Equation (6.30) we find that for the first term π1(f(a, u)) = u, all deriva- tives of higher than first order vanish and (π1 is ) π1(D(a,u)f(v, w)) = 0 D(a,u)(π1 ◦ f)(v, w) = w. Hence, if we work out η1(α) using Theorem (5.1.8)

- 148 - 6.5. MAKING METRICS SIMPLE and Equation (5.4), we find for l > 0 that

0  l α f  η1(α) = π1 D(0,u)(f ◦ e )((0, v1),..., (0, vl)) l−1 0 0 α f = D(0,u)((a , u ) 7→ Deα f (a0,u0)(π1 ◦ f)(D(a0,u0)e (0, vl)))((0, v1),..., (0, vl−1)) = ... (use Equation (5.4) repeatedly) l α f = Deα f (0,u)(π1 ◦ f)(D(0,u)e ((0, v1),..., (0, vl)))

+ (derivatives of π1 ◦ f of order at least 2)

= Deα f (0,u)(π1 ◦ f)(η(α)) + 0

= η2(α), and also for l = 0

0 α f η1(α) = π1(f(e (0, u)))

= π1(f(η(α)))

= η2(α).

0 So η1(α) = η2(α), therefore we find that  u l = 0 0  η1(0) = 0, η1(0) = η2(0) = v1 l = 1 ,  0 l > 1 as well as

00 0 η1 (α) = η2(α) l α f = π2(D(0,u)(f ◦ e )((0, v1),..., (0, vl)) l α f = Du(π2 ◦ f ◦ e ◦ ι)(v1, . . . , vl).

Note that for l = 0 we precisely obtain Equation (6.21) for γu = η1. This leads us to the following theorem. Theorem 6.5.2: Derivatives of φ Let A Ba /K, U ⊆ A open. Let g : U × A × A → K be a metric on A and suppose that g ∈ Ck(U) for k ≥ 3. Then the map φ from Theorem (6.5.1) has the following property.

For all 0 ≤ l < k −1, u ∈ V , v1, . . . , vl ∈ A the curve γu,v1,...,vl : S(0, u) → U defined by l α f γu,v1,...,vl (α) := π1(Du(e ◦ ι)(v1, . . . , vl))

(where f is given by Equation (6.30) and π1, π2, and ι by Equation (6.31)) satisfies

γu,v1,...,vl (0) = 0   u l = 0 γ0 (0) = v l = 1 u,v1,...,vl 1  0 l > 1

- 149 - 6.6. DIGRESSION (CONT’D) and for all α ∈ S(0, u)

γ00 (α) = Dl (π ◦ f ◦ eα f ◦ ι)(v , . . . , v ). (6.34) u,v1,...,vl u 2 1 l These curves completely determine φ by

l Duφ(v1, . . . , vl) = γu,v1,...,vl (1).

Note that for l = 0 we obtain geodesics γu as solutions, and for l > 0, l l α f (v1, . . . , vl) 7→ γu,v1,...,vl is l- by linearity of π1 ◦ Du(e ◦ ι). Noting that γ0(α) = 0 for all α ∈ R is a solution, we see that Equation (6.34) simplifies considerably for u = 0. Using this, it is straightforward to determine 2 g that γ0,v1 (α) = α v1 (which gives D0φ(v1) = v1), γ0,v1,v2 (α) = −α Γ0(v1, v2) 2 g (which gives D0φ(v1, v2) = −Γ0(v1, v2)). After a slightly longer calculation, we find 1  γ (α) = α3 2 Γg(v , Γg(v , v )) + 2 Γg(v , Γg(v , v )) 0,v1,v2,v3 3 0 1 0 2 3 0 2 0 3 1 g g g + 2 Γ0(v3, Γ0(v1, v2)) − D0Γ (v1)(v2, v3) g g  − D0Γ (v2)(v3, v1) − D0Γ (v3)(v1, v2) , 1 D3φ(v , v , v ) = 2 Γg(v , Γg(v , v )) + 2 Γg(v , Γg(v , v )) 0 1 2 3 3 0 1 0 2 3 0 2 0 3 1 g g g + 2 Γ0(v3, Γ0(v1, v2)) − D0Γ (v1)(v2, v3) g g  − D0Γ (v2)(v3, v1) − D0Γ (v3)(v1, v2) ,

l and in principle D0φ(v1, . . . , vl) may be calculated in this fashion up to any desired order l < k. 20

6.6 Digression (cont’d)

We can now combine Theorem (6.4.6) and Theorem (6.5.1) into one theorem.

Theorem 6.6.1 Let A, B Ba /K, U ⊆ A, V ⊆ B open. Let k ∈ N, and g : U × A × A → K, g ∈ Ck+2(U), h : V ×B ×B → K, h ∈ Ck+2(V ) be metrics on A and B respectively. Let a ∈ U, b ∈ V , and suppose that U and V are Ck+3 diffeomorphic. Then the following two statements are equivalent. • There exists an open abc neighbourhood W of 0 in A and Ck+1 diffeo- morphisms φ : W → U1, χ : W → V1, where a ∈ U1 ⊆ U, b ∈ V1 ⊆ V and U1, V1 are open, such that for all 0 ≤ l < k we have

l ∗ l ∗ D0(φ g) = D0(χ h) (6.35)

and φ∗g and χ∗h are simple metrics at 0.

20The number of terms arising from the derivative in Equation (6.34) will rapidly increase however.

- 150 - 6.6. DIGRESSION (CONT’D)

• There exists a map f : A → B c l and bijective such that for all 0 ≤ l < k − 2 we have for all u1, . . . , ul+4 ∈ A that

ga(u1, u2) = hb(f(u1), f(u2)) l g l h ∇ Ra(u1, . . . , ul+4) = ∇ Rb (f(u1), . . . , f(ul+4)). (6.36)

Proof. Let a ∈ U, b ∈ V , and let ψ : U → V be a Ck+3 diffeomorphism. Then by Lemma (6.3.3) and Lemma (6.3.5), ψ∗h : U ×A×A → K is a Ck+2 metric on A.

Suppose that for f : A → B and bijective, f, g, and h satisfy Equation (6.36). As A Ba , g and ψ∗h both satisfy the conditions of Theorem (6.5.1) at the −1 points a and ψ (b) respectively. Hence there exist open neighbourhoods U1 and −1 U2 of a and ψ (b) in U, open abc neighbourhoods W1 and W2 of 0 in A, and k+2−1 ∗ ∗ ∗ C diffeomorphisms φ : W1 → U1, ω : W2 → U2 such that φ g and ω (ψ h) −1 are both simple metrics at 0, φ(0) = a, ω(0) = ψ (b), and D0φ = D0ω = idA. −1 Let V1 := ψ(U2) ⊆ ψ(U) = V , then V1 is open since ψ . Note that by Lemma (6.3.4), ω∗(ψ∗h) = (ψ ◦ ω)∗h. −1 Now (ψ ◦ ω)(0) = ψ(ψ (b)) = b and by Theorem (5.1.8) D0(ψ ◦ ω) = Dω(0)ψ ◦ D0ω = Dψ−1(b)ψ. Hence by Equation (6.16)

∗ ((ψ ◦ ω) h)0(u, v) = hb(Dψ−1(b)ψ(u),Dψ−1(b)ψ(v)) for all u, v ∈ A. Note that by Theorem (5.1.8) and Theorem (4.4.3), f : A → B ∞ ∞ is a C diffeomorphism, similarly Dψ−1(b)ψ : A → B is a C diffeomorphism −1 with inverse (use Lemma (5.1.15)) Dbψ : B → A. Because we would like to use Theorem (6.4.6) we consider the map

−1 −1 −1 χ := ψ ◦ ω ◦ Dbψ ◦ f :(Dbψ ◦ f) (W2) → V1,

k+1 which is a C diffeomorphism. Then χ(0) = (ψ ◦ ω)(0) = b and D0χ = −1 Dψ−1(b)ψ ◦ Dbψ ◦ f = f by Theorem (5.1.8) and Lemma (5.1.15). −1 −1 Restrict both φ and χ to W := W1 ∩ (Dbψ ◦ f) (W2) which is an open −1 abc neighbourhood of 0 in A by Lemma (4.3.5) (as f and Dbψ are and bijective), and shrink U1 and V1 to φ(W ) and χ(W ) respectively. By Lemma (6.4.4), both φ∗g and χ∗h are simple metrics at 0 (since (ψ ◦ ω)∗h is a simple metric at 0 and D2 (D ψ−1 ◦ f) = 0 for all a ∈ A). a1 b 1 Then by Lemma (6.3.4) and our assumption on f

∗ (φ g)0(u, v) = gφ(0)(D0φ(u),D0φ(v))

= ga(u, v) (6.36) = hb(f(u), f(v))

= hχ(0)(D0χ(u),D0χ(v)) ∗ = (χ h)0(u, v) for all u, v ∈ A. By Theorem (6.2.3) we know that for all 0 ≤ l < k − 2, u1, . . . , ul+4 ∈ A we

- 151 - 6.6. DIGRESSION (CONT’D) have

l φ∗g l g ∇ R0 (u1, . . . , ul+4) = ∇ Rφ(0)(D0φ(u1),...,D0φ(ul+4)) l g = ∇ Ra(u1, . . . , ul+4) (6.36) l h = ∇ Rb (f(u1), . . . , f(ul+4)) l h = ∇ Rχ(0)(D0χ(u1),...,D0χ(ul+4)) l χ∗h = ∇ R0 (u1, . . . , ul+4).

Now we can use Theorem (6.4.6) and the fact that φ∗g and χ∗h are Ck metrics by Lemma (6.3.3) to conclude that for all 0 ≤ l < k, we have that l ∗ l ∗ D0(φ g) = D0(χ h): the maps φ and χ satisfy the conditions of the first item.

Suppose conversely that two Ck+1 diffeomorphisms φ and χ satisfy all con- ∗ ditions of the first item. Then by Theorem (6.4.6) we have that (φ g)0(u, v) = ∗ ∗ l φ g (χ h)0(u, v) and for all 0 ≤ l < k−2 and u1, . . . , ul+4 ∈ A, ∇ R0 (u1, . . . , ul+4) = ∗ l χ h ∇ R0 (u1, . . . , ul+4). We therefore pick

−1 f := D0χ ◦ [D0φ] : A → B which is c l and bijective by Lemma (5.1.15). Then by Theorem (6.2.3) and Equation (6.16), for all 0 ≤ l < k − 2, u1, . . . , ul+4 ∈ A we have

l g l φ∗g −1 −1 ∇ Ra(u1, . . . , ul+4) = ∇ R0 ([D0φ] (u1),..., [D0φ] (ul+4)) l χ∗h −1 −1 = ∇ R0 ([D0φ] (u1),..., [D0φ] (ul+4)) l h −1 −1 = ∇ Rχ(0)(D0χ([D0φ] (u1)),...,D0χ([D0φ] (ul+4))) l h = ∇ Rb (f(u1), . . . , f(ul+4)). Similarly

∗ −1 −1 ga(u, v) = (φ g)0([D0φ] (u), [D0φ] (v)) ∗ −1 −1 = (χ h)0([D0φ] (u), [D0φ] (v))

= hb(f(u), f(v)).

So f satisfies Equation (6.36) and the second item is satisfied. Theorem (6.6.1) is the partial converse to Christoffel’s result that was men- tioned in the introduction. It is a converse in the sense that while Christoffel showed in [Chr1869] that two metrics g and h related via a diffeomorphism21 give a chain of relations between Rg and Rh, ∇Rg and ∇Rh, . . . in the form of Equation (6.14) (Theorem (6.2.3)), we obtain in Theorem (6.6.1) from a chain of relations in the form of Equation (6.14), equality of the derivatives of simple forms of g and h. Of course Theorem (6.2.3) immediately gives the relations for all points in U and V (as all these points can be related by the diffeomorphism which relates g and h), while Theorem (6.6.1) only gives the ‘converse’ at two selected points; we have no diffeomorphism which tells us what points a and b

21Like in Equation (6.2).

- 152 - 6.6. DIGRESSION (CONT’D) to compare to one another. Nevertheless, it is striking that we can still recover such an amount of information about the behaviour of g from Rg, ∇Rg,.... This is also as far as we will get in this thesis to attaining the goal stated at the beginning of Section 6.2.

k l We now return to discuss the questions raised about the action of GU on MU for k > l, given by Equation (6.20), to look at Theorem (6.4.6) and Theorem (6.5.1) from a different point of view. For convenience, let U := {U ⊆ A| U open, 0 ∈ U } be the collection of open neighbourhoods of 0. It turns out to be inconvenient to restrict oneself to diffeomorphisms U → U, we therefore consider the groupoid 22 of Ck diffeomorphisms that leave 0 fixed, defined for k ∈ N by k k G := {φ : U1 → U2| φ C diffeomorphism, φ(0) = 0, U1,U2 ∈ U }. Multiplication is defined as φ χ := φ◦χ whenever im(χ) = dom(φ) and inversion by taking the inverse (as a function) of the diffeomorphism, this makes Gk a groupoid. We also define k [ k M := MU U∈U as the collection of all Ck metrics defined on an open neighbourhood of 0. Now Gk has the same ‘action’ 23 on Mk as defined in Equation (6.20): for k l l k > l, φ ∈ G , g ∈ M , g ∈ C (U) we define φ · g := φ∗g whenever dom(φ) = U. Note that this agrees with the groupoid multiplication, if φ · g is defined, then χ · φ is defined if and only if im(φ) = dom(χ) if and only if χ · (φ · g) is defined. Furthermore, by Lemma (6.3.4), if this is the case then (χ φ) · g = χ · (φ · g). Let us now formalise the ‘considering of the Taylor series’ we did at the end of Section 6.3. Let A Ba /K.

First we need a container for Taylor sequences of diffeomorphisms φ ∈ Gk for k ∈ N. k,1 c l Define G0 to be the collection of all f : A → A /K that are bijective. k,l l For l > 1 we define G0 to be the collection of all f : A → A l- /K that are l symmetric: f(a1, . . . , al) = f(aπ(1), . . . , aπ(l)) for all π ∈ S and a1, . . . , al ∈ A. Then we define k k [ k,l G0 := G0 . l=1 By Theorem (5.1.16) and Lemma (5.1.15) we can for all k ∈ N create a well-defined mapping (recall that for φ ∈ Gk we have φ(0) = 0, so we discard the first term of φ’s Taylor series)

k k k 2 k ρ : G → G0 : φ 7→ (D0φ, D0φ, . . . , D0 φ). 22A groupoid is a group in the usual sense of the word, except that the product need not be defined for all pairs of elements of the groupoid. 23Just as the multiplication of the groupoid, this is the same as a group action which is not defined for all elements of the groupoid and the set on which it acts.

- 153 - 6.6. DIGRESSION (CONT’D)

It is clear that ρk is not injective for any k ∈ N (by considering diffeomor- phisms with different (k + 1)-th derivative), which is not surprising since we discard a lot of information by only looking at the Taylor series. k Let (f1, f2, . . . , fk) ∈ G0 be given. Define φ : A → A : a 7→ f1(a) + f2(a, a) + ∞ ∞ c ... + fk(a, . . . , a), then φ is C (as all the fl are C because they are l- l k,1 ), φ(0) = 0, and D0φ = f1. By definition of G0 this makes D0φ and bijective, so by Theorem (5.5.8) there exist open neighbourhoods U1 and U2 ∞ of 0 and φ(0) = 0 in A such that φ|U1 : U1 → U2 is a C diffeomorphism. k k Hence φ|U1 ∈ G and because all fl are symmetric, ρ (φ|U1 ) = (f1, f2, . . . , fk). Therefore ρk is surjective. k k k k We endow G0 with a multiplication map G0 × G0 → G0 (using surjectivity k l of ρ to write all fl as the D0φ of a certain φ) 2 2 (D0χ, D0χ, . . .)(D0φ, D0φ, . . .)  := u1 7→ D0χ(D0φ(u1)),

2 2  (u1, u2) 7→ D0χ(D0φ(u1),D0φ(u2)) + D0χ(D0φ(u1, u2)),... given by repeated use of Theorem (5.1.8) and Equation (5.4) to write out l 24 k k k D0(χ ◦ φ) for 1 ≤ l ≤ k. This definition ensures that ρ (χ φ) = ρ (χ) ρ (φ) whenever χ φ is well-defined.

1 Note that G0 is precisely the collection of all bijective mappings A → A, which is a group under composition and inversion of mappings. Let for k ∈ N k k 1 2 k π1 : G0 → G0 :(D0φ, D0φ, . . . , D0 φ) 7→ D0φ which is clearly surjective and not injective for k > 1. Then this map satisfies

k 2 k 2 k π1 ((D0χ, D0χ, . . . , D0 χ)(D0φ, D0φ, . . . , D0 φ)) k = π1 (D0χ ◦ D0φ, . . .)

= D0χ ◦ D0φ

= D0χ D0φ

k and is therefore compatible with multiplication on G0 . Note that in particular k k k k k k k (π1 ◦ ρ )(χ φ) = (π1 ◦ ρ )(χ)(π1 ◦ ρ )(φ) whenever χ φ is defined in G .

We will now construct a similar set for Mk, k ∈ N. k,0 2 Define M0 to be the collection of all f : A → K 2- /K that are 0 symmetric, f(a1, a2) = f(a2, a1), and satisfy that A → A : a1 7→ (a2 7→ k,l f(a1, a2)) is bijective (recall Definition (6.1.2)). For l > 0 we define M0 to consist of all f : Al+2 → A (l+2)- /K that are symmetric with respect to the first l and last 2 variables: f(a1, . . . , al, al+1, al+2) = f(a1, . . . , al, al+2, al+1) = l f(aπ(1), . . . , aπ(l), al+1, al+2) for all π ∈ S and a1, . . . , al+2 ∈ A. Then we define

k k [ k,l M0 := M0 . l=0

24 k Note that an inversion map is not easily obtained in G0 , so we will not consider this set as k k k a group, just as a set with a multiplication map G0 × G0 → G0 which is associative, because l l l D0((ψ ◦ χ) ◦ φ) = D0(ψ ◦ (χ ◦ φ)) = D0(ψ ◦ χ ◦ φ).

- 154 - 6.6. DIGRESSION (CONT’D)

Definition (6.1.2) and Theorem (5.1.16) ensure that for all k ∈ N we can create a well-defined mapping

k k k k σ : M → M0 : g 7→ (g0,D0g, . . . , D0 g).

k k As with ρU , σU is not injective. k Let (f0, f1, . . . , fk) ∈ M0 be given. Define g : A × A × A → K :(a, u, v) 7→ f0(u, v) + f1(a, u, v) + ... + fk(a, . . . , a, u, v). Then g is a symmetric 2-tensor,    ∞ c g ∈ C (A), andg ˆ0 = u 7→ v 7→ f0(u, v) which is l /K and bijec- 0 k,0 tive A → A by the fact that f0 ∈ M0 . In particular by Corollary (4.4.6), 0 0 0 ∗ A ' A viag ˆ0, so A Ba /K. This gives us thatg ˆ0 ∈ L(A, A ) , so by Theorem 0 ∞ (5.5.6) and the fact that A → L(A, A ): a 7→ gˆa (as g is C , use Corollary (5.5.7)) we obtain that there is an open neighbourhood U of 0 in A such that 0 ∗ −1 for all a ∈ U,g ˆa ∈ L(A, A ) . Theng ˆ , because inversion is continuous by k Theorem (5.5.6). Therefore g|U×A×A is a metric and hence g|U×A×A ∈ M . k k Furthermore σ (g|U×A×A) = (f0, f1, . . . , fk), so σ is surjective.

k 25 l For all k, l ∈ N, k > l, G0 has a natural ‘action’ on M0 given by (use surjectivity of ρk and σl)

2 k l (D0φ, D0φ, . . . , D0 φ) · (g0,D0g, . . . , D0g)  −1 −1 := (u1, u2) 7→ g0(D0φ (u1),D0φ (u2)), −1 −1 −1 (u1, u2, u3) 7→ D0g(D0φ (u3))(D0φ (u1),D0φ (u2)) 2 −1 −1 −1 2 −1  + g0(D0φ (u1, u3),D0φ (u2)) + g0(D0φ (u1),D0φ (u2, u3)),... ,

m where all higher order terms can be obtained by writing out D0 (φ∗g) for 0 ≤ m ≤ l using Equation (5.4) and Equation (6.17). Note that with this definition, for φ ∈ Gk and g ∈ Ml for which φ · g is defined, we have

l l k l σ (φ · g) = σ (φ∗g) = ρ (φ) · σ (g),

k k k l k l So in particular if χ φ is defined in G , ρ (χ) · (ρ (φ) · σ (g)) = ρ (χ) · σ (φ∗g) = k k k l ρ (χ∗(φ∗g)) = ρ ((χ ◦ φ)∗g) = ρ (χ φ) · σ (g).

g g Now we will create a new container for sequences of the form (g0,R0, ∇R0,...). Let k ∈ N. k,0 4 Define R0 to be the collection of all f : A → K 4- /K satisfy- ing f(a1, a2, a3, a4) = −f(a2, a1, a3, a4), f(a1, a2, a3, a4) = −f(a1, a2, a4, a3), f(a1, a2, a3, a4) = f(a3, a4, a1, a2), and f(a1, a2, a3, a4)+f(a1, a4, a2, a3)+f(a1, a3, a4, a2) = k,l l+4 0. For l > 0 define R0 to consist of all f : A → K (l+4)- /K that satisfy k,1 the same relations for their last four variables as the maps in R0 . Then we define k−2 k k,0  [ k,l R0 := M0 ∪ R0 . l=0

25 k k l l Not truly a group action, as G0 is not a group, but really a pairing G0 × M0 → M0 : (a, b) 7→ a · b which satisfies a · (b · c) = (a b) · c.

- 155 - 6.6. DIGRESSION (CONT’D)

k k Between M0 and R0 we can define the Christoffel map (which is well-defined by Equation (6.13), of which the relations are preserved by Equation (6.15))

k k k 2 k g g k−2 g Chri : M0 → R0 :(g0,D0g, D0g, . . . , D0 g) 7→ (g0,R0, ∇R0,..., ∇ R0), (6.37) g g where we calculate R0 with Equation (6.11) and Γ0 using Equation (6.4) and l g Equation (6.7). Terms ∇ R0 for l ≥ 1 may inductively be calculated using m l g Equation (6.15) and expressed entirely in terms of D0 g, where D R0 is calcu- lated with Theorem (5.1.8), Equation (5.4), and Equation (6.1) (which gives l g m g D Γ0 in terms of derivatives D0 g and Γ0 itself). 1 c l The group G0 of all bijective mappings A → A has a natural action on k 1 k R0 given for all φ ∈ G0 and (f2, f4, . . . , fk+2) ∈ R0 by

φ · (f2, f4, f5, . . . , fk+2)  −1 −1 := (u1, u2) 7→ f2(φ (u1), φ (u2)), −1 −1 −1 −1 (u1, u2, u3, u4) 7→ f4(φ (u1), φ (u2), φ (u3), φ (u4)), ...,

−1 −1  (u1, . . . , uk+2) 7→ fk+2(φ (u1), . . . , φ (uk+2)) .

It can directly be verified that χ·(φ·(f2, f4, . . . , fk+2)) = (χ φ)·(f2, f4, . . . , fk+2) and idA ·(f2, f4, . . . , fk+2) = (f2, f4, . . . , fk+2), so this really is a group action. Note that for any φ ∈ Gk and g ∈ Ml, k > l, φ · g defined, we have with this definition that (using Lemma (5.1.15) and Theorem (6.2.3))

k k l l (π1 ◦ ρ )(φ) · (Chri ◦ σ )(g) g g = D0φ · (g0,R0, ∇R0,...)  −1 −1 = (u1, u2) 7→ g0([D0φ] (u1), [D0φ] (u2)),

g −1 −1  (u1, . . . , u4) 7→ R0([D0φ] (u1),..., [D0φ] (u4)),...

 −1 −1 = (u1, u2) 7→ gφ−1(0)(D0φ (u1),D0φ (u2)),

g −1 −1  (u1, . . . , u4) 7→ Rφ−1(0)(D0φ (u1),...,D0φ (u4)),...

 φ∗g  = (φ∗g)0,R0 ,... l l = Chri (σ (φ∗g)) = (Chril ◦ σl)(φ · g).

1 l This shows us that the action of G0 on R0 is compatible with the earlier actions we established.

- 156 - 6.6. DIGRESSION (CONT’D)

To summarise all of the above, for k, l ∈ N, k > l, we have

2 φ / (D0φ, D0φ, . . .) / D0φ (6.38)

ρk πk k k 1 1 G / / G0 / / G0

   l l l M / / M0 / R0 σl Chril

g g / (g0,D0g, . . .) / (g0,R0,...).

Here the dotted arrows denote the actions induced by the pushforward (φ, g) 7→ φ∗g. Furthermore, the actions and multiplication maps in the diagram are compatible whenever they are defined:

ρk(χ φ) = ρk(χ) ρk(φ) σl(φ · g) = ρk(φ) · σl(g) k k k k k k (π1 ◦ ρ )(χ φ) = (π1 ◦ ρ )(χ)(π1 ◦ ρ )(φ) l l k k l l (Chri ◦ σ )(φ · g) = (π1 ◦ ρ )(φ) · (Chri ◦ σ )(g). (6.39)

However, we still have not used Theorem (6.6.1). We therefore define the collection of Ck metrics that are simple at 0 by

Sk := {g ∈ Mk| g is a simple metric at 0 } ⊆ Mk.

k,0 k,0 k,1 Define for k, l ∈ N furthermore S0 := M0 , S0 := {0} as the set contain- ing only the zero mapping 0 : A3 → A (inspired by Equation (6.26)), and for k,l k,l l > 1, S0 to consist of all f ∈ M0 which in addition satisfy (Equation (6.27))

f(a1, a2, . . . , al, al+1, al+2) + f(al+1, a1, . . . , al−1, al, al+2) + ...

+ f(a2, a3, . . . , al+1, a1, al+2) = 0 for all a1, . . . , al+2 ∈ A. Then we define

k k [ k,l k S0 := S0 ⊆ M0 . l=0

k+1 k k 26 By Lemma (6.4.4) we have that for any g ∈ S , σ (g) ∈ S0 . We see from the derivation of Equation (6.29) that Chrik is injective when restricted k k k k to S0 : all (g0,D0g, . . . , D0 g) ∈ S0 ⊆ M0 satisfy Equation (6.26) and Equation (6.27) which were all that was necessary to derive Equation (6.29). So Equation (6.29) and Theorem (6.4.6) can in a sense be seen as expresse- k k sions of the fact that the map Chri is injective when restricted to S0 .

26 k k+1 k It is unclear whether or not σ (S ) = S0 however.

- 157 - 6.6. DIGRESSION (CONT’D)

Whether or not Chrik is surjective is a harder problem: Equation (6.29) gives us a way to construct a Taylor sequence of a metric up to order k, given a se- k quence (f2, f4, f5, . . . , fk+2) ∈ R0 , but whether or not this Taylor sequence again k yields (f2, f4, f5, . . . , fk+2) under Chri is unclear, because Equation (6.29) and Chrik both become very complicated at higher orders. 1 k By Lemma (6.4.4) we furthermore know that for any φ ∈ G0 and g ∈ S , we k 1 have that φ · g ∈ S if k ≥ 2, after suitably restricting φ. Likewise, all φ ∈ G0 k k preserve the relations of Equation (6.26) and Equation (6.27), so φ ·S0 ⊆ S0 . Hence for all k ∈ N 1 1 1 G0 / / / G0 / / / G0 (6.40)

   k+1 k k S / S0 / / R0 . σk Chrik Theorem (6.5.1) states that for k ≥ 3, and any g ∈ Mk there exists a φ ∈ Gk−1 such that φ·g ∈ Sk−2 if g is restricted to a suitable open neighbourhood of 0. k k Let k ∈ N,(g0,D0g, . . . , D0 g) ∈ M0 be arbitrary, then we may assume that g ∈ M∞ by the proof of surjectivity of σk. Hence by Theorem (6.5.1) ∞ 2 k+1 k there exists a φ ∈ G such that (D0φ, D0φ, . . . , D0 φ) · (g0,D0g, . . . , D0 g) = k k k+1 k k σ (φ · g) ∈ σ (S ) ⊆ S0 . So for any (g0,D0g, . . .) ∈ M0 there exists a 2 k+1 k+1 2 k (D0φ, D0φ, . . . , D0 φ) ∈ G0 such that (D0φ, D0φ, . . .) · (g0,D0g, . . .) ∈ S0 . Therefore Theorem (6.5.1) expresses in this setting the fact that for any k ∈ N, the ‘orbit’ (as it is not really a group action) of any Taylor sequence in k k+1 k M0 under G0 necessarily intersects S0 . So we can go from Equation (6.38) k to Equation (6.40), while staying in the same orbit in M0 . 1 k k The mayor advantage of this, is that the action of G0 on S0 and R0 is much k+1 k less complicated than the action of G0 on M0 and therefore a much more convenient setting in which to investigate whether or not two elements lie in the same orbit.

This is another way, compared to Theorem (6.6.1), of looking at the original problem in the context of orbits and expressing Theorem (6.4.6) and Theorem (6.5.1) in terms of this new formulation.

- 158 - Chapter 7

Conclusion

In the first chapters of this thesis we discussed topology (Chapter2), algebra and group theory (Chapter3), algebra and topology combined (Chapter4), and analysis (Chapter5) to develop the theory necessary to generalise Christoffel’s article, [Chr1869], as much as possible. The many interesting results found in these chapters are listed in Section 1.2.

Christoffel’s article deals with the question of, given two metrics, whether or not we can find a coordinate transformation that transforms these metrics into each other, as per Equation (6.2). In [Chr1869] the consequences of Equation (6.2) are investigated to find necessary conditions, for such a coordinate trans- formation to exist. These conditions have all been generalised from the finite k T dimensional space R to Vs /K T2 LC (that is, locally convex Hausdorff topologi- cal vector spaces over either R or C) in Theorem (6.2.1), Theorem (6.2.2), and Theorem (6.2.3). In these theorems we find that if such a coordinate transfor- mation exists, we obtain a chain of transformation equations between tensors which are all necessarily satisfied by the transformation.1 Furthermore, these covariant tensors can directly be expressed in terms of the derivatives of the metrics with which they are associated. Then we digressed from [Chr1869] in Section 6.3 at the point where we try to find a way back from this chain of equations to a coordinate transforma- tion relating the metrics. Because considering all coordinate transformations in their entirety would make this extremely complicated, we fixed two points and approximated both metrics by their Taylor approximations near these points. In Theorem (6.4.6) we then found that the Taylor approximations to two given metrics agree at a certain point, precisely when the sequence of tensors from [Chr1869] agrees at this point, provided that we are working in a UC (uniformly complete) space, and both the metrics are simple2 at this point. We continued with Theorem (6.5.1) which shows that if we are working in a Ba (Banach) space, then for any metric and any fixed point at which the metric is defined, there is a coordinate transformation such that the transformed metric is simple at that point. This in turn lead to Theorem (6.6.1) where we show for two fixed

1More precisely, we find equations between the curvature tensors of both metrics and their covariant derivatives. 2Expressed in Riemann normal coordinates.

- 159 points, that we can find coordinate transformations for both metrics, such that the transformed metrics at these points have an equal Taylor expansion, if and only if there is a continuous linear bijective map which relates the tensors from [Chr1869] at these points. While this does not provide the coordinate transformation itself, it does simplify the question of whether or not there can be a coordinate transformation relating the two fixed points in question, which preserves the Taylor sequences of both metrics up to a certain order, by rephrasing this question entirely in terms of the tensors from [Chr1869]. It furthermore ensures that at these points, we need not consider the entire Taylor expansion of the coordinate transformation, but only its linear part. At the end of Section 6.3 and continued in Section 6.6 we rephrase the prob- lem in terms of an action of the collection of diffeomorphisms on the collection of metrics (such that two metrics can be transformed into each other via a coor- dinate transformation if and only if they lie in the same orbit under this action). Here we see that the tensors from [Chr1869] give us a map, given by Equation (6.37), called the Christoffel map which factors nicely through the action of the diffeomorphisms (Equation (6.39)), and is even injective on a certain subset of Taylor sequences of simple metrics by Equation (6.29).

This raises some interesting questions. • At the end of Section 6.6 we found that the Christoffel map Chrik is k k k k+1 injective when restricted to S0 . This set S0 contains σ (S ), the Taylor sequences of all Ck+1 metrics that are simple at 0. We can therefore ask k k k k+1 k ourselves whether σ (S ) or σ (S ) equals the entire S0 . k k k • Another immediate question is what the set Chri (S0 ) ⊆ R0 looks like. Is Chrik defined in Equation (6.37) surjective? Or in other words, what con- g g ditions must the multilinear maps from a given sequence (g0,R0, ∇R0,...) satisfy to actually be of this form for a certain metric g? • And finally, would there be a practical way to employ Theorem (6.6.1) in finding the sought coordinate transformation, or is it only useful in elimi- nating points that may not be related by any coordinate transformation? These questions mark the end of this thesis. I would like to thank the reader for taking the time to read it, and I hope to have given an interesting new per- spective on Christoffel’s article,

Bas Fagginger Auer.

- 160 - Chapter 8

Translation

This is a annotated, direct translation of Elwin Bruno Christoffel’s article, [Chr1869]. All footnotes have been added by the author of this thesis, except for the references to the articles of Lam´eand Riemann.

- 161 About the transformation of homogeneous differential expressions of second degree (Mr. E. B. Christoffel from Z¨urich, translated by B. O. Fagginger Auer)

1.

Consider the differential expression X F = ωik ∂xi ∂xk; i, k = 1, 2, . . . , n, where the coefficients ω are arbitrary functions of the independent variables x1, 0 0 0 x2,..., xn. If these are independent functions of new variables x1, x2,..., xn then F will transform into a new differential expression

0 X 0 0 0 F = ωik ∂xi ∂xk, which is equal to the original expression. If on the other hand two differential expressions F and F 0 are given, one can ask under which conditions these may be transformed into each other and which substitution of variables will yield this transformation. For the investigation of this question we define the of the co- efficients of F , F 0, and the substitution itself as

X ∂x1 ∂x2 ∂xn ± 0 0 ... 0 = r, ∂x1 ∂x2 ∂xn X ± ω11 ω22 . . . ωnn = E,

X 0 0 0 0 ± ω11 ω22 . . . ωnn = E . From the theory of algebraic invariants it is known that there is precisely one condition E0 = r2 E for our sought after transformation, which would be sufficient for the considered case, provided the coefficients ω of F and elements of r are constant. In the actual, more general, case we are considering, additional conditions need to be 0 0 met, as not all systems of linear and homogeneous functions of ∂x1, ∂x2,..., 0 0 ∂xn inserted in place of ∂x1, ∂x2,..., ∂xn to transform F to F will solve the posed problem, without satisfying integrability conditions 1 that make the expressions for ∂x1, ∂x2,..., ∂xn full-fledged differentials. If F and F 0 are homogeneous differential expressions of general, but equal degree, the theory of algebraic invariants yields equations which, in the case that this degree is strictly greater than 2, completely determine the values of the coefficients necessary to transform F into F 0. However, these coefficients still need to be subjected to the necessary integrability conditions, which involves both the direct and inverse substitution, see section 10..2

1See the integrability conditions (B.) in section 8.. 2 “[...], wobei die Eigenschaft der zugeh¨origenFormen, unmittelbar nicht die directe, son- dern die transponirte Substitution zu liefern, wesentlich in Betracht kommt (vergl. art. 10).”

- 162 - These kind of simplifications do not occur for homogeneous differential ex- pressions of second degree, as the algebraic conditions yield only one invariant and one corresponding form. Because of this, we will introduce aids in the following sections, with which this important case can be treated. Finally, it 2 2 2 should be remarked that for the case where F = ∂x1 + ∂x2 + ∂x3, there exists an extensive work of Mr. Lam´e (Th´eoriedes coordonn´eescurvilignes), treating our present problem in the context of the unwrapping of curved planes.

2.

During the investigation of the conditions that are both necessary and suf- ficient for the equation

X X 0 0 0 (1.) ωik ∂xi ∂xk = ωik ∂xi ∂xk to hold, we restrict ourselves to the case where both the determinants E and E0 of these differential expressions are not identically equal to zero.3 The new variables x0 will be assumed to be independent and their differentials constant. 0 0 0 0 If one would replace each ∂xi by ∂xi + δxi, where the differentials δxi cor- respond to increases of the original variables x by δx, then in (1.), ∂x would go to ∂x + δx. Expanding the products on both sides of (1.) we obtain4

X X 0 0 0 (2.) ωik ∂xi δxk = ωik ∂xi δxk.

0 Comparing the coefficients for ∂xg, we find

X ∂xi X ω δx = ω0 δx0 , ik ∂x0 k gk k ik g k and from this 0 X Egh ∂xi (3.) δx0 = ω δx , h ik E0 ∂x0 k gik g 0 0 5 where Egh is the minor of ωgh. In equation (2.) we now increase each x0 by its differential ∂x0 and in (1.) each x0 by δx0, which does not change the differentials on the right hand side;

3 0 0 It is also implicitly assumed that ωik = ωki and ωik = ωki. 4 P P P For F we have ωik ∂xi ∂xk → ωik(∂xi + δxi)(∂xk + δxk) = ωik ∂xi ∂xk + P P P 0 ωik δxi δxk + ωik ∂xi δxk + ωik δxi ∂xk, do the same for F and use equation (1.) and symmetry of ω and ω0 to obtain (2.). 5 0 So Egh is the of the obtained from deleting the g-th row and h-th ` 0 ´ column of the matrix with coefficients ωik i,k=1,...,n. By Cramer’s rule and assumed invert- 0 P 0 0 0 ibility of ω , we have therefore h(Egh/E )ωhk = δgk, which immediately yields equation 0 0 (3.) by multiplying the right hand side of the equation above (3.) by Egh/E and summing over g.

- 163 - it then follows that:6

X 2 X X X 0 0 0 ωik ∂ xi δxk + ∂ωik ∂xi δxk + ωik ∂xi ∂δxk = ∂ωik ∂xi δxk, X X X 0 0 0 δωik ∂xi ∂xk + 2 ωik ∂xi δ∂xk = δωik ∂xi ∂xk. Dividing the second equation by 2 and subtracting it from the first we find

X X 1 X ω ∂2x δx + ∂ω ∂x δx − δω ∂x ∂x ik i k ik i k 2 ik i k X 1 X = ∂ω0 ∂x0 δx0 − ∂ω0 ∂x0 ∂x0 . ik i k 2 ik i k To avoid cluttering our formulae we define 1 ∂ω ∂ω ∂ω  gh (4.) gk + hk − gh = 2 ∂xh ∂xg ∂xk k from which hg gh ∂ω gh gk (5.) = and hk = + k k ∂xg k h follow, and if we use the same notation for the transformed differential expres- sion, we find7 0 X X il X αβ (6.) ω ∂2x δx + ∂x ∂x δx = ∂x0 ∂x0 δx0 . ik i k k i l k h a β h ik ikl αβh

0 Substituting δxh by its value from (3.) causes the right hand side to change to  0 0 X αβ X Egh ∂xi ∂x0 ∂x0 ω δx . h α β ik E0 ∂x0 k αβh gik g

By equating coefficients of δxk we find    0 0 X X il X ∂xi X αβ Egh ω ∂2x + ∂x ∂x = ω ∂x0 ∂x0 . ik i k i l ik ∂x0 α β h E0 i il giαβ g h

6 0 We are making a first order Taylor approximation in both (1.) and (2.) by letting xi → 0 0 0 0 0 0 0 0 0 0 0 xi +δxi and xi → xi +∂xi respectively (so ω (x +δx )ik = ω (x )ik +δωik). Note that all ωik, 0 0 ∂xi, δxi are functions of x via the dependence x = x(x ). From equation (2.) we obtain under 0 0 P 0 0 0 P 0 0 0 P 0 0 0 xi → xi + ∂xi that for the right hand side ωik ∂xi δxk → ωik ∂xi δxk + ∂ωik ∂xi δxk, 0 P P while for the left hand side (as the xi depend on the xi) ωik ∂xi δxk → ωik ∂xi δxk + P P 2 P ∂ωik ∂xi δxk + ωik ∂ xi δxk + ωik ∂xi ∂δxk. Now equating both sides and using (2.) we obtain the first equation. The second equation is found by doing the same for equation (1.) and using the symmetry of ω. 7 Using equation (5.) we find

X 1 X ∂ω ∂xi δx − δω ∂xi δx ik k 2 ik k X „»ji– »jk–« 1 X „»ji– »jk–« = + ∂xj ∂xi δx − + δxj ∂xi ∂x k i k 2 k i k X „»ji– »jk– 1 »ki– 1 »jk–« = + − − ∂xj ∂xi δx . k i 2 j 2 i k Now use the fact that this expression is symmetrical in j and i, together with equation (5.) to obtain (6.).

- 164 - 8 We now multiply this equation by Erk/E and sum over k. If we define     X il Erk il (7.) = k E r k from which il li (8.) = r r follows, it is found that    0 X il X αβ ∂xr ∂2x + ∂x ∂x = ∂x0 ∂x0 , r r i l g ∂x0 α β il αβg g which on the other hand implies equation (6.)9, therefore, for all α, β and r:

2    0 ∂ xr X ik ∂xi ∂xk X αβ ∂xr (9.) + = . ∂x0 ∂x0 r ∂x0 ∂x0 λ ∂x0 α β ik α β λ λ

This equation yields n(n + 1)/2 equations for each r, so in total a system of n2(n + 1)/2 partial differential equations for the sought after substitution of variables. If these are satisfied, then so are the integrability conditions for the linear expression in the new differentials. 10 These equations simplify considerably if all coefficients ω of the original form are constant. This, by (7.) and (5.), is true if and only if all expressions ik disappear. In this case we retain a linear system of partial differential r equations for all variables. This result was also obtained by Mr. Lam´e in the aforementioned case he was considering.

3.

If it is impossible to express the original variables x as functions of x0 such that (9.) is satisfied, then the transformation from F to F 0 is not possible, as (9.) is a direct consequence of the existence of such a change of variables. If on the other hand the equations (9.) satisfied, then one wonders in how far this leads back to equation (1.), that is F = F 0. To investigate this, we insert the found solution to (9.) into F , from which follows that

X 00 0 0 00 F = ωik ∂xi ∂xk = F .

8 These are the actual Christoffel symbols: for a pseudo-Riemannian metric g on a smooth Pm a b M we can write g in local coordinate patches as g(x) = a,b=1 gab(x) dx ⊗ dx , ` ´ where gab(x) = gba(x) is symmetric and the matrix gab(x) a,b=1,...,m is invertible. Denoting ` ab´ the inverse matrix by g a,b=1,...,m we find that if we put gab = ωab, such that g = abff »ab– E F , by Cramer’s rule gab = E /E, so by equation (4.) and (7.) = P d,c = ab c d d E h i P 1 ∂ωad ∂ωbd ∂ωab Ed,c P 1 cd c + − = g (∂agbd + ∂bgad − ∂dgab) = Γ . d 2 ∂xb ∂xa ∂xd E d 2 ab 9The equations are equivalent as ω is invertible. 10By equation (8.) we have symmetry in α and β of (9.).

- 165 - The important thing now is what connections there exist between the respective coefficients of F 00 and F 0. As (9.) is nothing but a rewriting of equation (6.), both equations are equiv- alent. If we now, instead of starting with F = F 0, begin by setting F = F 00 and follow the same reasoning as in section 2, then instead of the current right hand side of (6.), we would obtain the new expression 00 X αβ ∂x0 ∂x0 δx0 , h α β h αβh and this must equal the original left hand side of (6.). Therefore we find 00 0 αβ αβ = , h h which with (5.) gives 00 0 ∂ωhk ∂ωhk 0 = 0 ∂xg ∂xg for all values of g, h, and k. Therefore the coefficients of F 00 and F 0 can only differ by additive constants. To make these constants disappear, it is sufficient that F 00 = F 0 at a certain point, that is, that the solution to (9.) satisfies the transformation relations following from equation (1.):

X ∂xi ∂xk (10.) ω = ω0 ik ∂x0 ∂x0 αβ ik α β for a certain collection of values for the new variables. We therefore arrive at the statement: Whenever it is possible to express the original variables x1, x2, . . . , xn in 0 0 0 terms of the new variables x1, x2, . . . , xn, such that the system of equations (9.) is satisfied, together with the initial condition that their first derivatives for a certain value of the new variables satisfy the transformation relations following from F = F 0, then F = F 0 for all possible values of the new variables. From this it follows that the transformation relations contained in (1.), ex- cept from the required initial conditions, are made redundant by (9.). The equations contained in (9.) are, provided that the initial conditions are satisfied at a certain point, both necessary and sufficient for existence of a transformation of F into F 0 and completely replaces the algebraic and integra- bility conditions stemming from equation (1.).

4. For the possibility of the equations (9.) being satisfied, new integrability conditions are necessary which we will now derive. To also show that these conditions do not again imply (9.), we will consider the following, more general equations instead of (9.)  2    0   ∂ xr X gh X αβ r  + ug uh = ur +  ∂x0 ∂x0 r α β λ λ αβ  α β gh λ (90.) 2    0    ∂ xr X gh X αγ r  + ug uh = ur +  ∂x0 ∂x0 r α γ λ λ αγ α γ gh λ

- 166 - where, as we will do in the following, denote the first derivatives

∂xi i 0 = uα ∂xα while we use the usual notation for higher order derivatives. Note that we  r  recover equation (9.) when we put equal to 0. αβ We obtain the discussed integrability conditions when we differentiate the 0 0 0 first equation of (9 .) to xγ , the second to xβ and consider their difference, where the third derivative of xr disappears. Also noting that the terms which contain 0 0 11 a derivative with respect to both xβ and xγ disappear , it follows:

 gh gi ∂ ∂   2   2 X  r r  g h i X ph ∂ xp h X pi ∂ xp i  −  uαuβuγ + 0 0 uβ− 0 0 uγ ∂xi ∂xh r ∂x ∂x r ∂x ∂x ghi   ph α γ pi α β

 0 0  αβ αγ ∂ ∂  0 2  0 2 X  λ λ  X αβ ∂ xr X αγ ∂ xr =  −  ur + −  ∂x0 ∂x0  λ λ ∂x0 ∂x0 λ ∂x0 ∂x0 λ  γ β  λ λ γ λ λ β

 r   r  ∂ ∂ αβ αγ + 0 − 0 . ∂xγ ∂xβ

From this it follows that the integrability conditions for (90.) are precisely those  r  for our original equations (9.), whenever the satisfy the following equation αβ

 X ph  p  X pi  p   ub − ui  r αγ β r αβ γ  ph pi  (11.)  r   r    0    0   ∂ ∂  X αβ r X αγ r αβ αγ  = − + − ,  γ λγ λ λβ ∂x0 ∂x0 λ λ γ β which can easily be put in a more symmetric form. From now on we will suppose that this condition is satisfied for all values of r, α, β, and γ. If we then substitute all second derivatives by their corresponding

11Symmetry of higher order derivatives.

- 167 - expressions in (90.), we obtain:

 gh gi ∂ ∂ X  r r  g h i  −  uαuβuγ ∂xi ∂xh ghi  

 0  X ph X αγ X gi + uh up − ug ui r β  λ λ p α γ  ph λ gi

 0  X ph X αβ X gi − uh up − ug ui r γ  λ λ p α β ph λ gi  0 0  αβ αγ ∂ ∂ X  λ λ  =  −  ur  ∂x0 ∂x0  λ λ  γ β 

0  0  X αβ X γλ X pi + ur − up ui λ  µ µ r λ γ  λ µ pi

0  0  X αγ X βλ X pi − ur − up ui . λ  µ µ r λ β λ µ pi

Here all terms containing the product of two u’s cancel; if we collect the remain- ing terms and swap λ and µ, it follows that:

  gh gi   ∂ ∂  r r          X  X gh pi gi ph  g h i   − + −  uαuβuγ   ∂xi ∂xh p r p r   ghi p  (12.)   0  0   αβ αγ  ∂ ∂ " 0  0  0  0#  X  γ λ X αβ µγ αγ µβ  r  =  − + −  u .   ∂x0 ∂x0 µ λ µ λ  λ  λ  γ β µ  

We therefore obtain the following result. The system of equations (12.) where r, α, β, and γ range from 1 to n, contains the necessary integrability conditions of equation (9.). However, it cannot replace (9.), as it implies the more general system of equations (90.)  r  where the independent terms satisfy (11.). αβ

5.

To continue our investigation, the equations contained in (12.) have to be replaced by a different, but equivalent system of equations. To this end we

- 168 - k multiply (12.) by ωrkuδ and sum over k and r. It is clear that the equations this yields are equivalent to (12.): we can multiply the equations we obtain by ∂x0 E the above operations by δ ρl , sum over δ and l, and finally set ρ = r to again ∂xl E obtain (12.). Performing these operations we obtain from (12.), using (10.)

X r k 0 ωrkuλuδ = ωλδ rk the following equation

 gh gi  ∂ ∂         X g h i k X  r r X gh pi gi ph  g h i uαuβuγ uδ ωrk  − + −  uαuβuγ ∂xi ∂xh p r p r ghik r  p 

 0 0  αβ αγ ∂ ∂ " 0 0 0 0# X  γ λ X αβ µγ αγ µβ  = ω0  − + −  λδ  ∂x0 ∂x0 µ λ µ λ  λ  γ β µ  such that the following manipulations will be the same for both sides of the equation. Now by equation (7.)

X gh gh ω = , rk r k r hence carrying out the sum over r on the left hand side

gh gi ∂   ∂           k X gh ∂ωpk k X gi ∂ωpk X gh pi gi ph = − − + + − , ∂x p ∂x ∂x p ∂x p k p k i p i h p h p where in both sums containing derivatives of ω we replace r by p. Furthermore, by (5.) ∂ω ip ik ∂ω hp hk pk = + , pk = + , ∂xi k p ∂xh k p hence pi ∂ω ik ∂ω ph hk − pk = − , pk − = , k ∂xi p ∂xh k p which gives us

gh gi ∂ ∂ k k X gi hk gh ik = − + − . ∂x ∂x p p p p i h p

- 169 - gi We will denote this expression by (gkhi)12, such that when we rewrite , p gh using (7.), we find p

gh gi ∂ ∂         k k X Eαβ gi hk gh ik (13.)(gkhi) = − + − , ∂xi ∂xh E α β α β αβ which is by performing the derivatives equal to

 2 2 2 2   1 ∂ ωgi ∂ ωhk ∂ ωgh ∂ ωik  (gkhi) = + − −  2 ∂xh∂xk ∂xg∂xi ∂xi∂xk ∂xg∂xh (14.)         X Eαβ gi hk gh ik  + − .  E α β α β αβ

With this definition, the integrability conditions (12.) take the following form:

0 X g h i k (15.)(αδβγ) = (ghik) uαuβuγ uδ . ghik

For this system of equations we have the same result as we found for (12.), they contain all necessary integrability conditions of (9.), but not (9.) itself, as they imply the more general system (90.) subject to (11.). Hence they cannot fully replace the integrability conditions of (1.), but are a necessary consequence. The coefficients of (ghik) have the following properties. Exchanging i and h in (13.) we find (16a.)(gkih) = −(gkhi). Exchanging g and k in (14.) we find

(16b.)(kghi) = −(gkhi).

Exchanging g and i, and k and h, the right hand side of (14.) does not change, as Eβα = Eαβ, and we find

(ihkg) = (gkhi) or (16c.)(higk) = (gkhi). Finally we find (16d.)(gkhi) + (ghik) + (gikh) = 0. With these four formulas, the number of truly different equations contained in (15.) is easily determined. Because of (a.) and (b.) we can discard all cases where either α = δ or β = γ, as well as those which follow for exchanging α with δ or β with γ. Hence the pair αδ, and similarly βγ, may be enumerated by two distinct, ordered elements n(n−1) of 1, 2, . . . , n, which gives us a total of 2 = n2 possible combinations. 12(gkhi) really is the with all its indices lowered.

- 170 - The expressions (αδβγ)0 consist of three groups: the group where αδ = βγ n2(n2−1) consisting of n2 elements, a second of 2 expressions where αδ does not equal βγ, and finally a third with an equal amount of expressions as the second, where αδ and βγ are exchanged. The third group may be discarded because of n2(n2+1) (c.). Hence we retain, because of (a.), (b.), and (c.) just 2 expressions (αδβγ)0. As every expression (αδβγ)0 in which all of α, β, γ, and δ are distinct gives us a form of (d.), we see that 1 n = n(n − 1)(n − 2)(n − 3) 4 24 expressions may be rewritten in terms of others, and just

n (n + 1) n2(n2 − 1) 2 2 − n = 2 4 12 distinct expressions (αδβγ)0 remain; this is the number of equations contained in (15.) that do not directly follow from each other. n(n+1) The number of equations contained in (10.) equals 2 , so adding these to the number of equations in (15.) we find

n(n + 1) n2(n2 − 1) (n + 2)(n + 1)n(n − 3) + = n2 + n + 2 12 12 equations and therefore for n = 2 the number of equations in (10.) and (15.) is 2 less than the number n + n of unknowns, being the variables xi and their first i derivatives uα, for n = 3 they are equal, and for n > 3 the number of equations is greater than the number of unknowns.

6.

Just as the equations (10.) are transformation relations pertaining to (1.) or (2.), so (15.) contains the transformation relations of a form of order four. Before we continue to the actual construction of this form we will first show that the transformation relations of a more general form together with (9.) give us the transformation relations of a new form, which has an order equal to the order of the original form plus one. Let X Gµ = (i1i2 . . . iµ)∂1xi1 ∂2xi2 . . . ∂µxiµ

i1,...,iµ be a µ-linear form in the differentials ∂1x, ∂2x,..., ∂µx, derived from the 0 coefficients of F . Let Gµ denote its transform, with X (α α . . . α )0 = (i i . . . i ) ui1 ui2 . . . uiµ , 1 2 µ 1 2 µ α1 α2 αµ i1...iµ where i ∂xi uα = . ∂xα

- 171 - 0 Differentiating this expression to xα, we find

0 ∂(α1α2 . . . αµ) X ∂(i1i2 . . . iµ) = ui ui1 . . . uiµ + P, 0 α α1 αµ ∂xα ∂xi i1...iµ ∂2x X λ i2 iµ P = (λi2 . . . iµ) u . . . u ∂x0 ∂x0 α2 αµ α α1 λi2...iµ ∂2x X i1 λ iµ + (i1λ . . . iµ) u . . . u + .... α1 ∂x0 ∂x0 αµ α α2 i1λ...iµ

If we now replace all second derivatives in P by

2  0   ∂ xλ X αα X ii = s uλ − s ui uis 0 0 r α αs ∂xα∂xα r λ s r iis then we can write P as a difference U − V where  0 X αα1 X λ i2 iµ U = (λi2 . . . iµ) u u . . . u r r α2 αµ r λi2...iµ  0 X αα2 X i1 λ iµ + (i1λ . . . iµ) u u . . . u + ... r α1 r αµ r i1λ...iµ " 0 0 # X αα  αα  = 1 (rα . . . α )0 + 2 (α r . . . α u)0 + ... r 2 µ r 1 m r and      X i i1 iµ X ii1 ii2 V = u u . . . u (λi2 . . . iµ) + (i1λ . . . iµ) + ... . α α1 αµ λ λ i...iµ λ

Substituting these expressions in P and bringing U to the left hand side, we arrive at the following statement: Under the condition that all integrability conditions are satisfied, every system of transformation relations of order µ X (17a.)(α α . . . α )0 = (i i . . . i ) ui1 ui2 . . . uiµ 1 2 µ 1 2 µ α1 α2 αµ i1...iµ yields a new system of transformation relations of order µ + 1 X (17b.)(αα . . . α )0 = (ii . . . i )ui ui1 . . . uiµ , 1 µ 1 µ α α1 αµ i...iµ where we define      c ∂(i1i2 . . . iµ) X ii1 ii2 (17 .)(ii1 . . . iµ) = − (λi2 . . . iµ) + (i1λ . . . iµ) + ... ∂xi λ λ λ

- 172 - and use a similar definition for the transformed form. 13 0 Using this statement we can from an equation Gµ = Gµ construct a se- 0 0 quence of similar equations Gµ+1 = Gµ+1, Gµ+2 = Gµ+2, . . . , until we arrive at identical relations or relations composed of earlier ones.

This in fact occurs for the form F itself. If we would take (i1i2) = ωi1i2 , then      ∂ωi1i2 X ii1 ii2 (ii1i2) = − ωλi2 + ωi1λ ∂xi λ λ λ ∂ω ii  ii  = i1i2 − 1 − 2 ∂xi i2 i1 which equals zero by equation (5.). 14

7.

We will now construct the form G4 which has (15.) as transformation rela- tion: 0 X ∂xi ∂xi1 ∂xi2 ∂xi3 (18.)(αα1α2α3) = (ii1i2i3) 0 0 0 0 , ∂xα ∂xα ∂xα ∂xα i...i3 1 2 3 and by (16.) we have ( (ii1i3i2) = −(ii1i2i3), (i1ii2i3) = −(ii1i2i3), (i2i3ii1) = (ii1i2i3), (19.) (ii1i2i3) + (ii2i3i1) + (ii3i1i2) = 0.

We multiply equation (18.) with ∂x0 δx0 Dx0 ∆x0 , where we consider ∂, δ, α α1 α2 α3 D, and ∆ to be independent differentials, and define X (ii1i2i3) ∂xi δxi1 Dxi2 ∆xi3 = G4;

i...i3 then from (18.) we obtain by summing over over all values of α that

0 G4 = G4.

Here G4 is a four-linear form in the variables ∂x, δx, Dx, and ∆x, all subject to the same linear substitutions15. Because of the properties of the coefficients, this form has a number of interesting properties. If we exchange i and i1, which does not change G4 as we sum over all values, and replace (i1ii2i3) by −(ii1i2i3), then it follows that G4 only changes sign

13 This really is the covariant derivative ∇T of a tensor T : on a local coordinate patch of a P a a manifold M we have, if T (x) = Ta ...a (x) dx 1 ⊗ ... ⊗ dx k is a covariant tensor a1,...,ak 1 k of order k, then the covariant derivative ∇T of T is a covariant tensor of order k + 1 and P “ ∂T (x) P h b is given in local coordinates by ∇T (x) = + Γ (x)Tba ...a (x) + a,a1,...,ak ∂xa b aa1 2 k i” Γb (x)T (x) + ... + Γb (x)T (x) dxa ⊗ dxa1 ⊗ ... ⊗ dxak (compare with aa2 a1b...ak aak a1a2...b (17c.)). Hence (17b.) is an expression of the fact that the covariant derivative of a covariant tensor is again a covariant tensor. 14Which shows that the metric is covariantly constant with respect to this (covariant) derivative. 15Under changing of variables.

- 173 - 16 when ∂ and δ are exchanged. Now if we only sum over the distinct ii1 terms , then X G4 = (ii1i2i3)(∂xi δxi1 − ∂xi1 δxi)Dxi2 ∆xi3 .

The same thing happens for D and ∆ when i2 and i3 are exchanged, and it follows that

0 X (20.) G4 = (ii1i2i3)(∂xi δxi1 − ∂xi1 δxi)(Dxi2 ∆xi3 − Dxi3 ∆xi2 ),

0 P n(n−1) where denotes summing over the 2 distinct values of ii1, and similarly for i2i3. If we exchange i and i2, as well as i1 and i3, and use the equation (i2i3ii1) = (ii1i2i3), then G4 remains the same, while ∂xi δxk − ∂xk δxi and Dxi ∆xk − Dxk ∆xi are exchanged. Even so, we may not set D = ∂ and ∆ = δ, because then the coefficients (ii1i2i3) and (i2i1ii3), or equivalently (ii1i2i3) and (ii3i2i1), 0 are summed together which prohibits deriving (18.) from G4 = G4. Finally, cyclically permuting δ, D, and ∆ yields the forms X X H4 = (ii2i3i1) ∂xi δxi1 Dxi2 ∆xi3 = (ii1i2i3) ∂xi Dxi1 ∆xi2 δxi3 , X X J4 = (ii3i1i2) ∂xi δxi1 Dxi2 ∆xi3 = (ii1i2i3) ∂xi ∆xi1 δxi2 Dxi3 , for which, using (19.), we have

G4 + H4 + J4 = 0.

8.

Using the coefficients (i1i2i3i4) and applying the technique of section 6. to G4 we obtain a new set of coefficients (ii1i2i3i4) from which we build a five- linear form G5 using the same technique as in section 7. We can again do the same with G5 to construct a six-linear form G6, from G6 a seven-linear form G7, etc., until we encounter forms whose coefficients vanish or reduce to those of previous forms. This gives us the system of equations

0 0 0 (A.) F = F ,G4 = G4,G5 = G5,..., and from the previous considerations we see that the validity equations on the right are a necessary consequence of the validity of those on the left. 17 For the equation F = F 0 to hold it is both necessary and sufficient that the transformation relations X (F.)(α α )0 = (i i ) ui1 ui2 1 2 1 2 α1 α2 i1i2 for (ik) = ωik, as well as the integrability conditions

i i ∂uα ∂uβ (B.) 0 = 0 ∂xβ ∂xα

16 That is, only sum over all i and i1 which satisfy i < i1. 17 0 0 0 From F = F we found that necessarily G4 = G4 by (15.), just as G5 = G5 is a direct 0 b consequence of G4 = G4 by (17 .), . . . .

- 174 - are satisfied. Because if both conditions are satisfied, then there exist n functions v1, v2,..., vn such that the equations (F.) hold for

i ∂vi uα = 0 ∂xα and the substitution x1 = v1, x2 = v2,..., xn = vn then gives the desired transformation of F into F 0. 18 0 For the equation G4 = G4, we need the transformation relations X (G .)(α α α α )0 = (i i i i ) ui1 ui2 ui3 ui4 4 1 2 3 4 1 2 3 4 α1 α2 α3 α4 i1...i4 and the integrability conditions (B.) to be satisfied. In the same fashion the integrability conditions and X (G .)(αα . . . α )0 = (ii . . . i ) ui ui1 . . . ui4 5 1 4 1 4 α α1 α4 ii1...i4

0 must necessarily be satisfied for G5 = G5 to hold, etc.. With this observation, the transformation of F into F 0 is impossible when the transformation relations (F.), (G4.), (G5.), (G6.), etc. cannot simultaneously be satisfied, regardless of the necessary integrability conditions (B.). The question now remains whether or not the integratbility conditions (B.) are superfluous when the transformation relations are all satisfied.

9.

To answer this question and avoid unnecessarily long and tedious calcula- tions, we will restrict ourselves in this section to a particular case, which never- theless allows for direct generalisation, and then find an appropriate answer to our question above. Suppose that 1) the unknowns x and u are completely and uniquely deter- mined by for example the system of equations (G4.), and that 2) their values satisfy the next system of equations (G5.). 0 Taking the derivative of (G4.) with respect to xα, we obtain  0 ∂(α1α2α3α4) X ∂(i1i2i3i4) ∂xi  = ui1 . . . ui4 + Π,  0 0 α1 α4  ∂xα ∂xi ∂xα  i...i4   ∂uλ  X α1 i2 i3 i4 0 Π = (λi2i3i4) u u u (G4.) 0 α2 α3 α4 ∂xα  λi2i3i4  λ  X ∂uα  + (i λi i ) ui1 2 ui3 ui4 + ....  1 3 4 α1 0 α3 α4  ∂xα i1λi3i4 From our first assumption it follows that the values of all derivatives

∂uλ ∂xi αs 0 , 0 ∂xα ∂xα

18 See [DK2004II], Lemma 8.2.6 (Poincar´e)for the existence of such v1,..., vn.

- 175 - 0 are uniquely determined, and therefore that (G4.) contains a number of equa- tions equal to the number of these unknowns, which are all independent and do not contradict each other. If we now let   ∂xi i i 0 − uα = ∂xα α λ    0   ∂uα X ii X αα λ s + s ui uis − s uλ = , 0 α αs r ∂xα λ r ααs iis r  i   λ  then = 0, = 0 are the equations that are necessary to derive the α ααs 0 equations (G5.) from (G4.) in the same way as we did in section 6.. 0 We can get rid of all derivatives in (G4.) using these equations and retain  i   λ  on the right hand side a part U4 that is linear in and , which is α ααs   ∂xi i different from the original right hand side only through replacing 0 by , ∂xα α ∂uλ   αs λ and ∂x0 by . The other terms cancel against the left hand side, where α ααs we use that the equations in (G5.) are satisfied by x and u. Hence the equations 0 (G4.) reduce to U4 = 0,     0 i λ which is the same as (G4.) with all the unknowns replaced by , and α ααs all other terms removed. As noted before, this system of equations contains a number of equations  i   λ  equal to the number of unknowns , and the equations are all in- α ααs  i  dependent from each other. Hence they can only be satisfied by = 0 and α  λ  = 0, that is, if for all i and α ααs

i ∂xi uα = 0 . ∂xα

If therefore the equations (G4.) uniquely determines all unknows, and the values of these unknowns satisfy the equations (G5.) following from (G4.), then the integrability conditions are satisfied and hence the equations (B.) become superfluous.

10.

After section 9. it is clear without further calculations that the same con- clusion is valid when all unknowns have completely and uniquely been deter- mined by the equations (Gp.), (Gq.), (Gr.), . . . and furthermore satisfy (Gp+1.), (Gq+1.), (Gr+1.), . . . which are derived using the techniques of section 6.. Then

- 176 - from every system of equations (Gs.) together with (Gs+1.) we obtain a new set of equations Us = 0, which gives us a sequence of equations

Up = 0,Uq = 0,Ur = 0,... from which we must make a selection consisting of an equal number of equations as unknowns, which form a system with nonvanishing determinant.19 First of the equations which we can use to determine x and u is (F.) which distinguishes itself from the other (G4.), (G5.), . . . equations by the fact that the equations given by their derivative are always satisfied, as we saw in section

6.: for (i1i2) = ωi1i2 ,(ii1i2) = 0. Hence we arrive at the following statement: From the original differential expression F we first, using sections 5. and 7., derive the form G4 and then using section 6. the sequence of forms G5, G6, etc.. When the transformation relations determined by the equations

0 0 0 0 F = F ,G4 = G4,G5 = G5,...,Gp = Gp uniquely determine x and u, and these values satisfy the transformation relations given by 0 Gp+1 = Gp+1, then the necessary integrability conditions for the transformation of F into F 0 are satisfied, and we have i ∂xi uα = 0 . ∂xα This statement, which in all cases where it may be applied makes the in- tegrability conditions superfluous, opens up a new area of applicability for the theory of algebraic invariants. It enables us to consider F , G4, G5, . . . just as homogeneous algebraic forms in the variables ∂x, ∂1x, . . . , where it is no longer necessary to consider these as differentials, provided these variables are all subject to the same linear sub- stitution.20 Under these conditions we have, which follows directly from algebra, that we can continue the sequence of forms F , G4, G5, . . . until we have constructed n absolute invariants and an equal amount of corresponding forms Ψs such that the invariants are independent as functions of x and the Ψs do not depend on each other. Then if we continue the sequence F , G4, G5, . . . one term further, we find a complete system of invariants I, I1, . . . which are in relation to the coefficients of the forms in the sequence, mutually independent. With these invariants the possibility of transforming F into F 0 exists if and only if the equations

0 λ 0 λ1 I = r I,I1 = r I1,..., where r is the determinant of the substitution and the exponents λ are constant. It is therefore appropriate to call these invariants I, I1, . . . belonging to the sufficiently long, but not longer than necessary sequence F , G4, G5, . . . the complete system of invariants of the differential expression F .

19And hence uniquely determine the unknowns such that (B.) is satisfied. 20 ` i ´ Described by the matrix uα i,α with determinant r.

- 177 - Let U1, U2,..., Un denote the variables in the original form, as well as V1, V2,..., Vn the variables of the transformed form, then the original substitution

X i 0 ∂xi = uα∂xα α yields X i Vα = uαUi i as transpose. Then the equations between the corresponding forms become

0 µs Ψs(V1,V2,...) = r Ψs(U1,U2,...), for s = 1, 2, . . . , n and with µ constant. It follows that if for a given function Ω depending on x we set ∂Ω Ui = ∂xi the equations are  ∂Ω ∂Ω   ∂Ω ∂Ω  0 µs Ψs 0 , 0 ,... = r Ψs , ,... . ∂x1 ∂x2 ∂x1 ∂x2 We call an expression  ∂Ω ∂Ω  Ψs , ,... ∂x1 ∂x2 containing an arbitrary function Ω, an associated form of the differential ex- pression F . In a similar fashion the covariants of system F , G4, G5, . . . give n equations

0 0 0 νs Φs(∂x1, ∂x2,...) = r Φs(∂x1, ∂x2,...), and we call the differential expressions

Φs(∂x1, ∂x2,...) the covariants of the differential expression F .

11.

To make the contents of the statement from the previous section more pre- cise, it must be emphasized that the conditions of this statement cannot always be fulfilled. For example, consider the question of whether or not two given surfaces can be mapped onto each other without distorting them, 21 then we must investigate the equation F = F 0 for n = 2. Now even if F = F 0 can be satisfied for a certain transformation, the transformation relations given by 0 0 0 F = F , G4 = G4, G5 = G5, . . . may still fail to uniquely determine x and u, no matter how many equations are considered. This is the case when, for example,

21That is, via an isometry with respect to the metrics on the surfaces.

- 178 - the planes can be translated within themselves, changing the original variables while keeping the new variables fixed. 22 Now it is easy to determine the conditions for this case to occur. 23 The domain of the variables x1, x2,..., xn is called translatable , whenever it is possible to find a substitution of variables, such that the transformation of F does not uniquely determine this substitution. It is clear that the conditions for this to occur consist of identities between invariants and forms of F , which under the conditions of the previous section should be independent functions of x and u. In this case the integrability conditions are not necessarily fulfilled, because  i   λ  even if all transformation relations are satisfied, the terms , need α ααs not be zero, because they are no longer uniquely determined. For all other cases we have the pleasant and a priori unexpected result that for the possibility of transforming F to F 0 necessary and sufficient conditions can be phrased as equations between invariants 24 and that the forms and covariants belonging to the transformation problem can be considered and treated in a purely algebraic way.

12.

As an example of the discussed theory, we now treat the case n = 3, which shows some remarkable simplifications. In equation (20.) we take for ii1 and i2i3 only the unique pairs in 1 2 3, so for example just 2 3, 3 1, and 1 2. Then the transformation relations (18.) of G4 can be written as X (β β α α )0 = 0 (i i k k )(ui1 ui2 − ui1 ui2 )(uk1 uk2 − uk1 uk2 ). 1 2 1 2 1 2 1 2 β1 β2 β2 β1 γ1 γ2 γ2 γ1

We now define four numbers β, γ, i, and k by requiring that ββ1β2, γγ1γ2, ii1i2, and kk1k2 form positive permutations of the numbers 1 2 3, such that

β = 1, β1 = 2, β2 = 3, or β = 2, β1 = 3, β2 = 1, or β = 3, β1 = 1, β2 = 2.

Conversely we see that from β, we can uniquely determine β1 and β2. i 25 Let rα be the minor of the determinant r and since the values of i1i2 and k1k2 are completely determined by the above defined i and k, we put

(i1i2k1k2) = Aik,

22 3 3 Consider for example the two planes {(x, y, 0) ∈ R |x, y ∈ R} and {(x, y, 1) ∈ R |x, y ∈ R} 3 with metric induced by the Euclidean metric in R . Then the transformation taking the first 2 plane into the second is not at all uniquely determined: for any fixed (x0, y0) ∈ R the map (x, y, 0) 7→ (x + x0, y + y0, 1) is an isometry. 23“verschiebbares” 24“[. . . ], wenn dieser Ausdruck zur Bezeichnung der gleichen Formverh¨altnissewie in der Algebra angewandt wird” 25 “ ∂xi ” ` i ´ Recall that r is the determinant of the matrix 0 = uα . ∂xα i,α i,α

- 179 - such that by (14.) we have

 1  ∂2ω ∂2ω ∂2ω   A = 2 23 − 22 − 33  11 2 ∂x ∂x ∂x2 ∂x2  2 3 3 2           X Eαβ 23 23 22 33  + −  E α β α β  αβ (a.) 1 ∂2ω ∂2ω ∂2ω ∂2ω   A = 23 + 11 − 12 − 13  23 2  2 ∂x1 ∂x2∂x3 ∂x1∂x3 ∂x1∂x2           X Eαβ 11 23 12 13  + − ,  E α β α β αβ from which by cyclic permutations of 1 2 3 the other components follow. 26 Then the transformation relations of G4 become:

0 X i k (b.) Aβγ = Aikrβrγ . ik

In a similar way the transformation relations of F = F 0 can be replaced by:

0 X i k (c.) Eβγ = Eikrβrγ . ik

Furthermore, because of (19.) we have

(d.) Aki = Aik and Aik changes sign if we exchange i1 with i2 or k1 with k2. The equations (b.) and (c.) are nothing but transformation relations for the simultaneous transformations of the to F associated quadratic forms X Γ = AikXiXk, ik X Φ = EikXiXk ik into

0 X Γ = AikΞiΞk, ik 0 X Φ = EikΞiΞk ik via the substitution X i (e.) Xi = rβΞβ, β with determinant R = r2

26 i This is a consequence of the first equation of section 12. and definition of the rα.

- 180 - and inverse 27 X 1 (e0.)Ξ = ui X . β r β i i It is known that this transformation problem can be solved if four simulta- neous invariants and three corresponding forms exist that are independent with respect to the variables Xi,Ξβ and the coefficients of Γ and Φ. These yield three absolute invariants; if these are independent with respect to the variables x1, x2, x3, then by the statement from section 10. it is only necessary to determine G5 to solve the problem. By section 6. we find

    ∂(i1i2k1k2) X gi1 gi2 (gi1i2k1k2) = − (λi2k1k2) + (i1λk1k2) ∂xg λ λ λ gk  gk   + 1 (i i λk ) + 2 (i i k λ) . λ 1 2 2 λ 1 2 1

In this summation we for the first term only need to take the values i and i1 for λ, as (i2i2k1k2) = 0. Using similar considerations for the other three terms, the cases λ = i and = k are identical, we obtain

        ∂(i1i2k1k2) gi1 gi2 gk1 gk2 (gi1i2k1k2) = − (i1i2k1k2) + + + ∂xg i1 i2 k1 k2 gi  gi  gk  gk  + 1 (i ik k ) + 2 (ii k k ) + 1 (i i k k) + 2 (i i kk ). i 2 1 2 i 1 1 2 k 1 2 2 k 1 2 1

Now as i1i2i, i2ii1, and k1k2k, k2kk1 together with ii1i2 and kk1k2 are positive gi gk h i permutations of 1 2 3, we obtain after adding , in ... : i k

  ∂Aik X gr (gi i k k ) = − 2A 1 2 1 2 ∂x ik r g r gi gi  gi  gk gk  gk  + A + 1 A + 2 A + A + 1 A + 2 A . i ik i i1k i i2k k ik k ik1 k ik2

We denote this expression, which is completely determined by g, i, and k, by Agik, such that (gi1i2k1k2) = Agik, and using the easily derived formula

X gr 1 ∂E 2 = r E ∂x r g we find      ∂Aik Aik ∂E X gλ gλ (f.) Agik = − + Aλk + Aiλ . ∂xg E ∂xg i k λ

27 “ i ” By Cramer’s rule for the 3 × 3 matrix uα with determinant r. i,α

- 181 - This expression remains the same if we exchange i and k, and changes sign if we 0 exchange i1 with i2 or k1 with k2. Using this notation, the equation G5 = G5 becomes X A0 = A ug ui1 ui2 uk1 uk2 αβγ gik α β1 β2 γ1 γ2 gi1i2k1k2 which simplifies to 0 X g i k (g.) Aαβγ = Agikuαrβrγ gik and leads to a peculiar result. If we let for the forms Γ and Φ, U1, U2, U3 be the original and V1, V2, V3 be the new variables, such that the substitution corresponding to (e.) becomes

X g (h.) Vα = rαUg, g with inverse X 1 (h0.) U = ug V , g r α α α then we obtain from (g.) by multiplying with VαΞβΞγ and summing over α, β, and γ: X 0 X Aαβγ VαΞβΞγ = r AgikUgXiXk. αβγ gik

1 As r = R 2 is a power of the determinant of the substitution, and in both sums both the original variables as those of the associated forms occur 28 , we arrive at the following statement: To determine the possibility of the transformation of a quadratic differ- ential expression X F = ωik∂xi∂xk into 0 X 0 0 0 F = ωik∂xi∂xk, use the coefficients of F to build three algebraic forms X Γ = AikXiXk, X Φ = EikXiXk, X Θ = AgikUgXiXk, with variables X and U, and similarly for F 0 the forms Γ0, Φ0, and Θ0 with variables Ξ and V . Now the conditions necessary and sufficient for the transformation of F into F 0 are precisely those for the existence of a substitution

X i Xi = rαΞα α

28“[. . . ], so m¨ussendieselben entsprechende simultane Zwischenformen sein”

- 182 - which transforms Γ into Γ0, Φ into Φ0, and via

X i Vα = rαUi i yields 0 1 Θ = R 2 Θ. For this statement to be applicable, it is necessary that the six absolute asso- ciated forms and invariants of Γ and Φ are mutually independent functions of the variables x1, x2, x3, U1, U2, U3. If this condition is satisfied, then the equations between the original and 1 i transformed covariants of Γ and Φ yield the coefficients r uβ of the inverse sub- i stitution, so as r is determined by the invariants, the equations yield uβ and in such a way that i ∂xi uβ = 0 . ∂xβ If the condition is not satisfied, then it follows that the domain of the vari- ables x1, x2, and x3 can be translated into itself without changing F . From section 10. we know that under the conditions of this statement, the necessary and sufficient conditions for the transformation of F into F 0 are given by the simultaneous invariants of F , G4, G5 and their transforms. This result i also shows why in the above statement, where we solve for the coefficients uα or i rα of the substitution, we do not consider the integrability conditions between these coefficients. 29 In the above statement it is demanded that the coefficients Agik may be 0 expressed in terms of the coefficients of Γ and Φ, and similarly for Agik, such that 1) Θ is a Zwischenform of Γ and Φ, 2) Θ0 is a simultaneous Zwischenform 0 0 0 1 of Γ and Θ , and 3) Θ = R 2 Θ. From the exponent of R we see that these conditions may not be satisfied when Agik is a rational function of Aik and 0 Eik. On the other hand, these conditions are certainly satisfied whenever F is not arbitrary but obtained from F through a direct substitution, such as the 0 identity xi = xi. Hence we find: For every quadratic differential expression F , we can find the coefficients Agik defined in (f.) as irrational functions of the coefficients of Γ and Φ, such that Θ is a simultaneous Zwischenform of Γ and Φ, which is related to its 0 1 transform by Θ = R 2 Θ. Adding to this the equations Γ0 = Γ, Φ0 = Φ and

X 1 2 3 ±r1r2r3 = R, then we retain 31 transformation relations from which we, even though the variables of Θ are subject to different substitutions, can solve the 9 substitution i coefficients rα in such a way that they have the invariant form

I0 = RλI. 29“Dieses Resultat setzt ebenso wie der obige Satz voraus, dass bei der Elimination der Sub- i i stitutionscoefficienten uα oder rα, aus welcher die in Rede stehenden algebraischen Grund- formen hervorgehen, auf Integrabilit¨atsbedingungen zwischen denselben keinerlei R¨ucksicht genommen werde.”

- 183 - In the usual case where the transformations from Γ and Φ fully determine the substitution, the same result holds for Γ, Φ, and Θ. Then the number of independent invariants 1) from Γ and Φ alone is equal to 4, 2) from Γ, Φ, and Θ together equal to 22, so 3) the number of them which necessarily contain coefficients of Θ equals 18. Using the terminology of section 10., for the current case the differential expression F has 22 invariants, 21 absolute invariants, 3 independent forms and an equal amount of covariants. About the differential expression F given by the square of the line element in three dimensional space, which is not covered by the statement from this section, there is a treatment made by Riemann 30, for which Mr. Dedekind is treating the analytical backgrounds. 31

3rd of Januari 1869.

30Ueber die Hypothesen, welche der Geometrie zu Grunde liegen. Abh. der G¨ottingerGes. d. W., 1867, Band XIII. 31“[. . . ], zu welcher Herr Dedekind die dort unterdr¨ucktenanalytischen Entwicklungen in Aussicht gestellt hat.”

- 184 - Bibliography

[Chr1869] E. B. Christoffel: Ueber die Transformation der homogenen Differ- entialausdr¨uckezweiten Grades, Reine Angewandte Mathematik 70 (1869), pages 46–70. [Bou1947] N. Bourbaki: El´ements´ de Math´ematique VI, Livre II: Alg`ebre, Chapitre II, Libraire Scientifique Hermann et Cie Paris (1947). [Car1951] E. Cartan: G´eom´etriedes Espaces de Riemann, deuxi`emeedition, Gauthier-Villars, Paris (1951). [Bou1953] N. Bourbaki: El´ementsde´ Math´ematiqueXV, Livre V: Espaces Vec- toriels Topologiques, Chapitre I–II, Libraire Scientifique Hermann et Cie Paris (1953).

[Bou1955] N. Bourbaki: El´ementsde´ Math´ematiqueXVIII, Livre V: Espaces Vectoriels Topologiques, Chapitre III–V, Libraire Scientifique Hermann et Cie Paris (1955). [Hus1965] T. Husain: The Open Mapping and Closed Graph Theorems in Topo- logical Vector Spaces, Oxford Mathematical Monographs (1965).

[Ham1982] R. S. Hamilton: The inverse function theorem of Nash and Moser, Bulletin of the American Mathematical Society 7 (1982), pages 65–222. [Wal1984] R. M. Wald: General Relativity, The University of Chicago Press (1984), ISBN 0226870332.

[DK2000] J. J. Duistermaat, J. A. C. Kolk: Lie Groups, Springer (2000), ISBN 3540152938. [Mun2000] J. R. Munkres: Topology (second edition), Prentice Hall (2000), ISBN 0131784498.

[Dui2003] J. J. Duistermaat: Functies en Reeksen, lecture notes of Utrecht University (2003). [DK2004I] J. J. Duistermaat, J. A. C. Kolk: Multidimensional Real Analysis I, Cambridge University Press (2004), ISBN 0521551145.

- 185 BIBLIOGRAPHY

[DK2004II] J. J. Duistermaat, J. A. C. Kolk: Multidimensional Real Analysis II, Cambridge University Press (2004), ISBN 0521829259. [Dui2006] J. J. Duistermaat: Klassieke Mechanica, lecture notes of Utrecht University (2006). [Ban2008] E. P. van den Ban: Riemannian geometry, lecture notes of Utrecht University (2008).

[StAndrews] Biography of Elwin Bruno Christoffel, from http://www-history.mcs.st-andrews.ac.uk/Biographies/Christoffel.html, School of Mathematics and Statistics, University of St Andrews, Scotland, 1997.

- 186 -