<<

Ordinary Differential Equations

Peter Philip∗

Lecture Notes Originally Created for the Class of Spring Semester 2012 at LMU Munich, Revised and Extended for Several Subsequent Classes

September 19, 2021

Contents

1 Basic Notions 4 1.1 TypesandFirstExamples ...... 4 1.2 EquivalentIntegralEquation...... 10 1.3 PatchingandTimeReversion ...... 11

2 Elementary Solution Methods 13 2.1 GeometricInterpretation,Graphing ...... 13 2.2 LinearODE,VariationofConstants...... 13 2.3 SeparationofVariables ...... 16 2.4 ChangeofVariables...... 20

3 General Theory 26 3.1 Equivalence Between Higher-Order ODE and Systems of First-Order ODE 26 3.2 ExistenceofSolutions ...... 29 ∗E-Mail: [email protected]

1 CONTENTS 2

3.3 UniquenessofSolutions...... 38 3.4 Extension of Solutions, Maximal Solutions ...... 44 3.5 ContinuityinInitialConditions ...... 56

4 Linear ODE 63 4.1 Definition,Setting ...... 63 4.2 Gronwall’sInequality ...... 63 4.3 Existence, Uniqueness, Vector Space of Solutions ...... 68 4.4 Fundamental Matrix Solutions and Variation of Constants ...... 70 4.5 Higher-Order,Wronskian...... 72 4.6 ConstantCoefficients ...... 75 4.6.1 LinearODEofHigherOrder...... 75 4.6.2 SystemsofFirst-OrderLinearODE ...... 83

5 Stability 93 5.1 QualitativeTheory,PhasePortraits ...... 93 5.2 StabilityatFixedPoints ...... 104 5.3 ConstantCoefficients ...... 112 5.4 Linearization ...... 118 5.5 LimitSets ...... 123

A Differentiability 131

B Kn-Valued Integration 132 B.1 Kn-ValuedRiemannIntegral...... 133 B.2 Kn-ValuedLebesgueIntegral...... 136

C Spaces 139 C.1 DistanceinMetricSpaces ...... 139

D Local Lipschitz Continuity 142 CONTENTS 3

E MaximalSolutionsonNonopenIntervals 145

F Paths in Rn 145

G Operator Norms and Matrix Norms 149

H The Vandermonde Determinant 153

I Matrix-Valued Functions 155 I.1 ProductRule ...... 155 I.2 Integration and Matrix Multiplication Commute ...... 156

J Autonomous ODE 156 J.1 Equivalence of Autonomous and Nonautonomous ODE ...... 156 J.2 Integral for ODE with Discontinuous Right-Hand Side ...... 157

K Polar Coordinates 158

References 166 1 BASIC NOTIONS 4

1 Basic Notions

1.1 Types of Ordinary Differential Equations (ODE) and First Examples

A differential equation is an equation for some unknown , involving one or more of the unknown function. Here are some first examples:

y′ = y, (1.1a) y(5) =(y′)2 + πx, (1.1b) (y′)2 = c, (1.1c) 2πit 2 ∂tx = e x , (1.1d) 1 x′′ = 3x + − . (1.1e) − 1   One distinguishes between ordinary differential equations (ODE) and partial differential equations (PDE). While ODE contain only derivatives with respect to one variable, PDE can contain (partial) derivatives with respect to several different variables. In general, PDE are much harder to solve than ODE. The equations in (1.1) all are ODE, and only ODE are the subject of this class. We will see precise definitions shortly, but we can already use the examples in (1.1) to get some first exposure to important ODE-related terms and to discuss related issues. As in (1.1), the notation for the unknown function varies in the literature, where the two variants presented in (1.1) are probably the most common ones: In the first three equations of (1.1), the unknown function is denoted y, usually assumed to depend on a variable denoted x, i.e. x y(x). In the last two equations of (1.1), the unknown function is denoted x, usually7→ assumed to depend on a variable denoted t, i.e. t x(t). So one has to use some care due to the different roles of the symbol x. The notation7→ t x(t) is typically favored in situations arising from physics applications, where t represents7→ time. In this class, we will mostly use the notation x y(x). 7→ There is another, in a way a slightly more serious, notational issue that one commonly encounters when dealing with ODE: Strictly speaking, the notation in (1.1b) and (1.1d) is not entirely correct, as functions and function arguments are not properly distin- guished. Correctly written, (1.1b) and (1.1d) read

2 y(5)(x)= y′(x) + πx, (1.2a) x∈D∀(y) 2πit  2 (∂tx)(t)= e x(t) , (1.2b) t∈D∀(x)  1 BASIC NOTIONS 5

where (y) and (x) denote the respective domains of the functions y and x. However, one mightD also noticeD that the notation in (1.2) is more cumbersome and, perhaps, harder to read. In any case, the type of slight abuse of notation present in (1.1b) and (1.1d) is so common in the literature that one will have to live with it. One speaks of first-order ODE if the equations involve only first derivatives such as in (1.1a), (1.1c), and (1.1d). Otherwise, one speaks of higher-order ODE, where the precise order is given by the highest occurring in the equation, such that (1.1b) is an ODE of fifth order and (1.1e) is an ODE of second order. We will see later in Th. 3.1 that ODE of higher order can be equivalently formulated and solved as systems of ODE of first order, where systems of ODE obviously consist of several ODE to be solved simultaneously. Such a system of ODE can, equivalently, be interpreted as a single ODE in higher dimensions: For instance, (1.1e) can be seen as a single two-dimensional ODE of second order or as the system

x′′ = 3x 1, (1.3a) 1 − 1 − x′′ = 3x +1 (1.3b) 2 − 2 of two one-dimensional ODE of second order. One calls an ODE explicit if it has been solved explicitly for the highest-order deriva- tive, otherwise implicit. Thus, in (1.1), all ODE except (1.1c) are explicit. In general, explicit ODE are much easier to solve than implicit ODE (which include, e.g., so-called differential-algebraic equations, cf. Ex. 1.4(g) below), and we will mostly consider ex- plicit ODE in this class. As the reader might already have noticed, without further information, none of the equations in (1.1) makes much sense. Every function, in particular, every function solving an ODE, needs a set as the domain where it is defined, and a set as the range it maps into. Thus, for each ODE, one needs to specify the admissible domains as well as the range of the unknown function. For an ODE, one usually requires a solution to be defined on a nontrivial (bounded or unbounded) interval I R. Prescribing the possible range of the solution is an integral part of setting up an ODE,⊆ and it often completely determines the ODE’s meaning and/or its solvability. For example for (1.1d), (a subset of) C is a reasonable range. Similarly, for (1.1a)–(1.1c), one can require the range to be either R or C, where requiring range R for (1.1c) immediately implies there is no solution for c < 0. However, one can also specify (a subset of) Rn or Cn, n > 1, as range for (1.1a), turning the ODE into an n-dimensional ODE (or a system of ODE), where y now has n compoments (y1,...,yn) (note that, except in cases where we are dealing with matrix multiplications, we sometimes denote elements of Rn as columns and sometimes as rows, switching back and forth without too much care). A reasonable range for (1.1e) is (a subset of) R2 or C2. 1 BASIC NOTIONS 6

One of the important goals regarding ODE is to find conditions, where one can guan- rantee the existence of solutions. Moreover, if possible, one would like to find conditions that guarantee the existence of a unique solution. Clearly, for each a R, the function y : R R, y(x)= aex, is a solution to (1.1a), showing one cannot expect∈ uniqueness without−→ specifying further requirements. The most common additional conditions that often (but not always) guarantee a unique solution are initial conditions, (e.g. requiring y(x0)= y0 (x0,y0 given); or boundary conditions (e.g. requiring y(a)= ya, y(b)= yb for y : [a,b] Cn (y ,y Cn given)). −→ a b ∈ Let us now proceed to mathematically precise definitions of the abovementioned notions. Notation 1.1. We will write K in situations, where we allow K to be R or C. Definition 1.2. Let k,n N. ∈ (a) Given U R K(k+1)n and F : U Kn, call ⊆ × −→ F (x,y,y′,...,y(k))=0 (1.4)

an implicit ODE of kth order. A solution to this ODE is a k times differentiable function φ : I Kn, (1.5) −→ defined on a nontrivial (bounded or unbounded, open or closed or half-open) interval I R satisfying the two conditions ⊆ (i) (x,φ(x),φ′(x),...,φ(k)(x)) I K(k+1)n : x I U. ∈ × ∈ ⊆ (ii) F (x,φ(x),φ′(x),...,φ(k)(x)) = 0 for each x I.  ∈ Note that condition (i) is necessary so that one can even formulate condition (ii). (b) Given G R Kkn, and f : G Kn, call ⊆ × −→ y(k) = f(x,y,y′,...,y(k−1)) (1.6)

an explicit ODE of kth order. A solution to this ODE is a k times differentiable function φ as in (1.5), defined on a nontrivial (bounded or unbounded, open or closed or half-open) interval I R satisfying the two conditions ⊆ (i) x,φ(x),φ′(x),...,φ(k−1)(x) I Kkn : x I G. ∈ × ∈ ⊆ (ii) φ(k)(x)= f(x,φ(x),φ′(x),...,φ(k−1)(x)) for each x I.   ∈ Again, note that condition (i) is necessary so that one can even formulate condition (ii). Also note that φ is a solution to (1.6) if, and only if, φ is a solution to the equivalent implicit ODE y(k) f(x,y,y′,...,y(k−1))=0. − 1 BASIC NOTIONS 7

Definition 1.3. Let k,n N. ∈ (a) An initial value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp. of the ODE (1.6)) plus the initial condition

(j) y (x0)= y0,j (1.7) j=0,...,k∀ −1 n with given x0 R and y0,0,...,y0,k−1 K . A solution φ to the initial value problem is a k ∈times differentiable function∈ φ as in (1.5) that is a solution to the ODE and that also satisfies (1.7) (with y replaced by φ) – in particular, this requires x I. 0 ∈ (b) A boundary value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp. of the ODE (1.6)) plus the boundary condition

(j) (j) y (a)= ya,j and y (b)= yb,j (1.8) j∈∀Ja j∈∀Jb n with given a,b R, a

Under suitable hypotheses, initial and boundary value problems for ODE have unique solutions (for initial value problems, we will see some rather general results in Cor. 3.10 and Cor. 3.16 below). However, in general, they can have infinitely many solutions or no solutions, as shown by Examples 1.4(b),(c),(e) below. Example 1.4. (a) Let k N. The function φ : R K, φ(x) = aex, a K, is a solution to the kth order∈ explicit initial value problem−→ ∈

y(k) = y, (1.9a) y(j)(0) = a. (1.9b) j=0,...,k∀ −1 We will see later (e.g., as a consequence of Th. 4.8 combined with Th. 3.1) that φ is the unique solution to (1.9) on R. (b) Consider the one-dimensional explicit first-order initial value problem

y′ = y , (1.10a) | | y(0) =p 0. (1.10b) 1 BASIC NOTIONS 8

Then, for every c 0, the function ≥ 0 for x c, φc : R R, φc(x) := 2 ≤ (1.11) −→ (x−c) for x c, ( 4 ≥

is a solution to (1.10): Clearly, φc(0) = 0, φc is differentiable, and

0 for x c, φ′ : R R, φ′ (x) := ≤ (1.12) c −→ c x−c for x c, ( 2 ≥ solving the ODE. Thus, (1.10) is an example of an initial value problem with un- countably many different solutions, all defined on the same domain.

(c) As mentioned before, the one-dimensional implicit first-order ODE (1.1c) has no real-valued solution for c < 0. For c 0, every function φ : R R, φ(x) := a x√c, a R, is a solution to (1.1c).≥ Moreover, for c < 0,−→ every function φ ±: R C∈, φ(x) := a xi√ c, a C, is a C-valued solution to (1.1c). The one-dimensional−→ implicit first-order± − ODE∈

′ ey =0 (1.13)

is an example of an ODE that does not even have a C-valued solution. It is an exercise to find f : R R such that the explicit ODE y′ = f(x) has no solution. −→ (d) Let n N and let a,c Kn. Then, on R, the function ∈ ∈ φ : R Kn, φ(x) := c + xa, (1.14) −→ is the unique solution to the n-dimensional explicit first-order initial value problem

y′ = a, (1.15a) y(0) = c. (1.15b)

This situation is a special case of Ex. 1.6 below.

(e) Let a,b R, a

= (c sin+c cos) : [a,b] K : c ,c K . (1.17) L 1 2 −→ 1 2 ∈ n  o 1 BASIC NOTIONS 9

In consequence, the boundary value problem

y(0) = 0, y(π/2)=1, (1.18a)

for (1.16) has the unique solution φ : [0,π/2] K, φ(x) := sin x (using (1.18a) −→ and (1.17) implies c2 = 0 and c1 = 1); the boundary value problem

y(0) = 0, y(π)=0, (1.18b)

for (1.16) has the infinitely many different solutions φc : [0,π] K, φc(x) := c sin x, c K; and the boundary value problem −→ ∈ y(0) = 0, y(π)=1, (1.18c)

for (1.16) has no solution (using (1.18c) and (1.17) implies the contradictory re- quirements c = 0 and c = 1). 2 2 − (f) Consider y z z F : R K2 K2 K2, F x, 1 , 1 := 2 . (1.19a) × × −→ y z z 1   2  2  2 −  Clearly, the implicit K2-valued ODE

y′ 0 F (x,y,y′)= 2 = (1.19b) y′ 1 0  2 −    has no solution on any nontrivial interval.

(g) Consider

y z z + iy 2i 1 1 2 3 − F : R C3 C3 C3, F x, y , z := z + y x2 . (1.20a)   2  2  1 2  × × −→ y z y ie−ix 3 3 1 −        It is an exercise to show the C3-valued implicit ODE

y′ + iy 2i 0 2 3 − F (x,y,y′)= y′ + y x2 = 0 (1.20b)  1 2    y ie−ix 0 1 −     has a unique solution on R (note that, here, we do not need to provide initial or boundary conditions to obtain uniqueness). The implicit ODE (1.20b) is an example of a differential algebraic equation, since, read in components, only its first two equations contain derivatives, whereas its third equation is purely algebraic. 1 BASIC NOTIONS 10

1.2 Equivalent Integral Equation

It is often useful to rewrite a first-order explicit intitial value problem as an equivalent integral equation. We provide the details of this equivalence in the following theorem:

Theorem 1.5. If G R Kn, n N, and f : G Kn is continuous, then, for each (x ,y ) G, the explicit⊆ n×-dimensional∈ first-order−→ initial value problem 0 0 ∈ y′ = f(x,y), (1.21a)

y(x0)= y0, (1.21b) is equivalent to the integral equation

x y(x)= y0 + f t,y(t) dt, (1.22) Zx0  n in the sense that a φ : I K , with x0 I R being a nontrivial interval, and φ satisfying −→ ∈ ⊆

x,φ(x) I Kn : x I G, (1.23) ∈ × ∈ ⊆ is a solution to (1.21) in the sense of Def. 1.3(a) if, and only if,

x φ(x)= y0 + f t,φ(t) dt, (1.24) x∀∈I Zx0  i.e. if, and only if, φ is a solution to the integral equation (1.22).

n Proof. Assume I R with x0 I to be a nontrivial interval and φ : I K to be a continuous⊆ function, satisfying∈ (1.23). If φ is a solution to (1.21), then−→φ is differentiable and the assumed continuity of f implies the continuity of φ′. In other 1 words, each component φj of φ, j = 1,...,n , is in C (I, K). Thus, the fundamental theorem of calculus [Phi16a, Th. 10.20(b)]{ applies,} and [Phi16a, (10.51b)] yields

x x (1.21b) φj(x)= φj(x0)+ fj t,φ(t) dt = y0,j + fj t,φ(t) dt, (1.25) x∀∈I j∈{1∀,...,n} Zx0 Zx0   proving φ satisfies (1.24). Conversely, if φ satisfies (1.24), then the validity of the initial condition (1.21b) is immediate. Moreover, as f and φ are continuous, so is the integrand function t f t,φ(t) of (1.24). Thus, [Phi16a, Th. 10.20(a)] applies to (the components of) φ, proving7→ φ′(x)= f x,φ(x) for each x I, proving φ is a solution to  (1.21). ∈   1 BASIC NOTIONS 11

Example 1.6. Consider the situation of Th. 1.5. In the particularly simple special case, where f does not actually depend on y, but merely on x, the equivalence between (1.21) and (1.22) can be directly exploited to actually solve the initial value problem: n If f : I K , where I R is some nontrivial interval with x0 I, then we obtain φ : I −→Kn to be a solution⊆ of (1.21) if, and only if, ∈ −→ x φ(x)= y0 + f(t)dt, (1.26) x∀∈I Zx0 i.e. if, and only if, φ is the antiderivative of f that satisfies the initial condition. In particular, in the present situation, φ as given by (1.26) is the unique solution to the initial value problem. Of course, depending on f, it can still be difficult to carry out the integral in (1.26).

1.3 Patching and Time Reversion

If solutions defined on different intervals fit together, then they can be patched to obtain a solution on the union of the two intervals: Lemma 1.7 (Patching of Solutions). Let k,n N. Given G R Kkn and f : G Kn, if φ : I Kn and ψ : J Kn are both∈ solutions to (1.6)⊆ ,× i.e. to −→ −→ −→ y(k) = f(x,y,y′,...,y(k−1)),

such that I =]a,b], J =[b,c[, a

φ(j)(b)= ψ(j)(b), (1.27) j=0,...,k∀ −1 then φ(x) for x I, σ : I J Kn, σ(x) := ∈ (1.28) ∪ −→ ψ(x) for x J, ( ∈ is also a solution to (1.6).

Proof. Since φ and ψ both are solutions to (1.6),

x,σ(x),σ′(x),...,σ(k−1)(x) (I J) Kkn : x I J G (1.29) ∈ ∪ × ∈ ∪ ⊆ must hold, where (1.27) guarantees that σ(j)(b) exists for each j =0,...,k 1. Moreover, σ is k times differentiable at each x I J, x = b, and − ∈ ∪ 6 σ(k)(x)= f x,σ(x),σ′(x),...,σ(k−1)(x) . (1.30) x∈∀I∪J, x6=b  1 BASIC NOTIONS 12

However, at b, we also have (using the left-hand derivatives for φ and the right-hand derivatives for ψ)

φ(k)(b)= f b,φ(b),φ′(b),...,φ(k−1)(b) ′ (k−1) (k) = fb,ψ(b),ψ (b),...,ψ (b) = ψ (b), (1.31) which shows σ is k times differentiable and the equality of (1.30) also holds at x = b, completing the proof that σ is a solution. 

It is sometimes useful to apply what is known as time reversion:

Definition 1.8. Let k,n N, G R Kkn, f : G Kn, and consider the ODE ∈ f ⊆ × f −→ y(k) = f(x,y,y′,...,y(k−1)). (1.32)

We call the ODE y(k) = g(x,y,y′,...,y(k−1)), (1.33) where

g : G Kn, g(x,y):=( 1)k f x,y , y ,..., ( 1)k−1 y , (1.34a) g −→ − − 1 − 2 − k kn k−1 Gg := (x,y) R K : x,y1, y2,..., ( 1) yk Gf ,  (1.34b) ∈ × − − − ∈ the time-reversed version of (1.32). 

Lemma 1.9 (Time Reversion). Let k,n N, G R Kkn, and f : G Kn. ∈ f ⊆ × f −→ (a) The time-reversed version of (1.33) is the original ODE, i.e. (1.32).

(b) If a

Proof. (a) is immediate from the definition of g in (1.34). (b): Due to (a), it suffices to show if φ is a solution to (1.32), then ψ is a solution to (1.33). Clearly, if x ] b, a[, then x ]a,b[. Moreover, noting ∈ − − − ∈ ψ(j)(x)=( 1)j φ(j)( x), (1.36a) j=0∀,...,k x∈]−∀b,−a[ − − 2 ELEMENTARY SOLUTION METHODS 13

one has x,φ( x),φ′( x),...,φ(k−1)( x) G − − − − ∈ f  x,ψ(x),ψ′(x),...,ψ(k−1)(x)  (1.36b) x∈]−∀b,−a[ ⇒ = x,φ( x), φ′( x),..., ( 1)k−1 φ(k−1)( x) G − − − − − ∈ g  and  ψ(k)(x)=( 1)k f x,φ( x),φ′( x),...,φ(k−1)( x) − − − − − k ′ k−1 (k−1) =( 1) f x,ψ(x), ψ (x),..., ( 1) ψ  (x) (1.36c) x∈]−∀b,−a[ − − − − ′ (k−1) = g x,ψ(x),ψ (x),...,ψ (x) ,  thereby establishing the case.  

2 Elementary Solution Methods for 1-Dimensional First-Order ODE

2.1 Geometric Interpretation, Graphing

Geometrically, in the 1-dimensional real-valued case, the ODE (1.21a) provides a slope y′ = f(x,y) for every point (x,y). In other words, it provides a field of directions. The task is to find a differentiable function φ such that its graph has the prescribed slope in each point it contains. In certain simple cases, drawing the field of directions can help to guess the solutions of the ODE. Example 2.1. Let G := R+ R and f : G R, f(x,y) := y/x, i.e. we consider the ODE y′ = y/x. Drawing the× field of directions−→ leads to the idea that the solutions are functions whose graphs constitute rays, i.e. φ : R+ R, y = φ (x)= cx with c R. c −→ c ∈ Indeed, one immediately verifies that each φc constitutes a solution to the ODE.

2.2 Linear ODE, Variation of Constants

Definition 2.2. Let I R be an open interval and let a,b : I K be continuous functions. An ODE of the⊆ form −→ y′ = a(x)y + b(x) (2.1) is called a linear ODE of first order. It is called homogeneous if, and only if, b 0; it is ≡ called inhomogeneous if, and only if, it is not homogeneous. 2 ELEMENTARY SOLUTION METHODS 14

Theorem 2.3 (Variation of Constants). Let I R be an open interval and let a,b : I K be continuous. Moreover, let x I and⊆ c K. Then the linear ODE (2.1) −→ 0 ∈ ∈ has a unique solution φ : I K that satisfies the initial condition y(x0) = c. This unique solution is given by −→ x φ : I K, φ(x)= φ (x) c + φ (t)−1 b(t)dt , (2.2a) −→ 0 0  Zx0  where x R x a(t)dt φ : I K, φ (x) = exp a(t)dt = e x0 . (2.2b) 0 −→ 0 Zx0  −1 Here, and in the following, φ0 denotes 1/φ0 and not the of φ0 (which does not even necessarily exist).

Proof. We begin by noting that φ0 according to (2.2b) is well-defined since a is assumed to be continuous, i.e., in particular, Riemann integrable on[x0,x]. Moreover, the funda- mental theorem of calculus [Phi16a, Th. 10.20(a)] applies, showing φ0 is differentiable with x φ′ : I K, φ′ (x)= a(x)exp a(t)dt = a(x)φ (x), (2.3) 0 −→ 0 0 Zx0  where Lem. A.1 of the Appendix was used as well. In particular, φ0 is continuous. Since −1 φ0 = 0 as well, φ0 is also continuous. Moreover, as b is continuous by hypothesis, −16 φ0 b is continuous and, thus, Riemann integrable on [x0,x]. Once again, [Phi16a, Th. 10.20(a)] applies, yielding φ to be differentiable with φ′ : I K, −→ x ′ ′ −1 −1 φ (x)= φ0(x) c + φ0(t) b(t)dt + φ0(x)φ0(x) b(x) x0  Z x  −1 = a(x)φ0(x) c + φ0(t) b(t)dt + b(x)= a(x)φ(x)+ b(x), (2.4)  Zx0  where the product rule of [Phi16a, Th. 9.7(c)] was used as well. Comparing (2.4) with (2.1) shows φ is a solution to (2.1). The computation φ(x )= φ (x )(c +0)=1 c = c (2.5) 0 0 0 · verifies that φ satisfies the desired initial condition. It remains to prove uniqueness. To this end, let ψ : I K be an arbitrary differentiable function that satisfies (2.1) as well as the initial condition−→ ψ(x ) = c. We have to show ψ = φ. Since φ = 0, we can 0 0 6 define u := ψ/φ0 and still have to verify x −1 u(x)= c + φ0(t) b(t)dt. (2.6) x∀∈I Zx0 2 ELEMENTARY SOLUTION METHODS 15

We obtain ′ ′ ′ ′ ′ aφ0 u + b = aψ + b = ψ =(φ0 u) = φ0 u + φ0 u = aφ0 u + φ0 u , (2.7) ′ ′ −1 implying b = φ0 u and u = φ0 b. Thus, the fundamental theorem of calculus in the form [Phi16a, Th. 10.20(b)] implies x x ′ −1 u(x)= u(x0)+ u (t)dt = c + φ0(t) b(t)dt, (2.8) x∀∈I Zx0 Zx0 thereby completing the proof.  Corollary 2.4. Let I R be an open interval and let a : I K be continuous. Moreover, let x I and⊆ c K. Then the homogeneous linear−→ ODE (2.1) (i.e. with 0 ∈ ∈ b 0) has a unique solution φ : I K that satisfies the initial condition y(x0) = c. This≡ unique solution is given by −→ x R x a(t)dt φ(x)= c exp a(t)dt = ce x0 . (2.9) Zx0  Proof. One immediately obtains (2.9) by setting b 0 in in (2.2).  ≡ Remark 2.5. The name variation of constants for Th. 2.3 can be understood from comparing the solution (2.9) of the homogeneous linear ODE with the solution (2.2) of the general inhomogeneous linear ODE: One obtains (2.2) from (2.9) by varying the x −1 constant c, i.e. by replacing it with the function x c + φ0(t) b(t)dt . 7→ x0 Example 2.6. Consider the ODE R y′ =2xy + x3 (2.10) with initial condition y(0) = c, c C. Comparing (2.10) with Def. 2.2, we observe we are facing an inhomogeneous linear∈ ODE with a : R R, a(x):=2x, (2.11a) −→ b : R R, b(x) := x3. (2.11b) −→ From Cor. 2.4, we obtain the solution φ0,c to the homogeneous version of (2.10): x 2 φ : R C, φ (x)= c exp a(t)dt = cex . (2.12) 0,c −→ 0,c Z0  The solution to (2.10) is given by (2.2a): φ : R C, −→ x x 2 2 2 1 2 φ(x)= ex c + e−t t3 dt = ex c + (t2 + 1)e−t −2  Z0    0 2 1 1 2 1 2 1 = ex c + (x2 + 1)e−x = c + ex (x2 + 1). (2.13) 2 − 2 2 − 2     2 ELEMENTARY SOLUTION METHODS 16

2.3 Separation of Variables

If the ODE (1.21a) has the particular form

y′ = f(x)g(y), (2.14)

with one-dimensional real-valued continuous functions f and g, and g(y) = 0, then it can be solved by a method known as separation of variables: 6

Theorem 2.7. Let I,J R be (bounded or unbounded) open intervals and suppose that f : I R and g : J ⊆ R are continuous with g(y) = 0 for each y J. For each −→ −→ 6 ∈ (x0,y0) I J, consider the initial value problem consisting of the ODE (2.14) together with the∈ initial× condition y(x0)= y0. (2.15) Define the functions x y dt F : I R, F (x) := f(t)dt, G : J R, G(y) := . (2.16) −→ −→ g(t) Zx0 Zy0

′ ′ ′ (a) Uniqueness: On each open interval I I satisfying x0 I and F (I ) G(J), the initial value problem consisting of (2.14)⊆ and (2.15) has∈ a unique solution.⊆ This unique solution is given by

φ : I′ R, φ(x) := G−1 F (x) , (2.17) −→ where G−1 : G(J) J is the inverse function of G on G (J). −→ ′ ′ ′ (b) Existence: There exists an open interval I I satisfying x0 I and F (I ) G(J), i.e. an I′ such that (a) applies. ⊆ ∈ ⊆

Proof. (a): We begin by proving G has a differentiable inverse function G−1 : G(J) J. According to the fundamental theorem of calculus [Phi16a, Th. 10.20(a)], G−→is differentiable with G′ = 1/g. Since g is continuous and nonzero, G is even C1. If ′ G (y0)=1/g(y0) > 0, then G is strictly increasing on J (due to the intermediate value theorem [Phi16a, Th. 7.57]; g(y0) > 0, the continuity of g, and g = 0 imply that g > 0 ′ 6 on J). Analogously, if G (y0)=1/g(y0) < 0, then G is strictly decreasing on J. In each case, G has a differentiable inverse function on G(J) by [Phi16a, Th. 9.9]. In the next step, we verify that (2.17) does, indeed, define a solution to (2.14) and (2.15). The assumption F (I′) G(J) and the existence of G−1 as shown above provide ⊆ −1 that φ is well-defined by (2.17). Verifying (2.15) is quite simple: φ(x0)= G (F (x0)) = 2 ELEMENTARY SOLUTION METHODS 17

−1 G (0) = y0. To see φ to be a solution of (2.14), notice that (2.17) implies F = G φ on I′. Thus, we can apply the to obtain the derivative of F = G φ on I′◦: ◦ φ′(x) f(x)= F ′(x)= G′ φ(x) φ′(x)= , (2.18) x∈∀I′ g φ(x)  showing φ satisfies (2.14).  We now proceed to show that each solution φ : I′ R to (2.14) that satisfies (2.15) −→ must also satisfy (2.17). Since φ is a solution to (2.14),

φ′(x) = f(x) for each x I′. (2.19) g φ(x) ∈

Integrating (2.19) yields 

x φ′(t) x dt = f(t)dt = F (x) for each x I′. (2.20) g φ(t) ∈ Zx0 Zx0 Using the change of variables  formula of [Phi16a, Th. 10.25] in the left-hand side of (2.20), allows one to replace φ(t) by the new integration variable u (note that each solution φ : I′ R to (2.14) is in C1(I′) since f and g are presumed continuous). Thus, we obtain−→ from (2.20):

φ(x) du φ(x) du F (x)= = = G φ(x) for each x I′. (2.21) g(u) g(u) ∈ Zφ(x0) Zy0  Applying G−1 to (2.21) establishes φ satisfies (2.17). (b): During the proof of (a), we have already seen G to be either strictly increasing or strictly decreasing. As G(y0) = 0, this implies the existence of ǫ > 0 such that ] ǫ,ǫ[ G(J). The function F is differentiable and, in particular, continuous. Since − ⊆ ′ ′ F (x0) = 0, there is δ > 0 such that, for I :=]x0 δ, x0 +δ[, one has F (I ) ] ǫ,ǫ[ G(J) as desired. − ⊆ − ⊆ 

Example 2.8. Consider the ODE y y′ = on I J := R+ R+ (2.22) −x × × with the initial condition y(1) = c for some given c R+. Introducing functions ∈ 1 f : R+ R, f(x) := , g : R+ R, g(y) := y, (2.23) −→ −x −→ 2 ELEMENTARY SOLUTION METHODS 18

one sees that Th. 2.7 applies. To compute the solution φ = G−1 F , we first have to determine F and G: ◦ x x dt F : R+ R, F (x)= f(t)dt = = ln x, (2.24a) −→ − t − Z1 Z1 y dt y dt y G : R+ R, G(y)= = = ln . (2.24b) −→ g(t) t c Zc Zc Here, we can choose I′ = I = R+, because F (R+) = R = G(R+). That means φ is defined on the entire interval I. The inverse function of G is given by G−1 : R R+, G−1(t)= cet. (2.25) −→ Finally, we get c φ : R+ R, φ(x)= G−1 F (x) = ce− ln x = . (2.26) −→ x The uniqueness part of Th. 2.7 further tells us the above initial value problem can have no solution different from φ. —

The advantage of using Th. 2.7 as in the previous example, by computing the relevant functions F , G, and G−1, is that it is mathematically rigorous. In particular, one can be sure one has found the unique solution to the ODE with initial condition. However, in practice, it is often easier to use the following heuristic (not entirely rigorous) procedure. In the end, in most cases, one can easily check by differentiation that the function found is, indeed, a solution to the ODE with initial condition. However, one does not know uniqueness without further investigations (general results such as Th. 3.15 below can often help). One also has to determine on which interval the found solution is defined. On the other hand, as one is usually interested in choosing the interval as large as possible, the optimal choice is not always obvious when using Th. 2.7, either. The heuristic procedure is as follows: Start with the ODE (2.14) written in the form dy = f(x)g(y). (2.27a) dx Multiply by dx and divide by g(y) (i.e. separate the variables): dy = f(x)dx. (2.27b) g(y) Integrate: dy = f(x)dx. (2.27c) g(y) Z Z 2 ELEMENTARY SOLUTION METHODS 19

Change the integration variables and supply the appropriate upper and lower limits for the integrals (according to the initial condition):

y dt x = f(t)dt. (2.27d) g(t) Zy0 Zx0 Solve this equation for y, set φ(x) := y, check by differentiation that φ is, indeed, a ′ ′ solution to the ODE, and determine the largest interval I such that x0 I and such that φ is defined on I′. The use of this heuristic procedure is demonstrated∈ by the following example:

Example 2.9. Consider the ODE

y′ = y2 on I J := R R (2.28) − × ×

with the initial condition y(x0)= y0 for given values x0,y0 R. We manipulate (2.28) according to the heuristic procedure described in (2.27) above:∈

dy = y2 y−2 dy = dx y−2 dy = dx dx − − − Z Z y x 1 y 1 1 −2 x t dt = dt =[t]x0 = x x0 − t y − y0 − Zy0 Zx0  y0 y φ(x)= y = 0 . (2.29) 1+(x x ) y − 0 0

Clearly, φ(x0)= y0. Moreover,

2 ′ y0 2 φ (x)= 2 = φ(x) , (2.30) − 1+(x x0) y0 − −  i.e. φ does, indeed, provide a solution to (2.28). If y = 0, then φ 0 is defined 0 ≡ on the entire interval I = R. If y0 = 0, then the denominator of φ(x) has a zero at x = (x y 1)/y , and φ is not defined6 on all of R. In that case, if y > 0, then 0 0 − 0 0 x0 > (x0y0 1)/y0 = x0 1/y0 and the maximal open interval for φ to be defined on ′ − − is I =]x0 1/y0, [; if y0 < 0, then x0 < (x0y0 1)/y0 = x0 1/y0 and the maximal open interval− for φ∞to be defined on is I′ =] ,x− 1/y [. Note− that the formula for φ − ∞ 0 − 0 obtained by (2.29) works for y0 = 0 as well, even though not every previous expression in (2.29) is meaningful for y0 = 0 and, also, Th. 2.7 does not apply to (2.28) for y0 = 0. In the present example, the subsequent Th. 3.15 does, indeed, imply φ to be the unique solution to the initial value problem on I′. 2 ELEMENTARY SOLUTION METHODS 20

2.4 Change of Variables

To solve an ODE, it can be useful to transform it into an equivalent ODE, using a so-called change of variables. If one already knows how to solve the transformed ODE, then the equivalence allows one to also solve the original ODE. We first present the following Th. 2.10, which constitutes the base for the change of variables technique, followed by examples, where the technique is applied.

n n Theorem 2.10. Let G R K be open, n N, f : G K , and (x0,y0) G. Define ⊆ × ∈ −→ ∈ n Gx := y K : (x,y) G (2.31) x∀∈R { ∈ ∈ } and assume the change of variables function T : G Kn is differentiable and such that −→

Tx := T (x, ): Gx Tx(Gx), Tx(y) := T (x,y), is a diffeomorphism , Gx∀6=∅ · −→  (2.32) −1 i.e. Tx is invertible and both Tx and Tx are differentiable. Then the first-order initial value problems

y′ = f(x,y), (2.33a)

y(x0)= y0, (2.33b) and

−1 ′ −1 −1 −1 y = DTx (y) f x,Tx (y) + ∂xT x,Tx (y) , (2.34a)   y(x0)= T (x0,y0),   (2.34b) are equivalent in the following sense:

(a) A differentiable function φ : I Kn, where I R is a nontrivial interval, is a solution to (2.33a) if, and only−→ if, the function ⊆

µ : I Kn, µ(x):=(T φ)(x)= T x,φ(x) , (2.35) −→ x ◦ is a solution to (2.34a). 

(b) A differentiable function φ : I Kn, where I R is a nontrivial interval, is a solution to (2.33) if, and only if,−→ the function of ⊆(2.35) is a solution to (2.34). 2 ELEMENTARY SOLUTION METHODS 21

Proof. We start by noting that the assumption of G being open clearly implies each Gx, x R, to be open as well, which, in turn, implies Tx(Gx) to be open, even though this ∈ 1 is not as obvious . Next, for each x R such that Gx = , we can apply the chain rule [Phi16b, Th. 4.31] to T T −1 = Id to∈ obtain 6 ∅ x ◦ x −1 −1 DTx Tx (y) DTx (y)=Id (2.36) y∈Tx∀(Gx) ◦  −1 and, thus, each DTx (y) is invertible with

−1 −1 −1 DTx (y) = DTx Tx (y) . (2.37) y∈Tx∀(Gx)    Consider φ and µ as in (a) and notice that (2.35) implies

−1 φ(x)= Tx (µ(x)). (2.38) x∀∈I Moreover, the differentiability of φ and T imply differentiability of µ by the chain rule, which also yields

′ 1 µ (x)= DT x,φ(x) ′ φ (x) (2.39) x∈I   ∀ ′ = DTx(φ(x)) φ (x)+ ∂xT x,φ(x) .

To prove (a), first assume φ : I Kn to be a solution of (2.33a). Then, for each x I, −→ ∈ ′ (2.39),(2.33a) µ (x) = DTx(φ(x)) f x,φ(x) + ∂xT x,φ(x)

(2.38) −1 −1 −1 = DTx Tx (µ(x)) f x,T x (µ(x)) + ∂xT x,Tx (µ(x)) −1 (2.37) −1  −1  −1  = DTx (µ(x)) f x,Tx (µ(x)) + ∂xT x,Tx (µ(x)) , (2.40)   showing µ satisfies (2.34a). Conversely, assume µ to be a solution to (2.34a). Then, for each x I, ∈ −1 −1 −1 −1 DTx (µ(x)) f x,Tx (µ(x)) + ∂xT x,Tx (µ(x))

(2.34a) ′ (2.39) ′   = µ (x) = DTx(φ(x)) φ (x)+ ∂xT x,φ(x) . (2.41)

1 If Tx is a continuously differentiable map, then this is related to the inverse function theorem (see, e.g. [Phi16b, Cor. 4.51]); it is still true if Tx is merely continuous and injective, but then it is the invariance of domain theorem of algebraic topology [Oss09, 5.6.15], which is equivalent to the Brouwer fixed-point theorem [Oss09, 5.6.10], and is much harder to prove. 2 ELEMENTARY SOLUTION METHODS 22

Using (2.38), one can subtract the second summand from (2.41). Multiplying the result −1 by DTx (µ(x)) from the left and taking into account (2.37) then provides

′ −1 (2.38) φ (x)= f x,Tx (µ(x)) = f x,φ(x) , (2.42) x∀∈I   showing φ satisfies (2.33a). It remains to prove (b). If φ satisfies (2.33), then µ satisfies (2.34a) by (a). More- over, µ(x0) = T x0,φ(x0) = T (x0,y0), i.e. µ satisfies (2.34b) as well. Conversely, assume µ satisfies (2.34). Then φ satisfies (2.33a) by (a). Moreover, by (2.38), φ(x0)= −1 −1   Tx0 (µ(x0)) = Tx0 (T (x0,y0)) = y0, showing φ satisfies (2.33b) as well.

As a first application of Th. 2.10, we prove the following theorem about so-called Bernoulli differential equations: Theorem 2.11. Consider the Bernoulli differential equation

y′ = f(x,y) := a(x) y + b(x) yα, (2.43a) where α R 0, 1 , the functions a,b : I R are continuous and defined on an open interval I∈ \{R, and} f : I R+ R. For−→(2.43a), we add the initial condition ⊆ × −→ y(x )= y , (x ,y ) I R+, (2.43b) 0 0 0 0 ∈ × and, furthermore, we also consider the corresponding linear initial value problem

y′ = (1 α) a(x) y + b(x) , (2.44a) 1−−α y(x0)= y0 ,  (2.44b) with its unique solution ψ : I R given by Th. 2.3. −→ ′ ′ ′ (a) Uniqueness: On each open interval I I satisfying x0 I and ψ > 0 on I , the Bernoulli initial value problem (2.43) has⊆ a unique solution.∈ This unique solution is given by 1 φ : I′ R+, φ(x) := ψ(x) 1−α . (2.45) −→ ′ ′ ′ (b) Existence: There exists an open interval I I satisfying  x0 I and ψ > 0 on I , i.e. an I′ such that (a) applies. ⊆ ∈

Proof. (b) is immediate from Th. 2.3, since ψ(x0)= y0 > 0 and ψ is continuous. To prove (a), we apply Th. 2.10 with the change of variables

T : I R+ R+, T (x,y) := y1−α. (2.46) × −→ 2 ELEMENTARY SOLUTION METHODS 23

Then T C1(I R+, R) with ∂ T 0 and ∂ T (x,y)=(1 α) y−α. Moreover, ∈ × x ≡ y − + + 1−α Tx = S, S : R R , S(y) := y , (2.47) x∀∈I −→ which is differentiable with the differentiable inverse function S−1 : R+ R+, 1 α −→ −1 1−α −1 −1 ′ 1 1−α S (y)= y , DS (y)=(S ) (y)= 1−α y . Thus, (2.34a) takes the form −1 ′ −1 −1 −1 y = DTx (y) f x,Tx (y) + ∂xT x,Tx (y)

 −α 1 1 α = (1 α) y 1−α a(x) y 1−α + b(x) y1−α +0 − = (1 α) a(x) y+ b(x) .   (2.48) − ′ ′ ′ Thus, if I I is such that x0 I and ψ > 0 on I , then Th. 2.10 says φ defined by (2.45) must⊆ be a solution to (2.43)∈ (note that the differentiability of ψ implies the differentiability of φ). On the other hand, if λ : I′ R+ is an arbitrary solution to (2.43), then Th. 2.10 states µ := S λ = λ1−α to be a−→ solution to (2.44). The uniqueness 1−α ◦ 1−α part of Th. 2.3 then yields λ = ψ↾I′ = φ , i.e. λ = φ.  Example 2.12. Consider the initial value problem 1 y′ = f(x,y) := i , (2.49a) − ix y +2 − y(1) = i, (2.49b) where f : G C, G := (x,y) R C : ix y +2 =0 (G is open as the continuous preimage of the−→ open set C{ 0 ∈). We× apply the− change6 } of variables \ { } T : G C, T (x,y) := ix y. (2.50) −→ − Then, T C1(G, C). Moreover, for each x R, ∈ ∈ G = y C : (x,y) G = C ix +2 (2.51) x { ∈ ∈ } \ { } and we have the diffeomorphisms T : C ix +2 C 2 , T (y)= ix y, (2.52a) x \ { } −→ \ {− } x − T −1 : C 2 C ix +2 , T −1(y)= ix y. (2.52b) x \ {− } −→ \ { } x − To obtain the transformed equation, we compute the right-hand side of (2.34a)

−1 −1 −1 −1 DTx (y) f x,Tx (y) + ∂xT x,Tx (y)   1  1  =( 1) i + i = . (2.53) − · − y +2 y +2   2 ELEMENTARY SOLUTION METHODS 24

Thus, the transformed initial value problem is 1 y′ = , (2.54a) y +2 y(1) = T (1,i)= i i =0. (2.54b) − Using seperation of variables, one finds the solution

µ :] 1, [ ] 2, [, µ(x) := √2x +2 2, (2.55) − ∞ −→ − ∞ − to (2.54). Then Th. 2.10 implies that

φ :] 1, [ C, φ(x) := T −1 µ(x) = ix √2x +2+2, (2.56) − ∞ −→ x − is a solution to (2.49) (that φ is a solution to (2.49) can now also easily be checked directly). It will become clear from Th. 3.15 below that φ and ψ are also the unique solutions to their respective initial value problems. —

Finding a suitable change of variables to transform a given ODE such that one is in a position to solve the transformed ODE is an art, i.e. it can be very difficult to spot a useful transformation, and it takes a lot of practise and experience.

Remark 2.13. Somewhat analogous to the situation described in the paragraph before (2.27) regarding the separation of variables technique, in practise, one frequently uses a heuristic procedure to apply a change of variables, rather than appealing to the rigorous ′ Th. 2.10. For the initial value problem y = f(x,y), y(x0)= y0, this heuristic procedure proceeds as follows:

(1) One introduces the new variable z := T (x,y) and then computes z′, i.e. the deriva- tive of the function x z(x)= T (x,y(x)). 7→ (2) In the result of (1), one eliminates all occurrences of the variable y by first replacing ′ −1 y by f(x,y) and then replacing y by Tx (z), where Tx(y) := T (x,y)= z (i.e. one has to solve the equation z = T (x,y) for y). One thereby obtains the transformed initial ′ value problem problem z = g(x,z), z(x0)= T (x0,y0), with a suitable function g. (3) One solves the transformed initial value problem to obtain a solution µ, and then −1 x φ(x) := Tx µ(x) yields a candidate for a solution to the original initial value problem.7→  ′ (4) One checks that φ is, indeed, a solution to y = f(x,y), y(x0)= y0. 2 ELEMENTARY SOLUTION METHODS 25

Example 2.14. Consider

y y2 f : R+ R R, f(x,y):=1+ + , (2.57) × −→ x x2 and the initial value problem

y′ = f(x,y), y(1) = 0. (2.58)

We introduce the change of variables z := T (x,y) := y/x and proceed according to the steps of Rem. 2.13. According to (1), we compute, using the quotient rule,

y′(x) x y(x) z′(x)= − . (2.59) x2

′ −1 According to (2), we replace y (x) by f(x,y) and then replace y by Tx (z) = xz to obtain the transformed initial value problem

1 y y2 y 1 z 1+ z2 z′ = 1+ + = (1 + z + z2) = , z(1) = 0/1=0. (2.60) x x x2 − x2 x − x x   According to (3), we next solve (2.60), e.g. by seperation of variables, to obtain the solution

− π π µ : e 2 , e 2 R, µ(x) := tanln x, (2.61) −→   of (2.60), and

− π π φ : e 2 , e 2 R, φ(x) := xµ(x)= x tanln x, (2.62) −→ as a candidate for a solution to (2.58). Finally, according to (4), we check that φ is, indeed, a solution to (2.58): Due to φ(1) = 1 tan0 = 0, φ satisfies the initial condition, and due to · 1 φ′(x) = tanln x + x (1 + tan2 ln x)=1+tanln x + tan2 ln x x φ(x) φ2(x) =1+ + , (2.63) x x2 φ satisfies the ODE. 3 GENERAL THEORY 26

3 General Theory

3.1 Equivalence Between Higher-Order ODE and Systems of First-Order ODE

It turns out that each one-dimensional kth-order ODE is equivalent to a system of k first-order ODE; more generally, that each n-dimensional kth-order ODE is equivalent to a kn-dimensional first-order ODE (i.e. to a system of kn one-dimensional first-order ODE). Even though, in this class, we will mainly consider explicit ODE, we provide the equivalence also for the implicit case, as the proof is essentially the same (the explicit case is included as a special case). Theorem 3.1. In the situation of Def. 1.2(a), i.e. U R K(k+1)n and F : U Kn, plus (x ,y ,...,y ) R Kkn, consider the kth-order⊆ × initial value problem−→ 0 0,0 0,k−1 ∈ × F (x,y,y′,...,y(k))=0, (3.1a) (j) y (x0)= y0,j, (3.1b) j∈{0,...,k∀ −1} and the first-order initial value problem y′ y =0, 1 − 2 y′ y =0, 2 − 3 . . (3.2a) ′ yk−1 yk =0, − ′ F (x,y1,...,yk,yk)=0,

y0,0 . y(x0)=  .  (3.2b) y0,k−1   (note that the unknown function y in (3.1) is Kn-valued, whereas the unknown function y in (3.2) is Kkn-valued). Then both initial value problems are equivalent in the following sense:

(a) If φ : I Kn is a solution to (3.1), then −→ φ φ′ Φ: I Kkn, Φ :=  .  , (3.3) −→ .   φ(k−1)   is a solution to (3.2).   3 GENERAL THEORY 27

kn n (b) If Φ : I K is a solution to (3.2), then φ := Φ1 (which is K -valued) is a solution to−→(3.1).

Proof. We rewrite (3.2a) as G(x,y,y′)=0, (3.4) where

G : V Kkn, −→ V := (x,y,z) R Kkn Kkn : (x,y,z ) U R Kkn Kkn, ∈ × × k ∈ ⊆ × × G (x,y,z) := z y , 1 1 − 2 G (x,y,z) := z y , 2 2 − 3 . . G (x,y,z) := z y , k−1 k−1 − k Gk(x,y,z) := F (x,y,zk). (3.5)

(a): As a solution to (3.1), φ is k times differentiable and Φ is well-defined. Then (3.1b) implies (3.2b), since

φ(x ) 0 y φ′(x ) 0,0  0  (3.1b) . Φ(x0)= . = . . .    (k−1)  y0,k−1 φ (x0)         Next, Def. 1.2(a)(i) for φ implies Def. 1.2(a)(i) for Φ, since

x, Φ(x), Φ′(x) I Kkn Kkn : x I { ∈ × × ∈ } (3.3) = x,φ(x),φ′(x),...,φ(k−1)(x),φ′(x),...,φ(k)(x) I Kkn Kkn : x I ∈ × × ∈ (x,φ(x),...,φ(k)(x)) ∈ U  V.  (3.6) ⊆ The definition of Φ in (3.3) implies

′ (j−1) ′ (j) Φj =(φ ) = φ = Φj+1, j∈{1,...,k∀ −1} showing Φ satisfies the first k 1 equations of (3.2a). As − ′ (k−1) ′ (k) Φk(x)=(φ ) (x)= φ (x) 3 GENERAL THEORY 28

and, thus,

′ ′ (k) (3.1a) Gk x, Φ(x), Φ (x) = F x,φ(x),φ (x),...,φ (x) = 0, (3.7) x∀∈I Φ also satisfies the last equation of (3.2a).  (b): As Φ is a solution to (3.2), the first k 1 equations of (3.2a) imply − ′ φ=Φ1 (j) Φj+1 = Φj = φ , j∈{1,...,k∀ −1}

i.e. φ is k times differentiable and Φ has, once again, the form (3.3) (note Φ1 = φ by the definition of φ). Then, clearly, (3.2b) implies (3.1b), and Def. 1.2(a)(i) for Φ implies Def. 1.2(a)(i) for φ: x, Φ(x), Φ′(x) I Kkn Kkn : x I V ∈ × × ∈ ⊆ and the definition ofV in (3.5) imply x,φ(x),...,φ(k) I K(k+1)n : x I U. ∈ × ∈ ⊆ Finally, from the last equation of (3.2a), one obtains

(k) (3.3),(3.5) ′ F x,φ(x),...,φ = Gk x, Φ(x), Φ (x) =0, x∀∈I proving φ satisfies (3.1a).    Example 3.2. The second-order initial value problem y(0) = 0, y′′ = y, (3.8) − y′(0) = r, r R given, ∈ is equivalent to the following system of two first-order ODE:

′ y = y2, 0 1 y(0) = . (3.9) y′ = y , r 2 − 1   The solution to (3.9) is Φ (x) r sin x Φ: R R2, Φ(x)= 1 = , (3.10) −→ Φ2(x) r cos x     and, thus, the solution to (3.8) is φ : R R, φ(x)= r sin x. (3.11) −→ — 3 GENERAL THEORY 29

As a consequence of Th. 3.1, one can carry out much of the general theory of ODE (such as results regarding existence and uniqueness of solutions) for systems of first- order ODE, obtaining the corresponding results for higher-order ODE as a corollary. This is the strategy usually pursued in the literature and we will follow suit in this class.

3.2 Existence of Solutions

It is a rather remarkable fact that, under the very mild assumption that f : G Kn is kn −→ a continuous function defined on an open subset G of R K with (x0,y0,0,...,y0,k−1) G, every initial value problem (1.7) for the n-dimensional× explicit kth-order ODE (1.6)∈ has at least one solution φ : I Kn, defined on a, possibly very small, open interval. This is the contents of the Peano−→ Th. 3.8 below and its Cor. 3.10. From Example 1.4(b), we already know that uniqueness of the solution cannot be expected without stronger hypotheses. The proof of the Peano theorem requires some work. One of the key ingredients is the Arzel`a-Ascoli Th. 3.7 that, under suitable hypotheses, guarantees a given sequence of continuous functions to have a uniformly convergent subsequence (the formulation in Th. 3.7 is suitable for our purposes – many different variants of the Arzel`a-Ascoli theorem exist in the literature). We begin with some prelimanaries from the theory of metric spaces. At this point, the reader might want to review the definition of a metric, a , and basic notions on metric spaces, such as the notion of compactness and the notion of continuity of functions between metric spaces. Also recall that every normed space is a metric space via the metric induced by the (in particular, if we use metric notions on normed spaces, they are always meant with respect to the respective induced metric). Notation 3.3. Let (X,d) be a metric space. Given x X and r R+, let ∈ ∈ B (x) := y X : d(x,y)

Definition 3.4. Let (X,dX ) and (Y,dY ) be metric spaces. We say a sequence of func- tions (fm)m∈N, fm : X Y , converges uniformly to a function f : X Y if, and only if, −→ −→ dY fm(x),f(x) < ǫ. ǫ>∀0 N∃∈N m≥∀N, x∈X  Theorem 3.5. Let (X,dX ) and (Y,dY ) be metric spaces. If the sequence (fm)m∈N of continuous functions f : X Y converges uniformly to the function f : X Y , m −→ −→ then f is continuous as well. 3 GENERAL THEORY 30

Proof. We have to show that f is continuous at every ξ X. Thus, let ξ X and ǫ> 0. ∈ ∈ Due to the , we can choose m N such that dY fm(x),f(x) < ǫ/3 for every x X. Moreover, as f is continuous∈ at ξ, there exists δ > 0 such that m  x B (ξ) implies∈ d f (ξ),f (x) < ǫ/3. Thus, if x B (ξ), then ∈ δ Y m m ∈ δ

dY f(ξ),f(x) dY f(ξ),fm(ξ) + dY fm(ξ),fm(x) + dY fm(x),f(x) ≤ ǫ ǫ ǫ  < + + = ǫ,   3 3 3 proving f is continuous at ξ. 

Definition 3.6. Let (X,dX ) and (Y,dY ) be metric spaces and let be a set of functions from X into Y . Then the set (or the functions in ) areF said to be uniformly equicontinuous if, and only if, forF each ǫ> 0, there is δ > F0 such that

dX (x,ξ) <δ dY f(x),f(ξ) <ǫ . (3.12) x,ξ∀∈X ⇒f∈F ∀    Theorem 3.7 (Arzel`a-Ascoli). Let n N, let denote some norm on Kn, and let ∈ k · k n I R be some bounded interval. If (fm)m∈N is a sequence of functions fm : I K such⊆ that f : m N is uniformly equicontinuous and such that, for each x −→I, the { m ∈ } ∈ sequence fm(x) m∈N is bounded, then (fm)m∈N has a uniformly convergent subsequence (f ) , i.e. there exists f : I Kn such that mj j∈N  −→

fm (x) f(x) < ǫ. ǫ>∀0 N∃∈N j≥∀N, k j − k x∈I In particular, the limit function f is continuous.

Proof. Let (r1,r2,... ) be an enumeration of the set of rational numbers in I, i.e. of Q I. Inductively, we construct a sequence ( m)m∈N of subsequences of (fm)m∈N, ∩=(f ) , such that F Fm m,k k∈N (i) for each m N, is a subsequence of each with j 1,...,m , ∈ Fm Fj ∈ { }

(ii) for each m N, m converges pointwise at each of the first m rational numbers ∈ F n rj; more precisely, there exists a sequence (z1,z2,... ) in K such that, for each m N and each j 1,...,m : ∈ ∈ { }

lim fm,k(rj)= zj. k→∞

Actually, we construct the (zm)m∈N inductively together with the ( m)m∈N: Since the n F sequence (fm(r1))m∈N is, by hypothesis, a bounded sequence in K , one can apply the 3 GENERAL THEORY 31

n Bolzano-Weierstrass theorem (cf. [Phi16b, Th. 1.31]) to obtain z1 K and a sub- sequence = (f ) of (f ) such that lim f (r ) = z∈. To proceed by F1 1,k k∈N m m∈N k→∞ 1,k 1 1 induction, we now assume to have already constructed 1,..., M and z1,...,zM for M N such that (i) and (ii) hold for each m 1,...,MF F. Since the sequence ∈ n ∈ { } (fM,k(rM+1))k∈N is a bounded sequence in K , one can, once more, apply the Bolzano- n Weierstrass theorem to obtain zM+1 K and a subsequence M+1 = (fM+1,k)k∈N of such that lim f (r ) =∈z . Since is a subsequenceF of , it is FM k→∞ M+1,k M+1 M+1 FM+1 FM also a subsequence of all previous subsequences, i.e. (i) now also holds for m = M + 1. In consequence, limk→∞ fM+1,k(rj) = zj for each j = 1,...,M + 1, such that (ii) now also holds for m = M + 1 as required.

Next, one considers the diagonal sequence (gm)m∈N, gm := fm,m, and observes that this sequence converges pointwise at each rational number rj (limm→∞ gm(rj) = zj), since, at least for m j, (g ) is a subsequence of every (exercise) – in particular, ≥ m m∈N Fj (gm)m∈N is also a subsequence of the original sequence (fm)m∈N.

In the last step of the proof, we show that (gm)m∈N converges uniformly on the entire n interval I to some f : I K . To this end, fix ǫ > 0. Since gm : m N fm : m N , the assumed uniform−→ of f : m N yields{ δ >∈0 such} ⊆ that { ∈ } { m ∈ } ǫ x ξ <δ gm(x) gm(ξ) < . x,ξ∀∈I | − | ⇒m ∀∈N − 3  

Since I is bounded, it has finite length and, thus, it can be covered with finitely many N intervals I1,...,IN , I = j=1 Ij, N N, such that each Ij has length less than δ. Moreover, since Q is dense in R, for each∈ j 1,...,N , there exists k(j) N such that r I . Define MS:= max k(j) : j∈= { 1,...,N}. We note that each∈ of the k(j) ∈ j { } finitely many sequences (gm(r1) m∈N,..., (gm(rM ) m∈N is a Cauchy sequence. Thus,   ǫ gk(rα) gl(rα) < . (3.13) K∃∈N k,l∀≥K α=1∀,...,M − 3

We now consider an arbitrary x I and k,l K. Let j 1,...,N such that x I . ∈ ≥ ∈ { } ∈ j Then rk(j) Ij, rk(j) x < δ, and the estimate in (3.13) holds for α = k(j). In consequence,∈ we obtain| − the| crucial estimate g (x) g (x) k − l gk(x) gk (rk(j)) + gk(rk(j)) gl(rk(j)) + gl(rk(j)) gl(x) (3.14) k,l∀≥K ≤ kǫ ǫ − ǫ − − < + + = ǫ. 3 3 3

The estimate (3.14) shows (gm(x) m∈N is a Cauchy sequence for each x I, and we can define ∈  n f : I K , f(x):= lim gm(x). (3.15) −→ m→∞ 3 GENERAL THEORY 32

Since K in (3.14) does not depend on x I, passing to the limit k in the estimate of (3.14) implies ∈ → ∞ gl(x) f(x) ǫ, l≥∀K, k − k≤ x∈I

proving uniform convergence of the subsequence (gm)m∈N of (fm)m∈N as desired. The continuity of f is now a consequence of Th. 3.5. 

At this point, we have all preparations in place to state and prove the existence theorem. Theorem 3.8 (Peano). If G R Kn is open, n N, and f : G Kn is continuous, then, for each (x ,y ) G, the⊆ explicit× n-dimensional∈ first-order initial−→ value problem 0 0 ∈ y′ = f(x,y), (3.16a)

y(x0)= y0, (3.16b)

has at least one solution. More precisely, given an arbitrary norm on Kn, (3.16) has a solution φ : I Kn, defined on the open interval k · k −→ I :=]x α,x + α[, (3.17) 0 − 0 α = α(b) > 0, where b> 0 is such that

B := (x,y) R Kn : x x b and y y b G, (3.18) ∈ × | − 0|≤ k − 0k≤ ⊆ M := M(b) := max f(x,y) : (x,y) B < , (3.19) {k k ∈ } ∞ and min b, b/M for M > 0, α := α(b) := { } (3.20) (b for M =0. In general, the choice of the norm on Kn will influence the possible sizes of α and, thus, of I. k · k

Proof. The proof will be conducted in several steps. In the first step, we check α = α(b) > 0 is well-defined: Since G is open, there always exists b > 0 such that (3.18) holds. Since B is a closed and bounded subset of the finite-dimensional space R Kn, B is compact (cf. [Phi16b, Cor. 3.16(c)]). Since f and, thus, f is continuous (every× norm is even Lipschitz continuous due to the inverse triangle inek quality),k it must assume its + maximum on the compact set B (cf. [Phi16b, Th. 3.19]), showing M R0 is well-defined by (3.19) and α is well-defined by (3.20). ∈

In the second step of the proof, we note that it suffices to prove (3.16) has a solution φ+, defined on [x0,x0 + α[: One can then apply the time reversion Lem. 1.9(b): The proof 3 GENERAL THEORY 33

n providing the solution φ+ also provides a solution ψ+ : [ x0, x0 + α[ K to the ′ − − −→ time-reversed initial value problem, consisting of y = f( x,y) and y( x0)= y0 (note that the same M and α work for the time-reversed problem).− − Then, according− to Lem. 1.9(b), φ :]x α,x ] Kn, φ (x) := ψ ( x), is a solution to (3.16). According to − 0 − 0 −→ − + − Lem. 1.7, we can patch φ− and φ+ together to obtain the desired solution

φ (x) for x x , φ : I Kn, φ(x) := − ≤ 0 (3.21) −→ φ (x) for x x , ( + ≥ 0 defined on all of I. It is noted that one can also conduct the proof with the second step omitted, but then one has to perform the following steps on all of I, which means one has to consider additional cases in some places.

In the third step of the proof, we will define a sequence (φm)m∈N of functions

φ : I Kn, I := [x ,x + α], (3.22) m + −→ + 0 0 that constitute approximate solutions to (3.16). To begin the construction of φm, fix m N. Since B is compact and f is continuous, we know f is even uniformly continuous on∈B (cf. [Phi16b, Th. 3.20]). In particular, 1 x x˜ <δm, y y˜ <δm f(x,y) f(˜x, y˜) < . δm∃>0 (x,y),(˜∀x,y˜)∈B | − | k − k ⇒ − m  (3.23)

We now form what is called a discretization of the interval I+, i.e. a partition of I+ into sufficiently many small intervals: Let N N and ∈ x

min δm, δm/M, 1/m for M > 0, xj xj−1 < β := { } (3.25) j∈{1∀,...,N} − min δ , 1/m for M =0 ( { m }

(for example one could make the equidistant choice xj := x0 + jh with h = α/N and N > α/β, but it does not matter how the xj are defined as long as (3.24) and (3.25) both hold). Note that we get a different discretization of I for each m N; however, + ∈ the dependence on m is suppressed in the notation for the sake of readability. We now define recursively

φ : I Kn, m + −→ φm(x0) := y0, (3.26) φ (x) := φ (x )+(x x ) f x ,φ (x ) for each x [x ,x ]. m m j − j j m j ∈ j j+1  3 GENERAL THEORY 34

Note that there is no conflict between the two definitions given for x = xj with j n ∈ 1,...,N 1 . Each function φm defines a polygon in K . This construction is known as{ Euler’s− method} and it can be used to obtain numerical approximations to the solution of the initial value problem (while simple, this method is not very efficient, though). We still need to verify that the definition (3.26) does actually make sense: We need to check that f can, indeed, be applied to (xj,φm(xj)), i.e. we have to check (xj,φm(xj)) G. We can actually show the stronger statement ∈

(x,φm(x)) B, (3.27) x∈∀I+ ∈ where B is as defined in (3.18). First, it is pointed out that (3.20) implies α b, such ≤ that x I+ implies x x0 α b as required in (3.18). One can now prove (3.27) by showing∈ by induction| − on|j ≤ 0≤,...,N 1 : ∈ { − } (x,φm(x)) B. (3.28) x∈[xj∀,xj+1] ∈

To start the induction, note φm(x0) = y0 and (x0,y0) B by (3.18). Now let j 0,...,N 1 and x [x ,x ]. We estimate ∈ ∈ { − } ∈ j j+1 j φ (x) y φ (x) φ (x ) + φ (x ) φ (x ) k m − 0k ≤ k m − m j k k m k − m k−1 k k=1 X j (3.26) = (x x ) f x ,φ (x ) + (x x ) f x ,φ (x ) − j j m j k − k−1 k−1 m k−1 k=1 j  X  (∗) (x x ) M + (x x ) M =(x x ) M ≤ − j k − k−1 − 0 Xk=1 (3.20) α M b, (3.29) ≤ ≤ where, at ( ), it was used that (x ,φ (x )) B by induction hypothesis for each ∗ k m k ∈ k =0, . . . , j, and, thus, f xk,φm(xk) M by (3.19). Estimate (3.29) completes the induction and the third step of the proof.≤ 

In the fourth step of the proof, we establish several properties of the functions φm. The + first two properties are immediate from (3.26), namely that φm is continuous on I and differentiable at each x ]x ,x [, j 0,...,N 1 , where ∈ j j+1 ∈ { − } ′ φm(x)= f xj,φm(xj) . (3.30) j∈{0,...,N∀ −1} x∈]xj∀,xj+1[  The next property to establish is

φm(t) φm(s) t s M. (3.31) s,t∀∈I+ k − k≤| − | 3 GENERAL THEORY 35

To prove (3.31), we may assume s

′ 1 φm(x) f x,φm(x) < . (3.33) j∈{0,...,N∀ −1} x∈]xj∀,xj+1[ − m  Indeed, (3.33) is a consequence of (3.23), i.e. of the of f on B: First, ′ if M = 0, then f φm 0 and there is nothing to prove. So let M > 0. If x ]xj,xj+1[, ≡ ≡ ′ ∈ then, according to (3.30), we have φm(x)= f xj,φm(xj) . Thus, by (3.25),

 (3.31) x x < β min δ , δ /M φ (x) φ (x ) x x M<δ , (3.34a) | − j| ≤ { m m } ⇒ k m − m j k ≤ | − j| m and (3.34a),(3.23) 1 φ′ (x) f x,φ (x) = f x ,φ (x ) f x,φ (x) < , (3.34b) m − m j m j − m m    proving (3.33).

The last property of the φm we need is

(3.31) φ (x) φ (x) φ (x ) + φ (x ) x x M + φ (x ) k m k ≤ k m − m 0 k k m 0 k ≤ | − 0| k m 0 k (3.35) x∈∀I+ αM + y , ≤ k 0k 3 GENERAL THEORY 36

which says that the φm are pointwise and even uniformly bounded. In the fifth and last step of the proof, we use the Arzel`a-Ascoli Th. 3.7 to obtain a function φ : I Kn, and we show that φ constitutes a solution to (3.16). According + + −→ to (3.31), the φm are uniformly equicontinuous (given ǫ> 0, condition (3.12) is satisfied with δ := ǫ/M for M > 0 and arbitrary δ > 0 for M = 0), and according to (3.35) the φm are bounded such that the Arzel`a-Ascoli Th. 3.7 applies to yield a subsequence (φ ) of (φ ) converging uniformly to some continuous function φ : I Kn. mj j∈N m m∈N + + −→ So it merely remains to verify that φ+ is a solution to (3.16).

As the uniform convergence of the (φmj )j∈N implies pointwise convergence, we have

φ+(x0) = limj→∞ φmj (x0)= y0, showing φ+ satisfies the initial condition (3.16b). Next, (x,φ+(x)) = lim (x,φm (x)) B, x∀∈I j→∞ j ∈

since each (x,φmj (x)) is in B and B is closed. In particular, f(x,φ+(x)) is well-defined for each x I . To prove that φ also satisfies the ODE (3.16a), by Th. 1.5, it suffices ∈ + + to show x φ+(x) φ+(x0) f t,φ+(t) dt =0 (3.36) x∈∀I+ − − Zx0  (in particular, (3.36) implies φ+ to be differentiable). Fixing x I+ and using the for the umpteenth time, one obtains ∈

x φ (x) φ (x ) f t,φ (t) dt + − + 0 − + Zx0  x φ (x) φ (x) + φ (x) φ (x ) f t,φ (t) dt ≤ k + − mj k mj − + 0 − mj Zx0 x  + f t,φ (t) f t,φ (t) dt , (3.37) mj − + Zx0   

holding for every j N. We will conclude the proof by showing that all three summands on the right-hand side∈ of (3.37) tend to 0 for j . As already mentioned above, → ∞ the uniform convergence of the (φmj )j∈N implies pointwise convergence, implying the convergence of the first summand. We tackle the third summand next, using

x x f t,φ (t) f t,φ (t) dt f t,φ (t) f t,φ (t) dt, (3.38) mj − + ≤ mj − + Zx0 Zx0     

which holds for every norm (cf. Appendix B), but can easily be checked directly for the 1-norm, where (z ,...,z ) := n z (exercise). Given ǫ > 0, the uniform k 1 n k1 j=1 | j| continuity of f on B provides δ > 0 such that f t,φ (t) f t,φ (t) < ǫ/α for P mj − +  

3 GENERAL THEORY 37

φmj (t) φ+(t) < δ. The uniform convergence of (φmj )j∈N then yields K N such thatk φ −(t) φk (t) <δ for every j K and each t I. Thus, ∈ k mj − + k ≥ ∈ x x x0 ǫ f t,φmj (t) f t,φ+(t) dt | − | ǫ, j≥∀K x − ≤ α ≤ Z 0   thereby establishing the convergence of the third summand from the right-hand side of (3.37). For the remaining second summand, we note that the fact that each φm is continuous and piecewise differentiable (with piecewise constant derivative) allows to apply the fundamental theorem of calculus in the form [Phi16a, Th. 10.20(b)] to obtain

x ′ φm(x)= φm(x0)+ φm(t)dt. (3.39) x∈∀I+ Zx0 Using (3.39) in the second summand of the right-hand side of (3.37) provides

x x ′ φmj (x) φ+(x0) f t,φmj (t) dt φmj (t) f t,φmj (t) dt − − x ≤ x − Z 0 Z 0  (3.33) x 1 α  , ≤ m ≤ m Zx0 j j showing the convergence of the second summand, which finally concludes the proof.  Corollary 3.9. If G R Kn is open, n N, f : G Kn is continuous, and C G ⊆ × ∈ −→ ⊆ is compact, then there exists α > 0, such that, for each (x0,y0) C, the explicit n- dimensional first-order initial value problem (3.16) has a solution φ∈: I Kn, defined −→ on the open interval I :=]x0 α,x0 + α[, i.e. always on an interval of the same length 2α. −

Proof. Exercise.  Corollary 3.10. If G R Kkn is open, k,n N, and f : G Kn is continuous, ⊆ × ∈ −→ then, for each (x0,y0,0,...,y0,k−1) G, the explicit n-dimensional kth-order initial value problem consisting of (1.6) and (1.7)∈ , which, for convenience, we rewrite

y(k) = f x,y,y′,...,y(k−1) , (3.40a) (j) y (x0)= y0,j,  (3.40b) j∈{0,...,k∀ −1} has at least one solution. More precisely, there exists an open interval I R with x I and φ : I Kn such that φ is a solution to (3.40). If C G is⊆ compact, 0 ∈ −→ ⊆ then there exists α> 0 such that, for each (x0,y0,0,...,y0,k−1) C, (3.40) has a solution φ : I Kn, defined on the open interval I :=]x α,x +α[,∈ i.e. always on an interval −→ 0 − 0 of the same length 2α. 3 GENERAL THEORY 38

Proof. If f is continuous, then the right-hand side of the equivalent first-order system (3.2a) (written in explicit form) is given by the continuous function

y2 y  3  ˜ kn ˜ . f : G K , f(x,y1,...,yk) := . . (3.41) −→    yk−1    f(x,y1,...,yk)     Thus, Th. 3.8 provides a solution Φ : I Kkn to (3.2) and, then, Th. 3.1(b) yields −→ φ := Φ1 to be a solution to (3.40). Moreover, if C G is compact, then Cor. 3.9 provides α > 0 such that, for each (x ,y ,...,y ) C⊆, (3.2) has a solution Φ : I Kkn, 0 0,0 0,k−1 ∈ −→ defined on the same open interval I :=]x0 α,x0 + α[. In particular, φ := Φ1, the corresponding solution to (3.40) is also defined− on the same I. 

While the Peano theorem is striking in its generality, it does have several drawbacks: (a) the interval, where the existence of a solution is proved can be unnecessarily short; (b) the selection of the subsequence using the Arzel`a-Ascoli theorem makes the proof nonconstructive; (c) uniqueness of solutions is not provided, even in cases, where unique solutions exist; (d) it does not provide information regarding how the solution changes with a change of the initial condition. We will subsequently address all these points, namely (b) and (c) in Sec. 3.3 (we will see that the proof of the Peano theorem becomes constructive in situations, where the solution is unique – in general, a constructive proof is not available), (a) in Sec. 3.4, and (d) in Sec. 3.5.

3.3 Uniqueness of Solutions

Example 1.4(b) shows that the hypotheses of the Peano Th. 3.8 are not strong enough to guarantee the initial value problem (3.16) has a unique solution, not even in some neighborhood of x0. The additional condition that will yield uniqueness is local Lipschitz continuity of f with respect to y.

Definition 3.11. Let m,n N, G R Km, and f : G Kn. ∈ ⊆ × −→ (a) The function f is called (globally) Lipschitz continuous or just (globally) Lipschitz with respect to y if, and only if,

f(x,y) f(x, y¯) L y y¯ . (3.42) L∃≥0 (x,y),(∀x,y¯)∈G − ≤ k − k

3 GENERAL THEORY 39

(b) The function f is called locally Lipschitz continuous or just locally Lipschitz with respect to y if, and only if, for each (x ,y ) G, there exists a (relative) open set 0 0 ∈ U G such that (x0,y0) U (i.e. U is a (relative) open neighborhood of (x0,y0)) and⊆f is Lipschitz continuous∈ with respect to y on U, i.e. if, and only if,

f(x,y) f(x, y¯) L y y¯ . (x0,y∀0)∈G (x0,y0) ∈ U∃ ⊆ G open L∃≥0 (x,y),(∀x,y¯)∈U − ≤ k − k (3.43)

The number L occurring in (a),(b) is called Lipschitz constant. The norms on Km and Kn in (a),(b) are arbitrary. If one changes the norms, then one will, in general, change L, but not the property of f being (locally) Lipschitz.

Caveat 3.12. It is emphasized that f : G Kn, (x,y) f(x,y), being Lipschitz with respect to y does not imply f to be continuous:−→ Indeed,7→ if I R, = A Km, and g : I Kn is an arbitrary discontinuous function, then f⊆: I ∅ 6 A ⊆ Kn, f(x,y) := g−→(x) is not continuous, but satisfies (3.42) with L = 0. × −→ —

While the local neighborhoods U, where a function locally Lipschitz (with respect to y) is actually Lipschitz continuous (with respect to y) can be very small, we will now show that a continuous function is locally Lipschitz (with respect to y) on G if, and only if, it is Lipschitz continuous (with respect to y) on every compact set K G. ⊆ Proposition 3.13. Let m,n N, G R Km, and f : G Kn be continuous. Then f is locally Lipschitz with∈ respect to⊆ y if,× and only if, f is (globally)−→ Lipschitz with respect to y on every compact subset K of G.

Proof. First, assume f is not locally Lipschitz with respect to y. Then there exists (x ,y ) G such that 0 0 ∈

f(xN ,yN,1) f(xN ,yN,2) >N yN,1 yN,2 . (3.44) N∀∈N (xN ,yN,1)∃,(xN ,yN,2) − k − k ∈G∩B1/N (x0,y0)

The set K := (x ,y ) (x ,y ): N N, j 1, 2 { 0 0 }∪ N N,j ∈ ∈ { } is clearly a compact subset of G (e.g. by the Heine-Borel property of compact sets), since every open set containing (x0,y0) must contain all, but finitely many, of the elements of K). Due to (3.44), f is not (globally) Lipschitz with respect to y on the compact set K (so, actually, continuity of f was not used for this direction). 3 GENERAL THEORY 40

Conversely, assume f to be locally Lipschitz with respect to y, and consider a compact subset K of G. Then, for each (x,y) K, there is some (relatively) open U(x,y) G with (x,y) U and such that f is Lipschitz∈ with respect to y in U . By the Heine-Borel⊆ ∈ (x,y) (x,y) property of compact sets, there are finitely many U1 := U(x1,y1),...,UN := U(xN ,yN ), N N, such that ∈ N K U . (3.45) ⊆ j j=1 [ ′ For each j = 1,...,N, let Lj denote the Lipschitz constant for f on Uj and set L := max L ,...,L . As f is assumed continuous and K is compact, we have { 1 N } M := max f(x,y) : (x,y) K < . (3.46) {k k ∈ } ∞ Using the compactness of K once again, there exists a Lebesgue number δ > 0 for the open cover (Uj)j∈{1,...,N} of K (cf. [Phi16b, Th. 3.21]), i.e. δ > 0 such that

y y¯ <δ (x,y), (x, y¯) Uj . (3.47) (x,y),(∀x,y¯)∈K k − k ⇒j∈{1 ∃,...,N} { }⊆   Define L := max L′, 2M/δ . Then, for every (x,y), (x, y¯) K: { } ∈ y y¯ <δ f(x,y) f(x, y¯) L y y¯ L y y¯ , (3.48a) k − k ⇒ k − k≤ jk − k≤ k − k 2Mδ y y¯ δ f(x,y) f(x, y¯) 2M = L y y¯ , (3.48b) k − k≥ ⇒ k − k≤ δ ≤ k − k completing the proof that f is Lipschitz with respect to y on K. 

While, in general, the assertion of Prop. 3.13 becomes false if the continuity of f is omit- ted, for convex G, it does hold without the continuity assumption on f (see Appendix D). The following Prop. 3.14 provides a useful sufficient condition for f : G Kn, G R Km open, to be locally Lipschitz with respect to y: −→ ⊆ × Proposition 3.14. Let m,n N, let G R Km be open, and f : G Kn. A sufficient condition for f to be∈ locally Lipschitz⊆ with× respect to y is f being continuously−→ (real) differentiable with respect to y, i.e., f is locally Lipschitz with respect to y provided

that all partials ∂yk fl; k,l =1,...,n (∂yk,1 fl, ∂yk,2 fl for K = C) exist and are continuous.

Proof. We consider the case K = R; the case K = C is included by using the identifi- cations Cm = R2m and Cn = R2n. Given (x ,y ) G, we have to show f is Lipschitz ∼ ∼ 0 0 ∈ with respect to y on some open set U G with (x0,y0) U. As in the Peano Th. 3.8, since G is open, ⊆ ∈

m B := (x,y) R R : x x0 b and y y0 1 b G, b>∃0 ∈ × | − |≤ k − k ≤ ⊆  3 GENERAL THEORY 41

m where 1 denotes the 1-norm on R . Since the ∂yk fl,(k,l) 1,...,m 1,...,n , are allk continuous · k on the compact set B, ∈ { } × { }

M := max ∂ f (x,y) : (x,y) B, (k,l) 1,...,m 1,...,n < . (3.49) | yk l | ∈ ∈ { } × { } ∞ Applying the (cf. [Phi16b, Th. 4.35]) to the n components of the function f : y Rm : (x,y) B Rn, f (y) := f(x,y), x ∈ ∈ −→ x m we obtain η1,...,ηn  R such that ∈ m f (x,y) f (x, y¯)= ∂ f (x,η )(y y¯ ), (3.50) l − l yk l l k − k Xk=1 and, thus,

n f(x,y) f(x, y¯) = f (x,y) f (x, y¯) − 1 | l − l | l=1 n m X n (3.51) (x,y),(∀x,y¯)∈B (3.49),(3.50) M y y¯ = M y y¯ = nM y y¯ , ≤ | k − k| k − k1 k − k1 Xl=1 Xk=1 Xl=1 i.e. f is Lipschitz with respect to y on B (where

(x,y) R Rm : x x

y′ = f(x,y), (3.52a)

y(x0)= y0, (3.52b)

has a unique solution. More precisely, if I R is an open interval and φ,ψ : I Kn ⊆ −→ are both solutions to (3.52a), then φ(x0) = ψ(x0) for one x0 I implies φ(x) = ψ(x) for all x I: ∈ ∈

φ(x0)= ψ(x0) φ(x)= ψ(x) . (3.53) x0∃∈I ⇒ x∀∈I     3 GENERAL THEORY 42

Proof. We first show that φ and ψ must agree in a small neighborhood of x0:

φ(x)= ψ(x). (3.54) ǫ>∃0 x∈]x0−∀ǫ,x0+ǫ[ Since f is continuous and both φ and ψ are solutions to the initial value problem (3.52), we can use Th. 1.5 to obtain x φ(x) ψ(x)= f t,φ(t) f t,ψ(t) dt. (3.55) x∀∈I − − Zx0    Set y0 := φ(x0)= ψ(x0). As f is locally Lipschitz with respect to y, there exists δ > 0 such that U := (x,y) R Kn : x x < δ, y y <δ G { ∈ × | − 0| k − 0k }⊆ and such that f is Lipschitz with Lipschitz constant L 0 with respect to y on U, where we have chosen some arbitrary norm on Kn. The≥ continuity of φ,ψ implies the existence ofǫ> ˜ 0 such thatǫ ˜ δ, B (x )k · kI, φ(B (x )) B (y ) and ψ(B (x )) ≤ ǫ˜ 0 ⊆ ǫ˜ 0 ⊆ δ 0 ǫ˜ 0 ⊆ Bδ(y0), implying

f x,φ(x) f x,ψ(x) L φ(x) ψ(x) . (3.56) x∈B∀ǫ˜(x0) − ≤ −   Next, define ǫ := min ǫ,˜ 1/(2L) { } and, using the compactness of B (x )=[x ǫ,x + ǫ] plus the continuity of φ,ψ, ǫ 0 0 − 0 M := max φ(x) ψ(x) : x B (x ) < . k − k ∈ ǫ 0 ∞ From (3.55) and (3.56), we obtain x M φ(x) ψ(x) L φ(t) ψ(t) dt L x x0 M (3.57) x∈B∀(x ) k − k≤ k − k ≤ | − | ≤ 2 ǫ 0 Zx0

(note that the integral in (3.57) can be negative for x 0 such that [s,s + α] I. Then the continuity of φ,ψ implies φ(s)= ψ(s), i.e. φ and ψ satisfy ⊆ the same initial value problem at s such that (3.54) must hold with s instead of x0, in contradiction to the definition of s. Finally, φ(x) = ψ(x) for each x x follows in an ≤ 0 completely analogous fashion, which concludes the proof of the theorem.  3 GENERAL THEORY 43

Corollary 3.16. If G R Kkn is open, k,n N, and f : G Kn is continuous and ⊆ × ∈ −→ locally Lipschitz with respect to y, then, for each (x0,y0,0,...,y0,k−1) G, the explicit n-dimensional kth-order initial value problem consisting of (1.6) and (1.7)∈ , i.e.

y(k) = f x,y,y′,...,y(k−1) , (j) y (x0)= y0,j,  j∈{0,...,k∀ −1} has a unique solution. More precisely, if I R is an open interval and φ,ψ : I Kn ⊆ −→ are both solutions to (1.6), then

(j) (j) φ (x0)= ψ (x0) (3.58) j∈{0,...,k∀ −1} holding for one x I implies φ(x)= ψ(x) for all x I. 0 ∈ ∈ Proof. Exercise.  Remark 3.17. According to Th. 3.15, the condition of f being continuous and locally Lipschitz with respect to y is sufficient for each initial value problem (3.52) to have a unique solution. However, this condition is not necessary: It is an exercise to show that the continuous function

1 for y 0, f : R2 R, f(x,y) := ≤ (3.59) −→ 1+ √y for y 0, ( ≥ 2 is not locally Lipschitz with respect to y, but that, for each (x0,y0) R , the initial value problem (3.52) still has a unique solution in the sense that (3.53)∈ holds for each solution φ to (3.52a). And one can (can you?) even find simple examples of f being defined on an open domain such that f is discontinuous at every point in its domain and every initial value problem (3.52) still has a unique solution. —

At the end of Sec. 3.2, it was pointed out that the proof of the Peano Th. 3.8 is non- constructive due to the selection of a subsequence. The following Th. 3.18 shows that, whenever the initial value problem has a unique solution, it becomes unnecessary to select a subsequence, and the construction procedure (namely Euler’s method) used in the proof of Th. 3.8 becomes an effective (if not necessarily efficient) numerical approx- imation procedure for the unique solution. Theorem 3.18. Consider the situation of the Peano Th. 3.8. Under the additional assumption that the solution to the explicit n-dimensional first-order initial value prob- lem (3.16) is unique on some interval J [x ,x + α[, x J, and where α > 0 is ⊆ 0 0 0 ∈ 3 GENERAL THEORY 44

constructed as in Th. 3.8 (i.e. given by (3.18) – (3.20)), every sequence (φm)m∈N of func- tions defined on J according to Euler’s method as in the proof of Th. 3.8 (i.e. defined as in (3.26)) converges uniformly to the unique solution φ : J Kn. An analogous statement also holds for J ]x α,x ], x J. −→ ⊆ 0 − 0 0 ∈

Proof. Seeking a contradiction, assume (φm)m∈N does not converge uniformly to the unique solution φ. Then there exists ǫ> 0 and a subsequence (φmj )j∈N such that

φm φ sup = sup φm (x) φ(x) : x J ǫ. (3.60) j∈∀N k j − k k j − k ∈ ≥  However, as a subsequence, (φmj )j∈N still has all the properties of the (φm)m∈N (namely pointwise boundedness, uniform equicontinuity, piecewise differentiability, being approx- imate solutions according to (3.33)) that guanranteed the existence of a subsequence, converging to a solution. Thus, since the solution is unique on J,(φmj )j∈N must, in turn, have a subsequence, converging uniformly to φ, which is in contradiction to (3.60). This shows the assumption that (φm)m∈N does not converge uniformly to φ must have been false. The proof of the analogous statement for J ]x0 α,x0], x0 J, one obtains, e.g., via time reversion (cf. the second step of the proof⊆ of− Th. 3.8). ∈ 

Remark 3.19. The argument used to prove Th. 3.18 is of a rather general nature: It can be applied whenever a sequence is known to have a subsequence converging to some solution of some equation (or some other problem), provided the same still holds for every subsequence of the original sequence – in that case, the additional knowledge that the solution is unique implies the convergence of the original sequence without the need to select a subsequence.

3.4 Extension of Solutions, Maximal Solutions

The Peano Th. 3.8 and Cor. 3.10 show the existence of local solutions to explicit initial value problems, i.e. the solution’s existence is proved on some, possibly small, interval containing the initial point x0. In the current section, we will address the question in which circumstances such local solutions can be extended, we will prove the existence of maximal solutions (solutions that can not be extended), and we will learn how such maximal solutions can be identified.

Definition 3.20. Let φ : I Kn, n N, be a solution to some ODE (such as (1.6) or (1.4) in the most general case),−→ defined∈ on some open interval I R. ⊆ (a) We say φ has an extension or continuation to the right (resp. to the left) if, and only if, there exists a solution ψ : J Kn to the same ODE, defined on some −→ 3 GENERAL THEORY 45

open interval J I such that ψ↾ = φ and ⊇ I sup J > sup I (resp. inf J < inf I). (3.61) An extension or continuation of φ is a function that is an extension to the right or an extension to the left (or both). (b) The solution φ is called a maximal solution if, and only if, it does not admit any extensions in the sense of (a) (note that we require maximal solutions to be defined on open intervals, cf. Appendix E). Remark 3.21. As an immediate consequence of the time reversion Lem. 1.9(b), if a solution φ : I Kn, n N, to (1.6), defined on some open interval I R, has an extension to the−→ right (resp.∈ to the left) if, and only if, ψ : ( I) Kn, ψ(x⊆) := φ( x), (solution to y(k) = ( 1)k f( x,y, y′,..., ( 1)k−1 y(k−1)))− has−→ an extension to the− left (resp. to the right). − − − − — The existence of maximal solutions is not trivial – a priori it could be that every solution had an extension (analogous to the fact that to every x [0, 1[ (or every x R) there is some bigger element in [0, 1[ (respectively in R)). ∈ ∈ Theorem 3.22. Every solution φ : I Kn to (1.4) (resp. to (1.6)), defined on an 0 0 −→ open interval I R, can be extended to a maximal solution of (1.4) (resp. of (1.6)). 0 ⊆ Proof. The proof is carried out for solutions to (1.4) (the implicit ODE) – the proof for solutions to the explicit ODE (1.6) is analogous and can also be seen as a special case. The idea is to apply Zorn’s lemma. To this end, define a partial order on the set := (I ,φ ) (I,φ): φ : I Kn is solution to (1.4), extending φ (3.62) S { 0 0 } ∪ { −→ 0} by letting (I,φ) (J, ψ) : I J, ψ↾ = φ. (3.63) ≤ ⇔ ⊆ I Every chain , i.e. every totally ordered subset of , has an upper bound, namely (I ,φ ) with CI := I and φ (x) := φ(x), whereS (I,φ) is chosen such that C C C (I,φ)∈C C ∈ C x I (since is a chain, the value of φ (x) does not actually depend on the choice of S C (I,φ∈ ) andC is, thus, well-defined). ∈C Clearly, I is an open interval, I I , and φ extends φ as a function; we still need to C 0 ⊆ C C 0 see that φC is a solution to (1.4). For this, we, once again, use that x IC means there exists (I,φ) such that x I and φ is a solution to (1.4). Thus, using∈ the notation from Def. 1.2(a),∈C ∈

(x,φ (x),φ′ (x),...,φ(k)(x))=(x,φ(x),φ′(x),...,φ(k)(x)) U C C C ∈ 3 GENERAL THEORY 46

and ′ (k) ′ (k) F (x,φC(x),φC(x),...,φC (x)) = F (x,φ(x),φ (x),...,φ (x))=0, showing φ is a solution to (1.4) as defined in Def. 1.2(a). In particular,(I ,φ ) . To C C C ∈S verify (IC,φC) is an upper bound for , note that the definition of (IC,φC) immediately implies I I for each (I,φ) andCφ ↾ = φ for each (I,φ) . ⊆ C ∈C C I ∈C To conclude the proof, we note that all hypotheses of Zorn’s lemma have been verified such that it yields the existence of a maximal element of (Imax,φmax) , i.e. φmax : I Kn must be a maximal solution extending φ . ∈ S  max −→ 0 Proposition 3.23. Let k,n N. Given G R Kkn open and f : G Kn continuous, if φ : I Kn is∈ a solution to (1.6)⊆ such× that I =]a,b[, a

(j) y (b)= ηj. j=0,...,k∀ −1 By Cor. 3.10, there must exist ǫ> 0 such that this initial value problem has a solution ψ :]b ǫ,b+ǫ[ Kn. We now show that φ extended to b via (3.64a) is still a solution to − −→ (j) (1.6). First note the mean value theorem (cf. [Phi16a, Th. 9.18]) yields that φ (b)= ηj exists for j =1,...,k 1 as a left-hand derivative. Moreover, − (k) ′ (k−1) lim φ (x) = lim f x,φ(x),φ (x),...,φ (x) = f(b,η0,...,ηk−1), x↑b x↑b  showing φ(k)(b)= f b,φ(b),φ′(b),...,φ(k−1)(b) (again employing the mean value theo- rem), which proves φ extended to b is a solution to (1.6). Finally, Lem. 1.7 ensures  φ(x) for x b, σ :]a,b + ǫ[ Kn, σ(x) := ≤ −→ ψ(x) for x b, ( ≥ is a solution to (1.6) that extends φ to the right.  3 GENERAL THEORY 47

Proposition 3.24. Let k,n N, let G R Kkn be open, let f : G Kn be continuous, and let φ : I ∈Kn be a solution⊆ to× (1.6) defined on the open−→ interval I. −→ (k−1) Consider x0 I and let gr+(φ) (resp. gr−(φ)) denote the graph of (φ,...,φ ) for x x (resp.∈ for x x ): ≥ 0 ≤ 0 gr (φ) := gr (φ,x ) := x,φ(x),...,φ(k−1)(x) G : x I, x x , (3.65a) + + 0 ∈ ∈ ≥ 0 gr (φ) := gr (φ,x ) := x,φ(x),...,φ(k−1)(x) G : x I, x x . (3.65b) − − 0   ∈ ∈ ≤ 0   If there exists a compact set K G such that gr+(φ) K (resp. gr−(φ) K), then φ has an extension ψ : J Kn to⊆ the right (resp. to the⊆ left) such that ⊆ −→ x,ψ˜ (˜x),...,ψ(k−1)(˜x) / K. (3.66) x˜∃∈J ∈  The statement can be rephrased by saying that gr+(φ) (resp. gr−(φ)) of each maximal solution φ to (1.6) escapes from every compact subset of G when x appoaches the right (resp. the left) boundary of I (where the boundary of I can contain and/or + ). −∞ ∞ Proof. We conduct the proof for extensions to the right; extensions to the left can be handled completely analogously (alternatively, one can apply the time reversion Lem. 1.9(b) as demonstrated in the last paragraph of the proof below). The proof for exten- sions to the right is divided into three steps. Let K G be compact. ⊆ Step 1: We show that gr (φ) K implies φ has an extension to the right: Since K is + ⊆ bounded, so is gr+(φ), implying

b := sup I < (3.67) ∞ as well as

M := sup φ(j)(x) : j 0,...,k 1 , x [x ,b[ < . 1 ∈ { − } ∈ 0 ∞  In the usual way, K compact and f continuous imply M := max f(x,y) : (x,y) K < . 2 k k ∈ ∞ Set  M := max M ,M . { 1 2} According to Prop. 3.23, we need to show (3.64a) holds. To this end, notice

φ(j)(x) φ(j)(¯x) M x x¯ : (3.68) j=0,...,k∀ −1 x,x¯∈∀[x0,b[ k − k≤ | − | 3 GENERAL THEORY 48

Indeed,

x¯ φ(k−1)(x) φ(k−1)(¯x) = f t,φ(t),...,φ(k−1)(t) dt M x x¯ , x,x¯∈∀[x0,b[ k − k ≤ | − | Zx  and, for 0 j < k 1, ≤ − x¯ φ(j)(x) φ(j)(¯x) = φ(j+1)(t)dt M x x¯ , x,x¯∈∀[x0,b[ k − k ≤ | − | Zx

proving (3.68). Since K is compact, there exists a sequence (xm)m∈N in [x0,b[ such that ′ (k−1) lim xm,φ(xm),φ (xm),...,φ (xm) =(b,η0,...,ηk−1). (3.69) (b,η0,...,η∃k−1)∈K m→∞  Usingx ¯ := x in (3.68) yields, for m , m → ∞ (j) φ (x) ηj M x b , j=0,...,k∀ −1 x∈[∀x0,b[ k − k≤ | − | implying (j) lim φ (x)= ηj, j=0,...,k∀ −1 x↑b i.e. (3.64a) holds, completing the proof of Step 1.

Step 2: We show that gr+(φ) K implies φ can be extended to the right to I ]x0,b+α[, where α > 0 does not depend⊆ on b := sup I: Since K is compact, Cor. 3.9∪ guarantees every initial value problem

y(k) = f(x,y,y′,...,y(k−1)), (3.70a) (j) y (ξ0)= y0,j, (ξ0,y0) K, (3.70b) j=0,...,k∀ −1 ∈ has a solution defined on ]ξ0 α,ξ0 + α[ with the same α > 0. As shown in Step 1, the solution φ can be extended− into b = sup I such that it satisfies (3.70b) with (ξ0,y0)=(b,η) K. Thus, using Lem. 1.7, it can be pieced together with the solution to (3.70) given on∈ [b,b + α[ by Cor. 3.9, completing the proof of Step 2. Step 3: We finally show that gr (φ) K implies φ has an extension ψ : J Kn + ⊆ −→ to the right such that (3.66) holds: We set a := inf I and φ0 := φ. Then, by Step 2, φ0 has an extension φ1 defined on ]a,b + α[. Inductively, for each m 1, either there exists m m such that φ :]a,b + m α[ Kn is an extension of≥ φ that can be 0 ≤ m0 0 −→ used as ψ to conclude the proof (i.e. ψ := φm0 satisfies (3.66)) or φm can, once more, be extended to ]a,b +(m + 1)α[. As K is bounded, x x0 : (x,y) K R must also be bounded, say by µ R. Thus, (3.66) must be{ satisfied≥ for some∈ ψ}:= ⊆φ with ∈ m 1 m (µ x )/α. ≤ ≤ − 0 3 GENERAL THEORY 49

As mentioned above, one can argue completely analogous to the above proof to obtain

that gr−(φ) K implies φ to have an extension to the left, satisfying (3.66). Here we show how one,⊆ alternatively, can use the time reversion Lem. 1.9 to this end: Consider the map

h : R Kkn R Kkn, h(x,y ,...,y ):=( x,y ,..., ( 1)k−1y ), × −→ × 1 k − 1 − k which clearly constitutes an R-linear isomophism. Noting (1.6) and (1.32) are the same, we consider the time-reversed version (1.33) and observe Gg = h(G) to be open, h(K) n k ⊆ Gg to be compact, g : Gg K , g =( 1) (f h), to be continuous. If gr−(φ,x0) K, then gr (ψ, x ) h(K),−→ where ψ is the− solution◦ to the time-reversed version (1.33),⊆ + − 0 ⊆ given by Lem. 1.9(b). Then ψ has an extension ψ˜ to the right, satisfying (3.66) with ψ replaced by ψ˜ and K replaced by h(K). Then, by Rem. 3.21, φ must have an extension φ˜ to the left, satisfying (3.66) with ψ replaced by φ˜. 

In Th. 3.28 below, we will show that, for continuous f : G Kn, each maximal solution to (1.6) must go to the boundary of G in the sense of the−→ following definition.

Definition 3.25. Let k,n N, let G R Kkn be open, let f : G Kn, and let φ :]a,b[ Kn, a

gr+(φ,x0) K = (resp. gr−(φ,x0) K = ), (3.71) K ⊆ G∀compact x0∈∃]a,b[ ∩ ∅ ∩ ∅

where gr+(φ,x0) and gr−(φ,x0) are defined as in (3.65) (with I =]a,b[). In other words, φ goes to the boundary of G for x b (resp. for x a) if, and only if, the graph of (φ,...,φ(k−1)) escapes every compact→ subset K of G forever→ for x b (resp. for x a). → → Proposition 3.26. In the situation of Def. 3.25, if the solution φ goes to the boundary of G for x b, then one of the following conditions must hold: → (i) b = , ∞ (ii) b< and L := limsup φ(x),...,φ(k−1)(x) = , ∞ x↑b ∞ (iii) b< , L< (L as defined in (ii)), G = R K kn (i.e. ∂G = ), and ∞ ∞ 6 × 6 ∅ lim dist x,φ(x),...,φ(k−1)(x) , ∂G =0. (3.72) x↑b    An analogous statement is valid for the solution φ going to the boundary of G for x a. → 3 GENERAL THEORY 50

Proof. The proof is carried out for x b; the proof for x a is analogous. → → Assume (i) – (iii) are all false. Choose c ]a,b[. Since (i) and (ii) are false, ∈ φ(x),...,φ(k−1)(x) M. 0≤M<∃ ∞ x∈∀[c,b[ ≤  If (iii) is false because G = R Kkn, then K := (x,y) R Kkn : x [c,b], y M is a compact subset of G that× shows (3.71) does{ not hold.∈ × In the only∈ remainingk k≤ case,} (iii) must be false, since (3.72) does not hold. Thus,

(k−1) dist x1,φ(x1),...,φ (x1) , ∂G δ. δ>∃0 x0∈∀]a,b[ x1∈∃]x0,b[ ≥    Clearly, the set A := (x,y) G : dist (x,y),∂G δ ∈ ≥ kn + is closed (e.g. as the distance function d : R K  R 0 , d( ) := dist( ,∂G) is continuous (see Th. C.4) and A =(d−1[δ, [) (×G ∂G−→)). In consequence,· K · A with K as defined above is a compact subset of∞G that∩ shows∪ (3.71) does not hold.∩ 

Remark 3.27. (a) Examples such as the second ODE of Ex. 3.30(b) below show that the limsup in Prop. 3.26(ii) can not be replaced with a lim.

(b) If f : G Kn is continuous, then the three conditions of Prop. 3.26 are also sufficient for−→φ to go to the boundary of G (cf. Cor. 3.29 below).

(c) For discontinuous f : G Kn, in general, (ii) of Prop. 3.26 is no longer sufficient for φ to go to the boundary−→ of G as is shown by simple examples, whereas (i) and (iii) remain sufficient, even for discontinuous f (exercise). Similarly, simple examples show Prop. 3.24 becomes false without the assumption of f being continuous; and it can also happen that a maximal solution escapes every compact set, but still does not go to the boundary of G (exercise).

Theorem 3.28. In the situation of Def. 3.25, if f : G Kn is continuous and φ :]a,b[ Kn is a maximal solution to (1.6), then φ must go−→ to the boundary of G for both x −→a and x b, i.e., for both x a and x b, it must escape every compact → → → → subset K of G forever and it must satisfy one of the conditions specified in Prop. 3.26 (and one of the analogous conditions for x a). → Proof. We carry out the proof for x b – the proof for x a can be done analogously or by applying the time reversion Lem.→ 1.9, as indicated at→ the end of the proof below. Let φ :]a,b[ Kn be a maximal solution to (1.6). Seeking a contradiction, we assume φ does not go−→ to the boundary of G for x b, i.e. (3.71) does not hold and there exists → 3 GENERAL THEORY 51

a compact subset K of G and a strictly increasing sequence (xm)m∈N in ]a,b[ such that lim x = b< and m→∞ m ∞ (k−1) xm,φ(xm),...,φ (xm) K. (3.73) m∀∈N ∈  We now define C to be another compact subset of G that is strictly between K and G, i.e. K ( C ( G: More precisely, we choose r> 0 such that

C := (x,y) R Kkn : dist (x,y),K r G, ∈ × ≤ ⊆ where   dist (x,y),K = inf (x,y) (˜x, y˜) : (˜x, y˜) K , {k − k2 ∈ } kn+1 2 denoting the Euclidean norm on R for K = R and the Euclidean norm on kR2 ·kn k+1 for K = C (this choice of norm is different from previous choices and will be convenient later during the current proof). As φ is a maximal solution, Prop. 3.24 guarantees the existence of another strictly increasing sequence (ξm)m∈N in ]a,b[ such that limm→∞ ξm = b < , x1 < ξ1 < x2 < ξ2 < ... (i.e. xm < ξm < xm+1 for each m N) and such that ∞ ∈ (k−1) ξm,φ(ξm),...,φ (ξm) / C. m∀∈N ∈  Noting x ,φ(x ),...,φ(k−1)(x ) K by (3.73) and K C, define m m m ∈ ⊆  (k−1) sm := sup s xm : x,φ(x),...,φ (x) C for each x [xm,s] . m∀∈N ≥ ∈ ∈   By the definition of sm as a sup, sm < xm+1

(k−1) (k−1) xm,φ(xm),...,φ (xm) sm,φ(sm),...,φ (sm) r. (3.75) m∀∈N − 2 ≥

  We use the boundedness of the compact set C and (3.74) to provide

M := sup φ(x),...,φ(k−1)(x) : x [x ,s ], m N < , 1 2 ∈ m m ∈ ∞  

3 GENERAL THEORY 52

M := max f(x,y) : (x,y) C < 2 k k2 ∈ ∞ (as C is compact and f continuous),

M := max M ,M . { 1 2} We now notice that each function

J : [x ,s ] R Kkn, J (x) := x,φ(x),...,φ(k−1)(x) , m m m −→ × m is a continuously differentiable or path (using the continuity of f), cf. Def. F.1 2kn+1 (for K = C, we consider Jm as a path in R ). To finish the proof, we will have to make use of the notion of arc length (cf. Def. F.5) of such a continuously differentiable curve: Recall that each such continuously differentiable path is rectifyable, i.e. it has a well-defined finite arc length l(Jm) (cf. Th. F.7). Moreover, l(Jm) satisfies

s (F.4) (F.17) m J (x ) J (s ) l(J ) = J ′ (x) dx k m m − m m k2 ≤ m k m k2 Zxm sm k = 1+ φ(j)(x) 2 dx v k k2 xm u j=1 Z u X sm t 2 2 1+ φ(x),...,φ(k−1)(x) + f(J (x)) dx ≤ 2 m 2 Zxm q sm  √1+2 M 2 dx, (3.76) ≤ Zxm

where it was used that 2 was chosen to be the Euclidean norm. For each m N, we estimate k · k ∈

(3.75) (k−1) (k−1) 0 0.− This contradiction shows our initial assumption that φ does not go to the boundary of G for x b must have been wrong. → To obtain the remaining assertion that φ must go to the boundary of G for x a, one can proceed as in the last paragraph of the proof of Prop. 3.23, making use of→ the function h defined there and of the time reversion Lem. 1.9: If K G is a compact ⊆ 3 GENERAL THEORY 53

set and ψ is the solution to the time-reversed version given by Lem. 1.9(b), then ψ must be maximal as φ is maximal. Thus, for x a, ψ must escape the compact set h(K) forever by the first part of the proof above,→ implying − φ must escape K forever for x a.  → Corollary 3.29. Let k,n N, let G R Kkn be open, and let f : G Kn be continuous. If φ :]a,b[ ∈ Kn, a

(i) φ is a maximal solution.

(ii) φ must go to the boundary of G for both x a and x b in the sense defined in → → Def. 3.25.

(iii) φ satisfies one of the conditions specified in Prop. 3.26 and one of the analogous conditions for x a. → Proof. (i) implies (ii) by Th. 3.28, (ii) implies (iii) by Prop. 3.26, and it is an exercise to show (iii) implies (i) (here, Prop. 3.23 is the clue). 

Example 3.30. The following examples illustrate the different kinds of possible bahav- ior of maximal solutions listed in Prop. 3.26 (the different kinds of bahavior can already be seen for 1-dimensional ODE of first order):

(a) The initial value problem y′ =0, y(0) = 1, − has the maximal solution φ : R R, φ(x)= 1 – here we have −→ − G = R2, f : G R, f(x,y)=0, −→ solution interval I = R, b := sup I = , i.e. we are in Case (i) of Prop. 3.26. ∞ (b) The initial value problem y′ = x−2, y( 1)=1, − has the maximal solution φ :] , 0[ R, φ(x)= x−1 – here we have − ∞ −→ − G =] , 0[ R, f : G R, f(x,y)= x−2, − ∞ × −→

solution interval I =] , 0[, b := sup I = 0, limx↑0 φ(x) = i.e. we are in Case (ii) of Prop. 3.26. − ∞ | | ∞ 3 GENERAL THEORY 54

To obtain an example, where we are also in Case (ii) of Prop. 3.26, but where lim φ(x) , b := sup I, does not exist, consider the initial value problem x↑b | | 1 1 1 1 1 y′ = sin + cos , y =0, −x2 x x3 x −π   which has the maximal solution φ :] , 0[ R, φ(x) = x−1 sin(x−1) (here lim sup φ(x) = , but, as φ( 1/(kπ−)) ∞ = 0−→ for each k N, lim φ(x) does x↑0 | | ∞ − ∈ x↑0 | | not exist) – here we have 1 1 1 1 G =(R 0 ) R, f : G R, f(x,y)= sin + cos . \ { } × −→ −x2 x x3 x

To obtain an example, where we are again in Case (ii) of Prop. 3.26, but where G = R2, consider the initial value problem

y′ = y2, y( 1)=1, − which, as in the first example of (b), has the maximal solution φ :] , 0[ R, φ(x)= x−1 – here we have − ∞ −→ − G = R2, f : G R, f(x,y)= y2. −→ (c) The initial value problem

y′ = y−1, y( 1)=1, − − has the maximal solution φ :] , 1 [ R, φ(x)= √ 2x 1 – here we have − ∞ − 2 −→ − − G = R (R 0 ), f : G R, f(x,y)= y−1, × \ { } −→ − solution interval I =] , 1 [, b := sup I = 1 , ∂G = R 0 , − ∞ − 2 − 2 × { } 1 lim (x,φ(x)) = , 0 ∂G, x↑b −2 ∈   i.e. we are in Case (iii) of Prop. 3.26.

An example, where we are in Case (iii) of Prop. 3.26, but where limx↑b (x,φ(x)) does not exist, is given by the initial value problem

1 1 1 y′ = cos , y =0, −x2 x −π   3 GENERAL THEORY 55

has the maximal solution φ :] , 0[ R, φ(x) = sin(1/x) – here we have − ∞ −→ 1 1 G =(R 0 ) R, f : G R, f(x,y)= cos , \ { } × −→ −x2 x solution interval I =] , 0[, b := sup I = 0, ∂G = 0 R, − ∞ { }× lim dist (x,φ(x)),∂G = lim x =0. x↑0 x↑0 | |  As a final example, where we are again in Case (iii) of Prop. 3.26, reconsider the initial value problem from (a), but this time with G =] 1, 1[ ] 3, 5[, f : G R, f(x,y)=0. − × − −→ Now the maximal solution is φ :] 1, 1[ R, φ(x) = 1, solution interval I = − −→ − ] 1, 1[, b := sup I = 1, and limx↑1 (x,φ(x)) = (1, 1) ∂G. This last example also illustrates− that, even though it is quite common to− omit∈ an explicit specification of the domain G when writing an ODE (as we did in (a)) – where it is usually assumed that the intended domain can be guessed from the context – the maximal solution will typically depend on the specification of G. Example 3.31. We have already seen examples of initial value problems that admit more than one maximal solution – for instance, the initial value problem of Ex. 1.4(b) had infinitely many different maximal solutions, all of them defined on all of R. The fol- lowing example shows that an initial value problem can have maximal solutions defined on different intervals: Let y G := R ] 1, 1[, f : G R, f(x,y) := | | , × − −→ 1 y −p | | and consider the initial value problem p y y′ = f(x,y)= | | , y(0) = 0. (3.78) 1 y −p | | An obvious maximal solution is p φ : R R, φ(0) = 0. −→ However, another maximal solution (that can be found using separation of variables) is

2 1 √1+ x for 1

3.5 Continuity in Initial Conditions

The goal of the present section is to show that, under suitable conditions, small changes in the initial condition for an ODE result in small changes in the solution. As, in situations of nonuniqueness, we can change the solution without having changed the initial condition at all, ensuring unique solutions to initial value problems is a minimal prerequisite for our considerations in this section.

Definition 3.32. Let G R Kkn, k,n N, and f : G Kn. We say that the explicit n-dimensional kth-order⊆ × ODE (1.6),∈ i.e. −→

y(k) = f x,y,y′,...,y(k−1) , (3.79a)

admits unique maximal solutions if, and only if, f is such that every initial value problem consisting of (3.79a) and

(j) n y (ξ)= ηj K , (3.79b) j∈{0,...,k∀ −1} ∈

n with (ξ,η) G, has a unique maximal solution φ(ξ,η) : I(ξ,η) K (combining Cor. 3.16 with Th.∈ 3.22 yields that G being open and f being continuous−→ and locally Lipschitz with respect to y is sufficient for (3.79a) to admit unique maximal solutions, but we know from Rem. 3.17 that this condition is not necessary). If f is such that (3.79a) admits unique maximal solutions, then

Y : D Kn, Y (x,ξ,η) := φ (x), (3.80) f −→ (ξ,η) defined on D := (x,ξ,η) R G : x I , (3.81) f { ∈ × ∈ (ξ,η)} is called the global or general solution to (3.79a). Note that the domain Df of Y is determined entirely by f, which is notationally emphasized by its lower index f.

Lemma 3.33. In the situation of Def. 3.32, the following holds:

(a) Y (ξ,ξ,η)= η for each (ξ,η) G. 0 ∈

(b) If k =1, then η = η0 and Y x, x,Y˜ (˜x,ξ,η) = Y (x,ξ,η) for each (x,ξ,η), (˜x,ξ,η) D . ∈ f  (c) If k =1, then Y ξ,x,Y (x,ξ,η) = η for each (x,ξ,η) D . ∈ f  Proof. (a) holds as Y ( ,ξ,η) is a solution to (3.79b). For (b) note (˜x,ξ,η) D im- · ∈ f plying x,Y˜ (˜x,ξ,η) G, i.e. x,Y˜ (˜x,ξ,η) are admissible initial data. Moreover, ∈   3 GENERAL THEORY 57

Y , x,Y˜ (˜x,ξ,η) and Y ( ,ξ,η) are both maximal solutions for some intial value prob- lem· for (3.79a). Since both· solutions agree at x =x ˜, both functions must be identical  by the assumed uniqueness of solutions. In particular, they are defined for the same x and yield the same value at each x. Setting x := ξ in (b) yields (c). 

The core of the proof of continuity in initial conditions as stated in Cor. 3.36 below is the following Th. 3.34(a), which provides continuity in initial conditions locally. As a byproduct, we will also obtain a version of the Picard-Lindel¨of theorem in Th. 3.34(b), which states the local uniform convergence of the so-called Picard iteration, a method for obtaining approximate solutions that is quite different from the considered above.

Theorem 3.34. Consider the situation of Def. 3.32 for first-order problems, i.e. with k = 1, and with f being continuous and locally Lipschitz with respect to y on G open. Fix an arbitrary norm on Kn. k · k (a) For each (σ, ζ) G R Kn and each 0 satisfying: · (i) For every point (ξ,η) in the open set

U (σ, ζ) := (ξ,η) G : ξ ]a,b[, η Y (ξ,σ,ζ) <δ , (3.82) δ ∈ ∈ − the maximal solution φ = Y ( ,ξ,η) is defined on ]a,b [ (i.e. ]a,b[ I ). (ξ,η) · ⊆ (ξ,η) (ii) The restriction of the global solution (x,ξ,η) Y (x,ξ,η) to the open set 7→ W :=]a,b[ U (σ, ζ) (3.83) × δ is continuous.

(b) (Picard-Lindel¨of) For each (σ, ζ) G, there exists α > 0 such that the Picard ∈ n iteration, i.e. the sequence of functions (φm)m∈N0 , φm :]σ α,σ +α[ K , defined recursively by − −→

φ0(x) := ζ, (3.84a) x φm+1(x) := ζ + f t,φm(t) dt, (3.84b) m∈∀N0 Zσ  converges uniformly to the solution of the initial value problem (3.79) (with k = 1 and (ξ,η):=(σ, ζ)) on ]σ α,σ + α[. − 3 GENERAL THEORY 58

Proof. We will obtain (b) as an aside while proving (a). To simplify notation, we introduce the function ψ : [a,b] Kn, ψ(x) := Y (x,σ,ζ). −→ Since [a,b] is compact and ψ is continuous, γ := (Id,ψ)[a,b]= (x,ψ(x)) R Kn : x [a,b] { ∈ × ∈ } is a compact subset of G (cf. [Phi16b, Th. 3.12]). Thus, γ has a positive distance from the closed set (R Kn) G, implying × \ n C := (x,y) R K : x [a,b], y ψ(x) δ1 G. (3.85) δ1∃>0 ∈ × ∈ − ≤ ⊆  Clearly, C is bounded and C is also closed (using the continuity of the distance function n + d : R K R0 , d( ) := dist( ,γ), the continuity of the projection to the first × −→ n · · −1 −1 component π1 : R K R, and noting C = d [0,δ1] π1 [a,b]). Thus, C is compact, and the hypothesis× −→ of f being locally Lipschitz with∩ respect to y implies f to be globally Lipschitz with some Lipschitz constant L 0 on the compact set C by Prop. 3.13. We can now choose the number δ > 0 claimed≥ to exist in (a) to be any number −L(b−a) 0 <δ

Moreover, with d and π1 as above, Uδ(σ, ζ) as defined in (3.82) can be written in the form U (σ, ζ)= d−1[0,δ[ π−1]a,b[, δ ∩ 1 + showing it is an open set ([0,δ[ is, indeed, open in R0 ). Even though we are mostly interested in what happens on the open set W , it will be convenient to define functions on the slightly larger compact set W := [a,b] U, × U := (x,y) R Kn : x [a,b], y ψ(x) δ = d−1[0,δ] π−1[a,b]. ∈ × ∈ − ≤ ∩ 1 To proceed with the proof, we now carry out a form of the Picard iteration, recursively defining a sequence of functions (ψ ) , ψ : W Kn, defined recursively by m m∈N0 m −→ ψ0(x,ξ,η) := ψ(x)+ η ψ(ξ), (3.88a) x − ψm+1(x,ξ,η) := η + f t,ψm(t,ξ,η) dt. (3.88b) m∈∀N0 Zξ  The proof will be concluded if we can show the (ψm)m∈N0 constitute a sequence of continuous functions converging uniformly on W to Y ↾W . As an intermediate step, we establish the following properties of the ψ (simultaneously) by induction on m N : m ∈ 0 3 GENERAL THEORY 59

(1) ψ is continuous for each m N . m ∈ 0 (2) One has

ψm(x,ξ,η) ψ(x) <δ1 (x,ψm(x,ξ,η)) C . m∈∀N0, − ⇒ ∈ (x,ξ,η)∈W 

In particular, since C G, this shows the ψ are well-defined by (3.88b). ⊆ m (3) One has Lm+1 x ξ m+1 δ ψm+1(x,ξ,η) ψm(x,ξ,η) | − | . m∈∀N0, − ≤ (m + 1)! (x,ξ,η)∈W

To start the induction proof, notice that the continuity of ψ implies the continuity of ψ . Moreover, if (x,ξ,η) W , then 0 ∈ (3.88a) ψ (x,ξ,η) ψ(x) = η ψ(ξ) = η Y (ξ,σ,ζ) δ<δ . (3.89) 0 − − − ≤ 1 Also, from ψ = Y ( ,σ,ζ)= φ , we know, for each x,ξ [a,b ], · (σ,ζ) ∈ x ξ x ψ(x) ψ(ξ)= ζ + f(t,ψ(t))dt ζ f(t,ψ(t))dt = f(t,ψ(t))dt − − − Zσ Zσ Zξ and, for each (x,ξ,η) W , ∈ x ψ (x,ξ,η) ψ (x,ξ,η) = f t,ψ (t,ξ,η) f(t,ψ(t)) dt 1 − 0 0 − Zξ f L-Lip. x   L ψ (t,ξ,η) ψ(t) dt ≤ 0 − Zξ x = L η ψ(ξ) dt L x ξ δ, − ≤ | − | Zξ

completing the proof of (1) – (3) for m = 0. For the induction step, let m N . ∈ 0 It is left as an exercise to prove the continuity of ψm+1. Using the triangle inequality, we estimate, for each (x,ξ,η) W , ∈ ψm+1(x,ξ,η) ψ(x) − m

ψj+1(x,ξ,η) ψj(x,ξ,η) + ψ0(x,ξ,η) ψ(x) ≤ k − k − j=0 Xm (3.89), ind.hyp. for (3) Lj+1 x ξ j+1 δ (3.86) | − | + δ eL|x−ξ| δ < eL(b−a) e−L(b−a) δ = δ , ≤ (j + 1)! ≤ 1 1 j=0 X 3 GENERAL THEORY 60

establishing the estimate of (2) for m +1. To prove the estimate in (3) for m replaced by m + 1, one estimates, for each (x,ξ,η) W , ∈ x ψm+2(x,ξ,η) ψm+1(x,ξ,η) f t,ψm+1(t,ξ,η) f t,ψm(t,ξ,η) dt − ≤ ξ − Z x   L ψ (t,ξ,η) ψ (t,ξ,η) dt ≤ m+1 − m Zξ ind.hyp. x Lm+1 t ξ m+1 δ L | − | dt ≤ (m + 1)! Zξ Lm +2 x ξ m+2 δ = | − | , (m + 2)!

completing the induction proof of (1) – (3). As a consequence of (3), for each l,m N such that m>l: ∈ 0 m Lj (b a)j ψm(x,ξ,η) ψl(x,ξ,η) δ − . (3.90) (x,ξ,η∀)∈W − ≤ j! j=l+1 X

The convergence of the exponential series, thus, implies that (ψm(x,ξ,η))m∈N0 is a Cauchy sequence for each (x,ξ,η) W , yielding pointwise convergence of the ψ to ∈ m some function ψ˜ : W Kn. Letting m tend to infinity in (3.90) then shows −→ ∞ Lj (b a)j ψ˜(x,ξ,η) ψl(x,ξ,η) δ − , (x,ξ,η∀)∈W − ≤ j! j=l+1 X

where the independence of the right-hand side with respect to (x,ξ,η) W proves ∈ ψm ψ˜ uniformly on W . The uniform convergence together with (1) then implies ψ˜ to be→ continuous. In the final step of the proof, we show ψ˜ = Y on W , i.e. ψ˜( ,ξ,η) solves (3.79) (with k = 1). By Th. 1.5, we need to show ·

x ψ˜(x,ξ,η)= η + f t, ψ˜(t,ξ,η) dt (3.91) (x,ξ,η∀)∈W Zξ  (then uniqueness of solutions implies ψ˜( ,ξ,η) = Y ( ,ξ,η)). To verify (3.91), given ǫ> 0, by the uniform convergence ψ ψ·˜, choose m · N sufficiently large such that m → ∈

ψ˜(x,ξ,η) ψk(x,ξ,η) <ǫ k∈{m∀−1,m} (x,ξ,η∀)∈W −

3 GENERAL THEORY 61 and estimate, for each (x,ξ,η) W , ∈ x ψ˜(x,ξ,η) η f t, ψ˜(t,ξ,η) dt − − Zξ x ˜ ˜ ψ(x,ξ,η) ψm(x,ξ,η) + f t,ψm−1(t,ξ,η) f t, ψ(t,ξ,η) dt ≤ − ξ − Z   x   <ǫ + L ψ (t,ξ,η) ψ˜ (t,ξ,η) dt ǫ + Lǫ (b a). (3.92) m−1 − ≤ − Zξ

As (3.92) holds for every ǫ> 0, (3.91) must be true as well. It is noted that we have, indeed, proved (b) as a byproduct, since we know (for example from the Peano Th. 3.8) that ψ must be defined on [σ α,σ + α] for some α > 0 and then φ = ψ ( ,σ,ζ) on [σ α,σ + α] for each m N −.  m m · − ∈ 0 Theorem 3.35. As in Th. 3.34, consider the situation of Def. 3.32 for first-order prob- lems, i.e. with k = 1, and with f being continuous and locally Lipschitz with respect to y on G open. Then the global solution (x,ξ,η) Y (x,ξ,η) as defined in Def. 3.32 is 7→ continuous. Moreover, its domain Df is open.

Proof. Let (x,σ,ζ) Df . Then, using the notation from Def. 3.32, x is in the domain of the maximal solution∈ φ , i.e. x I . Since I is open, there must be

Proof. It was part of the exercise that proved Cor. 3.16 to show that the right-hand side F of the first-order problem equivalent to (3.79) in the sense of Th. 3.1 is continuous and locally Lipschitz with respect to y, provided f is continuous and locally Lipschitz with respect to y. Thus, according to Th. 3.35, the equivalent first-order problem has kn a continuous global solution Υ : DF K , defined on some open set DF . As a consequence of Th. 3.1(b), Y = Υ : D−→ Kn is the global solution to (3.79a). So 1 F −→ we have Df = DF and, as Υ is continuous, so is Y . 

It is sometimes interesting to consider situations where the right-hand side f depends on some (vector of) parameters µ in addition to depending on x and y: 3 GENERAL THEORY 62

Definition 3.37. If G R Kkn Kl with k,n,l N, and f : G Kn is such that, for each (ξ,η,µ) G, the⊆ explicit× n×-dimensional kth-order∈ initial value−→ problem ∈ y(k) = f x,y,y′,...,y(k−1),µ , (3.93a) (j) n y (ξ)= ηj K ,  (3.93b) j∈{0,...,k∀ −1} ∈ has a unique maximal solution φ : I Kn, then (ξ,η,µ) (ξ,η,µ) −→ Y : D Kn, Y (x,ξ,η,µ) := φ (x), (3.94) f −→ (ξ,η,µ) defined on D := (x,ξ,η,µ) R G : x I , (3.95) f { ∈ × ∈ (ξ,η,µ)} is called the global or general solution to (3.93a).

Corollary 3.38. Consider the situation of Def. 3.37 with f being continuous and locally Lipschitz with respect to (y,µ) on G open. Then the global solution Y as defined in Def. 3.94 is continuous. Moreover, its domain Df is open.

Proof. We consider k = 1 (i.e. (3.93a) is of first order) – the case k > 1 can then, in the usual way, be obtained by applying Th. 3.1. To apply Th. 3.35 to the present situation, define the auxiliary function

n+l fj(x,y) for j =1,...,n, F : G K , Fj(x,y) := (3.96) −→ (0 for j = n +1,...,n + l.

Then, since f is continuous and locally Lipschitz with respect to (y,µ), F is continuous and locally Lipschitz with respect to y, and we can apply Th. 3.35 to

y′ = F (x,y), (3.97a) y(ξ)=(η,µ), (3.97b) where (ξ,η,µ) G. According to Th. 3.35, the global solution Y˜ : D Kn+l of ∈ F −→ (3.97a) is continuous on the open set DF . Moreover, by the definition of F in (3.96), we have Y (x,ξ,η,µ) Y˜ (x,ξ,η,µ)= , (x,ξ,η,µ∀)∈DF µ   where Y is as defined in (3.94). In particular, Df = DF and the continuity of Y˜ implies the continuity of Y .  4 LINEAR ODE 63

Example 3.39. As a simple example of a parametrized ODE, consider f : R K2 K, f(x,y,µ) := µy, × −→ y′ = f(x,y,µ)= µy, y(ξ)= η, with the global solution Y : R R K2 K, Y (x,ξ,η,µ)= ηeµ (x−ξ). × × −→

4 Linear ODE

4.1 Definition, Setting

In Sec. 2.2, we saw that the solution of one-dimensional first-order linear ODE was particularly simple. One can now combine the general theory of ODE with some linear algebra to obtain results for n-dimensional linear ODE and, equivalently, for linear ODE of higher order. Notation 4.1. For n N, let (n, K) denote the set of all n n matrices over K. ∈ M × Definition 4.2. Let I R be a nontrivial interval, n N, and let A : I (n, K) and b : I Kn be continuous.⊆ An ODE of the form ∈ −→ M −→ y′ = A(x)y + b(x) (4.1) is called an n-dimensional linear ODE of first order. It is called homogeneous if, and only if, b 0; it is called inhomogeneous if, and only if, it is not homogeneous. ≡ —

Using the notion of matrix norm (cf. Sec. G), it is not hard to show the right-hand side of (4.1) is continuous and locally Lipschitz with respect to y and, thus, every initial value problem for (4.1) has a unique maximal solution (exercise). However, to show the maximal solution is always defined on all of I, we need some additional machinery, which is developed in the next section.

4.2 Gronwall’s Inequality

In the current section, we will provide Gronwall’s inequality, which is also of interest outside the field of ODE. Here, Gronwall’s inequality will allow us to prove the global ex- istence of maximal solutions for ODE with linearly bounded right-hand side – a corollary being that maximal solutions of (4.1) are always defined on all of I. 4 LINEAR ODE 64

As an auxiliary tool on our way to Gronwall’s inequality, we will now briefly study (one-dimensional) differential inequalities:

Definition 4.3. Given G R R = R2, and f : G R, call ⊆ × −→ y′ f(x,y) (4.2) ≤ a (one-dimensional) differential inequality (of first order). A solution to (4.2) is a differ- entiable function w : I R defined on a nontrivial interval I R satisfying the two conditions −→ ⊆

(i) x,w(x) I R : x I G, ∈ × ∈ ⊆ (ii) w′(x) fx,w(x) for each x I. ≤ ∈ Proposition 4.4. Let G R2 be open, let f : G R be continuous and locally Lipschitz with respect to y,⊆ and let

Proof. Consider the auxiliary function

g : G R R, g(x,y,µ) := f(x,y)+ µ (4.4) × −→ and the (parametrized) ODE

y′ = g(x,y,µ)= f(x,y)+ µ. (4.5)

Since f is continuous and locally Lipschitz with respect to y, g is continuous and locally Lipschitz with respect to (y,µ). Thus, continuity in initial conditions as given by Cor. 3.38 applies, yielding the global solution Y : D R, (x,ξ,η,µ) Y (x,ξ,η,µ), to g −→ 7→ be continuous on the open set Dg. We now consider an arbitrary compact subinterval [a,c] [a,b[ with a

4 γǫ := (x,ξ,η,µ) R : dist (x,ξ,η,µ), γ <ǫ Dg. (4.7) ǫ>∃0 ∈ ⊆ n  o 4 LINEAR ODE 65

If we choose the distance in (4.7) to be meant with respect to the max-norm on R4 and if 0 <µ<ǫ, then (x,a,φ(a),µ) γ for each x [a,c], such that φ := Y ( ,a,φ(a),µ) ∈ ǫ ∈ µ · is defined on (a superset of) [a,c]. We proceed to prove w φµ on [a,c]: Seeking a contradiction, assume there exists x [a,c] such that w(x≤) > φ (x ). Due to the 0 ∈ 0 µ 0 continuity of w and φµ, w>φµ must then hold in an entire neighborhood of x0. On the other hand, w(a) φ(a)= φ (a), such that, for ≤ µ x := inf xφ (t) for each t ]x,x ] , 1 0 µ ∈ 0 a x 0, ≤ 1 0 1 µ 1 w(x + h) w(x ) >φ (x + h) φ (x ), 1 − 1 µ 1 − µ 1 implying

′ w(x1 + h) w(x1) φµ(x1 + h) φµ(x1) ′ w (x1) = lim − lim − = φµ(x1) h→0 h ≥ h→0 h

= g x1,φµ(x1),µ = f x1,φµ(x1) + µ>f x1,φµ(x1) = f x1,w(x1) , (4.8) in contradiction to w being a solution to (4.2).   Thus, w φ on [a,c] holds for every 0 <µ<ǫ, and continuity of Y on D yields, ≤ µ g

w(x) lim φµ(x) = lim Y x,a,φ(a),µ = Y x,a,φ(a), 0 = φ(x), (4.9) x∈∀[a,c] ≤ µ→0 µ→0   concluding the proof. 

Theorem 4.5 (Gronwall’s Inequality). Let I := [a,b[, where

(4.10) can be written as ψ(x) w(x). x∀∈I ≤ Moreover, this implies w′(x)= β(x)γ(x)= β(x) α(x)+ ψ(x) β(x)w(x)+ α(x)β(x), x∀∈I ≤ showing w satisfies the (linear) differential inequality y′ β(x) y + α(x) β(x). (4.13) ≤ Continuously extending α and β to x

for each x I, proving (4.17). ∈ —

The following Th. 4.7 will be applied to show maximal solutions to linear ODE are always defined on all of I (with I as in Def. 4.2). However, Th. 4.7 is often also useful to obtain the domains of maximal solutions for nonlinear ODE.

Theorem 4.7. Let n N, let I R be an open interval, and let f : I Kn Kn be continuous. If there exist∈ nonnegative⊆ continuous functions γ, β : I ×R+ such−→ that −→ 0 f(x,y) γ(x)+ β(x) y , (4.19) (x,y)∈∀I×Kn k k≤ k k where denotes some arbitrary norm on Kn, then every maximal solution to k · k y′ = f(x,y) is defined on all of I.

Proof. Let c

Since the continuous function γ is uniformly bounded on the compact interval [x0,d],

x φ(x) C + β(t) φ(t) dt. C∃≥0 x∈[∀x0,d[ k k≤ k k Zx0 Thus, Example 4.6 applies, providing

x φ(x) C exp β(t)dt CeM(d−x0), (4.21) x∈[∀x0,d[ k k≤ ≤ Zx0  where M 0 is a uniform bound for the continuous function β on the compact interval ≥ [x0,d]. As (4.21) states that the graph

gr (φ)= x,φ(x) G : x [x ,d[ + ∈ ∈ 0   4 LINEAR ODE 68

is contained in the compact set

K := [x ,d] y Kn : y CeM(d−x0) , 0 × ∈ k k≤ Prop. 3.24 implies φ has an extension to the right. Now assume a

4.3 Existence, Uniqueness, Vector Space of Solutions

Theorem 4.8. Consider the setting of Def. 4.2 with an open interval I. Then every n initial value problem consisting of the linear ODE (4.1) and y(x0)= y0, x0 I, y0 K , has a unique maximal solution φ : I Kn (note that φ is defined on all∈ of I). ∈ −→ Proof. It is an exercise to show the right-hand side of (4.1) is continuous and locally Lipschitz with respect to y. Thus, every initial value problem has a unique maximal solution by using Cor. 3.16 and Th. 3.22. That each maximal solution is defined on I follows from Th. 4.7, as

A(x)y + b(x) b(x) + A(x) y , x∀∈I ≤ k k where A(x) denotes the matrix norm of A (x) induced by the norm on Kn (cf. Appendix G). k · k 

We will now proceed to study the solution spaces of linear ODE – as it turns out, these solution spaces inherit the linear structure of the ODE.

Notation 4.9. Again, we consider the setting of Def. 4.2. Define i and h to be the respective sets of solutions to (4.1) and its homogeneous version, i.e.L L

:= (φ : I Kn): φ′ = Aφ + b , (4.22a) Li −→ n o := (φ : I Kn): φ′ = Aφ . (4.22b) Lh −→ n o Lemma 4.10. Using Not. 4.9, we have

i = φ + h = φ + ψ : ψ h , (4.23) φ∈L∀ i L L { ∈L } 4 LINEAR ODE 69

i.e. one obtains all solutions to the inhomogeneous equation (4.1) by adding solutions of the homogeneous equation to a particular solution to the inhomogeneous equation (note that this is completely analogous to what occurs for solutions to linear systems of equations in linear algebra).

Proof. Exercise. 

Theorem 4.11. Let I R be a nontrivial interval, n N, and let A : I (n, K) be continuous. Then the⊆ following holds: ∈ −→ M

(a) The set defined in (4.22b) constitutes a vector space over K. Lh (b) For each k N and φ ,...,φ , the following statements are equivalent: ∈ 1 k ∈Lh

(i) The k functions φ1,...,φk are linearly independent over K. n (ii) There exists x0 I such that the k vectors φ1(x0),...,φk(x0) K are linearly independent over∈ K. ∈ n (iii) The k vectors φ1(x),...,φk(x) K are linearly independent over K for every x I. ∈ ∈ (c) The dimension of is n. Lh Proof. (a): Exercise. (b): (iii) trivially implies (ii). That (ii) implies (i) can easily be shown by contraposition: If (i) does not hold, then there is (λ ,...,λ ) Kk 0 such that k λ φ = 0, i.e. 1 k ∈ \ { } j=1 j j k λ φ (x) = 0 holds for each x I, i.e. (ii) does not hold. It remains to show (i) j=1 j j P implies (iii). Once again, we accomplish∈ this via contraposition: If (iii) does not hold, thenP there are (λ ,...,λ ) Kk 0 and x I such that k λ φ (x) = 0. But 1 k ∈ \ { } ∈ j=1 j j then, since k λ φ by (a), k λ φ = 0 (using uniqueness of solutions). In j=1 j j ∈ Lh j=1 j j P consequence, φ1,...,φk are linearly dependent and (i) does not hold. P P (c): Let (b ,...,b ) be a basis of Kn and x I. Let φ ,...,φ be the solutions 1 n 0 ∈ 1 n ∈ Lh to the initial conditions y(x0) = b1,...,y(x0) = bn, respectively. Then the φ1,...,φn must be linearly independent by (b) (as they are linearly independent at x0), proving dim h n. On the other hand, if φ1,...,φk h are linearly independent, k N, L ≥ n ∈ L ∈ n then, once more by (b), φ1(x),...,φk(x) K are linearly independent for each x K , showing k n and dim n. ∈ ∈  ≤ Lh ≤ Example 4.12. In Example 1.4(e), we had claimed that the second-order ODE (1.16) on [a,b], a

had the set of solutions as in (1.17), namely L = (c sin+c cos) : [a,b] K : c ,c K . L 1 2 −→ 1 2 ∈ n  o We are now in a position to verify this claim: The second-order ODE (1.16) is equivalent to the homogeneous linear first-order ODE y′ y 0 1 y 1 = 2 = 1 (4.24) y′ y 1 0 y  2 − 1 −  2 with the vector space of solutions h of dimension 2 over K. Clearly, φ1,φ2 h, where φ ,φ : [a,b] K2 with L ∈L 1 2 −→ sin x cos x φ (x) := , φ (x) := . (4.25) 1 cos x 2 sin x   −  0 1 Moreover, φ1 and φ2 are linearly independent (e.g. since φ1(0) = 1 and φ2(0) = 0 are linearly independent, so are φ ,φ : R K2 by Th. 4.11(b), implying, again by Th. 1 2 −→   4.11(b), the linear independence of φ1(a),φ2(a), finally implying the linear independence of φ ,φ : [a,b] K2). Thus, 1 2 −→ = (c φ + c φ ): [a,b] K2 : c ,c K (4.26) Lh 1 1 2 2 −→ 1 2 ∈ n  o and, since, according to Th. 3.1 the solutions to (1.16) are precisely the first components of solutions to (4.24), the representation (1.17) is verified.

4.4 Fundamental Matrix Solutions and Variation of Constants

Definition and Remark 4.13. A basis (φ1,...,φn), n N, of the n-dimensional vector space over K is called a fundamental system for∈ the linear ODE (4.1). One Lh then also calls the matrix φ11 ... φ1n . . Φ :=  . .  , (4.27) φn1 ... φnn   where the kth column of the matrix consists of the component functions φ1k,...,φnk of φk, k 1,...,n ,a fundamental system or a fundamental matrix solution for (4.1). The latter∈ term { is justified} by the observation that Φ : I (n, K) can be interpreted as a solution to the matrix-valued ODE −→ M Y ′ = A(x) Y : (4.28) Indeed, ′ ′ ′ Φ = φ1,...,φn = A(x) φ1,...,A(x) φn = A(x) Φ.   4 LINEAR ODE 71

Corollary 4.14. Let φ1,...,φn h, n N, and let Φ be defined as in (4.27). Then the following statements are equivalent:∈ L ∈

(i) Φ is a fundamental system for (4.1). (ii) There exists x I such that detΦ(x ) =0. 0 ∈ 0 6 (iii) detΦ(x) =0 for every x I. 6 ∈ Proof. The equivalences are a direct consequence of the equivalences in Th. 4.11(b).  Theorem 4.15 (Variation of Constants). Consider the setting of Def. 4.2. If Φ: I (n, K) is a fundamental system for (4.1), then the unique solution ψ : I Kn−→of M n−→ the initial value problem consisting of (4.1) and y(x0)= y0, (x0,y0) I K , is given by ∈ × x ψ : I Kn, ψ(x)=Φ(x)Φ−1(x ) y + Φ(x) Φ−1(t) b(t)dt. (4.29) −→ 0 0 Zx0 Proof. The initial condition is easily verified: −1 ψ(x0)=Φ(x0)Φ (x0) y0 +0=Id y0 = y0. To check that ψ satisfies (4.1), one computes, for each x I, ∈ x ′ (I.3) ′ −1 ′ −1 −1 ψ (x) = Φ (x)Φ (x0) y0 + Φ (x) Φ (t) b(t)dt + Φ(x)Φ (x) b(x) x0 Z x (4.28) −1 −1 = A(x) Φ(x)Φ (x0) y0 + A(x) Φ(x) Φ (t) b(t)dt + b(x) Zx0 = A(x) ψ(x)+ b(x), (4.30) thereby establishing the case.  Remark 4.16. The 1-dimensional variation of constants formula (2.2) is actually a special case of (4.29): We note that, for n = 1 and A(x) = a(x), the solution Φ := φ0 to the 1-dimensional homogeneous equation as defined in (2.2b), i.e. x R x a(t)dt φ : I K, φ (x) = exp a(t)dt = e x0 0 −→ 0 Zx0  constitutes a fundamental matrix solution in the sense of Def. and Rem. 4.13 (since 1/φ0 exists). Taking into account Φ(x )= φ (x ) = 1, we obtain, for each x I, 0 0 0 ∈ x (4.29) −1 −1 φ(x) = Φ(x)Φ (x0) y0 + Φ(x) Φ (t) b(t)dt x0 x Z −1 = φ0(x) y0 + φ0(t) b(t)dt , (4.31)  Zx0  4 LINEAR ODE 72

which is (2.2a). —

In Sec. 4.6, we will study methods for actually finding fundamental matrix solutions in cases where A is constant. However, in general, fundamental matrix solutions are often not explicitly available. In such situations, the following Th. 4.17 can sometimes help to extract information about solutions. Theorem 4.17 (Liouville’s Formula). Consider the setting of Def. 4.2 and recall the trace of an n n matrix A =(a ) is defined by × kl n

tr A := akk. Xk=1 If Φ: I (n, K) is a fundamental system for (4.1), then −→ M x det Φ(x) = det Φ(x0) exp tr A(t) dt . (4.32) x0,x∀∈I Zx0     Proof. Exercise. 

4.5 Higher-Order, Wronskian

In Th. 3.1, we saw that higher-order ODE are equivalent to systems of first-order ODE. We can now combine Th. 3.1 with our findings regarding first-order linear ODE to help with the solution of higher-order linear ODE. Definition 4.18. Let I R be a nontrivial interval, n N. Let b : I K and ⊆ ∈ −→ a0,...,an−1 : I K be continuous functions. Then a (1-dimensional) linear ODE of nth order is an equation−→ of the form

y(n) = a (x)y(n−1) + + a (x)y′ + a (x)y + b(x). (4.33) n−1 ··· 1 0 It is called homogeneous if, and only if, b 0; it is called inhomogeneous if, and only if, it is not homogeneous. Analogous to (4.22),≡ define the respective sets of solutions

n−1 := (φ : I K): φ(n) = b + a φ(k) , (4.34a) Hi −→ k n k=0 o n−1 X := (φ : I K): φ(n) = a φ(k) . (4.34b) Hh −→ k n Xk=0 o 4 LINEAR ODE 73

Definition 4.19. Let I R be a nontrivial interval, n N. For each n-tuple of (n 1) times differentiable functions⊆ φ ,...,φ : I K, define∈ the Wronskian − 1 n −→ φ1(x) ... φn(x) ′ ′ φ1(x) ... φn(x) W (φ ,...,φ ): I K, W (φ ,...,φ )(x) := det  . .  . 1 n −→ 1 n . .  (n−1) (n−1)  φ (x) ... φn (x)  1   (4.35) Theorem 4.20. Consider the setting of Def. 4.18.

(a) If i and h are the sets defined in (4.34), then h is an n-dimensional vector spaceH over HK and, if φ is arbitrary, then H ∈Hi = φ + . (4.36) Hi Hh (b) Let φ ,...,φ . Then the following statements are equivalent: 1 n ∈Hh

(i) φ1,...,φn are linearly independent over K (i.e. (φ1,...,φn) forms a basis of ). Hh (ii) There exists x I such that the Wronskian does not vanish: 0 ∈ W (φ ,...,φ )(x ) =0. 1 n 0 6 (iii) The Wronskian never vanishes, i.e. W (φ ,...,φ )(x) =0 for every x I. 1 n 6 ∈ Proof. According to Th. 3.1, (4.33) is equivalent to the first-order linear ODE

0 1 0 ... 0 0 y1 0 0 0 1 ... 0 0 y2 0  . . . .   .   .  ′ ...... y =     +    0 0 0 ... 1 0  yn−2  0         0 0 0 ... 0 1  yn−1  0        a0(x) a1(x) a2(x) ... an−2(x) an−1(x)  yn  b(x)             =: A˜(x) y + ˜b(x). (4.37)

Define

:= (Φ : I Kn) : Φ′ = A˜Φ+ ˜b , Li −→ n o := (Φ : I Kn) : Φ′ = Aφ˜ . Lh −→ n o 4 LINEAR ODE 74

(a): Let φ and define ∈Hi φ φ′ Φ :=  .  . .   φ(n−1)   Then   Th. 3.1(a),(b) = Ψ : Ψ Hh { 1 ∈Lh} and Th. 3.1(a),(b) (4.23) = Φ˜ : Φ˜ = Φ˜ : Φ˜ Φ+ Hi { 1 ∈Li} { 1 ∈ Lh} = (Φ+Ψ) : Ψ = φ + . { 1 ∈Lh} Hh As a consequence of Th. 3.1, the map J : h h, J(Φ) := Φ1, is a linear isomor- phism, implying that , like , is an n-dimensionalL −→ H vector space over K. Hh Lh (b): For φ ,...,φ , define Φ := φ(l−1) for each k,l 1,...,n and 1 n ∈Hh kl k ∈ { } φ1(x) ... φn(x) ′ ′ φ1(x) ... φn(x) Φ(x) := (Φ1(x),..., Φn(x)) = (Φkl(x)) =  . .  x∀∈I . .  (n−1) (n−1)  φ (x) ... φn (x)  1    such that detΦ(x)= W (φ ,...,φ )(x) for each x I. Since Th. 3.1 yields Φ ,..., Φ 1 n ∈ 1 n ∈ h if, and only if, φ1,...,φn h, the equivalences of (b) follow from the equivalences Lof Cor. 4.14. ∈H  Example 4.21. Consider a ,a : R+ K, a (x):=1/(2x), a (x) := 1/(2x2), and 0 1 −→ 1 0 − y′ y y′′ = a (x) y′ + a (x) y = . 1 0 2x − 2x2 One might be able to guess the solutions φ ,φ : R+ K, φ (x) := x, φ (x) := √x. 1 2 −→ 1 2 The Wronskian is W (φ ,φ ): R+ K, 1 2 −→ x √x √x √x W (φ ,φ )(x) = det = √x = < 0, 1 2 1 1/(2√x) 2 − − 2   i.e. φ and φ span according to Th. 4.20(b): 1 2 Hh = c φ + c φ : c ,c K . Hh { 1 1 2 2 1 2 ∈ } 4 LINEAR ODE 75

4.6 Constant Coefficients

For 1-dimensional first-order linear ODE, we obtained a solution formula in Th. 2.3 in terms of integrals (of course, in general, evaluating integrals can still be very difficult, and one might need effective and efficient numerical methods). In the previous sections, we have studied systems of first-order linear ODE as well as linear ODE of higher order. Unfortunately, there are no general solution formulas for these situations (one can use (4.29) if one knows a fundamental system, but the problem is the absence of a general procedure to obtain such a fundamental system). However, there is a more satisfying solution theory for the situation of so-called constant coefficients, i.e. if A in (4.1) and the a0,...,an−1 in (4.33) do not depend on x.

4.6.1 Linear ODE of Higher Order

Definition 4.22. Let I R be a nontrivial interval, n N. Let b : I K be ⊆ ∈ −→ continuous and a0,...,an−1 K. Then a (1-dimensional) linear ODE of nth order with constant coefficients is an equation∈ of the form

y(n) = a y(n−1) + + a y′ + a y + b(x). (4.38) n−1 ··· 1 0 —

In the present context, it is useful to introduce the following notation:

Notation 4.23. Let n N . ∈ 0 (a) Let denote the set of all polynomials over K, := P : deg P n . We P Pn { ∈ P ≤ } will also write [R], [C], n[R], n[C] if we need to be specific about the field of coefficients. P P P P

(b) Let I R be a nontrivial interval. Let Dn(I) := Dn(I, K) denote the set of all n times differentiable⊆ functions f : I K, and let −→ ∂ : D1(I) (I, K) := f : I K , ∂ f := f ′, x −→ F { −→ } x n j and, for each P n with P (x)= j=0 ajx (a0,...,an K) define the differential operator ∈P ∈ P n n P (∂ ): Dn(I) (I, K), P (∂ )f := a ∂j f = a f (j). (4.39) x −→ F x j x j j=0 j=0 X X 4 LINEAR ODE 76

Remark 4.24. Using Not. 4.23(b), the ODE (4.38) can be written concisely as

n−1 P (∂ )y = b(x), where P (x) := xn a xj. (4.40) x − j j=0 X —

The following Prop. 4.25 implies that the differential operator P (∂x) does not, actually, depend on the representation of the polynomial P . Proposition 4.25. Let P, P , P . 1 2 ∈P (a) If P = P + P and n := max deg P , deg P , then 1 2 { 1 2} P (∂x)f = P1(∂x)f + P2(∂x)f. f∈D∀n(I) (b) If P = P P and n := max deg P, deg P , deg P , then 1 2 { 1 2} P (∂x)f = P1(∂x) P2(∂x)f . f∈D∀n(I)  Proof. Exercise.  Lemma 4.26. Let λ K and ∈ f : R K, f(x) := eλx. (4.41) −→ Then, for each P , ∈P P (∂ )f : R K, P (∂ )f(x)= P (λ) eλx. (4.42) x −→ x n j Proof. There exists n N0 and a0,...,an K such that P (x) = j=0 ajx . One computes ∈ ∈ n n P j λx j λx λx P (∂x)f(x)= aj ∂x e = ajλ e = e P (λ), j=0 j=0 X X proving (4.42).  Theorem 4.27. If a ,...,a K, n N, and P (x)= xn n−1 a xj has the distinct 0 n−1 ∈ ∈ − j=0 j zeros λ1,...,λn K (i.e. P (λ1)= = P (λn)=0), then (φ1,...,φn), where ∈ ··· P λj x φj : I K, φj(x) := e , (4.43) j∈{1∀,...,n} −→ is a basis of the homogeneous solution space

= (φ : I K): P (∂ )φ =0 (4.44) Hh −→ x to (4.38) (i.e. to (4.40)). n o 4 LINEAR ODE 77

Proof. It is immediate from (4.42) and P (λj) = 0 that each φj satisfies P (∂x)φj = 0. From Th. 4.20(a), we already know h is an n-dimensional vector space over K. Thus, it merely remains to compute the Wronskian.H One obtains (cf. (4.35)):

1 ... 1 λ ... λ n−1 1 n (H.2) W (φ1,...,φn)(0) = . . = (λk λl) =0, . . − 6 . . k,l=0 n−1 λ ... λn−1 Yk>l 1 n

since the λj are all distinct. We have used that the Wronskian, in the present case, turns out to be a Vandermonde determinant. The formula (H.2) for this type of determinant is provided and proved in Appendix H. We also used that the determinant of a matrix is the t same as the determinant of its transpose: det A = det A . From W (φ1,...,φn)(0) = 0 and Th. 4.20(b), we conclude that (φ ,...,φ ) is a basis of . 6  1 n Hh Example 4.28. We consider the third-order linear ODE

y′′′ =2y′′ y′ +2y, (4.45) −

which can be written as P (∂x)y = 0 with

P (x) := x3 2x2 + x 2=(x2 + 1)(x 2)=(x i)(x + i)(x 2), (4.46) − − − − −

i.e. P has the distinct zeros λ1 = i, λ2 = i, λ3 = 2. Thus, according to Th. 4.27, the three functions −

φ ,φ ,φ : R C, φ (x)= eix, φ (x)= e−ix, φ (x)= e2x, (4.47) 1 2 3 −→ 1 2 3

form a basis of the C-vector space h. If we consider (4.45) as an ODE over R, then we are interested in a basis of the RH-vector space . We can use linear combinations Hh of φ1 and φ2 to obtain such a basis (cf. Rem. 4.33(b) below): eix + e−ix eix e−ix ψ ,ψ : R R, ψ (x)= = cos x, ψ (x)= − = sin x. (4.48) 1 2 −→ 1 2 2 2i

As explained in Rem. 4.33(b) below, as (φ1,φ2,φ3) are a basis of h over C,(ψ1,ψ2,φ3) are a basis of over R. H Hh —

By working a bit harder, one can generalize Th. 4.27 to the case where P has zeros of higher multiplicity. We provide this generalization in Th. 4.32 below after recalling the notion of zeros of higher multiplicity in Rem. and Def. 4.29, and after providing two preparatory lemmas. 4 LINEAR ODE 78

Remark and Definition 4.29. According to the fundamental theorem of algebra (cf. [Phi16a, Th. 8.32, Cor. 8.33(b)]), for every polynomial P with deg P = n, n N, ∈Pn ∈ there exists r N with r n, k1,...,kr N with k1 + + kr = n, and distinct numbers λ ,...,λ∈ C such≤ that ∈ ··· 1 r ∈ P (x)=(x λ )k1 (x λ )kr . (4.49) − 1 ··· − r Clearly, λ1,...,λr are precisely the distinct zeros of P and kj is referred to as the multiplicity of the zero λj, j =1,...,r. k Lemma 4.30. Let I R be a nontrivial interval, λ K, k N0, and f D (I). Then we have ⊆ ∈ ∈ ∈ k λx (k) λx (∂x λ) f(x) e = f (x) e . (4.50) x∀∈I −  Proof. The proof is carried out by induction. The case k = 0 is merely the identity 0 λx λx (∂x λ) f(x) e = f(x) e . For the induction step, let k 1 and compute, using the− product rule, ≥  ind. hyp. (∂ λ)k f(x) eλx = (∂ λ) f (k−1)(x) eλx x − x − = f (k)(x) eλx + f (k−1)(x) λeλx λf (k−1)(x) eλx   − = f (k)(x) eλx, (4.51) thereby establishing the case.  Lemma 4.31. Let P and λ K such that P (λ) = 0. Then, for each Q with deg Q = k, k N , it holds∈ P that ∈ 6 ∈ P ∈ 0 λx λx P (∂x) Q(x) e = R(x) e , (4.52) x∀∈R where R is still a polynomial of degree k.  ∈P Proof. We can rewrite P (cf. [Phi16a, Th. 6.5(a)]) in the form

n P (x)= b (x λ)j, n N , (4.53) j − ∈ 0 j=0 X where b0 = P (λ) = 0 and the remaining bj K can also be calculated from the coefficients of P according6 to [Phi16a, (6.6)]. We∈ compute

n n (4.53) (4.50) P (∂ ) Q(x) eλx = b (∂ λ)j Q(x) eλx = b Q(j)(x) eλx, x j x − j j=0 j=0  X  X i.e. (4.52) holds with R := k b Q(j) and b = 0 implies deg R = deg Q = k.  j=0 j 0 6 P 4 LINEAR ODE 79

Theorem 4.32. If a ,...,a K, n N, and P (x)= xn n−1 a xj has the distinct 0 n−1 ∈ ∈ − j=0 j zeros λ1,...,λr K with respective multiplicities k1,...,kr N, then the set ∈ ∈P := (φ : I K): j 1,...,r , m 0,...,k 1 , (4.54a) B jm −→ ∈ { } ∈ { j − } n o where m λj x φjm : I K, φjm(x) := x e , (4.54b) j∈{1∀,...,r} m∈{0,...,k∀ j −1} −→ yields a basis of the homogeneous solution space

= (φ : I K): P (∂ )φ =0 . Hh −→ x n o Proof. Since k + + k = n implies # = n and we know dim = n, it suffices to 1 ··· r B Hh show that h and the elements of are linearly independent. Let φjm be as in B⊆H B kj (4.54b). As λj is a zero of multiplicity kj of P , we can write P (x) = Qj(x)(x λj) with some Q . From the computation − j ∈P (4.50) kj >m P (∂ )φ (x)= Q (∂ )(∂ λ )kj xm eλj x = Q (∂ ) ∂kj xm eλj x = 0, x jm j x x − j j x x we gather . Linear independence of the φ is verified by showing B⊆Hh jm r λj x Qj(x) e =0 Qj k −1 Qj 0. (4.55) ∧j=1 ∀,...,r ∈P j ⇒j=1 ∀,...,r ≡ j=1 ! X We prove (4.55) by induction on r. Since eλj x = 0 for each x R, the case r = 1 is immediate. For the induction step, let r 62. If at least one∈ Q 0, then the ≥ j ≡ remaining Qj 0 as well by the induction hypothesis. It only remains to consider the ≡ kr case that none of the Qj vanishes identically. In that case, we apply (∂x λr) to r λj x − j=1 Qj(x) e = 0, obtaining P r−1 λj x Rj(x) e =0 (4.56) j=1 X

kr λrx (kr) λrx with suitable R , since Lem. 4.30 yields (∂ λ ) Q (x) e = Qr (x) e =0 j ∈P x − r r and, for j

Remark 4.33. (a) If λ1,λ2 C, then complex conjugation has the properties (cf. [Phi16a, Def. and Rem. 5.5])∈

λ λ = λ¯ λ¯ , λ λ = λ¯ λ¯ . 1 ± 2 1 ± 2 1 2 1 2 In consequence, if P [R], then P (λ) = P (λ¯) for each λ C. In particular, if P [R] and λ C ∈R Pis a nonreal zero of P , then λ¯ = λ is∈ also a zero of P . ∈P ∈ \ 6

(b) Consider the situation of Th. 4.32 with P [R]. Using (a), if φjm : I C, m λj x ∈ P −→ φjm(x) = x e , λj C R, occurs in a basis for the C-vector space h (with ∈ \ m Hλ˜x m = 0 in the special case of Th. 4.27), then φ˜ : I C, φ˜ (x)= x e j , with jm −→ jm λ˜ = λ¯ will occur as well. Noting that, for each x R and each λ C, j j ∈ ∈ eλx = ex(Re λ+i Im λ) = ex Re λ cos(x Im λ)+ i sin(x Im λ) , (4.57a) ¯ eλx = ex(Re λ−i Im λ) = ex Re λ cos(x Im λ) i sin(x Im λ) , (4.57b) −  1 ¯ (eλx + eλx)= ex Re λ cos(x Im λ),  (4.57c) 2 1 ¯ (eλx eλx)= ex Re λ sin(x Im λ), (4.57d) 2i − one can define

1 m x Re λj ψ : I R, ψ (x) := (φ (x)+ φ˜ (x)) = x e cos(x Im λ ), (4.58a) jm −→ jm 2 jm jm j

1 m x Re λj ψ˜ : I R, ψ˜ (x) := (φ (x) φ˜ (x)) = x e sin(x Im λ ). jm −→ jm 2i jm − jm j (4.58b)

If one replaces each pair φ ,φ˜ in the basis for the C-vector space with the jm jm Hh corresponding pair ψjm,ψ˜jm, then one obtains a basis for the R-vector space h: This follows from H

1 1 ψjm φjm 2 2 1 = A with A := 1 1 , det A = =0. (4.59) ψ˜jm φ˜jm −2i 6      2i − 2i  Example 4.34. We consider the fourth-order linear ODE

y(4) = 8y′′ 16y, (4.60) − − which can be written as P (∂x)y = 0 with

P (x) := x4 +8x2 +16=(x2 + 4)2 =(x 2i)2(x +2i)2, (4.61) − 4 LINEAR ODE 81

i.e. P has the zeros λ1 =2i, λ2 = 2i, both with multiplicity 2. Thus, according to Th. 4.32, the four functions −

φ ,φ ,φ ,φ : R C, 10 11 20 21 −→ 2ix 2ix −2ix −2ix φ10(x)= e , φ11(x)= xe , φ20(x)= e , φ21(x)= xe , form a basis of the C-vector space h. If we consider (4.60) as an ODE over R, we can use (4.58) to obtain the basis (ψ ,ψH ,ψ ,ψ ) of the R-vector space , where 10 11 20 21 Hh ψ ,ψ ,ψ ,ψ : R R, 10 11 20 21 −→ ψ10(x) = cos(2x), ψ11(x)= x cos(2x), ψ20(x) = sin(2x), ψ21(x)= x sin(2x). —

If (4.38) is inhomogeneous, then one can use Th. 4.32 and, if necessary, Rem. 4.33(b), to obtain a basis of the homogeneous solution space h, then using the equivalence with systems of first-order linear ODE and variation ofH constants according to Th. 4.15 to solve (4.38). However, if the function b in (4.38) is such that the following Th. 4.35 applies, then one can avoid using the above strategy to obtain a particular solution φ to (4.38) (and, thus, the entire solution space via = φ + ). Hi Hh Theorem 4.35. Let a ,...,a K, n N, and P (x)= xn n−1 a xj. Consider 0 n−1 ∈ ∈ − j=0 j P (∂ )y = Q(x)eµx, Q , µ K. P (4.62) x ∈P ∈

(a) (no resonance): If P (µ) =0 and m := deg(Q) N0, then there exists a polynomial R such that deg(R)=6 m and ∈ ∈P φ : R K, φ(x) := R(x) eµx, (4.63) −→ is a solution to (4.62). Moreover, if Q 1, then one can choose R 1/P (µ). ≡ ≡

(b) (resonance): If µ is a zero of P with multiplicity k N and m := deg(Q) N0, then there exists a solution to (4.62) of the following∈ form: ∈

m+k φ : R K, φ(x) := R(x) eµx, R , R(x)= c xj, c ,...,c K. −→ ∈P j k m+k ∈ j=k X (4.64)

The reason behind the terms no resonance and resonance will be explained in the follow- ing Example 4.36. 4 LINEAR ODE 82

Proof. Exercise.  Example 4.36. Consider the second-order linear ODE d2x + ω2 x = a cos(ωt), ω , ω R+, a R 0 , (4.65) dt2 0 0 ∈ ∈ \ { } which can be written as P (∂t)x = a cos(ωt) with

P (t) := t2 + ω2 =(t iω )(t + iω ). (4.66) 0 − 0 0 Note that the unknown function is written as x depending on the variable t (instead of y depending on x). This is due to the physical interpretation of (4.65), where x represents the position of a so-called harmonic oscillator at time t, having angular frequency ω0 and being subjected to a periodic external force of angular frequency ω and amplitude a. We can find a particular solution φ to (4.65) by applying Th. 4.35 to

iωt P (∂t)x = ae . (4.67)

We have to distinguish two cases:

(a) Case ω = ω0: In this case, one says that the oscillator and the external force are not in resonance,6 which explains the term no resonance in Th. 4.35(a). In this case, we 2 2 can apply Th. 4.35(a) with µ := iω and Q a, yielding R a/P (iω)= a/(ω0 ω ), i.e. ≡ ≡ − a φ : R C, φ (t) := R(t) eµt = eiωt, (4.68a) 0 −→ 0 ω2 ω2 0 − is a solution to (4.67) and a φ : R R, φ(t) := Re φ (t)= cos(ωt), (4.68b) −→ 0 ω2 ω2 0 − is a solution to (4.65).

(b) Case ω = ω0: In this case, one says that the oscillator and the external force are in resonance, which explains the term resonance in Th. 4.35(b). In this case, we can apply Th. 4.35(b) with µ := iω and Q a, i.e. m = 0, k = 1, yielding R(t) = ct for some c C. To determine c, we plug≡x(t)= R(t) eµt into (4.67): ∈ iωt iωt iωt 2 iωt P (∂t) cte = ∂t(ce + ciωte )+ ω0cte = ciω eiωt + ciω eiωt cω2teiωt + ω2cteiωt  − 0 =2ciω eiωt = aeiωt c = a/(2iω). (4.69) ⇒ 4 LINEAR ODE 83

Thus, a φ : R C, φ (t) := teiωt, (4.70a) 0 −→ 0 2iω

is a solution to (4.67) and

a φ : R R, φ(t) := Re φ (t)= t sin(ωt), (4.70b) −→ 0 2ω is a solution to (4.65).

4.6.2 Systems of First-Order Linear ODE

Matrix

Definition 4.37. Let I R be a nontrivial interval, n N, A (n, K) and b : ⊆ ∈ ∈ M I Kn be continuous. Then a linear ODE with constant coefficients is an equation of−→ the form y′ = Ay + b(x), (4.71) i.e. a linear ODE, where the matrix A does not depend on x. —

ax Recalling that the ordinary exponential function expa : R C, x e , a C, is precisely the solution to the initial value problem y′ = ay−→, y(0) = 1,7→ the following∈ definition constitutes a natural generalization:

Definition 4.38. Given n N, A (n, C), define the matrix exponential function ∈ ∈M exp : R (n, C), x eAx, (4.72a) A −→ M 7→ to be the solution to the matrix-valued initial value problem

Y ′ = AY, Y (0) = Id, (4.72b) i.e. as the fundamental matrix solution of y′ = Ay that satisfies Y (0) = Id (sometimes called the principal maxtrix solution of y′ = Ay). —

The previous definition of the matrix exponential function is further justified by the following result: 4 LINEAR ODE 84

Theorem 4.39. For each A (n, C), n N, it holds that ∈M ∈ ∞ (Ax)k eAx = (4.73) x∀∈R k! Xk=0 in the sense that the partial sums on the right-hand side converge pointwise to eAx on R, where the convergence is even uniform on every compact interval.

n2 Proof. By the equivalence of all norms on C ∼= (n, C), we may choose a convenient norm on (n, C). So we let denote an arbitraryM operator norm on (n, C), M k ·n k M induced by some norm on C . We first show that the partial sums (Am(x))m∈N, m (Ax)k k · k Am(x) := k=0 k! , in (4.73) form a Cauchy sequence in (n, C): For M,N N, N > M, one estimates, for each x R, M ∈ P ∈ N N (Ax)k (G.10) A k x k AN (x) AM (x) = k k | | . (4.74) k − k k! ≤ k! k=M+1 k=M+1 X X k k Since the convergence lim m kAk |x| = ekA k|x| is pointwise for x R and uniform m→∞ k=0 k! ∈ on every compact interval, (4.74) shows each (Am(x))m∈N is a Cauchy sequence that converges to some Φ(x) P(n, C) (by the completeness of (n, C)) pointwise for x R and uniform on every∈ M compact interval. It remains to showM Φ is the solution to (4.72b),∈ i.e. x Φ(x)=Id+ AΦ(t)dt. (4.75) x∀∈R Z0 Using the identity

m (Ax)k m−1 Akxk+1 x m−1 Aktk Am(x)=Id+ = Id+A = Id+ A dt, k! (k + 1)! 0 k! Xk=1 Xk=0 Z Xk=0 we estimate, for each x R and each m N, ∈ ∈ x x Φ(x) Id AΦ(t)dt Φ(x) A (x) + A (x) Id AΦ(t)dt − − ≤ − m m − − Z0 Z0 m−1 x Aktk = Φ(x) Am(x) + A AΦ( t) dt − 0 k! − ! Z Xk=0 x

Φ(x) A (x) + A A (t) Φ(t) dt 0 for m , ≤ − m k k m−1 − → → ∞ Z0

which proves (4.75) and establishes the case.  4 LINEAR ODE 85

The matrix exponential function has some properties that are familiar from the case n = 1 (see Prop. 4.40(a),(b)), but also some properties that are, perhaps, unexpected (see Prop. 4.42(a),(b)). Proposition 4.40. Let A (n, C), n N. ∈M ∈ (a) eA(t+s) = eAteAs holds for each s,t R. ∈ (b) (eAx)−1 = eA(−x) = e−Ax holds for each x R. ∈ (c) For the transpose At, one has eAtx =(eAx)t for each x R. ∈ A(t+s) Proof. (a): Fix s R. The function Φs : R (n, C), Φs(t) := e is a solution to Y ′ = AY (namely∈ the one for the initial−→ condition M Y ( s) = Id). Moreover, the function Ψ : R (n, C), Ψ (t) := eAteAs, is also a solution− to Y ′ = AY , since s −→ M s At As At As ∂tΨs(t)= ∂te e = Ae e = AΨs(t).

A0 As As As Finally, since Ψs(0) = e e = Id e = e = Φs(0), the claimed Φs = Ψs follows by uniqueness of solutions. (b) is an easy consequence of (a), since

(a) Id = eA0 = eA(x−x) = eAxe−Ax.

t (c): Clearly, the map A A is continuous on (n, C) (since limk→∞ Ak = A implies 7→ M t t limk→∞ ak,αβ = aαβ for all components, which implies limk→∞ Ak = A ), providing, for each x R, ∈ t t m t k m k m k t (A x) (Ax) (Ax) eA x = lim = lim = lim =(eAx)t, m→∞ k! m→∞ k! ! m→∞ k! ! Xk=0 Xk=0 Xk=0 completing the proof.  Proposition 4.41. Let A (n, C), n N. Then ∈M ∈ det eA = etr A.

Proof. Applying Liouville’s formula (4.32) to Φ(x) := eAx, x R, yields ∈ x det eAx = det eA0 exp tr A dt =1 ex tr A, (4.76) · Z0  and setting x = 1 in (4.76) proves the proposition.  4 LINEAR ODE 86

Proposition 4.42. Let A, B (n, C), n N. ∈M ∈ (a) BeAx = eAxB holds for each x R if, and only if, AB = BA. ∈ (b) e(A+B)x = eAxeBx holds for each x R if, and only if, AB = BA. ∈ Proof. (a): If BeAx = eAxB holds for each x R, then differentiation yields BAeAx = AeAxB for each x R, and the case x = 0 provides∈ BA Id = A Id B, i.e. BA = AB. For the converse, assume∈ BA = AB and define the auxiliary maps

f : (n, C) (n, C), f (C) := BC, B M −→ M B g : (n, C) (n, C), g (C) := CB. B M −→ M B If denotes an operator norm, then BC BC B C C and C B C B k·k k 1− 2k ≤ k kk 1− 2k k 1 − 2 k≤ B C C , showing f and g to be (even Lipschitz) continuous. Thus, k kk 1 − 2k B B m k m k Ax Ax (Ax) (Ax) Be = fB(e )= fB lim = lim fB m→∞ k! ! m→∞ k! ! Xk=0 Xk=0 m k m k m k (Ax) AB=BA (Ax) (Ax) = lim B = lim B = lim gB m→∞ k! m→∞ k! ! m→∞ k! ! Xk=0 Xk=0 Xk=0 m k (Ax) Ax Ax = gB lim = gB(e )= e B, m→∞ k! ! Xk=0 thereby establishing the case. (b): Exercise (hint: use (a)). 

Eigenvalues and Jordan Normal Form We will see that the solution theory of linear ODE with constant coefficients is related to the eigenvalues of A. We recall the definition of this notion:

Definition 4.43. Let n N and A (n, C). Then λ C is called an eigenvalue of A if, and only if, there exists∈ 0 = v ∈MCn such that ∈ 6 ∈ Av = λv. (4.77)

If (4.77) holds, then v = 0 is called an eigenvector for the eigenvalue λ. 6 Theorem 4.44. Let n N and A (n, C). ∈ ∈M 4 LINEAR ODE 87

(a) For each eigenvalue λ C of A with eigenvector v Cn 0 , the function ∈ ∈ \ { } φ : I Cn, φ(x) := eλxv, (4.78) −→ is a solution to the homogeneous version of (4.71).

n (b) If v1,...,vn is a basis of eigenvectors for C , where vj is an eigenvector with respect{ to the} eigenvalue λ C of A for each j 1,...,n , then φ ,...,φ with j ∈ ∈ { } 1 n n λj x φj : I C , φj(x) := e vj, (4.79) j∈{1∀,...,n} −→ form a fundamental system for (4.71).

Proof. (a): One computes, for each x I, ∈ φ′(x)= λeλxv = eλx Av = Aφ(x),

proving that φ solves the homogeneous version of (4.71). (b): Without loss of generality, we may consider I = R. We already know from (a) that each φj is a solution to the homogeneous version of (4.71). Thus, it merely remains to check that φ1,...,φn are linearly independent. As φ1(0) = v1,...,φn(0) = vn are linearly independent by hypothesis, the linear independence of φ1,...,φn is provided by Th. 4.11(b). 

To proceed, we need a few more notions and results from linear algebra: Theorem 4.45. Let n N and A (n, C). Then the following statements (i) and ∈ ∈ M (ii) are equivalent:

(i) There exists a basis of eigenvectors for Cn, i.e. there exist v ,...,v Cn and B 1 n ∈ λ ,...,λ C such that = v ,...,v is a basis of Cn and Av = λ v for 1 n ∈ B { 1 n} j j j each j = 1,...,n (note that the vj must all be distinct, whereas some (or all) of the λj may be identical). (ii) There exists an invertible matrix W (n, C) such that ∈M

λ1 0 −1 .. W AW =  .  , (4.80) 0 λn     i.e. A is diagonalizable (if (4.80) holds, then the columns v1,...,vn of W must actually be the respective eigenvectors to the eigenvalues λ1,...,λn). 4 LINEAR ODE 88

Proof. See, e.g., [Koe03, Th. 8.3.1]. 

Unfortunately, not every matrix A (n, C) is diagonalizable. However, every A (n, C) can at least be transformed∈ into M Jordan normal form: ∈ M Theorem 4.46 (Jordan Normal Form). Let n N and A (n, C). There exists an invertible matrix W (n, C) such that ∈ ∈M ∈M B := W −1AW (4.81) is in Jordan normal form, i.e. B has block diagonal form

B1 0 .. B =  .  , (4.82) 0 Br     1 r n, where each block B is a so-called Jordan matrix or Jordan block, i.e. ≤ ≤ j

λj 1 0 ... 0 λj 1  . .  Bj =(λj) or Bj = .. .. 0 , (4.83)    0 λj 1     λj     where λj is an eigenvalue of A.

Proof. See, e.g., [Koe03, Th. 9.5.6] or [Str08, Th. 27.13]. 

The reason Th. 4.46 regarding the Jordan normal form is useful for solving linear ODE with constant coefficients is the following theorem:

Theorem 4.47. Let n N and A, W (n, C), where W is assumed invertible. ∈ ∈M (a) The following statements (i) and (ii) are equivalent:

(i) φ : I Cn is a solution to y′ = Ay. −→ (ii) ψ := W −1φ : I Cn is a solution to y′ = W −1AW y. −→ − (b) eW 1AWx = W −1eAxW for each x R. ∈ 4 LINEAR ODE 89

Proof. (a): The equivalences φ′ = Aφ W −1φ′ = W −1Aφ ψ′ = W −1AW ψ ⇔ ⇔ establish the case. − (b): By definition, x eW 1AWx is the solution to the initial value problem Y ′ = W −1AW Y , Y (0) = Id.7→ Thus, noting W −1eA0W = Id and (W −1eAxW )′ = W −1AeAxW = W −1AW W −1eAxW shows x W −1eAxW is a solution to the same initial value problem, establishing (b). 7→  Remark 4.48. To obtain a fundamental system for (4.71) with A (n, C), it suffices to obtain a fundamental system for y′ = By, where B := W −1AW∈Mis in Jordan normal form and W (n, C) is invertible: If φ ,...,φ are linearly independent solutions to ∈M 1 n y′ = By, then A = WBW −1, Th. 4.47(a), and W being a linear isomorphism yield that ′ ψ1 := Wφ1,...,ψn := Wφn are linearly independent solutions to y = Ay. Moreover, since B is in block diagonal form with each block being a Jordan matrix according to (4.82) and (4.83), it actually suffices to solve y′ = By assuming that B = λ Id+N, (4.84) where λ C and N is a so-called canonical nilpotent matrix, i.e. ∈ 01 0 ... 0 0 1  . .  N = 0 (zero matrix) or N = .. .. 0 , (4.85)    0 01    0   where the case N = 0 is already covered by Th. 4.44. The remaining case is covered by the following Th. 4.49. Theorem 4.49. Let λ C, k N, k 2, and assume 0 = N (k, C) is a canonical nilpotent matrix according∈ to (4.85)∈ . Then≥ 6 ∈M x2 xk−2 xk−1 1 x 2 ... (k−2)! (k−1)! xk−3 xk−2 0 1 x ... (k−3)! (k−2)!  .. . . 0 0 1 . . .  λx   Φ: R (k, C), Φ(x) := e ......  , (4.86) −→ M ......      0 0 0 ... 1 x      0 0 0 ... 0 1      4 LINEAR ODE 90

is a fundamental matrix solution to Y ′ =(λ Id+N)Y, Y (0) = Id, (4.87) i.e. Φ(x)= e(λ Id+N)x; (4.88) x∀∈R in particular, the columns of Φ provide k solutions to y′ =(λ Id+N)y that are linearly independent.

Proof. Φ(0) = Id is immediate from (4.86). Since Φ(x) has upper triangular form with all 1’s on the diagonal, we obtain detΦ(x) = ekλx = 0 for each x R, showing the 6 ∈ columns of Φ are linearly independent. Let φαβ : R C denote the αth component function of the βth column of Φ, i.e. −→

λx xβ−α e (β−α)! for α β, φαβ : R C, φαβ(x) := ≤ α,β∈{∀1,...,k} −→ (0 for α>β. It remains to show that

λφαβ + φα+1,β for α<β, ′ φαβ = λφαβ for α = β, (4.89) α,β∈{∀1,...,k}  0 for α>β.

One computes,  λx xβ−α λx xβ−(α+1) λe (β−α)! + e (β−(α+1))! for α<β, ′ λx xβ−α φαβ(x)= λe +0 for α = β, α,β∈{∀1,...,k}  (β−α)! 0 for α>β,

i.e. (4.89) holds, completing the proof.  Example 4.50. For a 2-dimensional real system of linear ODE y′ = Ay, A (2, R), (4.90) ∈M there exist precisely the following three possibilities (i) – (iii):

(i) The matrix A is diagonalizable with real eigenvalues λ1,λ2 R (λ1 = λ2 is 2 ∈ possible), i.e. there is a basis v1,v2 of R such that vj is an eigenvector for λj, j 1, 2 . In this case, according{ to} Th. 4.44(b), the two functions ∈ { } φ ,φ : R K2, φ (x) := eλ1xv , φ (x) := eλ2xv , (4.91) 1 2 −→ 1 1 2 2 form a fundamental system for (4.90) (over K). 4 LINEAR ODE 91

(ii) The matrix A is diagonalizable with two complex conjugate eigenvalues λ1,λ2 C R, λ = λ¯ . Analogous to (i), one has a basis v ,v of C2 such that v∈ \ 2 1 { 1 2} j is an eigenvector for λj, j 1, 2 , and the two functions in (4.91) still form a fundamental system for (4.90),∈ { but} with K replaced by C. However, one can still obtain a real-valued fundamental system as follows: We have λ = µ + iω, λ = µ iω, where µ R, ω R 0 . (4.92) 1 2 − ∈ ∈ \ { } 2 Thus, if Av1 = λ1v1 with 0 = v1 = α + iβ, where α, β R , then, letting v :=v ¯ = α iβ, and taking complex6 conjugates ∈ 2 1 − Av2 = Av¯1 = Av1 = λ1v1 = λ¯1v¯1 = λ2v2

shows v2 is an eigenvector with respect to λ2. Thus, φ2 = φ¯1 and, similar to the approach described in Rem. 4.33(b) above, we can let

1 1 ψ1 2 2 φ1 = 1 1 , ψ2 φ2    2i − 2i   to obtain a fundamental system ψ1,ψ2 for (4.90) over R, where ψ1,ψ2 : R R2, { } −→

(µ+iω)x ψ1(x) = Re(φ1(x)) = Re e (α + iβ) = Re eµx cos(ωx)+ i sin(ωx) (α + iβ)

= eµxα cos( ωx) β sin(ωx) ,   (4.93a) − µx ψ2(x) = Im(φ1(x)) = e α sin(ωx)+ β cos(ωx) . (4.93b) (iii) The matrix A has precisely one eigenvalueλ R and the corresponding eigenspace is 1-dimensional. Then there is an invertible∈ matrix W (2, R) such that B := W −1AW is in (nondiagonal) Jordan normal form, i.e.∈ M λ 1 B = W −1AW = . 0 λ   According to Th. 4.49, the two functions 1 x φ ,φ : R K2, φ (x) := eλx , φ (x) := eλx , (4.94) 1 2 −→ 1 0 2 1     form a fundamental system for y′ = By (over K). Thus, according to Th. 4.47, the two functions ψ ,ψ : R K2, ψ (x) := Wφ (x), ψ (x) := Wφ (x), (4.95) 1 2 −→ 1 1 2 2 form a fundamental system for (4.90) (over K). 4 LINEAR ODE 92

Remark 4.51. One way of finding a fundamental matrix solution for y′ = Ay, A (n, C), is to obtain eAx, using the following strategy based on Jordan normal forms:∈ M

(i) Determine the distinct eigenvalues λ1,...,λs, 1 s n, of A, which are precisely the zeros of the characteristic polynomial χ (x)≤ := det(≤ A x Id) (the multiplicity A − of the zero λj is called its algebraic multiplicity, the dimension of the eigenspace ker(A λ Id) its geometric multiplicity). − j (ii) Determine the Jordan normal form B of A and W such that B = W −1AW . In k general, this means computing the (finitely many distinct) powers (A λj Id) and k − (suitable bases of) ker(A λj Id) (in general, this is a somewhat involved procedure and it is referred to [Mar04,− Sections 4.2,4.3] and [Str08, Sec. 27] for details – the lecture notes do not provide further details here, as they rather recommend using the Putzer algorithm as described below instead).

Bj x (iii) For each Jordan block Bj (as in (4.83)) of B compute e as in (4.86).

As step (ii) above tends to be complicated in practise, it is usually easier to obtain eAx using the Putzer algorithm described next.

Putzer Algorithm The Putzer algorithm due to [Put66] is a procedure for computing eAx that avoids the difficulty of determining the Jordan normal form of A, and, thus, is often more efficient to employ in practise than the procedure described in Rem. 4.51 above. The Putzer algorithm is provided by the following theorem:

Theorem 4.52. Let A (n, C), n N. If λ1,...,λn C are precisely the eigenval- ues of A (not necessarily∈M distinct, each∈ eigenvalue occurring∈ possibly repeatedly according to its multiplicity). Then n−1 Ax e = pk+1(x) Mk, (4.96) x∀∈R Xk=0 where the functions p1,...,pn : R C and matrices M0,...,Mn−1 (n, C) are defined recursively by −→ ∈ M

′ p1 = λ1 p1, p1(0) = 1, (4.97a) ′ pk = λk pk + pk−1, pk(0) = 0 for k =2,...,n (4.97b)

(i.e. each pk is a solution to a (typically nonhomogeneous) 1-dimensional first-order linear ODE that can be solved using (2.2)) and

M0 := Id, (4.98a) M = M (A λ Id) for k =1,...,n 1. (4.98b) k k−1 − k − 5 STABILITY 93

Proof. Note that (4.98) can be extended to k = n, yielding

n M = (A λ Id) = χ (A)=0, n − k A kY=1 since each matrix annihilates its characteristic polynomial according to the Cayley- Hamilton theorem (cf. [Koe03, Th. 8.4.6] or [Str08, Th. 26.6]). Also note

AMk = Mk(A λk+1 Id) + λk+1Mk = Mk+1 + λk+1Mk. (4.99) k=0,...,n∀ −1 −

n−1 We have to show that x Φ(x) := k=0 pk+1(x) Mk solves the initial value problem ′ 7→ Y = AY , Y (0) = Id. The initial condition is satisfied, as Φ(0) = p1(0) M0 = Id, and the ODE is satisfied, as, for each x PR, ∈ n−1 n−1 Φ′(x) AΦ(x) = p′ (x) M A p (x) M − k+1 k − k+1 k k=0 k=0 X n−1 X (4.97), (4.99) = λ1p1(x) M0 + λk+1pk+1(x)+ pk(x) Mk k=1 n−1 X  p (x) M + λ M − k+1 k+1 k+1 k k=0 X  = p (x) M =0, − n n completing the proof. 

5 Stability

5.1 Qualitative Theory, Phase Portraits

In the qualitative theory of ODE, which can be seen as part of the field of dynamical systems, the idea is to understand the set of solutions to an ODE (or to a class of ODE), if possible, without making use of explicit solution formulas, which, in most situations, are not available anyway. Examples of qualitative questions are if, and under which conditions, solutions to an ODE are constant, periodic, are unbounded, approach some limit (more generally, the solutions’ asymptotic behavior), etc. One often thinks of the solutions as depending on a time-like variable, and then qualitative theory typically means disregarding the speed of change, but rather focusing on the shape/geometry of the solution’s image. 5 STABILITY 94

The topic of stability takes continuity in intial conditions further and investigates the behavior of solutions that are, at least initially, close to some given solution. Under which conditions do nearby solutions approach each other or diverge away from each other, show the same or different asymptotic behavior etc. Even though the abovedescribed considerations are not limited to this situation, a nat- ural starting point is to consider first-order ODE where the right-hand side does not depend on x. In the following, we will mostly be concerned with this type of ODE, which has a special name: Definition 5.1. If Ω Kn, n N, and f : Ω Kn, then the n-dimensional first-order ⊆ ∈ −→ ODE y′ = f(y) (5.1) is called autonomous and Ω is called the phase space. Remark 5.2. In fact, nonautonomous ODE are not really more general than au- tonomous ODE, due to the, perhaps, surprising Th. J.1 of the Appendix, which states that every nonautonomous ODE is equivalent to an autonomous ODE. However, this fact is of little practical relevance, since the autonomous ODE arising via Th. J.1 from nonau- tonomous ODE can never have bounded solutions on unbounded intervals, whereas the theory of autonomous ODE is most powerful and useful for ODE that admit bounded solutions on unbounded intervals (such as constant or periodic solutions, or solutions approaching constant or periodic functions). Lemma 5.3. If, in the context of Def. 5.1, φ : I Kn is a solution to (5.1), defined on the interval I R, then −→ ⊆ n φξ : I ξ K , φξ(x) := φ(x + ξ), where I ξ := x ξ R : x I , ξ∈∀R − −→ − { − ∈ ∈ } (5.2) is another solution to (5.1). In consequence, if φ is a maximal solution, then so is φξ.

Proof. Clearly, I ξ is an interval. Note x I ξ x + ξ I and, since φ is a solution to (5.1), it− is φ(I) Ω, implying φ (I∈ ξ−) Ω.⇒ Finally,∈ ⊆ ξ − ⊆ ′ ′ φξ(x)= φ (x + ξ)= f φ(x + ξ) = f φξ(x) , x∈∀I−ξ   completing the proof that φξ is a solution. Since each extension of φ yields an extension of φξ and vice versa, φ is a maximal solution if, and only if, φξ is a maximal solution.  Lemma 5.4. If Ω Kn, n N, and f : Ω Kn is such that (5.1) admits unique maximal solutions (⊆f being locally∈ Lipschitz on−→Ω open is sufficient, but not necessary, cf. Def. 3.32), then the global solution Y : D Kn of (5.1) satisfies f −→ 5 STABILITY 95

(a) Y (x,ξ,η)= Y (x ξ, 0,η) for each (x,ξ,η) D . − ∈ f (b) Y x, 0,Y (˜x, 0,η) = Y (x+˜x, 0,η) for each (x, x,η˜ ) R R Kn such that (˜x, 0,η) D and x, 0,Y (˜x, 0,η) D . ∈ × × ∈ f  ∈ f  n n Proof. (a): If ψ : Iξ,η K and φ : I0,η K denote the maximal solutions to the initial data y(ξ)= η−→and y(0) = η, respectively,−→ then (a) claims, using the notation from Lem. 5.3, ψ = φ . As a consequence of Lem. 5.3, φ : I + ξ Kn, is some −ξ −ξ 0,η −→ maximal solution to (5.1) and, since φ−ξ(ξ)= φ(0) = η = ψ(ξ), the assumed uniqueness yields the claimed ψ = φ−ξ, in particular, Iξ,η = I0,η + ξ. (b): Letη ˜ := Y (˜x, 0,η). If ψ : I Kn and φ : I Kn denote the maximal 0,η˜ −→ 0,η −→ solutions to the initial data y(0) =η ˜ and y(0) = η, respectively, then (b) claims ψ = φx˜. As a consequence of Lem. 5.3, φ : I x˜ Kn, is some maximal solution to (5.1) x˜ 0,η − −→ and, since φx˜(0) = φ(˜x)=˜η = ψ(0), the assumed uniqueness yields the claimed ψ = φx˜, in particular, I = I x˜.  0,η˜ 0,η − Definition 5.5. Let I R be an interval and φ : I S (in principle, S can be arbitrary). ⊆ −→

(a) The image of I under φ, i.e.

(φ) := φ(I)= φ(x): x I S (5.3) O { ∈ }⊆ is often referred to as the orbit of φ in the present context of qualitative ODE theory.

(b) φ : R S (note I = R) is called periodic if, and only if, there exists a smallest ω > 0−→ (called the period of φ) such that

φ(x + ω)= φ(x). (5.4) x∀∈R The requirement ω > 0 means constant functions are not periodic in the sense of this definition.

Lemma 5.6. Let φ : R Kn, n N. −→ ∈ (a) If φ is continuous and (5.4) holds for some ω > 0, then φ is either constant or periodic in the sense of Def. 5.5(b).

(b) (a) is false without the assumption of φ being continuous.

Proof. Exercise.  5 STABILITY 96

Definition 5.7. Let Ω Kn, n N, and f : Ω Kn. In the context of the autonomous ODE (5.1), the⊆ zeros of∈f are called the fixed−→ points of the ODE (5.1) (cf. Lem. 5.8 below). One then sometimes uses the notation

:= := η Ω: F (η)=0 (5.5) F Ff { ∈ } for the set of fixed points.

Lemma 5.8. Let Ω Kn, n N, f : Ω Kn, η Ω. Then the following statements are equivalent: ⊆ ∈ −→ ∈

(i) f(η)=0, i.e. η is a fixed point of (5.1).

(ii) φ : R Kn, φ η, is a solution to (5.1). −→ ≡ Proof. If f(η)=0 and φ η, then φ′(x)=0= f(φ(x)) for each x R, i.e. (i) implies ≡ ∈ (ii). Conversely, if φ η is a solution to (5.1), then f(η)= f(φ(x)) = φ′(x) = 0, i.e. (ii) implies (i). ≡ 

Proposition 5.9. If Ω Kn, n N, and f : Ω Kn is such that (5.1) admits unique maximal solutions⊆ (f being∈ locally Lipschitz on−→Ω open is sufficient), then, for maximal solutions φ : I Kn, φ : I Kn to (5.1), defined on open intervals 1 1 −→ 2 2 −→ I1,I2, respectively, precisely one of the following two statements (i) and (ii) is true:

(i) (φ ) (φ )= , i.e. the solutions have disjoint orbits. O 1 ∩O 2 ∅ (ii) There exists ξ R such that ∈

I2 = I1 ξ and φ2(x)= φ1(x + ξ). (5.6) − x∈∀I2

In particular, it follows in this case that (φ1) = (φ2), i.e. the solutions have the same orbit. O O

Proof. Suppose (i) does not hold. Then there are x I and x I such that 1 ∈ 1 2 ∈ 2 φ (x )= φ (x ). Define ξ := x x and consider 1 1 2 2 1 − 2 φ : I ξ Kn, φ(x) := φ (x + ξ). (5.7) 1 − −→ 1

Then φ is a maximal solution of (5.1) by Lem. 5.3 and φ(x2) = φ1(x1) = φ2(x2). By uniqueness of maximal solutions, we obtain φ = φ2, in particular, I2 = I1 ξ, proving (5.6). Clearly, (5.6) implies (φ )= (φ ). −  O 1 O 2 5 STABILITY 97

Proposition 5.10. If Ω Kn, n N, and f : Ω Kn is such that (5.1) admits unique maximal solutions⊆ (f being∈ locally Lipschitz on−→Ω open is sufficient), then, for each maximal solution φ : I Kn to (5.1), defined on the open interval I, precisely one of the following three statements−→ is true:

(i) φ is injective.

(ii) I = R and φ is periodic.

(iii) I = R and φ is constant (in this case η := φ(0) is a fixed point of (5.1)).

Proof. Clearly, (i) – (iii) are mutually exclusive. Suppose (i) does not hold. Then there exist x1,x2 I, x1 0, this means I = R and the validity of (5.4). As φ is also continuous, by Lem.− 5.6(a), either (ii) or (iii) must hold. 

Corollary 5.11. If Ω Kn, n N, and f : Ω Kn is such that (5.1) admits unique maximal solutions (f ⊆being locally∈ Lipschitz on−→Ω open is sufficient), then the orbits of maximal solutions to (5.1) partition the phase space Ω into disjoint sets. Moreover, every point η Ω is either a fixed point, or it belongs to some periodic orbit, or it belongs to the orbit of∈ some injective solution.

Proof. The corollary merely summarizes Prop. 5.9 and Prop. 5.10. 

Definition 5.12. In the situation of Cor. 5.11, a phase portrait for (5.1) is a sketch showing representative orbits. Thus, the sketch shows subsets of the phase space Ω, including fixed points (if any) and representative periodic solutions (if any). Usually, one also uses arrows to indicate the direction in which each drawn orbit is traced as the variable x increases.

Example 5.13. Even though it is a main goal of qualitative theory to obtain phase portraits without the need of explicit solution formulas, and we will study techniques for accomplishing this below, we will make use of explicit solution formulas for our first two examples of phase portraits.

(a) Consider the autonomous linear ODE

y′ y 1 = 2 . (5.8) y′ −y  2  1  5 STABILITY 98

2 Here we have Ω = R and f : Ω Ω, f(y1,y2)=( y2,y1). The only fixed point is (0, 0). Clearly, for each r> 0, φ−→: R R2, φ(x):=(− r cos x,r sin x) is a solution to (5.8) and its orbit is the circle with radius−→ r around the origin. Since every point of Ω belongs to such a circle, every orbit is either the origin or a circle around the origin. Thus, the phase portrait consists of such circles plus the origin and arrows that indicate the circles are traversed counterclockwise. (b) As compared to the previous one, the phase portrait of the autonomous linear ODE y′ y 1 = 2 (5.9) y′ y  2  1 is more complicated: While (0, 0) is still the only fixed point, for each r> 0, all the following functions φ ,φ ,φ ,φ : R R2 are solutions: 1 2 3 4 −→ φ1(x):=(r cosh x,r sinh x), (5.10a) φ (x):=( r cosh x, r sinh x), (5.10b) 2 − − φ3(x):=(r sinh x,r cosh x), (5.10c) φ (x):=( r sinh x, r cosh x), (5.10d) 4 − − each type describing a hyperbolic orbit in some section of the plane R2. These sections are separated by rays, forming the orbits of the solutions φ5,φ6,φ7,φ8 : R R2: −→ x x φ5(x):=(e ,e ), (5.10e) φ (x):=( ex, ex), (5.10f) 6 − − φ (x):=(e−x, e−x), (5.10g) 7 − φ (x):=( e−x,e−x). (5.10h) 8 − The two rays on (y ,y ): y =0 move away from the origin, whereas the two rays { 1 1 1 6 } on (y1, y1) : y1 = 0 move toward the origin. The hyperbolic orbits asymptoti- cally{ approach− the6 ray} orbits and are traversed such that the flow direction agrees between approaching orbits.

The next results will be useful to obtain new phase portraits from previously known phase portraits in certain situations. Proposition 5.14. Let Ω Kn, n N, let I R be some nontrivial interval, let f : Ω Kn, and let φ : I ⊆ Kn be∈ a solution to⊆ −→ −→ y′ = γ(x) f(y), (5.11) 5 STABILITY 99

where γ : I R is continuous. If φ′(x) = 0 for each x I (if one thinks of x as time, then one−→ can think of φ′ as the velocity6 of φ), then there∈ exists a continuously differentiable bijective map λ : J I, defined on some nontrivial interval J, such that (φ λ): J Kn is a solution to−→y′ = f(y). ◦ −→ Proof. Since φ′(x) = 0 for each x I, one has γ(x) = 0 for each x I. As γ is also continuous, it must6 be either always∈ negative or always6 positive. In consequence,∈ fixing x I,Γ: I R, Γ(x) := x γ(t)dt , is continuous and either strictly increasing or 0 x0 strictly∈ decreasing.−→ In particular, Γ is injective, J := Γ(I) is an interval, and Γ : I J R is bijective. The desired function λ is λ := Γ−1 : J I. Indeed, according to [Phi16a,−→ Th. 9.9], λ is differentiable and its derivative is the−→ continuous function 1 λ′ : J R, λ′(x)= , −→ γ λ(x) implying, for each x J,  ∈ 1 (φ λ)′(x)= φ′ λ(x) λ′(x)= γ λ(x) f φ(λ(x)) = f φ(λ(x)) , ◦ γ λ(x)     showing φ λ is a solution to y′ = f(y) as required.   ◦ Proposition 5.15. Let Ω Kn, n N, and f : Ω Kn. Moreover, consider a continuous function h : Ω ⊆ R with∈ the property that−→ either h > 0 everywhere on Ω or h< 0 everywhere on Ω. −→

(a) If f has no zeros (i.e. = ), then the ODE F ∅ y′ = f(y), (5.12a) y′ = h(y) f(y) (5.12b)

have precisely the same orbits, i.e. every orbit of a solution to (5.12a) is an orbit of a solution to (5.12b) and vice versa.

(b) If f and h are such that the ODE (5.12) admit unique maximal solutions, then the ODE (5.12) have precisely the same orbits (even if = ). F 6 ∅ Proof. (a): If φ : I Kn is a solution to (5.12b), then γ := h φ is well-defined and continuous. Since−→ = implies φ′ = 0, we can apply Prop.◦ 5.14 to obtain F ∅ 6 the existence of a bijective λ1 : J1 I such that φ λ1 is a solution to (5.12a). Thus, (φ) = (φ λ ). Conversely,−→ if ψ : I Kn is◦ a solution to (5.12a), i.e. to O O ◦ 1 −→ y′ = h(y) f(y), then γ := 1/(h ψ) is well-defined and continuous. Since = implies h(y) ◦ F ∅ 5 STABILITY 100

′ ψ = 0, we can apply Prop. 5.14 to obtain the existence of a bijective λ2 : J2 I such that6 ψ λ is a solution to (5.12b). Thus, (ψ)= (ψ λ ). −→ ◦ 2 O O ◦ 2 (b): We are now in the situation of Prop. 5.10 and Cor. 5.11, and from (a) we know every nonconstant orbit of (5.12a) is a nonconstant orbit of (5.12b) and vice versa. However, since h > 0 or h < 0, both ODE in (5.12) have precisely the same constant solutions, concluding the proof.  Remark 5.16. We apply Prop. 5.15 to phase portraits (in particular, assume unique maximal solutions). Prop. 5.15 says that overall multiplication with a continuous pos- itive function h does not change the phase portrait at all. Moreover, Prop. 5.15 also states that overall multiplication with a continuous negative function h does not change the partition of Ω into solution orbits. However, after multiplication with a negative h, the orbits are clearly traversed in the opposite direction, i.e., for negative h, the arrows in the phase portrait have to be reversed. For a general continuous h, this implies the phase portrait remains the same in each region of Ω, where h> 0; it remains the same, except for the arrows reversed, in each region of Ω, where h < 0; and the zeros of h add additional fixed points, cutting some of the previous orbits. We summarize how to obtain the phase portrait of (5.12b) from that of (5.12a):

(1) Start with the phase portrait of (5.12a). (2) Add the zeros of h as additional fixed points (if any). Previous orbits are cut, where fixed points are added. (3) Reverse the arrows where h< 0. Example 5.17. (a) Consider the ODE

′ 2 2 y = y2 (y1 1) + y , 1 − − 2 (5.13) y′ = y (y 1)2 + y2 , 2 1 1 − 2  2 2 which comes from multiplying the right-hand side of (5.8) by h(y)=(y1 1) + y2. The phase portrait is the same as the one for (5.8), except for the added fixed− point at (1, 0) . { } (b) Consider the ODE ′ 2 y = y1 y2 + y , 1 − 2 (5.14) y′ = y y + y2, 2 − 1 2 1 which comes from multiplying the right-hand side of (5.8) by h(y)= y1 y2. The phase portrait is obtained from that of (5.8), where additional fixed points− are on the line with y1 = y2. This line cuts each previously circular orbit into two segments. The arrows have to be reversed for y2 >y1, that means above the y1 = y2 line. 5 STABILITY 101

Definition 5.18. Let Ω Rn, n N, and f : Ω Rn. A function E : Ω R is called an integral for the autonomous⊆ ∈ ODE (5.1), i.e.−→ for y′ = f(y), if, and only−→ if, E φ is constant for every solution φ of (5.1). ◦ Lemma 5.19. Let Ω Rn be open, n N, and f : Ω Rn such that each initial value problem for (5.1)⊆has at least one solution∈ (f continuous−→ is sufficient by Th. 3.8). Then a differentiable function E : Ω R is an integral for (5.1) if, and only if, −→ n

( E)(y) f(y)= ∂jE(y) fj(y)=0. (5.15) y∈∀Ω ∇ • j=1 X Proof. Let φ : I Rn be a solution to y′ = f(y). Then, by the chain rule, −→ (E φ)′(x)=( E)(φ(x)) φ′(x)=( E)(φ(x)) f(φ(x)). (5.16) x∀∈I ◦ ∇ • ∇ • The differentiable function E φ : I R is constant on the interval I if, and only if, ◦ −→ (E φ)′ 0. Thus, by (5.16), E φ being constant for every solution φ is equivalent to ( ◦E) ≡f (y) = 0 for each y Ω◦ such that at least one solution passes through y.  ∇ • ∈  Example J.2 of the Appendix, pointed out by Anton Sporrer, shows the hypothesis of Lem. 5.19, that each initial value problem for (5.1) has at least one solution, can not be omitted. The following Prop. 5.20 makes use of integrals and applies to phase portraits of 2-dimensional real ODE: Proposition 5.20. Let Ω R2 be open, and let f : Ω R2 be continuous and such that (5.1) admits unique⊆ maximal solutions (f being locally−→ Lipschitz is sufficient). Assume E : Ω R to be a continuously differentiable integral for (5.1), i.e. for y′ = f(y), satisfying−→ E(y) = 0 for each y Ω. Then the following statements hold for each maximal solution∇ φ :6 I R2 of (5.1)∈ (I R some open interval): −→ ⊆

(a) If (xm)m∈N is a sequence in I such that limm→∞ φ(xm)= η Ω, then η (i.e. η is a fixed point) or η (φ) (i.e. there exists ξ I with φ(∈ξ)= η). ∈ F ∈O ∈ (b) Let C R be such that E φ C (such a C exists, as E is an integral). If E−1 C ∈ = y Ω : E(y)◦ = C≡ is compact and E−1 C = , then φ is periodic.{ } { ∈ } { } ∩ F ∅

Proof. Throughout the proof let C be as in (b), i.e. E φ C. ◦ ≡ (a): The continuity of E yields E(η) = limm→∞ E(φ(xm)) = C. Moreover, by hypoth- esis, (ǫ ,ǫ ) := E(η) = (0, 0). We proceed with the proof for ǫ = 0–if ǫ = 0 1 2 ∇ 6 2 6 2 and ǫ = 0, then the roles of the indices 1, 2 have to be switched in the following. We 1 6 5 STABILITY 102

apply the implicit function theorem [Phi16b, Th. 4.49] to the function f˜ : Ω R, −→ f˜(y) := E(y) C at its zero η =(η1,η2). By [Phi16b, Th. 4.49], there exist ǫ,δ > 0 and a continuously− differentiable map g : I R, I :=]η δ, η +δ[, such that g(η )= η , g −→ g 1 − 1 1 2 E s,g(s) = C, (5.17a) s∈∀Ig and, having fixed some arbitrary norm on R 2, k · k y η <ǫ E(y)= C y = s,g(s) . (5.17b) y∈∀Ω k − k ∧ ⇒s∈ ∃Ig      We now assume η / and show η (φ). If η / , then f(η) = 0 and the continuity ∈ F ∈O ∈ F 6 of f and g imply there is δ˜ > 0, δ˜ δ, such that, for each s I˜ :=]η1 δ,η˜ 1 + δ˜[, f(s,g(s)) = 0. Define the auxiliary≤ function ϕ : I˜ Ω, ϕ∈(s)=(s,g−(s)). Since E ϕ C6, we can employ the chain rule to conclude −→ ◦ ≡ 0=(E ϕ)′(s)=( E)(ϕ(s)) ϕ′(s), (5.18) s∀∈I˜ ◦ ∇ • i.e. the two-dimensional vectors ( E)(ϕ(s)) and ϕ′(s) are orthogonal with respect to the Euclidean scalar product. As E∇is an integral, using (5.15), f(ϕ(s)) is another vector orthogonal to ( E)(ϕ(s)) and, since all vectors in R2 orthogonal to ( E)(ϕ(s)) form a 1-dimensional∇ subspace of R2 (recalling ( E)(ϕ(s)) = 0), there exists∇ γ(s) R such that ∇ 6 ∈ ϕ′(s)= γ(s)f(ϕ(s)) (5.19) (note f(ϕ(s)) = 0 as s I˜). We can now apply Prop. 5.14, since (5.19) says ϕ is a solution to (5.11),6 the function∈ γ : I˜ R, s γ(s) = ϕ′(s)/f(ϕ(s)) is continuous, and ϕ′(s) = (1,g′(s)) = (0, 0) for each−→s I˜.7→ Thus, Prop. 5.14 provides a bijective 6 ∈ λ : J I˜, such that ϕ λ is a solution to y′ = f(y). −→ ◦ As we assume limm→∞ φ(xm)= η, there exists M N such that φ(xm) η <ǫ for each m M. Since E(φ(x )) = C also holds, (5.17b)∈ implies thek existence− ofk a sequence ≥ m (sm)m∈N in I˜ such that φ(xm)=(sm,g(sm)) for each m M. Then, for each m M and τ := λ−1(s ), (ϕ λ)(τ )= ϕ(s )= φ(x ). On the≥ other hand, for τ := λ−1≥(η ), m m ◦ m m m 0 1 (ϕ λ)(τ0) = ϕ(η1) = η, showing φ(xm),η (ϕ λ). Since φ(xm) (φ) as well, Prop.◦ 5.9 implies (ϕ λ) (φ), i.e. η ∈( Oφ), which◦ proves (a). In∈ preparation O for (b), we also observeO that◦ φ⊆O(x ) η < ǫ∈Ofor each m M implies the s for m M k m − k ≥ m ≥ all are in some compact interval I1 with η1 I1, implying the τm to be in the compact interval J := λ−1[I ] with τ J . We will∈ use for (b) that J is bounded. 1 1 0 ∈ 1 1 (b): As we have (φ) E−1 C according to the choice of C, the assumed compactness of E−1 C and Prop.O ⊆ 3.24 show{ }φ can only be maximal if it is defined on all of R (since { } (x,φ(x)) must escape every compact [ m,m] E−1 C , m N, on the left and on the − × { } ∈ 5 STABILITY 103

right). Using the compactness of E−1 C a second time, we obtain the existence of a { } −1 sequence (xm)m∈N in R such that limm→∞ xm = and limm→∞ φ(xm) = η E C . So we see that we are in the situation of (a). Let∞ ψ be the maximal extension∈ of{ the} solution ϕ λ constructed in the proof of (a). Then we know (ψ) (φ) = from the proof of (a)◦ and, since ψ and φ both are maximal, Prop. 5.9 impliesO ∩O(ψ)=6 ∅ (φ) and, more importantly for us here, there exists ξ R such that ψ(x)= φ(x+Oξ) for eachO x R. Let m M with M from the proof of (a).∈ If ξ = 0, then φ(x ) = ψ(τ ) = φ(x ∈+ ξ) ≥ 6 m m m shows φ is not injective. If ξ = 0, then φ = ψ and φ(xm) = φ(τm). Since the τm are bounded, whereas the xm are unbounded, xm = τm cannot be true for all m, again showing φ is not injective. Since E−1 C = , φ cannot be constant, therefore it must be periodic by Prop. 5.10. { } ∩ F ∅  Example 5.21. Using condition (5.15), i.e. E f 0, one readily verifies that the functions ∇ • ≡ E : R2 R, E(y ,y ) := y2 + y2, (5.20a) −→ 1 2 1 2 E : R2 R, E(y ,y ) := y2 y2, (5.20b) −→ 1 2 1 − 2 are integrals for (5.8) and (5.9), i.e. for y′ y 1 = 2 y′ −y  2  1  and y′ y 1 = 2 , y′ y  2  1 respectively, and we recover the respective phase portraits via the respective level E(y ,y )= C, C R. 1 2 ∈ Example 5.22. Consider the autonomous ODE y′ 2y y 1 = 1 2 . (5.21) y′ 1 2y2  2  − 1 We claim that 2 2 E : R2 R, E(y ,y ) := y e−(y1 +y2 ), (5.22) −→ 1 2 1 is an integral for (5.21) and intend to use Prop. 5.20 to establish (5.21) has orbits that are fixed points, orbits that are periodic, and orbits that are neither. To verify E is an integral, one computes, for each (y ,y ) R2, 1 2 ∈ 2 E(y1,y2) (2y1y2, 1 2y1) ∇ 2 2 • 2− 2 2 2 −(y1+y2 ) 2 −(y1 +y2 ) −(y1 +y2 ) 2 = e 2y1e , 2y1y2e (2y1y2, 1 2y1) 2 2 − − • − = e−(y1+y2 ) (1 2y2, 2y y ) (2y y , 1 2y2)=0. − 1 − 1 2 • 1 2 − 1 5 STABILITY 104

Clearly, the set of fixed points is

1 1 = , 0 , , 0 . F −√2 √2     −1 The level set of 0 is E 0 = (0,y2) : y2 R , i.e. it is the y2-axis. This is a nonperiodic orbit (actually,{ } the orbit{ of solutions∈ of} the form φ : R R2, φ(x) := (0,x + c), c R). Now consider the level set −→ ∈ E−1 e−1 = (y ,y ): y > 0, y2 = ln y y2 +1 . { } 1 2 1 2 1 − 1 Using g : R+ R, g(y ) = ln y y2 + 1, and its derivative, it is not hard to show −→ 1 1 − 1 g has precisely two zeros, namely λ1 = 1 and 0 < λ2 < 1, and g 0 precisely on the −1 −1 ≥ compact interval J := [λ2, 1], implying E e = y1, g(y1) : y1 J , showing E−1 e−1 is compact. According to Prop.{ 5.20(b),} E−1±e−1 must consist∈ of one or { }  { p }  more periodic orbits.

5.2 Stability at Fixed Points

Given an autonomous ODE with a fixed point p, we will investigate the question under what conditions a solution φ(x) starting out near p will remain near p as x increases or decreases.

To simplify notation, we will restrict ourselves to initial data y(0) = y0, which, in light of Lem. 5.4(b), is not an essential restriction.

Notation 5.23. Let Ω Kn, n N, and f : Ω Kn such that ⊆ ∈ −→ y′ = f(y) (5.23) admits unique maximal solutions (f being locally Lipschitz on Ω open is sufficient). Let Y : D Kn denote the general solution to (5.23) and define f −→ n Y : Df,0 K , Y (x,η) := Y (x, 0,η), −→ (5.24) D := (x,η) R Kn : (x, 0,η) D . f,0 { ∈ × ∈ f } Definition 5.24. Let Ω Kn, n N, and f : Ω Kn such that (5.23) admits unique maximal solutions (f being⊆ locally∈ Lipschitz on Ω−→ open is sufficient). Moreover, assume the set of fixed points to be nonempty, = , and let p . The fixed point p is said to be positively (resp. negatively) stableFif, 6 and∅ only if, the∈ F following conditions (i) and (ii) hold: 5 STABILITY 105

(i) There exists r > 0 such that, for each η Ω with η p < r, the maximal solution x Y (x,η) (cf. (5.24)) is defined on∈ (a supersetk − of)kR+ (resp. R−). 7→ 0 0 (ii) For each ǫ> 0, there exists δ > 0 such that, for each η Ω, ∈ η p <δ Y (x,η) p < ǫ. (5.25) k − k ⇒x ∀≥0 k − k (resp. x ≤ 0)

The fixed point p is said to be positively (resp. negatively) asymptotically stable if, and only if, (i) and (ii) hold plus the additional condition

(iii) There exists γ > 0 such that, for each η Ω, ∈ η p <γ lim Y (x,η)= p (resp. lim Y (x,η)= p). (5.26) k − k ⇒ x→∞ x→−∞ The norm on Kn used in (i) – (iii) above is arbitrary. Due to the equivalence of norms on Kkn ·, k changing the norm does not change the defined stability properties, even though, in general, it does change the sizes of r, δ, γ. Remark 5.25. In the situation of Def. 5.24, consider the time-reversed version of (5.23), i.e. y′ = f(y). (5.27) − According to Lem. 1.9(b), (5.27) has the general solution Y˜ : D Kn, Y˜ (x,η) := Y ( x,η), −f,0 −→ − (5.28) D = (x,η) R Kn : ( x,η) D . −f,0 { ∈ × − ∈ f,0} (a) Clearly, for a fixed point p , we have the following equivalences: ∈ F p is positively stable for (5.23) p is negatively stable for (5.27), ⇔ p is negatively stable for (5.23) p is positively stable for (5.27). ⇔ (b) Clearly, for a fixed point p , we have the following equivalences: ∈ F p is pos. asympt. stable for (5.23) p is neg. asympt. stable for (5.27), ⇔ p is neg. asympt. stable for (5.23) p is pos. asympt. stable for (5.27). ⇔ Lemma 5.26. Consider the situation of Def. 5.24 with f : Ω Kn continuous on Ω Kn open. Then the fixed point p is positively (resp. negatively)−→ stable if, and only if,⊆ for each ǫ> 0, there exists δ > 0 such that, for each η Ω, ∈ η p <δ Y (x,η) p < ǫ, (5.29) + k − k ⇒x∈I(0 ∀,η)∩R0 k − k − (resp. x ∈ I(0,η) ∩ R0 ) where I denotes the domain of the maximal solution Y ( ,η). (0,η) · 5 STABILITY 106

Proof. Clearly, stability in the sense of Def. 5.24 implies (5.29), and it merely remains to show that (5.29) implies Def. 5.24(i). As (5.29) holds, we can consider ǫ := 1 and obtain a corresponding δ =: r. Then (5.29) states that, for each η Ω with η p

y′ = y(y 1). (5.30) − The set of fixed points is = 0, 1 . Moreover, Y ′( ,η) < 0 for 0 <η< 1 and Y ′( ,η) > 0 for η ] , 0[F ]1, { [. It} follows that, for· p = 0, the positive stability part· of (5.29) holds∈ − (where, ∞ ∪ given∞ǫ> 0, one can choose δ := min 1,ǫ ). Moreover, { } for η < 0 and 0 <η< 1, one has limx→∞ Y (x,η) = 0. Thus, all three conditions of Def. 5.24 are satisfied and 0 is positively asymptotically stable. Analogously, one sees that 1 is negatively asymptotically stable. (b) For the R2-valued ODE of (5.8), (0, 0) is a fixed point that is positively and neg- atively stable, but neither positively nor negatively asymptotically stable. For the R2-valued ODE of (5.9), (0, 0) is a fixed point that is neither positively nor nega- tively stable. (c) Consider the 1-dimensional R-valued ODE

y′ = y2. (5.31)

The only fixed point is 0, which is neither positively nor negatively stable. Indeed, not even Def. 5.24(i) is satisfied: One obtains η Y : D R, Y (x,η) := , f,0 −→ 1 ηx − where

D =(R 0 ) f,0 × { } (x,η) R2 : η > 0, x ] , 1/η[ ∪ ∈ ∈ − ∞ (x,η) R2 : η < 0, x ]1/η, [ , ∪  ∈ ∈ ∞ showing every neighborhood of 0 contains η such that Y ( ,η ) is not defined on all · of R+ and η such that Y ( ,η) is not defined on all of R−. 0 · 0 5 STABILITY 107

Remark 5.28. There exist examples of autonomous ODE that show fixed points can satisfy Def. 5.24(iii) without satisfying Def. 5.24(ii). For example, [Aul04, Ex. 7.4.16] provides the following ODE in polar coordinates (r, ϕ):

r′ = r (1 r), (5.32a) − 1 cos ϕ ϕ ϕ′ = − = sin2 . (5.32b) 2 2 Even though it is somewhat tedious, one can show that its fixed point (1, 0) satisfies Def. 5.24(iii) without satisfying Def. 5.24(ii) (see Claim 4 of Example K.2 in the Appendix). —

We will now study a method that allows, in certain cases, to determine the stability properties of a fixed point without having to know the solutions to an ODE. The method is known as Lyapunov’s method. The key ingredient to this method is a test function V , known as a Lyapunov function. Once a Lyapunov function is known, stability is often easily tested. The catch, however, is that Lyapunov functions can be hard to find. From the literature, it appears there is no definition for an all-purpose Lyapunov function, as a suitable choice depends on the circumstances.

Definition 5.29. Let Ω Rn be open, n N. A function V : Ω R is said to 0 ⊆ ∈ 0 −→ be positive (resp. negative) definite at p Ω0 if, and only if, the following conditions (i) and (ii) hold: ∈

(i) V (y) 0 (resp. V (y) 0) for each y Ω . ≥ ≤ ∈ 0 (ii) V (y) = 0 if, and only if, y = p.

Theorem 5.30 (Lyapunov). Consider the situation of Def. 5.24 with K = R, Ω Rn n n ⊆ open, and f : Ω R continuous. Let Ω0 be open with p Ω0 Ω R . Assume V : Ω R to be−→ continuously differentiable and define ∈ ⊆ ⊆ 0 −→ n V˙ : Ω R, V˙ (y):=( V )(y) f(y)= ∂ V (y) f (y). (5.33) 0 −→ ∇ • j j j=1 X

If V is positive definite at p and V˙ 0 (resp. V˙ 0) on Ω0, then p is positively (resp. negatively) stable. If, in addition, V˙≤is negative (resp.≥ positive) definite at p, then p is positively (resp. negatively) asymptotically stable.

Proof. The proof is carried out for the case of postive (asymptotic) stability; the proof for the case of negative (asymptotic) stability is then easily obtained by reversing time, 5 STABILITY 108

i.e. by using Rem. 5.25 together with noting V˙ changing its sign when replacing f with n n f. Fix your favorite norm on R . Let r> 0 such that Br(p)= y R : y p −r Ω (such an r> 0 exists,k·k as Ω is open). Define { ∈ k − k≤ }⊆ 0 0 k :]0,r] R+, k(ǫ) := min V (y): y p = ǫ , (5.34) −→ k − k where k is well-defined, since the continuous function V assumes its min on compact sets, and k(ǫ) > 0 by the positive definiteness of V . Given ǫ ]0,r], since V (p) = 0, k(ǫ) > 0, and V continuous, ∈

V (y) < k(ǫ), (5.35) 0<δ∃(ǫ)<ǫ y∈Bδ∀(ǫ)(p)

where we used Not. 3.3 to denote an open ball with center p with respect to . k · k We now claim that, for each η B (p), the maximal solution x φ(x) := Y (x,η) ∈ δ(ǫ) 7→ must remain inside Bǫ(p) for each x 0 in its domain I(0,η) (implying p to be positively stable by Lem. 5.26). Seeking a contradiction,≥ assume there exists ξ 0 such that φ(ξ) p ǫ and let ≥ k − k≥ s := sup x 0: φ(t) B (p) for each t [0,x] ξ < . (5.36) ≥ ∈ ǫ ∈ ≤ ∞ n o The continuity of φ then implies φ(s) p = ǫ, i.e. k − k V (φ(s)) k(ǫ) (5.37) ≥ by the definition of k(ǫ). On the other hand, by the chain rule (V φ)′(x) = V˙ (φ(x)) (cf. (5.16)), such that V˙ 0 implies ◦ ≤ s (5.35) V (φ(s)) = V (η)+ V˙ (φ(x))dx V (η) < k(ǫ), (5.38) ≤ Z0 + in contradiction to (5.37), proving φ(x) Bǫ(p) for each x I(0,η) R0 and the positive stability of p. ∈ ∈ ∩ For the remaining part of the proof, we additionally assume V˙ to be negative definite at p, while continuing to use the notation from above. Set γ := δ(r). We have to show lim Y (x,η)= p for each η B (p), i.e. x→∞ ∈ γ Y (x,η) p < ǫ. (5.39) ǫ∈∀]0,r] ξǫ∃≥0 x≥∀ξǫ k − k

So fix η Bγ(p) and, as above, let φ(x) := Y (x,η). Given ǫ ]0,r], we first claim that there exists∈ ξ 0 such that φ(ξ ) B (p), where δ(ǫ) is∈ as in the first part of the ǫ ≥ ǫ ∈ δ(ǫ) 5 STABILITY 109

proof above. Indeed, seeking a contradiction, assume φ(x) p δ(ǫ) for all x 0, and set k − k ≥ ≥ α := max V˙ (y): δ(ǫ) y p r . (5.40) ≤ k − k≤ Then α < 0 due to the negative definiteness of V˙ at p. Moreover, due to the choice of γ, we have δ(ǫ) φ(x) p r for each x 0, implying ≤ k − k≤ ≥ x 0 V (φ(x)) = V (η)+ V˙ (φ(t))dt V (η)+ αx, (5.41) x∀≥0 ≤ ≤ Z0 which is the desired contradiction, as α< 0 implies the right-hand side to go to for x . Thus, we know the existence of ξ such that η := φ(ξ ) B (p). −∞ → ∞ ǫ ǫ ǫ ∈ δ(ǫ) To finish the proof, we recall from the first part of the proof that Y (x,ηǫ) p <ǫ for each x 0. Using Lem. 5.4(a), we obtain k − k ≥

φ(ξǫ + x)= Y (ξǫ + x,ξǫ,ηǫ)= Y (ξǫ + x ξǫ,ηǫ)= Y (x,ηǫ) Bǫ(p), (5.42) x∀≥0 − ∈ showing φ(x) p <ǫ for each x ξ as needed.  k − k ≥ ǫ Example 5.31. Let k,m N and α,β > 0. We claim that (0, 0) is a positively asymptotically stable fixed point∈ for each R2-valued ODE of the form

y′ y2k−1 + αy y2 1 = − 1 1 2 . (5.43) y′ y2m−1 βy2y  2 − 2 − 1 2! Indeed, (0, 0) is clearly a fixed point, and we consider the Lyapunov function

y2 y2 V : R2 R, V (y ,y ) := 1 + 2 , (5.44a) −→ 1 2 α β

which is clearly positive definite at (0, 0). Since V˙ : R2 R, −→ V˙ (y ,y )= V (y ,y ) y2k−1 + αy y2, y2m−1 βy2y 1 2 ∇ 1 2 • − 1 1 2 − 2 − 1 2 = (2y /α, 2y /β) y2k−1 + αy y2, y2m−1 βy2y 1 2 • − 1 1 2 − 2 − 1 2 = 2(y2k/α + y2m/β), (5.44b) − 1 2  is clearly negative definite at (0, 0), Th. 5.30 proves (0, 0) to be a positively asymptoti- cally stable fixed point.

Theorem 5.32. Consider the situation of Def. 5.24 with K = R. Let Ω0 be open with p Ω Ω Rn. Assume V : Ω R to be continuously differentiable and assume ∈ 0 ⊆ ⊆ 0 −→ there is an open set U Ω such that the following conditions (i) – (iii) are satisfied: ⊆ 0 5 STABILITY 110

(i) p ∂U, i.e. p is in the boundary of U. ∈ (ii) V > 0 and V˙ > 0 (resp. V˙ < 0) on U (where V˙ is defined as in (5.33)).

(iii) V (y)=0 for each y Ω ∂U. ∈ 0 ∩ Then the fixed point p is not positively (resp. negatively) stable.

Proof. We assume V and V˙ are positive, proving p not to be positively stable; the cor- responding statement regarding p not to be negatively stable is then, once again, easily obtained by reversing time, i.e. by using Lem. 5.25 together with noting V˙ changing its sign when replacing f with f. − Seeking a contradiction, assume p to be positively stable. Then there exists r> 0 such n that Br(p) = y R : y p r Ω0 and η Br(p) implies Y (x,η) is defined for each x 0.{ ∈ Moreover,k − positivek ≤ stability} ⊆ and p ∈ ∂U also imply the existence of ≥ ∈ η U Br(p) such that φ(x) := Y (x,η) Br(p) for all x 0 (note p = η as p ∂U). Set∈ ∩ ∈ ≥ 6 ∈ s := sup x 0: φ(t) U for each t [0,x] . (5.45) ≥ ∈ ∈ } If s< , then the maximality of φ implies φ(s) to be defined. Moreover, φ(s) ∂U by ∞ ∈ the definition of s, and φ(s) Br(p) Ω0 by the choice of η. Thus, φ(s) Ω0 ∂U and V (φ(s)) = 0. On the other∈ hand,⊆ as V and V˙ are positive on U, we have∈ ∩

s V (φ(s)) = V (η)+ V˙ (φ(t))dt >V (η) > 0, (5.46) Z0 which is a contradiction to V (φ(s)) = 0, implying s = and φ(x) U as well as V (φ(x)) >V (η) > 0 hold for each x> 0. ∞ ∈ To conclude the proof, consider the compact set

C := B (p) U V −1[V (η), [ Ω . (5.47) r ∩ ∩ ∞ ⊆ 0 Then the choice of η guarantees φ(x) C for all x 0. If y C, then V (y) V (η) > 0. If y Ω ∂U, then V (y) = 0, showing∈ C ∂U =≥ , i.e. C∈ U and ≥ ∈ 0 ∩ ∩ ∅ ⊆ α := min V˙ (y): y C > 0. (5.48) { ∈ } Thus, x V (φ(x)) = V (η)+ V˙ (φ(t))dt V (η)+ αx. (5.49) x∀≥0 ≥ Z0 But this means that the continuous function V is unbounded on the compact set C and this contradiction proves p is not positively stable.  5 STABILITY 111

Example 5.33. Let h1,h2 : Ω R be continuously differentiable functions defined 2 −→ on some open set Ω R with (0, 0) Ω and h1(0, 0) > 0, h2(0, 0) > 0. We claim that (0, 0) is not a positively⊆ stable fixed point∈ for each R2-valued ODE of the form

y′ y h (y ,y ) 1 = 2 1 1 2 . (5.50) y′ y h (y ,y )  2  1 2 1 2 

Indeed, (0, 0) is clearly a fixed point, and we let Ω0 be some open neighborhood of (0, 0), where both h1 and h2 are positive (such an Ω0 exists by continuity of h1,h2 and h1,h2 being positive at (0, 0)), and consider the Lyapunov function

V : Ω R, V (y ,y ) := y y , (5.51a) 0 −→ 1 2 1 2 with V˙ : Ω R, 0 −→ V˙ (y ,y )= V (y ,y ) y h (y ,y ), y h (y ,y ) 1 2 ∇ 1 2 • 2 1 1 2 1 2 1 2 =(y , y ) y h (y ,y ), y h (y ,y ) 2 1 • 2 1 1 2 1 2 1 2  = y2 h (y ,y )+ y2 h (y ,y ) > 0 onΩ (0, 0) . (5.51b) 2 1 1 2 1 2 1 2  0 \ { } Letting U := Ω (R+ R+), one has (0, 0) ∂U, both V and V˙ are positive on U, 0 ∩ × ∈ and V =0onΩ0 ∂U ( 0 R) (R 0 ). Thus, Th. 5.32 applies, yielding that (0, 0) is not positively∩ stable.⊆ { }× ∪ × { } Theorem 5.34. Let Ω Rn be open, n N. Let F : Ω R be C2 and consider ⊆ ∈ −→ y′ = F (y). (5.52) −∇ If p Ω is an isolated critical point of F (i.e. F (p)=0 and there exists an open set O with∈ p O Ω and F = 0 on O p ),∇ then p is a fixed point of (5.52) that is positively∈ asymptotically⊆ ∇ stable,6 negatively\ { asymptotical} ly stable, neither positively nor negatively stable as p is a local minimum for F , local maximum for F , neither.

Proof. Note that F being C2 implies F to be C1 and, in particular, locally Lipschitz, such that (5.52) admits unique maximal∇ solutions. Suppose F has a local min at p. As p is an isolated critical point, the local min at p must be strict, i.e. there exists an open neighborhood Ω of p such that F (p) < F (y) for each y Ω p . Then the 0 ∈ 0 \ { } Lyapunov function V : Ω0 R, V (y) := F (y) F (p), is clearly positive definite at p and V˙ : Ω R, −→ − 0 −→ V˙ (y)= V (y) ( F (y)) = F (y) F (y)= F (y) 2, (5.53) ∇ • −∇ −∇ •∇ −k∇ k2 is clearly negative definite at p. Thus, p is a positively asymptotically stable fixed point by Th. 5.30. If F has a local max at p, then the proof is conducted analogously, using 5 STABILITY 112

V : Ω0 R, V (y) := F (p) F (y), or, alternatively, by using time reversion (if F has a local−→ max at p, then F −has a local min at p, i.e. p is a positively asymptotically stable for y′ = F (y), i.e. −p is a negatively asymptotically stable for y′ = F (y) by Rem. 5.25(b)).∇ −∇ If p is neither a local min nor max for F , then let Ω := O, and V : Ω R, V (y) := 0 0 −→ F (p) F (y), where O was chosen such that F = 0 on O p , i.e. V˙ : Ω0 R, ˙ − 2 ∇ 6 \ { } −→ V (y) = F (y) 2, is positive definite at p. Let U := y Ω0 : F (y) < F (p) . Then U is openk∇ by thek continuity of F , and p ∂U, as p {is neither∈ a local min nor} max ∈ for F . By the continuity of F , F (y) = F (p) for each y Ω0 ∂U, i.e. V = 0 on Ω ∂U. Thus, Th. 5.32 applies, showing p is not positively∈ stable.∩ Analogously, using 0 ∩ U := y Ω0 : F (y) > F (p) and V (y) := F (y) F (p) shows p is not negatively stable.{ ∈ } − 

n 2 Example 5.35. (a) The function F : R R, F (y)= y 2, has an isolated critical point at 0, which is also a min for F . Thus,−→ by Th. 5.34,k k

y′ = F (y)=( 2y ,..., 2y ) (5.54) −∇ − 1 − n has 0 as a fixed point that is positively asymptotically stable.

(b) The function F : R2 R, F (y)= ey1y2 , has an isolated critical point at 0, which is neither a local min−→ nor local max for F . Thus, by Th. 5.34,

y′ = F (y)=( y ey1y2 , y ey1y2 ) (5.55) −∇ − 2 − 1 has 0 as a fixed point that is neither positively nor negatively stable.

5.3 Constant Coefficients

The stability properties of systems of first-order linear ODE (cf. Sec. 4.6.2) are closely related to the eigenvalues of the matrix A. As it turns out, the stability of the origin is essentially determined by the sign of the real part of the eigenvalues of A (cf. Th. 5.38 below). We start with a preparatory lemma:

Lemma 5.36. Let n N and W (n, K) be invertible. Moreover, let be some norm on (n, K). Then∈ ∈ M k · k M W : (n, K) R+, A W := W −1AW , (5.56) k · k M −→ 0 k k k k also constitutes a norm on (n, K). M 5 STABILITY 113

Proof. If A = 0, then A W = W −10W = 0 =0. If A W = 0, then W −1AW = 0, i.e. A = W 0W −1 = 0,k showingk k W is positivek k k definite.k Next,k k · k λA W = λ W −1AW = λ A W , λ∈∀K k k | | k k | | k k showing W is homogeneous of degree 1. Finally, k · k A + B W = W −1(A + B)W = W −1AW + W −1BW A W + B W , A,B∈M∀ (n,K) k k k k k k ≤ k k k k

showing W satisfies the triangle inequality.  k · k Remark and Definition 5.37. Let n N, A (n, C), and let λ C be an eigenvalue of A. ∈ ∈ M ∈

(a) Clearly, one has

0 ker(A λ Id) ker(A λ Id)2 ... { }⊆ − ⊆ − ⊆ and the inclusion can be strict for at most n times. Let

r(λ) := min k N : ker(A λ Id)k = ker(A λ Id)k+1 . ∈ 0 − − Then  ker(A λ Id)r(λ) = ker(A λ Id)r(λ)+k : k∈∀N − − Indeed, otherwise, let k := min k N : ker(A λ Id)r(λ) ( ker(A λ Id)r(λ)+k . 0 { ∈ − − } Then there exists v Cn such that (A λ Id)r(λ)+k0 v = 0, but(A λ Id)r(λ)+k0−1v = ∈ − − 6 0. However, that means w := (A λ Id)k0−1v ker(A λ Id)r(λ)+1, but w / ker(A λ Id)r(λ), in contradiction to− the definition∈ of r(λ).− The space ∈ − M(λ) := ker(A λ Id)r(λ) − is called the generalized eigenspace corresponding to the eigenvalue λ.

(b) Due to A(A λ Id) = (A λ Id)A, one has − − A ker(A λ Id)k ker(A λ Id)k, k∈∀N0 − ⊆ −  i.e. all the kernels (in particular, the generalized eigenspace M(λ)) are invariant subspaces for A. 5 STABILITY 114

(c) As already mentioned in Rem. 4.51 the algebraic multiplicity of λ, denoted ma(λ), is its multiplicity as a zero of the characteristic polynomial χ (x) = det(A x Id), A − and the geometric multiplicity of λ is mg(λ) := dimker(A λ Id). We call the eigenvalue λ semisimple if, and only if, its algebraic and geometric− multiplicities are equal. We then have the equivalence of the following statements (i) – (iv): (i) λ is semisimple. (ii) M(λ) = ker(A λ Id). − (iii) A↾M(λ) is diagonalizable. (iv) All the Jordan blocks corresponding to λ are trivial, i.e. they all have size 1 (i.e. there are dimker(A λ Id) such blocks). − Indeed, note that m (λ) = dimker(A λ Id)ma(λ) (e.g., since, if A is in Jordan a − normal form, then ma(λ) provides the size of the λ-block and, for A λ Id, this block is canonically nilpotent). This shows the equivalence between (i)− and (ii). Moreover, m (λ) = m (λ) means ker(A λ Id) has a basis of m (λ) eigenvectors g a − a v1,...,vma(λ) for the eigenvalue λ. The equivalence of (i),(ii) with (iii) and with (iv) is then given by Th. 4.45 and Th. 4.46, respectively. Theorem 5.38. Let n N and A (n, C). Moreover, let be some norm on (n, C) and let λ ,...,λ∈ C, 1 ∈s Mn, be the distinct eigenvaluesk · k of A. M 1 s ∈ ≤ ≤ (a) The following statements (i) – (iii) are equivalent: (i) There exists K > 0 such that eAx K holds for each x 0 (resp. x 0). k k≤ ≥ ≤ (ii) Re λ 0 (resp. Re λ 0) for every j =1,...,s and if Re λ =0 occurs, then j ≤ j ≥ j λj is a semisimple eigenvalue (i.e. its algebraic and geometric multiplicities are equal). (iii) The fixed point 0 of y′ = Ay is positively (resp. negatively) stable. (b) The following statements (i) – (iii) are equivalent: (i) There exist K,α > 0 such that eAx Ke−α|x| holds for each x 0 (resp. x 0). k k ≤ ≥ ≤ (ii) Re λj < 0 (resp. Re λj > 0) for every j =1,...,s. (iii) The fixed point 0 of y′ = Ay is positively (resp. negatively) asymptotically stable.

2 Proof. Let denote the max-norm on Cn = (n, C), i.e. k · kmax ∼ M (m ) := max m : k,l 1,...,n k kl kmax | kl| ∈ { }  5 STABILITY 115

(caveat: for n > 1, this is not the operator norm induced by the max-norm on Cn). Moreover, using Th. 4.46, let W (n, C) be invertible and such that B := W −1AW ∈M W −1 is in Jordan normal form. Then, according to Lem. 5.36, M max := W MW max also defines a norm on (n, C). According to Th. 4.47(b), k k k k M Ax W −1 Ax W −1AWx Bx e max = W e W max = e max = e max. (5.57) x∈∀R k k k k k k k k

Bx According to Th. 4.44 and Th. 4.49, the entries βkl(x) of (βkl(x)) := e enjoy the following property:

Re λj x m βkl(x) = Ce x . (5.58) k,l∈{∀1,...,n} j∈{1∃,...,s} C>∃ 0 m∈∃N0 x∀∈R | | | | Moreover,

Re λj x m βkl(x) = Ce x Re λj < 0 lim βkl(x) =0, (5.59a) | | | | ∧ ⇒ x→∞ | | Re λj x m βkl(x) = Ce x Re λj > 0 lim βkl(x) = , (5.59b) | | | | ∧ ⇒ x→∞ | | ∞ β (x) = CeRe λj x x m Re λ =0 m =0 β C, (5.59c) | kl | | | ∧ j ∧ ⇒ | kl|≡ Re λj x m βkl(x) = Ce x Re λj =0 m> 0 lim βkl(x) = . (5.59d) | | | | ∧ ∧  ⇒ x→∞ | | ∞  (a): We start with the equivalence between (i) and (ii): Suppose, Re λ 0 for every j ≤ j = 1,...,s and if Re λj = 0 occurs, then λj is a semisimple eigenvalue. Then, using Rem. and Def. 5.37(c) and (5.58), we are either in situation (5.59a) or in situation (5.59c). Thus, there exists K > 0 such that β (x) K for each x 0 and each 0 | kl | ≤ 0 ≥ k,l =1,...,n. Then there exists K1 > 0 such that

Ax Ax W Bx e K1 e max = K1 e max K1 K0, (5.60) x∀≥0 k k≤ k k k k ≤ showing (i) holds with K := K K . Conversely, if there is j 1,...,s such that 1 0 ∈ { } Re λ > 0, then there is β such that (5.59b) occurs; if there is j 1,...,s such that j kl ∈ { } Re λj = 0 and λj is not semisimple, then, using Rem. and Def. 5.37(c), there is βkl such that (5.59d) occurs. In both cases,

Ax Ax W Bx lim e = lim e max = lim e max = , (5.61) x→∞ k k x→∞ k k x→∞ k k ∞ i.e., the corresponding statement of (i) can not be true. The remaining case is handled via time reversion: eAx K holds for each x 0 if, and only if, e−Ax K holds k k ≤ ≤ k k ≤ for each x 0, which holds if, and only if, Re( λj) 0 for every j = 1,...,s with λ semisimple≥ for Re( λ ) = 0, which is equivalent− to≤ Re λ 0 for every j = 1,...,s j − j j ≥ with λj semisimple for Re λj = 0. 5 STABILITY 116

We proceed to the equivalence between (i) and (iii): Fix some arbitary norm on Cn, and let denote the induced operator norm on (n, C). Let C ,C k> · k0 be k · kop M 1 2 such that M op C1 M and M C2 M op for each M (n, C). Suppose there existsk Kk > ≤0 suchk thatk eAxk k ≤K holdsk k for each x 0.∈ Given M ǫ > 0, choose k k ≤ ≥ δ := ǫ/(C1K). Then

Ax Ax ǫ Y (x,η) 0 = e η e op η 0 such that Y (x,η) = e η < 1 for each η Bδ(0) and each x 0. Thus, k k k k ∈ ≥ (G.1) eAx = sup eAx η : η Cn, η =1 k kop k k ∈ k k 2 2 (5.63) x∀≥0 = sup eAx η : η Cn, η = δ/2 , δ k k ∈ k k ≤ δ showing (i) holds with K := 2 C2/δ. The remaining case is handled via time reversion: eAx K holds for each x 0 if, and only if, e−Ax K holds for each x 0, which kholdsk≤ if, and only if, 0 is positively≤ stable for yk′ = Ayk≤, which, by Rem. 5.25(a),≥ holds − if, and only if, 0 is negatively stable for y′ = Ay.

(b): As in (a), we start with the equivalence between (i) and (ii): Suppose, Re λj < 0 for every j =1,...,s. We first show, using (5.58),

−αklx βkl(x) Kkl e : (5.64) k,l=1∀,...,n Kkl,α∃kl>0 x∀≥0 | |≤ According to (5.58),

Re λj x/2 Re λj x/2 m βkl(x) = Ckl e e x . Ckl∃>0 x∀≥0 | |

Re λj x/2 m Re λj x/2 m Since Re λj < 0, one has limx→∞ e x = 0, i.e. e x is uniformly bounded on [0, [ by some Mkl > 0. Thus, (5.64) holds with Kkl := CklMkl and αkl := Re λj/2. ∞ Ax −α|x| − In consequence, if K1 is chosen as in (5.60), then e Ke K for each x 0 holds with K := K max K : k,l = 1,...,n andk αk≤:= min α ≤: k,l = 1,...,n≥ . 1 { kl } { kl } Conversely, if there is j 1,...,s such that Re λj 0, then there is βkl such that (5.59b) or (5.59c) or (5.59d)∈ { occurs.} In each case, eAx≥ 0 for x , since k k 6→ → ∞ Ax W Bx lim e max = lim e max ]0, ], (5.65) x→∞ k k x→∞ k k ∈ ∞ i.e., the corresponding statement of (i) can not be true. The remaining case is handled via time reversion: eAx Ke−α|x| holds for each x 0 if, and only if, e−Ax Ke−α|x| holds for eachk xk ≤ 0, which holds if, and only≤ if, Re( λ ) < 0k for everyk ≤ ≥ − j j =1,...,s, which is equivalent to Re λj > 0 for every j =1,...,s. 5 STABILITY 117

It remains to consider the equivalence between (i) and (iii): Let op and C1,C2 > 0 be as in the proof of the equivalence between (i) and (iii) in (a).k Suppose, · k there exist K,α > 0 such that eAx Ke−α|x| holds for each x 0. Since eAx Ke−α|x| K for each x 0, 0 is positivelyk k≤ stable by (a). Moreover,≥ k k≤ ≤ ≥ Ax Ax −α|x| Y (x,η) = e η e op η C1Ke η 0 for x , η∈∀Cn x∀≥0 k k k k ≤ k k k k≤ k k → → ∞ (5.66) showing 0 to be positively asymptotically stable. For the converse, we will actually show (iii) implies (ii). If 0 is positively asymptotically stable, then, in particular, it is positively stable, such that (ii) of (a) must hold. It merely remains to exclude the possibility of a semisimple eigenvalue λ with Re λ = 0. If there were a semisimple eigenvalue λ with Re λ = 0, then eBx had a Jordan block of size 1 with entry eλx, i.e. λx βkk(x)= e for some k 1,...,n . Let ek be the corresponding standard unit vector n ∈ { } of C (all entries 0, except the kth entry, which is 1). Then, for η := Wek,

−1 Ax −1 Ax Bx λx W e (ǫη) = ǫ W e Wek = ǫ e ek = ǫ e ek k k k k k k k k (5.67) ǫ∈∀R+ x∀∈R = ǫ eλx e = ǫ 1 e > 0, | | k kk · · k kk showing 0 were not positively asymptotically stable (e.g., since y W −1y defines a norm on Cn). The remaining case is, once again, handled via time7→ revers k ion:k eAx Ke−α|x| holds for each x 0 if, and only if, e−Ax Ke−α|x| holds for eachk x k ≤0, which holds if, and only if,≤ 0 is positively asymptoticallyk k ≤ stable for y′ = Ay, which,≥ by Rem. 5.25(b), holds if, and only if, 0 is negatively asymptotically stable for− y′ = Ay.  Example 5.39. (a) The matrix 2 1 A = 1 2   has eigenvalues 1 and 3 and, thus, the fixed point 0 of y′ = Ay is negatively asymptotically stable, but not positively stable. (b) The matrix 0 1 A = 0 0   has eigenvalue 0, which is not semisimple, i.e. the fixed point0of y′ = Ay is neither negatively nor positively stable. (c) The matrix i 1 2 2 3i 0 i 5 −17 A = 0− 0 1+3i −0  − 00 0 5   −    5 STABILITY 118

has simple eigenvalues i, i, 1+3i, 5, i.e. the fixed point 0 of y′ = Ay is positively stable (since all− real− parts are− 0), but neither negatively stable nor positively asymptotically stable (since there≤ are eigenvalues with 0 real part).

5.4 Linearization

If the right-hand side f of an autonomous ODE is differentiable and p is a fixed point (i.e. f(p) = 0), then one can sometimes use its linearization, i.e. its derivative A := Df(p) (which is an n n matrix), to infer stability properties of y′ = f(y) at p from those of y′ = Ay at 0 (see× Th. 5.44 below). We start with some preparatory results: Lemma 5.40. Let n N and consider the bilinear function ∈ n β : Rn Rn R, β(y,z) := y (Bz)= ytBz = y b z , (5.68) × −→ • k kl l k,lX=1 where B =(bkl) (n, R), “ ” denotes the Euclidean scalar product, and elements of Rn are interpreted∈M as column vectors• when involved in matrix multiplications.

(a) The function β is differentiable (it is even a polynomial, deg(β) 2, and, thus, C∞) and ≤ ∂ β : Rn Rn R, yk × −→ n (5.69a) k∈{1,...,n} ∀ ∂yk β(y,z)= bkl zl =(Bz)k, (y,z)∈∀Rn×Rn l=1 n Xn ∂zl β : R R R, × n −→ t (5.69b) l∈{1,...,n} ∀ ∂zl β(y,z)= yk bkl =(y B)l, (y,z)∈∀Rn×Rn Xk=1 Dβ(y,z)= β(y,z): Rn Rn R, ∇ × −→t t (5.69c) (y,z)∈∀Rn×Rn β(y,z)(u,v)= β(y,v)+ β(u,z)= y Bv + u Bz. ∇ (b) The function n V : Rn R, V (y) := β(y,y)= y (By)= ytBy = y b y , (5.70) −→ • k kl l k,lX=1 is differentiable (it is also even a polynomial, deg(β) 2, and, thus, C∞) and ≤ n ∂yk V : R R, −→n t t (5.71a) k∈{1∀,...,n} ∂y V (y)= yl(bkl + blk)= y (B + B )k, y∈∀Rn k Xl=1 5 STABILITY 119

DV (y)= V (y): Rn R, ∇ −→ (5.71b) y∈∀Rn V (y)(u)= β(y,u)+ β(u,y)= ytBu + utBy = yt(B + Bt)u. ∇ Proof. (a): (5.69a) and (5.69b) are immediate from (5.68) and, then, imply (5.69c). (b): (5.71a) is immediate from (5.70) and, then, implies (5.71b). 

Lemma 5.41. Let A, B (n, R), n N, and V : Rn R as in (5.70). Then ∈M ∈ −→ V (y) (Ay)= yt(BA + AtB)y. (5.72) y∈∀Rn ∇ •

Proof. We note

t (ytBt) (Ay)= ytBtAy = ytBtAy = ytAtBy (5.73) y∈∀Rn •  and, thus, obtain

(5.71b) (5.73) V (y) (Ay) = yt(B + Bt) (Ay) = yt(BA + AtB)y, (5.74) y∈∀Rn ∇ • • proving (5.72). 

Definition 5.42. A matrix B (n, R), n N, is called positive definite if, and only if, the function V of (5.70) is positive∈M definite∈ at p = 0 in the sense of Def. 5.29.

Proposition 5.43. Let A (n, R), n N. Then the following statements (i) – (iii) ∈M ∈ are equivalent:

(i) There exist positive definite matrices B,C (n, R), satisfying ∈M BA + AtB = C. (5.75) − (ii) Re λ< 0 holds for each eigenvalue λ C of A. ∈ (iii) For each given positive definite (symmetric) C (n, R), there exists a positive definite (symmetric) B (n, R), satisfying (5.75)∈ M. ∈M Proof. (iii) immediately implies (i) (e.g. by applying (iii) with C := Id). For the proof that (i) implies (ii), let B,C (n, R) be positive definite matrices, satisfying (5.75). By Th. 5.38(b), it suffices∈ to M show 0 is a positively asymptotically stable fixed point for y′ = Ay. To this end, we apply Th. 5.30, using V : Rn R of −→ 5 STABILITY 120

(5.70) as the Lyapunov function. Then, by Def. 5.42, B being positive definite means V being positive definite at 0. Since (5.72) (5.75) V˙ : Rn R, V˙ (y)= V (y) (Ay) = yt(BA + AtB)y = ytCy (5.76) −→ ∇ • − and C is positive definite, V˙ is negative definite at 0, i.e. Th. 5.30 yields 0 to be a positively asymptotically stable fixed point for y′ = Ay as desired. It remains to show that (ii) implies (iii). If all eigenvalues of A have negative real part, then, as A and At have the same eigenvalues, all eigenvalues of At have negative real part as well. Thus, according to Th. 5.38(b),

Ax −αx Atx −αx e max Ke e max Ke , (5.77) K,α>∃ 0 x∀≥0 k k ≤ ∧ k k ≤  2 where we have chosen the norm in (5.77) to mean the max-norm on Rn (note that eAx is real if A is real, e.g. due to the series representation (4.73)). Given C (n, R), ∈ M define ∞ t B := eA x CeAx dx. (5.78) Z0 To verify that B (n, R) is well-defined, note that each entry of the integrand matrix ∈M of (5.78) constitutes an integrable function on [0, [: Indeed, ∞ Atx Ax Atx Ax e Ce M e max C max e max max ≤ k k k k k k (5.79) M>∃ 0 x∀≥0 M C K2 e−2αx, ≤ k kmax which is integrable on [0, [. Next, we compute ∞ x (5.79) Atx Ax Ats As C = lim e Ce C = lim ∂s e Ce ds − x→∞ − x→∞ 0 ∞ Z Ats As  = ∂s e Ce ds 0 Z ∞ (I.3) t  t = At eA s CeAs + eA s CeAs A ds 0 Z ∞ ∞ (I.5),(I.6) t t  = At eA s CeAs ds + eA s CeAs ds A Z0 Z0  (5.78) = = AtB + BA, (5.80) showing (5.75) is satisfied. If C is positive definite and 0 = y Rn, then ytCy > 0, implying 6 ∈ ∞ ∞ t (I.5),(I.6) t ytBy = yt eA x CeAx dx y = yt eA x CeAx y dx 0 0 ∞Z  Z Prop. 4.40(c) = (eAx y)t CeAx y dx > 0, (5.81) Z0 5 STABILITY 121

showing B is positive definite as well. Finally, if C is symmetric, then

∞ ∞ t t Prop. 4.40(c) t Bt = eA x CeAx dx = eA x CeAx dx = B, (5.82) Z0 Z0  showing B is symmetric as well. 

Theorem 5.44. Let Ω Rn be open, n N, and f : Ω Rn continuously differ- entiable. Let p Ω be a⊆ fixed point (i.e.∈f(p)=0) and A−→:= Df(p) (n, R) the derivative of f at∈p. If all eigenvalues of A have negative (resp. positive)∈ real M parts, then p is a positively (resp. negatively) asymptotically stable fixed point for y′ = f(y).

Proof. Let all eigenvalues of A have negative real parts. We first consider the special case p = 0, i.e. A = Df(0). By the equivalence between (ii) and (iii) of Prop. 5.43, we can choose C := Id in (iii) to obtain the existence of a positive definite symmetric matrix B (n, R), satisfying ∈M BA + AtB = Id . (5.83) − The idea is now to apply the Lyapunov Th. 5.30 with V of (5.70), i.e.

n V : Ω R, V (y) := y (By)= ytBy = y b y . (5.84) −→ • k kl l k,lX=1 We already know V to be continuously differentiable and positive definite. We will conclude the proof of 0 being positively asymptotically stable by showing there exists δ > 0, such that V˙ : B (0) R, V˙ (y)=( V )(y) f(y), (5.85) δ −→ ∇ • n is negative definite at 0, where we take Bδ(0) with respect to the 2-norm 2 on R . The differentiability of f at 0 implies that (cf. [Phi16b, Lem. 4.20]) k · k

r : Ω Rn, r(y) := f(y) Ay, (5.86) −→ − satisfies r(y) lim k k2 =0. (5.87) y→0 y k k2 Thus, we compute, for each y Ω, ∈ (5.71b),(5.86) V˙ (y) = ( V )(y) f(y) = ( V )(y) (Ay)+ yt(B + Bt) r(y) ∇ • ∇ • • (5.72),B=Bt (5.83) = yt(BA + AtB)y +2ytBr(y) = y 2 +2 y Br(y).  (5.88) −k k2 • 5 STABILITY 122

We can estimate the second summand via the Cauchy-Schwarz inequality to obtain

y Br(y) y Br(y) y B r(y) , (5.89) • ≤ k k2 k k2 ≤ k k2 k k k k2 and, thus, using (5.87), y Br(y) lim • 2 =0. (5.90) y→0 y 2 k k Now choose δ > 0 such that B (0) Ω and such that δ ⊆ 2 y Br(y) 1 • 2 < . (5.91) y∈B∀δ(0) y 2 2 k k Then, for each 0 = y B (0) 6 ∈ δ (5.91) y 2 y 2 V˙ (y)= y 2 +2 y Br(y) < y 2 + k k2 = k k2 < 0, (5.92) −k k2 • −k k2 2 − 2 showing V˙ to be negative definite at 0, and 0 to be positively asymptotically stable. If p = 0, then consider the ODE y′ = g(y) := f(y + p), g : (Ω p) Rn. Then 0 is a fixed6 point for y′ = g(y), Dg(0) = Df(p)= A, i.e. 0 is positively− asymptotically−→ stable for y′ = g(y). But, since ψ is a solution to y′ = g(y) if, and only if, φ = ψ+p is a solution to y′ = f(y), p must be positively asymptotically stable for y′ = f(y). The remaining case that all eigenvalues of A have positive real parts is now treated via time reversion: If all eigenvalues of A have positive real parts, then all eigenvalues of A = D( f)(p) have negative real parts, i.e. p is positively asymptotically stable for y′ −= f(y),− i.e., by Rem. 5.25(b), p is negatively asymptotically stable for y′ = f(y). −  Caveat 5.45. The following example shows that the converse of Th. 5.44 does not hold: A fixed point p can be positively (resp. negatively) asymptotically stable without A := Df(p) having only eigenvalues with negative (resp. positive) real parts. The same example shows that, in general, one can not infer anything regarding the stability of the fixed point p if A := Df(p) is merely stable, but not asymptotically stable: Consider

f : R2 R2, f(y ,y ):=(y + µy3, y + µy3), µ R. (5.93) −→ 1 2 2 1 − 1 2 ∈ Then, independently of µ, (0, 0) is a fixed point and

0 1 Df(0, 0) = (5.94) 1 0 −  with complex eigenvalues i and i. Thus, the linearized system is positively and nega- tively stable, but not asymptotically− stable, still independently of µ. However, we claim that (0, 0) is a positively asymptotically stable fixed point for y′ = f(y) if µ < 0 and 5 STABILITY 123

a positively asymptotically stable fixed point for y′ = f(y) if µ > 0. Indeed, this can 2 2 2 be seen by using the Lyapunov function V : R R, V (y1,y2) = y1 + y2, which has V (y ,y )=(2y , 2y ) and −→ ∇ 1 2 1 2 V˙ (y ,y )= V (y ,y ) f(y ,y )=2µ (y4 + y4). (5.95) 1 2 ∇ 1 2 • 1 2 1 2 Thus, V is positive definite at (0, 0) and V˙ is negative definite at (0, 0) for µ < 0 and positive definite at (0, 0) for µ> 0. Example 5.46. Consider (x,y,z)′ = f(x,y,z) with f : R3 R3, f(x,y,z)=( x cos y, yez, x2 2z). (5.96) −→ − − − The derivative is cos y x sin y 0 − Df : R3 (3, R), Df(x,y,z)= 0 ez yez . (5.97)   −→ M 2x −0 − 2 −   Clearly, (0, 0, 0) is a fixed point and Df(0, 0, 0) has eigenvalues 1 and 2. Thus, (0, 0, 0) is a positively asymptotically stable fixed point for (x,y,z)−′ = f(x,y,z− ) by Th. 5.44.

5.5 Limit Sets

Limit sets are important when studying the asymptotic behavior of solutions, i.e. φ(x) for x and for x . If a solution has a limit, then its corresponding limit set consists→ of ∞ precisely one→ point. −∞ In general, the limit set of a solution is defined to consist of all points that occur as limits of sequences taken along the solution’s orbit (of course, the limit sets can also be empty): Definition 5.47. Let Ω Kn, n N, and f : Ω Kn be such that y′ = f(y) admits ⊆ ∈ −→ unique maximal solutions. For each η Ω, we define the omega limit set and the alpha limit set of η as follows: ∈

ω(η) := ωf (η) := y Ω: lim xk = lim Y (xk,η)= y , (5.98a) ∈ (xk)k∃∈N⊆R k→∞ ∞ ∧ k→∞  

α(η) := αf (η) := y Ω: lim xk = lim Y (xk,η)= y . (5.98b) ∈ (xk)k∃∈N⊆R k→∞ −∞ ∧ k→∞   Remark 5.48. In the situation of Def. 5.47, consider the time-reversed version of y′ = f(y), i.e. y′ = f(y), with its general solution Y˜ (x,η) = Y ( x,η), cf. (5.28). Clearly, for each η Ω,− − ∈ ωf (η)= α−f (η), αf (η)= ω−f (η). (5.99) 5 STABILITY 124

Proposition 5.49. In the situation of Def. 5.47, the following hold:

(a) If Y ( ,η) is defined on all of R+, then · 0 ∞ ω(η)= Y (x,η): x m ; (5.100a) { ≥ } m=0 \ and if Y ( ,η) is defined on all of R−, then · 0 ∞ α(η)= Y (x,η): x m . (5.100b) { ≤− } m=0 \ (b) All points in the same orbit have the same omega and alpha limit sets, i.e.

ω(η)= ω Y (x,η) α(η)= α Y (x,η) . x∈∀I0,η ∧    Proof. Due to Rem. 5.48, it suffices to prove the statements involving the omega limit sets. (a): Let y ω(η) and m N . Then there is a sequence (x ) in R such that ∈ ∈ 0 k k∈N limk→∞ xk = and limk→∞ Y (xk,η) = y. Since, for sufficiently large k0 N, the sequence (Y (x∞,η)) is in Y (x,η): x m , the inclusion “ ” of (5.100a) is∈ proved. k k≥k0 { ≥ } ⊆ Conversely, assume y Y (x,η): x m for each m N . Then, ∈ { ≥ } ∈ 0 1 Y (xk,η) y < , k∈∀N xk∈∃[k,∞[ − k

providing a sequence (xk)k∈N in R such that limk→∞ xk = and limk→∞ Y (xk,η)= y, proving y ω(η) and the inclusion “ ” of (5.100a). ∞ ∈ ⊇ (b): Let y ω(η) and x I0,η. Choose a sequence (xk)k∈N in R such that limk→∞ xk = and lim ∈ Y (x ,η)= y∈. Then lim (x x)= and ∞ k→∞ k k→∞ k − ∞ Lem. 5.4(b) lim Y xk x,Y (x,η) = lim Y (xk,η)= y, (5.101) k→∞ − k→∞  proving ω(η) ω Y (x,η) . The reversed inclusion then also follows, since ⊆  Lem. 5.4(b) Y x,Y (x,η) = Y (0,η)= η, (5.102) − concluding the proof.   5 STABILITY 125

Example 5.50. Let Ω Kn, n N, and f : Ω Kn be such that y′ = f(y) admits unique maximal solutions.⊆ ∈ −→

(a) If η Ω is a fixed point, then ω(η) = α(η) = η . More generally, if η Ω ∈ n { } ∈ is such that limx→∞ Y (x,η) = y K , then ω(η) = y ; if η Ω is such that lim Y (x,η)= y Kn, then α∈(η)= y . { } ∈ x→−∞ ∈ { } (b) If A (n, C) is such that the conditions of Th. 5.38(b) hold (all eigenvalues have negative∈M real parts, 0 is positively asymptotically stable), then ω(η)= 0 for each η Cn and α(η)= for each η Cn 0 . { } ∈ ∅ ∈ \ { } (c) If η Ω is such that the orbit (φ) of φ := Y ( ,η) is periodic, then ω(η)= α(η)= ∈ O · (φ). For example, for (5.8), O 2 ω(η)= α(η)= Skηk2 (0) = y R : y 2 = η 2 . (5.103) η∈∀R2 ∈ k k k k  Example 5.51. As an example with nonperiodic orbits that have limit sets consisting of more than one point, consider

y′ = y + y (1 y2 y2), (5.104a) 1 2 1 − 1 − 2 y′ = y + y (1 y2 y2). (5.104b) 2 − 1 2 − 1 − 2 We will show that, for each point except the origin (which is clearly a fixed point), the omega limit set is the unit circle, i.e.

2 2 2 ω(η)= S1(0) = y R : y1 + y2 =1 . (5.105) η∈R∀2\{0} { ∈ }

We first verify that the general solution is

2 (η1 cos x + η2 sin x, η2 cos x η1 sin x) Y : Df,0 R , Y (x,η1,η2)= − , (5.106) −→ η2 + η2 + (1 η2 η2)e−2x 1 2 − 1 − 2 where, letting p 2 1 η 2 1 xη := ln k k − , (5.107) 2 2 η∈{y∈R ∀: kyk2>1} 2 η  k k2  D = R η R2 : η 1 ]x , [ η R2 : η > 1 : (5.108) f,0 × { ∈ k k2 ≤ } ∪ η ∞ ×{ ∈ k k2 } For each (η ,η ) R2, Y ( ,η ,η ) satisfies  the initial condition:  1 2 ∈ · 1 2

(η1, η2) Y (0,η1,η2)= =(η1, η2). (5.109) η2 + η2 + (1 η2 η2) 1 2 − 1 − 2 p 5 STABILITY 126

The following computations prepare the check that each Y ( ,η1,η2) satisfies (5.104): The 2-norm squared of the numerator in (5.106) is ·

2 (η cos x + η sin x, η cos x η sin x) 1 2 2 − 1 2 2 2 2 2 2 2 2 2 = η1 cos x +2η1η2 cos x sin x + η2 sin x + η2 cos x 2η1η2 cos x sin x + η1 sin x − = η2 + η2 = η 2. (5.110) 1 2 k k2 Thus, η 2 Y (x,η1,η2) = k k (5.111) 2 2 2 −2x η 2 + (1 η 2)e k k − k k and p 2 2 −2x 2 2 η + (1 η )e η 1 Y 2(x,η ,η ) Y 2(x,η ,η )=1 Y (x,η ,η ) = k k2 − k k2 − k k2 − 1 1 2 − 2 1 2 − 1 2 2 η 2 + (1 η 2)e−2x k k2 − k k2 (1 η 2)e− 2x = − k k2 . (5.112) η 2 + (1 η 2)e−2x k k2 − k k2 In consequence,

′ Y1 (x,η1,η2)

2 2 −2x 2 −2x ( η1 sin x + η2 cos x) η 2 + (1 η 2)e +(η1 cos x + η2 sin x)(1 η 2)e = − k k − k k 3 − k k η 2 + (1 η2)e−2x 2 k k2 − k k2 = Y (x,η ,η )+ Y (x,η ,η ) 1 Y 2(x,η ,η ) Y 2(x,η ,η ) , (5.113) 2 1 2 1 1 2 − 1 1 2 − 2 1 2   verifying (5.104a). Similarly,

′ Y2 (x,η1,η2)

2 2 −2x 2 −2x ( η2 sin x η1 cos x) η 2 + (1 η 2)e +(η2 cos x η1 sin x)(1 η 2)e = − − k k − k k 3 − − k k η 2 + (1 η2)e−2x 2 k k2 − k k2 = Y (x,η ,η )+ Y (x,η ,η ) 1 Y 2(x,η ,η ) Y 2(x,η ,η ) , (5.114) − 1 1 2 2 1 2 − 1 1 2 − 2 1 2   verifying (5.104b).

For η 2 1, Y ( ,η1,η2) is maximal, as it is defined on R (the denominator in (5.106) hask nok zero≤ in this· case). For η > 1, the denominator clearly has a zero at x < 0, k k2 η where xη is defined as in (5.107). For x>xη, the expression under the square root in (5.106) is positive. Since lim Y (x,η ,η ) = for η > 1, Y ( ,η ,η ) is x↓xη k 1 2 k2 ∞ k k2 · 1 2 5 STABILITY 127

maximal in this case as well, completing the verification of Y , defined as in (5.106) – (5.108), being the general solution of (5.104). It remains to prove (5.105). From (5.111), we obtain

lim Y (x,η1,η2) =1, (5.115) η∈R∀2\{0} x→∞ 2

which implies ω(η) S1(0). (5.116) η∈R∀2\{0} ⊆ 2 Conversely, consider η = (η1,η2) R 0 and y = (y1,y2) S1(0). We will show y ω(η): Since y = 1, ∈ \ { } ∈ ∈ k k2

y = (sin ϕy, cos ϕy). (5.117) ϕy∈∃[0,2π[

Analogously, η = η 2 (sin ϕη, cos ϕη) (5.118) ϕη∈∃[0,2π[ k k (the reader might note that, in (5.117) and (5.118), we have written y and η using their polar coordinates, cf. [Phi17, Ex. 2.67]). Then, according to (5.106), we obtain, for each x 0, ≥

η 2 (sin ϕη cos x + cos ϕη sin x, cos ϕη cos x sin ϕη sin x) Y (x,η1,η2)= k k − η 2 + (1 η 2)e−2x k k2 − k k2 η 2 sin(x + ϕηp), cos(x + ϕη) = k k . (5.119) η 2 + (1 η 2)e−2x k k2 − k k2  Define p + xk := ϕy ϕη +2πk R . (5.120) k∈∀N − ∈ Then lim x = and k→∞ k ∞

lim Y (xk,η1,η2) k→∞

(5.119) η 2 sin(ϕy ϕη +2πk + ϕη), cos(ϕy ϕη +2πk + ϕη) = lim k k − − k→∞ η 2 + (1 η 2)e−2x k k2 − k k2  η (sin ϕ , cos ϕ ) = lim k k2 y p y = y, (5.121) k→∞ η 2 + (1 η 2)e−2x k k2 − k k2 showing y ω(η) andpS (0) ω(η). ∈ 1 ⊆ 5 STABILITY 128

Proposition 5.52. In the situation of Def. 5.47, if f is locally Lipschitz, then orbits that intersect an omega or alpha limit set, must entirely remain inside that same omega or alpha limit set, i.e.

Y (x,y) ω(η) Y (x,y) α(η) . (5.122) y∈Ω∀∩ω(η) x∈∀I0,y ∈ ∧ y∈Ω∀∩α(η) x∈∀I0,y ∈     Proof. Due to Rem. 5.48, it suffices to prove the statement involving the omega limit set. Let y Ω ω(η) and x I0,y. Choose a sequence (xk)k∈N in R such that limk→∞ xk = and lim∈ ∩ Y (x ,η)=∈y. Then lim (x + x)= and, ∞ k→∞ k k→∞ k ∞ Lem. 5.4(b) (∗) lim Y (xk + x,η) = lim Y x,Y (xk,η) = Y (x,y), (5.123) k→∞ k→∞  proving Y (x,y) ω(η). At “( )”, we have used that, due to f being locally Lipschitz by hypothesis, Y∈is continuous∗ by Th. 3.35.  Proposition 5.53. In the situation of Def. 5.47, let η Ω be such that there exists a compact set K Ω, satisfying ∈ ⊆ Y (x,η): x 0 K resp. Y (x,η): x 0 K . (5.124) { ≥ }⊆ { ≤ }⊆ Then the following hold: 

(a) ω(η) = (resp. α(η) = ). 6 ∅ 6 ∅ (b) ω(η) (resp. α(η)) is compact.

n (c) ω(η) (resp. α(η)) is a connected set, i.e. if O1,O2 are disjoint open subsets of K such that ω(η) O1 O2 (resp. α(η) O1 O2), then ω(η) O1 = or ω(η) O2 = (resp. α(η) O⊆ = ∪or α(η) O =⊆). ∪ ∩ ∅ ∩ ∅ ∩ 1 ∅ ∩ 2 ∅ Proof. Due to Rem. 5.48, it suffices to prove the statements involving the omega limit sets.

(a): Since, by hypothesis, (Y (k,η))k∈N is a sequence in the compact set K, it must have a subsequence, converging to some limit y K. But then y ω(η), i.e. ω(η) = . ∈ ∈ 6 ∅ (b): According to (5.100a) and (5.124), ω(η) is a closed subset of the compact set K, implying ω(η) to be compact as well. (c): Seeking a contradiction, we suppose the assertion is false, i.e. there are disjoint n open subsets O1,O2 of K such that ω(η) O1 O2, ω1 := ω(η) O1 = and ω2 := ω(η) O = . Then ω and ω are disjoint⊆ since∪ O , O are disjoint.∩ 6 Moreover,∅ ω ∩ 2 6 ∅ 1 2 1 2 1 and ω are both subsets of the compact set ω(η). Due to ω = ω(η) (Kn O ) and 2 1 ∩ \ 2 5 STABILITY 129

ω = ω(η) (Kn O ), ω and ω are also closed, hence, compact. Then, according 2 ∩ \ 1 1 2 to Prop. C.9, δ := dist(ω1, ω2) > 0. If y1 ω1 and y2 ω2, then there are numbers 0

Y (sk,η) O1 Y (tk,η) O2 . (5.125) k∈∀N ∈ ∧ ∈   Define σk := sup x sk : Y (t,η) O1 for each t [sk,x] . (5.126) k∈∀N ≥ ∈ ∈

Then sk < σk < tk and the continuity of the (even differentiable) map Y ( ,η) yields η := Y (σ ,η) ∂O . Thus, (η ) is a sequence in the compact set K · ∂O and, k k ∈ 1 k k∈N ∩ 1 therefore, must have a convergent subsequence, converging to some z K ∂O1. But then z ω(η), but not in O O , in contradiction to ω(η) O O ∈. ∩  ∈ 1 ∪ 2 ⊆ 1 ∪ 2 Theorem 5.54 (LaSalle). Let Ω Rn, n N, and f : Ω Rn be such that y′ = f(y) ⊆ ∈ −→ admits unique maximal solutions. Moreover, let Ω0 be an open subset of Ω, assume V : Ω0 R is continuously differentiable, K := y Ω0 : V (y) r is compact for some r −→R, and V˙ (y) 0 (resp. V˙ (y) 0) for each{ ∈y K, where≤V˙ }is defined as in (5.33). If∈ η Ω is such≤ that V (η)

M := y K : V˙ Y (x,y) =0 for each x 0 (resp. for each x 0) , ∈ ≥ ≤ n  o one has ω(η) M (resp. α(η) M). In particular, V˙ (y)=0 for each y ω(η) (resp. for each⊆y α(η)). ⊆ ∈ ∈ Proof. As usual, it suffices to prove the assertions for V˙ (y) 0, as the assertions for V˙ (y) 0 then follow via time reversion. ≤ ≥ (a): We claim V Y (x,η)

0

+ which is impossible. Thus, (5.127) must hold. However, (5.127) implies R0 I0,η, since Y ( ,η) is a maximal solution and K = y Ω : V (y) r is compact. ⊆ · { ∈ 0 ≤ } (b): Let φ := Y ( ,η). During the proof of (a) above, we have shown φ(x) K for each x 0. Since, then,· (V φ)′(x)= V˙ (φ(x)) 0 for each x 0, V φ is nonincreasing∈ for x ≥ 0. Since V φ is also◦ bounded on K, ≤ ≥ ◦ ≥ ◦ c = lim V φ(x) . (5.130) c∈∃R x→∞  If y ω(η), then there exists a sequence (xk)k∈N in R such that limk→∞ xk = and lim ∈ φ(x )= y. Thus, y K (since K is closed), and ∞ k→∞ k ∈ (5.130) V (y)= lim V φ(xk) = c, (5.131) k→∞  proving (b). (c): Let y ω(η) and φ := Y ( ,y). Since f is assumed to be locally Lipschitz, Prop. ∈ · + 5.52 applies and we obtain φ(x)= Y (x,y) ω(η) for each x R0 . Using (b), we know V to be constant on ω(η), i.e. V φ must be∈ constant on R+∈as well, implying ◦ 0 V˙ (φ(x))=(V φ)′(x)=0 (5.132) + x∈∀R0 ◦ as claimed. 

Example 5.55. Let a< 0 0 for x> 0.

Consider the autonomous ODE  ′ y1 = y2, (5.134a) y′ = y2 y h(y ). (5.134b) 2 − 1 2 − 1 The right-hand side is defined on Ω :=]a,b[ R and is clearly C1, i.e. the ODE admits unique maximal solutions. Due to (5.133), ×= (0, 0) , i.e. the origin is the only fixed point of (5.134). We will use Th. 5.54(c)F to show{ (0}, 0) is positively asymptotically stable: We introduce x H :]a,b[ R, H(x) := h(t)dt, (5.135) −→ Z0 A DIFFERENTIABILITY 131

and the Lyapunov function

y2 V : Ω R, V (y ,y ) := H(y )+ 2 . (5.136) −→ 1 2 1 2 Since H is positive definite at 0 (H is actually strictly decreasing on ]a, 0] and strictly increasing on [0,b[), V is positive definite at (0, 0). We also obtain

V˙ : Ω R, V˙ (y ,y )= h(y ),y y , y2 y h(y ) = y2y2 0. (5.137) −→ 1 2 1 2 • 2 − 1 2 − 1 − 1 2 ≤ Thus, from the Lyapunov Th. 5.30, we already know (0, 0) to be positively stable. However, V˙ is not negative definite at (0, 0), i.e. we can not immediately conclude that (0, 0) is positively asymptotically stable. Instead, as promised, we apply Th. 5.54(c): To this end, using that H is continuous and positive definite at 0, we choose r> 0 and c,d R, satisfying ∈ a

O := (y ,y ) Ω: V (y ,y )

lim Y (x,η1,η2)=(0, 0). (5.141) (η1,η∀2)∈O x→∞

Moreover, the continuity of V implies K to be closed. Since K [c,d] [ √2r, √2r], it is also bounded, i.e. compact. Thus, Th. 5.54 applies to each η⊆ O.× So− let η O. We will show that M = (0, 0) , where M is the set of Th. 5.54(c) (then∈ ω(η)= (0∈, 0) by Th. 5.54(c), which implies{ (5.141)} as desired). To verify M = (0, 0) , note V˙ ({y ,y )}< 0 { } 1 2 for y1,y2 = 0, showing (y1,y2) / M. For y1 = 0, y2 = 0, let φ := Y ( ,y1,y2). Then φ (0) = y6 = 0 and φ′ (0) = y ∈= 0, i.e. both φ and φ6 are nonzero on· some interval 2 2 6 1 2 6 1 2 ]0,ǫ[ with ǫ > 0, showing (y1,y2) / M. Likewise, if y1 = 0, y2 = 0, then let φ be as before. This time φ (0) = y = 0 and∈ φ′ (0) = h(y ) =6 0, again showing both φ and 1 1 2 − 1 6 1 φ are nonzero on some interval6 ]0,ǫ[ with ǫ> 0, implying (y ,y ) / M. 2 1 2 ∈

A Differentiability

We provide a lemma used in the variation of constants Th. 2.3. B KN -VALUED INTEGRATION 132

Lemma A.1. Let O R be open. If the function a : O K is differentiable, then ⊆ −→ f : O K, f(x) := ea(x) (A.1a) −→ is differentiable with

f ′ : O K, f ′(x) := a′(x) ea(x). (A.1b) −→ Proof. For K = R, the lemma is immediate from the chain rule of [Phi16a, Th. 9.11]. It remains to consider the case K = C. Note that we can not apply the chain rule for holomorphic (i.e. C-differentiable functions), since a is only R-differentiable and it does not need to have a holomorphic extension. However, we can argue as follows, merely using the chain rule and the product rule for real-valued functions: Write a = b + ic with differentiable functions b,c : O R. Then −→ f(x)= ea(x) = eb(x)+ic(x) = eb(x) eic(x) = eb(x) sin c(x)+ i cos c(x) . (A.2)

Thus, one computes 

f ′(x)= b′(x) eb(x) eic(x) + eb(x) c′(x)cos c(x)+ ic′(x)sin c(x) − ′ a(x) ′ b(x) ′ a(x) ′ b(x) ic(x) = b (x) e + ic (x) e i cos c(x)+sin c(x) = b (x) e  + ic (x) e e ′ ′ a(x) ′ a(x) = b (x)+ ic (x) e = a (x) e ,  (A.3) proving (A.1b).  

B Kn-Valued Integration

During the course of this class, we frequently need Kn-valued integrals. In particular, n for f : I K , I an interval in R, we make use of the estimate I f I f , for example−→ in the proof of the Peano Th. 3.8. As mentioned in thek proofk of ≤ Th.k 3.8,k R R the estimate can easily be checked directly for the 1-norm on Kn, but it does hold for every norm on Kn. To verify this result is the main purpose of the present section. Throughout the class, it suffices to use Riemann integrals. However, some readers might be more familiar with Lebesgue integrals, which is a more general notion (every Riemann integrable function is also Lebesgue integrable). For convenience, the material is presented twice, first using Riemann integrals and arguments that make specific use of techniques available for Riemann integrals, then, second, using Lebesgue integrals and corresponding techniques. For Riemann integrals, the norm estimate is proved in Th. B.4, for Lebesgue integrals in Th. B.9. B KN -VALUED INTEGRATION 133

B.1 Kn-Valued Riemann Integral

Definition B.1. Let a,b R, I := [a,b]. We call a function f : I Kn, n N, ∈ −→ ∈ Riemann integrable if, and only if, each coordinate function fj = πj f : I K, j =1,...,n, is Riemann integrable. Denote the set of all Riemann integr◦ able functions−→ from I into Kn by (I, Kn). If f : I Kn is Riemann integrable, then R −→

f := f ,..., f Kn (B.1) 1 n ∈ ZI ZI ZI  is the (Kn-valued) Riemann integral of f over I.

Remark B.2. The linearity of the K-valued integral implies the linearity of the Kn- valued integral.

Theorem B.3. Let a,b R, a b, I := [a,b]. If f (I, Kn), n N, and φ : f(I) R is Lipschitz continuous,∈ ≤ then φ f (I, R).∈ R ∈ −→ ◦ ∈R Proof. If K = R, then φ f = ψ ι f, where ι : Rn Cn is the canonical imbedding, n ◦ ◦ ◦ −→ n and ψ : C R, ψ(z1,...,zn) := φ(Re z1,..., Re zn). Clearly, ι f (I, C ), and, if φ is L-Lipschitz,−→ L 0, then, due to ◦ ∈ R ≥ ψ(z) ψ(w) = φ(Re z) φ(Re w) L Re z Re w | − | | − n |≤ k − k (∗) CL Re z Re w = CL Re z Re w ≤ k − k1 | j − j| j=1 (B.2) z,w∈Cn ∀ n X [Phi16a, Th. 5.9(d)] (∗∗) CL z w = CL z w CCL˜ z w , ≤ | j − j| k − k1 ≤ k − k j=1 X + where the estimate at ( ) holds with C R , due to the equivalence of and 1 on n ∗ ∈ + k·k k·k R , and the estimate at ( ) holds with C˜ R , due to the equivalence of 1 and on Cn. Thus, by (B.2), ψ∗∗is Lipschitz as well,∈ namely CCL˜ -Lipschitz, andk·k it sufficesk·k to consider the case K = C, which we proceed to do next. Once again using the equivalence n + n of 1 and on C , there exists c R such that z c z 1 for each z C . k · k k · k ∈ n k k ≤ k k ∈ Assume φ to be L-Lipschitz, L 0. If f (I, C ), then Re f1,..., Re fn (I, R) and Im f ,..., Im f (I, R),≥ i.e., given∈ǫ R > 0, Riemann’s integrability criterion∈ R of 1 n ∈ R [Phi16a, Th. 10.13] provides partitions ∆1,..., ∆n of I and Π1,..., Πn of I such that ǫ R(∆ , Re f ) r(∆ , Re f ) < , j j − j j 2ncL ǫ (B.3) j=1∀,...,n R(Π , Im f ) r(Π , Im f ) < , j j − j j 2ncL B KN -VALUED INTEGRATION 134 where R and r denote upper and lower Riemann sums, respectively (cf. [Phi16a, (10.7)]). Letting ∆ be a joint refinement of the 2n partitions ∆1,..., ∆n, Π1,..., Πn, we have (cf. [Phi16a, Def. 10.8(a),(b)] and [Phi16a, Th. 10.10(a)]) ǫ R(∆, Re f ) r(∆, Re f ) < , j − j 2ncL ǫ (B.4) j=1∀,...,n R(∆, Im f ) r(∆, Im f ) < . j − j 2ncL

N+1 Recalling that, for each g : I R and ∆ = (x0,...,xN ) R , N N, a = x0 < x <

Thus,

N R(∆,φ f) r(∆,φ f)= M (φ f) m (φ f) I ◦ − ◦ k ◦ − k ◦ | k| k=1 N n X  (B.6) cL M (Re f ) m (Re f ) I ≤ k j − k j | k| j=1 Xk=1 X N n  + cL M (Im f ) m (Im f ) I k j − k j | k| k=1 j=1 X X  n n = cL R(∆, Re f ) r(∆, Re f ) + cL R(∆, Im f ) r(∆, Im f ) j − j j − j j=1 j=1 X  X  (B.4) ǫ < 2ncL = ǫ. (B.7) 2ncL Thus, φ f (I, R) by [Phi16a, Th. 10.13].  ◦ ∈R Theorem B.4. Let a,b R, a b, I := [a,b]. For each norm on Kn, n N, and each Riemann integrable∈f : I ≤ Kn, it is f (I, R), andk the · k following∈ holds: −→ k k∈R

f f . (B.8) ≤ k k ZI ZI

Proof. From Th. B.3, we obtain f (I, R), as the norm is 1-Lipschitz by the inverse triangle inequality. Let ∆k k∈R be an arbitrary partitionk of · kI. Recalling that, for N+1 each g : I R and ∆ =(x0,...,xN ) R , N N, a = x0 < x1 < < xN = b, I := [x ,x−→], ξ I , the intermediate∈ Riemann sums∈ ··· k k−1 k k ∈ k N N ρ(∆,f)= f(t ) I = f(t )(x x ), (B.9) k | k| k k − k−1 Xk=1 Xk=1 B KN -VALUED INTEGRATION 136 we obtain, for ξ I , k ∈ k

ρ(∆, Re f1), ρ(∆, Im f1) ,..., ρ(∆, Re fn), ρ(∆, Im fn)  N N N  N   = Re f1(ξk) Ik , Im f1(ξk) Ik ,..., Re fn( ξk) Ik , Im fn(ξk) Ik | | | |! | | | |!! k=1 k=1 k=1 k=1 X X X X N

= Re f1(ξk) Ik , Im f1(ξk) Ik ,..., Re fn(ξk) Ik , Im fn(ξk) Ik | | | | | | | | k=1     X N Re f (ξ ), Im f (ξ ) ,..., Re f (ξ ), Im f (ξ ) I ≤ 1 k 1 k n k n k | k| k=1 X    

N = f(ξ ) I = ρ(∆, f ). (B.10) k k k| k| k k Xk=1 Since the intermediate Riemann sums in (B.10) converge to the respective integrals by [Phi16a, (10.25b)], one obtains

f = lim ρ(∆, Re f1), ρ(∆, Im f1) ,..., ρ(∆, Re fn), ρ(∆, Im fn) |∆|→0 ZI (B.10)   

lim ρ(∆, f )= f , (B.11) ≤ |∆|→0 k k k k ZI proving (B.8). 

B.2 Kn-Valued Lebesgue Integral

Definition B.5. Let I R be (Lebesgue) measurable, n N. ⊆ ∈ (a) A function f : I Kn is called (Lebesgue) measurable (respectively, (Lebesgue) −→ integrable) if, and only if, each coordinate function fj = πj f : I K, j = 1,...,n, is (Lebesgue) measurable (respectively, (Lebesgue) inte◦ grable),−→ which, for K = C, means if, and only if, each Re fj and each Im fj, j =1,...,n, is (Lebesgue) measurable (respectively, (Lebesgue) integrable).

(b) If f : I Kn is integrable, then −→ f := f ,..., f Kn (B.12) 1 n ∈ ZI ZI ZI  is the (Kn-valued) (Lebesgue) integral of f over I. B KN -VALUED INTEGRATION 137

Remark B.6. The linearity of the K-valued integral implies the linearity of the Kn- valued integral.

Theorem B.7. Let I R be measurable, n N. Then f : I Kn is measurable in the sense of Def. B.5(a)⊆if, and only if, f −1(O∈) is measurable for−→ each open subset O of Kn.

Proof. Assume f −1(O) is measurable for each open subset O of Kn. Let j 1,...,n . −1 n ∈ { }n If Oj K is open in K, then O := πj (Oj) = z K : zj Oj is open in K . ⊆−1 −1 { ∈ ∈ } Thus, fj (Oj) = f (O) is measurable, showing that each fj is measurable, i.e. f is measurable. Now assume f is measurable, i.e. each fj is measurable. Since every open n O K is a countable union of open sets of the form O = O1 On with each Oj being⊆ an open subset of K, it suffices to show that the preimages×···× of such open sets are −1 n −1 −1 measurable. So let O be as above. Then f (O)= j=1 fj (Oj), showing that f (O) is measurable.  T Corollary B.8. Let I R be measurable, n N. If f : I Kn is measurable, then f : I R is measurable.⊆ ∈ −→ k k −→ Proof. If O R is open, then −1(O) is an open subset of Kn by the continuity of the norm. In⊆ consequence, f −k1 ·(O k )= f −1 −1(O) is measurable.  k k k · k Theorem B.9. Let I R be measurable, n N. For each norm on Kn and each ⊆ ∈ k · k integrable f : I Kn, the following holds: −→

f f . (B.13) ≤ k k ZI ZI

n Proof. First assume that B I is measurable, y K , and f = yχB, where χB is the characteristic function of B (i.e.⊆ the f are y on B∈ and 0 on I B). Then j j \

f = y λ(B),...,y λ(B) = λ(B) y = f , (B.14) 1 n k k k k ZI ZI 

where λ denotes Lebesgue measure on R. Next, consider the case that f is a so-called n simple function, that means f takes only finitely many values y1,...,yN K , N N, and each preimage B := f −1 y I is measurable. Then ∈ ∈ j { j}⊆ N

f = yjχBj , (B.15) j=1 X B KN -VALUED INTEGRATION 138

where, without loss of generality, we may assume that the Bj are pairwise disjoint. We obtain N N N f y χ = y χ = y χ ≤ j Bj j Bj j Bj ZI j=1 ZI j=1 ZI ZI j=1 X X X N (∗) = yjχBj = f , (B.16) I I k k Z j=1 Z X where, at ( ), it was used that, as the B are disjoint, the integrands of the two integrals j are equal at∗ each x I. ∈ Now, if f is integrable, then each Re fj and each Im fj, j 1,...,n , is integrable (i.e., for each j 1,...,n , Re f , Im f L1(I)) and there∈ exist { sequences} of simple ∈ { } j j ∈ functions φ : I R and ψ : I R such that lim φ Re f 1 = j,k −→ j,k −→ k→∞ k j,k − jkL (I) lim ψ Im f 1 = 0. In particular, for each j 1,...,n , k→∞ k j,k − jkL (I) ∈ { }

0 lim φj,k Re fj lim φj,k Re fj L1(I) =0, (B.17a) ≤ k→∞ − ≤ k→∞ k − k ZI ZI

0 lim ψj,k Im fj lim ψj,k Im fj L1(I) =0, (B.17b) ≤ k→∞ − ≤ k→∞ k − k ZI ZI

and also, for each j 1,...,n , ∈ { } 0 lim φj,k + iψj,k fj L1(I) ≤ k→∞ k − k lim φj,k Re fj L1(I) + lim ψj,k Im fj L1(I) =0. (B.18) ≤ k→∞ k − k k→∞ k − k Thus, letting, for each k N, φ := (φ ,...,φ ) and ψ := (ψ ,...,ψ ), we obtain ∈ k 1,k n,k k 1,k n,k

f = f1,..., fn ZI ZI ZI 

= lim φ1,k + i lim ψ1,k,..., lim φn,k + i lim ψn,k k→∞ k→∞ k→∞ k→∞  ZI ZI ZI ZI 

(∗) = lim (φk + iψk) lim φk + iψk = f , (B.19) k→∞ ≤ k→∞ k k k k ZI ZI ZI

where the equality at ( ) holds due to lim φ + iψ f = 0, which, in ∗ k→∞ k k kk − k k L1(I) turn, is verified by

0 φ + iψ f φ + iψ f C φ + iψ f ≤ k k kk − k k ≤ k k k − k≤ k k k − k1 ZI ZI ZI n = C φ + iψ f 0 for k , (B.20) j,k j,k − j → → ∞ ZI j=1 X

C METRIC SPACES 139

with C R+ since the norms and are equivalent on Kn.  ∈ k · k k · k1

C Metric Spaces

C.1 Distance in Metric Spaces

Lemma C.1. The following law holds in every metric space (X,d):

d(x,y) d(x′,y′) d(x,x′)+ d(y,y′) for each x,x′,y,y′ X. (C.1) | − |≤ ∈ 2 + In particular, (C.1) states the Lipschitz continuity of d : X R0 (with Lipschitz 2 −→ constant 1) with respect to the metric d1 on X defined by

d : X2 X2 R+, d (x,y), (x′,y′) = d(x,x′)+ d(y,y′). (C.2) 1 × −→ 0 1 Further consequences are the continuity and even uniform continuity of d, and also the continuity of d in both components.

Proof. First, note d(x,y) d(x,x′)+ d(x′,y′)+ d(y′,y), i.e. ≤ d(x,y) d(x′,y′) d(x,x′)+ d(y′,y). (C.3a) − ≤ Second, d(x′,y′) d(x′,x)+ d(x,y)+ d(y,y′), i.e. ≤ d(x′,y′) d(x,y) d(x′,x)+ d(y,y′). (C.3b) − ≤ Taken together, (C.3a) and (C.3b) complete the proof of (C.1). 

Definition C.2. Let (X,d) be a nonempty metric space. For each A, B X define the distance between A and B by ⊆

dist(A, B) := inf d(a,b): a A, b B [0, ] (C.4) { ∈ ∈ }∈ ∞ and dist(x,B) := dist( x ,B) and dist(A, x) := dist(A, x ). (C.5) x∈∀X { } { } Remark C.3. Clearly, for dist(A, B) as defined in (C.4), we have

dist(A, B) < A = and B = . (C.6) ∞ ⇔ 6 ∅ 6 ∅ C METRIC SPACES 140

Theorem C.4. Let (X,d) be a nonempty metric space. If A X and A = , then the functions ⊆ 6 ∅ δ, δ˜ : X R+, δ(x) := dist(x, A), δ˜(x) := dist(A, x), (C.7) −→ 0 are both Lipschitz continuous with Lipschitz constant 1 (in particular, they are both continuous and even uniformly continuous).

Proof. Since dist(x, A) = dist(A, x), it suffices to verify the Lipschitz continuity of δ. We need to show dist(x, A) dist(y, A) d(x,y). (C.8) x,y∀∈X | − |≤ To this end, let x,y X and a A be arbitrary. Then ∈ ∈ dist(x, A) d(x,a) d(x,y)+ d(y,a) (C.9) ≤ ≤ and dist(x, A) d(x,y) d(y,a), (C.10) − ≤ implying dist(x, A) d(x,y) dist(y, A) (C.11) − ≤ and dist(x, A) dist(y, A) d(x,y). (C.12) − ≤ Since x,y X were arbitrary, (C.12) also yields ∈ dist(y, A) dist(x, A) d(x,y), (C.13) − ≤ where (C.12) and (C.13) together are precisely (C.8). 

Definition C.5. Let (X,d) be a metric space, A X, and ǫ R+. Define ⊆ ∈ A := x X : d(x, A) <ǫ , (C.14a) ǫ { ∈ } A := x X : d(x, A) ǫ . (C.14b) ǫ { ∈ ≤ }

We call Aǫ the open ǫ-fattening of A, and Aǫ the closed ǫ-fattening of A. Lemma C.6. Let (X,d) be a metric space, A X, and ǫ R+. Then A , the open ⊆ ∈ ǫ ǫ-fattening of A, is, indeed, open, and Aǫ, the closed ǫ-fattening of A, is, indeed, closed.

+ Proof. Since the distance function δ : X R0 , δ(x) := dist(x, A), is continuous by −1 −→ Th. C.4, Aǫ = δ [0,ǫ[ is open as the continuous preimage of an open set (note that [0,ǫ[ + −1 is, indeed, (relatively) open in R0 ); Aǫ = δ [0,ǫ] is closed as the continuous preimage of a closed set.  C METRIC SPACES 141

Lemma C.7. Let (X,d) be a metric space, A X, and ǫ R+. If A is bounded, then ⊆ ∈ so are the fattenings Aǫ and Aǫ.

Proof. If A is bounded, then there exist x X and r > 0 such that A Br(x). Let s := r + ǫ +1. If y A , then there exists a∈ A such that d(a,y) <ǫ + 1.⊆ Thus, ∈ ǫ ∈ d(x,y) d(x,a)+ d(a,y)

Proof. (a) is immediate from (C.14). To prove (b), let a A and consider the maps ∈ φ : [0, 1] X, φ(t) := tx + (1 t)a, (C.16a) −→ − f : [0, 1] R, f(t) := d φ(t),A . (C.16b) −→  If (sn)n∈N is a sequence in [0, 1] such that limn→∞ sn = s [0, 1], then limn→∞ φ(sn)= sx+(1 s)a = φ(s), i.e. φ is continuous. Then, using Th. C.4,∈ f is also continuous. Thus, since f−(0) = d(a, A) = 0 and f(1) = d(x, A) = δ ǫ , one can use the intermediate ≥ 2 value theorem [Phi16a, Th. 7.57] to obtain, for each ǫ [0,ǫ2], some τ [0, 1], satisfying f(τ)= ǫ. If ǫ> 0, then d(φ(τ),A)= f(τ)= ǫ> 0, i.e∈φ(τ) A A and∈ φ(τ) A A , ∈ ǫ \ ∈ ǫ \ ǫ showing A ( Aǫ1 , Aǫ1 ( Aǫ1 , and Aǫ2 ( Aǫ2 . If ǫ := (ǫ1 + ǫ2)/2, then ǫ1 < ǫ = f(τ) = d(φ(τ),A) <ǫ , i.e. φ(τ) A A , showing A ( A .  2 ∈ ǫ2 \ ǫ1 ǫ1 ǫ2 Proposition C.9. Let (X,d) be a metric space, C,A X. If C is compact, A is closed, and A C = , then dist(C,A) > 0. ⊆ ∩ ∅ Proof. Proceeding by contraposition, we show that dist(C,A) = 0 implies A C = . If dist(C,A) = 0, then there exists a sequence ((c ,a )) in C A such that∩ 6 ∅ k k k∈N ×

lim d(ck,ak)=0. (C.17) k→∞ D LOCAL LIPSCHITZ CONTINUITY 142

As C is compact, we may assume

lim ck = c C, (C.18) k→∞ ∈ also implying

lim ak = c, since d(ak,c) d(ak,ck)+ d(ck,c). (C.19) k→∞ k∈∀N ≤ Since A is closed, (C.19) yields c A, i.e. c A C.  ∈ ∈ ∩

D Local Lipschitz Continuity

In Prop. 3.13, it was shown that a continuous function is locally Lipschitz with respect to y if, and only if, it is globally Lipschitz with respect to y on every compact set. The following Prop. D.1 shows that this equivalence holds even if f is not continuous, provided that each projection Gx as in (D.1) below is convex. On the other hand, Ex. D.2 shows that, in general, there exist discontinuous functions that are locally Lipschitz with respect to y without being globally Lipschitz with respect to y on every compact set.

Proposition D.1. Let m,n N, G R Km, and f : G Kn. If G is such that each projection ∈ ⊆ × −→ G := y Km : (x,y) G , x R, (D.1) x { ∈ ∈ } ∈ is convex (in particular, if G itself is convex), then f is locally Lipschitz with respect to y if, and only if, f is (globally) Lipschitz with respect to y on every compact subset K of G.

Proof. The proof of Prop. 3.13 shows, whithout making use of the continuity of f, that (global) Lipschitz continuity with respect to y on every compact subset K of G implies local Lipschitz continuity on G. Thus, assume f to be locally Lipschitz with respect to y and assume each Gx to be convex. The proof of Prop. 3.13 shows, whithout making use of the continuity of f, that, for each K G compact ⊆ y y¯ <δ f(x,y) f(x, y¯) L y y¯ . (D.2) δ>∃0, (x,y),(∀x,y¯)∈K k − k ⇒ k − k≤ k − k L≥0   If (x,y), (x, y¯) K are arbitrary with y =y ¯, then the convexity of G implies ∈ 6 x (x, (1 t)y)+(x,ty¯): t [0, 1] G. (D.3) { − ∈ }⊆ D LOCAL LIPSCHITZ CONTINUITY 143

Choose N N such that N > 2 y y¯ /δ and set h := y y¯ /N. Then ∈ k − k k − k h<δ/2. (D.4) Define kh kh yk := y + 1 y.¯ (D.5) k=0∀,...,N y y¯ − y y¯ k − k  k − k Then h h yk+1 yk = y + y¯ = h<δ (D.6) k=0,...,N∀ −1 k − k y y¯ y y¯

k − k k − k and

N−1 N−1 (D.2) f(x,y) f(x, y¯) f(x,y ) f(x,y ) L y y k − k≤ k k − k+1 k ≤ k k − k+1k Xk=0 Xk=0 = LNh = L y y¯ , (D.7) k − k showing f to be (globally) L-Lipschitz with respect to y on K.  Example D.2. We provide two examples that show that, in general, a discontinuous function can be locally Lipschitz with respect to y without being globally Lipschitz with respect to y on every compact set.

(a) Consider G :=] 2, 2[ ] 4, 1[ ]1, 4[ (D.8) − × − − ∪ and f : G R, −→  1/x for x = 0, y ] 4, 1[, 6 ∈ − − f(x,y) := 0 for x = 0, y ] 4, 1[, (D.9)  ∈ − − 0 for y ]1, 4[. ∈ For the following open balls with respect to the max norm (x,y) := max x , y , k k {| | | |} one has B1(x,y) G ] 2, 2[ ] 4, 1[ for y ] 4, 1[, andB1(x,y) G ] 2, 2[ ]1, 4[ for ∩y ]1⊆, 4[.− Thus,×f(−x, )− is constant∈ on− each− set B (x,y) G ∩(either⊆ − × ∈ · 1 ∩ constantly equal to 1/x or constantly equal to 0), i.e. 0-Lipschitz with respect to y. In particular, f is locally Lipschitz with respect to y. However, f is not Lipschitz continuous with respect to y on the compact set K := [ 1, 1] [ 3, 2] [2, 3] : (D.10) − × − − ∪

For the sequence ((xk,yk, yk))k∈N, where 

xk := 1/k, yk := 2, yk := 2, (D.11) k∈∀N − D LOCAL LIPSCHITZ CONTINUITY 144

one has f(x ,y ) f(x , y ) k 0 lim | k k − k k | = lim − = , (D.12) k→∞ y y k→∞ 2 ( 2) ∞ | k − k| − − showing f is not Lipschitz continuous with respect to y on K.

(b) If one increases the dimension by 1, then one can modify the example in (a) such the set G is even connected (this variant was pointed out by Anton Sporrer): Let

A := ] 4, 1[ ] 2, 2[ ] 4, 4[ ] 2, 0[ ]1, 4[ ] 2, 2[ R2. (D.13) − − × − ∪ − × − ∪ × − ⊆ Then Ais open and connected (but not convex) and the same holds for

G :=] 2, 2[ A R3. (D.14) − × ⊆ Define

1/x for x = 0, y1 ] 4, 1[, y2 > 0, f : G R, f(x,y1,y2) := 6 ∈ − − (D.15) −→ (0 otherwise.

Then everything works essentially as in (a) (it might be helpful to graphically visualize the set A and the behavior of the function f): For the following open balls with respect to the max norm (x,y) := max x , y , y , one has k k {| | | 1| | 2|}

(ξ,η1,η2) B1(x,y1,y2) G η1 < 1+1=0 < 1 . (D.16) (x,y1,y∀2)∈G, ∈ ∩ ⇒ − y1∈]−4,−1[  

Analogous to (a), f(x, ) is constant on each set B1(x,y1,y2) G (either constantly equal to 1/x or constantly· equal to 0), i.e. 0-Lipschitz with respect∩ to y. In particu- lar, f is locally Lipschitz with respect to y. However, f is not Lipschitz continuous with respect to y on the compact set

K := [ 1, 1] [ 3, 2] [2, 3] [ 1, 1]: (D.17) − × − − ∪ × −  For the sequence ((xk,y1,k, y1,k),y2,k)k∈N with

xk := 1/k, y1,k := 2, y1,k := 2, y2,k := 0, (D.18) k∈∀N − one has

f(xk,y1,k,y2,k) f(xk, y ,y2,k) k 0 lim | − 1,k | = lim − = , (D.19) k→∞ (y ,y ) (y ,y ) k→∞ max 4, 0 ∞ k 1,k 2,k − 1,k 2,k kmax { } showing f is not Lipschitz continuous with respect to y on K. E MAXIMAL SOLUTIONS ON NONOPEN INTERVALS 145

E Maximal Solutions on Nonopen Intervals

In Def. 3.20, we required a maximal solution to an ODE to be defined on an open interval. The following Ex. E.1 shows it can occur that such a maximal solution has an extension to a larger nonopen interval. In such cases, one might want to call the solution on the nonopen interval maximal rather than the solution on the smaller open interval. However, this would make the treatment of maximal solutions more cumbersome in some places, without adding any real substance, which is why we stick to our requirement for maximal solutions to always be defined on an open interval. Example E.1. (a) Let G := [0, 1] R, f : G R, f(x,y):=0. (E.1) × −→ Then, for each (x ,y ) G, the function 0 0 ∈ φ : [0, 1] R, φ y , (E.2) −→ ≡ 0 is a solution to the initial value problem

′ y = f(x,y), y(x0)= y0. (E.3)

However, the maximal solution of (E.3) according to Def. 3.20 is φ↾]0,1[. (b) The following modification of (a) allows f to be defined on all of R2: Let

0 for x [0, 1], G := R2, f : G R, f(x,y) := ∈ (E.4) −→ 1 for x / [0, 1]. ( ∈

Then, for each (x0,y0) [0, 1] R, the function φ of (E.2) is a solution to the initial value problem (E.3), but,∈ again,× the maximal solution of (E.3) according to Def. 3.20 is φ↾]0,1[.

F Paths in Rn

Definition F.1. A path or curve in Rn, n N, is a continuous map ψ : I Rn, where I R is an interval. One calls the path differentiable,∈ continuously differentiable,−→ etc. if,⊆ and only if, the function ψ has the respective property. Definition F.2. If a,b R, a b, and I := [a,b], then we call ∈ ≤ I := b a = a b , (F.1) | | − | − | the length of I. F PATHS IN RN 146

Definition F.3. Given a real interval I := [a,b] R, a,b R, a

∆ := max I : j 1,...,N , (F.2) | | | j| ∈ { } is called the mesh size of ∆.  Notation F.4. Given a,b R, a

N−1 p (∆) := p (x ,...,x ) := ψ(x ) ψ(x ) , (F.3) ψ ψ 0 N k k+1 − k k2 Xk=0 using to denote the 2-Norm on Rn, i.e. the Euclidean norm. k · k2 Definition F.5. Given a,b R, a b, for each path ψ : [a,b] Rn, n N, define ∈ ≤ −→ ∈ 0 for a = b, l(ψ) := (F.4) sup p (∆): ∆ Π[a,b] [0, ] for a

ψ(x)= y0 + xy1, (F.5) x∈∀[a,b] then ψ is rectifyable with arc length

l(ψ)= y (b a)= ψ(b) ψ(a) . (F.6) k 1k2 − k − k2 (b) If the path ψ is L-Lipschitz with L 0, then ψ is rectifyable and ≥ l(ψ) L (b a). (F.7) ≤ − (c) If the paths φ,ψ : [a,b] Rn are both rectifyable, then −→ l(φ) l(ψ) l(φ ψ). (F.8) − ≤ −

F PATHS IN RN 147

(d) For each ξ [a,b], it holds that ∈ l(ψ)= l(ψ↾[a,ξ])+ l(ψ↾[ξ,b]). (F.9)

Proof. (a): For each partition (x ,...,x ), N N, of [a,b], we have 0 N ∈ N−1 N−1 p (x ,...,x )= ψ(x ) ψ(x ) = x y x y ψ 0 N k k+1 − k k2 k k+1 1 − k 1k2 k=0 k=0 X N−1 X = y (x x )= y (b a), (F.10) k 1k2 k+1 − k k 1k2 − Xk=0 proving (F.6). (b): For each partition (x ,...,x ), N N, of [a,b], we have 0 N ∈ N−1 N−1 p (x ,...,x )= ψ(x ) ψ(x ) L x x ψ 0 N k k+1 − k k2 ≤ k k+1 − kk2 k=0 k=0 XN−1 X = L (x x )= L (b a), (F.11) k+1 − k − Xk=0 proving (F.7). (c): For each partition ∆ = (x ,...,x ), N N, of [a,b], we have 0 N ∈ N−1 N−1

pφ(∆) pψ(∆) = φ(xk+1) φ(xk) 2 ψ(xk+1) ψ(xk) 2 − k − k − k − k Xk=0 Xk=0 N−1

φ(x ) φ(x ) ψ(x ) ψ(x ) ≤ k k+1 − k k2 − k k+1 − k k2 Xk=0 N−1

φ(xk+1) ψ(xk+1) φ(xk) ψ(xk) ≤ − − − 2 k=0 X = pφ−ψ (∆), (F.12) proving (F.8) (the last estimate in (F.12) holds true due to the inverse triangle inequal- ity). (d): If ξ = a or ξ = b, then there is nothing to prove. Thus, assume a<ξN, then ∆:=(x ,...,x ) is a partition of [a,b]. Moreover, ∈ 0 M pψ(∆) = pψ(∆1)+ pψ(∆2) (F.13) F PATHS IN RN 148 is immediate from (F.3), implying

l(ψ) l(ψ↾ )+ l(ψ↾ ). (F.14) ≥ [a,ξ] [ξ,b] On the other hand, if ∆ = (x ,...,x ) M N, is a partition of [a,b], then, either there 0 M ∈ is 0 < N < M such that ξ = N, in which case (F.13) holds once again, where ∆1 and ∆2 are defined as before. Otherwise, there is N 0,...,M 1 such that x <ξ

M−1 p (∆) = ψ(x ) ψ(x ) ψ k k+1 − k k2 k=0 NX−1 M−1 = ψ(x ) ψ(x ) + x x + ψ(x ) ψ(x ) k k+1 − k k2 k N+1 − N k k k+1 − k k2 Xk=0 k=XN+1 p (∆ )+ p (∆ ), (F.15) ≤ ψ 1 ψ 2 showing l(ψ) l(ψ↾ )+ l(ψ↾ ) (F.16) ≤ [a,ξ] [ξ,b] and concluding the proof.  Theorem F.7. Given a,b R, a

Proof. Since ψ is continuously differentiable, it follows from [Phi16b, Th. 4.38] that ψ is Lipschitz continuous on [a,b], i.e. ψ is rectifyable by Prop. F.6(b) above. To prove (F.17), according to the fundamental theorem of calculus [Phi16a, Th. 10.20(b)], it suffices to show the function

λ : [a,b] R+, λ(x) := l(ψ↾ ), (F.18) −→ 0 [a,x] ′ ′ is differentiable with derivative λ (x)= ψ (x) 2. To this end, first note the continuous function ψ′ is even uniformly continuousk by [Phi16b,k Th. 3.20]. Thus,

x0 x <δ ψ(x0) ψ(x) 2 < ǫ. . (F.19) ǫ>∀0 δ>∃0 x0,x∀∈[a,b] | − | ⇒ k − k   Fix x [a,b[ and consider x ]a,b[ such that x

According to Prop. F.6(a), we have

l(α)= ψ′(x ) (x x ). (F.21) k 0 k2 1 − 0 Moreover, for the path ψ α, we have − (F.19) ′ ′ ′ ′ ψ (x) α (x) 2 = ψ (x) ψ (x0) 2 < ǫ. (F.22) x∈[x∀0,x1] k − k k − k

Thus, it follows from [Phi16b, Th. K.2] that ψ α is ǫ-Lipschitz on [a,b] and, then, Prop. F.6(b) yields − l(ψ↾ α) ǫ(x x ), (F.23) [x0,x1] − ≤ 1 − 0 Prop. F.6(c), in turn, yields

l(ψ↾ ) l(α) l(ψ↾ α) ǫ(x x ). (F.24) [x0,x1] − ≤ [x0,x1] − ≤ 1 − 0

Putting everything together, we obtain

l(ψ↾ ) l(ψ↾ ) Prop. F.6(d), (F.21) l(ψ↾ ) l(α) [a,x1] − [a,x0] ψ′(x ) = [x0,x1] x x − k 0 k2 x x − x x 1 − 0 1 − 0 1 − 0 (F.24) ǫ(x1 x0) − = ǫ, (F.25) ≤ x x 1 − 0

showing the function λ from (F.18) has a right-hand derivative at x0 and the value of that right-hand derivative at x is the desired ψ′(x ) . Repeating the above argument 0 k 0 k2 with x0,x1 ]a,b] such that x0 δ

G Operator Norms and Matrix Norms

For the present ODE class, we are mostly interested in linear maps from Kn into itself. However, introducing the relevant notions for linear maps between general normed vector spaces does not provide much additional difficulty, and, hopefully, even some extra clarity. G OPERATOR NORMS AND MATRIX NORMS 150

Definition G.1. Let A : X Y be a linear map between two normed vector spaces −→ (X, X ) and (Y, Y ) over K. Then A is called bounded if, and only if, A maps boundedk · k sets to boundedk · k sets, i.e. if, and only if, A(B) is a bounded subset of Y for each bounded B X. The vector space of all bounded linear maps between X and Y is denoted by (X,Y⊆ ). L Definition G.2. Let A : X Y be a linear map between two normed vector spaces (X, ) and (Y, ) over−→K. The number k · kX k · kY Ax A :=sup k kY : x X, x =0 k k x ∈ 6  k kX  =sup Ax : x X, x =1 [0, ] (G.1) k kY ∈ k kX ∈ ∞  is called the operator norm of A induced by X and Y (strictly speaking, the term operator norm is only justified if the value isk finite, · k butk ·it k is often convenient to use the term in the generalized way defined here). In the special case, where X = Kn, Y = Km, and A is given via a real m n matrix, the operator norm is also called matrix norm. × —

From now on, the space index of a norm will usually be suppressed, i.e. we write just instead of both and . k · k k · kX k · kY Theorem G.3. For a linear map A : X Y between two normed vector spaces (X, ) and (Y, ) over K, the following−→ statements are equivalent: k · k k · k (a) A is bounded.

(b) A < . k k ∞ (c) A is Lipschitz continuous.

(d) A is continuous.

(e) There is x X such that A is continuous at x . 0 ∈ 0 Proof. Since every Lipschitz continuous map is continuous and since every continuous map is continuous at every point, “(c) (d) (e)” is clear. ⇒ ⇒ “(e) (c)”: Let x0 X be such that A is continuous at x0. Thus, for each ǫ> 0, there is δ >⇒0 such that x∈ x <δ implies Ax Ax <ǫ. As A is linear, for each x X k − 0k k − 0k ∈ with x <δ, one has Ax = A(x + x ) Ax <ǫ, due to x + x x = x <δ. k k k k k 0 − 0k k 0 − 0k k k G OPERATOR NORMS AND MATRIX NORMS 151

Moreover, one has (δx)/2 δ/2 <δ for each x X with x 1. Letting L := 2ǫ/δ, this means that Axk = k≤A((δx)/2) /(δ/2) < 2ǫ/δ∈ = L fork eachk≤ x X with x 1. Thus, for each x,yk kX withk x = y,k one has ∈ k k ≤ ∈ 6 x y Ax Ay = A(x y) = x y A −

Together with the fact that Ax Ay x y is trivially true for x = y, this shows that A is Lipschitz continuous.k − k ≤ k − k “(c) (b)”: As A is Lipschitz continuous, there exists L R+ such that Ax Ay ⇒ ∈ 0 k − k≤ L x y for each x,y X. Considering the special case y = 0 and x = 1 yields Axk − Lk x = L, implying∈ A L< . k k k k≤ k k k k≤ ∞ “(b) (c)”: Let A < . We will show ⇒ k k ∞ Ax Ay A x y for each x,y X. (G.3) k − k ≤ k k k − k ∈ For x = y, there is nothing to prove. Thus, let x = y. One computes 6 Ax Ay x y k − k = A − A (G.4) x y x y ≤ k k   k − k k − k x−y as kx−yk = 1, thereby establishing (G.3).

“(b) (a)”: Let A < and let M X be bounded. Then there is r> 0 such that M ⇒B (0). Moreover,k k for∞ each 0 = x ⊆M: ⊆ r 6 ∈ Ax x k k = A A (G.5) x x ≤ k k   k k k k

as x = 1. Thus Ax A x r A , showing that A(M) B (0). Thus, kxk k k ≤ k kk k ≤ k k ⊆ rkAk A(M ) is bounded, thereby establishing the case.

“(a) (b)”: Since A is bounded, it maps the bounded set B (0) X into some ⇒ 1 ⊆ bounded subset of Y . Thus, there is r > 0 such that A(B1(0)) Br(0) Y . In particular, Ax

(a) The operator norm does, indeed, constitute a norm on the set of bounded linear maps (X,Y ). L (b) If A (X,Y ), then A is the smallest Lipschitz constant for A, i.e. A is a Lipschitz∈ L constant for kA andk Ax Ay L x y for each x,y Xk impliesk A L. k − k ≤ k − k ∈ k k≤ Proof. (a): If A = 0, then, in particular, Ax = 0 for each x X with x = 1, implying A = 0. Conversely, A = 0 implies Ax = 0 for each x ∈X with kx k= 1. But then Axk k= x A(x/ x ) =k 0k for every 0 = x X, i.e. A = 0.∈ Thus, thek operatork norm is positivek definite.k k Ifk A (X,Y ), λ 6 K, and∈ x X, then ∈L ∈ ∈ (λA)x = A(λx) = λ(Ax) = λ Ax , (G.6) | | yielding λA = sup (λA)x : x X, x =1 = sup λ Ax : x X, x =1 k k k k ∈ k k | | k k ∈ k k = λ sup Ax : x X, x =1 = λ A , (G.7) | |  k k ∈ k k | | k k showing that the operator norm is homogeneous of degree 1. Finally, if A, B (X,Y ) and x X, then ∈L ∈ (A + B)x = Ax + Bx Ax + Bx , (G.8) k k k k ≤ k k k k yielding

A + B = sup (A + B)x : x X, x =1 k k k k ∈ k k sup Ax + Bx : x X, x =1 ≤ k k k k ∈ k k sup Ax : x X, x =1 + sup Bx : x X, x =1 ≤ k k ∈ k k k k ∈ k k = A + B , (G.9) k k k k  showing that the operator norm also satisfies the triangle inequality, thereby completing the verification that it is, indeed, a norm. (b): That A is a Lipschitz constant for A was already shown in the proof of “(b) k k + ⇒ (c)” of Th. G.3. Now let L R0 be such that Ax Ay L x y for each x,y X. Specializing to y = 0 and ∈x = 1 implies Axk −L xk≤= L,k showing− k A L. ∈  k k k k≤ k k k k≤ Remark G.6. Even though it is beyond the scope of the present class, let us mention as an outlook that one can show that (X,Y ) with the operator norm is a (i.e. a complete normed vector space)L provided that Y is a Banach space (even if X is not a Banach space). H THE VANDERMONDE DETERMINANT 153

Lemma G.7. If Id : X X, Id(x) := x, is the identity map on a normed vector space X over K, then Id =−→ 1 (in particular, the operator norm of a unit matrix is always 1). Caveat: In principle,k k one can consider two different norms on X simultaneously, and then the operator norm of the identity can differ from 1.

Proof. If x = 1, then Id(x) = x = 1.  k k k k k k Lemma G.8. Let X,Y,Z be normed vector spaces and consider linear maps A (X,Y ), B (Y,Z). Then ∈ L ∈L BA B A . (G.10) k k ≤ k k k k Proof. Let x X with x =1. If Ax = 0, then B(A(x)) =0 B A . If Ax = 0, then one estimates∈ k k k k ≤ k k k k 6 Ax B(Ax) = Ax B A B , (G.11) k k Ax ≤ k k k k   k k thereby establishing the case. 

Example G.9. Let m,n N and let A : Rn Rm be the linear map given by the m n matrix (a ) ∈ . Then −→ × kl (k,l)∈{1,...,m}×{1,...,n} n

A ∞ := max akl : k 1,...,m (G.12a) k k ( | | ∈ { }) Xl=1 is called the row sum norm of A, and

m

A 1 := max akl : l 1,...,n (G.12b) k k ( | | ∈ { }) Xk=1 is called the column sum norm of A. It is an exercise to show that A ∞ is the operator n m k k norm induced if R and R are endowed with the -norm, and A 1 is the operator norm induced if Rn and Rm are endowed with the 1-norm.∞ k k

H The Vandermonde Determinant

Theorem H.1. Let n N and λ ,λ ,...,λ C. Moreover, let ∈ 0 1 n ∈ n 1 λ0 ... λ0 n 1 λ1 ... λ1 V := . .  (H.1) . .  n 1 λn ... λ   n   H THE VANDERMONDE DETERMINANT 154 be the corresponding Vandermonde matrix. Then its determinant, the so-called Vander- monde determinant is given by n det(V )= (λ λ ). (H.2) k − l k,l=0 Yk>l

Proof. The proof can be conducted by induction with respect to n: For n = 1, we have

1 1 λ0 det(V )= = λ1 λ0 = (λk λl), (H.3) 1 λ1 − − k,l=0 Yk>l showing (H.2) holds for n = 1. Now let n> 1. We know from Linear Algebra that the value of a determinant does not change if we add a multiple of a column to a different column. Adding the ( λ0)-fold of the nth column to the (n + 1)st column, we obtain in the (n + 1)st column− 0 n n−1 λ1 λ1 λ0  − .  . (H.4) .  n n−1  λ λ λ0  n − n  Next, one adds the ( λ )-fold of the (n 1)st column to the nth column, and, succes- − 0 − sively, the ( λ0)-fold of the mth column to the (m + 1)st column. One finishes, in the nth step, by− adding the ( λ )-fold of the first column to the second column, obtaining − 0 n 1 λ0 ... λ0 10 0 ... 0 1 λ ... λn 1 λ λ λ2 λ λ ... λn λn−1λ 1 1 1 0 1 1 0 1 1 0 det(V )= . . = . −. −. . − . . (H.5) ...... n 2 n n−1 1 λn ... λ 1 λn λ0 λ λnλ0 ... λ λ λ0 n − n − n − n

Applying the rule for determinants of block matrices to (H.5) yields

2 n n−1 λ1 λ0 λ1 λ1λ0 ... λ1 λ1 λ0 −. −. . − . det(V )=1 . . . . . (H.6) · 2 n n−1 λn λ0 λ λnλ0 ... λ λ λ0 − n − n − n

As we also know from Linear Algebra that determinants are linear in each row, for each k, we can factor out (λ λ ) from the kth row of (H.6), arriving at k − 0 n−1 n 1 λ1 ... λ1 . . . . det(V )= (λk λ0) . . . . . (H.7) − k=1 n−1 1 λn ... λ Y n

I MATRIX-VALUED FUNCTIONS 155

However, the determinant in (H.7) is precisely the Vandermonde determinant of the n 1 − numbers λ1,...,λn, which is given according to the induction hypothesis, implying n n n det(V )= (λ λ ) (λ λ )= (λ λ ), (H.8) k − 0 k − l k − l k=1 k,l=1 k,l=0 Y Yk>l Yk>l completing the induction proof of (H.2). 

I Matrix-Valued Functions

Notation I.1. Given m,n N, let (m,n, K) denote the set of m n matrices over K. ∈ M ×

I.1 Product Rule

Proposition I.2. Let I R be a nontrivial interval, let m,n,l N, and suppose ⊆ ∈ A : I (m,n, K),A(x)= a (x) , (I.1a) −→ M αβ B : I (n,l, K), B(x)= b (x) , (I.1b) −→ M αβ  are differentiable. Then  C : I (m,l, K), C(x) := A(x)B(x), (I.2) −→ M is differentiable, and one has the product rule C′(x)= A′(x)B(x)+ A(x)B′(x). (I.3) x∀∈I

Proof. Writing C(x) = cαβ(x) and using the one-dimensional product rule together with the definition of matrix multiplication, one computes, for each (α, β) 1,...,m  1,...,l , ∈ { } × { } n ′ ′ cαβ(x)= aαγ(x) bγβ(x) γ=1 ! nX n ′ ′ = aαγ(x) bγβ(x)+ aαγ(x) bγβ(x) γ=1 γ=1 X X ′ ′ = A (x)B(x) αβ + A(x)B (x) αβ, (I.4) proving the proposition.    J AUTONOMOUS ODE 156

I.2 Integration and Matrix Multiplication Commute

Proposition I.3. Let m,n,p N, let I R be measurable (e.g. an interval), let A : ∈ ⊆ I (m,n, K), x A(x)=(akl(x)), be integrable (i.e. all Re akl, Im akl : I R are−→ integrable). M 7→ −→

(a) If B =(b ) (p,m, K), then jk ∈M B A(x)dx = BA(x)dx. (I.5) ZI ZI (b) If B =(B ) (n,p, K), then lj ∈M A(x)dx B = A(x) B dx. (I.6) ZI  ZI Proof. (a): One computes, for each (j,l) 1,...,p 1,...,n , ∈ { } × { } m m

B A(x)dx = bjk akl(x)dx = bjkakl(x) dx I jl I I !  Z  Xk=1 Z Z Xk=1

= (BA(x))jl dx = BA(x)dx , (I.7) ZI ZI jl proving (I.5). (b): One computes, for each (k, j) 1,...,m 1,...,p , ∈ { } × { } n n

A(x)dx B = akl(x)dx blj = akl(x)blj dx I kj I I ! Z   Xl=1 Z  Z Xl=1

= (A(x)B)kj dx = A(x) B dx , (I.8) ZI ZI kj proving (I.6). 

J Autonomous ODE

J.1 Equivalence of Autonomous and Nonautonomous ODE

Theorem J.1. Let G R Kn, n N, and f : G Kn. Then the nonautonomous ODE ⊆ × ∈ −→ y′ = f(x,y) (J.1) J AUTONOMOUS ODE 157

is equivalent to the autonomous ODE

y′ = g(y), (J.2) where g : G Kn+1, g(y ,...,y ) := 1,f(y ,y ,...,y ) , (J.3) −→ 1 n+1 1 2 n+1 in the following sense: 

(a) If φ : I Kn is a solution to (J.1), then ψ : I Kn+1, ψ(x):=(x,φ(x)), is a −→ −→ solution to (J.2).

(b) If ψ : I Kn+1 is a solution to (J.2) with the property −→

ψ1(x0)= x0, (J.4) x0∃∈I then φ : I Kn, φ(x):=(ψ (x),...,ψ (x)), is a solution to (J.1). −→ 2 n+1 Proof. (a): If φ : I Kn is a solution to (J.1) and ψ : I Kn+1, ψ(x):=(x,φ(x)), then −→ −→ ψ′(x)=(1,φ′(x)) = 1,f(x,φ(x)) = g(x,φ(x)) = g(ψ(x)), (J.5) x∀∈I showing ψ is a solution to (J.2).  (b): If ψ : I Kn+1 is a solution to (J.2) with the property (J.4) and φ : I Kn, φ(x):=(ψ (x−→),...,ψ (x)), then (J.4) implies ψ (x)= x for each x I and, thus,−→ 2 n+1 1 ∈ ′ ′ ′ φ (x)=(ψ2(x),...,ψn+1(x)) = f(x,ψ2(x),...,ψn+1(x)) = f(x,φ(x)), (J.6) x∀∈I showing φ is a solution to (J.1). 

While Th. J.1 is somewhat striking and of theoretical interest, it has few useful applica- tions in practise, due to the unbounded first component of solutions to (J.2) (cf. Rem. 5.2).

J.2 Integral for ODE with Discontinuous Right-Hand Side

The following Example J.2, provided by Anton Sporrer, shows Lem. 5.19 becomes false if the hypothesis that every initial value problem for the considered ODE y′ = f(y) has at least one solution is omitted: K POLAR COORDINATES 158

Example J.2. Consider

0 for y Q, f : R R, f(y) := ∈ (J.7) −→ 1 for y R Q, ( ∈ \ and the autonomous ODE y′ = f(y). If (x ,y ) R Q, then the initial value problem 0 0 ∈ × y(x0) = y0 has the unique solution φ : R R, φ y0 Q. However, if (x0,y0) −→ ≡ ∈ ′ ∈ R (R Q), then the initial value problem y(x0)= y0 has no solution. Since y = f(y) has× only\ constant solutions, every function E : R R is an integral for this ODE according to Def. 5.18. However, not every differentiable−→ function E : R R satisfies (5.15): For example, if E(y) := y, then E′ 1, i.e. −→ ≡ E′(y)f(y)=1 =0, (J.8) y∈∀R\Q 6 showing that Lem. 5.19 does not hold for y′ = f(y) with f according to (J.7).

K Polar Coordinates

Recall the following functions, used in polar coordinates of the plane:

r : R2 (0, 0) R+, r(y ,y ) := y2 + y2, (K.1a) \ { } −→ 1 2 1 2 q 0 for y2 = 0, y1 > 0,

2 arccot(y1/y2) for y2 > 0, ϕ : R (0, 0) [0, 2π[, ϕ(y1,y2) :=  \ { } −→ π for y2 = 0, y1 < 0,  π + arccot(y1/y2) for y2 < 0.  (K.1b)  Theorem K.1. Consider f : R2 (0, 0) R2 and the corresponding R2-valued autonomous ODE \ { } −→

′ y1 = f1(y1,y2), (K.2a) ′ y2 = f2(y1,y2), (K.2b) together with its polar coordinate version

′ r = g1(r, ϕ), (K.3a) ′ ϕ = g2(r, ϕ), (K.3b) K POLAR COORDINATES 159

where g : R+ R R2, × −→ g : R+ R R, 1 × −→ g1(r, ϕ) := f1(r cos ϕ,r sin ϕ) cos ϕ + f2(r cos ϕ,r sin ϕ) sin ϕ, (K.4a) g : R+ R R, 2 × −→ 1 g (r, ϕ) := f (r cos ϕ,r sin ϕ) cos ϕ f (r cos ϕ,r sin ϕ) sin ϕ . (K.4b) 2 r 2 − 1   Let µ : I R2 be a solution to (K.3). −→ (a) Then φ : I R2, φ(x) := µ (x) cos µ (x), µ (x) sin µ (x) , (K.5) −→ 1 2 1 2 is a solution to (K.2).  (b) If µ satisfies the initial condition

µ (0) = ρ, ρ R+, (K.6a) 1 ∈ µ (0) = τ, τ R, (K.6b) 2 ∈ and if

η1 = ρ cos τ, (K.7a)

η2 = ρ sin τ, (K.7b)

then φ satisfies the initial condition

φ1(0) = η1, (K.8a)

φ2(0) = η2. (K.8b)

+ Note that ρ > 0 implies (η1,η2) = (0, 0), and that, for (ρ,τ) R [0, 2π[, (K.7) is equivalent to 6 ∈ ×

r(η1,η2)= ρ, (K.9a)

ϕ(η1,η2)= τ (K.9b)

(cf. the computations of φ−1 φ and of φ φ−1 in [Phi17, Ex. 2.67]). ◦ ◦ Proof. Exercise.  K POLAR COORDINATES 160

Example K.2. Consider the autonomous ODE (K.2) with

f : R2 (0, 0) R, 1 \ { } −→ y2 (r(y1,y2) y1) f1(y1,y2) := y1 1 r(y1,y2) − , (K.10a) − − 2 r(y1,y2) f : R2 (0, 0) R, 2 \ { } −→ y1 (r(y1,y2) y1) f2(y1,y2) := y2 1 r(y1,y2) + − , (K.10b) − 2 r(y1,y2)  where r is the radial polar coordinate function as defined in (K.1a). Using g : R+ R R2 as defined in (K.4), one obtains, for each (ρ,ϕ) R+ R, × −→ ∈ × ρ sin ϕ (ρ ρ cos ϕ) cos ϕ g (ρ,ϕ)= ρ cos ϕ (1 ρ) cos ϕ − 1 − − 2ρ ρ cos ϕ (ρ ρ cos ϕ) sin ϕ + ρ sin ϕ (1 ρ) sin ϕ + − − 2ρ = ρ (1 ρ), (K.11a) − cos ϕ (ρ ρ cos ϕ) cos ϕ g (ρ,ϕ) = sin ϕ (1 ρ) cos ϕ + − 2 − 2ρ sin ϕ (ρ ρ cos ϕ) sin ϕ cos ϕ (1 ρ) sin ϕ + − − − 2ρ 1 cos ϕ = − , (K.11b) 2 such that the autonomous ODE

r′ = r (1 r), (K.12a) − 1 cos ϕ [Phi16a, (I.1c)] ϕ ϕ′ = − = sin2 , (K.12b) 2 2 is the polar coordinate version of (K.2) as defined in Th. K.1. Claim 1. The general solution to

r′ = p(r), p : R+ R, p(r) := r (1 r), (K.13) −→ − is ρ Y : D R+, Y (x,ρ) := , (K.14) p p,0 −→ p ρ + (1 ρ) e−x − where ρ D = R ]0, 1] (x,ρ): ρ> 1, x ln , . (K.15) p,0 × ∪ ∈ − ρ 1 ∞  −   i h K POLAR COORDINATES 161

Proof. The initial condition is satisfied, since ρ Yp(0,ρ)= = ρ. (K.16) ρ∈∀R+ ρ + (1 ρ) − The ODE is satisfied, since

−x ′ ρ(1 ρ)e Yp (x,ρ)= − 2 = Yp(x,ρ) 1 Yp(x,ρ) . (K.17) (x,ρ)∀∈Dp,0 ρ + (1 ρ) e−x − −   To verify the form of Dp,0, we note that the denominator in (K.14) is positive for each x R if 0 <ρ 1. If ρ> 1, then the function a : R R, a(x) := ρ + (1 ρ) e−x, is strictly∈ increasing≤ (note a′(x)=(ρ 1) e−x > 0) and has−→ a unique zero at x −= ln ρ . − − ρ−1 Thus limx↓− ln ρ Y (x,ρ)= , proving the maximality of Y ( ,ρ). N ρ−1 ∞ · Claim 2. Letting, for each k Z, ∈ 2cos τ +2 x (τ) := for τ R lπ : l Z , (K.18a) 0 sin τ ∈ \ { ∈ } Rk :=]0,π[+2kπ, (K.18b) L :=] π, 0[+2kπ, (K.18c) k − A := R 2kπ : k Z , (K.18d) 0 × { ∈ } A := R− π +2kπ , (K.18e) 0,k × { } B := R+ π +2kπ , (K.18f) 0,k × { } A := (x,τ) R2 : x ] ,x (τ)[, τ R , (K.18g) 1,k ∈ ∈ − ∞ 0 ∈ k A := (x,τ) R2 : x ]x (τ), [, τ L , (K.18h) 2,k  ∈ ∈ 0 ∞ ∈ k B := (x,τ) R2 : x ]x (τ), [, τ R , (K.18i) 1,k  ∈ ∈ 0 ∞ ∈ k B := (x,τ) R2 : x ] ,x (τ)[, τ L , (K.18j) 2,k  ∈ ∈ − ∞ 0 ∈ k C := (x,τ) R2 : x = x (τ), τ R , (K.18k) 1,k  ∈ 0 ∈ k C := (x,τ) R2 : x = x (τ), τ L , (K.18l) 2,k  ∈ 0 ∈ k the general solution to  1 cos ϕ ϕ ϕ′ = q(ϕ), q : R R, q(ϕ) := − = sin2 , (K.19) −→ 2 2 K POLAR COORDINATES 162

is Y : R2 R+, q −→ τ for (x,τ) A0, 2 ∈ 2 kπ + arctan x for (x,τ) A0,k, k Z,  − 2 ∈ ∈ 2 (k + 1)π + arctan for (x,τ) B0,k, k Z,   − x ∈ ∈ π +2kπ for (x,τ)=(0,π +2kπ), k Z,   ∈  2 sin τ Yq(x,τ) := 2 kπ + arctan 2 cos τ−x sin τ+2 for (x,τ) A1,k A2,k, k Z,  ∈ ∪ ∈ π +2kπ for (x,τ) C1,k, k Z,  ∈ ∈ 2 (k + 1)π + arctan 2 sin τ for (x,τ) B , k Z,  2 cos τ−x sin τ+2 1,k  ∈ ∈  π +2kπ for (x,τ) C2,k, k Z, −  ∈ ∈ 2 (k 1)π + arctan 2 sin τ for (x,τ) B , k Z.  2 cos τ−x sin τ+2 2,k  − ∈ ∈  (K.20)  

Proof. One observes that Yq is well-defined, since

R2 = A ˙ (0,π +2kπ) ˙ A ˙ B ˙ A ˙ A ˙ B ˙ B ˙ C ˙ C 0 ∪ { } ∪ 0,k ∪ 0,k ∪ 1,k ∪ 2,k ∪ 1,k ∪ 2,k ∪ 1,k ∪ 2,k k∈Z   [ (K.21) and, introducing the auxiliary function ∆: R2 R, ∆(x,τ) := 2 cos τ x sin τ +2, (K.22) −→ − one has ∆(x,τ) =0 foreach(x,τ) A A B B , k Z. (K.23) 6 ∈ 1,k ∪ 2,k ∪ 1,k ∪ 2,k ∈ It remains to show that, for each τ R, the function x Yq(x,τ) is differentiable on R, satisfying ∈ 7→ ′ 1 cos Yq(x,τ) Yq (x,τ)= − , (K.24) x∀∈R 2 and the initial condition Yq(0,τ)= τ. (K.25) The initial condition (K.25) is satisfied, since

Yq(0,τ)= τ, (K.26a) τ∈{kπ∀: k∈Z} 2 sin τ [Phi16a, (I.1d)] τ Yq(0,τ)=2kπ + 2arctan = 2kπ + 2arctan tan τ∈Rk∪∀Lk,k∈Z 2 cos τ +2 2 τ   =2kπ +2 kπ = τ. (K.26b) 2 −   K POLAR COORDINATES 163

Next, we show that, for each τ R, the function x Y (x,τ) is differentiable on R and ∈ 7→ q satisfies (K.24): For τ 2kπ : k Z , x Yq(x,τ) is constant, i.e. differentiability is clear, and ∈ { ∈ } 7→ 1 cos Yq(x,τ) 1 cos(2kπ) ′ − = − =0= Yq (x,τ) (K.27) x∀∈R 2 2 proves (K.24). For each τ 2(k +1)π : k Z , differentiability is clear in each x R 0 . Moreover, ∈ { ∈ } ∈ \{ } 2 ′ 4 1 4 2 arctan = 4 = , (K.28) x∈R∀\{0} −x x2 1+ 4+ x2    x2 and, thus, for each x R 0 , ∈ \ { } 4 1 cos Y (x,τ) 1 1 2 [Phi16a, (I.1e)] 1 1 1 2 − q = cos 2 arctan = − x 2 2 − 2 −x 2 − 2 1+ 4    x2 2 1 1 x 4 8 4 (K.28) = − = = = Y ′(x,τ), (K.29) 2 − 2 x2 +4 2(4 + x2) 4+ x2 q proving (K.24) for each x R 0 . It remains to consider x = 0. One has, by L’Hˆopital’s rule [Phi16a, Th.∈ 9.25(a)],\ { } Y (0,τ) Y (x,τ) π +2kπ 2 kπ + arctan 2 lim q − q = lim − − x x↑0 0 x x↑0 x  − 4 − [Phi16a, (9.27)],(K.28) 2 = lim − 4+x =1 (K.30) x↑0 1 − and Y (0,τ) Y (x,τ) π +2kπ 2 (k + 1)π + arctan 2 lim q − q = lim − − x x↓0 0 x x↓0 x  − 4 − [Phi16a, (9.27)],(K.28) 2 = lim − 4+x =1, (K.31) x↓0 1 − showing x Y (x,τ) to be differentiable in x = 0 with Y ′(0,τ)=1. Due to 7→ q q 1 cos(π +2kπ) − =1= Y ′(0,π +2kπ), (K.32) 2 q (K.24) also holds.

For each τ Rk Lk, the differentiability is clear in each x R x0(τ) . Moreover, recalling ∆(∈x,τ)∪ from (K.22), one has, for each x R x (τ∈) , \ { } ∈ \ { 0 } 2 sin τ ′ 4(sin τ)2 1 4(sin τ)2 2 arctan = 2 = , (K.33) ∆(x,τ) (∆(x,τ))2 1+ 4(sin τ) 4(sin τ)2 + (∆(x,τ))2    (∆(x,τ))2 K POLAR COORDINATES 164

and, thus, for each x R x (τ) , ∈ \ { 0 } 2 1 4(sin τ) 1 cos Yq(x,τ) 1 1 2sin τ [Phi16a, (I.1e)] 1 1 − (∆(x,τ))2 − = cos 2 arctan = 2 2 2 − 2 ∆(x,τ) 2 − 2 1+ 4(sin τ)    (∆(x,τ))2 1 1 (∆(x,τ))2 4(sin τ)2 8(sin τ)2 = − = 2 − 2 (∆(x,τ))2 + 4(sin τ)2 2(4(sin τ)2 + (∆(x,τ))2) 2 4(sin τ) (K.33) = = Y ′(x,τ), (K.34) 4(sin τ)2 + (∆(x,τ))2 q proving (K.24) for each x R x (τ) . It remains to consider x = x (τ). For τ R , ∈ \ { 0 } 0 ∈ k we have sin τ > 0 and x0(τ) > 0, and, thus, by L’Hˆopital’s rule [Phi16a, Th. 9.25(a)],

2 sin τ Yq x0(τ),τ Yq(x,τ) π +2kπ 2 kπ + arctan lim − = lim − 2 cos τ−x sin τ+2 x↑x0(τ) x (τ) x x↑x0(τ) x (τ) x 0  − 0 −  4(sin τ)2 [Phi16a, (9.27)],(K.33) 2 2 = lim − 4(sin τ) +(∆(x,τ)) =1 (K.35) x↑x0(τ) 1 − and

2 sin τ Yq x0(τ),τ Yq(x,τ) π +2kπ 2 (k + 1)π + arctan lim − = lim − 2 cos τ−x sin τ+2 x↓x0(τ) x (τ) x x↓x0(τ) x (τ) x 0  − 0 −  4(sin τ)2 [Phi16a, (9.27)],(K.33) 2 2 = lim − 4(sin τ) +(∆(x,τ)) =1 (K.36) x↓x0(τ) 1 − showing x Y (x,τ) to be differentiable in x = x (τ) with Y ′(x (τ),τ)=1. Due to 7→ q 0 q 0 1 cos(π +2kπ) − =1= Y ′(x (τ),τ), (K.37) 2 q 0 (K.24) also holds. For τ L , we have sin τ < 0 and x (τ) < 0, and, thus, by L’Hˆopital’s ∈ k 0 rule [Phi16a, Th. 9.25(a)],

Yq x0(τ),τ Yq(x,τ) lim − x↑x0(τ) x (τ) x 0  − π +2kπ 2 (k 1)π + arctan 2 sin τ = lim − − − 2 cos τ−x sin τ+2 x↑x0(τ) x (τ) x 0 −  4(sin τ)2 [Phi16a, (9.27)],(K.33) 2 2 = lim − 4(sin τ) +(∆(x,τ)) =1 (K.38) x↑x0(τ) 1 − K POLAR COORDINATES 165

and

2 sin τ Yq x0(τ),τ Yq(x,τ) π +2kπ 2 kπ + arctan lim − = lim − − 2 cos τ−x sin τ+2 x↓x0(τ) x (τ) x x↓x0(τ) x (τ) x 0  − 0 −  4(sin τ)2 [Phi16a, (9.27)],(K.33) 2 2 = lim − 4(sin τ) +(∆(x,τ)) =1 (K.39) x↓x0(τ) 1 − showing x Y (x,τ) to be differentiable in x = x (τ) with Y ′(x (τ),τ)=1. Due to 7→ q 0 q 0 1 cos( π +2kπ) − − =1= Y ′(x (τ),τ), (K.40) 2 q 0 (K.24) also holds. N

Claim 3. The general solution to (K.2) with f1,f2 according to (K.10) is

Y : D R2, f,0 −→ Y (x,η1,η2) := Yp x,r(η1,η2) cos Yq x,ϕ(η1,η2) ,  Yp x,r(η1,η2) sin Yq x,ϕ(η1,η2) , (K.41)  where r and ϕ are given by (K.1), and  

D = R η R2 : 0 < η 1 f,0 × { ∈ k k2 ≤ }   2 η 2 (x,η) R R : η 2 > 1, x ln k k , . (K.42) ∪ ∈ × k k ∈ − η 2 1 ∞  i k k − h

Proof. Since (K.12) is the polar coordinate version of (K.2) with f1,f2 according to (K.10), everything follows from combining Th. K.1 with Claims 1 and 2. N

Claim 4. The autonomous ODE (K.2) with f1,f2 according to (K.10) has (1, 0) as its only fixed point, and (1, 0) satisfies Def. 5.24(iii) for x (even for each η R2 0 ) without satisfying Def. 5.24(ii) (i.e. without being posit→ively ∞ stable). ∈ \{ }

Proof. For each η R2 0 , it is r(η) > 0, and, thus ∈ \ { } r(η) lim Yp x,r(η) = lim =1. (K.43) η∈R∀2\{0} x→∞ x→∞ r(η)+ 1 r(η) e−x  − Fix η R2 0 . If ϕ(η) = 0, then  ∈ \ { }

lim Yq x, 0 = lim 0=0. (K.44a) x→∞ x→∞  REFERENCES 166

If ϕ(η)= π, then 2 lim Yq x,π = lim 2 π + arctan = 2(π +0)=2π. (K.44b) x→∞ x→∞ −x     If 0 <ϕ(η) <π or π<ϕ(η) < 2π, then sin ϕ(η) = 0 and, thus, 6 2 sin ϕ(η) lim Yq x,ϕ(η) = lim 2 π + arctan x→∞ x→∞ 2 cos ϕ(η) x sin ϕ(η)+2   −   = 2(π +0)=2π. (K.44c) Using (K.43) and (K.44) in (K.41) yields lim Y (x,η)=(1, 0). (K.45) η∈R∀2\{0} x→∞

While (1, 0) is clearly a fixed point for (K.2) with f1,f2 according to (K.10), (K.45) shows that no other η R2 0 can be a fixed point. ∈ \ { } For each τ ]0,π[ and η := (cos τ, sin τ), it is ϕ(η)= τ and Yq(0,ϕ(η)) = τ. Thus, due to (K.44c) and∈ the intermediate value theorem, the continuous function Y ( ,ϕ(η)) must q · attain every value between τ and 2π, in particular, there is xπ > 0 that Yq(xτ ,ϕ(η)) = π and Y (xτ ,η) = (cos π, sin π)=( 1, 0). Since every neighborhood of (1, 0) contains points η = (cos τ, sin τ) with τ −]0,π[, this shows that (1, 0) does not satisfy Def. 5.24(ii) for x 0. ∈ N ≥

References

[Aul04] Bernd Aulbach. Gew¨ohnliche Differenzialgleichungen, 2nd ed. Spektrum Akademischer Verlag, Heidelberg, Germany, 2004 (German). [Koe03] Max Koecher. Lineare Algebra und analytische Geometrie, 4th ed. Sprin- ger-Verlag, Berlin, 2003 (German), 1st corrected reprint. [Mar04] Nelson G. Markley. Principles of Differential Equations. Pure and Applied Mathematics, Wiley-Interscience, Hoboken, NJ, USA, 2004. [Oss09] E. Ossa. Topologie. Vieweg+Teubner, Wiesbaden, Germany, 2009 (German). [Phi16a] P. Philip. Analysis I: Calculus of One Real Variable. Lecture Notes, LMU Mu- nich, 2015/2016, AMS Open Math Notes Ref. # OMN:202109.111306, avail- able in PDF format at https://www.ams.org/open-math-notes/omn-view-listing?listingId=111306. REFERENCES 167

[Phi16b] P. Philip. Analysis II: Topology and Differential Calculus of Several Vari- ables. Lecture Notes, LMU Munich, 2016, AMS Open Math Notes Ref. # OMN:202109.111307, available in PDF format at https://www.ams.org/open-math-notes/omn-view-listing?listingId=111307.

[Phi17] P. Philip. Analysis III: Measure and Integration Theory of Several Variables. Lecture Notes, LMU Munich, 2016/2017, AMS Open Math Notes Ref. # OMN:202109.111308, available in PDF format at https://www.ams.org/open-math-notes/omn-view-listing?listingId=111308.

[Put66] E.J. Putzer. Avoiding the Jordan Canonical Form in the Discussion of Lin- ear Systems with Constant Coefficients. The American Mathematical Monthly 73 (1966), No. 1, 2–7.

[Str08] Gernot Stroth. Lineare Algebra, 2nd ed. Berliner Studienreihe zur Math- ematik, Vol. 7, Heldermann Verlag, Lemgo, Germany, 2008 (German).

[Wal02] Wolfgang Walter. Analysis 2, 5th ed. Springer-Verlag, Berlin, 2002 (Ger- man).