Ordinary Differential Equations
Peter Philip∗
Lecture Notes Originally Created for the Class of Spring Semester 2012 at LMU Munich, Revised and Extended for Several Subsequent Classes
September 19, 2021
Contents
1 Basic Notions 4 1.1 TypesandFirstExamples ...... 4 1.2 EquivalentIntegralEquation...... 10 1.3 PatchingandTimeReversion ...... 11
2 Elementary Solution Methods 13 2.1 GeometricInterpretation,Graphing ...... 13 2.2 LinearODE,VariationofConstants...... 13 2.3 SeparationofVariables ...... 16 2.4 ChangeofVariables...... 20
3 General Theory 26 3.1 Equivalence Between Higher-Order ODE and Systems of First-Order ODE 26 3.2 ExistenceofSolutions ...... 29 ∗E-Mail: [email protected]
1 CONTENTS 2
3.3 UniquenessofSolutions...... 38 3.4 Extension of Solutions, Maximal Solutions ...... 44 3.5 ContinuityinInitialConditions ...... 56
4 Linear ODE 63 4.1 Definition,Setting ...... 63 4.2 Gronwall’sInequality ...... 63 4.3 Existence, Uniqueness, Vector Space of Solutions ...... 68 4.4 Fundamental Matrix Solutions and Variation of Constants ...... 70 4.5 Higher-Order,Wronskian...... 72 4.6 ConstantCoefficients ...... 75 4.6.1 LinearODEofHigherOrder...... 75 4.6.2 SystemsofFirst-OrderLinearODE ...... 83
5 Stability 93 5.1 QualitativeTheory,PhasePortraits ...... 93 5.2 StabilityatFixedPoints ...... 104 5.3 ConstantCoefficients ...... 112 5.4 Linearization ...... 118 5.5 LimitSets ...... 123
A Differentiability 131
B Kn-Valued Integration 132 B.1 Kn-ValuedRiemannIntegral...... 133 B.2 Kn-ValuedLebesgueIntegral...... 136
C Metric Spaces 139 C.1 DistanceinMetricSpaces ...... 139
D Local Lipschitz Continuity 142 CONTENTS 3
E MaximalSolutionsonNonopenIntervals 145
F Paths in Rn 145
G Operator Norms and Matrix Norms 149
H The Vandermonde Determinant 153
I Matrix-Valued Functions 155 I.1 ProductRule ...... 155 I.2 Integration and Matrix Multiplication Commute ...... 156
J Autonomous ODE 156 J.1 Equivalence of Autonomous and Nonautonomous ODE ...... 156 J.2 Integral for ODE with Discontinuous Right-Hand Side ...... 157
K Polar Coordinates 158
References 166 1 BASIC NOTIONS 4
1 Basic Notions
1.1 Types of Ordinary Differential Equations (ODE) and First Examples
A differential equation is an equation for some unknown function, involving one or more derivatives of the unknown function. Here are some first examples:
y′ = y, (1.1a) y(5) =(y′)2 + πx, (1.1b) (y′)2 = c, (1.1c) 2πit 2 ∂tx = e x , (1.1d) 1 x′′ = 3x + − . (1.1e) − 1 One distinguishes between ordinary differential equations (ODE) and partial differential equations (PDE). While ODE contain only derivatives with respect to one variable, PDE can contain (partial) derivatives with respect to several different variables. In general, PDE are much harder to solve than ODE. The equations in (1.1) all are ODE, and only ODE are the subject of this class. We will see precise definitions shortly, but we can already use the examples in (1.1) to get some first exposure to important ODE-related terms and to discuss related issues. As in (1.1), the notation for the unknown function varies in the literature, where the two variants presented in (1.1) are probably the most common ones: In the first three equations of (1.1), the unknown function is denoted y, usually assumed to depend on a variable denoted x, i.e. x y(x). In the last two equations of (1.1), the unknown function is denoted x, usually7→ assumed to depend on a variable denoted t, i.e. t x(t). So one has to use some care due to the different roles of the symbol x. The notation7→ t x(t) is typically favored in situations arising from physics applications, where t represents7→ time. In this class, we will mostly use the notation x y(x). 7→ There is another, in a way a slightly more serious, notational issue that one commonly encounters when dealing with ODE: Strictly speaking, the notation in (1.1b) and (1.1d) is not entirely correct, as functions and function arguments are not properly distin- guished. Correctly written, (1.1b) and (1.1d) read
2 y(5)(x)= y′(x) + πx, (1.2a) x∈D∀(y) 2πit 2 (∂tx)(t)= e x(t) , (1.2b) t∈D∀(x) 1 BASIC NOTIONS 5
where (y) and (x) denote the respective domains of the functions y and x. However, one mightD also noticeD that the notation in (1.2) is more cumbersome and, perhaps, harder to read. In any case, the type of slight abuse of notation present in (1.1b) and (1.1d) is so common in the literature that one will have to live with it. One speaks of first-order ODE if the equations involve only first derivatives such as in (1.1a), (1.1c), and (1.1d). Otherwise, one speaks of higher-order ODE, where the precise order is given by the highest derivative occurring in the equation, such that (1.1b) is an ODE of fifth order and (1.1e) is an ODE of second order. We will see later in Th. 3.1 that ODE of higher order can be equivalently formulated and solved as systems of ODE of first order, where systems of ODE obviously consist of several ODE to be solved simultaneously. Such a system of ODE can, equivalently, be interpreted as a single ODE in higher dimensions: For instance, (1.1e) can be seen as a single two-dimensional ODE of second order or as the system
x′′ = 3x 1, (1.3a) 1 − 1 − x′′ = 3x +1 (1.3b) 2 − 2 of two one-dimensional ODE of second order. One calls an ODE explicit if it has been solved explicitly for the highest-order deriva- tive, otherwise implicit. Thus, in (1.1), all ODE except (1.1c) are explicit. In general, explicit ODE are much easier to solve than implicit ODE (which include, e.g., so-called differential-algebraic equations, cf. Ex. 1.4(g) below), and we will mostly consider ex- plicit ODE in this class. As the reader might already have noticed, without further information, none of the equations in (1.1) makes much sense. Every function, in particular, every function solving an ODE, needs a set as the domain where it is defined, and a set as the range it maps into. Thus, for each ODE, one needs to specify the admissible domains as well as the range of the unknown function. For an ODE, one usually requires a solution to be defined on a nontrivial (bounded or unbounded) interval I R. Prescribing the possible range of the solution is an integral part of setting up an ODE,⊆ and it often completely determines the ODE’s meaning and/or its solvability. For example for (1.1d), (a subset of) C is a reasonable range. Similarly, for (1.1a)–(1.1c), one can require the range to be either R or C, where requiring range R for (1.1c) immediately implies there is no solution for c < 0. However, one can also specify (a subset of) Rn or Cn, n > 1, as range for (1.1a), turning the ODE into an n-dimensional ODE (or a system of ODE), where y now has n compoments (y1,...,yn) (note that, except in cases where we are dealing with matrix multiplications, we sometimes denote elements of Rn as columns and sometimes as rows, switching back and forth without too much care). A reasonable range for (1.1e) is (a subset of) R2 or C2. 1 BASIC NOTIONS 6
One of the important goals regarding ODE is to find conditions, where one can guan- rantee the existence of solutions. Moreover, if possible, one would like to find conditions that guarantee the existence of a unique solution. Clearly, for each a R, the function y : R R, y(x)= aex, is a solution to (1.1a), showing one cannot expect∈ uniqueness without−→ specifying further requirements. The most common additional conditions that often (but not always) guarantee a unique solution are initial conditions, (e.g. requiring y(x0)= y0 (x0,y0 given); or boundary conditions (e.g. requiring y(a)= ya, y(b)= yb for y : [a,b] Cn (y ,y Cn given)). −→ a b ∈ Let us now proceed to mathematically precise definitions of the abovementioned notions. Notation 1.1. We will write K in situations, where we allow K to be R or C. Definition 1.2. Let k,n N. ∈ (a) Given U R K(k+1)n and F : U Kn, call ⊆ × −→ F (x,y,y′,...,y(k))=0 (1.4)
an implicit ODE of kth order. A solution to this ODE is a k times differentiable function φ : I Kn, (1.5) −→ defined on a nontrivial (bounded or unbounded, open or closed or half-open) interval I R satisfying the two conditions ⊆ (i) (x,φ(x),φ′(x),...,φ(k)(x)) I K(k+1)n : x I U. ∈ × ∈ ⊆ (ii) F (x,φ(x),φ′(x),...,φ(k)(x)) = 0 for each x I. ∈ Note that condition (i) is necessary so that one can even formulate condition (ii). (b) Given G R Kkn, and f : G Kn, call ⊆ × −→ y(k) = f(x,y,y′,...,y(k−1)) (1.6)
an explicit ODE of kth order. A solution to this ODE is a k times differentiable function φ as in (1.5), defined on a nontrivial (bounded or unbounded, open or closed or half-open) interval I R satisfying the two conditions ⊆ (i) x,φ(x),φ′(x),...,φ(k−1)(x) I Kkn : x I G. ∈ × ∈ ⊆ (ii) φ(k)(x)= f(x,φ(x),φ′(x),...,φ(k−1)(x)) for each x I. ∈ Again, note that condition (i) is necessary so that one can even formulate condition (ii). Also note that φ is a solution to (1.6) if, and only if, φ is a solution to the equivalent implicit ODE y(k) f(x,y,y′,...,y(k−1))=0. − 1 BASIC NOTIONS 7
Definition 1.3. Let k,n N. ∈ (a) An initial value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp. of the ODE (1.6)) plus the initial condition
(j) y (x0)= y0,j (1.7) j=0,...,k∀ −1 n with given x0 R and y0,0,...,y0,k−1 K . A solution φ to the initial value problem is a k ∈times differentiable function∈ φ as in (1.5) that is a solution to the ODE and that also satisfies (1.7) (with y replaced by φ) – in particular, this requires x I. 0 ∈ (b) A boundary value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp. of the ODE (1.6)) plus the boundary condition
(j) (j) y (a)= ya,j and y (b)= yb,j (1.8) j∈∀Ja j∈∀Jb n with given a,b R, a
Under suitable hypotheses, initial and boundary value problems for ODE have unique solutions (for initial value problems, we will see some rather general results in Cor. 3.10 and Cor. 3.16 below). However, in general, they can have infinitely many solutions or no solutions, as shown by Examples 1.4(b),(c),(e) below. Example 1.4. (a) Let k N. The function φ : R K, φ(x) = aex, a K, is a solution to the kth order∈ explicit initial value problem−→ ∈
y(k) = y, (1.9a) y(j)(0) = a. (1.9b) j=0,...,k∀ −1 We will see later (e.g., as a consequence of Th. 4.8 combined with Th. 3.1) that φ is the unique solution to (1.9) on R. (b) Consider the one-dimensional explicit first-order initial value problem
y′ = y , (1.10a) | | y(0) =p 0. (1.10b) 1 BASIC NOTIONS 8
Then, for every c 0, the function ≥ 0 for x c, φc : R R, φc(x) := 2 ≤ (1.11) −→ (x−c) for x c, ( 4 ≥
is a solution to (1.10): Clearly, φc(0) = 0, φc is differentiable, and
0 for x c, φ′ : R R, φ′ (x) := ≤ (1.12) c −→ c x−c for x c, ( 2 ≥ solving the ODE. Thus, (1.10) is an example of an initial value problem with un- countably many different solutions, all defined on the same domain.
(c) As mentioned before, the one-dimensional implicit first-order ODE (1.1c) has no real-valued solution for c < 0. For c 0, every function φ : R R, φ(x) := a x√c, a R, is a solution to (1.1c).≥ Moreover, for c < 0,−→ every function φ ±: R C∈, φ(x) := a xi√ c, a C, is a C-valued solution to (1.1c). The one-dimensional−→ implicit first-order± − ODE∈
′ ey =0 (1.13)
is an example of an ODE that does not even have a C-valued solution. It is an exercise to find f : R R such that the explicit ODE y′ = f(x) has no solution. −→ (d) Let n N and let a,c Kn. Then, on R, the function ∈ ∈ φ : R Kn, φ(x) := c + xa, (1.14) −→ is the unique solution to the n-dimensional explicit first-order initial value problem
y′ = a, (1.15a) y(0) = c. (1.15b)
This situation is a special case of Ex. 1.6 below.
(e) Let a,b R, a = (c sin+c cos) : [a,b] K : c ,c K . (1.17) L 1 2 −→ 1 2 ∈ n o 1 BASIC NOTIONS 9 In consequence, the boundary value problem y(0) = 0, y(π/2)=1, (1.18a) for (1.16) has the unique solution φ : [0,π/2] K, φ(x) := sin x (using (1.18a) −→ and (1.17) implies c2 = 0 and c1 = 1); the boundary value problem y(0) = 0, y(π)=0, (1.18b) for (1.16) has the infinitely many different solutions φc : [0,π] K, φc(x) := c sin x, c K; and the boundary value problem −→ ∈ y(0) = 0, y(π)=1, (1.18c) for (1.16) has no solution (using (1.18c) and (1.17) implies the contradictory re- quirements c = 0 and c = 1). 2 2 − (f) Consider y z z F : R K2 K2 K2, F x, 1 , 1 := 2 . (1.19a) × × −→ y z z 1 2 2 2 − Clearly, the implicit K2-valued ODE y′ 0 F (x,y,y′)= 2 = (1.19b) y′ 1 0 2 − has no solution on any nontrivial interval. (g) Consider y z z + iy 2i 1 1 2 3 − F : R C3 C3 C3, F x, y , z := z + y x2 . (1.20a) 2 2 1 2 × × −→ y z y ie−ix 3 3 1 − It is an exercise to show the C3-valued implicit ODE y′ + iy 2i 0 2 3 − F (x,y,y′)= y′ + y x2 = 0 (1.20b) 1 2 y ie−ix 0 1 − has a unique solution on R (note that, here, we do not need to provide initial or boundary conditions to obtain uniqueness). The implicit ODE (1.20b) is an example of a differential algebraic equation, since, read in components, only its first two equations contain derivatives, whereas its third equation is purely algebraic. 1 BASIC NOTIONS 10 1.2 Equivalent Integral Equation It is often useful to rewrite a first-order explicit intitial value problem as an equivalent integral equation. We provide the details of this equivalence in the following theorem: Theorem 1.5. If G R Kn, n N, and f : G Kn is continuous, then, for each (x ,y ) G, the explicit⊆ n×-dimensional∈ first-order−→ initial value problem 0 0 ∈ y′ = f(x,y), (1.21a) y(x0)= y0, (1.21b) is equivalent to the integral equation x y(x)= y0 + f t,y(t) dt, (1.22) Zx0 n in the sense that a continuous function φ : I K , with x0 I R being a nontrivial interval, and φ satisfying −→ ∈ ⊆ x,φ(x) I Kn : x I G, (1.23) ∈ × ∈ ⊆ is a solution to (1.21) in the sense of Def. 1.3(a) if, and only if, x φ(x)= y0 + f t,φ(t) dt, (1.24) x∀∈I Zx0 i.e. if, and only if, φ is a solution to the integral equation (1.22). n Proof. Assume I R with x0 I to be a nontrivial interval and φ : I K to be a continuous⊆ function, satisfying∈ (1.23). If φ is a solution to (1.21), then−→φ is differentiable and the assumed continuity of f implies the continuity of φ′. In other 1 words, each component φj of φ, j = 1,...,n , is in C (I, K). Thus, the fundamental theorem of calculus [Phi16a, Th. 10.20(b)]{ applies,} and [Phi16a, (10.51b)] yields x x (1.21b) φj(x)= φj(x0)+ fj t,φ(t) dt = y0,j + fj t,φ(t) dt, (1.25) x∀∈I j∈{1∀,...,n} Zx0 Zx0 proving φ satisfies (1.24). Conversely, if φ satisfies (1.24), then the validity of the initial condition (1.21b) is immediate. Moreover, as f and φ are continuous, so is the integrand function t f t,φ(t) of (1.24). Thus, [Phi16a, Th. 10.20(a)] applies to (the components of) φ, proving7→ φ′(x)= f x,φ(x) for each x I, proving φ is a solution to (1.21). ∈ 1 BASIC NOTIONS 11 Example 1.6. Consider the situation of Th. 1.5. In the particularly simple special case, where f does not actually depend on y, but merely on x, the equivalence between (1.21) and (1.22) can be directly exploited to actually solve the initial value problem: n If f : I K , where I R is some nontrivial interval with x0 I, then we obtain φ : I −→Kn to be a solution⊆ of (1.21) if, and only if, ∈ −→ x φ(x)= y0 + f(t)dt, (1.26) x∀∈I Zx0 i.e. if, and only if, φ is the antiderivative of f that satisfies the initial condition. In particular, in the present situation, φ as given by (1.26) is the unique solution to the initial value problem. Of course, depending on f, it can still be difficult to carry out the integral in (1.26). 1.3 Patching and Time Reversion If solutions defined on different intervals fit together, then they can be patched to obtain a solution on the union of the two intervals: Lemma 1.7 (Patching of Solutions). Let k,n N. Given G R Kkn and f : G Kn, if φ : I Kn and ψ : J Kn are both∈ solutions to (1.6)⊆ ,× i.e. to −→ −→ −→ y(k) = f(x,y,y′,...,y(k−1)), such that I =]a,b], J =[b,c[, a φ(j)(b)= ψ(j)(b), (1.27) j=0,...,k∀ −1 then φ(x) for x I, σ : I J Kn, σ(x) := ∈ (1.28) ∪ −→ ψ(x) for x J, ( ∈ is also a solution to (1.6). Proof. Since φ and ψ both are solutions to (1.6), x,σ(x),σ′(x),...,σ(k−1)(x) (I J) Kkn : x I J G (1.29) ∈ ∪ × ∈ ∪ ⊆ must hold, where (1.27) guarantees that σ(j)(b) exists for each j =0,...,k 1. Moreover, σ is k times differentiable at each x I J, x = b, and − ∈ ∪ 6 σ(k)(x)= f x,σ(x),σ′(x),...,σ(k−1)(x) . (1.30) x∈∀I∪J, x6=b 1 BASIC NOTIONS 12 However, at b, we also have (using the left-hand derivatives for φ and the right-hand derivatives for ψ) φ(k)(b)= f b,φ(b),φ′(b),...,φ(k−1)(b) ′ (k−1) (k) = f b,ψ(b),ψ (b),...,ψ (b) = ψ (b), (1.31) which shows σ is k times differentiable and the equality of (1.30) also holds at x = b, completing the proof that σ is a solution. It is sometimes useful to apply what is known as time reversion: Definition 1.8. Let k,n N, G R Kkn, f : G Kn, and consider the ODE ∈ f ⊆ × f −→ y(k) = f(x,y,y′,...,y(k−1)). (1.32) We call the ODE y(k) = g(x,y,y′,...,y(k−1)), (1.33) where g : G Kn, g(x,y):=( 1)k f x,y , y ,..., ( 1)k−1 y , (1.34a) g −→ − − 1 − 2 − k kn k−1 Gg := (x,y) R K : x,y1, y2,..., ( 1) yk Gf , (1.34b) ∈ × − − − ∈ the time-reversed version of (1.32). Lemma 1.9 (Time Reversion). Let k,n N, G R Kkn, and f : G Kn. ∈ f ⊆ × f −→ (a) The time-reversed version of (1.33) is the original ODE, i.e. (1.32). (b) If a Proof. (a) is immediate from the definition of g in (1.34). (b): Due to (a), it suffices to show if φ is a solution to (1.32), then ψ is a solution to (1.33). Clearly, if x ] b, a[, then x ]a,b[. Moreover, noting ∈ − − − ∈ ψ(j)(x)=( 1)j φ(j)( x), (1.36a) j=0∀,...,k x∈]−∀b,−a[ − − 2 ELEMENTARY SOLUTION METHODS 13 one has x,φ( x),φ′( x),...,φ(k−1)( x) G − − − − ∈ f x,ψ(x),ψ′(x),...,ψ(k−1)(x) (1.36b) x∈]−∀b,−a[ ⇒ = x,φ( x), φ′( x),..., ( 1)k−1 φ(k−1)( x) G − − − − − ∈ g and ψ(k)(x)=( 1)k f x,φ( x),φ′( x),...,φ(k−1)( x) − − − − − k ′ k−1 (k−1) =( 1) f x,ψ(x), ψ (x),..., ( 1) ψ (x) (1.36c) x∈]−∀b,−a[ − − − − ′ (k−1) = g x,ψ(x ),ψ (x),...,ψ (x) , thereby establishing the case. 2 Elementary Solution Methods for 1-Dimensional First-Order ODE 2.1 Geometric Interpretation, Graphing Geometrically, in the 1-dimensional real-valued case, the ODE (1.21a) provides a slope y′ = f(x,y) for every point (x,y). In other words, it provides a field of directions. The task is to find a differentiable function φ such that its graph has the prescribed slope in each point it contains. In certain simple cases, drawing the field of directions can help to guess the solutions of the ODE. Example 2.1. Let G := R+ R and f : G R, f(x,y) := y/x, i.e. we consider the ODE y′ = y/x. Drawing the× field of directions−→ leads to the idea that the solutions are functions whose graphs constitute rays, i.e. φ : R+ R, y = φ (x)= cx with c R. c −→ c ∈ Indeed, one immediately verifies that each φc constitutes a solution to the ODE. 2.2 Linear ODE, Variation of Constants Definition 2.2. Let I R be an open interval and let a,b : I K be continuous functions. An ODE of the⊆ form −→ y′ = a(x)y + b(x) (2.1) is called a linear ODE of first order. It is called homogeneous if, and only if, b 0; it is ≡ called inhomogeneous if, and only if, it is not homogeneous. 2 ELEMENTARY SOLUTION METHODS 14 Theorem 2.3 (Variation of Constants). Let I R be an open interval and let a,b : I K be continuous. Moreover, let x I and⊆ c K. Then the linear ODE (2.1) −→ 0 ∈ ∈ has a unique solution φ : I K that satisfies the initial condition y(x0) = c. This unique solution is given by −→ x φ : I K, φ(x)= φ (x) c + φ (t)−1 b(t)dt , (2.2a) −→ 0 0 Zx0 where x R x a(t)dt φ : I K, φ (x) = exp a(t)dt = e x0 . (2.2b) 0 −→ 0 Zx0 −1 Here, and in the following, φ0 denotes 1/φ0 and not the inverse function of φ0 (which does not even necessarily exist). Proof. We begin by noting that φ0 according to (2.2b) is well-defined since a is assumed to be continuous, i.e., in particular, Riemann integrable on[x0,x]. Moreover, the funda- mental theorem of calculus [Phi16a, Th. 10.20(a)] applies, showing φ0 is differentiable with x φ′ : I K, φ′ (x)= a(x)exp a(t)dt = a(x)φ (x), (2.3) 0 −→ 0 0 Zx0 where Lem. A.1 of the Appendix was used as well. In particular, φ0 is continuous. Since −1 φ0 = 0 as well, φ0 is also continuous. Moreover, as b is continuous by hypothesis, −16 φ0 b is continuous and, thus, Riemann integrable on [x0,x]. Once again, [Phi16a, Th. 10.20(a)] applies, yielding φ to be differentiable with φ′ : I K, −→ x ′ ′ −1 −1 φ (x)= φ0(x) c + φ0(t) b(t)dt + φ0(x)φ0(x) b(x) x0 Z x −1 = a(x)φ0(x) c + φ0(t) b(t)dt + b(x)= a(x)φ(x)+ b(x), (2.4) Zx0 where the product rule of [Phi16a, Th. 9.7(c)] was used as well. Comparing (2.4) with (2.1) shows φ is a solution to (2.1). The computation φ(x )= φ (x )(c +0)=1 c = c (2.5) 0 0 0 · verifies that φ satisfies the desired initial condition. It remains to prove uniqueness. To this end, let ψ : I K be an arbitrary differentiable function that satisfies (2.1) as well as the initial condition−→ ψ(x ) = c. We have to show ψ = φ. Since φ = 0, we can 0 0 6 define u := ψ/φ0 and still have to verify x −1 u(x)= c + φ0(t) b(t)dt. (2.6) x∀∈I Zx0 2 ELEMENTARY SOLUTION METHODS 15 We obtain ′ ′ ′ ′ ′ aφ0 u + b = aψ + b = ψ =(φ0 u) = φ0 u + φ0 u = aφ0 u + φ0 u , (2.7) ′ ′ −1 implying b = φ0 u and u = φ0 b. Thus, the fundamental theorem of calculus in the form [Phi16a, Th. 10.20(b)] implies x x ′ −1 u(x)= u(x0)+ u (t)dt = c + φ0(t) b(t)dt, (2.8) x∀∈I Zx0 Zx0 thereby completing the proof. Corollary 2.4. Let I R be an open interval and let a : I K be continuous. Moreover, let x I and⊆ c K. Then the homogeneous linear−→ ODE (2.1) (i.e. with 0 ∈ ∈ b 0) has a unique solution φ : I K that satisfies the initial condition y(x0) = c. This≡ unique solution is given by −→ x R x a(t)dt φ(x)= c exp a(t)dt = ce x0 . (2.9) Zx0 Proof. One immediately obtains (2.9) by setting b 0 in in (2.2). ≡ Remark 2.5. The name variation of constants for Th. 2.3 can be understood from comparing the solution (2.9) of the homogeneous linear ODE with the solution (2.2) of the general inhomogeneous linear ODE: One obtains (2.2) from (2.9) by varying the x −1 constant c, i.e. by replacing it with the function x c + φ0(t) b(t)dt . 7→ x0 Example 2.6. Consider the ODE R y′ =2xy + x3 (2.10) with initial condition y(0) = c, c C. Comparing (2.10) with Def. 2.2, we observe we are facing an inhomogeneous linear∈ ODE with a : R R, a(x):=2x, (2.11a) −→ b : R R, b(x) := x3. (2.11b) −→ From Cor. 2.4, we obtain the solution φ0,c to the homogeneous version of (2.10): x 2 φ : R C, φ (x)= c exp a(t)dt = cex . (2.12) 0,c −→ 0,c Z0 The solution to (2.10) is given by (2.2a): φ : R C, −→ x x 2 2 2 1 2 φ(x)= ex c + e−t t3 dt = ex c + (t2 + 1)e−t −2 Z0 0 2 1 1 2 1 2 1 = ex c + (x2 + 1)e−x = c + ex (x2 + 1). (2.13) 2 − 2 2 − 2 2 ELEMENTARY SOLUTION METHODS 16 2.3 Separation of Variables If the ODE (1.21a) has the particular form y′ = f(x)g(y), (2.14) with one-dimensional real-valued continuous functions f and g, and g(y) = 0, then it can be solved by a method known as separation of variables: 6 Theorem 2.7. Let I,J R be (bounded or unbounded) open intervals and suppose that f : I R and g : J ⊆ R are continuous with g(y) = 0 for each y J. For each −→ −→ 6 ∈ (x0,y0) I J, consider the initial value problem consisting of the ODE (2.14) together with the∈ initial× condition y(x0)= y0. (2.15) Define the functions x y dt F : I R, F (x) := f(t)dt, G : J R, G(y) := . (2.16) −→ −→ g(t) Zx0 Zy0 ′ ′ ′ (a) Uniqueness: On each open interval I I satisfying x0 I and F (I ) G(J), the initial value problem consisting of (2.14)⊆ and (2.15) has∈ a unique solution.⊆ This unique solution is given by φ : I′ R, φ(x) := G−1 F (x) , (2.17) −→ where G−1 : G(J) J is the inverse function of G on G (J). −→ ′ ′ ′ (b) Existence: There exists an open interval I I satisfying x0 I and F (I ) G(J), i.e. an I′ such that (a) applies. ⊆ ∈ ⊆ Proof. (a): We begin by proving G has a differentiable inverse function G−1 : G(J) J. According to the fundamental theorem of calculus [Phi16a, Th. 10.20(a)], G−→is differentiable with G′ = 1/g. Since g is continuous and nonzero, G is even C1. If ′ G (y0)=1/g(y0) > 0, then G is strictly increasing on J (due to the intermediate value theorem [Phi16a, Th. 7.57]; g(y0) > 0, the continuity of g, and g = 0 imply that g > 0 ′ 6 on J). Analogously, if G (y0)=1/g(y0) < 0, then G is strictly decreasing on J. In each case, G has a differentiable inverse function on G(J) by [Phi16a, Th. 9.9]. In the next step, we verify that (2.17) does, indeed, define a solution to (2.14) and (2.15). The assumption F (I′) G(J) and the existence of G−1 as shown above provide ⊆ −1 that φ is well-defined by (2.17). Verifying (2.15) is quite simple: φ(x0)= G (F (x0)) = 2 ELEMENTARY SOLUTION METHODS 17 −1 G (0) = y0. To see φ to be a solution of (2.14), notice that (2.17) implies F = G φ on I′. Thus, we can apply the chain rule to obtain the derivative of F = G φ on I′◦: ◦ φ′(x) f(x)= F ′(x)= G′ φ(x) φ′(x)= , (2.18) x∈∀I′ g φ(x) showing φ satisfies (2.14). We now proceed to show that each solution φ : I′ R to (2.14) that satisfies (2.15) −→ must also satisfy (2.17). Since φ is a solution to (2.14), φ′(x) = f(x) for each x I′. (2.19) g φ(x) ∈ Integrating (2.19) yields x φ′(t) x dt = f(t)dt = F (x) for each x I′. (2.20) g φ(t) ∈ Zx0 Zx0 Using the change of variables formula of [Phi16a, Th. 10.25] in the left-hand side of (2.20), allows one to replace φ(t) by the new integration variable u (note that each solution φ : I′ R to (2.14) is in C1(I′) since f and g are presumed continuous). Thus, we obtain−→ from (2.20): φ(x) du φ(x) du F (x)= = = G φ(x) for each x I′. (2.21) g(u) g(u) ∈ Zφ(x0) Zy0 Applying G−1 to (2.21) establishes φ satisfies (2.17). (b): During the proof of (a), we have already seen G to be either strictly increasing or strictly decreasing. As G(y0) = 0, this implies the existence of ǫ > 0 such that ] ǫ,ǫ[ G(J). The function F is differentiable and, in particular, continuous. Since − ⊆ ′ ′ F (x0) = 0, there is δ > 0 such that, for I :=]x0 δ, x0 +δ[, one has F (I ) ] ǫ,ǫ[ G(J) as desired. − ⊆ − ⊆ Example 2.8. Consider the ODE y y′ = on I J := R+ R+ (2.22) −x × × with the initial condition y(1) = c for some given c R+. Introducing functions ∈ 1 f : R+ R, f(x) := , g : R+ R, g(y) := y, (2.23) −→ −x −→ 2 ELEMENTARY SOLUTION METHODS 18 one sees that Th. 2.7 applies. To compute the solution φ = G−1 F , we first have to determine F and G: ◦ x x dt F : R+ R, F (x)= f(t)dt = = ln x, (2.24a) −→ − t − Z1 Z1 y dt y dt y G : R+ R, G(y)= = = ln . (2.24b) −→ g(t) t c Zc Zc Here, we can choose I′ = I = R+, because F (R+) = R = G(R+). That means φ is defined on the entire interval I. The inverse function of G is given by G−1 : R R+, G−1(t)= cet. (2.25) −→ Finally, we get c φ : R+ R, φ(x)= G−1 F (x) = ce− ln x = . (2.26) −→ x The uniqueness part of Th. 2.7 further tells us the above initial value problem can have no solution different from φ. — The advantage of using Th. 2.7 as in the previous example, by computing the relevant functions F , G, and G−1, is that it is mathematically rigorous. In particular, one can be sure one has found the unique solution to the ODE with initial condition. However, in practice, it is often easier to use the following heuristic (not entirely rigorous) procedure. In the end, in most cases, one can easily check by differentiation that the function found is, indeed, a solution to the ODE with initial condition. However, one does not know uniqueness without further investigations (general results such as Th. 3.15 below can often help). One also has to determine on which interval the found solution is defined. On the other hand, as one is usually interested in choosing the interval as large as possible, the optimal choice is not always obvious when using Th. 2.7, either. The heuristic procedure is as follows: Start with the ODE (2.14) written in the form dy = f(x)g(y). (2.27a) dx Multiply by dx and divide by g(y) (i.e. separate the variables): dy = f(x)dx. (2.27b) g(y) Integrate: dy = f(x)dx. (2.27c) g(y) Z Z 2 ELEMENTARY SOLUTION METHODS 19 Change the integration variables and supply the appropriate upper and lower limits for the integrals (according to the initial condition): y dt x = f(t)dt. (2.27d) g(t) Zy0 Zx0 Solve this equation for y, set φ(x) := y, check by differentiation that φ is, indeed, a ′ ′ solution to the ODE, and determine the largest interval I such that x0 I and such that φ is defined on I′. The use of this heuristic procedure is demonstrated∈ by the following example: Example 2.9. Consider the ODE y′ = y2 on I J := R R (2.28) − × × with the initial condition y(x0)= y0 for given values x0,y0 R. We manipulate (2.28) according to the heuristic procedure described in (2.27) above:∈ dy = y2 y−2 dy = dx y−2 dy = dx dx − − − Z Z y x 1 y 1 1 −2 x t dt = dt =[t]x0 = x x0 − t y − y0 − Zy0 Zx0 y0 y φ(x)= y = 0 . (2.29) 1+(x x ) y − 0 0 Clearly, φ(x0)= y0. Moreover, 2 ′ y0 2 φ (x)= 2 = φ(x) , (2.30) − 1+(x x0) y0 − − i.e. φ does, indeed, provide a solution to (2.28). If y = 0, then φ 0 is defined 0 ≡ on the entire interval I = R. If y0 = 0, then the denominator of φ(x) has a zero at x = (x y 1)/y , and φ is not defined6 on all of R. In that case, if y > 0, then 0 0 − 0 0 x0 > (x0y0 1)/y0 = x0 1/y0 and the maximal open interval for φ to be defined on ′ − − is I =]x0 1/y0, [; if y0 < 0, then x0 < (x0y0 1)/y0 = x0 1/y0 and the maximal open interval− for φ∞to be defined on is I′ =] ,x− 1/y [. Note− that the formula for φ − ∞ 0 − 0 obtained by (2.29) works for y0 = 0 as well, even though not every previous expression in (2.29) is meaningful for y0 = 0 and, also, Th. 2.7 does not apply to (2.28) for y0 = 0. In the present example, the subsequent Th. 3.15 does, indeed, imply φ to be the unique solution to the initial value problem on I′. 2 ELEMENTARY SOLUTION METHODS 20 2.4 Change of Variables To solve an ODE, it can be useful to transform it into an equivalent ODE, using a so-called change of variables. If one already knows how to solve the transformed ODE, then the equivalence allows one to also solve the original ODE. We first present the following Th. 2.10, which constitutes the base for the change of variables technique, followed by examples, where the technique is applied. n n Theorem 2.10. Let G R K be open, n N, f : G K , and (x0,y0) G. Define ⊆ × ∈ −→ ∈ n Gx := y K : (x,y) G (2.31) x∀∈R { ∈ ∈ } and assume the change of variables function T : G Kn is differentiable and such that −→ Tx := T (x, ): Gx Tx(Gx), Tx(y) := T (x,y), is a diffeomorphism , Gx∀6=∅ · −→ (2.32) −1 i.e. Tx is invertible and both Tx and Tx are differentiable. Then the first-order initial value problems y′ = f(x,y), (2.33a) y(x0)= y0, (2.33b) and −1 ′ −1 −1 −1 y = DTx (y) f x,Tx (y) + ∂xT x,Tx (y) , (2.34a) y(x0)= T (x0,y0), (2.34b) are equivalent in the following sense: (a) A differentiable function φ : I Kn, where I R is a nontrivial interval, is a solution to (2.33a) if, and only−→ if, the function ⊆ µ : I Kn, µ(x):=(T φ)(x)= T x,φ(x) , (2.35) −→ x ◦ is a solution to (2.34a). (b) A differentiable function φ : I Kn, where I R is a nontrivial interval, is a solution to (2.33) if, and only if,−→ the function of ⊆(2.35) is a solution to (2.34). 2 ELEMENTARY SOLUTION METHODS 21 Proof. We start by noting that the assumption of G being open clearly implies each Gx, x R, to be open as well, which, in turn, implies Tx(Gx) to be open, even though this ∈ 1 is not as obvious . Next, for each x R such that Gx = , we can apply the chain rule [Phi16b, Th. 4.31] to T T −1 = Id to∈ obtain 6 ∅ x ◦ x −1 −1 DTx Tx (y) DTx (y)=Id (2.36) y∈Tx∀(Gx) ◦ −1 and, thus, each DTx (y) is invertible with −1 −1 −1 DTx (y) = DTx Tx (y) . (2.37) y∈Tx∀(Gx) Consider φ and µ as in (a) and notice that (2.35) implies −1 φ(x)= Tx (µ(x)). (2.38) x∀∈I Moreover, the differentiability of φ and T imply differentiability of µ by the chain rule, which also yields ′ 1 µ (x)= DT x,φ(x) ′ φ (x) (2.39) x∈I ∀ ′ = DTx(φ(x)) φ (x)+ ∂xT x,φ(x) . To prove (a), first assume φ : I Kn to be a solution of (2.33a). Then, for each x I, −→ ∈ ′ (2.39),(2.33a) µ (x) = DTx(φ(x)) f x,φ(x) + ∂xT x,φ(x) (2.38) −1 −1 −1 = DTx Tx (µ( x)) f x,T x (µ( x)) + ∂xT x,Tx (µ(x)) −1 (2.37) −1 −1 −1 = DTx (µ(x)) f x,Tx (µ(x)) + ∂xT x,Tx (µ(x)) , (2.40) showing µ satisfies (2.34a). Conversely, assume µ to be a solution to (2.34a). Then, for each x I, ∈ −1 −1 −1 −1 DTx (µ(x)) f x,Tx (µ(x)) + ∂xT x,Tx (µ(x)) (2.34a) ′ (2.39) ′ = µ (x) = DTx(φ(x)) φ (x)+ ∂xT x,φ(x) . (2.41) 1 If Tx is a continuously differentiable map, then this is related to the inverse function theorem (see, e.g. [Phi16b, Cor. 4.51]); it is still true if Tx is merely continuous and injective, but then it is the invariance of domain theorem of algebraic topology [Oss09, 5.6.15], which is equivalent to the Brouwer fixed-point theorem [Oss09, 5.6.10], and is much harder to prove. 2 ELEMENTARY SOLUTION METHODS 22 Using (2.38), one can subtract the second summand from (2.41). Multiplying the result −1 by DTx (µ(x)) from the left and taking into account (2.37) then provides ′ −1 (2.38) φ (x)= f x,Tx (µ(x)) = f x,φ(x) , (2.42) x∀∈I showing φ satisfies (2.33a). It remains to prove (b). If φ satisfies (2.33), then µ satisfies (2.34a) by (a). More- over, µ(x0) = T x0,φ(x0) = T (x0,y0), i.e. µ satisfies (2.34b) as well. Conversely, assume µ satisfies (2.34). Then φ satisfies (2.33a) by (a). Moreover, by (2.38), φ(x0)= −1 −1 Tx0 (µ(x0)) = Tx0 (T (x0,y0)) = y0, showing φ satisfies (2.33b) as well. As a first application of Th. 2.10, we prove the following theorem about so-called Bernoulli differential equations: Theorem 2.11. Consider the Bernoulli differential equation y′ = f(x,y) := a(x) y + b(x) yα, (2.43a) where α R 0, 1 , the functions a,b : I R are continuous and defined on an open interval I∈ \{R, and} f : I R+ R. For−→(2.43a), we add the initial condition ⊆ × −→ y(x )= y , (x ,y ) I R+, (2.43b) 0 0 0 0 ∈ × and, furthermore, we also consider the corresponding linear initial value problem y′ = (1 α) a(x) y + b(x) , (2.44a) 1−−α y(x0)= y0 , (2.44b) with its unique solution ψ : I R given by Th. 2.3. −→ ′ ′ ′ (a) Uniqueness: On each open interval I I satisfying x0 I and ψ > 0 on I , the Bernoulli initial value problem (2.43) has⊆ a unique solution.∈ This unique solution is given by 1 φ : I′ R+, φ(x) := ψ(x) 1−α . (2.45) −→ ′ ′ ′ (b) Existence: There exists an open interval I I satisfying x0 I and ψ > 0 on I , i.e. an I′ such that (a) applies. ⊆ ∈ Proof. (b) is immediate from Th. 2.3, since ψ(x0)= y0 > 0 and ψ is continuous. To prove (a), we apply Th. 2.10 with the change of variables T : I R+ R+, T (x,y) := y1−α. (2.46) × −→ 2 ELEMENTARY SOLUTION METHODS 23 Then T C1(I R+, R) with ∂ T 0 and ∂ T (x,y)=(1 α) y−α. Moreover, ∈ × x ≡ y − + + 1−α Tx = S, S : R R , S(y) := y , (2.47) x∀∈I −→ which is differentiable with the differentiable inverse function S−1 : R+ R+, 1 α −→ −1 1−α −1 −1 ′ 1 1−α S (y)= y , DS (y)=(S ) (y)= 1−α y . Thus, (2.34a) takes the form −1 ′ −1 −1 −1 y = DTx (y) f x,Tx (y) + ∂xT x,Tx (y) −α 1 1 α = (1 α) y 1−α a (x) y 1−α + b(x) y 1−α +0 − = (1 α) a(x) y+ b(x) . (2.48) − ′ ′ ′ Thus, if I I is such that x 0 I and ψ > 0 on I , then Th. 2.10 says φ defined by (2.45) must⊆ be a solution to (2.43)∈ (note that the differentiability of ψ implies the differentiability of φ). On the other hand, if λ : I′ R+ is an arbitrary solution to (2.43), then Th. 2.10 states µ := S λ = λ1−α to be a−→ solution to (2.44). The uniqueness 1−α ◦ 1−α part of Th. 2.3 then yields λ = ψ↾I′ = φ , i.e. λ = φ. Example 2.12. Consider the initial value problem 1 y′ = f(x,y) := i , (2.49a) − ix y +2 − y(1) = i, (2.49b) where f : G C, G := (x,y) R C : ix y +2 =0 (G is open as the continuous preimage of the−→ open set C{ 0 ∈). We× apply the− change6 } of variables \ { } T : G C, T (x,y) := ix y. (2.50) −→ − Then, T C1(G, C). Moreover, for each x R, ∈ ∈ G = y C : (x,y) G = C ix +2 (2.51) x { ∈ ∈ } \ { } and we have the diffeomorphisms T : C ix +2 C 2 , T (y)= ix y, (2.52a) x \ { } −→ \ {− } x − T −1 : C 2 C ix +2 , T −1(y)= ix y. (2.52b) x \ {− } −→ \ { } x − To obtain the transformed equation, we compute the right-hand side of (2.34a) −1 −1 −1 −1 DTx (y) f x,Tx (y) + ∂xT x,Tx (y) 1 1 =( 1) i + i = . (2.53) − · − y +2 y +2 2 ELEMENTARY SOLUTION METHODS 24 Thus, the transformed initial value problem is 1 y′ = , (2.54a) y +2 y(1) = T (1,i)= i i =0. (2.54b) − Using seperation of variables, one finds the solution µ :] 1, [ ] 2, [, µ(x) := √2x +2 2, (2.55) − ∞ −→ − ∞ − to (2.54). Then Th. 2.10 implies that φ :] 1, [ C, φ(x) := T −1 µ(x) = ix √2x +2+2, (2.56) − ∞ −→ x − is a solution to (2.49) (that φ is a solution to (2.49) can now also easily be checked directly). It will become clear from Th. 3.15 below that φ and ψ are also the unique solutions to their respective initial value problems. — Finding a suitable change of variables to transform a given ODE such that one is in a position to solve the transformed ODE is an art, i.e. it can be very difficult to spot a useful transformation, and it takes a lot of practise and experience. Remark 2.13. Somewhat analogous to the situation described in the paragraph before (2.27) regarding the separation of variables technique, in practise, one frequently uses a heuristic procedure to apply a change of variables, rather than appealing to the rigorous ′ Th. 2.10. For the initial value problem y = f(x,y), y(x0)= y0, this heuristic procedure proceeds as follows: (1) One introduces the new variable z := T (x,y) and then computes z′, i.e. the deriva- tive of the function x z(x)= T (x,y(x)). 7→ (2) In the result of (1), one eliminates all occurrences of the variable y by first replacing ′ −1 y by f(x,y) and then replacing y by Tx (z), where Tx(y) := T (x,y)= z (i.e. one has to solve the equation z = T (x,y) for y). One thereby obtains the transformed initial ′ value problem problem z = g(x,z), z(x0)= T (x0,y0), with a suitable function g. (3) One solves the transformed initial value problem to obtain a solution µ, and then −1 x φ(x) := Tx µ(x) yields a candidate for a solution to the original initial value problem.7→ ′ (4) One checks that φ is, indeed, a solution to y = f(x,y), y(x0)= y0. 2 ELEMENTARY SOLUTION METHODS 25 Example 2.14. Consider y y2 f : R+ R R, f(x,y):=1+ + , (2.57) × −→ x x2 and the initial value problem y′ = f(x,y), y(1) = 0. (2.58) We introduce the change of variables z := T (x,y) := y/x and proceed according to the steps of Rem. 2.13. According to (1), we compute, using the quotient rule, y′(x) x y(x) z′(x)= − . (2.59) x2 ′ −1 According to (2), we replace y (x) by f(x,y) and then replace y by Tx (z) = xz to obtain the transformed initial value problem 1 y y2 y 1 z 1+ z2 z′ = 1+ + = (1 + z + z2) = , z(1) = 0/1=0. (2.60) x x x2 − x2 x − x x According to (3), we next solve (2.60), e.g. by seperation of variables, to obtain the solution − π π µ : e 2 , e 2 R, µ(x) := tanln x, (2.61) −→ of (2.60), and − π π φ : e 2 , e 2 R, φ(x) := xµ(x)= x tanln x, (2.62) −→ as a candidate for a solution to (2.58). Finally, according to (4), we check that φ is, indeed, a solution to (2.58): Due to φ(1) = 1 tan0 = 0, φ satisfies the initial condition, and due to · 1 φ′(x) = tanln x + x (1 + tan2 ln x)=1+tanln x + tan2 ln x x φ(x) φ2(x) =1+ + , (2.63) x x2 φ satisfies the ODE. 3 GENERAL THEORY 26 3 General Theory 3.1 Equivalence Between Higher-Order ODE and Systems of First-Order ODE It turns out that each one-dimensional kth-order ODE is equivalent to a system of k first-order ODE; more generally, that each n-dimensional kth-order ODE is equivalent to a kn-dimensional first-order ODE (i.e. to a system of kn one-dimensional first-order ODE). Even though, in this class, we will mainly consider explicit ODE, we provide the equivalence also for the implicit case, as the proof is essentially the same (the explicit case is included as a special case). Theorem 3.1. In the situation of Def. 1.2(a), i.e. U R K(k+1)n and F : U Kn, plus (x ,y ,...,y ) R Kkn, consider the kth-order⊆ × initial value problem−→ 0 0,0 0,k−1 ∈ × F (x,y,y′,...,y(k))=0, (3.1a) (j) y (x0)= y0,j, (3.1b) j∈{0,...,k∀ −1} and the first-order initial value problem y′ y =0, 1 − 2 y′ y =0, 2 − 3 . . (3.2a) ′ yk−1 yk =0, − ′ F (x,y1,...,yk,yk)=0, y0,0 . y(x0)= . (3.2b) y0,k−1 (note that the unknown function y in (3.1) is Kn-valued, whereas the unknown function y in (3.2) is Kkn-valued). Then both initial value problems are equivalent in the following sense: (a) If φ : I Kn is a solution to (3.1), then −→ φ φ′ Φ: I Kkn, Φ := . , (3.3) −→ . φ(k−1) is a solution to (3.2). 3 GENERAL THEORY 27 kn n (b) If Φ : I K is a solution to (3.2), then φ := Φ1 (which is K -valued) is a solution to−→(3.1). Proof. We rewrite (3.2a) as G(x,y,y′)=0, (3.4) where G : V Kkn, −→ V := (x,y,z) R Kkn Kkn : (x,y,z ) U R Kkn Kkn, ∈ × × k ∈ ⊆ × × G (x,y,z) := z y , 1 1 − 2 G (x,y,z) := z y , 2 2 − 3 . . G (x,y,z) := z y , k−1 k−1 − k Gk(x,y,z) := F (x,y,zk). (3.5) (a): As a solution to (3.1), φ is k times differentiable and Φ is well-defined. Then (3.1b) implies (3.2b), since φ(x ) 0 y φ′(x ) 0,0 0 (3.1b) . Φ(x0)= . = . . . (k−1) y0,k−1 φ (x0) Next, Def. 1.2(a)(i) for φ implies Def. 1.2(a)(i) for Φ, since x, Φ(x), Φ′(x) I Kkn Kkn : x I { ∈ × × ∈ } (3.3) = x,φ(x),φ′(x),...,φ(k−1)(x),φ′(x),...,φ(k)(x) I Kkn Kkn : x I ∈ × × ∈ (x,φ(x),...,φ(k)(x)) ∈ U V. (3.6) ⊆ The definition of Φ in (3.3) implies ′ (j−1) ′ (j) Φj =(φ ) = φ = Φj+1, j∈{1,...,k∀ −1} showing Φ satisfies the first k 1 equations of (3.2a). As − ′ (k−1) ′ (k) Φk(x)=(φ ) (x)= φ (x) 3 GENERAL THEORY 28 and, thus, ′ ′ (k) (3.1a) Gk x, Φ(x), Φ (x) = F x,φ(x),φ (x),...,φ (x) = 0, (3.7) x∀∈I Φ also satisfies the last equation of (3.2a). (b): As Φ is a solution to (3.2), the first k 1 equations of (3.2a) imply − ′ φ=Φ1 (j) Φj+1 = Φj = φ , j∈{1,...,k∀ −1} i.e. φ is k times differentiable and Φ has, once again, the form (3.3) (note Φ1 = φ by the definition of φ). Then, clearly, (3.2b) implies (3.1b), and Def. 1.2(a)(i) for Φ implies Def. 1.2(a)(i) for φ: x, Φ(x), Φ′(x) I Kkn Kkn : x I V ∈ × × ∈ ⊆ and the definition of V in (3.5) imply x,φ(x),...,φ(k) I K(k+1)n : x I U. ∈ × ∈ ⊆ Finally, from the last equation of (3.2a), one obtains (k) (3.3),(3.5) ′ F x,φ(x),...,φ = Gk x, Φ(x), Φ (x) =0, x∀∈I proving φ satisfies (3.1a). Example 3.2. The second-order initial value problem y(0) = 0, y′′ = y, (3.8) − y′(0) = r, r R given, ∈ is equivalent to the following system of two first-order ODE: ′ y = y2, 0 1 y(0) = . (3.9) y′ = y , r 2 − 1 The solution to (3.9) is Φ (x) r sin x Φ: R R2, Φ(x)= 1 = , (3.10) −→ Φ2(x) r cos x and, thus, the solution to (3.8) is φ : R R, φ(x)= r sin x. (3.11) −→ — 3 GENERAL THEORY 29 As a consequence of Th. 3.1, one can carry out much of the general theory of ODE (such as results regarding existence and uniqueness of solutions) for systems of first- order ODE, obtaining the corresponding results for higher-order ODE as a corollary. This is the strategy usually pursued in the literature and we will follow suit in this class. 3.2 Existence of Solutions It is a rather remarkable fact that, under the very mild assumption that f : G Kn is kn −→ a continuous function defined on an open subset G of R K with (x0,y0,0,...,y0,k−1) G, every initial value problem (1.7) for the n-dimensional× explicit kth-order ODE (1.6)∈ has at least one solution φ : I Kn, defined on a, possibly very small, open interval. This is the contents of the Peano−→ Th. 3.8 below and its Cor. 3.10. From Example 1.4(b), we already know that uniqueness of the solution cannot be expected without stronger hypotheses. The proof of the Peano theorem requires some work. One of the key ingredients is the Arzel`a-Ascoli Th. 3.7 that, under suitable hypotheses, guarantees a given sequence of continuous functions to have a uniformly convergent subsequence (the formulation in Th. 3.7 is suitable for our purposes – many different variants of the Arzel`a-Ascoli theorem exist in the literature). We begin with some prelimanaries from the theory of metric spaces. At this point, the reader might want to review the definition of a metric, a metric space, and basic notions on metric spaces, such as the notion of compactness and the notion of continuity of functions between metric spaces. Also recall that every normed space is a metric space via the metric induced by the norm (in particular, if we use metric notions on normed spaces, they are always meant with respect to the respective induced metric). Notation 3.3. Let (X,d) be a metric space. Given x X and r R+, let ∈ ∈ B (x) := y X : d(x,y) Definition 3.4. Let (X,dX ) and (Y,dY ) be metric spaces. We say a sequence of func- tions (fm)m∈N, fm : X Y , converges uniformly to a function f : X Y if, and only if, −→ −→ dY fm(x),f(x) < ǫ. ǫ>∀0 N∃∈N m≥∀N, x∈X Theorem 3.5. Let (X,dX ) and (Y,dY ) be metric spaces. If the sequence (fm)m∈N of continuous functions f : X Y converges uniformly to the function f : X Y , m −→ −→ then f is continuous as well. 3 GENERAL THEORY 30 Proof. We have to show that f is continuous at every ξ X. Thus, let ξ X and ǫ> 0. ∈ ∈ Due to the uniform convergence, we can choose m N such that dY fm(x),f(x) < ǫ/3 for every x X. Moreover, as f is continuous∈ at ξ, there exists δ > 0 such that m x B (ξ) implies∈ d f (ξ),f (x) < ǫ/3. Thus, if x B (ξ), then ∈ δ Y m m ∈ δ dY f(ξ),f(x) dY f(ξ),fm(ξ) + dY fm(ξ),fm(x) + dY fm(x),f(x) ≤ ǫ ǫ ǫ < + + = ǫ, 3 3 3 proving f is continuous at ξ. Definition 3.6. Let (X,dX ) and (Y,dY ) be metric spaces and let be a set of functions from X into Y . Then the set (or the functions in ) areF said to be uniformly equicontinuous if, and only if, forF each ǫ> 0, there is δ > F0 such that dX (x,ξ) <δ dY f(x),f(ξ) <ǫ . (3.12) x,ξ∀∈X ⇒f∈F ∀ Theorem 3.7 (Arzel`a-Ascoli). Let n N, let denote some norm on Kn, and let ∈ k · k n I R be some bounded interval. If (fm)m∈N is a sequence of functions fm : I K such⊆ that f : m N is uniformly equicontinuous and such that, for each x −→I, the { m ∈ } ∈ sequence fm(x) m∈N is bounded, then (fm)m∈N has a uniformly convergent subsequence (f ) , i.e. there exists f : I Kn such that mj j∈N −→ fm (x) f(x) < ǫ. ǫ>∀0 N∃∈N j≥∀N, k j − k x∈I In particular, the limit function f is continuous. Proof. Let (r1,r2,... ) be an enumeration of the set of rational numbers in I, i.e. of Q I. Inductively, we construct a sequence ( m)m∈N of subsequences of (fm)m∈N, ∩=(f ) , such that F Fm m,k k∈N (i) for each m N, is a subsequence of each with j 1,...,m , ∈ Fm Fj ∈ { } (ii) for each m N, m converges pointwise at each of the first m rational numbers ∈ F n rj; more precisely, there exists a sequence (z1,z2,... ) in K such that, for each m N and each j 1,...,m : ∈ ∈ { } lim fm,k(rj)= zj. k→∞ Actually, we construct the (zm)m∈N inductively together with the ( m)m∈N: Since the n F sequence (fm(r1))m∈N is, by hypothesis, a bounded sequence in K , one can apply the 3 GENERAL THEORY 31 n Bolzano-Weierstrass theorem (cf. [Phi16b, Th. 1.31]) to obtain z1 K and a sub- sequence = (f ) of (f ) such that lim f (r ) = z∈. To proceed by F1 1,k k∈N m m∈N k→∞ 1,k 1 1 induction, we now assume to have already constructed 1,..., M and z1,...,zM for M N such that (i) and (ii) hold for each m 1,...,MF F. Since the sequence ∈ n ∈ { } (fM,k(rM+1))k∈N is a bounded sequence in K , one can, once more, apply the Bolzano- n Weierstrass theorem to obtain zM+1 K and a subsequence M+1 = (fM+1,k)k∈N of such that lim f (r ) =∈z . Since is a subsequenceF of , it is FM k→∞ M+1,k M+1 M+1 FM+1 FM also a subsequence of all previous subsequences, i.e. (i) now also holds for m = M + 1. In consequence, limk→∞ fM+1,k(rj) = zj for each j = 1,...,M + 1, such that (ii) now also holds for m = M + 1 as required. Next, one considers the diagonal sequence (gm)m∈N, gm := fm,m, and observes that this sequence converges pointwise at each rational number rj (limm→∞ gm(rj) = zj), since, at least for m j, (g ) is a subsequence of every (exercise) – in particular, ≥ m m∈N Fj (gm)m∈N is also a subsequence of the original sequence (fm)m∈N. In the last step of the proof, we show that (gm)m∈N converges uniformly on the entire n interval I to some f : I K . To this end, fix ǫ > 0. Since gm : m N fm : m N , the assumed uniform−→ equicontinuity of f : m N yields{ δ >∈0 such} ⊆ that { ∈ } { m ∈ } ǫ x ξ <δ gm(x) gm(ξ) < . x,ξ∀∈I | − | ⇒m ∀∈N − 3 Since I is bounded, it has finite length and, thus, it can be covered with finitely many N intervals I1,...,IN , I = j=1 Ij, N N, such that each Ij has length less than δ. Moreover, since Q is dense in R, for each∈ j 1,...,N , there exists k(j) N such that r I . Define MS:= max k(j) : j∈= { 1,...,N}. We note that each∈ of the k(j) ∈ j { } finitely many sequences (gm(r1) m∈N,..., (gm(rM ) m∈N is a Cauchy sequence. Thus, ǫ gk(rα) gl(rα) < . (3.13) K∃∈N k,l∀≥K α=1∀,...,M − 3 We now consider an arbitrary x I and k,l K. Let j 1,...,N such that x I . ∈ ≥ ∈ { } ∈ j Then rk(j) Ij, rk(j) x < δ, and the estimate in (3.13) holds for α = k(j). In consequence,∈ we obtain| − the| crucial estimate g (x) g (x) k − l gk(x) gk (rk(j)) + gk(rk(j)) gl(rk(j)) + gl(rk(j)) gl(x) (3.14) k,l∀≥K ≤ kǫ ǫ − ǫ − − < + + = ǫ. 3 3 3 The estimate (3.14) shows (gm(x) m∈N is a Cauchy sequence for each x I, and we can define ∈ n f : I K , f(x):= lim gm(x). (3.15) −→ m→∞ 3 GENERAL THEORY 32 Since K in (3.14) does not depend on x I, passing to the limit k in the estimate of (3.14) implies ∈ → ∞ gl(x) f(x) ǫ, l≥∀K, k − k≤ x∈I proving uniform convergence of the subsequence (gm)m∈N of (fm)m∈N as desired. The continuity of f is now a consequence of Th. 3.5. At this point, we have all preparations in place to state and prove the existence theorem. Theorem 3.8 (Peano). If G R Kn is open, n N, and f : G Kn is continuous, then, for each (x ,y ) G, the⊆ explicit× n-dimensional∈ first-order initial−→ value problem 0 0 ∈ y′ = f(x,y), (3.16a) y(x0)= y0, (3.16b) has at least one solution. More precisely, given an arbitrary norm on Kn, (3.16) has a solution φ : I Kn, defined on the open interval k · k −→ I :=]x α,x + α[, (3.17) 0 − 0 α = α(b) > 0, where b> 0 is such that B := (x,y) R Kn : x x b and y y b G, (3.18) ∈ × | − 0|≤ k − 0k≤ ⊆ M := M(b) := max f(x,y) : (x,y) B < , (3.19) {k k ∈ } ∞ and min b, b/M for M > 0, α := α(b) := { } (3.20) (b for M =0. In general, the choice of the norm on Kn will influence the possible sizes of α and, thus, of I. k · k Proof. The proof will be conducted in several steps. In the first step, we check α = α(b) > 0 is well-defined: Since G is open, there always exists b > 0 such that (3.18) holds. Since B is a closed and bounded subset of the finite-dimensional space R Kn, B is compact (cf. [Phi16b, Cor. 3.16(c)]). Since f and, thus, f is continuous (every× norm is even Lipschitz continuous due to the inverse triangle inek quality),k it must assume its + maximum on the compact set B (cf. [Phi16b, Th. 3.19]), showing M R0 is well-defined by (3.19) and α is well-defined by (3.20). ∈ In the second step of the proof, we note that it suffices to prove (3.16) has a solution φ+, defined on [x0,x0 + α[: One can then apply the time reversion Lem. 1.9(b): The proof 3 GENERAL THEORY 33 n providing the solution φ+ also provides a solution ψ+ : [ x0, x0 + α[ K to the ′ − − −→ time-reversed initial value problem, consisting of y = f( x,y) and y( x0)= y0 (note that the same M and α work for the time-reversed problem).− − Then, according− to Lem. 1.9(b), φ :]x α,x ] Kn, φ (x) := ψ ( x), is a solution to (3.16). According to − 0 − 0 −→ − + − Lem. 1.7, we can patch φ− and φ+ together to obtain the desired solution φ (x) for x x , φ : I Kn, φ(x) := − ≤ 0 (3.21) −→ φ (x) for x x , ( + ≥ 0 defined on all of I. It is noted that one can also conduct the proof with the second step omitted, but then one has to perform the following steps on all of I, which means one has to consider additional cases in some places. In the third step of the proof, we will define a sequence (φm)m∈N of functions φ : I Kn, I := [x ,x + α], (3.22) m + −→ + 0 0 that constitute approximate solutions to (3.16). To begin the construction of φm, fix m N. Since B is compact and f is continuous, we know f is even uniformly continuous on∈B (cf. [Phi16b, Th. 3.20]). In particular, 1 x x˜ <δm, y y˜ <δm f(x,y) f(˜x, y˜) < . δm∃>0 (x,y),(˜∀x,y˜)∈B | − | k − k ⇒ − m (3.23) We now form what is called a discretization of the interval I+, i.e. a partition of I+ into sufficiently many small intervals: Let N N and ∈ x min δm, δm/M, 1/m for M > 0, xj xj−1 < β := { } (3.25) j∈{1∀,...,N} − min δ , 1/m for M =0 ( { m } (for example one could make the equidistant choice xj := x0 + jh with h = α/N and N > α/β, but it does not matter how the xj are defined as long as (3.24) and (3.25) both hold). Note that we get a different discretization of I for each m N; however, + ∈ the dependence on m is suppressed in the notation for the sake of readability. We now define recursively φ : I Kn, m + −→ φm(x0) := y0, (3.26) φ (x) := φ (x )+(x x ) f x ,φ (x ) for each x [x ,x ]. m m j − j j m j ∈ j j+1 3 GENERAL THEORY 34 Note that there is no conflict between the two definitions given for x = xj with j n ∈ 1,...,N 1 . Each function φm defines a polygon in K . This construction is known as{ Euler’s− method} and it can be used to obtain numerical approximations to the solution of the initial value problem (while simple, this method is not very efficient, though). We still need to verify that the definition (3.26) does actually make sense: We need to check that f can, indeed, be applied to (xj,φm(xj)), i.e. we have to check (xj,φm(xj)) G. We can actually show the stronger statement ∈ (x,φm(x)) B, (3.27) x∈∀I+ ∈ where B is as defined in (3.18). First, it is pointed out that (3.20) implies α b, such ≤ that x I+ implies x x0 α b as required in (3.18). One can now prove (3.27) by showing∈ by induction| − on|j ≤ 0≤,...,N 1 : ∈ { − } (x,φm(x)) B. (3.28) x∈[xj∀,xj+1] ∈ To start the induction, note φm(x0) = y0 and (x0,y0) B by (3.18). Now let j 0,...,N 1 and x [x ,x ]. We estimate ∈ ∈ { − } ∈ j j+1 j φ (x) y φ (x) φ (x ) + φ (x ) φ (x ) k m − 0k ≤ k m − m j k k m k − m k−1 k k=1 X j (3.26) = (x x ) f x ,φ (x ) + (x x ) f x ,φ (x ) − j j m j k − k−1 k−1 m k−1 k=1 j X (∗) (x x ) M + (x x ) M =(x x ) M ≤ − j k − k−1 − 0 Xk=1 (3.20) α M b, (3.29) ≤ ≤ where, at ( ), it was used that (x ,φ (x )) B by induction hypothesis for each ∗ k m k ∈ k =0, . . . , j, and, thus, f xk,φm(xk) M by (3.19). Estimate (3.29) completes the induction and the third step of the proof.≤ In the fourth step of the proof, we establish several properties of the functions φm. The + first two properties are immediate from (3.26), namely that φm is continuous on I and differentiable at each x ]x ,x [, j 0,...,N 1 , where ∈ j j+1 ∈ { − } ′ φm(x)= f xj,φm(xj) . (3.30) j∈{0,...,N∀ −1} x∈]xj∀,xj+1[ The next property to establish is φm(t) φm(s) t s M. (3.31) s,t∀∈I+ k − k≤| − | 3 GENERAL THEORY 35 To prove (3.31), we may assume s