<<

Ordinary Differential Equation

Alexander Grigorian University of Bielefeld Lecture Notes, April - July 2009

Contents

1 Introduction: the notion of ODEs and examples 3 1.1 Separable ODE ...... 5 1.2 Linear ODE of 1st order...... 11 1.3 Quasi-linear ODEs and differentialforms...... 14 1.4 ...... 22 1.5 Second order ODE ...... 23 1.5.1 Newtons’secondlaw...... 23 1.5.2 Electrical circuit...... 24 1.6HigherorderODEandnormalsystems...... 25 1.7 Some Analysis in Rn ...... 27 1.7.1 Norms in Rn ...... 27 1.7.2 Continuousmappings...... 30 1.7.3 Linearoperatorsandmatrices...... 30

2 Linear equations and systems 32 2.1Existenceofsolutionsfornormalsystems...... 32 2.2ExistenceofsolutionsforhigherorderlinearODE...... 37 2.3SpaceofsolutionsoflinearODEs...... 38 2.4 Solving linear homogeneous ODEs with constant coefficients...... 40 2.5 Solving linear inhomogeneous ODEs with constant coefficients...... 49 2.6SecondorderODEwithperiodicrighthandside...... 56 2.7Themethodofvariationofparameters...... 62 2.7.1 Asystemofthe1storder...... 62 2.7.2 A scalar ODE of n-thorder...... 66 2.8 and the Liouville formula ...... 69 2.9 Linear homogeneous systems with constant coefficients...... 73 2.9.1 A simple case ...... 73 2.9.2 Functionsofoperatorsandmatrices...... 77 2.9.3 Jordan cells ...... 80 2.9.4 TheJordannormalform...... 83 2.9.5 TransformationofanoperatortoaJordannormalform...... 84

1 3 The initial value problem for general ODEs 92 3.1 Existence and uniqueness...... 92 3.2 Maximal solutions...... 102 3.3 Continuity of solutions with respect to f (t, x) ...... 107 3.4Continuityofsolutionswithrespecttoaparameter...... 113 3.5 Differentiability of solutions in parameter ...... 116

4 Qualitative analysis of ODEs 130 4.1 Autonomous systems...... 130 4.2 Stability for a linear system ...... 132 4.3 Lyapunov’s theorems...... 139 4.4 Zeros of solutions...... 152 4.5 The Bessel equation...... 157

2 Lecture 1 14.04.09 Prof. A. Grigorian, ODE, SS 2009

1 Introduction: the notion of ODEs and examples

A general ordinary differential equation1 (shortly, ODE) has a form

(n) F x, y, y0, ..., y =0, (1.1) where x R is an independent variable,¡ y = y (x)¢ is an unknown function, F is a given function2 on n +2variables. Thenumbern - the maximal order of the derivative in (1.1), is called the order of the ODE. The equation (1.1) is called differential because it contains the derivatives2 of the unknown function. It is called ordinary3 because the derivatives y(k) are ordinary as opposed to partial derivatives. There is a theory of partial differential equations where the unknown function depends on more than one variable and, hence, the partial derivatives are involved, but this is a topic of another lecture course. TheODEsariseinmanyareasofMathematics,aswellasinSciencesandEngineering. In most applications, one needs to find explicitly or numerically a solution y (x) of (1.1) satisfying some additional conditions. There are only a few types of the ODEs where one can explicitly find all the solutions. However, for quite a general class of ODEs one can prove various properties of solutions without evaluating them, including existence of solutions, uniqueness, smoothness, etc. In Introduction we will be concerned with various examples and specific classes of ODEs of the first and second order, postponing the general theory to the next Chapters. Consider the differential equation of the first order

y0 = f (x, y) , (1.2) where y = y (x) is the unknown real-valued function of a real argument x,andf (x, y)is a given function of two real variables. Consider a couple (x, y)asapointinR2 and assume that function f is defined on asetD R2, which is called the domain4 of the function f and of the equation (1.2). Then the½ expression f (x, y) makes sense whenever (x, y) D. 2 Definition. Arealvaluedfunctiony (x)definedonanintervalI R, is called a (par- ticular) solution of (1.2) if ½

1. y (x)isdifferentiable at any x I, 2 2. the point (x, y (x)) belongs to D for any x I, 2

3. and the identity y0 (x)=f (x, y (x)) holds for all x I. 2 The graph of a particular solution is called an curve of the equation. Obviously, any integral curve is contained in the domain D. The family of all particular solutions of (1.2) is called the general solution.

1Die gewÄohnliche Di®erentialgleichungen 2Ableitung 3gewÄohnlich 4De¯nitionsbereich

3 Here and below by an interval we mean any set of the form

(a, b)= x R : a

ln y = dx = x + C, Z whence C x x y = e e = C1e , C C where C1 = e .SinceC R is arbitrary, C1 = e is any positive number. Hence, any positive solution y has the2 form

x y = C1e ,C1 > 0.

If y (x) < 0forallx,thenuse y0 =(ln( y))0 y ¡ andobtaininthesameway x y = C1e , ¡ 5unbestimmtes Integral 6Stammfunction 7stetig

4 where C1 > 0. Combine these two cases together, we obtain that any solution y (x)that remains positive or negative, has the form

y (x)=Cex, where C>0orC<0. Clearly, C = 0 suits as well since y = 0 is a solution. The next plot contains the curves of such solutions:

y

1

0.5

0 -2 -1 0 1 2

x -0.5

-1

Claim Thefamilyofsolutionsy = Cex, C R, is the general solution of the ODE 2 y0 = y. The constant C that parametrizes the solutions, is referred to as a parameter.Itis clear that the particular solutions are distinguished by the values of the parameter. Proof. Let y (x)beasolutiondefinedonanopenintervalI. Assume that y (x)takes a positive value somewhere in I,andlet(a, b) be a maximal open interval where y (x) > 0. Then either (a, b)=I or one of the points a, b belongs to I,saya I and y (a)=0.By the above argument, y (x)=Cex in (a, b), where C>0. Since ex =2 0, this solution does not vanish at a. Hence, the second alternative cannot take place6 and we conclude that (a, b)=I,thatis,y (x)=Cex in I. Thesameargumentappliesify (x) < 0forsomex. Finally, of y (x) 0thenalso y (x)=Cex with C =0. ´

1.1 Separable ODE Consider a separable ODE, that is, an ODE of the form

y0 = f (x) g (y) . (1.3)

Any separable equation can be solved by means of the following theorem.

Theorem 1.1 (The method of ) Let f (x) and g (y) be continuous functions on open intervals I and J, respectively, and assume that g (y) =0on J.Let 6 1 F (x) be a primitive function of f (x) on I and G (y) be a primitive function of g(y) on J.

5 Then a function y defined on some subinterval of I, solves the differential equation (1.3) ifandonlyifitsatisfies the identity

G (y (x)) = F (x)+C, (1.4) for all x in the domain of y,whereC is a real constant.

Forexample,consideragaintheODEy0 = y in the domain x R, y>0. Then f (x)=1andg (y)=y = 0 so that Theorem 1.1 applies. We have 2 6 F (x)= f (x) dx = dx = x Z Z and dy dy G (y)= = =lny g (y) y Z Z where we do not write the constant of integration because we need only one primitive function. The equation (1.4) becomes

ln y = x + C,

x whence we obtain y = C1e as in the previous example. Note that Theorem 1.1 does not cover the case when g (y) may vanish, which must be analyzed separately when needed. Proof. Let y (x) solve (1.3). Since g (y) = 0, we can divide (1.3) by g (y), which yields 6 y 0 = f (x) . (1.5) g (y)

1 Observe that by the hypothesis f (x)=F 0 (x)and = G0 (y), which implies by the g0(y) chain rule y0 = G (y) y =(G (y (x)))0 . g (y) 0 0 Hence, the equation (1.3) is equivalent to

G (y (x))0 = F 0 (x) , (1.6) which implies (1.4). Conversely, if function y satisfies (1.4) and is known to be differentiable in its domain then differentiating (1.4) in x, we obtain (1.6); arguing backwards, we arrive at (1.3). The only question that remains to be answered is why y (x)isdifferentiable. Since the function g (y) does not vanish, it is either positive or negative in the whole domain. 1 Then the function G (y), whose derivative is g(y) , is either strictly increasing or strictly 1 decreasing in the whole domain. In the both cases, the inverse function G− is defined and is differentiable. It follows from (1.4) that

1 y (x)=G− (F (x)+C) . (1.7)

1 Since both F and G− are differentiable, we conclude by the chain rule that y is also differentiable, which finishes the proof.

6 Corollary. Under the conditions of Theorem 1.1,forallx0 I and y0 J there exists a unique value of the constant C such that the solution y (x2) of (1.3) de2 fined by (1.7) satisfies the condition y (x0)=y0.

y

y0 (x0,y0)

J

x0 x I

8 The condition y (x0)=y0 is called the initial condition . Proof. Setting in (1.4) x = x0 and y = y0,weobtainG (y0)=F (x0)+C,which allows to uniquely determine the value of C,thatis,C = G (y0) F (x0). Let us prove that this value of C determines by (1.7) a solution y (x). The only problem¡ is to check that the right hand side of (1.7) is definedonanintervalcontainingx0 (a priori it may happen 1 so that the the composite function G− (F (x)+C) has empty domain). For x = x0 the right hand side of (1.7) is 1 1 G− (F (x0)+C)=G− (G (y0)) = y0 1 so that the function y (x)isdefined at x = x0.SincebothfunctionsG− and F + C are continuous and definedonopenintervals,theircompositionisdefinedonanopenset. Since this set contains x0, it contains also an interval around x0. Hence, the function y is definedonanintervalaroundx0,whichfinishes the proof. One can rephrase the statement of Corollary as follows: for all x0 I and y0 J there exists a unique solution9 y (x) of (1.3) that satisfies in addition the2 initial condition2 y (x0)=y0;thatis,foreverypoint(x0,y0) I J there is exactly one integral curve of the ODE that goes through this point. 2 £ In applications of Theorem 1.1, it is necessary to find the functions F and G.Techni- cally it is convenient to combine the evaluation of F and G with other computations as follows. The first step is always dividing (1.3) by g to obtain (1.5). Then integrate the both sides in x to obtain y dx 0 = f (x) dx. (1.8) g (y) Z Z Then we need to evaluate the integral in the right hand side. If F (x) is a primitive of f then we write f (x) dx = F (x)+C. Z 8Anfangsbedingung 9The domain of this solution is determined by (1.7) and is, hence, the maximal possible.

7 In the left hand side of (1.8), we have y0dx = dy. Hence, we can change variables in the integral replacing function y (x)byanindependentvariabley.Weobtain

y dx dy 0 = = G (y)+C. g (y) g (y) Z Z Combining the above lines, we obtain the identity (1.4).

8 Lecture 2 20.04.2009 Prof. A. Grigorian, ODE, SS 2009 Assume that in the separable equation y0 = f (x) g (y)thefunctiong (y)vanishesat a sequence of points, say y1,y2,..., enumerated in the increasing order. Obviously, the constant function y (x)=yk is a solution of this ODE. The method of separation of variables allows to evaluate solutions in any domain y (yk,yk+1), where g does not vanish. Then the structure of the general solution requires2 an additional investigation. Example. Consider the ODE 2 y0 xy =2xy, ¡ with domain (x, y) R2. Rewriting it in the form 2 2 y0 = x y +2y , we see that it is separable. The function g (¡y)=y2 ¢+2y vanishes at two points y =0and y = 2. Hence, we have two constant solutions y 0andy 2. Now restrict the ODE to the¡ domain where g (y) = 0, that is, to one of the´ domains´ 6 R ( , 2) , R ( 2, 0) , R (0, + ) . £ ¡1 ¡ £ ¡ £ 1 In any of these domains, we can use the method of separation of variables, which yields

1 y dy x2 ln = = xdx = + C 2 y +2 y (y +2) 2 ¯ ¯ Z Z whence ¯ ¯ ¯ ¯ y 2 ¯ ¯ = C ex y +2 1 2C where C1 = e . Clearly, C1 is any non-zero number here. However, since y 0isa § ´ solution, C1 can be 0 as well. Renaming C1 to C,weobtain

y 2 = Cex y +2 where C is any , whence it follows that

2Cex2 y = . (1.9) 1 Cex2 ¡ We obtain the following family of solutions

2Cex2 and y 2, (1.10) 1 Cex2 ´¡ ¡ theintegralcurvesofwhichareshownonthediagram:

9 y 5

4

3

2

1 0 -5 -4 -3 -2 -1 0 1 2 3 4 5

-1 x

-2

-3

-4

-5

Note that the constructed integral curves never intersect each other. Indeed, by The- orem 1.1, through any point (x0,y0)withy0 =0, 2 there is a unique integral curve from 6 ¡ the family (1.9) with C =0.Ify0 =0ory0 = 2 then the same is true with the family (1.10), because if C = 0 then6 the function (1.9)¡ never takes values 0 and 2. This implies as in the previous Section6 that we have found all the solutions so that¡ (1.10) represent the general solution. Let us show how to find a particular solution that satisfies the prescribed initial con- dition, say y (0) = 4. Substituting x =0andy = 4 into (1.9), we obtain an equation for C: ¡ ¡ 2C = 4 1 C ¡ ¡ whence C = 2. Hence, the particular solution is

4ex2 y = . 1 2ex2 ¡

Example. Consider the equation y0 = y , j j which is defined for all x, y R. Since the rightp hand side vanish for y = 0, the constant function y 0isasolution.Inthedomains2 y>0andy<0, the equation can be solved using separation´ of variables. For example, in the domain y>0, we obtain dy = dx py Z Z whence 2py = x + C

10 and 1 y = (x + C)2 ,x> C 4 ¡ (the restriction x> C comes from the previous line). Similarly, in the domain y<0, we obtain ¡ dy = dx p y Z ¡ Z whence 2p y = x + C ¡ ¡ and 1 y = (x + C)2 ,x< C. ¡4 ¡ We obtain the following integrals curves:

y 4

3

2

1

0 -2 -1 0 1 2 3 4 5

-1 x

-2

-3

-4 We see that the integral curves in the domains y>0andy<0 touch the line y =0.This allows us to construct more solutions as follows: for any couple of reals ab, < 4 ¡ which is obviously a solution with the domain . If we allow a to be and b to be : R + with the obvious meaning of (1.11) in these cases, then (1.11) represents¡1 the general 1 2 solution to y0 = y . Clearly, through any point (x0,y0) R there are infinitely many integral curves of thej j given equation. 2 p 1.2 Linear ODE of 1st order Consider the ODE of the form y0 + a (x) y = b (x) (1.12) where a and b are given functions of x,definedonacertainintervalI.Thisequationis called linear because it depends linearly on y and y0. A linear ODE can be solved as follows.

11 Theorem 1.2 (The method of variation of parameter) Let functions a (x) and b (x) be continuous in an interval I. Then the general solution of the linear ODE (1.12) has the form A(x) A(x) y (x)=e− b (x) e dx, (1.13) Z where A (x) is a primitive of a (x) on I.

Note that the function y (x) given by (1.13) is definedonthefullintervalI. Proof. Let us make the change of the unknown function u (x)=y (x) eA(x),thatis,

A(x) y (x)=u (x) e− . (1.14)

Substituting this to the equation (1.12) we obtain

A A ue− 0 + aue− = b,

¡A ¢ A A u0e− ue− A0 + aue− = b. ¡ Since A0 = a, we see that the two terms in the left hand side cancel out, and we end up with a very simple equation for u (x):

A u0e− = b

A whence u0 = be and u = beAdx. Z Substituting into (1.14), we finish the proof. One may wonder how one can guess to make the change (1.14). Here is the motivation. Consider first the case when b (x) 0. In this case, the equation (1.12) becomes ´

y0 + a (x) y =0 and it is called homogeneous. Clearly, the homogeneous linear equation is separable. In the domains y>0andy<0wehave y 0 = a (x) y ¡ and dy = a (x) dx = A (x)+C. y ¡ ¡ Z Z Then ln y = A (x)+C and j j ¡ A(x) y (x)=Ce− where C can be any real (including C = 0 that corresponds to the solution y 0). For a general equation (1.12) take the above solution to the homogeneous´ equation and replace a constant C by a function C (x) (or which was denoted by u (x) in the proof), which will result in the above change. Sincewehavereplacedaconstantparameterby a function, this method is called the method of variation of parameter. It applies to the linear equations of higher order as well.

12 Example. Consider the equation

1 2 y + y = ex (1.15) 0 x in the domain x>0. Then dx A (x)= a (x) dx = =lnx x Z Z (we do not add a constant C since A (x)isone of the primitives of a (x)),

1 2 1 2 1 2 y (x)= ex xdx = ex dx2 = ex + C , x 2x 2x Z Z ³ ´ where C is an arbitrary constant. Alternatively, one can solve first the homogeneous equation 1 y + y =0, 0 x using the separable of variables: y 1 0 = y ¡x (ln y)0 = (ln x)0 ¡ ln y = ln x + C1 ¡C y = . x Next, replace the constant C by a function C (x) and substitute into (1.15):

C (x) 0 1 C 2 + = ex , x x x µ ¶ C0x C C 2 ¡ + = ex x2 x2 C 2 0 = ex x x2 C0 = e x 2 1 2 C (x)= ex xdx = ex + C . 2 0 Z ³ ´ Hence, C (x) 1 2 y = = ex + C , x 2x 0 ³ ´ where C0 is an arbitrary constant. The integral curves are shown on the following diagram:

13 y 20

15

10

5

0 0 0.5 1 1.5 2 2.5 3

x

-5

Corollary. Under the conditions of Theorem 1.2,foranyx0 I and any y0 R there 2 2 is exists exactly one solution y (x) defined on I andsuchthaty (x0)=y0. That is, though any point (x0,y0) I R there goes exactly one integral curve of the equation. 2 £ A Proof. Let B (x) be a primitive of be− so that the general solution can be written in the form A(x) y = e− (B (x)+C) with an arbitrary constant C. Obviously, any such solution is defined on I. The condition y (x0)=y0 allows to uniquely determine C from the equation:

C = y eA(x0) B (x ) , 0 ¡ 0 whence the claim follows.‘

1.3 Quasi-linear ODEs and differential forms

Let F (x, y) be a real valued function definedinanopensetΩ R2. Recall that F is called differentiable at a point (x, y) Ω if there exist real numbers½ a, b such that 2 F (x + dx, y + dy) F (x, y)=adx + bdy + o ( dx + dy ) , ¡ j j j j as dx + dy 0. Here dx and dy the increments of x and y, respectively, which are consideredj j j asj! new independent variables (the differentials). The linear function adx +bdy of the variables dx, dy is called the differential of F at (x, y) and is denoted by dF ,that is, dF = adx + bdy. (1.16) In general, a and b are functions of (x, y). Recall also the following relations between the notion of a differential and partial derivatives:

14 1. If F is differentiable at some point (x, y)anditsdifferential is given by (1.16) then ∂F ∂F the partial derivatives Fx = ∂x and Fy = ∂y existatthispointand

Fx = a, Fy = b.

2. If F is continuously differentiable in Ω, that is, the partial derivatives Fx and Fy exist in Ω and are continuous functions, then F is differentiable at any point in Ω.

Definition. Given two functions a (x, y)andb (x, y)inΩ, consider the expression

a (x, y) dx + b (x, y) dy, which is called a differential form. The differential form is called exact in Ω if there is a differentiable function F in Ω such that

dF = adx + bdy, (1.17) and inexact otherwise. If the form is exact then the function F from (1.17) is called the integral of the form. Observe that not every differential form is exact as one can see from the following statement.

Lemma 1.3 If functions a, b are continuously differentiable in Ω then the necessary con- dition for the form adx + bdy to be exact is the identity

ay = bx. (1.18)

15 Lecture 3 21.04.2009 Prof. A. Grigorian, ODE, SS 2009 Proof. Indeed, if F is an integral of the form adx + bdy then Fx = a and Fy = b, whence it follows that the derivatives Fx and Fy are continuously differentiable. By a Schwarz theorem from Analysis, this implies that Fxy = Fyx whence ay = bx. Definition. The differential form adx + bdy is called closed 10 in Ω if it satisfies the condition ay = bx in Ω. Hence, Lemma 1.3 says that any exact form must be closed. The converse is in general not true, as will be shown later on. However, since it is easier to verify the closedness than the exactness, it is desirable to know under what additional conditions the closedness implies the exactness. Such a result will be stated and proved below.

Example. The form ydx xdy is not closed because ay = 1 while bx = 1. Hence, it is inexact. ¡ ¡ The form ydx + xdy is exact because it has an integral F (x, y)=xy. Hence, it is also closed, which can be easily verified directly by (1.18). 2 2 2 y3 The form 2xydx +(x + y ) dy is exact because it has an integral F (x, y)=x y + 3 (it will be explained later how one can obtain an integral). Again, the closedness can be verified by a straight differentiation, while the exactness requires construction (or guessing) of the integral. If the differential form adx + bdy is exact then this allows to solve easily the following differential equation: a (x, y)+b (x, y) y0 =0, (1.19) as it is stated in the next Theorem. The ODE (1.19) is called quasi-linear because it is linear with respect to y0 but not dy necessarily linear with respect to y.Usingy0 = dx , one can write (1.19) in the form a (x, y) dx + b (x, y) dy =0, which explains why the equation (1.19) is related to the differential form adx + bdy. We say that the equation (1.19) is exact (or closed) if the form adx + bdy is exact (or closed).

Theorem 1.4 Let Ω be an open subset of R2, a, b be continuous functions on Ω,such that the form adx + bdy is exact. Let F be an integral of this form and let y (x) be a differentiable function defined on an interval I R such that (x, y (x)) Ω for any x I (that is, the graph of y is contained in Ω). Then½ y is a solution of the2 equation (1.19)2 if and only if F (x, y (x)) = const on I (1.20) (that is, if function F remains constant on the graph of y).

The identity (1.20) can be considered as (an implicit form of) the general solution of (1.19). The function F is also referred to as the integral of the ODE (1.19). Proof. The hypothesis that the graph of y (x) is contained in Ω implies that the composite function F (x, y (x)) is defined on I. By the chain rule, we have d F (x, y (x)) = F + F y = a + by . dx x y 0 0 10geschlossen

16 d Hence, the equation a + by0 =0isequivalentto dx F (x, y (x)) = 0 on I, and the latter is equivalent to F (x, y (x)) = const.

Example. The equation y + xy0 = 0 is exact and is equivalent to xy = C because ydx+xdy = d(xy). The same can be obtained using the method of separation of variables. 2 2 The equation 2xy +(x + y ) y0 = 0 is exact and, hence, is equivalent to

y3 x2y + = C, 3 because the left hand side is the integral of the corresponding differential form (cf. the previous Example). Some integral curves ofthisequationareshownonthediagram:

y 2

1.5

1

0.5

0 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8

-0.5 x

-1

-1.5

We say that a set Ω R2 is a rectangle (box) if it has the form I J where I and J ½ £ are intervals in R. A rectangle is open if both I and J are open intervals. The following theorem provides the answer to the question how to decide whether a given differential form is exact in a rectangle.

Theorem 1.5 (The Poincar´elemma)Let Ω be an open rectangle in R2.Leta, b be continuously differentiable functions on Ω such that ay bx. Then the differential form adx + bdy is exact in Ω. ´

It follows from Lemma 1.3 and Theorem 1.5 that if Ω is a rectangle then the form adx + bdy is exact in Ω if and only if it is closed. Proof. First we try to obtain an explicit formula for the integral F assuming that it exists. Then we use this formula to prove the existence of the integral. Fix some ref- erence point (x0,y0) and assume without loss of generality that and F (x0,y0) = 0 (this can always be achieved by adding a constant to F). For any point (x, y) Ω,alsothe 2 point (x, y0)belongsΩ; moreover, the intervals [(x0,y0) , (x, y0)] and [(x, y0) , (x, y)] are contained in Ω because Ω is a rectangle (see the diagram).

17 y

y (x,y)

J

y0 (x0,y0) (x,y0)

x0 x x I

Since Fx = a and Fy = b, we obtain by the fundamental theorem of calculus that x x F (x, y )=F (x, y ) F (x ,y )= F (s, y ) ds = a (s, y ) ds 0 0 ¡ 0 0 x 0 0 Zx0 Zx0 and y y F (x, y) F (x, y )= F (x, t) dt = b (x, t) dt, ¡ 0 y Zy0 Zy0 whence x y

F (x, y)= a (s, y0) ds + b (x, t) dt. (1.21)

Zx0 Zy0 Now we start the actual proof where we assume that the form adx + bdy is closed and use the formula (1.21) to define function F (x, y). We need to show that F is the integral; this will not only imply that the form adx + bdy is exact but will give an effective way of evaluating the integral, Due to the continuous differentiability of a and b, in order to prove that F is indeed the integral of the form adx + bdy, it suffices to verify the identities

Fx = a and Fy = b. It is easy to see from (1.21) that

y ∂ F = b (x, t) dt = b (x, y) . y ∂y Zy0 Next, we have ∂ x ∂ y F = a (s, y ) ds + b (x, t) dt x ∂x 0 ∂x Zx0 Zy0 y ∂ = a (x, y )+ b (x, t) dt. (1.22) 0 ∂x Zy0 18 ∂ The fact that the integral and the derivative ∂x can be interchanged will be justified below (see Lemma 1.6). Using the hypothesis bx = ay, we obtain from (1.22)

y Fx = a (x, y0)+ ay (x, t) dt Zy0 = a (x, y )+(a (x, y) a (x, y )) 0 ¡ 0 = a (x, y) , which finishes the proof. Now we state and prove a lemma that justifies (1.22).

Lemma 1.6 Let g (x, t) be a continuous function on I J where I and J are bounded £ closed intervals in R. Consider the function β f (x)= g (x, t) dt, Zα where [α, β]=J, which is hence defined for all x I. If the partial derivative gx exists and is continuous on I J then f is continuously2 differentiable on I and, for any x I, £ 2 β f 0 (x)= gx (x, t) dt. Zα In other words, the operations of differentiation in x and integration in t, when applied to g (x, t), are interchangeable. ProofofLemma1.6. We need to show that, for all x I, 2 f (y) f (x) β ¡ g (x, t) dt as y x, y x ! x ! ¡ Zα which amounts to β g (y, t) g (x, t) β ¡ dt g (x, t) dt as y x. y x ! x ! Zα ¡ Zα Note that by the definition of a partial derivative, for any t [α, β], 2 g (y, t) g (x, t) ¡ g (x, t)asy x. (1.23) y x ! x ! ¡ Consider all parts of (1.23) as functions of t,withfixed x and with y as a parameter. Then we have a convergence of a sequence of functions, and we would like to deduce that their integrals converge as well. By a result from Analysis II, this is the case, if the convergence is uniform (gleichmassig¨ )inthewholeinterval[α, β] , that is, if g (y,t) g (x, t) sup ¡ gx (x, t) 0asy x. (1.24) t [α,β] y x ¡ ! ! ∈ ¯ ¯ ¯ ¡ ¯ By the mean value theorem,¯ for any t [α, β], there¯ is ξ [x, y] such that ¯ 2 ¯ 2 g (y, t) g (x, t) ¡ = g (ξ,t) . y x x ¡ 19 Hence, the difference quotient in (1.24) can be replaced by gx (ξ,t). To proceed further, recall that a continuous function on a compact set is uniformly continuous. In particular, the function gx (x, t) is uniformly continuous on I J,thatis,foranyε>0thereisδ>0 such that £

x, ξ I, x ξ <δand t, s J, t s <δ g (x, t) g (ξ,s) <ε. (1.25) 2 j ¡ j 2 j ¡ j )jx ¡ x j If x y <δthen also x ξ <δand, by (1.25) with s = t, j ¡ j j ¡ j

gx (ξ,t) gx (x, t) <ε for all t J. j ¡ j 2 In other words, x y <δimplies that j ¡ j g (y, t) g (x, t) sup ¡ gx (x, t) ε, t J y x ¡ · ∈ ¯ ¯ ¯ ¡ ¯ ¯ ¯ whence (1.24) follows. ¯ ¯ Consider some examples to Theorem 1.5. Example. Consider again the differential form 2xydx +(x2 + y2) dy in Ω = R2.Since

2 2 ay =(2xy)y =2x = x + y x = bx, we conclude by Theorem 1.5 that the given form¡ is exact.¢ The integral F can be found by (1.21) taking x0 = y0 =0: x y y3 F (x, y)= 2s0ds + x2 + t2 dt = x2y + , 3 Z0 Z0 ¡ ¢ as it was observed above. Example. Consider the differential form ydx + xdy ¡ (1.26) x2 + y2

2 in Ω = R 0 . This form satisfies the condition ay = bx because nf g y (x2 + y2) 2y2 y2 x2 ay = = ¡ = ¡ ¡ x2 + y2 ¡ (x2 + y2)2 (x2 + y2)2 µ ¶y and x (x2 + y2) 2x2 y2 x2 bx = = ¡ = ¡ . x2 + y2 (x2 + y2)2 (x2 + y2)2 µ ¶x By Theorem 1.5 we conclude that the given form is exact in any rectangular subdomain of Ω.However,Ω itself is not a rectangle, and let us show that the form is inexact in Ω. To that end, consider the function θ (x, y) which is the polar angle that is definedinthe domain 2 Ω0 = R (x, 0) : x 0 nf · g by the conditions y x sin θ = , cos θ = ,θ ( π,π) , r r 2 ¡ 20 2 2 where r = x + y .LetusshowthatinΩ0 p ydx + xdy dθ = ¡ , (1.27) x2 + y2

y that is, θ is the integral of (1.26) in Ω0. In the half-plane x>0 we have tan θ = x and θ ( π/2,π/2) whence f g 2 ¡ y θ = arctan . x Then (1.27) follows by differentiation of the arctan:

1 xdy ydx ydx + xdy dθ = ¡ = ¡ . 1+(y/x)2 x2 x2 + y2

In the half-plane y>0 we have cot θ = x and θ (0,π) whence f g y 2 x θ = arccot y and (1.27) follows again. Finally, in the half-plane y<0 we have cot θ = x and θ f g y 2 ( π,0) whence ¡ x θ = arccot , ¡ ¡y µ ¶ and (1.27) follows again. Since Ω0 is the union of the three half-planes x>0 , y>0 , f g f g y<0 , we conclude that (1.27) holds in Ω0 and, hence, the form (1.26) is exact in Ω0. f Nowwecanprovethattheform(1.26)isinexacting Ω. Assume from the contrary that it is exact in Ω and that F is its integral in Ω,thatis, ydx + xdy dF = ¡ . x2 + y2

11 Then dF = dθ in Ω0 whence it follows that d (F θ) = 0 and, hence F = θ +constin ¡ Ω0. It follows from this identity that function θ canbeextendedfromΩ0 to a continuous function on Ω, which however is not true, because the limits of θ when approaching the point ( 1, 0)(oranyotherpoint(x, 0) with x<0) from above and below are different: π and ¡π respectively. The¡ moral of this example is that the statement of Theorem 1.5 is not true for an arbitrary open set Ω. It is possible to show that the statement of Theorem 1.5 is true if and only if the set Ω is simply connected, that is, if any closed curve in Ω can be continuously deformed to a point while staying in Ω. Obviously, the rectangles are simply 2 connected (as well as Ω0), while the set Ω = R 0 is not simply connected. nf g

11We use the following fact from Analysis II: if the di®erential of a function is identical zero in a connected open set U Rn then the function is constant in this set. Recall that the set U is called connected if any two points½ from U can be connected by a polygonal line that is contained in U. The set −0 is obviously connected.

21 Lecture 4 27.04.2009 Prof. A. Grigorian, ODE, SS 2009

1.4 Integrating factor Consider again the quasilinear equation

a (x, y)+b (x, y) y0 =0 (1.28) and assume that it is inexact. Write this equation in the form

adx + bdy =0.

After multiplying by a non-zero function M (x, y), we obtain an equivalent equation

Madx+ Mbdy =0, which may become exact, provided function M is suitably chosen. Definition. AfunctionM (x, y)iscalledtheintegrating factor for the differential equa- tion (1.28) in Ω if M is a non-zero function in Ω such that the form Madx + Mbdy is exact in Ω. If one has found an integrating factor then multiplying (1.28) by M the problem amounts to the case of Theorem 1.4. Example. Consider the ODE y y = , 0 4x2y + x in the domain x>0,y >0 and write it in the form f g ydx 4x2y + x dy =0. ¡ Clearly, this equation is not exact. However,¡ dividing¢ it by x2, we obtain the equation y 1 dx 4y + dy =0, x2 ¡ x µ ¶ which is already exact in any rectangular domain because y 1 1 2 = 2 = 4y + . x y x ¡ x x ³ ´ µ ¶ Taking in (1.21) x0 = y0 = 1, we obtain the integral of the form as follows: x 1 y 1 y F (x, y)= ds 4t + dt =3 2y2 . s2 ¡ x ¡ ¡ x Z1 Z1 µ ¶ By Theorem 1.4, the general solution is given by the identity y 2y2 + = C. x

There are no regular methods for finding the integrating factors.

22 1.5 Second order ODE For higher order ODEs we will use different notation: the independent variable will be denoted by t and the unknown function by x (t). In this notation, a general second order ODE, resolved with respect to x00 has the form

x00 = f (t, x, x0) , where f is a given function of three variables. We consider here some problems that amount to a second order ODE.

1.5.1 Newtons’ second law Consider movement of a point particle along a straight line and let its coordinate at 12 13 time t be x (t). The velocity of the particle is v (t)=x0 (t) and the acceleration is a (t)=x00 (t).TheNewton’ssecondlawsaysthatatanytime

mx00 = F, (1.29) where m is the mass of the particle and F is the force14 acting on the particle. In general, F is a function of t, x, x0,thatis,F = F (t, x, x0) so that (1.29) can be regarded as a second order ODE for x (t). The force F is called conservative if F depends only on the position x.Forexample, the gravitational, elastic, and electrostatic forces are conservative, while friction15 and the air resistance16 are non-conservative as they depend on the velocity. Assuming that F = F (x), let U (x) be a primitive function of F (x). The function U is called the ¡ potential of the force F. Multiplying the equation (1.29) by x0 and integrating in t,we obtain

m x00x0dt = F (x) x0dt, Z Z m d (x )2 dt = F (x) dx, 2 dt 0 Z Z m (x )2 0 = U (x)+C 2 ¡ and mv2 + U (x)=C. 2 mv2 The sum 2 + U (x)iscalledthemechanical of the particle (which is the sum of the kinetic energy and the potential energy). Hence, we have obtained the law of conservation of energy 17: the total mechanical energy of the particle in a conservative field remains constant. 12Geschwindigkeit 13Beschleunigung 14Kraft 15Reibung 16StrÄomungswiderstand 17Energieerhaltungssatz

23 1.5.2 Electrical circuit Consider an RLC-circuit that is, an electrical circuit18 where a resistor, an inductor and a capacitor are connected in a series:

R I(t)

+ V(t) _

C

Denote by R the resistance19 of the resistor, by L the inductance20 of the inductor, and by C the capacitance21 of the capacitor. Let the circuit contain a power source with the voltage V (t)22 depending in time t.DenotebyI (t) the current23 in the circuit at 24 time t. Using the laws of electromagnetism, we obtain that the voltage drop vR on the resistor R is equal to vR = RI 25 (Ohm’s law ), and the voltage drop vL on the inductor is equal to dI v = L L dt

(Faraday’s law). The voltage drop vC on the capacitor is equal to Q v = , C C

26 where Q is the charge of the capacitor; also we have Q0 = I.ByKirchhoff’s law, we obtain vR + vL + vC = V (t) whence Q RI + LI + = V (t) . 0 C 18Schaltung 19Widerstand 20InduktivitÄat 21KapazitÄat 22Spannung 23Strom 24Spannungsfall 25Gesetz 26Ladungsmenge

24 Differentiating in t,weobtain I LI + RI + = V , (1.30) 00 0 C 0 which is a second order ODE with respect to I (t). We will come back to this equation after having developed the theory of linear ODEs.

1.6 Higher order ODE and normal systems A general ODE of the order n resolved with respect to the highest derivative can be written in the form (n) (n 1) y = F t, y, ..., y − , (1.31) where t is an independent variable and y (¡t) is an unknown¢ function. It is frequently more convenient to replace this equation by a system of ODEs of the 1st order. Let x (t) be a vector function of a real variable t, which takes values in Rn.Denoteby xk the components of x. Then the derivative x0 (t)isdefined component-wise by

x0 =(x10 ,x20 , ..., xn0 ) . Consider now a vector ODE of the first order

x0 = f (t, x) , (1.32) where f is a given function of n+1 variables, which takes values in Rn,thatis,f : Ω Rn ! where Ω is an open subset27 of Rn+1. Here the couple (t, x)isidentified with a point in Rn+1 as follows: (t, x)=(t, x1, ..., xn) .

Denoting by fk the components of f, we can rewrite the vector equation (1.32) as a system of n scalar equations x10 = f1 (t, x1, ..., xn) ... 8 > xk0 = fk (t, x1, ..., xn) (1.33) > <> ... xn0 = fn (t, x1, ..., xn) > > A system of ODEs of the form:> (1.32) or (1.33) is called the normal system.Asinthe case of the scalar ODEs, define a solution of the normal system (1.32) as a differentiable function x : I Rn,whereI is an interval in R,suchthat(t, x (t)) Ω for all t I and ! 2 2 x0 (t)=f (t, x (t)) for all t I. Let us show how the equation2 (1.31) can be reduced to the normal system (1.33). Indeed, with any function y (t) let us associate the vector-function

(n 1) x = y,y0,...,y − , which takes values in Rn. That is, we¡ have ¢

(n 1) x1 = y, x2 = y0, ..., xn = y − .

27Teilmenge

25 Obviously, (n) x0 = y0,y00,...,y , and using (1.31) we obtain a system of¡ equations ¢

x10 = x2 x0 = x 8 2 3 > ... (1.34) > > xn0 1 = xn < − xn0 = F (t, x1, ...xn) > > Obviously, we can rewrite this system:> as a vector equation (1.32) with the function f as follows: f (t, x)=(x2,x3, ..., xn,F(t, x1,...,xn)) . (1.35) Conversely, the system (1.34) implies

(n) (n 1) x1 = xn0 = F (t, x1, ...xn)=F t, x1,x10 ,..,x1 − ³ ´ so that we obtain equation (1.31) with respect to y = x1. Hence, the equation (1.31) is equivalent to the vector equation (1.32) with function f defined by (1.35). Example. Consider the second order equation

y00 = F (t, y, y0) .

Setting x =(y, y0)weobtain x0 =(y0,y00) whence x10 = x2 x0 = F (t, x1,x2) ½ 2 Hence, we obtain the normal system (1.32) with

f (t, x)=(x2,F(t, x1,x2)) .

As we have seen in the case of the first order scalar ODE, adding the initial condition allows usually to uniquely identify the solution. The problem of finding a solution sat- isfying the initial condition is called the initial value problem, shortly IVP. What initial value problem is associated with the vector equation (1.32) and the scalar higher order equation (1.31)? Motivated by the 1st order scalar ODE, one can presume that it makes sense to consider the following IVP also for the 1st order vector ODE:

x0 = f (t, x) , x (t0)=x0, ½ n where (t0,x0) Ω is a given point. In particular, x0 is a given vector from R ,which 2 is called the initial value of x (t), and t0 R is the initial time. For the equation 2

26 (1.31), this means that the initial conditions should prescribe the value of the vector (n 1) x = y, y0, ..., y − at some t0, which amounts to n scalar conditions

¡ ¢ y (t0)=y0 y0 (t )=y 8 0 1 > ... > (n 1) < y − (t0)=yn 1 − > where y0, ..., yn 1 are given values.:> Hence, the initial value problem IVP for the scalar equation of the− order n can be stated as follows:

(n) (n 1) y = F t, y, y0, ..., y − y (t0)=y0 8 ¡ ¢ > y0 (t0)=y1 > <> ... (n 1) y − (t0)=yn 1. > − > :> 1.7 Some Analysis in Rn Here we briefly revise some necessary facts from Analysis in Rn.

1.7.1 Norms in Rn Recall that a norm in Rn is a function N : Rn [0, + ) with the following properties: ! 1 1. N (x) = 0 if and only if x =0.

2. N (cx)= c N (x)forallx Rn and c R. j j 2 2 3. N (x + y) N (x)+N (y)forallx, y Rn. · 2 If N (x)satisfies2and3butnotnecessarily1thenN (x) is called a semi-norm. For example, the function x is a norm in R. Usually one uses the notation x for a norm instead of N (x). j j k k Example. For any p 1, the p-norm in Rn is defined by ¸ n 1/p x = x p . (1.36) k kp j kj Ãk=1 ! X In particular, for p =1wehave n x = x , k k1 j kj k=1 X and for p =2 n 1/2 x = x2 . k k2 k Ãk=1 ! X For p = set 1 x =max xk . ∞ 1 k n k k ≤ ≤ j j 27 It is known that the p-norm for any p [1, ] is indeed a norm. 2 1 It follows from the definition of a norm that in R any norm has the form x = c x k k j j where c is a positive constant. In Rn, n 2, there is a great variety of non-proportional ¸ norms. However, it is known that all possible norms in Rn are equivalent in the following n sense: if N1 (x)andN2 (x)aretwonormsinR then there are positive constants C0 and C00 such that N1 (x) C00 C0 for all x = 0. (1.37) · N2 (x) · 6

Forexample,itfollowsfromthedefinitions of x p and x that k k k k∞ x 1 k kp n1/p. · x · k k∞ For most applications, the relation (1.37) means that the choice of a specificnormis unimportant. Fix a norm x in Rn.Foranyx Rn and r>0, define an open ball k k 2 n B (x, r)= y R : x y

x_2

x_1

the 2-norm (a round ball):

x_2

x_1

28 the 4-norm:

x_2

x_1

the -norm (a square ball): 1 x_2

x_1

29 Lecture 5 28.04.2009 Prof. A. Grigorian, ODE, SS 2009

1.7.2 Continuous mappings

Let S be a subset of Rn and f be a mapping28 from S to Rm. The mapping f is called continuous29 at x S if f (y) f (x)asy x that is, 2 ! ! f (y) f (x) 0as y x 0. k ¡ k! k ¡ k! Here in the expression y x we use a norm in Rn whereas in the expression f (y) k ¡ k k ¡ f (x) we use a norm in Rm, where the norms are arbitrary, but fixed. Thanks to the equivalencek of the norms in the Euclidean spaces, the definition of a continuous mapping does not depend on a particular choice of the norms. Note that any norm N (x)inRn is a continuous function because by the triangle inequality N (y) N (x) N (y x) 0asy x. j ¡ j· ¡ ! ! Recall that a set S Rn is called ½ open30 if for any x S there is r>0 such that the ball B (x, r) is contained in S; ² 2 closed31 if Sc is open; ² compact if S is bounded and closed. ² An important fact from Analysis is that if S is a compact subset of Rn and f : S Rm ! is a continuous mapping then the image32 f (S)isacompactsubsetofRm. In particular, of f is a continuous numerical functions on S,thatisf : S R then f is bounded on S and attains on S its maximum and minimum. !

1.7.3 Linear operators and matrices

Let us consider a special class of mappings from Rn to Rm that are called linear operators or linear mappings. Namely, a linear operator A : Rn Rm is a mapping with the following properties: !

1. A (x + y)=Ax + Ay for all x, y Rn; 2 2. A (cx)=cAx for all c R and x Rn. 2 2 n m Any linear operator from R to R can be represented by a m n (aij)where i is the row33 index, i =1, ..., m,andj is the column34 index j =1£, ..., n, as follows:

n

(Ax)i = aijxj j=1 X 28Abbildung 29stetig 30o®ene Menge 31abgeschlossene Menge 32Das Bild 33Zeile 34Spalte

30 (the elements aij of the matrix (aij) are called the components of the operator A). In other words, the column vector Ax is obtained from the column-vector x by the left multiplication with the matrix (aij):

(Ax)1 a11 ...... a1n x1 ...... 0 1 0 1 0 1 (Ax)i = ...... aij ...... xj B ... C B ...... C B ... C B C B C B C B (Ax) C B am1 ...... amn C B xn C B m C B C B C @ A @ A @ A n m m n n m The class of all linear operators from R to R is denoted by R × or by (R , R ). One m n L defines in the obvious way the addition A+B of operators in R × and the multiplication by a constant cA with c R: 2 1. (A + B)(x)=Ax + Bx,

2. (cA)(x)=c (Ax) .

m n Clearly, R × with these operations is a linear space over R.Sinceanym n matrix mn £ m n has mn components, it can be identified with an element in R ;consequently,R × is linearly isomorphic to Rmn. n m m n Let us fixsomenormsinR and R . For any linear operator A R × ,define its by 2 Ax A =supk k, (1.38) k k x Rn 0 x ∈ \{ } k k where x is the norm in Rn and Ax is the norm in Rm. We claim that always A < . k k k k k k 1 Indeed, it suffices to verify this if the norm in Rn is the 1-norm and the norm in Rm is the -norm. Using the matrix representation of the operator A as above, we obtain 1 n

Ax =max (Ax)i max aij xj = a x 1 ∞ i i,j k k j j· j j j=1 j j k k X where a =maxi,j aij . It follows that A a< . Hence, one canj dejfine A as the minimalk k· possible1 real number such that the following inequality is true: k k n Ax A x for all x R . k k·k kk k 2 m n As a consequence, we obtain that any linear mapping A R × is continuous, because 2 Ay Ax = A (y x) A y x 0as y x. k ¡ k k ¡ k·k kk ¡ k! ! m n Let us verify that the operator norm is a norm in the linear space R × . Indeed, by (1.38) we have A 0; moreover, if A = 0 then there is x Cn such that Ax =0, whence Ax >k0andk¸ 6 2 6 k k Ax A k k > 0. k k¸ x k k

31 The triangle inequality can be deduced from (1.38) as follows: (A + B) x Ax + Bx A + B =supk k sup k k k k k k x x · x x k k k k Ax Bx sup k k +supk k · x x x x k k k k = A + B . k k k k Finally, the scaling property trivially follows from (1.38): (λA) x λ Ax λA =supk k =supj jk k = λ A . k k x x x x j jk k k k k k n m n Similarly, one defines the notion of a norm in C , the space of linear operators C × m n and the operator norm in C × . All the properties of the norms in the complex spaces canbeeitherprovedinthesamewayasintherealspaces,orsimplydeducedfromthe real case by using the natural identification of Cn with R2n.

2 Linear equations and systems

A normal linear system of ODEs is the following vector equation

x0 = A (t) x + B (t) , (2.1)

n n n where A : I R × , B : I R , I is an interval in R,andx = x (t)isanunknown ! ! function with values in Rn. In other words, for each t I, A (t)isalinearoperatorinRn, while A (t) x and B (t) 2 are the vectors in Rn. In the coordinate form, (2.1) is equivalent to the following system of linear equations n

xi0 = Aij (t) xj + Bi (t) ,i=1, ..., n, l=1 X where Aij and Bi are the components of A and B, respectively. We will consider the ODE (2.1) only when A (t)andB (t) are continuous in t, that is, when the mappings n n n A : I R × and B : I R are continuous. It is easy to show that the continuity of ! ! these mappings is equivalent to the continuity of all the components Aij (t)andBi (t), respectively

2.1 Existence of solutions for normal systems Theorem 2.1 In the above notation, let A (t) and B (t) be continuous in t I. Then, n 2 for any t0 I and x0 R ,theIVP 2 2 x = A (t) x + B (t) 0 (2.2) x (t0)=x0 ½ has a unique solution x (t) defined on I, and this solution is unique.

Before we start the proof, let us prove the following useful lemma.

32 Lemma 2.2 (The Gronwall inequality) Let z (t) be a non-negative continuous function on [t ,t ] where t 0. Therefore, (2.4) holds with any C>0, whence it followsthatitholdswithC =0. Hence, assume in the sequel that C>0. This implies that the right hand side of (2.3) is positive. Set t F (t)=C + L z (s) ds Zt0 and observe that F is differentiable and F 0 = Lz. It follows from (2.3) that z F whence ·

F 0 = Lz LF. · This is a differential inequality for F that can be solved similarly to the separable ODE. Since F>0, dividing by F we obtain F 0 L, F · whence by integration

t t F (t) F 0 (s) ln = ds Lds = L (t t0) , F (t ) F (s) · ¡ 0 Zt0 Zt0 for all t [t ,t ]. It follows that 2 0 1 F (t) F (t )exp(L (t t )) = C exp (L (t t )) . · 0 ¡ 0 ¡ 0 Using again (2.3), that is, z F , we obtain (2.4). Proof of Theorem 2.1.· Let us fix some bounded closed interval [α, β] I such that t [α, β]. The proof consists of the following three parts. ½ 0 2 1. the uniqueness of a solution on [α, β];

2. the existence of a solution on [α, β];

3. the existence and uniqueness of a solution on I.

33 We start with the observation that if x (t) is a solution of (2.2) on I then, for all t I, 2 t x (t)=x0 + x0 (s) ds t0 Z t = x0 + (A (s) x (s)+B (s)) ds. (2.5) Zt0 In particular, if x (t)andy (t) are two solutions of IVP (2.2) on the interval [a, β], then they both satisfy (2.5) for all t [α, β] whence 2 t x (t) y (t)= A (s)(x (s) y (s)) ds. ¡ ¡ Zt0 Setting z (t)= x (t) y (t) k ¡ k and using that A (y x) A y x = A z, k ¡ k·k kk ¡ k k k we obtain t z (t) A (s) z (s) ds, · k k Zt0 for all t [t0,β], and a similar inequality for t [α, t0] where the order of integration should be2 reversed. Set 2 a =sup A (s) . (2.6) s [α,β] k k ∈ Since s A (s) is a continuous function, s A (s) is also continuous; hence, it is bounded7! on the interval [α, β]sothata< .7!k It followsk that, for all t [t ,β], 1 2 0 t z (t) a z (s) ds, · Zt0 whence by Lemma 2.2 z (t) 0 and, hence, z (t) = 0. By a similar argument, the same · holds for all t [α, t0]sothatz (t) 0on[α, β], which implies the identity of the solutions x (t)andy (t)on[2 α, β]. ´

In the second part, consider a sequence xk (t) k∞=0 of functions on [α, β]defined in- ductively by f g x0 (t) x0 ´ and t xk (t)=x0 + (A (s) xk 1 (s)+B (s)) ds, k 1. (2.7) − ¸ Zt0 We will prove that the sequence xk k∞=0 converges on [α, β] to a solution of (2.2) as k . f g !1Using the identity (2.7) and

t xk 1 (t)=x0 + (A (s) xk 2 (s)+B (s)) ds − − Zt0

34 we obtain, for any k 2andt [t ,β], ¸ 2 0 t xk (t) xk 1 (t) A (s) xk 1 (s) xk 2 (s) ds − − − k ¡ k· t0 k kk ¡ k Z t a xk 1 (s) xk 2 (s) ds · k − ¡ − k Zt0 where a is defined by (2.6). Denoting

zk (t)= xk (t) xk 1 (t) , k ¡ − k we obtain the recursive inequality

t zk (t) a zk 1 (s) ds. · − Zt0 In order to be able to use it, let us first estimate z1 (t)= x1 (t) x0 (t) .Bydefinition, we have, for all t [t ,β], k ¡ k 2 0 t z (t)= (A (s) x + B (s)) ds b (t t ) , 1 0 · ¡ 0 °Zt0 ° ° ° where ° ° ° ° b =sup A (s) x0 + B (s) < . s [α,β] k k 1 ∈ It follows by induction that t (t t )2 z (t) ab (s t ) ds = ab ¡ 0 , 2 · ¡ 0 2 Zt0 t (s t )2 (t t )3 z (t) a2b ¡ 0 ds = a2b ¡ 0 , 3 · 2 3! ... Zt0 k k 1 (t t0) z (t) a − b ¡ . k · k!

Setting c =max(a, b) and using the same argument for t [α, t0], rewrite this inequality in the form 2 k (c t t0 ) xk (t) xk 1 (t) j ¡ j , k ¡ − k· k! for all t [α, β]. Since the series 2 (c t t )k j ¡ 0j k! k X is the exponential series and, hence, is convergent for all t, in particular, uniformly35 in any bounded interval of t, we obtain by the comparison test, that the series

∞ xk (t) xk 1 (t) k ¡ − k k=1 X 35gleichmÄassig

35 converges uniformly in t [α, β], which implies that also the series 2

∞ (xk (t) xk 1 (t)) ¡ − k=1 X converges uniformly in t [α, β]. Since the N-thpartialsumofthisseriesisx (t) x , 2 N ¡ 0 we conclude that the sequence xk (t) converges uniformly in t [α, β]ask . Setting f g 2 !1 x (t) = lim xk (t) k →∞ and passing in the identity

t xk (t)=x0 + (A (s) xk 1 (s)+B (s)) ds − Zt0 to the limit as k ,obtain !1 t x (t)=x0 + (A (s) x (s)+B (s)) ds. (2.8) Zt0 This implies that x (t)solvesthegivenIVP(2.2)on[α, β]. Indeed, x (t) is continuous on [α, β] as a uniform limit of continuous functions; it follows that the right hand side of (2.8) is a differentiable function of t, whence

d t x = x + (A (s) x (s)+B (s)) ds. = A (t) x (t)+B (t) . 0 dt 0 µ Zt0 ¶

Finally, it is clear from (2.8) that x (t0)=x0.

36 Lecture 6 04.05.2009 Prof. A. Grigorian, ODE, SS 2009 Having constructed the solution of (2.2) on any bounded closed interval [α, β] I,let us extend it to the whole interval I as follows. For any interval I, there is an increasing½ sequence of bounded closed intervals [αl,βl] l∞=1 such that their union is I; furthermore, we can assume that t [α ,β ] for all fl.Denotebyg x (t) be the solution of (2.2) on [α ,β ]. 0 2 l l l l l Then xl+1 (t) is also the solution of (2.2) on [αl,βl] whence it follows by the uniqueness statement of the first part that xl+1 (t)=xl (t)on[αl,βl]. Hence, in the sequence xl (t) any function is an extension of the previous function to a larger interval, whichf impliesg that the function x (t):=x (t)ift [α ,β ] l 2 l l is well-defined for all t I and is a solution of IVP (2.2). Finally, this solution is unique 2 on I because the uniqueness takes place in each of the intervals [αl,βl].

2.2 Existence of solutions for higher order linear ODE Consider a scalar linear ODE of the order n,thatis,theODE

(n) (n 1) x + a1 (t) x − + .... + an (t) x = b (t) , (2.9) where all functions ak (t) ,b(t)aredefinedonanintervalI R. Following the general procedure, this ODE can be reduced½ to the normal linear system as follows. Consider the vector function

(n 1) x (t)= x (t) ,x0 (t) , ..., x − (t) (2.10) so that ¡ ¢ (n 2) (n 1) x1 = x, x2 = x0, ..., xn 1 = x − , xn = x − . − Then (2.9) is equivalent to the system

x10 = x2

x20 = x3 ...

xn0 1 = xn − xn0 = a1xn a2xn 1 ... anx1 + b ¡ ¡ − ¡ ¡ that is, to the vector ODE x0 = A (t) x + B (t) (2.11) where 01 0... 0 0 00 1... 0 0 A = 0 ...... 1 and B = 0 ... 1 . (2.12) B 00 0... 1 C B 0 C B C B C B an an 1 an 2 ... a1 C B b C B ¡ ¡ − ¡ − ¡ C B C Theorem 2.1 implies@ then the following. A @ A

37 Corollary. Assume that all functions ak (t) ,b(t) in (2.9) are continuous on I. Then, n for any t0 I and any vector (x0,x1, ..., xn 1) R there is a unique solution x (t) of the ODE (2.9)2with the initial conditions − 2

x (t0)=x0 x0 (t )=x 8 0 1 (2.13) > ... > (n 1) < x − (t0)=xn 1 − > :> Proof. Indeed, setting x0 =(x0,x1,...,xn 1), we can rewrite the IVP (2.9)-(2.13) in the form − x0 = Ax + B x (t0)=x0 ½ whichistheIVPconsideredinTheorem2.1.

2.3 Space of solutions of linear ODEs

The normal linear system x0 = A (t) x + B (t)iscalledhomogeneous if B (t) 0, and inhomogeneous otherwise. ´ Consider an inhomogeneous normal linear system

x0 = A (t) x + B (t) , (2.14)

n n n where A (t):I R × and B (t):I R arecontinuousmappingsonanopeninterval ! ! I R, and the associated homogeneous equation ½

x0 = A (t) x. (2.15)

Denote by the set of all solutions of (2.15) defined on I. A Theorem 2.3 (a)The set is a linear space36 over R and dim = n.Consequently, if A 37 A x1 (t) , ..., xn (t) are n linearly independent solutions to (2.15) then the general solution of (2.15) has the form x (t)=C1x1 (t)+... + Cnxn (t) , (2.16) where C1,...,Cn are arbitrary constants. (b) If x0 (t) is a particular solution of (2.14) and x1 (t) , ..., xn (t) are n linearly inde- pendent solutions to (2.15) then the general solution of (2.14) is given by

x (t)=x0 (t)+C1x1 (t)+... + Cnxn (t) . (2.17)

Proof. (a)ThesetofallfunctionsI Rn is a linear space with respect to the operations addition and multiplication by a !constant. Zero element is the function which is constant 0 on I. We need to prove that the set of solutions is a of the space of all functions. It suffices to show that is closed underA operations of addition and multiplication by constant. A

36linearer Raum 37linear unabhÄangig

38 If x and y then also x + y because 2A 2A

(x + y)0 = x0 + y0 = Ax + Ax = A (x + y) and similarly cx for any c R.Hence, is a linear space. 2A 2 A n Fix t0 I and consider the mapping Φ : R given by Φ (x)=x (t0); that is, Φ (x) 2 A! is the value of x (t)att = t0. This mapping is obviously linear. It is surjective since by n Theorem 2.1 for any v R there is a solution x (t) with the initial condition x (t0)=v. Also, this mapping is injective2 because x (t )=0impliesx (t) 0 by the uniqueness of 0 ´ the solution. Hence, Φ is a linear isomorphism between and Rn, whence it follows that A dim =dimRn = n. A Consequently, if x1, ..., xn are linearly independent functions from then they form a in . It follows that any element of is a linear combination ofAx ,...,x ,thatis, A A 1 n any solution to x0 = A (t) x has the form (2.16). (b) We claim that a function x (t):I Rn solves (2.14) if and only if the function ! y (t)=x (t) x0 (t) solves (2.15). Indeed, the homogeneous equation for y is equivalent to ¡

(x x )0 = A (x x ) , ¡ 0 ¡ 0 x0 = Ax +(x0 Ax ) . 0 ¡ 0

Using x0 = Ax0 + B, we see that the latter equation is equivalent to (2.14). By part (a), the function y (t) solves (2.15) if and only if it has the form

y = C1x1 (t)+... + Cnxn (t) , (2.18) whence it follows that all solutions to (2.14) are given by (2.17). Consider now a scalar ODE

(n) (n 1) x + a1 (t) x − + .... + an (t) x = b (t) (2.19) where all functions a1, ..., an,f are continuous on an interval I, and the associated homo- geneous ODE (n) (n 1) x + a1 (t) x − + .... + an (t) x =0. (2.20) Denote by the set of all solutions of (2.20) defined on I. A Corollary. (a) The set is a linear space over R and dim = n.Consequently, if e A A x1,...,xn are n linearly independent solutions of (2.20) then the general solution of (2.20) has the form e e x (t)=C1x1 (t)+... + Cnxn (t) , where C1,...,Cn are arbitrary constants. (b) If x0 (t) is a particular solution of (2.19) and x1, ..., xn are n linearly independent solutions of (2.20) then the general solution of (2.19) is given by

x (t)=x0 (t)+C1x1 (t)+... + Cnxn (t) .

39 Proof. (a)Thefactthat is a linear space is obvious (cf. the proof of Theorem 2.3). A To prove that dim = n, consider the associated normal system A e x = A (t) x, (2.21) e 0 where (n 1) x = x, x0, ..., x − (2.22) and A (t) is given by (2.12). Denoting by the space of solutions of (2.21), we see that ¡ A ¢ the identity (2.22) defines a linear mapping from to . This mapping is obviously injective (if x (t) 0thenx (t) 0) and surjective,A becauseA any solution x of (2.21) ´ ´ gives back a solution x (t)of(2.20).Hence, and e are linearly isomorphic. Since by A A Theorem 2.3 dim = n, it follows that dim = n.The rest is obvious. (b) The proofA uses the same argument asA ine Theorem 2.3 and is omitted. Frequently it is desirable to consider complexe valued solutionsofODEs(forexample, this will be the case in the next section). The above results can be easily generalized in this direction as follows. Consider again a normal system

x0 = A (t) x + B (t) ,

n n n n where x (t) is now an unknown function with values in C , A : I C × and B : I C , ! ! where I is an interval in R;thatis,foranyt I, A (t) is a linear operator in Cn,and 2 B (t)isavectorfromCn. Assuming that A (t)andB (t) are continuous, one obtains the n following extension of Theorem 2.1: for any t0 I and x0 C ,theIVP 2 2 x = Ax + B 0 (2.23) x (t0)=x0 ½ has a solution x : I Cn, and this solution is unique. Alternatively, the complex valued ! problem (2.23) can be reduced to a real valued problem by identifying Cn with R2n,and the claim follows by a direct application of Theorem 2.1 to the real system of order 2n. Also, similarly to Theorem 2.3, one shows that the set of all solutions to the homoge- neous system x0 = Ax is a linear space over C and its is n. The same applies to higher order scalar ODE with complex coefficients.

2.4 Solving linear homogeneous ODEs with constant coefficients Consider the scalar ODE

(n) (n 1) x + a1x − + ... + anx =0, (2.24) where a1, ..., an are real (or complex) constants. We discuss here the methods of con- structing of n linearly independent solutions of (2.24), which will give then the general solution of (2.24). It will be convenient to obtain the complex valued general solution x (t) and then to extract the real valued general solution, if necessary. The idea is very simple. Let us look for a solution in the form x (t)=eλt where λ is a to be determined. Substituting this function into (2.24) and noticing that x(k) = λkeλt,weobtainafter cancellation by eλt the following equation for λ:

n n 1 λ + a1λ − + .... + an =0.

40 This equation is called the characteristic equation of (2.24) and the polynomial P (λ)= n n 1 λ + a1λ − + .... + an is called the characteristic polynomial of (2.24). Hence, if λ is the root38 of the characteristic polynomial then the function eλt solves (2.24). We try to obtain in this way n independent solutions.

Theorem 2.4 If the characteristic polynomial P (λ) of (2.24) has n distinct complex roots λ1, ..., λn, then the following n functions

eλ1t, ..., eλnt (2.25) are linearly independent complex solutions of (2.24). Consequently, the general complex solution of (2.24) is given by

λ1t λnt x (t)=C1e + ... + Cne , (2.26) where Cj are arbitrary complex numbers. Let a , ...a be reals. If λ = α + iβ is a non-real root of P (λ) then λ = α iβ is also 1 n ¡ arootofP (λ), and the functions eλt, eλt in the sequence (2.25) canbereplacedbythe real-valued functions eαt cos βt, eαt sin βt. By doing so to any couple of roots, one obtains n real-valued linearly independent solutions of (2.24),andthegeneral real solution of (2.24) is their linear combination with real coefficients.

Example. Consider the ODE x00 3x0 +2x =0. ¡ 2 The characteristic polynomial is P (λ)=λ 3λ + 2, which has the roots λ1 =1and ¡ t 2t λ2 = 2. Hence, the linearly independent solutions are e and e , and the general solution t 2t is C1e + C2e . More precisely, we obtain the general real solution if C1,C2 vary in R, and the general complex solution if C1,C2 C. 2

38Nullstelle

41 Lecture 7 05.05.2009 Prof. A. Grigorian, ODE, SS 2009

Example. Consider the ODE x00 + x = 0. The characteristic polynomial is P (λ)= 2 λ + 1, which has the complex roots λ1 = i and λ2 = i. Hence, we obtain the complex it it ¡ independent solutions e and e− . Out of them, we can get also real linearly independent solutions. Indeed, just replace these two functions by their two linear combinations (which corresponds to a change of the basis in the space of solutions)

it it it it e + e− e e− =cost and ¡ =sint. 2 2i Hence, we conclude that cos t and sin t are linearly independent solutions and the general solution is C1 cos t + C2 sin t.

Example. Consider the ODE x000 x = 0. The characteristic polynomial is P (λ)= 3 2 ¡ 1 √3 λ 1=(λ 1) λ + λ +1 that has the roots λ1 =1andλ2,3 = 2 i 2 . Hence, we obtain¡ the three¡ linearly independent real solutions ¡ § ¡ ¢

t 1 t p3 1 t p3 e ,e2 cos t, e 2 sin t, − 2 − 2 and the real general solution is

t 1 t p3 p3 C1e + e− 2 C2 cos t + C3 sin t . Ã 2 2 !

Proof of Theorem 2.4. Since we know already that eλkt are solutions of (2.24), it suffices to prove that the functions in the list (2.25) are linearly independent. The fact that the general solution has the form (2.26) follows then from Corollary to Theorem 2.3. Let us prove by induction in n that the functions eλ1t, ..., eλnt are linearly independent whenever λ1,...,λn are distinct complex numbers. If n = 1 then the claim is trivial, just because the exponential function is not identical zero. Inductive step from n 1ton: ¡ Assume that, for some complex constants C1, ..., Cn and all t R, 2 λ1t λnt C1e + ... + Cne =0, (2.27)

λnt and prove that C1 = ... = Cn =0. Dividing (2.27) by e and setting μj = λj λn,we obtain ¡ μ1t μn 1t C1e + ... + Cn 1e − + Cn =0. − Differentiating in t,weobtain

μ1t μn 1t C1μ1e + ... + Cn 1μn 1e − =0. − −

By the inductive hypothesis, we conclude that Cjμj =0whenbyμj = 0 we conclude C =0,forallj =1,...,n 1. Substituting into (2.27), we obtain also C6 =0. j ¡ n Let a1, .., an be reals. Since the complex conjugations commutes with addition and multiplication of numbers, the identity P (λ)=0impliesP λ =0(sinceak are real, we have ak = ak). Next, we have ¡ ¢ eλt = eαt (cos βt + i sin βt)andeλt = eαt (cos βt sin βt) (2.28) ¡ 42 so that eλt and eλt are linear combinations of eαt cos βt and eαt sin βt.Theconverseis true also, because 1 1 eαt cos βt = eλt + eλt and eαt sin βt = eλt eλt . (2.29) 2 2i ¡ ³ ´ ³ ´ Hence, replacing in the sequence eλ1t, ...., eλnt the functions eλt and eλt by eαt cos βt and eαt sin βt preserves the of the sequence. After replacing all couples eλt, eλt by the real-valued solutions as above, we obtain n linearly independent real-valued solutions of (2.24), which hence form a basis in the space of all solutions. Taking theirlinearcombinationswithrealcoefficients, we obtain all real-valued solutions. What to do when P (λ)hasfewerthann distinct roots? Recall the fundamental the- orem of algebra (which is normally proved in a course of Complex Analysis): any polyno- mial P (λ)ofdegreen with complex coefficients has exactly n complex roots counted with multiplicity. If λ0 is a root of P (λ) then its multiplicity is the maximal natural number m m such that P (λ) is divisible by (λ λ0) , that is, the following identity holds ¡ m P (λ)=(λ λ0) Q (λ)forallλ C, ¡ 2 where Q (λ) is another polynomial of λ.NotethatP (λ) is always divisible by λ λ0 so that m 1. The fact that m is maximal possible is equivalent to Q (λ) =0.The¡ ¸ 6 fundamental theorem of algebra can be stated as follows: if λ1, ..., λr are all distinct roots of P (λ) and the multiplicity of λj is mj then

m1 + ... + mr = n and, hence, m1 mr P (λ)=(λ λ1) ... (λ λr) for all λ C. ¡ ¡ 2 In order to obtain n independent solutions to the ODE (2.24), each root λj should give rise to mj independent solutions. This is indeed can be done as is stated in the following theorem.

Theorem 2.5 Let λ1, ..., λr be all the distinct complex roots of the characteristic polyno- mial P (λ) with the multiplicities m1, ..., mr, respectively. Then the following n functions are linearly independent solutions of (2.24):

tkeλj t ,j=1, ..., r, k =0, ..., m 1. (2.30) j ¡ Consequently, the general© solutionª of (2.24) is

r mj 1 − k λj t x (t)= Ckjt e , (2.31) j=1 k=0 X X where Ckj are arbitrary complex constants. If a1, ..., an are real and λ = α + iβ is a non-real root of P (λ) of multiplicity m,then λ = α iβ is also a root of the same multiplicity m, and the functions tkeλt, tkeλt in the sequence¡ (2.30) can be replaced by the real-valued functions tkeαt cos βt, tkeαt sin βt,for any k =0,...,m 1. ¡ 43 Remark. Setting mj 1 − k Pj (t)= Cjkt , k=1 X we obtain from (2.31) r λj t x (t)= Pj (t) e . (2.32) j=1 X Hence, any solution to (2.24) has the form (2.32) where Pj is an arbitrary polynomial of t of the degree at most m 1. j ¡ Example. Consider the ODE x00 2x0 + x = 0 which has the characteristic polynomial ¡ P (λ)=λ2 2λ +1=(λ 1)2 . ¡ ¡ Obviously, λ = 1 is the root of multiplicity 2. Hence, by Theorem 2.5, the functions et and tet are linearly independent solutions, and the general solution is

t x (t)=(C1 + C2t) e .

V IV Example. Consider the ODE x + x 2x000 2x00 + x0 + x = 0. The characteristic polynomial is ¡ ¡

P (λ)=λ5 + λ4 2λ3 2λ2 + λ +1=(λ 1)2 (λ +1)3 . ¡ ¡ ¡

Hence, the roots are λ1 =1withm1 =2andλ2 = 1withm2 = 3. We conclude that the following 5 function are linearly independent solutions:¡

t t t t 2 t e ,te,e− ,te− ,te− .

The general solution is

t 2 t x (t)=(C1 + C2t) e + C3 + C4t + C5t e− . ¡ ¢

V Example. Consider the ODE x +2x000 + x0 = 0. Its characteristic polynomial is

P (λ)=λ5 +2λ3 + λ = λ λ2 +1 2 = λ (λ + i)2 (λ i)2 , ¡ and it has the roots λ1 =0,λ2 = i and λ3 ¡= i,where¢ λ2 and λ3 has multiplicity 2. The following 5 function are linearly independent¡ solutions:

it it it it 1,e,te,e− ,te− . (2.33)

The general complex solution is then

it it C1 +(C2 + C3t) e +(C4 + C5t) e− .

44 it it it it Replacing in the sequence (2.33) e ,e− by cos t, sin t and te ,te− by t cos t, t sin t,we obtain the linearly independent real solutions

1, cos t, t cos t, sin t, t sin t, and the general real solution

C1 +(C2 + C3t)cost +(C4 + C5t)sint.

We make some preparation for the proof of Theorem 2.5. Given a polynomial

n n 1 P (λ)=a0λ + a1λ − + ... + an with complex coefficients, associate with it the differential operator

n n 1 d d d − P = a + a + ... + a dt 0 dt 1 dt 0 µ ¶ µn ¶ n 1µ ¶ d d − = a0 n + a1 n 1 + ... + a0, dt dt − where we use the convention that the “product” of differential operators is the composi- d tion. That is, the operator P dt acts on a smooth enough function f (t)bytherule d¡ ¢ P f = a f (n) + a f (n 1) + ... + a f dt 0 1 − 0 µ ¶

(here the constant term a0 is understood as a multiplication operator). For example, the ODE

(n) (n 1) x + a1x − + ... + anx =0 (2.34) can be written shortly in the form d P x =0 dt µ ¶ n n 1 where P (λ)=λ + a1λ − + ... + an is the characteristic polynomial of (2.34). As an example of usage of this notation, let us prove the following identity: d P eλt = P (λ) eλt. (2.35) dt µ ¶ Indeed, it suffices to verify it for P (λ)=λk and then use the linearity of this identity. For such P (λ)=λk,wehave

d dk P eλt = eλt = λkeλt = P (λ) eλt, dt dtk µ ¶ which was to be proved. If λ is a root of P then (2.35) implies that eλt is a solution to (2.24), which has been observed and used above.

45 Lecture 8 11.05.2009 Prof. A. Grigorian, ODE, SS 2009 The following lemma is a far reaching generalization of (2.35).

Lemma 2.6 If f (t) ,g(t) are n times differentiable functions on an interval then, for n n 1 any polynomial P (λ)=a0λ + a1λ − + ... + an of the order at most n, the following identity holds: d n 1 d P (fg)= f (j)P (j) g. (2.36) dt j! dt j=0 µ ¶ X µ ¶

2 Example. Let P (λ)=λ + λ +1.ThenP 0 (λ)=2λ +1,P00 =2,and(2.36)becomes

d d 1 d (fg)00 +(fg)0 + fg = fP g + f P g + f P g dt 0 0 dt 2 00 00 dt µ ¶ µ ¶ µ ¶ = f (g00 + g0 + g)+f 0 (2g0 + g)+f 00g.

It is an easy exercise to see directly that this identity is correct. Proof. It suffices to prove the identity (2.36) in the case when P (λ)=λk,k n, because then for a general polynomial (2.36) will follow by taking linear combination· of those for λk.IfP (λ)=λk then, for j k · (j) k j P = k (k 1) ... (k j +1)λ − ¡ ¡ and P (j) 0forj>k. Hence, ´ k j d d − P (j) = k (k 1) ... (k j +1) ,j k, dt ¡ ¡ dt · µ ¶ µ ¶ d P (j) =0,j>k, dt µ ¶ and (2.36) becomes

k k (k) k (k 1) ... (k j +1) (j) (k j) k (j) (k j) (fg) = ¡ ¡ f g − = f g − . (2.37) j! j j=0 j=0 X X µ ¶ The latter identity is known from Analysis and is called the Leibniz formula 39.

Lemma 2.7 A complex number λ is a root of a polynomial P with the multiplicity m if and only if P (k) (λ)=0for all k =0,...,m 1 and P (m) (λ) =0. (2.38) ¡ 6 39If k = 1 then (2.37) amounts to the familiar product rule

(fg)0 = f 0g + fg0:

For arbitrary k N, (2.37) is proved by induction in k. 2

46 Proof. If P has a root λ with multiplicity m then we have the identity

m P (z)=(z λ) Q (z)forallz C, ¡ 2 where Q is a polynomial such that Q (λ) = 0. We use this identity for z R. For any natural k,wehavebytheLeibnizformula6 2

k k P (k) (z)= ((z λ)m)(j) Q(k j) (z) . j − j=0 ¡ X µ ¶ If k

m (j) m j ((z λ) ) =const(z λ) − , ¡ ¡ which vanishes at z = λ. Hence, for k

P (λ) P (n) (λ) P (z)=P (λ)+ 0 (z λ)+... + (z λ)n 1! ¡ n! ¡ P (m) (λ) P (n) (λ) = (z λ)m + ... + (z λ)n m! ¡ n! ¡ =(z λ)m Q (z) ¡ where (m) (m+1) (n) P (λ) P (λ) P (λ) n m Q (z)= + (z λ)+... + (z λ) − . m! (m +1)! ¡ n! ¡

(m) Obviously, Q (λ)= P (λ) = 0, which implies that λ is a root of multiplicity m. m! 6

Lemma 2.8 If λ1, ..., λr are distinct complex numbers and if, for some polynomials Pj (t),

r λj t Pj (t) e =0 for all t R, (2.39) 2 j=1 X then P (t) 0 for all j. j ´ Proof. Induction in r.Ifr = 1 then there is nothing to prove. Let us prove the λrt inductive step from r 1tor. Dividing (2.39) by e and setting μj = λj λr,weobtain the identity ¡ ¡ r 1 − μj t Pj (t) e + Pr (t)=0. (2.40) j=1 X

47 Choose some integer k>deg Pr,wheredegP as the maximal power of t that enters P with non-zero coefficient. Differentiating the above identity k times, we obtain

r 1 − μj t Qj (t) e =0, j=1 X (k) wherewehaveusedthefactthat(Pr) =0and

(k) μj t μt Pj (t) e = Qj (t) e for some polynomial Qj (this for¡ example¢ follows from the Leibniz formula). By the inductive hypothesis, we conclude that all Q 0, which implies that j ´ (k) μj t Pje =0.

μ t Hence, the function Pje j must be equal¡ to¢ a polynomial, say Rj (t). We need to show that P 0. Assuming the contrary, we obtain the identity j ´ Rj (t) eμj t = (2.41) Pj (t) which holds for all t except for the (finitely many) roots of Pj (t). If μj is real and μj > 0 then (2.41) is not possible since eμj t goes to as t + faster than the rational 40 1 ! 1 function on the right hand side of (2.41). If μj < 0 then the same argument applies when t .Letnowμj be complex, say μj = α + iβ with β = 0. Then we obtain from (2.41)!¡1 that 6 R (t) R (t) eαt cos βt =Re j and eαt sin βt =Im j . Pj (t) Pj (t) It is easy to see that the real and imaginary parts of a rational function are again ratio- nal functions. Since cos βt and sin βt vanish at infinitely many points and any rational function vanishes at finitely many points unless it is identical constant, we conclude that R 0 and hence P 0, for any j =1, .., r 1. Substituting into (2.40), we obtain that j ´ j ´ ¡ also Pr 0. ProofofTheorem2.5.´ Let P (λ) be the characteristic polynomial of (2.34). We first prove that if λ is a root of multiplicity m then the function tkeλt solves (2.34) for any k =0, ..., m 1. By Lemma 2.6, we have ¡ n d 1 (j) d P tkeλt = tk P (j) eλt dt j! dt µ ¶ j=0 µ ¶ ¡ ¢ Xn ¡ ¢ 1 (j) = tk P (j) (λ) eλt. j! j=0 X ¡ ¢ If j>kthen the tk (j) 0. If j k then j

48 that is, the function x (t)=tkeλt solves (2.34). If λ1, ..., λr are all distinct complex roots of P (λ)andmj is the multiplicity of λj then it follows that each function in the following sequence

tkeλj t ,j=1, ..., r, k =0,...,m 1, (2.42) j ¡ is a solution of (2.34). Let© us showª that these functions are linearly independent. Clearly, each linear combination of functions (2.42) has the form

r mj 1 r − k λj t λj t Cjkt e = Pj (t) e (2.43) j=1 k=0 j=1 X X X mj 1 k where Pj (t)= k=0− Cjkt are polynomials. If the linear combination is identical zero then by Lemma 2.8 Pj 0, which implies that all Cjk are 0. Hence, the functions (2.42) are linearly independent,P ´ and by Theorem 2.3 the general solution of (2.34) has the form (2.43). Let us show that if λ = α + iβ is a complex (non-real) root of multiplicity m then λ = α iβ is also a root of the same multiplicity m. Indeed, by Lemma 2.7, λ satisfies the relations¡ (2.38). Applying the complex conjugation and using the fact that the coefficients of P arereal,weobtainthatthesamerelationsholdforλ instead of λ, which implies that λ is also a root of multiplicity m. The last claim that every couple tkeλt,tkeλt in (2.42) can be replaced by real-valued functions tkeαt cos βt, tkeαt sin βt, follows from the observation that the functions tkeαt cos βt, tkeαt sin βt are linear combinations of tkeλt,tkeλt, and vice versa, which one sees from the identities 1 1 eαt cos βt = eλt + eλt ,eαt sin βt = eλt eλt , 2 2i ¡ ³ ´ ³ ´ eλt = eαt (cos βt + i sin βt) ,eλt = eαt (cos βt i sin βt) , ¡ multiplied by tk (compare the proof of Theorem 2.4).

2.5 Solving linear inhomogeneous ODEs with constant coeffi- cients Here we consider the ODE

(n) (n 1) x + a1x − + ... + anx = f (t) , (2.44) where the function f (t)isaquasi-polynomial,thatis,f has the form

μj t f (t)= Rj (t) e j X where Rj (t)arepolynomials,μj are complex numbers, and the sum is finite. It is obvious that the sum and the product of two quasi-polynomials is again a quasi-polynomial. In particular, the following functions are quasi-polynomials

tkeαt cos βt and tkeαt sin βt

49 (where k is a non-negative integer and α, β R)because 2 iβt iβt iβt iβt e + e− e e− cos βt = and sin βt = ¡ . 2 2i As before, denote by P (λ) the characteristic polynomial of (2.44), that is

n n 1 P (λ)=λ + a1λ − + ... + an.

d Then the equation (2.44) can be written shortly in the form P dt x = f,whichwillbe used below. We start with the following observation. ¡ ¢ d Claim. If f = f1 +...+fk and x1 (t) ,...,xk (t) are solutions to the equation P dt xj = fj, d then x = x1 + ... + xk solves the equation P dt x = f. Proof. This is trivial because ¡ ¢ ¡ ¢ d d d P x = P x = P x = f = f. dt dt j dt j j j j j µ ¶ µ ¶ X X µ ¶ X

50 Lecture 9 12.05.2009 Prof. A. Grigorian, ODE, SS 2009 Hence, we can assume that the function f in (2.44) is of the form f (t)=R (t) eμt where R (t) is a polynomial. As we know, the general solution of the inhomogeneous equation (2.44) is obtained as a sum of the general solution of the associated homogeneous equation and a particular solution of (2.44). Hence, we focus on finding a particular solution of (2.44). To illustrate the method, which will be used in this Section, consider first the following particular case: d P x = eμt (2.45) dt µ ¶ where μ is not a root of the characteristic polynomial P (λ)(non-resonant case). We claim that (2.45) has a particular solution in the form x (t)=aeμt where a is a complex constant to be chosen. Indeed, we have by (2.35) d P eμt = P (μ) eμt, dt µ ¶ ¡ ¢ whence d P aeμt = eμt dt µ ¶ provided ¡ ¢ 1 a = . (2.46) P (μ) Example. Let us find a particular solution to the ODE

t x00 +2x0 + x = e .

Note that P (λ)=λ2 +2λ +1andμ =1isnotarootofP . Lookforasolutioninthe form x (t)=aet. Substituting into the equation, we obtain

aet +2aet + aet = et whence we obtain the equation for a: 1 4a =1,a = . 4 Alternatively, we can obtain a from (2.46), that is, 1 1 1 a = = = . P (μ) 1+2+1 4

1 t Hence, the answer is x (t)= 4 e . Consider another equation:

x00 +2x0 + x =sint (2.47)

Note that sin t is the imaginary part of eit.So,wefirst solve

it x00 +2x0 + x = e

51 and then take the imaginary part of the solution. Looking for a solution in the form x (t)=aeit,weobtain 1 1 1 i a = = = = . P (μ) i2 +2i +1 2i ¡2 Hence, the solution is i i 1 i x = eit = (cos t + i sin t)= sin t cos t. ¡2 ¡2 2 ¡ 2 Therefore, its imaginary part x (t)= 1 cos t solves the equation (2.47). ¡ 2 Example. Consider yet another ODE t x00 +2x0 + x = e− cos t. (2.48) t μt Here e− cos t is a real part of e where μ = 1+i. Hence, first solve ¡ μt x00 +2x0 + x = e . Setting x (t)=aeμt,weobtain 1 1 a = = = 1. P (μ) ( 1+i)2 +2( 1+i)+1 ¡ ¡ ¡ ( 1+i)t t t Hence, the complex solution is x (t)= e − = e− cos t ie− sin t,andthesolution t ¡ ¡ ¡ to (2.48) is x (t)= e− cos t. ¡ Example. Finally, let us combine the above examples into one: t t x00 +2x0 + x =2e sin t + e− cos t. (2.49) ¡ A particular solution is obtained by combining the above particular solutions:

1 t 1 t x (t)=2 e cos t + e− cos t 4 ¡ ¡2 ¡ µ ¶ µ ¶ 1 t 1 t ¡ ¢ = e + cos t e− cos t. 2 2 ¡

Since the general solution to the homogeneous ODE x00 +2x0 + x =0is t x (t)=(C1 + C2t) e− , we obtain the general solution to (2.49)

t 1 t 1 t x (t)=(C + C t) e− + e + cos t e− cos t. 1 2 2 2 ¡

Consider one more equation t x00 +2x0 + x = e− . This time μ = 1isarootofP (λ)=λ2 +2λ +1andtheabovemethoddoesnotwork. ¡ t Indeed, if we look for a solution in the form x = ae− then after substitution we get 0 in t the left hand side because e− solves the homogeneous equation. Thecasewhenμ is a root of P (λ)isreferredtoasaresonance.Thiscaseaswellas the case of the general quasi-polynomial in the right hand side is treated in the following theorem.

52 Theorem 2.9 Let R (t) be a non-zero polynomial of degree k 0 and μ be a complex number. Let m be the multiplicity of μ if μ is a root of P and m¸=0if μ is not a root of P . Then the equation d P x = R (t) eμt (2.50) dt µ ¶ has a particular solution of the form

x (t)=tmQ (t) eμt, where Q (t) is a polynomial of degree k (which is to be found). d μt In the case when k =0and R (t) a,theODEP dt x = ae has a particular solution ´ a x (t)= tmeμt. ¡ ¢ (2.51) P (m) (μ)

Example. Come back to the equation

t x00 +2x0 + x = e− .

Here μ = 1 is a root of multiplicity m =2andR (t) = 1 is a polynomial of degree 0. Hence, the¡ solution should be sought in the form

2 t x (t)=at e− where a isaconstantthatreplacesQ (indeed, Q must have degree 0 and, hence, is a constant). Substituting this into the equation, we obtain

2 t 2 t 2 t t a t e− 00 +2 t e− 0 + t e− = e− (2.52) ³ ´ Expanding the expression in¡ the bra¢ ckets,¡ we obtain¢ the identity

2 t 2 t 2 t t t e− 00 +2 t e− 0 + t e− =2e− ,

¡ ¢ ¡1 ¢ so that (2.52) becomes 2a =1anda = 2 . Hence, a particular solution is 1 x (t)= t2e t. 2 − Alternatively, this solution can be obtained by (2.51): 1 1 x (t)= t2e t = t2e t. P ( 1) − 2 − 00 ¡ Consider one more example.

t x00 +2x0 + x = te− with the same μ = 1andR (t)=t.SincedegR = 1, the polynomial Q must have degree 1, that is, Q (¡t)=at + b.Thecoefficients a and b can be determined as follows. Substituting 2 t 3 2 t x (t)=(at + b) t e− = at + bt e− ¡ ¢ 53 into the equation, we obtain

3 2 t 3 2 t 3 2 t x00 +2x0 + x = at + bt e− 00 +2 at + bt e− 0 + at + bt e− =(2b +6at) e t. ¡¡ ¢− ¢ ¡¡ ¢ ¢ ¡ ¢ Hence, comparing with the equation, we obtain

2b +6at = t

1 so that b =0anda = 6 .Thefinal answer is t3 x (t)= e t. 6 −

Proof of Theorem 2.9. Let us prove that the equation

d P x = R (t) eμt dt µ ¶ has a solution in the form x (t)=tmQ (t) eμt where m is the multiplicity of μ and deg Q = k =degR.UsingLemma2.6,wehave

d d 1 d P x = P tmQ (t) eμt = (tmQ (t))(j) P (j) eμt dt dt j! dt µ ¶ µ ¶ j 0 µ ¶ X≥ 1 ¡ ¢ = (tmQ (t))(j) P (j) (μ) eμt. (2.53) j! j 0 X≥ By Lemma 2.6, the summation here runs from j =0toj = n but we can allow any j 0 because for j>nthe derivative P (j) is identical zero anyway. Furthermore, since P (¸j) (μ)=0forallj m 1, we can restrict the summation to j m.Set · ¡ ¸ y (t)=(tmQ (t))(m) (2.54) and observe that y (t) is a polynomial of degree k,providedsoisQ (t). Conversely, for any polynomial y (t)ofdegreek, there is a polynomial Q (t)ofdegreek such that (2.54) holds. Indeed, integrating (2.54) m times without adding constants and then dividing by tm,weobtainQ (t) as a polynomial of degree k. It follows from (2.53) that y must satisfy the ODE

P (j) (μ) y(j m) = R (t) , j! − j m X≥ which we rewrite in the form

(l) b0y + b1y0 + ... + bly + ... = R (t) (2.55)

54 (l+m) where b = P (μ) (in fact, the index l = j m can be restricted to l k since y(l) 0 l (l+m)! ¡ · ´ for l>k). Note that P (m) (μ) b0 = =0. (2.56) m! 6 Hence, the problem amounts to the following: given a polynomial R (t)ofdegreek,prove that there exists a polynomial y (t)ofdegreek that satisfies (2.55). Let us prove the existence of y by induction in k. The inductive basis for k =0.Inthiscase,R (t)isaconstant,sayR (t)=a,andy (t) is sought as a constant too, say y (t)=c. Then (2.55) becomes b0c = a whence c = a/b0. Here we use that b0 =0. The inductive step6 from the values smaller than k to k.Representy in the from

y = ctk + z (t) , (2.57) where z is a polynomial of degree

(l) k k b z + b z0 + ... + b z + ... = R (t) b ct + b ct 0 + ... =: R (t) . 0 1 l ¡ 0 1 ³ ´ k ¡ ¢ Setting R (t)=at + lower order terms and choosing c from the equatione b0c = a,that k is, c = a/b0,weseethatthetermt cancels out and R (t) is a polynomial of degree

k has a solution z (t) that is a polynomial of degree

m!a (tmQ (t))(m) = P (m) (μ) whence after m integrations we find one of solutions for Q (t): a Q (t)= . P (m) (μ)

d μt Therefore, the ODE P dt x = ae has a particular solution ¡ ¢ a x (t)= tmeμt. P (m) (μ)

55 2.6 Second order ODE with periodic right hand side Consider a second order ODE

x00 + px0 + qx = f (t) , (2.58) which occurs in various physical phenomena. For example, (2.58) describes the movement of a point body of mass m along the axis x, where the term px0 comes from the friction forces, qx - from the elastic forces, and f (t) is an external time-dependant force. Another physical situation that is described by (2.58), is an RLC-circuit:

R x(t)

+ V(t) _

C

As before, let R the resistance, L be the inductance, and C be the capacitance of the circuit. Let V (t)bethevoltageofthepowersourceinthecircuitandx (t) be the current in the circuit at time t.Wehaveseenthatx (t)satisfied the following ODE x Lx + Rx + = V . 00 0 C 0 If L = 0 then dividing by L we obtain an ODE of the form (2.58) where p = R/L, 6 q =1/ (LC), and f = V 0/L. As an example of application of the above methods of solving linear ODEs with con- stant coefficients, we investigate here the ODE

x00 + px0 + qx = A sin ωt, (2.59) where A, ω are given positive reals. The function A sin ωt is a model for a more general periodic force f (t), which makes good physical sense in all the above examples. For example, in the case of electrical circuit the external force has the form A sin ωt if the power source is an electrical socket with the alternating current41 (AC). The number ω 42 2π is called the frequency of the external force (note that the period = ω )ortheexternal frequency, and the number A is called the amplitude43 (the maximum value) of the external force. Assume in the sequel that p 0andq>0, which is physically most interesting case. To find a particular solution of¸ (2.59), let us consider first the ODE with complex right hand side: iωt x00 + px0 + qx = Ae . (2.60)

41Wechselstrom 42Frequenz 43Amplitude

56 Let P (λ)=λ2 + pλ + q be the characteristic polynomial. Consider first the non-resonant case when iω is not a root of P (λ). Searching the solution in the from ceiωt,weobtain by Theorem 2.9 A A c = = =: a + ib P (iω) ω2 + piω + q ¡ and the particular solution of (2.60) is

(a + ib) eiωt =(a cos ωt b sin ωt)+i (a sin ωt + b cos ωt) . ¡ Taking its imaginary part, we obtain a particular solution of (2.59)

x (t)=a sin ωt + b cos ωt. (2.61)

57 Lecture 10 18.05.2009 Prof. A. Grigorian, ODE, SS 2009 Rewrite (2.61) in the form x (t)=B (cos ϕ sin ωt +sinϕ cos ωt)=B sin (ωt + ϕ) , where A B = pa2 + b2 = c = , (2.62) j j (q ω2)2 + ω2p2 ¡ and the angle ϕ [0, 2π) is determined from theq identities 2 a b cos ϕ = , sin ϕ = . B B The number B is the amplitude of the solution, and ϕ is called the phase. To obtain the general solution of (2.59), we need to add to (2.61) the general solution of the homogeneous equation x00 + px0 + qx =0.

Let λ1 and λ2 are the roots of P (λ), that is, p p2 λ1,2 = q. ¡2 § r 4 ¡ Consider the following cases. λ1 and λ2 are real. Since p 0andq>0, we see that λ1,λ2 < 0. The general solution of the homogeneous equation¸ has the from C eλ1t + C eλ2t if λ = λ , 1 2 1 6 2 λ1t (C1 + C2t) e if λ1 = λ2. In the both cases, it decays exponentially in t as t + . Hence, the general solution of (2.59) has the form ! 1 x (t)=B sin (ωt + ϕ) + exponentially decaying terms. As we see, when t the leading term of x (t) is the above particular solution B sin (ωt + ϕ). For the!1 electrical circuit this means that the current quickly stabilizes and becomes periodic with the same frequency ω as the external force. Here is a plot of t/4 such a function x (t)=sint +2e− :

x(t)

2

1.5

1

0.5

0 0 5 10 15 20 25

t -0.5

58 λ1 and λ2 are complex. Let λ1,2 = α iβ where § p2 α = p/2 0andβ = q > 0. ¡ · r ¡ 4 The general solution to the homogeneous equation is

αt αt e (C1 cos βt + C2 sin βt)=Ce sin (βt + ψ) , where C and ψ are arbitrary reals. The number β is called the natural frequency of the physical system in question (pendulum, electrical circuit, spring) for the obvious reason - in absence of the external force, the system oscillates with the natural frequency β. Hence, the general solution to (2.59) is

x (t)=B sin (ωt + ϕ)+Ceαt sin (βt + ψ) .

Consider two further sub-cases. If α<0 then the leading term is again B sin (ωt + ϕ). t/4 Here is a plot of a particular example of this type: x (t)=sint +2e− sin πt.

x(t) 2

1.5

1

0.5

0 0 5 10 15 20 25

t -0.5

-1

If α =0,thenp =0,q= β2, and the equation has the form

2 x00 + β x = A sin ωt.

The roots of the characteristic polynomial are iβ. The assumption that iω is not a root means that ω = β. The general solution is § 6 x (t)=B sin (ωt + ϕ)+C sin (βt + ψ) , which is the sum of two sin waves with different frequencies — the natural frequency and the external frequency. If ω and β are incommensurable44 then x (t) is not periodic unless C = 0. Here is a particular example of such a function: x (t)=sint +2sinπt.

44inkommensurabel

59 x(t)

2.5

1.25

0 0 5 10 15 20 25

t

-1.25

-2.5

Strictly speaking, in practice the electrical circuits with p = 0 do not occur since the resistance is always positive. Let us come back to the formula (2.62) for the amplitude B and, as an example of its application, consider the following question: for what value of the external frequency ω the amplitude B is maximal? Assuming that A does not depend on ω and using the identity A2 B2 = , ω4 +(p2 2q) ω2 + q2 ¡ we see that the maximum of B occurs when the denominators takes the minimum value. If p2 2q then the minimum value occurs at ω =0, which is not very interesting physically. ¸ 2 2 Assume that p < 2q (in particular, this implies that p < 4q, and, hence, λ1 and λ2 are complex). Then the maximum of B occurs when

1 p2 ω2 = p2 2q = q . ¡2 ¡ ¡ 2 The value ¡ ¢ ω := q p2/2 0 ¡ is called the resonant frequency of the physicalp system in question. If the external force has the resonant frequency, that is, if ω = ω0, then the system exhibits the highest response to this force. This phenomenon is called a resonance. Note for comparison that the natural frequency is equal to ¯ = q p2=4, which is in general larger than ! .Intermsof! and ¯,wecanwrite ¡ 0 0 p A2 A2 B2 = = 4 2 2 2 2 2 4 ! 2!0! + q (!2 ! ) + q2 ! ¡ ¡ 0 ¡ 0 A2 = ; (!2 !2)2 + p2¯2 ¡ 0 wherewehaveusedthat

p2 2 p4 q2 !4 = q2 q = qp2 = p2¯2: ¡ 0 ¡ ¡ 2 ¡ 4 µ ¶

60 A Inparticular,themaximumamplitude,thatoccursat! = !0,isBmax = p¯ : Consider a numerical example of this type. For the ODE

x00 +6x0 +34x =sin!t;

2 2 the natural frequency is ¯ = q p =4 = 5 and the resonant frequency is !0 = q p =2=4. ¡ 1 2 ¡ The particular solution (2.61) in the case ! = !0 =4isx (t)= 50 sin 4t 75 cos 4t with amplitude p 5 14 ¡ p B = Bmax =1=30, and in the case ! = 7 it is x (t)= 663 sin 7t 663 cos 7t with the amplitude B 0:224: Theplotsofthesetwofunctionsareshownbelow:¡ ¡ ¼

x(t) 0.03

0.025

0.02

0.015

0.01

0.005

0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 -0.005 t -0.01

-0.015

-0.02

-0.025

-0.03

In conclusion, consider the case, when iω is a root of P (λ), that is

(iω)2 + piω + q =0,

2 which implies p =0andq = ω .Inthiscaseα =0andω = ω0 = β = pq,andthe equation (2.59) has the form 2 x00 + ω x = A sin ωt. (2.63) Considering the ODE 2 iωt x00 + ω x = Ae , and searching a particular solution in the form x (t)=cteiωt, we obtain by Theorem 2.9 A A c = = . P 0 (iω) 2iω Hence, the complex particular solution is At At At x (t)= eiωt = i cos ωt + sin ωt. 2iω ¡ 2ω 2ω Using its imaginary part, we obtain the general solution of (2.63) as follows: At x (t)= cos ωt + C sin (ωt + ψ) . ¡2ω 61 Here is a plot of such a function x (t)= t cos t +2sint: ¡ x(t) 20

10

0 0 5 10 15 20 25

t

-10

-20

In this case we have a complete resonance: the external frequency ω is simultaneously equal to the natural frequency and the resonant frequency. In the case of a complete resonance, the amplitude increases in time unbounded. Since unbounded oscillations are physically impossible, either the system falls apart over time or the mathematical model becomes unsuitable for describing the physical system.

2.7 The method of 2.7.1 A system of the 1st order Consider again a general linear system

x0 = A (t) x + B (t) (2.64)

n n n where as before A (t):I R × and B (t):I R are continuous, I being an interval in ! ! R. We present here the method of variation of parameters that allows to solve (2.64), given n linearly independent solutions x1 (t) ,...,xn (t) of the homogeneous system x0 = A (t) x. We start with the following observation.

Lemma 2.10 If the solutions x1 (t) , ..., xn (t) of the system x0 = A (t) x are linearly in- dependent then, for any t I, the vectors x (t ) ,...,x (t ) arelinearlyindependent. 0 2 1 0 n 0 Proof. This statement is contained implicitly in the proof of Theorem 2.3, but nev- ertheless we repeat the argument. Assume that for some constant C1, ...., Cn

C1x1 (t0)+... + Cnxn (t0)=0.

Consider the function x (t)=C1x1 (t)+... + Cnxn (t) . Then x (t)solvestheIVP

x0 = A (t) x, x (t0)=0, ½ 62 whence by the uniqueness theorem x (t) 0. Since the solutions x , ..., x are independent, ´ 1 n it follows that C1 = ... = Cn = 0, whence the independence of vectors x1 (t0) , ..., xn (t0) follows. Example. Consider two vector functions

cos t sin t x (t)= and x (t)= , 1 sin t 2 cos t µ ¶ µ ¶ which are obviously linearly independent. However, for t = π/4, we have

p2/2 x (t)= = x (t) 1 p2/2 2 µ ¶ so that the vectors x1 (π/4) and x2 (π/4) are linearly dependent. Hence, x1 (t)andx2 (t) cannot be solutions of the same system x0 = Ax. For comparison, the functions

cos t sin t x (t)= and x (t)= ¡ 1 sin t 2 cos t µ ¶ µ ¶ are solutions of the same system

0 1 x = x, 0 10¡ µ ¶ and, hence, the vectors x1 (t)andx2 (t) are linearly independent for any t. This follows also from cos t sin t det (x1 x2)=det ¡ =1=0. j sin cos t 6 µ ¶

Given n linearly independent solutions to x0 = A (t) x,letusforman n matrix £ X (t)=(x (t) x (t) ... x (t)) 1 j 2 j j n where the k-th column is the column-vector xk (t) ,k=1, ..., n. The matrix X is called the fundamental matrix of the system x0 = Ax. It follows from Lemma 2.10 that the column of X (t) are linearly independent for any 1 t I, which implies that det X (t) =0andthattheinversematrixX− (t)isdefined for all2t I. This allows us to solve the6 inhomogeneous system as follows. 2 Theorem 2.11 The general solution to the system

x0 = A (t) x + B (t) , (2.65) is given by 1 x (t)=X (t) X− (t) B (t) dt, (2.66) Z where X is the fundamental matrix of the system x0 = Ax.

63 Lecture 11 19.05.2009 Prof. A. Grigorian, ODE, SS 2009 Proof. Let us look for a solution to (2.65) in the form

x (t)=C1 (t) x1 (t)+... + Cn (t) xn (t) (2.67) where C1,C2, .., Cn are now unknown real-valued functions to be determined. Since for any n t I, the vectors x1 (t) , ..., xn (t) are linearly independent, any R -valued function x (t) can2 be represented in the form (2.67). Let C (t)bethecolumn-vectorwithcomponents C1 (t) , ..., Cn (t). Then identity (2.67) can be written in the matrix form as x = XC

1 whence C = X− x. It follows that C1, ..., Cn are rational functions of x1,...,xn,x.Since x1,...,xn,x are differentiable, the functions C1,...,Cn are also differentiable. Differentiating the identity (2.67) in t and using xk0 = Axk,weobtain

x0 = C1x10 + C2x20 + ... + Cnxn0

+C10 x1 + C20 x2 + ... + Cn0 xn

= C1Ax1 + C2Ax2 + ... + CnAxn

+C10 x1 + C20 x2 + ... + Cn0 xn

= Ax + C10 x1 + C20 x2 + ... + Cn0 xn = Ax + XC0.

Hence, the equation x0 = Ax + B is equivalent to

XC0 = B. (2.68)

Solving this vector equation for C0,weobtain

1 C0 = X− B,

1 C (t)= X− (t) B (t) dt, Z and 1 x (t)=XC = X (t) X− (t) B (t) dt. Z

The term “variation of parameters” comes from the identity (2.67). Indeed, if C1, ...., Cn are constant parameters then this identity determines the general solution of the homoge- neous ODE x0 = Ax.ByallowingC1, ..., Cn to be variable, we obtain the general solution to x0 = Ax + B. Second proof. Observe ¯rst that the matrix X satis¯es the following ODE

X0 = AX:

Indeed, this identity holds for any column xk of X, whence it follows for the whole matrix. Di®erentiating (2.66) in t and using the product rule, we obtain

1 1 x0 = X0 (t) X¡ (t) B (t) dt + X (t) X¡ (t) B (t) Z 1 ¡ ¢ = AX X¡ B (t) dt + B (t) Z = Ax + B (t) :

64 Hence, x (t) solves (2.65). Let us show that (2.66) gives all the solutions. Note that the integral in (2.66) is inde¯nite so that it can be presented in the form

1 X¡ (t) B (t) dt = V (t)+C; Z where V (t) is a vector function and C =(C1;:::;Cn) is an arbitrary constant vector. Hence, (2.66) gives x (t)=X (t) V (t)+X (t) C

= x0 (t)+C1x1 (t)+::: + Cnxn (t) ; where x0 (t)=X (t) V (t) is a solution of (2.65). By Theorem 2.3 we conclude that x (t) is indeed the general solution. Example. Consider the system x0 = x 1 ¡ 2 x0 = x1 ½ 2 or, in the vector form, 0 1 x = x. 0 10¡ µ ¶ It is easy to see that this system has two independent solutions cos t sin t x (t)= and x (t)= ¡ . 1 sin t 2 cos t µ ¶ µ ¶ Hence, the corresponding fundamental matrix is cos t sin t X = sin t ¡cos t µ ¶ and cos t sin t X 1 = . − sin t cos t µ ¡ ¶ Consider now the ODE x0 = A (t) x + B (t) where B (t)= b1(t) . By (2.66), we obtain the general solution b2(t)

¡ ¢ cos t sin t cos t sin t b1 (t) x = ¡ dt sin t cos t sin t cos t b (t) µ ¶ Z µ ¡ ¶µ 2 ¶ cos t sin t b1 (t)cost + b2 (t)sint = ¡ dt. sin t cos t b (t)sint + b (t)cost µ ¶ Z µ¡ 1 2 ¶ 1 Consider a particular example B (t)= i . Then the integral is − cos t t sin t t cos t + C ¡ ¡ ¢ dt = 1 , sin t t cos t t sin t + C Z µ¡ ¡ ¶ µ¡ 2¶ whence

cos t sin t t cos t + C1 x = ¡ sin t cos t t sin t + C µ ¶µ¡ 2¶ C cos t C sin t + t = 1 ¡ 2 C sin t + C cos t µ 1 2 ¶ t cos t sin t = + C + C ¡ . 0 1 sin t 2 cos t µ ¶ µ ¶ µ ¶

65 2.7.2 A scalar ODE of n-th order Consider now a scalar ODE of order n

(n) (n 1) x + a1 (t) x − + ... + an (t) x = f (t) , (2.69) where ak (t)andf (t) are continuous functions on some interval I. Recall that it can be reduced to the vector ODE x0 = A (t) x + B (t) where x (t) x (t) x (t)= 0 0 ... 1 (n 1) B x − (t) C B C and @ A 01 0... 0 0 00 1... 0 0 A = 0 ...... 1 and B = . 0 ... 1 B 00 0... 1 C B C B f C B an an 1 an 2 ... a1 C B C B ¡ ¡ − ¡ − ¡ C @ A If x1,...,xn are n linearly@ independent solutions to theA homogeneous ODE

(n) (n 1) x + a1x − + ... + an (t) x =0 then denoting by x1, ..., xn the corresponding vector solutions, we obtain the fundamental matrix x1 x2 ... xn x10 x20 ... xn0 X =(x1 x2 ... xn )=0 1 . j j j ...... (n 1) (n 1) (n 1) B x − x − ... xn − C B 1 2 C 1 @ A1 We need to multiply X− by B.Denotebyyik the element of X− at position i, k where i is the row index and k is the column index. Denote also by yk the k-th column y1k 1 of X− ,thatis,yk = ... .Then 0 1 ynk @ A y11 ... y1n 0 y1nf 1 X− B = ...... = ... = fyn, 0 1 0 1 0 1 yn1 ... ynn f ynnf @ A @ A @ A and the general vector solution is

x = X (t) f (t) yn (t) dt. Z We need the function x (t)whichisthefirst component of x. Therefore, we need only to take the first row of X to multiply by the column vector f (t) yn (t) dt, whence n R x (t)= xj (t) f (t) yjn (t) dt. j=1 X Z 66 Hence, we have proved the following.

Corollary. Let x1, ..., xn be n linearly independent solutions of

(n) (n 1) x + a1 (t) x − + ... + an (t) x =0 and X be the corresponding fundamental matrix. Then, for any continuous function f (t), the general solution of (2.69) is given by

n

x (t)= xj (t) f (t) yjn (t) dt, (2.70) j=1 X Z 1 where yjk are the entries of the matrix X− . Example. Consider the ODE x00 + x = f (t) . (2.71)

The independent solutions of the homogeneous equation x00 + x =0arex1 (t)=cost and x2 (t)=sint, so that the fundamental matrix is

cos t sin t X = , sin t cos t µ ¡ ¶ and the inverse is cos t sin t X 1 = . − sin t ¡cos t µ ¶ By (2.70), the general solution to (2.71) is

x (t)=x1 (t) f (t) y12 (t) dt + x2 (t) f (t) y22 (t) dt Z Z =cost f (t)( sin t) dt +sint f (t)costdt. (2.72) ¡ Z Z For example, if f (t)=sint then we obtain

x (t)=cost sin t ( sin t) dt +sint sin t cos tdt ¡ Z 1 Z = cos t sin2 tdt + sin t sin 2tdt ¡ 2 Z Z 1 1 1 = cos t t sin 2t + C + sin t ( cos 2t + C ) ¡ 2 ¡ 4 1 4 ¡ 2 µ ¶ 1 1 = t cos t + (sin 2t cos t sin t cos 2t)+c cos t + c sin t ¡2 4 ¡ 1 2 1 = t cos t + c cos t + c sin t. ¡2 1 2 Of course, the same result can be obtained by Theorem 2.9. Consider one more example

67 f (t)=tant that cannot be handled by Theorem 2.9. In this case, we obtain from (2.72)45

x =cost tan t ( sin t) dt +sint tan t cos tdt ¡ Z Z 1 1 sin t =cost ln ¡ +sint sin t cos t + c cos t + c sin t 2 1+sint ¡ 1 2 µ µ ¶ ¶ 1 1 sin t = cos t ln ¡ + c cos t + c sin t. 2 1+sint 1 2 µ ¶

Let us show how one can use the method of variation of parameters for the equation (2.71) directly, without using the formula (2.70). We start with writing up the general solution to the homogeneous ODE x00 + x =0,whichis

x (t)=C1 cos +C2 sin t, (2.73) where C1 and C2 are constant. Next, we look for the solution of (2.71) in the form

x (t)=C1 (t)cost + C2 (t)sint, (2.74) which is obtained from (2.73) by replacing the constants by functions. To obtain the equations for the unknown functions C1 (t) ,C2 (t), differentiate (2.74):

x0 (t)= C1 (t)sint + C2 (t)cost (2.75) ¡ +C10 (t)cost + C20 (t)sint.

The first equation for C1,C2 comes from the requirement that the second line here (that is, the sum of the terms with C10 and C20 )mustvanish,thatis,

C10 cos t + C20 sin t =0. (2.76)

The motivation for this choice is as follows. Switching to the normal system, one must have the identity x (t)=C1 (t) x1 (t)+C2x2 (t) , which componentwise is

x (t)=C1 (t)cost + C2 (t)sint

x0 (t)=C1 (t)(cost)0 + C2 (t)(sint)0 .

Differentiating the first line and subtracting the second line, we obtain (2.76).

45The intergal tan x sin tdt is taken as follows:

R sin2 t 1 cos2 t dt tan x sin tdt = dt = ¡ dt = sin t: cos t cos t cos t ¡ Z Z Z Z Next, we have dt d sin t d sin t 1 1 sin t = = = ln ¡ : cos t cos2 t 1 sin2 t 2 1+sint Z Z Z ¡

68 It follows from (2.75) and (2.76) that

x00 = C cos t C sin t ¡ 1 ¡ 2 C0 sin t + C0 cos t, ¡ 1 2 whence x00 + x = C0 sin t + C0 cos t ¡ 1 2 (note that the terms with C1 and C2 cancel out and that this will always be the case provided all computations are done correctly). Hence, the second equation for C10 and C20 is C0 sin t + C0 cos t = f (t) . ¡ 1 2 Solving the system of linear algebraic equations

C10 cos t + C20 sin t =0 C0 sin t + C0 cos t = f (t) ½ ¡ 1 2 we obtain C0 = f (t)sint, C0 = f (t)cost 1 ¡ 2 whence C = f (t)sintdt, C = f (t)costdt. 1 ¡ 2 Z Z Substituting into (2.74), we obtain (2.72).

2.8 Wronskian and the Liouville formula

Let I be an open interval in R. n Definition. Given a sequence of n vector functions x1, ..., xn : I R ,define their Wronskian W (t) as a real valued function on I by !

W (t)=det(x (t) x (t) ... x (t)) , 1 j 2 j j n where the matrix on the right hand side is formed by the column-vectors x1, ..., xn. Hence, W (t) is the of the n n matrix. £ Definition. Let x1,...,xn are n real-valued functions on I,whicharen 1timesdiffer- entiable on I.. Then their Wronskian is defined by ¡

x1 x2 ... xn x x ... x W (t)=det 10 20 n0 . 0 ...... 1 (n 1) (n 1) (n 1) B x − x − ... xn − C B 1 2 C @ A

n Lemma 2.12 (a) Let x1, ..., xn beasequenceofR -valued functions that solve a linear system x0 = A (t) x,andletW (t) be their Wronskian. Then either W (t) 0 for all t I and the functions x , ..., x are linearly dependent or W (t) =0for all t ´I and the 2 1 n 6 2 functions x1, ..., xn are linearly independent.

69 (b) Let x1, ..., xn be a sequence of real-valued functions that solve a linear scalar ODE

(n) (n 1) x + a1 (t) x − + ... + an (t) x =0, and let W (t) be their Wronskian. Then either W (t) 0 for all t I and the functions ´ 2 x1,...,xn are linearly dependent or W (t) =0for all t I and the functions x1,...,xn are linearly independent. 6 2

Proof. (a) Indeed, if the functions x1, ..., xn are linearly independent then, by Lemma 2.10, the vectors x1 (t) , ..., xn (t) are linearly independent for any value of t,whichim- plies W (t) = 0. If the functions x ,...,x are linearly dependent then also the vectors 6 1 n x1 (t) , ..., xn (t) are linearly dependent for any t,whenceW (t) 0. (b)Define the vector function ´

xk x x = k0 k 0 ... 1 (n 1) B x − C B k C @ A so that x1, ..., xk is the sequence of vector functions that solve a vector ODE x0 = A (t) x. The Wronskian of x1, ..., xn is obviously the same as the Wronskian of x1, ..., xn,andthe sequence x1, ..., xn is linearly independent if and only so is x1, ..., xn. Hence, the rest follows from part (a).

n Theorem 2.13 (The Liouville formula) Let xi i=1 be a sequence of n solutions of the n n f g ODE x0 = A (t) x, where A : I R × is continuous. Then the Wronskian W (t) of this sequence satisfies the identity !

t W (t)=W (t0)exp A (τ) dτ , (2.77) µZt0 ¶ for all t, t I. 0 2 Recall that the trace46 trace A of the matrix A is the sum of all the entries of the matrix. Corollary. Consider a scalar ODE

(n) (n 1) x + a1 (t) x − + ... + an (t) x =0, where ak (t) are continuous functions on an interval I R.Ifx1 (t) , ..., xn (t) are n solutions to this equation then their Wronskian W (t) satis½fies the identity

t W (t)=W (t )exp a (τ) dτ . (2.78) 0 ¡ 1 µ Zt0 ¶

46die Spur

70 Lecture 12 25.05.2009 Prof. A. Grigorian, ODE, SS 2009 Proof of Theorem 2.13. Let the entries of the matrix (x x ... x )bex where i 1j 2j j n ij is the row index and j is the column index; in particular, the components of the vector xj are x1j,x2j,...,xnj.Denotebyri the i-th row of this matrix, that is, ri =(xi1,xi2,...,xin); then r1 r W =det 2 0 ... 1 B rn C B C We use the following formula for differentiation@ ofA the determinant, which follows from the full expansion of the determinant and the product rule:

r10 r1 r1 r r r W (t)=det 2 +det 20 + ... +det 2 . (2.79) 0 0 ... 1 0 ... 1 0 ... 1 B rn C B rn C B r0 C B C B C B n C @ A @ A @ A Indeed, if f1 (t) ,...,fn (t) are real-valued differentiable functions then the product rule implies by induction

(f1...fn)0 = f10 f2...fn + f1f20 ...fn + ... + f1f2...fn0 .

Hence, when differentiating the full expansion of the determinant, each term of the de- terminant gives rise to n terms where one of the multiples is replaced by its derivative. Combining properly all such terms, we obtain that the derivative of the determinant is the sum of n where one of the rows is replaced by its derivative, that is, (2.79).

The fact that each vector xj satisfies the equation xj0 = Axj can be written in the coordinate form as follows n

xij0 = Aikxkj. (2.80) k=1 X For any fixed i, the sequence x n is nothing other than the components of the row f ijgj=1 ri.Sincethecoefficients Aik do not depend on j, (2.80) implies the same identity for the rows: n

ri0 = Aikrk. k=1 X That is, the derivative ri0 of the i-th row is a linear combination of all rows rk. For example,

r10 = A11r1 + A12r2 + ... + A1nrn which implies that

r10 r1 r2 rn r r r r det 2 = A det 2 + A det 2 + ... + A det 2 . 0 ... 1 11 0 ... 1 12 0 ... 1 1n 0 ... 1 B rn C B rn C B rn C B rn C B C B C B C B C @ A @ A @ A @ A

71 All the determinants except for the 1st one vanish since they have equal rows. Hence,

r10 r1 r r det 2 = A det 2 = A W (t) . 0 ... 1 11 0 ... 1 11 B rn C B rn C B C B C @ A @ A Evaluating similarly the other terms in (2.79), we obtain

W 0 (t)=(A11 + A22 + ... + Ann) W (t) = (trace A) W (t) .

By Lemma 2.12, W (t)iseitheridentical0orneverzero.Inthefirst case there is nothing to prove. In the second case, we can solve the above ODE using the method of separation of variables. Indeed, dividing it W (t)andintegratingint,weobtain

W (t) t ln = trace A (τ) dτ W (t ) 0 Zt0

(note that W (t)andW (t0) have the same sign so that the argument of ln is positive), whence (2.77) follows. ProofofCorollary. The scalar ODE is equivalent to the normal system x0 = Ax where 01 0... 0 x 00 1... 0 x A = 0 ...... 1 and x = 0 . 0 ... 1 B 00 0... 1 C (n 1) B C B x − C B an an 1 an 2 ... a1 C B C B ¡ ¡ − ¡ − ¡ C @ A Since the Wronskian@ of the normal system coincidesA with W (t), (2.78) follows from (2.77) because trace A = a1. InthecaseoftheODEofthe2ndorder¡

x00 + a1 (t) x0 + a2 (t) x =0 the Liouville formula can help in finding the general solution if a particular solution is known. Indeed, if x0 (t) is a particular non-zero solution and x (t)isanyothersolution then we have by (2.78)

x0 x det = C exp a1 (t) dt , x0 x0 ¡ µ 0 ¶ µ Z ¶ that is

x x0 xx0 = C exp a (t) dt . 0 ¡ 0 ¡ 1 µ Z ¶ Using the identity x x0 xx0 x 0 0 ¡ 0 = x2 x 0 µ 0 ¶ we obtain the ODE x 0 C exp a (t) dt = ¡ 1 , (2.81) x x2 µ 0 ¶ ¡ R0 ¢ 72 and by integrating it we obtain x and, hence, x. x0 Example. Consider the ODE

2 x00 2 1+tan t x =0. ¡

One solution can be guessed x0 (t)=tan¡ t using the¢ fact that d 1 tan t = =tan2 t +1 dt cos2 t and d2 tan t =2tant tan2 t +1 . dt2 Hence, for x (t) we obtain from (2.81) ¡ ¢ x C 0 = tan t tan2 t ³ ´ whence47 dt x = C tan t = C tan t ( t cot t + C ) . tan2 t ¡ ¡ 1 Z The answer can also be written in the form

x (t)=c1 (t tan t +1)+c2 tan t where c = C and c = CC . 1 ¡ 2 1 2.9 Linear homogeneous systems with constant coefficients

n n Consider a normal homogeneous system x0 = Ax where A C × is a constant n n 2 £ matrix with complex entries and x (t) is a function from R to Cn. Asweknow,thegeneral solution of this system can be obtained as a linear combination of n linearly independent solutions. Here we will be concerned with construction of such solutions.

2.9.1 A simple case We start with a simple observation. Let us try to find a solution in the form x = eλtv n where v is a non-zero vector in C that does not depend on t. Then the equation x0 = Ax becomes λeλtv = eλtAv that is, Av = λv. Recall that any non-zero vector v,thatsatisfies the identity Av = λv for some constant λ, is called an eigenvector of A,andλ is called the eigenvalue48. Hence,

47 dt 2 To evaluate the integral tan2 t = cot tdt use the identity

2 R R (cot t)0 = cot t 1 ¡ ¡ that yields cot2 tdt = t cot t + C: ¡ ¡ Z 48der Eigenwert

73 λt the function x (t)=e v is a non-trivial solution to x0 = Ax provided v is an eigenvector of A and λ is the corresponding eigenvalue. It is easy to see that λ is an eigenvalue if and only if the matrix A λ id is not invertible, that is, when ¡ det (A λ id) = 0, (2.82) ¡ where id is the n n . This equation is called the characteristic equation of the matrix A and£ can be used to determine the eigenvalues. Then the eigenvector is determined from the equation (A λ id) v =0. (2.83) ¡ Note that the eigenvector is not unique; for example, if v is an eigenvector then cv is also an eigenvector for any constant c =0. The function 6 P (λ):=det(A λ id) ¡ is a polynomial of λ of order n, that is called the characteristic polynomial of the matrix A. Hence, the eigenvalues of A are the roots of the characteristic polynomial P (λ).

Lemma 2.14 If a n n matrix A has n linearly independent eigenvectors v , ..., v with £ 1 n the (complex) eigenvalues λ1, ..., λn then the following functions

λ1t λ2t λnt e v1,e v2, ..., e vn (2.84) are n linearly independent solutions of the system x0 = Ax. Consequently, the general solution of this system is n λkt x (t)= Cke vk, k=1 X where C1,...,Ck are arbitrary complex constants.. If A is a real matrix and λ is a non-real eigenvalue of A with an eigenvector v then λ is an eigenvalue with eigenvector v,andthetermseλtv, eλtv in (2.84) can be replaced by the couple Re eλtv , Im eλtv .

λkt Proof. As¡ we¢ have¡ seen¢ already, each function e vk is a solution. Since vectors n n λkt vk k=1 are linearly independent, the functions e vk k=1 are linearly independent, whencef g the first claim follows from Theorem 2.3. If Av = λv then applying the complex conjugation© andª using the fact the entries of A are real, we obtain Av = λv so that λ is an eigenvalue with eigenvector v.Sincethe functions eλtv and eλtv are solutions, their linear combinations

eλtv + eλtv eλtv eλtv Re eλtv = and Im eλtv = ¡ 2 2i are also solutions. Since eλtv and eλtv canalsobeexpressedviathesesolutions:

eλtv =Reeλtv + i Im eλtv eλtv =Reeλtv i Im eλtv, ¡ replacing in (2.84) the terms eλt,eλt bythecoupleRe eλtv ,Im eλtv results in again n linearly independent solutions. ¡ ¢ ¡ ¢ 74 It is known from that if A has n distinct eigenvalues then their eigen- vectors are automatically linearly independent,andLemma2.14applies.Anothercase when the hypotheses of Lemma 2.14 are satisfied is if A is a symmetric real matrix; in this case there is always a basis of the eigenvectors. Example. Consider the system x = y 0 . y0 = x ½ 01 The vector form of this system is x = Ax where x = x and A = .The y 10 µ ¶ characteristic polynomial is ¡ ¢ λ 1 P (λ)=det ¡ = λ2 1, 1 λ ¡ µ ¡ ¶ 2 the characteristic equation is λ 1 = 0, whence the eigenvalues are λ1 =1,λ2 = 1. ¡ a ¡ For λ = λ1 = 1 we obtain the equation (2.83) for v = b : 11 a ¡ ¢ ¡ =0, 1 1 b µ ¡ ¶µ ¶ which gives only one independent equation a b =0.Choosinga =1,weobtainb =1 whence ¡ 1 v = . 1 1 µ ¶ Similarly, for λ = λ = 1 we have the equation for v = a 2 ¡ b 11 a ¡ ¢ =0, 11 b µ ¶µ ¶ which amounts to a + b = 0. Hence, the eigenvector for λ = 1is 2 ¡ 1 v = . 2 1 µ¡ ¶ Since the vectors v1 and v2 are independent, we obtain the general solution in the form 1 1 C et + C e t x (t)=C et + C e t = 1 2 − , 1 1 2 − 1 C et C e t µ ¶ µ¡ ¶ µ 1 ¡ 2 − ¶ t t t t that is, x (t)=C e + C e− and y (t)=C e C e− . 1 2 1 ¡ 2 Example. Consider the system x0 = y ¡ . y0 = x ½ 0 1 The matrix of the system is A = , and the the characteristic polynomial is 10¡ µ ¶ λ 1 P (λ)=det = λ2 +1. ¡1 ¡λ µ ¡ ¶ 75 2 Hence, the characteristic equation is λ +1 = 0 whence λ1 = i and λ2 = i.Forλ = λ1 = i a ¡ we obtain the equation for the eigenvector v = b i 1 ¡a¢ ¡ ¡ =0, 1 i b µ ¡ ¶µ ¶ which amounts to the single equation ia+b =0.Choosinga = i,weobtainb =1, whence

i v = 1 1 µ ¶ and the corresponding solution of the ODE is

i sin t + i cos t x (t)=eit = ¡ . 1 1 cos t + i sin t µ ¶ µ ¶ Since this solution is complex, we obtain the general solution using the second claim of Lemma 2.14: sin t cos t C sin t + C cos t x (t)=C Re x + C Im x = C ¡ + C = ¡ 1 2 . 1 1 2 1 1 cos t 2 sin t C cos t + C sin t µ ¶ µ ¶ µ 1 2 ¶

Example. Consider a normal system

x0 = y y0 =0. ½

This system is trivially solved to obtain y = C1 and x = C1t + C2.However,ifwe try to solve it using the above method, we fail. Indeed, the matrix of the system is 01 A = , the characteristic polynomial is 00 µ ¶ λ 1 P (λ)=det = λ2, ¡0 λ µ ¡ ¶ and the characteristic equation P (λ) = 0 yields only one eigenvalue λ = 0. The eigen- a vector v = b satisfies the equation ¡ ¢ 01 a =0, 00 b µ ¶µ ¶ 1 whence b = 0. That is, the only eigenvector (up to a constant multiple) is v = 0 , and theonlysolutionweobtaininthiswayisx (t)= 1 . The problem lies in the properties 0 ¡ ¢ of this matrix: it does not have a basis of eigenvectors, which is needed for this method. ¡ ¢ In order to handle such cases, we use a different approach.

76 Lecture 13 26.05.2009 Prof. A. Grigorian, ODE, SS 2009

2.9.2 Functions of operators and matrices

At Recall that an scalar ODE x0 = Ax has a solution x (t)=Ce t. Now if A is a linear operator in Rn then we may be able to use this formula if we define what is eAt.Inthis section, we define the exponential function eA for any linear operator A in Cn.Denoteby (Cn) the space of all linear operators from Cn to Cn. L Fix a norm in Cn and recall that the operator norm of an operator A (Cn)is defined by 2L Ax A =supk k. (2.85) k k x Cn 0 x ∈ \{ } k k The operator norm is a norm in the linear space (Cn). Note that in addition to the L linear operations A + B and cA, that are defined for all A, B (Cn)andc C,the 2L 2 space (Cn) enjoys the operation of the product of operators AB that is the composition of A andL B: (AB) x := A (Bx) and that clearly belongs to (Cn) as well. The operator norm satisfies the following multiplicative inequality L AB A B . (2.86) k k·k kk k Indeed, it follows from (2.85) that Ax A x whence k k·k kk k (AB) x = A (Bx) A Bx A B x , k k k k·k kk k·k kk kk k which yields (2.86). Definition. If A (Cn)thendefine eA (Cn)bymeansoftheidentity 2L 2L A2 Ak ∞ Ak eA =id+A + + ... + + ... = , (2.87) 2! k! k! k=0 X where id is the identity operator. Of course, in order to justify this definition, we need to verify the convergence of the series (2.87).

Lemma 2.15 The exponential series (2.87) converges for any A (Cn) . 2L Proof. It suffices to show that the series converges absolutely, that is,

∞ Ak < . k! 1 k=0 ° ° X ° ° ° ° It follows from (2.86) that Ak A k°whence° ·k k ° ° k k °∞ °A ∞ A A k k = ek k < , k! · k! 1 k=0 ° ° k=0 X ° ° X ° ° and the claim follows. ° °

77 n tA Theorem 2.16 For any A (C ) the function F (t)=e satisfies the ODE F 0 = AF . 2L tA n Consequently, the general solution of the ODE x0 = Ax is given by x = e v where v C is an arbitrary vector. 2

Here x = x (t) is as usually a Cn-valued function on R, while F (t)isan (Cn)- 2 L valued function on R.Since (Cn) is linearly isomorphic to Cn , we can also say that n2 L F (t)isaC -valuedfunctiononR, which allows to understand the ODE F 0 = AF in the same sense as general vectors ODE. The novelty here is that we regard A (Cn) 2L as an operator in (Cn) (that is, an element of ( (Cn))) by means of the operator multiplication. L L L Proof. We have by definition

∞ tkAk F (t)=etA = . k! k=0 X Consider the series of the derivatives:

∞ d tkAk ∞ tk 1Ak ∞ tk 1Ak 1 G (t):= = − = A − − = AF. dt k! (k 1)! (k 1)! k=0 k=1 k=1 X µ ¶ X ¡ X ¡ It is easy to see (in the same way as Lemma 2.15) that this series converges locally uniformly in t, which implies that F is differentiable in t and F 0 = G. It follows that F 0 = AF. For function x (t)=etAv,wehave

tA tA x0 = e 0 v = Ae v = Ax so that x (t)solvestheODEx0 = Ax¡ for¢ any¡v. ¢ If x (t)isanysolutiontox0 = Ax then set v = x (0) and observe that the function etAv satisfies the same ODE and the initial condition

etAv =idv = v. jt=0 Hence, both x (t)andetAv solve the same initial value problem, whence the identity x (t)=etAv follows by the uniqueness part of Theorem 2.1. n tA tA Remark. If v1, ..., vn are linearly independent vectors in C then the solutions e v1, ...., e vn n tA are also linearly independent. If v1,...,vn is the in C ,thene vk is the k-th column of the matrix etA. Hence, the columns of the matrix etA form n linearly tA independent solutions, that is, e is the fundamental matrix of the system x0 = Ax. Example. Let A be the

A =diag(λ1,...,λn) .

Then k k k A =diag λ1, ..., λn and ¡ ¢ etA =diag eλ1t, ..., eλnt . ¡ ¢ 78 Let 01 A = . 00 µ ¶ Then A2 =0andallhigherpowerofA are also 0 and we obtain

1 t etA =id+tA = . 01 µ ¶

Hence, the general solution to x0 = Ax is

1 t C C + C t x (t)=etAv = 1 = 1 2 , 01 C C µ ¶µ 2¶ µ 2 ¶ where C1,C2 are the components of v.

Definition. Operators A, B (Cn)aresaidto commute if AB = BA. 2L In general, the operators do not have to commute. If A and B commute then various nice formulas take places, for example,

(A + B)2 = A2 +2AB + B2. (2.88)

Indeed, in general we have

(A + B)2 =(A + B)(A + B)=A2 + AB + BA + B2, which yields (2.88) if AB = BA.

Lemma 2.17 If A and B commute then

eA+B = eAeB.

Proof. Let us prove a sequence of claims. Claim 1. If A, B, C commutepairwisethensodoAC and B. Indeed, we have

(AC) B = A (CB)=A (BC)=(AB) C =(BA) C = B (AC) .

Claim 2. If A and B commutethensodoeA and B. It follows from Claim 1 that Ak and B commute for any natural k, whence

∞ Ak ∞ Ak eAB = B = B = BeA. k! k! Ãk=0 ! Ãk=0 ! X X

Claim 3. If A (t) and B (t) are differentiable functions from R to (Cn) then L

(A (t) B (t))0 = A0 (t) B (t)+A (t) B0 (t) . (2.89)

79 Indeed,wehaveforanycomponent

0 (AB)ij0 = AikBkj = Aik0 Bkj + AikBkj0 =(A0B)ij +(AB0)ij =(A0B + AB0)ij , Ã k ! k k X X X whence (2.89) follows. Now we can finish the proof of the lemma. Consider the function F : R (Cn) defined by !L F (t)=etAetB. Differentiating it using Theorem 2.16, Claims 2 and 3, we obtain

tA tB tA tB tA tB tA tB tA tB tA tB F 0 (t)= e 0 e +e e 0 = Ae e +e Be = Ae e +Be e =(A + B) F (t) .

On the other¡ ¢ hand, by¡ Theorem¢ 2.16, the function G (t)=et(A+B) satisfies the same equation G0 =(A + B) G. Since G (0) = F (0) = id, we obtain that the vector functions F (t)andG (t)solvethe same IVP, whence by the uniqueness theorem they are identically equal. In particular, F (1) = G (1), which means eAeB = eA+B. Alternative proof. Letusbrie°ydiscussadirectalgebraicproofofeA+B = eAeB.One¯rstproves the binomial formula n n n k n k (A + B) = A B ¡ k k=0 X µ ¶ using the fact that A and B commute(thiscanbedonebyinductioninthesamewayasfornumbers). Then we have n n 1 (A + B) 1 AkBn k eA+B = = ¡ : n! k!(n k)! n=0 n=0 k=0 X X X ¡ On the other hand, using the Cauchy product formula for absolutely convergent series of operators, we have n 1 Am 1 Bl 1 AkBn k eAeB = = ¡ ; m! l! k!(n k)! m=0 l=0 n=0 k=0 X X X X ¡ whence the identity eA+B = eAeB follows.

2.9.3 Jordan cells Here we show how to compute eA provided A is a Jordan cell. Definition. An n n matrix J is called a Jordan cell if it has the form £ λ 10 0 ¢¢¢ .. .. . 0 0 λ . . . 1 . . . . J = ...... 0 , (2.90) B C B . . C B . .. λ 1 C B C B 0 0 λ C B ¢¢¢ ¢¢¢ C @ A where λ is any complex number. That is, all the entries on the main diagonal of J are λ, all the entries on the second diagonal are 1, and all other entries of J are 0.

80 Let us use Lemma 2.17 to evaluate etJ . Clearly, we have J = λ id +N where

01 0 0 ¢¢¢ ...... 0 . . . . . 1 . . . N = . .. .. 0 . (2.91) B C B . . C B . .. 1 C B C B 0 0 C B ¢¢¢ ¢¢¢ ¢¢¢ C @ A The matrix (2.91) is called a nilpotent Jordan cell. Since the matrices λ id and N commute (because id commutes with any matrix), Lemma 2.17 yields

etA = etλ idetN = etλetN . (2.92)

Hence, we need to evaluate etN ,andforthatwefirst evaluate the powers N 2,N3,etc. Observe that the components of matrix N are as follows

1, if j = i +1 N = , ij 0, otherwise ½ where i is the row index and j is the column index. It follows that

n 1, if j = i +2 N 2 = N N = ij ik kj 0, otherwise k=1 ½ ¡ ¢ X that is, . 00 1 .. 0 . . . . . 0 ...... 1 2 . . . N = B . .. .. 1 C , B C B . .. C B . . 0 C B C B 0 0 C B ¢¢¢ ¢¢¢ ¢¢¢ C where the entries with value 1 are@ located on the 3rd diagonal,A that is, two positions above the main diagonal, and all other entries are 0. Similarly, we obtain

k+1

& . . 0 .. 1 .. 0 . . . . . 0 ...... 1 k . . . N = B . .. .. 1 C B C B . .. .. C B . . . C B C B 0 0 C B ¢¢¢ ¢¢¢ ¢¢¢ C @ A where the entries with value 1 are located on the (k + 1)-st diagonal, that is, k positions

81 above the main diagonal, provided k

Lemma 2.18 If J is a Jordan cell (2.90) then, for any t R, 2

2 . n 1 λt t tλ t tλ .. t − tλ e 1! e 2! e (n 1)! e − 0 tλ t tλ ...... 1 0 e 1! e tJ B C e = B ...... t2 tλ C . (2.94) B . . . . e C B 2! C B . . . C B . .. .. t etλ C B 1! C B 0 0 etλ C B C B ¢¢¢ ¢¢¢ C @ A Taking the columns of (2.94), we obtain n linearly independent solutions of the system x0 = Jx:

n 1 λt t λt t2 λt t − eλt e 1! e 2! e (n 1)! λt t λt − 0 e 1! e ... 0 1 0 1 0 λt 1 0 t2 λt 1 x1 (t)= ... ,x2 (t)= 0 ,x3 (t)= e ,...,xn (t)= . 2! e B ... C B ... C B ... C B t eλt C B C B C B C B 1! C B 0 C B 0 C B 0 C B eλt C B C B C B C B C @ A @ A @ A @ A

49Any matrix A with the property that Ak = 0 for some natural k is called nilpotent.Hence,N is a , which explains the term \a nilpotent Jordan cell".

82 Lecture 14 02.06.2009 Prof. A. Grigorian, ODE, SS 2009

2.9.4 The

Definition. If A is a m m matrix and B is a l l matrix then their tensor product is an n n matrix C where£n = m + l and £ £ A 0 C = 0 0 B 1

That is, matrix C consists of two blocks@A and B locatedA on the main diagonal, and all other terms are 0. Notation for the tensor product: C = A B. − Lemma 2.19 The following identity is true:

A B A B e ⊗ = e e . (2.95) − In extended notation, (2.95) means that

eA 0 eC = . 0 0 eB 1 @ A Proof. Observe firstthatifA1,A2 are m m matrices and B1,B2 are l l matrices then £ £ (A B )(A B )=(A A ) (B B ) . (2.96) 1 − 1 2 − 2 1 2 − 1 2 Indeed, in the extended form this identity means

A1 0 A2 0 A1A2 0 = 0 0 B1 1 0 0 B2 1 0 0 B1B2 1 which follows easily@ from the ruleA @ of multiplicationA of@ matrices. Hence,A the tensor product commutes with the . It is also obvious that the tensor product commutes with addition of matrices and taking limits. Therefore, we obtain

k k k k k A B ∞ (A B) ∞ A B ∞ A ∞ B A B e ⊗ = − = − = = e e . k! k! k! − k! − k=0 k=0 Ãk=0 ! Ãk=0 ! X X X X

Definition. The tensor product of a finite number of Jordan cells is called a Jordan normal form. That is, if a Jordan normal form is a matrix as follows:

J1 J2 0 0 . 1 J1 J2 Jk = .. , − − ¢¢¢− B C B 0 Jk 1 C B − C B Jk C B C @ A 83 where Jj are Jordan cells. Lemmas 2.18 and 2.19 allow to evaluate etA if A is a Jordan normal form.

Example. Solve the system x0 = Ax where

1100 0100 A = . 0 00211 B 0002C B C @ A Clearly, the matrix A is the tensor product of two Jordan cells:

11 21 J = and J = . 1 01 2 02 µ ¶ µ ¶ By Lemma 2.18, we obtain

et tet e2t te2t etJ1 = and etJ2 = 0 et 0 e2t µ ¶ µ ¶ whence by Lemma 2.19, et tet 00 0 et 00 etA = . 0 00e2t te2t 1 B 00 0e2t C B C The columns of this matrix form 4 linearly@ independent solutions,A and the general solution is their linear combination:

t t C1e + C2te t C2e x (t)=0 2t 2t 1 . C3e + C4te 2t B C4e C B C @ A

2.9.5 Transformation of an operator to a Jordan normal form

n n b Given a basis b = b1,b2, ..., bn in C and a vector x C ,denotebyx the column f g b 2 b vector that represents x in this basis. That is, if xj is the j-th component of x then

n b b b b x = x1b1 + x2b2 + ... + xnbn = xjbj. j=1 X Similarly, if A is a linear operator in Cn then denote by Ab the matrix that represents A in the basis b. It is determined by the identity

(Ax)b = Abxb, which should be true for all x Cn, where in the right hand side we have the product of the n n matrix Ab and the column-vector2 xb. £

84 b b Clearly, (bj) =(0, ...1,...0) where 1 is at position j, which implies that (Abj) = b b b A (bj) is the j-th column of A . In other words, we have the identity Ab = (Ab )b (Ab )b (Ab )b , 1 j 2 j¢¢¢j n that can be stated as the following³ rule: ´ b the j-th column of A is the column vector Abj writteninthebasisb1,...,bn.

2 Example. Consider the operator A in C that is given in the canonical basis e = e1,e2 by the matrix f g 01 Ae = . 10 µ ¶ Consider another basis b = b ,b defined by f 1 2g 1 1 b = e e = and b = e + e = . 1 1 ¡ 2 1 2 1 2 1 µ ¡ ¶ µ ¶ Then 01 1 1 (Ab )e = = 1 10 1 ¡1 µ ¶µ ¡ ¶ µ ¶ and 01 1 1 (Ab )e = = . 2 10 1 1 µ ¶µ ¶ µ ¶ It follows that Ab = b and Ab = b whence 1 ¡ 1 2 2 10 Ab = . ¡01 µ ¶ The following theorem is proved in Linear Algebra courses. Theorem. For any operator A (Cn) there is a basis b in Cn such that the matrix Ab is a Jordan normal form. 2L The basis b is called the Jordan basis of A, and the matrix Ab is called the Jordan normal form of A. Let J be a Jordan cell of Ab with λ on the diagonal and suppose that the rows (and columns) of J in Ab are indexed by j, j +1,...,j+ p 1sothatJ is a p p matrix. Then ¡ £ the sequence of vectors bj, ..., bj+p 1 is referred to as the Jordan chain of the given Jordan cell. In particular, the basis b is the− disjoint union of the Jordan chains. Since j j+p 1 ··· ··· − ↓ ↓ ... . 0 .. 1 01 0 B ¢¢¢ C j B .. .. . C ← b B 0 . . . C A λ id = B . C ¡ B . .. .. C ··· B . . . 1 C B 0 00 C ··· B C j+p 1 B ¢¢¢ C ← − B ... C B C B . C B .. C B C @ A 85 b and the k-thcolumnofA λ id is the vector (A λ id) bk (written in the basis b), we conclude that ¡ ¡

(A λ id) b =0 ¡ j (A λ id) bj+1 = bj ¡ (A λ id) b = b ¡ j+2 j+1 ¢¢¢ (A λ id) bj+p 1 = bj+p 2. ¡ − −

In particular, bj is an eigenvector of A with the eigenvalue λ. The vectors bj+1, ..., bj+p 1 − are called the generalized eigenvectors of A (more precisely, bj+1 is the 1st , bj+2 is the second generalized eigenvector, etc.). Hence, any Jordan chain contains exactly one eigenvector and the rest vectors are the generalized eigenvectors.

Theorem 2.20 Consider the system x0 = Ax with a constant linear operator A and let Ab be the Jordan normal form of A. Then each Jordan cell J of Ab of dimension p with λ on the diagonal gives rise to p linearly independent solutions as follows:

λt x1 (t)=e v1 t x (t)=eλt v + v 2 1! 1 2 µ ¶ t2 t x (t)=eλt v + v + v 3 2! 1 1! 2 3 ... µ ¶ p 1 λt t − t xp (t)=e v1 + ... + vp 1 + vp , (p 1)! 1! − µ ¡ ¶ where v1, ..., vp is the Jordan chain of J.Thesetofalln solutions obtained across all Jordancellsislinearlyindependent.f g

Proof. In the basis b,wehavebyLemmas2.18and2.19

... 0 ... 1 B C B C B p 1 C B eλt t etλ t − etλ C B 1! (p 1)! C B ¢¢¢ − C B tλ .. . C tAb B 0 e . . C e = B C , B . C B ...... t tλ C B . 1! e C B C B tλ C B 0 0 e C B ¢¢¢ C B C B . C B .. C B C B C B .. C B . C B C @ A 86 where the block in the middle is etJ . By Theorem 2.16, the columns of this matrix give n linearly independent solutions to the ODE x0 = Ax. Out of these solutions, select p solutions that correspond to p columns of the cell etJ ,thatis,

...... p 1 eλt t eλt t − eλt 0 1 0 1! 1 0 (p 1)! 1 0 eλt −... t2 λt x1 (t)=B ... C ,x2 = B 0 C ,...,xp = B C , B C B C B 2! e C B ... C B ... C B t λt C B C B C B 1! e C B C B C B λt C B 0 C B 0 C B e C B C B C B C B ... C B ... C B ... C B C B C B C @ A @ A @ A where all the vectors are written in the basis b, the boxed rows correspond to the cell J, and all the terms outside boxes are zeros. Representing these vectors in the coordinateless form via the Jordan chain v1, ..., vp, we obtain the solutions as in the statement of Theorem 2.20. Let λ be an eigenvalue of an operator A.Denotebym the algebraic multiplicity of λ, that is, its multiplicity as a root of characteristic polynomial50 P (λ)=det(A λ id). Denote by g the geometric multiplicity of λ, that is the dimension of the eigenspace¡ of λ:

g =dimker(A λ id) . ¡ In other words, g is the maximal number of linearly independent eigenvectors of λ.The numbers m and g can be characterized in terms of the Jordan normal form Ab of A as follows: m is the total number of occurrences of λ on the diagonal51 of Ab,whereasg is equal to the number of the Jordan cells with λ on the diagonal52. It follows that g m, and the equality occurs if and only if all the Jordan cells with the eigenvalue λ ·have dimension 1. Consider some examples of application of Theorem 2.20. Example. Solve the system 21 x = x. 0 14 µ ¡ ¶ The characteristic polynomial is

2 λ 1 2 P (λ)=det(A λ id) = det ¡ = λ2 6λ +9=(λ 3) , ¡ 14λ ¡ ¡ µ ¡ ¡ ¶ and the only eigenvalue is λ1 = 3 with the algebraic multiplicity m1 =2.Theequation for eigenvector v is (A λ id) v =0 ¡ 50 To compute P (¸), one needs to write the operator A in some basis b as a matrix Ab and then evaluate det (Ab ¸ id). The characteristic polynomial does not depend on the choice of basis b. Indeed, ¡ 1 if b0 is another basis then the relation between the matrices Ab and Ab0 is given by Ab = CAb0 C¡ 1 where C is the matrix of transformation of basis. It follows that Ab ¸ id = C (Ab0 ¸ id) C¡ whence 1 ¡ ¡ det (Ab ¸ id) = det C det (Ab0 ¸ id) det C¡ =det(Ab0 ¸ id) : 51 ¡ ¡ ¡ If ¸ occurs k times on the diagonal of Ab then ¸ is a root of multiplicity k of the characteristic polynomial of Ab that coincides with that of A.Hence,k = m. 52Note that each Jordan cell correponds to exactly one eigenvector.

87 that is, setting v =(a, b), 11 a =0, ¡11 b µ ¡ ¶µ ¶ which is equivalent to a + b =0.Choosinga =1andb = 1, we obtain the unique (up to a constant multiple)¡ eigenvector

1 v = . 1 1 µ ¶

Hence, the geometric multiplicity is g1 = 1. Therefore, there is only one Jordan cell with the eigenvalue λ1, which allows to immediately determine the Jordan normal form of the given matrix: 31 . 03 µ ¶ By Theorem 2.20, we obtain the solutions

3t x1 (t)=e v1 3t x2 (t)=e (tv1 + v2) where v2 is the 1st generalized eigenvector that can be determined from the equation

(A λ id) v = v . ¡ 2 1

Setting v2 =(a, b), we obtain the equation

11 a 1 = ¡11 b 1 µ ¡ ¶µ ¶ µ ¶ which is equivalent to a + b =1. Hence, setting a =0andb =1,weobtain ¡ 0 v = , 2 1 µ ¶ whence t x (t)=e3t . 2 t +1 µ ¶ Finally, the general solution is

3t C1 + C2t x (t)=C1x1 + C2x2 = e . C1 + C2 (t +1) µ ¶

Example. Solve the system

211 x0 = 20 1 x. 0 ¡212¡ 1 @ A

88 The characteristic polynomial is

2 λ 11 P (λ)=det(A λ id) = det ¡2 λ 1 ¡ 0 ¡212¡ ¡ λ 1 ¡ = λ3 +4λ2 5λ +2=(2@ λ)(λ 1)2 . A ¡ ¡ ¡ ¡

The roots are λ1 =2withm1 =1andλ2 =1withm2 =2.Theeigenvectorsv for λ1 are determined from the equation (A λ id) v =0, ¡ 1 whence, for v =(a, b, c) 011 a 2 2 1 b =0, 0 ¡210¡ ¡ 1 0 c 1 that is, @ A @ A b + c =0 2a 2b c =0 8 ¡ ¡ ¡ < 2a + b =0. The second equation is a linear combination of the first and the last ones. Setting a =1 : we find b = 2andc = 2 so that the unique (up to a constant multiple) eigenvector is ¡ 1 v = 2 , 0 ¡2 1 @ A which gives the first solution 1 2t x1 (t)=e 2 . 0 ¡2 1 @ A The eigenvectors for λ2 = 1 satisfy the equation

(A λ id) v =0, ¡ 2 whence, for v =(a, b, c), 111 a 2 1 1 b =0, 0 ¡211¡ ¡ 1 0 c 1 whence @ A @ A a + b + c =0 2a b c =0 8 ¡ ¡ ¡ < 2a + b + c =0. Solving the system, we obtain a unique (up to a constant multiple) solution a =0,b =1, : c = 1. Hence, we obtain only one eigenvector ¡ 0 v1 = 1 . 0 1 1 ¡ @ A 89 Therefore, g2 = 1, that is, there is only one Jordan cell with the eigenvalue λ2,which implies that the Jordan normal form of the given matrix is as follows: 200 011 . 0 0011 @ A By Theorem 2.20, the cell with λ2 = 1 gives rise to two more solutions 0 t t x2 (t)=e v1 = e 1 0 1 1 ¡ @ A and t x3 (t)=e (tv1 + v2) , where v2 is the first generalized eigenvector to be determined from the equation

(A λ2 id) v2 = v1. ¡ Setting v2 =(a, b, c)weobtain 111 a 0 2 1 1 b = 1 , 0 ¡211¡ ¡ 1 0 c 1 0 1 1 ¡ @ A @ A @ A that is a + b + c =0 2a b c =1 8 2¡a +¡b +¡c = 1. < ¡ This system has a solution a = 1, b =0andc = 1. Hence, ¡ : 1 ¡ v2 = 0 , 0 1 1 @ A andthethirdsolutionis 1 t t ¡ x3 (t)=e (tv1 + v2)=e t . 0 1 t 1 ¡ @ A Finally, the general solution is

2t t C1e C3e ¡2t t x (t)=C1x1 + C2x2 + C3x3 = 2C1e +(C2 + C3t) e . 0 2¡C e2t +(C C C t) et 1 1 3 ¡ 2 ¡ 3 @ A Corollary. Let λ C be an eigenvalue of an operator A with the algebraic multiplicity m and the geometric2 multiplicity g.Thenλ gives rise to m linearly independent solutions of the system x0 = Ax that can be found in the form

λt s 1 x (t)=e u1 + u2t + ... + ust − (2.97) ¡ ¢ 90 where s = m g +1and u are vectors that can be determined by substituting the above ¡ j function to the equation x0 = Ax. The set of all n solutions obtained in this way using all the eigenvalues of A is linearly independent.

Remark. For practical use, one should substitute (2.97) into the system x0 = Ax con- sidering uij as unknowns (where uij is the i-th component of the vector uj) and solve the resulting linear algebraic system with respect to uij. The result will contain m arbitrary constants, and the solution in the form (2.97)willappearasalinearcombinationofm independent solutions.

Proof. Let p1, .., pg be the dimensions of all the Jordan cells with the eigenvalue λ (as we know, the number of such cells is g). Then λ occurs p1 + ... + pg times on the diagonal of the Jordan normal form, which implies

g

pj = m. j=1 X Hence, the total number of linearly independent solutions that are given by Theorem 2.20 for the eigenvalue λ is equal to m. Let us show that each of the solutions of Theorem 2.20 has the form (2.97). Indeed, each solution of Theorem 2.20 is already in the form

eλt times a polynomial of t of degree p 1. · j ¡ To ensure that these solutions can be represented in the form (2.97), we only need to verify that p 1 s 1. Indeed, we have j ¡ · ¡ g g (p 1) = p g = m g = s 1, j ¡ j ¡ ¡ ¡ j=1 Ã j=1 ! X X whence the inequality pj 1 s 1 follows. ¡ · ¡

91 Lecture 15 08.06.2009 Prof. A. Grigorian, ODE, SS 2009

3 The initial value problem for general ODEs

Definition. Given a function f : Ω Rn,whereΩ is an open set in Rn+1, consider the IVP ! x = f (t, x) , 0 (3.1) x (t0)=x0, ½ n where (t0,x0) is a given point in Ω.Afunctionx (t):I R is called a solution (3.1) ! if the domain I is an open interval in R containing t0, x (t)isdifferentiable in t in I, (t, x (t)) Ω for all t I,andx (t)satisfies the identity x0 (t)=f (t, x (t)) for all t I 2 2 2 and the initial condition x (t0)=x0. The graph of function x (t), that is, the set of points (t, x (t)), is hence a curve in Ω that goes through the point (t0,x0). It is also called the integral curve of the ODE x0 = f (t, x). Our aim here is to prove the unique solvability of (3.1) under reasonable assumptions about the function f (t, x), and to investigate the dependence of the solution on the initial condition and on other possible parameters.

3.1 Existence and uniqueness

Let Ω be an open subset of Rn+1 and f = f (t, x) be a mapping from Ω to Rn.We start with a description of the class of functions f (t, x),whichwillbeusedinallmain theorems. Definition. Function f (t, x)iscalledLipschitz in x in Ω if there is a constant L such that for all (t, x) , (t, y) Ω 2 f (t, x) f (t, y) L x y . (3.2) k ¡ k· k ¡ k The constant L is called the Lipschitz constant of f in Ω. In the view of the equivalence of any two norms in Rn, the property to be Lipschitz does not depend on the choice of the norm (but the value of the Lipschitz constant L does). AsubsetG of Rn+1 will be called a cylinder if it has the form G = I B where I is £ an interval in R and B is a ball (open or closed) in Rn. The cylinder is closed if both I and B are closed, and open if both I and B are open.

Definition. Function f (t, x)iscalledlocally Lipschitz in x in Ω if, for any (t0,x0) Ω, there exist constants ε, δ > 0 such that the cylinder 2 G =[t δ,t + δ] B (x ,ε) 0 ¡ 0 £ 0 is contained in Ω and f is Lipschitz in x in G. Note that the Lipschitz constant may be different for different cylinders G. Example. The function f (t, x)= x is Lipschitz in x in Rn because by the triangle inequality k k x y x y . jk k¡k kj · k ¡ k 92 The Lipschitz constant is equal to 1. Note that the function f (t, x)isnotdifferentiable at x =0. Many examples of locally Lipschitz functions can be obtained by using the following lemma.

Lemma 3.1 (a) If all components fk of f are differentiable functions in a cylinder G and all the partial derivatives ∂fk are bounded in G then the function f (t, x) is Lipschitz ∂xi in x in G. (b) If all partial derivatives ∂fk exists and are continuous in Ω then f (t, x) is locally ∂xj Lipschitz in x in Ω.

Proof. Let us use the following mean value property of functions in Rn:ifg is a differentiable real valued function in a ball B Rn then, for all x, y B there is ξ [x, y] such that ½ 2 2 n ∂g g (y) g (x)= (ξ)(y x ) (3.3) ¡ ∂x j ¡ j j=1 j X (note that the interval [x, y] is contained in the ball B so that ∂g (ξ) makes sense). Indeed, ∂xj consider the function

h (t)=g (x + t (y x)) where t [0, 1] . ¡ 2 The function h (t)isdifferentiable on [0, 1] and, by the mean value theorem in R,thereis τ (0, 1) such that 2 g (y) g (x)=h (1) h (0) = h0 (τ) . ¡ ¡ Noticing that by the chain rule

n ∂g h (τ)= (x + τ (y x)) (y x ) 0 ∂x j j j=1 j ¡ ¡ X and setting ξ = x + τ (y x), we obtain (3.3). ¡ (a)LetG = I B where I is an interval in R and B is a ball in Rn.If(t, x) , (t, y) G then t I and x, y£ B. Applying the above mean value property for the k-th component2 2 2 fk of f, we obtain that

n ∂f f (t, x) f (t, y)= k (t, ξ)(x y ) , (3.4) k ¡ k ∂x j ¡ j j=1 j X where ξ is a point in the interval [x, y] B.Set ½ ∂f C =maxsup k k,j ∂x G ¯ j ¯ ¯ ¯ and note that by the hypothesis C< . Hence,¯ by (3.4)¯ 1 ¯ ¯ n f (t, x) f (t, y) C x y = C x y . j k ¡ k j· j j ¡ jj k ¡ k1 j=1 X 93 Taking max in k,weobtain

f (t, x) f (t, y) C x y 1. k ¡ k∞ · k ¡ k Switching in the both sides to the given norm and using the equivalence of all norms, we obtain that f is Lipschitz in x in G. k¢k (b)Givenapoint(t ,x ) Ω, choose positive ε and δ so that the cylinder 0 0 2 G =[t δ,t + δ] B (x ,ε) 0 ¡ 0 £ 0 is contained in Ω, which is possible by the openness of Ω.Sincethecomponentsfk are continuously differentiable, they are differentiable. Since G is a closed bounded set and the partial derivatives ∂fk are continuous, they are bounded on G.Bypart(a)we ∂xj conclude that f is Lipschitz in x in G,whichfinishes the proof. Now we can state the main existence and uniqueness theorem.

Theorem 3.2 (Picard - Lindel¨of Theorem) Consider the equation

x0 = f (t, x) where f : Ω Rn is a mapping from an open set Ω Rn+1 to Rn.Assumethatf is ! ½ continuous on Ω and is locally Lipschitz in x. Then, for any point (t0,x0) Ω, the initial value problem (3.1) has a solution. 2 Furthermore, if x (t) and y (t) are two solutions of (3.1) then x (t)=y (t) in their common domain.

Remark. By Lemma 3.1, the hypothesis of Theorem 3.2 that f is locally Lipschitz in x could be replaced by a simpler hypotheses that all partial derivatives ∂fk exist and are ∂xj continuous in Ω. However, as we have seen above, there are examples of functions that are Lipschitz but not differentiable, and Theorem 3.2 applies to such functions as well. If we completely drop the Lipschitz condition and assume only that f is continuous in (t, x) then the existence of a solution is still the case (Peano’s theorem) while the uniqueness fails in general as will be seen in the next example.

Example. Consider the equation x0 = x which was already considered in Section 1.1. The function x (t) 0 is a solution, and thej j following two functions ´ p 1 x (t)= t2,t>0, 4 1 x (t)= t2,t<0 ¡4 are also solutions (this can also be trivially verified by substituting them into the ODE). Gluing together these two functions and extending the resulting function to t =0by setting x (0) = 0, we obtain a new solution defined for all real t (see the diagram below). Hence, there are at least two solutions that satisfy the initial condition x (0) = 0.

94 x 6

4

2

0 -4 -2 0 2 4

t -2

-4

-6

The uniqueness breaks down because the function x is not Lipschitz near 0. j j The proof of Theorem 3.2 uses essentially the followingp abstract result. Banach fixpointtheorem.Let (X, d) beacompletemetricspace53 and f : X X be a contraction mapping54,thatis, !

d (f (x) ,f(y)) qd (x, y) · for some q (0, 1) and all x, y X. Then f has exactly one fixed point, that is, a point x X such2 that f (x)=x. 2 2 Proof. Choose an arbitrary point x X and define a sequence xn n∞=0 by induction using 2 f g x = x and x = f (x )foranyn 0. 0 n+1 n ¸ Our purpose will be to show that the sequence xn converges and that the limit is a fixed point of f. We start with the observation thatf g

d (xn+1,xn)=d (f (xn) ,f(xn 1)) qd(xn,xn 1) . − · − It follows by induction that

d (x ,x ) qnd (x ,x )=Cqn (3.5) n+1 n · 1 0 where C = d (x1,x0) . Letusshowthatthesequence xn is Cauchy. Indeed, for any m>n, we obtain, using the triangle inequality and (3.5),f g

d (xm,xn) d (xn,xn+1)+d (xn+1,xn+2)+... + d (xm 1,xm) − · n n+1 m 1 C q + q + ... + q − · Cqn ¡ . ¢ · 1 q ¡ 53VollstÄandiger metrische Raum 54Kontraktionsabbildung

95 Therefore, d (xm,xn) 0asn, m , that is, the sequence xn is Cauchy. By the definition of a complete! ,!1 any Cauchy sequence converges.f g Hence, we conclude that xn converges, and let a = limn xn.Then f g →∞ d (f (x ) ,f(a)) qd (x ,a) 0asn , n · n ! !1 so that f (xn) f (a). On the other hand, f (xn)=xn+1 a as n , whence it follows that f (!a)=a,thatis,a is a fixed point. ! !1 Let us prove the uniqueness of the fixed point. If b is another fixed point then

d (a, b)=d (f (a) ,f(b)) qd (a, b) , · which is only possible if d (a, b) = 0 and, hence, a = b. Proof of Theorem 3.2. We start with the following claim. Claim. Afunctionx (t) solves IVP if and only if x (t) is a continuous function on an open interval I such that t I, (t, x (t)) Ω for all t I,and 0 2 2 2 t x (t)=x0 + f (s, x (s)) ds. (3.6) Zt0 Here the integral of the vector valued function is understood component-wise. If x solves IVP then (3.6) follows from xk0 = fk (t, x (t)) just by integration:

t t xk0 (s) ds = fk (s, x (s)) ds Zt0 Zt0 whence t x (t) (x ) = f (s, x (s)) ds k ¡ 0 k k Zt0 and (3.6) follows. Conversely, if x is a continuous function that satisfies (3.6) then

t

xk =(x0)k + fk (s, x (s)) ds. Zt0

The right hand side here is differentiable in t whence it follows that xk (t)isdifferentiable.

It is trivial that xk (t0)=(x0)k,andafterdifferentiation we obtain xk0 = fk (t, x)and, hence, x0 = f (t, x). Fix a point (t0,x0) Ω and prove the existence of a solution of (3.1). Let ε, δ be the parameter from the the2 local Lipschitz condition at this point, that is, there is a constant L such that f (t, x) f (t, y) L x y k ¡ k· k ¡ k for all t [t0 δ, t0 + δ]andx, y B (x0,ε). Choose some r (0,δ] to be specified later on, and2 set ¡ 2 2 I =[t r, t + r]andJ = B (x ,ε) . 0 ¡ 0 0 Denote by X the family of all continuous functions x (t):I J, that is, ! X = x : I J : x is continuous . f ! g

96 The following diagram illustrates the setting in the case n =1:

x

Ω

J=[x0-ε,x0+ε]

x0

I=[t0-r,t0+r]

t0-δ t0 t0+δ t

Consider the integral operator A defined on functions x (t)by

t Ax (t)=x0 + f (s, x (s)) ds. Zt0 We would like to ensure that x X implies Ax X.Notethat,foranyx X,the point (s, x (s)) belongs to Ω so that2 the above integral2 makes sense and the function2 Ax is defined on I. This function is obviously continuous. We are left to verify that the image of Ax is contained in J. Indeed, the latter condition means that

Ax (t) x ε for all t I. (3.7) k ¡ 0k· 2 We have, for any t I, 2 t Ax (t) x = f (s, x (s)) ds k ¡ 0k °Zt0 ° ° t ° ° f (s, x (s)) ds° · ° k k ° Zt0 sup f (s, x) t t0 Mr, · s I,x J k kj ¡ j· ∈ ∈ where M =sup f (s, x) < . s [t0 δ,t0+δ] k k 1 ∈ − x B(x0,ε). ∈ Hence, if r is so small that Mr ε then (3.7) is satisfied and, hence, Ax X. To summarize the above argument,· we have defined a function family X2and a mapping A : X X. By the above Claim, a function x X solves the IVP if x = Ax,thatis,if ! 2 97 x is a fixed point of the mapping A.Theexistenceofafixed point of A will be proved using the Banach fixed point theorem, but in order to be able to apply this theorem, we must introduce a distance function55 d on X so that (X, d) is a complete metric space and A is a contraction mapping with respect to this distance. Define the distance function on the function family X as follows: for all x, y X,set 2 d (x, y)=sup x (t) y (t) . t I k ¡ k ∈ Then (X, d) is known to be a complete metric space. We are left to ensure that the mapping A : X X is a contraction. For any two functions x, y X and any t I, t t ,wehavex!(t) ,y(t) J whence by the Lipschitz condition 2 2 ¸ 0 2 t t Ax (t) Ay (t) = f (s, x (s)) ds f (s, y (s)) ds k ¡ k ¡ °Zt0 Zt0 ° ° t ° ° ° ° f (s, x (s)) f (s, y (s)) ds ° · t0 k ¡ k Z t L x (s) y (s) ds · k ¡ k Zt0 L (t t0)sup x (s) y (s) · ¡ s I k ¡ k ∈ Lrd (x, y) . · Thesameinequalityholdsfort t . Taking sup in t I,weobtain · 0 2 d (Ax, Ay) Lrd (x, y) . · Hence, choosing r<1/L, we obtain that A is a contraction. By the Banach fixed point theorem, we conclude that the equation Ax = x has a solution x X, which hence solves the IVP. 2 Assume that x (t)andy (t) are two solutions of the same IVP both definedonanopen interval U R and prove that they coincide on U.Wefirst prove that the two solution coincide in½ some interval I =[t r, t + r]wherer>0istobedefined. Let ε and δ 0 ¡ 0 be the parameters from the Lipschitz condition at the point (t0,x0)asabove.Choose r>0 to satisfy all the above restrictions r δ, r ε/M, r < 1 , and in addition to be so · · L small that the both functions x (t)andy (t) restricted to I =[t0 r, t0 + r] take values ¡ in J = B (x0,ε) (which is possible because x (t0)=y (t0)=x0 and both x (t)andy (t) are continuous functions). Then the both functions x and y belong to the space X and are fixed points of the mapping A. By the uniqueness part of the Banach fixed point theorem, we conclude that x = y as elements of X,thatis,x (t)=y (t) for all t I. 2

55Abstand

98 Lecture 16 09.06.2009 Prof. A. Grigorian, ODE, SS 2009 Alternatively, the same can be proved by using the Gronwall inequality (Lemma 2.2). Indeed, the both solutions satisfy the integral identity

t x (t)=x0 + f (s; x (s)) ds Zt0 for all t I. Hence, for the di®erence z (t):= x (t) y (t) ,wehave 2 k ¡ k t z (t)= x (t) y (t) f (s; x (s)) f (s; y (s)) ds; k ¡ k· k ¡ k Zt0 assuming for certainty that t0 t t0 + r. Since the both points (s; x (s)) and (s; y (s)) in the given range of s are contained in I ·J, we· obtain by the Lipschitz condition £ f (s; x (s)) f (s; y (s)) L x (s) y (s) k ¡ k· k ¡ k whence t z (t) L z (s) ds: · Zt0 Appling the Gronwall inequality with C =0weobtainz (t) 0. Since z 0, we conclude that z (t) 0 · ¸ ´ for all t [t ;t + r]. In the same way, one gets that z (t) 0fort [t r; t ], which proves that the 2 0 0 ´ 2 0 ¡ 0 solutions x (t)andy (t) coincide on the interval I. Now we prove that they coincide on the full interval U. Consider the set

S = t U : x (t)=y (t) f 2 g and let us show that the set S is both closed and open in I. The closedness is obvious: if x (tk)=y (tk) for a sequence tk and tk t U as k then passing to the limit and using the continuity of thef solutions,g we! obtain2 x (t)=!1y (t), that is, t S. 2 Let us prove that the set S is open. Fix some t1 S.Sincex (t1)=y (t1)=:x1, 2 the both functions x (t)andy (t) solve the same IVP with the initial data (t1,x1). By the above argument, x (t)=y (t)insomeintervalI =[t1 r, t1 + r]withr>0. Hence, I S, which implies that S is open. ¡ ½ Now we use the following fact from Analysis: If S is a subset of an interval U R that is both open and closed in U then either S is empty or S = U.Sincetheset½S is non-empty (it contains t0) and is both open and closed in U,weconcludethatS = U, which finishes the proof of uniqueness. For the sake of completeness, let us give also the proof of the fact that S is either empty or S = U. Set Sc = U S so that Sc is closed in U.AssumethatbothS and Sc are non-empty and choose some n c a0+b0 c points a0 S, b0 S .Setc = 2 so that c U and, hence, c belongs to S or S .Outoftheintervals 2 2 2 c [a0;c], [c; b0] choose the one whose endpoints belong to di®erent sets S; S and rename it by [a1;b1], say c a1+b1 a1 S and b1 S . Considering the point c = 2 , we repeat the same argument and construct an 2 2 c interval [a2;b2]beingoneoftwohalfsof[a1;b1] such that a2 S and b2 S . Continuing further on, 2 2 c we obtain a nested sequence [a ;b ] 1 of intervals such that a S, b S and b a 0. By f k k gk=0 k 2 k 2 j k ¡ kj! the principle of nested intervals56, there is a common point x [a ;b ]forallk.Notethatx U.Since 2 k k 2 a x,wemusthavex S, and since b x,wemusthavex Sc, because both sets S and Sc are k ! 2 k ! 2 closed in U. This contradiction ¯nishes the proof. Example. Themethodoftheproofoftheexistencein Theorem 3.2 suggests the following procedure of computation of the solution of IVP. We start with any function x X (using 0 2 56Intervallschachtelungsprinzip

99 the same notation as in the proof) and construct the sequence x ∞ of functions in f ngn=0 X using the rule xn+1 = Axn. The sequence xn is called the Picard iterations, and it converges uniformly to the solution x (t). f g Let us illustrate this method on the following example:

x0 = x, x (0) = 1. ½ The operator A is given by t Ax (t)=1+ x (s) ds, Z0 whence, setting x (t) 1, we obtain 0 ´ t x1 (t)=1+ x0ds =1+t, Z0 t t2 x (t)=1+ x ds =1+t + 2 1 2 Z0 t t2 t3 x (t)=1+ x dt =1+t + + 3 2 2! 3! Z0 and by induction t2 t3 tn x (t)=1+t + + + ... + . n 2! 3! k! Clearly, x et as n ,andthefunctionx (t)=et indeed solves the above IVP. n ! !1 Remark. Let us summarize the proof of the existence part of Theorem 3.2 as follows. For any point (t0,x0) Ω,wefirst choose positive constants ε, δ, L from the Lipschitz condition, so that the cylinder2

G =[t δ,t + δ] B (x ,ε) 0 ¡ 0 £ 0 is contained in Ω and, for any two points (t, x)and(t, y)fromG with the same value of t, f (t, x) f (t, y) L x y . k ¡ k· k ¡ k Let M =sup f (t, x) (t,x) G k k ∈ and choose any positive r to satisfy ε 1 r δ, r ,r< . (3.8) · · M L Then there exists a solution x (t)totheIVP,whichisdefinedontheinterval[t r, t + r] 0 ¡ 0 and takes values in B (x0,ε). The fact that the domain of the solution admits the explicit estimates (3.8) can be used as follows.

100 Corollary. Under the conditions of Theorem 3.2 for any point (t ,x ) Ω there are 0 0 2 positive constants ε and r such that, for any t1 [t0 r/2,t0 + r/2] and x1 B (x0,ε/2), the IVP 2 ¡ 2 x = f (t, x) , 0 (3.9) x (t1)=x1, ½ has a solution x (t) which is defined for all t [t r/2,t + r/2] and takes values in 2 0 ¡ 0 B (x0,ε) (see the diagram below for the case n =1).

x

Ω

x0+ε G

x0+ε/2

x0

x1

x0-ε/2

x0-ε

t0-δ t0-r/2 t0 t1 t0+r/2 t0+δ t

The essence of this statement is that despite the variation of the initial conditions (t ,x ), the solution of (3.9) is definedonthesameinterval[t r/2,t + r/2] and takes 1 1 0 ¡ 0 values in the same ball B (x0,ε)provided(t1,x1) is close enough to (t0,x0). Proof. Let ε, δ, L, M be as in the proof of Theorem 3.2 above. Assuming that t [t δ/2,t + δ/2] and x B (x ,ε/2), we obtain that the cylinder 1 2 0 ¡ 0 1 2 0 G =[t δ/2,t + δ/2] B (x ,ε/2) 1 1 ¡ 1 £ 1 is contained in G. Hence, the values of L and M for the cylinder G1 can be taken the same as those for G. Therefore, choosing r>0 to satisfy the conditions ε 1 r δ/2,r ,r< , · · 2M L for example, δ ε 1 r =min , , , 2 2M 2L µ ¶ we obtain that the IVP (3.9) has a solution x (t)thatisdefinedintheinterval[t r, t + r] 1 ¡ 1 and takes values in B (x1,ε/2) B (x, ε). If t1 [t0 r/2,t0 + r/2] then [t0 r/2,t0 + r/2] ½ 2 ¡ ¡ ½ 101 [t1 r, t1 + r], and we see that the solution x (t)of(3.9)isdefined on [t0 r/2,t0 + r/2] and¡ takes value in B (x, ε), which was to be proved (see the diagram below).¡

x

Ω

x0+ε G

x0+ε/2

x1+ε/2 G1 x0

x1

x0-ε/2

x1-ε/2

x0-ε

t1-r t1 t1+r

t0-δ t0-r/2 t0 t0+r/2 t0+δ t

3.2 Maximal solutions Consider again the ODE x0 = f (t, x) (3.10) where f : Ω Rn is a mapping from an open set Ω Rn+1 to Rn, which is continuous on Ω and locally! Lipschitz in x. ½ Although the uniqueness part of Theorem 3.2 says that any two solutions of the same initial value problem x = f (t, x) , 0 (3.11) x (t0)=x0 ½ coincide in their common interval, still there are many different solutions to the same IVP because, strictly speaking, the functions that are definedondifferent domains are different. The purpose of what follows is to determine the maximal possible domain of the solution to (3.11). We say that a solution y (t)oftheODEisanextension of a solution x (t)ifthedomain of y (t) contains the domain of x (t) and the solutions coincide in the common domain. Definition. Asolutionx (t) of the ODE is called maximal if it is definedonanopen interval and cannot be extended to any larger open interval.

Theorem 3.3 Under the conditions of Theorem 3.2, the following is true.

102 (a) Theinitialvalueproblem(3.11) has is a unique maximal solution for any initial values (t0,x0) Ω. (b) If x (t)2and y (t) are two maximal solutions of (3.10) and x (t)=y (t) for some value of t, then the solutions x and y are identical, including the identity of their domains. (c) If x (t) is a maximal solution with the domain (a, b) then x (t) leaves any compact set K Ω as t a and as t b. ½ ! ! Herethephrase“x (t) leaves any compact set K as t b” means the follows: there is T (a, b) such that for any t (T,b), the point (t, x (t))! does not belong to K. Similarly, thephrase“2 x (t)leavesanycompactset2 K as t a” means that there is T (a, b)such that for any t (a, T), the point (t, x (t)) does not! belong to K. 2 2 2 2 Example. 1. Consider the ODE x0 = x in the domain Ω = R . Thisisseparable equation and can be solved as follows. Obviously, x 0 is a constant solution. In the domains where x =0wehave ´ 6 x dt 0 = dt x2 Z Z whence 1 dx = = dt = t + C ¡x x2 Z Z 1 and x (t)= t C (where we have replaced C by C). Hence, the family of all solutions ¡ − ¡ 1 consists of a straight line x (t)=0andhyperbolasx (t)= C t with the maximal domains (C, + )and( ,C) (see the diagram below). − 1 ¡1 y 50

25

0 -5 -2.5 0 2.5 5

x

-25

-50

Each of these solutions leaves any compact set K,butindifferent ways: the solutions 1 x (t)=0leavesK as t because K is bounded, while x (t)= C t leaves K as t C because x (t) !§1. − ! !§1

103 1 2. Consider the ODE x0 = x in the domain Ω = R (0, + ) (that is, t R and x>0). By the separation of variables, we obtain £ 1 2 x2 = xdx = xx dt = dt = t + C 2 0 Z Z Z whence x (t)= 2(t C) ,t>C. ¡ See the diagram below: p

y

3

2.5

2

1.5

1

0.5

0 0 1.25 2.5 3.75 5

x

Obviously, the maximal domain of the solution is (C, + ). The solution leaves any compact K Ω as t C because (t, x (t)) tends to the point1 (C, 0) at the boundary of Ω. ½ ! The proof of Theorem 3.3 will be preceded by a lemma.

Lemma 3.4 Let xα (t) α A be a family of solutions to the same IVP where A is any f g ∈ index set, and let the domain of xα be an open interval Iα.SetI = α A Iα and define a function x (t) on I as follows: ∈ S x (t)=x (t) if t I . (3.12) α 2 α Then I is an open interval and x (t) isasolutiontothesameIVPonI.

The function x (t)defined by (3.12) is referred to as the union of the family xα (t) . Proof. First of all, let us verify that the identity (3.12) defines x (t) correctly,f thatg is, the right hand side does not depend on the choice of α. Indeed, if also t I then t 2 β belongs to the intersection Iα Iβ and by the uniqueness theorem, xα (t)=xβ (t). Hence, the value of x (t) is independent\ of the choice of the index α. Note that the graph of x (t) is the union of the graphs of all functions xα (t). Set a =infI, b =supI and show that I =(a, b). Let us first verify that (a, b) I, that is, any t (a, b) belongs also to I. Assume for certainty that t t .Sinceb =sup½ I, 2 ¸ 0 there is t1 I such that t

105 Lecture 17 15.06.2009 Prof. A. Grigorian, ODE, SS 2009 Let y (t) be another maximal solution to the IVP (3.11) and let z (t) be the union of the solutions x (t)andy (t). By Lemma 3.4, z (t) solves the same IVP and extends both x (t)andy (t), which implies by the maximality of x and y that z is identical to both x and y. Hence, x and y are identical (including the identity of the domains), which proves the uniqueness of a maximal solution. (b)Letx (t)andy (t) be two maximal solutions of (3.10) that coincide at some t,say t = t1.Setx1 = x (t1)=y (t1). Then both x and y are solutions to the same IVP with the initial point (t1,x1) and, hence, they coincide by part (a). (c)Letx (t) be a maximal solution of (3.10) definedon(a, b)wherea 0 such that the IVP with the initial point inside the cylinder

G =[t r/2,t + r/2] B (x ,ε/2) 0 ¡ 0 £ 0 has a solution defined for all t [t0 r/2,t0 + r/2]. In particular, if k is large enough then (t ,x ) G, which implies2 that¡ the solution y (t) to the following IVP k k 2

y0 = f (t, y) , y (tk)=xk, ½ is defined for all t [t r/2,t + r/2] (see the diagram below). 2 0 ¡ 0 x

x(t)

_ (tk, xk) B(x0,ε/2)

y(t) (t0, x0) K

t [t0-r/2,t0+r/2]

Since x (t) also solves this IVP, the union z (t)ofx (t)andy (t)solvesthesameIVP. Note that x (t)isdefined only for t>t0 while z (t)isdefined also for t [t0 r/2,t0]. 2 ¡ 106 Hence, the solution x (t) can be extended to a larger interval, which contradicts the maximality of x (t). Remark. By definition, a maximal solution x (t)isdefinedonanopeninterval,say (a, b),anditcannotbeextendedtoalargeropeninterval.Onemaywonderifx (t)can be extended at least to the endpoints t = a or t = b. Itturnsoutthatthisisneverthe case (unless the domain Ω of the function f (t, x) can be enlarged). Indeed, if x (t)can be defined as a solution to the ODE also for t = a then (a, x (a)) Ω and, hence, there is 2 ball B in Rn+1 centered at the point (a, x (a)) such that B Ω. By shrinking the radius of B, we can assume that the corresponding closed ball B ½is also contained in Ω.Since x (t) x (a)ast a, we obtain that (t, x (t)) B for all t close enough to a.Therefore, the solution! x (t)! does not leave the compact set2 B Ω as t a, which contradicts part (c) of Theorem 3.3. ½ !

3.3 Continuity of solutions with respect to f (t, x)

Let Ω be an open set in Rn+1 and f,g be two mappings from Ω to Rn. Assume in what follows that both f,g are continuous and locally Lipschitz in x in Ω, and consider two initial value problems x = f (t, x) 0 (3.13) x (t0)=x0 ½ and y = g (t, y) 0 (3.14) y (t0)=x0 ½ where (t0,x0)isafixed point in Ω. Let the function f be fixed and x (t)beafixed solution of (3.13). The function g will be treated as variable. Our purpose is to show that if g is chosen close enough to f then the solution y (t) of (3.14) is close enough to x (t). Apart from the theoretical interest, this result has significant practical consequences. For example, if one knows the function f (t, x) only approximately (which is always the case in applications in Sciences and ) then solving (3.13) approximately means solving another problem (3.14) where g is an approximation to f. Hence, it is important to know that the solution y (t) of (3.14) is actually an approximation of x (t).

Theorem 3.5 Let x (t) be a solution to the IVP (3.13) defined on a closed bounded in- terval [α, β] such that α0, there exists η>0 with the following property: for any function g : Ω Rn as above and such that ! sup f g η, (3.15) Ω k ¡ k· there is a solution y (t) of the IVP (3.14) defined on [α, β], and this solution satisfies the inequality sup x (t) y (t) ε. (3.16) [α,β] k ¡ k·

Proof. We start with the following estimate of the difference x (t) y (t) . k ¡ k Claim 1. Let x (t) and y (t) be solutions of (3.13) and (3.14) respectively, and assume that they both are defined on some interval (a, b) containing t0. Assume also that the

107 graphs of x (t) and y (t) over (a, b) belongtoasubsetK of Ω such that f (t, x) is Lipschitz in K in x with the Lipschitz constant L. Then

L(b a) sup x (t) y (t) e − (b a)sup f g . (3.17) t (a,b) k ¡ k· ¡ K k ¡ k ∈

Let us use the integral identities

t t x (t)=x0 + f (s, x (s)) ds and y (t)=x0 + g (s, y (s)) ds. Zt0 Zt0 Assuming that t t

L(t t0) x (t) y (t) e − (b a)sup f g k ¡ k· ¡ K k ¡ k L(b a) e − (b a)sup f g . (3.19) · ¡ K k ¡ k

In the same way, (3.19) holds for a

The estimate (3.17) implies that if supΩ f g is small enough then also the difference x (t) y (t) is small, which is essentiallyk what¡ k is claimed in Theorem 3.5. However, in orderk to¡ be ablek to use the estimate (3.17) in the setting of Theorem 3.5, we have to deal with the following two issues: to show that the solution y (t)isdefinedontheinterval[α, β]; ¡ to show that the graphs of x (t)andy (t)arecontainedinasetK where f satisfies the¡ Lipschitz condition in x. We proceed with the construction of such a set. For any ε 0, consider the set ¸ n+1 Kε = (t, x) R : α t β, x x (t) ε (3.20) 2 · · k ¡ k· which can be regarded as© the closed ε-neighborhood in Rn+1 of the graphª of the function t x (t)wheret [α, β]. In particular, K0 is the graph of the function x (t)on[α, β] (see7! the diagram below).2

108

x Ω

Kε K0

(t,x(t))

α β t

The set K0 is compact because it is the image of the compact interval [α, β]underthe continuous mapping t (t, x (t)). Hence, K0 is bounded and closed, which implies that 7! n+1 also Kε for any ε>0 is bounded and closed. Thus, Kε is a compact subset of R for any ε 0. ¸ Claim 2. There is ε>0 such that Kε Ω and f is Lipschitz in Kε in x. Indeed, by the local Lipschitz condition,½ for any point (t, x) Ω (in particular, for any (t, x) K ), there are constants ε, δ > 0 such that the cylinder2 2 0 G =[t δ, t + δ] B (x, ε) ¡ £ is contained in Ω and f is Lipschitz in G (see the diagram below).

x

Ω

_ K0 G B(x,ε)

x

α t-δ t t+δ β t

Consider also an open cylinder H =(t δ,t + δ) B x, 1 ε . ¡ £ 2 Varying the point (t, x)inK0,weobtainacoverof¡ K0 ¢by such cylinders. Since K0 is compact, there is a finite subcover, that is, a finite number of points (t ,x) m on K f i i gi=1 0 and numbers εi,δi > 0, such that the cylinders

1 Hi =(ti δi,ti + δi) B xi, εi ¡ £ 2 ¡ ¢ 109 cover all K0.Set G =[t δ ,t + δ ] B (x ,ε) i i ¡ i i i £ i i and let Li be the Lipschitz constant of f in Gi, which exists by the choice of εi,δi. Define ε and L as follows 1 ε = min εi,L=maxLi, (3.21) 2 1 i m 1 i m ≤ ≤ ≤ ≤ and prove that K Ω and that the function f is Lipschitz in K with the constant L. ε ½ ε Foranypoint(t, x) Kε,wehavebythedefinition of Kε that t [α, β], (t, x (t)) K0 and 2 2 2 x x (t) ε. k ¡ k· Then the point (t, x (t)) belongs to one of the cylinders Hi so that t (t δ ,t + δ )and x (t) x < 1 ε 2 i ¡ i i i k ¡ ik 2 i (see the diagram below).

x

_ Gi K0 B(xi,εi)

(t,x)

H i (t,x(t))

(ti,xi) B(xi,εi/2)

ti-δi ti+δi t

By the triangle inequality, we have x x x x (t) + x (t) x <ε+ ε /2 ε , k ¡ ik·k ¡ k k ¡ ik i · i wherewehaveusedthatby(3.21)ε ε /2. Therefore, x B (x ,ε) whence it follows · i 2 i i that (t, x) Gi and (t, x) Ω. Hence, we have shown that any point from Kε belongs to Ω,whichprovesthat2 K 2Ω. ε ½ If (t, x) , (t, , y) Kε then by the above argument the both points x, y belong to the same ball B (x ,ε)2 that is determined by the condition (t, x (t)) H .Then(t, x) , (t, , y) i i 2 i 2 Gi and, since f is Lipschitz in Gi with the constant Li,weobtain f (t, x) f (t, y) L x y L x y , k ¡ k· i k ¡ k· k ¡ k wherewehaveusedthedefinition (3.21) of L.Thisshowsthatf is Lipschitz in x in Kε and finishes the proof of Claim 2.

110 Let now y (t) be the maximal solution to the IVP (3.14), and let J be its domain; in particular, J is an open interval containing t0.Wearegoingtospotaninterval(a, b) where both functions x (t)andy (t)aredefined and their graphs over (a, b)belongto Kε.Sincey (t0)=x0,thepoint(t0,y(t0)) of the graph of y (t)iscontainedinKε.By Theorem 3.3, the graph of y (t)leavesKε when t goes to the endpoints of J. It follows that the following sets are non-empty:

A :=t J, t < t :(t, y (t)) / K , f 2 0 2 εg B :=t J, t > t0 :(t, y (t)) / Kε . f 2 2 g

On the other hand, in a small neighborhood of t0,thegraphofy (t) is close to the point (t0,x0) and, hence, is in Kε. Therefore, the sets A and B are separated out from t0,the set A being on the left from t0,andthesetB being on the right from t0.Set

a =supA and b =infB

(see the diagram below for two cases: a>αand a = α).

x x

x(t) x(t)

Kε Kε

y(t) y(t) (t ,x ) 0 0 (t0,x0)

A B A B α a t 0 b β t α=a t0 b β

J J

111 Lecture 18 16.06.2009 Prof. A. Grigorian, ODE, SS 2009

Claim 3. We have at0.SinceA is contained in J but is separated from the rightendpointofJ, it follows that a J; similarly, b J.Anypointt (a, b) belongs 2 2 2 to J but does not belong to A or B, whence (t, y (t)) Kε; therefore, t [α, β], which implies that a, b [α, β]. 2 2 Now we can fi2nish the proof of Theorem 3.5 as follows. Assuming that

f g η k ¡ kKε · and using b a β α, we obtain from (3.17) with K = K that ¡ · ¡ ε L(β α) ε sup x (t) y (t) e − (β α) η , (3.22) t (a,b) k ¡ k· ¡ · 2 ∈

ε L(β α) provided we choose η = 2(β α) e− − . It follows then by letting t a+that − ! x (a) y (a) ε/2. (3.23) k ¡ k·

Let us deduce from (3.23) that a = α.Indeed,foranyt A,wehave(t, y (t)) / Kε, whichcanoccurintwoways: 2 2

1. either t<α(exit through the left boundary of Kε)

2. or t α and x (t) y (t) >ε(exit through the lateral boundary of Kε) ¸ k ¡ k If a>αthen there is a point t A such that t>α.Atthispointwehave x (t) y (t) >εwhence it follows by2 letting t a that k ¡ k ! ¡ x (a) y (a) ε, k ¡ k¸ which contradicts (3.23). Hence, we conclude that a = α and in the same way b = β. Since [a, b] J, it follows that the solution y (t)isdefinedontheinterval[α, β]. Finally, (3.16) follows½ from (3.22). Using the proof of Theorem 3.5, we can refine the statement of Theorem 3.5 as follows. Corollary. Under the hypotheses of Theorem 3.5, let ε>0 be sufficiently small so that K Ω and f (t, x) is Lipschitz in x in K with the Lipschitz constant L (where K ε ½ ε ε is defined by (3.20)). If supKε f g is sufficiently small, then the IVP (3.14) has a solution y (t) defined on [α, β], andk ¡ thek following estimate holds

L(β α) sup x (t) y (t) e − (β α)sup f g . (3.24) [α,β] k ¡ k· ¡ Kε k ¡ k

Proof. Indeed, as it was shown in the proof of Theorem 3.5, if supKε f g is small enough then α = b and β = b. Hence, (3.24) follows from (3.17). k ¡ k

112 3.4 Continuity of solutions with respect to a parameter

Consider the IVP with a parameter s Rm 2 x = f (t, x, s) 0 (3.25) x (t0)=x0 ½ where f : Ω Rn and Ω is an open subset of Rn+m+1. Here the triple (t, x, s)isidentified ! as a point in Rn+m+1 as follows:

(t, x, s)=(t, x1, .., xn,s1, ..., sm) .

How do we understand (3.25)? For any s Rm, consider the open set 2 n+1 Ωs = (t, x) R :(t, x, s) Ω . 2 2

Denote by S the set of those s,forwhich© Ωs contains (t0,x0ª), that is,

m m S = s R :(t0,x0) Ωs = s R :(t0,x0,s) Ω f 2 2 g f 2 2 g

m R

s s S

n+1 R

(t0,x0)

Then the IVP (3.25) can be considered in the domain Ωs for any s S.Wealways assume that the set S is non-empty. Assume also in the sequel that f (t,2 x, s)isacontin- uous function in (t, x, s) Ω and is locally Lipschitz in x for any s S. For any s S, 2 2 2 denote by x (t, s) the maximal solution of (3.25) and let Is be its domain (that is, Is is an open interval on the axis t). Hence, x (t, s) as a function of (t, s)isdefined in the set

m+1 U = (t, s) R : s S, t Is . 2 2 2 © ª

Theorem 3.6 Under the above assumptions, the set U is an open subset of Rm+1 and the function x (t, s):U Rn is continuous in (t, s). ! 113 Proof. Fix some s S and consider solution x (t)=x (t, s )defined for t I . 0 2 0 2 s0 Choose some interval [α, β] Is0 such that t0 (α, β). We will prove that there is δ>0 such that ½ 2 [α, β] B (s ,δ) U, (3.26) £ 0 ½ m which will imply that U is open. Here B (s0,δ) is a closed ball in R with respect to -norm (we can assume that all the norms in various spaces Rk are the -norms). 1 1 m s R 2

U

B(s0,δ) s0

α t0 β t

I s0

As in the proof of Theorem 3.5, consider a set

n+1 Kε = (t, x) R : α t β, x x (t) ε 2 · · k ¡ k· anditsextensioninRn+m©+1 defined by ª

n+m+1 Kε,δ = Kε B (s0,δ)= (t, x, s) R : α t β, x x (t) ε, s s0 δ £ 2 · · k ¡ k· k ¡ k· © ª (seee the diagram below).

s

_ B(s0,δ) s0

_ ~ K =K B(s ,δ) ε,δ ε £ 0

α t0 β t

x(t,s0) x(t,s)

x s0

114 If ε and δ are small enough then Kε,δ is contained in Ω (cf. the proof of Theorem 3.5). Hence, for any s B(s ,δ), the function f (t, x, s)isdefined for all (t, x) K .Sincethe 2 0 2 ε function f is continuous on Ω, it is uniformlye continuous on the compact set Kε,δ, whence it follows that sup f (t, x, s0) f (t, x, s) 0ass s0. e (t,x) Kε k ¡ k! ! ∈ 57 Using Corollary to Theorem 3.5 with f (t, x)=f (t, x, s0)andg (t, x)=f (t, x, s)where s B (s ,δ), we obtain that if 2 0

sup f (t, x, s) f (t, x, s0) (t,x) Kε k ¡ k ∈ is small enough then the solution y (t)=x (t, s)isdefinedon[α, β]. In particular, this implies (3.26) for small enough δ. Furthermore, applying the estimate (3.24) of Corollary to Theorem 3.5, we obtain that

sup x (t, s) x (t, s0) C sup f (t, x, s0) f (t, x, s) , t [α,β] k ¡ k· (t,x) Kε k ¡ k ∈ ∈

L(β α) where C = e − (β α) depends only on α, β and the Lipschitz constant L of the function f (t, x, s )inK¡. Letting s s , we obtain that 0 ε ! 0

sup x (t, s) x (t, s0) 0ass s0, (3.27) t [α,β] k ¡ k! ! ∈ so that x (t, s) is continuous in s at s = s0 uniformly in t [α, β]. Since x (t, s)is continuous in t for any fixed s, it follows that x is continuous in2 (t, s). Indeed, fixapoint (t0,s0) U (where t0 (α, β) is not necessarily the value from the initial condition) and prove that2 x (t, s) x2(t ,s )as(t, s) (t ,s ). Using (3.27) and the continuity of the ! 0 0 ! 0 0 function x (t, s0)int,weobtain

x (t, s) x (t ,s ) x (t, s) x (t, s ) + x (t, s ) x (t ,s ) k ¡ 0 0 k·k ¡ 0 k k 0 ¡ 0 0 k sup x (t, s) x (t, s0) + x (t, s0) x (t0,s0) · t [α,β] k ¡ k k ¡ k ∈ 0ass s and t t , ! ! 0 ! 0 which finishes the proof. As an example of application of Theorem 3.6, consider the dependence of a solution on the initial condition. Corollary. Consider the initial value problem

x0 = f (t, x) x (t0)=x0 ½ where f : Ω Rn isacontinuousfunctioninanopensetΩ Rn+1 that is Lipschitz in x in Ω. For! any point (t ,x ) Ω,denotebyx (t, t ,x ) the½ maximal solution of this 0 0 2 0 0 problem. Then the function x (t, t0,x0) is continuous jointly in (t, t0,x0).

57 Since the common domain of the functions f (t; x; s)andf (t; x; s0)is(t; x) −s0 −s,Theorem3.5 should be applied with this domain. 2 \

115 Proof. Consider a new function y (t)=x (t + t ) x that satisfies the ODE 0 ¡ 0 y0 (t)=x0 (t + t0)=f (t + t0,x(t + t0)) = f (t + t0,y(t)+x0) .

Considering s := (t0,x0) as a parameter and setting

F (t, y, s)=f (t + t0,y+ x0) , we see that y satisfies the IVP y0 = F (t, y, s) y (0) = 0. ½ 2n+2 The domain of function F consists of those points (t, y, t0,x0) R for which 2 (t + t ,y+ x ) Ω, 0 0 2 which obviously is an open subset of R2n+2. Since the function F (t, y, s) is clearly continuous in (t, y, s) and is locally Lipschitz in y,weconcludebyTheorem3.6that the maximal solution y = y (t, s) is continuous in (t, s). Consequently, the function x (t, t ,x )=y (t t ,t ,x )+x is continuous in (t, t ,x ). 0 0 ¡ 0 0 0 0 0 0 3.5 Differentiability of solutions in parameter Consider again the initial value problem with parameter x = f(t, x, s), 0 (3.28) x (t0)=x0, ½ where f : Ω Rn is a continuous function definedonanopensetΩ Rn+m+1 and ! ½ where (t, x, s)=(t, x1, ..., xn,s1, ..., sm) . Let us use the following notation for the partial Jacobian matrices of f with respect to the variables x and s: ∂f ∂f f = ∂ f = := i , x x ∂x ∂x µ k ¶ where i =1, ..., n is the row index and k =1, ..., n is the column index, so that fx is an n n matrix, and £ ∂f ∂f f = ∂ f = := i , s s ∂s ∂s µ l ¶ where i =1, ..., n is the row index and l =1, ..., m is the column index, so that fs is an n m matrix. £ Note that if fx is continuous in Ω then, by Lemma 3.1, f is locally Lipschitz in x so that all the previous results apply. Let x (t, s) be the maximal solution to (3.28). Recall that, by Theorem 3.6, the domain U of x (t, s)isanopensubsetofRm+1 and x : U Rn is continuous. !

Theorem 3.7 Assume that function f (t, x, s) is continuous and fx and fs exist and are also continuous in Ω.Thenx (t, s) is continuously differentiable in (t, s) U and the 2 Jacobian matrix y = ∂sx solves the initial value problem y = f (t, x (t, s) ,s) y + f (t, x (t, s) ,s) , 0 x s (3.29) y (t0)=0. ½

116 ∂xk Here ∂sx = is an n m matrix where k =1,..,nis the row index and l =1, ..., m ∂sl £ n m is the column index.³ ´ Hence, y = ∂sx canbeconsideredasavectorinR × depending on t and s. Thebothtermsintherighthandsideof(3.29)arealson m matrices so that £ (3.29) makes sense. Indeed, fs is an n m matrix, and fxy is the product of the n n and n m matrices, which is again an n£ m matrix. The notation f (t, x (t, s) ,s)means£ £ £ x that one first evaluates the partial derivative fx (t, x, s) and then substitute the values of s and x = x (t, s) (and the same remark applies to fs (t, x (t, s) ,s)). The ODE in (3.29) is called the variational equation for (3.28) along the solution x (t, s)(ortheequation in variations).

117 Lecture 19 22.06.2009 Prof. A. Grigorian, ODE, SS 2009 Note that the variational equation is linear. Indeed, for any fixed s, it can be written in the form y0 = a (t) y + b (t) , where a (t)=fx (t, x (t, s) ,s) ,b(t)=fs (t, x (t, s) ,s) . Since f is continuous and x (t, s) is continuous by Theorem 3.6, the functions a (t)and b (t) are continuous in t. If the domain in t of the solution x (t, s)isIs then the domain n m of the variational equation is Is R × . By Theorem 2.1, the solution y (t) of (3.29) £ exists in the full interval Is. Hence, Theorem 3.7 can be stated as follows: if x (t, s)isthe solution of (3.28) on Is and y (t) is the solution of (3.29) on Is then we have the identity y (t)=∂sx (t, s)forallt Is. This provides a method of evaluating ∂sx (t, s)forafixed s without finding x (t, s)forall2 s. Example. Consider the IVP with parameter

2 x0 = x +2s/t x (1) = 1 ½ ¡ in the domain (0, + ) R R (that is, t>0andx, s are arbitrary real). Let us evaluate 1 £ £ 2 x (t, s)and∂sx for s = 0. Obviously, the function f (t, x, s)=x +2s/t is continuously dif- ferentiable in (x, s) whence it follows that the solution x (t, s) is continuously differentiable in (t, s). For s =0wehavetheIVP 2 x0 = x x (1) = 1 ½ ¡ 1 whence we obtain x (t, 0) = t .Noticingthatfx =2x and fs =2/t we obtain the variational equation along this¡ solution 2 2 y0 = fx (t, x, s) x= 1 ,s=0 y + fs (t, s, x) x= 1 ,s=0 = y + . j − t j − t ¡ t t ³ ´ ³ ´ This is a linear equation of the form y0 = a (t) y + b (t) which is solved by the formula

A(t) A(t) y = e e− b (t) dt, Z where A (t) is a primitive of a (t)= 2/t,thatisA (t)= 2lnt. Hence, ¡ ¡ 2 y (t)=t 2 t2 dt = t 2 t2 + C =1+Ct 2. − t − − Z ¡ ¢ 2 The initial condition y (1) = 0 is satisfied for C = 1sothaty (t)=1 t− .ByTheorem 2 ¡ ¡ 3.7, we conclude that ∂sx (t, 0) = 1 t− . Expanding x (t, s) as a function¡ of s by the Taylor formula of the order 1, we obtain

x (t, s)=x (t, 0) + ∂ x (t, 0) s + o (s)ass 0, s ! that is, 1 1 x (t, s)= + 1 s + o (s)ass 0. ¡ t ¡ t2 ! µ ¶ 118 In particular, we obtain for small s an approximation 1 1 x (t, s) + 1 s. ¼¡t ¡ t2 µ ¶ Later we will be able to obtain more terms in the Taylor formula and, hence, to get a better approximation for x (t, s). Let us discuss the variational equation (3.29). It is easy to deduce (3.29) provided it is known that the mixed partial derivatives ∂s∂tx and ∂t∂sx exist and are equal (for example, this is the case when x (t, s) C2 (U)). Then differentiating (3.28) in s and using the chain rule, we obtain 2

∂t∂sx = ∂s (∂tx)=∂s [f (t, x (t, s) ,s)] = fx (t, x (t, s) ,s) ∂sx + fs (t, x (t, s) ,s) ,

58 which implies (3.29) after substitution ∂sx = y. Although this argument is not a proof of Theorem 3.7, it allows to memorize the variational equation. Yet another point of view on the equation (3.29) is as follows. Fix a value of s = s0 and set x (t)=x (t, s0). Using the differentiability of f (t, x, s)in(x, s), we can write, for any fixed t,

f (t, x, s)=f (t, x (t) ,s )+f (t, x (t) ,s )(x x (t)) + f (t, x (t) ,s )(s s )+R, 0 x 0 ¡ s 0 ¡ 0 where R = o ( x x (t) + s s0 )as x x (t) + s s0 0. If s is close to s0 then x (t, s)isclosetok ¡ xk(t),k and¡ wek obtaink the¡ approximationk k ¡ k!

f (t, x (t, s) ,s) f (t, x (t) ,s )+a (t)(x (t, s) x (t)) + b (t)(s s ) , ¼ 0 ¡ ¡ 0 whence x0 (t, s) f (t, x (t) ,s )+a (t)(x (t, s) x (t)) + b (t)(s s ) . ¼ 0 ¡ ¡ 0 This equation is linear with respect to x (t, s) and is called the linearized equation at the solution x (t). Substituting f (t, x (t) ,s0)byx0 (t) and dividing by s s0,weobtainthat x(t,s) x(t) ¡ the function z (t, s)= − satisfies the approximate equation s s0 −

z0 a (t) z + b (t) . ¼

It is not difficult to believe that the derivative y (t)=∂sx s=s0 = lims s0 z (t, s)satisfies this equation exactly. This argument (although not rigorous)j shows that→ the variational equation for y originates from the linearized equation for x. How can one evaluate the higher derivatives of x (t, s)ins? Let us show how to find the ODE for the second derivative z = ∂ssx assuming for simplicity that n = m =1,that is, both x and s are one-dimensional. For the derivative y = ∂sx we have the IVP (3.29), which we write in the form y = g (t, y, s) 0 (3.30) y (t0)=0 ½ where g (t, y, s)=fx (t, x (t, s) ,s) y + fs (t, x (t, s) ,s) . (3.31)

58 In the actual proof of Theorem 3.7, the variational equation is obtained before the identity @t@sx = @s@tx. The main technical di±culty in the proof of Theorem 3.7 lies in verifying the di®erentiability of x in s.

119 For what follows we use the notation F (a, b, c, ...) Ck (a, b, c, ...) if all the partial deriva- tives of the order up to k of the function F with respect2 to the specified variables a, b, c... exist and are continuous functions, in the domain of F . For example, the condition in 1 Theorem 3.7 that fx and fs are continuous, can be shortly written as f C (x, s) , and the claim of Theorem 3.7 is that x (t, s) C1 (t, s) . 2 Assume now that f C2 (x, s). Then2 by (3.31) we obtain that g is continuous and g C1 (y, s), whence by2 Theorem 3.7 y C1 (s). In particular, the function z = ∂ y = 2 2 s ∂ssx is defined. Applying the variational equation to the problem (3.30), we obtain the equation for z z0 = gy (t, y (t, s) ,s) z + gs (t, y (t, s) ,s) .

Since gy = fx (t, x, s),

gs (t, y, s)=fxx (t, x, s)(∂sx) y + fxs (t, x, s) y + fsx (t, x, s) ∂sx + fss (t, x, s) , and ∂sx = y, we conclude that z = f (t, x, s) z + f (t, x, s) y2 +2f (t, x, s) y + f (t, x, s) 0 x xx xs ss (3.32) z0 (t0)=0. ½ Note that here x must be substituted by x (t, s)andy —byy (t, s). The equation (3.32) is called the variational equation of the second order (or the second variational equation). It is a linear ODE and it has the same coefficient fx (t, x (t, s) ,s) in front of the unknown function as the first variational equation. Similarly one can find the variational equations of the higher orders. Example. This is a continuation of the previous example of the IVP with parameter

2 x0 = x +2s/t x (1) = 1 ½ ¡ where we have already computed that 1 1 x (t):=x (t, 0) = and y (t):=∂ x (t, 0) = 1 . ¡ t s ¡ t2

Let us now evaluate z (t)=∂ssx (t, 0). Since

fx =2x, fxx =2,fxs =0,fss =0, we obtain the second variational equation

2 z0 = fx x= 1 ,s=0 z + fxx x= 1 ,s=0 y j − t j − t ³ 2 ´ 2 ³2 ´ = z +2 1 t− . ¡ t ¡ ¡ ¢ 2 Solving this equation similarly to the first variational equation with the same a (t)= t 2 2 ¡ and with b (t)=2(1 t− ) ,weobtain ¡ A(t) A(t) 2 2 2 2 z (t)=e e− b (t) dt = t− 2t 1 t− dt ¡ Z Z 2 2 3 2 2 ¡2 4 ¢ C = t− t 4t + C = t + . 3 ¡ t ¡ 3 ¡ t3 ¡ t t2 µ ¶ 120 16 The initial condition z (1) = 0 yields C = 3 whence 2 4 16 2 z (t)= t + . 3 ¡ t 3t2 ¡ t3 Expanding x (t, s)ats = 0 by the Taylor formula of the second order, we obtain as s 0 ! 1 x (t, s)=x (t)+y (t) s + z (t) s2 + o s2 2 1 2 1 2¡ ¢8 1 2 2 = + 1 t− s + t + s + o s . ¡ t ¡ 3 ¡ t 3t2 ¡ t3 µ ¶ ¡ ¢ ¡ ¢ For comparison, the plots below show for s =0.1thesolutionx (t, s)(black)com- 1 puted by numerical methods (MAPLE), the zero order approximation t (blue), the 1 2 ¡ first order approximation t +(1 t− ) s (green), and the second order approximation 1 2 1 2¡ 8 ¡1 2 +(1 t− ) s + t + 2 3 s (red). ¡ t ¡ 3 ¡ t 3t ¡ t ¡ ¢

x 0.1 t 0 1 2 3 4 5 6 7 8 9 10 0

-0.1

-0.2

-0.3

-0.4

-0.5

-0.6

-0.7

-0.8

-0.9

-1

Let us discuss an alternative method of obtaining the equations for the derivatives of x (t, s)ins.Asabove,letx (t), y (t) ,z(t) be respectively x (t, 0), ∂sx (t, 0) and ∂ssx (t, 0) so that by the Taylor formula 1 x (t, s)=x (t)+y (t) s + z (t) s2 + o s2 . (3.33) 2 ¡ ¢ Let us write a similar expansion for x0 = ∂tx, assuming that the derivatives ∂t and ∂s commute on x.Wehave ∂sx0 = ∂t∂sx = y0 andinthesameway ∂ssx0 = ∂sy0 = ∂t∂sy = z0. Hence, 1 x (t, s)=x (t)+y (t) s + z (t) s2 + o s2 . 0 0 0 2 0 ¡ ¢ 121 Substituting this into the equation

2 x0 = x +2s/t we obtain 1 1 2 x (t)+y (t) s + z (t) s2 + o s2 = x (t)+y (t) s + z (t) s2 + o s2 +2s/t 0 0 2 0 2 µ ¶ ¡ ¢ ¡ ¢ whence 1 x (t)+y (t) s + z (t) s2 = x2 (t)+2x (t) y (t) s + y (t)2 + x (t) z (t) s2 +2s/t + o s2 . 0 0 2 0 Equating the terms with the same powers of s (which¡ can be done by¢ the uniqueness¡ ¢ of the Taylor expansion), we obtain the equations

2 x0 (t)=x (t)

y0 (t)=2x (t) y (t)+2s/t 2 z0 (t)=2x (t) z (t)+2y (t) .

From the initial condition x (1,s)= 1weobtain ¡ s2 1=x (1) + sy (1) + z (1) + o s2 , ¡ 2 whence x (t)= 1, y (1) = z (1) = 0. Solving successively the¡ ¢ above equations with these initial conditions,¡ we obtain the same result as above. Before we prove Theorem 3.7, let us prove some auxiliary statements from Analysis. Definition. AsetK Rn is called convex if for any two points x, y K, also the full interval [x, y] is contained½ in K, that is, the point (1 λ) x + λy belong2 to K for any λ [0, 1]. ¡ 2 Example. Let us show that any ball B (z,r)inRn with respect to any norm is convex. Indeed, it suffices to treat the case z =0.Ifx, y B (0,r)then x

2n+1 Ω0 = (t, x, y) R : t R,x,y Ωt 2 2n+1 2 2 = (t, x, y) R :(t, x) and (t, y) Ω . © 2 ª2 © ª 122 l n Then there exists a continuous mapping ϕ (t, x, y):Ω0 R × such that the following identity holds: ! f (t, y) f (t, x)=ϕ (t, x, y)(y x) ¡ ¡ for all (t, x, y) Ω0 (here ϕ (t, x, y)(y x) is the product of the l n matrix and the column-vector). 2 ¡ £ Furthermore, we have for all (t, x) Ω the identity 2

ϕ (t, x, x)=fx (t, x) . (3.34)

n R

t  x (t,x)

y (t,y)

t

123 Lecture 20 23.06.2009 Prof. A. Grigorian, ODE, SS 2009 Let us discuss the statement of Lemma 3.8 before the proof. The statement involves the parameter t (that can also be multi-dimensional), whichisneededfortheapplication of this Lemma for the proof of Theorem 3.7. However, the statement of Lemma 3.8 is significantly simpler when function f does not depend on t. Indeed, in this case Lemma 3.8 can be stated as follows. Let Ω be a convex open subset of Rn and f : Ω Rl be a continuously differentiable function. Then there is a continuous function ϕ (x,! y): l n Ω Ω R × such that the following identity is true for all x, y Ω: £ ! 2 f (y) f (x)=ϕ (x, y)(y x) . (3.35) ¡ ¡ Furthermore, we have ϕ (x, x)=fx (x) . (3.36) Note that, by the differentiability of f,wehave f (y) f (x)=f (x)(y x)+o ( y x )asy x. ¡ x ¡ k ¡ k ! The point of the identity (3.35) is that the term o ( x y ) can be eliminated replacing k ¡ k fx (x) by a continuous function ϕ (t, x, y). Consider some simple examples of functions f (x)withn = l =1.Say,iff (x)=x2 then we have f (y) f (x)=(y + x)(y x) ¡ ¡ so that ϕ (x, y)=y + x. In particular, ϕ (x, x)=2x = f 0 (x). A similar formula holds for f (x)=xk with any k N: 2 k 1 k 2 k 1 f (y) f (x)= x − + x − y + ... + y − (y x) . ¡ ¡ k 1 k 2 k 1 k 1 so that ϕ (x, y)=x − + x − y + ...¡ + y − and ϕ (x, x)=kx¢ − . In the case n = l =1,onecandefine ϕ (x, y) to satisfy (3.35) and (3.36) as follows:

f(y) f(x) − ,y= x, ϕ (x, y)= y x 6 f (x−) ,y= x. ½ 0 It is obviously continuous in (x, y)forx = y, and it is continuous at (x, x)becauseif (x ,y ) (x, x)ask then 6 k k ! !1 f (yk) f (xk) ¡ = f 0 (ξ ) y x k k ¡ k where ξ (x ,y ), which implies that ξ x and hence, f 0 (ξ ) f 0 (x),wherewehave k 2 k k k ! k ! used the continuity of the derivative f 0 (x). This argument does not work in the case n>1 since one cannot divide by y x.In the general case, we use a different approach. ¡ ProofofLemma3.8. It suffices to prove this lemma for each component fi separately. Hence, we can assume that l =1sothatϕ is a row (ϕ1,...,ϕn). Hence, we need to prove the existence of n real valued continuous functions ϕ1, ..., ϕn of (t, x, y)such that the following identity holds:

n

f (t, y) f (t, x)= ϕi (t, x, y)(yi xi) . ¡ i=1 ¡ X 124 Fix a point (t, x, y) Ω0 and consider a function 2 F (λ)=f (t, x + λ (y x)) ¡ on the interval λ [0, 1]. Since x, y Ω and Ω is convex, the point x + λ (y x) 2 2 t t ¡ belongs to Ωt. Therefore, (t, x + λ (y x)) Ω and the function F (λ)isindeeddefined for all λ [0, 1]. Clearly, F (0) = f ¡(t, x), 2F (1) = f (t, y). By the chain rule, F (λ)is continuously2 differentiable and

n F 0 (λ)= f (t, x + λ (y x)) (y x ) . xi ¡ i ¡ i i=1 X By the fundamental theorem of calculus, we obtain

f (t, y) f (t, x)=F (1) F (0) ¡ 1 ¡ = F 0 (λ) dλ 0 Zn 1 = f (t, x + λ (y x)) (y x ) dλ xi ¡ i ¡ i i=1 Z0 Xn = ϕ (t, x, y)(y x ) i i ¡ i i=1 X where 1 ϕ (t, x, y)= f (t, x + λ (y x)) dλ. (3.37) i xi ¡ Z0 We are left to verify that ϕi is continuous. Observe firstthatthedomainΩ0 of ϕi is an 2n+1 open subset of R . Indeed, if (t, x, y) Ω0 then (t, x)and(t, y) Ω which implies by the openness of Ω that there is ε>0 such2 that the balls B ((t, x) ,ε2)andB ((t, y) ,ε)in Rn+1 are contained in Ω. Assuming the norm in all spaces in question is the -norm, we 1 obtain that B ((t, x, y) ,ε) Ω0. The continuity of ϕi follows from the following general statement. ½

Lemma 3.9 Let f (λ, u) be a continuous real-valued function on [a, b] U where U is an £ open subset of Rk, λ [a, β] and u U. Then the function 2 2 b ϕ (u)= f (λ, u) dλ Za is continuous in u U. 2

ProofofLemma3.9Let u 1 be a sequence in U that converges to some u U.Thenall f kgk=1 2 uk with large enough index k are contained in a closed ball B (u; ") U.Sincef (¸; u) is continuous in [a; b] U, it is uniformly continuous on any compact set in this domain,½ in particular, in [a; b] B (u; ") : Hence,£ the convergence £ f (¸; u ) f (¸; u)ask k ! !1 is uniform in ¸ [0; 1]. Since the operations of integration and the uniform convergence are interchange- 2 able,weconcludethat' (u ) ' (u), which proves the continuity of '. k !

125 The proof of Lemma 3.8 is finished as follows. Consider f (t, x + λ (y x)) as a xi ¡ function of (λ, t, x, y) [0, 1] Ω0. Thisfunctioniscontinuousin(λ, t, x, y), which implies 2 £ by Lemma 3.9 that also ϕi (t, x, y) is continuous in (t, x, y). Finally, if x = y then f (t, x + λ (y x)) = f (t, x) which implies by (3.37) that xi ¡ xi

ϕi (t, x, x)=fxi (t, x) and, hence, ϕ (t, x, x)=fx (t, x), that is, (3.34). Now we are in position to prove Theorem 3.7. Proof of Theorem 3.7. Recall that x (t, s) is the maximal solution of the initial value problem (3.28), it is definedinanopensetU Rm+1 and is continuous in (t, s)in U (cf. Theorem 3.6). In the main part of the proof,2 we show that the partial derivative

∂si x exists in U. Since this can be done separately for any component si,inthispartwe can and will assume that s is one-dimensional (that is, m =1). Fix a value s of the parameter s and prove that ∂ x (t, s)existsats = s for any t 0 s 0 2 Is0 ,whereIs0 is the domain in t of the solution x (t):=x (t, s0). Since the differentiability is a local property, it suffices to prove that ∂ x (t, s)existsats = s for any t (α, β) s 0 2 where [α, β] is any bounded closed interval that is a subset of Is0 . For a technical reason, we will assume that (α, β)containst0. By Theorem 3.6, for any ε>0thereisδ>0such that the rectangle [a, β] [s δ,s + δ] is contained in U and £ 0 ¡ 0

sup x (t, s) x (t) <ε for all s [s0 δ,s0 + δ] (3.38) t [α,β] k ¡ k 2 ¡ ∈ (see the diagram below).

s

U

s0+δ

s0

s0-δ

α t0 t β t

Furthermore, ε and δ can be chosen so small that the following set

n+m+1 Ω := (t, x, s) R : α

e s

s0+δ

s0

s0-δ Ω~ ~ t Ω

α t t0 β t

x(t) x(t,s) x

In what follows, we restrict the domain of the variables (t, x, s)toΩ.NotethatΩ is an open subset of Rn+m+1, and it is is convex with respect to the variable (x, s), for any fixed t. Indeed, by the definition (3.39), (t, x, s) Ω if and only if e e 2 (x, s) B (x (t) ,ε) (s δ,s + δ) 2 £ 0e¡ 0 so that Ω = B (x (t) ,ε) (s δ,s + δ). Since both the ball B (x (t) ,ε)andtheinterval t £ 0 ¡ 0 (s δ,s + δ) are convex sets, Ω is also convex. 0 ¡ 0 t Applyinge the Hadamard lemma to the function f (t, x, s) in the domain Ω and using the fact that f is continuously diefferentiable with respect to (x, s), we obtain the identity e f (t, y, s) f (t, x, s )=ϕ (t, x, s ,y,s)(y x)+ψ (t, x, s ,y,s)(s s ) , ¡ 0 0 ¡ 0 ¡ 0 where ϕ and ψ are continuous functions on the appropriate domains. In particular, substituting x = x (t)andy = x (t, s), we obtain

f (t, x (t, s) ,s) f (t, x (t) ,s )=ϕ (t, x (t) ,s ,x(t, s) ,s)(x (t, s) x (t)) ¡ 0 0 ¡ +ψ (t, x (t) ,s ,x(t, s) ,s)(s s ) 0 ¡ 0 = a (t, s)(x (t, s) x (t)) + b (t, s)(s s ) , ¡ ¡ 0 where the functions

a (t, s)=ϕ (t, x (t) ,s0,x(t, s) ,s)and b (t, s)=ψ (t, x (t) ,s0,x(t, s) ,s) (3.40) are continuous in (t, s) (α, β) (s δ, s + δ) (the dependence on s is suppressed 2 £ 0 ¡ 0 0 because s0 is fixed).

127 For any s (s δ, s + δ) s and t (α, β), set 2 0 ¡ 0 nf 0g 2 x (t, s) x (t) z (t, s)= ¡ (3.41) s s ¡ 0 and observe that, by (3.28) and (3.40),

x0 (t, s) x0 (t) f (t, x (t, s) ,s) f (t, x (t) ,s0) z0 = ¡ = ¡ s s0 s s0 = a (t, s)¡z + b (t, s) . ¡

Note also that z (t0,s) = 0 because both x (t, s)andx (t, s0)satisfythesameinitial condition. Hence, function z (t, s) solves for any fixed s (s δ,s + δ) s the IVP 2 0 ¡ 0 nf 0g z = a (t, s) z + b (t, s) 0 (3.42) z (t0,s)=0. ½ Since this ODE is linear and the functions a and b are continuous in (t, s) (α, β) 2 £ (s0 δ,s0 + δ), we conclude by Theorem 2.1 that the solution to this IVP exists for all s ¡(s δ,s + δ)andt (α, β) and, by Theorem 3.6, the solution is continuous in 2 0 ¡ 0 2 (t, s) (α, β) (s0 δ,s0 + δ). By the uniqueness theorem, the solution of (3.42) for s = s2 coincides£ with¡ the function z (t, s)defined by (3.41). Although (3.41) does not 6 0 allow to define z (t, s)fors = s0,wecandefine z (t, s0) as the solution of the IVP (3.42) with s = s0. Using the continuity of z (t, s)ins,weobtain

lim z (t, s)=z (t, s0) , s s0 → that is, x (t, s) x (t, s0) ∂sx (t, s) s=s0 =lim ¡ =limz (t, s)=z (t, s0) . j s s0 s s s s0 → ¡ 0 → Hence, the derivative y (t)=∂sx (t, s) s=s0 exists and is equal to z (t, s0), that is, y (t) satisfies the IVP j y0 = a (t, s0) y + b (t, s0) , y (t0)=0. ½ Note that by (3.40) and Lemma 3.8

a (t, s0)=ϕ (t, x (t) ,s0,x(t) ,s0)=fx (t, x (t) ,s0) and b (t, s0)=ψ (t, x (t) ,s0,x(t) ,s0)=fs (t, x (t) ,s0) Hence, we obtain that y (t)satisfies the variational equation (3.29). To finishtheproof,wehavetoverifythatx (t, s) is continuously differentiable in (t, s). m Here we come back to the general case s R .Thederivative∂sx = y satisfies the IVP 2 (3.29) and, hence, is continuous in (t, s) by Theorem 3.6. Finally, for the derivative ∂tx we have the identity ∂tx = f (t, x (t, s) ,s) , (3.43) which implies that ∂tx is also continuous in (t, s). Hence, x is continuously differentiable in (t, s).

128 Remark. It follows from (3.43) that ∂tx is differentiable in s and, by the chain rule,

∂s (∂tx)=∂s [f (t, x (t, s) ,s)] = fx (t, x (t, s) ,s) ∂sx + fs (t, x (t, s) ,s) . (3.44)

On the other hand, it follows from (3.29) that

∂t (∂sx)=∂ty = fx (t, x (t, s) ,s) ∂sx + fs (t, x (t, s) ,s) , (3.45) whence we conclude that ∂s∂tx = ∂t∂sx. (3.46) 59 Hence, the derivatives ∂s and ∂t commute on x.Aswehaveseenabove,ifoneknewthe identity (3.46) a priori then the derivation of the variational equation (3.29) would have been easy. However, in the present proof the identity (3.46) comes after the variational equation.

Theorem 3.10 Under the conditions of Theorem 3.7, assume that, for some k N, f (t, x, s) Ck (x, s). Then the maximal solution x (t, s) belongs to Ck (s). Moreover,2 for any multiindex2 α =(α , ..., α ) of the order α k, we have 1 m j j· α α ∂t∂s x = ∂s ∂tx. (3.47)

A multiindex α is a sequence (α1, ..., αm)ofm non-negative integers αi,itsorder α is defined by α = α + ... + α , and the derivative ∂α is defined by j j j j 1 n s ∂ α ∂α = | | . s α1 αm ∂s1 ...∂sm Proof. Induction in k.Ifk = 1 then the fact that x C1 (s) is the claim of Theorem 3.7, and the equation (3.47) with α =1wasverified in2 the above Remark. Let us make the inductive step from k 1tojk,j for any k 2. Assume f Ck (x, s). Since also k 1 ¡ ¸ k 1 2 f C − (x, s), by the inductive hypothesis we have x C − (s). Set y = ∂sx and recall that2 by Theorem 3.7 2 y = f (t, x, s) y + f (t, x, s) , 0 x s (3.48) y (t0)=0, ½ k 1 k 1 where x = x (t, s). Since fx and fs belong to C − (x, s)andx (t, s) C − (s), we obtain 2 k 1 that the composite functions fx (t, x (t, s) ,s)andfs (t, x (t, s) ,s) are of the class C − (s). k 1 Hence, the right hand side in (3.48) is of the class C − (y, s) and, by the inductive k 1 k hypothesis,weconcludethaty C − (s). It follows that x C (s). 2 2

59The equality of the mixed derivatives can be concluded by a theorem from Analysis II if one knows that both @s@tx and @t@sx are continuous. Their continuity follows from the identities (3.44) and (3.45), whichproveatthesametimealsotheirequality.

129 Lecture 21 29.06.2009 Prof. A. Grigorian, ODE, SS 2009 Let us now prove (3.47). Choose some index i so that αi 1, and set β =(α1, ..., αi 1, ...αn), ¸ α β ¡ where 1 is subtracted only at the position i. It follows that ∂s = ∂s ∂si .Setyi = ∂si x so that yi is the i-thcolumnofthematrixy = ∂sx. It follows from (3.48) that the vector-function yi satisfies the ODE

yi0 = fx (t, x, s) yi + fsi (t, x, s) . (3.49)

α β k Using the identity ∂s = ∂s ∂si that holds on C (s) functions, the fact that f (t, x (t, s) ,s) Ck (s), which follows from the first part of the proof, and the chain rule, we obtain 2

α α β β ∂s ∂tx = ∂s f (t, x, s)=∂s ∂si f (t, x, s)=∂s (fxi (t, x, s) ∂si x + fsi ) β β = ∂s (fxi (t, x, s) yi + fsi )=∂s ∂tyi.

k 1 Observe that the function on the right hand side of (3.49) belongs to the class C − (y,s). Applying the inductive hypothesis to the ODE (3.49) and noticing that β = k 1, we obtain j j ¡ β β ∂s ∂tyi = ∂t∂s yi, whence it follows that

α β β α ∂s ∂tx = ∂t∂s yi = ∂t∂s ∂si x = ∂t∂s x.

4 Qualitative analysis of ODEs

4.1 Autonomous systems Consider a vector ODE x0 = f (x) (4.1) wheretherighthandsidedoesnotdependont. Such equations are called autonomous. Here f is definedonanopensetΩ Rn (or Ω Cn)andtakesvaluesinRn (resp., Cn), ½ ½ so that the domain of the ODE is R Ω. £ Definition. The set Ω is called the of the ODE. Any path x : I Ω,where x (t) is a solution of the ODE on an interval I,iscalledaphase trajectory. A! plot of all (or typical) phase trajectories is called the phase diagram or the . Recall that the graph of a solution (or the integral curve) is the set of points (t, x (t)) in R Ω. Hence, the phase trajectory can be regarded as the projection of the integral curve£ onto Ω. Assume in the sequel that f is continuously differentiable in Ω.Foranys Ω,denote by x (t, s) the maximal solution to the IVP 2

x0 = f (x) x (0) = s. ½ Recall that, by Theorem 3.7, the domain of function x (t, s)isanopensubsetofRn+1 and x (t, s)iscontinuouslydifferentiable in this domain. The fact that f does not depend on t, implies easily the following two consequences.

130 1. If x (t) is a solution of (4.1) then also x (t a) is a solution of (4.1), for any a R. In particular, the function x (t t ,s) solves¡ the following IVP 2 ¡ 0

x0 = f (x) x (t0)=s. ½ 2. If f (x )=0forsomex Ω then the constant function x (t) x is a solution of 0 0 2 ´ 0 x0 = f (x). Conversely, if x (t) x is a constant solution then f (x )=0. ´ 0 0 Definition. If f (x )=0atsomepointx Ω then x is called a stationary point 60 or 0 0 2 0 a stationary solution of the ODE x0 = f (x).

It follows from the above observation that x0 Ω is a stationary point if and only if 2 x (t, x0) x0. It is frequently the case that the stationary points determine the shape of the phase´ diagram. Example. Consider the following system x = y + xy 0 (4.2) y0 = x xy ½ ¡ ¡ that can be partially solved as follows. Dividing one equation by the other, we obtain a separable ODE for y as a function of x: dy x (1 + y) = ; dx ¡y (1 + x) whence it follows that ydy xdx = 1+y ¡ 1+x Z Z and y ln y +1 + x ln x +1 = C: (4.3) ¡ j j ¡ j j Although the equation (4.3) does not give the dependence of x and y of t, it does show the dependence between x and y and allows to plot the phase diagram using di®erent values of the constant C:

y 5

4

3

2

1 0 -5 -4 -3 -2 -1 0 1 2 3 4 5

-1 x

-2

-3

-4

-5

One can see that there are two points of \attraction" on this plot: (0; 0) and ( 1; 1), that are exactly ¡ ¡ the stationary points of (4.2).

60In the literature one can ¯nd the following synonyms for the term \stationary point": rest point, singular point, equilibrium point, ¯xed point.

131 Definition. A stationary point x0 is called (Lyapunov) stable for the system x0 = f (x) (or the system is called stable at x0) if, for any ε>0, there exists δ>0 with the following property: for all s Ω such that s x0 <δ,thesolutionx (t, s)isdefined for all t>0 and 2 k ¡ k sup x (t, s) x0 <ε. (4.4) t [0,+ ) k ¡ k ∈ ∞

In other words, the means that if x (0) is close enough to x0 then the solution x (t)isdefined for all t>0and

x (0) B (x ,δ)= x (t) B (x ,ε)forallt>0. 2 0 ) 2 0 If we replace in (4.4) the interval [0, + ) by any bounded interval [0,T] then, by the continuity of x (t, s), 1

sup x (t, s) x0 =sup x (t, s) x (t, x0) 0ass x0. t [0,T ] k ¡ k t [0,T ] k ¡ k! ! ∈ ∈ Hence, the main issue for the stability is the behavior of the solutions as t + . ! 1 Definition. A stationary point x0 is called asymptotically stable for the system x0 = f (x) (or the system is called asymptotically stable at x0), if it is Lyapunov stable and, in addition, x (t, s) x 0ast + , k ¡ 0k! ! 1 provided s x is small enough. k ¡ 0k Observe, the stability and asymptotic stability do not depend on the choice of the norm in Rn because all norms in Rn are equivalent.

4.2 Stability for a linear system

n Consider a linear system x0 = Ax in R where A is a constant operator. Clearly, x =0is a stationary point.

Theorem 4.1 If for all complex eigenvalues λ of A, we have Re λ<0 then 0 is asymp- totically stable for the system x0 = Ax. If, for some eigenvalue λ of A, Re λ 0 then 0 is not asymptotically stable. If, for some eigenvalue λ of A, Re λ>0 then 0 is¸ unstable.

Proof. By Corollary to Theorem 2.20, the general complex solution of the system x = Ax has the form 0 n λkt x (t)= Cke Pk (t) , (4.5) k=1 X where Ck are arbitrary complex constants, λ1, ..., λn are all the eigenvalues of A listed with the algebraic multiplicity, and Pk (t) are some vector valued polynomials of t.The l 1 latter means that Pk (t)=u1 + u2t + ... + ult − for some l N and for some vectors 2 u1, ..., ul.

132 It follows from (4.5) that, for all t 0, ¸ n x (t) C eλkt P (t) (4.6) k k· k k k k k=1 X ¯ ¯ ¯ ¯ n (Re λk)t max Ck e Pk (t) · k j j k k k=1 n X αt max Ck e Pk (t) · k j j k k k=1 X where α =maxRe λk < 0. k Observe that the polynomials admits the estimates of the type

N Pk (t) c 1+t k k· for all t>0 and for some large enough constants¡ c and¢ N. On the other hand, since

n

x (0) = CkPk (0) , k=1 X we see that the coefficients Ck are the components of the initial value x (0) in the basis n Pk (0) k=1.Therefore,maxCk is nothing other but the -norm x (0) in that basis. f g j j 1 k k∞ Replacing this norm by the current norm in Rn at expense of an additional constant multiple, we obtain from (4.6) that

x (t) C x (0) eαt 1+tN (4.7) k k· k k for some constant C and all t 0. ¡ ¢ Since the function eαt 1+¸tN is bounded on [0, + ), we obtain that, for all t 0, 1 ¸ ¡ ¢ x (t) K x (0) , k k· k k αt N where K = C supt 0 e 1+t , whence it follows that the stationary point 0 is Lyapunov stable. Moreover,≥ since ¡ eαt¢ 1+tN 0ast + , ! ! 1 we conclude from (4.7) that x¡(t) ¢ 0ast , that is, the stationary point 0 is asymptotically stable. k k! !1 Assume now that Re λ 0 for some eigenvalue λ and prove that the stationary point 0 is not asymptotically stable.¸ For that it suffices to construct a real solution x (t)of the system x0 = Ax such that x (t) 0ast + . Indeed, for any real c =0,the solution cx (t) has the initial valuek cxk (0),6! which! can1 be made arbitrarily small provided6 c is small enough, whereas cx (t) does not go to 0 as t + , which implies that there is no asymptotic stability. To construct such a solution!x (t),1fix an eigenvector v of the eigenvalue λ with Re λ 0. Then we have a particular solution ¸ x (t)=eλtv (4.8)

133 for which x (t) = eλt v = et Re λ v v . (4.9) k k k k k k¸k k Hence, x (t) does not tend to 0 as t¯ ¯ .Ifx (t) is a real solution then this completes the proof of the asymptotic instability.¯ !1¯ If x (t) is a complex solution then both Re x (t) and Im x (t) are real solutions and one of them must not go to 0 as t . Finally, assume that Re λ>0 for some eigenvalue λ and prove that!1 0 is unstable. It suffices to prove the existence of a real solution x (t)suchthat x (t) is unbounded. For the same solution (4.8), we see from (4.9) that x (t) ask t k + , which settles the claim in the case when x (t) is real-valued.k For ak!1 complex-valued! 1x (t), one of the solutions Re x (t)orImx (t) is unbounded, which finishes the proof. Note that Theorem 4.1 does not answer the question whether 0 is Lyapunov stable or not in the case when Re λ 0 for all eigenvalues λ but there is an eigenvalue λ with Re λ = 0. Although we know that· 0 is not asymptotically stable in his case, it can actually be either Lyapunov stable or unstable, as we will see below.

134 Lecture 22 30.06.2009 Prof. A. Grigorian, ODE, SS 2009 Consider in details the case n =2wherewegiveafullclassification of the stability cases and a detailed description of the phase diagrams. Consider a linear system x0 = Ax 2 2 in R where A is a constant linear operator in R .Letb = b1,b2 be the Jordan basis of A so that Ab has the Jordan normal form. Consider firstf the caseg when the Jordan normal form of A has two Jordan cells, that is,

λ 0 Ab = 1 . 0 λ2 µ ¶

Then b1 and b2 are the eigenvectors of the eigenvalues λ1 and λ2, respectively, and the general solution is λ1t λ2t x (t)=C1e b1 + C2e b2. In other words, in the basis b,

λ1t λ2t x (t)= C1e ,C2e and x (0) = (C1,C2). It follows that ¡ ¢

λ1t λ2t Re λ1t Re λ2t αt x (t) =max C1e , C2e =max C1 e , C2 e x (0) e k k∞ j j j j ·k k∞ ¡¯ ¯ ¯ ¯¢ ¡ ¢ where ¯ ¯ ¯ ¯ α =max(Reλ1, Re λ2) . If α 0then · x (t) x (0) k k∞ ·k k which implies the Lyapunov stability. As we know from Theorem 4.1, if α>0 then the stationary point 0 is unstable. Hence, in this particular situation, the Lyapunov stability is equivalent to α 0. · Let us construct the phase diagrams of the system x0 = Ax under the above assump- tions. Case λ1,λ2 are real. Let x (t)andx (t) be the components of the solution x (t) in the basis b ,b . Then 1 2 f 1 2g

λ1t λ2t x1 = C1e and x2 = C2e .

Assuming that λ ,λ = 0, we obtain the relation between x and x as follows: 1 2 6 1 2 x = C x γ , 2 j 1j where γ = λ2/λ1. Hence, the phase diagram consists of all curves of this type as well as of the half-axis x1 > 0,x1 < 0,x2 > 0,x2 < 0. If γ > 0 (that is, λ1 and λ2 are of the same sign) then the phase diagram (or the stationary point) is called a node. One distinguishes a stable node when λ1,λ2 < 0and unstable node when λ1,λ2 > 0. Here is a node with γ > 1:

135 y_1 1

0.5

0 -1 -0.5 0 0.5 1

x_1

-0.5

-1 and here is a node with γ =1:

x_2 1

0.5

0 -1 -0.5 0 0.5 1

x_1

-0.5

-1

If one or both of λ1,λ2 is0thenwehaveadegenerate phase diagram (horizontal or vertical straight lines or just dots). If γ < 0 (that is, λ1 and λ2 are of different signs) then the phase diagram is called a saddle: x_2 1

0.5

0 -1 -0.5 0 0.5 1

x_1

-0.5

-1

136 Of course, the saddle is always unstable. Case λ and λ are complex, say λ = α iβ and λ = α + iβ with β =0. 1 2 1 ¡ 2 6 Note that b1 is the eigenvector of λ1 and b2 = b1 is the eigenvector of λ2.Letu =Reb1 and v =Imb1 so that b1 = u + iv and b2 = u iv. Since the eigenvectors b1 and b2 form 2 ¡ b1+b2 b1 b2 2 a basis in C , it follows that also the vectors u = 2 and v = 2−i form a basis in C ; consequently, the couple u, v is a basis in R2.Thegeneralrealsolutionis

(α iβ)t (α iβ)t x (t)=C1 Re e − b1 + C2 Im e − b1 = C eαt Re (cos βt i sin βt)(u + iv)+C eαt Im (cos βt i sin βt)(u + iv) 1 ¡ 2 ¡ = eαt (C cos βt C sin βt) u + eαt (C sin βt + C cos βt) v 1 ¡ 2 1 2 = Ceαt cos (βt + ψ) u + Ceαt sin (βt + ψ) v

2 2 where C = C1 + C2 and C1 C2 p cos ψ = , sin ψ = . C C Hence, in the basis (u, v), the solution x (t)isasfollows:

eαt cos (βt + ψ) x (t)=C . eαt sin (βt + ψ) µ ¶ If (r, θ) are the polar coordinates on the plane in the basis (u, v), then the polar coordinates of the solution x (t)are r (t)=Ceαt and θ (t)=βt + ψ. If α = 0 then these equations define a logarithmic spiral, and the phase diagram is called a focus6 or a spiral:

x_2

0.75

0.5

0.25

0 -0.5 -0.25 0 0.25 0.5 0.75 1

x_1 -0.25

-0.5

The focus is stable if α<0 and unstable if α>0. If α = 0 (that is, the both eigenvalues λ1 and λ2 are purely imaginary), then r (t)=C, that is, we get a family of concentric circles around 0, and this phase diagram is called a center:

137 x_2 1

0.5

0 -1 -0.5 0 0.5 1

x_1

-0.5

-1

In this case, the stationary point is stable but not asymptotically stable. Consider now the case when the Jordan normal form of A has only one Jordan cell, that is, λ 1 Ab = . 0 λ µ ¶ In this case, λ must be real because if λ is an imaginary root of a characteristic polynomial then λ must also be a root, which is not possible since λ does not occur on the diagonal of Ab. By Theorem 2.20, the general solution is

λt λt x (t)=C1e b1 + C2e (b1t + b2) λt λt =(C1 + C2t) e b1 + C2e b2 whence x (0) = C1b1 + C2b2. That is, in the basis b,wecanwritex (0) = (C1,C2)and

λt λt x (t)= e (C1 + C2t) ,e C2 (4.10) whence ¡ ¢ x (t) = eλt C + C t + eλt C . k k1 j 1 2 j j 2j If λ<0 then we obtain again the asymptotic stability (which follows also from Theorem 4.1), whereas in the case λ 0 the stationary point 0 is unstable. Indeed, taking C =0 ¸ 1 and C2 = 1, we obtain a particular solution with the norm x (t) = eλt (1 + t) , k k1 which is unbounded, whenever λ 0. ¸ If λ = 0 then it follows from (4.10) that the components x1,x2 of x are related as follows: 6 x C 1 x 1 = 1 + t and t = ln 2 x2 C2 λ C2 whence x ln x x = Cx + 2 j 2j, 1 2 λ

C1 where C = ln C2 . Here is the phase diagram in this case: C2 ¡ j j 138 x_1 1

0.5

0 -1 -0.5 0 0.5 1

x_2

-0.5

-1

This phase diagram is also called a node. It is stable if λ<0 and unstable if λ>0. If λ = 0 then we obtain a degenerate phase diagram - parallel straight lines. Hence, the main types of the phases diagrams are:

a node (λ ,λ are real, non-zero and of the same sign); ² 1 2 a saddle (λ ,λ are real, non-zero and of opposite signs); ² 1 2

a focus or spiral (λ1,λ2 are imaginary and Re λ =0); ² 6 a center (λ ,λ are purely imaginary). ² 1 2 Otherwise, the phase diagram consists of parallel straight lines or just dots, and is referred to as degenerate. To summarize the stability investigation, let us emphasize that in the case max Re λ = 0 both stability and instability can happen, depending on the structure of the Jordan normal form.

4.3 Lyapunov’s theorems

n Consider again an autonomous ODE x0 = f (x)wheref : Ω R is continuously n ! differentiable and Ω is an open set in R .Letx0 be a stationary point of the system x0 = f (x), that is, f (x0) = 0. We investigate the stability of the stationary point x0.

Theorem 4.2 (Lyapunov’s theorem) Assume that f C2 (Ω) and set A = f (x ) (that 2 x 0 is, A is the Jacobian matrix of f at x0). (a) If Re λ<0 for all eigenvalues λ of A then the stationary point x0 is asymptotically stableforthesystemx0 = f (x). (b) If Re λ>0 for some eigenvalue λ of A then the stationary point x0 is unstable for the system x0 = f (x).

139 The proof of part (b) is somewhat lengthy and will not be presented here. Example. Consider the system

x+y x0 = p4+4y 2e ¡ y0 =sin3x +ln(1 4y) . ½ ¡ It is easy to see that the right hand side vanishes at (0, 0) so that (0, 0) is a stationary point. Setting p4+4y 2ex+y f (x, y)= , sin 3x +ln(1¡ 4y) µ ¡ ¶ we obtain ∂xf1 ∂yf1 2 1 A = fx (0, 0) = = ¡ ¡ . ∂xf2 ∂yf2 3 4 µ ¶ µ ¡ ¶ Another way to obtain this matrix is to expand each component of f (x; y) by the Taylor formula: y f (x; y)=21+y 2ex+y =2 1+ + o (x) 2(1+(x + y)+o ( x + y )) 1 ¡ 2 ¡ j j j j = p2x y + o ( x + y )³ ´ ¡ ¡ j j j j and

f (x; y)=sin3x +ln(1 4y)=3x + o (x) 4y + o (y) 2 ¡ ¡ =3x 4y + o ( x + y ) : ¡ j j j j Hence, 2 1 x f (x; y)= ¡ ¡ + o ( x + y ) ; 3 4 y j j j j µ ¡ ¶µ ¶ whenceweobtainthesamematrixA. The characteristic polynomial of A is

2 λ 1 det = λ2 +6λ +11, ¡ 3¡ 4¡ λ µ ¡ ¡ ¶ and the eigenvalues are λ1,2 = 3 ip2. Hence, Re λ<0forallλ,whenceweconclude that 0 is asymptotically stable.¡ § Coming back to the setting of Theorem 4.2, represent the function f (x)intheform

f (x)=A (x x )+ϕ (x) ¡ 0 where ϕ (x)=o ( x x0 )asx x0. Consider a new unknown function X (t)=x (t) x0 that obviously satisk fi¡es thenk the! ODE ¡

X0 = AX + o ( X ) . k k

By dropping out the term o ( X ), we obtain the linearized equation X0 = AX.Fre- quently, the matrix A can be foundk k directly by the linearization rather than by evaluating fx (x0). The stability of the stationary point 0 of the linear ODE X0 = AX is closely related (but not identical) to the stability of the stationary point x0 of the non-linear ODE x0 = f (x). The hypotheses that Re ¸<0 for all the eigenvalues ¸ of A yields by Theorem 4.2 that x0 is asymptotically stable for the non-linear system x0 = f (x), and by Theorem 4.1 that 0 is asymptotically stable for the linear system

140 X0 = AX. Similarly, under the hypothesis Re ¸>0, x0 is unstable for x0 = f (x) and 0 is unstable for

X0 = AX.However,inthecasemaxRe¸ = 0, the stability types for ODEs x0 = f (x)andX0 = AX are not necessarily identical. Example. Consider the system

x = y + xy, 0 (4.11) y0 = x xy. ½ ¡ ¡ Solving the equations y + xy =0 x + xy =0 ½ we obtain the two stationary points (0, 0) and ( 1, 1). Let us linearize the system at the stationary point ( 1, 1). Setting X = x +1and¡ ¡Y = y + 1, we obtain the system ¡ ¡

X0 =(Y 1) X = X + XY = X + o ( (X, Y ) ) ¡ ¡ ¡ k k (4.12) Y 0 = (X 1) Y = Y XY = Y + o ( (X, Y ) ) ½ ¡ ¡ ¡ k k whose linearization is X0 = X ¡ Y 0 = Y. ½ Hence, the matrix is 10 A = , ¡01 µ ¶ and the eigenvalues are 1 and +1 so that the type of the stationary point is a saddle. Hence, the system (4.11)¡ is unstable at ( 1, 1) because one of the eigenvalues is positive. Consider now the stationary point (0¡, 0).¡ Near this point, the system (4.11) can be written in the form x0 = y + o ( (x, y) ) k k y0 = x + o ( (x, y) ) ½ ¡ k k so that the linearized system is x0 = y, y0 = x. ½ ¡ Hence, the matrix is 01 A = , 10 µ ¡ ¶ and the eigenvalues are i. Since they are purely imaginary, the type of the stationary point (0, 0) for the linearized§ system is a center. Hence, the linearized system is stable at (0, 0) but not asymptotically stable. For the non-linear system (4.11), no conclusion can be drawn from the eigenvalues. In this case, one can use the fact that the phase trajectories for the system (4.11) can be explicitly described by the equation: x ln x +1 + y ln y +1 = C: ¡ j j ¡ j j (cf. (4.3)). It follows that the phase trajectories near (0; 0)areclosedcurves(seetheplotbelow)and, hence, the stationary point (0; 0) of the system (4.11) is Lyapunov stable but not asymptotically stable.

141 y 2

1.5

1

0.5

0 -0.5 0 0.5 1 1.5 2

x -0.5

142 Lecture 23 6.07.2009 Prof. A. Grigorian, ODE, SS 2009 The main tool for the proof of Theorem 4.2 is the second Lyapunov theorem, that is of its own interest. Recall that for any vector u Rn and a differentiable function V in n 2 a domain in R , the directional derivative ∂uV can be determined by

n ∂V ∂uV (x)=Vx (x) u = (x) uk. ∂xk k=1 X

Theorem 4.3 (Lyapunov’s second theorem) Consider the system x0 = f (x) where f 1 2 C (Ω) and let x0 be a stationary point of it. Let U be an open subset of Ω containing x0, and V (x) be a C1 function on U such that V (x )=0and V (x) > 0 for any x U x . 0 2 nf 0g (a) If, for all x U, 2 ∂ V (x) 0, (4.13) f(x) · then the stationary point x0 is stable.

(b) Let W (x) be a continuous function on U such that W (x) > 0 for x U x0 .If, for all x U, 2 nf g 2 ∂ V (x) W (x) , (4.14) f(x) ·¡ then the stationary point x0 is asymptotically stable. (c) If, for all x U, 2 ∂ V (x) W (x) , (4.15) f(x) ¸ where W is as above then the stationary point x0 is unstable.

A function V from the statement is called the . Note that the vector field f (x) in the expression ∂f(x)V (x)dependsonx.Bydefinition, we have

n ∂V ∂f(x)V (x)= (x) fk (x) . ∂xk k=1 X In this context, ∂f V is also called the orbital derivative of V with respect to the ODE x0 = f (x). There are no general methods for constructing Lyapunov functions. Let us consider some examples. n Example. Consider the system x0 = Ax where A (R ). In order to investigate the stability of the stationary point 0, consider the function2L

n V (x)= x 2 = x2, k k2 k k=1 X n which is positive in R 0 and vanishes at 0. Setting f (x)=Ax and A =(Akj), we obtain for the componentsnf g n

fk (x)= Akjxj. j=1 X

143 ∂V Since =2xk, it follows that ∂xk

n ∂V n ∂f V = fk =2 Akjxjxk. ∂xk k=1 j,k=1 X X Recall that the matrix A is called a non-positive definite if

n n Akjxjxk 0forallx R . · 2 j,k=1 X

Hence, if A is non-positive definite, then ∂f V 0; by Theorem 4.3(a), we conclude that 0 is Lyapunov stable. The matrix A is called negative· definite if

n n Akjxjxk < 0forallx R 0 . 2 nf g j,k=1 X n In this case, set W (x)= 2 j,k=1 Akjxjxk so that ∂f V = W . Thenweconcludeby Theorem 4.3(b), that 0 is¡ asymptotically stable. Similarly,¡ if the matrix A is positive definite then 0 is unstable byP Theorem 4.3(c). For example, if A =diag(λ1,...,λn)whereλk are real, then A is non-positive definite if all λ 0, negative definite if all λ < 0, and positive definite if all λ > 0. k · k k Example. Consider the second order scalar ODE x00 +kx0 = F (x) that describes the one- dimensional movement of a particle under the external potential force F (x)andfriction with the coefficient k. This ODE can be written as a normal system

x0 = y y0 = ky + F (x) . ½ ¡ Note that the phase space is R2 (assuming that F is defined for all x R)andapoint (x, y) in the phase space is a couple position-velocity. 2 Assume F (0)=0sothat(0, 0) is a stationary point. We would like to decide if (0, 0) is stable or not. The Lyapunov function can be constructed in this case as the full energy y2 V (x, y)= + U (x) , 2 where U (x)= F (x) dx ¡ Z y2 is the potential energy and 2 is the kinetic energy. More precisely, assume that k 0 and ¸ F (x) < 0forx>0,F(x) > 0forx<0, (4.16) and set x U (x)= F (s) ds, ¡ Z0 so that U (0) = 0 and U (x) > 0forx = 0. Then the function V (x, y)vanishesat(0, 0) and is positive away from (0, 0). 6

144 Setting f (x, y)=(y, ky + F (x)) , ¡ compute the orbital derivative ∂f V : ∂ V = V y + V ( ky + F (x)) f x y ¡ = U 0 (x) y + y ( ky + F (x)) ¡ = F (x) y ky2 + yF (x) ¡ ¡ = ky2 0. ¡ · Hence, V is indeed the Lyapunov function, and by Theorem 4.3 the stationary point (0, 0) is Lyapunov stable. Physically the condition (4.16) means that the force always acts in the direction of the origin thus forcing the displaced particle to the origin, which causes the stability. For example, the following functions satisfy (4.16): F (x)= x and F (x)= x3. ¡ ¡ The corresponding Lyapunov functions are x2 y2 x4 y2 V (x, y)= + and V (x, y)= + , 2 2 4 2 respectively. Example. Consider a system x0 = y x ¡3 (4.17) y0 = x . ½ ¡ x4 y2 The function V (x, y)= 4 + 2 is positive everywhere except for the origin. The orbital derivative of this function with respect to the given ODE is

∂f V = Vxf1 + Vyf2 = x3 (y x) yx3 ¡ ¡ = x4 0. ¡ · Hence, by Theorem 4.3(a), (0, 0) is Lyapunov stable for the system (4.17). Using a more subtle Lyapunov function V (x; y)=(x y)2 + y2,onecanshowthat(0; 0) is, in ¡ 11 fact, asymptotically stable for the system (4.17). Since the matrix of the linearized system ¡00 (4.17) has the eigenvalues 0 and 1, the stability of (0; 0) for the systemµ (4.17)¶ cannot be deduced from ¡ Theorem 4.2. Observe that, by Theorem 4.1, the stationary point (0; 0) is not asymptotically stable for the linearized system. However, it is still stable since the matrix is diagonalizable (cf. Section 4.2).

Example. Consider again the system (4.11), that is,

x0 = y + xy y0 = x xy ½ ¡ ¡ and the stationary point (0, 0). As it was shown above, the phase trajectories of this system satisfy the following equation: x ln x +1 + y ln y +1 = C. ¡ j j ¡ j j 145 Thismotivatesustoconsiderthefunction V (x, y)=x ln x +1 + y ln y +1 ¡ j j ¡ j j in a neighborhood of (0, 0). It vanishes at (0, 0) and is positive away from (0, 0). Its orbital derivative is

∂f V = Vxf1 + Vyf2 1 1 = 1 (y + xy)+ 1 ( x xy) ¡ x +1 ¡ y +1 ¡ ¡ µ ¶ µ ¶ = xy xy =0. ¡ By Theorem 4.3(a), the point (0, 0) is Lyapunov stable. Proof of Theorem 4.3. (a) The main idea is that, as long as a solution x (t) remains in the domain U of V ,wehavebythechainrule d V (x (t)) = V 0 (x) x0 (t)=V 0 (x) f (x)=∂ V (x) 0. (4.18) dt f(x) · It follows that the function V is decreasing along any solution x (t). Hence, if x (0) is close to x0 then V (x (0)) must be small, whence it follows that V (x (t)) is also small for all t>0, so that x (t)isclosetox0. To implement this idea, we first shrink U so that U is bounded and V (x)isdefined on the closure U.Set n Bε = B (x0,ε)= x R : x x0 <ε . f 2 k ¡ k g Let ε>0besosmallthatB U.Foranysuchε,set ε ½ m (ε)= inf V (x) . x U Bε ∈ \

Since V is continuous and U Bε is a compact set (bounded and closed), by the minimal value theorem, the infimum ofn V is taken at some point. Since V is positive away from 0, we obtain m (ε) > 0.

Ω

B(x0,ε)

U \ B(x0,ε) x(0) x 0 δ

x(t)

Vm(ε)

146 It follows from the definition of m (ε)that

V (x) m (ε)forallx U Bε. (4.19) ¸ 2 n

Since V (x0)=0,foranygivenε>0thereisδ>0sosmallthat V (x) 0andthatx (t) B£for all t>0, 2 δ 2 ε which will prove the Lyapunov stability of the stationary point x0.Sincex (0) Bδ,we have by (4.20) that 2 V (x (0))

V (x (t)) 0,

61 as long as x (t)isdefined . It follows from (4.19) that x (t) Bε. We are left to verify that x (t)isdefined for all t>0.2 Indeed, assume that x (t)is defined only for t

c =limV (x (t)) t + → ∞ exists. Assume from the contrary that c>0. By the continuity of V ,thereisr>0such that V (x) 0, it follows that x (t) / B for all t>0. Set ¸ 2 r m =infW (z) > 0. z U Br ∈ \ It follows that W (x (t)) m for all t>0 whence ¸ d V (x (t)) W (x (t)) m dt ·¡ ·¡ 61Since x (t) has been de¯ned as the maximal solution in the domain R U,thepointx (t)isalways contained in U as long as it is de¯ned. £

147 for all t>0. However, this implies upon integration in t that

V (x (t)) V (x (0)) mt, · ¡ whence it follows that V (x (t)) < 0 for large enough t. This contradiction finishes the proof. (c) Assume from the contrary that the stationary point x0 is stable, that is, for any ε>0thereisδ>0suchthatx (0) Bδ implies that x (t)isdefined for all t>0and x (t) B .Letε be so small that B2 U.Chosex (0) to be any point in B x . 2 ε ε ½ δ nf 0g Since x (t) Bε for all t>0, we have also x (t) U for all t>0. It follows from the hypothesis (4.15)2 that 2 d V (x (t)) W (x (t)) 0 dt ¸ ¸ so that the function V (x (t)) is monotone increasing. Then, for all t>0,

V (x (t)) V (x (0)) =: c>0. ¸ Let r>0besosmallthatV (x)

B(x0,δ) x(0)

x 0 r

x(t)

Vc

It follows that x (t) / B for all t>0. Setting as before 2 r m =infW (z) > 0, z U Br ∈ \ we obtain that W (x (t)) m whence ¸ d V (x (t)) m for all t>0. dt ¸ It follows that V (x (t)) mt + as t + , which is impossible since V is bounded in U. ¸ ! 1 ! 1 Proof of Theorem 4.2. Without loss of generality, set x0 =0sothatf (0) = 0. By the differentiability of f (x), we have

f (x)=Ax + h (x) (4.21)

148 where h (x)=o ( x )asx .Usingthatf C2, let us prove that, in fact, k k !1 2 h (x) C x 2 (4.22) k k· k k provided x is small enough. Applying the Taylor formula to any component fk of f, we write k k n 1 n f (x)= ∂ f (0) x + ∂ f (0) x x + o x 2 as x 0. k i k i 2 ij k i j k k ! i=1 i,j=1 X X ¡ ¢ The first term on the right hand side is the k-th component of Ax, whereas the rest is the k-th component of h (x), that is,

1 n h (x)= ∂ f (0) x x + o x 2 . k 2 ij k i j i,j=1 k k X ¡ ¢ Setting B =max ∂ f (0) ,weobtain i,j,k j ij k j n 2 2 2 hk (x) B xixj + o x = B x 1 + o x . j j· i,j=1 j j k k k k k k X ¡ ¢ ¡ ¢ Replacing x by x at expense of an additional constant factor and passing from the k k1 k k components hk to the vector-function h, we obtain (4.22). Consider the following function

∞ sA 2 V (x)= e x 2 ds, (4.23) Z0 ° ° and prove that V (x) is the Lyapunov function.° Firstly,° let us first verify that V (x)is finite for any x Rn, that is, the integral in (4.23) converges. Indeed, in the proof of Theorem 4.1 we2 have established the inequality

etAx Ceαt tN +1 x , (4.24) · k k ° ° ¡ ¢ where C, N are some positive numbers° ° (depending on A)and α =maxReλ, where max is taken over all eigenvalues λ of A. Since by hypothesis α<0, (4.24) implies that esAx decays exponentially as s + , whence the convergence of the integral in (4.23) follows. ! 1 ° ° 1 n n Secondly,° ° let us show that V (x)isoftheclassC (R )(infact,C∞ (R )). For that, n represent x in the canonical basis v1, ..., vn of R as n

x = xivi i=1 X and notice that n 2 2 x 2 = xi = x x. k k i=1 j j ¢ X 149 Therefore,

sA 2 sA sA sA sA e x 2 = e x e x = xi e vi xj e vj ¢ Ã i ! ¢ Ã j ! ° ° X X sA sA ¡ ¢ ¡ ¢ ° ° = xixj e vi e vj . i,j ¢ X ¡ ¢ Integrating in s,weobtain

V (x)= bijxixj, i,j X ∞ sA sA where bij = 0 e vi e vj ds are constants. This means that V (x) is a quadratic form that is obviously infinitely¢ many times differentiable in x. R ¡ ¢ Remark. Usually we work with any norm in Rn. In the definition (4.23) of V (x), we have specifically chosen the 2-norm to ensure the smoothness of V (x). Function V (x) is obviously non-negative and V (x) = 0 if and only if x =0.Inorder to complete the proof of the fact that V (x) is the Lyapunov function, we need to estimate ∂f(x)V (x). Noticing that by (4.21)

∂f(x)V (x)=∂AxV (x)+∂h(x)V (x) , (4.25)

n let us first evaluate ∂AxV (x)foranyx R . We claim that 2 d ∂ V (x)= V etAx . (4.26) Ax dt ¯t=0 ¡ ¢¯ ¯ Indeed, using the chain rule, we obtain ¯ d d V etAx = V etAx etAx dt x dt whence by Theorem 2.16 ¡ ¢ ¡ ¢ ¡ ¢ d V etAx = V etAx AetAx. dt x ¡ ¢ ¡ ¢ Setting here t = 0 and using that e0A =id,weobtain d V etAx = V (x) Ax = ∂ V (x) , dt x Ax ¯t=0 ¡ ¢¯ ¯ which proves (4.26). ¯ To evaluate the right hand side of (4.26), observe first that

tA ∞ sA tA 2 ∞ (s+t)A 2 ∞ τA 2 V e x = e e x 2 ds = e x 2 ds = e x 2 dτ, Z0 Z0 Zt ¡ ¢ ° ¡ ¢° ° ° ° ° wherewehavemadethechange° τ =°s + t. Therefore,° diff°erentiating this° identity° in t,we obtain d 2 V etAx = etAx . dt ¡ 2 ¡ ¢ ° ° 150 ° ° Setting t = 0 and combining with (4.26), we obtain

∂ V (x)= x 2 . Ax ¡k k2 The second term in (4.25) can be estimated as follows:

∂ V (x)=V (x) h (x) V (x) h (x) h(x) x ·k x kk k n (where Vx (x) is the operator norm of the linear operator Vx (x):R R; the latter is, k k ! in fact, an 1 n matrix that can also be identified with a vector in Rn). It follows that £ ∂ V (x)=∂ V (x)+∂ V (x) x 2 + V (x) h (x) . f(x) Ax h(x) ·¡k k2 k x kk k Using (4.22) and switching there to the 2-norm of x, we obtain that, in a small neighbor- hood of 0, 2 ∂ V (x) x + C Vx (x) x . f(x) ·¡k k2 k kk k2 ObservenextthatthefunctionV (x) has minimum at 0, which implies that Vx (0) = 0. Hence, for any ε>0, V (x) ε k x k· 1 provided x is small enough. Choosing ε = 2 C and combining together the above two lines, we obtaink k that, in a small neighborhood U of 0, 1 1 ∂ V (x) x 2 + x 2 = x 2 . f(x) ·¡k k2 2 k k2 ¡2 k k2

1 2 Setting W (x)= 2 x 2, we conclude by Theorem 4.3, that the ODE x0 = f (x)isasymp- totically stable atk 0. k

151 Lecture 24-25 07 and 13.07.2009 Prof. A. Grigorian, ODE, SS 2009

4.4 Zeros of solutions In this section, we consider a scalar linear second order ODE

x00 + p (t) x0 + q (t) x =0, (4.27) where p (t)andq (t) are continuous functions on some interval I R. We will be con- cerned with the structure of zeros of a solution x (t). ½ For example, the ODE x00 + x = 0 has solutions sin t and cos t that have infinitely many zeros, while a similar ODE x00 x = 0 has solutions sinh t and cosh t with at most 1 zero. One of the questions to be discussed¡ is how to determine or to estimate the number of zeros of solutions of (4.27). Let us start with the following simple observation.

Lemma 4.4 If x (t) is a solution to (4.27) on I that is not identical zero then, on any bounded closed interval J I, the function x (t) has at most finitely many distinct zeros. Moreover, every zero of x½(t) is simple.

Azerot of x (t)iscalledsimple if x0 (t ) =0andmultiple if x0 (t )=0.Thisdefinition 0 0 6 0 matches the notion of simple and multiple roots of polynomials. Note that if t0 is a simple zero then x (t) changes signed at t0. Proof. If t0 isamultiplezerothenthenx (t)solvestheIVP

x00 + px0 + qx =0 x (t )=0 , 8 0 < x0 (t0)=0 whence, by the uniqueness theorem,: we conclude that x (t) 0. ´ Let x (t)haveinfinitely many distinct zeros on J,sayx (t )=0where t ∞ is k f kgk=1 a sequence of distinct reals in J. Then, by the Weierstrass theorem, the sequence tk contains a convergent subsequence. Without loss of generality, we can assume that tf g k ! t J.Thenx (t ) = 0 but also x0 (t ) = 0, which follows from 0 2 0 0 x (tk) x (t0) x0 (t0)= lim ¡ =0. k tk t0 →∞ ¡ Hence, the zero t is multiple, whence x (t) 0. 0 ´ Theorem 4.5 (Theorem of Sturm) Consider two ODEs on an interval I R ½

x00 + p (t) x0 + q1 (t) x =0and y00 + p (t) y0 + q2 (t) y =0, where p C1 (I), q ,q C (I),and,forallt I, 2 1 2 2 2 q (t) q (t) . 1 · 2 If x (t) is a non-zero solution of the first ODE and y (t) is a solution of the second ODE then between any two distinct zeros of x (t) there is a zero of y (t) (that is, if a

152 A mnemonic rule: the larger q (t) the more likely that a solution has zeros.

Example. Let q1 and q2 be positive constants and p = 0. Then the solutions are

x (t)=C1 sin (pq1t + ϕ1)andy (t)=C2 sin (pq2t + ϕ2) . Zeros of function x (t) form an arithmetic sequence with the difference π ,andzerosof √q1 y (t) for an arithmetic sequence with the difference π π . Clearly, between any two √q2 · √q1 terms of the first sequence there is a term of the second sequence.

Example. Let q1 (t)=q2 (t)=q (t)andletx and y be linearly independent solution to thesameODEx00 + p (t) x0 + q (t) x = 0. Then we claim that if a

y00 + py0 + qy =0 y (a)=0 8 < y0 (a)=Cx0 (a)

y0(a) where C = (note that x0 (a) =: 0 by Lemma 4.4). Since Cx(t)solvesthesameIVP, x0(a) 6 we conclude by the uniqueness theorem that y (t) Cx(t). However, this contradicts to the hypothesis that x and y are linearly independent.´ Finally, let us show that y (t)has auniquerootin[a, b]. Indeed, if c

x00 + p (t) x0 + q (t) x =0

1 the change of unknown function u (t)=x (t)exp 2 p (t) dt , we transform it to the form

u00 + Q (t) u¡=0R ¢ where p2 p Q (t)=q 0 ¡ 4 ¡ 2 (hereweusethehypothesisthatp C1). Obviously, the zeros of x (t)andu (t) are the 2 same. Also, if q1 q2 then also Q1 Q2. Therefore, it suffices to consider the case p 0, which will be assumed· in the sequel.· ´ Since the set of zeros of x (t)onanyboundedclosedintervalisfinite, it suffices to show that function y (t) has a zero between any two consecutive zeros of x (t). Let a 0in(a, b). This implies that x0 (a) > 0andx0 (b) < 0. Indeed, x (t) > 0in(a, b) implies x (t) x (a) x0 (a)= lim ¡ 0. t a,t>a t a ¸ → ¡ 153 It follows that x0 (a) > 0becauseifx0 (a)=0thena is a multiple root, which is prohibited by Lemma 4.4. In the same way, x0 (b) < 0. If y (t)doesnotvanishin[a, b]thenwecan assume that y (t) > 0on[a, b]. Let us show that these assumptions lead to a contradiction. Multiplying the equation x00 + q1x =0byy,theequationy00 + q2y =0byx,and subtracting one from the other, we obtain

(x00 + q (t) x) y (y00 + q (t) y) x =0, 1 ¡ 2

x00y y00x =(q q ) xy, ¡ 2 ¡ 1 whence (x0y y0x)0 =(q q ) xy. ¡ 2 ¡ 1 Integrating the above identity from a to b and using x (a)=x (b) = 0, we obtain

b b x0 (b) y (b) x0 (a) y (a)=[x0y y0x] = (q (t) q (t)) x (t) y (t) dt. (4.28) ¡ ¡ a 2 ¡ 1 Za

Since q2 q1 on [a, b]andx (t)andy (t) are non-negative on [a, b], the integral in (4.28) is non-negative.¸ On the other hand, the left hand side of (4.28) is negative because y (a) and y (b) are positive whereas x0 (b)and x0 (a) are negative. This contradiction finishes the proof. ¡

Corollary. Under the conditions of Theorem 4.5, let q1 (t) 0in(a, b). If y has no zeros in (a, b) then we can assume that y (t) > 0 on (a, b) whence y (a) 0andy (b) 0. The integral in the right hand side of (4.28) is ¸ ¸ positive because q2 (t) >q1 (t)andx (t) ,y(t) are positive on (a, b), while the left hand side is non-positive because x0 (b) 0andx0 (a) 0. Consider the differential operator· ¸

d2 d L = + p (t) + q (t) (4.29) dt2 dt so that the ODE (4.27) can be shortly written as Lx = 0. Assume in the sequel that p C1 (I)andq C (I) for some interval I. 2 2 Definition. Any C2 function y satisfying Ly 0 is called a supersolution of the operator L (or of the ODE Lx =0). · Corollary. If L has a positive supersolution y (t) on an interval I then any non-zero solution x (t) of Lx =0hasatmostonezeroonI. Proof. Indeed, define function q (t) by the equation

y + p (t) y + q (t) y =0. 00 e 0 Comparing with e Ly = y00 + p (t) y0 + q (t) y 0, ·

154 we conclude that q (t) q (t). Since x00 + px0 + qx = 0, we obtain by Theorem 4.5 that between any two distinct¸ zeros of x (t)theremustbeazeroofy (t). Since y (t)hasno zeros, x (t)cannothavetwodistinctzeros.e Example. If q (t) 0onsomeintervalI then function y (t) 1 is obviously a positive · ´ supersolution. Hence, any non-zero solution of x00 + q (t) x = 0 has at most one zero on I. It follows that, for any solution of the IVP,

x00 + q (t) x =0 x (t )=0 8 0 < x0 (t0)=a with q (t) 0anda =0,wehave:x (t) =0forallt = t . In particular, if a>0then · 6 6 6 0 x (t) > 0forallt>t0. Corollary. (The comparison principle) Assume that the operator L has a positive su- 2 persolution y on an interval [a, b].Ifx1 (t) and x2 (t) are two C functions on [a, b] such that Lx1 = Lx2 and x1 (t) x2 (t) for t = a and t = b then x1 (t) x2 (t) holds for all t [a, b]. · · 2 Proof. Setting x = x2 x1, we obtain that Lx =0andx (t) 0att = a and t = b. That is, x (t) is a solution that¡ has non-negative values at the endpoints¸ a and b. We need to prove that x (t) 0inside[a, b] as well. Indeed, assume that x (c) < 0 at some point c (a, b). Then, by¸ the intermediate value theorem, x (t) has zeros on each interval [a, c) and2 (c, b]. However, since L has a positive supersolution on [a, b], x (t)cannothavetwo zeros on [a, b] by the previous corollary. Consider the following boundary value problem (BVP) for the operator (4.29):

Lx = f (t) x (a)=α 8 < x (b)=β where f (t)isagivenfunctiononI, :a, b are two given distinct points in I and α, β are given reals. It follows from the comparison principle that if L has a positive supersolution on [a, b] then solution to the BVP is unique. Indeed, if x1 and x2 are two solutions then the comparison principle yields x1 x2 and x2 x1 whence x1 x2. The hypothesis that L has a positive· supersolution· is essential´ since in general there is no uniqueness: the BVP x00 +x =0withx (0) = x (π) = 0 has a whole family of solutions x (t)=C sin t for any real C. Let us return to the study of the cases with “many” zeros.

Theorem 4.6 Consider ODE x00 + q (t) x =0where q (t) a>0 on [t , + ).Then ¸ 0 1 zeros of any non-zero solution x (t) on [t0, + ) form a sequence tk k∞=1 that can be numbered so that t >t ,andt + . Furthermore,1 if f g k+1 k k ! 1 lim q (t)=b t + → ∞ then π lim (tk+1 tk)= . (4.30) k p →∞ ¡ b

155 Proof. By Lemma 4.4, the number of zeros of x (t) on any bounded interval [t0,T] is finite, which implies that the set of zeros in [t0, + ) is at most countable and that all zeros can be numbered in the increasing order. 1 Consider the ODE y00 +ay = 0 that has solution y (t)=sinpat. By Theorem 4.5, x (t) has a zero between any two zeros of y (t), that is, in any interval πk , π(k+1) [t , + ). √a √a ½ 0 1 This implies that x (t)hasin[t0, + )infinitely many zeros. Hence,h the seti of zeros of 1 x (t) is countable and forms an increasing sequence tk k∞=1. The fact that any bounded interval contains finitely many terms of this sequencef impliesg that t + . k ! 1

156 Lecture 26 14.07.2009 Prof. A. Grigorian, ODE, SS 2009 To prove the second claim, fixsomeT>t0 and set

m = m (T)= inf q (t) . t [T,+ ) ∈ ∞

Consider the ODE y00 + my =0.Sincem q (t)in[T,+ ), between any two zeros of · 1 y (t)in[T,+ ) there is a zero of x (t). Consider a zero tk of x (t)thatiscontainedin [T,+ ) and1 prove that 1 π t t . (4.31) k+1 ¡ k · pm Assume from the contrary that that t t > π . Consider a solution k+1 ¡ k √m πt y (t)=sin + ϕ , pm µ ¶ whose zeros form an arithmetic sequence s with difference π ,thatis,forallj, f jg √m π s s =

[s ,s ] (t ,t ) . j j+1 ½ k k+1

However, this means that between zeros sj, sj+1 of y there is no zero of x. This contra- diction proves (4.31). If b =+ then by letting T we obtain m and, hence, tk+1 tk 0as k , which1 proves (4.30) in this!1 case. !1 ¡ ! !1Considerthecasewhenb is finite. Then setting

M = M (T)= sup q (t) , t [T,+ ) ∈ ∞ we obtain in the same way that π tk+1 tk . ¡ ¸ pM When T ,bothm (T )andM (T)tendtob, which implies that !1 π tk+1 tk . ¡ ! pb

4.5 The Bessel equation The Bessel equation is the ODE

2 2 2 t x00 + tx0 + t α x =0 (4.32) ¡ ¡ ¢ 157 where t>0isanindependentvariable,x = x (t) is the unknown function, α R is a given parameter62.TheBessel functions63 are certain particular solutions of this equation.2 The value of α is called the order of the Bessel equation.

Theorem 4.7 Let x (t) be a non-zero solution to the Bessel equation on (0, + ).Then 1 the zeros of x (t) form an infinite sequence tk k∞=1 such that tk 0, but cannot be applied to the interval (0,T] because the ODE in question is not defined at 0. Let us show that, for small enough τ>0, the interval (0,τ) contains no zeros of x (t). Consider the following function on (0,τ) 1 z (t)=ln sin t t ¡ 62In general, one can let ® to be a complex number as well but here we restrict ourselves to the real case. 63The of the ¯rst kind is de¯ned by

m 2m+® 1 ( 1) t J (t)= ¡ : ® m!¡ (m + ® +1) 2 m=0 X µ ¶ It is possible to prove that J® (t) solves (4.32). If ® is non-integer then J® and J ® are linearly independent ¡ solutions to (4.32). If ® = n is an integer then the independent solutions are Jn and Yn where

J® (t)cos®¼ J ® (t) Yn (t)= lim ¡ ¡ ® n ! sin ®¼ is the Bessel function of the second kind.

158 which is positive in (0,τ)providedτ is small enough (clearly, z (t) + as t 0). For this function we have ! 1 ! 1 1 z0 = cos t and z00 = +sint ¡ t ¡ t2 whence 1 1 cos t z00 + z0 + z =ln . t t ¡ t cos t 1 1 1 Since t t and ln t = o t as t 0, we see that the right hand side here is negative in (0,τ)provided» τ is small enough.! It follows that ¡ ¢ 1 α2 z00 + z0 + 1 z<0, (4.35) t ¡ t2 µ ¶ so that z (t) is a positive supersolution of the Bessel equation in (0,τ). By Corollary of Theorem 4.5, x (t) has at most one zero in (0,τ). By further reducing τ,weobtainthat x (t) has no zeros on (0,τ), which finishes the proof. Example. In the case α = 1 we obtain from (4.34) Q (t) 1andtheODEforu (t) 2 ´ becomes u00 + u = 0. Using the solutions u (t)=cost and u (t)=sint and the relation 1/2 x (t)=t− u (t), we obtain the independent solutions of the Bessel equation: x (t)= 1/2 1/2 t− sin t and x (t)=t− cos t. Clearly, in this case we have exactly tk+1 tk = π. 1/2 1/2 ¡ The functions t− sin t and t− cos t show the typical behavior of solutions to the Bessel equation: oscillations with decaying amplitude as t : !1 y 2

1.5

1

0.5

0 0 5 10 15 20 25

x -0.5

-1

Remark. In (4.35) we have used that α2 0 which is the case for real α. For imaginary α one may have α2 < 0 and the above argument¸ does not work. In this case a solution to the Bessel equation can actually have a sequence of zeros accumulating at 0 (see Exercise 64).

159