<<

A Geometric Approach to Linear Ordinary Differential

R.C. Churchill Hunter College and the Graduate Center of CUNY, and the University of Calgary Address for Correspondence Department of Hunter College 695 Park Avenue, New York, NY 10021, USA

October 19, 2006 A Geometric Approach to Linear Ordinary Differential Equations R.C. Churchill

Prepared for the Kolchin Seminar on Differential Graduate Center, City University of New York 13 October 2006 Amended and Corrected, 19 October 2006

Abstract We offer a formulation of linear ordinary differential equations midway between what one encounters in a first undergraduate ODE course and what one encounters in a graduate Differential course (in the latter instance under the heading of “connections”). Analogies with el- ementary are emphasized; no familiarity with Differential Geometry is assumed.

Contents Introduction §1. Preliminaries on Derivations §2. Differential Modules §3. nth-Order Linear Differential Equations §4. Dimension Considerations §5. Fundamental Matrix Solutions §6. Dual Structures and Adjoint Equations §7. Cyclic Vectors §8. Extensions of Differential Structures §9. The Differential Galois Bibliography

1 Introduction

Let V be a finite-dimensional over a field K and let T : V → V be a K-linear . In a first course in linear algebra one learns that T can be studied via matrices as follows.

• Select an ordered basis e = (e1, e2, . . . , en) of V and let B = (bij) be the e-matrix of T , i.e., the n × n matrix with entries bij ∈ K defined by Pn (i) T ej = j=1bijei, j = 1, 2, . . . , n.

n • Let coln(K) ' K denote the K-space of n × 1 column vectors of elements of K i.e., n × 1 matrices, and let βe : V → coln(K) be the defined by   k1  k2  (ii) v = P k e ∈ V 7→ v :=   ∈ col (K). j j j e  .  n  .  kn

• Let TB : coln(K) → coln(K) denote left multiplication by B, i.e., the x ∈ coln(K) 7→ Bx ∈ coln(K). Then TB is K-linear and one has a commutative diagram

V −→T V (iii) βe↓ ↓βe

TB coln(K) −→ coln(K)

The commutativity of the diagram suggests that anything one wants to know about T should somehow be reflected in the K-linear mapping TB or, better yet, in the matrix B alone. The hope is that information desired about T is more easily extracted from B.

Example : An ordered pair (λ, v) ∈ F × (V \{0}) is an eigenpair for T if

(iv) T v = λv; one calls λ an eigenvalue of T , v an eigenvector, and the two entities are said to be “associated.”

2 We interpret (iv) geometrically: the effect of T on v is to “scale” this vector by a factor of λ. The idea is easy enough to comprehend, but suppose one actually needs to compute such pairs? In that case one switches from T to the matrix B: one learns that the eigenvalues of T are the roots of the characteristic polynomial

char(B) := det(xI − B)

of B, and once these have been determined one can begin searching for associated eigenvectors.

Of course the matrix B introduced above is not the unique “matrix representa- tion” of T : a different choice of basis results in a different matrix, say A. One learns that A and B must be “similar,” i.e., that A = PBP −1 for some non-singular n×n matrix P , and that similar matrices have the same characteristic polynomial. This shows that the characteristic polynomial is something intrinsically associated with T , not simply with matrix representations of T . In particular, one can unambiguously define the characteristic polynomial char(T ) of T by char(T ) := char(B). Since the (up to sign) and the negative of the trace of B appear as coefficients of this polynomial, these entities are also directly associated with T , not simply with matrix representations thereof. One can therefore define det(T ) := det(B) and tr(T ) := tr(B), again without ambiguity. To summarize: there are entities intrinsically associated with T which can easily be computed from matrix representations of this operator. In these notes we will view linear ordinary differential equations in an analogous manner. Specifically, we will explain how to think of a first order system

x˙ 1 + b11x1 + b12x2 + ··· + b1nxn = 0

x˙ 2 + b21x1 + b22x2 + ··· + b2nxn = 0 (v) . .

x˙ n + bn1x1 + bn2x2 + ··· + bnnxn = 0 as a “basis representation” of a geometric entity which we call a “differential mod- ule.” For our purposes the fundamental object intrinsically associated with such an entity is the “differential Galois group,” a group traditionally associated with basis representations as in (v). We need some preliminaries on derivations.

3 1. Preliminaries on Derivations

In this section R is a (not necessarily commutative) ring with multiplicative identity 1, with 1 = 0 allowed (in which case R = 0).

An additive group endomorphism δ : r ∈ R 7→ r 0 ∈ R is a derivation if the product or Leibniz rule

(1.1) (rs) 0 = rs 0 + r 0s

holds for all r, s ∈ R. One also writes r 0 as r(1) and defines r(n) := (r(n−1)) 0 for n ≥ 2. The notation r(0) := r proves convenient. d The usual differentiation operators dz on the polynomial ring C[z] and the quo- tient field C(z) are the basic examples of derivations. The second can be viewed as a derivation on the field M(P1) of meromorphic functions on the Riemann sphere by means of the usual identification C(z) 'M(P1). The same derivation on M(P1) is 2 d described in terms of the usual coordinate t = 1/z at ∞ by −t dt . Another example of a derivation is provided by the zero mapping r ∈ R 7→ 0 ∈ R; this is the trivial derivation. For an example involving a non-commutative ring choose an integer n > 1, let R be the collection of n × n matrices with entries in a commutative ring A with a 0 0 0 derivation a 7→ a , and for r = (aij) ∈ R define r := (aij). When r 7→ r 0 is a derivation on R one sees from (1.1) that 1 0 = (1 · 1) 0 = 1 · 1 0 + 1 0 · 1 = 1 0 + 1 0, and as a result that

(1.2) 1 0 = 0 .

When r ∈ R is a unit it then follows from 1 = rr−1 and (1.1) that

0 = (rr−1) 0 = r · (r−1) 0 + r 0 · r−1,

whence

(1.3) (r−1) 0 = −r−1 · r 0 · r−1 .

This formula is particularly useful in the matrix example given above. When R is commutative it assumes the more familiar form

(1.4) (r−1) 0 = −r 0r−2 .

4 The ring assumption suggests the generality of the concept of a derivation, but our main interest will be in derivations on fields. In this regard we note that any derivation on an domain extends uniquely, via the , to the quotient field. Henceforth K denotes a differential field (of characteristic 0), i.e., a field K equipped with a non-trivial derivation k 7→ k 0. By a constant we mean an element k ∈ K satisfying k 0 = 0, e.g., we see from (1.2) that 1 ∈ R has this property. Indeed, the collection KC ⊂ K of constants is easily seen to be a subfield containing Q; this is the field of constants (of K = (K, δ)). d When K = C(z) with derivation dz we have KC = C. The constants of the differential field M(P1) defined above are the constant functions f : P1 → C. The determinant   k1 k2 ··· kn  .   k 0 k 0 . k 0   1 2 n   (2) (2) .  (1.5) W := W (k1, . . . , kn) := det  k k .   1 2   .   .    (n−1) (n−1) (n−1) k1 k2 ······ kn is the of the elements k1, . . . , kn ∈ K. This entity is useful for determining linear (in)dependence over KC .

Proposition 1.6 : Elements k1, . . . , kn of a differential field K are linearly depen- dent over the field of constants KC if and only if their Wronskian is 0.

Proof : P (m) ⇒ For any c1, . . . , cn ∈ KC and any 0 ≤ m ≤ n we have ( j cjkj) = P (m) P j cjkj . In particular, when j cjkj = 0 the same equality holds when kj is replaced by the jth column of the Wronskian and 0 is replaced by a column of zeros. The forward assertion follows. ⇐ The vanishing of the Wronskian implies a dependence relation (over K) among columns, and as a result there must be elements c1, . . . , cn ∈ K, not all 0, such that Pn (m) (i) j=1cjkj = 0 for m = 0, . . . , n − 1.

What requires proof is that the cj may be chosen in KC , and this we establish by induction on n. As the case n = 1 is trivial we assume n > 1 and that the result holds for any subset of K with at most n − 1 elements.

5 If there is also a dependence relation (over K) among the columns of the Wron- skian of y2, . . . , yn, e.g., if c1 = 0, then by the induction hypothesis the elements y2, . . . , yn ∈ K must be linearly dependent over KC . But the same then holds for y1, . . . , yn, which is precisely what we want to prove. We therefore assume (w.l.o.g.) that c1 = 1 and that the columns of the Wronskian of y2, . . . , yk are linearly inde- pendent over K. From (i) we then have

Pn (m) 0 Pn (m+1) Pn 0 (m) Pn 0 (m) Pn 0 (m) 0 = ( j=1cjkj ) = j=1 cjkj + j=2 cjkj = 0 + j=2 cjkj = j=2 cjkj

0 0 for m = 0, . . . , n − 2, thereby forcing c2 = ··· = cn = 0. But this means cj ∈ KC for j = 1, . . . , n, and the proof is complete. q.e.d.

6 2. Differential Modules

Throughout this section K denotes a differential field with derivation k 7→ k 0 and V is a K-space (i.e., a vector space over K). The collection of n × n matrices with entries in K is denoted gl(n, K); the group of invertible matrices in gl(n, K) is denoted GL(n, k).

A differential structure on V is an additive group homomorphism D : V → V satisfying

(2.1) D(kv) = k 0v + kDv, k ∈ K, v ∈ V, where Dv abbreviates D(v). The Leibniz rule terminology is also used with (2.1). Vectors v ∈ V satisfying Dv = 0 are said to be horizontal 1. The zero vector 0 ∈ V is always has this property; other such vectors need not exist. When D : V → V is a differential structure the pair (V,D) is called a differential K-module, or simply a differential module when K is clear from context. When dimK (V ) = n < ∞ the integer n is the dimension of the differential module. For an example of a differential structure/module take V := coln(K) and define D : coln(K) → coln(K) by

   0  k1 k1  k   k 0   2   2  (2.2) D  .  :=  .  .  .   .  0 kn kn

Further examples will be evident from Proposition 2.12. Since KC is a subfield of K we can regard V as a vector space over KC by restricting scalar multiplication to KC × V .

Proposition 2.3 :

(a) Any differential structure D : V → V is KC -linear. (b) The collection of horizontal vectors of a differential structure D : V → V coincides with the kernel ker D of D when D is considered as a KC -linear mapping. 1The terminology is borrowed from Differential Geometry.

7 (c) The horizontal vectors of a differential structure D : V → V constitute a KC - subspace of (the KC -space) V .

Proof : (a) Immediate from (2.1). (b) Obvious from the definition of horizontal. (c) Immediate from (b). q.e.d.

To obtain a basis description of a differential structure D : V → V let e = n n (ej)j=1 ⊂ V be a(n ordered) basis of V let and B = (bij) ∈ gl(n, K) be defined by Pn (2.4) Dej := j=1bijei, j = 1, . . . , n.

2 τ (Example: For D as in (2.2) and ej = (0,..., 0, 1, 0,..., 1) [1 in slot j] for j = 1, . . . , n we have B = (0) [the zero matrix].) We refer to B as the defining (e)-matrix of D, or as the defining matrix of D relative to the basis e. Note that, Pn for any v = j=1 vjej ∈ V , additivity and the Leibniz rule (2.1) give

Pn 0 Pn (2.5) Dv = i=1(vi + j=1 bijvj)ei. This is better expressed in the matrix form

0 (2.6) (Dv)e = ve + Bve, wherein ve is as in (ii) of the introduction, i.e., the image of v under the isomorphism   v1  v2  β : v = P v e ∈ V 7→   ∈ col (K), e j j j  .  n  .  vn and  0  v1  v 0  0  2  ve :=  .  .  .  0 vn

2The superscript τ (“tau”) denotes transposition.

8 Where all this is leading should hardly be a surprise: the mapping DB : x ∈ 0 coln(K) → x +Bx ∈ coln(K) defines a differential structure on the K-space coln(K), and one has a commutative diagram

V −→D V (2.7) βe↓ ↓βe

DB coln(K) −→ coln(K).

An immediate with linear ordinary differential equations is seen from (2.6): a vector v ∈ V is horizontal if and only if ve ∈ coln(K) is a solution of the first-order

(2.8) x 0 + Bx = 0.

This is the defining (e)- of D. Linear systems of ordinary differential equations of the form

(2.9) x 0 + Bx = 0 are called homogeneous. One can also ask for solutions of inhomogeneous systems, i.e., systems of the form

(2.10) x 0 + Bx = b, wherein 0 6= b ∈ coln(K) is given. For b = we this is equivalent to the search for a vector v ∈ V satisfying

(2.11) Dv = w .

Equation (2.9) is the homogeneous equation corresponding to (2.10).

Proposition 2.12 : When dimK V < ∞ and e is a basis the correspondence be- tween differential structures D : V → V and n × n matrices B defined by (2.4) is bijective; the inverse assigns to a matrix B ∈ gl(n, K) the differential structure D : V → V defined by (2.6).

Proof : The proof is by routine verification. q.e.d.

9 Proposition 2.13 :

(a) The solutions of (2.9) within coln(K) form a vector space over KC .

(b) When dimK V = n < ∞ and e is a basis of V the K-linear isomorphism v ∈ V 7→ ve ∈ coln(K) restricts to a KC -linear isomorphism between the KC - subspace of V consisting of horizontal vectors and the KC -subspace of coln(K) consisting of solutions of (2.9).

Proof :

(a) When y1, y2 ∈ coln(K) are solutions and c1, c2 ∈ KC we have

0 0 0 (c1y1 + c2y2) = (c1y1) + (c2y2) 0 0 0 0 = c1y1 + c1y1 + c2y2 + c2y2

= 0 · y1 + c1(−By1) + 0 · y2 + c2(−By2)

= −Bc1y1 − Bc2y2

= −B(c1y1 + c2y2) .

(b) That the mapping restricts to a bijection between horizontal vectors and so- lutions was already noted immediately before (2.9), and since the correspondence v 7→ ve is K-linear and KC is a subfield of K any restriction to a KC -subspace must be KC -linear. q.e.d.

n n Suppose eˆ = (ˆej)j=1 ⊂ V is a second basis and P = (pij) is the transition Pn matrix, i.e.,e ˆj = i=1 pijei. Then the defining e and eˆ-matrices B and A of D are easily seen to be related by

(2.14) A := PBP −1 − P 0P −1,

0 0 where P := (pij). The transition from B to A is viewed classically as a change of variables: substitute w = P x in (2.9); then note from

w 0 = P x 0 + P 0x = P (−Bx) + P 0P −1w = −PBP −1w + P 0P −1w that w 0 + (PBP −1 − P 0P −1)w = 0 . The modern viewpoint is to regard (P,B) 7→ PBP −1 − P 0P −1 as defining a left of GL(n, K) on gl(n, K); this is the action by gauge transformations.

10 d Example 2.15 : Assume K = C(z) with derivation dz and consider the first-order system

  10z4−(2ν2−1)z2−2  4z6−4ν2z4−4z2+1  4 4 4 (i) x 0 + z(2z −1) z (2z −1) x = 0 ,  z2(z4+3z2−ν2) (2ν2−1)z2+1  − ( 2z4−1 ) z(2z4−1) i.e., 0  10z4−(2ν2−1)z2−2   4z6−4ν2z4−4z2+1  x1 + 4 x1 + 4 4 x2 = 0 z(2z −1) z (2z −1) , 0  z2(z4+3z2−ν2)   (2ν2−1)z2+1  x2 − 2z4−1 x1 + z(2z4−1) x2 = 0 wherein ν is a complex parameter. This has the form (2.9) with

  10z4−(2ν2−1)z2−2  4z6−4ν2z4−4z2+1  4 4 4 B := z(2z −1) z (2z −1) ,  z2(z4+3z2−ν2) (2ν2−1)z2+1  − ( 2z4−1 ) z(2z4−1) and with the choice3 ! z2 2z P := z3 z−2 one sees that the transformed system is

0 −1 ! (ii) x 0 + Ax = 0, where A := PBP −1 − P 0P −1 = . ν2 1 1 − z2 z

We regard (i)-(ii) as distinct basis descriptions of the same differential structure.

3At this point readers should not be concerned with how this particular P was constructed.

11 3. nth-Order Linear Differential Equations

The concept of an nth-order linear homogeneous equation in the context of a differ- ential field K is formulated in the obvious way: an element k ∈ K is a solution of

(n) (n−1) 0 (3.1) y + `1y + ··· + `n−1y + `ny = 0 ,

where `1, . . . , `n ∈ K, if and only if

(n) (n−1) 0 (3.2) k + `1k + ··· + `n−1k + `nk = 0 ,

where k(2) := k 00 := (k 0) 0 and k(j) := (k(j−1)) 0 for j > 2. Using a Wronskian argument one can easily prove that (3.1) has at most n solutions (in K) linearly independent over KC . As in the classical case k ∈ K is a solution of (3.1) if and only if the column vector (k, k 0, . . . , k(n−1))τ is a solution of

 0 −1 0 ··· 0  . .  . 0 −1 .     .   0 ..  0   (3.3) x + Bx = 0,B =   .  ..   . −1 0       0 −1  `n `n−1 ······ `2 `1

Indeed, one has the following analogue of Proposition 2.13.

Proposition 3.4 :

(a) The solutions of (3.1) within K form a vector space over KC . 0 (n−1) τ (b) The KC -linear mapping (y, y , . . . , y ) ∈ coln(K) 7→ y ∈ K restricts to a KC -linear isomorphism between the KC -subspace of V consisting of horizontal vectors and the KC -subspace of K described in (a).

Proof : The proof is a routine verification. q.e.d.

12 Example 3.5 : Equation (ii) of Example 2.15, i.e.,

0 −1 ! (i) x 0 + x = 0, ν2 1 1 − z2 z has the form seen in (3.3). This linear differential equation would commonly be written as 1  ν2  (ii) y 00 + y 0 + 1 − y = 0, z z2 or as

(iii) z2y 00 + z y + (z2 − ν2) y = 0.

Either form can be regarded as a basis description of the differential module of that example. Some readers will immediately recognize (iii): it is Bessel’s equation.

Converting the nth-order equation (3.1) to the first-order system (3.3) is stan- dard practice. Less well-known is the fact that any first-order system of n equations can be converted to the form (3.3), and as a consequence can be expressed nth-order form. We will prove this in §7. For many purposes nth-order form has distinct advan- tages, e.g., explicit solutions are often easily constructed with expansions, e.g., equation (iii) of Example 3.5 can be solved explicitly in terms of Bessel functions.

13 4. Dimension Considerations

In this section K denotes a differential field and (V,D) is a differential K-module of dimension n ≥ 1.

Proposition 4.1 : When V is a K-space with differential structure D : V → V the following assertions hold.

(a) A collection of horizontal vectors within V is linearly independent over K if and only if it is linearly independent over KC .

(b) The collection of horizontal vectors of V is a vector space over KC of dimen- sion at most n.

Proof :

(a) ⇒ Immediate from the inclusion KC ⊂ K . (In this direction the horizontal assumption is unnecessary.) ⇐ If the implication is false there is a collection of horizontal vectors in V which is KC -(linearly) independent but K-dependent, and from this collection we can choose vectors v1, . . . , vm which are K-dependent with m > 1 minimal w.r.t. this property. Pm−1 We can then write vm = j=1 kjvj, with kj ∈ K, whereupon applying D and the Pm−1 0 hypotheses Dvj = 0 results in the identity 0 = kj vj. By the minimality of 0 m this forces kj = 0, j = 1, . . . , m − 1, i.e., kj ∈ KC , and this contradicts linear independence over KC . (b) This is immediate from (a) and the fact that any K-linearly independent subset of V can be extended to a basis. q.e.d.

0 Suppose dimK V = n < ∞, e is a basis of V , and x +Ax = 0 is the defining e- equation of D. Then assertion (b) of the preceding result has the following standard formulation.

Corollary 4.2 : For any matrix B ∈ gl(n, K) a collection of solutions of

(i) x 0 + Bx = 0

within coln(K) is linearly independent over K if and only if the collection is linearly independent over KC . In particular, the KC -subspace of coln(K) consisting of solutions of (i) has dimension at most n.

14 Proof : By Proposition 2.13. q.e.d.

Equation (i) of Corollary 4.2 is always satisfied by the column vector x = (0, 0,..., 0)τ ; this is the trivial solution, and any other is non-trivial. Unfortunately, non-trivial solutions (with entries in K) need not exist. For example, the linear dif- ferential equation y 0 − y = 0 admits only the trivial solution in the field C(z) : for non-trivial solutions one must recast the problem so as to include the extension field (C(z))(exp(z)).

m Corollary 4.3 : For any elements `1, . . . , `n−1 ∈ K a collection of solutions {yj}j=1 ⊂ K of

(n) (n−1) 0 (i) y + `1y + ··· + `n−1y + `ny = 0

0 (n−1) m is linearly independent over KC if and only if the collection {(yj, yj , . . . , yj }j=1 is linearly independent over K. In particular, the KC -subspace of K consisting of solutions of (i) has dimension at most n.

Proof : Use Proposition 3.4(b) and Corollary 4.2. q.e.d.

15 5. Fundamental Matrix Solutions

Throughout the section K is a differential field and D : V → V is a differential module of positive dimension n.

Choose any basis e of V and let

(5.1) x 0 + Bx = 0 be the defining e-equation of D. A non-singular matrix M ∈ gl(n, K) is a funda- mental matrix solution of (5.1) if M satisfies this equation, i.e., if and only if

(5.2) M 0 + BM = 0.

Example: Take K = C(z) with the usual derivation δ = d/dz; then ! z2 1 + z M := 0 z3 is a fundamental matrix solution of the equation

2 z+2 ! 0 − z z4 x + 3 x = 0. 0 − z

Proposition 5.3 : A matrix M ∈ gl(n, K) is a fundamental matrix solution of (5.1) if and only if the columns of M constitute n solutions of that equation linearly independent over KC .

Of course linear independence over K is equivalent to the non-vanishing of the Wronskian W (y1, . . . , yn).

Proof : First note that (5.2) holds if and only if the columns of M are solutions of (5.1). Next observe that M is non-singular if and only if these columns are linearly independent over K. Finally, note from Propositions 2.13(b) and 4.1(a) that this will be the case if and only if these columns are linearly independent over KC . q.e.d.

Corollary 5.4 : Equation (5.1) admits a fundamental matrix solution in GL(n, K) if and only if V admits a basis consisting of D-horizontal vectors.

16 Proof : By Proposition 5.3 the existence of a fundamental matrix solution in GL(n, K) is equivalent to the existence of n D-horizontal vectors linearly indepen- dent over KC . Now recall Proposition 4.1. q.e.d.

Proposition 5.5 : Suppose M,N ∈ gl(n, K) and M is a fundamental matrix so- lution of (5.1). Then N is a fundamental matrix solution if and only if N = MC for some matrix C ∈ GL(n, KC ).

Proof : B ⇒ : By (1.3) we have

(M −1N) 0 = M −1 · N 0 = (M −1) 0 · N = M −1 · (−BN) + (−M −1M 0M −1) · N = −M −1BN + (−M −1)(−BM)(−M −1)N = −M −1BN + M −1BN = 0 .

⇐ : We have N 0 = (MC) 0 = M 0C = −BM · C = −B · MC = −BN. q.e.d.

Fundamental matrix solutions have an interesting geometric characterization. To explain that we need to introduce the dual of a differential module.

17 6. Dual Structures and Adjoint Equations

Differential structures allow for a simple conceptual formulation of the “adjoint equa- tion” of a linear ordinary differential equation.

In this section K is a differential field and (V,D) is a differential K-module of dimension n ≥ 1. Recall that the dual space V ∗ of V is defined as the K-space ∗ ∗ of linear functionals v : V → K, and the dual basis e of a basis e = {eα} of V is ∗ ∗ ∗ the basis {eα} of V satisfying eβeα = δαβ (wherein δαβ is the usual Kronecker delta, i.e., δαβ := 1 if and only if α = β; otherwise δαβ := 0).

There is a dual differential structure D∗ : V ∗ → V ∗ on the dual space V ∗ naturally associated with D: the definition is

(6.1) (D∗u∗)v = δ(u∗v) − u∗(Dv), u∗ ∈ V ∗, v ∈ V.

The verification that this is a differential structure is straightforward, and is left to the reader. One often sees u∗v written as hv, u∗i, and when this notation is used (6.1) becomes

(6.2) δhv, u∗i = hDv, u∗i + hv, D∗u∗i.

∗ ∗ This is the Lagrange identity ; it implies that u v ∈ KC whenever v and u are horizontal.

n Proposition 6.3 : Suppose e ⊂ V is a basis of V and B = (bij) ∈ gl(n, K) is the defining e-matrix of D. Then the defining e∗-matrix of D∗ is −Bτ .

The proof, as are most of the proofs in this section, is a simple application of the Lagrange identity.

Proof : First note from (6.2) that for any 1 ≤ i, j ≤ n we have

0 0 = δij ∗ = δhei, ej i ∗ ∗ ∗ = hDei, ej i + hei,D ej i P ∗ ∗ ∗ = h k bkiek, ej i + hei,D ej i P ∗ ∗ ∗ = k bkihek, ej i + hei,D ej i ∗ ∗ = bji + hei,D ej i,

18 from which we see that

∗ ∗ (i) hei,D ej i = −bji.

∗ ∗ However, for the defining e -matrix C = (cij) of D we have

∗ ∗ P ∗ D ei = kckiek, and therefore ∗ ∗ P ∗ P ∗ hei,D ej i = hei, kckjeki = k ckjhei, eki = cij for any 1 ≤ i, j ≤ n. The result now follows from (i). q.e.d.

As an immediate consequence of Proposition 6.3 we see that the defining e∗- equation of D∗ is

(6.4) y 0 − Bτ y = 0; this is the adjoint equation of

(6.5) x 0 + Bx = 0.

Note from the usual identification V ∗∗ ' V and −(−Bτ )τ = B that (6.5) can be viewed as the adjoint equation of (6.4). Intrinsically: the identification V ' V ∗∗ induces the additional identification D∗∗ := (D∗)∗ ' D. Equations (6.5) and (6.4) are interchangeable in the sense that information about either one can always be obtained from information about the other. In particular, there is fundamental relationship between solutions of a linear differential equation and the solutions of the adjoint equation. This is explained in the following result, which also contains the promised geometric characterization of a fundamental matrix solution.

Proposition 6.6 : Suppose e is a basis of V and

(i) x 0 + Bx = 0 is the defining e-matrix of D. Then for any M ∈ gl(n, K) the following statements are equivalent:

(a) M is a fundamental matrix solution of (i);

19 (b) (M τ )−1 is a fundamental matrix solution of the adjoint equation

(ii) x 0 − Bτ x = 0

of (i); and

(c) M τ is the transition matrix from the dual basis e∗ of e to a basis of V ∗ consisting of D∗-horizontal vectors.

Proof : First note that (M τ ) 0 = (M 0)τ ; hence that

(iii) M 0 + BM = 0 ⇔ (M τ ) 0 + M τ Bτ = 0.

(a) ⇔ (b) : From (iii), (M τ )−1 = (M −1)τ and (M τ ) 0 = (M 0)τ we have

M 0 + MB = 0 ⇔ (M 0)τ + M τ Bτ = 0 ⇔ −(M τ )−1(M 0)τ (M τ )−1 + (M τ )−1M τ Bτ (M τ )−1 = 0 ⇔ −(M −1)τ (M 0)τ (M −1)τ + Bτ (M −1)τ = 0 ⇔ −(M −1M 0M −1)τ + Bτ (M −1)τ = 0 ⇔ ((M −1) 0)τ − Bτ (M τ )−1 = 0 ⇔ ((M τ )−1) 0 − Bτ (M τ )−1 = 0 .

(a) ⇔ (c) : One has

M 0 + BM = 0 ⇔ (M τ ) 0 + M τ Bτ = 0 ⇔ (M τ ) 0(M τ )−1 + M τ Bτ (M τ )−1 = 0 ⇔ (M τ )(−Bτ )(M τ )−1 − (M τ ) 0(M τ )−1 = 0.

This last line is precisely what one obtains when A, B and P in (2.14) are replaced by 0, −Bτ and M τ respectively, i.e., it asserts that when M τ is regarded as a transition matrix from e∗ to some other basis ˆe∗ of V ∗, the defining ˆe∗-matrix of D∗ must vanish. Now simply observe from (2.4) that that the defining matrix of a basis vanishes if and only if that basis consists of horizontal vectors. q.e.d.

Dual differential structures are useful for solving equations of the form Du = w, wherein w ∈ V is given. The fundamental result in this direction is the following.

20 ∗ n ∗ Proposition 6.7 : Suppose (vm)j=1 is a basis of V consisting of horizontal vectors n ∗∗ and (vj)j=1 is the dual basis of V ' V . Then the following statements hold.

(a) All vj are horizontal. 0 ∗ (b) Suppose w ∈ V and there are elements kj ∈ K such that kj = hw, vj i for j = 1, . . . , n. Then the vector P (i) u := jkjvj ∈ V satisfies (ii) Du = w .

Te vector u introduced in (i) is not the unique solution of (ii): the sum of u and any horizontal vector will also satisfy that equation. Proof : (a) This can be seen as a corollary of Proposition 6.6, but a direct proof is quite ∗ easy: from hvi, vj i ∈ {0, 1} ⊂ KC , Lagrange’s identity (6.2) and the hypothesis ∗ ∗ D vj = 0 we have ∗ 0 0 = hvi, vj i ∗ ∗ ∗ = hDvi, vj i + hvi,D vj i ∗ = hDvi, vj i , ∗ and since (vj ) is a basis this forces Dvi = 0 for i = 1, . . . , n. (b) First note that for any v ∈ V we have P ∗ (iii) v = jhv, vj ivj . P Indeed, we can always write v in the form v = i civi, where ci ∈ K, and applying ∗ ∗ P ∗ vj to this identity gives hv, vj i = i cihvi, vj i = cj. From (a) and (iii) we then have P Du = j D(kjvj) P 0 = j(kj vj + kjDvj) P ∗ = j(hw, vj ivj + kj · 0) P ∗ = jhw, vj )vj = w . q.e.d.

21 The following corollary explains the relevance of the adjoint equation for solving inhomogeneous systems. In the statement we adopt more classical notation: when k, ` ∈ K satisfy ` 0 = k we write ` as R k, omit specific reference to `, and simply R P assert that k ∈ K. Moreover, we use the usual inner product hy, zi := j yjzj to ∗ identify coln(K) with (coln(K)) , i.e., we identify the two spaces by means of the ∗ K-isomorphism v ∈ coln(K) 7→ (w ∈ coln(K) 7→ hw, vi ∈ K) ∈ (coln(K)) .

Corollary 6.8 4: Suppose:

(a) B ∈ gl(n, K);

(b) b ∈ coln(K); n (c) (zj)j=1 is a basis of coln(K) consisting of solutions of the adjoint equation

(i) x 0 − Bτ x = 0

of

(ii) x 0 + Bx = 0 ;

R (d) hb, zji ∈ K for j = 1, . . . , n; and n (e) (yi)i=1 is a basis of coln(K) satisfying ( 1 if i = j (iii) hy , z i = i j 0 otherwise.

n Then (yj)j=1 is a basis of solutions of the homogeneous equation (ii) and the vector P R (iv) y := j( hb, zji) · yj is a solution of the inhomogeneous system

(v) x0 + Bx = b .

4For a classical account of this result see, e.g., [Poole, Chapter III, §10, pp. 36-39]. In fact the treatment in this reference was the inspiration for our formulation of this corollary.

22 The appearance of the in (iv) explains why solutions of (i) are called integrating factors of (ii) (and vice-versa, since, as already noted, (i) may be regarded as the adjoint equation of (ii)). The result is immediate from Proposition 6.7. However, it is a simple enough matter to give a direct proof, and we therefore do so.

n n Proof : Hypothesis (d) identifies (zj)j=1 with the dual basis of (yi)i=1. In particular, n ∗ it allows us to view (zj)j=1 as a basis of (coln(K)) . To prove that the xj satisfy (ii) simply note from (iii) and (i) that

0 0 = hyi, zji 0 0 = hyi , zji + hyi, zj i 0 τ = hyi , zji + hyi,B zji 0 = hyi , zji + hByi, zji 0 = hyi + Byi, zji.

n ∗ Since (zj)j=1 is a basis (of (coln(K)) ) it follows that

0 (vi) yj + Byj = 0 , j = 1, . . . , n.

Next observe, as in (i) of the proof of Proposition 6.7, that for any b ∈ coln(K) condition (iii) implies X b = hb, zjiyj . j It then follows from (vi) that

0 P R 0  y = j ( hb, zji) · yj + hb, zjiyj P R  = j −( hb, zji) · Byj + hb, zjiyj P R P = −B j( hb, zji) · yj + jhb, zjiyj = −By + b .

q.e.d.

Corollary 6.8 was formulated so as to make the role of the adjoint equation evident. The following alternate formulation is easier to apply in practice.

23 Corollary 6.9 : Suppose B ∈ gl(n, K) and M ∈ GL(n, K) is a fundamental matrix solution of

(i) x 0 + Bx = 0.

th τ −1 Denote the j -columns of M and (M ) by yj and zj respectively, and suppose R b ∈ coln(K) and hb, zji ∈ K for j = 1, . . . , n. Then P R (ii) y := j( hb, zj)) · yj is a solution of the inhomogeneous system

(iii) x 0 + Bx = b .

τ −1 Proof : By Proposition 5.3 the zj form a basis of coln(K), and from (M ) = (M −1)τ and M −1M = I we see that ( 1 if i = j hy , z i = i j 0 otherwise.

The result is now evident from Corollary 6.8. q.e.d.

Finally, consider the case of an nth-order

(n) (n−1) 0 (6.10) u + `1u + ··· + `n−1u + `nu = 0 .

In this instance the adjoint equation generally refers to the nth-order linear equation

n (n) n−1 (n−1) 0 (6.11) (−1) v + (−1) (`1v) + ··· + (−1)(`n−1v) + `nv = 0 ,

e.g., the adjoint equation of

00 0 (6.12) u + `1u + `2u = 0

is

00 0 0 (6.13) v − `1v + (`2 − `1)v = 0 .

24 Examples 6.14 :

(a) The adjoint equation of Bessel’s equation

1 ν2 y 00 + y 0 + (1 − ) y = 0 x x2 is 1 ν2 − 1 z 00 − z 0 + (1 − ) z = 0 . x x2 (b) The adjoint equation of any second-order equation of the form

00 y + `2y = 0

is the identical equation (despite the fact that they describe differential struc- tures on spaces dual to one another).

To understand why the “adjoint” terminology is used with (6.11) first convert (6.10) to the first order form (3.3) and write the corresponding adjoint equation accordingly, i.e., as

  0 0 0 ··· 0 −`n    1 0 0 0 −`   n−1   . .   0 1 0 . .  0 τ τ   (6.15) x − B x = 0, −B =   .  .. ..   . .     .   . 1 0 −`2    0 0 1 −`1

The use of the terminology is then explained by the following result.

τ Proposition 6.16 : A column vector x = (x1, . . . , xn) ∈ coln(K) is a solution of (6.15) if any only if xn is a solution of (6.11) and

j−1 j (j) X i (i) (i) xn−j = (−1) xn + (−1) (`j−ixn) for j = 1, . . . , n − 1 . i=0

25 Proof : τ ⇒ If x = (x1, . . . , xn) satisfies (6.15) then 0 (ii) xj = −xj−1 + `n+1−jxn for j = 1, . . . , n , where x0 := 0. It follows that 0 xn = −xn−1 + `1xn,

00 0 0 xn = −xn−1 + (`1xn) 0 = −(−xn−2 + `2xn) + (`1xn) 2 0 = (−1) xn−2 + (−1)`2xn + (`1xn) ,

(3) 2 0 0 00 xn = (−1) xn−2 + (−1)(`2xn) + (`1xn) 2 0 00 = (−1) (−xn−3 + `3xn) + (−1)(`2xn) + (`1xn) 3 2 0 00 = (−1) xn−3 + (−1) `3xn + (−1)(`2xn) + (`1xn) , and by induction (on j) that

j−1 (j) j X j−1−i (i) xn = (−1) xn−j + (−1) (`j−ixn) , j = 1, . . . , n . i=0 This is equivalent to (i), and equation (6.11) amounts to the case j = n.

⇐ Conversely, suppose xn is a solution of (6.11) and that (i) holds. We must show that (iii) holds or, equivalently, that

0 xn−j = −xn−(j+1) + `j+1xn for j = 1, . . . , n. This, however, is immediate from (i). Indeed, we have

0 j (j+1) Pj−1 i i+1 xn−j = (−1) xn + i=0 (−1) (`j−ixn) j (j+1) Pj i+1 (i) = (−1) xn + i=1(−1) (`j+1−ixn) j (j+1) Pj i+1 (i) = (−1) xn + i=0(−1) (`j+1−ixn) + `j+1xn  j+1 (j+1) Pj (i) = − (−1) xn + i=0(`j+1−ixn) + `j+1xn

= −xn−(j+1) + `j+1xn . q.e.d. For completeness we record the nth-order formulation of Corollary 6.9.

26 Proposition 6.17 (“Variation of Constants”) : Suppose `1, . . . , `n ∈ K and th {y1, . . . , yn} ⊂ coln(K) is a collection of solutions of the n -order equation

(n) (n−1) 0 (i) u + `1u + ··· + `n−1u + `nu = 0

linearly independent over KC . Let   y1 y2 ··· yn  .   y 0 y 0 . y 0   1 2 n   (2) (2) .  M :=  y y .   1 2   .   .    (n−1) (n−1) (n−1) y1 y2 ······ yn

th τ −1 and let (z1, . . . , zn) denote the n -row of the matrix (M ) . Suppose k ∈ K is such R that kzj ∈ K for j = 1, . . . , n. Then P R (ii) y := j( kzj) · yj is a solution of the inhomogeneous equation

(n) (n−1) 0 (iii) u + `1u + ··· + `n−1u + `nu = k .

Proof : Convert (i) to a first-order system as in (3.3) and note from Propositions 1.6 and 3.4 that M is a fundamental matrix solution. Denote the jth-column of τ M byy ˆj and apply Corollary 6.9 with b = (0,..., 0, k) and yj (in that statement) replaced byy ˆj so as to achieve yˆ0 + Byˆ = b. τ Now writey ˆ = (y, yˆ2,..., yˆn) and eliminatey ˆ2,..., yˆn in the final row of (iii) by expressing these entities in terms of y and thereof: the result is (iii). q.e.d.

Since our formulation of Proposition 6.17 is not quite standard5, a simple example seems warranted. 5Cf. [C-L, Chapter 3, §6, Theorem 6.4, p. 87].

27 d Example 6.18 : For K = C(x) with derivation dx we consider the inhomogeneous second-order equation 2 6 (i) u 00 + u 0 − u = x3 + 4x , x x2 and for the associated homogeneous equation 2 6 u 00 + u 0 − u = 0 x x2

2 3 take y1 = x and y2 = 1/x so as to satisfy the hypothesis of Proposition 6.17. In the notation of that proposition we have

! 3 ! x2 1 3 2x M = x3 , (M τ )−1 = 5x2 5 , 2x − 3 1 x4 x4 5x − 5

4 and from the second matrix we see that z1 = 1/5x, z2 = −x /5. A solution to (i) is therefore given by

 4  1 y = R 1 · (x3 + 4x) · x2 + (−1) R x · (x3 + 4x) · 5x 5 x3 x5 2x3 = + , 24 3 as is easily checked directly.

28 7. Cyclic Vectors

Throughout the section K is a differential field and (V,D) is a differential K- module of dimension n ≥ 1.

0 1 k k−1 Define D := idV (the identity operator on V ), D := D, and D := D ◦ D for k > 1. Using (1.1) and induction on n ≥ 1 we see that for any k ∈ K and v ∈ V we have the Leibniz rule

n X  n  (7.1) Dn(kv) = k(j)Dn−jv . j j=0

A vector v ∈ V is cyclic (w.r.t.D), or is D-cyclic, if {v, Dv, . . . , Dn−1v} is a basis of V .

Example 7.2 : Suppose K has characteristic 0 and contains an element k such 0 that k = 1. Moreover, assume V admits a basis {e1, . . . , en} of horizontal vectors. Then the vector

n−1 X kj v := e j! j+1 j=0 is cyclic. Indeed, by induction one sees that

n−1 X kj−` D`v = e (j − `)! j+1 j=` for ` = 1, . . . , n − 1, and the claim follows easily.

The hypotheses of Example 7.2 are too restrictive for most applications, but we will see that cyclic vectors always exist when K has characteristic 0, the inclusion KC ⊂ K is proper (i.e., the derivation on K is nontrivial), and the field of constants 6 KC is algebraically closed. The goal of the section is a proof of this assertion, but it seems preferable to first indicate why such vectors might be of interest.

6The proof we offer is due to J. Kovacic (unpublished). However, any errors that appear are the responsibility of R. Churchill. For a constructive proof, and references to alternate proofs, see [C-K].

29 Proposition 7.3 : The following assertions are equivalent:

(a) D∗ : V ∗ → V ∗ admits a cyclic vector;

(b) there is a basis e of V such that the defining e-equation of D has the form

 0 −1 0 ··· 0  .  .   0 0 −1 0 .   . . . .  0  ......  (i) x +   x = 0,      0 −1  p0 p1 ··· pn−2 pn−1

and

(c) there is a basis representation of D which can be converted to the nth-order form

(n) (n−1) 0 (ii) y + pn−1y + ··· + p1y + p0y = 0 .

Less formally: finding a cyclic vector for D∗ is equivalent to expressing D in terms of an nth-order homogeneous linear differential equation.

Proof : (a) ⇔ (b) : From the definition of a cyclic vector one sees that D∗ admits such a ∗ ∗ ∗ ∗ ∗ vector v if and only if there is a basis (v , e2, . . . , en) of V such that the associated defining matrix has the form   0 0 ··· 0 −p0  1 0 0 0 −p   1   . .   0 1 0 . .    .  ......   . . . .     0 −pn−2  0 0 ··· 0 1 −pn−1

Since this matrix is the negative transpose of that in (i), the asserted equivalence is evident from Proposition 6.3.

30 (b) ⇔ (c) : This equivalence has already been discussed: see the paragraph sur- rounding (3.3). q.e.d.

We turn our attention to the existence problem. Although Proposition 7.3 is phrased in terms of cyclic vectors for D∗, it suffices, since D∗∗ ' D, to concentrate on the existence of cyclic vectors for D. A D- subspace W ⊂ V is a differential subspace, or D-subspace, of V . As the reader can easily check, the intersection of any family of such subspaces is again such a subspace. In particular, the intersection hSi of those differential subspaces contained some nonempty subset S ⊂ V is a D-subspace; this is the subspace differentially generated by S. When S = {s1, . . . , sr} ⊂ V is finite the notation h{s1, . . . , sr}i is abbreviated to hs1, . . . , sri. In particular, hvi denotes the subspace differentially generated by a single vector v ∈ V .

Proposition 7.4 : When 1 ≤ m ≤ n and W ⊂ V is an m-dimensional D-space and v ∈ W the following statements are equivalent: (a) hvi = W ; (b) {v, Dv, . . . , Dm−1v} is a basis of W .

Proof : Pm−1 j (a) ⇒ (b) : If (b) is false there must be scalars kj ∈ K such that j=0 kjD v = 0. Letting p denote the maximal j ≤ m − 1 satisfying kj 6= 0 we can write this p Pp−1 ˆ dependence relation in the form D v = j=0 kjDjv , from which we easily see that Dp+sv must be in the span of {v, Dv, . . . , Dp−1v} for all s ≥ 0. We conclude that dimK (hvi) ≤ p < m = dimK (W ), hence that hvi= 6 W , and we have a contradiction. (b) ⇒ (a) : Obvious. q.e.d.

Corollary 7.5 : A vector v ∈ V is cyclic if and only if V = hvi .

Our proof of the existence of cyclic vectors requires four lemmas.

Lemma 7.6 : Suppose w, v ∈ V are non zero vectors satisfying both V = hw, vi and hvi= 6 V . Then : (a) V = hwi + hvi ;

(b) hwi ∩ hvi = {0} if and only if dimK (V ) = dimK (hvi) + dimK (hwi); and

31 (c) the inclusion hwi ∩ hvi ⊂ hwi is proper.

Proof : (a) : hwi + hvi is a differential subspace of V containing the set {w, v}, and hw, vi = V is the intersection of all such subspaces. The equality follows. (b) and (c) : By (a) (and elementary linear algebra) we have dimK (V ) = dimK (hwi) + dimK (hvi) − dimK (hwi ∩ hvi). Assertion (b) follows immediately, and if (c) fails the equality reduces to dimK (V ) = dimK (hvi), contradicting V 6= hvi. q.e.d.

Lemma 7.7 : Suppose KC is algebraically closed, of characteristic 0, and the in- clusion KC ⊂ K is proper. Let W ⊂ V be a nontrivial differential subspace of V , let 1 ≤ r ∈ Z, and let w1, . . . , wr ∈ W be subject only to the restriction wr 6= 0. Pr (j) Then there is an element ` ∈ K such that j=0 ` wj 6= 0.

Without the characteristic 0 assumption the proper inclusion hypothesis on KC ⊂ K can fail, e.g., when K is a finite field one has KC = K.

Proof : Choose any k ∈ K\KC . It is not difficult to see that when k is algebraic over KC one has k ∈ KC , contrary to hypothesis, and we conclude that k must be r transcendental over KC . In particular, the collection {1, k, . . . , k } must be linearly independent over KC . Pn Fix a basis (e1, e2, . . . , en) of V , write wj = i=1 aijei, and choose ` ∈ K at random. Then from

Pr (j) P (j) P P P (j) j=0` wj = j ` i aijei = i( j aij` )ei we see that Pr (j) P (j) j=0` wj = 0 ⇔ j aij` = 0 for i = 1, . . . , n . Pr (j) In other words, j=0 ` wj = 0 if and only if ` is a solution of the system

(r) (r−1) a1ry + a1,r−1y + ··· + a10y = 0 (r) (r−1) a2ry + a2,r−1y + ··· + a20y = 0 . . . . (r) (r−1) anry + an,r−1y + ··· + an0y = 0 . But from Corollary 4.3 we know that the solution space of any one (and therefore th all) of these r -order homogeneous linear differential equations has KC -dimension

32 at most r. From the previous paragraph we conclude that for any k ∈ K\KC at least one of 1, k, . . . , k r cannot be a solution, and the lemma is thereby established. q.e.d.

Lemma 7.8 : Assume KC is algebraically closed of characteristic 0 and the inclu- sion KC ⊂ K is proper. Suppose w, v ∈ V are non-zero and satisfy hwi∩hvi = {0}. Then there is an element ` ∈ K with the property that for vˆ := v + `w one has

(a) hw, vˆi = hw, vi and

(b) dimK (hvˆi) > dimK (hvi).

Proof : (a) The equality holds for any ` ∈ K. r−1 (b) Let r := dimK (hvi). Then {v, Dv, . . . , D v} is a basis of this space (by Proposition 7.4(b)) and the collection {v, Dv, . . . , D r−1v, Drv} is therefore lin- early dependent over K. In particular we can find scalars a0, . . . , ar ∈ K, where Pr j w.l.o.g. ar = 1, such that j=0 ajD v = 0. Now set  i  w := Pr a Di−jw, j = 0, 1, . . . , r , j i=j j i and note that wr = w 6= 0. It follows from Lemma 7.7 that we can find an element ` ∈ K such that

Pr (j) Pr (j) Pr i−j (i) 0 6= j=0 ` wj = j=0 ` i=j aiD w. Definev ˆ := v + `w. Suppose (b) is false. Then there is an integer 1 ≤ s ≤ r, and elements bi ∈ K Ps i for i = 0, . . . , s, where w.l.o.g. bs = 1, such that i=0 biD vˆ = 0, i.e., such that  i  0 = Ps b Divˆ = Ps b Div + Ps b `(j)Di−jw , i=0 i i=0 i i=0 i j=0 j where in computing the final term we have used (7.1). The first term on the right is in hvi while the second is in hwi, and since hwi ∩ hvi = {0} it follows that both must vanish. Immediate consequences of this vanishing are: s = r (because Ps i i=0 biD v = 0 and bs = 1); ai = bi for i = 0, 1, . . . , r ; and  i   i  0 = Pr a Pi `(j)Di−jw = Pr `(j) Pr a Di−jw = Pr `(j)w . i=0 i j=0 j j=0 i=j j i j=0 j

33 This contradicts (i), and (b) is thereby established. q.e.d.

When W ⊂ V is a D-subspace the quotient space V/W inherits a differential structure in the expected way, i.e., [v] := v + W 7→ [Dv] := Dv + W . This “quotient differential structure” is useful for induction arguments, such as the proof of the following result.

Lemma 7.9 : Suppose n = dimK (V ) > 1 and every differential K-space of lower positive dimension admits a cyclic vector. Then for any 0 6= w ∈ V there is a v ∈ V such that hw, vi = V .

Proof : If hwi = V take v = 0; if not give V/hwi the quotient differential structure and let π : V → V/hwi denote the quotient mapping. By assumption V/hwi contains a cyclic vector [ˆv], and for any v ∈ π−1([ˆv]) we then have V = hw, vi. q.e.d.

Theorem 7.10 (The Cyclic Vector Theorem) : Suppose KC is algebraically closed of characteristic 0 and the inclusion KC ⊂ K is proper. Then V contains a cyclic vector.

Proof : We argue by induction on n = dimK (V ), omitting the trivial case n = 1. We assume n > 1, and that the result holds for all non trivial differential K-spaces of dimension strictly less than n. Choose 0 6= w1 ∈ V at random. If hw1i = V we are done; otherwise we invoke Lemma 7.9 (and the induction hypothesis) to guarantee the existence of a vector v1 ∈ V such that hw1, v1i = V . We may assume that

(i) dimK (hv1i) > dimK (V ) − dimK (hw1i), and (as a consequence) that

(ii) hw1i ∩ hv1i= 6 {0}.

Indeed, if (i) fails then hw1i∩hv1i = {0} by Lemma 7.6(b), and we can use Lemma 7.8 to replace v1 with a vectorv ˆ1 such that hw1, vˆ1i = hw1, v1i = V and dimK (hvˆ1i) > dimK (hv1i). Inequality (i) is then evident from dim(hv1i) = dimK (V ) − dimK (hw1i). If hv1i = V we are done. Otherwise we use (ii) to find some 0 6= w2 ∈ hw1i∩hv1i, noting from Lemma 7.6(c) that dimK (hw2i) < dimK (hw1i) (which implies that w2

34 cannot be cyclic). Repeating the argument of the previous two paragraphs with w2 replacing w1 we then produce a vector v2 ∈ V such that V = hw2, v2i,

dimK (hv2i) > dimK (V ) − dimK (hw2i) , and

hw2i ∩ hv2i= 6 {0} .

If hv2i = V we are done; otherwise we repeat the construction once again, etc. The result is a of subspaces hwji with strictly decreasing dimensions and a sequence of vectors vj satisfying

dimK (hvji) > dimK (V ) − dimK (hwji) .

But this inequality is impossible if we reach dimK (hwji) = 0 (because hvji ⊂ V ), and we conclude that the iteration terminates after finitely many steps. Since the only requirement for continuing is that vj not be cyclic, this proves the theorem. q.e.d.

35 8. Extensions of Differential Structures

Here K is a differential field with derivation k 7→ k 0, V is a K-space (i.e., a vector space over K) of dimension n, and D : V → V is a differential structure. Primes 0 will also be used to indicate the derivation on any differential extension field L ⊃ K.

Recall7 that when L ⊃ K is an extension field of K (not necessarily differential) the tensor product L ⊗K V over K admits an L-space structure characterized by ` · (m ⊗K v) = (`m) ⊗K v. This structure will always be assumed. By means of the K-embedding

(8.1) v ∈ V 7→ 1 ⊗ v ∈ L ⊗K V one views V as a K-subspace of L⊗K V when the latter is considered as a K-space. In particular, any (ordered) basis e of V can be regarded as a subset of L ⊗K V . Proposition 8.2 (“Extension of the Base”) : Assuming the notation of the pre- vious paragraph any basis of the K-space V is also a basis of the the L-space L⊗K V . In particular,

(i) dimK V = dimL(L ⊗K V ) .

Proof : See, e.g., [Lang, Chapter XVI, §4, Proposition 4.1, p. 623]. q.e.d.

Proposition 8.3 : Suppose W, Vˆ and Wˆ are finite-dimensional 8 K-spaces and T : V → W and Tˆ : Vˆ → Wˆ are K-linear mappings. Then there is a K-linear ˆ ˆ ˆ mapping T ⊗K T : V ⊗K V → W ⊗K W characterized by ˆ ˆ ˆ (i) (T ⊗K T )(v ⊗K vˆ) = T v ⊗K T v,ˆ v ⊗K vˆ ∈ V ⊗K V.

9 ˆ Proof : There is a standard characterization of the tensor product V ⊗K V in terms of K-bilinear mappings of V × Vˆ into K-spaces Y , e.g., see [Lang, Chapter XVI, §1, p. 602]. The proposition results from considering the K-bilinear mapping ˆ ˆ ˆ (v, vˆ) ∈ V × V 7→ T v ⊗K T vˆ ∈ W ⊗K W . q.e.d.

7As a general reference for the remarks in this paragraph see, e.g., [Lang, Chapter XVI, §4, pp. 623-4, particularly Example 2]. Except for references to bases, most of what we say does not require V to be finite-dimensional. 8The finite-dimensional hypothesis is not needed; it is assumed only because this is a standing hypothesis for V . 9We offer only a quick sketch. The result is more important for out purposes than a formal proof, and filling in all the details would lead us too far afield.

36 Proposition 8.4 : To any differential field extension L ⊃ K there corresponds a unique differential structure DL : L ⊗K V → L ⊗K V extending D : V → V , and this structure is characterized by the property

0 (i) DL(` ⊗K V ) = ` ⊗K v + ` ⊗K Dv, ` ⊗K v ∈ L ⊗K V.

Recall from (8.1) that we are viewing V as a K-subspace of L ⊗K V by iden- tifying V with its image under the embedding v 7→ 1 ⊗K v. Assuming (i) we have DL(1 ⊗K v) = 1 ⊗K Dv ' Dv for any v ∈ V , and this is the meaning of DL “extending” D. In the proof we denote the derivation ` 7→ ` 0 by δ : L → L, and we also write the restriction δ|K as δ. One is tempted to prove the proposition by invoking Proposition 8.3 so as to define mappings δ ⊗K idV : L ⊗K V → L ⊗K V and idL ⊗K D : L ⊗K V → L ⊗K V and to then set DL := δ ⊗K idV + idL ⊗K D. Unfortunately, Proposition 8.3 does not apply since D is not K-linear.

10 Proof : The way around the problem is to first use the fact that D is KC -linear; one can then conclude from Proposition 8.3 (with K replaced by KC ) that a KC - ˆ linear mapping D : L ⊗KC V → L ⊗KC V is defined by ˆ (ii) D := δ ⊗KC idV + idKC ⊗KC D.

The next step is to define Y ⊂ L ⊗KC V to be the KC -subspace generated by all

vectors of the form `k ⊗KC v − ` ⊗KC kv, where ` ∈ L, k ∈ K and v ∈ V . Then from the calculation ˆ D(`k ⊗KC v − ` ⊗KC kv) = δ(`k) ⊗KC v + `k ⊗KC Dv

− δ(`) ⊗KC kv − ` ⊗KC D(kv) 0 0 = `k ⊗KC v + k` ⊗KC v + `k ⊗KC Dv 0 0 − ` ⊗KC kv − ` ⊗KC (k v + kDv) 0 0 = `k ⊗KC v − ` ⊗KC k v 0 0 + ` k ⊗KC v − ` ⊗KC k v

+ `k ⊗KC Dv − ` ⊗KC kDv

10Footnote 9 applies here also.

37 ˆ ˆ we see that Y is D-invariant, and D therefore induces a KC -linear mapping ˜ D :(L ⊗KC V )/Y → (L ⊗KC V )/Y which by (ii) satisfies ˜ 0 (iii) D([` ⊗KC v]) = [` ⊗KC v] + [` ⊗KC Dv],

where the bracket [ ] denotes the equivalence class (i.e., coset) of the accompanying element.

Now observe that when L ⊗KC V is viewed as an L-space (resp.K-space), Y becomes an L-subspace (resp. a K-subspace), and it follows from (iii) that D˜ is a differential structure when the L-space (resp.K-space) structure is assumed. 11 In view of the K-space structure on (L ⊗KC V )/Y the K-bilinear mapping

(`, v) 7→ [` ⊗KC v] induces a K-linear mapping T : L ⊗K V → (L ⊗KC V )/Y which one verifies to be K-isomorphism. It then follows from (iii) and (iv) that the −1 ˜ mapping DL := T ◦ D ◦ T : L ⊗K V → L ⊗K V satisfies (i), and it follows that DL is a differential structure on the L-space L ⊗K V . ˇ As for uniqueness, suppose D : L ⊗K V → L ⊗K V is any differential structure extending D, i.e., having the property ˇ D(1 ⊗K v) = 1 ⊗K Dv, v ∈ V.

Then for any ` ⊗K v ∈ L ⊗K V one has ˇ ˇ D(` ⊗K v) = D(` · (1 ⊗K v)) 0 ˇ = ` · (1 ⊗K v) + ` · D(1 ⊗K v) 0 = ` ⊗K v + ` · (1 ⊗K Dv) 0 = ` ⊗K v + ` ⊗K Dv

= DL(` ⊗K v), ˇ hence D = DL. q.e.d.

d When considered over the differential field C(z) = (C(z), dz ) the linear differential equation y 00 = y , has only the trivial solution, but if we “allow solutions” from the d differential field extension C(z)(exp(z)) = (C(z) exp(z), dz ) this is no longer the case. At the conceptual level, “allowing solutions from a differential field extension” simply means considering extensions of the given differential structure. However, at the computational level these extensions play no role, as one sees from the following result. 11Which we note is not L-bilinear, since for v ∈ V the product `v is only defined when ` ∈ K.

38 Proposition 8.5 : Suppose e is a basis of V and

(i) x 0 + Bx = 0

is the defining e-equation for D. Let L ⊃ K be a differential field extension and consider e as a basis for the L-space L ⊗K V . Then the defining e-equation for the extended differential structure DL : L ⊗K V → L ⊗K V is also (i).

Proof : Since DL extends D the e matrices of these two differential structures are the same. q.e.d.

A differential field extension L ⊃ K has no new constants if LC = KC . (Note that LC ⊃ KC is automatic.)

Proposition 8.6 : Suppose e and eˆ are two bases of V and

(i) x 0 + Bx = 0 and

(ii) x 0 + Ax = 0

are the defining e and eˆ-equations of D respectively. Let L ⊃ K be a no new con- stant differential field extension in which each equation admits a fundamental matrix solution. Then the field extensions of K generated by the entries of these fundamental matrix solutions are the same.

Proof : In view of the discussion surrounding (2.14) we can assume A and B are related by A = PBP −1 − P 0P −1, where P ∈ GL(n, K). Let MN ∈ gl(n, L), be fundamental matrix solutions of (i) and (ii) respectively and set Mˆ := PM. Then from

Mˆ 0 = (PM) 0 = PM 0 + P 0M = P (−BN) + P 0M = −PBP −1Mˆ + P 0P −1Mˆ

39 we see that 0 = Mˆ 0 + (PBP −1 − P 0P −1)Mˆ = Mˆ 0 + AM,ˆ and we conclude that Mˆ ∈ GL(n, L) is also a fundamental matrix solution of (ii). ˆ By Proposition 5.5 we have PM = M = NC for some C ∈ gl(n, LC ), and we can therefore write

(iii) M = P −1NC.

The entries of P −1 are in K, and by the no new constant hypothesis the same holds for the entries of C. The result follows. q.e.d.

40 9. The Differential Galois Group

Here K is a differential field and (V,D) is a differential K-module. We assume dimK V = n < ∞.

A Picard-Vessiot extension for (V,D) is a differential field extension L ⊃ K satisfying the following conditions:

(a) the extension has no new constants;

(b) the L-space L⊗K V admits a basis consisting of horizontal vectors of DL; and (c) when M ⊃ K is any other differential extension satisfying (a) and (b) there is a differential field embedding ϕ : L → M over K, i.e., a field embedding over K satisfying

(9.1) ϕ ◦ δL = δM ◦ ϕ .

The intuitive idea behind (c) is that L ⊃ M is “minimal” among differential ex- tensions M ⊃ K satisfying properties (a) and (b): any other such extension is “at least as big” in the sense that it must contain an isomorphic copy of L. The differential Galois group of (V,D) corresponding to an associated Picard- Vessiot extension L ⊃ K is the group GL of automorphisms of L over K which commute with the derivation δL on L. This group obviously depends on L, but, as we will see, only up to isomorphism. We need a few preliminaries. Define a fundamental matrix solution of (V,D) in GL(n, L) to be any fundamental matrix solution of any defining equation of DL.

Proposition 9.2 : Condition (b) is equivalent to the existence of a fundamental matrix solution for (L ⊗K V,DL) in GL(n, L).

Proof : By Corollary 5.4.

I thank J. Kovacic for the next observation.

41 Proposition 9.3 : When L ⊃ K is a Picard-Vessiot extension for (V,D) the following equivalent conditions hold.

(c1) If L ⊃ M ⊃ K is an intermediate differential field, and if the differential field extension M ⊃ K also satisfies (a) and (b), then there is a differential field embedding φ : L → M over K.

(c2) For any fundamental matrix solution Z ∈ GL(n, L) for (V,D) one has L = K(Z), i.e., L is generated by the entries of Z.

(c3) If L ⊃ M ⊃ K is an intermediate differential field, and if the differential field extension M ⊃ K also satisfies (a) and (b), then M = L.

Proof : Condition (c1) is immediate from (c).

(c1) ⇒ (c2) : Set M := K(Z) ⊂ L. We must prove that L ⊂ M. Since the extension L ⊃ K has no new constants the same is true of M ⊃ K, and by construction M contains the fundamental solution matrix Z of (V,D). It follows from Proposition 9.2 that conditions (a) and (b) in the definition of a Picard-Vessiot extension are satisfied. By (c) there is a differential embedding φ : L → K(Z), and from (9.1) one sees that φ(Z) is also a fundamental matrix solution for (V,D). From Proposition 5.5 we conclude that ZC = φ(Z) for some C ∈ GL(n, LC ) = GL(n, KC ), the last equality by (a). Choose any ` ∈ L. Then φ(`) ∈ M = K(Z), and we can therefore write

p(Z) φ(`) = , where p, q ∈ K[X ] and q(Z) 6= 0. q(Z) ij

Consider the elements p(ZC−1) and q(ZC−1) of M. Since φ is an embedding over K and C ∈ GL(n, KC ) ⊂ GL(n, K) we have

φ(p(ZC−1)) = p(φ(ZC−1)) = p(φ(Z)φ(C−1)) = p(φ(Z)C−1) = p(Z)

and, similarly, φ(q(ZC−1)) = q(Z). Note from this last identity and q(Z) 6= 0 that q(ZC−1) 6= 0. Since φ is injective we see from  p(ZC−1) p(Z) φ ` − = φ(`) − = 0 q(ZC−1) q(Z)

p(ZC−1) that ` = q(ZC−1) ∈ M, and L ⊂ M follows.

42 (c2) ⇒ (c3) : If W ∈ GL(n, M) is a fundamental matrix solution for (V,D) then M = K(W ) by (c2). However, W may also be considered as a fundamental matrix solution of (V,D) in GL(n, L), and, as above, we therefore have W = ZC for some C ∈ GL(n, KC ). The equalities M = K(W ) = K(ZC) = K(Z) = L follow.

(c3) ⇒ (c1) : Take φ := idL. q.e.d.

Corollary 9.4 : When L ⊃ K is a Picard-Vessiot extension for (V,D) any differ- ential field embedding φ : L → L over K is an automorphism.

Proof : φ(L) ⊂ L satisfies (a) and (b), hence φ(L) = L by (c3). q.e.d.

Theorem 9.5 : Suppose L ⊃ K and M ⊃ K are Picard-Vessiot extensions for (V,D). Then there is a differential field isomorphism φ : L → M over K, and the associated differential Galois groups are isomorphic.

The mapping φ is not unique, e.g., for any g ∈ GM the composition g ◦ φ : L → M is another differential field isomorphism over K. Nevertheless, the result will be used to justify reference to “the” Picard-Vessiot extension of a differential K-module (V,D). A similar convention is used with differential Galois groups: one refers to “the” differential Galois group of (V,D).

Proof : By definition we have differential embeddings L −→← M over K which by Corollary 9.4 must be . If we let φ : L → M denote the upper arrow we obtain an isomorphism σ : GL → GM between the differential Galois groups by −1 −1 assigning g ∈ GL to φ◦g ◦φ ∈ GM ; the inverse is given by h ∈ GM 7→ φ ◦h◦φ ∈ GL. q.e.d.

Picard-Vessiot extensions and differential Galois groups have traditionally been associated with homogeneous linear ordinary differential equations, i.e., with what we are viewing as basis representations of differential modules. Readers familiar with that approach will see immediately from Proposition 9.3(c2) that the extensions and groups we have defined for a differential module agree with the traditional definitions for any defining equation of that structure. One question we have not addressed is existence, i.e., does a Picard-Vessiot ex- tension exist for any differential structure? In general the answer is no: the standard proof one needs characteristic zero and algebraically closed hypotheses on KC .

43 Bibliography

[C-K] R.C. Churchill and J. Kovacic, Cyclic Vectors, in Differential Algebra and Related Topics, (Li Guo, P. Cassidy, W. Keigher and W. Sit, eds.), World Scientific, Singapore, 2002.

[C-L] E.A. Coddington and N. Levinson, Theory of Ordinary Differential Equations, McGraw-Hill, New York, 1955.

[Lang] S. Lang, Algebra, Revised Third Edition, Springer-Verlag, New York, 2002.

[Poole] E.G.C. Poole, Introduction to the Theory of Linear Differential Equa- tions, Dover Publications, New York, 1960,

[vdP-S] M. van der Put and M.F. Singer, Galois Theory of Linear Differential Equations, Springer-Verlag, Berlin, 2003.

R.C. Churchill Department of Mathematics Hunter College and the Graduate Center, CUNY, and the University of Calgary e-mail [email protected]

44