<<

Sturm’s Theorem: determining the of zeroes of in an open interval.

Bachelor’s thesis

Eric Spreen University of Groningen [email protected]

July 12, 2014

Supervisors: Prof. Dr. J. Top University of Groningen

Dr. R. Dyer University of Groningen Abstract

A review of the theory of rings and extension fields is presented, followed by an introduction on ordered, formally real, and real closed fields. This theory is then used to prove Sturm’s Theorem, a classical result that enables one to find the number of roots of a polynomial that are contained within an open interval, simply by counting the number of sign changes in two sequences. This result can be extended to decide the existence of a root of a family of polynomials, by evaluating a set of polynomial equations, inequations and inequalities with integer coefficients. Contents

1 Introduction 2

2 Polynomials and Extensions 4 2.1 Polynomial rings ...... 4 2.2 Degree arithmetic ...... 6 2.3 Euclidean division ...... 6 2.3.1 Polynomial factors ...... 8 2.4 extensions ...... 9 2.4.1 Simple Field Extensions ...... 10 2.4.2 Dimensionality of an Extension ...... 12 2.4.3 Splitting Fields ...... 13 2.4.4 Galois Theory ...... 15

3 Real Closed Fields 17 3.1 Ordered and Formally Real Fields ...... 17 3.2 Real Closed Fields ...... 22 3.3 The Intermediate Value Theorem ...... 26

4 Sturm’s Theorem 27 4.1 Variations in sign ...... 27 4.2 Systems of equations, inequations and inequalities ...... 32 4.3 Sturm’s Theorem Parametrized ...... 33 4.3.1 Tarski’s Principle ...... 38

1 Chapter 1

Introduction

In many of the natural sciences, polynomials and polynomial systems occur as useful approximations to real-world phenomena. As an example of this we give the harmonic oscillator, which is used in physics to approximate dynamical systems that are very close to an equilibrium point. The potential energy of such a (one-dimensional) system takes the form 1 V pxq  kx2, 2 which is a second-degree polynomial in one variable. A similar case occurs in higher-dimensional systems (in 3D, and with multiple bodies). Another example is the hydrogen atom in quantum mechanics, where the radial wave- functions takes on the form: [3] 1 r R prq  e¡ρρl 1vpρq, ρ  , nl r an

1 where vpρq is a polynomial, and n P N, n ¡ 0 and l P N. It is a crucial problem to find the zeroes of these functions in order to determine the electronic structure of the atom, which can be done by finding the zeroes of the polynomial ρl 1vpρq. It is clear from these examples that finding solutions of polynomial equa- tions is a fundamental problem in applied . A classical result that enables us to do this numerically is Sturm’s Theorem, named after Jacques Charles Fran¸coisSturm. This theorem gives the number of zeroes of a polynomial that are contained within a certain open interval, enabling us to determine the zeroes (by partitioning the number line appropriately, up to machine precision) of a polynomial numerically. We will discuss Sturm’s Theorem in the context of real closed fields, an abstraction of the system that has significantly different realizations. The key to the success of Sturm’s Theorem in real closed fields

1 We use the convention that 0 P N.

2 is the analog of the intermediate value theorem for polynomials. It can be shown that this result, and various other key theorems from real analysis hold for polynomials in real closed fields. After the discussion of Sturm’s Theorem, we will discuss an extension of Sturm’s Theorem that allows us to simplify the problem of the existence of a zero in a certain interval for a whole family of polynomials. The result will be a finite set of systems of polynomial equations, inequations and inequalities with integer coefficients, any one of which may be satisfied by the parameters of the family for the resulting polynomial to have zero in the interval. From this we can quickly establish criteria for the existence of a zero of a whole family of polynomials. A secondary result is that if a polynomial with rational coefficients has a zero in one real closed field, it will have a zero in every other real closed field. As a high school student I have often wondered whether it would be possible to form an equation of which the solvability is undecidable, in par- ticular when I was unable to solve a particular problem. At the very end we will touch on this question. A significant portion of this report follows [4] and [5]. If no citations have been provided, these are the sources. We will assume that the reader has a basic understanding of algebraic structures, such as monoids, groups and rings.

3 Chapter 2

Polynomials and Extensions

Before we can begin our study of real closed fields, we will develop the theory of polynomial rings over a field to some extent. This chapter will give basic results on arbitrary polynomial rings, and applications on extension rings and fields. Most of this chapter will follow [4]. We will say that a subring R of a ring S is generated by a set A € S, if R is the smallest subring that contains A. Also, if u1, . . . , un P S (and @r P R : uir  rui, 1 ¤ i ¤ n), then we denote the ring that is generated by R Y t u1, . . . , un u by Rru1, . . . , uns. We can readily note that Rru1, u2s  pRru1sqru2s by definition. The existance of such a subring follows from the observation that any arbitrary intersection of subrings of S is again a subring of S.

2.1 Polynomial rings

Definition 2.1.1. Given a ring R, its polynomial ring Rrxs is the ring of functions f : N Ñ R such that there exists a k P N with @n P N : n ¥ k ùñ fpnq  0. Addition and multiplication in Rrxs are defined as follows; for fpxq, gpxq P Rrxs and any n P N: ¸n pf gqpnq  fpnq gpnq, pfgqpnq  fpiqgpn ¡ iq. i0 and 0 and 1 as obvious. The elements of Rrxs are called polynomials with coefficients in R, and the values of a polynomial are called the coefficients of a polynomial.

We will establish the basic properties of polynomial rings. The proofs will be ommitted and can be found in [4, Sec.2.10].

Proposition 2.1.1 (Properties of polynomial rings). Let R be a ring and Rrxs its polynomial ring. Then:

4 1. There exists an injective homomorphism R Ñ Rrxs, so that R may be regarded as a subring of Rrxs.

2. Let x P Rrxs be the polynomial with xp1q  1 and xpnq  0 if n P Nz t 1 u. Then Rrxs is generated by RYt x u. Furthermore, all elements of R commute with x. p q P P P 3. For any f x N°, there exists an n N and unique a0, . . . , an R p q  n i such that f x i0 aix . 4. If R is commutative, then so is Rrxs.

Proposition 2.1.2 (Evaluation homomorphism). If R and S are rings, φ : R Ñ S is a homomorphism, u P S and @r P R : φprqu  uφprq, then there exists a unique homomorphism ψ : Rrxs Ñ S such that ψ|R  φ and ψpxq  u. Also, the kernel of ψ is an ideal I „ Rrxs such that IXR  kerpφq. This homomorphism is called the evaluation homomorphism in u.

Note: Evaluation in an overring From the proposition above, it follows immediately that if we let R be a subring of S and φ the inclusion homomorphism, the kernel of every eval- uation homomorphism in an element of S will be an ideal I of Rrxs with I X R  t 0 u.

We can note that if R is a ring, then the following property is “universal” for the polynomial ring Rrxs (in the sense that any rings that have this property are isomorphic): if S is any other ring, and φ : R Ñ S is a homomorphism, u P S and @r P R : φprqu  uφprq, then there exists an x P Rrxs and a unique homomorphism ψ : Rrxs Ñ S so that ψ|R  φ, φpxq  u and Rrxs is generated by R Y t x u. [4, p.124]

Note: Notation of polynomials From now on, we will denote a polynomial with coefficients in a ring R as fpxq and its value under a evaluation homomorphism in some u as fpuq. Also, if fpuq  0, then u is called a zero of fpxq in S.

Corollary 2.1.3. If R and S are rings, and φ : R Ñ S is a homomorphism, 1 then there exists a unique homomorphism ψ : Rrxs Ñ Srx s such that ψ|R  φ and ψpxq  x1.

We will also use the notion of a polynomial in multiple indeterminates. We can formalize this notion by (for any n P N, n ¡ 1) defining the ring Rrx1, . . . , xns : Rrx1s ... rxns. By induction we can then get an evaluation homomorphism in multiple variables.

5 2.2 Degree arithmetic °  p q  n i We have seen that for any polynomial 0 f x i0 aix there is some k P N such that ak  0, but ai  0 if i ¡ k. This observation is a strong tool that we will use often in further arguments. We therefore define °  p q  n i P r s Definition 2.2.1. If R is a ring and 0 f x i0 aix R x , then the degree of fpxq is the largest k P N such that ak  0. If fpxq  0, then the degree is ¡8. We will denote the degree of fpxq by degpfq or degpfpxqq. Furthermore, the leading coefficient of fpxq is adegpfq if fpxq  0 and 0 if fpxq  0. This will be denoted by lcpfq or lcpfpxqq. A polynomial fpxq will be called monic if lcpfq  1.

The following two lemmas can be proven quickly by the definition of the degree and considering the leading coefficients of fpxq gpxq and fpxqgpxq respectively.

Lemma 2.2.1. If R is a ring, then for any fpxq, gpxq P Rrxs : degpfpxq gpxqq ¤ maxpdegpfq, degpgqq.

Lemma 2.2.2. If D is a domain, then Drxs is also a domain, and for all fpxq, gpxq P Drxs we have degpfgq  degpfq degpgq.1 Also, the units of Drxs will be the units of D.

2.3 Euclidean division algorithm

The proof of the following proposition will be given when we prove algorithm 1.

Proposition 2.3.1. Let R be a commutative ring, and fpxq, gpxq P Rrxs with gpxq  0, m  degpgq and bm the leading coefficient of gpxq. Then there exist k P N, qpxq, rpxq P Rrxs such that:

bkfpxq  qpxqgpxq rpxq ^ degprq degpgq. (2.1)

Corollary 2.3.2. Let F be a field and fpxq, gpxq P F rxs with gpxq  0. Then there exist unique qpxq, rpxq P F rxs such that:

fpxq  qpxqgpxq rpxq ^ degprq degpgq. (2.2)

P p q p q P r s k p q  Proof. We can find some k N and q x , r x F x such that bmf x qpxqgpxq rpxq and degprq degpgq, where bm  lcpgq. Now since gpxq  0,  p q  p ¡k p qq p q p ¡k p qq we have bm 0 and thus f x bm q x g x bm r x . Also, since F p ¡k p qq  p ¡kq p p qq  p p qq p p qq is a domain: deg bm r x deg bm deg r x deg r x deg g x .

1 It is to be understood here that ¡8 a  ¡8 for any a P N Y t ¡8 u.

6 Now let q1pxq, r1pxq P F rxs also satisfy (2.2). Then pqpxq ¡ q1pxqqgpxq  r1pxq ¡ rpxq. Without loss of generality we may assume that degprq ¥ degpr1q. It then follows that degpgq ¡ degprq ¥ degpr1 ¡ rq  degpq ¡ q1q degpgq and this is only possible if degpq ¡ q1q  ¡8. Then qpxq  q1pxq and thus r1pxq  rpxq.

Algorithm 1: Euclidean Division Algorithm

Let R be a commutative ring and fpxq, gpxq P Rrxs with fpxq, gpxq  0. Also let m  degpgq P N and 0  b P R the leading coefficient of gpxq. Define the following three coupled sequences:

n0  degpf0q f0pxq  fpxq a  lcpf q # 0 0 n ¡m bfipxq ¡ aix i gpxq ni ¥ m ni 1  degpfi 1q fi 1pxq  0 ni m ai 1  lcpfi 1q

Then there exists a k P N such that fkpxq  0, fk 1pxq  0 and nk m. Also2: £ k¸¡1 ¡ ¡ ¡ k p q  k l 1 nl m p q p q bmf x alb x g x fk x . (2.3) l0

n ¡m Proof. Let i P N. We then see that degpfi 1q  degpbfipxq ¡ aix i gpxq n ¡m degpfipxqq, since the leading coefficients of bfipxq and aix i gpxq are both aib. This shows that the degree strictly decreases each step, and since f0pxq  0, there exists some k P N such that fkpxq  0, degpfkq  nk m and thus fk 1pxq  0. We may see this k as the terminal step of the algo- rithm, since from this point on only zero polynomials will be produced. P ¤ We will now prove¡° the following: for© any i N such that i k we have ¡ ¡ ¡ ¡ i p q  p q i 1 i l 1 nl m p q  b f x fi x l0 alb x g x . For i 0 this is clear, so pick i P with 0 i ¤ k and assume this holds for i ¡ 1. Then: N £ i¸¡2 i i¡1 i¡l¡2 nl¡m b fpxq  bb fpxq  bfi¡1pxq b alb x gpxq l£0 i¸¡2 ni¡1¡m i¡l¡1 nl¡m  fipxq ai¡1x gpxq alb x gpxq £ l0 i¸¡1 i¡l¡1 nl¡m  fipxq alb x gpxq. l0 This proves our claim. We can then set i  k to obtain our final formula for fpxq, which concludes the proof and also proves proposition 2.3.1, since degpfkq  nk m  degpgq. 2We understand here, that if k  0, the sum evaluates to 0.

7 2.3.1 Polynomial factors The Euclidean division algorithm can be used to prove an array of useful facts. The first of these will concern factors of polynomials. Since we will almost exclusively be concerned with commutative rings from this point on, R will denote a commutative ring in the rest of this chapter. Definition 2.3.1. If fpxq, gpxq P Rrxs, then gpxq is a factor of fpxq – denoted as gpxq  fpxq – if and only if there exists an hpxq P Rrxs such that fpxq  gpxqhpxq. Also, a polynomial fpxq P Rrxs of positive degree will be called reducible if there exist gpxq, hpxq P Rrxs of positive degree such that fpxq  gpxqhpxq. Otherwise, fpxq will be called irreducible.3 The following two results characterize the zeroes of a polynomial. They will be used a couple of times in the next chapters. Lemma 2.3.3. If fpxq P Rrxs and a P R, then there exists a unique qpxq P Rrxs such that fpxq  px ¡ aqqpxq fpaq. Proof. By the Euclidean division algorithm we may pick qpxq, rpxq P Rrxs with degprq degpx ¡ aq  1 and fpxq  px ¡ aqqpxq rpxq. We then immediately see that fpaq  pa ¡ aqqpaq rpaq  rpaq, and since degprq 1 we must have rpxq  fpaq. Also, since rpxq is fixed in this way, if q1pxq also satisfies fpxq  px ¡ aqq1pxq rpxq, then px ¡ aqpqpxq ¡ q1pxqq  0. Now, since the leading coefficient of x is 1, which is not a zero divisor, we get qpxq  q1pxq. Corollary 2.3.4. If fpxq P Rrxs and a P R. Then a is a zero of fpxq if and only if px ¡ aq  fpxq. Proof. By the previous lemma there is a qpxq P Rrxs such that fpxq  px ¡ aqqpxq fpaq. So, if fpaq  0, then px ¡ aq|fpxq. Conversely, if px ¡ aq|fpxq, there exists some hpxq P Rrxs such that fpxq  px ¡ aqhpxq. But then fpaq  pa ¡ aqhpaq  0.

We can also apply the Euclidean division algorithm to determine a great- est common factor of two polynomials with coefficients in a field F . By a greatest common factor (or divisor) of a pair of polynomials pfpxq, gpxqq we mean a polynomial hpxq such that hpxq  fpxq, hpxq  gpxq and if dpxq P F rxs such that dpxq  fpxq and dpxq  gpxq, then dpxq  hpxq. Degree considerations quickly show that two greatest common factors differ by a unit factor in F . Now, for any two polynomials fpxq, gpxq P F rxs we then define gcdpf, gq P F rxs to be the unique monic greatest common divisor.

3Several other definitions are possible. For example, a polynomial may be called irre- ducible if it is not a unit, and if it can be written as a product of two polynomials, one of them must be a unit. However, in polynomial rings over a field, this leads to the same concept. Hence we adopt this definition.

8 Lemma 2.3.5. Let F be a field and fpxq, gpxq P F rxs with gpxq  0, and qpxq, rpxq P F rxs such that degprq degpgq and fpxq  qpxqgpxq rpxq. Then for every hpxq P F rxs, hpxq  fpxq and hpxq  gpxq if and only if hpxq  gpxq and hpxq  rpxq. Proof. Let hpxq P F rxs. If hpxq  fpxq and hpxq  gpxq, then there exists some αpxq, βpxq P F rxs such that fpxq  αpxqhpxq and gpxq  βpxqhpxq. Then rpxq  fpxq ¡ qpxqgpxq  αpxqhpxq ¡ qpxqβpxqhpxq  pαpxq ¡ qpxqβpxqqhpxq, so that hpxq  rpxq and hpxq  gpxq. Conversely, let hpxq  gpxq and hpxq  rpxq. Then there exist γpxq, ρpxq P F rxs so that gpxq  γpxqhpxq and rpxq  ρpxqhpxq, so that fpxq  pqpxqγpxq ρpxqqhpxq and thus hpxq  fpxq and hpxq  gpxq.

The GCD and Euclidean sequence of two Algorithm 2: polynomials over a field Let fpxq, gpxq P F rxs where F is a field and gpxq  0, and define the following sequence: h pxq  fpxq h pxq  gpxq #0 1 qi 1pxqhi 1pxq ¡ hipxq, if hi 1pxq  0 hi 2pxq  0, if hi 1pxq  0

degphi 2q degphi 1q, i P N. (It is understood here that ¡8 ¡8). Then there exists a 1 ¤ s P N such that hspxq  0, but hs 1pxq  0. Furthermore, hspxq is a greatest common factor of fpxq and gpxq. The finite sequence (terminating at hspxq) thusly defined is called the Euclidean sequence of fpxq and gpxq.

Proof. Let dpxq P F rxs be a common divisor of fpxq and gpxq. Then by repeated use of lemma 2.3.5 see that this is the case if and only if dpxq is a common divisor of hs¡1pxq and hspxq. Therefore, every common divisor of fpxq and gpxq will be a factor of hspxq. Also, since hspxq  qspxqhspxq  hs¡1pxq and hspxq is a factor of itself, hspxq is a common factor of fpxq and gpxq, and thus a greatest common factor.

2.4 Field extensions

In the theory of fields, the main subject of study is a field extension. Infor- mally, this is a field that contains some other field. This notion is particularly important for the study of polynomial equations. We will discuss field ex- tension that are generated by finitely many elements, which can be seen as fields that are obtained by adjoining some elements. Definition 2.4.1. Let F be a field. An field extension over F is then a field E such that F is a subfield of E.

9 If S „ F , then a subfield K of F is said to be generated by S if it is the smallest subfield containing S. If E is a field extension over F , and S „ E, we denote the subfield generated by S Y F by F pSq. If S  t u1, . . . , un u is finite, we denote it by F pu1, . . . , unq.

2.4.1 Simple Field Extensions We will first consider the structure of simple field extensions. That is, field extensions over F of the form F puq. The following proposition will be crucial in our discussion. In this discussion we take a slightly different route than in [4]. Proposition 2.4.1. Let F be a field. Then F rxs is a principal ideal domain. Proof. It is clear that the trivial ideal t 0 u is a principal ideal. Therefore, let t 0 u  I „ F rxs be an ideal of F rxs and fpxq P I. We may also clearly pick a non-zero gpxq P Rrxs of minimal degree. Then there exist qpxq, rpxq P F rxs such that fpxq  qpxqgpxq rpxq with degprq degpgq. But then rpxq  fpxq ¡ qpxqgpxq P I. Since gpxq was of minimal degree, we must have that rpxq  0. Therefore fpxq  qpxqgpxq for some qpxq P F rxs and we conclude that I  pgpxqq is principal. We have also already seen that, since F is a domain, F rxs is a domain. This concludes the proof.

Definition 2.4.2. If R is a subring of a commutative ring S, and u P S, we will call u algebraic over R if there exists a 0  fpxq P Rrxs such that fpuq  0. Otherwise, we will call u transcendental over R. A field extension E over F will be called algebraic if and only if every element of E is algebraic over F . Lemma 2.4.2. If R is a subring of the commutative ring S and u P S is transcendental over R, then F rxs  F rus. Proof. If u is transcendental over R, then the kernel of the evaluation homo- morphism ρ : F rxs Ñ S in u is t 0 u. This shows that F rxs  A  ρpF rxsq. Since A contains R°and u, we obtain Rrus „ A. Also, if x°P A, then there p q  n i P r s  p q  n i P r s exists some f x i1 aix F x such that x f u i1 aiu R u . Therefore, A „ Rrus and we conclude that F rxs  A  Rrus.

Lemma 2.4.3. Let F be a field and fpxq P F rxs of positive degree and irreducible. Then F rxs{pfpxqq is a field. Furthermore, F rxs{pfpxqq  F puq, where u  x mod pfpxqq is a zero of fpxq, when regarded as a polynomial with coefficients in F rxs{pfpxqq. Proof. Let J 1 „ F rxs{I be an ideal, where I  pfpxqq. Then there exists an ideal J „ F rxs such that I „ J and J 1  J{I. Since F rxs is a principal ideal domain, we can find a gpxq P F rxs such that J  pgpxqq. Therefore,

10 there exists a hpxq P F rxs such that fpxq  hpxqgpxq. Now, since fpxq is irreducible, we have either hpxq P F or gpxq P F . In the first case, we get J  I, so that J 1  t 0 mod I u. In the second case we get J  F rxs, so that J 1  F rxs{I. This shows that F rxs{I has only the trivial ideals and thus it is a field. Now set K  F rxs{pfpxqq and let u  x mod pfpxqq P K. Then clearly fpuq  fpxq mod pfpxqq  0 mod pfpxqq. We also have F puq „ K. Now let a P K. Then there exists some bpxq P F rxs so that a  bpxq mod pfpxqq. But, there also exist qpxq, rpxq P F rxs with degprq degpfq and bpxq  q°pxqfpxq rpxq. Therefore a ° rpxq mod pfpxqq  rpuq. Now write rpxq  m i  p q  m i P p q i0 rix . Then a r u i0 riu F u . We therefore conclude that K  F puq.

Proposition 2.4.4. Let E be a field extension over F and u P F . If u is transcendental over F , then F puq  F pxq, the field of fractions of F rxs. If u is algebraic over F , there exists some irreducible gpxq P F rxs such that F puq  F rxs{pgpxqq. Moreover, this gpxq is unique up to a unit multiplier.

Proof. If u is transcendental, then F rxs  F rus. Now, since F rus „ F puq, and the field of fractions of F rus is the smallest field containing F rus (and the fields of fractions of two isomorphic integral domains are isomorphic), we get F puq  F puq. Now suppose that u is algebraic over F . Then, since F rxs is a prin- cipal ideal domain, there exists some gpxq P F rxs such that the kernel of the evaluation homomorphism ρ : F rxs Ñ E in u is I  pgpxqq and thus ρpF rxsq  F rxs{I, by the first isomorphism theorem of rings. We claim that F rxs{I is a field. Suppose that there exist fpxq, hpxq P F rxs such that gpxq  fpxqhpxq. Now, if fpxq P I, then there exists a kpxq P F rxs such that fpxq  kpxqgpxq and hence gpxq  hpxqkpxqgpxq. This would imply that degphq  0, and hence hpxq P F . Similarly, if hpxq P I, then fpxq P F . Now let both fpxq, hpxq R I. Then fpuq  0  hpuq, but fpuqhpuq  gpuq  0, so then E would contain non-zero zero-divisors. From this we conclude that gpxq is irreducible, and by lemma 2.4.3 we conclude that F rxs{I is a field. We now observe that F „ ρpF rxsq (since I X F  t 0 u) and u P ρpF rxsq, °so that F puq „ ρpF rxsq. But if x P°ρpF rxsq, then there exists a fpxq  n i P r s  p q  n i P p q p q  i1 aix F x with x f u i1 aiu F u . Therefore, F u ρpF rxsq  F rxs{pgpxqq. Now let t 0 u  I „ F rxs be an ideal and fpxq, gpxq P F rxs such that I  pfpxqq  pgpxqq. Then, there exist hpxq, kpxq P F rxs such that fpxq  hpxqgpxq and gpxq  kpxqfpxq, so that fpxq  hpxqkpxqfpxq. Degree considerations then show that 0  hpxq, kpxq P F . Therefore, fpxq  agpxq for some unit a P F .

Definition 2.4.3. If E is a field extension over F , and u P E is algebraic

11 over F , we call the unique monic polynomial gpxq P F rxs such that F puq  F rxs{pgpxqq the minimum polynomial of u over F . Also, if E  F puq, we call E a simple field extension over F with gener- ator u, and u is called a primitive element of E.

2.4.2 Dimensionality of an Extension If we have a field extension E over F , we may regard E as a vector space over F . In this vector space, the addition is the normal addition of the field, and the scalar multiplication is the normal multiplication in F , where the scalars lie in F . In particular, in vector spaces we have the notion of a dimension. This dimension turns out to be of critical importance.

Definition 2.4.4. If E is a field extension over F , the dimensionality (or degree) of E over F is the dimensionality of E regarded as a vector space over F , which shall be denoted as rE : F s.

Proposition 2.4.5. Let E be a field extension over F and u P F . Then u is algebraic over F if and only if rF puq : F s 8. Moreover, if u is algebraic, then rF puq : F s is the degree of the minimum polynomial of u.

Proof. Let u be algebraic over F and fpxq P F rxs its minimum° polynomial. P p q p q  n¡1 i P r s Now let a F u be arbitrary.° Then there exists a g x i0 aix F x  p q  n¡1 i  p q P ¤ ¤ ¡ such that a g u i0 aiu where n deg f and ai F for 0 i n n¡1 1. Therefore, p1, u, . . . , u °q spans the vector space F pu°q over F . Now let P i  p q  n¡1 i P p p qq b0, . . . , bn¡1 F such that i0 biu 0. Then h x i0 bix f x . But since degphq degpfq we get hpxq  0, so that b0  ¤ ¤ ¤  bn¡1  0. This shows that p1, u, . . . , un¡1q is a base for the vector space F puq over F , and hence rF puq : F s  n 8. P P Now let u be transcendental over F . Then let n °N and a0, . .° . , an F . r s „ p q  p q  n i  n i We recall that F x F x F u . Therefore, 0 i0 aiu i0 aix implies that a0  ¤ ¤ ¤  an  0, which shows that there exists no finite base for F rxs as a vector space over F . Now, since F rxs is a subspace of F puq, we then certainly have that rF puq : F s  8. By negation of this argument, we get that if rF puq : F s 8, then u must be algebraic over F .

Proposition 2.4.6 (Dimensionality formula). Let K be a field extension over E, which is in turn a field extension over F . Then K is a field extension over F and rK : F s 8 if and only if rK : Es, rE : F s 8. If rK : F s 8, then: rK : F s  rK : EsrE : F s (2.4)

Proof. It is trivial that K is a field extension over F . If rK : F s 8, then rE : F s 8, since E is a subspace of K. Now let t u1, . . . , un u € K be a base for K. Then for every a P K there exist

12 ° P „ n  t u a1, . . . , an F E such that i1 aiui a. Therefore, u1, . . . , un spans K as a vector space over E. We conclude that rK : Es ¤ n 8. Now let rK : Es, rE : F s 8 and pick bases pu1, . . . , umq „ E and p q „ P v1, . . . , vn K for E over F and K over E respectively.° Pick any a K. P  n ¤ Then there exist a1, . . . , an E such that a i°1 aivi. Also, for 1 ¤ P  m i n there° exist° bi1, . . . , bim F such that ai j1 bijuj. This gives  n m t | ¤ ¤ ^ ¤ ¤ u us a i1 j1 bijujvi, so that the set ujvi 1 i n 1 j m P °spans°K as a vector space over F . Now let c11, . . . , cnm ° F such that n m  ¤ ¤  m  i0 j0 cijujvi 0. Then clearly for 1 i n: di j0 cijuj 0. This implies that for 1 ¤ i ¤ n and 1 ¤ j ¤ m we have cij  0. Therefore we have obtained a base for K over F and rK : F s  nm  rK : EsrE : F s 8.

The following corollary is immediate and will be used later on.

Corollary 2.4.7. Let K be a field extension of F with rK : F s 8. Then for any field extension E „ K of F , rE : F s  rK : F s. Also, if rK : F s is prime, then the only subfields of K that contain F are K and F themselves.

2.4.3 Splitting Fields We have seen before that if we have a field extension E over F and some monic polynomial fpxq P F rxs, the u P E is a zero of fpxq if and only if px ¡ uq  fpxq, where we regard fpxq as a polynomial in Erxs (by the induced inclusion homomorphism). It would of course be great± if we could p q  n p ¡ q find a field extension E over F where we could write f x i1 x ri for r1, . . . , rn P E. Also, taking into account the results we obtained for the  p q dimensionalities of field± extensions, we would like that E F r1, . . . , rn , r s  n r p q p qs since then E : F i1 F u1, . . . , ui : F u1, . . . , ui¡1 (where the first r p q s P term in the product is understood± to be F u1 : F . Also, if r E is then p q  p q  n p ¡ q  a zero of f x , then 0 f r i1 r ri such that r ri for some 1 ¤ i ¤ n. We will call such a field extension a splitting field of fpxq over F .

Definition 2.4.5. Let F be a field and fpxq P F rxs monic. Then a splitting field of fpxq over± F is a field extension E over F , such that in Erxs we can p q  n p ¡ q P  p q write f x i1 x ri for r1, . . . , rn E and E F r1, . . . , rn . Proposition 2.4.8. If F is a field and fpxq P F rxs is monic, then there exists a splitting field E of fpxq over F . ± p q  k p q ¤ ¤ p q P r s Proof. Let f x i1 fi x where for 1 i k, fk x F x is monic and irreducible. Then k ¤ n  degpfq. If n  k, F itself is a splitting field of fpxq. Now let n ¡ k ¡ 0. Then for some j P t 1, . . . , k u we have degpfjq ¡ 1. Set K  F rxs{pf1pxqq, which is a field that contains F , and r  x mod pfipxqq, so that K  F prq and f1prq  0. Then in Krxs

13 ± p q  l p q ¤ ¤ p q P r s we have f x i1 gi x , where for 1 i l, gi x K x are the irreducible factors of fpxq in Krxs. Since these factors can be obtained by taking the irreducible factors of the fipxq and px ¡ rq P Krxs is an irreducible factor of f1pxq, we obtain n ¥ l ¡ k, so that n ¡ l n ¡ k.  p q By induction± we then obtain an extension field E K r1, . . . , rn such that p q  n p ¡ q  ¤ ¤ f x ii x ri , where r ri for some 1 i n. This shows that E  F prqpr1, . . . , rnq  F pr1, . . . , rnq is a splitting field of fpxq over F . We state the following proposition without proof, which can be found in [4].

Proposition 2.4.9. Let F be a field, fpxq P F rxs monic and of positive degree, and E and E1 splitting fields of fpxq over F . Then E  E1.

We may now quickly see that the splitting of a monic polynomial in factors of degree one is unique. Therefore, the zeroes are unique and the following definition is consistent over every possible splitting field.

Definition 2.4.6. Let F be a field, fpxq P F rxs monic and± of positive p q p q  m p ¡ qki degree, and E a splitting field of f x over F . Write f x i1 x ri , where ri  rj if i  j. We then call ki the multiplicity of ri. Also, a zero ri is called a simple zero if and only if it has multiplicity 1. Otherwise, it is called a multiple zero.

We lastly make the connection between the derivative of a polynomial and the character of its zeroes. Informally, we will see that a zero (in a splitting field) has multiplicity greater than 1 if and only if the polynomial and its derivative have a common factor of positive degree. For this we define the following map on a polynomial ring of a field.

Definition 2.4.7. Let F be a field. We then define the standard derivation in F rxs as the unique function F rxs Ñ F rxs : fpxq ÞÑ f 1pxq so that for any fpxq, gpxq P F rxs:

1. pf gq1pxq  f 1pxq g1pxq

2. pfgq1pxq  f 1pxqgpxq fpxqg1pxq

3. x1  1.

As in real analysis we may quickly derive all the familiar algebraic prop- erties of polynomial derivatives. We can now state the following proposition, the proof of which can be found in [4, Sec. 4.4].

Proposition 2.4.10. Let F be a field, fpxq P F rxs monic and of positive degree, and E any splitting field of fpxq over F . Then all zeroes of fpxq in E are simple if and only if gcdpf, f 1q  1.

14 2.4.4 Galois Theory Galois theory is one of the pearls of modern mathematics. It allows one to study solutions of algebraic equations in a purely algebraic way. At the heart of the theory is the connection between the solutions of such equations and group theory. We will state the fundamental results without proof for later use. An extensive treatment of this subject may (again) be found in [4, Ch. 4]. Throughout this subsection, F denotes a field.

Definition 2.4.8. fpxq P F rxs is called seperable if and only if its irreducible factors have distinct zeroes in any splitting field. An algebraic field extension E over F is called seperable over F if and only if the minimum polynomial over F of every element of E is seperable. Also, E is called normal over F if and only if every irreducible polynomial in F rxs that has a zero in E splits into factors of degree 1.

Lemma 2.4.11. Any field extension E over F of characteristic 0 is seper- able.

Definition 2.4.9 (The Galois group). Let E be a field extension over F . The Galois group of E over F is then the group GalpE{F q of automorphisms of E that reduce to the identity when restricted to F . Also, if G is any subgroup of the group of automorphisms of E, then Inv G „ E is the subfield of elements that are invariant under all automor- phisms in G.4

Definition 2.4.10. A field extension E over F is called a Galois field ex- tension if and only if E is a splitting field of fpxq over F for some seperable fpxq P F rxs.

Lemma 2.4.12. If E is a splitting field of fpxq over F for some monic seperable fpxq P F rxs, then | GalpE{F q|  rE : F s.

Proposition 2.4.13. Let E be a field extension over F . Then the following statements are equivalent:

• E is a Galois field extension over F .

• F  Inv G for some finite subgroup of Aut E.

• rE : F s 8, and E is normal and seperable over F .

Theorem 2.4.14 (Fundamental Theorem of Galois Theory). Let E be a Galois field extension over F and define:

• Γ is the set of subgroups of GalpE{F q.

4It follows that G  GalpE{F q is the subgroup of Aut E such that Inv G  F .

15 • Σ is the set of subfields K „ E such that F „ K.

• γ :Γ Ñ Σ: H ÞÑ InvpHq.

• σ :Σ Ñ Γ: K ÞÑ GalpE{Kq.

Then γ and σ are inverse bijections, and we have the following properties:

1. @H1,H2 P Γ: H1 „ H2 ðñ Inv H1 H2, 2. @H P Γ: |H|  rE : Inv Hs ^ rG : Hs  rInv H : F s,

3. @H P Γ: H is a normal subgroup of GalpE{F q if and only if Inv H is a normal field extension over F . In this case GalppInv Hq{F q  pG{Hq.

16 Chapter 3

Real Closed Fields

In this chapter we will develop the framework in which we prove Sturm’s Theorem. We will begin with a discussion of ordered fields, showing that a field can be (compatibly) ordered if and only if the field is formally real (meaning that no non-zero element is a sum of squares). We will then discuss real closed fields and some of their key properties, going on to investigate several equivalent characterizations of real closed fields. This serves to il- lustrate the importance of real closed fields in applications. Again we will follow [4] in our discourse.

3.1 Ordered and Formally Real Fields

Definition 3.1.1. An ordered field is a pair pF,P q where F is a field, and P € F such that 1.0 R P , 2. @a P F : a  0 _ a P P _¡a P P , 3. @a, b P P : a b P P ^ ab P P . We call the elements of P the positive elements of F . We also say that a field F can be ordered if and only if a P € F exists so that pF,P q is an ordered field. Lemma 3.1.1. If pF,P q is an ordered field, define the set of negative el- ements N  t x P F | Dp P P : x  ¡p u. Then P , N and t 0 u are disjoint and F  P Y t 0 u Y N. Proof. We first note that 0 R P by property 1. This implies that 0  ¡0 R N. Therefore P X t 0 u  H  N X t 0 u. Now suppose that P X N  H and let a P P X N. Then ¡a P P , so by property 3: 0  a ¡ a P P . This contradiction with property 1 shows that P X N  H. Now let a P F z t 0 u. Then by property 2, a P P or a P N, so a P P Y N and F  P Y t 0 u Y N.

17 Definition 3.1.1 of an ordered field is not so intuitive at first glance, but it becomes more transparent when we recall that P were the positive elements and we consider the following: Proposition 3.1.2. Any ordered field pF,P q induces a strict total order ¡ by: @a, b P F : a ¡ b ðñ a ¡ b P P, (3.1) with the following properties: 1. @a P F : ra ¡ 0 ðñ @b P F : a b ¡ bs, 2. @a, b P F : a ¡ 0 ^ b ¡ 0 ùñ ab ¡ 0. Conversely, if ¡ is a strict total order with the properties above, then P  t x P F | x ¡ 0 u defines an ordered field pF,P q. Proof. Let pF,P q be an ordered field and define ¡ as above. We shall first show that ¡ is a strict total order. Let a, b, c P F such that a ¡ b and b ¡ c. Then a ¡ b P P and b ¡ c P P . We then have a ¡ c  a ¡ b b ¡ c P P , so a ¡ c, which shows that ¡ is transitive. Then, if we take a, b P F , we see by lemma 3.1.1 that either a  b, a ¡ b P P , or b ¡ a P P . Therefore, either a  b, a ¡ b or b ¡ a, so ¡ is trichotomous and a strict total order. To prove property 1, let a P F . If a ¡ 0, then @b P F : pa bq¡b  a P P , so @b P F : a b ¡ b. Conversely, if @b P F : a b ¡ b, then this is in particular the case for b  0: a ¡ 0. To prove property 2, let a, b P F with a ¡ 0 and b ¡ 0. Then a, b P P and thus ab P P , which shows that ab ¡ 0. Let us now suppose that we are given a strict total order ¡ on F satis- fying properties 1 and 2. Define P  t x P F | x ¡ 0 u. Then clearly 0 R F , since otherwise 0 ¡ 0, which is the first property of an ordered field. Also, by the trichotomy of ¡, we have for any a P F : a ¡ 0, a  0, or 0 ¡ a. This means: a  a ¡ 0 P P , a  0, or ¡a  0 ¡ a P P , so P also satisfies the second property. Now let a, b P P . Then by property 1 and the transitivity of ¡: a ¡ 0 ùñ a b ¡ b ¡ 0 ùñ a b P P . Also, by property 2: a ¡ 0 ^ b ¡ 0 ùñ ab ¡ 0 ùñ ab P P , which finally shows that pF,P q is an ordered field.

Note: Notation of ordered fields From now on, if we speak of an ordered field pF,P q, and we use the symbol ¡, this will denote the induced strict total order. Also, the symbol ¥ will denote the total order induced by ¡ (defined by a ¥ b ðñ a ¡ 0 _ a  b). Similarly, for a P F we write |a|  a if a  0 or a ¡ 0 and |a|  ¡a if a 0. If the set P is not used, we may also just write: “the ordered field F ”.

Lemma 3.1.3. If pF,P q is an ordered field, then for any a P F ¦ : a2 P P . In particular we see that 1  12 P P .

18 Proof. Let a P F ¦. Then either a P P or ¡a P P , so that a2  p¡aq2 P P , since P is closed under multiplication. We state the following lemma without proof, as it is simply proven by considering the various possibilities of the signs of a and b. Lemma 3.1.4 (Triangle inequality). If pF,P q is an ordered field, then for any a, b P F : |a b| ¤ |a| |b|. (3.2) We will now go on to prove a nice characterization of an ordered field in terms of sums of squares. The following definition is due to Artin and Schreier [1]1. Definition 3.1.2. A formally real field is a field F that satisfies the following property:   ¸n @ P @ P 2  ùñ @ P t u  n N a1, . . . , an F ai 0 i 1, . . . , n : ai 0 , i0 i.e. the zero of the field is not the sum of non-zero squares, or the vanishing of a sum of squares implies the vanishing of all the individual squares. The following lemma illustrates a different characterization2. E P Lemma° 3.1.5. A field F is formally real if and only if a1, . . . , an F such n 2  ¡ that i1 ai 1. P Proof. Let°F be formally real. Now° suppose that there exist a1, . . . , an F n 2  ¡ n 2 2  ¡ 2  such that i1 ai 1. Then i1 ai 1 1 1 0, which is t u forbidden, so no such ai exist. ° P n 2  ¡ Conversely, let there exist° no a1, . . . , an F such that i1 ai 1 and P m 2   take b0, . . . , bm F such that i0 bi 0, and suppose that b0 0¡ (i.e.© one ° ° 2 of the b is non-zero). Then ¡b2  0 and m b2  ¡b2 ùñ m bi  i 0 i1 i 0 i1 b0 ¡1, which gives us a contradiction. Therefore, F is formally real. Lemma 3.1.6. Any ordered field pF,P q is formally real. Proof. Let a P F z t 0 u. Then either a ¡ 0 or ¡a ¡ 0 and thus a2  p¡aq2 ¡ 0. We will now show that any sum of non-zero squares is strictly greater than zero by induction. The induction basis was the first° step of the proof. Therefore, let k P N, k 2 2 k ¡ 0, a , . . . , a P F z t 0 u and a ¡ 0. Then a ¡ 0 and thus ° 1 ° k 1 i1 i k 1 k 1 2 ¡ k 2 ¡ i1 ai i1 ai 0. ° P n 2   Therefore, if a1, . . . , an F and i1 ai 0, we must have that a1 ¤ ¤ ¤  an  0 and thus F is formally real. 1Artin and Schreier chose this as one of the key properties of the real number system, in an effort to characterize the real in a purely algebraic way. 2This was actually the original definition of Artin and Schreier.

19 The converse of the foregoing lemma is a theorem that was proved by Artin and Schreier [1, Satz 1.], and gives the definite answer on the connec- tion between the sums of squares in a field and orderings. We follow a proof of Jean-Pierre Serre [6]3. We first prove the following ¦ Lemma 3.1.7. If P0 is a subgroup of the multiplicative group F of a field, such that P0 is closed under addition and contains all non-zero squares, and ¦ if a P F such that ¡a R P0, then

P1  P0 P0a  t p P F | Dx, y P P0 : p  x ya u is a subgroup of F ¦ that is closed under addition.

Proof. Let p1  x1 y1a, p2  x2 y2a P P1, where x1, y1, x2, y2 P P0. Then p1 p2  px1 x2q py1 y2qa P P1, since P0 is closed under addition. Also, 2 p1p2  px1 y1aqpx2 y2aq  px1x2 y1y2a q px1y2 x2y1qa P P1, since 2 a P P0 and P0 is closed under addition and multiplication. ¡1 If 0 P P1, then Dx, y P P0 : x ya  0, so ¡a  xy P P0, which leads ¦ to a contradiction. This shows that 0 R P1 and thus P1 „ F . ¡1 ¡1 Lastly, let p  x ya P P1, x, y P P0. Then p  px yaq  ¡2 ¡1 2 ¡1 2 px yaqpx yaq  rxppx yaq q s ryppx yaq q sa P P1, since ¦ x ya  0 and P0 contains all non-zero squares. Therefore, P1 „ F is a subgroup of the multiplicative group of F that is closed under addition.

Proposition 3.1.8 (Serre). If L is an extension field of an ordered field pK,P q, then L can be ordered as pL, PLq with P „ PL (i.e. the ordering P „ on L extends that° on K) if and only if for all p1, . . . , pn P K and P n 2  ùñ  ¤ ¤ ¤   x1, . . . , xn L: i1 pixi 0 x1 xn 0.

Proof. Let pL, PLq be an ordered extension field of an ordered field pK,P q with P „ PL. Now take p1, . . . , pn P P and x1, . . . , xn P L. We first see  2  ¡ ¡ ¡ that for every xi either xi 0, in which case xi 0, or xi 0 or xi 0 2  p¡ q2 ¡ ¡ so that xi xi 0. Therefore, since each pi 0 and the positive  elements° are closed° under addition and multiplication, if one of the xi 0: n 2 ¡ n 2   i1 pixi 0. So, i1 pixi 0 implies that all xi 0. Now suppose that the converse is true. Define T as the set of subgroups of L¦ that are closed under addition and contain all elements of the form 2 ¦ px where p P P and x P L°.  t n 2 | P ^ P ¦ u Clearly the set P°0 i1 pixi° p P xi L is closed under addi-  n 2  m 2 P P tion. Now let x i1 p°ixi , y° j1 qjyj P0, where all pi, qj P and P ¦  n m p q2 P xi, yj L . Then xy i1 j1 piqj xiyj P0, so P0 is closed under ¡1 ¡2 multiplication. Also, if x P P0, then x  xx P P0, because x  0 (by ¡2 ¡1 2 the hypothesis) and thus x  px q P P0. This shows that P0 P T , and thus T is non-empty.

3It is entertaining to note that although this argument was thought up by Serre, it was presented on a seminar by Elie´ Cartan.

20 By Zorn’s Lemma we may now pick a maximal element PL P T . We claim that this PL makes L an ordered field that extends K. To see this, ¦ let a P L . If both a and ¡a P PL, then 0 P PL, which is a contradiction, so a and ¡a cannot be simultaneously in PL. Now, if ¡a R PL, define 1 P  t x ya | x, y P PL u. Since PL certainly contains all non-zero squares (1 P P ), we can conclude by lemma 3.1.7 P 1 also is a subgroup of L¦ that is closed under addition. Furthermore, take p P P and x P L¦. Then px2  px2p1 aqp1 aq¡1  ppx2 px2aqp1 aq¡1 P P 1, so P 1 P T . Also, if ¡1 ¡1 1 1 x P PL, then x  xp1 aqp1 aq  px xaqp1 aq P P , so PL „ P . 1 Because we took PL to be maximal in T we can now conclude that PL  P . ¡1 2 ¡1 1 Lastly, a  ap1 aqp1 aq  pa aqp1 aq P P  PL, since PL contains all non-zero squares. We can now conclude that either ¡a P PL or a P PL exclusively. The above showed that pL, PLq is an ordered field. Now, if p P P „ K, 2 then p  p1 P PL, so P „ PL and the order extends the order on K. Now we are ready to prove

Theorem 3.1.9. A field F can be ordered if and only if it is formally real.

Proof. We already saw that if a field F can be ordered, then it is formally real. Conversely, let F be a formally real° field. Then its characteristic is 0 ¥ n 2  (for if it has characteristic n 1, then i1 1 0, which is not the case), and thus it contains Q as a subfield. p1 pn Let 0 ,..., P where all pi P and qi P z t 0 u, and x1, . . . , xn P ° q1 qn Q Z Z F with n pi x2  0. Let us multiply with q . . . q : i1 qi i 1 n £ ¸n p ¸n ¹n 0  q . . . q i x2  q p x2. 1 n q j i i i1 i i1 j1 ji

This is now a sum of integer multiples of squares, and thus simply a sum of squares. Since F is formally real, we can conclude that all xi  0. By proposition 3.1.8 we can therefore conclude that there exists an order on F that extends the standard order on Q. As our last result on formally real/ordered fields we will give the following lemma, which provides us with bounds on the zeroes of a monic polynomial. ° p q  n n¡1 i P r s Lemma 3.1.10. Let F be an ordered field, f x x i0 a°ix F x P  p n¡1 | |q monic and of positive degree, and c F . Define M max 1, i0 ai . Then |c| ¡ M implies that |fpcq| ¡ 0. Conversely, if fpcq  0, then ¡M ¤ c ¤ M.

¡n °Proof. Let c P F with |c| ¡ M. We first note c  0, so that: 1  u fpuq ¡ n¡1 i¡n | ¡n| P ¡ | i¡n| i1 aiu . Also, u 1, and for i 0, . . . , n 1 we have u

21 |u¡1|. From this, and the triangle inequality, it follows that:

n¸¡1 ¡n i¡n 1  |u fpuq ¡ aiu | i1 n¸¡1 ¡n i¡n ¤ |u ||fpuq| |ai||u | i1 n¸¡1 ¡1 |fpuq| |u | |ai| i1 ¤ |fpuq| M ¡1M  |fpuq| 1, from which we can conclude that |fpuq| ¡ 0. If we now negate this statement, then fpcq  0 implies that ¡M ¤ c ¤ M.

3.2 Real Closed Fields

Artin and Schreier defined a refinement of formally real fields in an attempt to capture the characteristic algebraic properties of the real numbers. There are several useful examples of formally real fields, which include the real numbers, the real numbers that are algebraic over Q, the hyperreal numbers and the computable numbers. Let us state the definition. Definition 3.2.1. A field F is called real closed if and only if F is formally real and no proper field of F is formally real. This definition and the foregoing discussion of formally real fields shows that a real closed field F is closed in the sense that it can be ordered, but no extension of it can be ordered. We will go on to find some more useful characterizations. We first observe the following very useful facts, where we follow the proof in [1]. Lemma 3.2.1. If F is a real closed field, then: • Every sum of squares in F can also be written as a single . • @x P F Dy P F : x  y2 _¡x  y2. • Every polynomial fpxq P F rxs of odd degree has a zero in F . P 2 ¡ P r s Proof. Let γ F?not be a square. Then the polynomial x γ F x is irreducible, so F p γq  F rxs{px2 ¡γq is a proper field extension of F , hence it is not formally real. This shows that there exist α1, . . . , αn, β1, . . . , βn P F such that ¸n ¸n ? ¸n ¸n ? 2 2  p q2  ¡ γ αν βν 2 γ ανβν αν γ βν 1. ν1 ν1 ν1 ν1

22 ° ? n  P If ν1 ανβν 0, then° γ F , which leads to a contradiction, so that this n 2  ¡ sum vanishes. Also, if ν1 αν 0, then 1 would be a sum of squares in F , which is also a contradiction, so that sum does not vanish. We can then conclude that γ is not a sum of squares in F , since otherwise ¡1 would be a sum of squares in F . Negating this statement leads to the first property. By the first property we may now pick α, β P F such that:

¸n ¸n 2  2 2  2 α αν, β 1 βν ν1 ν1

(observe that 1  12) and thus: ° ¢ 1 n β2 2 2 ¡  ° ν1 ν  β  β γ n 2 2 . ν1 αν α α From this we can conclude that either γ is a square, or ¡γ is a square, which shows the second property. Now let us pick any polynomial fpxq P F rxs with degpfq  2n 1, where n P N. Without loss of generality we may assume f to be monic, since F is a field. The third statement can then be proven by induction with respect to n. If n  0, then the polynomial is of first degree and thus of the form fpxq  x ¡ a, where a P F is a zero of fpxq. Now let n ¥ 1 and the statement be true for all gpxq P F rxs, degpgq  2k 1, k P N and k n. If fpxq is reducible, then it can be written as fpxq  gpxqhpxq, where gpxq, hpxq P F rxs are monic and of positive degree strictly smaller than 2n 1, and one of them (say gpxq)must be of odd degree, since degpfq  degpgq degphq. By the induction hypothesis, gpxq then has a zero in F , and hence so does fpxq. If fpxq is irreducible, we can form the proper field extension F pαq  F rxs{pfpxqq, where α P F pαq is a zero of fpxq. We then know that F pαq is not formally real, and thus there exist q1pxq, . . . , qrpxq P F rxs with degree smaller than 2n 1 such that: ¸r 2 pqνpαqq  ¡1 P F. ν1 This then shows that there exists some gpxq P F rxs such that:

¸r 2 pqνpxqq fpxqgpxq  ¡1. ν1

2 Now, the degree of the qνpxq must be even, and therefore the degree of the sum must be even and positive and strictly less than 4n 2. We therefore

23 conclude that gpxq has odd degree less than or equal to 2n ¡ 1. Therefore gpxq has a zero ρ P F . However, then:

¸r ¸r 2 2 ¡1  pqνpρqq fpρqgpρq  pqνpρqq . ν1 ν1 I.e. ¡1 is a sum of squares in F , leading to a contradiction. Therefore fpxq must be reducible and the third statement has been proven.

Lemma 3.2.2. If a field F is real closed, there exists one and only one P „ F ¦ such that pF,P q is an ordered field. I.e. a real closed field can be uniquely ordered.

Proof. If F is formally real, then we know that it can be ordered. Let P € F be the positive numbers of such an ordering. Then we know that any non-zero square x2, x P F ¦ must be positive. Now, if F is real closed, it is formally real and can thus be ordered. Also, @x P F ¦ Dy P F ¦ such that x  y2, in which case x must be positive, or ¡x  y2, in which case ¡x must be positive, in any ordering. Since this covers all non-zero elements of F , there exists only one ordering, namely the one where exactly all the non-zero squares are positive.

Note From now on, when we speak of a real closed field, we will implicitly assume that it is equipped with this unique order.

The following result is the analog of the classical Fundamental Theorem of Algebra, and shows that real closed fields capture the important property that we may obtain an algebraically closed field by adjoining a single square root. In particular, this shows that the? real numbers R form a real closed field, since C is obtained by adjoining ¡1 and is algebraically closed.

Theorem 3.2.3. A? field F is real closed if and only if it is not algebraically closed, and C  F p ¡1q  F rxs{px2 1q is algebraically closed.

Proof. Let F be a real closed field. Then we see that x2 1 is irreducible, and hence has no zeroes in F , since otherwise ¡1 would be a sum of squares in F . We can then define the field C  F rxs{px2 1q. We first define the automorphism z  a bi ÞÑ z¯  a ¡ bi of C, where i P C denotes a zero (any one of the two) of x2 1. This induces an automorphism fpxq ÞÑ f¯pxq of Crxs. We then see that if fpxq P Crxs, then fpxqf¯pxq P F rxs. Also, if fpxqf¯pxq has a zero r in C, then fprqf¯prq  0, and hence fpxq has a zero in C. We now show that every element of C can be written as a square. To this end, let z  a bi P C. Then zz¯  a2 b2 P F and non-negative,

24 so that Dα P F : a2 b2  α2. Also, α2 ¥ a2 so that |α| ¥ |a| and hence Dc1, c2 P F , where we can pick c1c2 with the same sign as b, such that a |α| ¡a |α| c2  , c2  . 1 2 2 2 Also: a |α| ¡a |α| p2c c q2  4  ¡a2 pa2 b2q  b2. 1 2 2 2 p q2  2 ¡ 2  We can therefore conclude that c1 c2i c1 c2 2c1c2i a bi. This shows that there exists no algebraic extension field E of C with rE : Cs  2 (since any quadratic equation is reducible). With the foregoing in mind, we now let fpxq P F rxs be a monic polyno- mial of even degree. We define E to be a splitting field over F of fpxqpx2 1q, such that C „ E. Then E is Galois over F (since F is of characteristic 0 and thus any polynomial in F rxs is seperable; hence E is the splitting field of a separable polynomial). We write | Gal E{F |  2em, where m is odd. By Sylow’s theorem, Gal E{F contains a subgroup H with |H|  2e. Let H be the subfield of E containing F corresponding to H under the Galois pairing. Therefore, 2em  rE : F s  rE : HsrH : F s  2erH : F s so that rH : F s  m. But since every polynomial of odd degree in F has a zero in F , F has no proper odd-dimensional algebraic extension fields. Therefore, m  1, H  Gal E{F , and H  E. We can now conclude that because | Gal E{F | is even, we can obtain E by repeatedly adjoining square roots. However, since we obtained C by adjoining a square root and C contains all possible square roots, we must have that C  E. Therefore, C is a splitting field of fpxqpx2 1q and hence contains all zeroes of fpxq4. This shows that every polynomial in F rxs has a zero in C. By the reasoning above we can then conclude that C is algebraically closed. We will now go on to show the converse. Let F be a field that is not 2 algebraically closed, but let? C  F piq  F rxs{px 1q be algebraically closed. We then clearly see that ¡1 R F , since otherwise x2 1 would be reducible and C would not be a field. Now let a, b P F . We can show in the same way as before that every element of C can be written as a square, and so we pick z P C such that z2  a bi. Then a2 b2  pa biqpa ¡ biq  z2z¯2  pzz¯q2 and zz¯ P F . This shows that every sum of squares in F can be written as a square. In particular, ¡1 is not a square, and hence not a sum of squares, so that F is formally real. We can? also see that C is an of F (since C is generated by i  ¡1 with minimum polynomial x2 1, so that rC : F s  2 8 and C is algebraically closed), so that every algebraic extension of F is contained within C. But then, C is the only proper algebraic extension field, and C is not formally real (as i2  ¡1), so that F is real closed. 4We note that x2 1 splits in Crxs as px ¡ iqpx iq.

25 3.3 The Intermediate Value Theorem

In this section we will discuss a very important theorem for real continous and differentiable functions that holds in the context of polynomials with coefficients in a real closed field. This is the familiar intermediate value theorem, and it will be the key to our success in the next chapter. Theorem 3.3.1 (Intermediate Value Theorem). Let F be a real closed field, fpxq P F rxs, a, b P F and a b. Then if fpaqfpbq 0, there exists a c P F such that a c b and fpcq  0. Proof. From theorem 3.2.3 we already know that the only irreducible poly- nomials in F rxs are going to be those of degree 1 or 2. Furthermore, a polynomial x2 αx β P Rrxs is going to be irreducible if and only if α2 ¡ 4β 0. This follows in the same way as for second degree polynomials with real coefficients. Now let us pick fpxq P F rxs to be monic and of positive degree. The general case then follows quickly by dividing out the leading coefficient and by noting that the premise cannot hold for polynomials of degree zero. We can write fpxq in terms of its irreducible factors as: ¹m ¹s fpxq  px ¡ riq gjpxq, i1 j1 where r1, . . . , rm P R and g1pxq, . . . , gspxq P Rrxs with: p q  2 2 ¡ ¤ ¤ gj x x ajx bj, aj 4bj 0, 1 j s.

For j P t 1, . . . , s u we can, by lemma 3.2.1, find 0 cj P R such that 2  1 p ¡ 2q cj 4 4bj aj . We can then write: ¡ © a 2 g pxq  x j c2, j 2 j so that for all u P R, gjpuq ¡ 0. We first rule out the case that fpxq has no irreducible± factors of first p q p q  s p q p q ¡ degree. If this would be the case, then f a f b j1 gj a gj b 0, contradicting our hypothesis. ± @ P t u ^ p q p q  m p ¡ Now, if± i 1, . . . , m : a ri b ri, then f a f b i1 a qp ¡ q 2 p q p q ¡ @ P t u ¡ ^ ¡ ri b ri j1 gj a gj b 0. Similarly, if i 1, . . . , m : a ri b ri, then also fpaqfpbq ¡ 0. We conclude that there exists a i P t 1, . . . , m u such that a ri b and fpriq  0, which concludes the proof. The key property in the proof above was that every positive element of R can be written as a square, which is a characteristic property of real closed fields. It turns out that analogues of several other important theorems in real analysis, such as Rolle’s Theorem and the Mean Value Theorem, hold for polynomials in a real closed field as well.

26 Chapter 4

Sturm’s Theorem

In this chapter we will study the classical method for determining the num- ber of zeroes of a polynomial with real coefficients that are contained within an open interval, which is based on a theorem by J.C.F. Sturm, published in 1829 [7]. In particular, this method allows us to symbolically locate the ze- roes of a polynomial up to an arbitrary precision. We will study this method in the context of real closed fields, which we have shown to encompass the real number system. We will give two versions of the theorem. The first gives a decision method in terms of variations in sign of a sequence of numbers. The second answers when a parametrized family of polynomials has zero in a certain interval, by reducing it to a set of polynomial equations and inequations for the parameters of the family, where the equations and inequations have integer coefficients. From the last theorem we can then quickly show that if a polynomial with rational coefficients has a zero in one real closed field, it will have a zero in any real closed field. Throughout this chapter, R will denote a real closed field, equipped with the strict total order ¡. Also, if a, b P R and a b we will use the notations ra, bs  t x P R | a ¤ x ¤ b u and sa, br t x P R | a x b u for closed and open intervals respectively. Most of this chapter draws from [4], but several definitions and theorems have been modified to streamline the discussion and to get some more general results.

4.1 Variations in sign

n 1 Definition 4.1.1. Let pc0, . . . , cnq P R be a sequence of numbers in R. Then the number of variations in sign of this sequence is defined to be

| t P t 1 u | 1 1 u | i 1, . . . , n ci¡1ci 0 ,

27 p 1 1 q where c0, . . . , cn1 is the subsequence obtained by dropping the zero elements of the original sequence.

Definition 4.1.2. Let fpxq P Rrxs and a, b P R with a b. Then a Sturm sequence for fpxq on ra, bs is a sequence of polynomials pf0pxq, . . . , fspxqq P s 1 Rrxs such that f0pxq  fpxq and:

1. f0paqf0pbq  0,

2. @c P ra, bs : fspcq  0 (i.e. fspxq has no zeroes in ra, bs),

3. If c P ra, bs and fjpcq  0 for some j P t 1, . . . , s ¡ 1 u, then fj¡1pcqfj 1pcq 0,

4. If c P ra, bs and fpcq  0, there exist open intervals sc1, cr, sc, c2r€ R such that @u Psc1, cr: f0puqf1puq 0 and @u Psc, c2r: f0puqf1puq ¡ 0. In the proposition below we will show that a Sturm sequence can be used to calculate the number of distinct (i.e. not counting multiplicity) zeroes of the polynomial that lie in some open interval.

Proposition 4.1.1. Let fpxq P Rrxs be of positive degree, a, b P R with a b, and pf0pxq, . . . , fspxqq a Sturm sequence for fpxq on ra, bs. For any c P ra, bs, denote the number of variations in sign of pf0pcq, . . . , fspcqq as Vc. Then the number of distinct zeroes of fpxq within sa, br is Va ¡ Vb.

Proof. Since the number of zeroes of all the fipxq within ra, bs is finite, we can write them down as a  a0 a1 ¤ ¤ ¤ am  b so that no fjpxq has a zero in any of the open intervals sai¡1, air, 1 ¤ i ¤ s. Now pick for 1 ¤ i ¤ m: ci Psai¡1, air. First we see that no fjpxq has a zero in sa0, c1r. Then by the nega- tion of theorem 3.3.1 we have fjpa0qfjpc1q ¡ 0 for j P t 0, . . . , s u. Now let k P t 0, . . . , s u with fkpa0q  0. Then clearly 0 k s, since f0pa0q  0  fspa0q, and so fk¡1pa0qfk 1pa0q 0. Then fk¡1pa0qfk 1pa0qfk¡1pcqfk 1pcq ¡ 0 implies that fk¡1pcqfk 1pcq 0. Taking into account all such k, we get   Va0 Vc1 . In exactly the same way we may prove that Vcm Vam . We now let i P 1, . . . , m ¡ 1. Then if fpaiq  0, we can carry through ¡  p q  the same argument to get Vci Vci 1 0. If f ai 0, we note that (pos- sibly by repicking our ci and ci 1 to comply with property 4 of a Sturm sequence) f0pciqf1pciq 0 and f0pci 1qf1pci 1q ¡ 0. Furthermore, the argu- ment above again shows that if 1 j s, then fj¡1pciq, fjpciq, fj 1pciq and fj¡1pci 1q, fjpci 1q, fj 1pci 1q have the same number of variations in sign. ¡  Therefore in this case Vci Vci 1 1. We can now write:

m¸¡1 m¸¡1 ¡  p ¡ q p ¡ q p ¡ q  Va Vb Va Vc1 Vci Vci 1 Vcm Vam δi, i1 i1

28 where δi  1 if fpaiq  0 and δi  0 if fpaiq  0. Now since all of the zeroes of fpxq that lie within sa, br per definition are one of the ai, we have counted all the zeroes. Therefore, Va ¡ Vb is the total number of distinct zeroes of fpxq that lie within sa, br.

Now that we have a method of determining how many distinct zeroes a polynomial has in some open interval, given a Sturm sequence, we will need a method to actually produce a Sturm sequence. If we do this, we have a full-blown algorithm to determine the zeroes of a polynomial in some interval. Even better, if we can find a bound on the absolute values of the zeroes of a polynomial and strategically disect the resulting interval, we can locate the zeroes numerically up to an arbitrary precision! It turns out that we can construct a Sturm sequence in a formalized way, using the Euclidean division algorithm. Definition 4.1.3. Let fpxq P Rrxs be of positive degree and f 1pxq P Rrxs its formal derivative. Then define the following sequence, terminating when fs 1pxq  0:

f0pxq  fpxq 1 f1pxq  f pxq (4.1)

fi 1pxq  qipxqfipxq ¡ fi¡1pxq degpfi 1q degpfiq, 1 ¤ i ¤ s where qipxq P Rrxs. Then pf0pxq, . . . , fspxqq is called the standard sequence of fpxq.

Note: Existence and uniqueness

The polynomials fi 1pxq and qipxq exist and are unique by corollary 2.3.2. Note however that we have picked fi 1pxq  ¡rpxq. This is the key in producing a Sturm sequence.

We notice that if pf0pxq, . . . , fspxqq is the standard sequence for some 1 fpxq P Rrxs, then fspxq is a common factor of fpxq and f pxq and all fipxq, and any such common factor will be a factor of fspxq. Temporarily pass- ing to the field of fractions of Rrxs, we can then define a derived sequence ¡1 pg0pxq, . . . , gspxqq by setting gipxq  fipxqfspxq for 0 ¤ i ¤ s and observ- ing that each gipxq P Rrxs.

Lemma 4.1.2. Let fpxq P Rrxs be of positive degree and pf0pxq, . . . , fspxqq be its standard sequence. Define the derived sequence of fpxq as pg0pxq, . . . , gspxqq, ¡1 1 where gipxq  fipxqfspxq P Rpxq for 0 ¤ i ¤ s. Then each gipxq P Rrxs, and the derived sequence is a Sturm sequence for g0pxq on every interval ra, bs such that g0paqg0pbq  0. Furthermore, @c P R : fpcq  0 ðñ g0pcq  0.

1Rpxq denotes the field of fractions of Rrxs.

29 Proof. We showed above that fspxq is a common factor of all the fipxq. Therefore, for every 0 ¤ i ¤ s we have some hipxq P Rrxs such that fipxq  ¡1 hipxqfspxq and thus gipxq  hipxqfspxqfspxq  hipxq P Rrxs. We will now show that the derived sequence is a Sturm sequence. Let a, b P R with a b and g0paqg0pbq  0. Then clearly property 1 holds. Furthermore, gspxq  1, so that gspxq has no zeroes in R and hence not in ra, bs. We now use the definition of the standard sequence to see that for 1 ¤ i ¤ s (where it is understood that gs 1pxq  0:

¡1 gi¡1pxq  fi¡1pxqfspxq ¡1  pqipxqfipxq ¡ fi 1pxqqfspxq

 qipxqgipxq ¡ gi 1pxq.

Suppose that c P ra, bs and gjpcq  0 for 0 j s. Then gj¡1pcqgj 1pcq  2 2 qjpcqgjpcqgj 1pcq ¡ pgj 1pcqq  ¡pgi 1pcqq ¤ 0. Also, gj¡1pcq  ¡gj 1pcq, so if gj¡1pcq  0, then gjpcq  0  gj 1pcq and by induction we can then show that gspcq  0, which is not the case. Therefore property 3 holds. Lastly, suppose that c P ra, bs and g0pcq  0. Then fpcq  g0pcqfspcq  0, e so there exist hpxq P Rrxs and e P N such that fpxq  px¡cq hpxq, e ¡ 0 and hpcq  0. Also, f 1pxq  epx¡cqe¡1hpxq px¡cqeh1pxq. Therefore, px¡cqe¡1 1 is a common factor of fpxq and f pxq and hence a factor of fspxq. It follows e¡1 that there exists a kpxq P Rrxs such that fspxq  px¡cq kpxq and kpcq  0. Then hpxq  kpxqlpxq and h1pxq  kpxqmpxq for some lpxq, mpxq P Rrxs with lpcq  0  mpcq. Then g0pxq  px ¡ cqlpxq and g1pxq  px ¡ cqmpxq elpxq 2 and thus g1pcq  elpcq  0. We may then choose an interval rc1, c2s such that c P rc1, c2s and the interval contains no zeroes of g1pxq nor lpxq. Then by theorem 3.3.1, g1pxqlpxq ¡ 0, so that for γ P rc1, c2s : g0pγqg1pγq  pγ ¡ cqg1pγqlpγq which has the same sign as γ ¡ c and thus is negative when γ Psc2, cr and positive when γ Psc, c1r. Hence property 4 holds and the derived sequence is a Sturm sequence for g0pxq in ra, bs. By combining the foregoing lemma and proposition, we may now prove the main result of this section. Theorem 4.1.3 (Sturm’s Theorem). Let fpxq P Rrxs be of positive degree and pf0pxq, . . . , fspxqq its standard sequence. For all c P R, let Vc be the number of variations in sign of pf0pcq, . . . , fspcqq. Then, if a, b P R, a b and fpaqfpbq  0, the number of distinct zeroes of fpxq in the interval sa, br is Va ¡ Vb.

Proof. Let pg0pxq, . . . , gspxqq be the derived sequence of fpxq. We have seen that fpxq and g0pxq have the same distinct zeroes, so the derived sequence is a Sturm sequence for g0pxq on ra, bs. Also, since fpaq  0  fpbq, neither

2 E.g. by choosing a random such interval and then filtering out the zeroes of g1pxq and lpxq by taking the ones closest to c and averaging with c

30 px ¡ aq nor px ¡ bq are common factors of fpxq and fpxq. It then follows that fspaq  0  fspbq and thus the sequences

fipaq  gipaqfspaq and fipbq  gipbqfspbq have the same variations in sign as the gipaq and gipbq respectively. Now, by the foregoing proposition and the observation above, the number of distinct zeroes of fpxq in sa, br is equal to the number of distinct zeroes of gpxq in the interval, which is Va ¡ Vb. We can use the foregoing result to form a useful algorithm, that runs in polynomial time with respect to the degree of the polynomial in question.

Calculating the total number of zeroes of a Algorithm 3: polynomial ° p q  n i P r s  Let f x ° i0 aix R x be monic and of positive degree. Define µ p n¡1 | |q p p q p qq 1 max 1, i0 ai . Calculate the standard sequence f0 x , . . . , fs x of fpxq by repetitive use of algorithm 1. For c P R, let Vc denote the number of variations in sign of the sequence pf0pcq, . . . , fspcqq. Then the total number of distinct zeroes of fpxq in R is V¡µ ¡ Vµ.

Proof. We have found in lemma 3.1.10 that° all zeroes of fpxq are contained in r¡ s  p n¡1 | |q the interval M,M , where M max 1, i0 ai . Therefore, all zeroes of fpxq are certainly contained in the open interval s¡µ, µr, where µ  1 M. If we combine this with Sturm’s theorem, we get V¡µ ¡ Vµ as the total number of distinct zeroes of fpxq.

1 Example. We let fpxq  x3 3x 1 P Rrxs. Then f pxq  3x2 3 and the Euclidean sequence of fpxq and f 1pxq (and thus the standard sequence of fpxq is:

3 f0pxq  x 3x 1 2 f1pxq  3x 3

f2pxq  ¡p2x 1q 15 f pxq  ¡ . 3 4 We observe that all zeroes of fpxq will lie in the interval s¡M ¡ 1,M 1r, where M  maxp1, 4q  4. We therefore evaluate the standard sequence at ¡5 and 5.

f0p¡5q  ¡139 0 f0p5q  141 ¡ 0

f1p¡5q  78 ¡ 0 f1p5q  78 ¡ 0

f2p¡5q  9 ¡ 0 f2p5q  ¡11 0 15 15 f p¡5q  ¡ 0 f p5q  ¡ 0 3 4 3 4

From this we see that V¡5 ¡ V5  2 ¡ 1  1, so fpxq has 1 distinct zero in any real closed field.

31 4.2 Systems of equations, inequations and inequal- ities

This section serves as a preamble to the next section. We will now develop the notion of a system of equations, inequations and inequalities, which are expressions vpt1, . . . , trq  0, vpt1, . . . , trq  0, and vpt1, . . . , trq ¡ 0 respectively, where v P Zrt1, . . . , trs for indeterminates ti, 1 ¤ i ¤ r. Note that will write vptiq for vpt1, . . . , trq if it is more convenient. We can consider any ordered field F , which will contain Z as a subring. We then have an evaluation homomorphism Zrt1, . . . , trs Ñ F induced by the inclusion homomorphism, that sends Z to Z and ti to some ci P F . In this way we can look for solutions of such an expression in the extension field F . We further note, that if vpc1, . . . , crq  0 and wpc1, . . . , crq  0, then since the solutions of these two inequations are in a field F , we can rewrite this equivalently as vpc1, . . . , crqwpc1, . . . , crq  0. So, any finite set of in- equations can be replaced by a single inequation. We can now state the following definition. Definition 4.2.1. An r-system (of equations, inequations and inequalities) is a triple

Γ  ppv , . . . , v q, v, pv¡ , . . . , v¡ qq ¡ 1 s ©1 u ¡ © P Y8 r spiq ¢ r s ¢ Y8 r spiq i1Z t1, . . . , tr Z t1, . . . , tr i1Z t1, . . . , tr .

Moreover, if pF,P q is an ordered field, then the solution set of Γ is the set prq ΓpF q of pc1, . . . , crq P F such that:

v1pciq  ¤ ¤ ¤  vspciq  0,

vpciq  0,

v¡1pciq, . . . , v¡upciq ¡ 0. If we wish to specify a system without equalities, we can specify the trivial equality 0  0. Similarly, we can adjoin the trivial inequation 1  0 and inequality 1 ¡ 0. In this chapter, we shall not use inequalities much, and when we do not need them, we shall drop the last term in the triple, assuming the trivial inequality is to be adjoined. Also, when no inequation (the second element in the triple) has been specified, we assume that the trivial inequation must be adjoined. We can now ask when a set of systems covers all possible cases. The following definition will make this formal.

Definition 4.2.2. An r-cover is a finite set of r-systems δ  t ∆1,..., ∆s u such that for any ordered field F : ¤ ∆pF q  F prq. ∆Pδ

32 Also, a refinement of an r-cover γ is an r-cover δ, such that for any ordered field F of K: @∆ P δ DΓ P γ : ∆pF q „ ΓpF q.

Definition 4.2.3. If Γ and ∆ are r-systems, their join is defined to be the r-system Γ [ ∆3 that has as its equalities and inequalities both those of Γ and ∆, and as inequality the product of the inequalities of Γ and ∆.

We will give the following lemmas without proof, as they are quite straightforward if you just write out the definitions.

Lemma 4.2.1. Let Γ and ∆ be r-systems. Then:

pΓ [ ∆qpF q  ΓpF q X ∆pF q, for any ordered field F .  t u Lemma 4.2.2. If Γ is an r-system and δ ∆”1,..., ∆s is a finite r-cover,  [ ¤ ¤ s p q  p q and we define Γj Γ ∆j for 1 j s, then j1 Γj F Γ F for every ordered field F .

Lemma 4.2.3. Let γ  t Γ1,..., Γu u and δ  t ∆1,..., ∆s u be r-covers 1  [ 1  t 1 1 u and define Γj Γ1 ∆j. Then γ Γ1,..., Γs, Γ2,..., Γu is again an r-cover, and a refinement of γ.

4.3 Sturm’s Theorem Parametrized

We will now consider a family of polynomials in a formally real field R whose coefficients are parametrized as multivariate polynomials over its prime ring Z. That is, the family of polynomials is represented by a polynomial in Zrt1, . . . , trsrxs. The ti represent parameters, and the x represents a variable we wish to solve for. Using Sturm’s Theorem we will show that we can, algorithmically, obtain a cover of systems in Z such that a member of this family has a zero in a certain interval if and only if the parameters and boundaries satisfy one of those systems. This method could be extended to parametrize the systems that the coefficients have to satisfy with respect to the boundaries of the system, but that extension will not be considered here. In order to get to our main result, we first let K  Z and R be a real closed field. We also let r P N, r ¥ 1 and define A  Krt1, . . . , trs, where the prq ti, 1 ¤ i ¤ r are indeterminates. Now, if we pick pc1, . . . , crq P R , we have a homomorphism A Ñ R that extends the inclusion homomorphism K Ñ R and sends ti ÞÑ ci. Therefore, we have an extension of this homomorphism Arxs Ñ Rrxs that maps each parametrized polynomial to a polynomial with coefficients in F : F pti; xq ÞÑ F pci; xq.

3This is not standard notation, but it proves intuitive given lemma 4.2.1.

33 Since A is a commutative ring, we can perfectly well perform Euclidean polynomial division in Arxs. If we now make the connection with the eval- uation in pc1, . . . , crq we can make the following important observation.

Lemma 4.3.1. Let F pti; xq,Gpti; xq P Arxs with Gpti; xq  0 and vmptiq the leading coefficient of G. Then there exists an even e P N and Qpti; xq,Rpti; xq P Arxs with degpRq degpGq and:

e vmptiq F pti; xq  Qpti; xqGpti; xq ¡ Rpti; xq.

prq Also, if pc1, . . . , crq P R and vmpciq  0, then the qpxq, rpxq P Rrxs with F pci; xq  qpxqGpci; xq ¡ rpxq and degprq degpGpciqq differ from Qpci; xq and Rpci; xq by a common positive multiplier. We also note that the choice of the Qpti; xq,Rpti; xq and e are indepen- dent of which real closed field we use.

Proof. The existence of an arbitrary e P N and the Qpti; xq,Rpti; xq P Arxs follows from the Euclidean division algorithm. However, if e is odd, we may multiply the entire equation by vmptiq and so obtain a new Q˜pti; xq and R˜pti; xq and an evene ˜ so that the equation still holds. prq Now, if pc1, . . . , crq P R such that vmpciq  0, then since e is even we e have vmpciq ¡ 0. Then evaluating the equation in the ci and dividing by e vmpciq , we obtain:

¡e ¡e F pci; xq  vmpciq Qpci; xqGpci; xq ¡ vmpciq Rpci; xq

 qpxqGpci; xq ¡ rpxq, where the qpxq, rpxq P Rrxs are as above. And since such qpxq and rpxq are e unique in the polynomial ring of a field, we have Qpci; xq  vmpciq qpxq and e Rpci; xq  vmpciq rpxq. We are now ready to state the following proposition, that allows us to use Sturm’s theorem on the parametrized polynomials. ° Proposition 4.3.2. Let F pt ; xq,Gpt ; xq P Arxs with Gpt ; xq  m v pt qxj  ° i i i j0 j i p q  k p q j  pp ¡ q q 0. Define Gk ti; x j0 vj ti x and the r-systems Γk vj, j k , vk 4 for 0 ¤ k ¤ m and Γ¡8  ppv0, . . . , vmq, 1q . Then we can obtain, in a finite number of steps, an r-cover δ  t ∆1,..., ∆h u that is a refine- ment of the cover γ  t Γ¡8, Γ0,..., Γm u and h sequences of polynomials p p q p qq r s p q P p q Fj0 ti; x ,...,Fjsj ti; x in A x such that, if c1, . . . , cr ∆j R , then the p p q p qq terms of Fj0 ci; x ,...,Fjsj ci; x differ from the terms of the Euclidean sequence of F pci; xq and Gpci; xq by a positive multiplier. Also, if this property holds in one real closed field, then it holds for any real closed field.

4 The k and ¡8 correspond to the degree of Gpci; xq if pciq P ΓkpRq.

34 Proof. We consider any k P t 0, . . . , m u with vkptiq  0 (or equivalently ΓkpRq  H), for else Gkpti; xq  Gjpti; xq for some j k and Γk would not be contributing to the cover γ. We can then just as well omit Γk in our refinement. Now find Qkpti; xq,Rkpti; xq P Arxs as in the foregoing lemma. We have to consider two cases. If Rkpti; xq  0, we can take the sequence pF, G, 0q and Γk as the corresponding system. This suffices because if pc1, . . . , crq P ΓkpRq, then Gpci; xq  Gkpci; xq and thus the Euclidean sequence would be pF pciq,Gkpciqq. Note that we will use this case as an induction basis in the next case. Now let Rkpti; xq  0. If k  m ¡ degpF q, then we see that Rkpti; xq  F pti; xq. We may then obtain the result for Gpti; xq and Rkpti; xq, by going through the argument again and seeing that this case is then excluded. Otherwise degpRkq degpGkq degpF q degpGq, so by induction on the  t u sum of the degrees, we may obtain a cover δk ∆k0,..., ∆khk and hk p p q p qq sequences Fkl0 ti; x ,...,Fklskl ti; x so that the required property holds for Gkpti; xq and Rkpti; xq. We now define Γkl  Γk [∆kl for l P t 0, . . . , hk u. Then, if pciq P ΓklpRq „ ΓkpRq, we have Gkpci; xq  Gpci; xq. Also, since Fkl0pci; xq  Gkpci; xq  Gpci; xq and Fkl1pci; xq  ¡Rkpci; xq, we can take p p q p q p qq the sequences F ci; x ,Fkl0 ci; x ,...,Fklskl ci; x , whose terms differ from the Euclidean sequence of F pci; xq and Gpci; xq by a positive multiplier, and pair these with the respective Γkl. If we now let δ consist of the systems obtained above, and pair these with their respective sequences, including Γ¡8 with pF, 0, 0q, we have obtained a refinement of γ that satisfies our requirements. We also note now that the choice of the systems and sequences did not depend on the real closed field in question, so that the property holds for any real closed field.

Example. Let F pp, q; xq  x2 px q and Gpp, q; xq  2x p. We then have p2q Γ¡8pRq  Γ0pRq  H and Γ1pRq  R . We therefore consider only k  2, G2pp, q; xq  Gpp, q; xq. We first observe that:

22F pp, q; xq  p2x pqGpp, q; xq ¡ pp2 ¡ 4qq.

2 We therefore set R2pp, q; xq  p ¡ 4q. Now, since R2pp, q; xq P A, another step (only possible if p2 ¡ 4q  0) will give us the 0 polynomial. We therefore have the 2-cover and corresponding sequences:

2 Γ1 : a ¡ 4b  0 Ø pF,Gq 2 Γ2 : a ¡ 4b  0 Ø pF, G, R2q

If we now recall that the standard sequence of a polynomial fpxq is simply the Euclidean sequence of fpxq and its formal derivative, we can quickly prove the following theorem, which is our second main result. Note that this version is more general than the one in [4], as the systems we have to obtain include requirements on the bounds of our interval.

35 Theorem 4.3.3 (Parametrized version of Sturm’s Theorem). Let F pti; xq P Arxs. Then there exists a finite set of r 2-systems5 ω in K – which we can obtain in a finite number of steps – such that for every pc1, . . . , cr, a, bq P pr 2q R with a b, F pci; xq has a zero in ra, bs if and only if F pci; aqF pci; bq  0 or F pci; aqF pci; bq  0 and there is some Ω P ω such that pc1, . . . , cr, a, bq P ΩpRq.

We can restate this theorem as follows: Let F pti; xq be a family of poly- nomials whose coefficients are parametrized by polynomials with integer co- efficients. Then for any interval sa, br we can obtain a finite set of systems of equations, inequations and inequalities, so that F pci; xq has a zero in that interval if and only if the coefficient parameters pciq and the boundaries a and b satisfy one of those systems (provided that F pci; aqF pci; bq  0). p q  °Proof of the parametrized version of Sturm’s Theorem, 4.3.3. Let F ti; x n p q ν p q P p q ν1 uν ti x , where uν ti A, and un ti is the leading coefficient. Now, 1 if F pti; xq  0, then F pti; xq  u0ptiq is constant and we can take the sole pp qq system u0 . °  1p q  n p q ν¡1 We can therefore now assume that 0 F ti; x ν1 νuν ti x . Then by proposition 4.3.2 we can obtain a cover δ  t ∆0,..., ∆h u and p p q p qq ¤ ¤ corresponding sequences Fj0 ti; x ,...,Fjsj ti; x (0 j h) such that p q P p p q p qq if c1, . . . , cr ∆j, then the terms of Fj0 ci; x ,...,Fjsj ci; x differ from the terms of the standard sequence of F pci; xq by positive multipliers. In particular, we see that at any point, the number of sign changes is the same. Now pick any j P t 0, . . . , h u. Now, if we let γ be the same cover as in proposition 4.3.2, then δ is a refinement of γ. Therefore, if pciq P ∆jpRq, we have either unpciq  ¤ ¤ ¤  u1pciq  0 – in which case F pci; xq has a zero if and only if u0pciq  0 – or there is some k P t 1, . . . , n u such that ukpciq  0 but ulpciq  0 for l ¡ k. In the first case we set ωj  t ppu0qq u: the sole equation u0  0. In the latter case we may construct the following two sequences:

2nl αjlpti; xa, xbq  umptiq Fjlpti; xaq P Arxa, xbs

2nl βjlpti; xa, xbq  umptiq Fjlpti; xbq P Arxa, xbs, for 0 ¤ l ¤ sj, and where nl  degpFjlq and xa and xb are new indeter- minates. Then all the αjlpci; a, bq and βjlpci; a, bq differ from Fjlpci; aq and F pci; bq respectively – and hence from the standard sequence of F pci; xq at those points – by a positive multiplier. If F pci; aqF pci; bq  0, we now con- clude by Sturm’s Theorem that F pci; xq has a zero in sa, br if and only if the p p q p qq number of variations in sign of the sequences αj0 ci; a, 0 , . . . , αjsj ci; a, 0 p p q p qq and βj0 ci; 0, b , . . . , βjsj ci; 0, b are not equal. We therefore now take all

5The 2 extra parameters in the systems of ω correspond to the bounds of our interval. They serve to keep the set of systems independent of the real closed field that we choose to use.

36 possible r 2-systems on K that can be formed by the elements of those sequences (which is finite), and filter out the ones that lead to a differ- ence in the number of variations in sign between the sequences and for 6 each take the join with ∆j , to form the set of systems ωj. Then, if a pci; a, bq P ΩpRq for some Ω P ωj, then pciq P ∆jpRq, so that the above ap- plies, and there is a difference between the variation in sign in the sequences pαjlpci, a, 0qq and pβjlpci, 0, bqq, so that F pci; xq has a zero in sa, br, provided that F pci; aqF pci; bq  0. We also observe that if F pci; aqF pci; bq  0, then F pci; xq has a zero in ra, bs.  Yh If we now let ω j0ωj, we obtain the set of r 2-systems we require, since δ is a cover of K.

Example. In our last example we obtained a 2-cover and the corresponding se- quences F pp, q; xq  x2 px q and F 1pp, q; xq  2x p. We write:

2 1 ∆1 : p ¡ 4q  0 Ø pF,F q 2 1 2 ∆2 : p ¡ 4q  0 Ø pF,F , p ¡ 4qq.

We can then define the corresponding αjl and βjl as follows: p q  2 p q  2 α10 p, q; xa, xb xa pxa q β10 p, q; xa, xb xb pxb q

α11pp, q; xa, xbq  2xa p β11pp, q; xa, xbq  2xb p

p q  2 p q  2 α20 p, q; xa, xb xa pxa q β20 p, q; xa, xb xb pxb q

α21pp, q; xa, xbq  2xa p β21pp, q; xa, xbq  2xb p 2 2 α22pp, q; xa, xbq  p ¡ 4q β22pp, q; xa, xbq  p ¡ 4q

 pp 2 ¡ q p¡p 2 qp For ∆1 we get the sole system Ω11 p 4q , 1, xa pxa q 2xa q p 2 qp qqq p , xb pxb q 2xb p , that is, α1l will change sign, but β1l will not. For ∆2, we have to consider three cases: the α2l change sign once, and the β2l don’t (one zero), the α2l change sign twice, and the β2l don’t (two zeroes), or the α2l change sign twice and the β2l change sign once (one zero). These cases can all occur in several ways, and so we end up with a whole pile of systems.

Note: Existence of a zero We have introduced two extra indeterminates in our systems in order to account for the zeroes of our polynomial. However, if we choose to inves- tigate the problem of the existence of a zero in the entire field, we can drop those two parameters. This can be done by noting that we do not have to consider a and b up to the point that we define the sequences pαjlpciqq and pβjlpciqq. In particular, at that point we can see that for any prq pciq P R , if ρ P R is to be a zero of F pci; xq, then necessarily ¡µ ρ µ,

6 Technically, we now have to transform the r-system ∆j to an r 2-system by using the inclusion homomorphism A Ñ Arxa, xbs on all the elements of the system.

37 °  p q m¡1 p q2 p q¡2 where µ k 1 ν0 uν ci uk ci . We may from that point on let aptiq, bptiq P A depend on the parameters and modify the αptiq and βptiq accordingly, and finish the argument in the same way. We can therefore state the following corollary.

Corollary 4.3.4. Let F pti; xq P Arxs. Then we can construct a finite set of prq r-systems ω in K such that for any real closed field R, and pc1, . . . , crq P R : F pci; xq has a zero in R if and only if pc1, . . . , crq P ΩpRq for some Ω P ω. Restated: Let F pti; xq P Arxs be a family of polynomials whose coef- ficients are parametrized by polynomials with integer coefficients. Then we can construct a finite set of systems of polynomial equations, inequations and inequalities with integer coefficients – independent of the real closed field in question – so that for some choice pciq of the parameters, F pci; xq P Rrxs has a zero in R if and only if the pciq satisfy one of the constructed systems. ° ° p q  n i P r s p q  n i Now let f x i0 aix R x and let F ti; x i0 tix . We then see that F pai; xq  fpxq. Suppose that all the ai P Q € R and that fpxq has a zero in R. Then by corollary 4.3.4 we can construct a set of n-systems in 1 Z such that the paiq satisfy one of those systems. Now let R be another real 1 closed field. Then clearly all ai P R (by an isomorphism of the prime fields) and they still satisfy one of those systems. Therefore, the corresponding polynomial in R1rxs will also have a zero in R1. Corollary 4.3.5. If a polynomial fpxq with rational coefficients has a zero in one real closed field, it will have a zero in any real closed field. This last corollary is i.a. of the utmost importance for computer calcula- tions. E.g. the computable numbers, described by Turing as “the numbers whose expressions as a decimal are calculable by a machine”[9], can be shown to be real closed[2]. This then gives the result, that a polynomial with ra- tional coefficients has a zero in the real numbers, if and only if it has a zero in the computable numbers. Therefore, for any polynomial with rational coefficients, we are in principle able to compute all its real zeroes with a computer (or any realization of a Turing machine).

4.3.1 Tarski’s Principle The question now naturally arises whether we can generalize this procedure to families of polynomials in multiple indeterminates. The answer turns out to be positive. The idea we can pursue is to replace an equation in multiple indeterminates to a set of equations in one less indeterminates. We may go on with this procedure to eventually obtain a set of equations that have to be satisfied for the original equation to be solvable. If we then invoke the parametrized version of Sturm’s Theorem for each of these, we obtain a set of systems that will have to be satisfied by the parameters for our equation to be solvable. [4, Sec. 5.6]

38 This method has an important application in the so-called field of meta- mathematics, where the properties of mathematics itself are studied. In particular, it implies that every “elementary” sentence in the logic of a real closed field is decidable. This was shown by Tarski in 1948 for the real num- bers. [8] Note that in the logic of a real closed field, we mean the first-order logic that remains when only the axioms of the field itself are assumed. Set- theoretic sentences are not allowed. This does however, to quote Tarski, “gives the mathematician the assurance that he will be able to solve every such problem (an elementary problem in a real closed field) by working at it long enough.” And with that assurance we can continue to make algebraic exercises for high school students.

39 Epilogue

Sturm’s Theorem has provided us with a very simple way to determine the zeroes of a polynomial that lie within a certain interval. It is interesting to note that despite the simplicity of this method, it is not widely taught in undergraduate calculus courses. Perhaps this can be attributed to the inef- ficiency of the algorithm compared to more modern root-finding methods, the amount of algebra involved, or simply its age (almost 200 years!). In either case I would like to express my hopes that the tides could change in this respect. Nevertheless, the theorem not only provides us with this calculation method, it also leads to several important theoretical implications. As ex- amples we have seen the of the theory of real closed fields (in metamathematics), and the fact that if a polynomial with rational coeffi- cients is going to have a zero in one real closed field, then it is going to have one in every real closed field. The last result finds an application in computer science, where we can conclude that we can compute every zero of a polynomial with rational (even computable!) coefficients with a computer program. I have personally enjoyed this project very much due to the large amount of new algebra I have come to learn, and the discovery of an obscure, but fun and useful result. I know that I will definitely have use for Sturm’s Theorem in the future. Lastly, I would like to acknowledge Prof. Dr. Jaap Top and Dr. Ramsay Dyer for their support during the course of this project. Prof. Top has recommended this project, and they have both provided me with very useful feedback on the report, for which I am very grateful.

40 Bibliography

[1] E. Artin and O. Schreier. Algebraische konstruktion reele k¨orper. Ab- handlungen aus dem Mathematischen Seminar der Universit¨atHamburg, 5(1):85–99, December 1927. Conference proceedings from June 1926.

[2] M. Braverman. On the complexity of real functions. In Proceedings of the 2005 46th Annual IEEE Symposium on Foundations of Computer Science. IEEE, 2005.

[3] D.J. Griffiths. Introduction to Quantum Mechanics. Pearson Education, 2nd edition, 2005.

[4] N. Jacobson. Basic Algebra, volume 1. Dover, dover edition, 2009.

[5] N. Jacobson. Basic Algebra, volume 2. Dover, dover edition, 2009.

[6] J.P. Serre. Extensions de corps ordonn´es.In Comptes rendus des s´eances de l’Acad´emiedes Sciences, pages 576–577, September 1949.

[7] J.C.F. Sturm. M´emoiresur la r´esolutiondes ´equationsnum´eriques. Bul- letin des Sciences de F´erussac, 11:419–425, 1829.

[8] A. Tarski. A Decision Method for Elementary Algebra and Geometry. RAND Corporation, 1948.

[9] A.M. Turing. On computable numbers, with an application to the . Proceedings of the London Mathematical Soci- ety, 42:230–265, 1937.

41