<<

EFFECTIVE ELIMINATION OVER REAL CLOSED FIELDS

N. Vorobjov

1 Tarski (1931, 1951): Elementary algebra and geometry is decidable. More generally, there is an algorithm for quan- tifier elimination in the first order theory of the reals.

Boolean formula F (x1,..., xν) with atoms of the kind f > 0, where f ∈ R[x0, x1,..., xν], xi = (xi,1, . . . , xi,ni).

Φ(x0) := Q1x1Q2x2 ··· QνxνF (x0, x1,..., xν), where Qi ∈ {∃, ∀}, Qi 6= Qi+1. Theorem 1. There is a quantifier-free formula Ψ(x0) such that Ψ(x0) ⇔ Φ(x0) over R, n n i.e., {x0 ∈ R 0| Ψ(x0)} = {x0 ∈ R 0| Φ(x0)}.

Theorem remains true if R is replaced by any real closed field R, e.g., real algebraic numbers, Puiseux series over R in an infinitesimal.

2 Theorem 2. If atoms f ∈ Z[x0, x1,..., xν], then there is an algorithm which eliminates quanti- fiers from Φ(x0), i.e., for a given Φ(x0) pro- duces Ψ(x0). Corollary 3. The first order theory of the reals is decidable, i.e., there is an algorithm which for a given closed formula

Q1x1Q2x2 ··· QνxνF (x1,..., xν) decides whether it’s true or false.

3 Integer coefficients are mentioned here to be able to handle the problem on a Turing ma- chine. If we are less picky about the model of computation, for example allow exact arith- metic operations and comparisons in a real closed field R, then Theorem ?? and Corol- lary ?? are also true for formulae over R. In what follows we will use exactly this approach.

3-1 Let R be a real closed field.

Definition 4. A set X ⊂ Rn is called semialgebraic if X is representable in a form X = {x ∈ Rn|F (x)}, where F (x) is a quantifier- free formula. n Corollary 5. Let X = {x0 ∈ R 0|Φ(x0)}, where Φ(x0) := Q1x1Q2x2 ··· QνxνF (x0, x1,..., xν). Then X is semialgebraic.

4 Examples:

• Feasibility of a system of equa- tions and inequalities (= is a semialgebraic set empty?)

• Representation of the closure (or interior, or singular locus, or tubular neighbourhood, etc.) of a semi- algebraic set by a Boolean combination of polynomial inequalities

• Polynomial optimization: find the set of points of local minima of a polynomial function on a semialgebraic set.

5 There is no need to explain to logicians the usefulness of quantifier elimination. Instead I list a few straightforward examples of how it can be used in “ordinary” mathematics.

Semialgebraic set in all the examples initially can be naturally represented by prenex formu- lae with quantifiers (sometimes involving ε/δ language.

5-1 Complexity:

Tarski’s quantifier elimination algorithm: “non-elementary” (can’t be bounded from above by any tower of exponentials of a finite height).

Problem: construct an efficient algorithm

Possible approach: cylindrical (algebraic) cell decomposition (mid-70s, W¨uthrich,Collins)

6 Definition 6. Cylindrical cell is defined by in- duction as follows.

1. Cylindrical 0-cell in Rn is an isolated point. Cylindrical 1-cell in R is an open interval (a, b) ⊂ R.

2. For n ≥ 2 and 0 ≤ k < n a cylindrical (k + 1)-cell B in Rn is either a graph of a continuous bounded function f : C → R, where C is a cylindrical a cylindrical (k+1)- cell in Rn−1, or else a set of the form n {(x1, . . . , xn) ∈ R | (x1, . . . , xn−1) ∈ C

and f(x1, . . . , xn−1) < xn < g(x1, . . . , xn−1)}, where C is a cylindrical k-cell in Rn−1, and f, g : C → R are continuous bounded func- tions such that

f(x1, . . . , xn−1) < g(x1, . . . , xn−1) for all points (x1, . . . , xn−1) ∈ C.

7 Cylindrical cell decomposition is a very use- ful tool when we study o-minimal structures. Since I’ll need this concept also in my other lecture, let me give a definition. We first de- fine a cylindrical cell. One can notice that any cylindrical cell is homeomorphic to an open ball of the matching dimension (semialgebraically homeomorphic in the case of an arbitrary R).

We will give the definition only for the bounded case.

7-1 Definition 7. Cylindrical cell decomposition D of a subset A ⊂ Rn is defined by induction as follows.

1. If n = 1, then D is a finite family of pair- wise disjoint cylindrical cells (i.e., isolated points and intervals) whose union is A.

2. If n ≥ 2, then D is a finite family of pair- wise disjoint cylindrical cells in Rn whose union is A and there is a cylindrical cell de- composition D0 of π(A) such that π(C) is its cell for each C ∈ D, where π : Rn → Rn−1 is the projection map onto the co- ordinate subspace of x1, . . . , xn−1. We say that D0 is induced by D.

Definition 8. Let B ⊂ A ⊂ Rn and D be a CCD of A. Then D is compatible with B if for any C ∈ D we have either C ⊂ B or C ∩ B = ∅ (i.e., some subset D0 ⊂ D is a CCD of B).

8 Example:

9 Theorem 9. (Collins, W¨uthrich). If X ⊂ Rn is a semialgebraic set, then there is an algorithm for computing CCD of Rn compatible with X. Moreover, CCD is semialgebraic, i.e., each cell is described by a system of polynomial equa- tions and inequalities.

Technical tools: and sub-resultants.

Compare with classical (van-der-Waerden, Modern Algebra) and Chevalley’s Theorem.

Corollary 10. There is an algorithm for quan- tifier elimination.

10 The corollary is almost obvious from the defini- tion of the CCD. One reasons by induction on quantifiers starting from Qν and moving out- ward. On each step we use the relation (de- scribed in the definition) between the decom- position of the space of all current variables and the decomposition of the subspace of cur- rent free variables.

10-1 Complexity:

Let Φ(x0) involve s atoms of the kind f > 0, where f ∈ R[x0,..., xν], n := n0 + ··· + nν, and deg(f) < d.

Then the complexity of constructing CCD and the corresponding quantifier elimination has an upper bound n (sd)O(1) .

Similar upper bound on the number of cells, degrees and quantities of describ- ing CCD and quantifier-free formula equivalent to Φ.

If coefficients of polynomials f are integer of bit-sizes less than M, then the (Turing ma- chine) complexity does not exceed n MO(1)(sd)O(1) .

11 Davenport & Heintz (1988) (similar results by Weispfenning (1988)) found a (parametric) example of Φ of degrees ≤ 4, with two free variables and

n1 ≤ 2, . . . , nν ≤ 2 (i.e., each quantifier is responsible for at most two variables) defining in R2 a semialgebraic set consisting of n 22 isolated points.

⇒ Any CCD for Φ can’t contain lesser num- ber of cells.

⇒ Lower complexity bound for any CCD-based algorithm is doubly exponential.

Fisher & Rabin (1974): lower bound for deci- sion methods is exponential in the number of quantifier alternations. 12 Simplest case: ∃x ∈ R (f(x) = 0), where f ∈ R[x], deg(f) < d.

Sturm’s Theorem (1835):

Euclidean algorithm for the division of f by its derivative f 0.

0 Let g1 = f, g2 = f , and g3, . . . , gt be the se- quence of the negatives of the remainders in the Euclidean algorithm. For any x ∈ R let V (x) be the number of sign changes in g1(x), g2(x), . . . , gt(x). Theorem 11. The number of distinct real roots of f in [a, b] ⊂ R is S(f, f 0) := V (a) − V (b).

Note: sign(f(∞)) =sign(leading coefficient a0 of f) d sign(f(−∞)) =sign(a0x at −1). 13 Note that over an algebraically closed field the answer is always “Yes”.

Sturm’s theorem provides an algorithm to de- cide the existential formula over any real closed field.

Now we want to generalize this algorithm to be able to handle systems of many equations and inequalities (in one variable, for time being).

13-1 Ben-Or/Kozen/Reif algorithm (1986):

Definition 12. Let f1, . . . , fk ∈ R[x], deg(fi) < d. Consistent sign assignment is a string σ = (σ1, . . . , σk) where σi ∈ {>, <, =}, such that the system

f1σ10, . . . , fkσk0 has a solution in R.

Input: f1, . . . , fk ∈ R[x]. Output: The list of all consistent sign assignments.

Prepare the input: Make all fi squarefree and relatively prime. Then sets of roots of polynomials fi are disjoint, in particular any consistent sign assignment has at most one =. Moreover, we can add a new polynomial f so that all con- sistent assignments for f1, . . . , fk can be recon- structed from all consistent assignments of < and > for f1, . . . , fk at roots of f. 14 Case k = 0: f = 0 Sturm’s Theorem.

Case k = 1: f, f1

C1 := {x| f = 0 ∧ f1 > 0}

C1 := {x| f = 0 ∧ f1 < 0} Obviously, the total number of roots of f is 0 S(f, f ) = |C1| + |C1|. (Here and in the sequel S corresponds to the interval [−∞, ∞].) 0 Lemma 13. S(f, f f1) = |C1| − |C1|.

Sturm + Lemma ⇒ Ã !Ã ! Ã ! 0 1 1 |C1| S(f, f ) = 0 1 −1 |C1| S(f, f f1)

15 Case k = 2: f, f1, f2

C2 := {x|f = 0 ∧ f2 > 0}

C2 := {x|f = 0 ∧ f2 < 0}

    1 1 1 1 |C1 ∩ C2|     1 −1 1 −1 |C ∩ C |    1 2  = 1 1 −1 −1 |C1 ∩ C2| 1 −1 −1 1 |C1 ∩ C2|   S(f, f 0)  0   S(f, f f1)  =  0   S(f, f f2)  0 S(f, f f1f2)

16 In case k we get a matrix Ak which is a tensor product of k copies of A1 and is non-singular. Its unique solution describes all consistent sign assignments for f = 0, f1, . . . , fk. Solving the system by any standard method (e.g., Gaus- sian elimination), we list all consistent assign- ments. However this naive approach leads to the complexity exponential in k.

Polynomial-time algorithm: deg(fi) < d ⇒ the number of all roots in all fi is O(kd) ⇒ the number of all consistent sign assignments is O(kd).

“Divide-and-conquer” then leads to complexity polynomial in kd.

17 PLAN:

n X = {x0 ∈ R 0| Φ(x0)}, where Φ(x0) := Q1x1Q2x2 ··· QνxνF (x0, x1,..., xν).

1. The algorithm eliminates quantifiers by induction starting from Qν outwards.

2. It’s sufficient to consider just the case of Φ(y) := ∃x F (x, y), where x ∈ Rn, y ∈ Rm.

3. We will first construct an algorithm for the problem ∃xF (x) and then “parameterize” the algorithm with vector y.

(3) → (2) → (1)

18 ∃xF (x), x ∈ Rn, F (x) is a Boolean formula with s atoms of the kind f ∗ 0, f ∈ R[x], ∗ ∈ {=, >, <, ≥, ≤}, deg(f) < d.

In case n = 1: Ben-Or/Kozen/Reif applied to the set of all atomic polynomials. The list of all consistent sign assignments allows to decide whether ∃xF (x) is true.

Arbitrary n: Similar strategy: list all consistent sign assignments.

How: Find a finite set Λ ⊂ Rn such that any consistent assignment defines a system of equations and strict inequalities having a solu- tion in this set. Knowing Λ it’s easy to find the list of consistent assignments.

19 Points from Λ are (isolated) solutions of sys- tems of polynomials equations based on atoms of F (x).

Solving systems of equations: Let a system of polynomial equations have at most finite number of solutions over algebraic closure R¯.

Possible approaches:

• Gr¨obnerbases

• Effective Hilbert’s Nullstellensatz

• u-resultants

We will use u-resultants.

20 Homogeneous polynomials: A polynomial in n variables is an expression: X i1 in ai1,...,inx1 ··· xn . (i1,...,in) A polynomial is homogeneous or a form if i1 + ··· + in =const for all (i1, . . . , in) with ai1,...,in 6= 0.

Examples: Homogeneous: x2 + xy Non-homogeneous: x2 − 1

0 is a root of any non-constant homogeneous polynomial.

n If x = (x1, . . . , xn) ∈ R¯ is a root of a homo- geneous polynomial h, then for any a ∈ R¯ the point ax = (ax1, . . . , axn) is also a root of h.

⇒ If x 6= 0, then there is a straight line of roots of h passing through 0 and x. We can say that the line itself is a root of h. 21 Projective space: The space of all straight lines through the ori- gin is called (n − 1)-dimensional projective space Pn−1(R¯).

An element of Pn−1(R¯), which is a line passing through 0 and x = (x1, . . . , xn) 6= 0, is denoted by (x1 : x2 : ··· : xn).

Any polynomial h ∈ R¯[x] can be homogenized: Let deg(h) = d. Introduce a new variable x0 and a homogeneous polynomial d hom(h) := x0h(x1/x0, . . . , xn/x0).

If (x1, . . . , xn) is a root of h, then (1 : x1 : ··· : xn) is a root of hom(h). If n−1 (y0 : y1 : ··· : yn) ∈ P (R¯) is a root of hom(h) and y0 6= 0, then (y1/y0, . . . , yn/y0) is a root of h. Of course, hom(h) can have a root of the form (0 : z1 : ··· : zn) which does not correspond in this sense to any root of h, it’s called a “root at infinity”. 22 Similar relation is true for systems of polyno- mial equations: homogenized system has not less solutions than the original system.

Example: original system: x1 − x2 = x1 − x2 + 1 = 0 homogenization: x1 − x2 = x1 − x2 + x0 = 0. Original is inconsistent, the homogenized has a unique solution (0 : 1 : 1) (at infinity).

23 u-:

Consider a system of n homogeneous equa- tions in n + 1 variables: h1 = ··· = hn = 0.

The system is always consistent in Pn(R¯). It can have either finite or infinite number of solutions.

Introduce new variables u0, u1, . . . , un. Proposition 14. There is a homogeneous ¯ polynomial Rh1,...,hn ∈ R[u0, . . . , un], called u-resultant, such that Rh1,...,hn 6≡ 0 iff the system has finite number of solutions.

If Rh1,...,hn 6≡ 0, then Y (i) (i) Rh1,...,hn = (χ0 u0 + ··· + χn un), 1≤i≤r (i) (i) where (χ0 : ··· : χn ), 1 ≤ i ≤ r are all solu- tions in Pn(R¯) of the system.

24 Assume that in ∃xF (x) formula F (x) is just a system of inequalities

f1 ≥ 0 ∧ · · · ∧ fs ≥ 0, and that {x| F (x)} ⊂ Rn is bounded.

Let deg(fi) < d for all 1 ≤ i ≤ s and D be minimal even integer ≥ sd + 1.

Introduce a new variable ε and consider Y X s+1 D g(ε) := (fi + ε) − ε xj . 1≤i≤s 1≤j≤n

Lemma 15. If the system f1 ≥ 0 ∧ · · · ∧ fs ≥ 0 is consistent, then there is a solution which is a limit of a solution of ∂g(ε) ∂g(ε) = ··· = = 0 ∂x1 ∂xn as ε → 0.

25 We consider just the case of a system of in- equalities to simplify the explanations. The case of an arbitrary formula is slightly more geometrically complicated, but very similar in spirit.

If {x| F (x)} is not bounded, then it can be reduced to the bounded cased by intersecting this set with a ball of sufficiently large radius.

25-1 Picture:

26 Lemma 16. For all sufficiently small ε > 0, if

f1 ≥ 0 ∧ · · · ∧ fs ≥ 0 is consistent, then the homogenization of ∂g(ε) ∂g(ε) = ··· = = 0 ∂x1 ∂xn has a finite number of solutions.

Proof. Gr¨obnerbasis or u-resultant techniques.

Denote the homogenization by

G1(ε) = ··· = Gn(ε) = 0. By Proposition, every solution of this system is a coefficient vector in a divisor of the u-resultant R . G1(ε),...,Gn(ε)

27 The lemma of course means that the system it- self also has a finite number of solutions, since the homogenization always has more solutions.

27-1 Represent R in the form G1(ε),...,Gn(ε) X R = R εm + R εj, G1(ε),...,Gn(ε) m j j>m where m is the lowerst degree w.r.t. ε.

Lemma 17. Y (i) (i) Rm = (η0 u0 + ··· + ηn un), 1≤i≤r (i) (i) where (η0 , . . . , ηn ) is the limit of (i) (i) (χ0 , . . . , χn ) as ε → 0.

Corollary 18. If f1 ≥ 0 ∧ · · · ∧ fs ≥ 0 is consistent, then its solution is (i) (i) (i) (i) (η1 /η0 , . . . , ηn /η0 ), (i) (i) (i) where η0 6= 0 and (η0 u0 + ··· + ηn un) is a divisor of Rm.

28 If we were able to factorize Rm over R¯, then the algorithm for consistency of f1 ≥ 0∧· · ·∧fs ≥ 0 could be roughly as follows:

• Construct the u-resultant R ; G1(ε),...,Gn(ε)

• Find the lowest term Rm w.r.t. ε in R ; G1(ε),...,Gn(ε)

• Factorize Rm finding the limits (i) (i) (η0 , . . . , ηn ) (when ε → 0);

• Check whether (i) (i) (i) (i) (η1 /η0 , . . . , ηn /η0 )

satisfies f1 ≥ 0 ∧ · · · ∧ fs ≥ 0. If yes, then this system is is consistent, else it’s inconsistent.

29 Unfortunately, we are unable to factorize a polynomial in general case. Even in special cases, when effective multivariate factorization algorithms are known, they are very involved.

Therefore, our scheme will be different.

29-1 Consider in the affine space R¯n+1 the hyper- surface n+1 {(u0, . . . , un) ∈ R¯ | Rm = 0}.

Since Rm factors into linear forms, the hy- persurface is a union of hyperplanes passing through 0.

(i) (i) η0 u0 +···+ηn un = 0 is the equation defining hyperplane number i, 1 ≤ i ≤ r; (i) (i) ⇒ vector (η0 , . . . , ηn ) is orthogonal to the hyperplane i; (i) (i) ⇒ vector (η0 , . . . , ηn ) is collinear to the gra- dient of Rm at a non-singular point x of the hyperplane i, i.e., to µ ¶ ∂Rm ∂Rm (x),..., (x) . ∂u0 ∂un

To find a finite set of non-singular points on all hyperplanes, intersect {Rm = 0} with a “generic” straight line in R¯n+1 defined with coefficients from R (actually, from Z).

30 Algorithm for deciding ∃x F (x):

• Write the homogeneous system G1(ε) = ··· = Gn(ε) = 0.

• Construct u-resultant R G1(ε),...,Gn(ε) (how – later).

• Find the lowest term Rm w.r.t. ε in R . G1(ε),...,Gn(ε)

• Find a generic straight line in the paramet- ric form αt + β, where α, β ∈ Rn+1, t is a single variable. The roots of the univariate equation Rm(αt+β) = 0 correspond to the points of intersection of the line with hy- perplanes. If f1 ≥ 0 ∧ · · · ∧ fs ≥ 0 is consis- 0 tent, then for a root t of Rm(αt + β) = 0 the following point is its solution: µ ¶ ∂Rm/∂u ∂Rm/∂un 1(αt0 + β),..., (αt0 + β) . ∂Rm/∂u0 ∂Rm/∂u0 31 • Denoting

∂Rm/∂uj yj(αt + β) := (αt + β), ∂Rm/∂u0 we get:

The system f1 ≥ 0∧· · ·∧fs ≥ 0 is consistent iff there exists t0 ∈ R such that

0 Rm(αt + β) = 0 and

0 0 fi(y1(αt + β) ≥ 0 ∧ · · · ∧ yn(αt + β)) ≥ 0 for all 1 ≤ i ≤ s.

• Check consistency of the latter system of univariate polynomials inequalities using Ben-Or/Kozen/Reif.

32 Technical details we left out:

• Computing u-resultant (polynomial of degree (sd)O(n)).

• Finding generic straight line in R¯n+1.

• Case of vanishing gradient ⇔ (i) (i) η0 u0+···+ηn un occurs in Rm with power > 1 ⇔ 0 root t of Rm(αt + β) = 0 is multiple.

• Case of an arbitrary Boolean formula as F (x).

Complexity:

Straightforward calculation: (sd)O(n).

33 Accomplished item (3) of the PLAN ↑: deciding ∃xF (x).

Now, item (2): quantifier elimination in ∃x F (x, y).

Main idea: Consider variables y as parameters and try to execute the decidability algorithm for a para- metric formula (i.e., with variable coefficients).

Decidability algorithm uses only +, ∗ and com- parisons over elements of R. Arithmetic can be easily performed parametri- cally, while comparisons require branching.

The parametric algorithm is represented by algebraic decision tree.

34 Picture:

35 Input of the tree is y. Each vertex of the tree is associated with a polynomial in y, which is the composition of arithmetic operations per- formed along the branch leading from the root to this vertex.

Upon a specialization of input y, the polyno- mial at the root is evaluated, and the value is compared to zero. Depending on the result of comparison, one of the three branches is cho- sen, thereby determining the next vertex and the polynomial evaluation to be made, and so on until a leaf is reached.

Each leaf is assigned the value “True” or “False”. All specializations of y, arriving at a leaf marked “True”, correspond to closed formulae that are true. Similar with “False”.

35-1 An elimination algorithm:

• Use the decision algorithm parametrically to obtain a decision tree with the variable input y.

• Determine all branches from the root to leaves marked True.

• Take conjunction of the inequalities along each such branch: ^ hi(y)σi0, i∈branch where σi ∈ {<, >, =}.

• _ ^ ∃x F (x, y) ⇔ hi(y)σi0 branches i∈branch

36 Difficulty:

Too many branches in the tree:

Height = complexity of decidability = (sd)O(n) O(n) ⇒ 3(sd) branches (leaves)

But:

Most branches are not followed by any spec- ification of y, i.e., most leaves correspond to Boolean formulae defining ∅.

Because:

37 Lemma 19. Let in the finite set g1, . . . , gk ∈ R[x1, . . . , xn] all degrees deg(gi) < D. Then the number of all consistent sign assignments for this set is less than (kD)O(n).

Carefully examining the decidability algorithm we see that the number of distinct polynomials hi in the tree is essentially the same as in one branch, i.e., (sd)O(n). From the complexity es- timate we know that their degrees are (sd)O(n). Then, by Lemma, the number of distinct 6= ∅ 2 sets associated with leaves is (sd)O(n ).

⇒ Complexity of ∃ elimination: 2 (sd)O(n )

∀ elimination:

∀xF (x) ⇔ ¬∃x¬F (x)

38 General case (item (1) of the PLAN ↑):

Q1x1Q2x2 ··· QνxνF (x0, x1,..., xν), n where xi ∈ R i, n := n0 + ··· + nν.

Complexity: O(ν) (sd)n Compare with CCD algorithm:

n (sd)O(1)

Using finer technical tools one can construct quantifier elimination algorithm having com- plexity Q (sd) 0≤i≤ν O(ni)

39 References:

1. J Renegar, Recent progress on the complexity of the decision problem for the reals, DIMACS Series in Discrete Maths. and Theoretical Comp. Sci., 6, 1991, 287–308.

2. S Basu, R Pollack, M-F Roy, Algorithms in Real , Springer, Berlin- Heidelberg, 2003.

40