Euclidean for multivariate

Anderson Beraldo de Araujo´ RA 065156 Disciplina Aneis´ e Corpos Prof. Dr. Fernando Eduardo Torres Orihuela

Abstract In contrast to what is regularly done in the literature, this article extends Euclidean division to multivariate polynomials, preserving uniqueness of .

1 Introduction

In the context of polynomials over one variable, Euclidean division is the process of division of two polynomials, which produces a and a with a degree iqual to zero or smaller than the . Its main property is that the quotient and remainder are unique poly- nomials, under some conditions. According to Brown (1973), the generalization of Euclidean division to multivariate polynomials has a long history; it is straigthforward, but with many pitfalls. Notably, in one-variable polynomials, Euclidean division is the base for Euclid’s algo- rithm, which permit us to compute greatest common . Nonetheless, in the multivariate case this is not so. Euclidean domains holds Bezout’s identity: gcd(a,b) = ra+sb. This fails in multivariate rings F[x1,...,xn], n ≥ 2, since gcd(x1,x2) = 1 but there is no Bezout equation 1 = x1 f + x2g (evaluating at x1 = 0 = x2 implies 1 = 0 in F). Moreover, the usual version of Euclidean division for multivariate polynomials does not guarantee the uniqueness of remainders. In this work, we will provide a simple variation of Euclidean division, as it is displayed in Cox et alli (2015, p. 61-68), preserving uniqueness of remainders. The fundamental idea is to use a quisksort algorithm to order the monomials before to make the division. This small change is sufficient to ensure the uniqueness of remainders, as it will be shown. In Section 2, we remind some concepts related to multivariate polynomials, doing an especial emphasis on the ordering over monomials. In Section 3, we exhibt the Euclidean divison over multivariate polynomials. To conclude, we make some remarks on future works.

1 2 Monomials ordering

We are going to discuss polynomials in n variables x1,...,xn with coefficients in an arbitrary field F. We start by defining monomials.

Definition 2.1. A monomial in x1,...,xn is a product of the form

α1 α2 αn x1 · x2 ···xn

where all of the exponents α1,...,αn are nonnegative . The total degree of this monomial is the sum α1 + ··· + αn.

We can simplify the notation for monomials as follows: let α = (α1,...,αn) be an n-tuple of nonnegative integers. Then we set

α α1 α2 αn x = x1 · x2 ···xn α When α = (0,...,0), note that x = 1. We also let |α| = α1 + ··· + αn denote the total degree of the monomial xα .

Definition 2.2. A polynomial f in x1,...,xn with coefficients in a field F is a finite linear combination (with coefficients in F) of monomials. We will write a polynomial f in the form

α f = ∑aα x α

where the sum is over a finite of n-tuples α = (α1,...,αn). The set of all polynomi- als in x1,...,xn with coefficients in F is denoted F[x1,...,xn]. When dealing with polynomials in a small number of variables, we will usually dispense with subscripts. Thus, polynomials in one, two, and three variables lie in F[x], F[x,y], and F[x,y,z], respectively. For example, 1 p = 2x3y2z + y3z3 − 3xyz + y2 2 is a polynomial in Q[x,y,z]. We will usually use the letters f , g, p, q, r to refer to polyno- mials. We will use the following terminology in dealing with polynomials. The definition of sums and products of multivariate polynomials is analogous to the one variable case. The sum and product of two polynomials is again a polynomial. We say that a polynomial f divides a polynomial g provided that f = gp for some polynomial p ∈ F[x1,...,xn] One can show that, under addition and multiplication, F[x1,...,xn] satisfies all of the field axioms except for the existence of multiplicative inverses. This means that F[x1,...,xn] is a .

α Definition 2.3. Let p = ∑α aα x be a polynomial in F[x1,...,xn]. We call aα the coefficient of the monomial xα . If a α , 0, then we call a xα a term of p. The degree of p , 0, denoted deg(p), is the maximum |α| such that the coefficient aα is nonzero. The total degree of the zero polynomial is undefined.

2 3 2 1 3 3 2 As an example, the polynomial p = 2x y z + 2 y z − 3xyz + y given above has four terms and degree six. Note that there are two terms of maximal degree, which is something that cannot happen for polynomials of one variable. For this reason, we need to order the terms of multivariate polynomials. For the on polynomials in one variable, we are dealing with the degree ordering on the one-variable monomials:

m+1 m ··· > x > x > ··· > x2 > x > 1. The success of the algorithm depends on working systematically with the leading terms in f and g, and not removing terms at random from f using arbitrary terms from g. First, we note that we can reconstruct the monomial xα = xα1 ···xαn from the n-tuple of n exponents α = (α1,...,αn) ∈ N in the sense that any ordering > we establish on the space Nn give us an ordering on monomials: if α > β according to this ordering, we will also say that xα > xβ . Here the natural N include the number 0. We also want our orderings to be compatible with the algebraic structure of polynomial rings. To begin, since a polynomial is a sum of monomials, we would like to be able to arrange the terms in a polynomial unambiguously in descending (or ascending) order. To do this, we must be able to compare every pair of monomials to establish their proper relative positions. Thus, we will require that our orderings be linear or total orderings. This means that for every pair of monomials xα and xβ either xα > xβ or xα = xβ or xα < xβ . A total order is also required to be transitive, so that xα > xβ and xβ > xγ . Next, we must take into account the effect of the sum and product operations on polynomials. When we add polynomials, after combining like terms, we may simply rearrange the terms present into the appropriate order, so sums present no difficulties. Products are more subtle, however. Since multiplication in a polynomial ring distributes over addition, it suffices to consider what happens when we multiply a monomial times a polynomial. If doing this changed the relative ordering of terms, significant problems could result in any process similar to the Euclidean division in F[x], in which we must identify the leading terms in polynomials. The reason is that the leading term in the product could be different from the product of the monomial and the leading term of the original polynomial. Hence, we will require that all monomial orderings have the following additional property. If xα > xβ and xγ is any monomial, then we require that xα xγ > xβ xγ . In terms of the exponent vectors, this property means that if xα > xβ in our ordering on Nn , then, for all γ ∈ Nn, α +γ > β +γ. Finally, we will need that > is a well-ordering. This means that every nonempty subset of Nn has a smallest element under >. In other words, if A ⊆ Nn is nonempty, then there is α ∈ A such that β > α or every β , α in A. With these considerations in mind, we make the following definition. n Definition 2.4. A monomial ordering on K[x1,...,xn] is a relation > on N such that, first, > is a total ordering on Nn, second, if xα > xβ and xγ ∈ Nn then xα xγ > xβ xγ and, third, > is a well-ordering on Nn. Given a monomial ordering >, we say that α ≥ β when either α > β or α = β. Lemma 2.1. An order relation > on Nn is a well-ordering if and only if every strictly decreas- ing sequence in Nn α(1) > α(2) > α(3) > ··· eventually terminates. Proof. We will prove this in contrapositive form: > is not a well-ordering if and only if there is an infinite strictly decreasing sequence in Nn. If > is not a well-ordering, then some nonempty

3 subset S ⊆ Nn has no least element. Now pick α(1) ∈ S. Since α(1) is not the least element, we can find α(1) > α(2) in S. Then α(2) is also not the least element, so that there is α(2) > α(3) in S. Continuing this way, we get an infinite strictly decreasing sequence α(1) > α(2) > α(3) > ···. Conversely, given such an infinite sequence, then {α(1),α(2),α(3),···} is a non-empty subset of Nn with no least element, and thus, > is not a well-ordering. 

In this work we will use an ordering on n-tuples called lexicographic order >lex. n Definition 2.5. Let α = (α1,...,αn) and β = (β1,...,βn) be in N . We say α >lex β if the n α β leftmost nonzero entry of the vector difference α −β in N is positive. We will write x >lex x if α >lex β.

Example 2.1. We have (1,2,0) >lex (0,3,4) since α − β = (1,−1,−4). On the other hand, (3,2,4) >lex (3,2,1) since α − β = (0,0,3). The variables x1,...,xn are ordered in the usual way by the lex ordering: (1,0,...,0) >lex (0,1,0,...,0) >lex ··· >lex (0,...,0,1). Thus, x1 >lex x2 >lex ··· >lex xn. In practice, when we work with polynomials in two or three variables, we will call the variables x, y, z rather than x1 , x2, x3. We will also assume that the alphabetical order x > y > z on the variables is used to define the lexicographic ordering unless we explicitly say otherwise. Proposition 2.1. The lexicographic ordering on is a monomial ordering.

Proof. That >lex is a total ordering follows directly from the definition and the fact that the usual numerical order on Nn is a total ordering. If α >lex β, then we have that the leftmost nonzero entry in α − β, say αi − βi, is positive. But xα xγ = xα+γ and xβ xγ = xβ+γ . Then in (α + γ) − (β + γ) = α − β, the leftmost nonzero entry is again αi − βi > 0. Suppose that >lex were not a well-ordering. Then by the previous lemma, there would be an infinite strictly descending sequence α(1) > α(2) > α(3) > ··· of elements of Nn. We will show that this leads to a contradiction. Consider the first entries of the vectors α(i) ∈ Nn. By the definition of the lexicographic order, these first entries form a nonincreasing sequence of nonnegative integers. Since Nn is well-ordered, the first entries of the α(i) must stabilize even- tually. In other words, there exists an i0 such that all the first entries of the α(i) with i ≥ i0 are equal. Beginning at α(i0), the second and subsequent entries come into play in determining the lexicographic order. The second entries of α(i0),α(i0 + l),... form a nonincreasing sequence. By the same reasoning as before, the second entries stabilize eventually as well. Continuing in the same way, we see that for some i1, the α(i1),α(i1 + l),... all are equal. This contradicts the fact that α(i1) >lex α(i1 + 1).  Example 2.2. For example, consider the polynomial f = 4xy2z+4z2 −5x3 +7x2z2 ∈ F[x,y,z]. With respect to lexicographic order, we would reorder the terms of f in decreasing order as f = −5x3 + 7x2z2 + 4xy2z + 4z2. Given that we have a monomial ordering, now we can define the final concepts that are necessary to introduce the Euclidean division for multivariate polynomials. α Definition 2.6. Let p = ∑α aα x be a polynomial in K[x1,...,xn] and let > be a monomial n order. The multidegree of f is mdeg( f ) = max(α ∈ N : aα , 0) (the maximum is taken with respect to >). The leading coefficient of f is LC( f ) = amdeg( f ) ∈ F. The leading monomial of f is LM( f ) = xmdeg( f ) (with coefficient 1). The leading term of f is LT( f ) = LC( f ) · LM( f ).

4 Example 2.3. To illustrate, let f = 4xy2z + 4z2 − 5x3 + 7x2z2 as before and let > denote lexi- cographic order. Then mdeg( f ) = (3,0,0), LC( f ) = −5, LM = x3, LT( f ) = −5x.

From the definitions it is clear that, for every f ,g ∈ F[x1,...,xn] mdeg( f g) = mdeg( f ) + mdeg(g) and if f + g , 0, then mdeg( f + g) ≤ max(mdeg( f ),mdeg(g)). If, in addition, mdeg( f ) , mdeg(g), then equality occurs. This is all that we need with respect to the order of monomials.

3 Multivariate Euclidean division

The goal is to divide f ∈ F[x1,...,xn] by f1,..., fs ∈ F[x1,...,xn]. This means expressing f in the form

f = q1 f1 + ··· + qs fs + r,

where the q1,...,qs and remainder r lie in F[x1,...,xn]. Some care will be needed in deciding how to characterize the remainder. This is where we will use monomial orderings. To make this, we will first apply a version of quicksort algorithm to ordering the divisors.

Algorithm 1 Partition(Vector,lower,higher) 1: pivot := A[higher] 2: i := lower, j := lower 3: for j < higher do 4: if A[ j] < pivot then 5: swap A[i] with A[ j] 6: i := i + 1 7: end if 8: swap A[i] with A[hi] 9: end for 10: return i

Algorithm 2 Quicksort(Vector,lower,higher) 1: if lower < higher then 2: partition := Partition(Vector,lower,higher) 3: Quicksort(Vector,lower, partition − 1) 4: Quicksort(Vector, partition + 1,higher) 5: end if

The basic idea of the algorithm is the same as in the one-variable case: we want to cancel the leading term of f (with respect to a fixed monomial order) by multiplying some fi by an ap- propriate monomial and subtracting. Then this monomial becomes a term in the corresponding qi. We can now state the general form of the division algorithm.

5 Algorithm 3 Euclidean(( f1,..., fs), f ) 1: q1 := 0,...,qs := 0,r := 0 2: p := f

3: ( fi1 ,..., fis ) := Quicksort(( f1,..., fs),1,s) 4: while p , 0 do 5: j := 1 6: divided := false 7: while j ≤ s and divided = false do

8: if LT( fi j ) divides LT(p) then LT(p) 9: q j := q j + LT( fi j ) LT(p) 10: p := p − fi j LT( fi j ) 11: divided := true 12: else 13: j := j + 1 14: end if 15: if divided = false then 16: r := r + LT(p) 17: p := p − LT(p) 18: end if 19: end while 20: end while 21: return [q1,...,qs,r]

Let us first work at some example to see what is involved.

2 Example 3.1. We will first divide f = xy + 1 by f1 = xy + 1 and f2 = y + 1, using lexico- graphic order with x > y. We want to employ the same scheme as for division of one-variable polynomials, the difference being that there are now several divisors and quotients. Listing the divisors f1, f2 and the quotients q1,q2 vertically, we have the following setup:

2 The leading terms LT( f1) = xy and LT( f2) = y both divide the leading term LT( f ) = xy . 2 Since f1 is listed first, we will use it. Thus, we divide xy into xy , leaving y, and then subtract y · f1 from f :

6 Now we repeat the same process on −y + 1. This time we must use f2 since LT( f1) = xy does not divide LT(−y + 1) = −y. We obtain:

Since LT( f1) and LT( f2) do not divide 2, the remainder is r = 2 and we are done. Thus, we have written f = xy2 + 1 in the form

xy2 + 1 = y · (xy + 1) + (−1) · (y + 1) + 2.

Example 3.2. In this example, we will encounter an unexpected subtlety that can occur when 2 2 2 dealing with polynomials of more than one variable. Let us divide f = x y + xy + y by f1 = 2 xy − 1 and f2 = y − 1. As in the previous example, we will use lexicographic order with x > y. The first two steps of the algorithm go as usual, giving us the following partially completed division (remember that when both leading terms divide, we use f1):

2 2 Note that neither LT( f1) = xy nor LT( f2) = y divides LT(x + y + y) = x. However, x + 2 y2 + y is not the remainder since LT( f2) divides y . Thus, if we move x to the remainder, we can continue dividing. (This is something that never happens in the one-variable case: once the leading term of the divisor no longer divides the leading term of what is at the bottom of

7 the division, the algorithm terminates.) To implement this idea, we create a remainder column r, to the right of the division, where we put the terms belonging to the remainder. Also, we call the polynomial at the bottom of division the intermediate dividend. Then we continue dividing until the intermediate dividend is zero. Here is the next step, where we move x to the remainder column (as indicated by the arrow):

Now we continue dividing. If we can divide by LT( f1) or LT( f2), we proceed as usual, and if neither divides, we move the leading term of the intermediate dividend to the remainder column. Here is the rest of the division:

Thus, the remainder is x + y + 1, and we obtain

x2y + xy2 + y2 = (x + y) · (xy − 1) + 1 · (y2 − 1) + x + y + 1. Note that the remainder is a sum of monomials, none of which is divisible by the leading terms LT( f1) or LT( f2). The above example is a fairly complete illustration of how the division algorithm works. It also shows us what property we want the remainder to have: none of its terms should be divisible by the leading terms of the polynomials by which we are dividing.

8 Theorem 3.1 (Multivariate Euclidean division). Let > be a monomial order on Nn, and let f = ( f1,..., fs) be an ordered s-tuple of polynomials in F[x1,...,xn]. Hence, the application of the Euclidean division to f ∈ F[x1,...,xn] uniquely generates q1,...,qs,r ∈ F[x1,...,xn] such that

f = q1 f j1 + ··· + qs f js + r, where either r = 0 or r is a linear combination with coefficients in F of monomials none of

which is divisible by any of LT( f j1 ),...,LT( f js ). We call r the remainder of f on division by

F. Furthermore, if q j, fi j , 0, then mdeg( f ) ≥ mdeg(q j fi j ). Proof. We can relate this algorithm to the previous example by noting that the variable p rep- resents the intermediate dividend at each stage, the variable r represents the column on the right-hand side, and the variables q1,...,qs are the quotients listed above the division. Finally,

the boolean variable divided tells us when some LT( fi j ) divides the leading term of the inter- mediate dividend. Each time we go through the main WHILE loop, precisely one of two things happens:

Division Step If some LT( fi j ) divides LT(p), then the algorithm proceeds as in the one- variable case of Euclidean division.

Remainder Step If no LT( fi) divides LT(p), then the algorithm adds LT(p) to the remainder.

To prove that the algorithm works, we will first show that

(∗) f = q1 fi1 + ··· + qs fis + p + r

holds at every stage. This is clearly true for the initial values of q1,...,qs, p , and r. Now suppose that (*) holds at one step of the algorithm. If the next step is a Division Step, then some LT( fi j ) divides LT(p), and the equality

q j fi j + p = (q j + LT(p)/LT( fi j )) fi j + (p − (LT(p)/LT( fi j )) fi j )

shows that q j fi j + p unchanged. Since all other variables are unaffected, (*) remains true in this case. On the other hand, if the next step is a Remainder Step, then p and r will be changed, but the sum p + r is unchanged since

p + r = (p − LT(p)) + (r + LT(p)). As before, equality (*) is still preserved. Next, notice that the algorithm comes to a halt when p = 0. In this situation, (*) becomes

f = q1 fi1 + ··· + qs fis + r.

Since terms are added to r only when they are divisible by none of the LT( fi j ), it follows that q1,...,qs and r have the desired properties when the algorithm terminates. Finally, we need to show that the algorithm does eventually terminate. The key observation is that each time we redefine the variable p, either its multidegree drops (relative to our term

9 ordering) or it becomes 0. To see this, first suppose that during a Division Step, p is redefined to be LT(p) p0 = p − LT( fi j ) fi j Now, we have

LT(p) LT(p) LT( fi j ) = LT( fi j ) = LT(p). LT( fi j ) LT( fi j ) LT(p) 0 so that p and fi j have the same leading term. Hence, their difference p must have LT( fi j ) strictly smaller multidegree when p0 , 0. Next, suppose that during a Remainder Step, p is redefined to be

p0 = p − LT(p). Here, it is obvious that mdeg(p0) < mdeg(p) when p0 , 0. Thus, in either case, the multide- gree must decrease. If the algorithm never terminated, then we would get an infinite decreasing sequence of multidegrees. The well-ordering prop- erty of >, as stated in the Proposition of the previous section, shows that this cannot occur. Thus p = 0 must happen eventually, so that the algorithm terminates after finitely many steps.

It remains to study the relation between mdeg( f ) and mdeg(q j f js ). Every term in q j i is of the form LT(p)/LT( fi j ) for some value of the variable p. The algorithm starts with p = f , and we just finished proving that the multidegree of p decreases. This shows that LT(p) ≤ LT( f ), and then it follows easily, using condition (ii) of the definition of a monomial order, that mdeg(q j fi j ) ≤ mdeg( f ) when q j fi j , 0. 

4 Conclusion

As we discussed in the Introduction, to grant an algorithm for the calculation of greatest com- mon divisors, it is necessary to study other approaches. In this direction, one of the most applied methods is Grober’s¨ basis (Cf. Sturmfels (2005)). What we have made here was just a simple change that guarantees the uniqueness of remainders. It could be revealing, however, to see how our version of Euclidean division could be extended to obtain Grobner¨ basis, since Buchberger’s Algorithm that is usually used to calculate Grobner¨ basis is actually a variation of Euclidean division.

5 References

W. S. BROWN. On Euclid’s Algorithm and the Computation of Polynomial Greatest Common Divisors. Journal of the Association for Computing Machinery 18(4): 478-504, 1971. D. A. COX, J. LITTLE, D. O’SHEA. Ideals, Varieties, and Algorithms. Fourth Edition. Berlin: Springer, 2015. B. STURMFELS. What is a Grobner¨ Basis? Notices of the AMS. 52(10): 2-3, 2005.

10