Fast computation of generic bivariate resultants∗
JORIS VAN DER HOEVENa, GRÉGOIRE LECERFb CNRS, École polytechnique, Institut Polytechnique de Paris Laboratoire d'informatique de l'École polytechnique (LIX, UMR 7161) 1, rue Honoré d'Estienne d'Orves Bâtiment Alan Turing, CS35003 91120 Palaiseau, France a. Email: [email protected] b. Email: [email protected]
Preliminary version of May 21, 2020
We prove that the resultant of two “sufficiently generic” bivariate polynomials over a finite field can be computed in quasi-linear expected time, using a randomized algo- rithm of Las Vegas type. A similar complexity bound is proved for the computation of the lexicographical Gröbner basis for the ideal generated by the two polynomials.
KEYWORDS: complexity, algorithm, computer algebra, resultant, elimination, multi- point evaluation
1. INTRODUCTION
The efficient computation of resultants is a fundamental problem in elimination theory and for the algebraic resolution of systems of polynomial equations. Given an effective field 핂, it is well known [10, chapter 11] that the resultant of two univariate polynomials P,Q∈핂[x] of respective degrees d⩾e can be computed using O˜ (d) field operations in 핂. Here the soft-Oh notation O˜ (E) is an abbreviation for E(log E)O(1), for any expression E. Given two bivariate polynomials P,Q∈핂[x,y] of respective total degrees d⩾e, their 2 resultant Resy(P, Q) in y can be computed in time O˜ (d e); e.g. see [20, Theorem 25] and references in that paper. If d=e, then this corresponds to a complexity exponent of 3/2 in terms of input/output size. An important open question in algebraic complexity theory is whether this exponent can be lowered. In the present paper, we consider the case when P and Q are “sufficiently generic”. If the coefficients of P and Q are chosen at random in a finite field 핂= 픽q with sufficiently many elements, then this will be the case with high probability. Under a suitable hypoth- esis of “grevlex-lex-generic position” (defined below) and assuming the random access memory (RAM) bit complexity model, our main result is the following theorem:
THEOREM 1. Let 휖 > 0 be a fixed rational number. Let P, Q ∈ 핂[x, y] be two polynomials of respective total degrees d ⩾ e over a finite field 핂 = 픽q. If P and Q are in grevlex-lex-generic position, then Resy(P,Q) can be computed in expected time
O((delog q)1+휖)+O˜ (d2 log q), using a randomized algorithm of Las Vegas type.
∗. This paper is part of a project that has received funding from the French “Agence de l'Innovation de Défense”.
1 2 FAST COMPUTATION OF GENERIC BIVARIATE RESULTANTS
Figure 1. Illustration of the Gröbner stairs for P and Q in generic position with respect to A major result in a similar direction has recently been obtained by Villard [25]. For a general effective field 핂, and under different genericity assumptions, he proposed an algorithm that computes the resultant in y of two polynomials P,Q∈핂[x,y] of degree dx 2−1/휔 1+o(1) in x and degree dy in y using (dx dy ) operations in 핂. Here 휔 is the usual expo- nent for matrix multiplication (such that two n × n matrices over 핂 can be multiplied using O(n휔) operations in 핂). Le Gall has shown in [19] that one may take 휔<2.373. Relationship to Gröbner bases Resultants are related to elimination theory. Throughout the paper, we assume that the reader is familiar with the basic theory of Gröbner bases, as found in standard text books [4, 10], so we only briefly recall basic terminology. Let 핂 still be a general effective field. Two common monomial orderings on the polynomial ring 핂[x, y] are the lexicographical ordering i j k l x y We say that P and Q are in lex-generic (resp. grevlex-generic) position if the leading mono- mials of the reduced Gröbner basis of I with respect to If we were able to rapidly convert a Gröbner basis for Specific fields For 핂 = ℚ, Mehrabi and Schost [21] showed how to compute a basis of I ≔(P,Q) with respect to 2. PRELIMINARIES Computational model Throughout this paper, 핂 is an effective field. Most of our algo- rithms work in the algebraic complexity models of straight-line programs (SLPs) or com- putation trees [3], in which execution times correspond to the required number of field operations in 핂. The genericity assumptions imply that non-trivial zero tests always fail, so the straight-line program framework actually suffices. In section 6, where we prove Theorem 1, we specialize 핂 to become a finite field 픽q. From that point on, we assume a RAM bit complexity model and recall that field opera- tions in 픽q can be performed in softly linear time O˜ (log q). 4 FAST COMPUTATION OF GENERIC BIVARIATE RESULTANTS Transposition principle Given a finite dimensional 핂-vector space V of 핂[z1, . . . , z휈] ℕ ℕ that admits V ∩ z1 ⋅ ⋅ ⋅ z휈 as a basis, it is convenient to mentally represent elements of V as column vectors with respect to this basis and linear forms 휆: V → 핂 as row vectors. Linear maps between two vector spaces V,W of this type correspond to matrices. Writing V ∗ for the set of linear forms 휆:V →핂, the transpose of a linear map L:V →W is the linear map L∗:W ∗ →V ∗ such that L∗(휆)( f )=휆(L( f )) for all f ∈V. If L can be com- puted by a linear SLP over 핂 of length ℓ, then it is well-known [3, Theorem 13.20] that L∗ can be computed by an SLP of length ℓ + O(dim핂 V + dim핂 W). This “transposition principle” is in general easy to be put into practice on concrete programs, as exemplified in [1]. Roughly speaking, the program is regarded as a composition of individual steps that are easy to transpose. For the transposition of a composition L ∘ K, where K: U → V is another 핂-linear map, we next apply the usual formula (L∘K)∗ =K ∗ ∘L∗. Gröbner bases Given indeterminates z1,...,z휈 and positive integers n1,...,n휈, we define 핂[z1, . . . ,z휈]n1, . . . ,n휈 ≔ {A∈핂[z1, . . . ,z휈] :degz1 A Consider a Gröbner basis G of an ideal I ⊆핂[z1,...,z휈] for some term ordering on the set ℕ ℕ i1 i휈 of monomials z1 ⋅⋅⋅ z휈 ≔{z1 ⋅⋅⋅ z휈 :i1,...,i휈 ∈ℕ}. We write 핂[z1,...,z휈]G for the 핂-vector space of polynomials f ∈ 핂[z1, . . . , z휈] that are reduced with respect to G. The reduced ℕ ℕ monomials in BG ≔핂[z1,...,z휈]G ∩z1 ⋅⋅⋅z휈 form a basis for 핂[z1,...,z휈]G and correspond to the monomials “under the Gröbner stairs”. In other words, BG consists of the monomials that are not divisible by the leading monomial of an element in G. We also write 휌G:핂[z1, . . . ,z휈] →핂[z1, . . . ,z휈]G for the map that computes the normal form of a polynomial f ∈핂[z1,...,z휈] with respect to G, i.e. the unique polynomial 휌G( f ) in 핂[z1, . . . ,z휈]G such that f − 휌G( f )∈I. 3. CONCISE REPRESENTATION OF THE QUOTIENT ALGEBRA In the remainder of this paper, let P,Q∈핂[x,y] be two polynomials of total degrees d⩾e, in grevlex-lex-generic position. We write I ≔ (P, Q) for the ideal generated by P and Q, and 픸≔ 핂[x, y]/I for the corresponding quotient algebra. Let us start by recalling sev- eral facts from [12]. ∗ Gröbner basis The reduced Gröbner basis G of I with respect to Checking the genericity assumption By means of [12, Remark 4, Theorem 28, and Proposition 31], the condition that P and Q are indeed in grevlex-generic position can be checked using O˜ (d2) operations in 핂 by running the basis computation and aborting when an irregularity occurs. Multiplication in the quotient algebra [12, section 6.2 and Theorem 33] Assume that the concise Gröbner basis of I has been computed. We represent elements in the quotient algebra 픸 = 핂[x, y]/I by normal forms in 핂[x, y]G. Given 휑, 휓 ∈ 핂[x, y]G, we may now compute 휑 휓 ∈ 핂[x, y] using O˜ (d e) operations in 핂, since degx(휑 휓) ⩽ 2 (d + e − 2) and degy(휑휓)⩽2(e−1). By what precedes, we may therefore compute 휌G(휑휓)∈핂[x,y]G in time O˜ (de). In other words, products in 픸 can be computed in softly linear time. 4. REDUCTION TO BIVARIATE MODULAR COMPOSITION As above, P and Q are polynomials in 핂[x,y] in grevlex-lex-generic position, I ≔(P,Q), and 픸≔핂[x,y]/I. 4.1. Resultants and minimal polynomials Consider the 핂-linear multiplication map 휉: 픸 → 픸; a ↦ (x + I) a. It is known that the characteristic polynomial 휒 ∈ 핂[t] of this map equals 훼 Resy(P(t, y), Q(t, y)) for some 훼∈핂; see for instance [5, Proposition 2.7] applied with n=1 and r=1. When all the zeros of I are regular, 휒 is separable and the latter property follows more straightforwardly by comparing the sets of roots of 휒(t) and of Resy(P(t, y), Q(t, y)) via the Stickelberger eigenvalue theorem [18]. On the other hand, since P and Q are in lex-generic position, we have deg 휒 =dim핂 픸=de. We introduce the minimal polynomial 휇∈핂[t] of 휉 as the monic polynomial of minimal degree such that 휇(휉)=0, or, equivalently, 휇(x)∈I. In particular, 휇(x) coincides with the unique element of the reduced Gröbner basis of I for LEMMA 2. With P and Q in grevlex-lex-generic position, we have 휒 =휇. In addition, if |핂|>de, 2 then Resy(P,Q) can be recovered from 휇 using O˜ (d ) operations in 핂. Proof. We always have 휇|휒. The polynomials 휇 and 휒 coincide whenever deg 휇=deg 휒 = de; this is the case if and only if P and Q are in lex-generic position. Once 휒 is known, since |핂| >de, we may find a 훽∈핂 such that 휒(훽) ≠ 0 using O˜ (de) operations in 핂, by means of fast multipoint evaluation. Then the above value 훼 can be computed using O˜ (d2) further operations, as 휒(훽) 훼= . Resy(P(훽,y),Q(훽,y)) From 휒 and 훼, we deduce Resy(P(t,y),Q(t,y))=휒(t)/훼. □ The above discussion shows that the computation of Resy(P,Q) reduces to the deter- mination of 휇. 6 FAST COMPUTATION OF GENERIC BIVARIATE RESULTANTS 4.2. Wiedemann's algorithm We use Wiedemann's algorithm and the transposition principle for the computation of 휇, as follows: • We first select a random linear form 휆: 핂[x, y]G → 핂. More precisely, assuming that |핂|⩾4de, we select a finite subset S⊆핂 of size |S|⩾4de and take 휆 to be a row vector with random entries from S. • Taking N ≔2de, we define the map Ex,G:핂[t]N ⟶ 핂[x,y]G 휑 ⟼ 휌G(휑(x)). We will explain how to evaluate Ex,G efficiently in section 5. • Then we compute the sequence N−1 (휆∘Ex,G)(1),(휆∘Ex,G)(t), . . . ,(휆∘Ex,G)(t ). (1) This task is an extension of the usual “power projection” problem to the bivariate case, since it corresponds to one evaluation of the transposed map of Ex,G: ∗ ∗ ∗ Ex,G:핂[x,y]G ⟶ 핂[t]N N−1 휆 ⟼ ((휆∘Ex,G)(1),(휆∘Ex,G)(t), . . . ,(휆∘Ex,G)(t )). • Using the fast variant of the Berlekamp–Massey algorithm [10, chapter 12, Algo- rithm 12.9 combined with the extended half-gcd algorithm], we determine the linear recurrence relation of smallest order m⩽de satisfied by the sequence (1). Stated other- wise, this means that we compute the monic polynomial 휇∗ of minimal degree m⩽de such that ∗ ∗ N−1−m ∗ (휆∘Ex,G)(휇 )=(휆∘Ex,G)(t 휇 )= ⋅ ⋅ ⋅ =(휆∘Ex,G)(t 휇 )=0. i • The set of polynomials 휑 for which (휆 ∘ Ex,G)(t 휑) = 0 for i = 0, . . . , N − 1 − deg 휑 is closed under gcds and clearly contains 휇. This implies that we always have 휇∗ | 휇. If ∗ ∗ deg 휇 =de, then we are sure that 휇 =휇=휒 =훼Resy(P(t,y),Q(t,y)). The next subsec- tion reminds why this happens with high probability. 4.3. Probability analysis ∗ The above polynomials 휇 and 휇 coincide if, and only if, 휆(Ex,G(휇/휓)) ≠ 0 for any irre- ducible factor 휓 of 휇. Now given an irreducible factor 휓 of 휇, we have Ex,G(휇/휓) ≠ 0. A random linear form 휆: 핂[x, y]G → 핂 as above annihilates a fixed non-zero element of 핂[x, y]G with probability at most 1/|S|. The probability that 휆 annihilates Ex,G(휇/휓) is therefore bounded by 1/|S|. We conclude that the probability 풫success that none of the ⩽de irreducible factors 휓 of 휇 annihilates Ex,G(휇/휓) is at least 1 de 1 de 3 풫 ⩾ 1− ⩾ 1− > . success |S| 4de 4 The algorithm of the previous subsection and the present probability analysis are sum- marized in the following lemma. LEMMA 3. Assume that P and Q are in grevlex-lex-generic position and that |핂|⩾4de. Then the computation of 휇 takes an expected number of O˜ (d e) operations in 핂 plus an expected number of O(1) computations of sequences (1) for different values of 휆. JORIS VAN DER HOEVEN, GRÉGOIRE LECERF 7 5. REDUCTION TO MULTIPOINT EVALUATION In this section, we show how to efficiently reduce the evaluation of Ex,G to multivariate multipoint evaluation. We mostly follow the same Kronecker segmentation strategy as in [13, 16, 22] for modular composition. 5.1. Kronecker segmentation Given an integer 휈 that will be specified later, let 훿 ≔ ⌈(2 d e)1/휈⌉ be the smallest integer such that 훿휈 ⩾2de. We define the Kronecker map K:핂[z1, . . . ,z휈]훿, . . . ,훿 ⟶ 핂[t]훿휈 훿i−1 zi ⟼ t , i=1, . . . ,휈, ˇ as the restriction to 핂[z1, . . . , z휈]훿, . . . ,훿 of the unique morphism K: 핂[z1, . . . , z휈] → 핂[t] of 훿i−1 핂-algebras that sends zi to t for i = 1, . . . , 휈. Notice that K is bijective and that both K and its inverse can be computed in linear time with respect to the monomial bases. Let Dx ≔d+e−2 and Dy ≔e−1 be upper bounds for the degrees in x and y of elements in 핂[x,y]G. We may compute 훿i−1 gi ≔휌G x ∈핂[x,y]Dx+1,Dy+1 for i=1, . . . ,휈 using binary powering. By what has been said in section 3, this requires O˜ (d e 휈 log 훿) −1 operations in 핂. For any 휙∈핂[t]훿휈 and f =K (휙), we notice that 휌G(휙(x)) = 휌G( f (g1(x,y), . . . , g휈(x,y))). Let Nx ≔휈 (훿 − 1)Dx +1, Ny ≔휈 (훿 − 1)Dy +1, and Eg:핂[z1, . . . ,z휈]훿, . . . ,훿 ⟶ 핂[x,y]Nx,Ny f ⟼ f (g1(x,y), . . . , g휈(x,y)). Note that degx( f (g1(x,y),..., g휈(x,y))) 5.2. Evaluation-interpolation We will compute the map Eg using evaluation-interpolation. Assume for the time being that |핂|⩾Nx and let 훼1,...,훼Nx ∈핂 be pairwise distinct points. Define 훽i ≔훼i for i=1,...,Ny. Setting Α ≔{훼1, . . . ,훼Nx}, Β≔{훽1, . . . ,훽Ny}, consider the evaluation map Α×Β EΑ×Β:핂[x,y]Nx,Ny ⟶ 핂 h ⟼ (h(훼i,훽j))(훼i,훽j)∈Α×Β, which is a 핂-linear bijection. Using traditional univariate evaluation-interpolation in −1 each coordinate [10, chapter 10], both EΑ×Β and its inverse EΑ×Β can be evaluated using ˜ SLPs of length O(Nx Ny) over 핂. In particular, we can compute Α×Β Γi ≔EΑ×Β(gi)∈핂 for i=1, . . . ,휈 (3) in time O˜ (휈 Nx Ny). We next define the map Α×Β EΓ:핂[z1, . . . ,z휈]훿, . . . ,훿 ⟶ 핂 f ⟼ ( f ((Γ1)(훼i,훽j), . . . ,(Γ휈)(훼i,훽j)))(훼i,훽j)∈Α×Β. 8 FAST COMPUTATION OF GENERIC BIVARIATE RESULTANTS Then we have −1 Eg = EΑ×Β ∘EΓ. Combined with (2), this yields −1 −1 Ex,G = 휌G ∘EΑ×Β ∘EΓ ∘K . (4) 6. FAST COMPUTATION OF RESULTANTS In section 4, we have reduced the computation of bivariate resultants to the evaluation of ∗ the transposed map Ex,G of Ex,G. In section 5 the evaluation of Ex,G has been reduced to multivariate multipoint evaluation. We first recall how to perform the latter evaluation using algorithms by Kedlaya and Umans. We next combine the above reductions with the transposition principle and prove our main result. 6.1. Fast multipoint evaluation Kedlaya and Umans designed various algorithms for modular composition and mul- tipoint evaluation [16]. They also gave algorithms for the transposed operations. For the computation of Ex,G and its transpose, we will rely on the following result, which is a direct consequence of [16, Corollary 4.5 and Theorem 7.6]: THEOREM 4. Let 휖 >0 be a fixed rational number. Given f ∈픽q[z1, . . . , z휈]훿, . . . ,훿 and evaluation 휈 o(1) points 훾1,...,훾ℓ ∈픽q such that 휈=훿 , there exists an algorithm that outputs f (훾i) for i=1,...,ℓ, and that runs in time O(((훿휈 +ℓ)log q)1+휖). The transpose of the linear map f ↦( f (훾i))1⩽i⩽ℓ can be computed with the same complexity. Note that [16, Corollary 4.5 and Theorem 7.6] are actually stated in an more precise manner and that we voluntarily simplified the presentation. We also refer to [13] for some recent refinements. ∗ 6.2. Evaluating Ex,G and Ex,G Assume from now on that 핂 = 픽q. Before we prove our main result, let us first study ∗ the complexity of evaluating the transpose Ex,G of Ex,G. Recall that the computation of a sequence as in (1) reduces to one such evaluation. PROPOSITION 5. Let 휖 > 0 be a constant, thought to be small. Assume that q ⩾ max (4 d e, Nx) and that the concise representation of G and the sets Γ1,...,Γ휈 of (3) have been precomputed. Then ∗ 1+휖 one evaluation of Ex,G or of its transpose Ex,G takes time O((delog q) ). Proof. We let 휈 ≔⌈log log (d+3)⌉ and verify the following bounds: 훿 = O (de)1/loglogd 훿휈 ⩽ (2(2de)1/휈)휈 = 2휈+1 de = (de)1+o(1) 2 2 2 1+2/loglogd 1+o(1) ℓ=|Α||Β| ⩽ 휈 훿 Dx Dy = O (log log d) (de) = (de) . ∗ 1+휖 Theorem 4 therefore implies that EΓ and EΓ can be computed in time O(delog q) . JORIS VAN DER HOEVEN, GRÉGOIRE LECERF 9 −1 −1 In the previous sections, we have already shown that 휌G, EΑ×Β and K can be com- 1+휖 puted using O((d e) ) operations over 픽q. Combining this with (4), it follows that one 1+휖 evaluation of Ex,G takes time O((delog q) ). Using our genericity assumptions, we also observed that these computations can be carried out by linear SLPs over 픽q. In view of the transposition principle (see section 2), ∗ −1 ∗ −1 ∗ 1+휖 it follows that 휌G, (EΑ×Β) and (K ) can also be computed using O((de) ) operations over 픽q. Combining this with (4), we conclude that ∗ −1 ∗ ∗ −1 ∗ ∗ Ex,G = (K ) ∘EΓ ∘(EΑ×Β) ∘휌G can be computed in time O((delog q)1+휖) as well. □ 6.3. Proof of Theorem 1 2 Let us first reduce to the case when q ⩾ 4 d e and q ⩾ Nx. Let q′ = O(q d ) be the smallest power of q such that q′ ⩾ 4 d e and q′ ⩾ Nx. Whenever q′ > q, we replace 픽q by the exten- sion field 픽q′. Since log q′ = O(log q + log d), the construction of 픽q′ takes bit complexity (log d)O(1) O˜ (log q), e.g. by using [10, Corollary 14.39], and the overhead involved by this extension only concerns hidden logarithmic factors in the complexity bounds. Consequently from now q⩾ 4 d e and q ⩾ Nx are satisfied. The concise representation ˜ 2 of the Gröbner basis of I for 7. FURTHER NOTES 7.1. Parametrization Recall from section 4.1 that 휇 coincides with the unique element in 핂[x] of the Gröbner lex basis G of I with respect to O((delog q)1+휖)+O˜ (d2 log q). One technique for doing this goes back to Kronecker [17]: it performs a first order defor- mation in order to compute the minimal polynomial of x + t y + O(t2) modulo I; see for instance in [11, section 2]. In the present paper, we appeal to an other known method relying once more on power projections; as in [8, Algorithm 2], for instance. For this purpose, let 휆 be the linear form that has successfully led to the computation of 휇 and let 휇˜(z)≔zde 휇(z−1) be the reciprocal of 휇. We compute the polynomial 휗 ∈핂[z] of degree