arXiv:2006.16729v2 [math.NT] 30 Oct 2020 hogotteppr u orcn mrvmns([HH19]), improvements recent to Due paper. the throughout (1.1) rv hoe .,w iletbihatm-pc rdo o Ferma ([Leh74]). for Lehman and tradeoff ([Law95]) Lawrence time-space by a generalizations its establish and will we 1.1, Theorem prove aua number natural a enpoe n[i1]adi ie by given is and [Hit18] in proved been has samr hoeia seto h nee atrzto problem, factorization integer the of aspect a theoretical [Rie94] more monographs a the to is and [Len00] survey the achievin to methods reader the factorization heuristic and probabilistic of variety hr the where n hi ioosaayi.W ildsrb utm opeiisb us by complexities runtime [Pap94]. machine describe Turing will multitape We the analysis. rigorous their and oee,nn fteeipoeet a enal ordc the reduce r to few able a been has been improvements have these there of 1977, none However, in approach Strassen’s of publication ehooisPormeadfne yBK MW n h fede the and BMDW, BMK, by FFG. by funded managed and Programme Technologies hoe 1.1. Theorem eemnsi n ioosfcoiainagrtmrnigi time in running algorithm factorization rigorous and deterministic a h 1 the 2010 o h aeo ipiiy ewl o consider now will we simplicity, of sake the For B eerh(B-1 saCMTCnr ihnteframework the within Centre COMET a is (SBA-K1) Research SBA M ecnie h rbe fcmuigtepiefcoiaino nat of factorization prime the computing of problem the consider We n[t7] tasnue atplnma utpiainadmultipoint and multiplication polynomial fast used Strassen [Str77], In int / -xoettrsodfrdtriitcitgrfcoiainb p by factorization integer deterministic for threshold 4-exponent ( ahmtc ujc Classification. Subject Mathematics k numbers Abstract. kN pl ttgte ihLha’ eutt banadetermini a obtain to time-spac result a fac Lehman’s construct deterministic complexity we with runtime first paper, together the this it In in apply resulted division. trial which than 1974, in Lehman the EMNSDTRIITCITGRFCOIAINMETHOD FACTORIZATION INTEGER DETERMINISTIC LEHMAN’S eoe h oto utpyn two multiplying of cost the denotes ) O e O fteoiia ubr ytmtcapoc ocos suit choose to approach systematic A number. original the of ntto sue ooi oaihi atr.Tecnrbto fth of contribution The factors. logarithmic omit to used is -notation ( N 1 N N / hr xssa loih odtriitclycmuethe compute deterministically to algorithm an exists There 4+ stedffrneo qae.I 85 arnegnrlzdt generalized Lawrence 1895, In squares. of difference the as emtswl-nw atrzto loih sbsdo fin on based is algorithm factorization well-known Fermat’s in o (1) O on n1977. in bound ) M M O int ( int N k ( 2 k / ) 9+ N IESAETAEF FOR TRADEOFF TIME-SPACE A ≤ o 2 (1) / 9 M .Ti stefis xoeta mrvmn ic h establ the since improvement exponential first the is This ). log int 11Y05. k 5 ( ′ / k 3 ′ ) AKSHITTMEIR MARKUS N 1. O e if ⌈ Introduction log k k N ⌉ ≤ btitgr.Floig[G0,p18] easm that assume we p.1782], [BGS07, Following integers. -bit 1 N / N k 4 1 ′ exp( = and i operations. bit pq − where , M C rdo o arnesgnrlzto and generalization Lawrence’s for tradeoff e int log a tt fVen.TeCMTPormeis Programme COMET The Vienna. of state ral tcitgrfcoiainagrtmwith algorithm factorization integer stic M oiainagrtmcnieal faster considerably algorithm torization ( d[a1] h ou ftepeetpaper present the of focus The [Wag13]. nd fCMT–Cmeec etr o Excellent for Centers Competence – COMET of kk N/ int hc concerns which xoet1 exponent p bevle for values able ′ ) ( uepnnilcmlxt.W refer We complexity. subexponential g O fieet fterniecomplexity. runtime the of efinements k o log log and ≤ i daadapidi omultiples to it applied and idea his a ebuddby bounded be may ) emtsmto saspecial-purpose a is method Fermat’s ( ’ elkonfcoiainmethod factorization well-known t’s igarpeetto fnatural of representation a ding N k oigtefloigtheorem. following the roving 2 rlnumbers ural 1 q vlaintcnqe oestablish to techniques evaluation M / n h i-opeiymdlof model bit-complexity the ing 4+ r itntpie.I re to order In primes. distinct are N int o ) / (1) ( .Tebs fteebounds these of best The 4. k rm atrzto fany of factorization prime ′ o oiieconstant positive a for k i prtos ic the Since operations. bit ) sppri ofial break finally to is paper is ) deterministic a nrdcdby introduced was N hr salarge a is There . smn of ishment O ( k algorithms log k ). C , 2 M. HITTMEIR algorithm first described in 1643. The idea of the procedure is based on the fact that (p+q)2 (q p)2 =4N. − − If the prime difference ∆ := q p is small, the number S := p + q is close to 2√N. In fact, one proves that − ∆2 0 square. If this procedure indeed yields an equality of the form (z + 2√N )2 4N = y2, we may obtain a proper factor of N by computing gcd(z + 2√N y,N). ⌈ ⌉ − ⌈ ⌉− Lawrence’s generalization works in a similar manner, but is based on finding linear combinations ap+bq of the prime factors of N. In 1974, Lehman published a deterministic factorization method running in time O(N 1/3), thus being significantly faster than trial division. The algorithm applies Lawrence’s approach in ae systematic manner and relies on suitable lower and upper bounds for the linear combinations ap + bq (see Theorem 2.1). All pairs (a,b) in a certain range are checked, and the bounds for the corresponding linear combinations vary depending on the size of the value of ab. In our improvement, we utilize Lehman’s theorem and elaborate on an idea from [Hit18]. There, it was shown that αS αN+1 mod N holds for every α coprime to N. Subsequently, this congruence has been ≡ used to search for S = p + q. In Lemma 4.2 of the present paper, we will extend this statement to the linear combinations ap + bq used by Lehman. As a consequence, we are able to construct two sets and B S which are disjoint modulo N, but not disjoint modulo p or q. In order to find the corresponding collision and, hence, a proper factor of N, we will adapt the already mentioned factorization scheme by Strassen. The application of efficient polynomial arithmetic techniques together with Lehman’s bounds lead to the runtime complexity stated in Theorem 1.1. The remainder of the paper is organized as follows: In Section 2, we discuss two theorems important to our improvement, one of which is Lehman’s result. In Section 3, we consider our adaptation of Strassen’s factorization scheme and introduce the concept of revealing subsets. Section 4 contains the core idea of this paper, namely the time-space tradeoff for Lawrence’s algorithm. In Section 5, we explain our strategy for applying Lehman’s bounds. Finally, Section 6 puts all the pieces together and finishes the proof.
2. Preliminary Work
This short section discusses two preliminary results our improvement is based on. Throughout the paper, we will use the notations N = 1, 2, 3,... and Z := Z/nZ for n N. The first and most { } n ∈ important ingredient of our algorithm is the following theorem, which was proved in [Leh74]. Lehman applied it with η O(N 1/3) to obtain a new factorization technique running in time O(N 1/3). Compared ∈ to trial division, this was a major improvement. e Theorem 2.1. [Lehman, 1974] Suppose that N N is odd and η is an integer such that 1 η (N/(η + 1))1/2 x2 y2 =4kN, 1 k η, − ≤ ≤ N 1/2 (2.1) 0 x (4kN)1/2 . ≤ − ≤ 4k1/2(η + 1) These integers are of the form x = ap + bq, y = ap bq and k = ab, where a,b N. | − | ∈ IMPROVING DETERMINISTIC FACTORIZATION 3 Lehman used Lawrence’s approach systematically and searched for linear combinations of the prime divisors of N. The idea is to divide the interval [0, 1] into parts, where each part corresponds to a fraction a/b. Considering many such fractions, the goal is to find the best approximation for q/p. This implies that the value of ap bq is sufficiently small such that the inequality (2.1) holds true. The shape of x, y | − | and k is shown in (4.6) of Lehman’s proof in [Leh74, p.641] and is not part of the original formulation of the theorem. However, it is crucial for our application of the result. Our second ingredient is a deterministic algorithm for finding elements of large order modulo N. We will need such an element in our application of the already mentioned Lemma 4.2. The following result has been proved in [Hit18, Theorem 6.3]. Theorem 2.2. [Hittmeir, 2018] Let N N and δ be an integer such that N 2/5 δ N. There exists an ∈ ≤ ≤ algorithm that either returns some α Z∗ with ord (α) > δ, or some nontrivial factor of N, or proves ∈ N N N prime. Its runtime complexity is bounded by δ1/2 log2 N O √log log δ bit operations. In this algorithm, we apply the standard babystep-giantstep procedure to compute small orders of m/r elements α Z∗ . If m := ord (α) is actually found, we try to find a factor of N via gcd(N, α 1) for ∈ N N − every r m. If this fails, we know that m p 1 for every prime divisor p of N. Repeating this process | | − for various values of α, we either find an element of sufficiently large order or obtain enough information about the factorization of p 1 for p N such that factoring N directly is feasible. − | 3. Revealing Subsets We briefly recall Strassen’s idea for factoring natural numbers. Let N N and set d := N 1/4 . We ∈ ⌈ ⌉ want to compute subproducts of the product N 1/2 ! to find a factor of N. For this task, the polynomial ⌈ ⌉ f = (X + 1)(X + 2) (X + d) is computed modulo N and evaluated at the points 0, d, 2d, . . . , (d 1)d ··· − by using fast polynomial arithmetic techniques. In the next step, gi := gcd(f(id),N) is computed for i = 0,...,d 1. If N is not a prime, one of these GCDs is not equal to 1. If g = N, we obtain a − i0 nontrivial factor of N by computing gcd(i0d + j, N) for j =1,...,d. In this paper, we will apply Strassen’s idea in a generalized setting. We will still use linear polynomials as factors of f, but with different sets of zeros. Similarly, we will use different sets of evaluation points for computing the GCDs. The following definition clarifies the conditions under which such sets will reveal a factor of N in the same way the original procedure of Strassen does. Subsequently, we formulate the corresponding algorithm. Definition 3.1. Let N N. A pair of subsets and of Z∗ is called revealing if the following two ∈ B S N conditions hold: (1) and are disjoint modulo N. B S (2) If N is composite, then there is a nontrivial divisor v of N such that and are not disjoint B S modulo v. Algorithm 3.2. Input: A natural number N and a pair of disjoint subsets and of Z∗ . We denote B S N = b :1 i β and = s :1 j σ . B { i ≤ ≤ } S { j ≤ ≤ } 4 M. HITTMEIR β 1: Compute f := (X b ). i=1 − i 2: Compute f(s1)Q,f(s2),...,f(sσ). 3: for j =1,...,σ do 4: Compute γj := gcd(f(sj ),N). 5: if 1 <γj 6: if γj = N then 7: for i =1,...,β do 8: Compute γ := gcd(s b ,N). j,i j − i 9: if γj,i > 1 then return γj,i. 10: Return “No factor found” In the subsequent sections, we will employ Algorithm 3.2 with different choices for and . We now B S prove correctness and analyze the runtime of the procedure. Lemma 3.3. Algorithm 3.2 runs in time O (M (δ log N) log N) , where δ := max β, σ . If N is composite int { } and and are revealing, it returns a nontrivial factor of N. B S Proof. We first prove correctness. Let N be composite and and be revealing subsets. There are B S elements b ′ and s ′ such that b ′ s ′ mod v for some nontrivial divisor v of N. As a result, one i ∈B j ∈ S i ≡ j easily observes that γj′ > 1 in Step 4 of the algorithm. If γj′ We now discuss the runtime complexity. Since and are subsets of Z∗ , we have δ One observes that we obtain the original approach of Strassen as a special case. We will apply it in form of the following corollary. Corollary 3.4. Let N N and ∆ N with ∆ N 1/2. We can find a nontrivial divisor ℓ of N such that ∈ ∈ ≤ ℓ ∆, or prove that no such divisor exists, in O M ∆1/2 log N log N bit operations. ≤ int Proof. Assume that N is sufficiently large and let d = ∆1/2 . We note that 1 d Lemma 3.5. Let N N and and be two indexed lists of elements in Z∗ and indices of bit size ∈ L1 L2 N O(log N). Moreover, assume that the elements of are all distinct. We may compute the set of all L1 M the pairs of indices (i, j) for which the element with index i in is equal to the element with index j in L1 . The bit-complexity is bounded by O(δ log2 N), where δ is the maximum of the lengths of the lists. L2 Proof. In the multitape Turing machine model, we use merge sort to solve this task. We denote every number in the lists as string of at most O(log N) bits and sort both and by performing at most L1 L2 O(δ log δ) comparisons, yielding a bit-complexity bounded by O(δ log2 N). After applying merge sort, it is easy to see that one is able to find all matches in the lists in O(δ log N) bit operations. For more detailed information on merge sort, we refer the reader to [SW11, Chap. 2.2]. For the claimed complexity, see Proposition F in [SW11, Page 272]. 4. Time-Space Tradeoff for Lawrence’s algorithm Let N N and u and v = N/u be nontrivial divisors of N. As discussed in the introduction, Fermat’s ∈ factorization algorithm searches for S = u + v by checking if i2 4N is a square for each i 2√N . − ≥ ⌈ ⌉ Lawrence ([Law95]) first noted that, since the equality (au + bv)2 (au bv)2 =4abN holds true for all − − a,b N, we can try to find any linear combination of u and v, not only S. For each i 2√abN , we ∈ ≥ ⌈ ⌉ check if i2 4abN is a square number. As in Fermat’s original approach, we have − (au bv)2 (4.1) 0 < au + bv 2√abN < − , − 4√abN hence finding au + bv works best if au bv is relatively small or, in other words, if a/b is a good | − | approximation of v/u. The following lemma shows how to factorize N, given that we know au + bv. Lemma 4.1. Let N N and u and v = N/u be unknown nontrivial divisors of N. Given as input positive ∈ numbers a,b,L O(N) such that a and b are coprime to N, we may test if L = au + bv or L = bu + av, ∈ and if so compute u and v, in O(Mint(log N)) bit operations. Proof. We compute the roots r and r of the polynomial X2 LX + abN. One easily checks that either 1 2 − r1/a or r1/b is equal to one of the factors of N. Since all involved quantities have O(log N) bits, we achieve the claimed runtime complexity by using the quadratic formula. In this section, we introduce a time-space tradeoff for Lawrence’s extension for semiprime numbers N. The extension is based on the following lemma, which may be considered as a generalization of the fact p+q N+1 that α α mod N for every α Z∗ . ≡ ∈ N Lemma 4.2. Let N be semiprime with the distinct factors p and q. Furthermore, let a,b Z, α Z∗ ∈ ∈ N and set t := αbN+a (mod N). We have αap+bq t mod p and αbp+aq t mod q. ≡ ≡ Proof. The first congruence follows from the fact that p 1 mod p 1 and, hence, ≡ − ap + bq a + bpq a + bN mod p 1. ≡ ≡ − The proof of the second congruence is similar. Setting t := αbN+a (mod N) introduces an asymmetry of a and b and defines which congruence in Lemma 4.2 holds modulo which prime factor. The following arguments show how to find ap + bq by considering the first congruence modulo p. However, they may easily be modified to fit the search for bp + aq via the congruence modulo q. In the subsequent theorem, we search for both quantities at once. 6 M. HITTMEIR bN+a 2√abN Let x = ap + bq 2√abN and t := α −⌈ ⌉ (mod N). From Lemma 4.2, it follows that 0 −⌈ ⌉ 0 αx0 t mod p. Now let Λ N such that x < Λ. For example, Λ may be taken from inequality (4.1) ≡ 0 ∈ 0 or Lehman’s bound in Theorem 2.1. Let m = Λ1/2 and write x = ml + r for unknown numbers ⌈ ⌉ 0 0 0 l , r 0, 1,...,m 1 . We then have 0 0 ∈{ − } r0 ml0 (4.2) α α− t mod p. ≡ 0 Our goal is to solve (4.2) in order to find l0 and r0 and obtain a proper factor of N. However, (4.2) is a congruence modulo an unknown prime factor. We will hence employ the approach discussed in Section 3, which allows us to prove the following result. Theorem 4.3. Let N and Λ be natural numbers. There is an algorithm that takes as input α Z∗ such ∈ N that ord (α) > Λ1/2 and positive integers a,b O(N) which are coprime to N. Its runtime complexity N ⌈ ⌉ ∈ is bounded by 1/2 O Mint Λ log N log N . If N is semiprime with distinct factors p and q, and ap + bq 2√abN < Λ or bp + aq 2√abN < Λ −⌈ ⌉ −⌈ ⌉ holds, the algorithm returns p and q. Proof. Put m = Λ1/2 and let x = ap + bq 2√abN and x = bp + aq 2√abN . Our assumptions ⌈ ⌉ 0 −⌈ ⌉ 1 −⌈ ⌉ imply that x0 < Λ or x1 < Λ holds, and we will use Lemma 4.2 to search for them simultaneously. W.l.o.g., we assume that x < Λ and x = ml + r for unknown numbers l , r 0, 1,...,m 1 . Before applying 0 0 0 0 0 0 ∈{ − } Algorithm 3.2, we perform D. Shanks’ babystep-giantstep method to search for matches of the form (4.2), but modulo N instead of modulo p. We start by computing t0 in time O(Mint(log N) log N) via the Square- and-Multiply algorithm. Next, we compute the lists of the babysteps αi (mod N) for 0 i m 1 and mj ≤ ≤ − the giantsteps α− t (mod N) for 0 j m 1. Note that these elements should be stored together 0 ≤ ≤ − with their indices i and j. It is easy to see that this can be done by performing O(m) multiplications modulo N, which is asymptotically negligible. In the next step of the algorithm, we compute g := gcd(N, αi 1 (mod N)) for i =1,...,m 1. The i − − computation of m GCDs of numbers bounded by O(N) can be done in O(m M (log N) log log N) bit · int operations, which is also negligible. Since we assumed ordN (α) > m, it follows that gi < N for every i. As a result, we either find a factor of N or we obtain gi = 1 for every i. If we do not find a factor, we know that ord (α) m and ord (α) m. We now suppose this is the case. Since our assumptions on the p ≥ q ≥ order of α modulo N imply that all the babysteps are distinct, we may apply Lemma 3.5 to compute the set of pairs of indices corresponding to the matches between the babysteps and the giantsteps. The M runtime is bounded by O(m log2 N), which again is negligible. Moreover, we deduce that m. For |M| ≤ each solution (i′, j′) in , we check if mj′ + i′ = x . This can be done by using mj′ + i′ + 2√abN as M 0 ⌈ ⌉ candidate for L in Lemma 4.1. Assume that we do not find p and q. The lower bounds on the order of ′ ′ i mj α modulo p and modulo q imply that α is the only babystep that matches α− t0 modulo the prime ′ mj factors of N. It follows that j′ = l . As a result, we delete the element α− t (mod N) from the list 6 0 0 of giantsteps. The overall runtime complexity for these applications of Lemma 4.1 can be bounded by O(m M (log N)), and hence is negligible. · int If we have not found a factor at this point, we define the set consisting of the babysteps and the B set consisting of the remaining giantsteps. Note that the list of giantsteps may have contained several S instances of the same element in Z∗ , stored together with different indices. In the sets and , we N B S ignore multiplicities and do not longer keep track of the indices. Clearly, and are disjoint modulo B S IMPROVING DETERMINISTIC FACTORIZATION 7 N. Moreover, we point out that the set of giantsteps is not empty, as it must still contain the element ml0 α− t0 (mod N). It follows that the two sets are not disjoint modulo p or not disjoint modulo q. We conclude that and are a pair of revealing subsets of Z∗ . We hence may apply Lemma 3.3 and the B S N claim follows. 5. Utilizing Lehman’s bound We now discuss the application of Lehman’s bound (2.1) in Theorem 2.1, which is at the core of our improvement. In the following, we suppose that N is a prime or a semiprime number. A procedure for reducing the factorization of any natural number to the factorization of primes and semiprimes will be discussed at the beginning of the next section. Assuming that N = pq, our goal is to find the number x which satisfies the bound (2.1). As discussed in Section 2, this number is of the shape x = ap + bq. We consider all possible values of k = ab in the interval [1, η] and apply the time-space tradeoff established in Theorem 4.3 to compute the factors of N. We have to account for the fact that said approach cannot search for all decompositions of a fixed 1 k η at once. Instead, we have to go through all (a,b) with ≤ ≤ 1 ab η. However, we will see that the effect on the runtime complexity is marginal. Moreover, we ≤ ≤ may assume that a b, since our procedure still searches for ap + bq and bp + aq at once. ≤ In addition, we split the interval [1, η] into [1, ξ] and (ξ, η], employing different strategies for the pairs (a,b) depending on whether ab is in the first or the second range. This idea is based on the fact that Lehman’s bound (2.1) is large for smaller values of k ξ, which allows for an effective application of the ≤ procedure in Theorem 4.3 for every single pair (a,b). However, for larger values of k > ξ, the bound (2.1) is decreasing rapidly. In this case, we are able to denote all remaining candidates for x as δ + 2√kN for ⌈ ⌉ some sufficiently small δ, and we will require only one more run of an approach similar to the one taken in Theorem 4.3 to find x and, hence, the prime factors p and q. Both parameters ξ and η have a strong impact on the final runtime complexity and will be optimized later. Based on the explanations above, we now suggest the following algorithm for a time-space tradeoff of Lehman’s algorithm. Algorithm 5.1. Input: A natural number N and two integers ξ and η such that 1 ξ η Lemma 5.2. The bit-complexity of Algorithm 5.1 is bounded by N 1/4ξ3/4 log ξ N 1/2 O M log N log N + M log N log N + M (η log η log N) log N . int η1/2 int ξ1/2η int If N is semiprime with distinct factors p and q, it returns a prime factor of N. Proof. In order to prove correctness, we leta ¯ and ¯b such that x :=ap ¯ + ¯bq and k :=a ¯¯b in Theorem 2.1. If k ξ, one easily checks that p and q will be found in the corresponding run of Step 1. Just note that our ≤ assumptions imply ord (α) > N 1/2/(ξ1/2η) > N 1/4/η1/2 Λ1/2 for all (a,b), hence Theorem 4.3 is N ⌈ ⌉ ⌈ ⌉ ≥ a,b applied correctly. We also point out that it suffices to consider the pairs (a,b) with a b, since Theorem ≤ 4.3 searches for both ap + bq and bp + aq at once. We now assume that ξ < k η. In this case, note that Lehman’s bound (2.1) implies that we may ≤ write x = δ + 2√kN for some ⌈ ⌉ N 1/2 N 1/2 0 δ < . ≤ ≤ 4k1/2(η + 1) ξ1/2η δ ¯bN+¯a 2√kN As a result, we derive α α −⌈ ⌉ mod p. In Step 2, we construct indexed lists modulo N such ≡ that there is a collision modulo p which corresponds to this congruence. The remainder of the algorithm and the following arguments are similar to the approach we took in the proof of Theorem 4.3. The goal is Z to end up with a pair of revealing subsets in N∗ . In Step 3, we either find a prime factor of N or make sure that both ord (α) and ord (α) are greater or equal to N 1/2/(ξ1/2η) . In Step 4, we search for matches p q ⌈ ⌉ between and modulo N and check if the corresponding indices yield candidates that are equal to L1 L2 x. The lower bounds on the order of α modulo p and modulo q imply that at most one babystep matches each giantstep modulo these prime factors. Since we delete all matching elements from in Step 4, the L2 sets and in Step 5 are disjoint modulo N. Moreover, the fact that no factor has been found at this B S point implies that the element corresponding to (¯a, ¯b) is still in . Of course, the element αδ (mod N) is S in , and the desired result follows. B We proceed by considering the runtime complexity of the algorithm. For Step 1, we consider Theorem 4.3 and the sum over all (a,b) with a b and ab ξ. Hence, we write ≤ ≤ 1 2 1 2 ξ / ξ/a ξ / ξ/a ⌊ ⌋ ⌊ ⌋ ⌊ ⌋ ⌊ ⌋ N 1/4 M Λ1/2 log N log N = M log N log N int a,b int (ab)1/4η1/2 Xa=1 Xb=a Xa=1 Xb=a 1 2 ξ / ξ/a N 1/4 log N ⌊ ⌋ 1 ⌊ ⌋ 1 Mint log N, ≤ η1/2 · a1/4 b1/4 Xa=1 Xb=a where we have used that M (k)+ M (k′) M (k + k′) holds for all k, k′, which follows from the first int int ≤ int assumption in (1.1). Now by considering 1 2 1 2 ξ / ξ/a ξ / ⌊ ⌋ ⌊ ⌋ ⌊ ⌋ ξ/a 1 1 1 ⌊ ⌋ 1/4 t− dt a1/4 b1/4 ≤ a1/4 Z aX=1 Xb=a Xa=1 0 1 2 ξ / ⌊ ⌋ ξ3/4 O O(ξ3/4 log ξ), ∈ a ⊆ Xa=1 we are able to obtain the first summand in our claimed runtime complexity bound. For estimating the runtime complexity of the Steps 2-4, let κ denote the maximum of the lengths of the two lists and . L1 L2 IMPROVING DETERMINISTIC FACTORIZATION 9 We first note that the length of is certainly no more than the number of pairs (a,b) with 1 ab η, L2 ≤ ≤ which is bounded by 1 2 1 2 η / η/a η / ⌊ ⌋ ⌊ ⌋ ⌊ ⌋ η 1 O(η log η). ≤ a ∈ Xa=1 Xb=1 Xa=1 Therefore, it follows that N 1/2 κ O max , η log η . ∈ ξ1/2η Since all exponents in Step 2 are in O(N 2), the construction of the two lists via the well-known Square- and-Multiply algorithm is finished after at most O(κ M (log N) log N) bit operations, which is negligible. · int The computation of the GCDs in Step 3 is finished after at most O(κ M (log N) log log N) bit operations. · int The complexity for applying Lemma 3.5 in Step 4 is O(κ log2 N), which is also negligible due to the fact that k M (k) is true for all k. The same holds for the cost of applying Lemma 4.1 with all possible ≤ int candidates, which may be bounded by O(κ M (log N)). Finally, the runtime complexity for Step 5 is a · int result of Lemma 3.3, which finishes the proof. 6. Proof of Theorem 1.1 Let N be any natural number. We start by performing the following preparation step: We apply Corollary 3.4 to N, putting ∆ = N 1/3 . If any divisor of N is found, we remove it and denote the 0 ⌈ ⌉ resulting number by N . We then apply Corollary 3.4 to N , setting ∆ = N 1/3 . We proceed in this 1 1 1 ⌈ 1 ⌉ way until no more divisors are found. The number of factors of N obtained in this manner is bounded by O(log N). Since ∆ N 1/3 for every i N , the bit-complexity of this procedure is at most i ≤⌈ ⌉ ∈ 0 1/6 2 O Mint N log N log N . Compared to the complexity bound stated in Theorem 1.1, this is asymptotically negligible. Moreover, note that the factors of N obtained in this preparation step are bounded by O(N 1/3). Hence, we may compute their complete prime factorizations in negligible time by further applications of Corollary 3.4. Let N be the number which remains after these computations. Since no factor smaller or equal to N 1/3 k ⌈ k ⌉ has been found by applying Corollary 3.4, Nk does not have more than two prime factors. We may check easily if Nk is a square number; if not, Nk has to be a prime or a semiprime number. For the remainder of the proof, we may hence suppose that N is either prime or N = pq such that p, q are distinct primes. We now run the following procedure. Algorithm 6.1. Input: A prime or semiprime N, ξ := N 1/9/ log2/3 N and η := N 2/9/ log1/3 N . ⌈ ⌉ ⌈ ⌉ Output: The prime factorization of N. 1/2 1: Apply Corollary 3.4 with ∆= (N/(η + 1)) . If a factor is found, return and stop. ⌈ ⌉ 2: Apply Theorem 2.2 with δ = N 2/5 . If a factor of N is found or N is proved to be prime, return and ⌈ ⌉ 2/5 stop; otherwise, α Z∗ is found such that ord (α) > N . ∈ N N ⌈ ⌉ 3: Apply Algorithm 5.1 with ξ, η and the element α found in Step 2. If no factor is found, return that N is prime. Note that, for sufficiently large inputs N, we have ord (α) >N 2/5 N 1/2/(ξ1/2η) and the parameters N ≫ ξ and η satisfy the assumptions of Algorithm 5.1. In Step 1, we make sure that p > (N/(η + 1))1/2 holds. As a result, Lemma 5.2 implies that the procedure above finds the factors of a semiprime number. 10 M. HITTMEIR Therefore, if we do not find any proper factors, N must be prime. We conclude that the algorithm is correct and are left with the final task to analyze the runtime complexity. The runtime of Step 1 is bounded by O M ∆1/2 log N log N O(N 7/36), which is asymptotically int ⊆ neglible. The same is true for Step 2, which runs in time e N 1/5 log2 N O . √log log N We now show that our choice of ξ and η optimizes the runtime complexity of Step 3. Considering the three summands in Lemma 5.2, one easily observes that the best possible choice for ξ and η is obtained by solving N 1/2 N 1/4ξ3/4 log ξ (6.1) η log η = = ξ1/2η η1/2 under O-notation. Assuming that both log η and log ξ are in O(log N), we solve η2 = N 1/2/(ξ1/2 log N) and η3/2 = N 1/4ξ3/4 for ξ and η. It is easy to check that our choice of these parameters is a solution to these equations, and that the three terms in (6.1) are all equal to N 2/9 log2/3 N under O-notation. As a result, the overall bit-complexity of Step 3 may be bounded by 2/9 5/3 O Mint N log N log N , and the claim follows. Acknowledgment I want to thank an anonymous referee for the helpful and most valuable suggestions in the report on the first version of this paper. References [BGS07] A. Bostan, P. Gaudry, E.´ Schost, Linear recurrences with polynomial coefficients and application to integer factor- ization and Cartier-Manin operator, SIAM J. Comput., 36(6): 1777-1806, 2007. [CH14] E. Costa, D. Harvey, Faster deterministic integer factorization, Math. Comp., 83: 339-345, 2014. [GG03] J. Gerhard, J. von zur Gathen, Modern Computer Algebra, Second Edition, Cambridge University Press, 2003. [HH19] D. Harvey, J. van der Hoeven, Integer multiplication in time O(n log n), hal-02070778, 2019. [Hit18] M. Hittmeir, A babystep-giantstep method for faster deterministic integer factorization, Math. Comp., 87(314): 2915–2935, 2018. [Law95] F.W. Lawrence, Factorisation of numbers, Messenger of Math., 24: 100–109, 1895. [Leh74] R.S. Lehman, Factoring Large Integers, Math. Comp., 28(126): 637–646, 1974. [Len00] A. K. Lenstra, Integer Factoring, Designs, Codes and Cryptography, 19: 101-128, 2000. [Pap94] C. H. Papadimitriou, Computational complexity, Addison-Wesley Publishing Com- pany, Reading, MA, 1994. [Rie94] H. Riesel, Prime Numbers and Computer Methods for Factorization, Progress in Mathematics (Volume 126), Second Edition, Birkh¨auser Boston, 1994. [SW11] R. Sedgewick, K. Wayne, Algorithms, Princeton University, Fourth Edition, Addison-Wesley, 2011. [Sho05] V. Shoup, A Computational Introduction to Number Theory and Algebra, Cambridge University Press, 2005. [Str77] V. Strassen, Einige Resultate ¨uber Berechnungskomplexit¨at, Jber. Deutsch. Math.-Verein., 78(1): 1–8, 1976/77. [Wag13] S.S. Wagstaff Jr., The Joy of Factoring, American Math. Society, Providence, RI, 2013. Current address: SBA Research, Floragasse 7, A-1040 Vienna Email address: [email protected]