Quadratic Sieve and Extensions 1 Quadratic Sieve

Quadratic Sieve and Extensions 1 Quadratic Sieve

CS 6550 Design and Analysis of Algorithms Section A, Lecture #8 Quadratic Sieve and Extensions Instructor: Richard Peng Feb 15, 2021 DISCLAIMER: These notes are not necessarily an accurate representation of what I said during the class. They are mostly what I intend to say, and have not been carefully edited. These notes are directly adapted from Eric Bach's course notes on arith- metic algorithms http://pages.cs.wisc.edu/~cs812-1/, specifically Lectures 21- 23. Recall that given some n = pq, we want to find x 6= ±y (mod n) such that x2 ≡ y2 (mod n): We will further develop the strategy from Dixon's algorithm [Dix81]. 1. GENERATE: 2 e11 e1s a1 ≡ p1 ∗ ::: ∗ ps (1) 2 e21 e2s a2 ≡ p1 ∗ ::: ∗ ps (2) ··· (3) 2 er1 ers ar ≡ p1 ∗ ::: ∗ ps (4) 2. COMBINE: Select a product of the ais that makes all the exponents of the pis in the resulting right hand side even. Call these numbers a and b, so that a2 ≡ b2 (mod n). 3. SPLIT: Compute gcd(a ± b; n), and show (via backwards) analysis from the random source of a and b that this is a nontrivial divisor of n. 1 Quadratic Sieve ∗ Dixon's algorithm chooses the ai to be random elements of Zn. The quadratic sieve [Pom84] uses a different GENERATE step that both makes the generated numbers easier to fac- torize, and also keeps them smaller (compared to n) Let p m n (5) f (x) = (x + m)2 − n: (6) 1 This seems more or less like generating the entire list of random numbers via some random hash function, and then use sieve-like methods to speed up the factorization process for them. By expanding out the square, we get f(x) ≤ x2 + 2xm (7) = O (x (x + 2m)) (8) for any x > 0. Such an algorithm has two major advantages: 1. the residues are smaller, hence more likely to be smooth. We will ensure x = no(1), which means f (x) ≤ n1=2+o(1): 2. the residues are values of a polynomial which we can factor using a sieve. Observe that for any prime p, pjf(x) implies either pjx or pj(x + 2m). So we can generate all such divisors by `walking around' 2m and 0, in the same way we do a sieve. A fully rigorous analysis of the complexity of quadratic sieve is not known. The heuristic justification for the bound is that the values of f(x) factor like randomly chosen numbers of the same size. Suppose we check all x 2 [1;U], and we only keep the ones that are y-smooth, the cost of the sieve is then about X U ≈ U · log log y i 2≤p≤y Then recall from last time that the probability of a number up to m being m1/λ-smooth is: P r[x ≤ m is m1/λ-smooth] ≈ λ−λ: In this case, subbing in m n1=2 gives: λ = log n=(2 log B): So to get y factored quadratic residues we should take M yλλ. Since y = n2/λ, we get a total cost for the factoring algorithm of T = n2/λB + n3/λ; after taking the time for Gaussian elimination into account. We get a "good" value for lambda by setting these terms equal and solving for lambda. The analysis proceeds similarly to the Dixon algorithm, and we get T = exp(p9=8(L log L) + o(1)) The constant p9=8 = 1:060660::: in the exponent is significantly smaller than Dixon's algo. Note that this is constant in exponent: it more or less square roots the runtime... 2 2 Improvements and Sanity Checks The quadratic sieve is a practical factoring algorithm, and was the workhorse for large number factorization during the 80's and early 90's. To make it run even faster: 1. Instead of factoring the f(x), put log f(x) into the array and subtract log p instead of dividing by p. This replaces division (an expensive step) by a single-precision operation. 2. Use multiple polynomials [Sil87]. 3. Use sparse matrix techniques on the linear equations, instead of Gaussian elimina- tion [Mon95]. 4. Parallelize. If you have many processors, give each one the task of sieveing a block of values of f(x). Alternatively, give each processor its own polynomial to work on. The master processor does the combining step at the end, which (in practice) is faster than sieveing. 5. Use higher degree polynomials: this will be the focus of the rest of our discussion. Here we largely follow the presentation from [Pom96]. On the other hand, note that the theoretical assumption of f(x) behaving like random numbers is a very major one. It's mapping a rather narrow band of numbers,p x = 1 ::: exp(L1=2) onto another rather narrow band of numbers around m ≈ n. What we need is that when we multiply these numbers further (using the computed exponent vector), and take their residue modulo n, the roots don't collide with good probability. The only high level intuition I have for this is that square roots are essentially random: for any `low description' sets A, B, the probability of a number in A squaring to a number in B is roughly jAjjBj=n, at least once the wrap around effect kicks in. There seems to be a lot of recent progress in math on how such sets interact, but I'm not aware of work in theoretical algorithms on this topic this century. Any pointers would be helpful: this issue is going to show up again in the next part as well, when we go to fancier f(x) functions. 3 Using Higher Degree Polynomials We now work back in asymptotics, and ignore constants in the exponent. The QS can be viewed as generating square roots via the mapping f (x) = x2 + 2mx + m2 − n: As long as this number is larger than n, we are able to `randomize' the pre-image of the square. 3 To use higher degree polynomials, the idea is to go to a degree d polynomial, and obtain these large values through the product of a number of smaller polynomials. That is, we treat m as a special symbol, and piece together the higher degree poly via a product Y (ai + bim) : i The issue is we still need to do a modulo n. For that, it's useful to define fields/rings augmented with algebraic integers. That is, we pick a polynomial f(θ) such that n = f (m) ; or equivalently, let θ denote a root of the polynomial obtained from the base m represen- tation of n. 1=3 2=3 For n with L digits, we will pick parameters so that d = L , and thus logm ≈ L . We will also pick the smoothness threshold y to be exp(L1=3). The main issue is how to ensure that Y (ai + biθ) i is a square over Z[θ]. In this lecture, we make a special assumption, which leads to the special number field sieve. That is Z[θ] is a unique factorization domain. In this case, we can factor each ai + biθ into products of primes polynomials. Let the min-poly of θ be d d−1 x + cd−1x + : : : c0 the norm of a + bθ is defined as def d d−1 d−2 2 d d N (a + bθ) = a − cd−1a b + cd−2a b + ::: (−1) c0b : It can be shown that norms are multiplicative. Furthermore, the key property over the (assumed) unique factorization domains is: Lemma 3.1. If Z[θ] is a unique factorization domain, then N(a + bθ) factors into the primes whose norms equal to the prime factorization of N(a + bθ). 1=3 2=3 Note that if ai, bi are picked to have L digits, the norm has at most L digits. 1=3 So by the `sqrt' rule, both a + bm and N(a + bθ) are y-smooth (for logy ≈ L ) with probability exp(−L1=3). So as long as one tries more than exp(2L1=3) pairs, we get more than 10y ones where both are y-smooth. Solving equations on exponents modulo 2, we are able to get a subset S such that Y Y (a + bm) (a + bθ) i2S i2S 4 are squares in Z and Z[θ] respectively. Evaluating the latter with the mapping θ m then gives two different numbers whose square match. This is roughly how an exp(L1=3) type runtime. The general issue is that Z[θ] cannot be expected to be a unique factorization domain in general. We will also discuss how to interpret these norms next time. 4 More on Norms The formal definition norms is based on embeddings of Q[θ] into C. Such embeddings are defined by what θ gets mapped to. Let the embedding be σ : Q[θ] ! C, then requirement of f (σ (θ)) = 0 means σ(θ) must be mapped to a root of f. As polynomials factorize completely over complex numbers, we can factor f into its roots d Y f (x) = (x − θi) i=1 and define the embeddings σ1 : : : σd via σi(θ) = θi. Note that implies that for some α, Pd i which is really α(θ) = i=0 αiθ , we have σi (α) = α (θi) Then formally, the norm of some α 2 Q[θ] is defined as def Y N (α) = σi (α) 1≤i≤d For the discussion above, we needed: 1. Norm is multiplicative, N(a) · N(b) = N(ab).

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us