Integer Factorization - an Investigation of Methods and Implementation

Integer Factorization - An Investigation of Methods and Implementation Josh Boone Southern Illinois University at Carbondale Carbondale, IL 62901 April 24, 2007 Abstract Integer factorization is the breaking down of a composite integer into its prime factors. This unique factorization is then used to ana- lyze the number, or in the case of most cryptographical applications, breaking the cryptoscheme. There are many methods of factorization, but we will focus on those based off of Fermat's Factorization Method. We will give a proof of correctness of this method, as well as an example. We will discuss Dixon's Factorization Method at length, with an example given to show how it works. Finally, we will give some insight on the Quadratic Sieve method, a factorization algorithm that uses quadratic congruences to reduce the amount of time needed to factor an integer. 1 Introduction Integer factorization has been a topic of study since the beginning of number theory. The French mathematicial Pierre de Fermat (1601-1665) is credited with one of the earliest algorithms, aptly named Fermat's Factorization Method. This method is based on a congruence of squares, which is the backbone of many factorization methods. We will prove the correctness and show the strength and weaknesses of this algorithm, and discuss the extensions of Fermat's Method that are more efficient: Dixon's Factorization Method and The Quadratic Sieve Method. Before we begin discussion of this and the other methods, we need some results and definitions from elementary number theory. 1 2 Some Number Theory All of our methods will rely upon the following important theorem, without which factorization would be unimportant. Theorem 2.1 (Fundamental Theorem of Arithmetic). Every integer greater than 1 can be written as a unique product of prime numbers. This rather intuitive result was first proven by Euclid in a more limited form, but was first proven completely by Carl Friedrich Gauss at the age of 21. Here is a simple proof. proof: Factorization: Assume that there exists a number that is not a product of primes. By the well-ordering principal, there must be a smallest integer that has this property. Call it n. It must be the case that n > 1, and that n is composite (since any prime is obviously a product of primes). Then n = ab, where a and b are positive integers less than n. It follows that a and b must be a product of primes. So, n = ab must be a product of primes, a contradiction. Uniqueness: Assume we have two factorizations, n = p1p2 : : : ps = q1q2 : : : qt, where pi and qi are primes and, WLOG, s ≤ t. Also WLOG, we can assume that the primes are written in increasing order, i.e. p1 ≤ p2 ≤ ::: ≤ ps, and q1 ≤ q2 ≤ ::: ≤ qt. We have p1jq1q2 : : : qt, so p1 = qk for some k )p1 ≥ q1. Similarly, q1jp1p2 : : : ps, so q1 ≥ p1. So p1 = q1. Continuing this algorithm, we end up with 1 = qs+1qs+2 : : : qt. Hence each q = 1. So we have s = t and pi = qi for each i. So, the factorizations are identical. We will also need some definitions to discuss our algorithms. Definition 2.2 (Congruence of Squares). Two integers x and y, x 6≡ ±y satisfy a congruence of squares modulo n if: x2 ≡ y2(mod n) for some positive integer n. Notice that this congruence implies x2 − y2 ≡ 0(mod n), i.e. (x − y)(x + y) ≡ 0(mod n). 2 Definition 2.3 (Quadratic Residue). Let a, m be positive integers. We say a is quadratic residue of m if gcd(a; m) = 1 and x2 ≡ a(mod n) has a solution. This term will come up when we discuss the Quadratic Sieve Method. 3 The Most Basic Algorithm: Trial Division Now that we know a little background material, we can discuss some factorization algorithms. We will begin with the most basic of algorithms, trial division. This algorithm should be familiar, since nearly every student has used it to factor an integer in an algebra class. Algorithm 3.1 (Trial Division). To factor an integer n, p 1. For p odd from 2 to n, if p divides n, p is a factor of n. 2. For each p dividing n, while pjjn; j 2 Z+, the multiplicity of the factor p equals j. If n has t factors, we get the factorization n = pj1 pj2 : : : pjt . However, p 1 2 t n this factorization takes, on average, 2 steps.[1] So, if n has two factors of similar size (like most cryptographic schemes) this algorithm is certainly computationally infeasible for large n. 4 Fermat's Factorization Method Now, our first algorithm involving the congruence of squares. Algorithm 4.1 (Fermat's Factorization Method). INPUT: An odd composite integer n. OUTPUT: Two integers a, b such that n = ab. p 1. r d( n)e s0 r2 − n 2. While s0 is not a perfect square: r r + 1 s0 r2 − n 3 3. If r = n+1 , return 'error: n is prime' 2 p p Otherwise, return a = r − s0; b = r + s0 Example 4.2. Factor n = 6077 using Fermat's Method p r = d ne = 78 782 − 6077 = 7 792 − 6077 = 164 802 − 6077 = 323 812 − 6077 = 484 Since 484 = 222, we see that: 6077 = 812 − 222 Hence, 6077 = (81 − 22)(81 + 22) = 59 ∗ 103 Of course, this method does not always give the full factorization, just two odd integers that divide n. The idea is that this information will lead to an easy analysis of n (or a complete factorization if n is simply a product of two primes). This seems like it would be a perfect algorithm for breaking the RSA modulus, so why is this not the end of our discussion? 5 Efficiency of Fermat's Method p Notice that Fermat's method is very fast if n = pq, where p ≈ q ≈ n. Be- cause of this fact, RSA primes are chosen carefully to not have this property, just as they are chosen to not have very small factors (which are easy to find with trial division). For this reason, Fermat's method is not as efficient for breaking cryptoschemes as it looks at first glance. In fact, as the distance p between p and n increases, the running time increases faster than expo- nentially.[2] So, with just a tiny bit of foresight, it is easy to design an RSA modulus that makes Fermat's method computationally infeasible. However, Fermat's Method is a very important topic of study, since it was the first factoring method to use the congruence of squares as a basis. We will see that two important extensions of Fermat's method are still in use today, one of which is the premier factoring algorithm for numbers with less than 115 decimal digits. For now, let us study Fermat's method in its entirety. 4 6 Correctness of Fermat's Method Now we will show that Fermat's method does indeed always find a factor of n. Say n = ab. We want to show that we will always find those factors, i.e. that either a or b is in the range of our iteration. The proof will also explain n+1 why n is prime if r reaches 2 . Theorem 6.1 (Correctness of Fermat's Method). For any odd composite integer n, Algorithm 4.1 will always find a divisor of n. proof: Let n = ab. Then, 1 1 2 2 a+b 2 a−b 2 n = ab = 4 (2ab + 2ab) = 4 ((a + b) − (a − b) ) = ( 2 ) − ( 2 ) a+b a−b 2 2 So, if we let r = 2 and s = 2 , we see that n = r − s Note: r and s are integers, because n odd ) a; b also odd. p We will now show that r is in the range of the iteration, i.e. d ne ≤ r < n+1 : p 2 Assume that r < n. Then, p n = r2 − s2 < n2 − s2 = n − s2 ) s2 < 0, an obvious contradiction. n+1 Assume that r ≥ 2 . Then, 2 2 n+1 2 2 n = r − s ≥ ( 2 ) − s 2 n+1 2 n2 n 1 n2 n 1 n−1 2 ) s ≥ ( 2 ) − n = ( 4 + 2 + 4 ) − n = 4 − 2 + 4 = ( 2 ) n−1 ) s ≥ 2 n+1 n−1 So, r + s ≥ 2 + 2 = n But we know that n = r2 − s2 = (r + s)(r − s) So it must be that r + s = n, r − s = 1. ) n is prime, another contradiction. So it must be that the value r can always be found by Algorithm 4.1, hence the factors a and b are always found. 7 Extension 1 of Fermat's Method - Dixon's Method Fermat's method is academically interesting, even if it has limited applica- tion, because it has extensions that are very useful for factoring integers. John D. Dixon of Carlten University, Ontario, devised this method in 1981 that is an extension of Fermat's method.[3] Dixon's method uses congruences of squares to find a divisor of n. It also uses Gaussian elimination to 5 solve the resulting matrix. We also must prepare a table of primes before we begin, the size of which we will discuss later. Algorithm 7.1 (Dixon's Factorization Method). INPUT: A composite integer n to be factored, a set fSg, called the factor base[3] of all primes less than some integer S called the prime bound, an integer R called the relation bound. OUTPUT: An integer a such that ajn. 1.

Integer Factorization - an Investigation of Methods and Implementation

On the Number Field Sieve: Polynomial Selection and Smooth Elements in Number Fields

Integer Factoring

Prime Factorization and Cryptography a Theoretical Introduction to the General Number Field Sieve

Eindhoven University of Technology MASTER a Study of the General

New Pragmatic Algorithms to Improve Factoring of Large Numbers

Polynomial Selection for the Number Field Sieve

Integer Factorization with the General Number Field Sieve

Number Field Sieve with Provable Complexity

Implementation of the Quadratic Sieve