AAAI Proceedings Template

Reverse Factorization and Comparison of Factorization Algorithms in attack to RSA Sadi Evren SEKER Cihan MERT Dept. of Business Administration. Electrical Engineering Dept. Istanbul Medeniyet University The University of Texas at Dallas [email protected] [email protected] ABSTRACT details of the factorization algorithms we have implemented Factorization algorithms have a major role in the computer in this study. The experiments section will go into the details security and cryptography. Most of the widely used crypto- of the big number of integers their properties after generation graphic algorithms, like RSA, are built on the mathematical and evaluation of the algorithms. difficulty of factorization for big prime numbers. This research, proposes a new approach to the factorization by using 2. PROBLEM STATEMENT two new enhancements. The new approach is also compared A stepwise approach to the study can be viewed as in the Fig- with six different factorization algorithms and evaluated the ure 1. performance on a big data environment. The algorithms cov- ered are elliptic curve method, quadratic sieve, Fermat’s method, trial division and Pollard rho methods. Success rates are compared over a million of integer numbers with different difficulties. We have implemented our own algorithm for random number generation, which is also explained in the paper. We also empirically show that the new approach has an advantage on the factorization attack to RSA. Keywords Factorization, Cryptography, Benchmarking Acknowledgement Work of Sadi Evren SEKER is supported by Istanbul University, research projects department under project number YADOP-27254 1. INTRODUCTION This study can be viewed as three major steps. In the first layer, we have generated big integers with a new approach on the generation. After the generation, the factorization algorithms including the new approach are executed. Finally, on Figure 1. Overview of Study the last step the performance of the algorithms are evaluated. In order to simulate the RSA prime number factorization In this paper, the problem will be defined and an overview of problem, we have only concentrated on the semi-prime num- the problem will be demonstrated in the problem statement bers. The random generator is designed to generate the semi- section. The related work section will cover a brief literature prime numbers. In order to make the time performance more review about the contemporary studies on the factorization explicit we have generated huge number of semi-prime num- algorithms. The background chapter will briefly describe the bers and stored them in a database. After storing, the factorization algorithms are executed on those numbers. Finally each algorithm is evaluated in the time performance. 3.3. Quadratic Sieve To factorize a number n, quadratic sieve method [3] attempts 3. BACKGROUND to find two numbers x and y such that 푥 ≢ ±푦 (푚표푑 푛) and 푥2 ≡ 푦2(푚표푑 푛). If two such numbers are found, this From the early times, the factorization of composite numbers implies that (x − y)(x + y) ≡ 0 (mod n). Then, x − y must has been an interesting area of studying and there are some have non-trivial factors in common with n. To achieve this, algorithms carried on like Sieve of Eratosthenes ( 276 – 194 a common strategy for finding such x and y is the following. BC). Choose a smoothness bound B. The number π(B),which Also the by the spreading usage of modern cryptographic sys- denotes the number of prime numbers less than B, will control tems which some are built on the difficulty of factoring, like both the number of vectors needed and the length of the RSA[1], the factorization problem has been a studying area. vectors. Then use sieving to locate π(B) + 1 numbers 푥푖 such 2 that 푦푖≡ (푥푖 푚표푑 푛) is B-smooth. Factor the 푦푖 and Initially factoring started with dividing a number by larger generate exponent vectors mod 2 for each one. Find a subset and larger primes until you had the factorization. This trial of these vectors which add to the zero vector. Multiply the division was not improved until Fermat’s method in which corresponding 푥푖 together naming the result mod n: x and the factorization of the difference of two squares is used. 2 the 푦푖 together which yields a B-smooth square 푦 . Next, While Fermat's method is much faster than trial division, 2 2 when it comes to the real world of factoring, for example for obtained equality 푥 ≡ 푦 (푚표푑 푛) gives two square roots of 2 factoring several hundred digits long RSA modulus, the (푥 푚표푑 푛), one by taking the square root in the integers 2 purely iterative Fermat’s method is too slow. This led the de- of 푦 namely 푦, and the other the a computed in previous velopment of several other methods, such as a pair of proba- step. Having desired identity(x − y)(x + y) ≡ 0(mod n), bilistic methods by Pollard in the mid 70's, the p − 1 method compute the 퐺퐶퐷(푥 − 푦, 푛). This gives a factor. If the factor and the ρ method, the Elliptic Curve Method discovered by is trivial, try again with a different a or linear dependency. H. Lenstra in 1987 . However, the fastest algorithms such as 3.4. Pollard Rho the Number Field Sieve (and its variants), the Quadratic Sieve Pollard’s rho method [4] is based on a combination of two (and it variants), and Continued Fraction Method utilize the ideas on Floyd's cycle-finding algorithm and birthday same trick as Fermat. The remainder of this paper will briefly paradox that are also useful for various other factoring discuss some of the above methods and focus on reverse fac- methods. torization method, a new approach. Let N be a number that is neither a perfect power nor a prime 3.1. Factorization by Trial Division and p the smallest prime factor of N. Generate sequence of numbers 푥0, 푥1, 푥2, … from 푍푁 uniformly, independently at random then after at most p + 1 Trial method is a brute-force method of finding a divisor of such pickings for the first time, there are two numbers 푥푖 and an integer N by simply trying if N is divisible by 푥푠 with i < s such that 푥푖 ≡ 푥푠 (푚표푑 푝). Since N is 2,3,5,7,11,13,17,…, i.e., all primes which are less than or not a perfect power, there is another prime factor q > p of N. equal to √푁 in succession, until a divisor is reached. Since the numbers 푥푖 and 푥푠 are randomly chosen from 푍푁, by the Chinese remaindering theorem, 푥푖 ≢ 푥푠 (푚표푑 푞) To partially or completely factor N, Trial division is an effec- with probability 1 − 1/푞 even under the condition that 푥푖 ≡ tive and simple method. It is reasonable to use trial division 푥푠 (푚표푑 푝). Therefore, 푔푐푑(푥푖 − 푥푠, 푁) is a nontrivial method as a factoring method when N is not too large. factor of N with probability at least 1 − 1/푞. Since the 푥 푚표푑 푝 behave more or less as random integers 3.2. Fermat Factorization 푖 in 0,1, … , 푝 − 1 , by computing 푔푐푑(푥푖 − 푥푗, 푁), for 푖 ≠ 푗 , the factorization of N after about 푐√푝 elements of the Fermat's factorization method [2] looks for the representation sequence can be computed, for some small constant c. 2 of an odd integer N as the difference of two squares N = This suggests that approximately (푐√푝) /2 pairs 푥푖 , 푥푗 have a2 − b2 . Then to be considered. However, this can easily be avoided by only N = (a − b)(a + b) computing 푔푐푑(푥푖 − 푥2푖, 푁) for 푖 = 0,1, … , i.e., by and N is factored. generating two copies of the sequence, one at the regular speed and one at the double speed. This can be expected to To factor any number N, first calculate √N. Then compute result in a factorization of N after approximately 2√푝 gcd a2 − N starting with a, the first integer greater than N and √ 2 2 2 computations. If this GCD ever comes to N, then the continue until reaching a square b . Since a − N = b , algorithm terminates with failure, since this means 푥푖 = N = a2 − b2 . So N is factorized into N = (a − b)(a + b) .If 푥2푖 and therefore, by Floyd's cycle-finding algorithm, the the only factors found are N and 1, then N is a prime number. sequence has cycled and continuing any further would only If N is not prime, use the same algorithm for each factor. be repeating previous work. Fermat's method works well when the number is factorized into two terms of approximately equal size. It works poorly when the factors are of very different sizes. 4. Semi-prime Factorization in RSA Where the number of prime factors of cn is consierede as This study focus on the fast and efficient factorization for the m+1. semi-prime numbers. The semi-prime numbers are For the given cn, the equation (2) can be concluded. considered as the multiplication of two prime numbers, say p and q. In some sources the semi-prime numbers are also 푚 named as pq numbers for this reason. ( 푛 ∈ 푁 ∧ 푛|푓푖) ⇔ 푛푖| {ℤ|푛 = (⋂ ℤ|푓푖)} (2) The advantage of factorizing the semi-prime numbers in RSA 푖=1 crypto system is the two prime factors of semi-prime numbers Where N is the domain set of search for the prime numbers, should be in equal digists or almost in equal digits. The reason 2 is, if the number of digits of one prime of the semi-prime which are the numbers from 2 to √푐푛.

AAAI Proceedings Template

Fast Tabulation of Challenge Pseudoprimes Andrew Shallue and Jonathan Webster

The Number Field Sieve for Discrete Logarithms

A Set of Sequences in Number Theory

The Factoring Dead: Preparing for the Cryptopocalypse

Counting Integers with a Smooth Totient

Use of SIMD-Based Data Parallelism to Speed up Sieving in Integer-Factoring Algorithms ?

Binomial Coefficients and Lucas Sequences

POLYA SEMINAR WEEK 2: NUMBER THEORY K. Soundararajan And

Using Formal Concept Analysis in Mathematical Discovery

Primes of the Form (Bn + 1)/(B + 1)

List of Numbers

Sum of the Reciprocals of Famous Series: Mathematical Connections with Some Sectors of Theoretical Physics and String Theory