<<

A study of Factoring related to the RSA Cryptosystems

by

NORLIZA BINTI MOHAMED

Dissertation submitted in partial fulfillment of the requirements for the degree of Master of Science ()

May 2008 ACKNOWLEDGEMENT

All praise to Allah, for his blessing and his grace to me gain enough strength to complete this dissertation. All difficulty can be overcomes with calm. Also prayer and regards for our prophet Muhammad S.A.W. with great reverence.

My appreciation and high regard aimed specifically to Dr. Hailiza Binti Kamarul Haili who has supervised me to complete this dissertation. All sacrifice and her noble effort, only God is able to return it. My appreciation also goes to all School of Mathematical Sciences lecturers, Universiti Sains Malaysia, who had taught me during this course.

Much affection to my beloved husband (Mohd Khir bin Ahmad) and mother (Merak Mas Binti Zakaria) who have given their sincere love, attention, support and understanding that have helped me get through good and rough times to me. I would like to express my deepest love and appreciation to my beloved children (Muhammad Nasrullah, Nur Karmila, Nur Dalila, Nur Qistina and Ahmad Zubair) for their endless love.

Finally, to all involved, your sacrifice and assistance shall be remembered forever. What is good come from Allah S.W.T. and weaknesses are all mine. Wallahua'alarn.

Thank you very much.

11 CONTENTS

Acknowledgement...... ii

Contents ...... iii

List of tables ...... v

Abstrak ...... vi

Abstract ...... vii

CONTENTS

CHAPTER 1 : INTRODUCTION

1.1 Introduction ...... 1 1.2 Objectives ...... 3 1.3 Dissertation outline ...... 3

CHAPTER 2 : LITERATURE REVIEW ON RSA CRYPTOSYSTEMS

2.1 Introduction ...... 5 2.2 History ...... 6 2.3 RSA Cryptosystem ...... 8 2.4 Implementation ofRSA ...... 10 2.5 The RSA ...... 12

CHAPTER 3 :

3.1 Naive methods ...... 20 3.2 Probabilistic tests...... 21

111 3.3 Miller-Robin methods ...... 22 3.3.1 Mathematica Function for the Primality Test...... 25 3.4 Fermat Test ...... 26

CHAPTER 4: FACTORING ALGORITHMS

4.1 ...... 29 4.2 The Pollard p- 1 Factoring Algorithm ...... 30 4.2.1 Pseudocode: Pollard p-1 Factorization ...... 32 4.3 The Pollard Rho Algorithm ...... 32 4.3.1 Pseudocode: Pollard Rho Factorization ...... 35 4.4 The Pollard p -1 and Pollard's Rho Findings ...... 36 4.5 Number Field Sieve ...... 40 4.5.1 General Number Field Sieve ...... 43 4.5.2 Special Number Field Sieve...... 47

CHAPTER 5 : GENERAL NUMBER FIELD SIEVE IN RSA

5.1 Factorization Record using GNFS ...... 50 5.2 RSA Number (Factoring Challenge) ...... 51 5.3 Attacks on RSA...... 53 5.3.1 Cracking the RSA encryption system ...... 54

CHAPTER 6 : CONCLUSION AND SUGGESTIONS FOR FURTHER WORKS

6.1 Conclusion ...... 55 6.2 Suggestions for further work...... 56

References Attachment

lV List of Tables

Page

Table 4.4.1 : The Pollard p- 1 Methods 38

Table 4.4.2 : The Pollard's Rho Methods 39

Table 4.5.2.1 :Factoring Using the Special Number Field Sieve 41

Table 4.5.1.2 :Factoring Using the General Number Field Sieve 42

Table 4.5.1.1 : Shows the summarizes the possibilities divisibility Scenarios 45

Table 5.2.1 :The table shows the recorded RSA Numbers factored with the 52

algorithm used.

v Abstrak

SATU KAJIAN TENTANG ALGORITMA PEMFAKTORAN DAN PERKAITANNYA DALAM RSA KRIPTOSISTEM

Kriptografi adalah merujuk kepada satu kajian mengenai cara menghantar mesej/ maklumat secara rahsia di mana hanya penerima maklumat tersebut sahaja yang dapat membaca maklumat yang disampaikan. Dalam tahun 1977, Ronal Rivest, Adi Shamir dan

Leonard Adleman dengan menggunakan nama keluarga masing-masing RSA telah mencipta satu teknologi baru dalam dunia Kriptografi yang dipanggil RSA Kriptosistem.

Sistem ini adalah kriptosistem 'public-key' yang mana telah memperkenalkan kedua-dua kunci 'encrypt' dan 'decrypt'. Keberkesanan RSA Kriptosistem bergantung kepada nombor perdana yang di gunakan dan juga kepada algoritma pemfaktoran integer yang efektif.

Di dalam tesis ini kita membincangkan beberapa algoritma pemfaktoran dan juga pengujian nombor perdana yang berkaitan dengan kriptosistem RSA. Beberapa contoh juga ada di tunjukkan untlik menampilkan konsep matematik yang di gunakan. Kita juga telah menggunakan programming MATHEMA TICA untuk membantu memahami pengujian nombor perdana dan juga pemfaktoran nombor komposit kepada nombor­ nombor perdana. Beberapa penemuan oleh para penyelidik didalam pemfaktoran integer ada di tunjukkan dan dibincangkan. Kemungkinan di masa yang akan datang, kita akan menjumpai satu algoritma yang lebih efisien dalam pemfaktoran integer yang besar.

InsyaAllah.

Vl Abstract

Cryptography is referred to the study of methods for sending messages in secret so that the intended recipient can remove the disguise and read the message. Now, this is extremely useful. In 1977, Ronald Rivest, Adi Shamir and Leonard

Adleman with the initial of their surnames RSA were publicly a RSA Cryptosystem. This

RSA Cryptosystem is a public-key cryptosystem that offers both encryption and digital signatures which is more secure. RSA cryptosystem relies very much on the length of the prime numbers used as well as the effectiveness of the available integer factoring algorithms.

In this thesis several factoring algorithms and primality tests related to the RSA cryptosystem are discussed. Some examples are given as to indicate the underlying mathematical concepts used m the process. A mild programmmg usmg

MATHEMATICA were also carried out for primality test and factoring composite numbers into primes. Some challenges and results on the latest development in integer factorization are shown and discussed. It is possible that new factoring algorithms may be developed in the future which once again targeted primes with certain properties.

Vll CHAPTERl INTRODUCTION

1.1 Introduction

A study of cryptography is a study of the science of writing in secret code which is an ancient art. In 1977, , Adi Shamir and Leonard Adleman proposed a public­ key cryptosystem that uses only elementary ideas from (Johannes, 2000).

Shortly after that they developed a cryptosystem that was the first real public-key cryptosystem capable of encryption and digital signatures. Their enciphering system is called RSA, after the initials of the algorithm's inventors. Its security depends on the assumption that in the current state of computer technology, the factorization of composite with large prime factors is prohibitively time-consuming. The RSA algorithm has become the foundation of an entire generation of public key cryptography security products because it provides secure communications over distances between parties that have not previously met. Indeed, RSA has provided the ideal mechanism required for private communications over electronic networks. It forms the basis of almost all of the security products currently in use on the Internet for financial and other private communications, including most organizational level Public Key Infrastructure systems.

RSA uses a variable size encryption block and a variable size key. The key-pair is derived from a very large number, n, that is the product of two prime numbers chosen according to special rules. These primes may be 100 or more digits in length each, yielding an n with roughly twice as many digits as the prime factors. The public key information

1 includes n and a derivative of one of the factors of n; an attacker cannot determine the

prime factors of n (and, therefore, the private key) from this information alone and that is what makes the RSA algorithm so secure. The ability for computers to factor large numbers,

and therefore attack schemes such as RSA, is rapidly improving. The systems today can

even find the prime factors of numbers with more than 200 digits. Nevertheless, if a large

number is created from two prime factors that are roughly of the same size, there is no known factorization algorithm that will solve the problem in a reasonable amount of time.

The security of the RSA public-key cryptography system is based on the

computational intractability of factoring large integers. As a more modest application, hash table performance typically improves when the table size is a . To get this benefit, an initialization routine must identify a prime near the desired table stze. Finally, prime numbers are just interesting to play with. Although factoring and primality testing are related problems, algorithmically they are quite different. There are algorithms that can demonstrate that an integer is composite (i.e. not prime) without actually giving the factors.

Considerably faster factoring algorithms exist, whose correctness depends upon more substantial number theory. The fastest known algorithm, the general number field sieve, uses randomness to construct a system of congruences, the solution of which usually gives a factor of the integer.

2 1.2 Objectives

This dissertation is to study the basic of RSA Cryptosystem in Cryptography. We are also going to study a few factoring algorithms related to the RSA Cryptosystem.

1.3 Dissertation Outline

This dissertation has 6 chapters. First, we have an introduction. In this chapter, we focus the main problem that we are going to study.

In chapter two, we discuss briefly about the cryptography and its history.

Furthermore, in this chapter the RSA Cryptosystem will be discussed. We also study the

RSA Algorithm and it steps in the RSA Cryptosystem in this chapter. Chapter three we are going to discuss the primality test which is used to find out whether the input is a prime number or not. Through these primality test, we can determine which primality test is used for the factoring algorithm.

In chapter four, we discuss the integer factoring and factoring algorithms. Our discussion of factoring algorithms is including the Pollard p-1 algorithm, the Pollard Rho algorithm, Number Field Sieve, Special Number Field Sieve and General Number Field

Sieve. The chapter ends with a discussion of the fastest algorithm for the factoring large integers.

3 Chapter five provides a detail discussion of the fastest factoring algorithm in this decade, the application of this algorithm in the RSA Cryptosystem and its implementation.

A brief discussion of the RSA Factoring Challenge and its record will be included.

Finally, chapter six g1ves a conclusion of the factoring algorithm and a few suggestions for further work.

4 CHAPTER2

LITERATURE RIVIEW ON RSA CRYPTOSYSTEM

2.1 Introduction of Cryptography

People always had a fascination with keeping information away from others. They are tried to keep information secret from adversaries. Even though, the government also used many methods to prevent the enemy from learning sensitive military information.

Today, the need for more sophisticated methods of protecting data has increased. The demand for information and electric service is growing as the world becomes more connected. We need to protect our data and electronic system in such a way of living. The field of cryptography is the techniques needed to protect data.

Cryptography is the study of techniques and applications that depend on the existence of difficult problems. Cryptanalysis is the study of how to compromise (defeat) cryptographic mechanisms, and cryptology (from the Greek krypt6s logos, meaning

"hidden word") is the discipline of cryptography and cryptanalysis combined.

Cryptography is concerned with keeping communications private of most people. Indeed, the protection of sensitive communications has been the emphasis of cryptography throughout much of its history. However, this is only one part of today's cryptography.

Modern cryptography is a field that draws heavily upon mathematics, computer sciences, and cleverness. One of the most important assumptions in modern cryptography is kerknoffs's Principle (Trappe et al, 2002). In assessing the security of a cryptosystem, one always assume the enemy knows the method being used. Auguste Kerckhoffs in 1883 in his classic treatise La Cryptographie Millitaire (Trappe et al, 2002) enunciated the principle.

5 Encryption is the transformation of data into a form that is almost impossible to read without the appropriate knowledge. Its purpose is to ensure privacy by keeping information hidden from anyone for whom it is not intended, even those who access to the encrypted data. Decryption is the reverse of encryption, it is transformation of encrypted data back into an intelligible form. Encryption and decryption generally required the use of some secret information, referred to as a key. For some encryption mechanisms, the same key is used for both encryption and decryption, for other mechanism, the keys used for encryption and decryption are different. Today's cryptography is more than encryption and decryption. Authentication is fundamentally a part of our lives as privacy. We use authentication throughout our every lives such as when we sign our name to some document. As we move to a world where our decisions and agreements are communicated electrically, we need to have electric techniques for providing authentication.

In cryptography, RSA is an algorithm for public-key cryptography. It is known as the first algorithm that is suitable for signing as well as encryption, and one of the first great advances in public key cryptography. RSA is widely used in electronic commerce protocols, and is believed to be secure with its sufficiently long keys and up-to-date implementations.

2. 2 History

Cryptography may be viewed as overt secret writing in the sense that the writing is clearly seen to be disguised. The first recorded instance of cryptographic technique was literally written in stone almost four millennia ago. This was done by an Egyptian scribe

6 who used hieroglyphic symbol substitution in his writing on a rock wall in the tomb of a nobleman of the time, Khnumhotep.

The oldest extant cryptography from ancient Mesopotamia is an enciphered cuneiform tablet, which has a formula for making pottery glazes, and dates from around

1500 B.C. It was found on the site of seleucia on the banks of the Tigris river (Koblitz,

1997). Also, the Babylonia and Asyrian scribes occasionally used exceptional or unusual cuneiform symbols on their clay tables to 'sign-off the message with a date and signature called colophons. The first known establishment of military cryptography was given to us by the Spartans, who used the first transposition cipher, called a skytale, consisted of a wooden staff around which a strip of parchment was tightly wrapped, layer upon layer. The secret message was written on the parchment lengthwise down the staff. Then the parchment was unwrapped and sent.

In the 16th century, the French cryptographer Vigenere invented a variant on the

Roman system that is not quite so easy to break. He took a message until to be a block of

'k' letters, in modern terminology. In other words, his map from plaintext to cipher text message. For the most part, until about 20 years ago only rather elementary algebra and number theory were used in cryptography. Perhaps the most sophisticated mathematical result in cryptography before the 1970's was the famous theorem of information theory

(Koblitz, 1997).

Before 1970s, the secure information was variants on private key cryptography.

This means that the sender and the receiver of encrypted message must share and keep

7 private a common cipher used for encoding and decoding messages. Meanwhile, in 1976

Whitfield Diffie and Martin Hillman introduce an alternative which is more an innovative approach. In this cryptographic system, the key used to encrypt a message is different from the key which is used to decrypt. Then, the system is called public key cryptography where publicly distributed key performs one function (either encoding or decoding) while the other key kept as a private, performs the reverse functions.

A year later, in 1977 the algorithm was publicly described by Ron Rivest, Adi

Shamir, and Loenard Adleman at MIT; the letters RSA are the initials of their surnames.

Clifford Cocks, a British mathematician working for the UK intelligence agency GCHQ, described an equivalent system in an internal document in 1973. Due to the relatively expensive computers needed to implement it at the time, it was mostly considered a curiosity and, as far as is publicly known, was never deployed. His discovery, however, was not revealed until 1997 due to its top-secret classification, and Rivest, Shamir, and

Adleman devised RSA independently of Cocks' work.

2.3 RSA Cryptosystem

A cryptosystem consisting of a set of enciphering and deciphering transformation is called a Public-key Cryptosystem or an Asymmetric Cryptosystem (Koblitz, 1997). The

RSA cryptosystem is a public-key cryptosystem that offers both encryption and digital signatures (authentication). The idea of a public-key cryptosystems was put forward by

Diffie and Hellman in 1976. In 1977, Rivest, Shamir and Adleman invented the well known

RSA Cryptosystem. The RSA algorithm works as follows: take two large primes, p and q,

8 and compute their product n = pq; n is called the modulus. Choose a number, e, less than n and relatively prime to (p-l)(q-1), which means e and (p-l)(q-1) have no common factors except 1. Find another number d such that ( ed - 1) is divisible by (p - 1) ( q - 1). The values e and dare called the public and private exponents, respectively. The public key is the pair

(n, e); the private key is (n, d). The factors p and q may be destroyed or kept with the private key.

It is currently difficult to obtain the private key d from the public key (n, e).

However if one could factor n into p and q, then one could obtain the private key d. Thus the security of the RSA system is based on the assumption that factoring is difficult. The discovery of an easy method of factoring would "break" RSA .

Encryption

Suppose Alice wants to send a message m to Bob. Alice creates the cipher text c by exponentiating: c = me mod n, where e and n are Bob's public key. She sends c to Bob. To decrypt, Bob also exponentiates: m = cd mod n; the relationship between e and d ensures that Bob correctly recovers m. Since only Bob knows d, only Bob can decrypt this message.

Digital Signature

Suppose Alice wants to send a message m to Bob in such a way that Bob is assured the message is both authentic, has not been tampered with, and from Alice. Alice creates a digital signatures by exponentiating: s = md mod n, where d and n are Alice's private key.

She sends m and s to Bob. To verify the signature, Bob exponentiates and checks that the message m is recovered: m = se mod n, where e and n are Alice's public key.

9 Thus encryption and authentication take place without any sharing of private keys: each person uses only another's public key or their own private key. Anyone can send an encrypted message or verify a signed message, but only someone in possession of the correct private key can decrypt or sign a message.

2.4 Implementing RSA

The RSA cryptosystem uses computation in Zn, where n is the product of two distinct odd primes p and q. For such an integer n, note that @(n) = (p- l)(q- 1). The formal description is given as Cryptosystem below:

Cryptosystem 1 : RSA Cryptosystem

Let n = pq, where p and q are primes. Let p = e = Zn and define

K = {(n, p, q, a ,b): ab == 1 (mod @(n)) }.

ForK= (n, p, q, a, b), define

eK(x)=xb modn

and

(x, y E Zn). The values n and b comprise the public key, and the values p, q, and a from the private key.

10 There are many aspects of the RSA Cryptosystem including the efficiency of encrypting and decrypting. To set up the system, Bob uses the RSA PARAMETER

GENERATION algorithm which is presented informally as Algorithm 1.

Algorithm 1 : RSA PARAMETER GENERATION

l. Generate two large primes, p and q, such that p i- q.

2. n+-pqanda(n)+- (p-1)(q-1)

3. Choose a random b (1 < b < a(n)) such that gcd(b, a(n)) = 1

1 4. a +--- b- mod a(n)

5. The public key is (n, b) and the private key is (p, q, a)

Example;

1. Let p = 3 and q = 7.

2. n = pq, therefore n = 21, and (p -1)(q -1) = 2 x 6 = 12.

3. b(l

4. a= 84

5. The public key is (21, 5) and the private key is (3, 7, 84)

If the RSA Cryptosystem is to be secure, it is certainly necessary that n = pq must be large enough that factoring it will be computationally infeasible. For greater security we can choose larger primes. In order to make them hard for our antagonists to discover, we should randomly select the primes. One way of doing this is to use a random number generator to pick an integer with the appropriate number of digits add one if it is even, and test the resulting integer n for primality. We can learn more about a primality testing in chapter 3.

11 2.5 RSA Algorithm

Here we are going to discuss more about the RSA Algorithm. The RSA algorithm is

actually a cipher, which means that it works on letters of the alphabet or on the symbols to

write a language rather than on words or meaningful phrases of the language. It really acts

on a collection of numbers, as the first task to get a uniform method of converting the

symbols we want to transmit into numbers. To begin creating your public key cryptography

system, choose two large primes number p and q. The larger p and q, the more secure your

encryption system will be. Until recently, it is sufficed to used primes that were

approximately 80 digits long. If the primes p and q choosen are large enough, say, 100

digits each, then n will be a 200-digits number that cannot be factored within any

reasonable amount of time. Much effort has been espended to try to factor large numbers

liken quickly.

Today, using primes from 100 and 300 digits long is recommended for maximum

security. Meanwhile, the record so far is the factoring of 193-digit number. (Mollin, 2003).

The requirements are certain to increase steadily, as more computing power and better

methods for attacking the RSA algorithm become available.

The security of the RSA cipher is based on the ease of finding deciphering number d when the factorization of n = pq is known and the difficulty of finding d from n and e when the factorization is not known. Now, we are going to discuss about the encryption and decryption. Suppose you want to use the public key (n, e) to encipher the message x in X=

Zn by raising x to the power e and reducing modulo n to obtain y = x E == xe (mod n). To decipher the message y, the holder of the secret deciphering key (n, d) raises y to the power

12 d and reduces modulo n to obtain x = y D =/ (mod n).

Suppose, we assume (n, e)= (33, 7), the message x = 17 is enciphered as y = 17E =

1 i =(17)(-16/=(17)(-162/=(17)((32)(8)/=(17)(-8/=(17)(64)(-8)=(17)(-2)(-8)=(34)(-8)

=(-1)(-8) = 8 (mod 33). Then y = 8 is deciphered with the secret key (n, d) = (33,3) to obtain x = /=83 =8(82)=8(64) =8(-2)= - 16 = 17 (mod 33).This is how the decipher decrypt the message.

Suppose Alice can send a signed message to Bob by forming V = uDA and then w = vEa = uDAEs that Bob can read as u = wDAEs. Alice must apply DA andEs in the correct order. For example, suppose Ali has enciphering key (nA, eA) = (33, 7) and deciphering key

(nA, dA) = (33, 3) while Bob has keys (n8 , e8) = (65,11) and (n 8, ds) = (65,35) and Bob wants to send the signed message x = 18 to Alice. Then, Bob calculates y = 18EA =

7 35 (18 )(mod33) = 6 and then z =6D8 = (6 )(mod 65) = 11, which he sends to Alice. Alice then applies Bob's encipherer Es to z = 11 to obtain y =lin (mod 65)=6 and then applies her decipher DA toy= 6 to obtain x = 6 DA= 63 (mod 33)= 18 in order to read the original message x = 18.

In this case, the order of application of operators is important. When sending a signed message, the operator (encipher or decipher) the associated with the smaller modulus n must be applied first, and when receiving a signed message the operator assosiated with the larger modulus must be applied first.

13 Example 1 p = 61 first prime number (destroy this after computing E and D)

Q= 53 second prime number (destroy this after computing E and D)

PQ = 3233 modulus (give this to others)

E = 17 public exponent (give this to others)

D = 2753 private exponent (keep this secret)

Your public key is (E, PQ)

Your private key is D.

The encryption function is:

Encrypt(T) = (T"E) mod PQ

= (T"17) mod 3233

The decryption function is

Decrypt (C) = (C"D) mod PQ

= (C"2753) mod 3233

To encrypt the plaintest value 123, do this,

Encrypt (123) = (123"17) mod 3233

= 337587917446653715596592958817679803 mod 3323

= 855

14 To decrypt the chiphertext value 855, do this,

Decrypt (855) = (855"'2753) mod 3233

= 123

One way to compute the value of 855"'2753 mod 3233 is like this;

2753 = 101011000001 base 2. therefore,

= 1 + 64 + 128 + 512 + 2048

Consider this table of power of 855

855/\1 855 (mod) 3233

855/\2 855/\2 (mod) 3233 = 367 (mod) 3233

855/\4 367/\2 (mod) 3233 = 2136 (mod) 3233

855/\8 2136/\2 (mod) 3233 = 733 (mod) 3233

855/\16 733/\2 (mod) 3233 = 611 (mod) 3233

855/\32 = 611/\2 (mod) 3233 = 1526 (mod) 3233

855/\64 = 1526/\2 (mod) 3233 = 916 (mod) 3233

855/\128 = 916/\2 (mod) 3233 = 1709 (mod) 3233

855/\256 = 1709/\2 (mod) 3233 = 1282(mod) 3233

855/\512 = 1282/\2 (mod) 3233 = 1160 (mod) 3233

15 855"1024 = 1160"2 (mod) 3233 = 672(mod) 3233

855"2048 = 672"2 (mod) 3233 = 2197 (mod) 3233

Given the above, we know this ;

855"2753 (mod) 3233

855"(1 + 64 + 128 + 512 + 2048) (mod) 3233

= 855"1 + 855"64 + 855"128 + 855"512 + 855"2048 (mod) 3233

855 * 916 * 1709 * 1160 * 2197 (mod) 3233

794 * 1709 * 1160 * 2197 (mod) 3233

= 2319 * 1160 * 2197 (mod) 3233

184 * 2197 (mod) 3233

123 (mod) 3233

123

Example 2:

P=3 first prime number (destroy this after computing E and D)

Q=7 second prime number (destroy this after computing E and D)

PQ =21 modulus (give this to others)

E=5 public exponent (give this to others)

D= 17 private exponent (keep this secret)

16 Your public key is (E, PQ)

Your private key is D.

The encryption function is:

Encrypt(T) = (T"'E) mod PQ

= (T''5) mod 21

The decryption function is

Decrypt (C) = (C'"D) mod PQ

= (CA17)mod21

To encrypt the plaintest value 10, do this,

Encrypt ( 1 0) = ( 1OAS) mod 21

= 100000 mod 21

= 19

To decrypt the chiphertext value 19, do this,

Decrypt (19) = (19A17) mod 21

= 10

One way to compute the value of 19A 17 mod 21 is like this ;

1 7 = 10001 base 2. therefore,

= 1 + 16

17 Consider this table of power of 19

19 (mod) 21

19"2 (mod) 21 = 4 (mod 21)

4"2 (mod) 21 = 16 (mod 21)

16"2 (mod) 21 = 4 (mod 21)

4"2 (mod) 21 = 16 (mod 21)

Given the above, we know this ;

19"17 (mod) 21 19"(1 + 16) (mod) 21

19"1 + 19"16(mod) 21

19 * 16 (mod) 21

10 (mod) 21

10

18 CHAPTER3 PRIMALITY TEST

In the previous section, we knew that factoring the RSA modulus is as difficult as finding the secret RSA key. In many public-key cryptosystem, large random prime numbers are used. They are produced by generating random numbers of the right size and by testing whether those random numbers are prime. Fast algorithms for primality testing have been of widespread interest to computer scientists since the early 1970s (Stinson,

2006) because of the important role they play in many modern cryptosystems. Keys for the

RSA algorithm, for instance, are numbers that are product of two large primes, generated with a primality test algorithm. The security of the algorithm rests on the fact that while multiplying two large numbers is easy, factoring a number into its prime components is hard. Part of the magic of RSA is to determine quickly, whether a number is prime or composite. But in general, no one knows a quick way to find a composite number's prime factors. The fast primality tests used in practice are randomized algorithms, which make use of random number generators. These algorithms have a high probability of giving a correct answer, and that probability can be made higher still by running the algorithm multiple times. Even so, a tiny probability of error always remains.

In this chapter, we discuss whether a given positive integer is a prime number. As mentioned in chapter 2, one way to find out whether the input is a prime number is using a primality test. A primality test is simply a function that determines if a given integer greater than 1 is prime or composite. Primality tests come in two varieties: deterministic and probabilistic. Deterministic tests determine with absolute certainty whether a number is

19 prime. Examples of deterministic tests include the Lucas-Lehmer test and proving. Probabilistic tests can potentially (although with very small probability) falsely identify a composite number as prime. However, they are in general much faster than deterministic tests. Numbers that have passed a probabilistic prime test are therefore properly referred to as probable primes until their primality can be demonstrated deterministically. The most efficient factorization and primality testing algorithms known today are probabilistic, in the sense that they use sophisticated techniques that will almost always return a result but do not do so with absolute mathematical certainty. Faster primality testing does not pose any immediate risk to the security of electronic communication. However, it does open up the possibility for greatly speeding up mathematical computation in many areas of number theory. There are many methods of an algorithm of a primality test and here we are going to discuss a few methods such as Nai"ve

Method, Probabilistic tests, Miller-Rabin Methods and Fermat test.

3.1 Naive methods

This is the simplest primality test. Given that number n is an input and any integer m from 2 to n - 1 divides n. If n is divisible by any m then n is composite otherwise it is prime.

Example;

1. n=27,n-1=26

m = 2, 3, 4, ... , 26

n -:- m = 27-:- 9 = 3, thus n is composite.

20 2. n = 61, n- 1 = 60

m=2,3,4, ... ,60

n 7 m = 61 7 1 , thus n is prime number.

We also can test m up to Fn. If n is composite then it can be factored into two values, at least one of which must be less than or equal to Fn.

Example;

1. n = 79 Fn = 8.8888

therefore m = 2, 3, 4, ... , 8

n 7 m = 79 7 1 , n is prime number.

A best way to speed up these methods is to pre-compute and store a list of all primes up to a certain bound. Then, before testing n for primality with a serious method, one first checks whether n is divisible by any prime from the list. But this method is still a simple primality test where it is not sufficient to use for a long length of number.

3.2 Probabilistic tests

One of the most popular primality tests are probabilistic tests. It is used apart from the tested number n, another number a which is chosen randomly. It is possible for a composite number to be reported as prime, and the usual randomized primility tests never

21 report a prime number as a composite. Error of the probability tests can be reduced by repeating the test for several times. Example, for two commonly used tests, for any composite n at least half the as detect n's compositeness, so k repetitions reduce the error probability to at most 2 -k.

The basic structure primality test is as follows;

1. Randomly pick a number a.

2. Check some equality involving a and the given number n If the equality fails

to hold true, then n is a composite number, a is known as a witness for the

compositeness, and the test stops.

3. Repeat from step 1 until the required certainty id achieved.

After repeat this step for several iterations, if n is not found to be a composite number, then it can be declared probably prime. Knowing that the simplest probabilistic test is Fermat primality test and also Rabin-Miller Test and we will discuss this two method later. It is only a heuristic test; some composite numbers will be declared 'probably prime' no matter what witness is chosen. Nevertheless, it is sometimes used if a rapid screening of numbers is needed, for instance in the key generation phase of the RSA public key cryptographical algorithm

3.3 Miller-Rabin Methods

The Miller-Rabbin primality test was development by Rabin (Nigel,2003), based on

Miller's idea. This algorithm provides a fast method of determining the primality of a number with a controllably small probability of error. The algorithm is described as follows.

Let n > 1 be an odd integer. Write n- 1 = i m with m odd. Choose a random integer a

22 with 1

2 and declare that n is probably prime. Otherwise, let b2 = b1 (mod n). If b2 =1(mod n), then n is composite. If b2 = -1 (mod n ), then stop and declare that n is probably prime. Continue in this way until stopping or reaching bk _ 1 f. -1 (mod n ), then n is composite.

Example;

1. Let n = 561, then n- 1 = 560 = 16 * 35, so 2k = 24 and m = 35,

Let a = 2. Then,

35 b0 = 2 = 263 (mod 561)

b1 = b~ = 166 (mod 561)

2 b2 = b1 = 67 (mod 561)

b3 = b; = 1 (mod 561)

Since b3 =1 (mod 561), we conclude that 561 is composite. Moreover, gcd (b 2 - 1,

561) = 33, which is a nontrivial factor of 561.

For the efficiency of the Miller-Rabin test, it is important that there are sufficiently many witnesses for the compositeness of a composite number. Suppose we determine all witnesses for the compositeness of n = 15. We have n-1 = 14= 2 * 7. Therefore, s= 1, and d

= 7. An integer a, which is prime to 15, is a witness for the compositeness of n if and only ifa7 mod 15 f. 1 and a7 mod 15 f. -1.

23 By usmg a Mathematica (programming), which have a implemented multiple

Rabin-Miller Test and combined with a test, the primality test can be done with a function PrimeQ[n]. However, this method is only efficient for the n<10 16 so, it still not very efficient to use Mathematica to detect the long length of the prime number. A more detailed analysis of the Miller-Rabin test has shown that the error probability is in fact even smaller. (Nigel, 2003) . However, the Miller-Rabin primality test are more sophisticated variants where it can detect all composite. It also the method of choice because it much faster than other general primality test (Primality test).

The Miller-Rabin test is given by the following pseudocode;

Write n- 1 = 2/\s m with m odd;

For (j=O; j,k, j++)

{pick a from [2, ... , n-1];

if(b! = 1)

{ flag = true;

for ( i = 1; i , s; i ++)

{if (b = = (n- 1))

{ flag = false;

break;

}

}

24 if (flag== true)

{output (composite, a);

exit;

}

}

}

Output '';

3.3.1 Matematica Function for the Primality Test

The function used here is to prove a primality or compositeness. It is not only to I prove primality, but also can generate a certificate of primality. A certificate of primality is ~

' .. a relatively short set of data that can be easily used to prove primality. The word easily I a: : " : :.1 means that using the data to prove primality is much easier and faster than generating the data in the first place. As a simple example of a certificate, the factors of a composite

': number provide a certificate of compositeness. Multiplying the numbers together to show " that they are the factors is much easier that finding the factors. The certificate of primality used here is for large n is for based on the theory of elliptic curves. Here also can generate certificate of compositeness for composite numbers.

For example, PrimeQCertificate[3837523] returns the certificate {2, 3837522,

3837523}, which is intended to show that 23837522(mod 3837523) is not equal to 1. The

word "ProvablePrimeQ[n ]" returns True or False depending on whether n is prime or not.

The certificate for primality or compositeness is generated by "PrimeQCertificate [n]".

25 "ProvablePrimeQ" calls "PrimeQCertificate" and stores the result, so it does not take any extra time to create a certificate once "ProvablePrimeQ" has returned an answer. The certificate generated by "PrimeQCertificate" can be checked by "PrimeQCertificateCheck".

This function recognizes whether the certificate asserts primality or compositeness and then uses the certificate to verify the assertion.

3.4 Fermat Test

The Fermat Test is based on Fermat's theorem where it states that if p is prime number, then for any integer a, (aP- a) will be evenly divisible by p.

Theorem 3.3.1 (Fermat's theorem) [1}

If n is a prime number, then an -I = 1 mod n for all a E Z with gcd (a, n) = 1

This theorem can be sued to determine that a positive integer is composite. The notation of modular arithmetic

aP =a (modp)

1 If p is a prime and a us an integer coprime top, then (a?- - 1) will be evenly divisible by p.

In the notation of modular arithmetic

a?- 1 =1 (modp)

In other ways, if p is a prime number and a is any integer that does not have p as a factor, then a raised to the p- 1 power will leave a remainder of 1 when divided by p.

26 Example;

Consider n = 341 = 11 * 31, we have

2340 =1 (mod 341 ),

although n is composite. Therefore, if we use the Fermat test with n = 341 and a = 2, then we obtain y = 1, which proves nothing. On the other hand, we have

3340 =56 (mod 341),

If we use the Fermat Test with n = 341 and a= 3, then n is proven composite.

The encryption program PGP (Pretty Good Privacy: a computer program that provide cryptography privacy and authentication) uses this primality test in its algorithms. The

50 chance ofPGP generating a is less than 1 in 10 , which is more than adequate for practical purposes.

27 CHAPTER4

FACTORING ALGORITHMS

In this chapter we are going to discuss about the factoring algorithms. Every integer can be represented uniquely as a product of prime numbers. The art of factorization is almost as old as mathematics itself. However, the study of fast algorithms for factoring is only a few decades old. One possible algorithm for factoring an integer is to divide the input by all small prime numbers iteratively until the remaining number is prime. This is efficient only for integers that are, say, of size less than 1016 as this already requires trying

8 all primes up to 10 . In public-key cryptosystems based on the problem of factoring, numbers are of size 10300 and this would require trying all primes up to 10150 and there are about 10147 such prime numbers according to the prime number theorem. This far exceeds the number of atoms in the universe, and is unlikely to be enumerated by any effort. The easy instance of factoring is the case where the given integer has only small prime factors.

5 5 For example, 759375 is easy to factor as we can write it as 3 * 5 . In cryptography we want to use only those integers that have only large prime factors. Preferably we select an integer with two large prime factors, as is done in the RSA cryptosystem.

The three algorithm that are most effective on very large numbers are the

Quandratic Sieve, the Elliptic Curve factoring Algorithm and The Number Field Sieve.

Other well known algorithms that were precursors include Pollard's pho-method and p-1

Algorithm, Williams' p + 1 algorithm.

28 4.1 Integer Factorization in the context of Cryptography

Cryptography is an important building block of e-commerce systems. Public key cryptography can be used for ensuring the confidentiality, authenticity, and integrity of information in an organization. To protect the sensitive information in an organization, encryption can be applied to conceal sensitive data so that the encrypted data is completely meaningless except to the authorized individuals with the correct decryption key. To preserve the authenticity and integrity of data, digital signature can be performed on the data such that other people can neither impersonate the correct signer nor modify the signed data without being detected.

Integer factorization is the process of breaking down a composite number into smaller divisors where when multiplied together equal the original integer. When the numbers are very large , no efficient integer factorization algorithm is publicly known; a recent effort which factored a 200 digit number (RSA-200) took eighteen months and used over half a century of computer time (http://en.wikipedia.org/wiki/Integer_factorization).

Not all numbers of a given length are equallly hard to factor. According to the fundamental theorem of arithmetric, every positive integer greater than one has a unique prime factorization. A fast integer factorization algorithm would mean that the RSA public­ key algorithm is insecure. Some cryptographic systems, such as the Rabin public-key algorithm and the Blum Blum Shub pseudorandom number generator can make a stronger guarantee - any means of breaking them can be used to build a fast integer factorization algorithm, so if integer factorization is hard then they are strong.

29 Integer factorization also has many positive applications in algorithms. For example, once an integer n is placed in its prime factorization representation, it enables the rapid computation of multiplicative functions on n. It can also be used to save storage, since any multiset of prime numbers can be stored without loss of information as its product.

4.2 The Pollard p - 1 Algorithm

The Pollard p - 1 Algorithm which dates from 1974 (Weisstein, 2002), is a specialized method. However, it is only useful to find prime factors p so that p - 1 is divisible only by small factors, and not working particularly well outside those cases. This tells us that another way to make a RSA public-key secure is to make sure the factors of the prime factor, of n, minus one are large. This algorithm has two inputs: the (odd) integer n to be factored and a prespecified "bound", B. It is a special-purpose algorithm, meaning that it is only suitable for integers with specific types of factors. The algorithm is based on the insight that numbers of form ab - 1 tend to be highly composite when b is itself composite.

Since it is computationally simple to evaluate numbers of this form in modular arithmetic, the algorithm allows one to quickly check many potential factors with great efficiency.

Algorithm : POLLARD p- 1 FACTO RING ALGORITHM (n, B)

a-2

for}- 2 to B

do a- a1 mod n

d- gcd(a-1, n)

if 1 < d < n

30 then return (d)

else return ("failure")

Example;

If n = 15770708441, apply Algorithm,

B = 180, then we find that

a= 11620221425 and dis computed to be 135979,

In fact, the complete factorization of n into primes is

15770708441 = 135979 X 115979

In this example, the factorization succeeds because 135978 have only 'small' prime factors;

135978 = 2 X 3 X 131 X 173

Hence, by taking B ~ 173, it will be the case that 135978 I B!, as desired.

Note that, failure of this algorithm can occur for two different reasons. One is there are no prime factors p of n so that p -1 is B-smooth. In that case the gcd computed is 1 all the time. The other failure mode is that all the prime factors q of n has q - 1 being B­ smooth. There could be other tricks to make this method faster, but until they are found, it is also not very useful when trying to factor very large RSA numbers.

31 4.2.1 Pseudocode : Pollard p -1 Factorization

function pollard_p 1(N)

# Initial value 2/\(k!) fork= 0

two k fact := 1

for k from 1 to infinity

#Calculate 2/\(k!)(mod N) from 2/\((k-1)!)

two_k_fact := modPow(two_k_fact, k, N)

rk := gcd(two_k_fact- 1, N)

if rk < > 1 and rk < > N then

return rk, N/rk

end if

end for

end function

This function is typically provided in languages with big integer operations, and is known as "".

4.3 The Pollard Rho Algorithm

The Pollard Rho Algorithm is a probabilistic method for factoring a composite number N by iterating a modulo N. The method was published by J.M. Pollard in 1975 (Weisstein, 2002). Pollard's rho method quickly finds relatively small factors of composite numbers. It is a very simple factorization method which already runs several times faster than for numbers whose smallest prime factor is about I ,000,000.

32 Further, it has the practical virtue that if a number has a small prime factor, then the method finds such a factor faster than it would find a large factor. This method generates two sequences of random numbers, and takes the difference of those sequences to generate a quadratic map. Then it tests the differences against n for a (GCD).

This is better than trial division because instead of just dividing one number into n to see if it is a factor, it looks for a greatest common divisor between the numbers, which allows it to test a large amount of numbers at once. Instead of just trying to divide the number, it tries to divide the number and its factors. Below is an algorithm of the Pollard rho;

Algorithm :POLLARD RHO FACTORING ALGORITHM (n. x1}

external f

x' +-f(x) mod n

p +- gcd(x -x', n)

whilep= 1

comment : in the i th iteration, x =Xi and x' = X2;

x +-f(x) modn

x ' +-f(x ') mod n

do x' +-f(x) mod n

33 p - gcd(x -x', n)

ifp = n

then return ("failure")

else return (p)

The description of the algorithm is simple. Given integer n, initialize by setting x = 2,

a. Compute p = gcd ( x - y, n)

b. If 1 < p < n, stop: p is proper factor on n.

c. Ifp- 1, replace x by x2 + 1 andp by (p2 + 1/ + 1 and repeat

d. If p = n, we have failure, and the algorithm needs to be reinitialized. This

rarely occurs.

This method starts with one small number, and uses it to generate a random map, and with this map it tests for the greatest common divisor with respect to the number being factored. Let us start with x = 2, and y = x2 + 1 = 5 . If the GCD of x-y and n is between one and n, the GCD is the factor. If it's one, you have to take x = x2 + 1 andy = (/ + Ii +

1, until the GCD isn't one anymore. If the GCD is n, you have to start over, and change the formula for x to something of the form x2 + c, where cis not 0 or -2.

In implementation this algorithm, probably a limit should be imposed upon the number of cycles to run through before making some adjustments. Since the worst-case scenario for this algorithm is a complete failure, it would be unwise to test primality by this algorithm, since failure can occur for composite numbers. Thus, in practice, before using

34 Pollard's rho one should first apply a primality test to n. Even though, the Pollard's rho method is fastest for small prime factors up to about 7 digits (Weisstein, 2002). This means that it is not very efficient for RSA cryptosystem, where we could easily be working with factors with up to 100 digits. The Pollard's rho method shows that the prime factors need to be about the same size, but other factoring algorithms start searching numbers that are of the same size.

4.3.1 Pseudocode : Pollard rho Factorization

function pollardRho (N)

# Initial values x(i) and x(2 * 1) for I =0

Xl :=2

x2i :=2

do

#Find x(i+1) and x(2*(i+1))

xiPrime :=xi 1\ 2+ 1

x2iPrime :=(x2i/\ 2 + 1Y2+ 1

#Increment I : change our running values for x(i), x(2*i)

xi := xiPrime % N

x2i := x2iPrime % N

s := gcd(xi - x2i, N)

if s < > 1 and s < > N then

returns, N/s

end if

end do

end function

35 Here a % m is a modulo operation, which returns the least nonnegative integer y such that a

=y (mod m).

4.4 The Pollard p- 1 and Pollard's Rho Findings

According to the programming used, Mathematica, it is used to factor numbers from one to twenty eight digits. From that, we also can look for a function of n for the running time and also for a function of steps through each loop. For the Pollard p - 1, we

116 choose b=2 and b = n to run the program. This program gives the time used and prints the factor (g), the number of times through the GCD loop (c) and the number of times through

116 the prime loop (r). After running the program up to n = 28 digits, we found that b = n , it went the different number of times through each of the loops. It show that it is not very useful when trying to factor very large RSA numbers.

For the Pollard's Rho method, the factor (t), is to find the number of steps and (a) to find the factor. The number 10 is changed according to the number of steps the algorithm

114 will go through, which is said to be usually about n • Here, we used this program for n up to 28 digits only and then looked for the formulas relati.ng to the running time to n and the number of steps to n. After running the program, we notice that the Rho formulae take the least amount of time to run through. We also can look at how the method works with prime numbers which are very far apart, making one much smaller, and then multiplied together to get a different type of n. This algorithm was a much faster method, because the Pollard's

Rho method is for small prime factors up to about 8 digits (see table 4. 4. 2). It is means that

36 this algorithm is also not very efficient for the RSA Cryptosystem, where RSA

Cryptosystem is working with factors with up to 100 digits. Even though, the Pollard's Rho method shows us that the prime factors need to be of the same size, so there is a balance in there. This also shows that Pollard's Rho and p - 1 methods, were good tools to study factorization and some cryptography. The Pollard's Rho method shows us that p and q should be large and p - 1 methods points out how the factors of p-1 and q-1 should also be large, when p*q = n. These can protect the public key from easy attacks on the RSA cryptosystem.

37 Table 4. 4.1 The Pollard p- 1 Methods

r # digits #digits run time #digits n p 2 q q n Jsl 1 2 1 3 1 6 0 2 5 1 . 7 1 35 0 3 11 2 13 2 143 0 4 23 2 107 3 2461 0 5 43 2 709 3 30487 0 6 293 3 487 3 142691 0 7 1499 4 2029 4 3041471 0 8 5039 4 11443 5 57661277 0.03 9 8221 4 73783 5 606570043 0.01 10 39901 5 52267 5 2085505567 0.01 11 35023 5 521641 6 18269432743 0 12 190207 6 924059 6 1.75762E+11 0.02 13 907141 6 1972121 7 1.78899E+12 0.12 14 2057683 7 12001051 8 2.46944E+13 2.76 15 17396147 8 17840671 8 3.10359E+14 0.98 16 20603417 8 63919529 8 1.31696E+15 1.18 17 81493593 8 481878377 9 3.927E+16 0.54 18 426914513 9 890014343 9 3.7996E+17 41.76 19 955679821 9 8079331567 10 7.72125E+18 91.61 20 1662192247 10 7310089201 10 1.21508E+19 1.08 21 3768926873 10 64747108099 11 2.44027E+20 12.19 22 523326649 9 3672701191391 13 1.92202E+21 76.18 23 9704936951 10 10231277255671 14 9.92939E+22 1.85 24 95172440929 11 4021288227503 13 3.82716E+23 23.62 25 313042493843 12 4169629019921 13 1.30527E+24 1.05 26 599986552873 12 45133643332471 14 2.70796E+25 54.9 27 8867424188003 13 15816623460283 14 1.40253E+26 621.88 28 4596296379151 13 748760052489131 15 3.44152E+27 675.96

38 Table 4.4.2 The Pollard's Rho Methods

r # digits #digits run time #digits n p p _g_ q n (s) 1 2 1 3 1 6 0 2 5 1 7 1 35 0 3 11 2 13 2 143 0 4 23 2 107 3 2461 0 5 43 2 709 3 30487 0.01 6 293 3 487 3 142691 0 7 1499 4 2029 4 3041471 0.01 8 5039 4 11443 5 57661277 0.02 9 8221 4 73783 5 606570043 0.02 10 39901 5 52267 5 2085505567 0.07 11 35023 5 521641 6 18269432743 0.05 12 190207 6 924059 6 1.75762E+11 0.17 13 907141 6 1972121 7 1.78899E+12 0.25 14 2057683 7 12001051 8 2.46944E+13 0.55 15 17396147 8 17840671 8 3.10359E+14 1.32 16 20603417 8 63919529 8 1.31696E+15 1.03 17 81493593 8 481878377 9 3.927E+16 3.95 18 426914513 9 890014343 9 3.7996E+17 10.33 19 955679821 9 8079331567 10 7.72125E+18 13.4 20 1662192247 10 7310089201 10 1.21508E+19 5.78 21 3768926873 10 64747108099 11 2.44027E+20 14.05 22 523326649 9 3672701191391 13 1.92202E+21 7.15 23 9704936951. 10 10231277255671 14 9.92939E+22 48.47 24 95172440929 11 4021288227503 13 3.82716E+23 77.88 25 313042493843 12 4169629019921 13 1.30527E+24 143.94 26 599986552873 12 45133643332471 14 2. 70796E+25 344.66 27 8867424188003 13 15816623460283 14 1.40253E+26 1249.30 28 4596296379151 13 748760052489131 15 3.44152E+27 755.12

39 To numerically solve of Pollard p - 1 and Pollard's rho we used a Mathematica command. Here it automatically chooses the value of prime number (p or q). We apply the algorithm of Polllard p-1 and Pollard's rho to find the running time and also the factor of the composite number. This programming is used because it is easy to code the program of the algorithm.

4.5 Number Field Sieve

This algorithm is an extremely fast factorization method (Johannes B.A., 2000) developed by Pollard which was used to factor the RSA-130 number. The effectiveness of the NFS algorithm becomes apparent for very large integers, it can factor any integer of size 10150 in a few months time (Johannes B.A., 2000). The NFS algorithm takes sub­ exponential time (which is still not very efficient).

Number Field Sieve (NFS) algorithm is known to be the most efficient with a large integer number (11 0 digits or more) (Buchmann Johannes A., (2000)). Factorization of

512-bit was sucessfully done in 1999 and recently in March 2003, factorization of 530 bits was done. For these experiments used distributed PCs and supercomputers which are general purpose computers and hard to scale for bigger size.

The NFS has two common variations ; the Special Number Field Sieve (SNFS) and the General Number Field Sieve (GNFS). The SNFS is an algorithm is an algorithm that can quickly factor large numbers but works only for numbers of a special form. The

General Number Field Sieve works for all composite numbers, but this flexibility is at the

40 cost of the General Number Field Sieve being slightly slower than the Special Number

Field Sieve. The General Number Field Sieve is the methods choice in many factoring challenges because of its increased flexibility. SNFS and GNFS are essentially the same algorithm. SNFS is simply a special case where a particularly simple polynomial is known,

Z[a] is a unique factorization domain, and some other nice algebraic properties are present.

In the case of a general integer, and a more complex polynomial, some things get messier.

Table 4.5.2.1 gives the number of mips-years required for the special number field sieve to factor numbers of different lengths.

Table 4.5.2.1: Factoring Using the Special Number Field Sieve

#of bits mips-years required to factor

512 <200

768 100,000

1024 3*10/\7

1280 3*10/\9

1536 2*10/\11

2048 4*10/\14

41 Table 4.5.1.2: Factoring Using the General Number Field Sieve

#of bits mips-years required to factor

512 30,000

768 2*10/\8

1024 3*10/\11

1280 1*10/\14

1536 3*10/\16

2048 3*10/\20

Mathematicians keep commg up with new tricks, new optimizations, and new techniques. There's no reason to think this trend would not continue. A related algorithm, the special number field sieve, can already factor numbers of a certain specialized form numbers not generally used for cryptography must faster than the general number field sieve can factor general numbers of the same size. It is not unreasonable to assume that the general number field sieve can be optimized to run this fast.

42 4.5.1 General Number Field Sieve Algorithm

The General Number Field Sieve is the most efficient algorithm known for the integer factorization. It can factor integers larger than 100 digits in the record

(http://en.wikipedia.org/wiki/General_number field sieve). Its complexity for factoring an integer n is of the form

O(e ( 1.92 +a (I)) (Inn) 1/3 (In lnn)2/3) = Ln[J/3, c)

(in 0 and L notations) for a constant c which depends on the complexity measure and on the variant of the algorithm.

The General Number Field Sieve algorithm includes two parameters that must be chosen to meet certain criteria. These free variables will be used throughout the derivation of the General Number Field Sieve along with a composite integer n that is to be factored.

Here is the explanation for the GNFS algorithm.

2 Suppose we want to factor a composite n, for two numbers, r, t E Z, r = r(mod n).

Then and r2 = r (mod n) and n has the prime factorization n = pq. Then,

~ pq I ( r + t)(r - t)

~ p I ( r + t)(r- t) and q I ( r + t)(r- t)

43 From number theory states that if c 1 ab and gcd (b, c) = 1 then c I a. According to this

statement, it means that

p I ( r + t) or p I ( r - t)

q I ( r + t) or q I ( r - t) .

So that the implies that it is not possible that p l ( r + t) and p l ( r - t). Similarly that it is

not possible that q l ( r + t) and q l ( r - t). Table 4.5 .1.1 show summarizes the possibilities

for p and q dividing r + t and r- t.

For the explanation (row 6), suppose p I (r + t), p l ( r- t), q I (r + t) and q I ( r- t)

gcd (pq, r + t) E { 1, p, q, pq}, the divisors of n = pq. Since PI (r + t) and q I ( r + t) so that pq I gcd (pq, (r + t). And gcd (pq, r- t) = q because p l ( r- t). According to one of the

gcd's was able to isolate either p or q, this is a sucessful factorization of n = pq. If it is

assumed that all of the combinations in table 4.5.1.1 are equally likely then/= t2 (mod n)

implies that either gcd (pq, r + t) or gcd (r - t) gives nontrivial factor of n = pq with

probability 2h.

44 Table 4. 5.1.1 Shows the summarizes the possibilities divisibility Scenarios

Possible Divisibility Scenarios GCD Results

p 1 ( r+t) pI ( r-t) q I( r+t) q I( r+t) gcd(pq,r+t) gcd(pq,r-t) Successful Factorization

No Yes No Yes 0 pq

No Yes Yes No q p *

No Yes Yes Yes q pq *

Yes No No Yes p q *

Yes No Yes No pq 0

Yes No Yes Yes pq q *

Yes Yes No Yes p pq *

Yes Yes Yes No pq p *

Yes Yes Yes Yes pq pq

45 In the GNFS algorithm there are two free parameters that must be chosen. Both free variables will be used along with a composite integer n that is to be factored. The first parameter is a polynomial/: R-+ R with integer coefficient, and the second parameter is natural number m E N that is satisfies f(m) == 0 (mod n).

Theorem 4. 5.1 : Given a polynomial j(x) with integer coefficients, a root e E C, and an m

E Z I nZ such that f(m) == O(mod n), there exists a unique mapping @ :Z[B] -+ Z I nZ satisfying

1. @ (ab) =@(a) @ (b) V a, b 0 Z[8]

2. @ (a+b) =@(a)+ @(b) V a, b 0 Z[8]

3. @ (1) == 1 (mod n)

4. @ (8) == m (mod n)

One can apply this theorem to obtain a difference of squares congruence in the following way : suppose there exists a finite set U of pairs integers ( a, b) such that

n ( a + b8 ) = ~ 2 and n ( a + bm) = y 2

For~ E Z[8] andy E Z. Let x = 8 (~).Then working congruent modulo n,

46 x2 @ (~)@ (~)

@ (~2)

@ ( n (a+ be))

= n (0 (a+ be))

= n (a+ b m)

2 y

2 2 Thus a relation x E y (mod n) has been created and there is a probability of % that this will lead to a factorization of n.

4.5.2 Special Number Field Sieve Algorithm

The Special Number Field Sieve is one of the Number Field Sieve algorithm. It is a special-purpose integer factorization algorithm. It is known that the General Number Field

Sieve was also derived from this algorithm. This algorithm is efficient for the integers of the form l ± s, where r and s are small. Meanwhile, this is ideal for a factoring Fermat numbers. The Special Number Field Sieve running time is

O(e ( 32/9 +Inn) 1/3 (In In n)2/3) = Ln{J/3, 32/9113}

The Special Number Field Sieve is based on simpler . It means that you have to know about rational sieve before starting this algorithm. The SNFS generate as follows ;

47 Let n be the integer we want to factor, and we can break it in two steps;

a. Find the large number of multiplicative relations among a factor base of

elements of Z I nZ , and this number must be larger than the number of

elements in the factor base.

b. Then, multiply together subsets of these relations in such a way that all

the exponents are even, resulting in congruences of the form a2 =

b2 (mod n). This is tum immediately lead to factorizations of n : n =

gcd( a+ b,n) x gcd(a-b,n).

Let say we set the factor base as

F = {ai = cp (j) E Z I nZ : j E S}

Where

S 1 = { p E Z : p is prime and p :S B

S2 = (U1 :j = 1, 2, ... , r 1 + r2 - 1, where U1 is a generator of

UF},

s3 = ( p = a + ba E z [a] : NF (p) = p < B 2 where p is

prime}

We may assume gcd (ai, n) = 1 for all j E S, since otherwise we have a factorization of n and the algorithm terminates.

48 This algorithm is very efficient for numbers of the form l ± s, for rands relatively small. It is also efficient for any integers which can be present represented as a polynomial with small coefficients. This includes integers of the more general form a'r e ± b's 1, and also for many integers whose binary representation has low Hamming weight. The reason for this is the Number Field Sieve performs sieving in two different fields. The first field is

usually the rationals. The second us a higher degree field. The efficientcy of the algorithm

strongly depends on the norms of certain elements in these fields.

49 CHAPTERS GENERAL NUMBER FIELD SIEVE IN THE RSA

5.1 Factorization Record using GNFS

Factoring a large number is very hard. Unfortunately for algorithm designed is getting easier and safe. Today's factoring algorithms work by finding distict square roots of the same modulo the number to be factored. In 1999, record was a 155 digit RSA challenge number which is factored by a team of mathematicians. Then, a team of researchers at the University of Bonn established a record in the art factoring general integers into primes on 18 January 2002 (http://en.wikipedia.org/wiki/General_number field sieve, 2004). It has implications of public-key cryptography. Using a new implementation ofthe general number field sieve (GNFS), we have 158-digit divisor of2953

- 1, establishing a record for the factorization of without small divisors into primes. A new record was done by Bahr F., Boehm M., Franke J., and Kleinjung T. It was RSA-640 of the

RSA factoring Challenge (RSA-640).

The General Number Field Sieve is the asymptotically fastest algorithm for factoring large integers (http://en.wikipedia.org/wiki/General_number field sieve, 2004). Its runtime depends on a good choice of a polynomial pair. GNFS is an improvement on . The algorithm uses ideas from diverse field of mathematics such as algebraic number theory, graph theory, finite fields, and linear algebra. One of the improvements is that the polynomial being used is not only limited to quadratic, but may be cubic or even higher degree . Implementation of the number field sieve can be written such that the sieving process can take place on several computers. This algorithm

50 has been used for the polynomial selection stage of the factorization of many numbers. The largest number whose factorization has been completed and where a number of similar size has been factored using the original Montgomery-Murphy method is a composite 143-digit factor of 21064 + 1.

5.2 RSA Number (Factoring Challenge)

In Mathematics, factoring large primes that are part of the RSA Factoring Challenge.

RSA Laboratories sponsors the RSA Factoring Challenge to encourage research into computational number theory and the partical difficulty of factoring large integers. So it can be helpful for users of the RSA encryption public-key cryptography algorithm to choose suitable key length. The prizes of this challenging were according to a complicated formula.

These original numbers were named according to the number of decimal digits, so RSA-

200 was a two hundred-digit number. The unfactored challenge numbers were removed from the prize list and replaced with a set of number. In this stage, the naming convention was also changed so that the trailing number would indicate the number of digits in the binary representation of the number. Hence RSA-576 has 576 binary digits which translate to 174 digits in decimal.

51 Table 5. 2.1 The table shows the recorded RSA Numbers factored with the algorithm used.

Number Digits Factored Algorithm

RSA-100 100 April 1991 Quadratic Sieve

RSA-110 110 April1992 Quadratic Sieve

RSA-120 120 Jun 1993 Quadratic Sieve

RSA-129 129 April1994 Multiple Polynomial Quadratic Sieve

RSA-130 130 April 10, 1996 Number Field Sieve

RSA-140 140 February 2, 1999 General Number Field Sieve

RSA-150 150 April 6, 2004 General Number Field Sieve

RSA-155 155 August 22, 1999 General Number Field Sieve

RSA-160 160 April 1, 2003 General Number Field Sieve

RSA-200 200 May 9, 2005 General Number Field Sieve

RSA-576 174 December 3, 2004 General Number Field Sieve

RSA-640 193 November 4, 2005 General Number Field Sieve

RSA-129 was used by R. Rivest, A. Shamir, and L. Adleman to publish one ofthe

first public-key messages together with a $100 reward for the message's decryption. These

RSA was factored in 1994 using distributed computing approach involving roughly 600

computers (Nigel, 2003) . The factorization was done using Multiple Polynomial Quadratic

Sieve (MPQS). On April lOth, 1996, the RSA-130 number to the RSA factoring Challenge has been factored by Lenstra. Then on the February 2nd, 1999, the RSA-140 of the RSA

52 Factoring Challenge has been factored by Herman J.J. te Riele. A few years later, on

December 3rct, 2003, RSA-576 (174 decimal digits) has been factored using the General

Number Field Sieve by Franke J. et al. This factorization into two 87-digit factors.

On May 9th, 2005, the number RSA-640 of the RSA Factoring Challenge has been factored by Bahr.F., Boehm M., Franke J. and Kleinjung T. This factorization took five months on 80 2.2 GHz AMD Opteron CPUs using the GNFS. The RSA Factoring

Challenge is opened to whoever is clever and persistent enough to complete the next challenge number in the series are RSA-704, RSA-768, RSA-896, RSA-1024, RSA-1536 and RSA-2048. It remains open and carrying awards from $30000 to $200000.

5.3 Attacks on RSA

An attack on a cryptosystem is an attempt to decrypt encrypted message without knowledge of the key. There are four basic sorts of attacks including ciphertext only, known plaintext, chosen plaintext and encryption key. The first three sorts of attacks are relevant to symmetric cryptosystems. Of the three, clearly the ciphertext-only attack is most difficult for the cryptanalyst, while the choosen-plaintext attack is the easiest. If the collection of all possible messages is known, and is relatively small, then the attacked need only encrypt all the message until a match is found.

For example, if the message is known to be either 'yes' or 'no' then only one encryption need to be computed which is the plaintext. Therefore, especially in the case of small message, it should be padded by adding random bits at front and/or back.

53 5.3.1 Cracking the RSA encryption system

The most obvious way to crack the RSA system is to use the recipient's public key, the numbers m and n, to find the recipient's private key k. Actually, you do not need to find the exact same private key used by the recipient. All you need is any number k' that meets the requirement that (p-1) (q-1) is a factor ofthe number/[ n- 1. Given any such number k', you can decrypt an encrypted number by raising it to the power k' mod m.

In principle, there could be a way to find a private key k' directly, without having to find the primes p and q first. In practice, there is no known strategy for finding a private key k' directly that would be faster than simply factoring the number m, finding the primes p and q, and finding /[ using p and q. On the other hand, if the primes p and q are chosen large enough, say, 100 digits each, then m will be a 200-digit number that cannot be factored within any reasonable amount of time. Much effort has been expended to try to factor large numbers like m quickly. The record so far is the factoring of a 155-digit number using nearly 300 computers and taking about four months (Mollin, 2003).

The size of these numbers assures the security of RSA encryption today. Note, however, that no one knows for sure that factoring a 200-digit number requires so much computing power and time. It is possible that someone could find a revolutionary new approach that can factor a 200-digit number in a few hours or even minutes. We do not think this will happen, because too many brilliant people have already given their best efforts to this challenge. Still, sudden surprising advances in mathematics do occur. If factoring large numbers became feasible, it would immediately cripple all Internet activities such as e-commerce that rely on secure communication.

54 CHAPTER6

CONCLUSION AND SUGGESTIONS FOR FURTHER WORKS

6.1 Conclusion

The Pollard's p -1 and Pollard's Rho methods were good tools to study factorization

and some cryptography. No huge breakthroughs were found, but they were interesting to

look at, and they gave us guidelines to follow in order to choose a strong public key.

Pollard's Rho method shows us that p and q should be large, and p-I methods points out

how the factors ofp -1 and q -1 should be large, whenp * q = n. These two tips protect the

public key from two easy attacks on RSA Cryptosystem.

Other factoring algorithms have been supplanted by the Number Field Sieve is

Quadratic Sieve. This is the fastest known algorithm for the number less than 110 digits

long and has been used extensively. A fastest version of this algorithm is called the

Multiple Polynomial I Quadratic Sieve. Another good one is the elliptic curve method. But

according to the RSA lab, the best factoring method for large numbers is the General

Number Field Sieve, although it is very complex. Now days, the people become more

brilliant to create a new factoring algorithm that is more sophisticated to give more

improvement to the factoring algorithm and this will threaten the security of the RSA.

55 6.2 Suggestions for further work

In further work, I would spend my time to study more about the attacks on RSA

Cryptosystem. Time constraint obstructs me to study the factoring algorithm in depth. For further work I would study the following matters in details ;

1. to study the general number field sieve algorithm by using a programming.

n. to study the special number field sieve algorithm by using a programming.

m. to identify a new factoring algorithm in the RSA cryptosystem.

tv. to study the Elliptic Curve factoring algorithm in the RSA Cryptosystem.

56 REFERENCE

57 REFERENCES

BOOK:

Burton D. M. (2002), Elementary Number Theory, Me Graw- Hill, New York.

Johannes B. A., (2000). Introduction to Cryptography, Springer- Verlag, New York.

Mallin R.A. (2003). RSA and Public-key Cryptography, Chapman & Hall/CRC: New York.

Neal K., ( 1997). Algebraic Aspects of Cryptography, Springer- Verlag, Berlin.

Nigel S. (2003), Cryptography: An Introduction, Me Graw- Hill, New York.

Rendall K. N., (1999), ICSA Guide to Cryptography, Me Graw- Hill, New York.

Schneier B., (1996), Applied Cryptography, John Wiley & Sons, Inc., New York.

Stinson D.R. (2006). Cryptography Theory and Practice, Chapman& Hall/CRC; New York

Trappe W., Washington L.C. (2002), Introduction to Cryptography with Coding Theory, Prentice Hall, New Jersey.

58 WEB PAGE:

Algorithm [Online] [Accessed 26th Dec 2007] Available from World Wide Web http//en.wikipedia.org/wiki/Algorithm.

Connelly Barnes (2004) Integer Factorization Algorithms [Online] [Accessed; 22nct Nov 2007]Available from World Wide Web: ]http:/I oregonstate .edu/ ~barn esc/ documents/factoring. pdf

Cryptography Watch [Online] [Accessed 24th Jan 2008] Available from World Wide Web: http://crypnet.net/people/vab/blogs/cryptowatch/category/factoring.

General Number Field Sieve - From Wikipedia, an online encyclopedia. [Accessed; 27'h Nov 2007] Available from World Wide Web: http://en.wikipedia.org/wiki/General number field sieve

Integer factorization -Difficulty and complexity - From Wikipedia, an online encyclopedia. [Accessed; 24th Nov 2007] Available from World Wide Web: http://en.wikipedia.org/wiki/Integer factorization

Integer factorization Record- From Wikipedia encyclopedia. [Accessed; 2?'h Nov 2007] Available from World Wide Web: http:/I en. wikipedia. org/wiki/Integer factorization records

Primality Test [Online] [Accessed; 27th Nov 2007] Available from World Wide Web: http:/ /en. wikipedia.org/wiki/Primality test

Prime Number [Online] [Accessed; 27'h Nov 2007] Available from World Wide Web : http://simple.wikipedia.org/wiki/Prime number

Quadratic Sieve [Online] [Accessed; 27'h Nov 2007] Available from World Wide Web : http:/ /en. wikipedia.org/wiki/Quadratic sieve

59 RSA numbers- From Wikipedia [Online] [Accessed; 27th Nov 2007] Available from World Wide Web: http:/ /en. wikipedia.org/wiki!RSA numbers

Special Number Field Sieve - From Wikipedia, an online encyclopedia. [Accessed; 2ih Nov 2007] Available from World Wide Web: http://en.wikipedia.org/wiki/Special number field sieve

Weisstein, Eric, (2002) Pollard p - 1 Algorithm [Online] [Accessed; 27th Nov 2007] Available from World Wide Web http:/ /en.wikipedia.org/wiki/Pollard%27 p-1 algorithm

Weisstein, Eric, (2002) Pollard Rho Algorithm [Online) [Accessed; 271h Nov 2007] Available from World Wide Web http:/ /en. wikipedia.org/wiki/Pollard%27s rho algorithm

Weisstein. Eric (2002) Rabin-Miller StrongPseudoprime Test. [Online] [Accessed; 29th Nov 2007] Available from World Wide Web: http: I /mathworld. wolfram/ com/Rabin-MillerStrong PseudoprimeT est.html

Wolfram Mathematica -Documentation Centre [Online] [Accessed; 28th Nov 2007] Available from World Wide Web: http:/ /reference. wo 1fram. com/mathematical ref/GCD .html

Wolfram Mathematica -Documentation Centre [Online] [Accessed; 28th Nov 2007] Available from World Wide Web: http:/ /reference. wolfram. com/mathematical guide/NumberTheory .html

60 JOURNALS

Daniel J.B. and A.K. Lenstra (2002), A General Number Field Sieve Implementation [Online], Accessed 9th December 2007, Available from World Wide Web: http://springerlink.metapress.com/content/?k=factorization+prime+number

Daniel N. (2004), 2 RSA and Probabilistic Prime Number Tests, Journal of Cryptology [Online] Accessed 9th December 2007. Available from World Wide Web: http://springerlink.metapress.com/content/

Kazumaro Aoki, et al (2002), A Kilobit Special Number Field Sieve Factorization [Online], Accessed 1ih December 2007. Available from World Wide Web: http://www.eprint.iacr.org/2007/205.pdf

Peter M (2007), Application of Number : Codes and Public Key Cryptography [Online], Accessed 1ih December 2007. Available from World Wide Web: http://commerce.metapress.com/content/n753j0091lu 13q41

THESIS

Matthew E.B. (1998), An Introduction to the General Number Field Sieve , Master of Science in Mathematics thesis, Blacksburg, Virginia.

61 ATTACHMENT

62 Appendix A

Factorization Result

32741455569349801575114630374914188063642403240171463406883 * RSA -120 693342667110830181197325401899700641361965863127336680673013

34905295108476509491478496199038981334177646384933878439908 RSA- 129 20577* 3276913299326670954996198819083446141377642967992942539798 288533

39685999459597454290161126162883786067576449112810064832555 RSA-130 157243*4553449864673597218840368689727440886435630126320506960099 9044599

33987174230284385545301236276138758356339864959695974234 RSA- 140 90929302771479* 626420018740128509615165494826444221930203717862350901911 1660653946049

RSA- 576 398075086424064937397125500550386491199064362342526708406 385189575946388957261768583317* 47277214610743530253622307197304822463291469530209711645 9852171130520711256363590397527

16347336458092538484431338838650908598417836700330923121 RSA- 640 81110852389333100104508151212118167511579* 19008712816648221131268515739354139754139754718967899685 15493666638539088027103802104498957191261465571

63 Appendix C

Proving Primality and Compositeness function.

Function (Syntax)

PrimeQ[ expr] test whether a number is prime

Prime[n] give the nth prime number

ProvablePrimeQ[ n] give True if n can be proved to be prime, and False if n can be proved to be composite

ProvablePrimeQ[ n, "Certificate" --. print a certificate that can be used to verify the True] result

PrimeQCertificate[n] print a certificate that n is prime or that n i~ composite

PrimeQCertificateCheck[ cert,n] verify that the certificate cert proves the

primality or compositeness of n

66 Appendix D

Pollard rho and Pollard p-1 functions

Syntax n define as a composite number f[v _) define a function that takes any single argument log(n) gives the natural logarithm of n

GCD [b,n] gives the greatest common divisor of these two integers.

While evaluate an expression while a creation is true

PrimeQ to test whether an integer is prime or else

PrimeQ[t] yields true ift is a prime number, and yield false otherwise.

PowerMod [b, p"l, n] gives bp-I mod n

Floor[log(n)llog[p}} greatest integer [log(n)/log[p]]

Mod[Floor[log(n)llog[p},n] gives the remainder on division of [log(n)llog[p} by n.

67 Appendix E

~R'SP-J.nb

n = 143; Ti.ming[r=O; b=2; g=GCD[b, n]; c=l.; If[g>=2, {Pri.nt[g]}, {t=l.,

While[g < 2 1 {p = l. 1

While[p<2 1 {++t 1

If[PrimeQ[t] 1 p=t, p=l.], r++}], 1 = Mod[Floor[Log[n] I Log[p]], n],

b = PowerMod[b, pAl, n] 1 g = GCD [b- l, n], C++}]}]] Pri.nt[g] Print[c] Pri.nt[r] Null

{ 0 . Second, { 1, Null} }

13

3

2

n=2461; Ti.ming[r=O; b=2; g=GCD[b, n]; c=l.; If[g>=2, {Pri.nt[g]}, {t=l.,

While [ g < 2 1 {p : J. 1

While[p<2 1 {++t, If[PrimeQ[t], p=t, p=l.], r++}], 1 = Mod[Floor[Log[n] I Log[p]], n],

b = PowerMod[b, p"l 1 n], g = GCD [b- l, n], C++}]}}} Pri.nt[g) Pri.nt[c] Pri.nt[r] Null.

{0.01 Second, {1, Null}}

23

6

10

68 .4RD's rho.nb

Appendix F n = 6; f[v_] :={Mod[v[[1]]"2+1, n], Mod[(v[[2]]"2+1)"2+1, n]} g[v_] := GCD[v[[1]] -v[[2]], n] w = f£{2, 5}]; a= 1; Timi.ng[Do[t=g[w]; ++a; If[t>1, Break[],]; w=f[w], {i, 10}]] Print[t] Pr.int[a]

{0. Second, Null}

6

2

n = 35; f[v_] :={Mod[v[[1]]"2+1, n], Mod[(v[[2]]"2+1)"2+1, n]} g[v_] :=GCD[v[[1]]-v[[2]], n] w=f[{2, 5}]; a= 1; T.im.ing[Do[t=g[w]; ++a; If[t>1, Break[],]; w=f[w], {i, 10}]] Pr.int[t] Pd.nt[a]

{0. Second, Null}

7

2

n = 143; f[v_] := {Mod[v[£1]]"2+1, n], Mod[{v[£2]]"2+1)"2+1, n]} g[v_] :=GCD[v[[1]] -v[[2]], n] w=f[{2, 5}]; a= 1; Timi.ng[Do[t=g[w]; ++a; If[t>1, Break[],}; w=~[w], {i, l.O}}] Print[t] Print[a]

{0. Second, Null}

143

4

69