Primality Testing A survey of techniques

Harish G. Rohan Ramanath Department of Computer Science & Engineering Department of Computer Science & Engineering R.V. College of Engineering R.V. College of Engineering Bangalore, India. Bangalore, India. [email protected] [email protected]

Abstract — This expository paper briefly surveys the history where s has to be regarded as a complex variable and the of testing whether a number is prime. The recently discovered product is taken over all primes. The series, and so the deterministic time due to Agrawal, product, converges absolutely for real( s) > 1. The identity Kayal and Saxena is presented and some improvements are between the series and the product is the analytic version of briefly discussed. the unique factorization of integers, and provides another Keywords— primality tests, polynomial time, P, NP proof for the existence of infinitely many prime numbers which is due to Euler: assuming that there are only finitely I. INTRODUCTION many primes, the product converges throughout the complex plane, contradicting the fact that the series reduces for s = 1 to Prime numbers are studied in but they the divergent harmonic series. occur in many other subfields of mathematics. In the last few The Riemann hypothesis claims that the complex zeros of decades, prime numbers entered the real world in many (s) all lie on the so-called critical line Re s = 1/2 in the applications, e.g. as generator for keys in modern complex plane. This famous conjecture was stated by cryptographic algorithms. Riemann in 1859 and is still unproved. If the Riemann An integer n > 1 is called prime if it has no other positive hypothesis is true, the error term in the theorem than 1 and itself (within the set of integers); otherwise 1/2 n is said to be composite . is as small as possible, namely ~ x log x, and so the prime Every integer has a unique factorization into powers of numbers are distributed as uniformly as possible [7]. distinct prime numbers. Euclid was the first who proved that II. BACKGROUND there are infinitely many primes. His proof is that if p1, p2, …, It is easy to check that 97 is prime and 99 is not, but it pm are prime, then the number seems much harder to answer the same question for the q = p 1 × … × pm + 1 numbers 10 000 000 000 097 and 10 000 000 000 099, at least is not divisible by any of the p js. Thus q has a prime in the same time. Indeed, a fundamental problem in number different from p 1, …, pm (one of which can be q itself). This construction of a new prime number out of an arbitrary finite theory is the decision problem collection of given primes implies the infinitude of prime Primes: given a positive integer n, decide whether n is numbers. Other proofs of this basic fact are in Ribbenboim [7]. prime or not! The celebrated prime number theorem gives information This problem became very important by developments in on how the primes are distributed. On the first view the prime cryptography in the late 1970s. It is easy to multiply two large numbers seem to appear in the sequence of positive integers prime numbers but it is much harder to factor a given large without any visible rule. However, as conjectured by Gauss integer; at least there are no factoring algorithms of satisfying and first proved by Hadamard and de la Vallée-Poussin speed known so far. This simple observation led to so-called (independently) on the base of outstanding contributions due public key cryptosystems where the key, a large integer N of to Riemann, they satisfy a distribution law. Roughly speaking, about two hundred digits, is public knowledge (as the the number π(x) of primes less than or equal to x is: telephone number) but its prime factorization is the secret of its owner. This idea is attackable if N splits into small primes, (1) but if N is the product of two (carefully chosen) primes with = + errorterm The appearing logarithmic integral is asymptotically about hundred digits, the factorization of N is a nearly equal to x/ log x, where log x is the natural logarithm. The error unsolvable task with present day computers [4]. For term in the prime number theorem is small in comparison with generating such keys one needs to find large prime numbers x/ log x and is closely related to the zero distribution of the or, in other words, one needs to have a fast primality test, Riemann zeta-function. where fast means that the running time depending on the size of the number to be tested is small . Notice that a factoring (2) algorithm and a primality test are different things: a number n ζ = = 1 −

1

can fail a primality test and the test does not tell us any of its expectation value for the number of Mersenne primes M p with divisors, whereas a factoring algorithm gives the complete p ≤ x is factorization of n. 1 1 1 log log One of the first ideas for testing a given number n of being ∼ ∼ log 2 − 1 log 2 log 2 prime might be trial division, i.e., to try all positive integers ≤ which tends with x to infinity; the last asymptotic identity whether they divide n or not. Obviously, if there is no n relies on taking the logarithm in Equation (2). Note that this divisor of n among them, then n is prime. This strategy is not fits pretty well to the number of detected Mersenne primes. very useful if n is large. For example, it would take about 10 50 10 arithmetic operations to test an integer with 100 digits. If 10 III. DESIRED CHARACTERISTICS OF PRIMALITY TESTS operations can be performed by a computer within one second, then this test would take about 10 40 seconds, which is still A. Generality much more than 12 billion years, the estimated age of the There are many fast primality tests but they work for Universe. However, hypothetical quantum computers that are numbers with only certain properties. For example, the Lucas– computers which compute with quantum states, if once Lehmer test for Mersenne numbers can only be applied for realized, would solve this factorization problem within a Mersenne numbers. Pépin's test works only for Fermat fraction of a second. The simple idea of trial division leads to numbers. We would like an algorithm that can be used to test the sieve of Eratosthenes (due to the ancient Greek any general number for primality. Eratosthenes who was the first to measure approximately the circumference of the Earth 250 B.C.). If one deletes out of a B. Polynomial time in input size list of integers 1 < n ≤ x all multiples n of the primes p ≤ , The maximum running time of the algorithm can be then only the prime numbers in between and x remain. This x expressed as a polynomial over the number of digits in the gives a list of all primes under a given magnitude. Moreover, target number. Certain algorithms can deterministically we obtain the factorizations of all integers in the list. For a determine whether a given number is prime or not, but their primality test, this is a lot of superfluous information and we running time is not polynomial for all inputs. Algorithms like might ask for faster algorithms for detecting primes. ECPP (Elliptic Curve Primality Proving) and APR (Adleman- For some special numbers, primality tests of satisfying Pomerance-Rumely) deterministically prove primality, but speed are known for quite a long time. For instance, the they do not have polynomial running time for all inputs. We Mersenne numbers, invented by the monk Mersenne in 1644, would like an algorithm that runs in polynomial time for all p are defined by M p = 2 − 1, where p ≥ 3 is prime. It is easily possible inputs. seen that composite exponents cannot produce primes of this form. In 1750, Euler corrected Mersenne’s erroneous list of C. Deterministic numbers by use of the following criterion: if The algorithm guarantees to deterministically distinguish p is a prime number of the form p = 4k + 3, then q = 2p + 1 is whether the target number is prime or composite. There are a divisor of M p if and only if q is prime; primes of the form certain randomized algorithms like Solovay-Strassen and 2p+1 for prime p are called Sophie Germain-primes. For Miller-Rabin, that can test any given number for primality in example, M 11 = 2047 = 23 × 89 is not prime as it was stated by polynomial time, but they may produce some false positives Mersenne. In 1878, Lucas found a simple and fast primality also. We would like an algorithm that can “prove” or test for Mersenne numbers (but only in 1935 Lehmer gave the “disprove” primality with certainty. first proof of the underlying mathematical theorem). His algorithm makes use of the congruence calculus and the test D. Unconditional (called Lucas-Lehmer test) can be described as follows: The Miller test for primality is fully deterministic and runs ≥ Input: a prime p 3. Output: Mp is prime or composite. in polynomial time over all inputs, but its correctness depends 1. Put s = 4. 2 on the truth of the yet-unproven generalized Riemann 2. For j from 3 to p do s := s − 2 mod M p. hypothesis. We would like an algorithm that is not based on 3. If s = 0, return prime; otherwise return composite. any unproven hypothesis. Crandall et. al. give a proof [4]. The first iterations (without reducing modulo Mp) are IV. KINDS OF ALGORITHMS AND COMPLEXITY CLASSES s = 4 → 14 = 2 × 7 → 194 → 37 634 = 2 × 31 × 607, A. Monte Carlo which yields the first two Mersenne primes M 3 = 7 and M 5 = 31. The world record among prime numbers, i.e., the largest For any x which does not belong to the language L, the known prime number, is a Mersenne prime, namely M20996011 algorithms that belong to this class will conclusively say that x = 2 20996011 − 1. This number has more than six million digits ∉ L. For any x which belongs to the language L (class of and was found in November 2003. It is an open question prime numbers with certain properties), these algorithms say whether there are infinitely many Mersenne primes. With a bit that x ∊ L with at least 0.5 probability. This type of behavior is heuristics we can be optimistic. We may interpret the prime termed as “x is accepted with one-sided error”. The class of number theorem, shown in Equation (1) as follows: a positive languages with polynomial time Monte Carlo recognition integer n is prime with probability 1/log n. Then, the algorithms is termed as RP .

2

B. Atlantic City theorem to : a positive integer n > 1 is prime if For any x, algorithms that belong to this class will and only if = (3) say either x ∊ L or x ∉ L with a probability of 0.75. This is x + 1 + 1 termed as “x is accepted with two-sided error”. The class of in the ring of polynomials with coefficients from Z/nZ. For languages with polynomial time Atlantic City recognition example, the n = 561 leads to the algorithms is termed as BPP . polynomial (x + 1) 561 = (x561 + … + 51x 11 + … + 1) mod 561. C. Las Vegas The proof makes only use of Fermat’s little theorem ap-1≡1 It can be defined as a combination of two algorithms: a (mod p) and divisibility properties of binomial coefficients. Monte Carlo algorithm for L and a Monte Carlo algorithm for However, this characterization would not give a polynomial (complement of L). These classes of algorithms conclusively time primality test since for testing n, one has to compute Lsay whether x L or x L, but the running time is about n coefficients for the polynomial on the left hand side of probabilistic. The∊ class of languages∉ with polynomial time Las (3). Agrawal et. al. replaced the polynomial identity (3) by a Vegas algorithms is termed as ZPP . Both Strassen-Solvay and set of weaker congruences n n r Miller-Rabin tests belong to the Monte Carlo class of (x − a) ≡ x − a mod (n, x − 1) (4) algorithms. where the a’s have to be small residue classes modulo n and the r is a small positive integer. However, to assure that V. DETERMINISTIC PRIMALITY TEST switching from the polynomial identity Equation (3) to the set These tests are mostly based on factorization techniques. of congruences Equation (4) still yields a characterization of Two best current methods are Cyclotomic Ring Test and prime numbers, one has to consider quite many a’s and r’s. On Elliptic Curve Test [1]. The Lucas Test discussed before is a the contrary, these congruences can be checked much faster deterministic primality test designed to find only Mersenne than Equation (3) since it suffices to compute with primes sequence [2]. Deterministic tests are so complicated to polynomials of degree ≤ 2r. The right balance leads to a implement that probability of making an error in the deterministic primality test with polynomial running time. implementation far exceeds the probability that a probabilistic test will return composite [1]. Theorem 1 (Agrawal, Kayal, Saxena). Let s, n be positive integers. Suppose that q and r are primes such that q divides A. Pocklington Primality Test (r−1)/q r − 1 , n !≡ 0, 1 mod r , and [ ] ≥ Let where is the factored part of a number If for all 1 ≤ a < s , a coprime with n, the congruence (3) holds − 1 = to be true, then n is a . , where and . = ⋯ GCD , = 1 < Pocklington's theorem, also known as the Pocklington - Lehmer test, then says that if there exists a for i=1, …, r VI. PROBABILISTIC PRIMALITY TEST These tests can determine whether a number is prime or such that and , not with a given degree of confidence. Assuming this degree ≡ 1mod GCD b − 1, n = 1 of confidence is large enough, these tests are good enough [5]. then n is prime. As probabilistic tests are fast, it is suggested that they be –100 -31 B. Elliptic Curve Primality Proving (ECPP) used with error ratios smaller than 2 (~7.8x10 ) [9]. Fermat, Slovay-Strassen, Lehmann, Miller-Rabin (M&R) and Elliptic Curve Primality Proving (ECPP) is a method based Frobenius tests can be given as examples of probabilistic on elliptic curves to prove the primality of a number. It is a primality tests. More theory about these tests can be found in general-purpose algorithm, meaning it does not depend on the [5], [10-15]. Numbers passing these tests are called probable number being a special form. ECPP is currently, in practice, primes or . the fastest known algorithm for testing the primality of general A number between 1 and n that can be used to demonstrate numbers, but the worst-case execution time is not known. the compositeness (non-primeness) of n is called witness . The ε ECPP heuristically runs in time: for some > 0. density of witnesses (1-d) is very small. The probability of logε This exponent may be decreased to 4 + for some versions by number n to be prime after i iterations is given: heuristic arguments. ECPP works the same way as most Prb(prime) = 1 - Prb(composite) = 1 – (1-d) i other primality tests do, i.e., finding a group and showing its A larger d means faster convergence to the desired confidence size is such that p is prime. For ECPP, the group is an elliptic threshold. M&R is the most popular in practice since density curve over a finite set of quadratic forms such that p − 1 is is large (d=0.75) [16]. To have an error ratio about 2 –100 , 50 trivial to factor over the group. different base (a) values must be taken with M&R test. About C. AKS Test 99.9% of the possible base (a) values are witnesses [5].

In August 2002, Agrawal, Kayal and Saxena [3] gave a Fermat’s Theorem & : This theorem is the first deterministic primality test in polynomial time without basis for probable primality tests. n is called Fermat assuming any unproven hypothesis. The main idea of the pseudoprime to the base a, if n satisfies Equation 3 for base a AKS-algorithm is the following extension of Fermat’s little (a to be any integer 1 ≤ a ≤ n-1). In other words, a

3

pseudoprime is a number that “pretends” to be prime by C. Carmichael Numbers passing Fermat’s theorem for given base values. Carmichael numbers are numbers which mislead A. Solovay Strassen Test probabilistic probability tests. Andrew Granville and Pomerance proved that for any given finite set of bases there The test was proposed by Solovay and Strassen [6] and are infinitely many Carmichael numbers that are ‘strong was the first efficient algorithm for primality testing. Its pseudoprimes’ to all the bases in that set [17]. Pinch shows starting point is a restatement of Fermat’s Little Theorem: that there are 246.683 prime numbers up to 10 16 , all with at Theorem (Fermat’s Little Theorem, Restatement 1) For most 10 prime factors [18]. any odd prime number n, and for any number a, D. Frobenius Numbers 0 < a < n, = ±1 (mod n). It is an easy observation that for prime n, a is a In number theory, a Frobenius pseudoprime is a composite (in other words, a = b 2 (mod n) for some b) if number which passes a three-step test set out by Jon Grantham in section 3 of his paper "Frobenius and only if - = 1 (mod n). The Legendre symbol equals a pseudoprimes". Although a single round of Frobenius is 1 if a is a quadratic residue modulo n else equals −1 for prime slower than a single round of most standard tests, it has the n. Therefore, for prime n, advantage of a much smaller worst-case per-round error bound of 1/7710, which would require 7 rounds to achieve with = n) mod the Miller-Rabin primality test according to best known Legendre symbol can be generalized to composite numbers by bounds. defining: VII. CONCLUSION = In view of the primality test of Agrawal and his students it follows that the decision problem Primes P. On the other where , p i is prime for each i. This generalization ∊ n = side, the integer factoring problem, which reads as given an is called . Jacobi symbol satisfies quadratic integer N, find the prime factorization of N, is not expected to reciprocity law: lie in class P but in class NP. The class NP is, roughly speaking, the class of decision problems having solutions that, . = −1 once given, can be verified in polynomial time. By definition This along with the property that gives an the classes P and NP seem to be quite different: solving a = algorithm to compute that takes only O(log n) arithmetic problem seems to be harder than verifying a given solution. In the language of prime numbers, it is rather difficult to factor operations. For composite n, it is no longer necessary that a given large integer, e.g., N = 10 000 000 000 097, into its =1 if and only if a is a quadratic residue modulo n or that prime divisors, but it is easy to check whether or not 811 × 12330456227 is the prime factorization of N. Once the = . This suggests that checking if factorization of an integer is produced by some factoring mod algorithm, we can use the AKS-algorithm to test its factors on = may be a test for primality of n. Solovay mod primality in polynomial time. This shows that Factoring NP. and Strassen showed that this works with high probability ∊ when a is chosen randomly. It is widely expected that Factoring does not lie in P; public key-cryptography relies in the main part on this belief B. Miller Rabin Test (however, this is not true for hypothetical quantum This test was proposed by Michael Rabin [9] by slightly computers). Surprisingly, it seems to be rather difficult to find modifying a test by Miller [8]. The starting point is another an example which is a member of NP but not of P. Moreover, restatement of Fermat’s Little Theorem: it is an open problem to prove (or disprove) P ≠ NP. We Theorem (Fermat’s Little Theorem, Restatement 2) For conclude our report on primes, primality testing and open any odd prime n = 2s · t with t odd, and for any number a, 0 < problems with a nice quotation due to Paul Leyland who 2t expressed his surprise about the unexpected discovery of a a < n, the sequence at (mod n), a (mod n), (mod n),…, a simple deterministic polynomial time primality test by saying: (mod n) either has all 1’s or the pair −1, 1 occurs a “Everyone is now wondering what else has been similarly somewhere in the sequence. overlooked.” If n is composite, then the sequence may not satisfy the above property. Miller proved that, assuming Extended REFERENCES 2 Riemann Hypothesis, for at least one a between 1 and log n, [1] Silverman, R.D., Fast Generation of Random, Strong RSA Primes, the above sequence fails to satisfy the property when n is RSA Laboratories’ Crypto Bytes Magazine - Volume 3, Number 1, composite but not a prime power. Miller proved that the same 1997. holds with high probability for a random a without any [2] Emerson P., Prime Number Generation and Primality Testing, MSc Thesis, Computer Science at Middlebury College, 1997. hypothesis. The test requires O(log n) arithmetic operations [3] Agrawal M., Kayal N., Saxena N. Primes is in P, available at and hence is polynomial time. http://www.cse.iitk.ac.in/news/primality.html

4

[4] Crandall R., Pomerance C. Prime numbers – a computational perspective, Springer 2001 [5] Schneier B., Applied Cryptography (Second Edition), John Wiley & Sons Inc., pages: 258-261, 1996. [6] R. Solovay and V. Strassen. A fast Monte-Carlo test for primality. SIAM Journal on Computing, 6:84–86, 1977. [7] Ribbenboim P. The new book of prime number records, Springer, 3rd ed., 1996 [8] Menezes, A. and Oorschot P., Handbook of Applied Cryptography, CRC Press, 1997. [9] M. O. Rabin. Probabilistic algorithm for testing primality. J. Number Theory, 12:128–138, 1980. [10] G. L. Miller. Riemann’s hypothesis and tests for primality. J. Comput. Sys. Sci.,13:300–317, 1976. [11] Caldwell C.K., Finding Primes & Proving Primality, http://www.utm.edu/research/primes/prove1.html, 1997. [12] Zachary S, McGregor Dorsey, Methods of Primality Testing, http://www-math.mit.edu/phase2/UJM/vol1/DORSEY-F.PDF , 1999 [13] Grantham, J., Frobenius Pseudoprimes, Institute for Defense Analyses, Center for Computing Sciences, 1998. [14] Grantham, J., A Probable Prime Test with High Confidence, Journal of Number Theory 72, 32-47, 1998. [15] Maurer, U.M., Fast Generation of Prime Numbers& Secure Public-Key Cryptographic Parameters, to appear in Journal of Cryptography, 1994. [16] Segre, A., Computer and Network Security, Iowa University “Data Security” Lecture Notes (last modified 97), 2000. [17] Granville A., Primality Testing & Carmichael Numbers, Notices Amer. Math. Soc. 39 (pages: 696-700), 1992. [18] Pinch, R.G.E., The Carmichael Numbers Up to 1016 www.chalcedon.demon.co.uk/rcam.html, 1998.

5