Parametrized Hardware Architectures for the Lucas Primality Test
Total Page:16
File Type:pdf, Size:1020Kb
Parametrized Hardware Architectures for the Lucas Primality Test Adrien Le Masle, Wayne Luk Csaba Andras Moritz Department of Computing BlueRISC, Inc Imperial College London, UK Amherst, MA, USA fal1108,[email protected] [email protected] Abstract—We present our parametric hardware architecture Lucas probabilistic test can be performed to ensure that the of the NIST approved Lucas probabilistic primality test. To number is prime. In fact, as of today no composite number our knowledge, our work is the first hardware architecture passing an appropriate number of Rabin-Miller tests plus a for the Lucas test. Our main contributions are a hardware architecture for calculating the Jacobi symbol based on the binary Lucas test is known. Jacobi algorithm, a pipelined modular add-shift module for FPGAs and ASICs are relevant platforms for cryptographic calculating the Lucas sequences, a dependencies analysis and a algorithm implementations. First, the structure of FPGAs scheduling of the Lucas sequences computation. Our architecture makes them particularly fit for pipelined applications, which is implemented on a Virtex-5 FPGA is 30% slower but 3 times the case for most of the basic cryptographic operations. Sec- more energy efficient than the software version running on a Intel Xeon W3505. Our fastest 45 nm ASIC implementation is ond, FPGAs and ASICs can be used to embed security into low 3.6 times faster and 400 times more energy efficient than the power environments keeping very good performance. Finally, optimised software implementation in comparable technology. a pure hardware implementation of a cryptographic algorithm The performance scaling of our architecture is much better is inherently less vulnerable than its software counter-parts than linear in area. Different speed/area/energy trade-offs are which are usually run in a multi-tasking operating system. available through parametrization. The cell count and the power consumption of our ASIC implementations make them suitable For instance, software-based attacks such as cache-attacks [4] for integration into an embedded system whereas our FPGA do not apply to hardware. implementation would more likely benefit server applications. This paper presents the first hardware architecture for the NIST approved Lucas probabilistic primality test. Our main I. INTRODUCTION contributions include: Many public-key cryptographic algorithms require large • A hardware architecture for calculating the Jacobi symbol prime numbers in order to generate a key pair. This is based on the binary Jacobi algorithm for example the case of the Diffie-Hellman key exchange • A pipelined modular add-shift module for calculating the protocol or the most popular Rivest, Shamir and Adleman Lucas sequences (RSA) encryption scheme. The common method for finding • A dependencies analysis and a scheduling of the Lucas a large prime number consists in testing randomly generated sequences computation numbers until a prime is determined. The AKS primality • An implementation of the proposed design on a Xilinx test [1] determines if a number is prime in polynomial time. Virtex-5 FPGA and in TSMC 65 nm and 45 nm ASIC This test is deterministic. However, due to the complexity processes of implementing this algorithm in hardware as well as in • A comparison of the performance of our hardware imple- software, probabilistic tests are still in use. mentations with an optimised software implementation in A common primality tester first tries to divide the number terms of speed and energy efficiency under test by a few first primes, then performs a number Our architecture implemented on a Virtex-5 FPGA is 30% of Miller-Rabin probabilistic tests [2]. The error probability slower but 3 times more energy efficient than the software of such a primality test depends on the bitwidth of the version running on a Intel Xeon W3505. Our design is scalable number under test and on the number of Miller-Rabin tests and the performance scaling of our architecture is much better performed. The American National Institute of Standards and than linear in area. Hence we expect the energy efficiency of Technology (NIST) gives recommended values for different our FPGA designs would improve with advances in technology algorithms and different key sizes [3]. However, even with and area available. Our 65 nm ASIC implementation is more well-chosen parameters there is still a small probability that than 2 times faster and 40 times more energy efficient than the a composite number passes the primality test. Ultimately, a FPGA implementation. Our fastest 45 nm ASIC implementa- tion is 3.6 times faster and 400 times more energy efficient The support of BlueRISC, Alpha Data, Xilinx, UK EPSRC and the HiPEAC than the optimised software implementation. NoE is gratefully acknowledged. The research leading to these results has received funding from the European Union Seventh Framework Programme The rest of the paper is organised as follows. Section II under grant agreement number 248976 and 257906. explains the background relevant to our work. In section III, we present the challenges we face when designing the Algorithm 1: Simple Lucas test Lucas primality tester and how we solve them. In section IV, Input: n odd integer we compare our software, FPGA and ASIC implementations. Output: composite if n is composite, probably prime if Finally, section V concludes the paper. n is probably prime 2 1 Choose a; b 6= 0 such that D = a − 4b 6= 0, II. BACKGROUND gcd(b; n) = 1 The Lucas primality test is a probabilistic test based on the and D = −1 properties of the Lucas sequences. We first introduce the Lucas n 2 Compute Un+1(a; b) sequences and the Lucas theorem. Then we describe the Lucas 3 if Un+1(a; b) mod n = 0 then primality test algorithm. 4 return probably prime A. Lucas Sequences 5 else 6 return composite The Lucas sequences U(a; b) and V (a; b) of the pair (a; b) are the sequences: U(a; b) = (U0(a; b);U1(a; b);U2(a; b); :::) (1) another probabilistic primality test such as the Miller-Rabin V (a; b) = (V0(a; b);V1(a; b);V2(a; b); :::) (2) test. Consider line 1 of Alg. 1. Different methods to choose a, b such that for each k ≥ 0: and D are given in [6], [7]. In particular, it is shown that D αk − βk should not be a square modulo n. In that case, the Lucas test U (a; b) = (3) k α − β would be an ordinary primality test. That is in fact ensured k k D Vk(a; b) = α + β (4) by the condition n = −1. A method proposed by Selfridge [7] consists in choosing D as the first element in the sequence where α and β are the two roots of the quadratic equation f5; −7; 9; −11; 13; :::g such that D = −1, a = 1 and b = 2 n x − ax + b = 0, with a; b chosen such that a; b 6= 0 and the (1 − D)=4. If n is square, it can be shown that D > −1 2 n discriminant D = a − 4b 6= 0 [5]. for any D. Hence, so that the algorithm terminates, we have a to check if n is a perfect square either before looking for a Let us define the Jacobi symbol . For a integer n correct D or after a few trials for D. and p prime, we have: Consider line 2. Several methods can be used to compute 8 Un+1. A simple method is to use equation 3 directly. However, > 0 if a = 0 mod p > 1 if a 6= 0 mod p and x2 = a mod p this method is not efficient as it requires two exponentiations a <> = is soluble for some integer x and possibly one division. Instead we can use a recurrence p > 2 relation. For example, we can easily show that: > −1 if a 6= 0 mod p and x = a mod p :> is not soluble Uk(a; b) = aUk−1 − bUk−2 (5) α1 α2 αk For an integer n with prime decomposition n = p1 p2 :::pk , Vk(a; b) = aVk−1 − bVk−2 (6) we have: In [8], a more complex recurrence relation is used to come a a α1 a α2 a αk = ::: up with an efficient method to compute the Lucas sequences: n p1 p2 pk j Ui+j(a; b) = UiVj − b Ui−j (7) j The Lucas theorem is defined as follows [5]. Vi+j(a; b) = ViVj − b Vi−j (8) Lucas Theorem: Let a, b, D, and Uk be as above. If n is D The American National Institute of Standards and Tech- prime, gcd(b; n) = 1 and n = −1, then n divides Un+1. nology (NIST) approved Lucas probabilistic primality test is The Lucas test is based on the contrapositive of the given in Alg. 2. The NIST algorithm first checks if n is Lucas theorem stating that if, under the previous assumptions, a perfect square. If not, it uses the algorithm described by an odd positive integer n does not divide Un+1, then n is Selfridge to find a correct value for D. This test is sometimes composite. called the Lucas-Selfridge probabilistic primality test. Lines D 4-5 rely on the fact that if n = 0, a factor of n exists [6] B. Lucas Test Algorithms and we can therefore stop the test. Lines 6 to 18 computes A simple algorithm performing a Lucas Test is given in Un+1(a; b), with a = 1 and b = (1 − D)=4, using a modified Alg. 1. Note that a Lucas test alone is not enough to determine version of the method presented in [8]. After r iterations, whether a number is prime with a low probability of error.