<<

Implementation of Simultaneous Protections Against Observation and Perturbation Attacks for ECC

Audrey Lucas1 and Arnaud Tisserand2 1CNRS, IRISA UMR 6074, INRIA Centre Rennes - Bretagne Atlantique and Univ Rennes, Lannion, France 2CNRS, Lab-STICC UMR 6285 and University South Britany, Lorient, France

Keywords: Elliptic Curve Cryptography, Side Channel Attack, Fault Injection Attack, Protection, Countermeasure.

Abstract: Scalar multiplication is the main operation in elliptic curve cryptography. In embedded systems, it is vulner- able to both observation and perturbation attacks. Most of protections only target one of these two types of attacks. Unfortunately, many protections against one type of attack may reduce the protection against the other one. In this paper, we simultaneously deal with protections against both types of attacks. Two countermea- sures are presented for scalar multiplication and implemented on a Cortex-M0 microcontroller. The first one protects finite field operations over point coordinates. The second one protects the scalar (or key) .

1 INTRODUCTION tacks. In this work, we simultaneously deal with protec- Elliptic curve cryptography (ECC) is promoted for tions against both types of attacks. We propose two providing public-key cryptography (PKC) support in countermeasures, developed onto specific curves, for embedded systems due to its smaller cost, e.g. sil- scalar multiplication (SM) in ECC. They are probably icon area and energy, and better performances than adaptable onto other curves. The first one protects fi- RSA (Cohen and Frey, 2005; Hankerson et al., 2004). nite field operations over point coordinates. The sec- Embedded systems are widespread in our society, ond one protects the scalar itself during SM. thus their protection against various types of attacks Our paper is organized as follows. Sections 2, 3 is essential. Due to their proximity with other users, and 4 respectively recall background elements on potentially malicious ones, embedded circuits are vul- ECC, SCAs/FAs attacks and ECC attacks and pro- nerable to physical attacks. In this paper, we focus on tections. Our two propositions are presented in Sec- side channel attacks (SCAs) and fault attacks (FAs). tion 5. Section 6 reports implementation results The first ones, use observations of physical parame- on Cortex-M0 and the µNaCl li- ters, such as computation timings or power consump- brary (Dull¨ et al., ). tion, which are analyzed using statistical tools to de- duce links between physical measurements and inter- nal secret values. The second ones, use perturbations 2 BACKGROUND ON ECC of the circuit such as variations of the power supply or electromagnetic radiations to inject fault(s) during ECC (Hankerson et al., 2004; Cohen and Frey, 2005) algorithms execution. These faults are exploited to is a PKC based on elliptic curves (ECs). deduce internal secret values. In the case of prime fields Fp, short Weierstrass Numerous countermeasures exist against SCAs curves form EWS and Montgomery (Montgomery, and FAs at various levels: mathematics, algorithm, 1987) curves EM are defined, with a,b Fp and spe- architecture, circuit. Most of these protections only cific conditions on a,b (see books (Hankerson∈ et al., target one type of attack. For example, uniformiza- 2004; Cohen and Frey, 2005)), respectively by: tion schemes are efficient against SCAs but not for 2 3 EWS : y = x + ax + b, FAs. Some error correcting codes can be used against 2 3 2 (1) FAs but not for SCAs. Unfortunately, many protec- EM : by = x + ax + x. tions, against one type of attack, leave or may make In this paper, we only consider these curves onto the implementation vulnerable to the other type of at- prime fields Fp.

404

Lucas, A. and Tisserand, A. Microcontroller Implementation of Simultaneous Protections Against Observation and Perturbation Attacks for ECC. DOI: 10.5220/0006884604040411 In Proceedings of the 15th International Joint Conference on e-Business and Telecommunications (ICETE 2018) - Volume 2: SECRYPT, pages 404-411 ISBN: 978-989-758-319-3 Copyright © 2018 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved Microcontroller Implementation of Simultaneous Protections Against Observation and Perturbation Attacks for ECC

The most critical operation in ECC is the scalar 768 bits a few years ago. In this work, we do not con- multiplication (SM) [k]P between a curve point P and sider these attacks. Physical attacks are totally dif- a scalar k (either the public or private key). When k is ferent from logical ones and require specific protec- private, it must be protected. SM can be performed by tions. Embedded systems have to be protected against various algorithms based on point addition (ADD) and them since circuits in charge of security tasks can be point doubling (DBL) operations at curve level. When very close to the attackers. Typical physical attacks ADD and DBL have different behaviors, their differ- include: reverse engineering, observation (or SCAs) ences can be a leakage source in observation attacks. and perturbation (or FAs). In this paper, we only con- The easiest way to perform SM is the double and sider SCAs and FAs. It is possible to combine them add (DA) algorithm 1. In case of EM, the Montgomery such as (Roche et al., 2011). ladder (ML) algorithm 2. is commonly used. 3.1 SCAs and Countermeasures Algorithm 1: SM - double and add.

Input: P and k = (km 1,...,k0)2 SCAs observe physical parameters such as tim- Result: [k] P − ings (Kocher, 1996), power consummation (Man- · 1 T O ← gard et al., 2007) or electromagnetic radia- 2 for i = m 1 to 0 do − tions (EM) (Agrawal et al., 2002) at run time. 3 T 2 T DBL ← · They exploit potential correlations between measure- 4 if ki = 1 then 5 T T + P ADD ments of physical parameter(s) and some secret data ← manipulated during execution. 6 return T SCAs are often decomposed into two types. On one hand, simple power analysis (SPA) uses a sin- gle trace of power measurements. For instance, algorithm 1 is vulnerable to SPA. On the other Montgomery ladder Algorithm 2: SM - . hand, various attacks use multiple traces and statis- Input: P and k = (km 1,...,k0)2 tical tools. For instance, differential power analy- Result: [k] P − · sis (DPA) (Kocher et al., 2011) uses difference of av- 1 T O, T P 1 ← 2 ← erages and correlation power analysis (CPA) (Brier 2 for i = m 1 to 0 do − 3 if ki = 1 then et al., 2004) uses Pearson correlation. Both simple 4 T T + T ADD and differential-like attacks exist for other physical 1 ← 1 2 5 T 2 T DBL parameters (e.g. EM). 2 ← · 2 6 else For SCA protection, one must avoid, or strongly 7 T T + T ADD 2 ← 1 2 reduce, dependencies between secret values and ob- 8 T 2 T DBL 1 ← · 1 servable variations of the physical parameter(s). A 9 return T1 first type of protection is denoted uniformization: op- erations sequences must be indistinguishable what- ever the actual secret bits manipulated in the circuit. In order to perform T1 + T2, the x coordinate of Useless operations can be added to uniformize some T1 T2 can be known. During ML [k]P internal itera- algorithms. A second type of SCA protection is de- tions,− T T is always equal to the base point P. 1 − 2 noted randomization: a random activity generates a Several ADD and DBL formulas for different curves scramble in the measurements. Statistic tools con- are available on the EFD website (Bernstein and sider this random activity as data and their results Lange, ). are disturbed. For instance, random useless opera- tions or random masks can be added. Many variations and combinations of uniformization and randomiza- 3 BACKGROUND ON PHYSICAL tion protections have been proposed. ATTACKS 3.2 FAs and Countermeasures Embedded systems have to face attacks at both logi- cal and physical levels. Logical attacks target mathe- Lasers, electromagnetic radiations, variations in sup- matical properties of cryptosystems, networking pro- ply voltage or circuit temperature, glitches in clock tocols, weak software implementations, etc. For in- signals are used to disturb the circuit by injecting stance, very efficient factorization algorithms and par- fault(s) during algorithm execution (Bar-El et al., allel implementations have been used against RSA 2006; Verbauwhede et al., 2011). These faults can

405 SECRYPT 2018 - International Conference on Security and Cryptography

be temporary or permanent and equivalent at logical 1 0 0 0 1 level to a flip, bit set, bit reset or bit stuck-at (on DBL ADD DBL DBL DBL DBL ADD single or multiple bits). FAs exploit some unspecified circuit behavior, di- Figure 1: Basic DA algorithm. rectly or not, in order to deduce the secret. For instance, they can use differences between faulty Valette, 2003). In practice, some randomization and correct outputs thanks to differential fault anal- schemes can be applied against DPA-like attacks in ysis (DFA) (Biham and Shamir, 1997). many protocols. Then SPA-like ones are considered Safe-error analysis (SEA) (Yen and Joye, 2000) as a major threat in ECC. In this paper, we only deal checks if the injected fault has an impact on the final with SPA-like attacks. result. By determining whether a corrupted data was Among SCA protections uniformization and ran- effectively used or not, SEA is very efficient against domization have been widely used in ECC. SCA protections based on useless/dummy operations. Among uniformization countermeasures, double Attackers can produce fault(s) on data, control or and add always (DAA) (Coron, 1999) and ML are external memory. In this paper, we only consider typical SPA protections. The DAA algorithm is simi- faults on data since we target software implementa- lar to DA where a useless ADD is added when the key tions with on-chip memory. bit is zero. This is good for SPA protection but very Two types of protections exist against FAs: detec- bad for SEA ones (injecting a fault during the useless tion and correction schemes. Detection schemes al- ADD has not impact on the output, then the attacker low various policy solutions when an attack occurs: knows that the operation was a dummy one and the execution stop and re-run, algorithm change, eras- corresponding key bit was 0). ing/destroying secret values, etc. Detection can be ML is widely used in practice since it is SPA and achieved at various levels: in hardware using intru- SEA resistant. The same operations sequence is made sion sensors, at algorithm using redundant computa- regardless of the key bits. Attackers cannot distin- tions (spatial and/or temporal) or data integrity checks guish ADD and DBL patterns. Furthermore, all inter- for instance. Correction schemes use methods per- mediate computations impact the final result. forming the expected operations even in presence of Among randomization countermeasures, scalar faults (e.g. use of majority voters). In this paper, we randomization and point blinding protections have only consider detection schemes. been proposed against DPA (Coron, 1999). Scalar randomization consists in performing [k]P = [k + r λ]P where r is the order of E and λ is a random num-· 4 ATTACKS AND PROTECTIONS ber. Point blinding performs [k]P = [k](P + R) [k]R instead of [k]P, where R is a random point.− Other ON ECC randomization countermeasures use projective coor- dinates. Before each SM execution, P coordinates are In this section, several SCAs, FAs and related protec- randomized thanks to the multiplication by a random tions for ECC are recalled. Attacks objective is to re- number λ, so P = (λxP,λyP,λzP). This new P is em- cover the secret scalar/key k from execution(s) of the ployed during SM. scalar multiplication Q = [k]P. 4.2 FAs on ECC and Protections 4.1 SCAs on ECC and Protections Attackers can inject faults on several types of data During SM, each sequence of curve-level operations during SM: curves parameters, scalar, field represen- depends on the actual scalar bits. If ADD and DBL op- tation (Ciet and Joye, 2005), base point (Biehl et al., erations can be distinguished (through physical mea- 2000) and current point (Blomer¨ et al., 2006). The surements) and DA algorithm is used, then SM is vul- attacker aims DFA or transferring the ECDLP (dis- nerable to SPA. Indeed, a 1 key bit generates a DBL crete logarithm problem) onto a weaker curve. Com- followed by an ADD, while a 0 key bit only generates monly, the transfer is possible since b parameter in a DBL. If partial traces for ADD and DBL are different curve equation 1 is unused during SM. Below, two (even with a few differences), an attacker is able to attack examples with different targets are recalled. distinguish what operation is made and then recover In (Biehl et al., 2000), the base point P belongs to the key bits from the trace as illustrated in Figure 1. E instead of E . Curve E has a smaller order Several other SCAs on ECC exist including tim- SW SW SW than E and it is defined by: e ings, DPA, zero-value point attacks (Akishita and SW Takagi, 2003) or doubling attacks (Fouque and e b = y2 x3e ax. (2) − − e 406 Microcontroller Implementation of Simultaneous Protections Against Observation and Perturbation Attacks for ECC

As [k]P = [k]P, the DLP is transferred onto sub- 1 0 0 1 group of smaller order and an attacker recovers k mod DBL V ADD DBL V DBL V DBL V ADD ord(P). By reiteratinge with several other P, the at- tacker recovers the key value thanks to the Chinese Figure 2: Point verification in DA. remaindere theorem (CRT). e In (Bao et al., 1997), the attack proposed on RSA of PV proposed for protection against FAs but we can also apply to ECC. The fault is supposed one bit added uniformization for SPA protection. Our sec- flip located on the random key bit at index j. Let Q ond countermeasure, called iteration , protects the faulty SM result, then Q and Q can be written as: the scalar bits against FAs with a uniform behavior j 1 m 1 e for SPA protection. In this section, we describe these − i j − i Q = [k]P = ∑ ki2 P + k j2 Pe+ ∑ ki2 P countermeasures for Weierstrass curves (with both ja- i=0 i= j+1 (3) m 1 cobian and projective coordinates) and Montgomery e e − i e curves (with XZ coordinates). Q = [k]P = ∑ ki2 P i=0 Their cost will be first evaluated in terms of the Knowing both Q and Q, one can compute Q Q number of Fp operations: multiplication M, square S − and addition/subtraction A. In Section 6, detailed which helps to deduce the key bit k j. Indeed, if comparisons will be reported for microcontroller im- Q Q = 2 jP then k =e 0 and if Q Q = 2 jP thene − − j − plementation. k j = 1. Finally, the attacker recovers the full key iter- ativelye for the other j ranks. e Most FAs on ECC are defeated using point verifi- 5.1 Point Verification (PV) cation (PV) (Biehl et al., 2000) at various steps of SM. For instance, PV ensures that final or current point be- PV injects point coordinates into the curve equation longs to the curve by injecting their coordinates into to verify that the checked point effectively belongs to the curve equation. Thus, PV protects against FA the curve. PV can be integrated in SM at different which targets the curve parameters and the point P periods leading to various trade-offs between security of SM [k]P. and performance. For low cost protection, PV can be In case of Montgomery curve, the y coordinate is performed at beginning and end of SM. Then, a FA unused during SM, and PV is equivalent to verify if on an intermediate point is detected very late. For x3+ax2+x earlier detection, but with a higher run time, PV can C = b is a square with Legendre symbol (LS). p 1 be performed every d iterations or randomly during As a reminder LS is computed by C −2 , and if C ≡ SM. The strongest protection is obtained for d = 1 0 mod p (p is prime) then, LS equals to 0. In other where PV is performed at each SM iteration. In this case, LS equals to 1 if C is square modulo p and 1 − paper, we denote ` the number of executed PVs during else. However, PV is ineffective against FA targeting one SM (1 ` m + 1). ≤ ≤ scalar bits k js. Care must be taken when applying PV to not weakening the implementation against SCAs. For in- stance, using PV after DBL operations is safer (they do 5 PROPOSED PROTECTIONS not depend on k bits). For our first countermeasure, we explore and mod- As protections against one type of attack may weaken ify PV to ensure a uniform behavior against SPA-like the implementation against the other one, simultane- attacks. The sequence of Fp operations added for PV, ously dealing with both SCAs and FAs is important denoted V, can be used to ”fill” the differences be- but it can be tricky. For instance, basic uniformization tween ADD and DBL as illustrated in Figure 2. We schemes using dummy operations for SCA protection modified the complete computations to ensure the ex- may be weak against SEA. Or in case of FA protec- act same behavior, i.e. same sequence of Fp opera- tion, adding redundancy checks must not reduce the tions, for both ADD and DBL+V to make our SM uni- robustness against SCAs (for instance by breaking the form (against SPA). uniform behavior). In most of the literature, FAs and SCAs are considered independently. 5.1.1 Uniform PV on Weierstrass Curves We propose two combined countermeasures for protecting SM simultaneously against both SPA-like We first present uniformization of SM with PV for attacks and major FAs. Standard protections against projective and jacobian coordinates where ADD is DPA-like attacks can be used on top of our counter- more complex than DBL. We include PV in DBL, de- measures. Our first countermeasure is an extension noted DBL+V, to ensure a uniform behavior of SM.

407 SECRYPT 2018 - International Conference on Security and Cryptography

Weierstrass curves onto Fp in projective coordi- SPA and FA (only SPA for DAA) and has a smaller nates are defined below with a = 3: cost than DAA. − Obviously, one can use a smaller security level E y2z = x3 + axz2 + bz3. WP : (4) with less frequent PVs (with d > 1) leading to smaller Multiplication by b, denoted Mb, can be imple- overheads (while d = 1 for DAA). mented with a generic multiplication or additions de- pending on b value (e.g. sparse decomposition). 5.1.2 Uniform PV on Montgomery Curves Table 1 reports the cost of curve operations and verification V. Obviously, V cost is too small to di- Direct PV is too expensive for Montgomery curves rectly uniformize SM. We add operations in V by mul- using XZ coordinates. The unused y coordinate forces tiplying by y such that: to check current point using LS. Furthermore, Mont- gomery curves with XZ coordinates and ML are more V : y3z = x3y + axyz2 + byz3. (5) efficient than Weierstrass curves. Our proposition takes advantage of a constant in- After factorization of some operations and add of side ML Algorithm 2. Indeed, T T is always equal operation M to ADD (this is not a dummy operation, 2 1 b to P at each iteration. If point T is− faulted, it becomes see the source code), the new cost for DBL+V is equal 1 = to the ADD cost (11M+6S+18A+1M ). The final over- T1 and T2 T1 P. The computation T2 T1 with b addition formulas− 6 is possible at each iteration− since head for our uniform SM is 6`M + 4`A + 2`Mb. Te1 + T2 is alsoe computed. Table 1: Operations costs for Weierstrass curves. T1 + T2 = (x3,z3) is performed using the x coordi- nate of P denoted x . After, T T is made with x . Projective Jacobian P 2 − 1 3 ADD 11M+6S+18A 11M+ 5S+13A A first verification consist in comparing between re- DBL 5M+6S+14A 3M+ 6S+ 15A sult of T2 T1 and P. This verification does not detect attacks since− the x must be normalized by z . Then, Basic V 4M+3S+5A+1Mb 1M+6A+1Mb 3 3 a final test is equivalent to x3(1 zP) = 0 which is V 6M+4A+2Mb 7M+ 7A+2Mb − always true since zP = 1. To solve this problem, SM is modified. Instead For jacobian coordinates, we first transform DBL to deal with one bit, the new ladderStep, denote (remove 1 S and add 1 M): ML V, deals with simultaneously two bits (the itera- tions number is halved). Indeed, the curve operations xx = x2 are performed as in the algorithm 3. The variable T6 is 1 yy = y2 yy = y2 1 the new T2 after ML V. If ki = ki+1 then, T5 replaces T1. 1 t0 = x1 + x1 6 t0 = x1 yy Else T1 is replaced by T7. In order to perform PV, we · t1 = t0 + x1 t1 = t0 +t0 (6) note that T6 P = T7 and T6 + P = T5. As T5 and T7 ⇒ t2 = t0 +t0 − s = t1 +t1 are calculated earlier, T = T P can be performed s = t2 yy 8 6 t2 = xx + xx · ± t3 = t1 x1 with the x coordinate of the new T1 (T5 or T7). t3 = t2 +t2 · Then, the verification equation is transformed Algorithm 3: ML V. thanks to a multiplication by the z coordinate: Input: T , T , xor k k 1 2 ← i ⊕ i+1 1 T 2T V : y2z = x3z + axz5 + bz7 (7) 3 ← 1 2 T T + T 4 ← 1 2 3 T 2T new T , if xor = 1 After integration into DBL and some factorizations, 5 ← 4 1 4 T T + T new T our uniform SM overhead is 7`M + 7`A + 2M . 6 ← 3 4 2 b 5 T 2T new T , if xor = 0 7 ← 3 1 6 T T6 + P use x of new T Table 2: Overheads. 8 ← 1 7 if xor = 1 then Projective Jacobian 8 T8 = T7 `M + `A + `M `M + `A + `M PV 6 4 2 b 7 7 2 b 9 else DAA 6.5`M + 3`S + 9`A 6.5`M + 2.5`S + 7.5`A ' ' 10 T8 = T5 Despite ADD and DBL have the same cost, their be- haviors are distinguishable. Then, we reschedule the If a fault is performed during SM then, the equal- operations sequences of ADD and DBL+V to ensure the ity between T8 and T7 (xor = 1) and between T8 and exact same behavior. T5 (xor = 0) is wrong. We compare our uniform PV with DAA in Ta- The cost of the original LadderStep is 5M + 4S + ble 2. Our uniform SM with PV protects against both 8A + 1Ma. As ML V deals with two key bits instead of

408 Microcontroller Implementation of Simultaneous Protections Against Observation and Perturbation Attacks for ECC

one, its overhead equals to 8M + 4S + 7A + Ma. Thus, added. This result is compared to a reference value. ` for one SM the overhead is 4`M + 2`S + 4.5`A + 2 Ma. This new version, denoted ICC, costs 1 swap, 5 small The verification in ML V is faster than verifying if integer additions, 2 shifts and 2m small random num- x3 + ax2 + x is a square using LS. Nevertheless, this ber generations. PV does not ensure that P belongs to curve. Thus, the

first attack from Section 4 is possible. To avoid this, r1 C1 + i if i%2 = 0 0 Θ1 + i if i%2 = 1 the y coordinate is kept and a basic PV is performed at r the beginning of SM. In latter steps of SM, the y co- 2 C2 + i if i%2 = 1 0 Θ2 + i if i%2 = 0 ordinate can be removed without security reduction. r3 Θ3 + λ3 The cost of the beginning curve equation computa- r4 Θ4 + λ4 tion is 1M + 2S + 3A + Ma. Finally, our uniform PV simultaneously protects intermediate point and curve Figure 3: ICC split on registers during SM iterations. parameters against major FAs and SPA. As ML is uniform, if late detection is acceptable, ICC detects bit flip attacks and is resistant to SPA. one can only use PV at beginning and end of SM for But it does not detect bit set or bit reset faults (the bits a very low cost. are forced to 1 and 0 respectively) used in some SEAs. Indeed, if the i-th bit is set to 1 and actually ki = 1, 5.2 Iteration Counter (IC) this ”non-modification” is not detected. But SEAs can be avoided using masking schemes such as (Coron, During SM, key bits manipulations are very short and 1999). have a different behavior compared to field opera- tions. But injecting faults during them is also pos- 5.3 Fault Detection Policy sible (see Sec. 4). PV only protects curve parameters and verified points against FAs but not key bits (PV is Protection against SCAs is based on good properties verified with faulted key bits). of the algorithms and implementations without detec- In order to protect all scalar bits against FAs, we tion at run time. But fault detection can be an ac- propose the iteration counter countermeasure. It en- tive . Several detection policies can be applied sures that the executed SM iterations effectively cor- at run time: stopping the execution, erasing the se- respond to the actual key k even in presence of FAs cret data, re-computing with the same or another al- with a uniform behavior. gorithm, continuing computations with a random key. A naive solution is a check sum which counts the The policy choice depends on the application and re- Hamming weight of k. Nevertheless, this idea is not lated threats. sufficient when attackers can flip two key bits at dif- In this work, we implemented the continuation us- ferent indexes (i = i ). 6 0 ing a random scalar. A random scalar kr is generated Another solution is to count ADDs using a weight before SM. If an attack is detected, kr is used instead depending on the iteration index i. When ki = 1, in- of k as soon as possible. The cost of this additional dex i is added to a register reg (remember that i is protection is the random generation of m bits (i.e. k small). Attackers have to forge multiple bit flips ac- length) and k kr swap when a detection occurs. cording to interesting values of i, which is very un- ↔ likely. Then the overwhelming majority of faults in k are detected. Unfortunately, when ki = 0, reg is not modified leading to a small but measurable activity 6 IMPLEMENTATION RESULTS drop. This second solution is good against FAs but not sufficient against SPA. We implemented our countermeasures on a 32- Our final solution consists in splitting reg into 4 bit Cortex-M0 microcontroller (STM32F0 Discovery registers r1,...,r4 as illustrated in Figure 3. Thanks to board) and the µNaCl library (Dull¨ et al., ) at 128-bit these registers, the cswap function (used during ML) security level. This is a variant of the NaCl (Daniel can both the IC registers and current point co- J. Bernstein and Schwabe, ) library where the Bern- ordinates according to key bits at each ML iteration. stein curve (Bernstein, 2006) (EM with a = 48662, 255 When ki = ki 1, cswap (r1, r2) with (r3, b = 1 and p = 2 19) is implemented thanks to 6 − − r4). Regardless of key bits, random values are added ML and XZ coordinates. to r3 and r4. If i%2 = 0, i is added to first part of r1. The main SM variable is state composed of: co- Else, i is added to first part of r2. ordinates of points T1 and T2, xP coordinate, scalar, At the end of the SM, r1 and r2 are shifted and previous bit and downwards counter.

409 SECRYPT 2018 - International Conference on Security and Cryptography

100 1e+07 ICC ICC & PV PV begin & end ICC & PV & sparse key 9e+06 PV DAA 80 8e+06 60 7e+06

6e+06 40 Weier. − Jac. 5e+06 Weier. − Proj. 20

Montg. − XZ Computation time overhead (%) 4e+06 DAA − Jac. Computation time (clock cycles) DAA − Proj. 0 3e+06 Montg. Weier. J. Weier. P. 0 50 100 150 200 250 PV number during SM Figure 5: Overhead depending on type of protection. Figure 4: Clock cycles depending on PV numbers. Table 3: Experimental results overhaed. The SM loop on key bits is handled by the cswap protection type code size RAM size ICC 2.7% 2.8% which swaps T1 and T2 when the current key bit is dif- ferent from the last one. After cswap, a LadderStep PV end 10.6% 11% XZ PV 12.6% 13.2% is performed on state. Montg. PV+ICC+answer 17.3% 16.9% We use the structure and parameters defined in ICC 2.5% 2.6% µNaCl for SM with ML onto Montgomery curves. In PV 2.9% 2.6% case of Weierstrass curves, we use the same base field PV begin+end 3.9% 4% Jac. with the NIST parameters and y-coordinate is added Weier. PV+ICC+answer 5.6% 5.8% to state. DAA 0.4% 0.4% In order to implement ICC, the variable Reg is cre- ICC 2.3% 2.4% PV 2.1% 2.2% ated to hold r1, r2, r3 and r4. Moreover, the small PV begin+end 2.1% 2.2% Proj. random numbers are generated with random number Weier. PV+ICC+answer 4.8% 5% generator of µNaCl. Before performing ICC compu- DAA 0.4% 0.2% tations, the swap function is used on Reg at each iter- ation. Figure 5 illustrates the overhead of PV during the offs. They have been implemented for short Weier- SM with a 256-bit scalar. The practical overhead strass and Montgomery curves on a 32-bit microcon- is larger than theoretical overhead. The original li- troller. brary was optimized to fill the processors registers ef- A uniform PV in Weierstrass curves was pro- ficiently. When PV is added more memory pressure posed. It leads to faster SM than DAA with early leads to slower execution. detection of FAs (at each SM iteration if needed) and Cost increases linearly with the number of PVs, protection against SPA-like attacks. For Montgomery this leads to trade-offs between the detection level curves, a PV was proposed against FAs (with a uni- and the performance. The PV overhead for Montgo- form behavior). For protecting scalar bits against FAs, mery curves is 62% in worst case against only 2.3% a specific countermeasure called iteration counter was when PV is only used the beginning and the end of proposed. It is low cost and robust to SPA-like at- SM. Similar observations can be made for Weierstrass tacks. curves. Nevertheless, overheads of uniform algorithm Our code will be distributed as open source soft- (the worst case) are smaller than DAA. In addition to ware. Future works will focus on new randomiza- the number of clock cycles, the overhead of size code tion schemes and other types of ECs and point co- and intermediate RAM are reported in Table 3. ordinates.

7 CONCLUSION REFERENCES

We proposed two ECC countermeasures combined si- Agrawal, D., Archambeault, B., Rao, J. R., and Rohatgi, multaneous protection against SCAs and FAs. They P. (2002). The EM Side-Channel(s). In Proc. Cryp- protect field operations on point coordinates and the tographic Hardware and Embedded Systems - CHES, scalar bits against both major fault and SPA-like at- pages 29–45. tacks with various security vs. performance trade- Akishita, T. and Takagi, T. (2003). Zero-Value Point At-

410 Microcontroller Implementation of Simultaneous Protections Against Observation and Perturbation Attacks for ECC

tacks on Elliptic Curve Cryptosystem. In Proc. Infor- Montgomery, P. L. (1987). Speeding the pollar and ellip- mation Security - ISC, pages 218–233. tic curves methods of factorisation. Mathematics of Bao, F., Deng, R. H., Han, Y., and more authors (1997). Computation, 48(177):243–264. Breaking Public Key Cryptosystems on Tamper Re- Roche, T., Lomne,´ V., and Khalfallah, K. (2011). Com- sistant Devices in the Presence of Transient Faults. In bined Fault and Side-Channel Attack on Protected Im- Proc. Security Protocols, pages 115–124. plementations of AES. In Proc. Smart Card Research Bar-El, H., Choukri, H., Naccache, D., Tunstall, M., and and Advanced Applications -CARDIS, pages 65–83. Whelan, C. (2006). The sorcerer’s apprentice guide Verbauwhede, I., Karaklajic, D., and Schmidt, J. (2011). to fault attacks. Proceedings of the IEEE, 94(2):370– The Fault Attack Jungle - A Classification Model to 382. Guide You. In Proc. Workshop on Fault Diagnosis Bernstein, D. J. (2006). Curve25519: New Diffie-Hellman and Tolerance in Cryptography, pages 3–8. Speed Records. In Proc. Public Key Cryptography - Yen, S. and Joye, M. (2000). Checking Before Output May PKC, pages 207–228. Not Be Enough Against Fault-Based Cryptanalysis. Bernstein, D. J. and Lange, T. Explicit-formulas database. IEEE Trans. , 49(9):967–970. http://hyperelliptic.org/EFD/. Biehl, I., Meyer, B., and Muller,¨ V. (2000). Differential Fault Attacks on Elliptic Curve Cryptosystems. In Proc. Advances in Cryptology - CRYPTO, pages 131– 146. Biham, E. and Shamir, A. (1997). Differential Fault Anal- ysis of Secret Key Cryptosystems. In Proc. Advances in Cryptology, pages 513–525. Blomer,¨ J., Otto, M., and Seifert, J. (2006). Sign Change Fault Attacks on Elliptic Curve Cryptosystems. In Proc. Fault Diagnosis and Tolerance in Cryptography - FDTC, pages 36–52. Brier, E., Clavier, C., and Olivier, F. (2004). Correlation Power Analysis with a Leakage Model. In Proc. Cryp- tographic Hardware and Embedded Systems - CHES, pages 16–29. Ciet, M. and Joye, M. (2005). Elliptic Curve Cryptosystems in the Presence of Permanent and Transient Faults. Des. Codes Cryptography, 36(1):33–43. Cohen, H. and Frey, G., editors (2005). Handbook of Ellip- tic and Hyperelliptic Curve Cryptography. Discrete Maths and Applications. Chapman & Hall/CRC. Coron, J. (1999). Resistance Against Differential Power Analysis For Elliptic Curve Cryptosystems. In Proc. Cryptographic Hardware and Embedded Systems - CHES, pages 292–302. Daniel J. Bernstein, T. L. and Schwabe, P. µNaCl library. https://nacl.cr.yp.to/. Dull,¨ M., Haase, B., and Sanchez,´ A. H. µNaCl library. http://munacl.cryptojedi.org/index.shtml. Fouque, P. and Valette, F. (2003). The Doubling Attack - Why Upwards Is Better than Downwards. In Proc. Cryptographic Hardware and Embedded Systems - CHES, pages 269–280. Hankerson, D., Menezes, A., and Vanstone, S. (2004). Guide to Elliptic Curve Cryptography. Springer. Kocher, P. (1996). Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems. In Proc. Advances in Cryptology - CRYPTO, pages 104– 113. Kocher, P., Jaffe, J., Jun, B., and Rohatgi, P. (2011). In- troduction to differential power analysis. J. Crypto- graphic Engineering, 1(1):5–27. Mangard, S., Oswald, E., and Popp, T. (2007). Power Anal- ysis Attacks: Revealing the Secrets of Smart Cards. Springer.

411