<<

An Analytic Attack Against ARX Addition Exploiting Standard Side-Channel Leakage

Yan Yan1, Elisabeth Oswald1 and Srinivas Vivek2 1University of Klagenfurt, Klagenfurt, Austria 2IIIT Bangalore, India {yan.yan, elisabeth.oswald}@aau.at, [email protected]

Keywords: ARX construction, Side-channel analysis, Hamming weight, Chosen plaintext attack

Abstract: In the last few years a new design paradigm, the so-called ARX (modular addition, rotation, exclusive-or) ciphers, have gained popularity in part because of their non-linear operation’s seemingly ‘inherent resilience’ against Differential Power Analysis (DPA) Attacks: the non-linear modular addition is not only known to be a poor target for DPA attacks, but also the computational complexity of DPA-style attacks grows exponentially with the operand size and thus DPA-style attacks quickly become practically infeasible. We however propose a novel DPA-style attack strategy that scales linearly with respect to the operand size in the chosen-message attack setting.

1 Introduction ever are different: they offer a potentially ‘high reso- lution’ for the adversary. In principle, under suitably Ciphers that base their round function on the sim- strong assumptions, adversaries can not only observe ple combination of modular addition, rotation, and leaks for all instructions that are executed on a proces- exclusive-or, (short ARX) have gained recent popu- sor, but indeed attribute leakage points (more or less larity for their lightweight implementations that are accurately) to instructions (Mangard et al., 2007). suitable for resource constrained devices. Some re- Achieving security in this scenario has proven to cent examples include Chacha20 (Nir and Langley, be extremely challenging, and countermeasures such 2015) and Salsa20 (Bernstein, 2008) family stream as masking (secret sharing) are well understood but ciphers, SHA-3 finalists BLAKE (Aumasson et al., costly (Schneider et al., 2015). In the case of ARX 2008) and SKEIN (Ferguson et al., 2010), as well as constructions one has to cope with the fact that there other block ciphers such as SPECK (Beaulieu et al., are Boolean operations (requiring Boolean masking 2015), SPARX (Dinu et al., 2016) and the related or secret sharing) and arithmetic operations (requir- SPARKLE (Beierle et al., 2008) which made into the ing arithmetic masking or secret sharing). Thus se- second round of the NIST lightweight cipher compe- curing ARX ciphers against power (and EM) attacks tition. Another second round candidate of the NIST is potentially very costly; unless, it could be argued lightweight competition Gimli (Bernstein et al., 2017) that they are inherently ‘secure enough’ against such also proposed a variation, namely Gimli-SPARX, that attacks that “non-provable” countermeasures (such as has adopted the ARX paradigm. hiding via randomisation of instructions, etc.) could The fact that the round function has an efficient possibly suffice. The most recent work on securing and simple expression via functions that are typi- ARX implementations is (Jungk et al., 2018). cally available as instructions on small embedded de- vices enables excellent performance with respect to It is well known that completely linear targets such code size, execution time and energy consumption. as the rotation and the exclusive-or operation are dif- Any implementation on an embedded device however ficult to attack with differential (power or EM) anal- also needs to be able to withstand the threat of Side- ysis (DPA for short): attacks on such targets require Channel (timing, power, EM, cache) Attacks (SCA, many more traces than attacks on highly non-linear for short). The absence of dependent loops, or target functions, and even with a very large number indeed tables, implies resistance to timing and cache of leakage traces there remains some keys that cannot attacks. Power (and synonymously EM) attacks how- be distinguished from each other (Prouff, 2005). 1.1 Background and Related Work on To date, the Butterfly attack (Zohner et al., 2012) SCA on ARX Ciphers remains the most effective result in attacking modular addition. This attack demonstrated that it is possi- n ble to improve on straightforward DPA-style attacks The idea of combining addition modulo 2 , exclusive- when targeting modular additions by testing pairs of or, and rotation as a round function, has been sug- correlations induced by the symmetrical structure of gested as early as 1987 in the FEAL modular additions. However Butterfly attacks are (Shimizu and Miyaguchi, 1988). Since NIST kick- constrained by the fact that knowledge of one adder is started its lightweight project in 2015, required which does not hold for some ARX ciphers the interest in ARX constructions has received re- such as SPECK (Beaulieu et al., 2015) and SPARX newed interest. The ciphers SIMON and SPECK (Dinu et al., 2016). (Beaulieu et al., 2015), which were submitted to the first of the two workshops hosted by NIST, gained a considerable amount of interest from within the 1.2 Our Contribution crypto community. In 2016 the SPARX family of ci- phers was introduced (Dinu et al., 2016). In this paper we propose a novel attack strategy The appeal of the ARX construction is primarily against the modular addition in ARX-Boxes. Our in the fact that when choosing n equal to the word size method, in comparison to previous work, requires no of a processor, software implementations gain consid- knowledge of the adders and thus is more generally erable speedups. Furthermore, because the non-linear applicable on targets where a Butterfly attack is not component is given by the addition modulo 2n, it does an option, such as SPECK (Beaulieu et al., 2015) and not need to be encoded as a table lookup which sig- SPARX (Dinu et al., 2016). nificantly reduces the memory usage. The absence Our method requires to obtain the leakage from of lookup tables is also perceived as a distinctive ad- the result of the modular addition only, which we need vantage when the threats of various side channel at- to be a bit-linear function (e.g. Hamming Weight, or tacks are considered (Biryukov et al., 2016; Biryukov weighted Hamming weight with positive weights of and Perrin, 2017; Dinu et al., 2016). The absence similar magnitude). We only need to be able to ob- of cache also implies that the instructions are always serve if the leakage increases, decreases, or remains performed in a constant time and thus there is unlikely the same upon a single-bit flip in the plaintext. Based any key dependent leakage exploitable in the execu- on this information, we show how to reconstruct the tion time (Dinu et al., 2016). Being free from the ta- adder output. With the adder output, and based on a bles also significantly reduces the number of mem- further related plaintext, we then show how to recon- ory accesses as these instructions have shown to be struct the secret key. the most exploitable targets in power analysis attacks We consider our novel methodology of indepen- (Biryukov et al., 2016). dent theoretical interest as we leverage minimal side channel leakage to perform a cryptanalytic-style anal- However, it has been shown in (Yan and Os- ysis for ARX constructions. wald, 2019) that na¨ıve implementations of ARX ci- phers may still leave vulnerabilities easily exploitable. In (Yan and Oswald, 2019), the authors simply 1.3 Organisation of the Paper attempted straightforward correlation power analy- sis attacks on the reference implementation of the In Section 2 we formalise our attack on ARX-Boxes SPARX cipher (Dinu et al., 2016) on some real plat- as the (Noisy) Hidden Adder Problem, (N)HAP, and forms, and found that the key was efficiently recov- propose its sub-problem the (Noisy) Hidden Sum ered exploiting the leakage amplified by consecutive Problem, (N)HSP. In Section 3 we explain how HSP XOR and shifting instructions. can be solved, then use the solution to solve HAP in Nevertheless, for the modular addition in ARX- Section 4 thus providing a full key recovery attack Boxes, the authors of (Yan and Oswald, 2019) re- given ideal leakage. Section 5 completes the attack by ported unsuccessful attacks targeting the addition in- adapting the attack to noisy leakage. We present sim- struction which coincide with the previous results re- ulation results in Section 5.1 and also discuss practical ported by (Biryukov et al., 2016) and that align with considerations. (Yan and Oswald, 2019; Zohner et al., 2012; Dinu et al., 2016): their argument is that the weak non- 1.4 Notation linearity of modular addition leaves a relatively lower margin in distinguishing the keys comparing to a typ- In this work we frequently use both the integer and the ical S-Box instruction. binary representation of operands. The notation [x] indicates the binary representation of a non-negative 2.1 Outline of Our Attack integer x: n−1 In a nutshell, there are two steps in our attack strat- i x = [x]n−1[x]n−2...[x]1[x]0 = ∑ 2 [x]i, egy. The first step is to recover the sum s(x,y) from i=0 the leakage. From there, we then recover the subkeys hence [x]i denotes the i-th bit of [x]. The notation [x][y] (α,β) by solving equations involving x,y and s(x,y). implies the concatenation of two bit strings [x] and We begin by explaining how the attack works in an [y]. The notation [x]k denotes k times repetition of [x]. ideal world where the adversary observes ideal leak- Specifically, [∗]k denotes an arbitrary k-bit string. age without the Gaussian noise e, and then we show In this paper we assume that all integers are drawn how such solution can be adapted to realistic noisy + from Z2n , where n ∈ N . The  denotes modular ad- leakage by using statistical methods. To this end, we dition over Z2n . HW(x) denotes the Hamming weight define the following problems: of [x]. For a n-bit integer x, xe refers to the (one’s) Definition 1 (Hidden Adder Problem (HAP)). Let complement of x, where all the bits are flipped. That (α,β) be randomly chosen from Z2n × Z2n . The ad- is, versary chooses as many pairs x,y ∈ Z2n and obtains x = x ⊕ ( n − ). e 2 1 leakage of the form HW(s(x,y)) = HW((x⊕α)(y⊕ We often require to be able to change the value of β)) for each pair. The adversary must then recover a single bit to its complement (i.e. we flip a bit, but (α,β). leave all other bits unchanged). For this purpose we Definition 2 (Noisy Hidden Adder Problem define the flip function Fi(x) which returns x with the (NHAP)). Let (α,β) be randomly chosen from i-th bit flipped: Z2n × Z2n . The adversary chooses as many pairs i Fi(x) = x ⊕ 2 . x,y ∈ Z2n and obtains leakage of the form Lα,β(x,y) = HW(s(x,y)) + e = HW((x ⊕ α)  (y ⊕ β)) + e where e ∼ N (0,σ2). The adversary must then recover 2 Problem Description (α,β). Note that HAP (and so is NHAP) reflects the ul- In this work we explore alternative SCA strategies timate goal of the adversary to reveal the subkeys when targeting the modular addition in a more general (α,β). To explain our attack, we further define setting as described by the generalised ARX-Box in two sub-problems of HAP and NHAP called Hidden (Yan and Oswald, 2019): Sum Problem (HSP) and Noisy Hidden Sum Problem (NHSP): s(x,y) := (x ⊕ α)  (y ⊕ β), (1) where (x,y) are some known input values and (α,β) Definition 3 (Hidden Sum Problem (HSP)). Let are the unknown sub-keys. All x,y,α,β and s(x,y) are (α,β) be randomly chosen from Z2n × Z2n . The ad- n-bit variables. We further assume an adversary who versary chooses as many pairs x,y ∈ Z2n and obtains is able to choose input pairs (x,y), and observe the leakage of the form HW(s(x,y)) = HW((x⊕α)(y⊕ (noisy) leakage of s(x,y). For simplicity we assume β)) for each pair. The adversary must then recover that the (noisy) leakage is the Hamming weight with s(x,y). Gaussian noise, which is the most commonly consid- Definition 4 (Noisy Hidden Sum Problem (NHSP)). ered leakage model in side-channel literature: Let (α,β) be randomly chosen from Z2n × Z2n . The adversary chooses pairs x,y ∈ Z2n and obtains L(x,y) = HW(s(x,y))+e = HW((x⊕α)(y⊕β))+e, leakage of the form Lα,β(x,y) = HW(s(x,y)) + e = where the Gaussian noise e ∼ (0,σ2). We note that N HW((x ⊕ α) (y ⊕ β)) + e where e ∼ (0,σ2). The our statements also hold for a more general bit-linear  N adversary must then recover s(x,y). model assuming that all coefficients have the same sign. The adversaries in the problems HSP and NHSP The goal of the adversary is to recover the subkeys are given exactly the same form of leakage as the (α,β) given a set of chosen inputs (x,y) and their as- adder problems (HAP and NHAP). Only their goals sociated leakage L(x,y). Later in Section 5 we ex- are changed to recover the sum s(x,y) in the sum plain that how the chosen inputs requirement can be problems rather than the sub-keys in the adder prob- relaxed in different settings. lems. We model our attack that exploits ideal and noisy In this paper, we always consider the general case leakage as the Hidden Adder Problem (HAP) and where n ≥ 2. For the special case where n = 1, we Noisy Hidden Adder Problem (NHAP), respectively, have as formalised in Definition 1 and Definition 2. HW(s(x,y)) = s(x,y) = x ⊕ α ⊕ y ⊕ β which immediately gives α ⊕ β given HW(s(x,y)),x Note that [s]n−1 ∈ {0,1}, ∆sy([x]n−1) ∈ and y. It should also be noted that as shown in (Yan {+2n−1,−2n−1} according to Equation (2), and and Oswald, 2019), any side-channel attack target- −2n−1 mod 2n = +2n−1. Equation (5) can, there- ing the modular addition of the generalised ARX-Box fore, be categorised into four cases: will yield at least two pairs of the subkeys. Conse- n−1 quently, as we will see later in Section 4, all the above 1. If [s]n−1 = 0, ∆sy([x]n−1) = +2 , then problems have at least two pairs of solutions in (α,β). n−2 0 n−1 n−1 i n sn−1 = (+2 + 0 · 2 + ∑ [s]i2 )(mod 2 ) i=0 3 Solving the Hidden Sum Problem = [1][s]n−2...[s]1[s]0. n−1 2. If [s]n−1 = 0, ∆sy([x]n−1) = −2 , then Our solution to the HSP recovers s(x,y) one bit at n−2 a time. Starting from the MSB down to the LSB, we 0 n−1 n−1 i n sn−1 = (−2 + 0 · 2 + ∑ [s]i2 )(mod 2 ) flip each bit of x and observe the resulting differences i=0 in the Hamming weight leakage. In the end we re- = [1][s] ...[s] [s] . cover all bits of the hidden sum s(x,y). We abbreviate n−2 1 0 0 n−1 s := s(x,y) and si := s(Fi(x),y), which are the sum, 3. If [s]n−1 = 1, ∆sy([x]n−1) = +2 , then and the sum with [x]i flipped, respectively. n−2 0 n−1 n−1 i n Since flipping [x]i also flips the bit [x ⊕ α]i, this s = (+2 + 1 · 2 + [s]i2 )(mod 2 ) i n−1 ∑ effectively changes the sum by ±2 due to commuta- i=0 tivity of addition. In the following sections we explain = [0][s]n−2...[s]1[s]0. how to exploit this property of s(x,y) to solve HSP. n−1 Define ∆sy([x]i) to be the difference in s(x,y) in- 4. If [s]n−1 = 1, ∆sy([x]n−1) = −2 , then duced by flipping [x]i. We have: n−2 i n s0 = (− n−1 + · n−1 + [s] i)( n) ∆sy([x]i) ≡ s(Fi(x),y) − s(x,y) ≡ ±2 (mod 2 ). n−1 2 1 2 ∑ i2 mod 2 (2) i=0 Equivalently, = [0][s]n−2...[s]1[s]0. i s(Fi(x),y) = s(x,y) ± 2 . (3) Observe that in Cases 1 and 2, where [s]n−1 = 0, we have 0 3.1 Recovering the MSB sn−1 = [1][s]n−2...[s]1[s]0. Similarly, in Cases 3 and 4 where [s]n−1 = 1, we have In this section we give a solution that recovers the 0 MSB of s, which is the base case for our algorithm. sn−1 = [0][s]n−2...[s]1[s]0. Note that the MSB of s can be reduced to XORing Therefore, we obtain the carry bit c from the sum of the lower-order bits, HW(s0 ) = and the MSBs of x ⊕ α and y ⊕ β. Because flipping n−1  [x]n−1 flips [x ⊕ α]n−1 and thus [s]n−1, the adversary HW(1) + HW([s]n−2[s]n−3...[s]1[s]0)if [s]n−1 = 0, can determine that [s]n−1 = 0 if the Hamming weight HW(0) + HW([s]n−2[s]n−3...[s]1[s]0)if [s]n−1 = 1. increases (from 0 to 1) and vice versa. (6) Lemma 1. Given x, y, Hamming weights of the sums Denote by ∆HWn−1 the (signed) difference in 0 s(x,y) and s(Fn−1(x),y), the MSB of the sum s is: Hamming weight between s to sn−1. Subtracting Equation (6) by Equation (4), we have: ( 0 if HW(s0 ) − HW(s) > 0, 0 n−1 ∆HWn−1 = HW(sn−1) − HW(s) [s]n−1 = 0 1 if HW(sn−1) − HW(s) < 0. HW(1) − HW(0) = +1 if [s]n− = 0, (7) = 1 Proof. We can write HW(s) as: HW(0) − HW(1) = −1 if [s]n−1 = 1. HW(s) = HW([s]n−1) + HW([s]n−2[s]n−3...[s]1[s]0). Observing Equation (7), we can see that the sign (4) of ∆HWn−1 solely depends on [s]n−1. Since both Also, HW(s0 ) and HW(s) can be obtained as (ideal) 0 n−1 sn−1 ≡ ∆sy([x]n−1) + s leakage, we can thus recover [s]n−1 by computing n−2 ∆HWn−1 and then applying Equation (7). n−1 i n ≡ ∆sy([x]n−1) + [s]n−12 + ∑ [s]i · 2 (mod 2 ). i=0 Algorithm 1 provides the pseudo code for recov- (5) ering the MSB. Conditions Algorithm 1 Compute MSB of s ∆HW [s] ∆s ([x] ) Overflow? s n−m function [s] = GetMsb(x,y) n−m y n−m known n−1 + n−m [∗]m−1 ∆HW = HW(s0 ) − HW(s); 2 No +1 n−1 [ ] m−1 if ∆HW > 0 then 0 n−m Yes [0] +m −2 m−(k+2) k return 0; No [∗] [1][0] +k Yes [1]m−1 -m else +2n−m return 1; [1] No [∗]m−(k+2)[0][1]k -k end if −2n−m No [∗]m−1 -1 end function Table 1: ∆HW under different conditions, where 2 ≤ m ≤ n, 0 ≤ k ≤ m. 3.2 Recovering the m-th bit No carry bit. In the following conditions there is no In Section 3.1 we explained how the MSB of s = carry bit: n−m s(x,y) can be recovered by the noiseless Hamming 1. If [s]n−m = 0 and ∆sy([x]n−m) = +2 , then weight leakage. We now show how to recover the ∆HWn−m = +1. n−m remaining bits. 2. If [s]n−m = 1 and ∆sy([x]n−m) = −2 , then ∆HW = −1. Lemma 2. Suppose we flip the bit [x] . If: n−m n−m Otherwise there must exist a carry bit. 0 •HW(sn−m) > HW(s), then [s]n−m = 0, Carry bit. The existence of a carry bit implies either 0 one of the following conditions: •HW(sn−m) = HW(s), then [s]n−m = [es]n−(m−1), n−m •HW(s0 ) < HW(s), then [s] = 1, • Case C1 : [s]n−m = 1 and ∆sy([x]n−m) = +2 . n−m n−m n−m • Case C2 : [s]n−m = 0 and ∆sy([x]n−m) = −2 . for 2 ≤ m ≤ n. is satisfied. This can be further categorised into: Proof. We assume that the higher-order m − 1 bits of Overflow. In this case, all the bits of sknown are s: flipped after the addition: sknown = [s]n−1[s]n−2...[s]n−(m−1) 1. In the Case C1, it is required that sknown = m−1 [1] . The propagation results in sknown has been determined. The goal is then to recover m−1 flipped to [0] , with ∆HWn−m = −m. the next bit [s]n−m. According to Equation (3), when 0 2. In the Case C2, it is required that sknown = [x]n−m is flipped, we have the flipped sum sn−m: m−1 [0] . The propagation results in sknown m−1 0 flipped to [1] , with ∆HWn−m = +m. sn−m = s + ∆sy([x]n−m) (8) No overflow. In this case, only a part of sknown is n−m where ∆sy([x]n−m) = ±2 . flipped after adding ∆sy([x]n−m). Denote by k ∈ In the RHS of Equation (8), bits “lower” than [0,m − 2] the number of bits flipped in sknown [s]n−m is unchanged after the addition operation and before the carry propagation terminates, then thus does not affect the Hamming weight. On the 1. In the Case C1, the carry propagation termi- other hand, the addition to (or subtraction from) nates at the least significant [0] of sknown which [s]n−m may potentially generate a carry bit that propa- is required to have the form gates through bits “higher” than [s] and result into n−m s = [s] ...[s] [0][1]k. a change of Hamming weight. known n−1 n−(m−(k+2)) n−m Let ∆HWn−m be the (signed) change of Hamming After the addition with ∆sy([x]n−m) = +2 , weight induced by flipping [x]n−m: sknown changes to k 0 [s]n−1...[s]n−(m−(k+2))[1][0] . ∆HWn−m = HW(sn−m) − HW(s). (9) Therefore ∆HWn−m = −k. We can categorise ∆HWn−m by: 2. The Case C2 is just the opposite of C1 with • Whether there exists or not a carry bit (either pos- ∆HWn−m = +k. itive or negative), Table 1 summarises the above scenarios. It is • If there exists a carry bit, then shown that positive ∆HWn−m implies [s]n−m = [0] – Whether the carry triggers an overflow (and and negative ∆HWn−m implies [s]n−m = [1] as both hence modular reduction). m,k ≥ 0. The case ∆HWn−m = 0 is only possible when k = 0, which indicates a carry bit exists without over- We next analyse each of the above cases. flow. In such a case sknown is required to be either: Algorithm 2 Compute m-th significant bit [s]n−m (2 ≤ 4 Solving HAP m ≤ n) In this section we show how HAP (cf. Defini- function [s]n−m = tion 1) can be solved using a solution to HSP (cf. Sec- GetNextBit(x,y,m,[s]n−1[s]n−2...[s]n−(m−1)) 0 tion 3). ∆HW = HW(sn−1) − HW(s); if ∆HW > 0 then Lemma 3. Let ∆ := ((s(x,y) − s(xe,y) − 1) return 0; (mod 2n))  1 (here  refers to the right shift else if ∆HW < 0 then operator). return 1; The solutions to HAP are: else . ∆HW == 0 α = ∆ ⊕ x if [s]n−m+1 == 0 then n return 1; β = y ⊕ ((s(x,y) − ∆)(mod 2 )) else or return 0; ( α = (∆ 2n−1) ⊕ x end if  end if β = y ⊕ ((s(x,y) − (∆ + 2n−1)) (mod 2n)), end function for arbitrary x,y ∈ Z2n . m−2 • sknown = [∗] [1], for [s]n−m = 0, or Proof. Observe that for any x ∈ Z2n , we have m−2 • sknown = [∗] [0], for [s]n−m = 1. n xe⊕ α = x]⊕ α = 2 − 1 − (x ⊕ α). Hence In either case, [s]n−m can be determined by the LSB of sknown. s(x,y) − s(x,y) To summarise, given ∆HWn−m, we can uniquely e n determine [s]n−m. = ((x ⊕ α) + (y ⊕ β)) − ((xe⊕ α) + (y ⊕ β)) (mod 2 ), = (x ⊕ α) − (2n − 1 − (x ⊕ α)) (mod 2n), = 2(x ⊕ α) + 1 (mod 2n). Algorithm 2 provides the pseudo code that com- putes sn−m for 2 ≤ m ≤ n. Note that we have already computed the values s(x,y) and s(xe,y) in Section 3. Since 2 is not co-prime to the modulo 2n, there are exactly two values of x ⊕ α that 3.3 Complete Solution to HSP n−1 satisfy the above equation: ∆ and ∆2 . Hence the lemma follows. Combining the methods described in Section 3.1 and Section 3.2, we now have a full solution to the HSP, Algorithm 4 provides the pseudo code for solving as summarised in Algorithm 3. Notice that the same HAP. The algorithm has trace complexity O(n) - re- HW(s) can indeed be reused in Algorithm 1 and Al- quiring 2n + 2 calls to the (ideal) leakage function. gorithm 2 ; hence Algorithm 3 only needs n+1 traces to recover s(x,y). Algorithm 4 Compute (α,β) function (α,β) = GetAlphaBeta(void) Algorithm 3 Compute s Pick arbitrary (x,y); function s = GetSum(x,y) . Compute s(x,y) and s(xe,y) by Algorithm 3 . We initialise the sum to its MSB S0 = GetSum(x,y); s = GetMsb(x,y); S1 = GetSum(xe,y); . Recover one bit at a time from 2nd MSB to . Recover (α,β) using Lemma 3 LSB a1 = ((S0 − S1 − 1)(mod 2n)) >> 1; for (m = 2;m ≤ n;m + +) do a2 = a1 ⊕ 2n−1; s = [s][GetNextBit(x,y,m,s)]; b1 = y ⊕ ((S0 − a1) mod 2n); end for b2 = y ⊕ ((S0 − a2) mod 2n); return s; return {(a1,b1),(a2,b2)}; end function end function Algorithm 5 Determine sign of ∆HW Alternative to Chosen Input Requirement. Due to the symmetric structure of Equation (1), Lemma 2 function ∆HW = CompareHW(S1,S2) also applies when the i-th bit of y is flipped. It should (t, p) = test(S1,S2) if p/2 ≥ Signi ficanceLevel then . Accept also be noted that flipping the i-th bit of α is effec- ∆HW = 0 tively equivalent of flipping the i-th bit of x: return 0 x ⊕ Fi(α) = Fi(x) ⊕ α = Fi(x ⊕ α) else if t > 0 then . +1 for positive ∆HW. and vice versa for y and β. Therefore our attack can return +1 also be achieved in a fault attack set up where the else . −1 for negative ∆HW. plaintexts are known to the adversary and for each bit return −1 the adversary can induce an i-th bit flip of either x, y, end if α or β. end if Checking the recovered key-pair. In many attacks end function utilising side channel information another practically relevant question is that of which part of the side 5 Converting to Noisy Leakage channel trace to use (often not just a single value is available but a vector of leakages). Typical side chan- nel attacks proceed via applying the analysis to all In a real world attack setting an adversary is un- leakage points independently (this lends itself to an likely to have noise free leakages. We thus now con- efficiently parallelisable algorithm). This technique sider how to translate the developed attack strategy applies also to our attack: each leakage point eventu- into a more realistic setting. ally gives one key pair. We can test each key pair via In principle, the reduction explained in Section 4 a pair of known plaintext and . In case of a also holds for NHAP to NHSP, as long as the adver- plaintext only attack, another test would be possible: sary is able to recover s(x,y) given the noisy leak- if we set x = α and y = β then both adder inputs are age in NHSP. Further examining the HSP solution equal to 0 which can be detected by a collision anal- in Section 3, we see it is indeed sufficient to solve ysis (Schramm et al., 2004; Bogdanov, 2007; Moradi HSP given only the signs of the difference ∆HWi = 0 et al., 2010). HW(si) − HW(s),i ∈ [0,n − 1]. In the case of noisy leakages we can reveal this difference by sampling the 5.1 Experiments leakage function (i.e. the device) multiple times on the same input. We thus get two sets of leakages: Attack simulations are a valuable tool because they 0 enable us to produce results quickly for different sets S1 = {HW(si) + e}, of parameter choices. In the absence of detailed char- S = {HW(s) + e}. 2 acterisations of devices and their actual power mod- Clearly by subtracting the averages of these sets, we els, one cannot fall back on established statistical can recover ∆HWi also in the noisy case. Moreover techniques such as a power-based sample size anal- because we are only interested in the sign of ∆HWi, ysis to derive how many leakage traces are necessary we can hope that in practice we don’t require ‘large’ in practice to conduct successful attacks. Thus simu- sets. To add a bit more rigour, we opted to implement lations are the established ‘workaround’ when exam- a standard two-tailed t-test in our experiments. A two- ining new attack techniques. The attack is mainly af- tailed test can tell us if fected by three parameters in practice. These are the 0 noise distribution (characterised by σ), the number of • HW(si) = HW(s), repeat queries to the leakage function N, and the sig- 0 • HW(si) > HW(s), or nificance level of the two-tailed test. It is well under- • HW(s0) < HW(s). stood that these three quantities jointly determine how i well a test performs, and thus in turn, how often our Algorithm 5 summarises the pseudo code that de- attack succeeds. termines the sign of ∆HW. It first conducts a two- We simulated the attack with 16-bit word size tailed test: (n = 16) which has been chosen in certain ARX ci- H0 : S1 = S2 phers such as SPARX. Algorithm 5 is implemented 1 H1 : S1 6= S2 using a t-test . S1 and S2 are chosen to have the same sample size for simplicity. We simulated the attack using a set significance level, and interprets the result 1 in terms of the sign of ∆HWi. We follow the assumption of Gaussian noise, which weight leakage information at first, and then discusses an adaptation to cope with a noisy leakage function. To ascertain the impact of noise we provide a case study and via simulations we demonstrate the impact and trade-off between noise and significance level that would need to be considered in practice. Purely from a practical perspective, there exists a more powerful attack for ARX ciphers on a spe- (a) Significance level of 0.01 cific class of devices as described in (Yan and Os- wald, 2019). They observe a device specific effect called leakage amplification that makes the attack on the rotation very trace efficient on a specific type of device. We are not aware of any successful attack however, beyond the Butterfly attack that requires a known adder input, on the modular addition in the context of an ARX cipher. Thus we believe our contribution is of general . (b) Significance level of 0 05 cryptanalytic interest because it opens up a new av- Figure 1: Simulation results for the attack. The left figure enue for analysing modular additions utilising rather shows the success rate as a function of the number of leak- generic leakage from the adder output. age traces (per query) when choosing a significance of 0.01. The right figure is identical but for a significance of 0.05. using different configurations where σ ∈ [0.1,6] and 7 Acknowledgements N ∈ [100,1000]. Recall that N is the number of sam- ples used for each tests. A complete attack hence uses This work has been funded in parts by the Eu- 2N(n + 1) traces. ropean Union (EU) via the ERC project 725042 The result follows our expectations in general. For (acronym SEAL). The third author’s work was funded the same choice of significance level, the number of by the INSPIRE Faculty Award (DST, Govt. of India). traces required to achieve the same success rate in- creases as the noise variance increases. This directly derives from the fact that the power of t-test weak- ens as the noise variance increases. With the same REFERENCES amount of traces and the same noise level, the signif- icance level of 0.05 has a lower success rate cap than Aumasson, J.-P., Henzen, L., Meier, W., and Phan, R. C.- 0.01, which are around 0.74 and 0.96, respectively. W. (2008). SHA-3 proposal BLAKE. Submission to But in return the former also showed a better noise NIST. tolerance than the latter. Since the significance level Beaulieu, R., Shors, D., Smith, J., Treatman-Clark, S., Weeks, B., and Wingers, L. (2015). The SIMON and of 0.05 implies a higher rate of false positives in re- SPECK lightweight block ciphers. In Proceedings of turn for less traces required; thus the results follow an the 52nd Annual Design Automation Conference, San expected and natural trend. Francisco, CA, USA, June 7-11, 2015, pages 175:1– 175:6. ACM. Beierle, C., Biryukov, A., dos Santos, L. C., Großschadl,¨ J., Perrin, L., Udovenko, A., Velichkov, V., and Wang, 6 Conclusions Q. (2008). SchwaemmandEsch: Lightweight Authen- ticated andHashing using the Sparkle Per- We present a novel theoretical concept to attack mutation Family. Inf. Comput., 206(2-4):378–401. the modular addition in the context of ARX ciphers Bernstein, D. J. (2008). The Salsa20 Family of Stream Ci- using side channel leakage. Our assumptions are min- phers. In Robshaw, M. J. B. and Billet, O., editors, imal w.r.t. the leakage because we only require to ob- New Designs - The eSTREAM Final- serve a change in the leakage magnitude relating to ists, volume 4986 of Lecture Notes in Computer Sci- the adder output in an implementation. Our paper de- ence, pages 84–97. Springer. tails the mathematical idea using idealised Hamming Bernstein, D. J., Kolbl,¨ S., Lucks, S., Massolino, P. M. C., Mendel, F., Nawaz, K., Schneider, T., Schwabe, P., motivates the choice of the t-test. Thus if noise follows a Standaert, F.-X., Todo, Y., and Viguier, B. (2017). different distribution, then a different two-tailed test could Gimli : A cross-platform permutation. In Fischer, be used. W. and Homma, N., editors, Cryptographic Hardware and Embedded Systems – CHES 2017, pages 299– June 2-5, 2015, Revised Selected Papers, pages 559– 320, Cham. Springer International Publishing. 578. Biryukov, A., Dinu, D., and Großschadl,¨ J. (2016). Cor- Schramm, K., Leander, G., Felke, P., and Paar, C. (2004). relation power analysis of lightweight block ciphers: A collision-attack on AES: combining side channel- From theory to practice. In Manulis, M., Sadeghi, and differential-attack. In Joye, M. and Quisquater, J., A., and Schneider, S., editors, Applied Cryptogra- editors, Cryptographic Hardware and Embedded Sys- phy and Network Security - 14th International Confer- tems - CHES 2004: 6th International Workshop Cam- ence, ACNS 2016, Guildford, UK, June 19-22, 2016. bridge, MA, USA, August 11-13, 2004. Proceedings, Proceedings, volume 9696 of Lecture Notes in Com- volume 3156 of Lecture Notes in Computer Science, puter Science, pages 537–557. Springer. pages 163–175. Springer. Biryukov, A. and Perrin, L. (2017). State of the art in Shimizu, A. and Miyaguchi, S. (1988). FEAL - fast data lightweight symmetric cryptography. IACR Cryptol- encipherment algorithm. Systems and Computers in ogy ePrint Archive, 2017:511. Japan, 19(7):20–34. Bogdanov, A. (2007). Improved side-channel collision at- Yan, Y. and Oswald, E. (2019). Examining the practical tacks on AES. In Adams, C. M., Miri, A., and Wiener, side channel resilience of ARX-boxes. In Palumbo, M. J., editors, Selected Areas in Cryptography, 14th F., Becchi, M., Schulz, M., and Sato, K., editors, Pro- International Workshop, SAC 2007, Ottawa, Canada, ceedings of the 16th ACM International Conference August 16-17, 2007, Revised Selected Papers, volume on Computing Frontiers, CF 2019, Alghero, Italy, 4876 of Lecture Notes in Computer Science, pages April 30 - May 2, 2019., pages 373–379. ACM. 84–95. Springer. Zohner, M., Kasper, M., and Stottinger,¨ M. (2012). Dinu, D., Perrin, L., Udovenko, A., Velichkov, V., Butterfly-attack on Skein’s modular addition. In Großschadl,¨ J., and Biryukov, A. (2016). Design Schindler, W. and Huss, S. A., editors, Construc- strategies for ARX with provable bounds: SPARX tive Side-Channel Analysis and Secure Design - Third and LAX. In Cheon, J. H. and Takagi, T., editors, International Workshop, COSADE 2012, Darmstadt, Advances in Cryptology - ASIACRYPT 2016 - 22nd Germany, May 3-4, 2012. Proceedings, volume 7275 International Conference on the Theory and Applica- of Lecture Notes in Computer Science, pages 215– tion of Cryptology and Information Security, Hanoi, 230. Springer. Vietnam, December 4-8, 2016, Proceedings, Part I, volume 10031 of Lecture Notes in Computer Science, pages 484–513. Ferguson, N., Lucks, S., Schneier, B., Whiting, D., Bel- lare, M., Kohno, T., Callas, J., and Walker, J. (2010). The Skein hash function family. Submission to NIST (round 3), 7(7.5):3. Jungk, B., Petri, R., and Stottinger,¨ M. (2018). Effi- cient side-channel protections of ARX ciphers. IACR Transactions on Cryptographic Hardware and Em- bedded Systems, 2018(3):627–653. Mangard, S., Oswald, E., and Popp, T. (2007). Power analysis attacks - revealing the secrets of smart cards. Springer. Moradi, A., Mischke, O., and Eisenbarth, T. (2010). Correlation-enhanced power analysis . In Mangard, S. and Standaert, F., editors, Crypto- graphic Hardware and Embedded Systems, CHES 2010, 12th International Workshop, Santa Barbara, CA, USA, August 17-20, 2010. Proceedings, volume 6225 of Lecture Notes in Computer Science, pages 125–139. Springer. Nir, Y. and Langley, A. (2015). ChaCha20 and for IETF Protocols. RFC 7539 (Informational). Prouff, E. (2005). DPA attacks and s-boxes. In Fast Soft- ware Encryption: 12th International Workshop, FSE 2005, Paris, France, February 21-23, 2005, Revised Selected Papers, pages 424–441. Schneider, T., Moradi, A., and Guneysu,¨ T. (2015). Arith- metic addition over boolean masking - towards first- and second-order resistance in hardware. In Applied Cryptography and Network Security - 13th Interna- tional Conference, ACNS 2015, New York, NY, USA,