Quick viewing(Text Mode)

How to Backdoor Diffie-Hellman

How to Backdoor Diffie-Hellman

How to Backdoor Diffie-Hellman

David Wong

NCC , June 2016

Abstract have pointed an attack on our first contribu- tion, as well as an improvement for the ex- Lately, several backdoors in cryptographic ploitation of our second contribution. This constructions, protocols and implementations work has been updated to reflect these ad- have been surfacing in the wild: Dual-EC in vances. RSA’s B-Safe product, a modified Dual-EC in Juniper’s operating system ScreenOS and a non-prime modulus in the open-source tool 1 Introduction socat. Many papers have already discussed the fragility of cryptographic constructions Around Christmas 2015 Juniper, a networking not using nothing-up-my-sleeve numbers, as hardware company, released an out-of-cycle se- 2 well as how such numbers can be safely picked. curity bulletin . Two vulnerabilities were dis- However, the question of how to introduce a closed without much details to help us grasp backdoor in an already secure, safe and easy the seriousness of the situation. Fortunately, to audit implementation has so far rarely been at this period of the year many researchers researched (in the public). were home with nothing else to do but to solve We present a new way of building a Nobody- this puzzle. By quickly comparing both the But-Us (NOBUS) Diffie-Hellman backdoor by patched and vulnerable binaries, the two is- using a composite modulus with a smooth sues were pinpointed. While one of the vulner- . We then explain how we were able to abilities was a simple “master”- im- implement a proof of concept with Socat and plemented at a crucial step of the product’s OpenSSL in order to exploit our backdoor on authentication, the other discovery was a bit the TLS protocol. more subtle: a unique value was modified. More accurately, a number in the source code Keywords: Diffie-Hellman, Ephemeral, was replaced. The introduction of the vulner- DHE, NOBUS, Backdoor, Discrete , ability was so simple, and due to the fact that Small Attack, Pohlig-Hellman, Pol- the number was stored as a string of hexadec- lard Rho, Factorization, Pollard’s p-1, ECM, imal digits, the trivial use of the UNIX com- Dual-EC, Juniper, socat mand line tool strings was enough to discover it. The special value ended up being a constant Update (December 2016): Dorey et al.1 Archive, Report 2016/999. http : / / eprint . iacr . 1Kristen Dorey, Nicholas Chang-Fong, and Alek- org/2016/999. 2016 sander Essex. Indiscreet Logs: Persistent Diffie- 2https://kb.juniper.net/InfoCenter/index? Hellman Backdoors in TLS. Cryptology ePrint page=content&id=JSA10713

1 Shumow and Ferguson4 at Crypto 2007, which might have been the reason why Juniper gen- erated their own point Q in their implementa- tion of Dual EC. Shortly after that revision, a mysterious update would change that Q point one more time, magically allowing another or- ganization, or person, to access that backdoor in place of the NSA or Juniper. Although the quest to find Juniper’s backdoor Figure 1: The strings of the patched binary and the numerous open questions that arose from that work is a fascinating read by itself5, it is only the introduction of the work you are currently reading. Here we aim to show how secure and strong cryptographic constructions are a single and subtle change away from being your own secretive peep show.

On February 1st, 2016, only a few months after Juniper’s debacle, socat published a security advisory of its own6:

Figure 2: The strings of the vulnerable binary In the OpenSSL address implemen- tation the hard coded 1024-bit DH p used in the system’s pseudo-random number parameter was not prime. The effec- generator (PRNG) Dual EC, an odd algo- tive cryptographic strength of a key rithm believed to have been backdoored by the exchange using these parameters was NSA3. The PRNG’s core has the ability to pro- weaker than the one one could get vide a Nobody-But-Us (NOBUS) trapdoor: a by using a prime p. Moreover, since secret passage that can only be accessed by the there is no indication of how these pa- people holding the secret key. In our case: the rameters were chosen, the existence k in the Dual of a trapdoor that makes possible for EC equation Q = [k]P (where P and Q are an eavesdropper to recover the shared the two elliptic curve points used in the foun- secret from a that uses dation of Dual EC). them cannot be ruled out. Solely the NSA is thought to be in possession In the same vein as Juniper’s problem, a single of that k value, making them the only ones 4 able to climb back to the PRNG’s internal Shumow and Ferguson. On the Possibility of a Back Door in the NIST SP800-90 Dual Ec Prng. state from random outputs, and then able to Crypto 2007. http : / / rump2007 . cr . yp . to / 15 - predict the PRNG’s future states and outputs. shumow.pdf. 2007 The backdoor in Dual EC was pointed out by 5Stephen Checkoway et al. A Systematic Analysis of the Juniper Dual EC Incident. Cryptology ePrint 3Daniel J. Bernstein, Tanja Lange, and Ruben Archive, Report 2016/376. http : / / eprint . iacr . Niederhagen. Dual EC: A Standardized Back Door. org/2016/376. 2016 Cryptology ePrint Archive, Report 2015/767. http: 6http://www.openwall.com/lists/ //eprint.iacr.org/2015/767. 2015 oss-security/2016/02/01/4

2 number was at issue. This time it was the parameters into open-source or closed-source public modulus, an used to generate libraries. This work is about generating such the ephemeral Diffie-Hellman keys of both backdoors and implementing them in TLS, parties during socat’s TLS handshakes. This showing how easy and subtle the process is. had been, contrary to Dual-EC, The working code along with explanations on considered secure from the start. But as it how to reproduce our setup is available on turned out, badly understood as well: as the Github10. Logjam7 paper had demonstrated earlier in the previous year, most servers would use In section 2, we will first briefly talk about Diffie-Hellman key exchanges to perform the several attacks possible on Diffie-Hellman, ephemeral handshakes, and the same servers from small subgroup attacks to Pohlig Hell- would generate their ephemeral keys from man’s algorithm. In section 3 we will introduce hardcoded defaults (often the same ones) our first attempt at a DH backdoor. We will provided by various TLS libraries. The present another DH backdoor in section 4 by paper raised a wave of discussion around how using the ideas of the previous section with a developers should use Diffie-Hellman, at the composite modulus. In section 5 we will see an- same time scaring people away from 1024 bit other method using a composite modulus that DH: “We estimate that even in the 1024-bit allows us to choose a specific generator, allow- case, the computations are plausible given ing us to only modify the modulus value when nation-state resources”. implementing our backdoor. In section 6 we will explain how we implemented the backdoor Securely integrating DH in a protocol is in TLS and how we exploited it. We will then unfortunately not well understood. Defensive see in section 7 how to detect such backdoors approaches are discussed in several RFCs89, and how to prevent them. Eventually we will but few papers so far have taken the point of wrap it all up in section 8. view of the attacker. The combination of the current trend of increasing the bitsize of DH 2 Attacks on Diffie-Hellman parameters with the now old trend of using open source libraries’ defaults to generate and the Discrete Logarithm ephemeral Diffie-Hellman keys would give To attack a Diffie-Hellman key exchange, one opportunist attackers a valid excuse to submit could extract the secret key a from one of the their bigger (more secure) and backdoored a peer’s public key ya = g (mod p). One could then compute the shared key gab (mod p) 7 David Adrian et al. “Imperfect : using the other peer’s public key y = gb How Diffie-Hellman Fails in Practice”. In: 22nd ACM b Conference on Computer and Communications Secu- (mod p). rity. : / / weakdh . org / imperfect - forward - secrecy-ccs15.pdf. Oct. 2015 The naive way to go about this is to compute 8 Eric Rescorla. RFC 2631: Diffie-Hellman Key each power of g (while tracking the exponent) Agreement Method. RFC 2631. https://rfc-editor. org/rfc/rfc2631.txt. 2013. doi: 10.17487/rfc2631 until the public key is found. This is called 9Robert Zuccherato. RFC 2785: Methods for trial and would need on av- Avoiding the Small-Subgroup Attacks on the Diffie- q erage 2 operations to find a solution (with Hellman Key Agreement Method for S/MIME. RFC 2785. https://rfc-editor.org/rfc/rfc2785.txt. 10https://github.com/mimoo/Diffie-Hellman_ 2013. doi: 10.17487/rfc2785 Backdoor

3 q the order of the base). More efficiently, 2.2 Pohlig-Hellman that compute discrete logarithm √ In 1978, Pohlig and Hellman discovered a in expected q steps like Shank’s baby-step 11 giant-step (deterministic), Pollard rho or shortcut to the discrete logarithm problem : Pollard Kangaroo (both probabilistic) can be if you know the complete factorization of the used. Because of the memory required for order of the group, and all of the factors are rel- baby-step giant-step, Pollard’s algorithms are atively small, then the discrete logarithm can often preferred. While both are parallelizable, be quickly computed. Pollard Kangaroo is used when the order is The idea is to find the value of the secret key unknown or known to be in a small interval. x modulo the divisors of the group’s order by x For larger orders the Index Calculus or Num- reducing the public key y = g (mod p) in ber Sieve (NFS) algorithms are the most of order dividing the group order. efficient. But so far, computing a discrete Thanks to the Chinese Remainder Theorem logarithm in polynomial time on a classical (CRT) stated later on, the secret key can then computer is still an open problem. be reassembled in the group order. Summed up below is the full Pohlig-Hellman algorithm (with ϕ being Euler’s totient function): 2.1 Pollard Rho 1. Determine the prime factorization of the The algorithm that interests us here is Pollard order of the group Rho: it is fast in relatively small orders, it is Y ki parallelizable and it takes very little amount ϕ(p) = pi of memory to run. The idea comes from the birthday paradox and the following equation ki (where x is the secret key we are looking for; 2. Determine the value of x modulo pi for and a, a0, b and b0 are known): each i

0 0 gxa+b = gxa +b (mod p) 3. Recompute x (mod ϕ(p)) with the CRT =⇒ x = (a − a0)−1(b0 − b) (mod p − 1) The central idea of Pohlig and Hellman’s al- The birthday paradox tells us that by looking gorithm is in how they determine the value of ki for a random collision we can quickly find one the secret key x modulo each factor pi of the √ in O( p). A random function is used to ef- order. One way of doing it is to try to reduce ficiently step through various gxa+b until two the public key to the subgroup we’re looking values repeat themselves, it is then straightfor- at by computing: ward to calculate x. Cycle-finding algorithms k ϕ(p)/p i are used to avoid storing every iteration of the y i (mod p) algorithm (two different iterations of gxa+b are started and end up in a loop past a certain Computing the discrete logarithm of that ki step) and the technique of distinguished points value, we get x (mod pi ). This works because is used to parallelize the algorithm. (Machines 11 only save and share particular iterations, for Stephen Pohlig and Martin Hellman. An Improved Algorithm for Computing over GF(p) and example iterations starting with a chosen num- its Cryptographic Significance. http : / / www - ee . ber of zeros.) stanford.edu/~hellman/publications/28.pdf. 1978

4 of the following observation (note that x can solution for x (mod m): ki be written x1 + p x2 for some x1 and x2): i k X Y ki ki x = ai ∗ ( mjmj) (mod m) ϕ(p)/pi x ϕ(p)/pi y = (g ) (mod p) j6=i k k (x +p i x )ϕ(p)/p i = g 1 i 2 i (mod p) −1 with mj = mj (mod mi) ki x1ϕ(p)/pi x2ϕ(p) = g g (mod p) At first, it might be kind of hard to grasp k x ϕ(p)/p i = g 1 i (mod p) where that formula is coming from. But let’s k see where it does by starting with only two ϕ(p)/p i x = (g i ) 1 (mod p) equations. Keep in mind that we want to find the value of x modulo m = m1m2 The value we obtain is a generator of the ki ) subgroup of order pi raised to the power x = a1 (mod m1) x . By computing the discrete logarithm of =⇒ x = ? (mod m) 1 x = a2 (mod m2) this value we will obtain x1, which is the value of x modulo pki . Generally we will use i How can we start building the value of x? the Pollard Rho algorithm to compute that discrete logarithm. If x = a1m2 (mod m), ( x = a m (mod m ) The Chinese Remainder Theorem, sometimes then 1 2 1 used for good12 will be of use here for evil. The x = 0 (mod m2) following theorem states why it is possible for Not quite what we want, but we are getting us to find a solution to our problem once we there. Let’s add to it: find a solution modulo each power prime factor of the order. If x = a1m2m2 (mod m) k −1 Q m2 the integer congruent to m2 (mod m1) Theorem 1. Suppose m = mi with ( x = a m m = a (mod m ) m1, ··· , mk pairwise co-prime. then 1 2 2 1 1 x = 0 (mod m2) For any (a1, ··· , ak) there exists an x such that: That’s almost what we want! Half of what we  want actually. We just need to do the same x = a1 (mod m1)  thing for the other side of the equation, and . . we have:  x = ak (mod mk) = a1m2m2 (mod m1) There is a simple way to recover x (mod m) = a1 (mod m1) which is stated in the following theorem: x = a1m2m2+ Theorem 2. Moreover there exists a unique a2m1m1 (mod m)

12Shinde and Fadewar. Faster RSA Algorithm for Decryption Using Chinese Remainder Theorem. http: = a2m1m1 (mod m2) //www.techscience.com/doi/10.3970/icces.2008. = a2 (mod m2) 005.255.pdf

5 −1 with m2 the integer congruent to m2 3. Send generators one by one as your pub- (mod m1) and m1 the integer congruent to lic keys in different Diffie-Hellman key ex- −1 m1 (mod m2). changes

ki Everything works as we wanted! Now you 4. Determine the value of x modulo pi for should understand better how we came up with each shared key computed that general formula. There have been im- 5. Recompute x (mod ϕ(p)) with the CRT provements to it with the Garner’s algorithm13 but this method is so fast anyway that it is not The fourth step can be done by having ac- the bottleneck of the whole attack. cess to an oracle telling you what the shared key computed by the victim is. In TLS this 2.3 Small Subgroup Attacks is done by brute-forcing the possible solutions The attack we just visited is a passive attack: and seeing which one has been used by the vic- the knowledge of one Diffie-Hellman exchange tim in his following encrypted messages (for between two parties is enough to obtain the example the MAC computation in the Finish following shared key. But instead of reducing message during the handshake). With these one party’s public key to an element of dif- constraints the attack would be weaker than ferent subgroups, there is another clever at- Pohlig-Hellman since the brute-force is slower tack called a small subgroup attack that cre- than Pollard Rho, or even trial multiplication. ates the different subgroup generators directly Because of the previously stated limitations and sends them to one peer successively to ob- and the fact that this attack only works for tain its private key. It is an active attack that rather small subgroups, we won’t use it in this doesn’t work against ephemeral protocols that work. renew the Diffie-Hellman public key for every new key exchange. This is for example the 3 A First Backdoor Attempt case with TLS when using ephemeral Diffie- Hellman (DHE) as a key exchange during the in Prime Groups handshake. The naive approach to creating a backdoor The attack is straight forward and summed up would be to weaken the parameters enough below: to make the computation of discrete loga- rithms affordable. Making the modulus a 1. Determine the prime factorization of the prime of a special form (re + s with small r order of the group and s) would facilitate the Special Number Y ϕ(p) = pki Field Sieve (SNFS) algorithm. Having a i small modulus would also allow for easier pre-computation of the General Number Field 2. Find a generator for every subgroup of or- Sieve (GNFS) algorithm. It is believed14 that der pki , this can be done by picking a ran- i the NSA has enough power to achieve the dom element α and computing 14David Adrian et al. “Imperfect Forward Secrecy: k ϕ(p)/p i α i (mod p) How Diffie-Hellman Fails in Practice”. In: 22nd ACM Conference on Computer and Communications Secu- 13http://www.csee.umbc.edu/~lomonaco/s08/ rity. https : / / weakdh . org / imperfect - forward - 441/handouts/GarnerAlg.pdf secrecy-ccs15.pdf. Oct. 2015

6 first pre-computing phases of GNFS on 1024 by fixing a prime modulus p such that p − 1 bit primes which would then allow them to is B-smooth with B small enough for discrete compute discrete logarithms in such large logarithms in bases of order B to be possible. groups in the matter of seconds. But these ideas are pure computational advantages that y = gx (mod p) involve no secret key to make the use of ϕ(p) = p − 1 = p × · · · × p efficient backdoors possible. Moreover they 1 k are downright not practical: the attacker Pohlig-Hellman would have to re-do the pre-computing phase entirely for every different modulus, and the x (mod p1) ··· x (mod pk) next generation of recommended modulus CRT bitsize (2048+) would make these kind of computational advantages fruitless. x (mod ϕ(p))

Another approach could be to use a genera- But this design is flawed in the same ways as of a smaller subgroup (without publishing the previous ones were: anyone can compute what smaller subgroup we use) so that algo- the order of the group (by subtracting 1 from rithms like Pollard Rho would be cost-effective p) and try to factor it. Choosing p such that p − 1 would include factors small enough to again. √ use one of the O( p) would make it danger- ϕ(p) = p − 1 = p1 × · · · × pk ously factorisable. Using the Elliptic Curve Method (ECM), a factorization algorithm order which complexity only depends on the size of y = gx (mod p) the smallest factor (or for a full factorization, on the size of the second largest factor), the But then algorithms like Pollard Kangaroo latest records15 were able to find factors of that run in the same amount of time as around 300 bits. This necessary lower bound on the factors makes it unfeasible to use any Pollard Rho and that do not require the √ knowledge of the base’s order could be used of the O( p) algorithms that would take, for as well by anyone willing to try. This makes example, more than 2150 operations to solve it a poorly hidden backdoor that we cannot the discrete logarithm of 300 bit orders. qualify as NOBUS. Our contribution in section 5 uses a compos- Section 4 uses both of these ideas to insert ite modulus to hide the smoothness of the or- a backdoor into Diffie-Hellman by using der as long as the modulus cannot be factored. a composite modulus. GNFS and SNFS This method is preferred from the other meth- can then be used modulo the factors of the ods as it is not reversible and might only need composite modulus, or better as we will see, a change of modulus. For example, in many the generator’s “small” subgroups can be DH parameters or implementations, g = 2 as a concealed modulo the factors. generator is often used. While the first method will not allow any easy ways to find a specific Back to our prime modulus. A second idea generator, our second method will. would be to set the scene for the Pohlig- 15http://www.loria.fr/~zimmerma/records/ Hellman algorithm to work. This can be done top50.html

7 4 A Second Backdoor At- y = gx (mod n = pq) tempt With a Composite Modulus and a Hidden Sub- group y (mod p) y (mod q)

NFS/SNFS NFS/SNFS

x (mod p − 1) x (mod q − 1)

Update: this section was first presented as a CRT way to construct a Nobody-But-Us backdoor, improving on the previous section. Unfortu- x (mod (p − 1)(q − 1)) nately, Dorey et al.16 have pointed an attack on this method that makes the backdoor re- p and q could be hand-picked as SNFS primes, versible. This section has been left as is, but or we could use GNFS to compute the discrete it should be noted that there exist a way to logarithm modulo p and q. But a more efficient reverse it according to Coron et al.17. way exists to ease the discrete logarithm prob- lem. Choosing a generator g such that both g Our first backdoor gets around the previous modulo p and g modulo q generate “small” sub- problems using a composite modulus n = pq groups, would allow us to compute two discrete with p and q large enough to avoid the logarithms in two small subgroups instead of factorization of n. This requires the same one discrete logarithm in one large group. precautions used to secure RSA instances, For example, we could pick p and q such that with n typically reaching 2048 bits and with p − 1 = 2p1p2 and q − 1 = 2q1q2 with p1 and two factors p and q nearing the same size. q1 two small prime factors and p2, q2 two large prime factors. Lagrange’s theorem tells us that the possible orders of the subgroups are divi- With the factorization of n known, the dis- sors of the group order. This means we can crete logarithm problem can be reduced mod- probably find an element g of order p1q1 to be ulo p and q and solved there, before being re- our Diffie-Hellman generator. constructed modulo ϕ(n) with the help of the CRT theorem. p − 1 q − 1

2 p1 p2 2 q1 q2

subgroup of order

16Kristen Dorey, Nicholas Chang-Fong, and Alek- g (mod p) g (mod q) sander Essex. Indiscreet Logs: Persistent Diffie- Hellman Backdoors in TLS. Cryptology ePrint Archive, Report 2016/999. http : / / eprint . iacr . By reducing the discrete logarithm problem org/2016/999. 2016 y = gx modulo p and q with our new back- 17 Jean-Sebastien Coron et al. of the doored generator, we can compute x modulo RSA Subgroup Assumption from TCC 2005. Cryptol- ogy ePrint Archive, Report 2010/650. http://eprint. p−1 and q−1 more easily and then recompute iacr.org/2010/650. 2010 an equivalent secret key modulo (p − 1)(q − 1).

8 This will find the exact original secret key A stronger and more clever attacker would with a probability of 1 , which is tiny, but parallelize this algorithm on more powerful 4p2q2 this doesn’t matter since the shared key we machines to obtain better numbers. To be will compute with that solution and the other able to exploit the backdoor “live” we want peer’s public key will be a valid shared key. a running-time close to zero. Using a 80 bit integer as our generator’s order, someone Proof. Let a + kap1q1 be Alice’s public key for with no knowledge of the factorization of the ka ∈ Z and let b + kbp1q1 be Bob’s public key modulus would take around 240 operations for kb ∈ Z, to compute a discrete logarithm while this then Bob’s shared key will be would take us on average 221 thanks to the a+kap1q1 b+kbp1q1 ab (g ) = g (mod n). trapdoor. A more serious adversary with Let a + kcp1q1 be the solution we found for a higher computation power and a care for kc ∈ Z, security might want to choose a 200 bit integer then the shared key we will compute will be as the generator’s order. For that he would b+kbp1q1 a+kcp1q1 ab (g ) = g (mod n), which is need to be able to perform 250 operations the same as Bob’s shared key. instantaneously if he would want to tamper Update: here, the research of Dorey et with the encrypted communications following 18 the key exchange, while an outsider would al. pointed that p1 and p2 can be found in O(max(p , p )) operations with Coron et al’s have to perform an “impossible” number of 1 2 100 attack19, which makes this backdoor as easy 2 operations. The size of the two primes to reverse as it is to exploit. Thus this can no p and q, and of the resulting n = pq, should longer be called a Nobody-But-Us backdoor. be chosen large enough to resist against the Fortunately, our contribution in section 5 still same attacks as RSA. That is a n of 2048 bits holds. with p and q both being 1024 bit long would suffice. We used the Pollard Rho function in Sage 6.10 on a Macbook Pro with an i7 Intel Core @ To use such a backdoor, one must not only 3.1GHz to compute discrete logarithms mod- generate two primes p and q to satisfy the pre- ulo safe primes of diverse bitsizes. The results vious shape, but also find a specific generator are summed up in the table below. g. This is not a hard task, unless you want to use a specific generator g. For example many order size expected complexity time libraries use g = 2 by default, implementing this backdoor would mean changing both the 20 40 bits 2 01s modulus and the generator. This is because 22 45 bits 2 04s the probability that an element in a group of 25 50 bits 2 34s order q is the generator of a subgroup of or- der d is d . This means that with our example 18Kristen Dorey, Nicholas Chang-Fong, and Alek- q sander Essex. Indiscreet Logs: Persistent Diffie- g = 2, we would need to generate many mod- Hellman Backdoors in TLS. Cryptology ePrint ulus hoping that g = 2 as a generator would Archive, Report 2016/999. http : / / eprint . iacr . work. The probability that it would work for org/2016/999. 2016 each try would be : 19Jean-Sebastien Coron et al. Cryptanalysis of the RSA Subgroup Assumption from TCC 2005. Cryptol- ogy ePrint Archive, Report 2010/650. p p 1 1 http://eprint. 1 2 ∼ = iacr.org/2010/650. 2010 (p − 1)(q − 1) pq n

9 This is obviously too small of a probability for lard’s p-1 we can add a large factor to both us to try to generate many parameters until p − 1 and q − 1 that we will call pbig and qbig one admits our targeted g as a generator of respectively. our “small” subgroup. This is a problem if we want to only replace the modulus of an im- plementation to activate our backdoor. Since changing only one value would be more subtle than changing two values, our next contribu- tion revise the way we generate the backdoor parameters to solve this problem.

5 A Composite Modulus for a NOBUS Backdoor with a B- To exploit this backdoor we can reduce our public key y modulo p and q, as we did in Smooth Order our first method, and proceed with Pohlig- Hellman’s algorithm there. This is not a Let’s start again with a composite modulus necessary step but this will reduce the size of n = pq, but this time let’s choose p and q such the numbers in our calculations, speeding up that p − 1 and q − 1 are both B-smooth with B the attack. We then carry on with CRT to small enough for the discrete logarithm to be recompute the private key modulo its order, doable in subgroups of order B. We’ll see later which can be picked at a secure maximum how to choose B. of (p−1)(q−1) , which brings around the same Let p − 1 = p × · · · × p × 2 and q − 1 = 2 1 k security promises of a safe-prime modulus. q ×· · ·×q ×2 such that gcd(p−1, q−1) = 2 and 1 l This is because of the following isomorphy such that p ≤ B and q ≤ B for all i ∈ 1, k i j we have: ( )∗ ' ( )∗ × ( )∗, with the and j ∈ 1, l respectively. This makesJ theK Zn Zp Zq product’s orders s = |( )∗| and t = |( )∗| order of theJ groupK ϕ(n) = (p − 1)(q − 1) B- Zp Zq not being coprimes (gcd(p − 1, q − 1) = 2). smooth. This results in a non- with an Constructing the Diffie-Hellman modulus upper-bound on possible subgroup orders of this way permits anyone with both the lcm(s, t) = (p−1)(q−1) . knowledge of the order factorization and the 2 ability of computing the discrete logarithm in subgroups of order B, to compute the To decide how big pbig and qbig should be, we can look at the world’s records for the Pollard’s discrete logarithm modulo n by using the 20 Pohlig-Hellman method. p-1 factorization algorithm , the largest B2 parameter (the large factor) used in a factor- 15 Since p − 1 and q − 1 are both B-smooth, they ization is 10 ∼ 50bits in 2015. As with our are susceptible to be factored with the Pol- previous method, we could use much larger fac- tors of around 100 bits to avoid any powerful lard’s p-1 factorization algorithm, a factoriza- 51 tion algorithm that can find a factor p of n if adversaries and have a agreeable 2 computa- p − 1 is partially-smooth. RSA counters this tions on average to solve the discrete logarithm problem using safe primes of the form p = q+1 problem in these large subgroups. with q prime as well, but this would break our 20http://www.loria.fr/ zimmerma/records/Pmi- backdoor. Instead, as a way of countering Pol- nus1.html

10 This method brings the security of our overall TLS – would have their DH public key and scheme to the one of a perfectly secure Diffie- parameters baked into user’s or server’s gen- Hellman. Its security also relies on the RSA’s erated certificates. Interestingly, the param- assumption that factoring n is difficult if n is eters of the – much more commonly used – large enough. More than that, the probability ephemeral version of Diffie-Hellman used to of having a targeted element be a valid gen- add the properties of Perfect Forward Secrecy, 1 erator can be as large as 2 in our example of are rarely chosen by end users and thus never (p−1)(q−1) engraved into user’s or server’s certificates. a secure subgroup of order 2 . This will allow us to easily generate a backdoored Furthermore, most libraries implementing the modulus that will fit a specific generator, thus TLS protocol (socat, Apache, , ...) increasing the stealthiness of the implementa- have predefined or hardcoded ephemeral DH tion phase of our scheme. In the case where parameters. Developers using those libraries the large two subgroups of order pbig and qbig will rarely generate their own parameters and need to be avoided, one could think about try- will use the default ones. This was the source ing to generate many modulus hopping that of many discussions after being pointed out 21 the targeted g would fit. This can be done by Logjam last year, creating a movement of with probability 1 for each try, which is awareness, pushing people to migrate to bigger 2pbigqbig way too low. But worse, this would give a free parameters and increase the bitsizes of appli- start to someone trying to factor the modulus cation’s Diffie-Hellman modulus from 1024 or using Pollard’s p-1. Another way would be to lower to 2048+ bits. This trend seems like the pass a fake order to the program, forcing it to perfect excuse to submit a backdoored patch generate small ephemeral private keys upper- claiming to improve the security of a library. bounded by ϕ(n) . After this, proceding to We’ll first see how TLS works with ephemeral pbigqbig use Pohlig-Hellman over the small factors, ig- Diffie-Hellman in the next section. Followed will be a demonstration on how the backdoor noring pbig and qbig, would be enough to find the private key. This can be done in OpenSSL was implemented in real open source libraries. or libraries making use of it by passing the fake Finally we’ll explain how our setup worked to order to OpenSSL via dh->q. Of course doing make use of the backdoor. this would bring back our first method’s issues by having to add extra lines of code to our 6.1 Background malicious patch. An ephemeral handshake allows two parties to negotiate a “fresh” set of keys for every new 6 Implementing and Exploit- TLS connection. This has become the pre- ing the Backdoor in TLS ferred way of using TLS as it increases its security, providing the property that we call Theoretically, any application including Diffie- Forward Secrecy or Perfect Forward Secrecy, Hellman might be backdoored using our that is: if the long term key is compromised, method. As TLS is one of the most well known recorded past communications won’t be af- protocols using Diffie-Hellman it is particularly 21 interesting to abuse for a field test of our work. David Adrian et al. “Imperfect Forward Secrecy: Most TLS applications making use of the How Diffie-Hellman Fails in Practice”. In: 22nd ACM Conference on Computer and Communications Secu- Diffie-Hellman algorithm for the handshake – rity. https : / / weakdh . org / imperfect - forward - although this is an algorithm rarely used in secrecy-ccs15.pdf. Oct. 2015

11 fected and future communications will still re- change, to derive the session keys that will en- sist passive attacks. This is done by using one crypt further communications (including final of the two Diffie-Hellman algorithms provided handshake messages): by TLS: “normal” Diffie-Hellman present in the ciphersuites containing DHE in their names and 1. premaster_secret = gcs (mod n) Elliptic Curve Diffie-Hellman (ECDH) present 2. master_secret = PRF(premaster_secret, in the ciphersuites containing ECDHE in their “master secret”, ClientHello.random + names. Note that the concept of “ephemeral” ServerHello.random) is not defined the same by everyone: the de- fault behavior of OpenSSL, up until recent ver- 3. keys = PRF(master_secret, “key ex- sions, has been to generate the ephemeral DH pansion”, ServerHello.random + Clien- key at boot time and cache it until reboot, tHello.random) unless specified not to do so. Such behavior would greatly speed up our attack. Right after trading their ephemeral DH pub- At the start of a new ephemeral handshake, lic keys, the TLS peers compute the Diffie- both the server and the client will send each Hellman algorithm by exponentiating the other their ephemeral DH (DHE) public keys other’s public key with their own private key. via a ServerKeyExchange and a ClientKeyEx- The output is stored in a premaster_secret change message respectively. The server will variable that is sent into a pseudo-random dictate as well what the DHE parameters are function (PRF) with the string “master secret” via the same ServerKeyExchange message. and the public values random of both parties taken from their Hello message as parameters. This is because the DH output can be of fluc- tuating lengths: TLS offers several parame- ters and algorithms to perform this part of the handshake, passing it through a PRF first aims to normalize its size before deriving the keys from it. The output of that first PRF is then sent into the same PRF repeatedly along with different arguments: the string “key ex- pansion” and the reversed order of the random Figure 3: The serverKeyExchange message values we just used, until enough bits are pro- duced for the many keys used to encrypt and authenticate the post-handshake communica- tions. Two authentication keys are first de- rived, client_write_MAC_key and server_write_MAC_key, one for each di- rection. Then two encryption keys are Figure 4: The clientKeyExchange message derived as well, client_write_key and server_write_key. For AEAD ciphers, Let c and s be the client and the server public MAC keys are ignored and two more val- keys respectively. The following computation ues after the encryption keys are derived: is done on each side, right after the key ex- client_write_IV and server_write_IV.

12 6.2 Implementation source code of socat to introduce the backdoor to the masses. We started by generating small parameters with the method in section 5 for our tests. Since socat’s generator is 2, we only had to modify the modulus in the dh1024_p variable of the xio-.c file. Above are the differences between a valid version and a backdoored version of socat. If we want to submit such a patch as an excuse to increase the size of the DH parameters, we could mimick socat’s case22:

Figure 5: socat’s xio-openssl.c file

Figure 6: The same backdoored socat file after our changes socat makes use of OpenSSL underneath the Figure 7: The commit diff page of the socat “pre- surface, and being a useful nifty command line sumed” backdoor tool, it is what we first used to quickly test our backdoor parameters. With the help of 6.3 Exploitation the dhparam argument, we can use an ASN.1 file encoded in the DER format to specify To exploit this kind of backdoor, we first need our backdoored DH parameters to the pro- to obtain a Man-In-The-Middle position be- gram. But as this exercise is one of imple- 22http://repo.or.cz/socat.git/commitdiff/ mentation, we will show how we modified the 281d1bd6515c2f0f8984fc168fb3d3b91c20bdc0

13 tween the client and the server. This can be public keys could be attacked simultaneously done passively by obtaining posterior access to as one might be recovered more quickly than logs of TLS records, but this was done actively the other one. As soon as the private key in our proof of concept23 by using a machine of one party is recovered, the Diffie-Hellman as a proxy to the server and making the client and the session keys computations are done connect to the proxy directly instead of the in a negligible time, and the proxy can start server. The proxy unintelligently forwards the live decrypting and live tampering with the packets back and forth until a TLS connection packets. If the attacker really wants to be able is initiated, it then observes the handshake, to tamper with the first messages, it can delay storing the random values at first, until the the end of the handshake by sending TLS server decides to send its public Diffie-Hellman warning alerts that can keep a handshake parameters to be used in an ephemeral key alive indefinitely or for a period of time exchange. If the proxy recognizes the back- depending on the TLS implementation used door parameters in the server’s ServerKeyEx- by both parties. change message, it runs the attack, recovering one party’s private key and computing the ses- Update: exploitation of this backdoor has sion keys out of that information. With the been improved by Dorey et al.24 who pointed session keys in hand, the proxy can then ob- that an active man-in-the-middle attacker can serve the traffic in clear and even tamper with also force both a client and a server to use a the messages being exchanged. DHE cipher suite (to force them to use the backdoor) if both peers already support a DHE Client Server cipher suite. ClientHello.random 7 Detecting a Backdoor and ServerHello.random Defending Against One ServerKeyExchange.DHparams & ServerKeyExchange.pubKey In the course of this work several open source libraries were tested for composite modulus with no positive results. TLS handshakes of ClientKeyExchange.pubKey the full range of IPv4 addresses obtained from scans.io were inspected on March 3rd, 2016. A total of 50,222,805 handshakes were analyzed from which 4,522,263 were augmented with the Depending on the security margins chosen use of ephemeral Diffie-Hellman. From these during the generation of the backdoor, and on numbers, only 30 handshakes used a compos- the computing power of the attacker, it may ite modulus, all of them had a small factor but be the case that the attacker would not be none of them could be factored in less than able to derive the session keys until the first 5 hours using the ECM or Pollard’s p-1 fac- few messages have been exchanged, exempting them from tampering. For better results, 24Kristen Dorey, Nicholas Chang-Fong, and Alek- the work could be parallelized and the two sander Essex. Indiscreet Logs: Persistent Diffie- Hellman Backdoors in TLS. Cryptology ePrint 23https://github.com/mimoo/Diffie-Hellman_ Archive, Report 2016/999. http : / / eprint . iacr . Backdoor org/2016/999. 2016

14 torization algorithms. Most IPs were hosting Considering that a full factorization with webpages, in some cases the same one. All ad- ECM runs in a complexity tied to the size of ministrators were contacted about the issue. the second largest factor, it might be hard or Our contributions should withstand any kind impossible to do it half of the time. The rest of reversing and thus we should not be able to of the time it might take a bit of work, but detect any backdoor produced by people who since the largest factor found using ECM26 is would have reached the same conclusions as 274 bits, it is possible. ours. The addition of easy to find small factors could have been intentionally done to provide The question of how to avoid these kinds of plausible deniability. Interestingly, it is also backdoors or weaknesses is also interesting and hard to differentiate a mistake in the modulus well understood, but rarely done correctly. generation from a backdoor. From the Hand- First, it is known that by using safe primes book of Applied Cryptography25 fact 3.7: – primes of the form 2q + 1 with q prime – the generator’s subgroup will have an order close Definition 1. Let n be chosen uniformly at to the modulus’ size. Since it is easy to check if random from the interval [1, x]. a number is a safe prime, the client should also 1. if 1/2 ≤ α ≤ 1, then the probability that only accept such moduli. The current state is the largest prime factor of n is ≤ xα is ap- that most programs don’t even check for prime proximately 1+ln(α). Thus, for example, modulus. As an example, no browser currently the probability than n has a prime factor warns the user if a composite modulus is de- > p(x) is ln(2) ≈ 0.69 tected. Another way to prevent this is to have a pre- 2. The probability that the second-largest defined list of public parameters27, this would prime factor of n is ≤ x0.2117 is about 1/2. make Diffie-Hellman look similar to Elliptic Curve Diffie-Hellman in the sense that only 3. The expected total number of prime fac- a few curves are pre-defined and accepted in Q ei tors of n is lnlnx + O(1). (If n = pi , most exchanges. the total number of prime factors of n is Both mitigations can be hard to integrate if the P e .) i two endpoints of a key exchange are not con- Since it might be easier to visualize this with trolled. For example this is the case between numbers: 26http://www.loria.fr/~zimmerma/records/ top50.html 1. a 1024 bit composite modulus n probabil- 27Matt Lepinski and Dr. Stephen T. Kent. RFC ity to have a prime factor greater than 512 5114: Additional Diffie-Hellman Groups for Use with bits is ≈ 0.69. IETF Standards. RFC 5114. https://rfc-editor. org / rfc / rfc5114 . txt. 2015. doi: 10 . 17487 / 2. the probability that the second-largest rfc5114, Tero Kivinen. RFC 3526: More Modular Exponential (MODP) Diffie-Hellman groups for Inter- prime factor of n is smaller than 217 bits net Key Exchange (IKE). RFC 3526. https://rfc- is 1/2. editor.org/rfc/rfc3526.txt. 2015. doi: 10.17487/ rfc3526, Daniel Kahn Gillmor. IETF Draft: Nego- 3. The total number of prime factor of n is tiated Diffie-Hellman Ephemeral Parame- expected to be 7. ters for TLS. Internet-Draft draft-ietf-tls-negotiated-ff- dhe-10. https://tools.ietf.org/html/draft-ietf- 25http://cacr.uwaterloo.ca/hac/about/chap3. tls- negotiated- ff- dhe- 10. Internet Engineering pdf Task Force, 2015

15 browsers and websites TLS connections where them more than a year to realize the com- the browser is a different program from what is posite modulus. running on the server. Asserting for these spe- cial primes might just break the connection, • high lack of conspiracy: in the case of so- which would be worse from the user’s perspec- cat only the person who had submitted tive. This is why Google Chrome is currently the vulnerability would be the target of removing DHE from its list of supported cipher investigation. It turns out he is a regular suites28, and recommending server administra- employee at Oracle. tors to migrate from DHE to ECDHE. This • high plausible deniability: three things is also one of the recommendations from Log- help us in the creation of a good story 29 jam . These security measures might very in the socat’s case: reversing bytes of well prevent this work’s efforts: backdooring the fake prime gives us a prime, some ECDHE in a stealthy way as we did with DHE small factors were found, anyone with remains an open problem. weak knowledge of could have submitted a composite number. 8 Conclusion • medium ease of use: man-in-the-middling the attack and observing the first hand- Many cryptographic constructions are not sub- shake would allow the attacker to take ad- ject to change, unless a breakthrough comes vantage of the backdoor. along and the whole construction has to be replaced. Very rarely the excuse of updat- • high severity: having access to that back- ing a reviewed and considered strong cryp- door lets us observe, or if exploited in real tographic implementation to change a single time let us tamper, with any communica- number comes along, and very few people un- tions made over TLS. derstand such subtle changes. According to • medium durability: system admins would the grading system of a whitepaper by Schneier have to update to newer versions to re- 30 et al , here is how such a backdoor scores: move the backdoor.

• medium undetectability: to discover the • high monitorability: the saboteur cannot backdoor one would have to test for the detect if other attackers are taking advan- primality of the modulus. A pretty easy tage of the backdoor, which is OK since task, although not typically performed as the backdoor in this work are NOBUS seen with the socat’s case where it took ones.

28https://groups.google.com/a/chromium. • high scale: backdooring an open-source li- org/forum/m/#!topic/blink-dev/AAdv838-koo/ brary would allow access to many systems’ discussion and users’ communications. 29David Adrian et al. “Imperfect Forward Secrecy: How Diffie-Hellman Fails in Practice”. In: 22nd ACM • high precision: the saboteur doesn’t Conference on Computer and Communications Secu- weaken any system, only the saboteur rity. https : / / weakdh . org / imperfect - forward - secrecy-ccs15.pdf. Oct. 2015 himself can access the backdoor. 30Bruce Schneier et al. Surreptitiously Weakening Cryptographic Systems. Cryptology ePrint Archive, • high control: like Dual-EC, only the sabo- Report 2015/097. http://eprint.iacr.org/. 2015 teur can exploit the backdoor.

16 While this work is mostly a fictive exercise, we [4] Jean-Sebastien Coron et al. Cryptanal- hope to raise awareness in the need for better ysis of the RSA Subgroup Assump- toolings and deeper reviews of open source – tion from TCC 2005. Cryptology ePrint as well as closed source – implementations of Archive, Report 2010/650. http : / / cryptographic algorithms. eprint.iacr.org/2010/650. 2010. [5] Kristen Dorey, Nicholas Chang-Fong, Acknowledgements and Aleksander Essex. Indiscreet Logs: Persistent Diffie-Hellman Backdoors in Many thanks to both Scott Fluhrer and Scott TLS. Cryptology ePrint Archive, Report Contini who have been of precious help in the 2016/999. http://eprint.iacr.org/ core ideas of this paper. 2016/999. 2016. [6] Daniel Kahn Gillmor. IETF Draft: Ne- Thanks as well to Pete L. Clark, Tom Ritter, gotiated Finite Field Diffie-Hellman Mike Brown, Roman Zabicki, Vincent Lynch, Ephemeral Parameters for TLS. Drew Suarez, Ryan Koppenhaver, Divya Internet-Draft draft-ietf-tls-negotiated- Natesan, Andy Grant and Jake Massimo for ff-dhe-10. https://tools.ietf.org/ feedback and discussions. html / draft - ietf - tls - negotiated - ff- dhe- 10. Internet Engineering Task Force, 2015. References [7] Tero Kivinen. RFC 3526: More Modu- lar Exponential (MODP) Diffie-Hellman [1] David Adrian et al. “Imperfect Forward groups for Internet Key Exchange (IKE). Secrecy: How Diffie-Hellman Fails in RFC 3526. https : / / rfc - editor . Practice”. In: 22nd ACM Conference on org/rfc/rfc3526.txt. 2015. doi: 10. Computer and Communications Secu- 17487/rfc3526. rity. https://weakdh.org/imperfect- forward - secrecy - ccs15 . pdf. Oct. [8] Matt Lepinski and Dr. Stephen T. Kent. 2015. RFC 5114: Additional Diffie-Hellman Groups for Use with IETF Standards. [2] Daniel J. Bernstein, Tanja Lange, and RFC 5114. https : / / rfc - editor . Ruben Niederhagen. Dual EC: A Stan- org/rfc/rfc5114.txt. 2015. doi: 10. dardized Back Door. Cryptology ePrint 17487/rfc5114. Archive, Report 2015/767. http : / / eprint.iacr.org/2015/767. 2015. [9] Stephen Pohlig and Martin Hellman. An Improved Algorithm for Comput- [3] Stephen Checkoway et al. A Systematic ing Logarithms over GF(p) and its Analysis of the Juniper Dual EC Inci- Cryptographic Significance. http : / / dent. Cryptology ePrint Archive, Report www - ee . stanford . edu / ~hellman / 2016/376. http://eprint.iacr.org/ publications/28.pdf. 1978. 2016/376. 2016. [10] Eric Rescorla. RFC 2631: Diffie-Hellman Key Agreement Method. RFC 2631. https : / / rfc - editor . org / rfc / rfc2631 . txt. 2013. doi: 10 . 17487 / rfc2631.

17 [11] Bruce Schneier et al. Surreptitiously Weakening Cryptographic Systems. Cryptology ePrint Archive, Report 2015/097. http://eprint.iacr.org/. 2015. [12] Shinde and Fadewar. Faster RSA Al- gorithm for Decryption Using Chinese Remainder Theorem. http : / / www . techscience . com / doi / 10 . 3970 / icces.2008.005.255.pdf. [13] Shumow and Ferguson. On the Possibil- ity of a Back Door in the NIST SP800- 90 Dual Ec Prng. Crypto 2007. http : //rump2007.cr.yp.to/15-shumow.pdf. 2007. [14] Robert Zuccherato. RFC 2785: Methods for Avoiding the Small-Subgroup Attacks on the Diffie-Hellman Key Agreement Method for S/MIME. RFC 2785. https: //rfc-editor.org/rfc/rfc2785.txt. 2013. doi: 10.17487/rfc2785.

18