On the Weaknesses of PBKDF2 ⋆
Total Page:16
File Type:pdf, Size:1020Kb
On the weaknesses of PBKDF2 ? Andrea Visconti??, Simone Bossi, Hany Ragab, and Alexandro Calò Cryptography and Coding Laboratory (CLUB), Department of Computer Science, Università degli Studi di Milano http://www.club.di.unimi.it/ [email protected] {simone.bossi2,hany.ragab,alexandro.calo}@studenti.unimi.it Abstract. Password-based key derivation functions are of particular interest in cryptography because they (a) input a password/passphrase (which usually is short and lacks enough entropy) and derive a crypto- graphic key; (b) slow down brute force and dictionary attacks as much as possible. In PKCS#5 [19], RSA Laboratories described a password based key derivation function called PBKDF2 that has been widely adopted in many security related applications [7], [13], [6]. In order to slow down brute force attacks, PBKDF2 introduce CPU-intensive operations based on an iterated pseudorandom function. Such a pseudorandom function is HMAC-SHA-1 by default. In this paper we show that, if HMAC-SHA-1 is computed in a standard mode without following the performance im- provements described in the implementation note of RFC 2104 [15] and FIPS 198-1 [16], an attacker is able to avoid 50% of PBKDF2's CPU in- tensive operations, by replacing them with precomputed values. We note that a number of well-known and widely-used crypto libraries are sub- ject to this vulnerability. In addition to such a vulnerability, we describe some other minor optimizations that an attacker can exploit to reduce even more the key derivation time. Keywords: Key derivation function (KDF), HMAC-SHA1, PKCS#5, optimizations, CPU-intensive operations 1 Introduction Passwords are widely used to protect secret data or to gain access to specic resources. For sake of security, they should be strong enough ? A shorter version of this paper appeared in the Proceedings of the 14th International Conference on Cryptology and Network Security (CANS 2015), Springer Interna- DRAFTtional Publishing, LNCS 9476. ?? Corresponding author: [email protected] to prevent well-know attacks such as dictionary and brute force at- tacks. Unfortunately, user-chosen passwords are generally short and lack enough entropy [11], [21], [18]. For these reasons, they cannot be directly used as a key to implement secure cryptographic sys- tems. A possible solution to this issue is to adopt a key derivation function (KDF), that is a function which takes a source of initial keying material and derives from it one or more pseudorandom keys. Such a key material can be the output of a pseudo-random number generator, a bit sequence obtained by a statistical sampler, a shared Die-Hellman value, a user-chosen password, or any bit sequence from a source of more or less entropy [14]. KDF that input user passwords are known as password-based KDF. Such functions are of particular interest in cryptography because they introduce CPU- intensive operations on the attacker side, increasing the cost of an exhaustive search. By applying a KDF to a user password, we allow legitimate users to spend a moderate amount of time on key deriva- tion, while increase the time an attacker takes to test each possible password. The approach based on KDF not only slows down a brute force attack as much as possible but also allows to increase the size of a cryptographic key. In PKCS#5 [19], RSA Laboratories provides a number of recom- mendations for the implementation of password-based cryptography. In particular, they described Password-Based Key Derivation Func- tion version 2 (PBKDF2), a function widely used to derive keys and implemented in many security-related systems. For example, PBKDF2 is involved in Android's full disk encryption (since version 3.0 Honeycomb to 4.3 Jelly Bean) 1, in WPA/WPA2 encryption process [13], in LUKS[12][7], EncFS [2], FileVault Mac OS X [6], [8], GRUB2 [3], Winrar [5], and many others. In order to slow down the attackers, PBKDF2 uses a salt to pre- vent building universal dictionaries, and an iteration count which species the number of times the underlying pseudorandom function is called to generate a block of keying material. The number of it- erations is a crucial point of the KDF. The choice of a reasonable value for the iteration count depends on the environment and can 1DRAFTAt the time of writing this represents 58% of the Android devices market share (see developer.android.com). vary from an application to another. In SP 800-132 [17], Turan et al. suggests that it is a good practice to select the iteration count as large as possible, as long the time required to generate the key is acceptable for the user. Moreover, they specify that for very critical keys on very powerful system an iteration count of 10,000,000 may be appropriate, while a minimum of 1,000 iterations is recommended for general purpose. PBKDF2 introduce CPU-intensive operations based on an it- erated pseudorandom function. Such a pseudorandom function is HMAC-SHA-1 by default. In this paper we show that, if HMAC-SHA-1 is computed in a standard mode without following the performance improvements described in the implementation note of RFC 2104 [15] and FIPS 198-1 [16], an attacker is able avoid 50% of PBKDF2's CPU inten- sive operations, by replacing them with precomputed values. Readers note that a number of well-known and widely-used crypto libraries e.g.,[4,1], are subject to this vulnerability, therefore an attacker is able to derive keys signicantly faster than a regular user can do. Moreover, we present some other minor optimizations (based on the hash function used) that can be exploited by an attacker to reduce even more the key derivation time. The remainder of the paper is organized as follows. In Section 2, we present Password Based Key Derivation Function version 2 (PBKDF2). In Section 3 we briey describe HMAC, that is the pseu- dorandom function adopted in PBKDF2. In Section 4 we present the weaknesses of PBKDF2. Finally, discussion and conclusions are drawn in Section 5. 2 PBKDF 2 Password Based Key Derivation Function version 2, PBKDF2 for short, is a key derivation function published by RSA Laboratories in PKCS #5 [19]. In order to face brute force attacks based on weak user passwords, PBKDF2 introduce CPU-intensive operations. Such operations are based on an iterated pseudorandom function (PRF) e.g. a hash function, cipher, or HMAC which maps input values toDRAFT a derived key. One of the most important properties to assure is that PBKDF2 is cycle free. If this is not so, a malicious user can avoid the CPU-intensive operations and get the derived key by executing a set of equivalent, but less onerous, instructions. Unlike its prede- cessor (PBKDF version 1) in which the length of the derived key is bounded by the length of the underlying PRF output, PBKDF2 can derive keys of arbitrary length. More precisely, PBKDF2 generates as many blocks Ti as needed to cover the desired key length. Each block Ti is computed iterating the PRF many times as specied by an iteration count. The length of such blocks is bounded by hLen, which is the length of the underlying PRF output. In the sequel by PRF we will refer to HMAC with the SHA-1 hash function, that is the default as per [19]. Note that HMAC can be used with any other iterated hash functions such as RIPEMD, SHA-256 or SHA-512. PBKDF2 inputs a user password/passphrase p, a random salt s, an iteration counter c, and derived key length dkLen. It outputs a derived key DK. DK = P BKDF 2(p; s; c; dkLen) (1) The derived key is dened as the concatenation of ddkLen=hLene- blocks: DK = T1jjT2jj ::: jjTddkLen=hLene (2) where T1 = F unction(p; s; c; 1) T2 = F unction(p; s; c; 2) . TddkLen=hLene = F unction(p; s; c; ddkLen=hLene): Each single block Ti i.e., Ti = F unction(p; s; c; i) is computed as Ti = U1 ⊕ U2 ⊕ ::: ⊕ Uc (3) where U1 = P RF (p; sjji) U2 = P RF (p; U1) . DRAFT. Uc = P RF (p; Uc−1) 3 HMAC Hash-based Message Authentication Code (HMAC) is an algorithm for computing a message authentication code based on a crypto- graphic hash function. The denition of HMAC [15] requires H : any cryptographic hash function; K : the secret key; text : the message to be authenticated. In this paper, we will focus on HMAC-SHA1 that is the default PRF for PBKDF2 [19]. As described in RFC 2104 [15], HMAC can be dened as follows: HMAC = H(K ⊕ opad; H(K ⊕ ipad; text)) (4) where H is the chosen hash function, K is the secret key, and ipad, opad are the constant values (respectively, the byte 0x36 and 0x5C repeated 64 times) XORed with the password. Equation 4 can be expanded in the form: h = H(K ⊕ ipad jj text) (5) HMAC = H(K ⊕ opad jj h) Using SHA1 as cryptographic hash function, equation 5 can be graph- ically represented as in Figure 1. DRAFTFig. 1. HMAC-SHA-1 4 Weaknesses In this section we present some weaknesses of PBKDF2. The major one concerns the precomputation of specic values that can be reused during the key derivation process. The others aim to avoid useless operations during the computation of the hash function. We will describe these weaknesses using as an example the pa- rameters dened by LUKS format [11][12] i.e., a salt length of 256 bits, and HMAC-SHA-1 as PRF. 4.1 Precomputing a message block Looking closely at Figure 2, note the following: rst message block of a keyed hash function is repeated c times (the green rectangles in Figure 2); rst message block of a second keyed hash function is repeated c times (the orange rectangles in Figure 2); all green rectangles have the same content and they can assume only the values SHA1(P ⊕ ipad); all orange rectangles rectangles have the same content and they can assume only the values SHA1(P ⊕ opad); Thus, it is possible to compute these two blocks in advance i.e., SHA1(P ⊕ ipad) and SHA-1(P ⊕ opad) 2 and then use such values for c times.