Second Preimages on N-Bit Hash Functions for Much Less Than 2N Work

Second Preimages on n-bit Hash Functions for Much Less than 2n Work John Kelsey1 and Bruce Schneier2 1 National Institute of Standards and Technology, [email protected] 2 Counterpane Internet Security, Inc., [email protected] Abstract. We expand a previous result of Dean [Dea99] to provide a second preimage attack on all n-bit iterated hash functions with Damgºard- Merkle strengthening and n-bit intermediate states, allowing a second preimage to be found for a 2k-message-block message with about k £ 2n=2+1 +2n¡k+1 work. Using RIPEMD-160 as an example, our attack can ¯nd a second preimage for a 260 byte message in about 2106 work, rather than the previously expected 2160 work. We also provide slightly cheaper ways to ¯nd multicollisions than the method of Joux [Jou04]. Both of these results are based on expandable messages{patterns for producing messages of varying length, which all collide on the intermediate hash result immediately after processing the message. We provide an algorithm for ¯nding expandable messages for any n-bit hash function built using the Damgºard-Merkleconstruction, which requires only a small multiple of the work done to ¯nd a single collision in the hash function. 1 Introduction The security goal for an n-bit hash function is that collisions require about 2n=2 work, while preimages and second preimages require about 2n work. In [Dea99], Dean demonstrated that this goal could not be accomplished by hash functions whose compression functions allowed the easy ¯nding of ¯xed points, such as MD5 [Riv92] and SHA1 [SHA02]. In this paper, we use the multicollision- ¯nding result of [Jou04] to demonstrate that the standard way of constructing iterated hash functions (the Damgºard-Merkleconstruction) cannot meet this goal, regardless of the compression function used. Thus, hash functions such as RIPEMD-160 [DBP96] and Whirlpool [BR00] (when used with a full 512- bit result) provide less than the previously-expected amount of resistance to second-preimage attacks, just as do hash functions like SHA1. For a message of 2k message blocks, we provide a second preimage attack requiring about k £ 2n=2+1 + 2n¡k+1 work. Like Dean's attack, ours is made possible by the notion of expandable messages{patterns of messages of di®erent lengths which all yield the same intermediate hash value after processing them. These expandable messages do not directly yield collisions on the whole hash function because of the length padding done at the end of modern hash functions, and in any event are no easier to ¯nd than collisions. However, they allow second preimages and multicollisions to be found much more cheaply than had previously been expected. This result may be compared with an earlier generic preimage attack by Merkle: the attacker is given 2k distinct n-bit hash outputs, and expects to ¯nd a preimage for one of the outputs with about 2n¡k work, but has no ability to choose which of the outputs is to be matched. (Note that Merkle's attack is truly generic, in that it applies to any hash fuction with an n-bit output, even a random oracle.) Our attack, like the earlier attack of Dean, probably has no practical impact on the security of any system currently relying upon a hash function such as SHA1, Whirlpool, or RIPEMD-160. This is true because the attack is always at least as expensive as collision search on the hash function, and because the di±culty of the attack grows quickly as the message gets shorter. For example, a 160-bit hash function like SHA1 or RIPEMD-160 requires about 2128 work to ¯nd a second preimage for a 238-byte message, and a target message of only one megabyte (220 bytes) requires about 2146 work to ¯nd a second preimage. Also, the attack only recovers second preimages{it doesn't allow an attacker to invert the hash function. The signi¯cance of our result is in demonstrating another important way in which the behavior of the hash functions we know how to construct di®ers from both the commonly claimed security bounds of these functions, and from the random oracles with which we often model them. When combined with the recent results of Joux [Jou04], our results raise questions about the usefulness of the widely-used Damgºard-Merkleconstruction for hash functions where attackers can do more than 2n=2 work. The remainder of the paper is organized as follows: First, we discuss basic hash function constructions and security requirements. Next, we demonstrate a generic way to ¯nd expandable messages, and review the method of Dean. We then demonstrate how these expandable messages can be used to violate the second preimage resistance of nearly all currently speci¯ed cryptographic hash functions with less than 2n work. Finally, we demonstrate an even more e±cient (albeit much less elegant) way to ¯nd multicollisions than the method of Joux, using Dean's ¯xed-point-based expandable messages. We end with a discussion of how this a®ects our understanding of iterated hash function security. 2 Hash Function Basics In 1989, Merkle and Damgºard[Mer89,Dam89] independently provided security proofs for the basic construction used for almost all modern cryptographic hash functions3. Here, we describe this construction4, and its normal security claims. 3 These functions date back to Rabin, and were widely used by hash function designers throughout the 1980s [Pre05,MvOV96]. 4 Damgºardproposed two methods for constructing hash functions. This paper ad- dresses only the more commonly used one, which was independently invented by Merkle. A hash function with an n-bit output is expected to have three minimal security properties. (In practice, a number of other properties are expected, as well.) 1. Collision-resistance: An attacker should not be able to ¯nd a pair of messages M 6= M 0 such that hash(M) = hash(M 0) with less than about 2n=2 work. 2. Preimage-resistance: An attacker given an output value Y in the range of hash should not be able to ¯nd an input X from its domain so that Y = hash(X) with less than about 2n work. 3. Second preimage-resistance: An attacker given one message M should not be able to ¯nd a second message, M 0 to satisfy hash(M) = hash(M 0) with less than about 2n work. A collision attack on an n-bit hash function with less than 2n=2 work, or a preimage or second preimage attack with less than 2n work, is formally a break of the hash function. Whether the break poses a practical threat to systems using the hash function depends on speci¯cs of the attack. Following the Damgºard-Merkleconstruction, an iterated hash function is built from a ¯xed-length component called a compression function, which takes an n-bit input chaining value and an m-bit message block, and derives a new n-bit output chaining value. In this paper, F (H; M) is used to represent the ap- plication of this compression function on hash chaining variable H and message block M. In order to hash a full message, the following steps are carried out: 1. The input string is padded to ensure that it is an integer multiple of m bits in length, and that the length of the original, unpadded message appears in the last block of the padded message. 2. The hash chaining value h[i] is started at some ¯xed IV, h[¡1], for the hash function, and updated for each successive message block M[i] as h[i] = F (h[i ¡ 1];M[i]) 3. The value of h[i] after processing the last block of the padded message is the hash output value. This construction gives a reduction proof: If an attacker can ¯nd a collision in the whole hash, then he can likewise ¯nd one in the compression function. The inclusion of the length at the end of the message is important for this reduction proof, and is also important for preventing a number of attacks, including long- message attacks [MvOV96]. Besides the claimed security bounds, there are two concepts from this brief discussion that are important for the rest of this paper: 1. A message made up of many blocks, M[0; 1; 2; :::; 2k ¡1], has a corresponding sequence of intermediate hash values, h[0; 1; 2; :::; 2k ¡ 1]. 2. The padding of the ¯nal block includes the length, and thus prevents collisions between messages of di®erent lengths in the intermediate hash states from yielding collisions in the full hash function. 3 Finding Expandable Messages An expandable message is a kind of multicollision, in which the colliding messages have di®erent lengths, and the message hashes collide in the input to the last compression function computation, before the length of the message is processed. Consider a starting hash value h[¡1]. Then an \expandable message" from h[¡1] is a pattern for generating messages of di®erent lengths, all of which yield the same intermediate hash value when they are processed by the hash, starting from h[¡1], without the ¯nal padding block with the message length being included. In the remainder of the paper, an expandable message that can take on any length between a and b message blocks, inclusive, will be called an (a; b)-expandable message. 3.1 Dean's Fixed-Point Expandable Messages In [Dea99], there appears a technique for building expandable messages when ¯xed points can easily be found in the compression function5. For a compression function h[i] = F (h[i ¡ 1];M[i]), a ¯xed point is a pair (h[i ¡ 1];M[i]) such that h[i ¡ 1] = F (h[i ¡ 1];M[i]).

Load more