Introduction to Design and Cryptanalysis of Cryptographic Hash
Total Page:16
File Type:pdf, Size:1020Kb
Hash Functions - Bart Preneel June 2014 Hash functions X.509 Annex D RIPEMD-160 MDC-2 SHA-256 SHA-3 MD2, MD4, MD5 SHA-512 Introduction to the SHA-1 Design and Cryptanalysis of This is an input to a crypto- Cryptographic Hash Functions graphic hash function. The input is a very long string, that is reduced by the hash function to a string of fixed length. There are 1A3FD4128A198FB3CA345932 Bart Preneel additional security conditions: it h should be very hard to find an KU Leuven - COSIC input hashing to a given value (a [email protected] preimage) or to find two colliding inputs (a collision). Sibenik, June 2014 Insert presenter logo here on slide master 2 Applications Agenda • short unique identifier to a string • Definitions – digital signatures – data authentication • Iterations (modes) • one-way function of a string • Compression functions – protection of passwords – micro-payments • Constructions • confirmation of knowledge/commitment • SHA-3 • pseudo-random string generation/key derivation • Conclusions • entropy extraction • construction of MAC algorithms, stream ciphers, block ciphers,… 2005: 800 uses of MD5 in Microsoft Windows 3 4 Security requirements (n-bit result) Preimage resistance preimage 2nd preimage collision preimage • in a password file, one does not store – (username, password) x ? ? ? ? ? • but – (username,hash(password)) h h h h h h • this is sufficient to verify a password • an attacker with access to the password file has to find a preimage h(x) h(x) = h(x’) h(x) = h(x’) h(x) 2n 2n 2n/2 2n 5 6 1 Hash Functions - Bart Preneel June 2014 Second preimage resistance Collision resistance 2nd preimage • hacker Alice prepares two versions collision x of a software driver for the O/S company Bob Channel 1: high capacity and insecure x’ x ? – x is correct code x h(x) – x’ contains a backdoor that gives Alice access to the machine Channel 2: low capacity but secure (= authenticated – cannot be modified) • Alice submits x for inspection to Bob h h • if Bob is satisfied, he digitally signs h h h(x) with his private key • an attacker can modify x but not h(x) • Alice now distributes x’ to users of h(x) = h(x’) • he can only fool the recipient if he the O/S; these users verify the h(x) = h(x’) finds a second preimage of x signature with Bob’s public key n n/2 2 • this signature works for x and for x’, 2 since h(x) = h(x’) 7 8 Indifferentiability from a random oracle Pseudo-random function or PRO property [Maurer+04] computationally indistinguishable from a random function variant of indistinguishability appropriate when distinguisher $ prf $ hK(.) f Advh = Pr [ K K: A 1] - Pr [ f RAND(m,n):A 1] has access to inner component (e.g. building block of a hash function) RAND(m,n): set of all functions from m-bit to n-bit strings Simulator S, distinguisher D, AdvPRO(H,S) is small K h FIL f H VIL RO S (hash function) RO ? or ? This concept makes only ? or ? sense for a function with a D secret key D [Ristenpart-Shacham-Shrimpton’11] 9 [Demay-Gaz-Hirt-Maurer’13] 10 Brute force (2nd) preimage Brute force attacks in practice • multiple target second preimage (1 out of many): • (2nd) preimage search t – if one can attack 2 simultaneous targets, the effort to find a single 40 preimage is 2n-t – n = 128: 14 B$ for 1 year if one can attack 2 targets in parallel • multiple target second preimage (many out of many): • parallel collision search: small memory using cycle finding algorithms (distinguished points) – time-memory trade-off with Θ(2n) precomputation and storage Θ(22n/3) time per (2nd) preimage: Θ(22n/3) – n = 128: 1 M$ for 5 hours (or 1 year on 60K PCs) [Hellman’80] – n = 160: 56 M$ for 1 year – need 256-bit result for long term security (30 years or more) • answer: randomize hash function with a parameter S (salt, key, spice,…) 11 12 2 Hash Functions - Bart Preneel June 2014 Quantum computers Properties in practice • in principle exponential parallelism • collision resistance is not always necessary • inverting a one-way function: 2n reduced to 2n/2 • other properties are needed: [Grover’96] – PRF: pseudo-randomness if keyed (with secret key) • collision search: can we do better than 2n/2 ? – PRO: pseudo-random oracle property – near-collision resistance – 2n/3 computation + hardware [Brassard-Hoyer-Tapp’98] = 22n/3 – partial preimage resistance (most of input known) n/4 – [Bernstein’09] classical collision search requires 2 computation – multiplication freeness and hardware (= standard cost of 2n/2 ) • how to formalize these requirements and the relation between them? 13 14 How not to construct a hash function • Divide the message into t blocks xi of n bits each Message block 1: x1 Iteration Message block 2: x2 (mode of compression function) … Message block t: xt = Hash value h(x) 15 15 16 Hash function: iterated structure Security relation between f and h • iterating f can degrade its security – trivial example: 2nd preimage IV H1 H2 H3 f f f f g IV H H H 1 2 3 g x1 x2 x3 x4 f f f f • split messages into blocks of fixed length and hash them x1 x2 x3 x4 block by block with a compression function f • need padding at the end IV = H1 H2 H3 f f f g efficient and elegant…. but … x2 x3 x4 17 18 3 Hash Functions - Bart Preneel June 2014 Security relation between f and h (2) Security relation between f and h (3) • solution: Merkle-Damgård (MD) strengthening length extension: if one knows h(x), easy to compute h(x || y) without knowing x or IV – fix IV, use unambiguous padding and insert length at the end IV H H 1 2 H = h(x) f f f 3 • f is collision resistant h is collision resistant x x x [Merkle’89-Damgård’89] 1 2 3 IV H H H H = h(x || y) nd nd 1 2 3 4 • f is ideally 2 preimage resistant ? h is ideally 2 f f f f preimage resistant [Lai-Massey’92] x1 x2 x3 y • PRO preservation Col, Sec and Pre for ideal compression function solution: output transformation – but for narrow pipe bounds for Sec and Pre are at most 2n/2 rather than 2n IV H1 H2 H3 • many other results f f f f g 19 x1 x2 x3 x4 20 How (NOT) to strengthen a hash function? Attacks on MD-type iterations [Coppersmith’85][Joux’04] • long message 2nd preimage attack • answer: concatenation [Dean-Felten-Hu'99], [Kelsey-Schneier’05] – Sec security degrades lineary with number 2t of message blocks • h1 (n1-bit result) and h2 (n2-bit result) hashed: 2n-t+1 + t 2n/2+1 – appending the length does not help here! • multi-collision attack and impact on concatenation [Joux’04] • intuition: the strength of g against h h collision/(2nd) preimage attacks is the 1 2 • herding attack [Kelsey-Kohno’06] product of the strength of h1 and h2 – reduces security of commitment using a hash function from 2n — if both are “independent” – on-line 2n-t + precomputation 2.2(n+t)/2 + storage 2t g(x) = h1(x) || h2(x) • but…. 21 22 Multiple collisions multi-collision Multi-collisions on iterated hash function (2) Assume “ideal” hash function h with n-bit result IV H1 H2 H3 • Θ(2n/2) evaluations of h (or steps): 1 collision f f f f – h(x)=h(x’) x , x’ x , x’ x , x’ x , x’ • Θ(r. 2n/2) steps: r2 collisions 1 1 2 2 3 3 4 4 – h(x1)=h(x1’) ; h(x2)=h(x2’) ; … ; h(xr2)=h(xr2’) • for IV: collision for block 1: x1, x’1 • for H : collision for block 2: x , x’ • Θ(22n/3) steps: a 3-collision 1 2 2 for H : collision for block 3: x , x’ – h(x)= h(x’)=h(x’’) • 2 3 3 • for H3: collision for block 4: x4, x’4 • Θ(2n(t-1)/t) steps: a t-fold collision (multi-collision) • now h(x1||x2||x3||x4) = h(x’1||x2||x3||x4) = h(x’1||x’2||x3||x4) = … – h(x1)= h(x2)= … =h(xt) = h(x’1||x’2||x’3||x’4) a 16-fold collision (time: 4 collisions) 23 24 4 Hash Functions - Bart Preneel June 2014 Multi-collisions [Coppersmith’85][Joux ’04] Multi-collisions [Coppersmith’85][ Joux ’04] • finding multi-collisions for an iterated hash function is not consider h1 (n1-bit result) and h2 (n2-bit result), with n1 n2. much harder than finding a single collision (if the size of the concatenation of 2 iterated hash functions (g(x)= h (x) || h (x)) internal memory is n bits) 1 2 is as most as strong as the strongest of the two (even if both are independent) • algorithm R • generate R = 2n1/2-fold • cost of collision attack against g at most n2/2 n1/2 (n1 + n2)/2 multi-collision for h2 n1 . 2 + 2 << 2 • in R: search by brute • cost of (2nd) preimage attack against g at most force for h1 n1 . 2n2/2 + 2n1 + 2n2 << 2n1 + n2 • if either of the functions is weak, the attacks may work better • Time: n1. 2n2/2 + 2n1/2 h1 h2 << 2(n1 + n2)/2 g(x) = h1(x) || h2(x) 25 26 Improving MD iteration Improving MD iteration salt + output transformation + counter + wide pipe • degradation with use: salting (family of functions, randomization) salt salt salt salt salt – or should a salt be part of the input? • PRO: strong output transformation g IV H1 H2 H 3 – also solves length extension 2n 2n 2n 2n g n f f f f 2n • long message 2nd preimage: preclude fix points – counter f fi [Biham-Dunkelman’07] |x| x x x x n/2 1 1 2 2 3 3 4 4 • multi-collisions, herding: avoid breakdown at 2 with larger internal memory: known as wide pipe security reductions well understood – e.g., extended MD4, RIPEMD, [Lucks’05] many more results on property preservation impact of theory limited 27 28 Tree structure: parallelism Permutation (π) based: sponge x x x x h1 h2 [Damgård’89], [Pal-Sarkar’03], [Keccak team’13] 1 2 3 4 H1 x1 x2 x3 x4 x5 x6 x7 x8 0 r f f f f π π π π π π H2 0 … c f f absorb squeeze if result has n bits, H1 has r bits (rate), H2 has c bits (capacity) and f the permutation π is “ideal” collisions min (2c/2 , 2n/2) 2nd preimage min (2c/2 , 2n) c