Hash Functions - Bart Preneel June 2016

Hash Functions - Bart Preneel June 2016 Hash functions X.509 Annex D RIPEMD-160 MDC-2 SHA-256 SHA-3 MD2, MD4, MD5 SHA-512 Introduction to the SHA-1 Design and Cryptanalysis of This is an input to a crypto- Cryptographic Hash Functions graphic hash function. The input is a very long string, that is reduced by the hash function to a string of fixed length. There are 1A3FD4128A198FB3CA345932 Bart Preneel additional security conditions: it h should be very hard to find an KU Leuven - COSIC input hashing to a given value (a [email protected] preimage) or to find two colliding inputs (a collision). Sibenik, June 2016 Insert presenter logo here on slide master 2 Applications Agenda • short unique identifier to a string • Definitions – digital signatures – data authentication • Iterations (modes) • one-way function of a string • Compression functions – protection of passwords – micro-payments • Constructions • confirmation of knowledge/commitment • SHA-3 • pseudo-random string generation/key derivation • Conclusions • entropy extraction • construction of MAC algorithms, stream ciphers, block ciphers,… 2005: 800 uses of MD5 in Microsoft Windows 3 4 Security requirements (n-bit result) Preimage resistance preimage 2nd preimage collision preimage • in a password file, one does not store – (username, password) x ? ? ? ? ? • but – (username,hash(password)) h h h h h h • this is sufficient to verify a password • an attacker with access to the password file has to find a preimage h(x) h(x) = h(x’) h(x) = h(x’) h(x) 2n 2n 2n/2 2n 5 6 1 Hash Functions - Bart Preneel June 2016 Second preimage resistance Collision resistance 2nd preimage • hacker Alice prepares two versions collision x of a software driver for the O/S company Bob Channel 1: high capacity and insecure x’ x ? – x is correct code x h(x) – x’ contains a backdoor that gives Alice access to the machine Channel 2: low capacity but secure (= authenticated – cannot be modified) • Alice submits x for inspection to Bob h h • if Bob is satisfied, he digitally signs h h h(x) with his private key • an attacker can modify x but not h(x) • Alice now distributes x’ to users of h(x) = h(x’) • he can only fool the recipient if he the O/S; these users verify the h(x) = h(x’) finds a second preimage of x signature with Bob’s public key n n/2 2 • this signature works for x and for x’, 2 since h(x) = h(x’) 7 8 Pseudo-random function Brute force (2nd) preimage computationally indistinguishable from a random function • multiple target second preimage (1 out of many): $ $ prf hK(.) f t Advh = Pr [ K K: A 1] - Pr [ f RAND(m,n):A 1] – if one can attack 2 simultaneous targets, the effort to find a single preimage is 2n-t RAND(m,n): set of all functions from m-bit to n-bit strings • multiple target second preimage (many out of many): n K – time-memory trade-off with Θ(2 ) precomputation and h storage Θ(22n/3) time per (2nd) preimage: Θ(22n/3) f [Hellman’80] ? or ? • answer: randomize hash function with a parameter S This concept makes only (salt, key, spice,…) sense for a function with a D secret key 9 10 Brute force attacks in practice Quantum computers • in principle exponential parallelism • (2nd) preimage search n n/2 – n = 128: 14 B$ for 1 year if one can attack 240 targets in • inverting a one-way function: 2 reduced to 2 parallel [Grover’96] • parallel collision search: small memory using • collision search: can we do better than 2n/2 ? cycle finding algorithms (distinguished points) – 2n/3 computation + hardware [Brassard-Hoyer-Tapp’98] = 22n/3 n/4 – n = 128: 1 M$ for 5 hours (or 1 year on 60K PCs) – [Bernstein’09] classical collision search requires 2 computation and hardware (= standard cost of 2n/2 ) – n = 160: 56 M$ for 1 year – need 256-bit result for long term security (30 years or more) 11 12 2 Hash Functions - Bart Preneel June 2016 Properties in practice • collision resistance is not always necessary • other properties are needed: – PRF: pseudo-randomness if keyed (with secret key) – PRO: pseudo-random oracle property Iteration – near-collision resistance – partial preimage resistance (most of input known) (mode of compression function) – multiplication freeness • how to formalize these requirements and the relation between them? 13 14 14 How not to construct a hash function Hash function: iterated structure • Divide the message into t blocks xi of n bits each IV H1 H2 H3 Message block 1: x1 f f f f g Message block 2: x2 x1 x2 x3 x4 … • split messages into blocks of fixed length and hash them block by block with a compression function f Message block t: xt • need padding at the end = efficient and elegant…. but … Hash value h(x) 15 16 Security relation between f and h Security relation between f and h (2) • iterating f can degrade its security • solution: Merkle-Damgård (MD) strengthening – trivial example: 2nd preimage – fix IV, use unambiguous padding and insert length at the end • f is collision resistant h is collision resistant IV H1 H2 H3 [Merkle’89-Damgård’89] f f f f g • f is ideally 2nd preimage resistant ? h is ideally 2nd preimage resistant [Lai-Massey’92] x1 x2 x3 x4 IV = H1 H2 H3 • many other results f f f g x2 x3 x4 17 18 3 Hash Functions - Bart Preneel June 2016 Security relation between f and h (3) Attacks on MD-type iterations length extension: if one knows h(x), easy to compute h(x || y) without knowing x or IV • long message 2nd preimage attack IV H H [Dean-Felten-Hu'99], [Kelsey-Schneier’05] 1 2 H = h(x) t f f f 3 – Sec security degrades lineary with number 2 of message blocks hashed: 2n-t+1 + t 2n/2+1 – appending the length does not help here! x1 x2 x3 IV H1 H2 H3 H4= h(x || y) f f f f • multi-collision attack and impact on concatenation [Joux’04] x x x 1 2 3 y • herding attack [Kelsey-Kohno’06] – reduces security of commitment using a hash function from 2n solution: output transformation – on-line 2n-t + precomputation 2.2(n+t)/2 + storage 2t IV H1 H2 H3 f f f f g x1 x2 x3 x4 19 20 How (NOT) to strengthen a hash function? [Coppersmith’85][Joux’04] Multiple collisions multi-collision • answer: concatenation Assume “ideal” hash function h with n-bit result • Θ(2n/2) evaluations of h (or steps): 1 collision • h (n1-bit result) and h (n2-bit result) 1 2 – h(x)=h(x’) • Θ(r. 2n/2) steps: r2 collisions – h(x )=h(x ’) ; h(x )=h(x ’) ; … ; h(x 2)=h(x 2’) • intuition: the strength of g against h h 1 1 2 2 r r collision/(2nd) preimage attacks is the 1 2 2n/3 product of the strength of h1 and h2 • Θ(2 ) steps: a 3-collision — if both are “independent” – h(x)= h(x’)=h(x’’) g(x) = h1(x) || h2(x) • but…. • Θ(2n(t-1)/t) steps: a t-fold collision (multi-collision) – h(x1)= h(x2)= … =h(xt) 21 22 Multi-collisions on iterated hash function (2) Multi-collisions [Coppersmith’85][Joux ’04] IV H H H • finding multi-collisions for an iterated hash function is not 1 2 3 much harder than finding a single collision (if the size of the f f f f internal memory is n bits) • algorithm R x1, x’1 x2, x’2 x3, x’3 x4, x’4 • generate R = 2n1/2-fold • for IV: collision for block 1: x1, x’1 multi-collision for h2 • in R: search by brute • for H1: collision for block 2: x2, x’2 force for h1 • for H2: collision for block 3: x3, x’3 n2/2 n1/2 • for H3: collision for block 4: x4, x’4 • Time: n1. 2 + 2 h1 h2 << 2(n1 + n2)/2 • now h(x1||x2||x3||x4) = h(x’1||x2||x3||x4) = h(x’1||x’2||x3||x4) = … = h(x’1||x’2||x’3||x’4) a 16-fold collision (time: 4 collisions) g(x) = h1(x) || h2(x) 23 24 4 Hash Functions - Bart Preneel June 2016 Multi-collisions Improving MD iteration [Coppersmith’85][ Joux ’04] consider h (n1-bit result) and h (n2-bit result), with n1 n2. 1 2 salt + output transformation + counter + wide pipe concatenation of 2 iterated hash functions (g(x)= h1(x) || h2(x)) is as most as strong as the strongest of the two (even if both are independent) salt salt salt salt salt • cost of collision attack against g at most IV H1 H2 H3 n2/2 n1/2 (n1 + n2)/2 g n1 . 2 + 2 << 2 2n f 2n f 2n f 2n f 2n n • cost of (2nd) preimage attack against g at most n1 . 2n2/2 + 2n1 + 2n2 << 2n1 + n2 x x x x |x| • if either of the functions is weak, the attacks may work better 1 1 2 2 3 3 4 4 security reductions well understood many more results on property preservation impact of theory limited 25 26 Improving MD iteration Tree structure: parallelism • degradation with use: salting (family of functions, [Damgård’89], [Pal-Sarkar’03], [Keccak team’13] randomization) x1 x2 x3 x4 x5 x6 x7 x8 – or should a salt be part of the input? • PRO: strong output transformation g f f f f – also solves length extension • long message 2nd preimage: preclude fix points – counter f fi [Biham-Dunkelman’07] f f • multi-collisions, herding: avoid breakdown at 2n/2 with larger internal memory: known as wide pipe – e.g., extended MD4, RIPEMD, [Lucks’05] f 27 28 Permutation (π) based: sponge Modes: summary x x x h1 h2 1 2 x3 4 • growing theory to reduce security properties of hash function to that of compression function H10 r (MD) or permutation (sponge) – preservation of large range of properties π π π π π π H2 – relation between properties 0 … c • it is very nice to assume multiple properties of the compression function f, but unfortunately it is very absorb squeeze hard to verify these • still no single comprehensive theory if result has n bits, H1 has r bits (rate), H2 has c bits (capacity) and the permutation π is “ideal” collisions min (2c/2 , 2n/2) 2nd preimage min (2c/2 , 2n) c n preimage min (2 , 2 ) 29 30 5 Hash Functions - Bart Preneel June 2016 Single block length [Rabin’78] x1 x2 x3 x4 H H H3 H4 IV E 1 E 2

Load more