<<

Hash Functions - June 2016

Hash functions

X.509 Annex D RIPEMD-160 MDC-2 SHA-256 SHA-3 MD2, MD4, MD5 SHA-512 Introduction to the SHA-1 Design and of This is an input to a crypto- Cryptographic Hash Functions graphic . The input is a very long string, that is reduced by the hash function to a string of fixed length. There are 1A3FD4128A198FB3CA345932 Bart Preneel additional security conditions: it h should be very hard to find an KU Leuven - COSIC input hashing to a given value (a [email protected] preimage) or to find two colliding inputs (a collision). Sibenik, June 2016 Insert presenter logo here on slide master 2

Applications Agenda

• short unique identifier to a string • Definitions – digital signatures – data authentication • Iterations (modes) • one-way function of a string • Compression functions – protection of – micro-payments • Constructions • confirmation of knowledge/commitment • SHA-3 • pseudo-random string generation/ derivation • Conclusions • entropy extraction • construction of MAC algorithms, stream ciphers, block ciphers,…

2005: 800 uses of MD5 in Microsoft Windows 3 4

Security requirements (n-bit result) Preimage resistance

preimage 2nd preimage collision preimage • in a file, one does not store – (username, password) x ?  ? ?  ? ? • but – (username,hash(password)) h h h h h h • this is sufficient to verify a password • an attacker with access to the password file has to find a preimage h(x) h(x) = h(x’) h(x) = h(x’) h(x) 2n 2n 2n/2 2n

5 6

1 Hash Functions - Bart Preneel June 2016

Second preimage resistance

2nd preimage • hacker Alice prepares two versions collision x of a software driver for the O/S company Bob Channel 1: high capacity and insecure x’ x  ? – x is correct code x  h(x) – x’ contains a backdoor that gives Alice access to the machine Channel 2: low capacity but secure (= authenticated – cannot be modified) • Alice submits x for inspection to Bob h h • if Bob is satisfied, he digitally signs h h h(x) with his private key • an attacker can modify x but not h(x) • Alice now distributes x’ to users of h(x) = h(x’) • he can only fool the recipient if he the O/S; these users verify the h(x) = h(x’) finds a second preimage of x signature with Bob’s public key n n/2 2 • this signature works for x and for x’, 2 since h(x) = h(x’) 7 8

Pseudo-random function Brute force (2nd) preimage

computationally indistinguishable from a random function • multiple target second preimage (1 out of many): $ $ prf hK(.) f t Advh = Pr [ K  K: A 1] - Pr [ f  RAND(m,n):A 1] – if one can attack 2 simultaneous targets, the effort to find a single preimage is 2n-t RAND(m,n): set of all functions from m-bit to n-bit strings • multiple target second preimage (many out of many): n K – time-memory trade-off with Θ(2 ) precomputation and h storage Θ(22n/3) time per (2nd) preimage: Θ(22n/3) f [Hellman’80]

? or ? • answer: randomize hash function with a parameter S This concept makes only (, key, spice,…) sense for a function with a D secret key

9 10

Brute force attacks in practice Quantum computers

• in principle exponential parallelism • (2nd) preimage search n n/2 – n = 128: 14 B$ for 1 year if one can attack 240 targets in • inverting a one-way function: 2 reduced to 2 parallel [Grover’96] • parallel collision search: small memory using • collision search: can we do better than 2n/2 ? cycle finding algorithms (distinguished points) – 2n/3 computation + hardware [Brassard-Hoyer-Tapp’98] = 22n/3 n/4 – n = 128: 1 M$ for 5 hours (or 1 year on 60K PCs) – [Bernstein’09] classical collision search requires 2 computation and hardware (= standard cost of 2n/2 ) – n = 160: 56 M$ for 1 year – need 256-bit result for long term security (30 years or more)

11 12

2 Hash Functions - Bart Preneel June 2016

Properties in practice

• collision resistance is not always necessary • other properties are needed: – PRF: pseudo-randomness if keyed (with secret key) – PRO: pseudo- property Iteration – near-collision resistance – partial preimage resistance (most of input known) (mode of compression function) – multiplication freeness • how to formalize these requirements and the relation between them?

13 14 14

How not to construct a hash function Hash function: iterated structure

• Divide the message into t blocks xi of n bits each IV H1 H2 H3 Message block 1: x1 f f f f g 

Message block 2: x2 x1 x2 x3 x4  … • split messages into blocks of fixed length and hash them  block by block with a compression function f

Message block t: xt • need padding at the end = efficient and elegant…. but … Hash value h(x)

15 16

Security relation between f and h Security relation between f and h (2)

• iterating f can degrade its security • solution: Merkle-Damgård (MD) strengthening – trivial example: 2nd preimage – fix IV, use unambiguous padding and insert length at the end

• f is collision resistant  h is collision resistant IV H1 H2 H3 [Merkle’89-Damgård’89] f f f f g • f is ideally 2nd preimage resistant ? h is ideally 2nd preimage resistant [Lai-Massey’92] x1 x2 x3 x4

IV = H1 H2 H3 • many other results f f f g

x2 x3 x4 17 18

3 Hash Functions - Bart Preneel June 2016

Security relation between f and h (3) Attacks on MD-type iterations

length extension: if one knows h(x), easy to compute h(x || y) without knowing x or IV • long message 2nd IV H H [Dean-Felten-Hu'99], [Kelsey-Schneier’05] 1 2 H = h(x) t f f f 3 – Sec security degrades lineary with number 2 of message blocks hashed: 2n-t+1 + t 2n/2+1 – appending the length does not help here! x1 x2 x3

IV H1 H2 H3 H4= h(x || y) f f f f • multi- and impact on concatenation [Joux’04] x x x 1 2 3 y • herding attack [Kelsey-Kohno’06] – reduces security of commitment using a hash function from 2n solution: output transformation – on-line 2n-t + precomputation 2.2(n+t)/2 + storage 2t

IV H1 H2 H3 f f f f g

x1 x2 x3 x4 19 20

How (NOT) to strengthen a hash function? [Coppersmith’85][Joux’04] Multiple collisions  multi-collision

• answer: concatenation Assume “ideal” hash function h with n-bit result • Θ(2n/2) evaluations of h (or steps): 1 collision • h (n1-bit result) and h (n2-bit result) 1 2 – h(x)=h(x’)

• Θ(r. 2n/2) steps: r2 collisions

– h(x )=h(x ’) ; h(x )=h(x ’) ; … ; h(x 2)=h(x 2’) • intuition: the strength of g against h h 1 1 2 2 r r collision/(2nd) preimage attacks is the 1 2 2n/3 product of the strength of h1 and h2 • Θ(2 ) steps: a 3-collision — if both are “independent” – h(x)= h(x’)=h(x’’) g(x) = h1(x) || h2(x) • but…. • Θ(2n(t-1)/t) steps: a t-fold collision (multi-collision) – h(x1)= h(x2)= … =h(xt)

21 22

Multi-collisions on iterated hash function (2) Multi-collisions [Coppersmith’85][Joux ’04] IV H H H • finding multi-collisions for an iterated hash function is not 1 2 3 much harder than finding a single collision (if the size of the f f f f internal memory is n bits)

• algorithm R x1, x’1 x2, x’2 x3, x’3 x4, x’4 • generate R = 2n1/2-fold • for IV: collision for block 1: x1, x’1 multi-collision for h2 • in R: search by brute • for H1: collision for block 2: x2, x’2 force for h1 • for H2: collision for block 3: x3, x’3 n2/2 n1/2 • for H3: collision for block 4: x4, x’4 • Time: n1. 2 + 2 h1 h2 << 2(n1 + n2)/2 • now h(x1||x2||x3||x4) = h(x’1||x2||x3||x4) = h(x’1||x’2||x3||x4) = … = h(x’1||x’2||x’3||x’4) a 16-fold collision (time: 4 collisions) g(x) = h1(x) || h2(x) 23 24

4 Hash Functions - Bart Preneel June 2016

Multi-collisions Improving MD iteration [Coppersmith’85][ Joux ’04]

consider h (n1-bit result) and h (n2-bit result), with n1  n2. 1 2 salt + output transformation + counter + wide pipe concatenation of 2 iterated hash functions (g(x)= h1(x) || h2(x)) is as most as strong as the strongest of the two (even if both are independent) salt salt salt salt salt

• cost of collision attack against g at most IV H1 H2 H3 n2/2 n1/2 (n1 + n2)/2 g n1 . 2 + 2 << 2 2n f 2n f 2n f 2n f 2n n • cost of (2nd) preimage attack against g at most n1 . 2n2/2 + 2n1 + 2n2 << 2n1 + n2 x x x x |x| • if either of the functions is weak, the attacks may work better 1 1 2 2 3 3 4 4

security reductions well understood many more results on property preservation impact of theory limited 25 26

Improving MD iteration Tree structure: parallelism

• degradation with use: salting (family of functions, [Damgård’89], [Pal-Sarkar’03], [Keccak team’13] randomization) x1 x2 x3 x4 x5 x6 x7 x8 – or should a salt be part of the input? • PRO: strong output transformation g f f f f – also solves length extension • long message 2nd preimage: preclude fix points

– counter f fi [Biham-Dunkelman’07] f f • multi-collisions, herding: avoid breakdown at 2n/2 with larger internal memory: known as wide pipe – e.g., extended MD4, RIPEMD, [Lucks’05] f

27 28

Permutation (π) based: sponge Modes: summary

x x x h1 h2 1 2 x3 4 • growing theory to reduce security properties of hash function to that of compression function H10 r (MD) or permutation (sponge) – preservation of large range of properties π π π π π π H2 – relation between properties 0 … c • it is very nice to assume multiple properties of the compression function f, but unfortunately it is very absorb squeeze hard to verify these • still no single comprehensive theory if result has n bits, H1 has r bits (rate), H2 has c bits (capacity) and the permutation π is “ideal” collisions min (2c/2 , 2n/2) 2nd preimage min (2c/2 , 2n) c n preimage min (2 , 2 ) 29 30

5 Hash Functions - Bart Preneel June 2016

Single block length [Rabin’78]

x1 x2 x3 x4

H H H3 H4 IV E 1 E 2 E E Compression functions H’2 H’’2

• Merkle’s meet-in-the-middle: (2nd) preimage in time 2n/2

n/2 – select 2 values for (x1,x2) and compute forward H’2 n/2 – select 2 values for (x3,x4) and compute backward H’’2 – by the birthday paradox expect a match and thus a (2nd) preimage

31 31 32

Block cipher (EK) based: single block length Permutation (π) based

Davies-Meyer Miyaguchi-Preneel parazoa small permutation JH Grøstl

H H x i i i x H x i i-1 E i E π1 H1i-1 H1i

H2i-1 π H2i Hi xi Hi-1 Hi-1 π2 • output length = block length m; rate 1; 1 per • 12 secure compression functions (in ideal cipher model) • lower bounds: collision 2m/2, (2nd) preimage 2m • [Preneel+’93], [Black-Rogaway-Shrimpton’02], [Duo-Li’06], [Stam’09],…

33 34

Block cipher (EK) based: double block length (3n to 2n compression) Iteration modes and compression functions

• security of simple modes well understood collision Open problems: • powerful tools available [Mennink12] • what is the best (3) collision/preimage security for • analysis of slightly more complex schemes very n 2 block cipher calls? difficult • For optimal collision security: what is the best preimage [Jetchev-Özen-Stam12] security for s block cipher (2) • MD versus sponge debate: 2n/3 calls? (upper bounds are MDC-4 known) – sponge is simpler 5n/8 MDC-2 (4) – sponge easier to extend to , MAC,.. 3n/5 (2) – should xi and Hi-1 be treated differently? n/2 trivial (1)

preimage 2n n5n/4 3n/2 35 36

6 Hash Functions - Bart Preneel June 2016

Hash function history 101

DES RSA 1980 single block ad hoc Dedicated length schemes

HARDWARE MD2 Hash function MD4 1990 double MD5 block security SHA-1 constructions length reduction for RIPEMD-160 factoring, AES SHA-2 2000 small DLOG, lattices Whirlpool

SOFTWARE permu- tations

SHA-3 2010

37 37 38

MDx-type hash function history MD5: 4 rounds of 16 steps [Rivest’91]

H MD4 Ext. MD4 90 i-1 A0 B0 C0 D0

x0 f A1 B1 C1 D1 MD5 91 … x 15 f A16 B16 C16 D16 92 g A B C D H AVA L RIPEMD xp(0) 17 17 17 17 x … i g xp(15) A32 B32 C32 D32 K

93 A B C D i SHA(-0) xq(0) h 33 33 33 33 …

94 xq(15) h A48 B48 C48 D48 SHA-1 RIPEMD-160 95 A B C D xr(0) j 49 49 49 49 … x j B D SHA-2 r(15) A64 64 C64 64 02 + SHA-3 H 12 39 i 40

State updates in the MD4 family The complexity of collision attacks

brute force: 1 million PCs (1 year) or US$ 100,000 hardware (4 days)

MD4 SHA/SHA-1 SHA-256 90 80

K + Σ0 Σ1 + 70 K + f + MD4 M W A C 60 + + + J H MD5 W << s + 50 + SHA-0 + 40 SHA-1 AN BN CN DN EN FN GN HN 30 20 Brute force Design principles copied in MD5, RIPEMD, HAVAL, SHA, 10 SHA-1, SHA-256, ... 0

– All hash functions in use today 4 6 2 4 6 0 9 0 1 99 00 00 1992 1992 19 1 1998 2000 20 2 2 2008 20

41 42 Slide credit: C. Rechberger

7 Hash Functions - Bart Preneel June 2016

SHA-1 Rogue CA attack designed by NSA in ‘94 [Sotirov-Stevens-Appelbaum-Lenstra-Molnar-Osvik-de Weger ’08]

log2 complexity • request user cert; by special collision this results in a fake CA Self-signed root key [Wang+’05] cert (need to predict serial [Mendel+’08] [Manuel+’09] number + validity period) CA1 CA2 Rogue CA [Wang+’04] [Stevens’12]

User1 User2 User x [Sugita+’06] [McDonald+’09] impact: rogue CA that can issue certs that are trusted by Most attacks unpublished/withdrawn all browsers

•6 CAs have issued certificates signed with MD5 in 2008: prediction: collision for SHA-1 in the next 12 months — Rapid SSL, Free SSL (free trial certificates offered by RapidSSL), TC TrustCenter 57.7 AG, RSA Data Security, Verisign.co.jp - 75/80 steps takes 2 [Grechnikov-Adinetz’11] 43 44

Collisions for SHA-1 compression function Upgrades [Stevens-Karpman-Peyrin’15]

• 10 days on a cluster of 64 GPUs (2K$) • RIPEMD-160 is good replacement for SHA-1 – does not lead to a collision for SHA-1 with fixed IV • upgrading algorithms is always hard – compare to [denBoer-Bosselaers’93] for MD5 • by extrapolation: 100K$ for SHA-1 collision • TLS uses MD5 || SHA-1 to protect algorithm negotiation (up to v1.1) • browser industry: planned stop accepting • upgrading negotiation algorithm is even SHA-1 certs in 2017 harder: need to upgrade TLS 1.1 (‘06) to – September 2015: 28.2% of certs still use SHA-1 TLS 1.2 (‘08) – progress in November 2013 (Google, Microsoft) – but TLS 1.2 allows MD5 only!! SLOTH attack [late 2015]

45 46

SHA-2 : FIPS 180 designed by NSA, published in 2002

• SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/256 – non-linear message expansion – 64/80 steps – SHA-384 and SHA-512: 64-bit architectures

• SHA-256 collisions: 31/64 steps 265.5 [Mendel+’13] 128- SHA-3 – free start collision: 52/64 steps (2 ) [Li+12] – non-randomness 47/64 steps (practical) [Biryukov+11][Mendel+11] (bits and bytes) • SHA-256 preimages: 45/64 steps (2256-) [Khovratovitch’12]

• implementations today faster than anticipated – 18 cycles/byte on Core 2 (2008)  7.8 cycles/byte on Haswell (2013)

• adoption accelerated by other attacks on TLS – since 2013 deployment in TLS 1.2 47 48 48

8 Hash Functions - Bart Preneel June 2016

Properties: bits and bytes NIST AHS competition (SHA-3) [Watanabe’10]

• SHA-3: 224, 256, 384, and 512-bit message digests • (similar to SHA-2) Call: 02/11/07 Deadline (64): 31/10/08 Round 1 (51): 09/12/08 Round 2 (14): 24/7/09 Final (5): 10/12/10 80 64 Selection: 02/10/12 60 51 Standard 05/08/15 40 20 14 5 1 0 Q4/08 Q3/09 Q4/10 Q4/12

round 1 round 2 final 5-bit Sbox 49 50

Software performance Hardware: post-place & route results eBash [Bernstein-Lange] ASIC 130nm [Guo-Huang-Nazhandali-Schaumont’10]

Throughput logarithmic scale (Gbps) 20 slower SHA256 Blake Keccak BMW 16 CubeHash ECHO 12 Grostl Grøstl Hamsi JH 8 Keccak Luffa Blake Shabal SHAvite 4 JH SIMD Skein 0 Area 0 40,000 80,000 120,000 160,000 200,000 (GateEqv) 51 52 factor 4 in cycles/byte Slide credit: Patrick Schaumont, Virginia Tech

Keccak: FIPS 202 Keccak (published: 5 August 2015)

• append 2 extra bits for domain separation to allow – flexible output length (XOFs or eXtendable Output Functions) – tree structure (Sakura) allowed by additional encoding • 6 versions – SHA3-224: n=224; c = 448; r = 1152 (72%) – SHA3-256: n=256; c = 512; r = 1088 (68%) – SHA3-384: n=384; c = 768; r = 832 (52%) pad 01 – SHA3-512: n=512; c = 1024; r = 576 (36%) – SHAKE128: n=x; c = 256; r = 1344 (84%) pad 11 for XOF – SHAKE256: n=x; c = 512; r = 1088 (68%)

permutation: 25, 50, 100, 200, 400, 800, 1600 nominal version: if result has n bits, H1 has r bits (rate), H2 has c bits (capacity) and the permutation π is “ideal” collisions min (2c/2 , 2n/2) • 5x5 array of 64 bits [Dinur+13]: collisions 5 nd c/2 n out of 24 rounds 2 preimage min (2 , 2 ) • 24 rounds of 5 steps preimage min (2c , 2n)

9 Hash Functions - Bart Preneel June 2016

Performance of hash functions – Bernstein-Lange Hash functions: conclusions (cycles/byte) Intel Core 2 Quad Q9550; 4 x 2833MHz (2008)

2001 • SHA-1 would have needed 128-160 steps instead of 80 • 2004-2009 attacks: cryptographic meltdown but not dramatic for most applications • theory is developing for more robust iteration modes and extra features; still early for building blocks

• Nirwana: efficient hash functions with security reduction

(estimated) 56

The end

Thank you for your attention

57

10