<<

www.ecrypt . eu. org

Theory and practiceTitle of Presentation for hash functions Kath o lie ke Un ivers ite it Leuven - COSIC [email protected]

Cambridge, 1 February 2012

Insert presenter logo here on slide master Hash functions

X.509 Annex D RIPEMD-160 MDC-2 SHA-256 SHA-3 MD2, MD4, MD5 SHA-512 SHA-1

This is an input to a crypto - graphic . The input is a very long string, that is reduced by the hash function to a string of fixed length. There are 1A3FD4128A198FB3CA345932 additional security conditions: it h should be very hard to find an input hashing to a given value (a preimage) or to find two colliding inputs (a collision).

2 Applications

• short unique identifier to a string – digital signatures – data • one-way function of a string – protection of passwords – micro-payments • confirma tion o f k nowl e dge /comm itmen t

• pseudo-random string generation/ derivation • entropy extraction • construction of MAC algorithms, stream ciphers, block ciphers,…

2005: 800 uses of MD5 in Microsoft Windows 3 Bridging theory and practice?

4 Agenda

Definitions

Iterations (modes)

Compression functions

Hash functions

SHA-3

5 Security requirements (n-bit result) preimage 2nd preimage collision ? x ≠ ? ? ≠ ?

h h h h h

h(()x) h(()x) = h(()x’) h(()x) = h(x’) 2n 2n 2n/2

6 Length extension length extension: if one knows h(x), easy to compute h(x || y) without knowing x or IV

IV H H 1 2 H = h(x) f f f 3

x1 x2 x3

IV H1 H2 H3 H4= h(x || y) f f f f

x1 x2 x3 y

solution: output transformation

IV H1 H2 H3 f f f f g

x1 x2 x3 x4 7 Informal definitions: NIST requirements for SHA-3

• One Way Hash Function (OWHF) – preimage resistance (2n) – 2nd preimage resistance (2n-L for 2L-bit message) • Collision Resistant Hash Function (CRHF): – collision resistant • Resistance to length extension • Properti es sca le we ll w ith trunca tion • HMAC-PRF indistinguishable up to 2n/2 queries • Secure in randomized hashing mode

8 Brute force (2nd) preimage

• multippgle target second preima ge (()1 out of M): if one can attack 2t simultaneous targets, the effort to find a single preimage is 2n-t • multiple target second preimage (M out of M): – time-memory trade -off with Θ(2n) precomputation and storage Θ(22n/3) time per (2nd) preimage: Θ(22n/3) [Hellman’80] – full cost per (2nd) preimage from Θ(2n) to Θ(22n/5) [Wiener’02] (if Θ(23n/5)t) target s are att ack ed)

answer: randomize hash function with a parameter S ( , key , spice,… )

9 Formal definition: (2nd) preimage resistance

Notation: Σ = {0,1}, l(n)>n A one-way hash function (OWFH) h is a function with domain D=Σl(n) and range R=Σn that satisfies the following conditions: • Pre: let x be selected uniformly in D and let M be an adversary that on input h(x) uses time ≤ t and outputs M(h(x)) ∈ D.

For each adversary M, Prx ∈ D { h(M(h(x)))=h(x) } < ε. Here the prob. is also taken over the random choices of M. • Sec:let: let x be selected uniformly in D=Σl(n) and let M' be an adversary who on input x uses time ≤ t and outputs x' ∈ D with ≠ x' x. For each adversary M', Prx ∈ D {h(M{ h(M'(x)) =h(x) } < ε. Here the prob. is taken over the random choices of M'.

10 Formal definitions: collision resistance

A collision-resistant hash function (col) h is a function with domain D=Σl(()n) and range R=Σn that that satisfies the following conditions: • for each collision string finder F that uses time ≤ t and outputs either “?” or a pair x, x' ∈ Σl(n) with x' ≠ x such that h(x‘)=h(x) we have that Pr{ F(H) ≠ “?‘” } < ε. Here the prob. is taken over the random choices of F.

Is this correct? No ! F or every h th ere are many ( sh or t) collidi ng s tr ings, and so there exist many very simple collision string finders F that output a collision with probability 1

11 Formal definitions: collision resistance

A collision-resistant hash function family (CRHF) H is a l(n) n function family {hS} with domain D=Σ and range R=Σ and indexed by a “key” S that that satisfies the following conditions: • let F be a collision string finder that on input S ∈ Σs uses time ≤ t and outputs either “?”? or a pair x, x ' ∈ Σl(n) with x' ≠

x such that hS(x‘)=hS(x). For each F, ≠ PrS { F(H) “?‘” } < ε. Here the prob. is also taken over the random choices of F.

12 Formal definitions - continued

• for collision resistance: formalization requires a family, btbut see [Stinson ’06], [R ogaway’06] : “formali z ing human ignorance”

• for (2nd) preimage resistance, one can choose the challengg()e (x) and/or the ke y()y (S) that selects the function , resulting in 3 flavors [Rogaway-Shrimpton’04]: 1. random challenge, random key (Pre and Sec) 2. randkfidhll(PdSdom key, fixed challenge (ePre and eSec - everywh)here) (eSec=UOWHF) 3. fixed keyyg(, random challenge (aPre and aSec - alwayy)s)

• complex relationship (see figure on next slide)

13 Relations between security properties

[Rogaway-Shrimpton’04] Even if Coll [Andreeva-Stam’10] eSec/Pre: bound always 2n/2 << 2n

∀Q rPre Easy to prove d Relevant in practice How does ePre (theoretically easy to prove) relate to dPre (practically relevant)?

For sufficiently Col secure H, ePre implies dPre (if the message space has sufficient Rényi2 entropy > |Y|) 14 Collision resistance

• more complex to formalize • hard to achieve in theory – [Simon’ 98] one cannot derive collision resistance from “general” preimage resistance (there exists no black box reduction) • hard to achieve in practice – requires double output length 2n/2 versus 2n – many attacks

Good news: not alwayyys necessary Even better news: rarely necessary

15 How to solve collision problem in digital signatures?

Message m UOWHF TCR $ Random r r ← {0,1}k eSEC

Signature Sig SKA ( r || hr(m)) SKA

• Hash fffunction is only chosen after the message has been committed to, so collisions for a particular hash function are useless • If two messages m , m ’ are found with the same signature, the signer must have cheated by first choosing r and then finding a collision • Fine in theory, but will this work in p ractice?

16 Courts and second preimage resistance

17 Pseudo-random function computationally indistinguishable from a random function $ $ prf ← hK(.) ← f Advh = Pr [ K K: A 1] - Pr [ f RAND(m,n): A 1] RAND(m,n): set of all functions from m-bit to n-bit strings

K h f

?or?? or ? This concept makes only sense for a function with a D secret key

18 Indifferentiability from a random oracle or PRO property [Maurer+04] variant of indistinguishability appropriate when distinguisher has access to inner component (e.g. bu ilding bloc k o f a hash function) ∃ Simulator S, ∀ distinguisher DAdvD, AdvPRO(H,S) is small

FIL H VIL RO S (hash function) RO ? or ?

!! But [Ristenpart-Shacham- D Shrimpton’11]: not always suffici ent f or ROM securi ty

19 Properties in practice

• other properties are needed: – near-collis ion res is tance – partial preimage resistance (most of input known) – multiplication freeness – MAC property

Which properties How to formalize are relevant? these requirements and the relation between all these properties?

Keyed or unkeyed? Who chooses the key? What about the salt?

20 21 Itera tion (df(mode of compressi ifon functi ti)on)

22 22 Hash function: iterated structure

IV H1 H2 H3 f f f f g x1 x2 x3 x4

Fix IV Are the roles of the Unambiguous padding 2 inputs of the Suffix-free encoding compression function really equivalent? [Merkle’89-Damgård’89] f is collision resistant  h is collision resistant

23 Attacks on MD-type iterations

• multi- and impact on concatenation [Joux’04]

• long message 2nd [Dean-Felten-Hu'99], [Kelsey-Schneier’05] – Sec security degrades lineary with number 2t of message blocks hashed: 2n-t+1 +t2+ t 2n/2+1 – appending the length does not help!

• herding attack [Kelsey-Kohno’06] – reduces security of commitment using a hash function from 2n – on-line 2n-t + precomputtitation 2 22.2(n+t)/2 + storage 2t

24 Improving MD iteration

salt + output transformation + counter + wide pipe

salt salt salt salt salt

IV H1 H H 2 3 g 2n f 2n f 2n f 2n f 2n n x x x x |x| 1 1 2 2 3 3 4 4

25 Improving MD iteration

• degradation with use: salting (family of functions, randii)domization) – or should a salt be part of the input? • PRO/Indif: strong output transformation g – also solves length extension • long message 2nd preimage: preclude fix points → – counter f fi [Biham-Dunkelman’07] • multi-collisions, herding: avoid breakdown at 2n/2 with larger internal memory: known as wide pipe – e.g., extended MD4, RIPEMD, [Lucks’05]

26 Some Domain Extender Security Results [Andreeva-Neven-P-Shrimpton07], [Andreeva-Mennink-P11]

Col Sec aSec eSec Pre aPre ePre Indif 1. SMD [M90, D90] + - - - - - + - 2. Linear hash [BR97] ------++ 3. XOR Linear hash [BR97] + -- + -- + - 4. Shoup's hash [Sho00] + -- + -- + - 5. BCM [AP08] ++ ----+ ? 6. Prefix-free MD [CDMP05] ------++ 7. Randomized hash [HK06] + -- --- + - 8. HAIFA [BD06] + -- --- + + 9. Enveloped MD [BR06] + -- --- ++ 10. Str. [[]Mer80] + -- --- + ? 11. Linear Tree [BR97] ------+ ? 12. ROX [ANPS07] ++ + + + + + ?

27 Preservation

• which constructions preserve which properties? • or is it ok if we can prove security for the iteration? • assumptions on compression function f for PRO/Indif – f is preimage aware (MD) – f is FIL-PRO (variant of MD)

Can tight Impact of results reductions be in ideal model? made?

Is it meaningful to preserve Pre?

28 Compression function

29 29 (EK) based: single block length

Davies-Meyer Miyaguchi-Preneel

Hi Hi H x i-1 E i E

xi Hi-1

• outppgut length = block len g;;ygth m; rate 1; 1 per encr yption • 12 secure compression functions (in ideal cipher model) • lower bounds: collision 2m/2, (2nd) preimage 2m • [Prenee l+ ’93], [Bl ac k-Rogaway-Shri mpt on’02] , [D uo-Li’06], [Stam ’09],…

30 Blake

Davies-Meyer + double pipe internally

Hi-1 Hi E

xi

31 Permutation (π) based: sponge

x1 x2 x3 x4 h1 h2

H10

π π π π π π π π H2 0 …

absorb buffer squeeze

Examples: , RadioGatun Keccak (no buffer) Generalization called Parazoa JH, Cubehash, Fuge, Grindahl, Hamsi, Luffa 32 Permutation (π) based

parazoa small permutation JH Grøstl xi xi π1 H1i-1 H1i

H2i-1 π H2i Hi Hi-1 π2

33 MDC-2 and MDC-4 [Brachtl+’89]

E1,E, E2 n-bit block cipher with n-bit key Some security results [[g,Steinberger’07,Mennink+’11 ] 2n-to-n bit compression function with 3 permutations

Earlier work: [RogawaySteinberger‘08, Stam‘08, Steinberger’10]

Slide credit: Bart Mennink 2n-to-n bit compression function with 3 permutations and XORs [Mennink+‘11]

Slide credit: Bart Mennink Iteration modes and compression functions

• security of simple modes well understood • powerful tools available • analysis of slightly more complex schemes very difficult • need strong assumptions – imho not justified is there a point to build larger hash functions from small building blocks? – why not construct large permutations (e. g. Keccack = 1600 bits)? – what if the building block is not ideal (the larger they are the more risky)?

MD versus sponge?

38 HhftiHash function constructions

39 39 Hash function history 101

DES RSA

1980 E RR single block ad hoc Dedicated

RDWA length schemes AA

H MD2 MD4 1990 double MD5 block security SHA-1 length reduction

RE for RIPEMD-160 AA factoring, SHA-2 2000 AES permu- DLOG, lattices Whirlpool SOFTW tations

SHA-3 2010 40 Performance of hash functions [Bernstein-Lange] (cycles/) AMD Intel Pentium D 2992 MHz (f64)

45 2001 40 35 30 25 20 15 10 5 0 MD4MD5 SHA-1 RMD- DESSHA- SHA- Whirl-AES AES- hash (esti- 160 256 512 pool 41 mated) The complexity of collision attacks

brute force: 1 million PCs (1 year) or US$ 100,000 hardware (4 days)

90 80 70 MD4 60 MD5 50 SHA-0 40 30 SHA-1 20 Brute force 10 0

2 6 92 92 94 96 98 00 0 04 0 08 10 9 9 9 9 0 0 0 0 0 0 19 1 1 1 1 2 2 2 2 2 2

42 MD5

• advice (RIPE since ‘92, RSA si nce ‘96) : stop using MD5 • largely ignored by industry until 2009 (lik(click on a cer t... )

43 Rogue CA attack [Sotirov-Stevens-Appelbaum-Lenstra-Molnar-Osvik-de Weger ’08]

• reqq;ypuest user cert; by special Self-signed collision this results in a fake CA root key cert (need to predict serial number + validity period) CA1 CA2 Rogue CA

impact: rogue CA User1 User2 User x that can issue certs that are trusted by all browsers

• 6 CAs have issued certificates signed with MD5 in 2008: — Rapp(id SSL, Free SSL (free trial certificates offered b y RapidSSL), TC TrustCenter AG, RSA Data Security, Verisign.co.jp

44 SHA-1 log2 complexity 90 80 [Wang+’05] 70 [Mendel+’08] [Manuel+’09] 60 [Wang+’04] 50 SHA- 1 40 [Sugita+’06] [McDonald+’09] 30 20 Most attacks 10 unpublished/withdrawn 0 2003 2004 2005 2006 2007 2008 2009 2010 prediction: collision for SHA-1 in the next 12-18 months 45 46 SHA-3

47 47 NIST AHS competition (SHA-3)

• SHA-3 must support 224, 256, 384, and 512-bit message digg,ests, and must supp ort a maximum messa ge len gth of at least 264 bits Call: 02/11/07 Deadline (64): 31/10/08 Round 1 (51): 9/12/08 80 64 Round 2 (14): 24/7/09 60 51 Final (5): 9/12/10 40 Standard: Q3 2012 20 14 5 1 0 Q4/08Q3/09Q4/10Q3/12

round1d 1 final round 2 48 The candidates

49 Slide credit: Christophe De Cannière Preliminary

50 Slide credit: Christophe De Cannière End of Round 1 candidates

a

51 Slide credit: Christophe De Cannière Round 2 candidates

a

52 Slide credit: Christophe De Cannière Properties: bits and [Watanabe’10]

53 Security Analysis of the 2nd Round SHA-3 Candidates [Andreeva-Mennink-PP10]’10]

54 Security results for 256-bit versions

l-state m-bl. Pre Sec Col Indif size size BLAKE-256 256 512 256 256 128 128 Grøstl-256 512 512 256 256-L 128 128 JH-256 1024 512 256 256 128 170 KkKeccak-256 1600 1088 256 256 128 256 -256 512 512 256 256 128 256 NIST requirements 256 256-L 128 - L: # msg blocks

E. Andreeva, A. Luykx and B. Mennink, "Provable Security of BLAKE with Non- Ideal Compression Function", ePrint Archive, Report 2011/620. D. Chang, M. Nandi and M. Yung: “Indifferentiability of the hash algorithm BLAKE” , ePrint Archive,,p Report 2011/623. E. Andreeva, B. Mennink and B. Preneel, "Security Analysis and Comparison of the SHA-3 Finalists BLAKE, Grøstl, JH, Keccak, and Skein ", 2012. 55 Slide credit: Elena Andreeva Security Results for 512-bit Versions

l-state m-bl. Pre Sec Col Indif size size BLAKE-512 512 1024 512 512 256 256

Grøstl-512 1024 1024 512 512-L 256 256 JH-512 1024 512 256 256 256 170 Keccak-512 1600 576 512 512 256 512 Skein-512 512 512 512 512 256 256 NIST requirements 512 512-L 256 - L: # msg blocks

E. Andreeva, A. Luykx and B. Mennink, "Provable Security with Non -Ideal Compression Function", ePrint Archive, Report 2011/620. D. Chang, M. Nandi and M. Yung: “Indifferentiability of the hash algorithm BLAKE” , ePrint Archive,,p Report 2011/ 623. E. Andreeva, B. Mennink and B. Preneel , "Security Analysis and Comparison of the SHA-3 Finalists BLAKE, Grøstl, JH, Keccak, and Skein ", 2012. 56 Slide credit: Elena Andreeva Open questions

• standard model security proofs (other than Col preservation) • improved Indif bounds • JH-512 optimality for Sec and Pre security in the ideal model

57 57 Slide credit: Elena Andreeva SWIFFTX [Arbitman-Dogon-Lyubashevsky-Micciancio- Peikert-Rosen’08]

• compression function: 32 64 64 – SWIFFT: FFT-like opera tion from (Z2 ) tZto Z257 – sandwich: 3xSWIFFT - S-boxes - 1xSWIFFT • asymptotic proof of security: “it can be formally proved that finding a collision in a randomly- chosen compression function from the SWIFFTX family is at least as hard as finding short vectors in cyyg[clic/ideal lattices over the ring Z[α]/(α n+1) is in the worst case.” • note: SWIFFT mapping is linear and some heuristics are needed to “kill” the linearity • speed: 57 cpb

58 FSB [Augot-Finiasz-Gaborit-Manuel-Sendrier’08]

• compression function: multiplication of vector of Hamming weight w with a truncated quasi-cyclic binary matrix – can be interpreted as a syndrome computation of error pattern with weight w • MD iteration with Whirlpool as output transformation • security can be reduced to: (Computational Syndrome Decoding) Given a binary r x n matrix H , a word s ∈ {0,1}r and an integer w > 0, find a word e ∈ {0,1}n of Hamming weight ≤ w such that eHT = s. (Codeword Finding) Given a binary r x n matrix H and an integer w > 0, and a non-zero word e ∈ {0,1}n of Hamming weight ≤ w with an all zero H- syndrome. • speed – 324 cycles/byte – RFSB: 10.67 cycles/byte on Core 2 Q9550 [Bernstein-Lange’ 11]

59 Hash functions: conclusions

• 2004-2009 attacks: cryptographic meltdown but not dramatic for most applications • half-life of a hash function is < 1 year • solid progress in theory, but still quite a large gap between theory and practice

• nirwana: e ffici ent h ash functi ons with secur ity re duc tions 60