Introduction

Cryptographic hash functions ⇒ confidentiality against eavesdropping

Myrto Arapinis School of Informatics What about authenticity and integrity against an active attacker? University of Edinburgh −→ cryptographic hash functions and codes −→ this lecture January 26, 2015

1 / 1 2 / 1

One-way functions (OWFs) Collision-resistant functions (CRFs) A OWF is a function that is easy to compute but hard to invert: A function is a CRF if it is hard to find two messages that get Definition (One-way) mapped to the same value threw this function A function f is a one-way function if for all x there is no efficient Definition (Collision resistance) algorithm which given f (x) can compute x A function f is collision resistant if there is no efficient algorithm that can find two messages m1 and m2 such that f (m1) = f (m2) Constant functions ARE OWF: for any function f (x) = c (c a constant) it is impossible to retrieve n from f (n) Constant functions ARE NOT CRFs for all m1 and m2, f (m1) = f (m2) The successor function in N IS NOT a OWF given succ(n) it is easy to retrieve n = succ(n) − 1 The successor function in N IS a CRF the predecessor of a positive integer is unique Multiplication of large primes IS a OWF: integer factorization is a hard problem - given p × q (where p Multiplication of large primes IS a CRF: and every positive integer has a unique prime factorization q are primes) it is hard to retrieve p and q 3 / 1 4 / 1 Cryptographic hash functions Cryptographic hash functions: applications

A cryptographic hash function takes messages of arbitrary length I Commitments - Allow a participant to commit to a value v end returns a fixed-size bit string such that any change to the data by publishing the hash H(v) of this value, but revealing v only will (with very high probability) change the corresponding hash later. Ex: electronic voting protocols, digital signatures, . . . value. I File integrity - Hashes are sometimes posted along with files Definition (Cryptographic hash function) on “read-only” spaces to allow verification of integrity of the files. Ex: SHA-256 is used to authenticate Debian GNU/Linux A cryptographic hash function H : M → T is a function that software packages satisfies the following 4 properties: I Password verification - Instead of storing passwords in I |M| >> |T | cleartext, only the hash digest of each password is stored. To I it is easy to compute the hash value for any given message authenticate a user, the password presented by the user is I it is hard to retrieve a message from it hashed value (OWF) hashed and compared with the stored hash. I it is hard to find two different messages with the same hash I derivation - Derive new keys or passwords from a single, value (CRF) secure key or password.

Examples: MD4, MD5, SHA-1, SHA-256, Whirlpool, . . . I Building block of other crypto primitives - Used to build MACs, block ciphers, PRG, . . .

5 / 1 6 / 1

Collision resistance and the The Merkle-Damgard construction Theorem m Let H : M → {0, 1}n be a cryptographic hash function (|M| >> 2n) m0 m1 m2 m3 || PB Generic algorithm to find a collision in time O(2n/2) hashes: n/2 1. Choose 2 random messages in M: m1,..., m2n/2 n/2 2. For i = 1,..., 2 compute ti = H(mi )

3. If there exists a collision (∃i, j. ti 6= tj ) IV H1 H2 H3 H4 = H(m) (fixed) h h h h then return (ti , tj ) else go back to 1 I Compression function: h : T × X → T Proof: (details on the board) I Padding Block (PB): 1000 ... 0||mes-len the expected number of iteration is 2 (add extra block if needed) ⇒ running time O(2n/2) Example of MD constructions: MD5, SHA-1, SHA-2, . . . ⇒ Cryptographic function used in new projects should have an output size n ≥ 256!

7 / 1 8 / 1 Properties of MD construction Compression functions from block ciphers

Let E : K × {0, 1}n → {0, 1}n be a

Theorem Let H be built using the MD construction to the compression m m function h. If H admits a collision, so does h. i i

H H H H i E(k,) + i+1 i E(k,) + i+1 Proof: (details on the board) Davies-Meyer Miyaguchi-Preneel

9 / 1 10 / 1

Example of cryptographic hash function: SHA-256

I Structure: Merkle-Damgard I Compression function: Davies-Meyer Message Authentication Codes (MACs) I Bloc cipher: SHACAL-2

512-bit key

256-bit block 256-bit block SHACAL-2

11 / 1 12 / 1 Goal: message integrity Goal: message integrity

k k k k

message m tag t message m tag t

Alice Bob Alice Bob Generate tag Verify tag Generate tag Verify tag

t ← MDC(m) V(m,t)=“yes”? t ← S(k,m) V(k,m,t)=“yes”?

A MAC is a pair of algorithms (S, V ) defined over (K, M, T ): A MAC is a pair of algorithms (S, V ) defined over (K, M, T ):

S : K × M → T I S : K × M → T

V : K × M × T → {>, ⊥} I V : K × M × T → {>, ⊥}

Consistency: V (k, m, S(k, m)) = T I Consistency: V (k, m, S(k, m)) = T and such that and such that

It is hard to computer a valid pair (m, S(k, m)) without I It is hard to computer a valid pair (m, S(k, m)) without knowing k knowing k

13 / 1 14 / 1

File system protection Block ciphers and message integrity

I At installation time Let (E, D) be a block cipher. We build a MAC (S, V ) using (E, D) as follows: F1 F2 Fn I S(k, m) = E(k, m)

I V (k, m, t) = if m = D(k, t) then return > else return ⊥ t1 = S(k, F1) t2 = S(k, F2) tn = S(k, Fn) k derived from user password But: block ciphers can usually process only 128 or 256 bits

I To check for virus file tampering/alteration: Our goal now: construct MACs for long messages I reboot to clean OS I supply password I any file modification will be detected

15 / 1 16 / 1 ECBC-MAC PMAC

m m m m 0 1 2 3 m0 m1 m2 m3

P(K , 0) P( , 1) P(K , 2) P(K ,30) + + + 2 + K2 + 2 + 2 +

E(K1,) E(K1,) E(K1,) E(K1,) E(K1,) E(K1,) E(K1,)

 t E(K2, ) +

n n I E : K × {0, 1} → {0, 1} a block cipher 2 ∗ n E(K1,) t I ECBC-MAC : K × {0, 1} → {0, 1} → the last encryption is crucial to avoid forgeries!! n n I E : K × {0, 1} → {0, 1} a block cipher (details on the board) n I P : K × N → {0, 1} any easy to compute function 2 ∗ n Ex: 802.11i uses AES based ECBC-MAC I PMAC : K × {0, 1} → {0, 1} 17 / 1 18 / 1

HMAC MAC built from cryptographic hash functions

HMAC(k, m) = H(k ⊕ OP||H(k ⊕ IP||m))

IP, OP: publicly known padding constants

k XOR IP m0 m1 m2

IV (fixed) h h h h

k XOR OP

IV h h t (fixed)

Ex: SSL, IPsec, SSH, . . .

19 / 1