Cryptography 7.

Hash functions Hash functions in data structures

Hash function is a compression function On arbitrary length input H : {0, 1}∗ 7→ {0, 1}k for k = 128, 160, 256, etc Classical application: data structures Storing a set of elements in a table of length k Achieving O(1) insertion and lookup time The element x is stored in the H(x) table-cell Retrieve x by computing H(x) and check the respective cell Collision: x 6= x0 : H(x) = H(x0) A hash function is „good” if there are few collisions It spreads the elements well Hash functions in

Compressing data Few collisions A collision resistance in Data structures desired only Cryptography crucial In data structures x and H(.) are independent In cryptography the adversary can choose x arbitrarily to cause a collision Cryptographic hash functions are harder to construct... Hash functions in cryptography

Definition A collision in a function H(.) is a pair of inputs x 6= x0 such that H(x) = H(x0). A function H(.) is collision resistant if any PPT adversary can find a collision with negligible probability only. A function H(.) is hash function if H : {0, 1}∗ 7→ {0, 1}n.

Weaker notions of security: 1 Collision resistance: see above 2 Second pre-image resistance: given x it is infeasible to find x0 6= x : H(x0) = H(x) by a PPT adversary 3 Pre-image resistance: given y = H(x) for random (and unknown) x it is infeasible to find x : H(x0) = y by a PPT adversary (in other words it’s a one-way function) Hash functions in cryptography

Design principles Collision resistance Second pre-image resistance Pre-image resistance : small change in input ⇒ large change in output Strict avalanche criterion: if a single input bit is complemented ⇒ every output bit is changed with 1/2 probability Bit independence criterion: ∀i, j, k : if a single input bit i is complemented ⇒ output bits j, k change independently Attacks and weaknesses

Theorem (Birthday paradox)

Let x1, . . . , xn ∈R {1, . . . , d} uniform random values. Then

− n(n−1) P (∃i, j ∈ {1, . . . , n} : i 6= j, xi = xj) ≈ 1 − e 2d

Birthday attack For a hash-function H : {0, 1}∗ 7→ {0, 1}n a collision can be found with probability 1/2 by computing 2n/2 hash values.

Significantly faster than brute force =⇒ n ≥ 160 A collision can be found faster than the =⇒ tha hash is „broken” Attacks and weaknesses

Sophisticated collision attacks: birthday paradox +

Chosen-prefix attack

Given two prefixes p1 6= p2 find m1, m2 : H(p1||m1) = H(p2||m2).

Specific to Merkle-Damgård Real-world attacks against MD5 based implementations Attacks and weaknesses

Lenght-extension attack Given hash value H(m) and message length |m| compute H(m||m0) for some m0 chosen by the attacker.

Padding based attack H(data||) ⇒ H(data||padding||OurData||NewP adding) Merkle-Damgård is vulnerable attacks on MD5, SHA1, SHA2 Attacks and weaknesses

Rainbow tables Find a preimage using precomputed table of hashchains.

Application: password recovery Storing the input-output pairs for hash-reduction chains Searching for identical output values

Rainbow table with 3 reduction function for Wikipedia created by User:Dake Attacks and weaknesses

Side-channel attacks Any attack based on information obtained from the implementation of a given instead of weaknesses in the algorithm itself.

Timing information Power consumption Elektromagnetic leaks Sound Statistical methods Merkle-Damgård transform

Practical constructions handling fixed-length input only Methodology to construct full-fledged hash function Let h : {0, 1}2n 7→ {0, 1}n be a fixed-length hash function and m ∈ {0, 1}∗ with |m| = ` < 2n Then the following H(.) is a variable-length hash function:

1 ` Split m into blocks of length n, i.e. let b := d n e and m = (m1|m2| ... |mb) k n 2 Set mb+1 := ` ∈ {0, 1} , z0 := 0

3 For i = 1, . . . , b + 1 compute zi := h(zi−1|mi)

4 H(m) := zb+1 Merkle-Damgård transform

1 ` Split m into blocks of length n, i.e. let b := d n e and m = (m1|m2| ... |mb) k k 2 Set mb+1 := ` ∈ {0, 1} , z0 := 0

3 For i = 1, . . . , b + 1 compute zi := h(zi−1|mi)

4 H(m) := zb+1

Practice: it is enough to consider fixed-length constructions Theory: the amount of compressing is not important

Initialization vector – IV : z0 can be chosen freely Security: if h(.) is collision resistant then H(.) is collision resistant as well MD5 - Description

512 to 128 bits compression extended by Merkle-Damgård Works on 32-bit words m divided into 512(=16*32)-bit blocks Operates on a 128(=4*32)-bit state A, B, C, D are fixed 4 rounds, 16 similar operation each Four possible non-linear F : 1 F (B,C,D) = (B∧C)∨(¬B∧D) 2 G(B,C,D) = (B∧D)∨(C∧¬D) 3 H(B,C,D) = B ⊕ C ⊕ D 4 I(B,C,D) = C ⊕ (B ∨ ¬D)

Mi is a message block

Ki constant, s a rotation parameter varies for each operation MD5 – Analysis

NOT collision resistant! 128 bit output =⇒ birthday attack is possible... 1992 - MD5 published 1993 - „pseudo-collision” in the compression function (IV based attack) 1996 - collision in the compression function 2004 - MD5CRK, a distributed effort using birthday attack 2004 - within 1 hour (analytical attack) 2005 - practical collision of two X.509 certificates with different public keys and the same MD5 hash value 2010 - first published single-block collision SHA-1

SHA – Secure Hash Algorithm Designed by U.S. NSA, published by U.S. NIST Similar to MD5 Versions: SHA-0 (1993) 160-bit output, 32-bit words, 80 rounds Operations: ⊕, , ∧, ∨, ≪ Collision found SHA-1 (1995) 160-bit output, 32-bit words, 80 rounds Operations: ⊕, , ∧, ∨, ≪ Wt expanded message word for round t

Kt round constant for round t More resistant, theoretical attack of SHA-1, original diagram for Wikipedia created 61 complexity 2 (2011) by User:Matt Crypto SHA-2

SHA-2 (2001) = SHA-256/SHA-512 256/512-bit output, 32/64-bit words, 64/80 rounds Operations: ⊕, , ∧, ∨, ≪, rot Ch(E,F,G) = (E ∧ F ) ⊕ (¬E ∧ G) Ma(A, B, C) = (A ∧ B) ⊕ (A ∧ C) ⊕ (B ∧ C) Σ0(A) = (A ≫ 2) ⊕ (A ≫ 13) ⊕ (A ≫ 22) Σ1(E) = (E ≫ 6)⊕(E ≫ 11)⊕(E ≫ 25) No collision found (yet) SHA-3 (2014-)

Different design SHA-2, original diagram for Wikipedia created

An alternative of SHA-2 by User:kockmeyer RIPEMD-160

Published in 1996 160-bit hash value Similar design principles as MD5 A bit faster than SHA-1 BUT designed in the open academic community!!! Developed in the framework of the EU project RIPE (RACE Integrity Primitives Evaluation) No collision found (yet) Optional extensions: RIPEMD-256 and RIPEMD-360 Longer hash values The same levels of security A possible alternative of SHA-1 NIST hash function competition (2007 – 2012)

Development process similar to the AES competition Oct. 2008 Submission deadline Dec. 2008 51 candidates for Round 1 Feb. 2009 NIST conference: submitters presented their algorithms Jul. 2009 14 candidates accepted to Round 2 Aug. 2010 CRYPTO 2010:the second-round candidates were discussed Dec. 2010 Announcement of finalists Performance: small hardware requirement Security: possible crypto /design weaknesses Analysis: (lack of) cryptanalysis of the whole crypto-community Diversity: different modes of operation and internal structures Dec. 2012 Winner: Keccak Aug. 2013 NIST announced changes in the proposed standard to achieve better security/performance trade-off... Aug. 2015 Keccak aka SHA-3 is the hashing standard One finalist: Grøstl

The Grøstl hash-function

Knudsen et al. (TU of Denmark & TU Graz) Modified Merkle-Damgård

h0 = iv, hi = f(hi−1, mi) Compression function f based on permutations P,Q(see later) H(m) = Ω(ht) Output transformation Ω(x) = truncn(P (x) ⊕ x) One finalist: Grøstl

f(h, m) = P (h ⊕ m) ⊕ Q(m) ⊕ h Design of P and Q are inspired by AES Small number of permutations =⇒ simple analysis Well-known design principles Provably secure if the permutations are ideal Collision find with ≥ 2`/4 P,Q eval `/2 Preimage find with ≥ 2 P,Q eval The compression function f of Grøstl Indifferentiable from a random oracle One finalist:

Schneier et al. Main components 1. Threefish A tweakable Tweak: an extra input provides variability Large number of simple rounds instead of fewer complex rounds + tweak subkeys Mix: ⊕, , <<<

Four of the 72 rounds of the Threefish-512 One finalist: Skein

Main components 2. Unique Block Iteration (UBI): A chaining mode using Threefish to build a compression function Example: 166 byte input with 3 calls of Threefish-512 Tweak: length + first/last block + „type” Hashing a three-block message using UBI Skein: multiple invocations of UBI 3. Optional Argument System For extensions and other modes

Skein in normal hashing mode SHA-3/Keccak

Diagram of a sponge construction from http://sponge.noekeon.org/

Winner of the NIST hash function competition (2012) Created by Bertoni, Daemen, Peteers and Van Assche Sponge construction – a fixed-length permutation f and a padding rule:

1. m is padded and splitted into r-bit blocks pi

2. Absorbing: XORing pis into the hash state at a given rate r interleaved with application of f (f : 4 × 24 rounds of simple operations on a state consists of a 5 × 5 array of 64-bit words)

3. Squeezing: get the output blocks zi similarly from it at the same rate GPU-resistant hash functions

RandomHash serial vs. parallel hashers

N rounds, Hi-s are well-known hash functions

∀ round H ∈R {H1,...,H18} Output is expanded for memory-hardness A possible solution RandomHash design by Herman Schoenfeld Privacy vs. Integrity

Secure communication Alice wants to send a message to Bob Open communication channel Privacy Tool: Message integrity Alice wants to send a message to Bob Open communication channel Authenticity (caller-ID, email address) Integrity Prevent any undetected tampering Adversarial tampering is not a crypto problem (physical countermeasures) Tool: ??? Encryption vs.

Encryption using stream ciphers

Let c := Ek(m) = G(k) ⊕ m be the , where G(.) is a PRG Flipping a bit in c =⇒ flipping the same bit in m Example: flipping the 11th lsb causes 1000$ difference... The scheme is still secure Similar attack for the unconditionally secure one-time pad Encryption using block ciphers The same attack for OTR and CTR modes A bit sophisticated methods for ECB and CBC modes Encryption itself does not provide integrity c completely hides the contents of m BUT the adversary can modify c in a meaningful way! Every possible c corresponds to some m... We need something new Message Authentication Codes: Definition

Communicating parties has a common secret (a private key) Send an authenticated message Know whether the message was tampered

Defintion A message authentication code is a triple (Gen, Mac, V rfy) if the following holds: Key-generation Gen outputs a secret key k on input of the security parameter 1n with |k| ≥ n

Tag-generation Mac outputs the MAC tag t := Mack(m) for every message m ∈ {0, 1}∗

Verification V rfy outputs a bit b := V rfyk(m, t), with b = 1 if the MAC tag is valid and 0 otherwise. Furthermore the scheme has to be correct: for every set of parameters V rfyk(m, Mack(m)) = 1. Message Authentication Codes: Definition of Security

How to attack such a scheme? The adversary performs the following steps: 1 Asks Alice for the MAC tags of some messages (influence on the content of m) 2 Makes some computation based on the results 3 Outputs a forgery: a valid t for a new m (not asked previously) If this attack is "hard", then the scheme is called secure

Defintion A message authentication code is existentially unforgeable under an adaptive chosen-message attack (or secure shortly) if every PPT adversary can generate a valid MAC tag t for a message m with negligible probability only after asking several t0 for m0 6= m. Message Authentication Codes: Definition of Security

Defintion A message authentication code is existentially unforgeable under an adaptive chosen-message attack (or secure shortly) if every PPT adversary can generate a valid MAC tag t for a message m with negligible probability only after asking several t0 for m0 6= m.

Too strong definition? Adversary can request the tag of any message Generating a valid tag for any message "breaks" the scheme Only meaningful messages are important in practice What does "meaningful" mean? Replay attacks Other methods: sequence numbers or time-stamps concatenated with m Drawbacks: storing or synchronization problems MAC constructions for fixed length messages

Fixed-length MAC Let PRF : {0, 1}n 7→ {0, 1}n be a pseudorandom function. Then the following is a fixed-length MAC n Gen: k ∈R {0, 1} Mac: Given k and a message m ∈ {0, 1}n the tag is t := PRFk(m) V rfy: Given k, a message m ∈ {0, 1}n and a tag n t ∈ {0, 1} the output is 1 iff t = PRFk(m)

If PRF is a pseudorandom function then this scheme is secure Drawback: only for fixed-length messages MAC constructions for variable length messages

We have a secure MAC (Gen0, Mac0, V rfy0) for fixed length m How to extend it for arbitrary length m? Some wrong (but even better) ideas

0 Split m into b blocks m1, . . . , mb and authenticate blockwise 1 0 Authenticate the sum of the blocks: t := Mack(⊕imi) 0 0 Easy to forge: give a new message m : ⊕imi = ⊕imi 2 Authenticate each blocks separately: t := (t1, . . . , tb) with 0 ti := Mack(mi) Easy to forge: permute the blocks 3 Authenticate each blocks with a sequence number: 0 t := (t1, . . . , tb) with ti := Mack(i|mi) Easy to forge: drop or mix-and-match the blocks Additional information to every blocks to prevent Length based attacks Combining the blocks MAC constructions for variable length messages

Variable-length MAC Let (Gen0, Mac0, V rfy0) be a fixed-length MAC for messages of length n Gen: The same as Gen0 Mac: Given k and a message m ∈ {0, 1}∗ with ` := |m| < 2n/4. split m into b blocks m1, . . . , mb with |mi| = n/4 and choose n/4 0 r ∈R {0, 1} . Compute the tags ti := Mack(r|`|i|mi), then the tag is t := (r, t1, . . . , tb) ∗ V rfy: Given k, a message m ∈ {0, 1} and a tag t = (r, , t1, . . . , tb0 ) split 0 0 m into b blocks. The output is 1 iff b = b and V rfyk(r|`|i|mi, ti) = 1 for i = 1, . . . , b.

If (Gen0, Mac0, V rfy0) is a secure fixed-length MAC then (Gen, Mac, V rfy) is a secure variable-length MAC MAC from hash functions: Nested MAC

NMAC Let h : {0, 1}2n 7→ {0, 1}n be a compression function and let H : {0, 1}∗ 7→ {0, 1}n be a hash function constructed by the Merkle-Damgård transform n Gen: k1, k2 ∈R {0, 1} ∗ Mac: Given k1, k2 and a message m ∈ {0, 1} the tag is

t := h(k1|Hk2 (m)) ∗ n V rfy: Given k1, k2 and a message m ∈ {0, 1} and a tag t ∈ {0, 1}

the output is 1 iff t = Mack1,k2 (m)

HIV (.) denotes the Merkle-Damgård hash keyed hash with initialization n vector z0 := IV ∈ {0, 1} The compression of a key and the output of a keyed Merkle-Damgård If h(.) is collision resistant and yields a secure MAC then NMAC is secure MAC from hash functions: HMAC

HMAC Let h : {0, 1}2n 7→ {0, 1}n be a compression function, let H : {0, 1}∗ 7→ {0, 1}n be a hash function constructed by the Merkle-Damgård transform and let IV, ipad, opad ∈ {0, 1}n be fixed. n Gen: k ∈R {0, 1} Mac: Given k and a message m ∈ {0, 1}∗ the tag is t := h(h(IV |k ⊕ opad)|HIV (k ⊕ ipad|m)) V rfy: Given k and a message m ∈ {0, 1}∗ and a tag t ∈ {0, 1}n the output is 1 iff t = Mack(m)

Improvement of NMAC: uses a fixed IV and a single secret key only

In fact its a special case: k1 := h(IV |k ⊕ opad), k2 := h(IV |k ⊕ ipad) MAC from hash functions: HMAC

HMAC-X ∗ n Let HX : {0, 1} 7→ {0, 1} be an arbitrary hash function and let ipad, opad ∈ {0, 1}n be fixed. n Gen: k ∈R {0, 1} Mac: Given k and a message m ∈ {0, 1}∗ the tag is t := HX ((k ⊕ opad)|HX (k ⊕ ipad|m)) V rfy: Given k and a message m ∈ {0, 1}∗ and a tag t ∈ {0, 1}n the output is 1 iff t = Mack(m)

Eliminates the weaknesses of HX HMAC-SHA1 Immune to length-extension attack MAC from block ciphers: CBC-MAC

Fixed length CBC-MAC

n n Let Ek : {0, 1} 7→ {0, 1} be a block-cipher and let x be a fixed length. n Gen: k ∈R {0, 1} Mac: Given k and a message m ∈ {0, 1}x·n first split m into blocks of length n, i.e. l m = (m1|m2| ... |mx) and compute n ti := Ek(ti−1 ⊕ mi) for i = 1 to x with t0 := 0 . The tag is t := tx V rfy: Given k and a message m ∈ {0, 1}x·n and a tag t ∈ {0, 1}n the output is 1 iff t = Mack(m) MAC from block ciphers: CBC-MAC

Fixed length CBC-MAC

If E is a PRF then this is a secure fixed-length MAC Generalizations to variable-length input:

1 Use the key kx := Ek(x) in the block-cipher 2 Prepend m with its length (add one more round with m0 := |m|) n 3 Use two keys k1, k2 ∈ {0, 1} and first compute the 0 CBC-MAC with k1, the tag is t := Ek2 (t)