Hash Functions
Total Page:16
File Type:pdf, Size:1020Kb
Hash Functions CSC 482/582: Computer Security Slide #1 Topics 1. Hash Functions 2. Applications of Hash Functions 3. Secure Hash Functions 4. Collision Attacks 5. Pre-Image Attacks 6. Current Hash Functions 7. HMAC: Keyed Hash Functions CSC 482/582: Computer Security Slide #2 Hash Functions Hash Function h: MMD Input M: variable length message M Output MD: fixed length “Message Digest” of input Many inputs produce same output (called a hash collision) Limited number of outputs; infinite number of inputs Avalanche effect: small input change -> big output change Example Hash Function Sum 32-bit words of message mod 232 M MD=h(M) h CSC 482/582: Computer Security Slide #3 Applications of Hash Functions Verifying file integrity How do you know that a file you downloaded was not corrupted during download? Storing passwords (confidentiality) To avoid compromise of all passwords by an attacker who has gained admin access, store hash of passwords. Additional features needed for secure passwords. Digital signatures (authentication) Cryptographic verification that data was downloaded from the intended source and not modified. Used for operating system patches and packages. CSC 482/582: Computer Security Slide #4 Why attack hash functions? Create forged security certificate to Make phishing site appear legitimate. Bypass code signing checks on updates. Distribute malware Replace legitimate app with malware app. Ensure both apps have legitimate hash value, so victims cannot distinguish between them. Forge digital signatures Replace contract where victim pays $50 to attacker with one where victim pays $5,000. CSC 482/582: Computer Security Slide #5 Flame Malware Cyber espionage tool discovered in 2012 Records audio, screenshots, bluetooth, and file data. Exfiltrates data via SSL encrypted channel. Bypassed code signing security in MS Windows Used hash collision to create a certificate apparently signed by Microsoft Certificate Authority. Malware digitally signed with forged certificate. Code signing accepted that malware was valid as certificate apparently signed by MS CA. Attack could be used as MITM attack on MS Update Attacker substitutes Windows patch with malware. CSC 482/582: Computer Security Slide #6 Avalanche Effect The avalanche effect is shown when a small change to the input of a block cipher or hash function makes a large change in the output. Hashing “Cryptography”: MD5 (128-bit) = 64ef07ce3e4b420c334227eecb3b3f4c SHA1 (160-bit) = b804ec5a0d83d19d8db908572f51196505d09f98 Hashing “Cryptography1”: MD5 (128-bit) = 443d4fb1fedeb86b69582169c2719c24 SHA1 (160-bit) = 838498e48147106062a64c523ddfe11bd07a5eac CSC 482/582: Computer Security Slide #7 Secure Hash Function A function h = hash(m) must have 3 properties to be secure: 1. Pre-image resistance: Given a hash h it should be difficult to find any message m such that h = hash(m). Functions that lack this property are vulnerable to preimage attacks. 2. Second pre-image resistance: Given an input m1 it should be difficult to find another input m2 such that m1 ≠ m2 and hash(m1) = hash(m2). Functions that lack this property are vulnerable to second-preimage attacks. 3. Collision resistance: It should be difficult to find two different messages m1 and m2 such that hash(m1) = hash(m2). Such a pair is called a cryptographic hash collision. This property is sometimes referred to as strong collision resistance. It requires a hash value at least twice as long as that required for preimage-resistance; otherwise collisions may be found by a birthday attack. CSC 482/582: Computer Security Slide #8 Pre-image Attacks A pre-image attack attempts to find a message m that has a specific hash value h, such that h=hash(m). Would allow attacker to substitute a malicious document matching hash of valid document, allowing SSL certificate or digitally signed contract forgeries. Brute force attack is possible with 2n operations, where n is the length of the hash value. For n >= 64, brute force considered infeasible. A one-way function is pre-image resistant. No practical pre-image attacks exist against widely used hash functions. An MD5 collision can be found in 2123.4 operations. CSC 482/582: Computer Security Slide #9 Collision Attacks A collision attack attempts to find two different messages m1 and m2 such that hash(m1) = hash(m2). Collisions must exist because there are more inputs than fixed-sized outputs for hash functions. Pigeonhole principle: if there are n containers for n+1 objects, then at least 1 container will have 2 objects in it. Two types of collision attacks exist Birthday Attack Chosen Prefix Attack Collision attacks do not impact password hashing, but do allow for forged certificates and signatures. CSC 482/582: Computer Security Slide #10 The Birthday Paradox The birthday paradox concerns the probability that, in a set of n randomly chosen people, some pair of them will have the same birthday. By the pigeonhole principle, the probability reaches 100% when the number of people reaches 367. However, 99% probability is reached with just 57 people, and 50% probability with 23 people. The birthday paradox is a violation of our intuition, not a true paradox. It arises because the chance of shared birthdays increases with the number of unique pairs of people, which is n(n-1)/2 for n people. CSC 482/582: Computer Security Slide #11 Birthday Attack A birthday attack exploits the mathematics behind the birthday problem to find hash collisions. Suppose a hash function h has a b-bit long output. Therefore there are 2b possible hash values. Attacker generates many random messages Computes hash of each one. Searches for pairs of messages with same hash value. By similar mathematics as in the birthday problem, attacker can find a collision with about 2b/2 messages. CSC 482/582: Computer Security Slide #12 Birthday Attack Analysis The birthday attack procedure follows these steps: 1. Randomly generate a sequence of plaintexts X1, X2, X3,… 2. For each Xi compute yi = h(Xi) and test whether yi = yj for some j < i 3. Stop as soon as a collision has been found If there are m possible hash values, the probability that the ith plaintext does not collide with any of the previous i – 1 plaintexts is 1 - (i - 1)/m The probability Fk that the attack fails (no collisions) after k plaintexts is Fk = (1 - 1/m) (1 - 2/m) (1 - 3/m) … (1 - (k - 1)/m) Using the standard approximation 1 - x e-x -(1/m + 2/m + 3/m + … + (k-1)/m) -k(k-1)/2m Fk e = e The attack succeeds/fails with probability ½ when Fk = ½ , that is, e-k(k-1)/2m = ½ k 1.17 m½ We conclude that a hash function with b-bit values provides ~b/2 bits of security. CSC 482/582: Computer Security Slide #13 Chosen Prefix Attacks A chosen prefix attack is an hash collision attack starting with two different prefixes p1, p2 and attempting to find two suffixes m1 and m2 such that hash(p1 ∥ m1) = hash(p2 ∥ m2). Such an attack allows custom creation of two completely different documents with identical hashes. Example attack Attacker creates two SSL certificate files for two different domains but with identical hashes. Attacker asks CA to sign certificate for one domain. Attacker uses certificate to create phishing site for another domain. User browser successfully validates SSL certificate signature, tells user that phishing site is real site. CSC 482/582: Computer Security Slide #14 Merkle–Damgård construction Select a cryptographic hash function f(m, d). Apply repeatedly to fixed size blocks of message mi. Use output of previous stage di as second input. Start with initialization vector d0 = IV CSC 482/582: Computer Security Slide #15 Message-Digest Algorithm 5 (MD5) Developed by Ron Rivest in 1991 Uses 128-bit hash values Merkle–Damgård construction Still widely used in legacy applications even though collision vulnerabilities allow forgery of digital signatures and SSL certificates. CSC 482/582: Computer Security Slide #16 MD5 Collision Attack History 1. Initial attacks (2004) could only find collisions in files that differed only in last few bytes. 2. Early attacks (2008) used cluster of 200 PS3s for a couple of days. 3. Current attacks can find a collision in seconds on single PC. Lesson: Cryptanalytic attacks always improve. Change algorithms before they do. CSC 482/582: Computer Security Slide #17 Secure Hash Algorithm (SHA-1) Developed by NSA; approved as federal std by NIST SHA-0 (1993) and SHA-1 (1995) 160-bit hash values Merkle–Damgård construction SHA-1 developed to correct insecurity of SHA-0 SHA-1 still found in legacy applications Vulnerabilities less severe than those of MD5 Can find SHA-1 collision in 269 operations. Can find SHA-0 collision in 239 operations. CSC 482/582: Computer Security Slide #18 SHA-2 Developed by NSA; approved as federal std by NIST SHA-2 (2001) 224, 256, 384, or 512-bit hash values Merkle–Damgård construction Current recommended hash function for security applications like digital signatures or SSL certificates. Cryptanalysts making progress but no breaks Can only find collisions if modify hash algorithm by reducing number of rounds from 80 (SHA-512) to 46 or 64 (SHA-256) to 41. CSC 482/582: Computer Security Slide #19 SHA-3 Winner of open NIST competition (2007-2012) Final standard expected by 2014 Q2. SHA-3 (2012) 224, 256, 384, or 512-bit hash values Keccak was winning algorithm out of field of 64. An alternative to SHA-2 Not a replacement as SHA-2 is not broken. Built on sponge-function instead of Merkle–Damgård construction like MD5, SHA-1, SHA-2 so that the same cryptanalytic techniques will not work against SHA-3. CSC 482/582: Computer Security Slide #20 HMAC A keyed hash message authentication code (HMAC) is the use of a hash function for calculating a message authentication code (MAC) based on a message in combination with a secret cryptographic key.