Foundations of Cryptography Cryptographic Ideas Are Central to Security Technology

Overview Foundations of Cryptography Cryptographic ideas are central to security technology.

Tuesday April 19 and Thursday April 21, 2016 Most practical security systems involve many of the following technologies: Reading: S&M Secs. 7.1 – 7.4, 7.6; Chs. 8-10 o Symmetric-key cryptography, in which parties use a shared Schneier Applied Cryptography, Chs. 1, 2-4, 7-12, 18.14 secret key to encrypt/decrypt messages to communicate Ferguson, Schneier, & Kohno, Cryptography Engineering, Chs. 2,5,6 confidentially Kaufman, Perlman, & Speciner, Ch. 9 (esp. 9.7.2-9.7.); 11; 13-14, 15 o Hashing for message authentication and digital signatures. Anderson, Security Engineering, Ch. 3 o Public-key cryptography, in which asymmetric public/private key pairs address the key exchange problem and support CS342 Computer Security encryption and digital signatures o Public key infrastructure (PKI), including certificate Department of Computer Science authorities, that help public key crypto work in practice. Wellesley College These technologies are embedded in protocols of messages between two parties that implement secure communication.

22-2

Cast of Characters Eves Lament

Eve

M = one if by land two if by sea

Mallory Alice Bob

Alice wants to send a message M to Bob.

Eve may be eavesdropping on the communication (passive attacker)

Mallory may be able to change the message (active attacker) http://xkcd.com/177/ 22-3 22-4 Goals of Cryptography Terminology Confidentiality: No one can read M except encryption/ Cryptography = scrambling messages: converting messages into Alice and Bob (e.g., whispering, locked box); decryption "gibberish" that can be converted back to message.

Authentication: Bob knows M comes from Alice, Cryptanalysis = breaking secret codes. not someone masquerading as Alice (e.g., wax seal, watermark paper, signature); Cryptology = cryptography + cryptanalysis (in practice cryptography is often used for cryptology). Integrity: Bob can verify that M has not be message Steganography = hiding messages, ``security through obscurity''. altered after it was sent by Alice authentication, E.g., under wax, under hair, invisible ink, in lower order bits (e.g., tamperproof packaging); digital of images, in whitespace ( signatures http://compsoc.dur.ac.uk/whitespace/tutorial.html) Nonrepudiation: Alice should not be able to falsely deny sending M.

22-5 Note: Can combine steganography and cryptography. 22-6

The Basic Scenario Kerkhoffss Principle

o Separate cryptography into public algorithms Alices key Bobs key and private keys (Dutch linguist Auguste Kerckhoffs, 1883) KA KB o Private algorithm is an example of security through obscurity, which is usually not secure, M C = E(KA,M) M = D(K , C) Encrypt Decrypt B at least once algorithm is known plaintext ciphertext plaintext (e.g., Navajo code talkers in WW2). (text,image, voice, …) (Eve can see Mallory can see/alter) o Proprietary algorithms often contain holes; public algorithms are analyzed by lots of smart people to find potential problems. E(key, msg) is the encryption (enciphering) function o People (rightly) suspicious of private algorithms, so hard to D(key, msg) is the decryption (deciphering) function adopt on widespread basis.

Need D(KB, E(KA , M)) = M for all M.

22-7 22-8 Symmetric vs. Public-Key Cryptography Symmetric Cryptography Algorithms

Symmetric (a.k.a. shared-key, secret-key, private-key): For simplicity, often assume messages & keys use 26-letter alphabet. (in reality, typically use 8-bit = 256-character bytes or raw bits). o Alice and Bob share the same key (or they have different keys that can be easily calculated from one another). In each of the following examples: What is the key? How many keys are there? (Modern crypto relies on computational intractability.) o Communicating keys is a big problem. (How to do e-commerce?) o Shift cipher o We begin with this o Vigenere cipher Asymmetric (public-key): o One-Time pad o Bob can publish a public key that anyone (including Alice) can use o Substitution cipher to encrypt a message, but only Bob has a private key that can decrypt the message. The private key cannot easily be determined o Transposition cipher from the public key. o Rotor Machines o Alice can contact Bob without exchanging keys with him! But o Block Cipher public-key management is still a problem. o Stream Cipher o Later in this lecture 22-9 22-10

Shift Cipher Shift Cipher: JavaScript Malware, 2010

shift message o Idea: shift the letter in each In a spam .html message sent to Takis: plaintext position by the same amount 0 ibm o Caesar cipher: shift by 3 1 jcn 2 kdo o Rot13: shift by 13 (doing twice is id) … … o How easy is it to break an encrypted message? 25 hal

Shift cipher of 1 for which redirects to what appears to be a drive-by download site.

Cryptographic Tools 22-11 22-12 Vigenere Cipher Frequency Analysis

Idea: shift the letter in each plaintext position English letter frequencies: as determined by repeating key

Plaintext: oneifbylandtwoifbysea Key: cabcabcabcabcabcabcab Ciphertext: qnfkfcalbpduyojhbzueb o The same plaintext at different positions may be encrypted to different ciphertext (e.g. ifby). Most common digrams (in order): th, he, in, en, nt, re, er, an, ti, es, on, at, se, nd, o How secure is Vigenere? Check out or, ar, al, te, co, de, to, ra, et, ed, it, sa, em, ro. http://cs.wellesley.edu/~fturbak/codman-archive/codman-nov-02-2007/ The most common trigrams (in order): the, and, tha, ent, ing, ion, tio, for, nde, has, nce, edt, tis, oft, sth, men

See http://pages.central.edu/emp/LintonT/classes/spring01/cryptography/letterfreq.html 22-13 Cryptographic Tools 22-14

XOR Operation (⊕) On Bits One-Time Pad Properties ⊕ 0 1 • Associativity: ( x ⊕ y) ⊕ z = x ⊕ (y ⊕ z) View message as n bits and have n-bit random pad as key. 0 0 1 • Commutativity: x ⊕ y = y ⊕ x Effectively Vigenere with key as long as message. 1 1 0 • Identity: x ⊕ 0 = x = 0 ⊕ x • Invertability: x ⊕ x = 0 Consequence: (p ⊕ k) ⊕ k = p plaintext p1 p2 p3 p4 … plaintext o n e i f message as bits ascii 111 110 101 105 102 key k1 ⊕ k2 ⊕ k3 ⊕ k4 ⊕ bits 0110 1111 0110 1110 0110 0101 0110 1001 0110 0110 c c key 1011 0101 0101 1010 1110 1111 0100 0000 0110 1101 1 c2 c3 4 … ciphertext 1101 1010 0011 0100 1000 1010 0010 1001 0000 1011 k k k k key 1011 0101 0101 1010 1110 1111 0100 0000 0110 1101 key 1 ⊕ 2 ⊕ 3 ⊕ 4 ⊕ bits 0110 1111 0110 1110 0110 0101 0110 1001 0110 0110 p p p p ascii 111 110 101 105 102 1 2 3 4 … plaintext o n e i f

22-15 22-16 Evaluating One-Time Pad One-Time Pads in the News

o If used properly, one-time pad is perfectly secure. Ciphertext C for message M can decrypt to any message M with some key.

o New key must be chosen for every message:

• Reusing keys breaks security by opening messages to analysis (e.g., Russian Venona ciphers).

• How to communicate such long keys securely? (Why not send messages the same way?)

o If pad isnt really random but psuedo-random generated by seed, this is a stream cipher .

http://www.nytimes.com/2012/11/02/world/europe/world-war-ii-pigeons-message-a-mystery.html 22-17 22-18

Stream Cipher Evaluating Stream Ciphers Basic architecture is like one-time pad, except XORing is done with key stream generated by key. E.g., key is seed to A simple way to encrypt long/infinite streams of data psuedo-random number generator (PRNG). Example: RC4. (e.g., a network link).

plaintext message as bits Security of stream cipher depends entirely on details of keystream generator. p p 1 p2 p3 4 … keystream key ks ks ks ks generator 1 ⊕ 2 ⊕ 3 ⊕ 4 ⊕

c1 c2 c3 c4 keystream key ks1 ks2 ks3 ks4 generator ⊕ ⊕ ⊕ ⊕

p1 p2 p3 p4

22-19 22-20 Substitution Cipher Breaking Substitution Ciphers 26 Idea: replace plaintext letters according to alphabet permutation. There are 26! = 4x10 keys. So substitution ciphers must be very secure, right? ! A key is one such permutation. E.g.:

plaintext: a b c d e f g h i j k l m n o p q r s t u v w x y z ciphertext: j d u s c l n f r h a g z w k y i p t m x b v q o e

Encrypt a message via this permutation. E.g.:

plaintext: o n e i f b y l a n d t w o i f b y s e a ciphertext: k w c r l d o g j w s m v k r l d o t c j

o Same plaintext at different positions encrypts to same ciphertext. " Oops! How are o Can implement by rotors (e.g. WW2 Enigma) substitution ciphers broken? o How many keys are there? 22-21 22-22

Transposition Cipher Block Cipher

Idea: shuffle chunks of the plaintext message Fixed-length plaintext block (e.g. 64 bits)

A key is one such shuffling. E.g.: … How many different blocks with n bits? key o n e i f b y l a n d t w o i f b y s e a x y z E In theory, how many … different encryption Fixed-length ciphertext block functions? n b i o f l e y n o t a w f d i y x e b a z s y In practice far fewer, depending on key size. o Same plaintext at different positions can be shuffled differently. key D How many encryption o May need to pad end of message. Pads must be chosen carefully! functions for key size k?

o How many keys are there for shuffle blocks of length n? … Examples: DES, AES. Fixed-length plaintext block Skipjack, IDEA, o How can transposition ciphers be broken? Blowfish, RC5

22-23 22-24 Block Cipher: Electronic Codebook Mode (ECB) Evaluating ECB Advantages: plaintext message as blocks (may need padding at end) • Easy to parallelize • Can modify part of encrypted file without re-encrypting whole file. P1 P2 P3 P4 • Ciphertext bit errors in one block don t propagate to other blocks. … Disadvantages: k k k k E E E E • Can make codebook of any decrypted blocks. C C C 1 2 3 C4 … • Can perform statistical attacks on blocks. Stereotyped beginnings and endings of messages especially problematic.

k k k k D D D D • Adding or losing a ciphertext bit (synchronization error) throws everything off if no frames.

P1 P2 P3 P4 … • Block replay: Mallory can replace, remove, repeat, interchange blocks (e.g. modify bank transfers to other

22-25 accounts to move money to his account). 22-26

Block Cipher: Cipher-Block Chaining Mode (CBC) Evaluating CBC Why Initialization Vectors? Initialization Vector (IV) P P P P Same message (or message prefix) won t yield same ciphertext 1 2 3 4 for different IVs (like password hashing salt). IVs chosen randomly.

⊕ ⊕ ⊕ ⊕ … Advantages of CBC:

k E k E k E k E Resistant to codebooks, statistical attacks, and block replay.

Disadvantages of CBC:

C1 C2 C3 C4 … • Inherently sequential.

k D k D k D k D • Error problems: o Changed bit in ciphertext block garbles current plaintext block and changes one bit of subsequent block . ⊕ ⊕ ⊕ ⊕ … o Adding/removing single bit in ciphertext block garbles rest of P1 P2 P3 P4 plaintext message.

22-27 22 22-28

Data Encryption Standard (DES) Encryption/Decryption with openssl Command o In 1972, National Bureau of Standards issued request for standard [cs342@puma] cat secret.txt crypto algorithm. One if by land, two if by sea. o Data Encryption Standard (DES) was adopted as federal standard [cs342@puma] openssl enc -des -nosalt -pass pass:foobar -in secret.txt -out in 1977 and ANSI standard in 1981. secret.enc o Grew out of IBM Lucifer system and evaluated by the National [cs342@puma] cat secret.enc Security Agency (NSA). People worried that NSA reduced key size S\3725\337*|^T\341\345^^\366\316\207\370\261\351d\371^D^A# from 128 to 56 bits and may have introduced a trapdoor. \245^V1>\240Gy\230\256 o DES is a block cipher --- maps 64-bit plaintext block to 64 bit [cs342@puma] openssl enc -d -des -nosalt -pass pass:foobar -in secret.enc ciphertext block controlled by 56-bit key. Uses 16 rounds of the One if by land, two if by sea. same substitution/permutation operation.

22-29 22-30

Brute Force Attacks on DES Other Block Ciphers

Although denied by US govt. (esp. NSA) people suspected DES was National Institute of Standards and Technology (NIST) requested breakable by brute force (i.e., try all possible keys) with enough proposals for new encryption standard in 1997; winner in 2000 computing power. was dubbed Advanced Encryption Standard (AES) . Minimum of 128-bit keys; up to 256-bit keys. Shown in series of RSA-Labs-sponsored challenges to break DES keys: Other block ciphers: o Jun. 1997: 140 days by DESCHALL project (distributed.net), as described in Matt Curtins o Skipjack (Clipper chip/Fortezza program, 80-bit keys) Brute Force: Cracking the Data Encryption Standard o Feb. 1998: 39 days by distributed.net o Triple-DES (TDES) = three rounds of DES with 3 different keys; 2*56 = 112 effective bits of security with 3*56 = 168-bit keys. o Jul. 1998: 56 hours by Electronic Frontier Foundation's (EFF) "Deep Crack" machine o IDEA (128-bit keys) o Jan. 1999, key broken in 22 hours and 15 minutes by Deep Crack + distributed.net o Blowfish (Schneier, up to 448-bit keys).

Moral: Key size matters! o RC5 (Rivest, parameterized over key & block size)

22-31 22-32

Breaking Cryptography (S&M Ch. 8) Cryptanalytic Attack Classification Brute force attack: Try all possible keys. Feasability depends on key size, available computrons. Add ciphertext-only: given only encrypted messages, deduce messages/key. Information theoretic attacks: Letter frequency information can break shift & substitutions cipher quickly. (To reduce redundancy of messages, some crypto implementations first compress message.) known-plaintext: given plaintext messages & encryptions, deduce key/algorithm Algorithm/implementation attacks: take advantage of details in algorithm or its implementation – e.g. choice of randomness, padding functions . Examples: chosen-plaintext: deduce key from black-box encrypter. • Netscape SSL random session key easily guessable with time/process info. • Kerberos v.4 DES 56-bit session key only had 20 bits of info. chosen-ciphertext: deduce key from black-box decrypter.

Side channel attacks: determine key from timing, memory usage, power, purchase-key/rubber-hose: bribe/threaten key holder electromagnetic field, etc. until s/he gives up key Keyjacking: Hijack calls to cryptographic API.

Other approaches weve seen: information leakage (find keys on stack, in memory pages, etc.) and social engineering (shoulder surfing, dumpster diving, etc.) 22-33 22-34

Change of Focus: Message Authentication Message Authentication via Digital Signatures

Confidentiality: No one can read M except encryption/ With encryption scenario, nothing prevents Eve from recording and Alice and Bob (e.g., whispering, locked box); decryption replaying messages and Mallory from modifying messages (e.g. splicing parts of different messages together). Authentication: Bob knows M comes from Alice, not someone masquerading as Alice Want a way to determine that an plaintext message is authentic -- (e.g., wax seal, watermark paper, signature); it s really from who it says it's from, and has not been changed.

message If Alice sends Bob (unencrypted) M, want it accompanied with a Integrity: Bob can verify that M has not be authentication, digital signature SA,M that has the following properties: altered after it was sent by Alice digital (e.g., tamperproof packaging); signatures • authenticity: Bob confident M is from Alice (only she signs S ). A,M Nonrepudiation: Alice should not be able to • integrity of signed document. Changing M invalidates SA,M falsely deny sending M. • unforgeability: no one but Alice knows how to sign SA,M.

• unreusabilty: SA,M depends on M, so cant be used with another msg. • nonrepudiation of signature: Alice cant claim she didnt send M.

22-35 22-36 Message Authentication Code (MAC) CBC-MAC Idea: Break message into n fixed-sized blocks (padding if necessary) Shared authentication key K and encrypt with block cipher in CBC mode. MAC is final cipher block. Initialization padded Vector (IV) M1 M2 Mn

M S = MAC(K,M) S S ⊕ ⊕ ⊕ MAC M S = MAC(K,M) K E K E … K E MAC M discard discard MAC o For untampered message, M = M and S = S = S. Otherwise, Bob thinks message has been changed and/or is not from Alice. Ferguson, Schneier, and Kohno (FSK) warn: o Replay a problem, but can be solved with sequence #s/timestamps. o Never use same key for encryption and authentication. o MAC provides no confidentiality by itself. M could be plaintext. o Collision attacks limit security to half the length of block size. o MAC can be combined with encryption, but authentication key should be o Not recommended, because difficult to use correctly. unrelated to encryption key. 22-37 22-38

CMAC Digression: Cryptographic Hash Functions A cryptographic hash function H(M) returns a fixed-length hash Idea: Like CBC-MAC, but XOR key-dependent value in input to final value h for any size message M. This value h is known as a block to disrupt some CBC-MAC attacks cryptographic checksum, fingerprint, message digest. They’re used in passwords, digital signatures, integrity checking, etc. Initialization padded Vector (IV) M M M 1 2 n 01011010001 01011101101 H 11010011010110…110 100110110… ⊕ ⊕ ⊕ H(M) = h (fixed size)

K K f M (arbitrarily large) E E … ⊕

K E Hash Function Bits in Fingerprint discard discard MD5 (Message Digest, Rivest, 1992) 128 SHA-1 (Secure Hash Algorithm, NSA, 1995) 160 MAC SHA-2 (NSA, 2001) 224, 256, 384, 512 SHA-3 (NIST, 2012) 224, 256, 384, 512

22-39 22-40 Desirable Crypto Hash Function Properties Hashing Examples Using openssl One-way (a.k.a. Pre-image Resistance): • Easy to compute H(M) = h. [cs342@tempest ~] echo "One if by land, two if by sea." | openssl dgst -md5 (stdin)= fe66cbf9d929934b09cc7e8be890522e • Given h, hard to find an M such that H(M') = h (even though

there are ∞ such values, by the pigeon-hole principle). [cs342@tempest ~] echo "One if by land, two if by sea." | openssl dgst -md5 -c • Implies that, given M, hard to find M such that H(M) = H(M). (stdin)= fe:66:cb:f9:d9:29:93:4b:09:cc:7e:8b:e8:90:52:2e

Collision Resistance: [cs342@tempest ~] echo "One if by land, two if by sea" | openssl dgst -md5 -c (stdin)= 78:4b:61:4a:66:98:17:82:18:d9:25:ca:c9:64:c5:56 • Practically difficult to find (M1, M2) that collide: H(M1) = H(M2). [cs342@tempest ~] echo "One if by land, Two if by sea." | openssl dgst -md5 -c Seemingly Random Behavior: (stdin)= 96:07:e8:69:cd:97:59:98:ad:21:8e:46:a8:c0:4f:0e • H behaves like a random mapping: changing a single bit in M should change about half the bits in H(M), in an unpredictable way [cs342@tempest ~] echo "One if by land, two if by sea." | openssl dgst -sha1 -c (stdin)= 28:47:d1:e5:9a:96:83:bf:2f:2a:91:b8:f3:ec:21:63:d3:be:5a:6b

Note: do note confuse cryptographic hash functions with much [cs342@tempest ~] echo "One if by land, two if by sea" | openssl dgst -sha1 -c simpler hash functions used in hash tables! (stdin)= 24:e2:d1:19:44:c2:17:49:1f:d8:9c:23:d0:9d:d2:d9:87:87:11:f1

22-41 22-42

More Hashing Examples Hashes in Practice: Safe Downloads & Intrusion Detection

Safe downloads from untrusted sites (as long as have file hash from trusted [cs342@tempest ~] echo "One if by land, two if by sea." | openssl dgst -sha256 -c site). Can use hashes to guarantee file integrity. (stdin)= 67:82:97:b4:e2:4f:95:28:b7:f1:cf:37:dc:8b:49:83:3f:94:d6:45:50:eb:1c:4f:79:86: 56:0a:59:8e:e1:ed E.g., http://www.safer-networking.org/en/download/index.html [cs342@tempest ~] echo "One if by land, two if by sea" | openssl dgst -sha256 -c (stdin)= 94:09:b6:88:06:44:df:e8:47:28:e2:9c:5e:99:0c:16:76:5c:ad:1d:32:36:25:ef:2b:c3: [lynux@localhost ~]$ openssl dgst -md5 Desktop/spybotsd160.exe 0e:d8:7a:ed:38:95 MD5(Desktop/spybotsd160.exe)= 0e7fbf50f87b3b7c384a2471154a7558

[cs342@tempest ~] echo "One if by land, two if by sea." | openssl dgst -sha512 -c Intrusion detection: use hash (possibly combined with key) to fingerprint all (stdin)= 93:fc:e3:a3:07:6a:90:ec:51:29:be:71:da:bf:47:dd:d0:67:ed:89:c6:b4:f9:27:ba:f6: c8:8c:a1:78:d2:53:ac:92:bc:3d:a5:52:06:98:d5:40:14:9f:2e:ad:fe:ab:55:ae:6f:d7:67:cf:1e: important system files. b4:85:0a:01:9c:78:8f:97:22

[cs342@tempest ~] echo "One if by land, two if by sea" | openssl dgst -sha512 -c (stdin)= 8c:89:d6:ad:9e:30:09:53:00:70:bd:a0:4c:1a:62:34:7c:5e:f2:a3:c0:25:02:99:e2:7a: 89:b0:ac:2f:a5:fc:7c:44:72:a7:a6:69:75:45:c9:3f:18:f9:12:e0:a7:50:bf:73:c4:f8:61:42:fd: 90:78:3d:06:28:7b:f0:48:8c

22-43 22-44 Hashes Arent Digital Signatures! Hash-Based MACs

One-way hashes can be used for integrity in some cases (e.g., code A MAC combine hashes with keys to so that only key-holders can fingerprint from ``reputable'' website), but by themselves they're authenticate. Useful for authenticating files between users and not suitable for signing messages from Alice to Bob. determining if user files have changed

Why? Basic idea: send pair of message M and signature S, where the S is calculated from M and key K. Some possible signatures:

• encrypt hash value of M with key: S = E(K, H(M)).

• hash encrypted value of M: S = H(E(K, M)).

• hash result of concatenating key and message: (1) S = H(K @ M) or (2) S = H(M @ K).

Schneier & FSK mention some vulnerabilties of these approaches.

22-45 22-46

Generic Hash Attacks Birthday Problems

Recall: want hash function H to have both the following properties: The following two problems are very different:

Pre-Image Resistance: given h, hard to find M s.t. H(M) = h 1. Alice is in a room of n people. Whats the smallest n for which theres a 50% probability that someone else shares Alices birthday? Collision Resistance: hard to find M1 and M2 s.t. H(M1) = H(M2)

For k-bit hashes:

• given h, about how many messages M do we expect to examine in 2. (Birthday Paradox) There are n people in a room. What s the smallest a brute force attack on h? n for which there s a 50% probability that at least one pair share the same birthday?

• about how many messages M do we expect to examine to find a collision?

To figure these out, lets think about birthdays …

22-47 22-48 Birthday Problem #1: Details Birthday Paradox: Details Solution to Birthday Problem #1: Solution to Birthday Problem #2 (Birthday Paradox) Let B(n) = at least one of n people shares Alices birthday Let B(n) = at least two of n people share same birthday and NB(n) = not one of the n people shares Alices birthday and NB(n) = no two people share the same birthday. Then p(B(n)) + p(NB(n)) = 1, so p(B(n)) = 1 – p(NB(n)). Then p(B(n)) + p(NB(n)) = 1, so p(B(n)) = 1 – p(NB(n)).

n p(NB(n)) = (364/365) n p(NB(n)) p(B(n)) p(NB(n)) = (365/365)*(364/365)*(363/365)* … * ((366 – n )/365)

50 .872 .128 = 365! / ((365 – n)! * 365n) Note: expected number of 100 .760 .240 people to have first match 150 .663 .337 n p(NB(n)) p(B(n)) is 365. 200 .578 .422 10 .883 .117 250 .504 .496 15 .747 .253

253 .500 .500 20 .589 .411 Consequence: for k-bit hash, 300 .439 .561 23 .493 .507 expect to find a message with 400 .334 .666 30 .294 .706 a given hash value h after 500 .254 .746 40 .109 .891 k 2 hashes. 750 .128 .872 50 .03 .97 1000 .064 .936 60 .006 .994 70 .001 .999 22-49 22-50

Birthday Paradox: Results Birthday Paradox: Practical Consequences Suppose a system generates a random sequence of n MD5 hash (128 bits) values in [1 .. L]. o Expect to find collisions after 264 ~ 18x1018 messages. Once unthinkably large, but now very thinkable. There are (n * (n-1))/2 pairs of sequence elements o Cryptanalytic advances starting in 2005 allow finding collisions in much (ignoring order). fewer than 264 computations. o While the existence of such efficent collision finding attacks may not There is a 1/L chance that any pair has equal values. immediately break all uses of MD5, it is safe to say that MD5 is very weak and should no longer be used. (FSK, Ch. 5) So the probability of a collision is (n * (n-1))/2L ~ n2/2L. SHA-1 hash (160 bits) When is the collision probability about 50%? o Expect to find collisions after 280 ~ 1.2x1024 messages. Within the realm of thinkability. What does this imply about the number of messages that o Algorithm details make it possible to finding collisions in much fewer than need to be examined to find collisions for a hashing 280 computations. function with a k-digit fingerprint? o It is no longer safe to trust SHA-1. (FSK, Ch. 5)

22-51 22-52

Iterative Hash Functions Attacks on Iterative Hash Functions

In practice, a hash function H on fixed size blocks (typically 512 bits) is applied iteratively to message blocks starting with a fixed value. Length Extension Attack:

padded o Knowing H(M @ M … @ M ), we know a lot about H(M @ M … @ M @ M ). h (fixed) 1 2 n 1 2 n n+1 0 M M M 1 2 n o E.g., Consider MAC(K,M) = H(K @ M) = h. Mallory can add an extra block to the end of M without changing h. ⊕ ⊕ ⊕ Partial-Message Collisions: H H … H o Consider MAC(K,M) = H(M @ K) = h, where K is a multiple of a block size. Then if Mallory finds M such that H(M) = H(M), she can replace M by M without changing h. h0 h1 h2 hn = H(M)

Properties: • Easy to implement. FSK suggest the hash function H (M) = H(H(M) @ M) to fix these problems. • Can compute as soon as part of message received, so works well on stream of data. • But, subject to some attacks.

22-53 22-54

Password Interception Man-in-the-Middle /Chess Grandmaster Attacks

Many password interception techniques are examples of password <Im Alice, H(password) > man-in-the-middle (MITM) attacks: Mallory impersonates Alice to Bob and Bob to Alice. Bob Bob Alice Alice Mallory

message to Bob message to Bob

message to Alice message to Alice Bob Alice Alice Mallory

This is also known as the chess grandmaster attack: Mallory can play remote chess against grandmasters Alice (white) and Bob (black) at the same time and will win one game or draw both.

22-55 22-56 Replay Attack Foiling Replay Attack o Timestamps/Sequence Numbers/Nonces but, as well see in more detail later, these have problems: <Im Alice, H(password) > • timestamps require synchronization and might be guessable • Alice Alice consecutive sequence numbers are obviously related <Im Alice, H(password) > Bob • nonces must be remembered by Bob. Eve o Password Aging • bad social properties; people try to circumvent. Eve can record encrypted password and later replay it. • how frequent to be effective deterrent? o One-time Passwords (OTP): Alice has a sequence of passwords How to foil replay attack? and uses each only once (theyre nonreusable!).

• how to generate password sequence?

• how to synchronize Alice and Bob? o Challenge-Response: Alice responds to new challenge from Bob on each login.

22-57 22-58

One-Way Authentication: Challenge-Response MITM Attack on Challenge-Response

Alice responds to nonce challenge R from Bob by encrypting Im Alice Im Alice or hashing R with shared key KAB. R R Bob Bob Im Alice Alice E(KAB, R) or H( ) Mallory E(KAB, R) or H( ) R Bob Bob Alice Alice MIG-in-the-Middle attack. described by Ross Anderson: E(KAB, R) or H( )

R R R

E(K , R) E(K , R) E(K , R) SAAF SAAF Angola in SAAF SAAF jet jet SAAF in Namibia Namibia in Cuban MIG MIG Cuban over Angola Cuban radar Cuban SAAF radar radar SAAF over Namibia

22-59 22-60 Other Challenge-Response Attacks Review: Symmetric-Key Crypto Violated assumptions can lead to many attacks: K K Impersonation: Nothing authenticates Bob, so Mallory can impersonate Bob. Alice Bob

Known Plaintext: Plaintext R can help Eve figure out key KAB. M C = E(K,M) M = D(K, C) Encrypt Decrypt Chosen Plaintext: As Bob, Mallory can get Alice to encrypt any M. plaintext ciphertext plaintext Session Hijacking: Mallory hijacks conversation after initial handshake. o Alice and Bob share same key K.

Server Database Attack: Mallory may be able to find KAB (and thereby o Need D(K, E(K, M)) = M for all keys K and messages M. impersonate Alice) by attacking Bobs database.

o Fundamental problem is key exchange: Predictable Nonce: If nonce is a sequence number or coarse-grained timestamp, Mallory can replace R by a later R′, obtain E(KAB, R′) from Alice, • How do Alice and Bob choose & share K, especially if and use this to impersonate Alice for later nonce R′. (This is impractical for they never meet in person? fine-grained timestamp.) • How many keys are needed for n parties, each of whom Moral: Cryptography not enough by itself – must consider system in which its want to communicate in pairs? used. (Radia Perlman s talk on How to Build an Insecure System out of Its essential to solve this problem for electronic commerce. Perfectly Good Cryptography.) 22-61 22-62

Diffie-Hellman Key Exchange (1976) Public-Key Crypto

• Allows Alice and Bob to create a new session key without meeting Bobs public Bobs private • Does not authenticate Alice or Bob. key, KBpub key, KBpriv Alice Bob gx (mod n) M C = E(KBpub,M) M = D(K , C) picks random x picks random y Encrypt Decrypt Bpriv gy (mod n)

Bob Bob plaintext ciphertext plaintext

now knows key Alice now knows key k = gxy (mod n) k = gxy (mod n) o Each person has a key pair:

Public knowledge: E.g. 2 is a generator for 11: 1. a public key Kpub that is published to the world; 210 = 1 (mod 11) 29 = 6 (mod 11) • a large prime n 2. a private key Kpriv that only the person knows. 21 = 2 (mod 11) 27 = 7 (mod 11) • a generator g for n (i.e., any number 8 3 2 = 3 (mod 11) 2 = 8 (mod 11) o Need D(Kpriv, E(Kpub, M)) = M for all key pairs (Kpub, Kpriv) and all M. k between 1 and n-1 can be expressed 22 = 4 (mod 11) 26 = 9 (mod 11) i 4 5 as g (mod n) for some i) 2 = 5 (mod 11) 2 = 10 (mod 11) o Must be practically impossible to determine Kpriv from Kpub o Solves key exchange problem. Anyone can look up Ps public key Depends on hardness of discrete log problem: in a public directory to encrypt a message for P, but only P Hard to calculate i from gi (mod n) can decrypt it. (But well see there are problems with public 22-63 directories…) 22-64 Sounds Magical … Do Algorithms Exist? Yes! RSA Foundations: Modular Arithmetic

Idea of public-key crypto due to Diffie/Hellman and Merkle (1976). x =(mod n) y means x mod n = y mod n

Implementations based on computationally intractable problems: equivalently: x + kn = y for some k x = y (mod n) is an alternative notation. o Merkle/Hellman encryption (1978) based on knapsack problem (NP complete). For all x, y, n:

o RSA (Rivest,Shamir, Adelman, 1978) based on difficulty of x + y = (x mod n) + (y mod n) factoring large numbers. The most popular approach to public- (mod n) key crypto, it supports both encryption and digital signatures. x * y =(mod n) (x mod n) * (y mod n) Well study this algorithm. m m x =(mod n) (x mod n) o El Gamal (1985) based on difficulty of calculating discrete logarithms in finite field. Supports both encryption and Examples: digital signatures. 7002 * 7775 =(mod 7) ?? 17 In contrast, symmetric-key crypto implementations are a black art . 153154 =(mod 153) ??

22-65 22-66

RSA Foundations: Multiplicative Modular Inverses RSA Foundations: Fermats Little Theorem

Multiplicative modular inverses of relative primes: Fermats Little Theorem:

(p-1) If x, y are relatively prime (i.e., gcd(x,y) = 1) then theres a k s.t. If p is prime and p doesnt divide x, then x =(mod p) 1

-1 x⋅k =(mod y) 1 (alternatively, k =(mod y) x ) E.g. 96 5 =(mod 97) 1 This is a consequence of Euclids gcd algorithm. 64 96 = 1 (mod 97) 96 E.g. 21, 25: 96 =(mod 97) 1 96 300 =(mod 97) 1 21 ⋅ 6 =(mod 25) 126 =(mod 25) 5 ⋅ 25 + 1 =(mod 25) 1 But 96 25 ⋅ 16 =(mod 21) 400 =(mod 21) 19 ⋅ 21 + 1 =(mod 21) 1 97 =(mod 97) 0 96 194 =(mod 97) 0

22-67 22-68 RSA Foundations: Chinese Remainder Theorem RSA Foundations: Practice Lemma

Chinese Remainder Theorem† Suppose p, q are prime.

Suppose a and b are relatively prime. Then: Using Fermats Little Theorem and Chinese Remainder Theorem, show the following:

(x =(mod a) y) and (x =(mod b) y) iff (x =(mod ab) y) w if w =(mod (p-1)(q-1)) 1 then m =(mod pq) m for all k and all m < p ⋅q Examples: Consider: 10 = 22 and 10 = 22 implies 10 = 22 (mod 3) (mod 4) (mod 12) 1+k(p-1)(q-1) 23 =(mod 30) 53 implies 23 =(mod 2) 53 and 23 =(mod 15) 53 m =(mod p) ?

23 =(mod 3) 53 and 23 =(mod 10) 53 1+k(p-1)(q-1) m =(mod q) ? 23 =(mod 5) 53 and 23 =(mod 6) 53 Does it matter if one of p or q divides m?

† This is a specialized version of the Chinese Remainder Theorem. The general version involves n equations and n numbers that are pairwise relatively prime. 22-69 22-70

RSA (Rivest, Shamir, Adelman 1978) RSA Implements Public-Key Crypto!

1. Choose two large (100s of digits) random prime numbers, p and q. D(kpriv, E(kpub, mi)) 2. Define n = p ⋅ q = D((d,n), E((e,n), mi)) (by defns of kpriv and kpub ) 3. Choose e relatively prime to (p-1)(q-1). e d = ((mi mod n) mod n) (by defns of E & D) -1 4. Define d =(mod (p-1)(q-1)) e (by Thm 1) ed =(mod n) mi (by modular arithmetic) 5. Define public key Kpub = pair (e,n) 1+k(p-1)(q-1) -1 =(mod n) mi (because d =(mod (p-1)(q-1)) e ) 6. Define private key Kpriv = pair (d,n) =(mod n) mi (by Practice Lemma; recall n = p ⋅ q) 7. To encrypt m with public key (e,n), split m (as bitstring) into

m1 @ m2 @ … s.t. each mi (as number) is < n and calculate: e E((e,n), mi) = (mi mod n) = ci d 8. To decrypt ci: D((d,n), ci) = (ci mod n)

Note: E(kpub, m) stands for E(kpub, m1) @ E(kpub, m2) @ …

D(kpriv, c) stands for D(kpriv, c1) @ D(kpriv, c2) @ …

22-71 22-72 Problem: Larger Numbers are Being Factored Chosen Plaintext Attack on Public Key Crypto

Security of RSA depends on hardness of factoring large numbers. Suppose Eve has ciphertext C = E(KBpub,M). In 1977, Rivest said factoring a 125-digit number would take 40 quadrillion years (reported by Schneier). But computers are If M comes from a small range of possible messages, getting faster, so he was wrong! Eve can try them all (because KBpub is public)!

Decimal digits Bits When factored

90 299 1988

100 333 1989

129 429 1994 140 466 1999 155 515 1999 576 1914 2003 640 2127 2005 Moral: use large numbers in RSA. Consider: If someone found a way to break RSA, would they tell you?

22-74 22-75

Chosen Ciphertext Attack on RSA, #1 Chosen Ciphertext Attack on RSA, #2 & #3

Eve has ciphertext C = Me mod n of message M encrypted by Alice. Mallory wants to trick notary public Trent into signing bad msg B (might have wrong timestamp, claim to be from Alice, etc.) Suppose Eve can trick Alice into signing a message. e Attack 1: Mallory finds an X such that G =(mod n) B (X ) is a good Eve chooses random R < n and calculates: message that Trent will sign. e X =(mod n) R d Y =(mod n) XC Trent returns G mod n. What is this? -1 T =(mod n) R

d Eve tricks Alice into signing Y to yield U = Y mod n. Attack 2: Mallory finds good G1 and G2 such that B =(mod n) G1 G2.

d d Now Eve computes TU mod n. What is this? Trent returns G1 mod n and G2 mod n.

How can Mallory get Bd mod n?

Moral: Never sign a random document from a stranger!

22-76 22-77 Other RSA Attacks (Schneier, S&M) RSA Provides Digital Signatures, Too!

Common Modulus Attack: If two parties use same n but different Since exponentiation is commutative, we also have: e1 and e2, it s possible to decrypt a message encrypted with both. E(Kpub, D(Kpriv, mi) ) Moral: Never share n! d e = ((mi mod n) mod n) Low Encryption Exponent Attack: Small e makes encryption faster, = m ed = m (by same reasoning as above) but have problems if same small e is used for different n. (mod n) i (mod n) i

(1) Can decrypt e copies of same M via Chinese Remainder Thm. D(Kpriv, m) is a public-key signature of m. (2) can decrypt e(e+1)/2 linearly dependent messages. Moral: Be careful with small e; use random padding for larger Alices private Alices public messages. key, K key, KApub Apriv Low Decryption Exponent Attack: Small d makes decryption faster, M s = D(KApriv, M) s M= E(K ,s) but it is vulnerable to attack if its small (< n/4). Moral: Use large d! D E Apub M Timing Attack: Can determine bits of keys with decryption timing attack . Moral: implementation details matter!

22-78 22-79

Properties of Public-Key Signatures Combining Encryption & Signatures

Bob receives message signed D(k , m) Apriv Alice wants to send a secret, signed message m to Bob. (i.e. he receives pair ) Apriv Assume only Alice knows kApriv (it hasnt been stolen): She can send this to Bob:

Authenticity: Bob confident message from Alice (only she knows k ). Apriv E(K_Bpub, <From Alice, D(K_Apriv, M)>) Unforgeability: because no one but Alice knows kpriv. Bob can process her message as follows: Unreusabilty: signature depends on documents & cant be used for

another document. 1. Bob uses K_Bpriv to get <From Alice, signed_M>. What about documents (e.g. digital checks of a certain amount) 2. Bob sees its from Alice, and computes E(K_Apub, signed_M). that might be sent more than once? If its not gobbdelygook, trust message. If not, ignore message. Integrity of signed document. Changing m requires changing Alternatives that Alice could send to Bob: signature in way that cant be done without knowing k Apriv. Nonrepudiation of signature: Alice cant claim she didnt send m. • <From Alice, E(K_Bpub, D(K_Apriv, M))> • <From Alice, D(K_Apriv, E(K_Bpub, M))>

22-80 22-81 Public-Key in Practice Public Key Infrastructure (PKI)

In practice, public-key crypto is orders of magnitude slower than But how does Alice know KBpub and Bob know KApub? symmetric-key crypto. o This is the realm of Public Key Infrastructure (PKI) = everything Assume Alice and Bob have never met. needed to make public key cryptography work in practice. o Suppose Alice wants to encrypt a large message M to Bob. Its too o PKI is all the stuff (S&M) needed to make public-key expensive to use public key crypto to encrypt M. What to do? cryptography work in the real world. Its essentially an issue of trust management. • Alice uses a key generator (PRNG) to generate a session key K_s for symmetric key crypto (e.g. AES). o In particular, why should we trust that what is claimed to be Alices public key is Alices public key? • Alice sends to Bob <From Alice, E(K_Bpub, K_s), E(K_s, M)> • Certainly wed trust Alice to give us her own public key directly. • Bob uses K_Bpriv to extract K_s, and uses that to decrypt M. But this is no better than secret-key distribution! o Suppose Alice wants to send a signed large message M to Bob. Its • The key issue in PKI is finding ways to trust Alices public key too expensive to use public key crypto to sign M. What to do? when it comes from other sources. • Alice sends to Bob <From Alice, M, D(K_Apriv, H(M))> o In practice, the X.509 standard dominates PKI. See readings for • Bob uses K_Apub and H to verify M. details.

22-82 22-98

Person h Key Diffie and Hellmans Vision Certification Authority (CA): Idea Alice KA Bob K Idea: Public keys are stored B We (and many others) trust Carlo (the Certification Authority = CA) Carol KC to manage and distribute public keys. … in a public directory …

Alice registers her public key KA with Carlo, who has high standards for verifying that Alice is really Alice (and not Mallory) – thats why Problems: we trust him! • How do we know information is correct? Who guarantees that the key published for Alice is her public key K and not A Everyone knows Carlos public key KCApub. Mallorys key KM. Public-key crypto breaks if this can happen! Carlo creates a signed certificate (signed with KCApriv) that KA is • One global directory has many practical problems, so likely to Alices public key. be many local directories. Some will be more trustable than others. How do we decide trust? This certificate specifies Alices public key and can be distributed by anyone! • For a web-based directory how do we know it hasnt been

compromised? Are we sure were even talking to the right directory?

22-99 22-100 Certification Authority (CA): Picture Chrome Certificates

<“Alices public key, K > Apub “Alices key?

certA = <“cert from Carlo”, D(KCApriv, “Alices public key is KApub)> certA Carlo (CA)

“Whats your public key? “Alices key? cert Alice Alice

A Elllie cert

Bob Bob A E(KApub, “Msg from Bob: … ) Dave

E(KApub, “Msg from Dave: … )

E(KApub, “Msg from Ellie: … )

Note: certA can be published anywhere! 22-101 22-102

Certificate Chain Chrome Certificate Authorities Menu> Settings> Advanced Settings> HTTPS/SSL Manage Certificates

22-103 22-104 SSL Protocol (used in HTTPS) Ways to Make Certificates (S&M, Ch. 10)

Assume Bob knows CA’s public key.

Strawman 0: CA signs public key from someone claiming to be Alice.

Strawman 1: CA generates key pair for Alice, issues it to her, then signs her public key.

Strawman 2: Alice generates her key pair and brings it to CA for signing.

Strawman 3: Alice generates her key pair and brings it to CA for signing along with signed request proving she knows the private key.

Practical matters: o its important for Alice to provide proof shes who she says she is. o Where is cert stored? o CAs private key must remain private! 22-105 Source:(h*ps://upload.wikimedia.org/wikipedia/commons/e/e5/( 22-107 Ssl_handshake_with_two_way_authen

End of SSL SSL -> TLS

Secure(Socket(Layer((SSL)(is(being(phased(out(and(replaced(by(( Transport(Layer(Security((TLS).(

The(Payment(Card(Industry(Security(Standards(Council((PCI(SSC)( released(a(special(bulle

Thanks(to(Carissa(Sun(for(this(link:( Thanks(to(Carissa(Sun(for(this(link:( h*p://www.tenable.com/blog/pciBsscBannouncesBtheBendBofBsslBusageBforBtheBpaymentBcardBindustry( h*ps://www.sans.org/readingBroom/whitepapers/protocols/sslBtlsBbeginnersBguideB1029( 22-108 22-109 Trust Chains Problems with Trust

Notation: Write [M]U for D(KUpriv, M) o What is trust? o Suppose Bob trusts Cindy to verify Alice. Then Bob trusts the cert Entity A trusts entity B when A assumes that

[Alices public key is KAlicePub]Cindy B will behave exactly as A expects. o Suppose Doug doesnt know Cindy, but trusts Edna to verify Cindy. Adams and Lloyd, Then Doug can use the cert chain Understanding Public Key Infrastructure

[Cindys public key is KCindyPub]Edna → [Alices public key is Involves assumptions, behavior, expectations. KAicePub]Cindy trust anchor for Doug o Problem: trust isnt transitive! Just because Doug trusts Edna to verify Cindys public key doesnt o Can extend chains to any length, as long as theyre rooted at at mean he trusts Cindy to verify Alices public key. E.g., Cindy might be trust anchor. very lax about whose keys she signs.

[Cindys public key is KCindyPub]Edna → [Alices public key is KAicePub]Cindy

22-111 22-112

Models of Trust Revocation

Monopoly: world chooses one trusted CA. Need revocation when (1) private key is compromised or (2) person can no longer be trusted (e.g. disgruntled employee). Oligarchy: there are many trusted CAs. E.g., the CAs baked into browsers. Revocation mechanisms:

Anarchy: each user responsible for choosing trust anchors and o Certifications have built-in expiration date that limits damage time. evaluating trust chains. E.g. PGP web of trust, signing parties. o Certification Revocation List (CRL) = list of unexpired serial numbers that should no longer be honored. Recent CRL must be checked before using public key in cert.

o Delta CRLs: provide small change lists on top of larger complete CRLs.

22-113 22-114 Pop-up Box Problems Bad Certificates Kaufman, Perlman, & Speciner consider the following scenarios: Expired/revoked/self-signed certificates raise warning flags. o Warning: This was signed by an unknown CA. Would you like to E.g. https://cs.wellesley.edu accept the certificate anyway? (OK)

o Would you like to always accept this certificate without being asked in the future? (OK)

o Would you like to always accept certificates from the CA that issued that certificate? (OK)

o Would you like to always accept certificates from any CA? (OK)

o Since youre willing to trust anyone for anything, would you like me to make random edits to the files on your hard drive without bothering you with a pop-up box? (OK)

… it would be an interesting psychology exercise to see how outrageous you can be before a user stops cliciking OK.

22-115 22-116