Internet Security (SS 2011)

01 Basic concepts and definitions

– Dichotomy (splitting the topic „security“ into two subsets) ● Host • node, physical or virtual machine • is (or can be) well-controlled by well-developed and authorization models • strong notion of „privileged“ state • usually penetrated via a buggy application • focus of Jean-Pierre Seifert's lectures ● Network • anyone can connect • connectivity can only be controlled in very small, well-regulated environments, maybe not even then • different OS have different or no notions of user Ids and privileges • focus of Anja Feldmann's lectures

– Principle: Trust nothing. A host should trust nothing that comes over the wire. – Network security tools ● ● Network-based access control (e.g. firewalls) ● Monitoring ● Protocol analysis by formal verification ● paranoid design – Protocol design ● heavy use of crypto and authentication ● ensure that sensitive fields are protected ● bilateral authentication ● proper authorization ● defend against • eavesdropping • modification • deletion • replay • and combinations thereof

– Definition of Computer Security: ● deals with the prevention and detection of unauthorized actions by users or others of a system. ● Main components: • Confidentiality:  prevent unauthorized disclosure of information,  origin: historical link of security to secrecy (few organizations dealing with classified data) • Integrity: prevent unauthorized modification of information • Availability:  prevent unauthorized withholding of information  services are accessible and useable (without undue delay) whenever needed by an authorized entity  fault-tolerance is needed, e.g. against DoS ● Additional components: • authenticitiy, • accountability:  actions affecting security must be traceable to the responsible party  audit information must be kept and protected  access control is needed • reliability – deals with accidental damage • safety – impact of system failure on the environment • dependability – reliance can be justifiable placed on the system • survivability – recovery of the system after massive failure – more terms: ● Vulnerability: • error or weakness in the design, implementation or operation of a system • Hardware: Interruption (DoS), Modification, Interception (Theft), Fabrication (Substitution) • Software: Interruption (Deletion), Modification, Interception, Fabrication • Data: Interruption (Loss), Modification, Interception, Fabrication ● Attack: means of exploiting some vulnerability in a system ● Threat: • adversary who is motivated and capable of exploiting a vulnerability • different enemies have different abilities • Serious enemies can exploit the „three B“: burglary, bribery and blackmail (in addition to social engineering) • cannot design a security system unless one knows who the enemy is ● Types of attackers: • „Joy hackers“:  most are „script kiddies“, some are very competent  share more tools than the good guys do • Hacking for profit  allied with the spammers and phishers  primary motivation: money  sophisticated attacks, less pure vandalism  botnets • organized crime • industrial espionage  less than 5% of attacks are detected  professionals know what they want and use non-technical means (social engineering)  won't use your machine to attack others, so they are hard to find • inside jobs: What if your system administrator turns to the „Dark Side“? • Spies • Distinction needed because security is a matter of economics: how much security do you need/can you affors ● Assets • host-resident data • bandwidth • CPU time • knowledge of what hosts exist ● Cipher: • algorithm for and decryption • operates syntactically • on elements of an alphabet or groups of letters (arbitrary plaintext → compute ) • usually depends on a piece of additional information, the • distinct from codes ● Code: • operate semantically • on words, phrases or sentences • e.g. per codebooks • classical cryptography: substituting according to a large codebook which links random string of characters to a word or phrase [wikipedia]

– Weakest point: Human „Humans are incapable of securely storing high-quality cryptographic keys, and they have unacceptable speed and accuracy when performing cryptographic operations. They are also large, expensive to maintain, difficult to manage, and they pollute the environment. It is astonishing that these devices continue to be manufactured and deployed, but they are sufficiently pervasive that we must design our protocols around their limitations. Kaufman et al. “

– examples of attacks: ● Bandwidth attacks • clog your bandwidth, e.g. DoS • use your bandwidth to attack someone else, e.g. reflector attacks: forge source address for UDP-based service where response is bigger than request • network identity attack: run a server with illegal content on hacked machine • eavesdropping: sniff password, credit card details ● Packet sniffing • easiest case: Broadcast media  promiscuous NIC reads all packets passing by  can read all unencrypted data

02 Crypto Basics

● pair of algorithms that take a key and convert plaintexts to and backwards ● symmetric = secret key = private key cryptosystem / cryptography ● asymmetric = public key cryptosystem / cryptography ● Main assumption: Kerckhoffs Principle (1883): Assume adversary knows the algorithm used, but not the key. ● Types of attacks: • ciphertext only • known plaintext • chosen plaintext ● Attacks: • mathematical attacks • statistical analysis – make assumptions about the distribution of letters, pairs of letters (digrams), triplets of letters (trigrams) etc = models of the language • examine ciphertext, correlate properties with the assumptions

– Symmetric Cryptography: – Substitution cipher ● monoalphabetic cipher: substitute one letter for another (Caesar Code) ● insecure due to language characteristics (statistical attack on frequency of letters) ● enumerate all keys (brute-force attack) ● because 26! = 4*1026 possible keys are very few – One Time Pad (Vernam/Mauborgne cipher 1917-18) ● Exclusive-Or a key stream tape (random sequence of 0 and 1) with the plaintext ● online encryption of teletype traffic, combined with transmission ● provably secure ● but need true-random keying tapes which are never reused – ● process message one bit or byte at a time when en/decrypting ● key stream generator produces a sequence S of pseudo-random bytes (like a one time pad) ● key stream bytes are combined (usually via XOR) with plaintext bytes ● properties: • very good for asynchronous traffic • best known stream cipher: RC4 • key streams must never be reused for different plaintexts ● RC4: • internal state: 256-byte array S plus two integers • modifies the state and outputs a byte of the keystream in each iteration  increments i,  adds the value of S pointed to by i to j,  exchanges the values of S[i] and S[j],  outputs the value of S at the location S[i] + S[j] (modulo 256).  Each value of S is swapped at least once every 256 iterations • no resynchronization except via re-keying + starting over – ● process message in blocks, each of which is then en/decrypted ● like a substitution on very big characters ● codebook would need 264 entries, one for each 64-bit block ● instead create it from smaller building blocks ● Claude Shannon substitution • two primitive cryptographic operations: • substitution (S-Box) - makes relationship between ciphertext and plaintext as complex as possible • permutation (P-Box) – diffusion of statistical properties of plaintext over the bulk of ciphertext • Basis for: ● Feistel Cipher Structure • based on concept of invertible product cipher • partition input block into two halves • process block through multiple rounds which: • perform a substitution on left data half based on round function of right half and subkey • then have permutation swapping halves • Design elements: block size, , number of rounds, subkey generation algorithm, round function, fast software en/decryption, ease of analysis ● Cipher design • number of rounds: more is better, exhaustive search best attack • function f: provides confusion, is nonlinear, avalanche • have issues of how S-boxes are selected • : complex subkey creation, key avalanche ● 5 Standard modes of operation: • Electronic Code Book (ECB)  direct use  primarily to transmit encrypted keys  very weak for general-purpose encryption  similar blocks of plaintext produce similar blocks of ciphertext  enemy can build „code book“ of plaintext/ciphertext equivalents  encryption only works for messages that are a multiple of the block size  initialization vector should not be transmitted, else it does not increase security • Cipher Block Chaining (CBC)  most frequently used mode for message encryption  Initialization vector XOR plaintext1, encrypted, result is ciphertext1, result is also taken XOR plaintext2, then encrypted → result is ciphertext2 …  ciphertext of each encrypted block depends on the plaintext of all preceding blocks  subsets of blocks appear valid and will decrypt properly  message integrity has to be done otherwise  used for general file or packet encryption (input must be padded)  drawback: encryption is sequential, cannot be parallelized  one bit error in ciphertext causes complete corruption of the corresponding plaintext block and inverts the corresponding bit in the following plaintext block • Cipher Feedback (CFB) with n-bit shift  relatively close to CBC  move encryption operation before XOR operator  ciphertext is next round's input for the encryption function  can be made a self-synchronizing stream cipher: can recover after n bit errors  → initialize a shift register with of the block size with the initialization vector  encrypt it with the block cipher  highest n bits of the result are XOR'ed with n bits of the plaintext to produce n bits of ciphertext  shift the n bits of ciphertext into the register  n bit errors → n bit of incorrect plaintext, but then the shift register has correct values again and cipher resynchronizes  known as CFB-1 or CFB-8 according to the size of the shift register  encryption cannot be parallelized  advantage: message does not need to be padded • Output Feedback (OFB)  makes a block cipher into a synchronous stream cipher  pseudo one time pad  generates keystream blocks  IV → encryption → result XOR Plaintext1 is ciphertext1, result → encryption → result2 XOR plaintext2 is ciphertext2 …  one bit error in ciphertext changes only one bit of plaintext → error correction codes work even on encrypted text  no error propagation  encryption cannot be performed in parallel, but keystream can be generated in advance before plaintext is available  active attacker can make controlled changes to plaintext  can have a short cycle (keystream repeats after 232 bits)  bit stream, noisy line and error propagation is undesirable • Counter (CTR)  form of stream cipher  generates next keystream block by encrypting a counter value  T → encrypt → result XOR P1 = Ciphertext1  T+1 → encrypt → result XOR P2 = Ciphertext2  …  parallelizable  no linkage between stages  counter often split in message and block number  counter never to be repeated!  Very high speed ● integrity checks • recognize bit errors and modification of the message • requires a separate pass over data • cannot be parallelized: undesired burden, degrades performance of application • combined modes: Galois Counter Mode (GCM), counter with CBC-MAC

– Data Encryption Standard (DES) ● Block Cipher ● developed as Lucifer by IBM in the late 60s, team led by Feistel ● redeveloped as commercial cipher with input from NSA and others ● 1973 National Bureau of Standards issues public call, IBM submitted Lucifer ● adopted in 1977 by NBS (now NIST) ● encrypts 64-bit data using 56-bit key ● widespread use ● 1. Initial Permutation: • reorder input data bits • even bits to left half, odd bits to right half • regular structure → easy in hardware ● 16 rounds: • use 32-bit L & R halves • take 32-bit right half (R) and 48-bit subkey • expand R to 48 bit using perm E • add subkey using XOR • pass through 8 S-Boxes  split 48 bits into 8 blocks of 6 bits  map 6-bit-block to 4 bits  each S-box is actually 4 little 4-bit boxes  outer bits 1&6 (row bits) select one row out of 4 (depends on both data and key)  inner bits 2-5 are substituted  result is 8 blocks of 4 bits, or 32 bits • permutate using 32-bit perm P • swap R and L ● subkeys used in each round: • initial permutation of the key (PC1) which splits 56-bits in two 28-bit halves • 16 stages consisting of: • rotating each half separately either 1 or 2 places depending on key rotation schedule K • selecting 24 bits from each half & permuting them by PC2 for use in round function F ● Decryption: • unwind = do encryption steps again using subkeys in reverse order • first Initial Permutation to undo Final Permutation of Encryption step • 1st round with Key 16 undoes 16th encryption round • … • final FP undoes initial encryption ● Avalanche effect: • desirable property: change one input or key bit to change approx. half of the output bits • making attempts to home-in by guessing keys impossible ● Key size: • 56-bit keys: 256 = 7.2*1016 values • brute force is possible: 1999 with dedicated hardware and distributedly in 22 hours • you must still be able to recognize plaintext

– Differential ● attack on block ciphers/Feistel ciphers • block cipher consists of a network of S-blocks (Subsitution) and P-blocks (Permutation) • plaintext goes through r rounds • in each round, a different subkey is used • → output is influenced by both plaintext and key • analysis of the algorithms internals • try to determine which subkey was used in each round ● chosen plaintext ● statistical attack ● known by NSA in 70's cf DES design ● published in 90's by Murphy, Biham, Shamir ● DES reasonably resistant to it ● Attack mechanics: [wikipedia] • use pairs of plaintext with a constant difference to each other • encrypt them and compare the results, search for statistical patterns in their distribution • e.g. assume that difference between plaintexts remains for (r-1) rounds • deduce which round keys are possible in the final round • if key is short, just decrypt the ciphertexts one round with each possible round key • if the result of this one-round decryption with a certain round key have the same differences as the input plaintexts • this suggests possible correct round keys • iterate process over many rounds (with decreasing probabilities) ● 13-round iterated characteristic can break the full 16-round DES ● can break DES with 247 chosen plaintexts – Linear Cryptanalysis ● attack on block ciphers and stream ciphers ● statistical method ● developed in early 90's by Matsui et al ● Attack mechanics: [wikipedia] • construct linear equations relating plaintext, ciphertext and key bits • here: binary variables (0 or 1) combined with exclusive-or (XOR) • e.g. first bit of plaintext XOR third bit of plaintext XOR first bit of ciphertext = second bit of key • equation can hold (be true) or not hold (be false) • these linear equations (also called approximations) have probabilities of holding over the space of all possible values of their variables • compute many ciphertexts with known plaintexts, check if equation is true for possible key bits • in an ideal cipher, any linear equation would hold with probability p = 50% • construct many equations, combine them with known permutation/key mixing • goal: find linear equations where probability is either very close to 1 or to 0 • then, apply a straightforward algorithm (Matsui's Algorithm 2) to guess the values of the key bits • compute many ciphertexts with known plaintexts • for each set of key bits (=partial key), count how many times the equation holds true: count T • the partial key where T has the greatest absolute difference from half the number of plaintext-ciphertext pairs: most likely set of values for the key bits • repeat until the number of unknown key bits is low enough to attack with brutal force ● can attack DES with 243 known plaintexts → impractical – Moore's Law: ● adding one bit to the key doubles work for brute force attack ● effect on encryption time is often negligible or even free ● for example, it costs nothing to use a longer RC4 key ● going from 128-bit AES to 256-bit AES takes (at most) 40% longer for en-/decryption but increases the attacker's effort by a factor of 2128 ● using triple DES costs 3x more to encrypt, but increases the attacker's effort by a factor of 2112 ● → Moore's Law favors the defender

– Public-key cryptography ● Basic problem: How to share a secret (e.g. exchange a symmetric key) with someone you have never met? ● → Use asymmetric keys ● also used for digital signatures, verify a message comes intact from the claimed sender – Basic principle: ● use two keys: • a public key, which may be known to anybody and is used to encrypt messages and verify signatures • a private key, known only to the recipient, used to decrypt messages and sign (create signatures) ● those who encrypt messages or verify signatures cannot decrypt messages or create signatures ● design goals: • computationally infeasible to find decryption key knowing only algorithm and encryption key • computationally easy to en/decrypt messages when the relevant key is known • either of the two keys can be used for encryption, with the other for decryption ● Security: • brute-force attack is always theoretically possible • but keys usually too large (> 512 bits) • relies on difference between easy operations and hard (crypt-analyse) problems • requires large numbers • slow compared to private key schemes

– Mathematical basics in number theory ● Prime numbers: • numbers that only have divisors of 1 and self • cannot be written as a product of other numbers ● Relatively prime a, b are relatively prime if they have no common divisors apart from 1 ● Greatest Common Divisor: compare prime factors of numbers, use least powers ● Fermat's (little) theorem: [wikipedia] • p is prime number a is any integer • ap ≡ a (mod p) • congruence relation ≡ from modular arithmetics means: ap and a have the same remainder after dividing by n → ap – a is divisable by p without remainder • variant of this theorem: ap-1 ≡ 1 (mod p) • useful for testing if a given number is prime • useful for public-key cryptography ● Euler Totient function Ø(n) • when doing arithmetic module n, there is a set of possible residues (remainders): 0 … n-1 • reduced set of residues is those numbers which are relatively prime to n: e.g. for n = 10, complete set is 0 … 9, reduced set is only {1, 3, 7, 9} • number of elements in the reduced set of residues is called Euler Totient Function • to compute Ø(n), must compute number of residues to be excluded • in general, need prime factorization of n • but for p (p is prime): Ø(p) = p – 1 because all residues are relatively prime to p, except 0 • for p * q (p, q are prime): Ø(pq) = (p – 1) * (q – 1) ● Euler's Theorem • generalization of Fermat's Little Theorem • a and n are relatively prime (coprime) aØ(n) ≡ 1 (mod n) • can be reversed: if this holds true, then a and n are relatively prime ● Primitive Roots modulo n [wikipedia] • any number g with the property that any number coprime to n is congruent to a power of g modulo n • in other words: result of am mod n for m = 1 .. (n-1) must go through all values of 1 .. (n-1) • e.g. 3 is primitive root mod 7: 31 = 3 (mod 7) 32 = 2 (mod 7) 33 = 6 (mod 7) 34 = 4 (mod 7) 35 = 5 (mod 7) 36 = 1 (mod 7) → rests cover all values from 1 to 6 and do not repeat until then • in other words, [slides]: have equation am ≡ 1 (mod n), a and n are coprime • is true for m = Ø(n) (Euler's theorem), but may also be true for smaller m • if smallest possible m is Ø(n), then a is called a primitive root or generating element • successive powers of a „generate“ the group mod p ● Discrete Logarithms • inverse problem to exponentiation • y = gx (mod p) → find x

• x = logg y (mod p) • if g is a primitive root, then it always exists • exponentiation is easy, but finding discrete logarithms is generally hard

– Diffie-Hellman ● first public-key scheme proposed ● 1976 ● public-key distribution scheme: public exchange of a secret key ● cannot be used to exchange an arbitrary message ● based on exponentiation and difficulty of computing discrete logarithms ● Setup: all users agree on global parameters: q = large prime integer of polynomial a = a primitive root modulo q ● : xA • User A generates random XA < q; calculates YA = a mod q

• A sends YA to B xB • B generates random XB < q; calculates YB = a mod q, calculates secret key K = XB YA mod q

• B sends YB to A XA • A computes secret key K = YB mod q ● K can now be used as a session key ● attacker would need to compute logarithm modulo q to compute one of the secret keys to be able to compute the session key ● vulnerable to meet-in-the-middle-attack ● authentication of the keys is needed

– RSA (Rivest, Shamir & Adleman) ● 1977, MIT ● best-known and widely used public-key scheme ● based on exponentiation in a finite (Galois) field over integers modulo a prime ● use large integers ● security due to cost of factoring large numbers ● key setup: • each user generates a public/private key pair by: • selecting two large primes at random p, q • compute their system modulus n = p * q, Ø(n) = (p – 1) * (q - 1) • select at random the encryption key e, so that 1 < e < Ø(n), e and Ø(n) are coprime • in practice: e is low to reduce computational load, e.g. e = 216 + 1 = 65537 • solve following equation to find decryption key d (multiplicative inverse of e) e * d ≡ 1 (mod Ø(n)) and 0 <= d <= n → solve equation e * d + Ø(n) * k = 1 (using Extended Euclidean Algorithm) • publish their public encryption key PU = {e, n} • keep secret private decryption key PR = {d, n} ● RSA use: • to encrypt a message M to a sender:  obtain public key of recipient: PU = {e,n}  compute ciphertext: C = Me mod n, where 0 <= M < n • to decrypt the cyphertext C:  owner uses their private key PR = {d}  computes: M = Cd mod n  message M must be smaller than the modulus n (block if needed) • showing that RSA works:  the decrypted result is the same as the original: need MØ(n) = 1 (mod n) (Euler's theorem) and e * d = 1 + Ø(n) * k (because when computing d, solve equation: e * d + Ø(n) * k = 1 → e*d could also be 1 - Ø(n) * k if k was positive)

Cd = Me*d = M1+ Ø(n) * k = M1 * MØ(n) * k = M * (MØ(n)) k = M * 1 k = M ● Practical implementation • exponentiation: can use Square and Multiply Algorithm  fast, efficient algorithm for exponentiation  concept is based on repeatedly squaring bases and multiplying in the ones that are needed to compute the result

 only needs O(log2n) multiples for number n • efficient encryption:  uses exponentiation to power e  if e is smaller, this will be faster  if e fixed, must ensure gcd (e, Ø(n)) = 1 → reject any p or q not relatively prime to e • efficient decryption:  uses exponentiation to power d  d is likely large  use Chinese Remainder Theorem (CRT) to compute mod p&q separately, then combine to the desired answer  approx 4 times faster than doing directly • key generation:  determine two primes at random – p, q  select either e or d and compute the other  primes p, q must not be easily derived from modulus n = p * q  means they must be sufficiently large  typically guess and use probabilistic test  exponents e, d are inverses, so use inverse algorithm to compute the other ● possible attacks [wikipedia / http://members.tripod.com/irish_ronan/rsa/attacks.html ] • brute force key search:  encrypt every possible message block,  until a match is found with one of the ciphertext blocks  infeasible given block size • guessing d  known plaintext and ciphertext  e.g. trying every possible key until it returns the original plaintext  once d has been discovered, it is easy to find the factors of n (p and q)  need long keys (currently, 2048 bit is recommended) • factorization of n  n = p * q  private key can be computed easily if p and q are known (just like the legitimate sender has computed d)  n should be large enough that factorization is infeasible • low public exponent attack  to reduce encryption or signature verification time, often choose small public exponent e such as 3, 17 or 65537  make modular exponentiation faster  also easier to test if e and (p-1) are coprime, and if e and (q-1) are coprime  if public exponent is small and plaintext m is very short, RSA function is easy to invert, revealing the plaintext m • Coppersmith's attack: [wikipedia]  same clear text message s sent to e or more recipients in an encrypted way  they share the same exponent e, but have different p, q and therefore n,  it is easy to decrypt the original clear text messages via Chinese remainder theorem • side-channel attacks  timing and power attacks (on running decryption)  check how many operations are performed on hardware for a decrypting a ciphertext, how long they take, how much power they consume ● Implementation details matter. ● Selfmade certificate verification is a bad idea, for example ● security by obscurity is, too – : ● sign message with private key and encrypt with public key to have secrecy and authentication ● Message Authentication Code (MAC) is generated by an algorithm that creates a small fixed-size block

• cryptographic checksum MAC = Ck (M) • condenses a variable-length message M • using a secret key K • to a fixed-size authenticator • needs to satisfy the following:  knowing a message and a MAC, it is infeasible to find another message with the same MAC  MAC should be uniformly distributed  MAC should depend equally on all bits of the message ● like encryption depending on key and plaintext, but needs not to be reversible ● append to message as a signature, but it is no signature ● receiver performs same computation on message and checks its MAC ● provides assurance that message is unaltered and comes from sender ● can use any block cipher chaining mode and use final block as MAC ● Data Authentication Algorithm (DAA) is a widely used MAC based on DES-CBC • using IV = 0 and zero-pad of final block • encrypt message using DES in CBC mode • and send just the final block of the MAC or its leftmost M bits • final MAC is now too small for security – Hash functions ● condenses arbitrary message to fixed size: h = H(M) ● usually assume that the hash function is public and not keyed ● hash used to detect changes to message ● can use in various ways with message ● most often to create a ● easy to compute, ● infeasible to reverse ● infeasible to find other key with same hash ● 64 hash can be broken by birthday paradox: two sets of messages are compared to find pair with same hash, probability > 0.5 by birthday paradox, then have user sign the valid message and substitute it with the forged message which has the same hash ● can use Block ciphers as hash functions

• using H0 = 0 and zero-pad of final block

• Hi = Emi [Hi-1] • use final block as the hash value • similar to CBC but without a key • resulting hash is too small (64 bit) ● attacks: • brute-force attacks possible: 128 bits looks vulnerable, 160 bit better • MACs with known message-MAC pairs: at least 128 bit needed for security • number of analytic attacks: exploit properties of rounds of block ciphers – Secure Hash Algorithm (SHA) ● designed by NIST&NSA in 1993 ● revised in 1995 as SHA-1 ● based on design of MD4 with key differences ● produces 160-bit hash values ● revised multiple times ● for compatibility with increased security provided by AES cipher ● compression function processes 1024-bit blocks ● consists of 80 rounds – Whirlpool ● hash function ● endorsed by European NESSIE project ● uses modified AES internals as compression function ● similar to AES – Keyed Hash functions as MAC (HMAC) ● original proposal: KeyedHash = Hash(Key|Message) ● had some weaknesses: • if Key is prepended to message, it is possible to append data to the message without knowing the key and obtaining another valid MAC • if key is appended, finding a collision in unkeyed hash function results in a collision in keyed hash function ● → HMAC was developed with inner and outer Hash function ● take two constants opad = 0x5c5c5c... and ipad = 0x363636... (have a large hamming distance, so used key bits are different) + + ● HMAC: HMACK = Hash[ ( K xor opad) || Hash[ (K xor ipad) || M) ]] ● K+ is key padded out to size and opad, ipad as specific padding constants ● overhead is just 3 more messages than the message needs alone ● any hash function can be used e.g. MD5, SHA-1, RIPEMD-160, Whirlpool – Digital Signatures ● verify author, date and time of signature ● authenticate message contents ● direct signature • involve only sender and receiver • receiver has sender's public key • encrypt with private, decrypt with public key ● arbitrated digital signatures • use of arbiter A who validates any signed message, dates and sends to recipient • requires suitable level of trust • can be implemented with either public or private key algorithms ● Digital Signature Standard (DSS) • US government approved signature scheme • uses SHA algorithm and calls it das • revisions include RSA as other algorithm – Authentication protocols ● One-way authentication • required when sender and receiver are not in communications at the same time (e.g. email) • using symmetric encryption: can refine use of KDC but cannot have final exchange of nonces, not secure against replay attacks • public-key approaches • Digital Signature Standard (DSS) ● Many-to-many authentication • users provide their identities when requesting services from machines on the network • naive solution: every server knows every password • inefficient: to change his password, user must contact every server • insecure: compromising one server is enough to compromise all users • use a Key Distribution Center who assigns a key pair to any pair who wants to talk • trusted authentication service on the network – replay attacks ● valid signature is copied and resent later ● repetition that cannot be logged/detected ● backward replay without modification ● countermeasures: • use sequence numbers (generally impractical) • timestamps (need synchronized clocks) • challenge/response (using unique nonce)

– Standards for using Public-key encryption – Kerberos ● entity: user or system service ● Tickets: cryptographically sealed messages with session keys and identifiers, used for service ● Ticket-Granting Ticket (TGT): ticket to obtain other tickets ● Kerberos Key Distribution Center issues TGT via extended Needham-Schroeder ● user requests service tickets from Ticket Granting Service (TGS) via extended Needham-Schroeder ● each instance has a long-term key and negotiates session keys ● realm: identifies Kerberos server (different key databases) ● access a service in another realm: get ticket from home-TGS for remote-realm TGS ● get ticket for remote service from remote-realm TGS ● issues: • password dictionary attacks on client master key • ticket cache security • replay of authenticators: 5-minute lifetimes long enough for replay • timestamps assume global, secure synchronized clocks • challenge-response would be better • same user-server key used for all sessions • homebrewed PCBC mode of encryption (tries to combine integrity check with encryption) • extraneous double encryption of tickets • no ticket delegation: printer cannot fetch email from server on your behalf ● Kerberos version 5 • better user-server authentication: separate subkey for each user-server session instead of re-using the session key contained in ticket, authentication via subkeys, not timestamp increments • Authentication forwarding: servers can access other services on user's behalf • realm hierarchies for inter-realm authentication • richer ticket functionality • explicit integrity checking + standard CBC mode • multiple encryption schemes, not just DES ● practical uses of Kerberos • email, FTP, network file systems – transparent for end-user • standard authentication for Windows • local authentication login and su in OpenBSD • authentication for network protocols: rlogin, rsh, telnet, afs • secure windowing system: xdm, kx – SSL ● originally designed by Netscape in 1993 ● provides confidentiality, integrity, authentication ● available to all TCP applications via secure socket interface ● Toy SSL: simple • handshake: Alice and Bob use their certificates and private keys to authenticate each other and exchange • key derivation: use shared secret to derive set of keys • connection closure: special messages to securely close connection • key derivation:  different keys for message authentication code (MAC) and encryption  four keys: also different keys for direction  keys derived via : master secret and additional random data to generate key  use a version of HMAC • Data transfer:  data broken up into a series of records  each record carries a MAC  each record has a length header  sequence numbers in MAC (serves as nonce, not a field),  session nonce against replay  type field: 0 for data, 1 for closure – against truncation attack • Toy SSL has no encryption algorithm negotiation ● most common symmetric ciphers: • DES • 3DES • RC2 • RC4 • RSA ● Cipher suite: • public-key algorithm • symmetric encryption algorithm • MAC algorithm • client offers choice, server picks one ● Real SSL: Handshake [with Client authentication] • – client generates random nonce – • ClientHello: Random nonce, list of cipher suites, [sessionID] • – server generates random nonce – • ServerHello: Chosen cipher suite, random nonce, [sessionID] • Server: Certificate (X.509) • – client verifies certificate – • [Server: CertificateRequest] • ServerHelloDone • [Client: Certificate] • – Client generates random number: Pre_master_secret – • ClientKeyExchange: Pre_master_secret encrypted with server's Public Key • [Client: CertificateVerify: signature over previous handshake message to prove that client owns the certificate] • – both independently compute MasterSecret from Pre_master_secret, use it to compute MAC and symmetric encryption keys – • Client: ChangeCipherSpec: everything from now on will be encrypted • Client: Handshake finished: encrypted MAC of all handshake messages • Server: ChangeCipherSpec • Server: Handshake finished: encrypted MAC of all handshake messages • last 2 steps protect against tampering of handshake • now both exchange application_data via record protocol • close_notify, TCP fin follows ● Key derivation • client generates random pre-master-secret • encrypts pre-master-secret with public key, sends it to server • both use client random, server random, pre_master_secret → master secret • master secret, client and server random numbers into another pseudo-random- number generator → produces key block • key block sliced and diced: Client MAC key, Server MAC key, Client encryption key, Server encryption key, Client initialization vector, Server initialization vector ● SSL record protocol • data split into fragments (each 214 bytes ~16 KBytes) • append MAC (sequence number, MAC key) • encrypt data and MAC • prepend record header (content type, version, length) • content type: application_data, Alert, Handshake, Change_cipher_spec ● SSL performance • server handshake: typically over half the SSL handshake CPU time goes to RSA decryption of the encrypted pre_master_secret • Client handshake: public key encryption is less expensive, server is bottleneck • Data transfer: symmetric encryption and MAC calculation is not CPU-expensives ● Session resumption • full handshake is expensive in CPU time • if client and server have already communicated once, they skip handshake and proceed directly to data transfer • client sends session_id in ClientHello • server then agrees to resume in ServerHello • new keyblock computed from master_secret and client and server random numbers ● Client authentication: server sends CertificateRequest ● Technical attack: • shopping on unencrypted page • click on „Checkout“ • next page, downloaded without SSL, has login link, which uses SSL • → tamper with that page 03 Web insecurity

– Web attacker ● basic web security threat model ● capabilities: • operates a website on the reachable network • can obtain a TLS/SSL certificate for server • may operate other machines on own domain • user visits website • attacker.com different from honest domains • attacker has no access to user's machine ● variation: gadged attacker – produces a gadget that is included in otherwise honest mashups – Analogy between operating system and web browser Operating System Browser primitives System calls, Document object model processes, frames disk cookies/localStorage principals Users Origins Discretionary access control Mandatory access control vulnerabilities Buffer overflow Cross-site scripting root exploit universal scripting

– Browser: ● requests content (may involve requesting images, subframes...) ● receives it ● renders it • process HTML and scripts (e.g. JavaScript) • displays the page ● Responds to events – Browser events: ● types: • mouse event • onLoad (Rendering) • onMouseMove • onKeyPress • onUnLoad • Timeout set by setTimeout() ● order of execution • if an element and one of its ancestors have an event handler for the same event, which one should fire first? • MS IE: Bubble-up • Netscape: Window-down event capture • W3C: specify event listener with event capture or event bubbling semantics → different browsers use different methods – Document object model (DOM) ● data structure of a website in HTML ● nested HTML tags → hierarchical strucutre of DOM ● contains objects with properties and methods ● examples: • Properties: document.alinkColor, document.URL, document.forms[ ], document.links[ ], document.anchors[ ] • Methods: document.write(document.referrer) ● can be manipulated using JavaScript ● e.g. createElement, createTextNode, appendChild, removeChild... ● other things such as clipboard context is also part of the model ● stealing clipboard contents: • create hidden form, enter clipboard text, post form – Browser Object Model (BOM) ● components of a browser ● Window object ● Frame object ● Document which contains DOM ● others: History, Location, Navigator – Components of browser security policy ● Frame-frame relationships • canScript (A,B) Frame A can execute a script that manipulates arbitrary/non-trivial DOM elements of frame B • canNavigate(A,B) Frame A can change the origin of content from Frame B, i.e. Can load another site in the frame ● Frame-principal relationships • readCookie(A,S) • writeCookie(A,S) Can frame A write/read cookies from site S? ● Security indicator (lock icon)

– Browser Same Origin Policy (SOP) ● Web sites from different domains cannot interact except in very limited ways ● two origins are the same if domain-name, port and protocol are equal ● applies to: • Cookies: cookie from origin A not visible to origin B • DOM: script from origin A cannot read or set properties for origin B ● example: • navigating child frame is allowed • reading child's source is not ● Generally misunderstood • usually just refers to canScript relation ● full policy of current browsers is complex • evolved via penetrate and patch • common scripting and cookie policies:  canScript considers: scheme, host, port  canReadCookie: scheme, host, path  canWriteCookie: host – Example for Same origin policy: JavaScript security model ● sandbox design: no direct file access, restricted network access ● frame can only read properties of documents and windows from same origin: server, protocol, port ● however, this does not apply to: • script loaded in enclosing frame from arbitrary site • via • user clicks on a malicious link either in an e-mail or on a website • victim server generates a site which contains the JavaScript code instead of a simple string • client receives this site from the victim server • client's browser executes the script as if it came from the victim server itself • → attacker circumvents same-origin policy • e.g. client unwillingly sends attacker his cookie for victim.com • attacker can now impersonate client and hijack his session • Example: Samy worm  on Myspace.com, users can post HTML on their pages  ensures html contains no javascript, but it is still possible in CSS tags  infects anyone who visits an infected myspace page and adds Samy as a friend • Example: Vulnerability in pdf viewer <= 7.0.x  Attacker locates a PDF file hosted on website.com  They create a specially crafted URL pointing to the PDF append with some JavaScript Malware in the fragment portion (Example: http://website.com/path/to/file.pdf#s=javascript:alert(”xss”);)  Attacker entices a victim to click on the link  If the victim has Adobe Acrobat Reader Plugin 7.0.x or less, confirmed in Firefox and Internet Explorer, the JavaScript Malware executes.  Everything XSS has shown to be capable of including Phishing w/ Superbait, Intranet Hacking, Web Worms, History Stealing, etc is now available to the attacker.  Can not only be pointed to any hosted pdf on the web, but also to a default pdf stored on Windows systems → script runs in local context! FFFUUUUUU – Protection of the server: ● validate all headers, cookies and query strings against a rigorous specification of what should be allowed • never trust client input, only allow what you expect • remove or encode special characters • < for < …, only allow safe commands – no ● UrlEncode • used in an URL (such as value in querystring) • Click Here! ● XmlEncode • XML output • [Untrusted input] ● XmlAttributeEncode • Some Text – Encoding functions: ● PHP: htmlspecialchars(string) htmlspecialchars( "Test", ENT_QUOTES); Outputs: <a href='test'>Test</a> ● ASP.NET 1.1: Server.HtmlEncode(string) similar to PHP htmlspecialchars ● ASP.NET validateRequest (on by default) crashes page if