Analysis and Improvement of Message Authentication Codes in Openssh

Volume xx (200y), Number z, pp. 1–7 Analysis and improvement of message authentication codes in OpenSSH Peter Valchev University of Calgary, Alberta, Canada Abstract In this paper, we will analyze the current MAC algorithms used in OpenSSH in light of recent attacks to HMAC functions. We will evaluate the suitability of UMAC as an alternative to the existing algorithms, in terms of both security and efficiency. Categories and Subject Descriptors (according to ACM CCS): D.4.6 [Security]: Authentication, Verification 1. Introduction struction briefly [BCK96a]. The HMAC functionality and security is based on a cryptographically secure hash func- A message authentication code (MAC) is a keyed crypto- tion. OpenSSH uses HMAC based on the MD5, SHA-1 graphic hash function designed to ensure data integrity (mes- and RIPEMD-160 hash functions. Recent results in cryp- sage has not been altered) and source authentication (mes- tographic research have presented theoretical limitations of sage originated from the purported sender). Suppose Alice HMAC [CY06, KBPH06, Bel06] that we will consider. We and Bob share a secret key. When Bob sends a message to will then assess the suitability of UMAC as an alternative. Alice, he computes the MAC of the message using the key, We will research the current state-of-the-art of the security of and sends the message and MAC to her using the commu- UMAC and the methods for implementing it in comparison nication channel. Alice then uses the shared key to compute to HMAC. In order to compare the performance of UMAC the MAC on the message received from Bob and compares with HMAC, we will implement UMAC in OpenSSH. We the computed MAC with the MAC received. If these match, will benchmark the performance of the UMAC implemen- then Alice is assured that the message was not altered and tation and compare it with the existing HMAC implementa- that it came from Bob, because only someone possessing tions, and time permitting, include benchmarking results on the shared key can compute a valid MAC for the message. the VIA C7 with hardware AES. MAC algorithms are widely used in Internet protocols (SSH, SSL/TLS, IPsec) for providing data integrity and source au- The remainder of this paper is organized as follows. thentication. Section 2 describes HMAC and the best attacks currently known. Section 3 describes UMAC with some of its se- OpenSSH is a free implementation of the SSH protocol curity proofs and briefly compares it to HMAC. Section 4 [Ope] which allows the establishment of a secure communi- deals with the implementation and benchmarking results in cation channel between two users. The first version of the OpenSSH. Section 5 concludes the paper. protocol did not use MAC algorithms [SSH]. The SSH-2 protocol, which has been in wide use since 1996, and has been proposed as an Internet standard in 2006, uses public 2. HMAC key cryptography to authenticate the remote user and pro- vides a secure and reliable way to exchange data using en- Originally, MACs were constructed from block ciphers. Peo- cryption and message authentication codes. ple started considering the construction of MACs from cryptographic hash functions for a variety of reasons. For one The goals of this project will be to assess the current MAC thing, they are much faster than block ciphers when imple- implementations in OpenSSH. Each of these algorithms are mented in software. Additionally, there were tight crypto- based on HMAC [Ope] and we will show the HMAC con- graphic export restrictions in the US and other countries at submitted to CPSC 503 Forum (3/2007). 2 Peter Valchev / MAC the time which affected block ciphers, but not hash func- Definition A message authentication code (MAC) is a functions. The HMAC (and related NMAC) design, first intro- tion which takes a secret key k and a message m as input, and duced in 1996 [BCK96a], describes a MAC scheme based returns an authentication tag MACk(m) as output. on cryptographic hash functions such as MD5 and SHA-1. In order for a cryptographic hash function to be useful These hash functions were not originally designed for mes- in the context of message authentication, it needs to have sage authentication, and it was studied how to turn them into the ability to incorporate a secret key. In their normally in- keyed primitives. There were various proposed designs to tended use, anyone can compute a hash of a message with- address the problems encountered, until a construction was out secrecy. The approach in [BCK96a] is to key these hash presented which was backed by a rigorous security analy- functions through modifying their initial vector (IV) which sis [BCK96a]. is usually fixed. The modified IV is kept secret and becomes The resulting NMAC and HMAC schemes could utilize the key. any cryptographic hash function as a “black box” and had Definition Assuming K is the secret key, h is the crypto- many attractive features. The results showed that the secu- graphic hash function and m is the message to be authenti- rity of these constructs were directly related to the secu- cated, HMAC is defined as: rity of the underlying hash function in use. It was proven that if any significant weaknesses are found in these MAC HMACK(m) = h((K ⊕ opad)||h((K ⊕ ipad)||m)), schemes, not only does the underlying hash function need to where ⊕ is XOR, || is the concatenation symbol and be dropped from these particular usages, but it also must be ipad,opad are padding constants. The m above is one block dropped from all other uses. long. The message can have an arbitrary length, and it is split up in blocks of fixed size. 2.1. MACs based on Hash Functions (HMAC) In order to construct MACs, we will first explain what a hash 2.2. Security properties function is. In the context of a MAC, security usually exclusively refers Definition A hash function is a function h which has, as a to resistance to forgery, the ability of an attacker to produce minimum, the following properties: a new message and to compute a correct authentication tag for it under the secret key. 1. compression - H maps an input x of arbitrary length to an output of fixed length n. The adversary may see a sequence 2. it is easy to compute - given h and an input x, h(x) is easy (m1,a1),(m2,a2),...,(mq,aq) of pairs of messages and to compute. their corresponding tags (ai = MACk(mi)) transmitted between the communicating parties. They break the MAC if This definition implies an unkeyed hash func- they can find a new message m together with its correspond- tion. [MvOV96] ing valid tag a = MACk(m), given that they do not possess Hash functions are many-to-one by definition, as they the key k. When the adversary has no way of influencing compress an arbitrarily large input to a small, fixed-size out- the messages exchanged by the parties, but is simply put. In order to be useful for cryptographic uses, they need eavesdropping on the wire, this is called a known message to have some additional properties. attack. In some cases they can choose the sequence of messages m1,...,mq, and it is then called a chosen message Definition Cryptographic hash functions are hash functions attack. designed with some additional goals [BCK96a]. Primarily, they need to be collision resistant: if our hash function is There are some generic attacks that apply to MAC algo- h, then it should be infeasible for an adversary to find two rithms that need to be considered. Most simply, an adversary unique strings m,m0 such that h(m) = h(m0). can attempt a naive exhaustive search (brute force) on all possible keys - this will require O(2k) operations for a k-bit As we mentioned, one of main functions of a hash is to key. The problem is that verifying such an attack (determin- compress. Thus we define a compression function. ing whether the MAC is correct) requires a total of (k,m) pairs of (key,MAC), where the hash function has an m-bit Definition A compression function processes short fixed- output. If m < k, the adversary may prefer to simply guess length inputs, and is iterated in a particular way (in an iter- the MAC corresponding to a chosen message, without the ated hash function) to hash arbitrarily long inputs. If we de- need of recovering the key - the probability of success then note the compression function as f , it will accept two inputs: is 1/2m, but this attack is not verifiable. The birthday attack a chaining variable (length l) and a block of data (length b), is discussed in [MvOV96]. It is named after a classic prob- and produces output of length l [BCK96a]. ability problem, in which if we have 23 people in a room, And now we formally define what a message authentica- the probability that at least 2 of them have the same birth- tion code is. day is ≈ 0.507, which is surprisingly large. For an m − bit submitted to CPSC 503 Forum (3/2007). Peter Valchev / MAC 3 tag, this attack allows us to forge given the ability to per- this last step, the algorithm used is HMAC [KBPH06]. By form about 2m/2 MAC queries, which is a big improvement the birthday paradox, this attack requires O(2l/2) messages. over the brute force attack. When considering other attacks Any further distinguishing and forgery attacks we consider on MAC functions, we can only consider them practical if must have better complexity than this, in order for us to con- they do better than the birthday attack.

Load more