Lecture Notes on Stream Ciphers and RC4

Lecture Notes on Stream Ciphers and RC4 Rick Wash [email protected] Abstract. In these notes I explain symmetric key additive keystream ciphers, using as an example the cipher RC4. I discuss a number of attack models for this class of ciphers, using attacks on RC4 as examples. I cover a number of attacks on RC4, some of which are effective against implementations of RC4 used in the real world. 1 Introduction Stream ciphers are a very important class of encryption algorithms. These notes explain what stream ciphers are, explain common subclasses of stream ciphers, and discuss the attack models relevant to stream ciphers. They also discuss the stream cipher RC4 in detail, using it as an example for discussing a number of different attacks. 2 Stream Ciphers Symmetric key cryptosystems are an important type of modern cryptosystem. Symmetric key systems are cryptosystems where the same key is used for both encryption and decryption. This class of cryptosystems is important in modern cryptography because, in general, symmetric key cryptosystems are much faster than public key cryptosystems. 2.1 Block vs. Stream Ciphers The two major types of symmetric key systems are block ciphers and stream ciphers. Block ciphers in general process the plaintext in relatively large blocks at a time. The encryption function is the same for every block. A block cipher can be represented by a bijective function f which accepts as input a block of plaintext of a fixed size, and a key, and outputs a block of ciphertext. See Eq. 1. f(p; k) = c (1) Stream ciphers, on the other hand, process plaintext in small blocks (some- times as small as a single bit). In contrast to block ciphers, stream ciphers keep some sort of memory, or state, as it processes the plaintext and uses this state as an input to the cipher algorithm. More specifically, a stream cipher is two functions, f and g, given in Eq. 2. σt+1 = f(σt; pt; k) (2) ct = g(σt; pt; k) f is the next state function, which given the current state, the next block of plaintext, and the key produces a new state. g is the output function, which given the same three inputs produces a block of ciphertext as output. Note that then next time f and g is called (at time t + 1), the state will be different. 2.2 Types of Stream Ciphers In [1], an interesting distinction is made between two types of stream ciphers { synchronous stream ciphers and self-synchronizing stream ciphers. A synchronous stream cipher is a cipher where the a keystream is generated sepa- rately from the plaintext and is then combined with the plaintext later to form the ciphertext. More specifically, a synchronous stream cipher is σt+1 = f(σt; k) zt = g(σt; k) (3) ct = h(zt; pt) where f is the next state function, g is the keystream output function, and h is the function that combines the plaintext with the keystream to produce the ciphertext. Note that decryption only requires inverting the h function. The two functions f and g in Eq. 3 are together known as a the keystream generator. The output of these two functions, the sequence of zt values, is known as the keystream. As such, synchronous stream ciphers are also known as keystream generator ciphers. This class of ciphers has the advantage that the keystream can be precomputed without knowledge of the plaintext or ciphertext. A particularly popular subclass of keystream generator stream ciphers is the binary additive stream ciphers. In this class of ciphers, the h function is the XOR function (represented by ). This will be the primary model of stream cipher that will be analyzed. ⊕ 3 Attack Models Stream ciphers are generally studied with respect to a number of common attack models. These attack models are considered good models to study the security of stream ciphers in, but they do not cover all possible attacks. Some of these models provide for more practical attacks than others do. One thing to remember is that stream ciphers (and most encryption algorithms in general) do not provide for message integrity. This must be done by some external algorithm, such as a MAC (Message Authentication Code). It is 2 possible to use the lack of message integrity checks to partially determine some of the plaintext, but that is beyond the scope of this paper. A MAC is particularly important for binary additive stream ciphers. For this class of ciphers, flipping a bit of the ciphertext will flip the corresponding bit of the plaintext, and only affect that one bit. This can be used by an attacker to change messages. A MAC will prevent this type of attack. This property, that a change in the ciphertext will produce a known predictable change in the plaintext, is called malleability. 3.1 Brute Force Key Search The basic attack against any symmetric key cryptosystem is the brute force attack. In this attack, the attacker keeps guessing what the key is until they guess correctly. In general, one known plaintext, or the ability to recognize a correct plaintext is all that is needed for this attack. However, all good cryptosystems should be designed such that this attack is impractical. For a key of size n, a brute force search would consist of trying all 2n keys to see which one works. If it is possible to recognize the correct plaintext, then n 1 on average the correct key will be found in 2 − guesses. Although this attacks works on all modern cryptosystems, its complexity grows exponentially with respect to the key size, to choosing an appropriately large key size will provide the needed security. Experts suggest having at least a 90 bit key for security in today's world. Time/Memory Tradeoffs It is possible to speed up a brute force search by pre-computing some values and storing them in memory. This is trading off memory usage for time it takes to perform the attack. Sampling Resistance In [2], an interesting property of stream ciphers is discussed in the context of time/memory tradeoffs of brute force attacks. This property, known as the sampling resistance, is how easy it is to find keys that generate keystreams with a certain property of the output. 3.2 Real or Random Distinguishers In a theoretical model, the output of a \good" keystream generator should ap- pear random. If the output was truely random, then the cipher would be a one- time-pad. However, since the output cannot be truely random, at some point in the keystream we should be able to distinguish between the real keystream generator and a truely random keystream generator. It follows from this logic that the more keystream output that is needed to distinguish between the real cipher and random output, the closer that output is to being random, and the better the cipher. Therefore, a common academic attack model for stream ciphers is the real or random distinguisher. In this model, the goal is to come up with a distinguisher. 3 A distinguisher is a probabilistic polynomial-time algorithm A. A takes as input N bits of data, which are either from the real stream cipher or are completely random data. A then has to, in polynomial time, output either \Real" or \Random". If this is correct with a non-negligible probability, then this is a good distinguisher.This N indicates how good the distinguisher is. In most cases, larger N will produce a more accurate distinguisher. In [3], a distinction is made between two different type of distinguishers. The first type of distinguisher is what has already been discussed, and is called a polynomial-time distinguisher. The other type is what is known as a polynomial- space distinguisher. For this type of distinguisher, the attacker is given a block box which is either the cipher or a true random generator. This black box can be reset and rerun with a random key a polynomial number of times. The goal is to distinguish between the two black boxes. According to [3], these two notions are equivalent from the information-theoretical viewpoint, though the difference in bias can be significant. 3.3 Key Weaknesses There are also some more specific models that should be discussed, because they are directly relevant for the discussion on the security of RC4. These are related to the key. Specifically, the key completely determines the output sequence from a keystream generator. In a good keystream generator, each bit of the output will depend on the entire key for its value, and the relationship between the key and a given bit (or set of bits) should be extremely complicated. Key-Biased Output The first condition listed in 3.3 is every bit of the output is dependent on the entire key for its value. This meansx that changing any single 1 bit of a key should have a 2 probability of affecting each bit of the output. When this property holds, then in order to brute force a key, every possible key must be tried, and there will be no relationship between bits of the key and the output. This means that uncertainties of the individual bits of the key multiply when calculating the total number of possible keys. Let us see what happens when this property does not hold. Assume that the 8 bits of the output is dependent only on the first 8 bits of the key. This means that for a given value q = k1 : : : k8 of the first 8 bits, all keys that have the value q for the first 8 bits will have the same first 8 bits output.

Lecture Notes on Stream Ciphers and RC4

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support