CS 4005-705-01 I Prof. Alan Kaminsky Graduate Project: Team: “The Number Crunchers” Members: Sharif Hdairis Andrew Hoffman Nelson Powell

Introduction

Cryptography has progressed significantly from the days of substitution and affine ciphers, to the modern block and stream ciphers based on high order mathematics. This course provides an introduction to the fundamentals of cryptography as it applies to the field of security via the examination of both cryptography and .

This project focuses an empirical investigation on the statistical attributes of stream ciphers, specifically using the published Rabbit with respect to the randomness of the stream. The Rabbit Cipher algorithm is a stream cipher utilizing a 128-bit secret key with a 64- bit (IV) [1][2]. The Rabbit Cipher efficiently encrypts 128-bits per iteration of the algorithm in a synchronous manner to provide an effective ciphered bit stream.

The cipher was implemented using the Java language with the intent to utilize the Parallel Java Library [3] as provided by Dr. Alan Kaminsky. Using Object Oriented Design (OOD), the components of the cipher were abstracted in an effort to maximize the utility of fundamental components as well as provide for a possible context based environment. The fundamental components were validated against test sets[2][4] to ensure functional compliance.

This statistical analysis test suite TestU01 was used to determine the probability of randomness in cipher’s key stream. Since the Rabbit cipher uses four iterations to mix the key followed by four iterations to mix the IV, the analysis examines the effects on randomness within the with respect to the number of initialization rounds. The “Crush” battery of tests outlined in TestU01 provided the verification of randomness as stated by the original author.

1

Table of Contents Introduction Rabbit Algorithm Mathematical Syntax Component Breakdown Algorithm Breakdown Functional Component Descriptions Key Expansion Key Expansion Counter ReInitialization IV Expansion Next State Function Carry Bit Resolution Counter Iteration State Iteration Encryption KeyStream Generation Key Stream and Data Combining TestU01 Test Suite Random Data Generator Test Results Results Analysis Comparison to Other Analysis’ Conclusion Project Documentation Configuration Management Wiki Environment Project Environment Build Processes User’s Manual Project Discovery Future Research Statement of Work References

2

Rabbit Algorithm

The Rabbit Cipher, as described by the authors in [1][2][5], is a streaming cipher utilizing chaos theory as a method of injecting nonlinearity into the algorithm. The authors include a cryptanalysis within their follow on documentation provide evidence of the resilience of Rabbit against algebraic, distinguishing, and brute-force attacks, as well as statistical analysis [1][6][7] [8]. Additional authors have focused on Distinguishing or Bias Attacks as a method to approach the brute force limit of [9][10]. While other authors have attempted other attacks, such as Fault Analysis [11].

Mathematical Syntax To provide a description of the Rabbit Cipher algorithm, a standard mathematical notation must be established in order to familiarize the reader with the mathematical expressions within. The addition sign “+” will be used for bitwise addition, and assumes that the register used to perform the addition has enough bits to satisfy the operation without rollover issues. If the addition is performed with a modulus, the addition will be expressed as a function such as .

Logical bit rotations and shifts follow the ANSI C Standard expresses where ‘<<<’ and ‘>>>’ represent logical rotations left and right, respectively, and ‘<<’ and ‘>>’ represent logical shifts left and right respectively. For logical concatenation, the symbol is used for the logical concatenation of two bit fields. The symbol will be used for the logical bitwise exclusive OR function between to bit fields.

The rabbit cipher uses registers of the same size and function throughout the standard. Furthermore, each register has state with respect to time. To describe a set of registers over a period of time, the following notation is used: where j describes the register index number and i provides the iteration with respect to time.

When referencing a bit field within a variable, the form X[m...n] describes the bits “m” through “n” of the register “X”, where m > n and ‘m’ is consider the MSB and ‘n’ is the LSB. If the bit field describes a register in an array of registers over the course of numerous iterations, the variable is described as a combination of the bit field and the standard register representation, such as .

Finally, to describe numbers in varying numerical systems, decimal numbers will be written with standard Arabic values, binary numbers will be succeeded by a ‘b’, such as 100110b for example, and hexadecimal numbers will be preceded by a ‘0x’ value such as 0x1F823C.

Component Breakdown The Rabbit Cipher has a total of 513-bits of state information at any given iteration throughout the cipher and decipher processes. The 513-bits of data is broken into eight 32-bit state registers, eight 32-bit counter registers, and a single carry bit. Depending on implementation, one may choose to store more data to avoid extra processing, or maintain all other data items as volatile memory items for code compactness.

The eight 32-bit state registers are called the x-registers, or where and i represents the iteration counter. Similar to the x-register, the counter registers, or where 3

, are 32-bit registers coupled with the state registers throughout the process of the algorithm. There is only one carry bit, but this does not detract from the possibility of one creating a data structure in which the carry is also associated with the state and counter registers.

It is important to realize that any given instance of the cipher, especially in object oriented designs, must maintain its own instance of the 513-bits of data. Also, any given instance always begins with the ‘i’ value at 0 (i = 0). Later, the single carry bit usage and derivation will be explained. To avoid future confusion, it is important to note that the carry bit is set to 0 at i = - 1, and that the carry bit is the only data item that has a notion of i = -1 at the start of a session. Keeping this in mind during the description of the carry bit will resolve many misconceptions about the use of the carry bit throughout the implementation of the cipher.

Though Boesgaard [7], et al., suggests that only 513-bits of non-volatile memory is required for an implementation of the Rabbit Cipher, it is the suggestion of this team to also store an additional 128-bits of keystream in a pseudo- method for fast access to the keystream. Failure to store the key stream between encryption of bits, bytes, or blocks would require additional memory to store the previously used location in the keystream as well as special processing to recover the new keystream information. Though location information requires less memory, the coding effort outweighs the extra memory required to store the current keystream information.

Algorithm Breakdown The Rabbit Cipher can be broken into four fundamental phases of operation as seen in Figure 1: Key and IV Selection/Insertion, Key Initialization, IV Initialization, and Encryption. The Key and IV selection is any method in which an implementation chooses to obtain a secret key and a public IV for the current context to be encrypted or decrypted. This process is outside of the scope of the algorithm itself, and is therefore left to the reader to determine the best business practices for Key management.

Figure 1 - Rabbit Cipher Process Overview

The second phase of the Rabbit cipher is the Key Setup phase. In this phase, the Key is applied to both the internal State Registers (X) and Counter Registers (C). As seen in Figure 2, the Key bits are first distributed across both the state and counters for an initial value for both and . The NextState() function, also known as the Iterate() function, is called four times to scramble the initial values of X and increment the counters. These four iterations account for the register values and through and .

4

Finally, the eight counters are re-initialized by eXclusive OR’ing the state registers with the counters. It is important to note that the is merely replaced with the result of this XOR process, and the ‘i’ value is not incremented during counter re-initialization. [5] claims that the process of re-initializing the counters avoids an attacker reversing the counter iteration process, thereby recovering the initial Key value.

Figure 2 - Secret KEY Setup Process

The third process, as seen in Figure 3, is known as the IV Setup process. The IV Setup process is a method of further obfuscating the Secret key by modifying the eight counter registers by distributively XOR’ing every bit in the IV with every bit in the counters. This ensures that there are possible unique key streams for any given Secret Key.

One should be aware that the combining of the IV with the counters, much like the Counter Re- Initialization of the Key Setup process, does not constitute an iteration in the system. The IV is combined with the counter registers at iteration 4, where = .

After combining the IV with the counters, the IV Setup process ensures the mixing of the IV bit by again processing four iterations of the NextState() function. The iterations will cause a mixing of the IV bits between the counters and the state registers. These four iterations account for the register values and through and .

5

Figure 3 - IV Setup Process

The final process in the cipher is the Encryption Cycle. There are two distinct sub-phases within the encryption process, the NextState() or iteration function, and the KeyStream generation. As seen in Figure 4, the NextState() function can be further broken into two subprocesses, Counter Iteration and State Iteration.

Upon entry to the encryption process, the counter registers must be iterated once. The state of the counter registers are updated by combining the current state with a hard coded constant and the value of the carry bit. Unlike most processes within the Rabbit cipher, counter iteration must be performed sequentially, as the carry bit is recalculated after each counter register, and the new value of the carry is used in the processing of the next counter register.

After the eight counter registers are updated to the state, the state registers are calculated in the State Register Iteration phase. Each state register and associated counter register pair is used to generate a G value defined as a function . Once the values are calculated, each state register is updated as a function of various G values. This allows each state register to be a function of the previous state and next counter as well as two other states and their associated counters.

After iterating the counters and states, the final phase of encryption is the KeyStream generation. Though conceptually, the KeyStream generation is part of the encryption process, it is not part of the NextState(0 function, and is not necessary for execution when the NextState() function is called from either the Key Setup or IV Setup processes.

6

Figure 4 - Encryption Process

The KeyStream generation can be confused for some type of dynamic S-Box generation, but is in reality a simple XOR function between different state registers to create eight 16-bit keystream registers. The output of this phase is 128-bits of random key stream that is used to XOR with the bits of the plain text data stream.

Functional Component Descriptions Each of the major processes in the Algorithm Breakdown of the previous section identified unique code blocks that can be described separately in greater detail. This section will provide the detailed algorithms and diagrams to expand upon each functional block.

Key Expansion The Key Setup process was composed to two functional blocks, the Key Expansion and the Counter ReInitialization. Though in between these two blocks, the NextState() function is called four times, the description of the NextState function is remanded for a later section.

Key Expansion The Secret Key is expanded into both the State and Counter registers using concatenation of various sections of the key. The key is first broken into eight subkeys where 0 <= y <= 7. Each is composed of a 16-bit portion of the original key K such that

. Figure 5 displays the breakdown of the Secret Key into Subkey registers.

7

Figure 5 - Subkey Creation

After the subkey registers are created, the State and Counter registers are initialized using a concatenation of two subkey registers per state and counter register. When concatenating subkeys to initialize a state or counter, there are two methods based on the value of j. If j is even the first pattern is used, and the second pattern is used when j is odd. Furthermore, the initialization of the counter uses the two patterns in the opposite manner based on j being odd or even, as well as reverses the order of the concatenation.

The state registers are initialized using Equation 1 and the counter registers are initialized using Equation 2.

Equation 1 - State Register Initialization

Equation 2 - Counter Register Initialization

Figure 6 is an example block diagram of the State Register initialization. This figure focuses on the first four state registers to provide a clear example of the initialization process. By reducing the complexity of the image, the diagram clarifies the distribution of the subkey using the indexing (mod 8), as noticed in the initialization of register . Since j = 3, the upper 16- bits of the register are initialized by the subkey. Likewise, Figure 7 provides the block diagram for the Counter Register initialization. Note that the order of 16-bit words are reversed as well as the selected per register is reversed based on j.

Figure 6 - Loading the State Registers

Figure 7 - Loading the Counter Registers

Counter ReInitialization The process known as Counter ReInitialization is not a true initialization in the sense that the

8 counter registers are not written over with a fresh value. Rather, the current counter registers, as of iteration i = 4, are updated by combining the current value with one of the current state registers. Equation 3 provides the algorithm used to combine the appropriate State register to the corresponding Counter register. Figure 8 provides the block diagram of this process for visual satiation.

Equation 3 -Counter Register ReInitialization

Figure 8 - Re-Initializing Counter Registers

IV Expansion The IV Expansion process is the method in which the 64-bit IV affects change on all 256-bits of the Counter registers. To begin, the IV is broken into four 16-bit IV registers represented by where . Next, the 16-bit IV registers are concatenated into four 32-bit fields which are used to XOR with the registers. Much like the re-initialization phase of the Key Setup process, the XOR of the IV and the Counters does not increment the iteration. The results of the process merely update the values of the Counter registers for iteration 4.

Figure 9 - Reordering the IV Bits

Figure 9 provides a block diagram overview of the process of separating the IV bits into the four 16-bit registers, and then logically concatenating them into four 32-bit register values. Next, Figure 10 provides the process by which each of the four 32-bit reordered IV registers are XOR’ed with two of the Counter registers to assert a new value.

9

Figure 10 - IV Bits XORed with the Counter Registers

After the IV affects the counter registers, the NextState function is called four time. This function, however, is out of scope for the current functional block, and is left to the next section for detailed description.

Next State Function As previously described, the Next State function is broken into three primary components, the Carry bit resolution, Counter Iteration, and the State Iteration.

Carry Bit Resolution The Carry bit, referred to as Phi or , is a single bit that is modified for each counter register update. By using the carry bit to shift data from one counter to the next in a circular pattern, every prior state bit has impact on all other future state bits. This is in alignment with the Chaos Theory from which the which the cipher was formed.

An noteworthy behavior to acknowledge is that the Carry bit is initialized to 0 for all instances of the cipher. Though is a function of both i and j, where , is a special register in the algorithm in which the initial value must be some for .which is out of the valid range for i. Therefore, the default value of 0 is used in the first instance of the carry bit. The algorithm for the derivation of is given in Equation 4. In simplest terms, if aggregate of the terms that compose the next counter register value is greater than then the carry is set.

Equation 4 -Calculating Phi

Since the first iteration starts with , there is technically no , for i = -1, which is the impetus for the default value of 0.

Counter Iteration The Counter registers are updated sequentially, accompanied by the update of as described in the previous section. Each Counter is updated using its current value, a hardcoded alpha register value, and the previously derived from the last counter register updated. Equation 5

10 provides the Counter Iteration equations. Notice that for all calculated, the is used as the carry bit.

Equation 5 -Calculating Phi

After each counter is calculated, a new value for is generated for use in the next counter register calculation. In each equation, an alpha value is hardcoded in the calculation of the counter. The values used for alpha are provided in Equation 6.

0x4D34D34D 0xD34D34D3 0x34D34D34 Equation 6 -Alpha Constants for Counter Register Updates

State Iteration At the core of this encryption algorithm is the internal state iteration. The 32-bit X values are calculated through a process of rotating and adding the previously calculated G values. G values are calculated using the formula in equation 7. The 32-bit G values are calculated using the previous X values and the current count states. First the X value is added to the C value and squared creating a 64 bit number which is XORed with a 32 bit right shifted version of itself. This XOR operation is modulus 32.

Equation 7 - G Value Creation

The purpose of the G values is to be temporary values used to calculate the new state values. To create these new state values two different equations are used: these are both shown in Equation 8. Which equation is used to calculate the new X value depends on if an odd or even block is being calculated.

Equation 7 - New State Value Calculation

Each new X value is calculated by the addition of 3 different G values. When calculating

11 an even X value both the second and third G values are rotated to the left 16 bits but when calculating an odd X just the second G value is rotated to the left 8 bits.

Figure 11 - New States from G Values

Using this method to create calculate new state values guarantees that each new state will be affected by at least six previous State and Count values. This in combination with the key setup scheme guarantees that in each iteration every state variable has been affected by each key bit [1].

Encryption

KeyStream Generation The main function behind the operation of a stream cipher is its ability to create a random sequence of bits called the “Key Stream.” A keystream is a pseudo randomly generated string of bits of any arbitrary length. The strength of a stream cipher is directly related to the randomness of its generated keystream. The operating philosophy behind stream ciphers is that stream ciphers generate a keystream equal to the length of the plaintext data. The rabbit cipher creates this keystream output in 128-bit blocks, but only uses the bits as needed. The process of creating each 128-bit block is performed after each internal iteration. Extraction is done by XORing the upper and lower halves of two separate X values, creating a 15-bit keystream sub- block. This process is repeated for all X values, then the results are concatenated into the output 128-bit keystream.

12

Figure 12: Keystream Generation Process

Key Stream and Data Combining The purpose of a cipher is to output a stream of bits that appear entirely random but contain encrypted data. This encryption process is done by XORing the keystream with the plaintext. This creates a of whose length equals the plaintext. For our testing purposes we stored each of the generated in file.

TestU01 Test Suite

The project requires that the cipher primitive be statistically tested using an established Random Number Generator (RNG) statistical suite. The test suite TestU01 was picked for this project because of both its flexibility with input data and various unique test batteries. An additional reason we chose TestU01 is because it hadn’t been mentioned in any of the cryptanalysis work performed on the Rabbit Cipher.

TestU01 was designed to test RNG implementations. These RNG modules have a naming convention indicating their role in the software package. u_modules stand for uniform generator modules, these would typically hold the implementation of a RNG to test. A special generator comes with the package that can produce random numbers by reading from a file in either binary encoded 32-bit numbers or ASCII encoded floating point numbers between 0.0 and 1.0. This file module, ufile, was used in this project to reduce the amount of effort required to get the keystream from our Java implementation of the cipher into the native binary executable of TestU01.

TestU01 contains a collection of test functions, each individual function performs a statistical test looking for patterns of non-randomness in the RNG output, these non-randomnesses can be in the form of correlations, pattern repetition, etc. For most uses, running individual tests does not give a good enough measure of the quality of testing or the results. For this reason it is common to run multiple sets of non-overlapping tests. Luckily, TestU01 comes with preconfigured test batteries, these are placed into b_modules.

The author of TestU01 suggest that when beginning to analyze RNGs it is best to start off by using the Small Crush battery[12]. This 10 test battery is good enough to catch most weak RNG implementations. The most common test suite for TestU01 is Crush. Crush is a thorough set of 96 tests, and it is suggested that any RNG passing Crush is good enough for general use.

13

For the serious cryptographer, an even longer test battery is available, called Big Crush, which has an estimated running time of 8 hours, covering 106 different tests. While Big Crush is more suitable for testing cryptographic grade RNGs, the long execution time makes it prohibitive to collect the necessary amount of data in the timeframe allocated to this project; for this reason, we have chosen to use Crush as our main test battery. It is worthy of noting that other batteries are available, such as the pseudoDIEHARD, FIPS, Alphabit, and Rabbit, not to be confused with this project’s Rabbit Cipher.

In order to study the role of initialization rounds, a battery of tests were performed to test the output of the Rabbit cipher. We first selected a number of key and IV pairs to use for keystream generation. Different initialization pairs were selected using different criteria. First, we used key and IV combinations of constant 0s and 1s, then we used a cipher initialization value from the eStream implementation test vectors, lastly, we selected a random bit pattern from random.org to initialize the cipher. The selected cipher initialization pairs are listed below. all zeros 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 all ones ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff zeros key, ones iv 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff ff ff ff ff ff ff ones key, zeros iv ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 eStream test vector #6 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 27 17 f4 d2 1a 56 eb a6 random.org bit pattern 6d 6e e6 ea 19 fb fd 12 d5 f2 b0 3a b1 96 65 f5 8e 95 24 b2 03 4b 2b

These vectors would then be used to init the Rabbit cipher and generate for each of the round variations 4 through 0. By performing analysis on the outputs of reduced round Rabbit and comparing that to the full round version, we expect to be able to draw some conclusion regarding the role of round count in a cipher initialization on the randomness of its keystream.

The details of the individual tests in Crush can be found in [12]. Here we list the names of the tests, omitting the parameters and parameter variations for tests that are executed more than once.

Crush smarsa_Birthday Spacing sknuth_Collision sknuth_Gap sknuth_SimpPoker sknuth_CouponCollector sknuth_MaxOft svaria_WeightDistrib smarsa_MatrixRank sstring_HammingIndep

14

swalk_RandomWalk1 smarsa_SerialOver smarsa_CollisionOver snpair_ClosePairs snpair_ClosePairsBitMatch sknuth_Run sknuth_Permutation sknuth_CollisionPermut svaria_SampleProd svaria_SampleMean svaria_SampleCorr svaria_AppearanceSpacings smarsa_Savir2 smarsa_GCD scomp_LinearComp scomp_LempelZiv sspectral_Fourier3 sstring_LongestHeadRun sstring_PeriodsInStrings sstring_HammingWeight2 sstring_HammingCorr sstring_Run sstring_AutoCor

Random Data Generator

The Random Number Generator, or the Rabbit Stream Cipher, was a command line Java application that read in a Test Vector, number of initialization rounds, and byte count, and output a binary file containing the prescribed number of random bits. This file was then used as input to the TestU01 test suite.

A stream cipher is responsible for generating what amounts to a random sequence of bits. Therefore, there is no need to actually encrypt a file to get random data, merely generate the key stream. For this experiment, the RNG was used to create 8GB binary files as input to the test suite.

The typical command line used for the Java RNG was as follows:

$ java FileDriver -r 0 -s 8000000000

The User’s Manual provides more detail as to each option presented in this example. The command line shows that the user is selecting a 0 round initialization process with a required 8-GByte output file. Knowing that the Rabbit Stream cipher mandates a 4 round initialization process, tests data files were generated for round R where

The format of the Key File was an ASCII text file containing 24 bytes of data. The first 16 bytes were used for the Key and the last 8 bytes were used for the IV.

| Key Bytes | IV Bytes | 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18

15

The output file was a pure binary datafile. Since the file was filled with random data, there was no need for any type of file header or control information within the data file itself.

Test Results

Results were collected after some modifications were made to TestU01. The first change was to help reduce the amount of unique data consumed by the suite. Initially, by trial and error, we realized that Crush consumes over 60 GBytes worth of random data. At this file size, we decided to modify the Crush suite so that the file was rewound between independent test executions, as this would not affect individual test’s results. We later discovered that this change was not sufficient. Further debugging lead us to the discovery of a platform issue regarding large file sizes (>2GB), this was remedied by updating the file manipulation functions to handle 64-bit file offsets properly.

These modifications allowed Crush to run to completion using less than 8 GBytes of random data. Each test produces 186 p-values, there are six key/IV pairs, and five possible round values. These tests resulted in the generation of 320 GBytes of keystream, with a total test suite execution time of ~22.5 hours. Due to the size of the results spreadsheet, they will not be placed in this report, but are available from our main wiki site at: TestU01Results.xlsx. A summary will be provided to help support our analysis, and is given in the next section.

Results Analysis TestU01 authors design their suite so that the RNG is part of the implementation, and can be invoked as many times as necessary to produce, potentially, infinite amounts of data. For this reason they encourage users to not run their tests on a fixed data size and use a hard threshold of 0.01 or 0.001 as their pass criteria. What they recommend instead is that if one encounters a suspicious p-value, they can run multiple iterations of the test on a longer data set. By performing multiple iterations, it is possible to verify if the observed p-value was due to chance or an actual weakness in the RNG.

With limited time and storage space, we opted against the recommendation, and leave that as an area to be improved on when doing future investigation. The summary of suspicious p-values observed is provided in Tables 1 and 2.

significance 4 Rounds 3 Rounds 2 Rounds 1 Round 0 Round threshold

1% 22 16 30 18 34

.1% 3 2 1 0 1

Table 1 - Count of p values exceeding both 1% and .1% limits by # rounds

The result summary in Table 1 shows that the number of tests failing at the 0.01 threshold is consistently above 11 (1% of the number of tests executed = 11.6). The number of expected failures due to chance alone at the 0.001 threshold is 1, which is too small to perform any tests on.

Rabbit authors claim that two rounds of initialization is sufficient for the security of the cipher, but they specify four to have a safety margin. For this reason we can run a simple Chi-Squared 16 test on the Rounds 4 through 2 for the 0.01 threshold failures, which yields a Chi-squared value of : 11 + 2.27 + 32.8 = 46.07, with 2 degrees of freedom, that translates to a p-value of 0.0000 which confirms that these observed failures are not due to chance and they point to non- uniformity in Rabbit’s output.

In order to determine if any particular key and IV initialization value were responsible for the non-uniformity, we sorted the failures according to the key and IV pairs, which led us to Table 2.

K=0's K=0's K=1's K=1's K=0's K=Rand IV=0's IV=1's IV=0's IV=1's IV=Rand IV=Rand

1% 18 26 20 15 19 22

.1% 2 3 0 1 0 1

Table 2 - Count of p values exceeding both 1% and .1% limits by Key&IV value

In Table 2, we can see that the failed tests at both threshold levels span each of the key and IV pairs. This indicates that no specific key and IV combination was causing the output bias in the cipher, and it is likely a characteristic of the cipher itself.

Comparison to Other Analysis’ Bias in Rabbit’s output has already been confirmed and emphasized in [9]. Although the bias is present, it was not found to have any known security implications. The SADIST statistical package was used in [13], but was unable to detect any bias, although differential analysis performed on the algorithm proved that input bit flipping had shown high correlation with output bits of the keystream. The remaining papers [8][10][11] do not focus on running test suites and prefer analytical techniques, the consensus is that differential attacks are the most likely attacks to succeed on Rabbit.

Conclusion The results obtained strongly suggest that there is a bias in the output bits of Rabbit. Due to the existing bias, it was not possible to perform much analysis on the effect of initialization rounds on the output uniformity. Previous analysis efforts have also found Rabbit’s output to be non- uniform. We have not been able to further analyze the non-uniformity to assess the possibility of formulating an attack more efficient than keyspace search. While the bias does exist in Rabbit, it remains secure in that no known attacks are able to retrieve the key in less time than a keyspace search.

Project Documentation

Configuration Management The project documentation and associated source code is hosted and stored on the website Bit Bucket at https://bitbucket.org/shdairis/4005-705-cryptography/wiki/Home. The Wiki and the Code Repository both employ the Git source control software package as the basis for Configuration Management (CM).

The standard client required the following hardware and software configuration for success build, execution, and development of the project:

Hardware: Intel Pentium or higher 2GB RAM or more 32 bit Single Core CPU (minimum)

Software: Windows XP SP2 or later

17

Eclipse version 3.7.1 for Windows Cygwin 2.763 gcc 4.5.3 GNU make 3.82.90 bash 4.0-6 Java Runtime Environment 1.6 Tortoise Git 1.7.7.0 git 1.7.9.msysgit.0

Wiki Environment Users may update the Wiki directly through their preferred web browser, but the preferred alternative is to add files and hyperlinks directly via the Git interface to the web server. To perform advanced operations, one can use the following line to create a local copy of the Wiki Git repository locally.

$git clone https://bitbucket.org/shdairis/4005-705-cryptography.git/wiki

Access to the Wiki is public, but access via git for pushing and pulling requires privilege escalation. Submitting a request via the homepage allows for wiki administration to provide access.

Project Environment In order to download or modify the code from the Git repository, a user must create a BitBucket account.

$git clone https://@bitbucket.org/shdairis/4005-705-cryptography.git

Build Processes There are three Java class files used to create the Rabbit stream cipher: FileDriver.java, RabbitCipher.java, and CryptoState.java. The RabbitCipher.java implements the StreamCipher.java, and therefore requires the inclusion of the StreamCipher interface in the local environment. A number of Parallel Java Library [3] components were also used to support the development of the RabbitCipher class.

To reduce potential conflicts between user source code revisions, a single environment was created and committed to the Git repository. The environment GitTestProject was created initially to test all team member’s Eclipse and Git installations, but also became the base directory for all source code development. Pulling this environment to a new platform will allow building of the FileDriver class and full access to the projects source code and configurations settings.

If a command line build is required, the javac command can be used provided the classpath is set to include the parallel java library. This was not the method employed by this team, and is outside of the scope of this document.

Though not directly a part of the project, the TestU01 suite requires a very specific environment to compile and execute. When using cygwin, one must remove any instance of autoconf, automake, autobuild, and libtool. Having any of these build environment tools will cause failure

18 during configuration or during the initial make process.

User’s Manual The Java class FileDriver is designed to allow a user to create a keystream, encrypt a file, or test the algorithm at certain intervals. The interface is rudimentary, but provides the necessary functionality for the given task.

A user may wish to test the algorithm against pre defined test vectors/output sets, such as those provided by eStream or by the authors of the Rabbit Cipher. In these instances, the cipher is required to produce three blocks of keystream based on the input data. While in these modes input is restricted to two types of data: 128-bit Key only, or 128-bit Key with 64-bit IV. If the user plans to execute the RabbitCipher in a test mode, the [-k] or [-i] options must appear as the first arguments to the FileDriver application.

After the initialization option specification (key only, or key and iv), the key filename must be the next argument on the command line. The file must contain 16 or 24 ASCII hexadecimal characters representing the 16 bytes of key data and optionally the 8 bytes of IV data. For non testing modes, the program requires all 24 bytes. Unconventionally, no ‘0x’ can precede the hex digits in the file. For example:

00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17

The next argument, optionally, is the number of rounds field. Since the Rabbit Cipher performs 4 rounds of NextState functions in both the key setup and the IV setup operations, the number of calls to the NextState function in these processes can be modified by invoking the [-r #] option. The user should enter a positive integer in place of the ‘#’ symbol to configure the engine.

If a user wishes to encrypt a file, the use of the [-e filename] option should be the next argument on the command line. The [-e] cannot be used in conjunction with either the [-k] or [-i] options. This mode will read in the file and encrypt the data 128 bits at a time. If there are fewer than 128 bits of data remaining, only the data remaining is encrypted.

For the purposes of the TestU01 suite, the FileDriver was required to generate test vectors. Therefore, the [-s ] option is made available to the user. The [-s] option is not compatible with the [-k], [-i], or the [-e] options. If the [-s] option is chosen, it must be the next argument on the command line. The value is the number of byte the application is to produce.

The final argument is the output filename which is where the keystream or encrypted data will be written to, in binary form.

The following is the help screen displayed upon invalid input to the system from the command line: usage: FileDriver [-k] [-i] [-r ] [[ -e ]|[ -s ]]

Options: (order matters) -k, -K : Key Test only mode

-i, -I : Key and IV Test only mode

19

keyFile: Key and IV. the 16 (key only) or 24 (key + iv) hexadecimal bytes, encoded as a string, used initialize the cipher.

-r, -R : Override the number of init rounds (Default = 4) rounds : Number of initialization rounds

-e, -E : Encrypt a data file specified by the following argument srcFile: Binary file to encrypt

-s, -S : Specify the number of PRNG bytes to generate (the cipher's keystream) prngLen: desired size of output file

OutFile: File storing the encrypted data from the srcFile, or the desired amount of PRNG output

Project Discovery

While working on this project, the team has gained valuable knowledge only attainable through experience. We highly recommend that the choice of statistical test suite is considered when selecting the cipher implementation language. Had the cipher been written in C rather than Java, it would have been easier to integrate into the test suite and saved us many many hours of debugging and frustration.

We have learned that it is very difficult to write a proper cryptographic specification, as the intended audience spans several disciplines. Our particular specification was written with a high focus on the mathematics, and very little consideration for software or hardware implementors. We found the syntax, bit ordering, and portions of the notation confusing. When the cipher could not produce output matching the given test vectors, we had to refer to the C implementation, insert debugging statements, and compare the output at various stages to the Java version. This was paramount in being able to understand which portions of the specification were interpreted incorrectly.

Future Research

Our small data sets were able to detect bias and confirm non-uniformity of Rabbit’s output. Therefore, we will not suggest enhancing or adding further tests to detect bias, instead we expect future work to go into analyzing the source of the types of non-uniformity that were detected. We have not classified the failed tests to determine if they are related to known biases or if they even fall within the same category of tests. This type of analysis would be good grounds for future efforts.

Another experiment we suggest is to correlate the biases with the Alpha values from the algorithm. One observation is that the rotation operations rotate by a multiple of 8, using registers of size 32 implies that a single bit can only affect 4 other bit positions in any of the registers. An experiment that varies the amount of rotations to improve diffusion may help reduce the bias in output.

Due to our choice of implementation language, limited time, and storage constraints, we had to set a hard limit of 8GB of keystream to run through the test suite. Running multiple iterations of the failed tests, as TestU01 authors suggest, would help eliminate any doubts regarding suspicious p-values. This would require lots more storage space, or adapting the C implementation provided by eStream to be an RNG u_* module that TestU01 can use. The reason we recommend this is that many of the failure values were within the suspicious range.

Statement of Work

Sharif Hdairis: ● Statistical test suite research and selection.

20

● Compile, execution of, and debugging of the TestU01 suite. ● Creating the project and shared environment in Eclipse, including proper linking to the Parallel Java library. ● Setting up and administering the Wiki and Git repository. ● Implemented initial FileDriver frontend to encrypt/decrypt files using the cipher’s external interface. ● Scripted and executed all of the keystream producing programs. ● Scripted and executed all the statistical tests and collected all the results. ● Assisted with the debugging of Rabbit implementation ● Collaborated on this report document. ● Updated and refined the Javadocs in the deliverable source. ● Prepared the source for packaging and submission.

Andrew Hoffman: ● Assisted development of RabbitCipher.java ○ g-function implementation and debugging. ○ encryption function ● Final report ○ Visio diagrams for the presentations and this report. ○ additional content and proofreading. ○ Wrote excel scripts to extract analysis data

Nelson Powell: ● Wrote my first Java program ever ○ Wrote CryptoState.java ○ Implemented the CipherStream interface in RabbitCipher.java ■ Wrote all Key and IV funcitons ■ Wrote the keystream generation functions ■ Implemented Key/IV test feature ○ Modified FileDriver interface ■ Added user options for tests, key or key+IV files, encryption, and round selection ● Imported eStream version of Rabbit Cipher ○ Imported all eStream test vectors ○ Performed all debugging and output comparisons between the two versions for validation ● Worked on debugging TestU01 ● Derived test vectors for analysis ○ Collected and formatted data for analysis ● Formatted final paper ○ Authored Introduction through Counter Iteration and Project Documentation to the end

21

References

[1] M. Boesgaard, M. Vesterager, T. Pedersen, J. Christiansen and O. Scavenius, The Stream Cipher Rabbit, http://www.ecrypt.eu.org/stream/p3ciphers/rabbit/rabbit_p3.pdf, Presented at Fast Software Encryption Conference, 2003

[2] M. Boesgaard, M. Vesterager, E. Zenner, A Description of the Rabbit Stream Cipher Algorithm, RFC 4503, May 2006

[3] Kaminsky, A., Parralel Java Library, http://www.cs.rit.edu/~ark/pj.shtml

[4] eCrypt Source Code and Test Vectors, http://www.ecrypt.eu.org/stream/p3ciphers/rabbit/rabbit_p3source.zip

[5] Anonymous, Rabbit Stream Cipher Algorithm Specification, http://www.cryptico.com/images/pages/WP_Rabbit_Specification.pdf, Cryptico A/S, 2005

[6] Anonymous, Secutiry Analysis of the IV Setup for Rabbit, http://www.cryptico.com/images/pages/ wp_security_analysis_ivsetup.pdf, Cryptico A/S, 2003

[7] M. Boesgaard, M. Vesterager, T. Pedersen, J. Christiansen and O. Scavenius: Rabbit: A New High- Performance Stream Cipher, Proceedings of Fast Software Encryption 2003, Springer, Berlin, (2003)

[8] Unknown, Second Degree Approximations of the g-Function, http://www.cryptico.com/images/pages/ wp_second_degree_approx.pdf, Cryptico A/S, 2003

[9] Jean-Philipe Aumasson, On a bias of Rabbit, http://sasc.crypto.rub.de/files/sasc2007_316.pdf,

[10] Y. Lu, H. Wang, and S. Ling, Cryptanalysis of Rabbit, Information Security 11th International Conference, ISC 2008, Taipei, Taiwan, September 15-18, 2008. Proceedings

[11] A. Kircanski, A. M. Youssef, Differential Fault Analysis of Rabbit, Selected Areas in Cryptography 16th Annual International Workshop, SAC 2009, Calgary, Alberta, Canada, August 13-14, 2009, Revised Selected Papers

[12] P. L’Ecuyer, R. Simard, TestU01, http://www.iro.umontreal.ca/~simardr/testu01/guideshorttestu01.pdf, Universit´e de Montr´eal, D´epartement d’Informatique et de Recherche Op´erationnelle, August 2009

[13] Lütfü Tarkan ÖLÇÜOĞLU, , Analysis of Rabbit Cipher, http://www3.iam.metu.edu.tr/iam/images/6/6a/Tarkanolcuogluterm.pdf, Middle East Technical University, Institute of Applied Mathematics, Jan 2009

22

23