<<

Assessing the Viability of the Rotor in the Modern World

by

Jordan Zink Gahanna Lincoln High School 140 S Hamilton Road Gahanna, Ohio 43230

[email protected] (614) 307-3669

Prepared for GLHS Science Academy Symposium February 5, 2010 2

ASSESSING THE VIABILITY OF THE ROTOR CIPHER IN THE MODERN WORLD Jordan Zink Gahanna Lincoln High School, 140 S Hamilton Road, Gahanna, Ohio 43230

The purpose of this project is to determine the viability of the rotor cipher in the modern world of computer . This project consists of two phases: modifying the cipher to increase its security and running a simulation to assess the effectiveness of a brute force attack.

While many modifications were made, one modification involved shortening the plaintext before by removing unnecessary letters and replacing words with symbols. An experiment was run to determine the level of shortening that would not distort meaning. 20 subjects participated and it was found that a conservative level of shortening did not significantly distort meaning, while a liberal level of shortening did distortion meaning slightly. It was also found that there was no significant difference between youth and adult subjects, or between subjects familiar and unfamiliar with texting and online lingo.

To simulate a brute force attack, a computer program was written in Visual Basic. Initial testing found household computers could not break a message encrypted with 3 or more rotors. A regression equation was found to predict the search speed based on plaintext length (R2 = 0.9988). Also, the equation t = 2562n/s was created to showed the relation of time to run a brute force attack to the number of rotors and the key search speed.

An investigation into the parts of a computer that make a brute force attack run faster was also conducted. It was found that processors with high l-2 cache, voltage, and front side bus speed were the fastest, and RAM had little to no effect on time.

It was concluded that the rotor cipher is a viable cipher in the modern world. Also, plaintext shortening can be applied with most , which can boost security. 3

Table of Contents

I. Introduction…………………………………………………………………………….……….4

II. Review of Literature………………………………………………………………...…...…….5

III. Methods………………………………………………………………………………...…….18

IV. Results & Analysis…………………………………………………………………….…….27

V. Conclusion………………………………………………………………………………...….34

VI. Acknowledgments………………………………………………………………………..….36

VII. Works Cited……………………………………………………………………...………….37

VIII. Appendix………………………………………………………………………………...…40 4

Introduction

Cryptography is a field that often goes unappreciated. Many people do not know what a cryptologist even does. However, cryptography is one of the most important fields in computer science. Although it often goes unseen, it is used by everyone daily. The simple concept of keeping information secret from prying eyes is, in reality, a complex science rooted deep in logic and mathematics. While it may be considered an evil, it is a necessary evil for the world today.

Within cryptography there lie many different ciphers, or methods of encrypting and decrypting data. The point of this project is to research one of these ciphers called a rotor cipher.

This cipher was used by the Germans in WWII and required a large effort by the Allies to break, including the creation of the first computer. However, in the move to computers and modern ciphers after the war, the rotor cipher was left behind and became merely cryptographic history.

This project hopes to assess whether, through implementation on a computer, the rotor cipher still presents a viable secure cipher. The project consists of two phases; modifications to the cipher to increase security and a simulation to test the effectiveness of a brute force attack on the cipher. 5

Review of Literature

Introduction to Cryptography

The first recorded instance of concealing information is in Roman philosopher

Herodotus’s The Histories, where he chronicles the conflicts of the fifth century BC between

Persia and Greece. He mentions that a Greek in Persia hid a message on a piece of wood by covering it with wax revealing Persian military actions. The message was not discovered as it left Persia and ended up saving Greece from a Persian assault. This event is an example of , or hiding messages through hiding the existence of a message. While effective when not discovered, steganography provides no security if the message itself is discovered

(Singh, 1999).

The weaknesses of stegaography led to the creation of cryptography. Cryptography hides information not by concealing the existence of a message, but by making the information unreadable except by those parties meant to read the message. In order to hide a message, a sender would take the plaintext (information to be encrypted) and what is called a key and combined them through an algorithm to produce a (the encrypted message to be sent).

The algorithm is the basic instructions on how to encipher a message, while the key is a further set of instructions which is specific to each message. Once the receiver gets the message, he or she will use the algorithm along with the same key to decrypt the message, revealing the original plaintext (Signh, 1999).

The first ciphers consisted of two categories: transposition and substitution.

Transposition revolves around re-arranging the order of the letters in a word. For example, the word “example” would become “xplamee”. The weakness of transposition is for the algorithm to be more secure, it must become very complex. Simple algorithms, like writing the message 6 backwards, provide little security. The second method, substitution, involves replacing certain letters with other letters. The most famous of these ciphers was the Caesar cipher. The cipher involved replacing a letter with the letter a certain number of places in front of it in the alphabet.

So if the shift was one, “example” would become “fybnqmf”, and if the shift was two,

“example” would become “gzcorg” (Singh, 1999).

The weakness of the simple led to the creation of a critical part of cryptography: . Cryptanalysis, or codebreaking, is the study of deriving the plaintext from a ciphertext without knowing the key. The first reference of cryptanalysis is in the works of al-Kindī, an Arab philosopher. He suggested counting the number of occurrences of a letter in a ciphertext and comparing that to the normal distribution of letters in that language.

This method, known as , shows the weakness of substitution ciphers, since they have the same letter distribution in the ciphertext and the plaintext. Al-Kindī’s work also showed that cryptographic ciphers could be broken by analyzing the ciphertext (Singh, 1999).

Overtime, the simple ciphers of the early days of cryptography were replaced by newer, more complex ones. One of the most famous was Le Chiffre Indéchiffrable (French for “the unbreakable cipher”), or the Vigenère Cipher. Substitution ciphers up to the Renaissance were monoalphabetic ciphers, meaning a single substitution alphabet was used for the entire message.

While these ciphers allowed for easy frequency analysis, the lack of common knowledge in such matters kept these ciphers secure. But with the Renaissance came a need for more secure . In the mid-1500s, a French diplomat named Blaise de Vigenère developed the Le

Chiffre Indéchiffrable (French for “the unbreakable cipher”), or the Vigenère Cipher. This cipher was a , meaning it used multiple substitution alphabets to encrypt a single message. In the cipher, the plaintext would be encrypted using one of 26 different 7 substitution alphabets selected by the key (which was usually a keyword repeated for the length of the message). Using multiple substitution alphabets made simple frequency analysis useless since every letter did not necessary encrypt to the same letter for the length of the message. This cipher remained secure until it was broken by Charles Babbage in the 19th century, who took advantage of the repeating keyword to perform complex frequency analysis. (Pincock, 2006).

During WWI, many new ciphers were created to keep military traffic secret. However, they were all built on out-dated thoughts and ideas, allowing cryptanalyst to break most ciphers.

Countries began looking for more powerful ciphers, and cipher machines presented an answer.

A cipher machine uses a mechanism to perform encryption rather than a person using pencil, paper, and look-up tables. First cipher machines were disks which simply made it easier to perform the Vigenère Cipher. However, cipher machines would revolutionize cryptography in

1918 (Singh, 1999).

The

Arthur Scherbius, a German inventor with an understanding in electrical engineering, invented the Enigma machine in 1918. The machine consisted of a keyboard, a series of scrambling mechanisms, and a lamp board. After pressing a key, an electrical current traveled down a wire to the scrambling mechanism, which consisted of a series of rotors (the exact number varies based on Enigma model; the standard is three) and a reflector. Each rotor had 26 contacts, each corresponding to a letter of the alphabet. Inside the rotor was a jumble of wires which connected each contact on one side with a random contact on the other side. The electrical current would pass through these rotors, changing position along the way, until it reached the reflector. The reflector paired together random pairs of contacts (which can be thought of as letters). So if the current entered the “A” contact, it would leave through the “G” 8 contact, moving in the opposite direction, and vice versa (current entering “G” would leave as

“A”). After passing through the reflector, the current traveled back through the rotors and again change positions. Once back to the side of the rotors where it began, it traveled to a lamp board which illuminated a small light bulb under the letter corresponding to the position the current left the last rotor in (Singh, 1999).

As described, this machine mealy produces a monoalphabetic substitution cipher (a single cipher alphabet for the entire message). However, there is one key feature to the machine which provides impressive security. After a key is pressed, a stepper mechanism rotates the first rotor one twenty-sixth of a rotation, or so it rotates to the next contact. Every time the first rotor completes a full rotation, the next rotor over is rotated one contact. So for every 26 steps of the first rotor, the next rotor is stepped once, and for every 676 (26x26) steps of the first rotor, the rotor one spot over is stepped 26 times and the rotor two spots over is stepped once, and so on .

The operator could change when this action occurred by setting a notch on the rotor to a specific setting, but only one turn of the next rotor would ever occur for a full rotation of the original rotor. This stepping action creates a powerful polyalphabetic cipher by making a random cipher alphabet for every letter encrypted with no repeating keyword (Pincock, 2006).

Further security is provided by the plug board. The plug board was a series of 26 electrical plugs on the front of the machine with one for each letter of the alphabet. The electrical current would pass through the plug board both before and after passing through the rotors. The operator would insert a wire with jacks on both ends into two of the plugs. This would switch letters connected together whenever current passed through the plug board (letters with no connections remained the same) (Singh, 1999). The number of connections used varied 9 by who was using the Enigma (i.e. Army or Navy), but six was average (Kahn, 1991). In 1941, all Enigma traffic was standardized to ten connections (Miller, 1996).

Operators of the Enigma had many settings on the machine to change depending on the key. The order of the rotors (each rotor could be removed from the machine and placed in a different position), the rotors’ starting positions, the notch position on each rotor, and the plug board could all be changed based on the key (Ratcliff, 2003).

Arthur Scherbius market his new invention in 1923, hoping to sell models to both civilians (such as businessmen) as well as military and diplomatic offices. While failing to gain popularity in the civilian sector, the German Navy, eager for a new cipher system after learning that theirs from WWI was broken, took interest in the Enigma. They developed a system to use the Enigma machine and increase security. Codebooks would be issued to ships which would contain the keys to be used for a certain period of time (at first, keys were used for weeks or months, but this later changed to daily key changes). Using these key settings, the operator would arrange the rotors in the machine, turn the rotors to their starting positions, and plug the plug board. The operator would make a random key and type it twice (like “ABCABC” for a three rotor machine). The operator would then reset the machine using the random key and type the message. This meant that the daily keys issued to every ship would only be used to encrypt a random key rather than an entire message (Kahn, 1991). The Navy adopted the Enigma machine in 1926. The German Army followed suit in 1928 (Kahn, 1993). These military models contained rotors with different wiring than the commercial models, preventing enemies of

Germany from simply buying an Enigma to read their messages (Kahn, 1991).

10

Enigma Strengths

The Enigma machine had many strengths for the Germans. One was the ease of use. The reflector mechanism made it so that, on the same settings, any letter’s encrypted counterpart will encrypt back to the first letter. For example, if on the setting ABC, if “A” encrypts to “B”, then

“B” will encrypt to “A”. This allowed for very easy decryption, since operators simply needed to put in the same settings used to encrypt and type the ciphertext to get the plaintext

(“Cryptanalysis of the Enigma”, 2009).

The main strength that the Germans cited for the security of the Enigma machine was the number of possible setups for the machine. There were 6.5x1079 different ways to wire the three rotors, 17,576 different possible settings for three rotors, 676 different notch settings, 7.9x1012 different reflector wirings, and 5.3x1014 different plug board wirings, bringing the grand total number of combinations to 3x10114. This number was further increased when encryption procedure was changed. Instead of only three rotors which could be rearranged, rotors were now selected from a group of five, increasing the number of setting even further. This incredibly large number allowed the Germans to feel safe, believing brute force attacks to be infeasible and frequency analysis out of the question (Miller, 1996).

Breaking Enigma

The Germans’ belief that Enigma was unbreakable due to the large number of different settings was largely false. In order to take advantage of the vast number of ways to wire the three rotors (which provides the bulk of the different combinations), they would have to have a way to re-wire the rotors, or have each Enigma operator posses 6.5x1079 different rotors.

However, the Germans never implemented a re-wire able rotor, only using the same hardwired 11 rotors for thousands of messages. Using this, Allied Cryptanalyst could reconstruct the rotor wirings of the military rotors (Ratcliff, 2003).

For the most part, the same was true for the reflector. The reflector, even though it could be removed and replaced with another, was kept the same by Enigma operators normally.

However, near the end of WWII, Germany developed a re-wire able reflector. This development terrified the Allied cryptanalysts, but never became an issue due to the fact that it was never fully implemented and was later abandoned. Due to the weaknesses presented by the hardwiring of the rotors and reflector, the number of possible settings was reduced from 3x10114 to 1x1023

(Ratcliff, 2003).

Even though the Allies were able to reconstruct the wirings for the rotors, there was still the problem of the setting presented in the key: the rotor arrangement, the rotor settings, the notch settings, and the plug board. To discover the key for a message, the Allied cryptanalyst used a brute force attack. A brute force attack is a cryptanalytic method where the cryptanalyst tries using every key possible to decode a message, under the assumption that every key but the correct one will make an incomprehensible message. Only the key that produces a message that makes sense must be the correct message. Allied cryptanalysts took advantage of many weaknesses of the Enigma and its operation to reduce the number of keys that were required to check, and then used a variety of tools (most of which led to the construction of the modern computer) to check the remaining keys (“Brute force attack”, 2009).

Enigma Weaknesses Used to Reduce the Number of Keys to Check

One of Enigma’s biggest weaknesses was the mechanism that made it so easy to use: the reflector. The reflector allowed encryption and decryption to be done easily using the same settings. The reflector also had the effect of preventing any letter from encrypting to itself. So 12 an “A” in the ciphertext could never correspond to an A in the plaintext (“Cryptanalysis of the

Enigma”, 2009).

Allied cryptanalysts relied heavily on cribs for breaking Enigma. Crib is a cryptographic term used to refer to a plaintext that is known or suspected to be in a ciphertext. Enigma operators were notorious for including long phrases in many reports, such as Keine besonderen

Ereignisse (figuratively translates to “nothing to report”) (Milner-Barry, 1993), or even using small words like eins (one) in their messages repeatedly (“Cryptanalysis of the Enigma”, 2009).

Operators would include callsigns (the identification of the transmitting and receiving stations) in their brosdcasts, allowing identical plaintexts to be cross referenced (Welchman, 1984). The

British would even place mines in certain areas of ocean in order to try and control what a

German message would say (a method known as “gardening”) (Morris, 1993). Knowing or guessing plaintext provided the basis for most methods of cryptanalysis of the Enigma.

The plug board, a tool which was added to provide extra security to the Enigma machine, also provided a weakness. If “A” was connected to “B”, then “A” would always switch with “B” and “B” would always switch with “A”. Since this stayed the same for the entire message,

Allied cryptanalysts could exploit this feature as an advantage. Creating what was called a diagonal board, they could reduce the number of rotor settings to be checked considerably

(Welchman, 1984).

German Enigma protocols, while made to increase security, created weaknesses to the cipher. Operators were instructed to use the daily key to encrypt a made-up key that was typed twice (“ABCABC” for example). Since the rotors were turning during this process, it made it seem that this was a secure practice. But in reality, by encrypting the same plaintext in two different places, the Germans had given Allied cryptanalysts an advantage because they could 13 compare two different knowing that the plaintexts were identical. This allowed

Allied cryptanalyst to be able to make cycle groups (also called boxes of chains) which reduced the number of possibilities to be tested from 10,000 trillion to 105,456, which is the number of possible rotor settings (Singh, 1999).

The human operators of the Enigma machine also provided weakness to the cipher. The

“made-up” keys at the beginning of every message (which already present a weakness by repeating) were notoriously not that random. Operators usually used nearby key combinations, like “QWE” (the first three letters on the top row) (Rejewski, 1984). Also, operators would use

German three letter words, such as “IST” (which translates to “is”), or even one operator who used the initials of his girlfriend (Pincock, 2006). All of these nonrandom combinations made it much easier for Allied cryptanalysts to break the keys.

More human error came from encrypting and sending messages twice. Often in the

German Navy, messages would be sent via the Enigma system and then resent verbatim using weaker, easy to break ciphers. Known as a kiss, these foolish practices by human operators allowed Allied cryptanalyst to compare full plaintexts with ciphertext, increasing their understanding of Enigma and allowing completely different messages to be read more easily

(Mahon, 1945).

It was generally assumed that adding more rotors would increase security of the Enigma machine. However, to make it portable, most versions of the Enigma machine used only three rotors (“Enigma machine”, 2009). High security models used by Army HQ staff had many more rotors than three (“Enigma machine”, 2009). But when it comes down to it, mechanics limited the Enigma machine’s effectiveness. The need for a physical device with moving parts limited 14 how many rotors could be used, or even how they were used. Modern computers can generate the effect of an Enigma machine, but, theoretically, without the limit of number of rotors.

Another physical limit was stepping action. The stepping action of the rotors was created by ratchets and paws (Hamer, 1997). Due to how complex these mechanisms would be if intricate stepping was used, only single steps were used in most Enigmas (as earlier described)

(Ratcliff, 2003). Some Enigmas contained more advance stepping action (such as early stepping of the middle and end rotors or double steps), but the normally simple stepping action made cryptanalysis much easier than erratic stepping.

Modern Cryptography

After WWII, science saw a revolution with the development, which was ironically rooted in the Allied cryptanalytic effort during the war. During this time of development, ciphers became shrouded in secrecy with the start of the Cold War. The US government had no computer-based encryption standard (that is known publicly) until the early 1970 with the creation of the Data Encryption Standard, or DES. This system, which was disclosed to the public, became the government standard until it was replaced by the AES system (“”, 2009).

One issue in modern cryptography is key size. The key size is the amount of information

(in bits) in each key. If there were eight possible keys for a cipher, the key size would be 3 bits

(23 = 8). DES uses a 56 bit key (“Data Encryption Standard”, 2009) while AES uses various key sizes: 128, 192, and 256 (“Advanced encryption standard”, 2009). Due to secure key transfer, a smaller key is seen as easier to keep safe than a larger key, but larger keys present more security.

A key the size of the original message would provide perfect security if it was kept secret (this methods is known as a one time pad) (Singh, 1999). 15

Brute force attacks can easily be implemented on modern computers. As chips become faster, larger keys are required to keep systems secure from this attack. DES, using custom chips, can be brute forced in a matter of days. AES has yet to be publicly brute forced (“Brute force attack”, 2009).

An important issue for cryptographers and cryptanalysts is determining when a brute force attack would be ineffective. If there were only 100 keys that a cryptanalyst would have to search through, they could find the key in almost no time, but if there were 1x10^100 keys to search through, it would take much too long to find the key. Now, 1x10^100 might be a bit overkill, but any number of keys that would require more than a certain amount of time to search can be considered secure from brute force attack. The certain amount of time would vary based on the information being encrypted. If the information that is being encrypted is battle plans that are to be carried out the next day, then the information may only need to be secure for one week, but if the information is launch codes for nuclear missile silos, the information would need to remain secure for decades (if not longer). The number of keys required then would be proportional to the length of time that the cipher would remain secure (“Key size”, 2009).

Other Issues in Modern Cryptography

Every cipher mentioned to this point is called a symmetric cipher. That means the key used to encrypt and decrypt is identical. There is another system called an asymmetric cipher, or public key encryption. This involves the encrypting key (which is not kept secret and can be used by anyone) being different than a decrypting key (which is kept secret). As of now, this method is secure, but no proof of its security exists (Pincock, 2006).

Outside of brute force attacks (and other cryptanalytic attacks that focus on the cipher itself like frequency analysis) are side-channel attacks. These attacks involve looking at the 16 physical system used to implement the cipher rather than the cipher itself. By watching the system, attackers can discover information which can allow for the breaking of an otherwise secure system (“Side-channel attack”, 2009). Since this attack has to do with physical implementation of a cipher and not the theory of the cipher itself, this paper will not address side-channel attacks on rotor ciphers.

Kerckhoff’s Principle

Kerckhoff’s Principle is a fundamental idea of cryptography. The principle states that an encryption system should remain secure even if everything about the system (outside of the message-specific key) is known. In another words, the key should be the only factor relied upon for security (Kahn, 1996).

Various Other Topics

Best/Average/Worst Cases

When looking at an algorithm, one issue taken into consideration is the best, average, and worst case scenario. These are the amount of resources, such as run time or memory required, that an algorithm will require in three different situations. Best case would be an optimal run of the algorithm using the fewest resources, while worst case would be a run of the algorithm that takes the most amount of resources. Average case would be the average number of resources required to run the algorithm. For most algorithms, it is ideal to have average case near to or equal to best case, but for cryptography, it is ideal to have average case equal worst case (“Best, worst and average case”, 2009).

17

SMS Language

During the first decade of the 21st century, short messaging service, or SMS, has become one of the most popular forms of communication. Usually referred to as texting, it involves sending a message no more than 160 characters (http://www.3gpp.org). The need for short messages that are quick to write or type has caused the evolution of a new language (“New

Language Spawned by SMS Abbreviations”, 2009). Called SMS language, it involves removing letters from words, replacing common phrases with abbreviations, and using non-letter symbols to replace words (Alvi, 2009). Popular with teenagers, this language has found its way into

Standard English (such as the abbreviation “LOL” meaning “laugh out loud”). Certain teenagers are even using aspects of the language in essays for school, like shortening “you” to “u”, which has created controversy to the language’s validity (“New Language Spawned by SMS

Abbreviations”, 2009). 18

Methods

This project consists of two phases: making modifications to the Enigma cipher to strengthen it from the cryptanalysis run against it during WWII, and implementing these modifications on a computer and testing for the viability of a brute force attack based on the number of rotors in the cipher.

Modifications to the cipher

The first, and seemingly most obvious modification, is the elimination of the reflector.

The reflector was added so that a letter would be encrypted through the rotors twice rather than once and so that encryption and decryption follow the exact same process (same settings and everything). However, the weakness of not being able to have the ciphertext character be the same as the plaintext character is much to great for either of the two strengths, and computers can do things a mechanical device cannot do, allowing for both strengths to be implemented without the need of a reflector. So the reflector is eliminated from the cipher.

When it was used during WWII, the rotor cipher consisted of rotors with 26 contacts, one for every letter of the alphabet. For the modified cipher, the rotors had 256 contacts. This serves two main advantages. 256 converts perfectly to binary (8 binary digits), which allows for a medium of data transfer that is universal as opposed to simply the 26 letters of the English alphabet. Also, 256 provides an immense number of wiring possibilities (256!). The number of rotors is easily changeable, but the exact number is dependent on the amount of security desired

(see “Assesing the effectivness of a brute force attack” section for more information).

Another major modification to the cipher is the addition of a “shift rotor” in addition to the normal rotors (from now on referred to as substitution rotors). The substitution rotors provide the actual substitution alphabet for each letter to be encrypted with, but the shift rotor tells how to advance the substitution rotors. The Enigma machine would advance each rotor one 19 step at a time and would not turn the 2nd or 3rd rotor until the previous rotor had made a complete revolution. All of these factors aided in making the cryptanalyst of the Enigma much easier. But during those few times when the Enigma would advance the 2nd rotor early, or the special

Enigma variations which would occasionally “double step” (advance two settings), the Allied cryptanalyst were drastically slowed down. So by devoting entire rotors to shifting the substitution rotors, cryptanalyst should be made extremely challenging. Each substitution rotor has its own shift rotor which is unique to each rotor (no two rotors should used the same shift rotor). The shift rotors themselves will also shift, but will simply advance one position every letter encrypted.

The key tells the initial positions of every rotor. When specifying the number of rotors, one is specifying how many substitution rotors there are. So if there are three rotors, there are actually six rotors, three substitution rotors and three shift rotors. So the length of a key is the number of characters equal to twice the number of rotors. Each character, which gives an individual start position for a rotor, is really eight bits. So the key size in bits is 16 times the number of rotors.

Modifying the plaintext prior to encryption

Even with a stronger cipher, there still lie several problems with the plaintext.

Commonly repeated words provided suspected plaintexts, or cribs, for cryptanalyst to use to try and break a cipher. Normal English text contains certain letters that are used more commonly than others (like “E”), which allow for frequency analysis, the main method used for cryptanalysis of polyalphabeitc substitution ciphers. Finding a way to strengthen plaintext before encryption is vital to a cipher’s security.

20

Text Shortening

One way to do this is to shorten the plaintext. English is an inefficient language with long words that can be shortened. Evolution of this can be seen in SMS language, or texting language, where the writer uses abbreviations and letter removal (usually vowels) to create a message still readable but much shorter in length. However, SMS language is not always clear, causing occasional miscommunication. Individuals who do not know the abbreviations may not even be able to read the language.

To implement the concept of text shortening, rules must be develop for text shortening that a computer can follow and which produces a shortened text still readable by any individual.

Ideally, a human could provide the best shortening since a person can use complex logic to know whether removal of certain characters will yield understandable text, but long passages of text that need shortening would be time consuming for a person to shorten, so a computer method was used here. The shortened text must be understandable to any individual as well since the process of shortening is one way (that is, the text will be shortened before encryption, but after decryption, the text will not be “un-shortened”; it will stay shortened).

To find the best rules, an experiment was devised to test how text shortening affects comprehension. Two types of shortening methods were developed, which are referred to as conservative and liberal. Some rules are followed by both methods (like “to” becoming “2”), but each method has some individual rules. Conservative removes certain unnecessary letters while liberal removes all vowels (following some rules, however). The complete set of rules can be found in Figure 1.1.

For testing, many different passages were created. Two different types of text were identified: conversational and analytical. Conversational text is meant to simulate everyday communication by average people, such as E-Mail traffic. It contains many 1st, 2nd, and 3rd 21 person pronouns, mostly simple words, and few or no large names, numerical figures, or specific location names, but may contain dates and times. Analytical text is supposed to simulate secretive traffic by organizations such as the government or scientists. It contains many numbers, figures, names, and specific locations. It is crucial that this information remain clear, since miscommunication could be disastrous. There were four different samples created for each type of text. The four samples for conversational have no theme to them and were referred to as

Sample 1, Sample 2, Sample 3, and Sample 4. For analytical, four different sub-types were identified: Diplomatic, Military, Corporate, and Scientific. Each simulates a different field where encryption would be necessary to preserve secrecy. All passages are approximately the same size in words and characters (with other passages from their type), and all passages can be found in Figure 1.2. Each passage was shortened via a computer program written by me with both methods of shortening. The program can be found in Figure 1.3.

Testing involved a test subject reading passages out loud while observed by myself (the judge). Three measures of data were collected: stumbles, missed words, and time. Every time a subject paused their reading for a period of time outside normal pauses, or when the subject began to read a word incorrectly and then fixed the word, the judge marked down a “stumble”.

Every time a subject completely read a word incorrectly, the judge marked down a “missed word”. Also, if a subject paused for approximately a second before reading a word because he or she could not figure out the meaning of the word, the judge told the subject to skip the word and continue and marked down a “missed word”. The time of each passage’s reading is also taken.

Each subject read sixteen passages. Eight of the passages were shortened (half conservatively, half liberally), and eight of the passages were controls (not shortened at all). A subject always read the shortened form of a passage before the control of the passage. The control allows for a time unique to each subject for comparing the shortened text to normal text. 22

For the test, subjects read three shortened passages, then alternated between control and shortened until all shortened passages were read, then completed the remaining controls.

Shortened passages were alternated between conservatively shortened and liberally shortened.

The order of the specific passages was randomized from subject to subject excluding the first passage read. Since this is the first shortened passage the subject would read, it was decided that it was crucial this did not repeat, so over the course of sixteen runs, each of the eight passages should be read first once under each shortening. Also, after eight tests, each passage should be read four times under each shortening. A random order is important so that there is no bias to the learning curve that subjects develop while reading shortened texts.

The testing environment during each test was quiet and free of distractions. Each test subject had no knowledge of the content of the passages or of specifics of the text shortening process (all subjects did know the general premise of the test and those subjects under 18 and their parents signed consent forms). It was important subjects had no knowledge of the specifics of the text shortening (outside of experience possibly gained from the internet or texting) so that the test could reflect an average person’s comprehension. Subjects were selected with no distinction (such as gender, race, background, etc.), but it was noted when the subject was over

18 or 18 and under. This distinction, referred to as adult vs. youth, is to see if the increased use of internet and texting by the generation currently in high school translates to increased abilities in reading shortened texts. After the test, subjects were asked four questions: two concerning their familiarity with texting, and two concerning their familiarity with online instant messaging.

Both questions also look to see if there is a correlation between texting/internet and ability to read shortened texts.

In order to try and eliminate any bias from me as a judge, a second judge was present at a few of the test. During the test, he would also mark down stumbles and missed words. Each 23 judge could not see what the other was marking. This is to make sure a single judge is not pulling the experiment one way or the other.

Other modifications to the plaintext prior to encryption

Even with text shortening, there are still letters used more commonly than others. A way to get around this is to trick cryptanalyst by switching out common characters. Since there are

256 characters at our disposal when encrypting text, not all 256 characters are going to be used.

These extra characters would be useless, but if common characters (like “E”) were replaced with certain extra characters, frequency analysis would be hindered. And decryption would be simple, since each extra character would always correspond to a specific normal character (like

“$” always equaling “A”). This method would not make frequency analysis impossible by any means, but is merely a means of slowing the process of cryptanalysis. Also, keeping in mind

Kerckhoff’s Principle, the system of character exchange should be changed frequently in case a cryptanalyst discovers the system.

While in a full implementation of the cipher this process would be included, it was omitted from the brute force simulation described below because the simulation is assuming a cryptanalyst knows all of the cipher (i.e. rotor wirings) except the message specific key

(Kerckhoff’s Principle).

Assessing brute force effectiveness

The modifications to the cipher were all done in order to strengthen the rotor cipher from cryptanalysis. However, there is one attack that may still be effective: the brute force attack.

Unless the key size is very large (like a one time pad where the key size is the same as the size of the plaintext), a brute force attack can always produce the key to decipher any ciphertext.

However, as key size increases, the number of searches to run for a brute force attack and the time it would take to run the attack increases exponentially. At a certain point, it can be 24 determined that a brute force attack would be ineffective because a search would require years to discover the key. Since a smaller key is preferable to a larger key, the key need only be as large as it takes to make a brute force attack ineffective (larger is okay, but it is in a sense overkill).

To find the key size where brute force is ineffective, an experiment was run with the modified rotor cipher. The rotor cipher’s key size is easy to modify because it is easy to add or remove rotors from the cipher. To experiment with brute force attacks, the cipher was coded in

Visual Basic. A language like Assembler would be superior in speed to Visual Basic but due to resources and my knowledge in programming, Visual Basic was used. The cipher was coded as efficiently as could be made since it is assume that a cryptanalyst would used only the most efficient ways to implement the cipher for a brute force attack. A screenshot of the interface can be found in Figure 1.4 and the program can be found in Figure 1.5.

Once the program was completed, small modifications allowed it to run a brute force attack on a ciphertext created using the cipher. These modifications included searching through every key along with a very simple plaintext analysis to determine which key is the correct one

(a good plaintext analysis would be very complicated; mine simply looked for the word “the”).

For experimentation, a worst case scenario was simulated. Best case scenario would be the first guess being the correct key, so it does not require experimentation. There is equal probability for the key to be any key possible. So there is a 50% chance the key will be found by halfway through the search. So the average case scenario should take exactly half the time of the worst case scenario.

The experiment consisted of many trials, with every trial being five runs of the worst case scenario (a full key search), with the program clocking the run time of each search. An encrypted form of “The quick brown fox jumps over the lazy dog.” was used as the ciphertext being attacked. The variable was the number of rotors (the key size). Starting with one rotor, 25 the number of rotors increased after every trial until the time it took to run the trials became unreasonable (i.e. it taking more than a few hours to run a full search). Once this is done, data analysis produced a graph of run time versus rotors (key size). From that, a regression line was created to predict run time for any rotor.

One major issue was the choice of computers. Due to accessible resources, the trials were all run on household/workplace grade computers with the algorithm running as a Windows application (coded in Visual Basic). This represented what key size would be necessary to keep the cipher secure from an average person running a brute force attack. The ideal brute force attack would be conducted through a supercomputer with the algorithm coded in Assembler or machine for speed, but lack of resources prevents any test from being run.

It is important to remember that this experiment assumes that a cryptanalyst knows the wiring and order of the rotors, making the only unknown the message-specific key (rotor settings). In practice, it would be an immense challenge for cryptanalysts to discover the rotor wirings. But unless the rotors are change frequently, Kerckhoff’s Principle must be taken into account and it can be assumed that a cryptanalyst could discover the rotor wirings.

A second part of this phase of the experiment was also conducted to determine what specific aspects of a computer increase the speed of the cipher. Two aspects were looked at; the processor and the amount of RAM. To conduct this experiment, a brute force attack was run on many different computers ten times with the search time recorded. The parameters of the brute force attack were a single rotor with an encrypted form of the text “the quick brown fox”.

Information on each computer was recorded as well; specifically operating system, type of processor, the speed of the processor(s) in GHz, and the amount of RAM in GB (all of this information was accessed by right clicking on “My Computer” and selecting Properties). 26

An issue with the brute force simulation in general is what else the computer is running while the test was run. Each test was conducted with all other applications closed, but since

Microsoft Windows runs many processes in the background, it is hard to know whether the processor is fully being used to run the brute force attack. However, since no other major applications were being run, the speed reduction can be considered minor. Further experimentation regarding this issue could yield faster times, but this will not be discussed in this paper. 27

Results & Analysis

Results for test shortening

A total of 20 subjects took part in the experiment. Of those 20, 6 were adult (over 18) while 14 were youth (18 and under). Each subject read 16 passages (8 shortened and 8 control) meaning that there were 320 pieces of data, with each piece containing number of stumbles, number of missed words, and time. Three categories of comparison were used for comparing data: number of stumbles, number of missed words, and Vs Control % (percent difference between the time of a shortened passage vs. the time it took the same subject to read the control for that passage). Pure time was not used for comparison since what needed to be identified was the difference between reading shortened text and normal text.

After running a t-test, it was found that there is a very significant difference in the Vs

Control % of conservatively shortened text and liberally shortened text. Conservatively shortened text took on average 11.2% longer to read than control while liberally shortened text took 45.6% longer to read than control (P value between the shortening methods was 5.50x10-20).

This result basically confirms the belief that a more shortened text is harder to comprehend and takes longer to read, but does not really tell how understandable the shortened text is compared to normal text. A complete table of comparison can be found in Figure 2.1.

To see how understandable the text is, the shortened text must be compared to the control text. They were compared with number of stumbles and number of missed words (it is already known that both shortening methods took longer to read, so time was not used). Also, since it is being compared to a control, each method of shortening can now be looked at individually.

Passages using the liberal method of shortening contain a significantly higher number of stumbles and missed words than the control, meaning that liberally shortened text is very hard to comprehend. On average, about 2 missed words were present for every liberally shortened text, 28 so meaning may be distorted by the text shortening. However, this is only 2 words in a 65-70 word passage, which is no that much. Liberally shortened analytical passages also had more stumbles and missed words than conversational passages, which makes sense since analytical passages contain larger words that could be harder to read if shortened. A complete table of comparison can be found in Figure 2.2.

Conservatively shortened passages had a significantly higher number of stumbles than control, but had an insignificantly higher number of missed words (P value was 0.0873). The number of missed words on average for conservatively shortened passages was 0.05, which is very, very low. This means that while comprehension may be slowed down, meaning is almost completely preserved. A complete table of comparison can be found in Figure 2.3.

While a general trend existed of adults having slightly higher stumbles, missed words, and Vs Control %, no trend exhibited any significance. The lowest P value of any youth-adult comparison was 0.2029, which is not near significant. A complete table of comparison can be found in Figure 2.4.

After every test, the subject was asked four questions to assess their familiarity with texting and online lingo. When comparing those who said they were familiar with those who said they were not familiar, no significant difference was found. Interestingly, there was a very slight increase in stumbles and missed words for those who were familiar with texting and online lingo compared to those who were not, but none of the data was significant (the lowest P value was 0.2390). A complete table of comparison can be found in Figure 2.5 and Figure 2.6.

When data gathered by myself was compared to the other judge’s data, it was found that there was no significant difference in our results, which were very close to each others. The P value of the comparison was 0.8660.

29

Results of Brute Force Attack simulation

When the brute force attack simulation was run, a major problem was discovered. The time it took to run through a single rotor (16 bit key) was about 1.787 seconds (data can be found in Figure 2.7). If the number of keys searched is divided by the average time (216 / 1.787), a search speed of 36674 keys per second is produced. If a test were to be run to try and find the search time for a second rotor, a total of 4,294,967,296 (2564) keys would need to be searched.

At the speed of 25,889 keys per second, that would take 117,112 seconds, or 32 hours. Even greater quantities of time would be required to search any more rotors. Knowing this, it can be concluded that an average computer not design for cryptography could not run an effective brute force attack on a ciphertext encrypted with three or more rotors.

While this is a result in itself, further research was conducted on the brute force attack.

For the time trials, a very short message was used (“the quick brown fox”), but an investigation into the relation between message size and key search speed seemed necessary, since most messages would be much longer than “the quick brown fox”. So a second set of time trials was run. A brute force attack was run on messages of various sizes (one rotor only, due to already mentioned reasons). Then, using Excel, the data was graphed and a regression equation was created. Two graphs were made: one of the raw times, the other of the key search speed (in keys per second). Both graphs can be found in Figure 2.8 and Figure 2.9.

To estimate the amount of time it would take to run a brute force attack, one would need to take the number of keys and divide it by the key search speed at that size of message. The number of keys can be found by taking 256 to the power of two times the number of rotors. So the equation would look like this:

t = 2562n/s 30 where t = time, n = number of rotors, and s = key search speed. The key search speed is relative to the size of the message. The regression equation in Figure 2.9 would be used for an average computer not design for cryptography, but a new regression equation would need to be made for computers and processors specifically design for running brute force searches. However, the equation should still be effective in finding the number of rotors required to keep a message secure if one were to put in the time desired for the message to remain secure and the key search speed.

Something to remember is that all of this is working on the worst case scenario of the brute force attack where every key must be searched. In reality, an average case scenario would be more realistic. To find the average case scenario time, one can simply divide the worst case scenario time by 2. So the equation for average search time would be: 2t = 2562n/s.

Results of brute force simulation on different computers

Nine computers were tested. All ran Microsoft Windows XP except one which ran a 64 bit version of Windows 7. Information on all computers can be found in Figure 2.10. Average times of the ten trials on each computer were between 0.9 seconds and 1.75 seconds except for that of the Netbook which averaged almost 4 seconds. This raises an interesting point, since the

GHz of the processors and the GB of the RAM for the Netbook were not outside the normal range of the other computers, but the time from the Netbook was well above the other computers.

This is because other factors are at play in the processor than just frequency of the processor.

Factors like efficiency of the processor and cache size also affect processor performance.

Because these other factors are hard to keep track of and since all other computers had very similar times, these other factors were considered negligible for data analysis. The Netbook’s data, however, was left out of the data analysis since its processor behaved so differently than the other processors tested. 31

For analysis, a regression was run using Microsoft Excel. Three independent variables

(first processor frequency, second processor frequency, and RAM) were analyzed relative to one dependent variable (time). When no second processor was present, 0 was used. The results of the regression can be found in Figure 2.11. Using the coefficients, the following equation was predicted:

t = 2.08134 – 0.18688 * P1 + 0.04271 * P2 – 0.14551 * R where t = time (sec), P1 = the first processor’s frequency (GHz), P2 = the second processor’s frequency (GHz), and R = RAM (GB). The R2 value of the equation was 0.65693, which is not great but not bad. An interesting part of the equation is the fact that the coefficient for the second processor’s frequency is positive, not negative. While this could mean that the second processor is actually detrimental to the run time for the brute force attack, it is much more likely that the program is not even using the second processor and the positive coefficient is a result of the regression trying to find the best fit.

Based on this, a second regression was run with it only looking at fist processor frequency and RAM. The results of the regression can be found in Figure 2.12. Using the coefficients, the following equation was predicted:

t = 1.95775 – 0.11970 * P1 – 0.13280 * R.

The R2 value of the equation was 0.65385, which almost identical to the R2 value of the first regression. Looking at this equation, it shows that RAM and processor frequency have very similar affects on run time, with RAM having a slightly larger effect. However, the R2 value was still far from ideal. This means than the other factors in the processor that were considered negligible do indeed affect the time it takes to run a brute force attack.

To find a better equation, a third regression was run. This regression used 5 more factors related to the processor: level-2 cache (in MB), front side bus clock time (in MHz), multiplier, 32 voltage (in volts), and thermal design power (in watts). Data can be found in Figure 2.13. The results of the regression can be found in Figure 2.14. The coefficients produced the following equation: t = 12.62375 – 0.19966 * P – 0.00371 * R – 0.73653 * C – 0.00203 * F – 0.03836 * M – 5.85850

* V + 0.00820 * T where t = time (sec), P = first processor’s frequency (GHz), R = RAM (GB), C = level-2 cache

(MB), F = front side bus frequency (MHz), M = multiplier, V = voltage (V), and T = thermal design power (W). With and R2 value of 0.9853, this equation is a very good predictor of run time. This regression equation is limited, however. Since it is linear, it predicts impossible circumstances. For instance, the equation could predict a zero second (or even negative time) brute force attack if certain numbers are high enough. But for most normal cases, it should provide a good prediction.

To aid in analysis of the equation, a table was created which showed each factor’s affect on the time (this was accomplished by multiplying the technical data by the coefficients produced by the regression). The table can be found in Figure 2.15. When sorted from lowest time to highest time, two factors really set apart the two fastest computers (the two that had times less than one second) from the rest: level-2 cache and front side bus speed. Ironically, these two computers had low voltages, which served as a large detrimental affect on the time. While most every factor had an affect on time, one factor had essentially no effect: RAM. This is most likely because a brute force attack requires high quantity low memory operation, something much more suited for cache memory than RAM. Knowing this, any party wishing to construct the fastest computer using only household grade computing equipment would need to acquire a processor with high level-2 cache, front side bus speed, and voltage. RAM does not need to be large and, 33 unless the program used was modified to take advantage of a second processor, a second processor would not be necessary or helpful. 34

Conclusion

The goal of this project was to assess the viability of a rotor cipher in the modern world of computers. Success was found in modifying the cipher to strengthen it from its weaken state as the Enigma machine of WWII. Taking advantage of the large scale effort to break the cipher in WWII has allowed the cipher to be changed to counter the weaknesses found.

The cipher has many different advantages. It is easily modifiable to add or remove rotors from the cipher to adjust the security level to what is needed or desired. It provides effective security from brute force attack and unlike other ciphers that die out as technology advances, more rotors can be added to the cipher to increase its security. All of the tests run on the cipher assume that a cryptanalyst would know the wiring of the rotors, but in reality, cryptanalyst would have a very challenging time determining the wiring of the rotors.

With the cipher’s advantages also come several disadvantages. The cipher, while very simple, requires the storage of 768 bytes of information per rotor. This may present problems when trying to transfer the rotors to all parties who would use the cipher without an unwanted party receiving a copy. This would also prevent problems to organizations like governments or banks that already require large databases of keys and would be bogged down by additional rotor data. Finally, the cipher, while strengthened against certain cryptanalytic attack, is not guaranteed secure against every cryptanalytic attack. Further research could either prove or disprove its security against other attacks, but it is safe to assume it is fairly secure.

So where does this leave the rotor cipher? The rotor cipher presents a means of providing high security for limited traffic between parties, but large quantities of encrypted traffics may be better suited for another cipher. However, with these modifications, it is safe to say that the rotor cipher can be considered a viable cipher in the modern world. 35

Shortening text also went through extensive testing and experimentation. After analyzing the results, it can be concluded that using the conservative form of text shortening will result in text that, while slightly harder to comprehend, contains no meaning distortion. Liberal text shortening will result in minor meaning distortion. This means normal text can be shortened by approximately 10% without any meaning distortion. This is important for cryptography since shorter messages make cryptanalysis harder. While this process was discussed as part of the rotor cipher, it can actually be used with any cipher since it modifies the plaintext prior to encryption. 36

Acknowledgements

I would like to thank my instructor, Mr. Donelson, for his help and guidance in this project, Houston Fortney for his help as a second judge, and the subjects of the text shortening experiment for their participation. 37

Works Cited

______. “Advanced Encryption Standard”. Wikipedia, from http://en.wikipedia.org/wiki/Advanced_Encryption_Standard.

Alvi, Muzammal. “Saying it With a Smile Instead of Words”. EzineArticles.com. July 20, 2009.

______. “Best, worst and average case”. Wikipedia, from http://en.wikipedia.org/wiki/Best,_worst_and_average_case.

______. “Brute force attack”. Wikipedia, from http://en.wikipedia.org/wiki/Brute_force_attack.

Budiansky, Stephen. Battle of Wits. The Free Press, New York City. 2000.

______. “Cryptanalysis of the Enigma”. Wikipedia, from http://en.wikipedia.org/wiki/Cryptanalysis_of_the_Enigma.

______. “Data Encryption Standard”. Wikipedia, from http://en.wikipedia.org/wiki/Data_Encryption_Standard.

DeBrosse, Jim and Colin Burke. The Secret in Building 26. Random House, New York. 2004. Grinter, Rebecca E. and Margery A. Eldridge. “y do tngrs luv 2 txt msg?” European Conference on Computer Supported Cooperative Works. 2001. Hamer, David H. “Enigma: Actions Involved in the ‘Double Steeping’ of the Middle Rotor”. Cryptologica. January 21, 1997. Hamer, David H., Geoff Sullivan, and Frode Weirud. “Enigma Variations: An Extended Family of Machines”. Cryptologica. July 1998.

______. “History of cryptography”. Wikipedia, from http://en.wikipedia.org/wiki/History_of_cryptography. http://www.3gpp.org

Kahn, David. “An Enigma Chronology”. Cryptologica. July, 1993. 38

Kahn, David. Seizing the Enigma. Houghton Mifflin Company, Boston. 1991.

Kahn, David. The Codebreakers: The Comprehensive History of Secret Communication from Ancient Times to the Internet. Simon and Schuster, ______. 1996.

______. “Key size”. Wikipedia, from http://en.wikipedia.org/wiki/Key_size.

Mahon, Patrick. "History of Hut 8 to December 1941". The Essential Turing: Seminal Writings in Computer Logic, Philosophy, Artificial Intelligence and Artificial Life. Oxford University Press, Oxford. 2004.

Miller, Ray. “The Cryptographic Mathematics of Enigma”. Center for Cryptologic History, Fort Meade. 1996.

Milner-Barry, Stuart. "Navy Hut 6: Early days". Codebreakers: The Inside Story of . Oxford University Press, Oxford. 1993.

Morris, Christopher. "Navy 's Poor Relations". Codebreakers: The Inside Story of Bletchley Park. Oxford University Press, Oxford. 1993.

______. “New Language Spawned by SMS Abbreviations”. Asian News International. June 29, 2009.

Pincock, Stephen. Codebreaker: The History of Codes and Ciphers, From the Ancient Pharohs to . Walker & Company, New York. 2006.

Ratcliff, R. A. “How Statistics Led the Germans to Believe Enigma Secure and Why They Were Wrong”. Cryptologica. April 2003.

Rejewski, Marion and Richard Woytak. "A Conversation with : Appendix B”. Enigma: How the German machine cipher was broken, and how it was read by the Allies in World War Two. University Publications of America, ______. 1984.

______. “Side-Channel attack”. Wikipedia, from http://en.wikipedia.org/wiki/Side_channel_attack. 39

Singh, Simon. The Code Book. Doubleday, New York, 1999.

Welchman, Gordon. The Hut Six Story: Breaking the Enigma Codes. Penguin Books, Harmondsworth. 1984. 40

Appendix

Figure 1.1

Rules for Shortening: Both:  “and” → “&”  “to”, “too” → “2”  “you” → “u”  “at” → “@”  “see” → “c”  “for” → “4”  “are” → “r”  “why” → “y”  “because” → “cuz”  “be” → “b”  Use all contractions (“can not” → “can’t”)  Replace spelled out numbers with numerals (“seven” → “7”)  Do NOT shorten proper names with any method above or below Conservative only:  “with” → “wit”  “until” → “til”  “about” → “bout”  “-ing” → “-in”  “-ed” → “-d”  “-er” → “-r” Liberal only:  Remove all vowels (a,e,i,o,u) o Keep vowels in words three letters or less o Keep vowels if first or last letter in word

Note: All substitutions preserve capitalization (i.e. “Because…” → “Cuz…”, not “cuz…”)

Figure 1.2

Passages for text shortening test:

Conversational: Sample 1: Plaintext: 77 words, 353 characters Hey. It is John. How are you doing? I have not talk to you in a long time. Why is that? I have been doing pretty well. Work has been a great pain as normal. How is your job? The wife and kids are doing great. My oldest son Jim is the captain of his soccer team. Write me back, man. I want to hear about how it is going up north. Talk to you later, John. 41

Shortened Text (Conservative): 324 characters, 91.8% Hey. It’s John. How r u doin? I have not talk 2 u in a long time. Y is that? I’ve been doin pretty well. Work has been a great pain as normal. How’s your job? The wife & kids r doin great. My oldest son Jim is the captain of his soccr team. Write me back, man. I want 2 hear bout how it’s goin up north. Talk 2 u latr, John.

Shortened Text (Liberal): 280 characters, 79.3% Hey. It’s John. How r u dng? I hve not tlk 2 u in a lng tme. Y is tht? I’ve bn dng prtty wll. Wrk has bn a grt pn as nrml. How’s yr job? The wfe & kds r dng grt. My oldst son Jim is the cptn of his sccr tm. Wrte me bck, man. I wnt 2 hr abt how it’s gng up nrth. Tlk 2 u ltr, John.

Sample 2: Plaintext: 76 words, 351 characters What is up, Jill? It is your big sister. I just got the news. Great job on that award. I am so proud of you. Mom and Dad will be so surprised when they hear about this. Is the ceremony on the tenth or the twelfth? Make sure you get me some tickets. I want to be in the front row when you get the award. I truly am proud of you. Your big sister, Alice.

Shortened Text (Conservative): 327 characters, 93.2% What’s up, Jill? It’s your big sistr. I just got the news. Great job on that award. I’m so proud of u. Mom & Dad will b so surprisd when they hear bout this. Is the ceremony on the 10th or the 12th? Make sure u get me some tickets. I want 2 b in the front row when u get the award. I truly am proud of u. Your big sistr, Alice.

Shortened Text (Liberal): 288 characters, 82.1% Wht’s up, Jill? It’s yr big sstr. I jst got the nws. Grt job on tht awrd. I’m so prd of u. Mom & Dad wll b so srprsd whn thy hr abt ths. Is the crmny on the 10th or the 12th? Mke sre u get me sme tckts. I wnt 2 b in the frnt row whn u get the awrd. I trly am prd of u. Yr big sstr, Alice.

Sample 3: Plaintext: 75 words, 378 characters Hey Mary. That was a great party I went to last night at Jenny’s house. Why did you not show up? It would have been much better if you were there. We should get together for dinner sometime. I am free every day until Friday. I know a great Italian place downtown with the best pasta you have ever tasted. What do you say to Thursday at six? Send me back soon. Your friend, Beth.

Shortened Text (Conservative): 351 characters, 92.9% Hey Mary. That was a great party I went 2 last night @ Jenny’s house. Y did u not show up? It would have been much bettr if u were there. We should get togethr 4 dinnr sometime. I’m free every day til Friday. I know a great Italian place downtown wit the best pasta u have evr tastd. What do u say 2 Thursday @ 6? Send me back soon. Your friend, Beth.

Shortened Text (Liberal): 295 characters, 78.0% 42

Hey Mary. Tht was a grt prty I wnt 2 lst nght @ Jenny’s hse. Y did u not shw up? It wld hve bn mch bttr if u wre thre. We shld get tgthr 4 dnnr smtme. I’m fre evry day untl Fri. I knw a grt Itln plce dwntwn wth the bst psta u hve evr tstd. Wht do u say 2 Thurs @ 6? Snd me bck sn. Yr frnd, Beth.

Sample 4: Plaintext: 78 words, 366 characters It is Michael. I have been picked to lead the youth group in our town. Do you think I should take the job, because I am having my doubts. I do not really enjoy the job right now, and being the leader would mean more work for me. But it is a very good experience, and I would be helping the community. I think I am going to take it, but I want to know what you think.

Shortened Text (Conservative): 338 characters, 92.3% It’s Michael. I have been pickd 2 lead the youth group in our town. Do u think I should take the job, cuz I am havin my doubts. I don’t really enjoy the job right now, & bein the leadr would mean more work 4 me. But it’s a very good experience, & I would b helpin the community. I think I’m goin 2 take it, but I want 2 know what u think.

Shortened Text (Liberal): 287 characters, 78.4% It’s Michael. I hve bn pckd 2 ld the yth grp in our twn. Do u thnk I shld tke the job, cz I am hvng my dbts. I don’t rlly enjy the job rght now, & bng the ldr wld mn mre wrk 4 me. But it’s a vry gd exprnce, & I wld b hlpng the cmmnty. I thnk I’m gng 2 tke it, but I wnt 2 knw wht u thnk.

Analytical: State Department: Plaintext: 65 words, 410 characters Status Report from Moscow. Relations with the Kremlin are beginning to break down. The Russian President is calling for immediate reductions in our nuclear stockpiles by three hundred warheads or they threaten to sanction exports to the United States. As Ambassador to Russia, I would strongly suggest that President Smith use his influence to try and resolve this issue, or military answers may be necessary.

Shortened Text (Conservative): 369 characters, 90.0% Status Report from Moscow. Relations wit the Kremlin r beginnin 2 break down. The Russian President is callin 4 immediate reductions in our nuclear stockpiles by 300 warheads or they threaten 2 sanction exports 2 the US. As Ambassador 2 Russia, I’d strongly suggest that President Smith use his influence 2 try & resolve this issue, or military answers may b necessary.

Shortened Text (Liberal): 304 characters, 74.1% Stts Rprt frm Moscow. Rltns wth the Kremlin r bgnnng 2 brk dwn. The Russian Prsdnt is cllng 4 immdte rdctns in our nclr stckpls by 300 wrhds or thy thrtn 2 snctn exprts 2 the US. As Ambssdr 2 Russia, I’d strngly sggst tht Prsdnt Smith use his inflnce 2 try & rslve ths isse, or mltry answrs may b ncssry.

Military: Plaintext: 67 words, 399 characters 43

Field Report from General Brown. The army has been drilling three miles south of Fort Victory for the past six months. In light of the recent aggression by the enemy, we have moved five miles north west towards the front. We are currently positioned outside the village of Johnstown. An ambush has been planned by Generals Smith and Jones and I will recommend reinforcements immediately for success.

Shortened Text (Conservative): 372 characters, 93.2% Field Report from General Brown. The army has been drillin 3 miles south of Fort Victory 4 the past 6 months. In light of the recent aggression by the enemy, we’ve movd 5 miles north west towards the front. We r currently positiond outside the village of Johnstown. An ambush has been plannd by Generals Smith & Jones & I’ll recommend reinforcements immediately 4 success.

Shortened Text (Liberal): 308 characters, 77.2% Fld Rprt frm Gnrl Brown. The army has bn drllng 3 mls sth of Frt Victory 4 the pst 6 mnths. In lght of the rcnt aggrssn by the enmy, we’ve mvd 5 mls nrth wst twrds the frnt. We r crrntly pstnd otsde the vllge of Johnstown. An ambsh has bn plnnd by Gnrls Smith & Jones & I’ll rcmmnd rnfrcmnts immdtly 4 sccss.

Business/Corporate: Plaintext: 66 words, 397 characters To CEO Williams. We have completed the prototype for the new product with excellent results. Efficiency is improved forty three percent while cost is reduced thirteen percent. The final design will be ready on November 9. It is crucial that our rival Anderson Technologies does not find out about this, since using this product could boost their profit by one million dollars. From R&D, Mr. Davis.

Shortened Text (Conservative): 342 characters, 86.1% 2 CEO Williams. We have completd the prototype 4 the new product wit excellent results. Efficiency is improvd 43% while cost is reducd 13%. The final design will b ready on Nov 9. It is crucial that our rival Anderson Technologies does not find out bout this, since usin this product could boost their profit by 1 mil. $. From R&D, Mr. Davis.

Shortened Text (Liberal): 291 characters, 73.3% 2 CEO Williams. We hve cmpltd the prttype 4 the new prdct wth excllnt rslts. Effcncy is imprvd 43% whle cst is rdcd 13%. The fnl dsgn wll b rdy on Nov 9. It is crcl tht our rvl Anderson Technologies ds not fnd out abt ths, snce usng ths prdct cld bst thr prft by 1 mil $. Frm R&D, Mr. Davis.

Scientific: Plaintext: 67 words, 369 characters To Dr. Tompson. We have discovered a new form of hydrogen while experimenting in . It seems to act as if effected by some unknown force. We think it may be gravity, but are not sure. Our test involved one thousand pieces of hydrogen and resulted in four hundred being changed. The test rig was moving at fifty miles per hour during the test. From Dr. Wilson.

44

Shortened Text (Conservative): 322 characters, 87.3% 2 Dr. Tompson. We have discoverd a new form of hydrogen while experimentin in Switzerland. It seems 2 act as if effectd by some unknown force. We think it may b gravity, but aren’t sure. Our test involvd 1000 pieces of hydrogen & resultd in 400 bein changd. The test rig was movin @ 50 mph durin the test. From Dr. Wilson.

Shortened Text (Liberal): 288 characters, 78.0% 2 Dr. Tompson. We hve dscvrd a new frm of hydrogen whle exprmntng in Switzerland. It sms 2 act as if effctd by sme unknwn frce. We thnk it may b grvty, but arn’t sre. Our tst invlvd 1000 pcs of hydrogen & rsltd in 400 bng chngd. The tst rig was mvng @ 50 mph drng the tst. Frm Dr. Wilson.

Average percent of reduction for conservative shortening: 9.2% Average percent of reduction for liberal shortening: 22.4%

Figure 1.3

Automated Text Shortening Program Written by Jordan Zink in VBA (Word) Note: “---” denotes a continuation from the previous line

'declaration of a public variable Public ReplaceCheck As Boolean ------Sub TextShortener() 'defines whether shortening method is conservative (false) or liberal (true); this is changed manually (could be easily modified for friendly user interface) IsLiberalShortening = False 'gets the length of the text TextLength = Len(ActiveDocument.Range.Text) - 1 'A is a counter for the current spot the program is searching in the text (like a cursor) A = 1 Do While A <= TextLength 'check if current character is a letter or some other character using LetterCheck function If LetterCheck(Mid(ActiveDocument.Range.Text, A, 1)) = False Then 'it is some other character (.,:"'?! ect.), no shortening will be applied, add to A and loop A = A + 1 Else 'it is a letter, continue 'find length of word 'B is a counter for the length of the word B = 1 Do 'searches until it finds a non-letter character If LetterCheck(Mid(ActiveDocument.Range.Text, A + B, 1)) = False Then Exit Do B = B + 1 45

Loop 'check for proper noun (all caps) If UCase(Mid(ActiveDocument.Range.Text, A, B)) = Mid(ActiveDocument.Range.Text, A, ---B) Then 'proper noun, no text shortening applied, add to A and loop A = A + B Else 'not proper, continue 'check substitution database 'first, chec if capitalized (so substitute will also be capitalized) IsCapitalized = False If UCase(Mid(ActiveDocument.Range.Text, A, 1)) = Mid(ActiveDocument.Range.Text, ---A, 1) Then IsCapitalized = True 'puts word in "Txt" variable (makes it lower case for easier substitution searching) Txt = LCase(Mid(ActiveDocument.Range.Text, A, B)) 'ReplaceCheck sees if any modifications have been made (starts out false, will be set to true if the ReplaceDocText function is run) ReplaceCheck = False 'this section contains substitutions to be applied for both conservitinve and liberal shortening

'IMPROTANT: the way the program applies modifications is by placing a "ß" character wherever there is a character that needs to be removed. A later function will remove all "ß" (this was done for ease and speed of programming) ("ß" character used because it is an odd character not used in English)

'substitutions are self explanitory If Txt = "and" Then Temp = ReplaceDocTxt(A, "&ßß") If Txt = "to" Then Temp = ReplaceDocTxt(A, "2ß") If Txt = "too" Then Temp = ReplaceDocTxt(A, "2ßß") If Txt = "you" And IsCapitalized = True Then Temp = ReplaceDocTxt(A, "Ußß") If Txt = "you" And IsCapitalized = False Then Temp = ReplaceDocTxt(A, "ußß") If Txt = "at" Then Temp = ReplaceDocTxt(A, "@ß") If Txt = "see" And IsCapitalized = True Then Temp = ReplaceDocTxt(A, "Cßß") If Txt = "see" And IsCapitalized = False Then Temp = ReplaceDocTxt(A, "cßß") If Txt = "for" Then Temp = ReplaceDocTxt(A, "4ßß") If Txt = "are" And IsCapitalized = True Then Temp = ReplaceDocTxt(A, "Rßß") If Txt = "are" And IsCapitalized = False Then Temp = ReplaceDocTxt(A, "rßß") If Txt = "why" And IsCapitalized = True Then Temp = ReplaceDocTxt(A, "Yßß") If Txt = "why" And IsCapitalized = False Then Temp = ReplaceDocTxt(A, "yßß") If Txt = "because" And IsCapitalized = True Then Temp = ReplaceDocTxt(A, ---"Cuzßßßß") If Txt = "because" And IsCapitalized = False Then Temp = ReplaceDocTxt(A, ---"cuzßßßß") If Txt = "be" And IsCapitalized = True Then Temp = ReplaceDocTxt(A, "Bß") If Txt = "be" And IsCapitalized = False Then Temp = ReplaceDocTxt(A, "bß") 46

If Txt = "one" Then Temp = ReplaceDocTxt(A, "1ßß") If Txt = "two" Then Temp = ReplaceDocTxt(A, "2ßß") If Txt = "three" Then Temp = ReplaceDocTxt(A, "3ßßßß") If Txt = "four" Then Temp = ReplaceDocTxt(A, "4ßßß") If Txt = "five" Then Temp = ReplaceDocTxt(A, "5ßßß") If Txt = "six" Then Temp = ReplaceDocTxt(A, "6ßß") If Txt = "seven" Then Temp = ReplaceDocTxt(A, "7ßßßß") If Txt = "eight" Then Temp = ReplaceDocTxt(A, "8ßßßß") If Txt = "nine" Then Temp = ReplaceDocTxt(A, "9ßßß") If Txt = "ten" Then Temp = ReplaceDocTxt(A, "10ß") 'apply type-specific shortening If IsLiberalShortening = False Then 'conservitive only If B > 1 Then 'check if ending in "-er" or "-ed". If so, remove the "e" If Mid(Txt, B - 1, 2) = "ed" Then Temp = ReplaceDocTxt(A + B - 2, "ß") If Mid(Txt, B - 1, 2) = "er" Then Temp = ReplaceDocTxt(A + B - 2, "ß") End If If B > 2 Then 'check if ending in "-ing". If so, remove "g" If Mid(Txt, B - 2, 3) = "ing" Then Temp = ReplaceDocTxt(A + B - 1, "ß") End If 'other substitutions (conservative only) If Txt = "with" And IsCapitalized = True Then Temp = ReplaceDocTxt(A, "Witß") If Txt = "with" And IsCapitalized = False Then Temp = ReplaceDocTxt(A, "witß") If Txt = "until" And IsCapitalized = True Then Temp = ReplaceDocTxt(A, "ßßTil") If Txt = "until" And IsCapitalized = False Then Temp = ReplaceDocTxt(A, "ßßtil") If Txt = "about" And IsCapitalized = True Then Temp = ReplaceDocTxt(A, "ßBout") If Txt = "about" And IsCapitalized = False Then Temp = ReplaceDocTxt(A, "ßbout") Else 'liberal only If ReplaceCheck = False Then 'still no modifications, continue 'check word length If B > 3 Then 'the word is long enough; remove vowels For Search = 2 To B - 1 If VowelCheck(Mid(Txt, Search, 1)) = True Then Temp = ReplaceDocTxt(A + ---Search - 1, "ß") Next Search End If End If End If 'move cursor past word that was just shortened A = A + B End If 47

End If Loop 'function Replacer actually removes all "ß" characters (Temp is for the returned value of the function (unused)) Temp = Replacer("ß", "") End Sub ------Function ReplaceDocTxt(Where, What As String) 'true ReplaceCheck to show that change has been made ReplaceCheck = True 'sticks in string requested ActiveDocument.Characters(Where + Len(What) - 1).InsertAfter (What) 'removes old string For Tmp = Where To Where + Len(What) - 1 ActiveDocument.Characters(Where).Delete Next Tmp End Function ------Function LetterCheck(Txt As String) As Boolean LetterCheck = True 'if upper case and lower case equal each other, then it is a non-letter character If UCase(Txt) = LCase(Txt) Then LetterCheck = False End Function Function VowelCheck(Txt As String) As Boolean VowelCheck = False 'checks if it is a vowel If Txt = "a" Or Txt = "e" Or Txt = "i" Or Txt = "o" Or Txt = "u" Then VowelCheck = True End Function ------Function Replacer(FindT As String, ReplaceT As String) 'replaces requested text with another requested text Selection.Find.ClearFormatting Selection.Find.Replacement.ClearFormatting With Selection.Find .Text = FindT .Replacement.Text = ReplaceT .Forward = True .Wrap = wdFindContinue .Format = False .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False End With Selection.Find.Execute Replace:=wdReplaceAll 48

End Function

Figure 1.4

Figure 1.5

Rotor Cipher Porgram Written by Jordan Zink in Visual Basic (2008 edition) Note: “---” denotes a continuation from the previous line

Sub BruteForceAttack() 'start timer Dim startSec As Integer = Now.Second Dim startMSec As Integer = Now.Millisecond 'initialize variables Dim PlainText As String Dim CipherText As String Dim CurChar As String Dim Hits As Integer Dim TempA As Byte Dim TempB As Byte Hits = 0 'load plaintext PlainText = System.IO.File.ReadAllText(TextBox2.Text + TextBox1.Text, ---System.Text.Encoding.Default) 49

'initialize rotors Dim RotorsSub(0 To NumericUpDown1.Value) As String Dim RotorsSft(0 To NumericUpDown1.Value) As String Dim RotorsSet(0 To NumericUpDown1.Value, 0 To 1) As Integer Dim RotorsSetStart(0 To NumericUpDown1.Value, 0 To 1) As Integer 'load substitution and shift rotors For A As Byte = 1 To CByte(NumericUpDown1.Value) RotorsSub(A) = Mid(System.IO.File.ReadAllText(TextBox2.Text + "Rotor" + ---CStr(NumericUpDown1.Value + 1 - A) + "Sub.txt", ---System.Text.Encoding.Default), 257, 256) RotorsSft(A) = System.IO.File.ReadAllText(TextBox2.Text + "Rotor" + ---CStr(NumericUpDown1.Value + 1 - A) + "Sft.txt", ---System.Text.Encoding.Default) Next 'prep key search For A As Integer = 1 To NumericUpDown1.Value For B = 0 To 1 RotorsSetStart(A, B) = 0 Next Next Do 'load key For A As Integer = 1 To NumericUpDown1.Value For B = 0 To 1 RotorsSet(A, B) = RotorsSetStart(A, B) Next Next 'run decrypt (see below for comments) CipherText = "" For WhichChar As Integer = 1 To Len(PlainText) CurChar = Mid(PlainText, WhichChar, 1) For NumRotor As Byte = 1 To CByte(NumericUpDown1.Value) CurChar = Chr(Mod256(Asc(Mid(RotorsSub(NumRotor), Asc(CurChar) + 1, 1)) – ---RotorsSet(NumRotor, 0))) Next CipherText = CipherText + CurChar '"rotate rotors" (advance rotor settings) For NumRotor As Byte = 1 To CByte(NumericUpDown1.Value) RotorsSet(NumRotor, 0) = Asc(Mid(RotorsSft(NumRotor), ---Mod256(RotorsSet(NumRotor, 0) + RotorsSet(NumRotor, 1)) + 1, 1)) RotorsSet(NumRotor, 1) = RotorsSet(NumRotor, 1) + 1 If RotorsSet(NumRotor, 1) = 256 Then RotorsSet(NumRotor, 1) = 0 Next Next 'look at decrypted text for "the" (to determine if correct decrypted text) For A = 1 To Len(CipherText) - 2 50

If Mid(CipherText, A, 3) = "the" Then Hits = Hits + 1 Next 'advance to next key TempA = 1 TempB = 0 Do RotorsSetStart(TempA, TempB) = RotorsSetStart(TempA, TempB) + 1 If RotorsSetStart(TempA, TempB) = 256 Then RotorsSetStart(TempA, TempB) = 0 TempB = TempB + 1 If TempB = 2 Then TempB = 0 TempA = TempA + 1 If TempA > NumericUpDown1.Value Then Exit Do End If Else Exit Do End If Loop If TempA > NumericUpDown1.Value Then Exit Do Loop 'search done, stop timer Dim finishMSec As Integer = Now.Millisecond Dim finishSec As Integer = Now.Second 'display time TextBox4.Text = finishSec + finishMSec / 1000 - (startSec + startMSec / 1000) TextBox5.Text = Hits End Sub ------Sub EncryptText() Dim PlainText As String Dim CipherText As String 'load plaintext PlainText = System.IO.File.ReadAllText(TextBox2.Text + TextBox1.Text, ---System.Text.Encoding.Default) CipherText = "" 'initialize rotors Dim RotorsSub(0 To NumericUpDown1.Value) As String Dim RotorsSft(0 To NumericUpDown1.Value) As String Dim RotorsSet(0 To NumericUpDown1.Value, 0 To 1) As Integer 'load substitution and shift rotors (if decrypting, in reverse order) For A As Byte = 1 To CByte(NumericUpDown1.Value) If EncryptSelected.Checked = True Then RotorsSub(A) = ---Mid(System.IO.File.ReadAllText(TextBox2.Text + "Rotor" + CStr(A) + ---"Sub.txt", System.Text.Encoding.Default), 1, 256) If DecryptSelected.Checked = True Then RotorsSub(A) = 51

---Mid(System.IO.File.ReadAllText(TextBox2.Text + "Rotor" + ---CStr(NumericUpDown1.Value + 1 - A) + "Sub.txt", ---System.Text.Encoding.Default), 257, 256) If EncryptSelected.Checked = True Then RotorsSft(A) = ---System.IO.File.ReadAllText(TextBox2.Text + "Rotor" + CStr(A) + "Sft.txt", ---System.Text.Encoding.Default) If DecryptSelected.Checked = True Then RotorsSft(A) = ---System.IO.File.ReadAllText(TextBox2.Text + "Rotor" + ---CStr(NumericUpDown1.Value + 1 - A) + "Sft.txt", ---System.Text.Encoding.Default) Next 'load key (forward if encrypting, reversed if decrypting) Dim Counter As Integer If EncryptSelected.Checked = True Then Counter = 1 For A As Byte = 1 To CByte(NumericUpDown1.Value) For B = 0 To 1 RotorsSet(A, B) = Asc(Mid(TextBox3.Text, Counter, 1)) Counter = Counter + 1 Next Next End If If DecryptSelected.Checked = True Then Counter = NumericUpDown1.Value * 2 For A As Byte = 1 To CByte(NumericUpDown1.Value) For B = 1 To 0 Step -1 RotorsSet(A, B) = Asc(Mid(TextBox3.Text, Counter, 1)) Counter = Counter - 1 Next Next End If Dim CurChar As String 'run cipher (same process for encrypt and decrypt) For WhichChar As Integer = 1 To Len(PlainText) 'load current character CurChar = Mid(PlainText, WhichChar, 1) 'move through every rotor For NumRotor As Byte = 1 To CByte(NumericUpDown1.Value) 'looks up and preforms the substitution If EncryptSelected.Checked = True Then CurChar = Mid(RotorsSub(NumRotor), ---Mod256(Asc(CurChar) + RotorsSet(NumRotor, 0)) + 1, 1) If DecryptSelected.Checked = True Then CurChar = ---Chr(Mod256(Asc(Mid(RotorsSub(NumRotor), Asc(CurChar) + 1, 1)) – ---RotorsSet(NumRotor, 0))) Next CipherText = CipherText + CurChar 52

'"rotate rotors" (advance rotor settings) For NumRotor As Byte = 1 To CByte(NumericUpDown1.Value) RotorsSet(NumRotor, 0) = Asc(Mid(RotorsSft(NumRotor), ---Mod256(RotorsSet(NumRotor, 0) + RotorsSet(NumRotor, 1)) + 1, 1)) RotorsSet(NumRotor, 1) = RotorsSet(NumRotor, 1) + 1 If RotorsSet(NumRotor, 1) = 256 Then RotorsSet(NumRotor, 1) = 0 Next Next 'write text to new file If EncryptSelected.Checked = True Then System.IO.File.WriteAllText(TextBox2.Text + ---Mid(TextBox1.Text, 1, Len(TextBox1.Text) - 4) + "(encrypted).txt", CipherText, ---System.Text.Encoding.Default) If DecryptSelected.Checked = True Then System.IO.File.WriteAllText(TextBox2.Text + ---Mid(TextBox1.Text, 1, Len(TextBox1.Text) - 4) + "(decrypted).txt", CipherText, ---System.Text.Encoding.Default) End Sub ------Function Mod256(ByVal WhatNum As Integer) As Integer 'preforms modulo 256 If WhatNum > 255 Then WhatNum = WhatNum - 256 End If If WhatNum < 0 Then WhatNum = WhatNum + 256 End If Mod256 = WhatNum End Function ------Sub RotorGenerator() 'initialize variables Dim CurRotor As String Dim Temp(0 To 256) As Byte Dim Z As Byte 'makes number of rotors specified For NumRotor As Byte = 1 To CByte(NumericUpDown2.Value) 'sets rotor type (1 = substitution rotor, 2 = shift rotor For RType As Byte = 1 To 2 'initialize Temp arrray For A As Integer = 0 To 255 Temp(A) = A Next CurRotor = "" 'generate rotor For A As Integer = 0 To 255 Randomize() Z = Int((255 - A) * Rnd()) 53

'gets random character from array that has not been used yet CurRotor = CurRotor + Chr(Temp(Z)) 'advances array so character will not be used again For B As Integer = Z To 255 - A Temp(B) = Temp(B + 1) Next Next 'checks if substitution rotor If RType = 1 Then 'reverses rotor and adds to end of original rotor (for easy decryption) For A As Integer = 1 To 256 Temp(Asc(Mid(CurRotor, A, 1))) = A - 1 Next For A As Integer = 0 To 255 CurRotor = CurRotor + Chr(Temp(A)) Next End If 'writes rotors If RType = 1 Then System.IO.File.WriteAllText(TextBox2.Text + "Rotor" + ---CStr(NumRotor) + "Sub.txt", CurRotor, System.Text.Encoding.Default) If RType = 2 Then System.IO.File.WriteAllText(TextBox2.Text + "Rotor" + ---CStr(NumRotor) + "Sft.txt", CurRotor, System.Text.Encoding.Default) Next Next End Sub

Figure 2.1

Conservative Shortening vs. Liberal Shortening Type of Mean of 1st Mean of 2nd Specifics P value Comparison Set Set Vs Control % None 11.2% 45.6% 5.50E-20 Stumbles None 0.61 1.54 1.53E-07 Missed Words None 0.05 2.15 9.56E-13 Vs Control % Conversational 10.2% 37.8% 3.23E-10 Stumbles Conversational 0.68 1.31 0.0019 Missed Words Conversational 0.05 1.33 4.91E-06 Vs Control % Analytical 12.2% 54.1% 4.67E-12 Stumbles Analytical 0.55 1.79 4.66E-05 Missed Words Analytical 0.05 3.05 5.72E-09

54

Figure 2.2

Liberal Shortening vs. Control (No Shortening) Type of Mean of 1st Mean of 2nd Specifics P value Comparison Set Set Stumbles None 1.54 0.21 6.48E-14 Missed Words None 2.15 0.006 4.30E-13 Stumbles Conversational 1.31 0.23 1.29E-09 Missed Words Conversational 1.33 0 2.24E-06 Stumbles Analytical 1.79 0.20 4.65E-07 Missed Words Analytical 3.05 0.01 4.84E-09

Figure 2.3

Conservative Shortening vs. Control (No Shortening) Type of Mean of 1st Mean of 2nd Specifics P value Comparison Set Set Stumbles None 0.61 0.21 2.24E-05 Missed Words None 0.05 0.006 0.0873 Stumbles Conversational 0.68 0.23 0.0028 Missed Words Conversational 0.053 0 0.1600 Stumbles Analytical 0.55 0.20 0.0028 Missed Words Analytical 0.048 0.013 0.3274

Figure 2.4

Youth vs. Adult Type of Mean of 1st Mean of 2nd Specifics P value Comparison Set Set Vs Control % None 27.7% 30.1% 0.5668 Stumbles None 0.60 0.75 0.2029 Misses None 0.51 0.65 0.4739 Vs Control % Conversational 24.9% 24.1% 0.8732 Vs Control % Analytical 30.4% 36.1% 0.4087 Vs Control % Conservative 11.1% 11.4% 0.8922 Vs Control % Liberal 44.2% 48.8% 0.3993

Figure 2.5

Familiar with Texting vs. Not Familiar Type of Comparison Specifics Mean of 1st Set Mean of 2nd Set P value Vs Control % None 27.8% 29.9% 0.6109 Stumbles None 0.68 0.55 0.2390 Missed Words None 0.58 0.48 0.5241 55

Figure 2.6

Familiar with Online Lingo vs. Not Familiar Type of Comparison Specifics Mean of 1st Set Mean of 2nd Set P value Vs Control % None 28.5% 28.0% 0.9048 Stumbles None 0.64 0.66 0.9044 Missed Words None 0.58 0.45 0.4860

Figure 2.7

Running on a Windows XP OS with an AMD Athlon 2500+ processor at 1.84 GHz

1.765 1.828 1.797 1.734 1.813 *Time in seconds

Figure 2.8

Brute Force Time (One Rotor) y = 8E-05x2 + 0.105x - 0.0608 R2 = 1

60

50

40

30

Time (seconds) 20

10

0 0 50 100 150 200 250 300 350 400 450 Number of characters in plaintext/ciphertext

56

Figure 2.9

Key Search Speed y = 752257x-1.0634 R2 = 0.9988

160000

140000

120000

100000

80000

60000

40000 Speed (keys per second) Speed (keys

20000

0 0 50 100 150 200 250 300 350 400 450 Number of characters in plaintext/ciphertext

Figure 2.10

Brute force run times on different computers

1st 2nd General RAM Average OS Type of processor processor processor Type (GB) Time freq. (GHz) freq. (GHz) Desktop Windows XP AMD Athlon XP 3200+ 2.19 None 1.00 1.472 Desktop Windows XP AMD Athlon XP 2500+ 1.84 None 1.50 1.750 Desktop Windows XP Intel Pentium D CPU 2.80 2.80 1.00 1.580 Netbook Windows XP Intel Atom CPU N270 1.60 None 1.00 3.964 Laptop Windows XP Intel Core 2 Duo CPU P8400 2.26 1.58 2.95 0.952 Windows 7 Laptop Intel Core 2 Duo CPU P7350 2.00 2.00 6.00 0.981 (64 bit) Laptop Windows XP Intel Pentium M 725 1.60 0.60 1.00 1.660 Desktop Windows XP Intel Pentium 4 CPU HT 650 3.40 3.40 3.00 1.268 Mobile Intel Pentium 4 CPU Laptop Windows XP 3.20 1.85 0.47 1.444 HT 3.2

57

Figure 2.11

Regression Statistics Multiple R 0.818508438 R Square 0.669956063 Adjusted R Square 0.656928012 Standard Error 0.166077627 Observations 80

ANOVA df SS MS F Significance F Regression 3 4.255106336 1.418368779 51.42412365 2.93808E-18 Residual 76 2.096215152 0.027581778 Total 79 6.351321488

Standard Coefficients t Stat P-value Error Intercept 2.081336029 0.125853008 16.537833 6.78802E-27 1st Processor -0.186882119 0.060106823 -3.109166497 0.002640396 Frequency 2nd Processor 0.042711387 0.032851338 1.300141466 0.197481677 Frequency RAM -0.145507826 0.014679091 -9.91259126 2.43087E-15

Figure 2.12

Regression Statistics Multiple R 0.814011874 R Square 0.66261533 Adjusted R Square 0.653852092 Standard Error 0.166820477 Observations 80

ANOVA df SS MS F Significance F Regression 2 4.208482985 2.104241492 75.6130687 6.80496E-19 Residual 77 2.142838503 0.027829071 Total 79 6.351321488

Standard Coefficients t Stat P-value Error Intercept 1.957750737 0.082852127 23.62945671 4.99392E-37 1st Processor -0.119696147 0.030836372 -3.881654717 0.000217534 Frequency RAM -0.132799481 0.011000385 -12.07225769 1.93305E-19

58

Figure 2.13

More processor data Type of processor L2-Cache Front Side Bus Multiplier Voltage TDP AMD Athlon XP 3200+ 512 KB 400 MHz 11x 1.65 V 76.8 W AMD Athlon XP 2500+ 512 KB 333 MHz 11x 1.65 V 68.3 W Intel Pentium D CPU 2 MB 800 MHz 14x 1.3 V 95 W Intel Atom CPU N270 512 KB 533 MHz 12x 1.1 V 2.5 W Intel Core 2 Duo CPU P8400 3 MB 1066 MHz 8.5x 1.15 V 25 W Intel Core 2 Duo CPU P7350 3 MB 1066 MHz 7.5x 1.15 V 25 W Intel Pentium M 725 2 MB 400 MHz 16x 1.34 V 15 W Intel Pentium 4 CPU HT 650 2 MB 800 MHz 17× 1.3 V 84 W Mobile Intel Pentium 4 CPU HT 3.2 512 KB 533 MHz 24× 1.5 V 76 W

Figure 2.14

Regression Statistics Multiple R 0.999083791 R Square 0.998168421 Adjusted R Square 0.985347368 Standard Error 0.109270413 Observations 9

ANOVA df SS MS F Significance F Regression 7 6.507038199 0.929576886 77.85385915 0.087052268 Residual 1 0.011940023 0.011940023 Total 8 6.518978222

Standard Coefficients t Stat P-value Error Intercept 12.62375 0.78125764 16.15823675 0.039348901 Processor Frequency -0.19966 0.277376856 -0.719808786 0.602814764 RAM -0.00371 0.042665112 -0.086984021 0.944763284 L2 - Cache -0.73653 0.080882086 -9.10627595 0.069630999 Front Side Bus -0.00203 0.00062386 -3.260044702 0.189478828 Multiplier -0.03836 0.02365915 -1.621392709 0.351826277 Voltage -5.85850 0.475595476 -12.31823343 0.051568009 TDP 0.00820 0.003804659 2.154207112 0.276678909

59

Figure 2.15

Note: Time (Act.) is the actual average time. Time (Equ.) is the time predicted by the regression equation.

Each Factor's Affect on Time

Time (Act.) Time (Equ.) Frq RAM L2-Cache Front Side Bus Mult Voltage TDP 0.95 0.93 -0.45 -0.011 -2.21 -2.17 -0.33 -6.7 0.20 0.98 1.00 -0.40 -0.022 -2.21 -2.17 -0.29 -6.7 0.20 1.27 1.25 -0.68 -0.011 -1.47 -1.63 -0.65 -7.6 0.69 1.44 1.45 -0.64 -0.002 -0.37 -1.08 -0.92 -8.8 0.62 1.47 1.54 -0.44 -0.004 -0.37 -0.81 -0.42 -9.7 0.63 1.58 1.59 -0.56 -0.004 -1.47 -1.63 -0.54 -7.6 0.78 1.66 1.67 -0.32 -0.004 -1.47 -0.81 -0.61 -7.9 0.12 1.75 1.68 -0.37 -0.006 -0.37 -0.68 -0.42 -9.7 0.56 3.96 3.96 -0.32 -0.004 -0.37 -1.08 -0.46 -6.4 0.02