<<

“Breaking and Entering”: Evaluation of Various Decryption Techniques to Decipher a Polyalphabetic Substitution .

A dissertation submitted in partial fulfilment of the requirements for the degree of Bachelor of Science (Honours) in Computing

by Ryan J. Brown

Department of Computing & Information Systems Cardiff School of Management

Cardiff Metropolitan University

April 2017

i

Declaration I hereby declare that this dissertation entitled “Breaking and Entering”: Evaluation of Various Decryption Techniques to Decipher a Polyalphabetic is entirely my own work, and it has never been submitted nor is it currently being submitted for any other degree.

Candidate: Ryan Brown

Signature:

Date: 18/04/2017

Supervisor: Dr Chaminda Hewage/Dr Ambikesh Jayal

Signature:

Date: 18/04/2017

ii

Abstract The dissertation research is based upon both cryptology (the study of in and decryption) and analysing machine learning algorithms. Combining these two studies helped in both understanding cryptosystems as well as testing the performances of the different machine learning algorithms. The purpose is to use the machine learning algorithms to decrypt a by using a cipher-text only attack and exploiting the used.

This dissertation topic mainly looked at the effectiveness of the different algorithms performances whilst decrypting a polyalphabetic cipher. This encouraged some useful data that produced reliable information based on the algorithm, including: the functionality; the complexity of each; and, the ability to find better results of keys.

iii

Acknowledgments I would like to thank my family and friends for supporting me throughout University. I would especially like to thank my parents and brother for making many cups of tea during my studies. Without this I may have not been as motivated.

A big thank you to my supervisor Dr Chaminda Hewage/Dr. Ambikesh Jayal for the continuous help and support throughout the dissertation. Also, I would like to thank Dr Ambikesh Jayal for the initial input and ideas.

iv

Table of Contents Abstract ...... iii Acknowledgments...... iv Table of Contents ...... v List of Tables ...... vii List of Figures ...... viii 1. Introduction ...... 1 1.1. Purpose and Structure ...... 1 1.2. Research Question ...... 2 2. Literature Review ...... 2 2.1. History of Cryptology ...... 2 2.2. Cryptology ...... 6 2.2.1. ...... 7 2.2.1.1. Substitution Cipher...... 7 2.2.1.1.1. Simple Substitution/Monoalphabetic ...... 8 2.2.1.1.2. Polyalphabetic Cipher ...... 9 2.2.1.2. ...... 11 2.2.2. ...... 11 2.2.2.1. Cryptanalysis Attacks ...... 11 2.3. Machine Learning Algorithms ...... 14 2.3.1. Hill Climbing Algorithm ...... 15 2.3.2. Genetic Algorithm ...... 16 2.3.3. Simulated Annealing ...... 17 2.4. The Big O Notation ...... 18 3. Methodology ...... 19 3.1. Approach ...... 19 3.2. Strategy and Research Design ...... 19 3.2.1. Encryption ...... 20 3.2.2. Decryption ...... 22 3.2.3. Scoring ...... 23 3.2.4. Machine Learning methods ...... 25 3.2.4.1. ‘Random Hill Climbing’ ...... 26 3.2.4.2. ‘Genetic Algorithm’ ...... 27 v

3.2.4.3. ‘Random Optimisation’ ...... 28 3.2.4.4. ‘Simulated Annealing’ ...... 28 3.3. Data Collection and Analysis Methods ...... 29 3.4. Limitations ...... 30 4. Evaluation and Results ...... 31 4.1. Random Hill Climbing ...... 31 4.2. Genetic Algorithm ...... 35 4.3. Random Optimisation ...... 39 4.4. Simulated Annealing ...... 42 4.5. Evaluation ...... 46 5. Conclusion ...... 48 References ...... 50 Appendices ...... 54 Appendix 1 – 217 words for encryption from ‘Winston Churchill to Franklin D. Roosevelt’ message November 23, 1940. (National Churchill Museum, no date) ...... 54 Appendix 2 – Encrypting the message by ‘Winston Churchill’ in Appendix 1...... 56 Appendix 3 – JUnit Testing the encrypted message ...... 59 Appendix 4 – Decrypting the polyalphabetic cipher...... 61 Appendix 5 – Fitness Scoring of text using Quad grams (Practical Cryptography, 2009) ...... 62 Appendix 6 – Machine Learning Algorithms implementation ...... 64 Appendix 7 – Testing performance of algorithms ...... 71 Appendix 8 – Ethics form (2016D0099) ...... 75

vi

List of Tables Table 1 - Polyalphabetic cipher using modern tablu recta in figure 3 ...... 10 Table 2 - Random Hill Climbing data ...... 31 Table 3 - Random Hill Climbing graphs ...... 33 Table 4 - Genetic Algorithm data ...... 35 Table 5 - Genetic Algorithm graphs ...... 37 Table 6 - Random Optimisation data ...... 39 Table 7 - Random Optimisation graphs ...... 41 Table 8 - Simulated Annealing data ...... 42 Table 9 - Simulated Annealing graphs ...... 44

vii

List of Figures Figure 1 - Simple Monoalphabetic Substitution Cipher - (Stamp, 2011) ...... 3 Figure 2 - Alberti Cipher disk (Rodriguez-Clark, 2013) ...... 4 Figure 3 - Modern Tablu Recta (Kahn, 1997) ...... 5 Figure 4 - Monoalphabetic Cipher in order from left to right ...... 8 Figure 5 - Hill Climbing flow chart ...... 15 Figure 6 - Genetic Algorithm stages (Zang, Zhang and Hapeshi, 2010)...... 16 Figure 7 - Simulated Annealing flow chart ...... 17 Figure 8 - Score of the best key ...... 31 Figure 9 - Hill Climbing table 2 representation ...... 34 Figure 10 – Genetic Algorithm table 4 representation ...... 38 Figure 11 - Random Optimisation table 6 representation ...... 42 Figure 12 - Simulated Annealing table 8 representation ...... 45 Figure 13 - Plot.ly graph made with data from a random iteration ...... 46 Figure 14 - Results from all algorithms iterations...... 47

viii

1. Introduction 1.1. Purpose and Structure This dissertation analyses decryption techniques. Its primary purpose is to identify the different trends for which decryption techniques are solved, as well as the time it takes to complete the decryption. The decryption techniques used within this dissertation deploys machine learning algorithms1 to solve the polyalphabetic cipher.

One of the main reasons for using machine learning algorithms as a way of decryption is to look at how they perform with different types of algorithms to see their benefits, and perhaps what they may not be able to achieve. Therefore, this dissertation analyses machine learning algorithms to an extent where it can help solve real life problems. The problem posed in this dissertation is that an encrypted polyalphabetic cipher2 needs to be decrypted, and thus, using such algorithms will show their tested performances.

The dissertation is structured as follows:  First, the introduction outlines the purpose and structure of the project as well as the research question.  Second, the literature review provides an evaluation of the previous writings in this research area and shows how this dissertation fits within the literature basis.  Third, the methodology section outlines the methodological approach that was taken to answer the dissertation’s primary research question.  Fourth, an evaluation of the progress of project.  Finally, a conclusion is provided which summarises the research findings and identifies the scope for future research stemming from the project.

1 “Machine Learning” describes using a set of tools and methods on data to find patterns to extract insights into the real world (Conway and White, 2012); this allows computers to understand and make a critical analysis on given data. 2 Vigenere Cipher used for encryption. Vigenere cipher was regarded as the ‘model’ for all other polyalphabetic substitutions and “probably the most famous cipher of all time” (Kahn, 1997). This is one of the main reasons for using this cipher for encryption, due to its importance in history. 1

1.2. Research Question This dissertation evaluates the various decryption techniques to decipher a polyalphabetic substitution cipher. Hence the primary research question can be described as follows: How effective are machine learning algorithms in decrypting polyalphabetic substitution ciphers?

The above research question is examined by a number of sub-questions, namely:  Which polyalphabetic substitution cipher would be most suited to the task?  Which machine learning algorithms would be the best as methods of decryption?  What is the best type of decryption method for decrypting a polyalphabetic cipher?

Therefore, this dissertation shall focus on the analysis of these sub-questions to fully evaluate the overall research premise.

2. Literature Review This section discusses the background literature that helped gather the research findings in order to complete this project. The foundation of this dissertation topic was discovered by reading a dissertation by Dorey (n.d). The dissertation by Dorey (n.d.) looks at analysing hill climbing algorithms to decipher a monoalphabetic type encryption. Where this dissertation differs is by analysing different methods and a greater number of algorithms whilst using a polyalphabetic encryption.

2.1. History of Cryptology To understand cryptology today, it is important to look at its history and the techniques created and used over many years. In summary, cryptology is the study for encrypting and decrypting a message to prevent others intercepting its secret message.

The first evidence of cryptology was discovered in an Egyptian town called Menet Khufu in 1900 BC (Khan, 1997). In that case, many centuries ago, a master scribe had told a story of his lord’s life using unusual hieroglyphs (Kahn, 1997). However, this type of cryptology was not used as a form of a secret , but to ‘impart dignity and authority’ to their writings (Kahn, 1997 p.65). 2

These texts are seen as a cryptographic message, as it is a ‘deliberate transformation of the writing’ (Kahn, 1997), suggesting that during this time cryptology was used in a way to show one’s knowledge and for future success. Kahn (1997) continues to state that the development of cryptography paralleled the development of humankind itself.

Overtime cryptology started to develop with security in mind, but at this stage is used in a way to hinder the reader’s attention to try to solve the riddles buried within the messages (Kahn, 1997). It often proved unsuccessful as many readers had no desire of deciphering them; thus, bringing the age of cryptography and its secrecy (Kahn, 1997).

Later, other civilisations such as China and India started to develop cryptology by using their own methods and proved their importance in the military, for provisions and use of espionage. An example of this is “The Arthasastra” which was used by India’s espionage service to distribute messages to all spies in ‘secret writings’ (Kahn, 1997). Also, a worthy mention is that the famous book of the “Karma Sutra” also “lists secret writing as one of the 64 arts, or yogas, that women should know and practice” (Kahn, 1997). Even though ‘cryptography’ and ‘cryptology’ was evident, there was no real evidence of ‘cryptanalysis’ (study of decrypting) written down or documented until it was discovered amongst the Arabs (Kahn, 1997). This helped to improve its studies and became just as important as cryptography, hence, the term ‘Cryptology’.

The first substitution cipher (created in 100 BC by the very well-known Julius Caeser) was the simplest but most effective form of cryptography. The ‘Caeser Cipher’ involved changing the alphabet by shifting it up 3 spaces, as shown below in figure 1 (Stamp, 2011).

Plain text: a b c d e f g h i J k l m n o p q r s t u v w x y z Cipher text: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C

Figure 1 - Simple Monoalphabetic Substitution Cipher - Caesar Cipher (Stamp, 2011)

This proved to be important for Julius Caeser when communicating between his generals during his military regimes, as a way of protecting confidential information about military plans (Rivest,

3

2013). Using this cipher, shown in figure 1, the words ‘attack at dawn’ would read ‘dwwdfn dw gdzq’ making it difficult for an interceptor to understand. Consequently, this type of cipher could easily be solved by using a and brute force due to the limitations this cipher provides. This is discussed further in section 2.2.

However, up until the middle ages, there were no major changes with the development of cryptography. During this time the ‘Polyalphabetic’ substitution cipher was developed by one of the most notable “Father of the western Cryptology”, , and proved to be more secure than using the monoalphabetic cipher (Damico, 2009). Alberti created a method of using “two copper disks that fit together” which have alphabet’s inscribed on each of them. It would use a random starting point (key/code) that would change the order of the alphabet when turned one of the disks as shown in figure 2 (Damico, 2009). The position of the starting alphabet would be the main indicator of the transformation from ‘plain text’ to ‘cipher text’, allowing the cipher text to be secure enough that it would be difficult to run a frequency analysis, see section 2.2 (Rodriguez-Clark, 2013). Following this it had created the contemporary perception of polyalphabetic substitution.

Figure 2 - Alberti Cipher disk (Rodriguez-Clark, 2013)

The polyalphabetic substitution cipher varied throughout time, until Blaise De Vigenere discovered a suitable way of using a keyword that “repeats above the plaintext letters until each one has a keyletter” (Kahn, 1997). Using a modern Tablu Recta (figure 3); “26 standard horizontal alphabets, each slid one space to the left of the one above”, allowing the encipherer to change

4 each letter to its corresponding alignment in the Tablu Recta (Kahn, 1997). This cipher proved difficult to solve for over 300 years; however, it was considered a slow method for its time, due to the period spent having to encrypt long plaintext messages to cipher text, as well as the recipient having to decipher the message, and consequently was not used for that main reason (Kahn, 1997).

Figure 3 - Modern Tablu Recta (Kahn, 1997)

The more secure adaptation of this type of polyalphabetic cipher is known as the ‘autokey’, by using ‘the message itself as its own key.’ (Kahn, 1997). This would increase the probabilities of solving this and subsequently would prove to be difficult to break. However, this bares the same issues as the Vigenere cipher because of the time it would take to encrypt and decrypt a message. As a result of this the Vigenere cipher was seen as a ‘far more susceptible to solution than Vigenere’s autokey’ (Kahn, 1997).

For a long time, there has been a major use of symmetric algorithms (including the ciphers previously discussed) in which the has the same key to encrypt and decrypt a message (Schneier, 1996). This relies upon the key to secure the content, but there are two distinct types: first ‘stream ciphers’ use a single bit or a single character to process the cryptography of plaintext; and second, ‘block ciphers’, as suggested, means that a block of text or bits are used (Schneier, 1996). Public key algorithms were later developed around nineteenth 5 century to improve security; the key used to encrypt plaintext was thus different to the key for decryption and consequently, made it a lot more secure (Schneier, 1996).

New techniques and algorithms have been created and used over many centuries (e.g. Vigenere cipher, ) to make decryption more difficult and time consuming. It therefore may be suggested that data become useless after certain period of time (Hat, 2017). The early twentieth century was an important time for cryptology and its study, with the occurrence of both World Wars and the use of secret/important messages being communicated (Kahn, 1997). From then on, improvements and techniques of cryptology became an important aspect in securing data.

2.2. Cryptology Stamp (2011) states that: ‘Cryptology is the art and science of making and breaking “secret codes”’. Martin (2012) defines cryptology as a loosely used term to describe, ‘the design and analysis of mechanisms based on mathematical techniques’ to secure data and information.

There are two types of studies in cryptology. “Cryptography” describes the fundamentals of securing data by using such mechanisms to design an algorithm (Martin, 2012). “Cryptanalysis” is the opposite of cryptography and uses an ‘analysis of such mechanisms’ to decrypt its encryption (Martin, 2012). Cryptology is therefore a way of transforming an original message into cipher text that an interceptor may not be able to read and understand. However, the main recipient of the message could transform the message back to its original readable message by using a decryption technique.

However, more importantly, ‘cryptography’ and ‘cryptanalysis’, when used together improve the cryptosystem’s security and its ability to withheld its important information (Tilborg, 2006, p.1). Churchhouse, R.F. (2001, p.4) explains cryptographers and cryptanalysts as adversaries, as “each tries to outwit the other” and it becomes a “fascinating intellectual battle” between both parties trying to solve each other’s problems. This certainly implies that cryptology is ever-evolving and

6 it needs to grow to keep up with securing important personal and business data. After presenting the history of cryptology in section 2.1, it clearly shows cryptology’s evolution over time and the purposes it has had on different cultures and events, even in today’s society.

The question is, who and why would someone want to use a cryptosystem today? Tilborg (2006, p.1) states that a cryptosystem provides the necessary ‘confidentiality’ of data being transmitted over the internet to its rightful recipient without a third party seeing or using others data. Furthermore ‘Authentication’ allows the recipient to identify the owner of a work or where something has been sent from. Finally, ‘integrity’ of data and information makes sure that the correct modifications have been made by the correct author and not by a third party (Tilborg, 2006, p.1).

2.2.1. Cryptography Cryptography is perhaps more important to understand than trying to decipher an encrypted message. A strong encryption allows data to be somewhat impenetrable, and prevents the data from being modified or stolen by a third party. However, having discussed the history of cryptology it is apparent that there is the need to evolve encryption, because the techniques become solvable overtime (Schneier, 1996, p. 20). As suggested by Schneier (1996, p. 20) depending on what type of algorithm is used, the difficulty of breaking them varies. Therefore, if the ‘amount of data encrypted with a single key is less than the amount of data necessary to break the algorithm, then you’re probably safe’ (Scneier, ibid).

2.2.1.1. Substitution Cipher As the name suggests, substitution cipher is when a character in the plaintext is substituted for another character (Schneier, 1996, p. 22), shown previously in figure 1 and the Caesar Cipher. There are, however, various types of the substitution ciphers which have different characteristics and levels of securing information.

7

2.2.1.1.1. Simple Substitution/Monoalphabetic Ciphers The first substitution cipher (see section 2.1) is perhaps the most famous cipher and best known as the Caesar Cipher, due to its simplicity and effectiveness for its time. It simply works by shifting the alphabet up three spaces shown in figure 1. However, this type of cipher would not be regarded as useful today, as it provides little security and can easily be solved (Martin, 2012, p. 53).

Following on from the Caesar cipher another encryption method developed to become known as monoalphabetic cipher. Where the alphabet is rearranged in a different order. This would then take on the form of a one to one relationship between the both alphabets, known as a permutation (Churchhouse, 2001, p. 27), as shown in figure 4. Both the sender and recipient of the message would have to know the permutated alphabet in order to decipher. An example of using this cipher with figure 4 can be shown as, the word ‘TREE’ can be permutated as ‘HFLL’, subsequently, to find the original message the order is reversed.

Alphabet: a b C d e f g h i j k l m n o p q r s t u v w x y z Cipher I M R S L B X T A K J E D C U W V F Z H G Y P O N Q Alphabet:

Figure 4 - Monoalphabetic Cipher in order from left to right

Unlike the Caesar cipher which can be solved by shifting the alphabet about 25 times until it becomes readable English, the monoalphabetic cipher has a greater number of possibilities in which it can be solved (Martin, 2012, p.50). In fact, Martin (2012, p. 51) states that the number of total possibilities of finding the correct permutation is of factorial 26, which is represented as:

26 x 25 x 24 x …. x 3 x 2 x 1

8

This number can also be represented as 1026. Being an exponential sum, it would take a ‘computer capable of testing one thousand million alphabets every second…. a hundred million years to complete the task’ using brute force3. (Churchhouse, 2001, p.30).

However, these techniques of encryption can easily be broken through use of a frequency analysis or brute force (Schneier, 1996). A frequency analysis can exploit a simple substitution because the plaintext characters are represented always by a specific character in the cipher text (Martin, 2012, p. 53). In the English language, a frequency of the most common letters in the alphabet can be found: for example, in about “every 100 000 letters of typical English text we would expect about 8167 As, 12 702 Es, but only 74 Zs.” (Martin, 2012, p. 53). By having a representation of the frequency of the alphabet, the cipher text will characterise the same frequency; in consequence, this analysis can solve a monoalphabetic of cipher quite easily.

2.2.1.1.2. Polyalphabetic Cipher The polyalphabetic cipher as discussed in section 2.1., is a far more advanced version of a simple substitution cipher. Although the simple substitution ciphers can be solved through a frequency analysis, the polyalphabetic cipher cannot; this is largely because of its uniqueness of using a key to change the permutation of the alphabet multiple of times (Martin, 2012, p.67).

This cipher works by using types of devices shown in figures 2 and 3, where a key is identified and used as the main interpreter for the new cipher text. The key is used as the starting point for the new alphabet. Using figure 3 to help work out the solution, the first letter in the key is ‘F’ (represented as 6th in the alphabet) and the first letter of the plaintext is ‘L’ then a shift of 6 has been made and a new permutated alphabet is available to use (Churchhouse, 2001, p.41). Then following steps of the Caesar cipher to work out the new ciphered letter would be ‘Q’. Therefore, the first letter in the key encrypts the first letter in the plaintext; this continues until all the plaintext letters are converted into the cipher text (Schneier, 1996, p.23). The key is reused to fit

3 Brute-force attack, is a widely known form of cryptanalysis which does an exhaustive search to find all possible keys until a solution is found (Martin, 2012). 9 the length of the plaintext and is called the ‘period’. But in traditional cryptography the longer the key is the tougher it is to solve this type of cipher (Schneier, 1996, p.23). This cipher can be represented in an example below:

Plaintext A T T A C K A T D A W N

Key T E S T T E S T T E S T Cipher T X L T V O S M W E O G

Table 1 - Polyalphabetic cipher using modern tablu recta in figure 3

Thus, the arrangement of the message ‘ATTACK AT DAWN’ is now represented as the new cipher text ‘TXLTVO SM WEOG’, using the key ‘TEST’. The cipher text does not bare any resemblance to the English language and a frequency analysis would be ineffective on this type of cipher, due to the irregularity of the letters used (Martin, 2012, p.67). Similar to the simple substitution ciphers, by reversing the process of encryption it will then be possible to decrypt, as long as the same key is used.

The most well-known polyalphabetic method is the Vigenere cipher which for a long period proved to be un-solvable. This is a method that this research paper adopts as part of the investigation into testing and analysing the performance of the machine learning algorithms.

On the other hand, even though polyalphabetic ciphers were thought to be unbreakable, after a long time of study they too were easily solved (Kahn, 1997). One of the methods that were used which discovered the polyalphabetic ciphers’ weaknesses is the Kasiki method. This method exploits the length of the key used in the cryptosystem which can be used to help find a solution. As this type of cryptosystem relies on an encryption of using the key more than once, a repetition of phrases is visible in the cipher text, therefore once found the length of the key then such methods of using a frequency analysis could help solve this type of encryption (Tilborg, 2006, p.19).

10

2.2.1.2. Transposition Cipher Unlike the substitution cipher exchanges its letters for new letters, the transposition cryptosystem is one which uses the same plaintext message but is re-arranged in a specific order (Churchhouse, 2001). Therefore, this leaves the message un-ordered and hard to read. The order in which the letters are shuffled is decided between the sender and recipient. However, a frequency analysis (as discussed previously in the section 2.2.1.1.1) could aid a cryptanalyst determining the correct order of the encrypted message. Nevertheless, this type of cryptosystem is perhaps weak as it is a simple cipher that can be exploited (Schneier, 1996).

2.2.2. Cryptanalysis The purpose of cryptanalysis is to uncover or exploit encrypted information. Therefore, it is a study and science of recovering the original plaintext without knowing the key (Schneier, 1996). In other words, cryptanalysis is seen most commonly today as ‘code-breaking’ and ‘hacking’, but maybe better known as an ‘attack’ (Schneier, 1996). It is about finding and attacking the vulnerabilities in weak methods to gain knowledge of the plaintext (Bergmann, 2007).

As discovered earlier in the cryptography section (2.2.1) even the simplest form of encryption (i.e. the simple substitution cipher) would take millions of years to run. A cryptanalyst role is not to try solve the cipher text itself, but to expose the initial key used. Thus, the most important part of any cryptosystem is its secret key, which allows the messages to revert to any state, encrypted or decrypted (Bergmann, 2007). Bergmann (2007) states that by identifying the key used in one message, it could then be utilised to intercept more messages.

2.2.2.1. Cryptanalysis Attacks As a cryptanalyst, it is important to understand what type of algorithm is used before attempting to unravel the cipher and give meaning to the content. Schneier (1996) explains that if a cryptanalyst cannot break the algorithm used, having known the background information of the algorithm, then they are unlikely to be successful at breaking it. Therefore, before any

11 cryptanalyst can ‘attack’ an encrypted message, it is important to discover and analyse the type of method used for the cryptosystem.

There are two main general types of attacks used in cryptanalysis: first, a passive attack takes the approach of unethically stealing or intercepting information, but does not modify the data in any way and sometimes cannot be identified either; and second, an active attack is when a deliberate modification of the data has been conducted (Martin, 2012). However, this research considers more specific types of attacks which are most commonly used by a cryptanalyst. Below are types of attacks which can be adopted by a cryptanalyst:

o Cipher text-only attack. This is the most common type of attack which intercepts messages that has been encrypted using the same type of encryption method. A cryptanalyst would analyse and decide upon the encryption method used in order to exploit the plaintext or even the key (Schneier, 1996). By gaining access to the key allows the cryptanalyst to use it for other messages using the same key.

o Known-plain text attack. Unlike the ‘cipher text-only attack’, the known-plaintext attack has only the resource of knowing the cipher text. This attack allows the cryptanalyst to have both the knowledge of plaintext and cipher text. Therefore, it focuses on exploiting the key used in the cryptosystem (Schneier, 1996).

o Chosen-plaintext attack. The cryptanalyst has both the encrypted and decrypted text. However, this type of attack uses the plaintext to better understand the algorithm and how it works. Knowing the plaintext which has been encrypted simplifies the task of uncovering the key (tutorialspoint.com, 2017).

o Chosen-Cipher text attack. Decrypted cipher texts are chosen to decipher and gain access to decrypted plaintext. The cryptanalyst can therefore obtain the key that has correctly outputted the plaintext (Schneier, 1996).

12

o Chosen-key attack. Schneier (1996) explains that this type of attack is perhaps unreliable and unpractical, because the cryptanalyst should have some knowledge of the ‘relationship between different keys’ before it can use a key to encrypt or decrypt some text. To summarize, this attack takes wild guesses to what the key of the cryptosystem is and can be considered a less common type of an attack.

o Adaptive chosen-plaintext attack. This is a more advanced version of the ‘chosen-plaintext attack’ in which a cryptanalyst can choose smaller segments of the plaintext to encrypt and throughout the process can choose different segments of the text. By doing this, the cryptanalyst can exhaust the keys used to identify the best key for the cryptosystem (Schneier, 1996).

o Timing attack. This type of attack helps to exploit the length of the key, which can be very useful for a cryptanalyst in finding the best solution to the cryptosystem. It does this by computing the time in which the algorithm takes, thus, the length of time it takes to complete will suggest that the key has a greater length also (tutorialspoint.com, 2017).

o Rubber horse attack. This occurs when violent actions are taken by a cryptanalyst to gain information of the key used in the cryptosystem. Some examples of the methods of violence used in order to gain information about the key are; threating, blackmailing or even torturing (Schneier, 1996).

The chance of a successful attack is determined by the amount of information that the cryptanalyst has available. The type of attack that will be used is dependent upon the information that the cryptanalyst has intercepted.

13

2.3. Machine Learning Algorithms Machine Learning makes up part of artificial intelligence, and its purpose is finding out if a computer can question what it is doing and then improve from its experience, just like a human. Therefore, the computer learns consistently over time, which helps in finding a better solution to a problem (Bell, 2015). More importantly, it learns ‘without being explicitly programmed’, which means that it has the ability to learn from tasks and experiences with other similar datasets. In turn, improving the program’s performance and making better solutions (Bell, 2015). It is without doubt that this kind of research is very important in solving problems today, such as: discovering new medicines and accurately diagnosing patients; working out better solutions to a specific problem (e.g. travelling sales man4); and finally, using machine learning to better understand cryptology.

More specifically, Machine Learning does this by identifying patterns and insight in the datasets used from real world examples, as previously stated (Conway, D. and White, J.M., 2012). The better a computer understands datasets, the more accurate and reliable the results. In machine learning, there are two distinct learning techniques, these are: supervised learning, where humans use their own knowledge to help guide the computer in making better decisions using test data. Unsupervised learning runs an algorithm without any supervision to find any pattern or correlation in the data (Lones, 2014).

Some machine learning algorithms are based upon nature and its inspirations, as it can provide a useful way of looking at a particular problem. This means that nature-inspired algorithms can be employed to achieve solutions to difficult tasks (Zang, Zhang and Hapeshi, 2010). For example, the ant colony optimisation is a nature inspired algorithm analysing ants ‘social behaviour in finding shortest paths’ (Zang, Zhang and Hapeshi, 2010); this type of algorithm could also help by solving other real world issues that require finding the shortest route.

4 Travelling salesman problem is a computer science and mathematical problem which tries to find out what is the most effective and shortest length of travelling between different cities. 14

A very well illustrated and explained example of optimising these types of machine learning algorithms can be shown by Segaran (2007) where he implements the different algorithms to find the cheapest flights and finding out if the group of people can meet on time in New York. What is useful about this example is that it helps to compare and identify the strengths and weaknesses of the different algorithms.

This research takes into consideration three main machine learning nature-inspired algorithms. These are outlined below.

2.3.1. Hill Climbing Algorithm This algorithm is inspired by nature as it ‘resembles trying to find the top of Mount Everest in a thick fog while suffering from amnesia’ (Russell and Norvig, 2016). The purpose of this type of algorithm is to find and improve on the best local solution to the problem after each step, checking whether the neighbouring results are better or worse than the current position (also known as a ‘local search’) (Lones, 2014). A problem with this algorithm, is that it can reach a local maximum5 quite quickly. It may have found a good enough solution, but not the best (global maxima). Ways in which to resolve these issues include using multi-starts or simply allowing the algorithm to accept negative moves (Lones, 2014). In turn, the hill climbing algorithm has more scope with the data, and has a better chance of finding a best solution.

Better solution TRY FIND OTHER BETTER SOLUTIONS found

YES

Randomise a key to Generate Evaluate fitness of Is better than Fitness of key NO Best solution found become best solution neighbouring results neighbours solution

Figure 5 - Hill Climbing flow chart

5 Local maxima. Reaching a state in the data where it cannot find a better neighbouring solution, however, there could be an even better solution that has not yet been reached, which is known as the global maxima (Russell and Norvig, 2016). 15

To implement this sort of algorithm, a random starting key needs to be initialised to start the process. Following this, the neighbouring keys from the starting key are also generated, and a fitness score of each are accumulated to find its best solution. Until no other better solutions are found, it will reach a local maximum, and in some cases, the global maxima (Segaran, 2007, p.93).

2.3.2. Genetic Algorithm Genetic algorithm (GA) is another well-known nature inspired method; it is also known as an evolutionary approach. In particular, GA is inspired by biological evolution. For example, in the gene selection stage it takes both parents genes to produce a mutation or crossover to a new set of genetic composition (Zang, Zhang and Hapeshi, 2010). It is a very useful algorithm for solving ‘local search, optimisation and machine learning problems’ (Zang, Zhang and Hapeshi, 2010). This type of algorithm works by finding the best successor (result) from a combination of parents that are modified in ways of either mutation or crossover (Russell and Norvig, 2016).

Figure 6 - Genetic Algorithm stages (Zang, Zhang and Hapeshi, 2010).

GA uses a randomly created population in order to start the process of generating better solutions/population. It does this by utilising a fitness function to calculate a score from the population. Following the score calculation, the algorithm ranks the best or worst parents (Segaran, 2007, p.97). By selecting two of the best parents of keys at random, GA increases the 16 chance of identifying a better successor. For instance, if two parents produce a good fitness score then, most likely, its successor will produce a better score. Also, within this process both parent’s genes can either ‘crossover’ or ‘mutate’, and are dependent on a randomisation function. A reproduction of the new generation is then used to repeat this process until a particular criteria is met or the amount of iterations are complete (Zang, Zhang and Hapeshi, 2010).

2.3.3. Simulated Annealing The simulated annealing algorithm is based upon the process of heating up metals and glass to very high temperatures and slowly cooling them to the shape required (Russell and Norvig, 2016). This is quite similar to the hill climbing algorithm, but is implemented to prevent reaching a local maximum. Therefore, while the temperature is still high the probability allows the annealing process to accept worse answers, as well as a better ones. This improves the scope in which the algorithm can search and as it leads towards a good solution the probability of accepting worse solutions are discarded. It is illustrated well by Russel and Norvig (2016) that simulated annealing can be imagined by using a gradient decent in which a ping pong ball is rolled down the gradient. If the ball is left to roll it will reach a local minimum, however, if the surface is shaken then the ball has the ability to move and find a better solution.

Cool rate of TRY FIND OTHER BETTER SOLUTIONS temperature and repeat

NO Random key generated, Generate Is better than Criteria Evaluate fitness YES YES Best solution found Temperature and neighbouring results solution? Complete? cooling rate set

NO

Random number < YES probability?

Figure 7 - Simulated Annealing flow chart

17

The annealing process starts the same as the hill climbing where a random solution is created. This algorithm is limited by the number of iterations used, generated from a variable of a high temperature and the rate at which the cooling process is taken (Segaran, 2007). During the iterations, a random direction changes the solution in some way for testing its fitness score. But it is also dependent on the temperature. If the temperature is high there is a greater probability that a worse answer can be accepted. On the other, if the temperature is lower there is less chance for a good solution (Segaran, 2007). This suggests that it is important to accept a worse solution before finding a better one.

2.4. The Big O Notation Quite simply, the Big O notation is regarded as one of the important aspects of analysing the complexity or performance of algorithms (Bell, 2017). It is important because it allows programmers and mathematicians to understand whether they can scale well when implementing a particular type of algorithm. The question is whether the algorithm works with other solutions and larger problems. Bell (2017) describes the Big O as finding the worst-case scenario of the algorithm, which finds out the time in which it will take to complete or understanding the size of memory needed. There are different types of Big O notation, but the most commonly found are (Bell, 2017):

o Constant-time O(1) - The algorithm is constant or runs at the same time depending on any type of input given: for example, printing a variable to the screen.

o Linear-time O(N) - Depending on the size of the input the algorithm will consistently run in a linear form, therefore it represents a 1:1 relationship between performance and the input size. This can be represented as using a for loop which takes a size of an array and processes it.

18

o Quadratic-time O(N2) – Like linear-time the size of the input increases the performance time, but in quadratic-time it doubles the time in which it takes. Therefore, using nested for loops is an example of using quadratic-time.

3. Methodology 3.1. Approach The methodological approach of positivism has been adopted in this research paper to enable clear observation and measurements of the qualitative data collected for the different algorithms’ performances. Using deductive reasoning helped in ensuring that a clear understanding of the data and a more defined conclusion has been achieved. Deductive reasoning was chosen in this research to have a clear observation of results. By doing so, this improved the reliability and authenticity of the data.

3.2. Strategy and Research Design The focus of this research was to create a program that implements the different machine learning algorithms to decrypt a polyalphabetic cipher. The algorithms that have been used are: (1) hill climbing; (2) genetic algorithm; (3) simulated annealing; and (4), random optimisation. The main aim of the research was to find out and prove which of these algorithms were best at solving this type of encryption. This process enabled a better observation and insight into the strengths and weaknesses of the algorithms.

To clarify, two separate programing languages were used; one for encryption of plaintext (Appendix 1) in Java; and furthermore, the other used the cipher text generated from Java for decryption and tested the performance of the algorithms in Python. Using the two different types of programming languages provided a sense of un-bias; thus, the output from Java was used as the input for the Python program.

Java was chosen as part of this dissertation as the researcher is familiar with this programming language. The researcher viewed it as a good starting point for encryption of the polyalphabetic cipher. Python is also another frequently used programming language that the researcher enjoys 19 and is interested in. However, python was used primarily for decrypting for the following reasons: it was a useful tool for scoring the fitness of the text using quad grams (Practical Cryptography, 2009); and, it creates graphs quite easily at the end of each iteration.

3.2.1. Encryption As previously mentioned, encryption was implemented using the Java programming language. First, before creating the program, it was worth finding a suitable text for this project to both encrypt and decrypt. A message from Winston Churchill to Franklin D. Roosevelt was used (Appendix 1) to create a sense of realism to the projects goal of uncovering sensitive data. For this to work, all grammar and spaces were removed from the text when encrypting, making the output text (cipher text) a single block of unreadable script. This ensured that the text is more difficult to comprehend and provided slightly more protection.

Encryption used a polyalphabetic cipher (Vigenere cipher) to encrypt the message and was interpreted and implemented using the resources found in section 2.2. of this dissertation. Using Eclipse IDE for implementing the Java program made it easier to work and execute the different classes as shown in Appendix 2.

There are two different classes shown in Appendix 2: one used to execute the program, the other to create an instance of a new cipher text. Before encrypting the input text in Appendix 1 it is stored in a string as the plaintext in the Main class. (Appendix 2)

Both Practical Cryptography (2009) website and ‘Everyday Cryptography’ (Martin, 2012), have helped in assisting and inspiring the writer to implement the Vigenere polyalphabetic cipher in Java (Appendix 2). When the ‘Encryption’ class has first been initialised in the ‘Main’ class, variables including the plaintext and the seven-letter key used (‘WINSTON’) for encryption will also be constructed. A formula adopted to encrypt and decrypt a Vigenere cipher is by Practical Cryptography (2009):

20

Cipher Character = Plaintext Character + Key Character (mod 26)

Using this formula which assisted in transforming the text proved to be very successful (as shown in the output of Appendix 1). This encryption focused on using a to look at each individual bit of character of the text, and changed it to a new corresponding letter. As shown in Appendix 2 the letters are altered to find their corresponding decimal number using the ASCII numbering sequence (Microsoft, 2017), as it is very useful tool to help calculate a new cipher text letter and an easy solution to use compared to other methods. Another method considered was using an array of alphabetic letters that represented an index; however, this method seemed to be less practical and consequently was not used.

Finally, when encrypting the plaintext, the program uses a for loop to run through each iteration of the characters in the plaintext as well as the key character, converting them to ASCII. This helps to calculate a new letter using the formula above. The modulo 26 is represented as the 26 letters in the alphabet. In this process, all letters are converted to uppercase which have a different representation of decimal numbers compared to lowercase letters; this can be shown in the ASCII chart on Microsoft’s site (2017). Any grammar and spaces are ignored enabling it to output to a single block of text. The result is an encrypted piece of cipher text that is outputted like the one illustrated in Appendix 1. An example of this formula working is shown as, if ‘R’ (decimal number 82) was the letter from the plaintext and ‘C’ (decimal number 84) is a character chosen from the key then using the formula above:

19th number in alphabet (T) = (82 (R) + 67 (C)) % 26

As shown above, it finds the representation of the letter in the alphabet, although it is quite simple to convert to ASCII decimal number by adding the letter ‘A’ (65). This would result in the new cipher letter being T with a new decimal number of 84 due to the capitalisation. Each of the ASCII numbers used from the plaintext and key are changed to represent the alphabet and converted back to their ASCII characters once finished (shown in Appendix 2). This procedure

21 loops until completed, appending the new cipher letters to a string which becomes the new cipher text for this encryption. An example of Winston Churchill’s message encrypted can be shown in the output of Appendix 1. To ensure that the encryption had been successful in encrypting the plaintext into the Vigenere cipher, Junit testing was used and is discussed further in Appendix 3.

Now finished discussing the implementation of the encryption it is also important to understand and discuss how decryption was implemented in the Python language.

3.2.2. Decryption The Python language was used primarily because of the capabilities that it provides for cryptography and machine learning. One of the main reasons for using Python to decrypt the polyalphabetic cipher is that a useful external Python file was found which scores the cipher text on how closely it represents the English language. This has proven to be an important part in analysing the different machine learning algorithms (see more in section 3.3.).

Decryption was implemented by following the code structure represented in Appendix 2. Quite simply, it is a reverse process of encryption. The outputted cipher text from Appendix 1 was used for decrypting to help analyse the machine learning algorithms, checking whether the cipher text can be decrypted. Therefore, the Junit testing in Appendix 3 was important to check whether the encryption was a success, otherwise the purpose of this research could be undermined.

Practical Cryptography (2009) provided useful information on the Vigenere cipher, although, this time a different formula was implemented to decrypt the content:

Plaintext Character = Cipher Character – Key Character (mod 26)

Like encryption, ASCII is used to recognise the number of the character in which it can be manipulated to find and create a new letter. However, the formula in which to decrypt is

22 different. Again, the procedure of decryption is very similar to encryption, as it executes a for loop to search through each character of the cipher text, discovering the cipher text and keys represented decimal numbers of ASCII at each iteration. Following the formula above it should discover the original plaintext character of the message and once finished will return a string of the plaintext. For example, the cipher text letter ‘Y’ (decimal number 89) and the key letter of ‘D’ (decimal number 68), then:

21st number in alphabet (V) = (89 (Y) – 68 (D)) % 26

Once all the cipher text letters have been converted back then it should output the same plaintext as the message from Winston Churchill (Appendix 1). However, if the wrong key has been used then the output might be wrong or unreadable, and so, testing machine learning algorithms to observe which can solve this decryption using cipher text only attack. Even though this dissertation identifies and uses the plaintext message to encrypt, the Python program only has the ability to understand the cipher text and the cryptosystem used; thus, this research takes the approach of the most common type of attack.

The purpose at this point of the research was to find out whether the algorithms used can decrypt the cipher text with the same key as encrypted. Therefore, getting the algorithm to understand a way of finding the best decryption key. It is also worth pointing out that the key size has been hard coded into the program enabling the algorithms to work at an efficient rate, and focus on the decryption of the cipher text. The Kasiki method (discussed in literature review) could have been implemented, but was not deemed important for the analysis of the algorithms.

3.2.3. Scoring To test the algorithms, it was important to utilise a scoring method which could assess each of the algorithms performances in finding a better key. A useful tool to calculate a fitness score of the cipher text was Practical Cryptography (2009). This helped in getting accurate results of the English language from the cipher text. An example of this scoring method, where the code had

23 been employed within the main project, is shown in Appendix 5. Here no modifications to the files had been made. Hence, it was essential to understand the code before use, in addition to making sure that it achieved accurate results.

Before discussing the use of this scoring method, it is worth noting that ‘quad grams’ are groups of letters of four that are made up within text (Practical Cryptography, 2009). For example, the word ‘ENCRYPT’ has four different quad grams; ‘ENCR’, ‘NCRY’, ‘CRYP’ and ‘RYPT’. Some of the scoring for the quad grams can be shown in Appendix 5 in the example ‘english_quadgrams.txt’.

For this scoring method to work, both the python file to assess the fitness score and the text file with all the English quad grams were needed. Before any fitness score can be calculated the quad grams needed to be imported into a list stored, and used for running multiple fitness scores throughout the process (as shown in Appendix 5). This improved the time for calculating the function, and ultimately, allowed the algorithms to focus primarily on the task at hand. Quad grams were chosen over trigrams or bigrams as the scope and length of the cipher text would take much longer in assessing the score, as well as the time in which the algorithms run.

Evaluating the ‘ngram_score.py’ in Appendix 5, works by obtaining the relevant score for a quad gram found in text and using this to create a log probability. By logging the score of the quad grams helps to identify if the text is English or not. To calculate the log probability the following equations from Practical Cryptography (2009) are represented below. (These equations use the example from before ‘ENCRYPT’):

count(ENCR) probability(ENCR)  total number of quadgrams

Equation 1 - individual quad gram calculation (Practical Cryptography, 2009)

probability(ENCRYPT )  p(ENCR) p(NCRY) p(CRYP) p(RYPT )

Equation 2 - Find the log probability of all quad grams (Practical Cryptography, 2009)

24

Equation 2 is used to calculate the overall probability of all the quad grams found in the piece of text or sentence. The overflowing numbers created from this probability would represent a longer number than a float can hold; it was thus very useful to use this logarithm to find the probability to store it as a float (see Practical Cryptography (2009) and Appendix 5). Therefore, the higher the log probability the more likely it would represent the English language, otherwise a lower score would result in just random letters (Practical Cryptography, 2009).

3.2.4. Machine Learning methods The machine learning algorithms adopted in this research were adapted to remove limitations and provide a better performance. Some examples of the limitations of machine learning methods are: reaching a state of local maxima and minima; that the number of iterations required are not enough (parameters changed to improve this); and, the ability to achieve better results throughout the process.

These algorithms are normally seen as optimisation techniques in which they strive to reach a state better than previously. In this research, the optimisation techniques are also considered as machine learning algorithms. The focus was to implement these types of algorithms in a fair environment to test their performances. However, before starting to collect any data, it was important to test and analyse of each of the algorithms separately and by doing so, discover which of the parameters performed the best (see Appendix 7). Hence, the collection of the results from the machine learning algorithms can be considered fair practice and suggest that the results are accurate.

The purpose of using these algorithms was to use the key as the main method of decryption. Instead of expending many hours analysing the cipher text, it was much easier to break the key. As a result, the key remained the main indicator to finding and breaking the polyalphabetic cipher and the score is represented by the decrypted text (as discussed in section 3.2.3.).

25

The approach for implementing the algorithms was to create them in Python, which allowed the algorithms to gain access to both the scoring method and the decryption. A timer and a list (to store best results) was incorporated for each of the algorithms, displaying a completion time and use of the recorded scores. This is used as part of the analysis of the algorithms. Once the algorithms were completed generating the keys and scores, it would be collected as part of the data (discussed in section 3.3.).

Due to the limitations discussed in section 3.4, the algorithms were used and developed from the book ‘Programming Collective Intelligence’ by Segaran (2007, pp. 86-100). The focus has been in developing these algorithms to work with the dissertations topic of cryptanalysis of the polyalphabetic cipher. The main reason for using the code from the book was that it provided valuable information in implementing the machine learning algorithms (which were chosen prior to reading this), However, the only similarities between the code developed and the original, is the functionality and the nature in which the algorithms are supposed to work. On the other hand, they are different from each other in respect of the problems or goals they are trying to solve. For this reason, the algorithms have been designed to customise a particular problem, in both scenarios. Segaran (2007) looks at using these algorithms to search and analyse flight times, whereas, this project takes a complete different approach of using these algorithms to analyse their performances whilst decrypting a polyalphabetic cipher (shown in Appendix 6). Therefore, it was a great starting point and is a template on which this project was based. The machine learning algorithms below are the ones that have been used and developed for the main analysis of this research.

3.2.4.1. ‘Random Hill Climbing’ This algorithm has been implemented so that, once the hill climbing has reached a state of local maxima, it will pick another point to assess, consequently searching a wider scope to find the global maxima. The purpose for this type of algorithm is to assess the keys neighbouring solutions and to ascend to a better result, but when local maxima are reached the random restarts will help widen the search scope.

26

First, before assessing the problem, the algorithm initialised a timer and started with a random key solution. Appendix 6 shows that the hill climbing algorithm runs through a main loop where it assesses a new set of neighbours every time a better solution is found, and consequently, achieving a better result. It does this by changing a specific character of the key in a random way for each of the neighbouring solutions (as displayed in Appendix 6). Changing a specific character enabled the program to assess three different solutions: first, the best solution; second, the next solution; and finally, its previous one. It is noteworthy that if the best solution had no neighbouring solutions that has a better score, then the key indicator was incremented. The algorithm focused on the next character of the key. Once it completed this cycle, the best result was found, the timer was stopped, and all the relevant results within the output were displayed to console. All this information was then manually copied to a Microsoft Excel document and the data were stored for the evaluation.

3.2.4.2. ‘Genetic Algorithm’ The genetic algorithm has many parameters that are used in order for this method to work. As discussed in the literature review in relation to how this method works, it defines the size of the population that can be created, as well as the chance of mutation and to find the number of elite (best) candidates within the population (see Appendix 6). This is all controlled by the number of times the algorithm can be iterated to find the best genetic result, which is implied as the ‘maxiter’. It is unlike any of the other algorithms used in this research, as it has two methods within this function; the mutation accepts a key that changes a single character in some random way, whereas the crossover method takes the two best keys and combines them from a specific index.

The complexity of this algorithm meant that it took a longer time to implement. Nevertheless, it proved to be very useful. The starting point for this algorithm is to generate a new list that holds randomly generated keys. Setting up the parameters at the start helped to control the process of how many elites can be chosen from the population. Before ranking the order of population, the scoring method (see section 3.2.3) is used to find out their fitness score. Therefore, the

27 population can be overwritten by the ranking keys, utilising the elite ‘parents’ for either mutating or crossing over the keys. Once completed, one set of the population (the new generation) is then processed the same way. This is repeated until the number of generations are completed and no better solution can be found. This is shown in Appendix 6, where the algorithm has been utilised.

3.2.4.3. ‘Random Optimisation’ The purpose for using this type of algorithm is based upon its ability to use random solutions continuously to find its solution. As it is very unpredictable whether it will find a better key, this method was useful to compare with the others in striving to find the best result. This algorithm does however exploit a better key, if found; this was easy enough to implement because of the simplicity at which the algorithm works.

Random Optimisation follows the same procedure as the other methods, where a main loop is applied to iterate a number of times trying to find a better result. This method starts by initialising the worst possible score (as shown in Appendix 6). From here, the method creates randomly created keys where a fitness score (Appendix 5) is calculated at each iteration to find if it is better than the previous score. If so, the score is ever evolving to a better solution. The number of iterations is dependent on the setting of the parameter. This is largely because it has better or worse scope for searching (refers again to Appendix 6).

3.2.4.4. ‘Simulated Annealing’ This proved to be the most difficult to implement. The probability controls whether a worse result can be accepted; however, when trying to implement this, worse results throughout the process were being accepted. Even though the purpose of this algorithm is to accept these wrong results, it is not supposed to accept it continuously until the end. Therefore, a lot of trial and error was used to resolve this issue and finally a simple fix was found and the final code can be shown in Appendix 6.

28

As discussed in the other algorithms, each one started by implementing local variables to instantiate the function. Likewise, in this algorithm, a random solution is created to start the method in order to find a result better than its previous score. However, unlike the other methods it uses a decreasing temperature to control the amount of iterations performed. This is set in the parameters, the highest temperature in which it starts and the rate in which it cools down is established. In each of the iterations a random direction is chosen to modify the solution in some way. Doing so changes the character and helps to discover the fitness scores (Appendix 5) for analysis on whether a better score can be accepted. However, within this procedure a probability is calculated, and if the score is not better than the previous then it may still accept it, but this depends on the probability and the current temperature. If, however, the temperature has decreased, the probability becomes less, which only allows the better results to be accepted. In turn, this provided a useful method for searching within the scope and removing the limitations of the local maxima and minima.

3.3. Data Collection and Analysis Methods The data to collect was based upon several factors: the time in which it is completed; the best key found to solve the decryption; and the score and the number of times a better solution was found. The data have helped to analyse the functionality of the algorithms and have given an interpretation of whether they are useful for this type of task, in addition to more advanced versions of polyalphabetic ciphers. Even though the collection of the data was imperative for this study, the graphs have nonetheless been more valuable for discovering the way in which the algorithms work, and to show how they strive to find its best results.

To collect the data the program had been checked to ensure the algorithms were working correctly and outputting good information. Having done this, it was deemed important to identify the performances of these algorithms, by testing each with different parameters (as shown in Appendix 7). This helped in assessing which parameters were best for testing the algorithms for the main test, and, in turn, gained better outcomes for each of the algorithms, which helped the main analysis.

29

The program was executed ten times, and recorded the data outputted and inputting into a Microsoft Excel spreadsheet; the graphs were also saved to this folder. Using the Excel spreadsheet helped in analysing the data, and also created other graphs for a better outlook of each of the algorithms. The data collection was very significant in finding and discovering, as it assisted in providing a good evaluation of the topic, and thereby aided in answering the research question at hand.

The data have been analysed based upon the algorithms results, and by taking an objective approach towards each of them. This included observing the following: the cipher score where for each key used it would decrypt the cipher text and find its fitness score (closer to 0 represents the English language); and showing the key used, the time it took to complete and the number of better keys found throughout the process. Another method for analysing these algorithms is the Big O notation, which looks at how the algorithms can cope with scaling the size of inputs. It has also been a great contributor to finding out how effective the machine learning algorithms are in solving this particular task. It could be suggested that very limited data were collected. However, the analysis carried out was successful and aided in a good conclusion and evaluation.

3.4. Limitations Limitations of this research have been considered and listed below:

 Firstly, the time permitted for the implementation of this research was a major limitation;  As a result of limited time the researcher used the scoring method to analyse the cipher text, as shown in Appendix 5 by Practical Cryptography (2009);  Again, due to the limited time constraints using the pre-existing code to develop and implement the different machine learning algorithms was beneficial. The code used from ‘Programming collective intelligence’ by Segaran (2007) was very helpful and provided a template to develop upon and create this project.

30

4. Evaluation and Results This section will evaluate and discuss the different algorithms used. Having completed and recorded the data, a clear insight to how effective each of the machine learning algorithms is displayed. It is vital that the data are of quality and reliability as well as being run as a fair test as mentioned in the methodology. This section discusses the results relating to all the algorithms. However, it is noteworthy that the order in which the algorithms appear in this section has no bearing on their effectiveness.

It is worth noting that the best score possible is the key which had first encrypted the text. The key that was used was ‘WINSTON’ and so is therefore, the best and only key which can fully decrypt the cipher text and has a score, as shown below in Figure 8.

Figure 8 - Score of the best key

4.1. Random Hill Climbing

Hill Climbing Data Table Iteration Best Key Cipher Score No. of best solutions Time (s) 1 WINIMON -5531.01 6719 51.10 2 AEDSTON -5835.32 6764 49.35 3 HERSTON -5892.45 6858 50.01 4 LCNSTON -5579.58 6900 50.73 5 WINSTUN -4852.16 6816 53.18 6 ARNSTON -5498.92 6828 49.72 7 HINMTON -5802.73 6893 48.58 8 ACNSTON -5479.97 6793 48.30 9 HINSTOR -5374.43 6773 50.32 10 CINSTEN -5548.68 6794 49.75 Average -5539.52 6813.80 50.11 Table 2 - Random Hill Climbing data

31

The hill climbing algorithm has been modified to widen the search scope. It has had the ability to get very close to the correct key. For example, this can be shown in iteration 5 of table 2, where the best score has been found with an output (WINSTUN). Thus, nearly bares the resemblance to the key used (WINSTON) to encrypt, and the decrypted text will have close similarity to the text used in Appendix 1. Although if you observe all the outputs from this algorithm in table 2, they all bare some likeness. This would suggest that using these keys for decryption would help a cryptanalysis to identify some words or letters in the cipher text. Consequently, this type of algorithm has fared considerably well in the short amount of time taken to nearly decrypt the message. This suggests that if this runs longer than a minute or more, it could potentially solve the key used and uncover the original plaintext.

Observing table 2 above suggests that this algorithm has the capability to search for the best result. This is because of the number of best solutions it can find in quite a short period. For example, iteration 8 with an output of ‘ACNSTON’ has the shortest completion time of 48.30 (s), but has found more than 6793 better results throughout the process. This implies that with only two letters wrong in this key that this decryption could be somewhat successful as some words or letters would be recognisable. Also, due to the nature of this algorithm constantly striving to find a better result it has nearly achieved its full potential. Although it did not find the correct key within the 10 iterations tested, it nevertheless came close to solving it.

The graphs in table 3 represent each of the iterations from table 2. It is interesting to note that the graphs can be represented as sound waves, but are a collection of all the different hill climbing points within the scope. All the lowest points in the graphs represent the starting point from where the algorithm grows to a better result, which is shown at the peaks throughout the processes. Fascinatingly, it is not clear where the best result will be found and graph 5 clearly shows this, since within the process a great leap to the best result is made. This is better than any other keys that have been found. Observing these graphs has helped to clarify the nature in which these algorithms work, and that each iteration has a different output. Its efficiency to find a better result as shown in the table below.

32

Hill Climbing Graphs (Y-axis: Score; X-axis: No. of Best Solutions Found)

1 2

3 4

5 6

7 8

9 10 Table 3 - Random Hill Climbing graphs

33

Hill Climbing Data Table Cipher Score 0.00

-1000.00

CINSTEN

HINSTOR

LCNSTON

HERSTON

AEDSTON

ACNSTON

WINSTUN

ARNSTON

HINMTON WINIMON -2000.00 1 2 3 4 5 6 7 8 9 10 -3000.00

-4000.00 Score Score is (0 Best) -5000.00

-6000.00

-7000.00 Iteration

Hill Climbing Data Table Cipher Score

Figure 9 - Hill Climbing table 2 representation

After assessing the implementation of the algorithm, it is quite clear that the complexity is of quadratic-time O(N2). The nested for loops in the implemented algorithm results in its complexity of quadratic-time. This means that a larger input increases the time it takes to complete this algorithm, thereby suggesting that this type of algorithm is useful for completing inputs of smaller sizes (as shown in this research). But for larger inputs (i.e. bigger key sizes) it may not be feasible; however, this could be justified in another test.

34

4.2. Genetic Algorithm

Genetic Algorithm Data Table Iteration Best Key Cipher Score No. of best solutions Time (s) 1 WINSTON -3948.52 7200 24.24 2 WINSTAN -4823.49 7200 23.37 3 WINSZUN -5302.01 7200 22.91 4 WINWTEN -5600.01 7200 23.44 5 WINYTON -4828.43 7200 23.10 6 WINSTON -3948.52 7200 22.62 7 WINSZON -4824.53 7200 22.67 8 WIPSTON -4894.46 7200 22.76 9 WINSTON -3948.52 7200 25.35 10 WONSTOJ -5414.26 7200 23.55 Average -4753.27 7200 23.40 Table 4 - Genetic Algorithm data

The genetic algorithm in table 4 has been very successful in finding at least three occurrences of the correct key. Therefore, it can find the key used to encrypt the polyalphabetic cipher. Also, looking at the rest of the best keys, it has a success rate of finding at least 5 out of 7 letters overall of the total keys. This largely suggests that using any of these keys could uncover some words or phrases in the decrypted text. Thus, this is a successful algorithm. To add to this success, for each iteration the number of best solutions has been consistent, because of the way in which this algorithm works and is implemented. This makes this process a reliable and efficient algorithm to use. Another contributor is the time in which each iteration is completed. As shown in table 4, this algorithm finds a good number of best solutions in a short period on average of 23.40 (s). This is a successful all round set of results.

The below graphs from table 5 represent the data collected in table 4. There is a clear characteristic to the graphs, unlike the others. What makes these graphs look unusual is that throughout the process, they grow to a better result and don’t fluctuate in scores. For example, iteration 1 in table 8 has a rapid growth of finding the best result after 1000 other better solutions. However, a rectangle appears continuously once it has found its best solution. This rectangular shape is a fluctuation between one best solution and another, and suggests that it cannot improve beyond this point (as shown in all other iterations). Looking back at the data in

35 table 4 shows that once it is found what it thinks is the best solution of best keys, it does not have the capabilities to be able to progress beyond that. Nevertheless, the scores that have been collected prove that this algorithm has the efficiency and reliability to find a solution to this problem in a small amount of time. It could be suggested that the number of best solutions found throughout this process is what makes this algorithm reliable.

36

Genetic Algorithm Graphs (Y-axis: Score; X-axis: No. of Best Solutions Found)

1 2

3 4

5 6

7 8

9 10 Table 5 - Genetic Algorithm graphs

37

Genetic Algorithm Data Table Cipher Score 0.00

-1000.00

WIPSTON

WINSTAN

WINSZUN

WINSZON

WINSTON WINSTON WINSTON

WINYTON

WONSTOJ WINWTEN -2000.00 1 2 3 4 5 6 7 8 9 10

-3000.00

Score Score is (0 Best) -4000.00

-5000.00

-6000.00 Iteration

Genetic Algorithm Data Table Cipher Score

Figure 10 – Genetic Algorithm table 4 representation

Looking at the implementation of this algorithm in Appendix 6, it appears that the complexity of this is quadratic-time O(N2). This is due to having nested loops which increase the time for each iteration. Therefore, this complexity suggests that a bigger input size will double the length of time for each iteration. This is perhaps not feasible for larger inputs in the long run. However, it could still be worth using this algorithm to find better solutions due to its efficiency (as shown its reliability in figure 10).

38

4.3. Random Optimisation

Random Optimize Data Table Iteration Best Key Cipher Score No. of best solutions Time (s) 1 JYNYTOA -6591.53 10 10.01 2 IKERTON -6659.19 9 9.24 3 WINSKGM -6325.53 23 9.63 4 BEPSTON -6147.19 4 9.18 5 WEHSTOT -6085.54 6 9.28 6 EFUSTOJ -6573.54 11 9.73 7 ZETSTEN -6743.52 6 9.22 8 WCNQION -6488.86 7 9.17 9 HXNVTON -6500.34 11 9.21 10 FINKTOR -6383.89 8 9.21 Average -6449.91 9.5 9.39 Table 6 - Random Optimisation data

The random optimisation algorithm has done considerably well for a random search. It should be suggested that this algorithm is not as reliable as the other algorithms, but having seen the data in table 6, it displays some fair results. From evaluating the data, it is apparent that it takes less time to complete and can get some good results. This can be shown by looking at the best keys, where some of the letters represent the original key used to encrypt: thus, it is quite close to finding the correct key. Not once though has the algorithm completed the process successfully. The data show that it has the potential to search a much wider scope if the number of iterations were increased in the algorithm. The best keys here could have the potential to uncover some small phrases, which could be useful to a cryptanalysis.

As mentioned previously, the consistency of this algorithm is what makes it weak, and ultimately unusable. On the other hand, when observing the graphs in table 7, they all show an improving state; and one which represents the hill climbing event, where it tries to find a better solution to find its best result within the scope. Each iteration has a different climb and different number of best solutions found (as displayed in table 6). Interestingly, they all bare similar qualities of aiming for better solutions. However, they do this through different paths because of its randomness and unreliability. For smaller problems, this algorithm could be useful, although it has shown many weaknesses within this research; and for larger keys used, the chance of solving them is

39 minimal. Because of the nature of this algorithm, it would take a huge amount of chance to be able to find the correct key.

40

Random Optimise Graphs (Y-axis: Score; X-axis: No. of Best Solutions Found)

1 2

3 4

5 6

7 8

9 10 Table 7 - Random Optimisation graphs

41

Random Optimize Data Table Cipher Score -5600.00

-5800.00

EFUSTOJ

ZETSTEN

FINKTOR

IKERTON

JYNYTOA

BEPSTON

WEHSTOT

HXNVTON WINSKGM -6000.00 WCNQION 1 2 3 4 5 6 7 8 9 10 -6200.00

-6400.00 Score Score is (0 Best) -6600.00

-6800.00 Iteration

Random Optimize Data Table Cipher Score

Figure 11 - Random Optimisation table 6 representation

The random optimisation has been a relatively good algorithm to use and test to see how it compares with others. From its simple implementation of a for loop and an if statement, its complexity is of linear-time O (N) which means that for a larger data set the algorithm would grow proportionate to a linear scale. The algorithm works by ascending for each input and its increasing performance.

4.4. Simulated Annealing

Simulated Annealing Data Table Iteration Best Key Cipher Score No. of best solutions Time (s) 1 JVNITAC -6930.51 3817 66.28 2 IGGHTEJ -7192.06 3762 75.85 3 AYGOTOC -6858.18 3842 67.73 4 YVAFTZZ -7214.97 3878 68.61 5 WUJOEOD -7012.83 3865 70.50 6 COZETAN -6897.47 3811 66.40 7 HMNOIOC -6899.66 3867 66.27 8 WBSDIOB -7210.67 3885 66.38 9 FBCHTOC -6919.98 3826 66.49 10 EOFOEBU -7559.53 3819 67.14 Average -7069.59 3837.2 68.17 Table 8 - Simulated Annealing data

42

The simulated annealing algorithm has been unsuccessful in all attempts to find both the correct key used as well as closely representing keys. It can be shown in table 8 that all the best keys do not bare any resemblance to the key used (WINSTON) to encrypt the text. This suggests that either the algorithm is not working to its potential or that some of the implementation may have been incorrectly used, such as the probability which controls how many worse keys can still be chosen. After careful deliberation, it can be agreed that the algorithm has been implemented correctly, but it is the functionality which is preventing it reaching any good results. It is also noteworthy that even though this algorithm was not successful this time, it has shown the ability to find many different best solutions throughout the process in a reasonable amount of time. However, the scores in table 8 would suggest otherwise, as it does not produce any good results: for example, the average cipher score implies that a text deciphered using a key with this score would result in a failure, since the message would still be unclear and not readable English.

On the other hand, the graphs in table 9 show a different side of the performance of the algorithm. Having analysed each graph at each iteration it is interesting to see what happens towards the end of the processes. Each have a bad start, with some results getting worse like iteration 6. The problem here, is that it’s quite unpredictable whether this algorithm will accept better or worse results and whether to work from these. However, towards the end of the process the algorithms are now adjusting themselves to only accept better results. This is because the probability is less likely to accept the worse results; thus, it gradually ascends to a result that is better than the others. A good example of this can be shown in iteration 6. It starts off unpredictable but at the end a curve appears where it begins to increase only using the conditions to accept better results.

43

Simulated Annealing Graphs (Y-axis: Score; X-axis: No. of Best Solutions Found)

1 2

3 4

5 6

7 8

9 10 Table 9 - Simulated Annealing graphs

44

Simulated Annealing Data Table Cipher Score -6400.00

-6600.00

JVNITAC

IGGHTEJ

YVAFTZZ

FBCHTOC

COZETAN

EOFOEBU

WBSDIOB

AYGOTOC HMNOIOC -6800.00 WUJOEOD 1 2 3 4 5 6 7 8 9 10 -7000.00

-7200.00

Score (0 is Best) -7400.00

-7600.00

-7800.00 Iteration

Simulated Annealing Data Table Cipher Score

Figure 12 - Simulated Annealing table 8 representation

The implementation of this algorithm has a complexity of O(N), as it is made up of a loop which accesses one input per iteration at a time. This thereby suggests that this is linear-time, and implies that the increase in performance is dependent upon the size of the input given. Therefore, this could make this algorithm work more functionally with larger problems. As it is shown in table 9, that all of the iterations exceed during the end of its process. In consequence, it can be suggested that with larger sized inputs this algorithm works a lot more efficiently in finding better solutions. Although, this is something that could be tested in further research in this area.

45

4.5. Evaluation

Figure 13 - Plot.ly graph made with data from a random iteration

Figure 13 above is a collection of all the different algorithms running from a random iteration. This graph clarifies that the algorithms are all working differently from one another, and suggests that it is one of the main reasons why they produce different results. Each algorithm’s performances and functionality have certain characteristics which allow them to search a scope of sample data to find solutions.

It has been useful to test these algorithms to find out the effectiveness of each one. As a result, it has helped to identify trends and patterns. In this case (as displayed in figure 13), as well as analysing the results of all the tests, it has been shown that the genetic algorithm has been the most effective and efficient way for finding the correct solutions. In fact, it has been the only algorithm to successfully find a correct solution. However, it must be remembered that this test has not only been about finding the solution, but other factors such as the time taken and number of best solutions found in the process. Thus, due to all these contributors it is fair to say that the

46 genetic algorithm has been the most effective. This is shown in figure 13, as it illustrates that the genetic algorithm has nearly found the best solutions in under processing 1000 better solutions. On the contrary, the other algorithms are spending a lot longer on processing the worst key, and did not get the best result possible. However, it must be acknowledged that the hill climbing algorithm also performed well, nearly achieving the correct solution through widening the search scope. In this respect, it can also be considered a very effective algorithm and could be useful for discovering other solutions.

Combined Results

Hill Climbing Genetic Algorithm Random Optimisation Simulated Annealing

0.00 1 2 3 4 5 6 7 8 9 10 -1000.00

-2000.00

-3000.00

-4000.00

-5000.00 SCORE (0 IS BEST) (0IS SCORE

-6000.00

-7000.00

-8000.00 ITERATION

Figure 14 - Results from all algorithms iterations

The other algorithms (i.e. random optimisation and simulated annealing) used in this research have not been successful this time, but they have shown their potential to solve different types of problems. For example, both random optimisation and simulated annealing in figure 14 show a consistency and reliability of results. Although, due to the results shown, they should be considered a weak and ineffective algorithm for the purposes of this study. Figure 14 shows that the two highest algorithms are performing more effectively and consistently. It could also be

47 suggested that the hill climbing and genetic algorithm bare a likeness of striving to find the best solution unlike the others, and as a result find good solutions to the problem of decrypting a polyalphabetic cipher. From this, it could be implied that because simulated annealing accepts worse conditions and random optimisation is unreliable, they will never perform the same as the two most effective algorithms; hill climbing and genetic algorithm.

5. Conclusion This research has enabled an analysis of the different machine learning algorithms by using them for cryptology. It has observed how the algorithms have searched for the solutions to this problem, and how their functionality has aided the researcher with data collection.

From the data collected, it can be concluded that the genetic algorithm has been the most effective algorithm used in this research, with hill climbing as second. Furthermore, they both have the potential to be useful for larger problems. The main reason why genetic algorithm has performed best is that it has achieved the correct key used in 3 out of 10 iterations, with an average time of 23.40 (s), and a consistent number of best solutions found. It is perhaps the nature in which the algorithm works that makes it so successful; however, this would be an ideal opportunity to exploit this algorithm further in extended research.

Throughout this dissertation the polyalphabetic technique has been adopted for use of encryption and decryption. By using this difficult type of cipher, it maybe suggested that all the algorithms have done considerably well, but some have performed better than others. In the end, using a cipher-text only attack proved that the algorithms were utilising their abilities with only small resources, namely: the size of the key used; and, the cipher text created from the encryption in Java.

Using the template code from Segaran (2007) to help implement the algorithms has been helpful for this research. Also, using the fitness scoring of the quad grams (Appendix 5) has also been a success, as it has provided this research with the ability to test the functionality of the algorithms. Without these implementations, the research would not have been as accurate and as useful.

48

It must be noted that even though fair practice was taken into consideration when performing each of the algorithms, the simulated annealing and random optimisation could have been improved to try and match the scores of the better results. Some improvements could have been: increasing the amount of time the algorithms were run; and, changing the parameters to achieve a greater number of iterations. On the other hand, it could be said that because of the nature of the algorithms, trying to improve them might not find better results and therefore would waste time and efficiency. Due to the limited amount of time left, it was deemed not important to utilise, as the data already collected was sufficient.

Having looked and discussed the different complexities of the algorithms it can now be determined which was the most efficient for its ability to scale well. After careful deliberation, it is quite clear that both the random optimisation and the simulated annealing work especially well to larger arrays/inputs. Therefore, they can be considered the most efficient for their complexities, in this dissertation, as they work in a linear-time and scale. If it is considered which algorithm provided the most accurate results, this would be the genetic algorithm. The genetic algorithm has been consistent with results and achieving the best scores and so has been the most accurate algorithm in this dissertation.

On reflection, this project was successful as it implemented both the polyalphabetic cipher for encrypting and decrypting, as well as the machine learning algorithms. It also provided a platform in which the algorithms were tested. Overall, this research has outputted some useful information that could potentially help in understanding the improvements needed to make these algorithms work more effectively. For future research, it would be useful to use, and enhance, the genetic algorithm to establish to a fuller extent how this algorithm could be utilised.

Word Count – 12,427;

49

References Books Bell, J. (2015). Machine learning. 1st ed. Indianapolis, Indiana: John Wiley & Sons, Inc.

Churchhouse, R.F. (2001) Codes and ciphers: Julius Caesar, the enigma and the internet. Cambridge: Cambridge University Press.

Conway, D. and White, J.M. (2012) Machine learning for hackers. Sudbury, MA, United States: O’Reilly Media, Inc, USA.

Gollmann, D. (2010) Computer security. 3rd edn. Chichester, United Kingdom: Wiley, John & Sons.

Kahn, D. (1997) The codebreakers: The comprehensive history of secret communication from ancient times to the Internet. New York, NY: Simon & Schuster Adult Publishing Group.

Martin, K.M. (2012) Everyday cryptography: Fundamental principles and applications. Oxford: Oxford University Press.

Russell, S. and Norvig, P. (2016). Artificial intelligence. 3rd ed. Pearson, pp.120-129.

Schneier, B. (1996). Applied cryptography, second edition. 2nd ed. John Wiley & Sons.

Segaran, T. (2007) Programming collective intelligence: Building smart web 2.0 applications. United States: O’Reilly Media, Inc, USA.

Stamp, M. (2011) Information security: Principles and practice. 2nd edition. Oxford: Wiley, John & Sons.

Tilborg, H. (2006). Fundamentals of cryptology: A Professional Reference and Interactive Tutorial. 1st ed. Springer Science & Business Media.

50

Journal Articles Damico, T.M. (2009) ‘A brief ’, Inquiries Journal, 1(11).

Lones, M. (2014). Metaheuristics in nature-inspired algorithms. Proceedings of the 2014 conference companion on Genetic and evolutionary computation companion - GECCO Comp '14.

Zang, H., Zhang, S. and Hapeshi, K. (2010). A Review of Nature-Inspired Algorithms. Journal of Bionic Engineering, 7, pp. S232-S237.

Dissertation/Thesis Bergmann, K. (2007). Cryptanalysis Using Nature-Inspired Optimization Algorithms. Master of Science. University of Calgary.

Dorey, E. (n.d.). Optimising Hill Climbing Algorithms to Solve Monoalphabetic Substitution Ciphers. Undergraduate. University of Gloucestershire.

Maximov, A. (2006). Some Words on Cryptanalysis of Stream Ciphers. Ph.D. Lund University.

Websites Beach, J. (no date) Syntax Highlight Code in Word Documents. Available at: http://www.planetb.ca/syntax-highlight-word (Accessed: 1 March 2017).

Bell, R. (2017). A beginner's guide to Big O notation. Rob-bell.net. Available at: https://rob- bell.net/2009/06/a-beginners-guide-to-big-o-notation/ (Accessed: 12 April 2017).

Hat, R. (2017) A brief history of Cryptography. Available at: https://access.redhat.com/blogs/766093/posts/1976023 (Accessed: 2 February 2017).

51

Junit.org. (2017). JUnit Framework. Available at: http://junit.org/junit4/ (Accessed: 4 April 2017).

Microsoft. (2017). ASCII Character Codes. Available at: https://msdn.microsoft.com/en- us/library/60ecse8t(v=vs.80).aspx (Accessed: 3 April 2017).

National Churchill Museum (no date) Winston Churchill to Franklin D. Roosevelt, 1940. Available at: https://www.nationalchurchillmuseum.org/winston-churchill-to-franklin-d-roosevelt- november-23-1940.html (Accessed: 1 March 2017).

Practical Cryptography (2009) Quadgram Statistics as a Fitness Measure. Available at: http://practicalcryptography.com/cryptanalysis/text-characterisation/quadgrams/#a-python- implementation (Accessed: 13 February 2017).

Programming-algorithms.net. (2017). Vigenère cipher. Available at: http://www.programming- algorithms.net/article/45623/Vigenere-cipher (Accessed: 2 April 2017).

Rivest, R. (2013) Cryptography. Available at: http://www.newworldencyclopedia.org/entry/Cryptography (Accessed: 9 February 2017).

Rodriguez-Clark, D. (2013) Polyalphabetic substitution ciphers. Available at: http://crypto.interactive-maths.com/polyalphabetic-substitution-ciphers.html (Accessed: 9 February 2017).

Socialresearchmethods.net. (2017). Philosophy of Research. Available at: https://www.socialresearchmethods.net/kb/philosophy.php (Accessed: 2 April 2017).

52 tutorialspoint.com. (2017). Attacks on Cryptosystems. Available at: https://www.tutorialspoint.com/cryptography/attacks_on_cryptosystems.htm (Accessed: 28 March 2017). tutorialspoint.com. (2017). JUnit Writing a Test. Available at: https://www.tutorialspoint.com/junit/junit_writing_tests.htm (Accessed: 4 April 2017).

53

Appendices

Appendix 1 – 217 words for encryption from ‘Winston Churchill to Franklin D. Roosevelt’ message November 23, 1940. (National Churchill Museum, no date)

Input: Our accounts show that the situation in Spain is deteriorating and that the Peninsula is not far from the starvation point. An offer by you to dole out food month by month so long as they keep out of the war might be decisive. Small things do not count now and this is a time for very plain talk to them. The occupation by Germany of both sides of the Straits would be a grievous addition to our naval strain, already severe. The Germans would soon have batteries working by radio direction finding which would close the Straits both by night and day. With a major campaign developing in the Eastern Mediterranean and need of reinforcement and supply of our armies there all round the Cape we could not contemplate any military action on the mainland at or near the Straits. The Rock of Gibraltar will stand a long siege but what is the good of that if we cannot use the harbour or pass the Straits? Once in Morocco the Germans will work South, and U-boats and aircraft will soon be operating from Casablanca and Dakar. I need not, Mr. President, enlarge upon the trouble this will cause to us or approach of trouble to the Western Hemisphere. We must gain as much time as possible.

Key used to encrypt = ‘WINSTON’

Output: KCESVQBQVGKLVBSBUSMHUAAVLNOGEWAAGGCWQAALRRPMEAHFNPQAYTBQPPNLMVRLMA AGGHHIVKGCGBIEXKCZPPRKMOERIGAHBCKQALTBBBNRJUMLKCGGWCYAWHLYCBZUBFMVOUU BFMVFKTBFZOFPPRQDSRLWHLHTGDMJSKAVCPGTXRRYQFAOSFIIYDMVVJOFVHBBPKBMGHAKE NFWHUEAVKTHVIMSGKJRNGCDTWAPIYCMCGDMZLASBYKHHTHVKVOQZSEIIAQHTOKBUKBRRO WSLASFPZNAMGJKCYVUSNCZVWOCHOIQVBHVKVGGHIEJIISEGGNIVFTZEAIQQLSIAZRLASTAZZSG GJKCYVLCBJPNNXPNPBRJBSFSWECBBTXGESWWBZQEWVHVKVSAGRVJOJZBQUSWHDWQYKARL ASFPZNAMGOKBUTRBVCPGSGRQWGJAMVNIIWGKQNIXNAZBQADRDHDVJOVFMVRAIFLXFAIMQ AMSENIAWTBNJLAWXRBBZRAGTBNKREXBGWVQKNDCHGBXHIEWZZAXGGDMEWTZYNWHFWH 54

UAKNHXKRYWHDWBBPKBFMSZLTNLXOAUUVDBHNNGNUMWBJWALASZWQADTBQWBBJGSNN BUWLHEWQGKMVRNWPCHTTEJESEHNNEVDEGGWVQSECACAVWZSOQBJZTHVOBUWZCBZWSL AOGENJWVOAJWGMLSGDMUSKPBQZBJIOFOBUWLHEWQGKHBPAQAEHFBYKBLASTAZZSGGJETY OHFXOWHLAOAZCOGTHFWVQSBFPNISLPWYHABGGPRKXRJTHVJOSJHAPWANTEOAYINFWRNGI EAGSRZVBLFFCNMFAWSAPMADTFTACCGGHUABEGNPYABUALKVHTPSNGRPWHKHFNLXEGTQU KNGJHIOHMGGMVRSMFLXFADMZALDUAZROXAHOBTSBBNOUHUAHVIMNKICFOQODX

55

Appendix 2 – Encrypting the message by ‘Winston Churchill’ in Appendix 1. Main.java

1. package uk.co.ryanjamesbrown.dissertation; 2. 3. public class Main { 4. 5. public static void main(String[] args) { 6. 7. System.out.println("Enter message to encrypt here:\n"); 8. 9. String plaintext = "Our accounts show that the situation in Spain is deteri orating and that the Peninsula is not far from the starvation point.....rest of message"; 10. Encryption encrypt = new Encryption(plaintext); 11. 12. System.out.println("Plain Text:\t" + encrypt.getPlainText() +"\n\nEncrypted Message:\t" + encrypt.getEncryptedMessage()); 13. } 14. }

56

Encryption.java

1. package uk.co.ryanjamesbrown.dissertation; 2. 3. public class Encryption { 4. 5. private String plaintext; 6. private String key = "WINSTON"; //change key for the testing 7. 8. //start of the alphabet 'A' in the ASCII. 9. private static final int ASCII_UPPERCASE = 65; 10. 11. private StringBuilder encryptedMessage; 12. 13. Encryption(String plaintext) { 14. 15. this.plaintext = plaintext.toUpperCase(); 16. this.encryptedMessage = new StringBuilder(); 17. 18. startEncryption(); 19. } 20. 21. public void setPlainText(String plnText){ 22. 23. this.encryptedMessage = new StringBuilder(); 24. this.plaintext = plnText.toUpperCase(); 25. } 26. 27. private void startEncryption() { 28. 29. int keyNum = 0; 30. 31. //running through message characters 32. for(int i = 0; i < plaintext.length(); i++){ 33. 34. //gets the character at the specific point of message 35. char plnTxt = plaintext.charAt(i); 36. 37. int convertASCII = (int) plnTxt; 38. int plnTextNum = convertASCII - ASCII_UPPERCASE; 39. 40. char currentKey = key.charAt(keyNum); 41. 42. int convert_ASCII_key = (int) currentKey; 43. int indexOfKey = convert_ASCII_key - ASCII_UPPERCASE; 44.

57

45. if(convertASCII >= 65 && convertASCII <= 90) { 46. 47. shiftLetters(plnTextNum, indexOfKey); 48. 49. if(keyNum < key.length()-1) { 50. keyNum = keyNum + 1; 51. } else { 52. //reset 53. keyNum = 0; 54. } 55. } 56. } 57. } 58. 59. private void shiftLetters(int plainText, int key) { 60. 61. // Ci = (Pi + Ki) mod 26; 62. int newLetter = (plainText + key) % 26; 63. 64. int getAscii_letter = (newLetter + ASCII_UPPERCASE); 65. char encryptedCharacter = (char) getAscii_letter; 66. 67. encryptedMessage.append(encryptedCharacter); 68. } 69. 70. public StringBuilder getEncryptedMessage() { 71. return encryptedMessage; 72. } 73. 74. public String getKey() { 75. return key; 76. } 77. 78. public String getPlainText() { 79. return plaintext; 80. } 81. 82. }

58

Appendix 3 – JUnit Testing the encrypted message The purpose of this type of testing is to ensure that the encrypted message is correct before proceeding with the decryption and testing the algorithms. This is an important result for this research because it needed to pass to ensure the reliability of the decryption techniques. The test takes a small segment of the text from Appendix 1 in which has been calculated before performing, using the techniques discussed in section 3.2.1 and using the Tablu Recta in figure 3. The ‘Junit writing a test’ from tutorialspoint.com (2017) proved to be very helpful in setting up and implementing this test. Junit.org (2017) provided the Junit jar file to use in the Java application for testing.

TestEncryption.java

1. package uk.co.ryanjamesbrown.dissertation; 2. 3. import org.junit.runner.JUnitCore; //setup for Junit. Import libraries 4. import org.junit.runner.Result; 5. import org.junit.runner.notification.Failure; 6. 7. public class TestEncryption { 8. 9. public static void main(String[] args) { 10. 11. Result encrypted = JUnitCore.runClasses(Main.class); //get success status 12. 13. for(Failure fail : encrypted.getFailures()){ //check for any failures 14. System.out.println(fail.toString()); 15. } 16. 17. System.out.println("\nThe encryption was successful (TRUE) or failed (FALSE):\t" + encryp ted.wasSuccessful()); 18. } 19. 20. }

59

Main.java

1. package uk.co.ryanjamesbrown.dissertation; 2. 3. import org.junit.Test; //import Junit library 4. import static org.junit.Assert.assertEquals; 5. 6. public class Main { 7. 8. @Test 9. public void testEncryption() { 10. 11. System.out.println("Enter message to encrypt here:\n"); 12. 13. String plaintext = "Our accounts show that the situation in Spain"; 14. //create a polyalphabetic encryption from Winston Churchill message. 15. Encryption encrypt = new Encryption(plaintext); 16. String encryptedMessage = encrypt.getEncryptedMessage().toString(); 17. 18. System.out.println("Plain Text:\t" + encrypt.getPlainText() +"\n\nEncrypted Message:\t" + encryptedMessage); 19. //check to see if encrypted message is successful 20. assertEquals(encryptedMessage, "KCESVQBQVGKLVBSBUSMHUAAVLNOGEWAAGGCWQA"); 21. } 22. }

The test had been successful with the correct outputs shown. This helped in advancing further with the project and completing the project.

60

Appendix 4 – Decrypting the polyalphabetic cipher. The code is relatively the same as encryption, but it implements a technique for reversing/decrypting the process of the polyalphabetic cipher.

1. class deciphering(object): 2. 3. key = "" 4. ascii_uppercase = 65 5. decryptedMessage = [] 6. 7. def __init__(self, msg, key): #input key/solution made from algorithm 8. 9. self.encryptedMessage = msg #store message 10. 11. if type(key) == str: 12. self.key = key 13. elif type(key) == list: 14. temp = ''.join(key) 15. self.key = temp 16. 17. def startDecryption(self): 18. 19. self.decryptedMessage = [] #create a list to store new letters 20. 21. self.key_num = 0 22. 23. for i in range(0, len(self.encryptedMessage)): #loop through message 24. 25. charAtCipher = self.encryptedMessage[i] #get specific character 26. cipher_index = ord(charAtCipher) 27. cipher_alphabet_index = cipher_index - self.ascii_uppercase 28. 29. charAtKey = self.key[self.key_num] #get key character 30. key_index = ord(charAtKey) 31. key_alphabet_index = key_index - self.ascii_uppercase 32. 33. if cipher_index >= 65 and cipher_index <= 90: #Within alphabet 34. #find plaintext letter 35. self.shiftLetters(cipher_alphabet_index, key_alphabet_index) 36. 37. if self.key_num < len(self.key)-1: 38. self.key_num = self.key_num + 1 39. else: 40. self.key_num = 0 41. 42. def shiftLetters(self, cipher_text, key): 43. #Pi = (Ci – Ki) % 26 44. shift = ((cipher_text - key) + 26) % 26 #decrypt letter using formula 45. 46. newLetter = (shift + self.ascii_uppercase) 47. newCharacter = chr(newLetter) 48. 49. self.decryptedMessage.append(newCharacter) #append letter to list 50. 51. def getDecryptedMessage(self): 52. new_string = ''.join(self.decryptedMessage) #convert list to string 53. return new_string

61

Appendix 5 – Fitness Scoring of text using Quad grams (Practical Cryptography, 2009) These files were used and exploited for calculating the scores of the cipher text which aided in testing the performance of the machine learning algorithms. These files were used from the website Practical Cryptography (2009) and no modifications to these files were made. ngram_score.py (Practical Cryptography, 2009)

1. ''''' 2. Allows scoring of text using n-gram probabilities 3. 17/07/12 4. ''' 5. from math import log10 6. 7. class ngram_score(object): 8. def __init__(self,ngramfile,sep=' '): 9. ''''' load a file containing ngrams and counts, calculate log probabilities ''' 10. self.ngrams = {} 11. for line in file(ngramfile): 12. key,count = line.split(sep) 13. self.ngrams[key] = int(count) 14. self.L = len(key) 15. self.N = sum(self.ngrams.itervalues()) 16. #calculate log probabilities 17. for key in self.ngrams.keys(): 18. self.ngrams[key] = log10(float(self.ngrams[key])/self.N) 19. self.floor = log10(0.01/self.N) 20. 21. def score(self,text): 22. ''''' compute the score of text ''' 23. score = 0 24. ngrams = self.ngrams.__getitem__ 25. for i in xrange(len(text)-self.L+1): 26. if text[i:i+self.L] in self.ngrams: 27. score += ngrams(text[i:i+self.L]) 28. else: score += self.floor 29. return score

62

‘english_quadgrams.txt’ example contents in file (Practical Cryptography, 2009):

1. TION 13168375 2. NTHE 11234972 3. THER 10218035 4. THAT 8980536 5. OFTH 8132597 6. FTHE 8100836 7. THES 7717675 8. WITH 7627991 9. INTH 7261789 10. ATIO 7104943 11. OTHE 6900574 12. TTHE 6553056 13. DTHE 6470280 14. INGT 6461147 15. ETHE 6135216 16. SAND 5996705 17. STHE 5748611 18. HERE 5630500 19. THEC 5466071

Implementation in the main program:

1. #fitness scoring using quadgrams(Practical Cyrptography (2009)) 2. import ngram_score as ns 3. fitness = ns.ngram_score('english_quadgrams.txt') #loads the file of quad grams 4. 5. #calculates a score from the decipherment of key and 6. def fitness_score(problem, sol): 7. 8. decipher = dc.deciphering(problem,sol) #decipher message without current key. 9. decipher.startDecryption() 10. message = decipher.getDecryptedMessage() 11. 12. score = fitness.score(message) #score of decrypted message. 13. 14. return score

63

Appendix 6 – Machine Learning Algorithms implementation The code here was created from and inspired by all the literature surrounding the topic. Due to the limitations discussed in section 3.3 the algorithms were used from the ‘Programming Collective Intelligence’ by Segaran (2007) which were modified and developed to implement a program that can decrypt a polyalphabetic cipher text.

64

Hill Climbing:

1. #some code used from collective intelligence (Segeran, 2007) 2. def hillclimb(problem, rstarts=1000, random_startkeys = {}): 3. 4. print "Hill Climbing Algorithm Starting....\n" 5. 6. start_time = time.time()#time the algorithm's performance completion 7. 8. hillClimbScores = [] #collecst all the best scores 9. 10. #number of random starts 11. for random_start in range(1,rstarts): 12. 13. #start of the index of keyword. changes as it progresses through decipher 14. key_indicator = 0 15. 16. # Create a random solution 17. sol=[chr(random.randint(0,25)+65) for i in range(keySize)] 18. 19. while 1: # Main loop 20. 21. currentSol = fitness_score(problem,sol) 22. best = currentSol 23. 24. #two random solutions made from solution 25. prevSol = [sol[p] for p in range(len(sol))] 26. nextSol = [sol[n] for n in range(len(sol))] 27. 28. if key_indicator < keySize: #if index of key is less than the size of key

29. 30. getnum = ord(sol[key_indicator]) 31. 32. #from alphabet of 0 to 25 (1 to 26) change to ascii uppercase letters (add 65. 65 being 'A' in ascii alphabet) 33. #previous neighbour 34. prevChar = chr(getnum - 1) 35. prevNum = ord(prevChar) 36. 37. if prevNum >= 65 and prevNum <= 90: 38. prevSol[key_indicator] = prevChar 39. elif prevNum < 65: 40. #assign Z as A - 1 goes to Z 41. prevSol[key_indicator] = chr(90) 42. 43. #from alphabet of 0 to 25 (1 to 26) change to ascii uppercase letters (add 65. 65 being 'A' in ascii alphabet) 44. #next neighbour 45. nextChar = chr(getnum + 1)

65

46. nextNum = ord(nextChar) 47. 48. if nextNum >= 65 and nextNum <= 90: 49. nextSol[key_indicator] = nextChar 50. elif nextNum > 90: 51. #assign A as Z + 1 goes back to start of alphabet A. 52. nextSol[key_indicator] = chr(65) 53. 54. elif key_indicator >= keySize: #if index of key is greater than size of key 55. break #break out of the while loop 56. 57. neighbour = [prevSol, nextSol] #assign two neighbours together 58. 59. for j in range(len(neighbour)): #check neighbours to see if their better than s olution 60. 61. current_score = fitness_score(problem,neighbour[j]) 62. 63. #if current score is greater (so closer to 0) then 64. if current_score > best: 65. best=current_score 66. sol=neighbour[j] 67. hillClimbScores.append(best) 68. 69. if best == currentSol: 70. key_indicator = key_indicator + 1 71. 72. random_startkeys[''.join(sol)] = best 73. 74. best_of = None 75. num = 0 76. best_of_sol = [] 77. 78. #run through the best random iterations and find the best result 79. for item in random_startkeys: 80. 81. if num == 0: 82. best_of = random_startkeys[item] 83. best_of_sol = list(item) 84. num += 1 85. 86. else: 87. current = random_startkeys[item] 88. 89. if current > best_of: 90. best_of = current 91. best_of_sol = list(item) 92. 93. end_time = time.time() 94. print "\nHill Climbing time (seconds):\t" + str(end_time - start_time) + "\n" 95. 96. create_graphs(hillClimbScores, 'Hill Climbing', 'r') 97. 98. print "Number of best solutions found:\t" + str(len(hillClimbScores)) 99. print "Hill Climbing Algorithm Completed....\n" 100. print str(''.join(best_of_sol)) + "\n" + str(best_of) 101. 102. return hillClimbScores

66

Genetic Algorithm:

1. #some code used from collective intelligence (Segeran, 2007) 2. def geneticoptimize(problem, popsize=100, step=1, mutprod=0.6, elite=0.4, maxiter=1 20): 3. 4. print "Genetic Algorithm Starting....\n" 5. 6. start_time = time.time() #time the algorithm's performance completion 7. 8. geneScores = [] #collects the best scores 9. 10. # Mutation Operation 11. def mutate(vec): 12. 13. i = random.randint(0, keySize-1) #choose random index to mutate 14. 15. if random.random()<0.5 and len(vec[i])>0: 16. 17. newNum = ord(vec[i]) - step #find neighbour alphabet to change 18. 19. if newNum >= 65 and newNum <= 90: #If character is in the Alphabet 20. newChr = vec[0:i]+[chr(ord(vec[i])-step)]+vec[i+1:] #mutation 21. elif newNum < 65: #If character is less than the character A assign Z 22. newChr = vec[0:i]+[chr(90)]+vec[i+1:] 23. 24. return newChr 25. 26. elif len(vec[i])= 65 and newNum <= 90:#If character is in the Alphabet 31. newChr = vec[0:i]+[chr(ord(vec[i])+step)]+vec[i+1:] #mutation 32. elif newNum > 90: #If character is more than the character Z assign A 33. newChr = vec[0:i]+[chr(65)]+vec[i+1:] 34. 35. return newChr 36. 37. # Crossover Operation 38. def crossover(r1,r2): 39. i = random.randint(1,len(r1)-2) #index of key to changeover 40. cross = r1[0:i]+r2[i:] #crossover the two elite 41. return cross 42. 43. pop=[] #Initialise the starting population 44. 45. for i in range(popsize): #create a number of random keys 46. sol = [chr(random.randint(0,25)+65)for i in range(keySize)] 47. pop.append(sol) 48. 49. topelite = int(elite*popsize) #find the number of elite in the population 50. 51. for i in range(maxiter): #start of main nloop 52. 53. scores=[(fitness_score(problem,v),v) for v in pop] #find fitness scores of population 54. scores.sort(reverse=True) #make best results show first 55. ranked=[v for (s,v) in scores] #rank these scores

67

56. 57. # Start with the pure winners and overwrite existing list of better populat ion 58. pop = ranked[0:topelite] 59. 60. # Add mutated and bred forms of the winners 61. while len(pop)

68

Random Optimisation:

1. #some code used from collective intelligence (Segeran, 2007) 2. def randomoptimize(problem, riterations=7000): 3. 4. print "Random Optimize Starting....\n" 5. 6. #time the algorithm's performance completion 7. start_time = time.time() 8. 9. randomScores = [] 10. 11. best = -999999999 #best is the largest number possible first (number is less than 0) 12. best_result = None 13. 14. for i in range(0,riterations): 15. 16. # Create a random solution of a key of the size key set 17. sol=[chr(random.randint(0,25)+65)for i in range(keySize)] 18. 19. score = fitness_score(problem,sol)# Get the fitness score of the current so lution 20. 21. # Compare the current fitness score with the best. If so, update new best.

22. if score > best: 23. best = score 24. best_result = sol 25. randomScores.append(best) 26. 27. end_time = time.time() # 28. print "\nRandom Optimize time (seconds):\t" + str(end_time - start_time) + "\ n" 29. 30. create_graphs(randomScores, 'Random Optimize', 'g') 31. 32. print "Number of best solutions found:\t" + str(len(randomScores)) 33. print "Random Optimize Completed....\n" 34. print str(''.join(best_result)) + "\n" + str(best) 35. 36. return randomScores

69

Simulated Annealing:

1. #some code used from collective intelligence (Segeran, 2007) 2. def annealingoptimize(problem, T=100000.0, cool=0.9995, step=1): 3. 4. print "Simulated Annealing Starting....\n" 5. 6. start_time = time.time() #time the algorithm's performance completion 7. 8. best_sol_score = None 9. annealingScores = [] 10. 11. # create a random solution of a key with the given size 12. sol=[chr(random.randint(0,25)+65) for i in range(keySize)] 13. 14. while T > 0.1: 15. 16. # randomise the chosen index 17. i = random.randint(0, keySize-1) 18. 19. # edit the index of character 20. dir = random.choice([1,-1]) 21. 22. # Create a new list with one of the values changed 23. current_sol = sol[:] 24. newChar = current_sol[i] 25. newNum = ord(newChar) + dir 26. 27. if newNum >= 65 and newNum <= 90: #if new character is part of Alphabet 28. current_sol[i] = chr(newNum) 29. elif newNum < 65: #if new character is less than A then assign Z 30. current_sol[i] = chr(90) 31. elif newNum > 90: #if new character is more than Z then assign A 32. current_sol[i] = chr(65) 33. 34. # Calculate the current cost and the new cost 35. best_sol_score = fitness_score(problem, sol) 36. current_sol_score = fitness_score(problem, current_sol) 37. 38. #find the probability from results. The lower the probability the less errors it accepts 39. probability = pow(math.e,(current_sol_score - -best_sol_score) / T) 40. 41. if (current_sol_score > best_sol_score or random.random() < probability): 42. sol=current_sol 43. best_sol_score = current_sol_score 44. annealingScores.append(best_sol_score) 45. 46. # Decrease the temperature 47. T = T * cool 48. 49. end_time = time.time() 50. print "\nSimulated Annealing time (seconds):\t" + str(end_time - start_time) + "\n" 51. 52. create_graphs(annealingScores, 'Simulated Annealing', 'y') 53. 54. print "Number of best solutions found:\t" + str(len(annealingScores)) 55. print "Simulated Annealing Completed....\n" 56. print str(''.join(sol)) + "\n" +str(best_sol_score) 57. return annealingScores

70

Appendix 7 – Testing performance of algorithms

The algorithms used in this research have been tested to find and utilise the best parameters for each. For this reason, it will help produce better results as performances of each one are improved. This will hopefully enhance the final test results. Below is the added code used to test the algorithms performances:

71

1. def hillClimb_ptesting(): 2. 3. start_time = time.time() 4. hillclimb(cipher, 100) 5. end_time = time.time() 6. print "\nHill Climbing 1 time (seconds):\t" + str(end_time - start_time) + "\n" 7. 8. start_time = time.time() 9. hillclimb(cipher, 200) 10. end_time = time.time() 11. print "\nHill Climbing 2 time (seconds):\t" + str(end_time - start_time) + "\n" 12. 13. start_time = time.time() 14. hillclimb(cipher, 300) 15. end_time = time.time() 16. print "\nHill Climbing 3 time (seconds):\t" + str(end_time - start_time) + "\n" 17. 18. def genetic_ptesting(): 19. 20. start_time = time.time() 21. geneticoptimize(cipher, 50, 1, 0.2, 0.1, 80) 22. end_time = time.time() 23. print "\nGenetic Algorithm time (seconds):\t" + str(end_time - start_time) + "\n" 24. 25. start_time = time.time() 26. geneticoptimize(cipher, 70, 1, 0.4, 0.2, 100) 27. end_time = time.time() 28. print "\nGenetic Algorithm time (seconds):\t" + str(end_time - start_time) + "\n" 29. 30. start_time = time.time() 31. geneticoptimize(cipher, 100, 1, 0.6, 0.4, 120) 32. end_time = time.time() 33. print "\nGenetic Algorithm time (seconds):\t" + str(end_time - start_time) + "\n" 34. 35. def random_ptesting(): 36. 37. start_time = time.time() 38. randomoptimize(cipher, 3000) 39. end_time = time.time() 40. print "\nRandom Optimize time (seconds):\t" + str(end_time - start_time) + "\n" 41. 42. start_time = time.time() 43. randomoptimize(cipher, 5000) 44. end_time = time.time() 45. print "\nRandom Optimize time (seconds):\t" + str(end_time - start_time) + "\n" 46. 47. start_time = time.time() 48. randomoptimize(cipher, 7000) 49. end_time = time.time() 50. print "\nRandom Optimize time (seconds):\t" + str(end_time - start_time) + "\n"

72

51. 52. def annealing_ptesting(): 53. 54. start_time = time.time() 55. annealingoptimize(cipher, 100000.0, 0.9995, 1) 56. end_time = time.time() 57. print "\nSimulated Annealing time (seconds):\t" + str(end_time - start_time) + "\n" 58. 59. start_time = time.time() 60. annealingoptimize(cipher, 1000000.0, 0.995, 1) 61. end_time = time.time() 62. print "\nSimulated Annealing time (seconds):\t" + str(end_time - start_time) + "\n" 63. 64. start_time = time.time() 65. annealingoptimize(cipher, 10000000.0, 0.9995, 1) 66. end_time = time.time() 67. print "\nSimulated Annealing time (seconds):\t" + str(end_time - start_time) + "\n"

Hill Climbing

SETTING CHOSEN

Best Outputted Scores Parameters Used (Problem, Average 1 2 3 iterations) Score Time (s) Score Time (s) Score Time (s) Score Time (s) hillclimb(cipher, 100) -5752.286562 5.16 -5477.846894 5.32 -6128.886104 5.46 -5786.339853 5.31 hillclimb(cipher, 500) -5752.286562 26.02 -5477.846894 26.01 -4737.055471 10.17 -5322.396309 20.73 hillclimb(cipher, 1000) -5432.748693 52.57 -5316.9699 54.02 -4737.055471 53.06 -5162.258021 53.22

Testing this algorithm made it clear that improving the number of iterations helped in searching more scope of better keys, thus it produced better results. A 1000 random hill climbing iterations is therefore chosen for testing in section 4.1.

Genetic Algorithm

SETTING CHOSEN

Best Outputted Scores Parameter Settings (Problem, popsize, step, Average 1 2 3 mutprod, elite, maxiter) Score Time (s) Score Time (s) Score Time (s) Score Time (s) geneticoptimize(cipher, 50, 1, 0.2, 0.1, 80) -5475.658591 10.12 -4962.192344 9.46 -6187.126142 9.93 -5541.659 9.84 geneticoptimize(cipher, 70, 1, 0.4, 0.2, 100) -6386.050171 18.72 -5944.400916 15.11 -4788.541182 15.94 -5706.3308 16.59 geneticoptimize(cipher, 100, 1, 0.6, 0.4, 120) -4830.030634 23.32 -5544.027862 23.55 -4737.055471 23.02 -5037.038 23.30

This algorithm took a bit more thought into understanding what parameters to use. The one highlighted green has achieved the best score overall, however, has taken the most time to complete. Therefore, the more time the algorithm is performing, the better the algorithm

73 produces better results. Thus, the parameter above has been chosen for the final test results in section 4.2.

Random Optimize

Best Outputted Scores Average Parameters Used (Problem, iterations) 1 2 3 Score Time (s) Score Time (s) Score Time (s) Score Time (s) randomoptimize(cipher, 3000) -6551.871064 4.00 -6856.845628 4.04 -6535.415624 4.29 -6648.044105 4.11 randomoptimize(cipher, 5000) -6581.94738 6.69 -6657.871058 6.81 -6761.367491 7.04 -6667.061976 6.85 randomoptimize(cipher, 7000) -6734.122574 9.32 -6851.043175 9.58 -6416.355671 9.40 -6667.173807 9.44

As shown in the above image, the random optimisation is not reliable and the data is not consistent, however, to try and compare with the other algorithms it is better to allow the algorithm to run longer. For the final test the random optimisation algorithm will perform 7000 iterations. This would allow the algorithm to at least have the chance to find a decent score/key.

Simulated Annealing

SETTING CHOSEN

Best Outputted Scores Parameter Settings (Problem, Tempearture, Average 1 2 3 Cooling, Step) Score Time (s) Score Time (s) Score Time (s) Score Time (s) annealing(cipher, 100000.0, 0.9995, 1) -7132.610754 71.60 -5904.320339 67.81 -6434.416686 73.41 -6490.45 70.94 annealing(cipher, 1000000.0, 0.995, 1) -7032.570318 7.75 -6985.980967 8.03 -7227.976007 8.35 -7082.18 8.04 annealing(cipher, 10000000.0, 0.9995, 1) -6419.462345 103.12 -6900.859297 107.35 -7081.347545 113.78 -6800.56 108.08

The simulated annealing works by using temperature and a cooling rate. It has been configured to test varying parameters, in which creates different running times as shown in the above image. The first parameter, highlighted green, will be useful for the final and main test. This is because it produces the best score on average and completes in a decent amount of time taken.

74

Appendix 8 – Ethics form (2016D0099) When undertaking a research or enterprise project, Cardiff Met staff and students are obliged to complete this form in order that the ethics implications of that project may be considered. If the project requires ethics approval from an external agency (e,g., NHS), you will not need to seek additional ethics approval from Cardiff Met. You should however complete Part One of this form and attach a copy of your ethics letter(s) of approval in order that your School has a record of the project. The document Ethics application guidance notes will help you complete this form. It is available from the Cardiff Met website. The School or Unit in which you are based may also have produced some guidance documents, please consult your supervisor or School Ethics Coordinator. Once you have completed the form, sign the declaration and forward to the appropriate person(s) in your School or Unit. PLEASENOTE: Participant recruitment or data collection MUST NOT commence until ethics approval has been obtained. PART ONE Name of applicant: Ryan Brown Supervisor (if student project): Dr Chaminda Hewage School / Unit: Cardiff School of Management Student number (if applicable): St20067780 Programme enrolled on (if applicable): BSc (Hons) Computing Project Title: “Breaking and Entering”: Evaluation of Various Decryption Techniques to Decipher a Polyalphabetic

Expected start date of data collection: 01/12/2016Substitution Cipher. Approximate duration of data collection: 2 months Funding Body (if applicable): N/A Other researcher(s) working on the project: N/A Will the study involve NHS patients or staff? No Will the study involve human samples and/or No human cell lines? 75

Does your project fall entirely within one of the following categories: Paper based, involving only documents in the Yes public domain Laboratory based, not involving human No participants or human samples Practice based not involving human No participants (eg curatorial, practice audit) Compulsory projects in professional practice No (eg Initial Teacher Education) A project for which external approval has No been obtained (e.g., NHS) If you have answered YES to any of these questions, expand on your answer in the non-technical summary. No further information regarding your project is required. If you have answered NO to all of these questions, you must complete Part 2 of this form In no more than 150 words, give a non-technical summary of the project Overall, the project consists of creating a Java application which can encrypt ‘plain text’ (string of words or sentences), using a type of algorithm called the ‘polyalphabetic cipher’, as result scrambles the ‘plain text’ creating ‘encrypted text’. The main part of the project will be to implement different types of decryption techniques (Algorithms to solve equations) and using these to decipher the ‘encrypted text’, whilst analysing and evaluating each of their performances. Therefore, looking at how different techniques of decryption are closer to solving the ‘polyalphabetic cipher’ and which ones are faster at doing this.

This research will be conducted by procedure of creating the polyalphabetic cipher to encrypt a string of words. Then using the methods of deciphering to decipher the encrypted message. At the same time as decoding, scoring and testing will be created for analysing data of the decryption methods. DECLARATION: I confirm that this project conforms with the Cardiff Met Research Governance Framework

76

I confirm that I will abide by the Cardiff Met requirements regarding confidentiality and anonymity when conducting this project.

STUDENTS: I confirm that I will not disclose any information about this project without the prior approval of my supervisor. Signature of the applicant: Date: 02/11/2016 Ryan Brown FOR STUDENT PROJECTS ONLY Name of supervisor: Chaminda Hewage Date: 17/11/2016

Signature of supervisor:

Research Ethics Committee use only

Decision reached: Project approved Project approved in principle Decision deferred Project not approved Project rejected Project reference number: 2016D0099

Name: Dr Hilary Berger Date: 22/11/2016

Signature: Hilary Berger Details of any conditions upon which approval is dependant: None

77