Malware Detection Using the Index of Coincidence

Total Page:16

File Type:pdf, Size:1020Kb

Malware Detection Using the Index of Coincidence San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Winter 1-23-2017 Malware Detection using the Index of Coincidence Bhavna Gurnani San Jose State University Follow this and additional works at: https://scholarworks.sjsu.edu/etd_projects Part of the Artificial Intelligence and Robotics Commons, and the Information Security Commons Recommended Citation Gurnani, Bhavna, "Malware Detection using the Index of Coincidence" (2017). Master's Projects. 507. DOI: https://doi.org/10.31979/etd.9mkp-kstb https://scholarworks.sjsu.edu/etd_projects/507 This Master's Project is brought to you for free and open access by the Master's Theses and Graduate Research at SJSU ScholarWorks. It has been accepted for inclusion in Master's Projects by an authorized administrator of SJSU ScholarWorks. For more information, please contact [email protected]. Malware Detection using the Index of Coincidence A Project Presented to The Faculty of the Department of Computer Science San Jose State University In Partial Fulfillment of the Requirements for the Degree Master of Science by Bhavna Gurnani Dec 2016 ○c 2016 Bhavna Gurnani ALL RIGHTS RESERVED The Designated Project Committee Approves the Project Titled Malware Detection using the Index of Coincidence by Bhavna Gurnani APPROVED FOR THE DEPARTMENTS OF COMPUTER SCIENCE SAN JOSE STATE UNIVERSITY Dec 2016 Dr. Mark Stamp Department of Computer Science Dr. Thomas Austin Department of Computer Science Mr. Fabio Di Troia Department of Computer Science ABSTRACT Malware Detection using the Index of Coincidence by Bhavna Gurnani In this research, we apply the Index of Coincidence (IC) to problems in malware analysis. The IC, which is often used in cryptanalysis of classic ciphers, is a technique for measuring the repeat rate in a string of symbols. A score based on the IC is applied to a variety of challenging malware families. We find that this relatively simple IC score performs surprisingly well, with superior results in comparison to various machine learning based scores, at least in some cases. ACKNOWLEDGMENTS I am grateful to Dr. Stamp for providing me with the opportunity to work on this project and guiding me. I would like to thank my committee members for reviewing my work. Also, special thanks to Fabio Di Troia for helping me throughout the project. I would also like to thank my brother for constantly motivating me during the project. v TABLE OF CONTENTS CHAPTER 1 Introduction ...............................1 1.1 Introduction to Malware.......................1 1.2 Types of Malware...........................1 1.3 Techniques to detect malware....................2 1.4 Technique used............................2 2 Background ................................4 2.1 Previous work and research......................4 2.2 Index of Coincidence.........................5 2.2.1 Calculation..........................5 2.2.2 Application..........................6 3 Implementation ..............................7 3.1 Choices Made in Calculation.....................7 3.2 Process to calculate IC........................8 4 Results ................................... 10 4.1 With variation of normalizing coefficient.............. 12 4.2 With variation of Train dataset................... 17 4.3 With variation of Test dataset.................... 21 4.4 Summary of results.......................... 24 4.5 Comparison with other scores.................... 25 5 Observations ............................... 27 vi 5.1 Observation 1:............................. 27 5.2 Observation 2:............................. 28 5.3 Observation 3:............................. 29 6 Conclusion and Future Work ..................... 31 APPENDIX Generated ROC Curves .......................... 34 vii LIST OF TABLES 1 Dataset................................ 10 2 Parameters for c as 26........................ 13 3 Parameters for c as 79........................ 13 4 Results with different normalizing coefficients........... 14 5 Results with different c and training datasets........... 16 6 Parameters for NGVCK as training dataset............ 18 7 Parameters for Zeroaccess as training dataset........... 19 8 Results with different training datasets............... 20 9 Parameters for Cridex as test dataset................ 22 10 Parameters for Cleaman as test dataset............... 23 11 Results with different testing datasets............... 23 12 Comparison across different scores................. 25 13 Percentage of matching opcodes................... 27 14 Frequency percentage of ‘other’ opcodes.............. 28 A.15 Parameters NGVCK-Benign-26................... 34 A.16 Parameters NGVCK-Cleaman-26.................. 34 A.17 Parameters NGVCK-Cridex-26................... 35 A.18 Parameters ZeroAccess-Benign-26.................. 35 A.19 Parameters Smart HDD-Benign-26................. 36 A.20 Parameters Security Shield-Benign-26................ 36 A.21 Parameters Winwebsec-Benign-26.................. 37 viii A.22 Parameters Zbot-Benign-26..................... 37 A.23 Parameters Cridex-Benign-26.................... 38 A.24 Parameters Cleaman-Benign-26................... 38 A.25 Parameters Harebot-Benign-26................... 39 A.26 Parameters NGVCK-Benign-79................... 39 A.27 Parameters NGVCK-Zbot-79.................... 40 A.28 Parameters NGVCK-Benign-63................... 40 A.29 Parameters NGVCK-Security Shield-63............... 41 A.30 Parameters NGVCK-Benign-50................... 41 A.31 Parameters NGVCK-Winwebsec-50................. 42 A.32 Parameters NGVCK-Benign-45................... 42 A.33 Parameters NGVCK-Smart HDD-45................ 43 A.34 Parameters NGVCK-Benign-34................... 43 A.35 Parameters NGVCK-Zeroaccess-34................. 44 A.36 Parameters NGVCK-Benign-12................... 44 A.37 Parameters NGVCK-Harebot-12.................. 45 A.38 Parameters NGVCK-Benign-6.................... 45 ix LIST OF FIGURES 1 NGVCK-Benign scatter plot, c is 26................ 11 2 NGVCK-Benign scatter plot, c is 30................ 11 3 ROC curve for c as 26........................ 12 4 ROC curve for c as 79........................ 13 5 AUC with different normalizing coefficients: line graph...... 14 6 AUC with different normalizing coefficients: bar graph...... 15 7 Different c and train dataset across benign............. 17 8 ROC curve for NGVCK as training dataset............ 18 9 ROC curve for Zeroaccess as training dataset........... 19 10 AUC by different training datasets................. 20 11 ROC curve for Cridex as test dataset................ 22 12 ROC curve for Cleaman as test dataset............... 23 13 AUC by different testing datasets.................. 24 14 Comparing results with different scores............... 25 15 Percentage of matching opcodes................... 28 16 Frequency percentage of ’other’ opcodes.............. 29 17 Opcode Distribution across families................. 30 A.18 ROC Curve NGVCK-Benign-26................... 34 A.19 ROC Curve NGVCK-Cleaman-26.................. 34 A.20 ROC Curve NGVCK-Cridex-26................... 35 A.21 ROC Curve ZeroAccess-Benign-26.................. 35 x A.22 ROC Curve Smart HDD-Benign-26................. 36 A.23 ROC Curve Security Shield-Benign-26............... 36 A.24 ROC Curve Winwebsec-Benign-26................. 37 A.25 ROC Curve Zbot-Benign-26..................... 37 A.26 ROC Curve Cridex-Benign-26.................... 38 A.27 ROC Curve Cleaman-Benign-26................... 38 A.28 ROC Curve Harebot-Benign-26................... 39 A.29 ROC Curve NGVCK-Benign-79................... 39 A.30 ROC Curve NGVCK-Zbot-79.................... 40 A.31 ROC Curve NGVCK-Benign-63................... 40 A.32 ROC Curve NGVCK-Security Shield-63.............. 41 A.33 ROC Curve NGVCK-Benign-50................... 41 A.34 ROC Curve NGVCK-Winwebsec-50................. 42 A.35 ROC Curve NGVCK-Benign-45................... 42 A.36 ROC Curve NGVCK-SmartHDD-45................ 43 A.37 ROC Curve NGVCK-Benign-34................... 43 A.38 ROC Curve NGVCK-Zeroaccess-26................. 44 A.39 ROC Curve NGVCK-Benign-12................... 44 A.40 ROC Curve NGVCK-Harebot-12.................. 45 A.41 ROC Curve NGVCK-Benign-6................... 45 xi CHAPTER 1 Introduction 1.1 Introduction to Malware Malware [2] is an umbrella term for various kinds of malicious software. Such programs are designed to harm computers in several ways including stealing personal information, installing illegitimate software, reformatting hard drives, deleting files and so on. Only six years after the launch of the first personal computer in 1975, the world was presented with its first computer virus [13]. Ever since, malware has seemed unstoppable. The rate with which malicious code and other unwanted programs are released is increasing in comparison to the release rate of legitimate software applications. In this paper, the terms virus and malware are used interchangeably. 1.2 Types of Malware There has been an arms race between malware writers and anti-virus software publishers. The most commonly and widely used malware detection technique is signature detection [2]. In this technique, pattern matching is used to scan for par- ticular byte patterns that are found in malware. If a matching pattern is found, the file is categorized as malware. This is a very simple and often effective technique, but malware writers have developed various techniques aimed at defeating signature detection. Encrypting the body of a virus can effectively defeat signature detection. How- ever, the decryption
Recommended publications
  • Amy Bell Abilene, TX December 2005
    Compositional Cryptology Thesis Presented to the Honors Committee of McMurry University In partial fulfillment of the requirements for Undergraduate Honors in Math By Amy Bell Abilene, TX December 2005 i ii Acknowledgements I could not have completed this thesis without all the support of my professors, family, and friends. Dr. McCoun especially deserves many thanks for helping me to develop the idea of compositional cryptology and for all the countless hours spent discussing new ideas and ways to expand my thesis. Because of his persistence and dedication, I was able to learn and go deeper into the subject matter than I ever expected. My committee members, Dr. Rittenhouse and Dr. Thornburg were also extremely helpful in giving me great advice for presenting my thesis. I also want to thank my family for always supporting me through everything. Without their love and encouragement I would never have been able to complete my thesis. Thanks also should go to my wonderful roommates who helped to keep me motivated during the final stressful months of my thesis. I especially want to thank my fiancé, Gian Falco, who has always believed in me and given me so much love and support throughout my college career. There are many more professors, coaches, and friends that I want to thank not only for encouraging me with my thesis, but also for helping me through all my pursuits at school. Thank you to all of my McMurry family! iii Preface The goal of this research was to gain a deeper understanding of some existing cryptosystems, to implement these cryptosystems in a computer programming language of my choice, and to discover whether the composition of cryptosystems leads to greater security.
    [Show full text]
  • 9 Purple 18/2
    THE CONCORD REVIEW 223 A VERY PURPLE-XING CODE Michael Cohen Groups cannot work together without communication between them. In wartime, it is critical that correspondence between the groups, or nations in the case of World War II, be concealed from the eyes of the enemy. This necessity leads nations to develop codes to hide their messages’ meanings from unwanted recipients. Among the many codes used in World War II, none has achieved a higher level of fame than Japan’s Purple code, or rather the code that Japan’s Purple machine produced. The breaking of this code helped the Allied forces to defeat their enemies in World War II in the Pacific by providing them with critical information. The code was more intricate than any other coding system invented before modern computers. Using codebreaking strategy from previous war codes, the U.S. was able to crack the Purple code. Unfortunately, the U.S. could not use its newfound knowl- edge to prevent the attack at Pearl Harbor. It took a Herculean feat of American intellect to break Purple. It was dramatically intro- duced to Congress in the Congressional hearing into the Pearl Harbor disaster.1 In the ensuing years, it was discovered that the deciphering of the Purple Code affected the course of the Pacific war in more ways than one. For instance, it turned out that before the Americans had dropped nuclear bombs on Japan, Purple Michael Cohen is a Senior at the Commonwealth School in Boston, Massachusetts, where he wrote this paper for Tom Harsanyi’s United States History course in the 2006/2007 academic year.
    [Show full text]
  • Historical Ciphers • A
    ECE 646 - Lecture 6 Required Reading • W. Stallings, Cryptography and Network Security, Chapter 2, Classical Encryption Techniques Historical Ciphers • A. Menezes et al., Handbook of Applied Cryptography, Chapter 7.3 Classical ciphers and historical development Why (not) to study historical ciphers? Secret Writing AGAINST FOR Steganography Cryptography (hidden messages) (encrypted messages) Not similar to Basic components became modern ciphers a part of modern ciphers Under special circumstances modern ciphers can be Substitution Transposition Long abandoned Ciphers reduced to historical ciphers Transformations (change the order Influence on world events of letters) Codes Substitution The only ciphers you Ciphers can break! (replace words) (replace letters) Selected world events affected by cryptology Mary, Queen of Scots 1586 - trial of Mary Queen of Scots - substitution cipher • Scottish Queen, a cousin of Elisabeth I of England • Forced to flee Scotland by uprising against 1917 - Zimmermann telegram, America enters World War I her and her husband • Treated as a candidate to the throne of England by many British Catholics unhappy about 1939-1945 Battle of England, Battle of Atlantic, D-day - a reign of Elisabeth I, a Protestant ENIGMA machine cipher • Imprisoned by Elisabeth for 19 years • Involved in several plots to assassinate Elisabeth 1944 – world’s first computer, Colossus - • Put on trial for treason by a court of about German Lorenz machine cipher 40 noblemen, including Catholics, after being implicated in the Babington Plot by her own 1950s – operation Venona – breaking ciphers of soviet spies letters sent from prison to her co-conspirators stealing secrets of the U.S. atomic bomb in the encrypted form – one-time pad 1 Mary, Queen of Scots – cont.
    [Show full text]
  • The Mathemathics of Secrets.Pdf
    THE MATHEMATICS OF SECRETS THE MATHEMATICS OF SECRETS CRYPTOGRAPHY FROM CAESAR CIPHERS TO DIGITAL ENCRYPTION JOSHUA HOLDEN PRINCETON UNIVERSITY PRESS PRINCETON AND OXFORD Copyright c 2017 by Princeton University Press Published by Princeton University Press, 41 William Street, Princeton, New Jersey 08540 In the United Kingdom: Princeton University Press, 6 Oxford Street, Woodstock, Oxfordshire OX20 1TR press.princeton.edu Jacket image courtesy of Shutterstock; design by Lorraine Betz Doneker All Rights Reserved Library of Congress Cataloging-in-Publication Data Names: Holden, Joshua, 1970– author. Title: The mathematics of secrets : cryptography from Caesar ciphers to digital encryption / Joshua Holden. Description: Princeton : Princeton University Press, [2017] | Includes bibliographical references and index. Identifiers: LCCN 2016014840 | ISBN 9780691141756 (hardcover : alk. paper) Subjects: LCSH: Cryptography—Mathematics. | Ciphers. | Computer security. Classification: LCC Z103 .H664 2017 | DDC 005.8/2—dc23 LC record available at https://lccn.loc.gov/2016014840 British Library Cataloging-in-Publication Data is available This book has been composed in Linux Libertine Printed on acid-free paper. ∞ Printed in the United States of America 13579108642 To Lana and Richard for their love and support CONTENTS Preface xi Acknowledgments xiii Introduction to Ciphers and Substitution 1 1.1 Alice and Bob and Carl and Julius: Terminology and Caesar Cipher 1 1.2 The Key to the Matter: Generalizing the Caesar Cipher 4 1.3 Multiplicative Ciphers 6
    [Show full text]
  • Index-Of-Coincidence.Pdf
    The Index of Coincidence William F. Friedman in the 1930s developed the index of coincidence. For a given text X, where X is the sequence of letters x1x2…xn, the index of coincidence IC(X) is defined to be the probability that two randomly selected letters in the ciphertext represent, the same plaintext symbol. For a given ciphertext of length n, let n0, n1, …, n25 be the respective letter counts of A, B, C, . , Z in the ciphertext. Then, the index of coincidence can be computed as 25 ni (ni −1) IC = ∑ i=0 n(n −1) We can also calculate this index for any language source. For some source of letters, let p be the probability of occurrence of the letter a, p be the probability of occurrence of a € b the letter b, and so on. Then the index of coincidence for this source is 25 2 Isource = pa pa + pb pb +…+ pz pz = ∑ pi i=0 We can interpret the index of coincidence as the probability of randomly selecting two identical letters from the source. To see why the index of coincidence gives us useful information, first€ note that the empirical probability of randomly selecting two identical letters from a large English plaintext is approximately 0.065. This implies that an (English) ciphertext having an index of coincidence I of approximately 0.065 is probably associated with a mono-alphabetic substitution cipher, since this statistic will not change if the letters are simply relabeled (which is the effect of encrypting with a simple substitution). The longer and more random a Vigenere cipher keyword is, the more evenly the letters are distributed throughout the ciphertext.
    [Show full text]
  • A Hybrid Cryptosystem Based on Vigenère Cipher and Columnar Transposition Cipher
    International Journal of Advanced Technology & Engineering Research (IJATER) www.ijater.com A HYBRID CRYPTOSYSTEM BASED ON VIGENÈRE CIPHER AND COLUMNAR TRANSPOSITION CIPHER Quist-Aphetsi Kester, MIEEE, Lecturer Faculty of Informatics, Ghana Technology University College, PMB 100 Accra North, Ghana Phone Contact +233 209822141 Email: [email protected] / [email protected] graphy that use the same cryptographic keys for both en- Abstract cryption of plaintext and decryption of cipher text. The keys may be identical or there may be a simple transformation to Privacy is one of the key issues addressed by information go between the two keys. The keys, in practice, represent a Security. Through cryptographic encryption methods, one shared secret between two or more parties that can be used can prevent a third party from understanding transmitted raw to maintain a private information link [5]. This requirement data over unsecured channel during signal transmission. The that both parties have access to the secret key is one of the cryptographic methods for enhancing the security of digital main drawbacks of symmetric key encryption, in compari- contents have gained high significance in the current era. son to public-key encryption. Typical examples symmetric Breach of security and misuse of confidential information algorithms are Advanced Encryption Standard (AES), Blow- that has been intercepted by unauthorized parties are key fish, Tripple Data Encryption Standard (3DES) and Serpent problems that information security tries to solve. [6]. This paper sets out to contribute to the general body of Asymmetric or Public key encryption on the other hand is an knowledge in the area of classical cryptography by develop- encryption method where a message encrypted with a reci- ing a new hybrid way of encryption of plaintext.
    [Show full text]
  • Secure Communications One Time Pad Cipher
    Cipher Machines & Cryptology © 2010 – D. Rijmenants http://users.telenet.be/d.rijmenants THE COMPLETE GUIDE TO SECURE COMMUNICATIONS WITH THE ONE TIME PAD CIPHER DIRK RIJMENANTS Abstract : This paper provides standard instructions on how to protect short text messages with one-time pad encryption. The encryption is performed with nothing more than a pencil and paper, but provides absolute message security. If properly applied, it is mathematically impossible for any eavesdropper to decrypt or break the message without the proper key. Keywords : cryptography, one-time pad, encryption, message security, conversion table, steganography, codebook, message verification code, covert communications, Jargon code, Morse cut numbers. version 012-2011 1 Contents Section Page I. Introduction 2 II. The One-time Pad 3 III. Message Preparation 4 IV. Encryption 5 V. Decryption 6 VI. The Optional Codebook 7 VII. Security Rules and Advice 8 VIII. Appendices 17 I. Introduction One-time pad encryption is a basic yet solid method to protect short text messages. This paper explains how to use one-time pads, how to set up secure one-time pad communications and how to deal with its various security issues. It is easy to learn to work with one-time pads, the system is transparent, and you do not need special equipment or any knowledge about cryptographic techniques or math. If properly used, the system provides truly unbreakable encryption and it will be impossible for any eavesdropper to decrypt or break one-time pad encrypted message by any type of cryptanalytic attack without the proper key, even with infinite computational power (see section VII.b) However, to ensure the security of the message, it is of paramount importance to carefully read and strictly follow the security rules and advice (see section VII).
    [Show full text]
  • The Enigma History and Mathematics
    The Enigma History and Mathematics by Stephanie Faint A thesis presented to the University of Waterloo in fulfilment of the thesis requirement for the degree of Master of Mathematics m Pure Mathematics Waterloo, Ontario, Canada, 1999 @Stephanie Faint 1999 I hereby declare that I am the sole author of this thesis. I authorize the University of Waterloo to lend this thesis to other institutions or individuals for the purpose of scholarly research. I further authorize the University of Waterloo to reproduce this thesis by pho­ tocopying or by other means, in total or in part, at the request of other institutions or individuals for the purpose of scholarly research. 11 The University of Waterloo requires the signatures of all persons using or pho­ tocopying this thesis. Please sign below, and give address and date. ill Abstract In this thesis we look at 'the solution to the German code machine, the Enigma machine. This solution was originally found by Polish cryptologists. We look at the solution from a historical perspective, but most importantly, from a mathematical point of view. Although there are no complete records of the Polish solution, we try to reconstruct what was done, sometimes filling in blanks, and sometimes finding a more mathematical way than was originally found. We also look at whether the solution would have been possible without the help of information obtained from a German spy. IV Acknowledgements I would like to thank all of the people who helped me write this thesis, and who encouraged me to keep going with it. In particular, I would like to thank my friends and fellow grad students for their support, especially Nico Spronk and Philippe Larocque for their help with latex.
    [Show full text]
  • Shift Cipher Substitution Cipher Vigenère Cipher Hill Cipher
    Lecture 2 Classical Cryptosystems Shift cipher Substitution cipher Vigenère cipher Hill cipher 1 Shift Cipher • A Substitution Cipher • The Key Space: – [0 … 25] • Encryption given a key K: – each letter in the plaintext P is replaced with the K’th letter following the corresponding number ( shift right ) • Decryption given K: – shift left • History: K = 3, Caesar’s cipher 2 Shift Cipher • Formally: • Let P=C= K=Z 26 For 0≤K≤25 ek(x) = x+K mod 26 and dk(y) = y-K mod 26 ʚͬ, ͭ ∈ ͔ͦͪ ʛ 3 Shift Cipher: An Example ABCDEFGHIJKLMNOPQRSTUVWXYZ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 • P = CRYPTOGRAPHYISFUN Note that punctuation is often • K = 11 eliminated • C = NCJAVZRCLASJTDQFY • C → 2; 2+11 mod 26 = 13 → N • R → 17; 17+11 mod 26 = 2 → C • … • N → 13; 13+11 mod 26 = 24 → Y 4 Shift Cipher: Cryptanalysis • Can an attacker find K? – YES: exhaustive search, key space is small (<= 26 possible keys). – Once K is found, very easy to decrypt Exercise 1: decrypt the following ciphertext hphtwwxppelextoytrse Exercise 2: decrypt the following ciphertext jbcrclqrwcrvnbjenbwrwn VERY useful MATLAB functions can be found here: http://www2.math.umd.edu/~lcw/MatlabCode/ 5 General Mono-alphabetical Substitution Cipher • The key space: all possible permutations of Σ = {A, B, C, …, Z} • Encryption, given a key (permutation) π: – each letter X in the plaintext P is replaced with π(X) • Decryption, given a key π: – each letter Y in the ciphertext C is replaced with π-1(Y) • Example ABCDEFGHIJKLMNOPQRSTUVWXYZ πBADCZHWYGOQXSVTRNMSKJI PEFU • BECAUSE AZDBJSZ 6 Strength of the General Substitution Cipher • Exhaustive search is now infeasible – key space size is 26! ≈ 4*10 26 • Dominates the art of secret writing throughout the first millennium A.D.
    [Show full text]
  • Vigen`Ere Cipher – Friedman's Attack
    Statistics 116 - Fall 2002 Breaking the Code Lecture # 5 – Vigen`ere Cipher – Friedman’s Attack Jonathan Taylor 1 Index of Coincidence In this lecture, we will discuss a more “automated” approach to guessing the keyword length in the Vigen`erecipher. The procedure is based on what we will call the index of coincidence of two strings of text S1 and S2 of a given length n. The index of coincidence is just the proportion of positions in the strings where the corresponding characters of the two strings match, and is defined for the strings S1 = (s11, s12, . , s1n) and S2 = (s21, s22, . , s2n) n 1 X I(S ,S ) = δ(s , s ) 1 2 n 1i 2i i=1 where the function ( 1 if x = y δ(x, y) = 0 otherwise. Now, in our situation we have to imagine that the string we observe is “random”. Suppose then, that S1 and S2 are two random strings generated by choosing letters at random with the frequencies (p0, . , p25) and (q0, . , q25), respectively. If the strings are random we can talk about the “average” index of coincidence of the two strings S1 and S2. In statistics, if we have a random quantity X we write E(X) for the “expectation” or “average” of X. It is not difficult to show that if S1 is a string of random letters with frequencies (p0, . , p25) and (q0, . , q25) then the average index of coincidence of these two strings is given by 25 X piqi. (1) i=0 1 This is because, for any of the letters in the string, the probability that the two characters match is the probability that they are both A’s, both B’s, .
    [Show full text]
  • Loads of Codes – Cryptography Activities for the Classroom
    Loads of Codes – Cryptography Activities for the Classroom Paul Kelley Anoka High School Anoka, Minnesota In the next 90 minutes, we’ll look at cryptosystems: Caesar cipher St. Cyr cipher Tie-ins with algebra Frequency distribution Vigenere cipher Cryptosystem – an algorithm (or series of algorithms) needed to implement encryption and decryption. For our purposes, the words encrypt and encipher will be used interchangeably, as will decrypt and decipher. The idea behind all this is that you want some message to get somewhere in a secure fashion, without being intercepted by “the bad guys.” Code – a substitution at the level of words or phrases Cipher – a substitution at the level of letters or symbols However, I think “Loads of Codes” sounds much cooler than “Loads of Ciphers.” Blackmail = King = Today = Capture = Prince = Tonight = Protect = Minister = Tomorrow = Capture King Tomorrow Plaintext: the letter before encryption Ciphertext: the letter after encryption Rail Fence Cipher – an example of a “transposition cipher,” one which doesn’t change any letters when enciphered. Example: Encipher “DO NOT DELAY IN ESCAPING,” using a rail fence cipher. You would send: DNTEAIECPN OODLYNSAIG Null cipher – not the entire message is meaningful. My aunt is not supposed to read every epistle tonight. BXMT SSESSBW POE ILTWQS RIA QBTNMAAD OPMNIKQT RMI MNDLJ ALNN BRIGH PIG ORHD LLTYQ BXMT SSESSBW POE ILTWQS RIA QBTNMAAD OPMNIKQT RMI MNDLJ ALNN BRIGH PIG ORHD LLTYQ Anagram – use the letters of one word, phrase or sentence to form a different one. Example: “Meet behind the castle” becomes “These belched a mitten.” Substitution cipher – one in which the letters change during encryption.
    [Show full text]
  • Classic Crypto
    Classic Crypto Classic Crypto 1 Overview We briefly consider the following classic (pen and paper) ciphers o Transposition ciphers o Substitution ciphers o One-time pad o Codebook These were all chosen for a reason o We see same principles in modern ciphers Classic Crypto 2 Transposition Ciphers In transposition ciphers, we transpose (scramble) the plaintext letters o The scrambled text is the ciphertext o The transposition is the key Corresponds to Shannon’s principle of diffusion (more about this later) o This idea is widely used in modern ciphers Classic Crypto 3 Scytale Spartans, circa 500 BC Wind strip of leather around a rod Write message across the rod T H E T I M E H A S C O M E T H E W A L R U S S A I D T O T A L K O F M A N Y T H I N G S When unwrapped, letters are scrambled TSATAHCLONEORTYTMUATIESLHMTS… Classic Crypto 4 Scytale Suppose Alice and Bob use Scytale to encrypt a message o What is the key? o How hard is it for Trudy to break without key? Suppose many different rod diameters are available to Alice and Bob… o How hard is it for Trudy to break a message? o Can Trudy attack messages automatically—without manually examining each putative decrypt? Classic Crypto 5 Columnar Transposition Put plaintext into rows of matrix then read ciphertext out of columns For example, suppose matrix is 3 x 4 o Plaintext: SEETHELIGHT o Ciphertext: SHGEEHELTTIX Same effect as Scytale o What is the key? Classic Crypto 6 Keyword Columnar Transposition For example o Plaintext: CRYPTOISFUN o Matrix 3 x 4 and keyword MATH o Ciphertext:
    [Show full text]