<<

In this section we learn about .

Reference: [Mark Stamp] Chapter 2

46 You can use a password to protect your computer; however, once the data leaves your computer and goes through a public data network (such as Internet), what can you do to protect it?

Answer: data security by cryptography

Data sent over public channels is subject to being intercepted and read by others. Encrypting data can keep data confidential even if read while on public channels.

Data sent over public channels may also be modified during transit by an attacker or corrupted accidently. Encrypted data may also be modified or corrupted so doesn’t necessarily solve that problem.

47 Objective of this section are: to learn the concept of data encryption, decryption, relationship between size and data security.

48 49 Figure shows encryption. Encryption is the process of converting plaintext to the . Decryption is the opposite: conversion of the ciphertext back to plaintext. A cryptographic system may use keys (string of bits) for encryption and decryption. In practice, the encryption method and decryption method should be public. The decryption key may be kept confidential for ensuring data security.

50 Cryptography may be used to setup a secure communication system. In symmetric key Cryptography, a single shared key may be used to encrypt and decrypt the message as shown in the above diagram. For security reasons, the key should be a secret known only by sender and receiver (i.e., the people who are authorized to read the message).

Problem: how can sender and receiver agree on a key securely on the Internet? They need to share a key BEFORE they can encrypt messages to each other, but if one of them sends the key to the other over public channels, it‘s vulnerable to interception. Later, we will see some clever strategies that allow them to share a key securely.

51 That system is public key cryptography – i.e., use two keys: one for encryption and one for decryption. The encryption key can be made public. So long as decryption key is kept private, the information remains private. This is the basis of public key system.

52 In public key cryptography, every user has two keys. One key is public. The other key is kept private, only the owner can access the private key. Given the public key, it is not computationally feasible to compute the corresponding private key.

For confidential communication, the receiver’s public key is used for encryption and receiver’s private key is used for decryption.

Since no one else has the private key other than the owner of the private key, no one else can use the key to decrypt the message.

53 For integrity in communication, the sender’s private key is used for encryption and the sender’s public key is used for decryption. The encrypted message is NOT private since anyone can use the sender’s public key to decrypt it. However, only the sender could have encrypted that message, so the fact that the ciphertext decrypts with some particular user’s public key is proof that that message originally came from that user.

A message encrypted with a private key is considered to be “digitally signed” by that private key’s owner. Digital signatures are important for many online business applications.

Note that the public key cryptography does not completely eliminate the key distribution problem. The public key has to be distributed through a reliable channel or somehow make sure that we are using the right public key for encryption. Also, the private key has to be protected from disclosure or tampering.

54 55 First step is to decide on the encryption approach. Part of the system may be made public knowledge, especially if widespread adaptation of the system is desirable.

56 Why should anyone trust a that is proprietary/classified/non-public? Attempts to replace open systems with “black box” (e.g. Clipper Chip) failed for 2 reasons: 1) the system’s designers could not publish their security analysis (to convince others), and 2) security experts could not do their own security analysis. Some also worry that the designer of a black box cryptosystem might build a “backdoor” into the system for their own use. In short: those in charge of securing data assets could not trust black box systems. The system MUST be public for security experts to trust it to secure assets under their protection. In modern practice, design of the system is public. It is assumed that an enemy can and will uncover any encryption system. Security of the system depends only on the key. Keeping the key safe is an important requirement for data security.

57 The above six principles still continue to be relevant even today.

58 Users want privacy. However, they often find it hard to use cryptographic tools because they don’t understand how they work – they also don’t know what should and should not be done to ensure security. Administrators on the other hand are trying hard to make sure all sensitive data is encrypted. Trudy wants to find out the type of used first, so that he can try to break it.

Active Learning Task: With your partner analyze the use of cryptography for data security from three different perspectives.

59 Shift cipher is an encryption system where each character in the input text is shifted by a numerical key. Key in the above example is 2. There are other types of shift as well. You can vary the shifts for each character position: first character shifted by 2, second by -3, third by 5 and so on.

60 For a single shift cipher, there can be as many keys as there are characters – so, the key can be between 0 and 255, assuming an extended ASCII character set of 256 letters, symbols, and special characters.

61 Simple is similar to shift cipher. Each letter is substituted by a letter or symbol from the substitution table. Thus, the plain text “CAB” is encrypted to “(@!”. Try to decrypt the above with the help of the substitution table.

62 Brute-force method involves trying to break a cipher just by trying different keys to see if they work.

For an attacker to guess the substitution table correctly, how many trials are needed. Assume that there are n possible characters. The first letter may be substituted with any of the n characters. The second letter may be substituted with one of the other n-1 character and so on. So total possibilities: = n * (n-1) * (n-2) * (n-3) ……….. 2 * 1 = n!

If n is 256, find out the number of possibilities. Assume each combination would take one millisecond to try out and verify, how long will it take on an average to uncover the message?

63 All letters in English language text are not equally likely. i.e., some letters appear more often than others. For example, “e” is the most commonly found letter. Since the plaintext statistics are reflected in the simple substitution ciphertext as well, it is easy to find the one to one correspondence between plaintext and ciphertext characters.

64 The above text is probably not enough to perform unless we get lucky. Given that it is a shift cipher, things may get a little easier. If no information is given, one can only try out different possibilities. You may use an online tool. (e.g., http://www.cs.uri.edu/cryptography/classicalshiftdemo.htm)

65 Each word in the input message is encrypted with the help of a codebook. In this example, “Nuke Device Ready” is converted to “Cat Food Empty”. From the table, it is clear that Nuke = Cat, Device = Food, and so on.

66 Transposition ciphers (or permutation ciphers) are like jumbles. There is no substitution here, but the letters are permuted as can be seen from the above example. Keys: 3, 1, 2, etc. will tell which position the letter will go to. C goes to third position, A goes to 1st position and so on.

To decrypt a permutation cipher, you would need to try out different possibilities. One way would be to write a program which will re-arrange characters to see if you are getting something that makes sense. With short ciphertext, someone can intuitively try to rearrange the letters to uncover the message.

67 In double transposition ciphers, a message entered into a grid. Spaces are often removed or substituted with random characters.

Both row transposition and column transposition are done. The first row becomes the second row, second row becomes the first, third row becomes fourth, and the fourth row becomes the third (key: 2, 1, 4, 3). For column transformation: first column becomes fifth, second becomes first, third remains the same, fourth becomes second and fifth goes to fourth. (key: 5, 1, 3, 2, 4). Some double transposition ciphers are quite difficult to break for a large body of text.

68 Even if the plaintext characters are not disguised, they are re-arranged thus diffusing the plaintext characteristics (e.g. breaking up common 2-letter digrams like th and er). Column and row transposition make it harder to identify the words. It is a non- trivial cipher, especially for a large body of text.

What are the total number of possible combinations? For an (m x n) column of text, it is m! * n!.

69 Remember, all characters are stored in a computer system as 1s and 0s. So, the plaintext “Hello world” is encoded using the above table resulting in the binary string “01000 00101 01100 01100 01111”. (h=8, e = 5, l = 12, … etc). The key is randomly chosen and is the bit stream “10010 00……”. Each plaintext message bit is XORed with key bit, resulting in the ciphertext string “zddsb …”. In a one-time pad, the random key must be as long as the message and each key must be used one time only.

XOR is its own inverse so if we encrypt M by M XOR K = C, we can decrypt by C by C XOR K = M

70 Often it is very hard to generate truly random keys. In many cases encryption software uses flawed random number generators and if the numbers are predictable, the key is weak. Also, pseudo random number generators generate a repeating sequence of numbers. Users can decide the starting point of the sequence.

71 Confusion obscures the relationship between plaintext and ciphertext. A shift cipher offers poor confusion (and poor diffusion as can be seen in the next slide), whereas some substitution ciphers can offer good confusion. A offers little confusion (especially with large block of text).

72 Remove the statistical properties of plaintext from ciphertext. A simple substitution cipher offers no diffusion. A transposition cipher offers good diffusion, but no confusion.

73 That is, the fact that ciphertext is given does not make it any easier (or harder) to find the plaintext.

74 Assume some 3 letter plaintext was encoded using the above table (where a=1,b=2,c=3,…) then XORed with a randomly chosen key, resulting in the binary ciphertext 00011 00001 10100. If ciphertext message bits are XORed with that key’s bits, we should get the plaintext message back. An attacker who doesn’t know the key must try every possible key, trying to find one that decodes to a sensible message.

Unfortunately for attacker, with OTP there is a key that will decode that C into any possible 3 letter M so he can’t tell which is the correct key!

75 Regardless of the input message, the probability of getting a particular ciphertext is the same.

76 In a a fixed-length key is used to generate a key stream used for encryption. The fixed-length key is repeatedly used as often as necessary to get a key stream as long as the message. Each bit in the message stream and corresponding bit in the key stream are XORed together to produce the ciphertext.

77 can provide both “confusion” and “diffusion”

78 We will learn more about DES and AES later.

79 Users find it hard to understand public key cryptography and the difference between public and private keys, etc. PGP is a tool that users can use to perform encryption and decryption. PGP has a user interface, but as can be seen from the paper “Why Johnny Can’t Encrypt”, it is not easy for users to use the software. Usability problems can affect the ways in which users use the software, or users may decide not to use encryption at all. This affects security; thus, we should pay attention. Anyone designing user interface for security should be aware of the properties described above.

80 81 In many tools (e.g. GPG) passwords are often used to protect private keys. This is done to make sure that there is no misuse of private key without the user approval. Thus, before using the private key, the user has to authorize the use of private key by entering the key password. If this password is weak, it may be cracked easily. Even if the cryptography is strong, security is only as strong as the weakest link.

82 (Please see the assignment description for more information; this assignment may be done in C as well)

83 Bullet proof your program such that it can handle any input. Your program should be as usable as possible. For storing information, use correct data structure. Flawed random number generators can be a security threat. So use secure random number generators. Reduce data life time by destroying information just after the use.

Use structured programming and have good internal/external documentation.

84 Import the Scanner classes for reading input from the keyboard; Import SecureRandom class for setting up the random number generator for the key stream.

85 Create an instance of Scanner class and initialize it to read from System.in (normally keyboard). Read a line from the keyboard and store in the String variable. Use String s to initialize a BigInteger variable called message.

86 Create a random object that uses “SHA1PRNG” algorithm (provided by Java library) for . Set a , in this example, 10 for the random number generator.

87 How can we fix above to ensure that key length is long enough for security?

88 In the above example, the variable password retains its value until the very end giving room for someone to steal the password from the memory and that is unnecessary. (More about passwords later in Section 3)

89 The line password=null indicates that the password data is no longer needed and by calling System.gc() would be hopefully freed by the garbage collector (no guarantee though). If you are using C programming language, you have a better control over this issue.

For more information see: Shredding Your Garbage: Reducing Data Lifetime Through Secure Deallocation Jim Chow, Ben Pfaff, Tal Garnkel, Mendel Rosenblum 14th USENIX Security Symposium, July/August 2005.

http://stanford.edu/~blp/papers/shredding.pdf

90 91 92 93