Part III Essential Resources

The four appendixes introduce several mathematical topics that are needed for a com• plete understanding of the material in the book. Of particular importance is Appendix D on finite fields. This topic was originally developed in the 1820s by Evariste Galois for his work on the solvability of equations, and for 150 years it was of interest to math• ematicians only. Like other fields of pure mathematics, this topic has found extensive applications in the age in several aspects of coding theory, such as data en• cryption and error-correcting codes. The answers to the exercises are provided but the reader is encouraged to give each exercise a try before peeking at the answer. The timeline may be of interest to history buff's and a glossary always comes in handy.

And being of the council call'd 'the Privy,' Lord Henry walk'd into his cabinet, To furnish matter for some future Livy To tell how he reduced the nation's debt; And if their full contents I do not give ye, It is because I do not know them yet; But I shall add them in a brief appendix, To come between mine epiC and its index. -George Byron, Don Juan (1821) Appendix A Convolution

Convolution is an important operation that has many practical applications. It is used in Section 12.6 to hide data in audio files as an echo. The following section introduces the concept of a linear system and uses it to develop the basic one-dimensional convolution. Section A.2 shows how this useful operation can be extended to two dimensions and beyond.

A.I One-Dimensional Convolution

We start with the simple, intuitive concept of a system. This is anything that receives input and generates output in response. The input and output can be one-dimensional (functions of the time), two-dimensional (functions of two spatial variables), or can have any number of dimensions. We will be concerned with the relation of the output to the input, not with the internal operation of the system. We will also concentrate on linear systems, since they are both simple and important. A linear system is defined as follows: If input XI(t) produces output YI(t) [we denote this by XI(t) ----+ YI(t)] and if X2 (t) ----+ Y2 (t), then Xl (t) + X2 (t) ----+ YI (t) + Y2 (t). Any system that does not satisfy this condition is considered nonlinear. This definition implies that 2XI(t) = XI(t) + XI(t) ----+ ydt) + YI(t) = 2YI(t) or, in general, axdt) ----+ aydt) for any real a. Some linear systems are shift invariant. If such a linear system satisfies x( t) ----+ y( t), then x(t-T) ----+ y(t-T), i.e., shifting the input by an amount T shifts the output by the same amount, but does not otherwise affect the output. In the discussion of convolution, we assume that the systems in question are linear and shift-invariant. This is true (or true to a very good approximation) for electrical networks and optical systems, the main pieces of hardware used in image processing, data compression, and . 370 Appendix A Convolution

System. Frequently used without need. Dayton has adopted the Dayton has adopted government commission system of government. by commission. The dormitory system Dormitories ~Strunk and White, The Elements of Style (1979).

It is useful to have a general relation between the input and output of a linear, shift-invariant system. It turns out that the expression

+00 y(t) = 1-00 f(t, r)x(r) dr, (A.l)

is general enough for this purpose. In other words, there is always a two-parameter function f (t, r) that can be used to predict the output y( t) if the input x( r) is known for all times r. However, we want to express this relation with a one-parameter function, and we use the shift-invariance of the system for this purpose. For a linear, shift• invariant system we can write

+00 y(t-T)= 1-00 f(t,r)x(r-T)dr.

If we change variables by adding T to both t and r, we get

+00 y(t) = 1-00 f(t + T, r + T)x(r) dr. (A.2)

Comparing Equations (A.l) and (A.2) shows that f(t, r) = f(t + T, r + T). Thus, function f has the property that if we add T to both its parameters, it does not change. The function is constant as long as the difference between its parameters is constant. Function f depends only on the difference of its parameters, so it is essentially a single parameter function. We can therefore write g(t - r) = f(t, r), which changes Equa• tion (A.l) to +00 y(t) = 1-00 g(t - r)x(r) dr. (A.3)

This is the convolution integral, an important relation between x(t) and y(t) or between x(t) and g(t). This relation is denoted y = 9 * x and it says that the output of a linear, shift-invariant system is given by the convolution of its input x with a certain function g(t) (or by convolving x with g). Function g, which is characteristic of the system, is called the impulse response of the system. Figure A.l shows a graphical description of a convolution, where the final result (the integral) is the gray area under the curve. The convolution operation has a number of important and useful properties. It is commutative, associative, and distributive over addition. These features are listed in A.1 One-Dimensional Convolution 371

9(7)

t

inpu fun tion COl volving fun tion

t

g( 7) renee d

t t

functions superimposed product of functions

Figure A.l: The Convolution of x(t) and g(t). "Oh no," George said. "It was more than money." He leaned his forehead in his hand and tried to remember what else more than money. The darkness inside his head was full of convolutions. His eardrums were too tight. Only the higher registers of sound were getting through. -Paul Scott, The Bender (1963)

input function smooth function

Figure A.2: Applying Convolution to Denoising a Function. 372 Appendix A Convolution

Equation (A.4):

f*g = g* f, f*(g*h) = (f*g)*h, (A.4) f * (g + h) = f * 9 + f * h.

Practical problems normally involve discrete sequences of numbers, rather than continuous functions, so the discrete convolution is useful. The discrete convolution of the two sequences f(i) and g(i) is defined as

h(i) = f(i) * g(i) = L f(j) g(i - j). (A.5) j

If the lengths of f (i) and g( i) are m and n, respectively, then h( i) has length m + n - l. Example: Given the two sequences f = (j(0), f(l), ... , f(5)) (six elements) and 9 = (g(O),g(l), ... ,g(4)) (five elements), Equation (A.5) yields the 10 elements of the convolution h = f * g: o h(O) = L f(j)g(O - j) = f(O)g(O) j=O 1 h(l) = L f(j)g(l - j) = f(O)g(l) + f(l)g(O) j=O 2 h(2) = L f(j)g(2 - j) = f(0)g(2) + f(l)g(l) + f(2)g(0) j=O 3 h(3) = L f(j)g(3 - j) = f(0)g(3) + f(1)g(2) + f(2)g(1) + f(3)g(0) j=O 4 h(4) = L f(j)g(4 - j) = f(0)g(4) + f(1)g(3) + f(2)g(2) + f(3)g(1) + f(4)g(0) j=O 5 h(5) = L f(j)g(5 - j) = f(1)g(4) + f(2)g(3) + f(3)g(2) + f(4)g(1) + f(5)g(0) j=l 5 h(6) = L f(j)g(6 - j) = f(2)g(4) + f(3)g(3) + f(4)g(2) + f(5)g(1) j=2 5 h(7) = L f(j)g(7 - j) = f(3)g(4) + f(4)g(3) + f(5)g(2) j=3 5 h(8) = L f(j)g(8 - j) = f(4)g(4) + f(5)g(3) j=4 A.2 Two-Dimensional Convolution 373

5 h(9) = L f(j)g(9 - j) = f(5)g(4). j=5

A simple example of the use of a convolution is smoothing (or denoising). This shows how convolution can be used as a filter. Given a noisy function f(t) [Figure (A.2)]' we select a rectangular pulse as the convolving function g(t). It is defined as

I, -a/2 < t < a/2, g(t) = { ~, t = ±a/2, 0, elsewhere, where a is a suitably small value (typically 1, but could be anything). As the convolution proceeds, the pulse is moved from left to right and is multiplied by f(t). The result of the product is a local average of f(t) over an interval of width a. This has the effect of suppressing the high-frequency fluctuations of f(t).

From the Dictionary convolution: coiling together convolve: roll together.

A.2 Two-Dimensional Convolution The basic concept of convolution can be extended to any number of dimensions. In general, an n-dimensional structure A can be convolved with a kernel K that must have the same dimensionality but is normally smaller than A. Figure A.3a illustrates the principle. The kernel (a 3 x 3 array in this example) is placed over every possible area of A (a 5 x 6 array in the figure). For each placement of the kernel, an operation similar to a dot product is performed. Each of the elements of the kernel is multiplied by the element of A located "behind" it and the partial products are added. The result becomes one element of the two-dimensional convolution of A and the kernel. Notice that the edges of A present a problem. When the center of the kernel is located over one of the elements on an edge of A, some of the kernel elements do not cover any elements of A. The common solution is to produce a convolution that's smaller than A. If the dimensions of A are mxn, then the convolution may have dimensions (n -1) x (m - 1) or smaller. Another solution is to extend A to have dimensions (n + 1) x (m + 1), which requires adding elements. Depending on the specific application, the added elements can be zeros, copies of their near neighbors in A, or anything else. Image processing is an important field where a two-dimensional convolution is com• monly used. Figure A.3b-e shows four simple kernels that perform common image pro• cessing operations. The kernel of Figure A.3b results in image blurring. Each pixel of the convolved image retains 20% of its original value and receives contributions of 8-12% from its eight near neighbors. Note that the weights add up to 100%. 374 Appendix A Convolution

.08 .12 .08 - 1 0 0 .12 .20 .12 000 .08 .12 .08 001 (b) (c) o 0 4 4 4 0 4 4 4 6 0 -1 0 0 1 8 8 6 -1 5 -1 10 3 7 8 6 0 -1 0 10 1 7 9 5

(a) (d) (e) Figure A.3: Two-Dimensional Convolution and Kernels. o Exercise A.I: Try to figure out the effect of the kernel of Figure A.3d.

The kernel of Figure A.3c produces an embossed image. Each pixel in the con• volved image becomes the difference of its bottom-right and top-left neighbors. If these neighbors are about equal, the pixel becomes close to zero. If these neighbors are close opposites (as happens along a diagonal edge of the image), the pixel is set to a large (positive or negative) value. Swapping the 1 and -1 of this kernel results in an emboss• ing where the light seems to come from the opposite direction. This convolution should be followed by a rescaling of pixel values such that large negative values become zero (white), zero becomes 50% gray, and large positive values become dark gray. Similar rescalings may be necessary with other kernels. Figure A.3e shows a different approach to convolution. An image is scanned and each pixel P is replaced by the median of a group of 5 x 5 pixels centered on P. This creates the effect of a watercolor. In the figure, the center pixel (8) is modified to 4 because 4 is the median of the group of 25 pixels shown (half are less than 4 and the other half are greater than 4). An important application of the two-dimensional image convolution is edge detec• tion. This is done by (among other filters) the two Sobel kernels

+1 -1 0 +1] 2 +1] Gx = [ -2 0 +2 and Gy = [ 0 o 0 -1 0 +1 -1 -2 -1

Gx detects vertical edges. The left and right columns have equal and opposite values, so they magnify any differences between the left and right neighbors of a pixel. Similarly,

Gy , which is a rotated version of GXl detects horizontal edges. It is possible to detect edges in all directions by combining the two kernels. Assume that applying the two kernels to a pixel P in the original image results in the two numbers Gx and Gy . Pixel A.2 Two-Dimensional Convolution 375

P is replaced by JG~ + G~. Figure A.4 shows the results of applying Gx [part (c)], Gy [part (d) l, and their combination [part (b) 1to the Lena image shown in part (a) of the figure. The Matlab that generates those images is also included.

(a) (b)

(c) (d)

Figure A.4: Edge Detection with the Sobel Filter.

% 2D Convolution of an image with a 3x3 kernel kernel=[-l 0 1;-2 0 2;-1 0 1J; % Sobel Xfilter for vert gradients filename='lena128'; dim=128; fid=fopen(filename,'r'); 376 Appendix A Convolution if fid==-l disp('file not found'); end; img=fread(fid, [dim,dim])'; fclose(fid); figure(l), imagesc(img), colormap(gray), axis off, axis square ximg=zeros(dim); for i=2:dim-l, for j=2:dim-l, ximg(i,j)=img(i-l,j-l)*kernel(l,l)+ ... img(i-l,j)*kernel(1,2)+img(i-l,j+l)*kernel(1,3)+ ... img(i,j-l)*kernel(2,1)+img(i,j)*kernel(2,2)+ ... img(i,j+l)*kernel(2,3)+img(i+l,j-l)*kernel(3,1)+ ... img(i+l,j)*kernel(3,2)+img(i+l,j+l)*kernel(3,3); end; end; figure(2), imagesc(ximg), colormap(gray), axis off, axis square yimg=zeros(dim); kernel=fliplr(rot90(kernel)); % Sobel Yfilter for horiz gradients for i=2:dim-l, for j=2:dim-l, yimg(i,j)=img(i-l,j-l)*kernel(l,l)+ ... img(i-l,j)*kernel(1,2)+img(i-l,j+l)*kernel(1,3)+ ... img(i,j-l)*kernel(2,1)+img(i,j)*kernel(2,2)+ ... img(i,j+l)*kernel(2,3)+img(i+l,j-l)*kernel(3,1)+ ... img(i+l,j)*kernel(3,2)+img(i+l,j+l)*kernel(3,3); end; end; figure(3), imagesc(yimg), colormap(gray), axis off, axis square img=sqrt(ximg.-2+yimg.-2); figure(4), imagesc(img), colormap(gray), axis off, axis square

Matlab Code for Figure A.4

Sound like a shock wave, herself the sound and the sounding board, vision over vision, a fire in her bones, thunder in her veins, a heart-contracting experience of pain and pleasure so intense and so total that every nerve in her body and every convolution of her brain echoed. -Anne McCaffrey, Crystal Singer (1982) Appendix B Hashing

A hash function H accepts an argument x and scrambles the individual bits of x to generate a result y that's an n-bit integer (where n is a constant that's either built into H or can be modified). Hash functions have traditionally been used in the computing field to implement a data structure known as a hash table. Recently, hash functions have found applications in cryptography and (Section 11.1). We start with a short description of hash tables.

B.1 Hash Tables

A hash table is a data structure allowing for fast insertions, searches, and deletions of data items. The table itself is just an array A, and it is based on a hash function H such that H(k) produces an index to array A, where k is the of a data item. A good example of the use of a hash table is a symbol table. Virtually all computer languages use variables. A variable provides a name for a value that will be stored in memory, in a certain address M, when the program is eventually executed. When the program is compiled, each variable has two attributes, its name N (a string of characters assigned by the programmer) and its memory address M, assigned by the compiler. The compiler uses a hash table A to store all the information about variables. For each variable, its address M and name N are stored in the table as a data item and a key, respectively. The compiler reads the name from the program source file, hashes it, finds it in the hash table, and retrieves the address in order to compile the current statement. If the variable is not found in the hash table (i.e., it's being seen for the first time), it is assigned an address, and both the name and address are stored in the table (in principle, only the address need be stored, but the name is also stored because of collisions). 378 Appendix B Hashing

The hash function H takes as argument a key, which may be a number or a string. It scrambles (hashes) the bits of the key to produce an index to array A. In practice the array size is normally 2n, so the result produced by H should be an n-bit number. A hash table is a good data structure, since any operation on the hash table, adding, searching, or deleting, can be performed in one step, regardless of the table size. The only problem is collisions. In most applications it is possible for two distinct keys kl and k2 to be hashed to the same index; i.e., H(kd = H(k2) for kl -I- k2 . The example of a symbol table makes it easy to understand the reason for this. Assuming that variable names consist of five letters, there may be 26 5 = 11,881,376 variable names. Any particular program uses just a small percentage of this number, perhaps a few hundred or a few thousand names. Thus, the size of the hash table should be only a few thousand entries, and hashing 11.8 million names into a few thousand index values must involve many collisions. o Exercise B.1: How many names are possible if a name consists of exactly eight letters and digits? Terminology. Two different keys that hash to the same index are called synonyms. If a hash table of size 2n contains m keys out of a set of M possible keys, then m/!v! is the density of the table, and a = m/2n is its loading factor.

to "circumvent a technological measure" means to descramble a scrambled work, to decrypt an encrypted work, or otherwise to avoid, bypass, remove, deactivate, or impair a technological measure, without the authority of the copyright owner;

~Copyright Law of the United States of America

B.2 Hash Functions

A hash function has to be fast to compute and should minimize collisions. The function should make use of all the bits of the key, such that changing even one bit would normally (although not always) produce a different index. An ideal hash function should also produce indexes that are uniformly distributed (invoking the function many times with random keys should produce each index the same number of times). A function that produces, for example, index 118 most of the time is obviously biased and leads to collisions. The function should also assume that many keys may be similar. A programmer may sometimes assign names such as Ai and A2 to variables. A hash function that uses just the leftmost bits of a key would produce the same index for such names, leading to many collisions. Following are some examples of hash functions used in practice. Mid-Square: The key k is considered an integer: it is squared and the middle n bits of k 2 extracted to become the index. Squaring k has the advantage that the middle bits of k 2 depend on all the bits of k. Thus, two keys differing by one bit would tend to produce different indexes. A variation, slower but suitable for large keys, is to divide B.3 Collision Handling 379

the bits of the original key into several groups, add all the groups, square the result, and extract its middle n bits. The keys Ai, A2, and A3, for example, become the 16-bit numbers

01000001100110001, 01000001100110010, and 01000001100110011.

After squaring and extracting the middle eight bits, the resulting indexes are 15S, 166, and 175, respectively. Modulo: Hm(k) = k mod m. The result is the remainder of the integer division kim, a number in the interval [0, m -1]. In order for the result to be a valid index, the hash table size should be m. The value of m is critical and should be selected carefully. If m is a power of 2, say, 2i , then the remainder of kim is simply the i rightmost bits of k. This would be a very biased hash function. If m is even, then the remainder of kim has the same parity as k (it is odd when k is odd and even when k is even). This again is a bad choice for m, since it produces a biased hash function that maps odd keys to odd locations of A and even keys to even locations. If p is a prime number that divides m evenly, then keys that are permutations of each other (e.g., ABC, ACB, and CBA) may often be mapped to indexes that differ by p or by a multiple of p, again causing a nonuniform distribution of the keys. It can be shown that the modulo hash function achieves best results when m is a prime number that does not evenly divide sa ± b where a and b are small numbers. In practice, good choices for m are composite numbers whose prime divisors are > 20. Folding: This function is suitable for large keys. The bits constituting the key are divided into several groups, which are then added. The middle n bits of the sum are extracted to become the index. A variation is reverse folding where every other group of bits is reversed before being added. No, no. Look, here's the hash on the side because I didn't know how much you took. -Amy Wright as Shelley in Stardust Memories (19S0).

B.3 Collision Handling

When an index i is produced by the hash function H(k), the software should first check location A[i] for a collision. There must, therefore, be a way for the software to tell whether entry A[i] is empty or occupied. Initializing all entries of A to zero is normally not enough, since zero may be a valid data item. A simple approach is to have an additional array F, of size 2n IS bytes, where each bit is associated with an entry of A. Each bit of F acts as a flag indicating whether the corresponding entry of A is empty or not. The entire array F is initially set to zeros, indicating that all entries of A are empty. When the software decides to insert a data item into A[i], it has to locate the bit in F that corresponds to entry i and check it. The software should therefore calculate j = lilSJ, k = i - Sj, and check bit k of byte F[j]. If the bit is zero, entry A[i] is empty and can be used for a new data item. Bit k of F[j] then has to be set, which is done by using k to select one of the eight masks 380 Appendix B Hashing

00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000 and logically OR it with F[j]. If the bit is 1, entry A[i] is already occupied, and this is a collision. The software should be able to check and tell whether entry A[i] contains the data item d that corresponds to key k. This is why the keys have to be saved, together with the data items, in the hash table. What should the software do in case of a collision? The simplest choice is to check entries A[H(i + 1)], A[H(i + 2)], ... , A[H(2n - 1)], A[H(O)], A[H(1)] ... until an empty entry is found or until the search reaches entry A [H (i -1)]. In the latter case the software knows that the data item is not in the table (if this was a search for an item) or that the table is full (if this was an attempt to insert a new item in the table). This process is called linear search. Searching for a data item, which in principle should take one step, can now, because of collisions, take up to 2n -1 steps. Also, experience indicates that a linear search causes occupied entries in the table to cluster, which is intuitively easy to understand. If the hash function is not ideal and it hashes many keys to, say, index 54, then table entries 54, 55, ... will quickly fill up, creating a cluster. Clusters also tend to grow and merge, creating even larger clusters and thereby increasing the search time. A theoretical analysis shows that the expected number of steps needed to locate an item when linear search is used is (2 - ex)/(2 - 2ex), where ex is the loading factor (percent full of the table). For ex = 0.5 we can expect 1.5 steps on average, but for ex = 0.75 the expected number of steps rises to 2.5, and for ex = 0.9 it becomes 5.5. It is clear that when linear search is used, the loading factor should be kept low (perhaps below 0.6-0.7). If more items need to be added to the table, a good solution is to construct a new table, twice as large as the original one, transfer all items from the old table to the new one (using a new hash function), and delete the old table.

Dinner was at one o'clock; and on Monday, Tuesday, and Wednesday it consisted of beef, roast, hashed, and minced, and on Thursday, Friday, and Saturday of mutton. On Sunday they ate one of their own chickens. -W. Somerset Maugham, OJ Human Bondage (1915)

A more sophisticated method of handling collisions is quadratic search. Assume that A is an array of size N. When entry A[i] is found to be occupied, the software checks entries A[(i ± j2) mod N] where 0 ::; j ::; (N - 1)/2. It can be shown that if N is a prime number of the form 4j + 3 (where j is an integer), quadratic search will end up examining every entry of A. A third approach to the problem of collisions is to rehash. The software should have a choice of several hashing functions HI, H 2 , .... If i = HI(k) and A[i] is occupied, the software should calculate i = H2(k) then try the new A[i]. Another approach is to generate an array R of N unique pseudo-random numbers in the range [0, N - 1]. If entry A[i] is occupied, the software should set i <- (i + R[i]) mod N and try the new A[i]. B.4 Secure Hash Functions 381

It is possible to design a perfect hash function that, for a given set of data items, will not have any collisions. This makes sense for sets of data that never change. Examples are indexes of the Bible, the works of Shakespeare, or any data written on a CD-ROM. The size N of the hash table should, in such a case, be normally larger than the number of data items. It is also possible to design a minimal perfect hash function where the hash table size equals the size of the data (i.e., no entries remain empty after all data items have been inserted). See [Czech 92], [Fox 91]' and [Havas 93] for details on these special hash functions.

B.4 Secure Hash Functions

A secure hash function H has two properties: (1) if z = H(x), then it is computationally infeasible to find a y =I- x such that z = H(y); (2) collisions are extremely rare (Le., it is also computationally infeasible to find two different arguments x and y that will hash to the same z). This section describes the secure hash standard (SHS), a hash function adopted in 1995 by the United States government as the FIPS-lS0-l standard [FIPS 95]. This standard specifies a secure hash algorithm (SHA-l) that accepts an argument (termed a "message") of any length less than 264 bits and computes a 160-bit output called a message digest. SHA-l uses the following functions and constants. 1. A circular left shift Si(X) where X is a 32-bit word and i is an integer between o and 31. This rotates X i positions to the left. 2. Addition. Adding two 32-bit words is done modulo 232 • 3. Complement. A unary minus stands for Is complement. 4. A total of SO logical functions ft are defined, each operating on three 32-bit words and producing one 32-bit word. Their definitions are

(B 1\ C) V (-B 1\ D), 0:::; t :::; 19, { B XOR C XOR D, 20:::; t :::; 39, h(B, C, D) = (B 1\ C) V (B 1\ D) V (C 1\ D), 40:::; t :::; 59, B XOR C XOR D, 60:::; t :::; 79.

5. A set of SO constants K t is also defined as follows

5A82799916, 0 :::; t :::; 19,

K t = { 6ED9EBAl16, 20:::; t :::; 39, 8F1BBCDC16, 40:::; t :::; 59, CA62C1D616, 60:::; t :::; 79.

The first step of SHA-l is to pad the message such that its length becomes an integer multiple of 512 bits. Denoting the original message length by l, padding is done by appending a 1 followed by m zeros, followed by the number l represented in 16 bits. The value of m depends on l and is determined by equating l + 1 + m + 64 to 512n for the smallest possible positive integer n. 382 Appendix B Hashing

The remaining computations use two buffers consisting of five 32-bit words each. The five words of one buffer are labeled A, B, C, D, and E, and the five words of the other buffer are labeled Ho, H l , H 2, H3, and H4. Before any processing starts, the five Hi words are initialized to Ho = 67452301, Hl = EFCDAB89, H2 = 98BADCFE, H3 = 10325476, and H4 = C3D2E1FO. There is also a sequence of eighty 32-bit words labeled Wo through W 79 and a I-word buffer TEMP. The message is divided into n 16-word (512 bits) blocks Ml through Mn and each block Mi goes through five steps as follows. Step 1. Partition Mi into 16 words Wo through W l5 where Wo is the leftmost word. Step 2. For t = 16 to 79 let W t = Sl(Wt_3 XOR W t - 8 XOR W t- l4 XOR W t- l6 ). Step 3. Let A = Ho, B = H l , C = H 2, D = H3, and E = H4. Step 4. For t = 0 to 79 do

TEMP = S5(A) + ft(B, C, D) + E + W t + K t ; E = D D = C C = S30 (B) B = A A = TEMP " " . Step 5. Set Ho = Ho + A, Hl = Hl + B, H2 = H2 + C, H3 = H3 + D, and H4=H4+ E . After all n blocks have been processed in this way, the 160-bit result is found in the five words HoHlH2H3H4'

"Why," said he, "a magician could call up a lot of genies, and they would hash you up like nothing before you could say Jack Robinson. They are as tall as a tree and as big around as a church." -Mark Twain, The Adventures Of Huckleberry Finn, (1885) Appendix C Cyclic Redundancy Codes

The idea of a parity bit is simple, old, and familiar to most computer practitioners. A parity bit is the simplest type of error-detecting code. It adds reliability to a group of bits by making it possible for hardware or software to detect certain errors that occur when the group is stored in memory, is written on a disk, or is transmitted over communication lines between . A single parity bit does not make the group absolutely reliable. There are certain errors that cannot be detected with a parity bit, but experience shows that even a single parity bit can make data transmission reliable in most practical cases. The parity bit is computed from a group of n - 1 bits, then added to the group, making it n bits long. A common example is a 7-bit ASCII code that becomes 8 bits long after a parity bit is added. The parity bit p is computed by counting the number of ones in the original group, and setting p to complete that number to either odd or even. The former is called odd parity and the latter is even parity. Instead of counting the number of ones, odd parity can be computed as the exclusive OR (XOR) of the n - 1 data bits. Examples: Given the group of seven bits 1010111, the number of ones is 5, which is odd. Assuming odd parity, the value of p should be 0, leaving the total number of Is odd. Similarly, the group 1010101 has four Is, so its odd parity bit should also be a 1, bringing the total number of Is to five. Imagine a block of data where the most significant bit (MSB) of each byte is an odd parity bit, and the bytes are written vertically (Table C.1a). When this block is read from a disk or is received by a modem, it may contain transmission errors, errors that have been caused by imperfect hardware or by electrical interference during transmission. We can think of the parity bits as horizontal reliability. When the block is read, the hardware can check every byte, verifying the parity. This is done by simply counting the number of ones in the byte. If this number is odd, the 384 Appendix C Cyclic Redundancy Codes

1 01101001 1 01101001 1 01101001 1 01101001 o 00001011 o 00001011 o 00001011 o 00001011 o 11110010 o 11010010 o 11010110 o 11010110 o 01101110 o 01101110 o 01101110 o 01101110 1 11101101 1 11101101 1 11101101 1 11101101 1 01001110 1 01001110 1 01001110 1 01001110 o 11101001 o 11101001 o 11101001 o 11101001 1 11010111 1 11010111 1 11010111 1 11010111 o 00011100

(a) (b) (c) (d) Table c.1: Horizontal and Vertical Parities. hardware assumes that the byte is good. This assumption is not always correct, since two bits may get corrupted during transmission (Table C.1c). A single parity bit is thus useful (Table C.1 b) but does not provide full error-detection capability. A simple way to increase the reliability of a block of data is to compute vertical parities. The block is considered eight vertical columns, and an odd parity bit is com• puted for each column (Table C.1d). If two bits in one byte get corrupted, the horizontal parity will not detect the error, but two of the vertical parity bits will. Even the vertical bits do not provide complete error-detection capability, but they are a simple way to significantly improve data reliability. Vertical parity is the simplest example of a CRC. CRC stands for Cyclical Redun• dancy Check (or Cyclical Redundancy Code). It is a rule that specifies how to compute the vertical check bits (they are now called check bits, not just simple parity bits) from all the bits of the data. Here is how CRC-32 is computed (CRC-32 is one of the many standards developed by the CCITT). The block of data is written as one long binary number. In our example this will be the 64-bit number 1011010011000001011101111001010011011101111101101110100111010111010011111010111. The individual bits are considered the coefficients of a polynomial. In our example, this will be the degree-63 polynomial

P(x) = 1 X X 63 + 0 X x 62 + 1 X X 61 + 1 X x 60 + ... + 1 X x 2 + 1 X xl + 1 x xO

= x 63 + X 61 + x60 + ... + x 2 + X + 1.

This polynomial is then divided by the standard CRC-32 generating polynomial

When an integer M is divided by an integer N, the result is a quotient Q (which is irrelevant for CRC) and a remainder R, which is in the interval [0, N - 1]. Similarly, when a high-degree polynomial P(x) is divided by a degree-32 polynomial, the result is two polynomials, a quotient and a remainder. The remainder is a polynomial whose Cyclic Redundancy Codes 385 degree is in the range [0,31]' implying that it has 32 coefficients, each a single bit. (If the degree of the remainder polynomial is less than 31, some of its leftmost coefficients are zeros.) Those 32 bits are the CRC-32 code, which is appended to the block of data as four bytes. As an example, the CRC-32 of a recent version of the file with the text of this Appendix is 586DE4F E 16 . The CRC is sometimes called the "fingerprint" of the file. Of course, since it is a 32-bit number, there are only 232 different CRCs. This number equals approximately 4.3 billion, so in principle there are different files with the same CRC, but in practice this is rare. The CRC is useful as an error-detecting code because it has the following properties. 1. Every bit in the data block is used to compute the CRC. This means that changing even one bit may produce a different CRC. 2. Even small changes in the data normally result in very different CRCs. Experience with CRC-32 shows that it is very rare that introducing errors in the data does not change the CRC. 3. Any histogram of CRC-32 values for different data blocks is flat (or very close to flat). For a given data block, the probability of any of the 232 possible CRCs being produced is practically the same.

Other common generating polynomials are CRC 12 (X) = X12 + x 3 + X + 1 and CRC 16 (x) = X16+X15+X2+1. They generate the common CRC-12 and CRC-16 codes, which are 12 and 16 bits long, respectively.

I agree that there is adequate overlap in longitude, but I'm still worried about redundancy. -Carl Sagan, Contact, (1985) Appendix 0 Galois Fields

This appendix is an introduction to finite fields for those who need to brush up on this topic. Finite fields are used in cryptography in the Rijndael (AES) algorithm and in stream ciphers.

D.1 Field Definitions and Operations

The mathematical concept of a field is based on that of a group, so we start with basic definitions and simple examples of groups and fields. A group G is a set of mathematical elements with a binary operation denoted by "+" defined on the elements that satisfies the following conditions. 1. Closure: for any a, bEG, the sum (a + b) is an element of G. 2. Associativity: any a, b, c E G satisfies (a + b) + c = a + (b + c). 3. Identity: there exists e E G such that for all a E G (a + e) = (e + a) = a. 4. Inverses: for each a E G there exists a unique element a -1 E G such that a + a-I = a-1 + a = e. 5. If the group operation is commutative, Le., if a + b = b + a for any a, bEG, the group is called Abelian.

Question: What's purple and commutes? Answer: An Abelian grape. 388 Appendix D Galois Fields

Examples of groups: 1. The set of all the integers with integer addition. The identity element is the integer O. This is an infinite group. 2. The (finite) set of the integers 0, 1, 2, ... ,m - 1 with modulo-m addition. 3. The integers 1, 2, ... ,q - 1 for a prime q with modulo-q multiplication. 4. The set of all rotations in two dimensions under the operation: The sum of the two rotations by ex and (3 degrees is a rotation by ex + (3 degrees. The set (0,1,2,3) with modulo-4 addition is a group denoted by G(4). It obeys the addition table + 0 1 2 3 o 0 1 2 3 1 1 2 3 0 2 2 3 0 1 3 3 0 1 2

The order of a group (its cardinality) is the number of elements. It is denoted by ord(G). The order of G(4) is 4. A subgroup is a subset of the elements of a group that's closed under the group's operation. A theorem by Lagrange states that if S is a subgroup of G, then ord(S) divides ord(G). For example, if S is the subgroup (0,1) of G(4), then ord(S) = 2 divides ord(G(4)) = 4 and G(4) can be partitioned into the cosets Sand S + 2. A field F is a set with two operations, addition "+" and multiplication "x" that satisfies the following conditions. 1. F is an Abelian group under the + operation. 2. F is closed under the x operation. 3. The nonzero elements of F form an Abelian group under x. 4. The elements obey the distributive law (a + b) xc = a x c + b x c. Examples of fields are: 1. The real numbers under the normal addition and multiplication; 2. The complex numbers; 3. The rational numbers. Notice that the integers do not form a field under addition and multiplication because the multiplicative inverse (reciprocal) of an integer a is l/a which is generally a noninteger. Also, a finite set of real numbers is not a field under normal addition and multiplication, because these operations can create a result outside the set. In order for a finite set of numbers to be a field, its two operations have to be defined carefully, so they satisfy the closure requirement. Finite fields are intriguing, because the finite number of elements implies that the two operations could be performed by computers exactly (with full precision). This is why much research has been devoted to the use of finite fields in practical applications. A Galois field, abbreviated GF, is a finite field. These fields were "discovered," studied, and precisely defined by the young French mathematician Evariste Galois, and today they have many applications in fields as diverse as error-control codes, cryptog• raphy, random-number generation, VLSI testing, and digital signal processing. Galois 0.1 Field Definitions and Operations 389

has proved that the size of a finite field must be a power m of a prime number q and that there is exactly one finite field with any given size qm. This justifies talking about the finite field with qm elements and this field is denoted by GF( qm). If m = 1, the size of the field GF(q) is a prime number q, its elements are the integers 0, 1, ... ,q - 1, and the two operations are integer addition and multiplication modulo q. The simplest examples are GF(2) and GF(3). The simple field GF(2) consists of the two elements 0 and 1 and is the smallest finite field. Its operations are integer addition and multiplication modulo 2, that are summarized by + 0 1 x 0 1 o 0 1 o 0 0 1 1 0 1 0 1 Notice that the addition is actually an XOR and the multiplication is a logical AND. The next field is GF(3) whose elements are 0, 1, and 2. Its operations are integer addition and multiplication modulo 3, summarized by the truth tables

+ 0 1 2 x 0 1 2 o 0 1 2 o 0 0 0 1 1 2 0 1 0 1 2 2 2 0 1 2 0 2 1

The additive inverse of 1 is 2 because 1 + 2 = 2 + 1 = O. Similarly, the multiplicative inverse of 2 is itself because 2 x 2 = 1.

<> Exercise D.l: Write the addition and multiplication tables of GF(5).

He is the only candidate who gave poor answers. He knows absolutely nothing. I was told that this student has an extraordinary capacity for mathematics. This astonishes me greatly, for, after his examination, I believed him to have but little intelligence or that his intelligence is so well hidden that I was unable to uncover it. If he really is what he appears to be, I doubt very much that he will make a good teacher. -French physicist Jean Claude Eugene peclet. One of Galois's examiners in 1829.

<> Exercise D.2: Compute the addition and multiplication tables of GF(4) as if 4 were a prime and show why these tables don't make sense. If m > 1, the elements of GF(qm) are polynomials of degree less than mover GF(q) [i.e., polynomials whose coefficients are elements of GF(q)]' and the operations are special versions of polynomial addition and polynomial multiplication. Hence, if the polynomial am_lXm - l + ... + alx + ao is an element of GF(qm), then ao, al, . .. ,am-l are elements of Galois field GF(q). The degree of the polynomial is the largest i for which ai i= O. Adding elements of GF(qm) is easy. If the polynomials a(x) and b(x) are elements of GF (qm), then the sum c( x) = a( x) + b( x) is a polynomial with coefficients Ci = 390 Appendix 0 Galois Fields

(ai + bi) mod q. The sum is a polynomial whose degree is the greater of the degrees of a(x) and b(x), so it is an element of GF(qm). Also, the rule for addition implies that this operation is associative and that there is an identity (the polynomial whose coefficients are all zeros). o Exercise D.3: In order for GF(qm) to be a field, each element must have an additive inverse. What is it?

Multiplying elements of GF(qffi) is a bit trickier, because the normal multiplication of two polynomials of degrees m and n results in a polynomial of degree m + n. Multipli• cation of polynomials in GF(qffi) must therefore be defined (similar to addition) modulo something. In analogy to addition, which is done modulo a prime integer, multiplication is performed modulo a prime polynomial. Such a polynomial is called irreducible. Much as a prime number is not a product of smaller integers, an irreducible polynomial is not a product of lower-degree polynomials. The irreducible polynomials we are interested in are irreducible in GF(q), which means that such a polynomial cannot be factored into a product of lower-degree polynomials in GF(q). [Note. A polynomial irreducible over GF(q) has no roots in GF(q). The opposite, however, isn't true. A polynomial with no roots in GF(q) may be reducible over GF(q).] Section D.3 shows how to multiply two polynomials modulo a third polynomial. The polynomial x2 - 1 over the reals can be factored into (x - 1)(x + 1), so it is reducible. Its relative, the polynomial x 2 + 1, is irreducible over the real numbers. This same polynomial, however, is reducible over GF(2) because the polynomial product (x+l)(x+l), which equals xxx+ lxx+xxl+ lxl can also be written x2+(I+I)x+lxl and in GF( 2) this equals x2 + 1. Another example is the polynomial (x2 + X + 1)2. It is easy to verify that neither zero nor 1 are roots of this polynomial. It therefore does not have any roots in GF(2), but it is not irreducible in it because it is obviously a product of two lower-degree polynomials. o Exercise D.4: Show that the polynomial x8 + 1 with coefficients in GF(2) is reducible.

No doubt this style and this efficiency were due to his peasant heredity. Perhaps also to the fact that manual work (whatever the demagogues may say) does not demand a veritable genius, since it is more difficult to extract a square root than a gorse root. -Marcel Pagnol, Jean de Florette

The simplest example of a Galois field of the form GF(qffi) for m > 1 is GF(22 ) = GF(4). Its elements are polynomials alx + ao over GF(2) (meaning, with coefficients that are 0 or 1). If we denote such an element by the two bits alaO, then the four field elements are 0 = 002 = Oxx + 0,1 = 012 = Oxx + 1,2 = 102 = X + 0, and 3 = 112 = X + 1. If we now select the polynomial x 2 + x + 1, which is irreducible over 0.1 Field Definitions and Operations 391

GF( 4), and multiply modulo this polynomial, then the two field operations become

+ 0 1 2 3 x 0 1 2 3 o 0 1 2 3 o 0 0 0 0 1 1 0 3 2 1 0 1 2 3 2 2 3 0 1 2 0 2 3 1 3 3 2 1 0 3 0 3 1 2

The multiplication table shows that 2 x 2 = 3. In polynomial notation, element 2 is the polynomial x and element 3 is x + 1. This is why the product xxx, which over the reals is x 2 , equals x + 1 in GF(4). Notice that the multiplication table implies that 23 = (2 x 2) x 2 = 3 x 2 = 1, so we can consider element 2 the cube root of unity. Over the real numbers, this cube root is (iV3 - 1)/2, which shows that the names 0, 1, 2, and 3 are arbitrary. Choosing a different irreducible polynomial of degree m produces a different mul• tiplication table, but all the tables that can be generated in this way are isomorphic; they have the same essential structure in terms of the two operations and differ by the names of the field's elements. However, as we already know, the names are arbitrary. o Exercise D.5: Explain why GF(6) does not exist.

Another simple example is GF(23 ) = GF(8). Its elements are polynomials a2x2 + alX + ao with coefficients ai in GF(2) (i.e., bits). We denote such an element by the three bits a2alaO, so element 6 = 1102 is the polynomial x 2 + x. Addition is simple: the sum of x 2 + 1 and x + 1 is x 2 + X + 1 + 1 = x 2 + x. For multiplication, we select the irreducible polynomial x 3 + x + 1. The results are summarized in Table D.1.

+ 01234567 x 01234567 o 01234567 o 00000000 1 10325476 1 01234567 2 23016745 2 02463175 3 32107654 3 03657412 4 45670123 4 04376251 5 54761032 5 05142736 6 674 5 2 3 0 1 606715324 7 76543210 7 07521643

Table 0.1: Addition and Multiplication in GF(8).

As an example, the GF(8) multiplication table indicates that 5x3 = 4, or in binary 101 x 011 = 100, or in polynomial notation (x2 + l)(x + 1) = (x3 + x 2 + X + 1) = x 2 mod (x3 + x + 1). The modulo operation results in the remainder of the polynomial division (x3 + x 2 + X + 1)/(x3 + X + 1). o Exercise D.6: Choose some of the elements of the GF(4) and GF(8) multiplication tables and show how they are computed. 392 Appendix D Galois Fields o Exercise D.7: List the additive and multiplicative inverses of the eight elements of GF(8).

The existence of the additive and multiplicative inverses makes it possible to sub• tract and divide field elements. To subtract a - b just add a to the additive inverse of b (since the additive inverse of b is b itself, subtraction in GF(8) is identical to addition). To divide alb, multiply a by the multiplicative inverse of b. The particular definition of multiplication in GF(qm) satisfies the requirements for a field. The product of two field elements is a polynomial of degree m - 1 or less, so it is an element of the field. The multiplication is associative and there is an identity element, namely the polynomial 1. In order to figure out the inverse of element p( x), we denote by m(x) the particular irreducible polynomial that we use for the multiplication and apply the extended Euclidean algorithm. This algorithm (page 10) finds two polynomials a(x) and b(x) such that p(x)a(x) + m(x)b(x) = 1. This implies that a(x)p(x) mod m(x) = 1 or p-l(X) = a(x) mod m(x). Section D.2 describes another approach to computing the multiplicative inverse (reciprocal) of a field element.

In those days, my head was full of the romantic prose of E.T. Bell's Men of Math• ematics, a collection of biographies of the great mathematicians. This is a splendid book for a young boy to read (unfortunately, there is not much in it to inspire a girl, with Sonya Kovalevsky allotted only half a chapter), and it has awakened many people of my generation to the beauties of mathematics. The most memorable chap• ter is called "Genius and Stupidity" and describes the life and death of the French mathematician Galois, who was killed in a duel at the age of twenty. . .. "All night long he had spent the fleeting hours feverishly dashing off his scientific last will and testament, writing against time to glean a few of the great things in his teeming mind before the death he saw could overtake him. Time after time he broke off to scribble in the margin 'I have not time; I have not time,' and passed on to the next frantically scrawled outline. What he wrote in those last desperate hours before the dawn will keep generations of mathematicians busy for hundreds of years. He had found, once and for all, the true solution of a riddle which had tormented mathematicians for centuries: under what conditions can an equation be solved?" -Freeman Dyson, Disturbing the Universe (1979)

The Exponential Representation of Galois Fields. We start with the simple field GF(q) and define the order of a field element. Let 13 be an element in GF(q). The order of 13 is denoted by ord(j3) and is defined as the smallest positive integer m such that 13m = 1. It can be shown that if t is the order of 13 for some 13 in GF(q), then t divides (q - 1). An element with order (q - 1) in GF(q) is called a primitive element in GF(q). Every field GF(q) contains at least one primitive element a. The elements of GF(q) can be represented as zero followed by the (q - 1) consecutive powers of any primitive 0.1 Field Definitions and Operations 393

element a

This is the exponential representation of GF(q). Notice that we don't have to know the value of any particular root a. All we need is this particular sequence of powers of a. A simple example is element 2 of GF(3). The multiplication table of GF(3) shows that the smallest n for which 2n = 1 is n = 2 = 3 - 1. Thus, element 2 is primitive and GF(3) can be represented as the set (0,2,22 = 1). Another example is GF(5). Exercise D.1 shows that element 2 of GF(5) is primitive because the smallest n for which 2n = 1 is n = 4 = 5 - 1. Hence, the exponential representation of GF(5) with respect to 2 is (0 , 2 , 22 = 4 , 23 = 3 , 24 = 1) . <> Exercise D.8: Show that 3 is also a primitive element of GF(5). The exponential representation of Galois fields can be extended to fields GF(qm) where m > 1. An irreducible polynomial p(x) of degree m in GF(q) is said to be primitive if the smallest positive integer n for which p(x) divides xn - 1 is n = qm - 1. It can be shown that the roots aj of an mth-degree primitive polynomial p(x) in GF(q) have order qm - 1. This implies that the roots aj of p(x) are primitive elements in GF(qm). The exponential representation of GF(qm) can therefore be constructed from any of these roots. As an example, we show the construction of the exponential representation of GF(23 ). The polynomial p(x) = x 3 + X + 1 is primitive in GF(2). Let a be any root of p(x) = x 3 + X + 1. From a 3 + a + 1 = 0 we get a 3 = a + 1 [this is done by adding a + 1 to both sides, since in GF(2) 1 + 1 = OJ and from this, the exponential representation of GF(8) can be constructed (Table D.2). The second column of the table is the power of a. These are the field elements in the exponential representation (notice how element zero is termed 7 in this representation). The rightmost columns list the field elements in the polynomial representation. The exponential representation listed in Table D.2 also makes it clear that the nonzero elements of any Galois field form a cyclic group.

exp polynomial representation rep o 7 0 000 0 aD 0 1 001 1 a l 1 a 010 2 a 2 2 a 2 100 4 a3 3 a+1 011 3 a4 4 a 2 +a 110 6 a 5 5 a 3 + a 2 = a 2 + a + 1 111 7 a 6 6 a 2 + 1 101 5 a7 1 001 1

Table D.2: Exponential and Polynomial Representations of GF(8). 394 Appendix 0 Galois Fields

Which representation is better? The exponential representation (the second col• umn of Table D.2) is useful for multiplication. Adding two elements in this column (modulo 7) produces their product. Thus, a 4 x a 5 = a 9 mod (2 3 -1) = a 2 . The polyno• mial representation (the rightmost three columns of Table D.2) is useful for addition. Thus, adding 4 + 7 mod 8 produces 3. Notice that the sum (Le., the XOR) of all the field elements is zero. This is a general result. Notice also how all the powers of a are expressed in terms of a O = 1, a 1 = a, and a 2 . These three powers of a are the basis for the polynomial representation of GF(8). A direct check using Table D.1 shows that elements 2, 4, and 6 of GF(8) are primitive. Each can be the a of Table D.2. o Exercise D.9: Show that elements 2 and 3 of GF(4) are primitive elements of this field.

<> Exercise D.lO: Given that the polynomial x4 +x3 +1 is primitive in GF(2), construct the exponential representation of GF(24) = GF(16). Any root a is therefore a generator of a finite field. A generator is defined as an element whose successive powers take on every element of the field except the zero. It is possible to check every field element for this property, but this is time consuming. For example, we can test elements of GF(7) by computing successive powers modulo 7 of each nonzero element. It is clear that element 1 cannot be a generator. Successive powers of 2 modulo 7 produce 2, 22 = 4, 23 = 1, but 24 = 2, implying that 25 will be 4, same as 22. Next, we try element 3. Its successive powers taken modulo 7 are 3, 32 = 2, 33 = 6, 34 = 4, 35 = 5, and 36 = 1, which establishes 3 as a generator of this field. The following discussion attempts to shed light on the nature of the elements of GF(qm) and on the mysterious a. Perhaps the best way to understand finite fields and their elements is to consider algebraic equations of various degrees (Galois himself developed the concepts of groups and fields when trying to answer the question, "Under what conditions does an equation have a solution?"). Consider the linear (degree-I) equation 2x - 1 = o. Its coefficients are integers, but its solution is not. It is the rational number 1/2. Similarly, the quadratic equation x 2 - 2 = 0 has the irrational solution x = J2. Continuing along the same line, we examine the quadratic equation x 2 + 1 = o. Its coefficients are 0 and 1 (the coefficient of x is zero). If we consider the coefficients real numbers, then the solutions are x 2 = -lor x = ±A. There is no real number whose square is -1, so we extend the concept of number and construct the field of complex numbers. We can say that when the equation x 2 + 1 = 0 is over the reals, its solutions are over the field of complex numbers. Alternatively, we can say that the base field of our equation is the reals and the extension field is the complex numbers. This shows that the solutions of an equation may sometimes lie in a field different from that of the coefficients. Thus, in order to solve an equation, we sometimes have to extend the concept of numbers and develop new types of mathematical entities. Next, we consider the equation x 2 + x + 1 = O. When we assume its coefficients to be over the reals, the solutions are the complex numbers (-1 ± A) /2. They are obtained by the well-known general solution of the quadratic equation. However, when 0.2 GF(256) and Rijndael 395 we consider the coefficients elements of GF(2), we have to use GF(2) arithmetic to solve it. It is easy to see that no element of GF(2) is a solution. Trying x = 0 produces Ox 0 + 0 + 1 = 0 and trying x = 1 yields 1 x 1 + 1 + 1 = 0, both contradictions. Thus, we realize that the solutions are not in GF(2) and we have to extend our concept of a field. We therefore denote one of the two (unknown) solutions by a and observe that a satisfies a 2 + a + 1 = 0 or a 2 = a + 1. We still don't know what mathematical entity a is, but we know that (1) a is neither 0 nor 1, since neither of those elements of GF(2) is a solution to our equation and (2) that the two solutions are a and a 2 [the latter is a solution because a 2 + a + 1 = (a + 1) + a + 1 = (1 + l)a + 1 + 1 = 0]. We don't know how to express a in terms of real or complex numbers. We don't even know if this is possible. However, we also don't "know" what A is; it also cannot be expressed in terms of elements of "simpler" fields. We simply accept the "existence" of A and use it to perform calculations. In much the same way, we can accept the existence of a and use it to denote elements of finite fields. The entire finite field GF(22 ) can now be constructed as the 4-tuple (0,1, a, a 2 ). Clearly, elements 0 and 1 are needed; they are the identities for the two operations. Elements a and a 2 complete the field because higher powers of a reduce to 1, a, or a 2 .

D.2 GF(256) and Rijndael

The Rijndael algorithm (Section 7.7) employs several transformations in each of its rounds. The first transformation of a round is byte substitution, which depends on computing the multiplicative inverses (reciprocals) of the elements of GF(28 ). This section presents two approaches to computing the 255 nonzero reciprocals of this field. The first approach employs a method that echoes the use of logarithms. The approach is based on the simple polynomial x+ 1, which can also be denoted by 0316. This happens to be the simplest generator polynomial for GF(256), so successive powers of this polynomial generate all the 255 nonzero elements of the field. Table D.3 lists those powers. For any byte rc, row r and column c of the table contain (x + 1Yc. The table shows, for example, that (x + 1f3 is the polynomial 7d = 01111101 or x 6 + x 5 + x4 + x 3 + x 2 + 1. This table is easy to compute by appropriate software and it can be considered the equivalent of a table of antilogarithms. The opposite (or inverse) values are listed in Table D.4 that's denoted by Land that can be considered a table of "logarithms." Entry L(rc) is the element of GF(256) that satisfies rc = (x+ l)L(rc). For example, entry 76 of Lis 5e, so entry 5e of Table D.3 is 76. The two tables can be used to multiply elements of GF(256). In order to multiply b6x53 we first look up L(b6)=bl and L(53)=30, then add (modulo 256) bl + 30 = el, and finally select the entry in row e column 1 of the antilogarithm table. This entry is 36, implying that b6 x 53 = 36, or 03b1 x 0330 = 03e1 = 36. Once the use of this "version" of a logarithm table is clear, it is easy to see how it can be exploited for finding reciprocals in GF(256). The principle is that the inverse of (x + 1)rc is (x + l)ff-rc. Thus, to find the reciprocal of 6b we (1) use the "logarithm" table to find out that 6b = (x + 1)5\ (2) compute the reciprocal as (x + 1)ff-54 = 396 Appendix D Galois Fields

o 1 234 5 6 789 abc d e f o 01 03 05 Of 11 33 55 ff la 2e 72 96 al f8 13 35 1 5f el 38 48 d8 73 95 a4 f7 02 06 Oa le 22 66 aa 2 e5 34 5e e4 37 59 eb 26 6a be d9 70 90 ab e6 31 3 53 f5 04 Oe 14 3e 44 ee 4f dl 68 b8 d3 6e b2 cd 4 4e d4 67 a9 eO 3b 4d d7 62 a6 fl 08 18 28 78 88 5 83 ge b9 dO 6b bd de 7f 81 98 b3 ee 49 db 76 9a 6 b5 e4 57 f9 10 30 50 fO Ob ld 27 69 bb d6 61 a3 7 fe 19 2b 7d 87 92 ad ee 2f 71 93 ae e9 20 60 aO 8 fb 16 3a 4e d2 6d b7 e2 5d e7 32 56 fa 15 3f 41 9 e3 5e e2 3d 47 e9 40 cO 5b ed 2e 74 ge bf da 75 a 9f ba d5 64 ae ef 2a 7e 82 9d be df 7a 8e 89 80 b 9b b6 el 58 e8 23 65 af ea 25 6f bl e8 43 e5 54 c fe if 21 63 a5 f4 07 09 lb 2d 77 99 bO eb 46 ea d 45 ef 4a de 79 8b 86 91 a8 e3 3e 42 e6 51 f3 Oe e 12 36 5a ee 29 7b 8d 8e 8f 8a 85 94 a7 f2 Od 17 f 39 4b dd 7e 84 97 a2 fd le 24 6e b4 e7 52 f6 01

Table 0.3: Exponents (x + 1)re in GF(28 ).

o 1 234 5 6 7 8 9 abc d e f o 00 19 01 32 02 la e6 4b e7 lb 68 33 ee df 03 1 64 04 eO Oe 34 8d 81 ef 4e 71 08 e8 f8 69 le el 2 7d e2 ld b5 f9 b9 27 6a 4d e4 a6 72 9a e9 09 78 3 65 2f 8a 05 21 Of el 24 12 fO 82 45 35 93 da 8e 4 96 8f db bd 36 dO ee 94 13 5e d2 fl 40 46 83 38 5 66 dd fd 30 bf 06 8b 62 b3 25 e2 98 22 88 91 10 6 7e 6e 48 e3 a3 b6 le 42 3a 6b 28 54 fa 85 3d ba 7 2b 79 Oa 15 9b 9f 5e ea 4e d4 ae e5 f3 73 a7 57 8 af 58 a8 50 f4 ea d6 74 4f ae e9 d5 e7 e6 ad e8 9 2e d7 75 7a eb 16 Ob f5 59 eb 5f bO ge a9 51 aO a 7f Oe f6 6f 17 e4 49 ee d8 43 if 2d a4 76 7b b7 b ee bb 3e 5a fb 60 bl 86 3b 52 al 6e aa 55 29 9d c 97 b2 87 90 61 be de fe be 95 ef cd 37 3f 5b dl d 53 39 84 3e 41 a2 6d 47 14 2a ge 5d 56 f2 d3 ab e 44 11 92 d9 23 20 2e 89 b4 7e b8 26 77 99 e3 a5 f 67 4a ed de e5 31 fe 18 Od 63 8e 80 cO f7 70 07

Table 0.4: Logarithms rc = (x + l)L(re) in GF(28 ).

(x + 1)ab; (3) use the antilogarithm table to find that the element at row a column b is df. Hence df is the reciprocal of 6b in GF(256). Table D.5 lists all the 255 reciprocals (element zero does not have a reciprocal and in Rijndael it is considered its own inverse). Rijndael also employs the simple operation of multiplying an element of GF(256), which is a polynomial of degree 7 or less, by the field element x, which is the polynomial 02 (this operation is used in the subkeys computation). We now show how this operation can be implemented by a left shift that's sometimes followed by an XOR. We denote the general degree-7 polynomial b7X7 + b6X6 + ... + b1x + bo by the byte b7b6 . .. b1bo. When this is multiplied by x, it results in the degree-8 polynomial b7x 8 +b6x7 + ... + b1x 2 +box that's denoted by the nine bits b7b6 ... hboO. If b7 = 0, then the product is the byte b6 ... b1 boO. This is the original byte shifted to the left one position. If b7 = 1, then the D.2 GF(256) and Rijndael 397

o 1 2 3 4 5 6 7 8 9 abc d e f o 01 8d f6 eb 52 7b dl e8 4f 29 cO bO el e5 e7 1 74 b4 aa 4b 99 2b 60 5f 58 3f fd ee ff 40 ee b2 2 3a 6e 5a fl 55 4d a8 e9 el Oa 98 15 30 44 a2 e2 3 2e 45 92 6e f3 39 66 42 f2 35 20 6f 77 bb 59 19 4 ld fe 37 67 2d 31 f5 69 a7 64 ab 13 54 25 e9 09 5 ed 5e 05 ea 4e 24.87 bf 18 3e 22 fO 51 ee 61 17 6 16 5e af d3 49 a6 36 43 f4 47 91 df 33 93 21 3b 7 79 b7 97 85 10 b5 ba 3e b6 70 dO 06 al fa 81 82 8 83 7e 7f 80 96 73 be 56 9b ge 95 d9 f7 02 b9 a4 9 de 6a 32 6d d8 8a 84 72 2a 14 9f 88 f9 de 89 9a a fb 7e 2e e3 8f b8 65 48 26 e8 12 4a ee e7 d2 62 b Oe eO if ef 11 75 78 71 a5 8e 76 3d bd be 86 57 c Ob 28 2f a3 da d4 e4 Of a9 27 53 04 lb fe ae e6 d 7a 07 ae 63 e5 db e2 ea 94 8b e4 d5 9d f8 90 6b e bl Od d6 eb e6 Oe ef ad 08 4e d7 e3 5d 50 le b3 f 5b 23 38 34 68 46 03 8e dd ge 7d aO cd la 41 le

Table 0.5: Reciprocals of GF(28 ).

irreducible polynomial m(x) that's used in Rijndael should be subtracted to reduce the result to a degree-7 polynomial. The specific m(x) used in Rijndael is the polynomial lb, so when b7 = 1, a multiplication by x is done by a left shift of b7 b6 •.• b1bo to b6 ••• b1boO, followed by an XOR with lb = 000110112. In Rijndael, this computation is denoted by a = xtime(b) and it can be used to multiply an element of GF(256) by any other element, not just 02. We first show how element 57 can be multiplied by polynomials that are multiples of 2.

57 x 02 = xtime(57) = ae, 57x04 = xtime(ae) = 47, 57x08 = xtime(47) = 8e, 57xlO = xtime(8e) = 07.

From 13 = 01 EEl 02 EEl 10, we can easily compute the product

57x 13 = 57x (01 EEl 02 EEl 10) = 57 EEl ae EEl 07 = fe.

The second approach to computing the multiplicative inverses (reciprocals) of the elements of GF(28 ) is due to [Rijmen 01]. It uses the simple relation that exists between the elements of this field and those of GF(24 ). The former Galois field has 256 elements and the latter has 16. Since all these elements are polynomials, we can express an element of GF(28 ) as the degree-1 polynomial bx + c where band c are elements of GF(24 ). Since each of band c can have 16 values, the polynomial bx + c can have 256 values and is therefore a general element of GF(28 ). For the irreducible polynomial we select x 2 + Ax + B, where A and B are elements of GF(24 ) to be determined later. We denote the inverse of bx + c by ax + {3 and expand the basic relation

(bx + c)(ax + {3) = 1 mod x 2 + Ax + B to obtain bax2 + (b{3 + ca)x + {3c = 1 mod x 2 + Ax + B. (D.1) 398 Appendix 0 Galois Fields

It is obvious that x 2 + Ax + B modulo itself is zero, so any multiple of x 2 + Ax + B modulo itself is also zero. Thus, we can write

ba(x2 + Ax + B) = 0 mod x 2 + Ax + B (D.2)

and subtract Equation (D.2) from Equation (D.1) to obtain

(b(3 + co; - baA)x + (3c - baB = 1.

This relation has to hold for any value of x, so we conclude that (c - bA)a + b(3 = 0 and -bBa + c(3 = 1. The solutions are a = bW B + bcA + c2 ) and (3 = (c + bA)(b2 B + bcA + C2 )-1. These solutions imply that computing the inverse of an element of GF(28 ) involves multiplications, additions, and squaring of elements of GF(24) and finding the inverses of such elements. The latter problem is trivial, as there are only 16 such elements and their inverses can be stored in a table. Multiplying by element A can be simplified if we choose it to be the multiplicative unity (element 1111). Similarly, multiplying by B can be converted into a shift if we select it as element 0001. Figure D.6 is a schematic diagram that can be implemented by hardware or software.

Figure 0.6: Computing the Inverse in GF(28 ). D.3 Polynomial Arithmetic 399 D.3 Polynomial Arithmetic

This section describes the four arithmetic operations on polynomials, especially division, which is needed to compute one polynomial modulo another. Polynomial Addition/Subtraction. Adding two polynomials is done by adding corresponding coefficients. Thus, adding P(x) = 2:;;'-1 aixi and Q(x) = 2:~-1 biXi is done by adding (ai + bi ). Subtraction is done similarly by subtracting the coefficients (subtraction is defined over the reals, but in general, a field has only addition and multiplication defined). A simple example is the sum (5x2 + 3x - 2) + (-x3 + x 2 + 7) which, over the reals equals -x3 + 6x2 + 3x + 5. It is clear that the degree of the polynomial sum is max(m, n). Polynomial Multiplication. Multiplying two polynomials P and Q is done by multiplying every coefficient ai in P by every coefficient bj in Q. A simple example serves to make this clear

(x3 - 3x + 4)( _x2 + 2x + 1) = x 3 ( _x2 + 2x + 1) - 3x( _x2 + 2x + 1) + 4( _x2 + 2x + 1) = (_x5 + 2x4 + x 3 ) + (3x3 - 6x2 - 3x) + (_4x2 + 8x + 4) = _x5 + 2X4 + 4x3 - 10x2 + 5x + 4.

The degree of the product polynomial is the sum of the degrees of the multiplied polyno• mials. [Notice that this example is done over the reals. When done over a different field, the rules may be different. When polynomials are multiplied over GF(2), for example, the arithmetic rule 1 + 1 = 0 applies.] Polynomial Division. Dividing two integers produces a quotient and a remainder. If m and n are integers, then m mod n is the remainder of the integer division m 7 n and is therefore in the range [0, n - 1]. Similarly, if P and Q are polynomials, then the polynomial division P7Q produces a quotient polynomial and a remainder polynomial. The latter is denoted by P mod Q and its degree is less than that of Q. We illustrate polynomial division with an example. We use the compact notation (8,5,4,1,0) for the polynomial x 8 +x5 +x4+x+1 and show the steps of dividing P = (13,11,9,8,6,5,4,3,0) by Q = (8,4,3,1,0).

Step 1: Divide x 13 j x 8 to obtain x 5 . This is the highest term of the quotient polynomial. Step 2: Multiply (5) x (8, 4, 3,1,0) to obtain (13,9,8,6,5). Step 3: Add modulo 2 (i.e., XOR) (13,11,9,8,6,5,4,3,0) and (13,9,8,6,5) to ob• tain (11,4,3,0). Repeat the three steps for this polynomial. Step 4: Divide x ll jx8 to obtain x 3 . This is the second term of the quotient poly• nomial. Step 5: Multiply (3) x (8, 4, 3,1,0) to obtain (11,7,6,4,3). Step 6: XOR (11,4,3,0) and (11,7,6,4,3) to obtain (7,6,0). This is the final result P mod Q, since the next step would have to divide x 7 by x 8 . 400 Appendix 0 Galois Fields

In Galois Fields, full of flowers primitive elements dance for hours climbing sequentially through the trees and shouting occasional parities. The syndromes like ghosts in the misty damp feed the smoldering fires of the Berlekamp and high flying exponents sometimes are downed on the jagged peaks of the Gilbert bound. -So B. Weinstein IEEE Transactions on Information Theory (1971) o Exercise D .11: Compute the three polynomial divisions (quotients and remainders): (x5 + x 2 + X + 1)/(x2 + 1), (x5 + x 2 + 1)/(x2 + 1), and (x4 + x 3 + x)/(x4 + 1). Consider the coefficients elements of GF(2) and add them modulo 2.

In those innumerable glowing fires,-in those infinite fields of light which surround them, and which neither storms nor darkness can extinguish, is there nothing but empty space and an eternal void? -Bernardin de Saint Pierre, Paul and Virginia Answers to Exercises

Note to Internet friends: I'm extremely grateful that hundreds of you have taken time to read these drafts, and to detect and report errors that you've found. Your comments have improved the material enormously. But I must confess that I'm also disappointed to have had absolutely no feedback so far on several of the exercises on which I worked hardest when I was preparing this material. Could it be that (1) you've said nothing about them because I somehow managed to get the details perfect? Or is it that (2) you shy away from the more difficult stuff, being unable to spend more than a few minutes on any particular topic? Although I do not like to think that readers are lazy, I fear that hypothesis (1) is far less likely than hypothesis (2). Thus I would like to enter here a plea for some readers to tell me explicitly, "Dear Don, I have read exercise N and its answer very carefully, and I believe .... " Fromhttp://Sunburn.Stanford.EDU/-knuth/news.html

1: We know that m can be one of the 12 numbers 1, 3, 5, 7, 9, 11, 15, 17, 19, 21, 23, and 25. These numbers are of the form 2n + 1 for certain nonnegative integers n, but m = 2n + 1 implies (m - 1) /2 = n, so (m - 1) /2 is a nonnegative integer. From this we conclude that

m-1) 13m mod 26 = (13 + 13(m - 1)) mod 26 = ( 13 + 26-2- mod 26 = (13 + 26n) mod 26 = 13.

We therefore conclude that any multiplicative cipher transforms "n" into "N."

2: For m = 1, there are 25 such keys, because a = 0 is the only value that results in a fixed point. For the eleven values m > 1, odd values of a result in no fixed points. There are 13 such values, so the total number of no-fixed-point affine ciphers is 25 + 11 x 13 = 168. To see why odd values of a have this property, we observe that a fixed point, i.e., the case x = x· m + a mod 26, is equivalent to x( m - 1) = -a mod 26. Since m is 402 Answers to Exercises relatively prime to 26, it is odd, implying that m - 1, and therefore also the left-hand side x(m - 1), is even. In order for a solution to exist, the right-hand side must also be even. If the right-hand side (i.e., a) is odd, there are no solutions to the fixed-point equation x = x·m + a mod 26, so there are no fixed-point ciphers for those keys.

3: The inverse of y = x·23 + 7 mod 126 is

x = 23-1 (y - 7) mod 126 = 23-1 mod 126(y - 7) mod 126 = l1(y - 7) mod 126 = 11y - 77 mod 126 = 11y + 49 mod 126.

4: Decryption is the opposite of encryption. Cipherletter Ci is decrypted using keyletter k i by rotating the alphabet such that it starts with k i , and then selecting the letter found at position Ci. For example, if Ci is the letter p and k i is d, then the alphabet is rotated three positions to the left to become defghijklmnopqrstuvwxyzabc and p is decrypted by selecting m (the letter located in the original position of p).

5: The number of 64-bit keys is 264 = 18,446,744,073,709,551,620 or approximately 1.8 x 1019 . The following examples illustrate the magnitude of this key space. 1. 264 seconds equal 584,942,417,355 years. 2. The unit of electrical current is the Ampere. One Ampere is defined as 6.24x101S electrons per second. Even this huge number is smaller than 264 . 3. Even light, traveling (in vacuum) at 299,792,458 mls takes 61,531,714,963 sec• onds (about 1,951 years) to cover 264 meters. This distance is therefore about 1951 light years. 4. In a fast, 5 GHz computer, the clock ticks five billion times per second. In one year, the clock ticks 5.109.(3.107) = 1.5.1017 times. 5. The mass of the sun is roughly 2.1031 Kg and the mass of a single proton is approximately 1.67.10-27 kg. There are therefore approximately 1058 protons in the sun. This number is about 2193 , so searching a keyspace of 193 bits is equivalent to trying to find a single proton in the sun (notwithstanding the fact that all protons are identical and that the sun is not all protons and is hot). The proverbial needle in a haystack problem pales in comparison. 6. The term "femto" stands for 10-15 . Thus, a femtometer is 10-15 m, and a cubic femtometer is 10-45 cubic meters, an incredibly small unit of volume. A light year is 1016 meters, so assuming that the universe is a sphere of radius 15 billion light years, its volume is (4/3)7T(15 X 109 x 1016 )3 = 1.41372 X 1079 cubic meters or about 10124 cubic femtometers. This is roughly 2411 , so searching a keyspace of 411 bits is like trying to locate a particular femtometer in the entire universe. These examples illustrate the power of large numbers and should convince any rational person that breaking a code by searching the entire key space is an illusion. Answers to Exercises 403

As for the claim that "there is a chance that the first key tried will be the right one," for a 64-bit keyspace this chance is 2-64 . To get a feeling for how small this number is, consider that light travels 1.6 x 10-11 meters (about the size of 10 atoms laid side by side) in 2-64 seconds.

1.1: See, for example, [Gaines 56].

1.2: Follow each letter in the key polybiuscher with its first successor that's still not included in the key. Thus, p should be followed by q and 0 should be followed by p, but because p is already included in the key (as are q, r, and s), the 0 is followed by t. This process produces first the 22-letter string pqotlmyzbcikuvswhnefrx which is then extended in the same way to become the 25-letter string paqdogtlmyzbcikuvswhnefrx.

1.3: FO ---+ MF, LX ---+ PU, LO ---+ SM, WM ---+ HL, EX ---+ NE, EA ---+ AT, YX ---+ ZY. The is FOLLDWMEEARLY ---+ MFPUSMHLNEATZY.

1.4: An integer N in the range [a, b] can be converted to an integer in the range [c, d] by the transformation d- c ) round ( (N - a) b _ a + c .

A simpler method is to use a generator that generates random real numbers R in the range [0,1]. For each R, the value l12xRJ is examined. If it is in the right range (in the interval [1,3] for a D), then it is used, otherwise another random R is generated and examined.

2.1: A space-filling curve completely fills up a square (or in general, part of a multi• dimensional space) by passing through every point in it. It does that by changing direc• tion repeatedly. Figure Ans.1a-c shows examples of the well-known Hilbert, Sierpinski, and Peano curves. It is obvious that any square can be completely scanned by such a curve. Each space-filling curve is defined recursively and can be refined to fill a square grid of any size.

(a) (b) (c)

Figure Ans.1: The Hilbert, Sierpinski, and Peano Curves. 404 Answers to Exercises

2.2: Collection can be done by diagonals, zigzags, or a spiral, as suggested by Fig• ure 2.3. Collecting the plaintext of Figure 2.5 by diagonals from top-right to bottom-left results in the ciphertext BEJAIDHOCGNRFKQMPL. (See also Exercise 2.1.) There are, of course, many other ways to scan a square, such as going down the first column, up the second column, and alternating this way.

2.3: A transposition method encrypts by a permutation and the result of two consec• utive permutations is another permutation. Thus, just combining several transposition methods does not, by itself, increase security. A combination of transposition ciphers may be more secure than any of its individual methods if the methods being combined use keys. A combination of several methods requires several keys, and the security pro• vided by such a combination may be equivalent to that provided by a long key. Also, combining a transposition method and a substitution method (such as in Section 2.6) may result in improved encryption.

2.4: This is trivial. The Caesar shift of one position results in the simple permutation abcdefghijklmnopqrstuvwxyz BCDEFGHIJKLMNOPQRSTUVWXYZA that obviously has one cycle.

2.5: The three groups are BFIKMRV03, DLNQU2579 , and GJOSXZ148.

2.6: In an 8x8 template there should be 8·8/4 = 16 holes. The template is written as four 4x4 small templates, and each of the 16 holes can be selected in four ways. The total number of hole configurations is therefore 416 = 4,294,967,296.

2.7: The letter D is the fourth one in the alphabet, implying that the template size should be 4x4. Of its 16 squares, only 16/4 = 4 should be holes. The first four letters of the key are DOGO , so they produce the numeric string 1324. The resulting template is shown in Figure Ans.2.

Figure Ans.2: A 4 x 4Turning Template.

2.8: For 12 November 2001 , the weighted sum is

50 . 1 + 51 . 2 + 52 . 1 + 53 . 1 + 54 . 0 + 55 . 1 = 312 and 312 mod 190 = 122. Thus, the page number is 123. Answers to Exercises 405

2.9: The second key is 6 letters long, so the initial rectangle has six columns. The length of the ciphertext is 32 letters. The quotient of 32 -;- 6 is 5, so the rectangle has 5 + 1 = 6 rows. The remainder is 2, so the first two columns are full (six rows each) and the remaining four columns have five rows each. The second key is TRIPLE, which corresponds to the numeric sequence 652431. The ciphertext starts with the 5-letter string thbnc that's written into the last column (whose number is 1). This column has just five rows. The ciphertext continues with the 5-letter string rtttn that's placed in the third column (the one labeled 2), and so on. After six steps, the rectangle looks as in Figure 2.10b, and a similar process ends up with the rectangle of Figure 2.10a. Reading this rectangle in rows yields the plaintext.

2.10: Two simple variations on AMSCO are shown in Figure Ans.3a,b. They are easy to figure out. AMSCO has been named after its developer, A. M. Scott, so if your name is Claude Isaac Fairchild, you would name your cipher CIFAIR. Q U A L I T Y Q U A L I T Y 4 6 3 2 5 7 4 6 3 2 5 7 C H I D E L L CO M EH 0 ME I MM OM OM MM IA LY LIDS D IA T EL Y AL L E E E T A S T IS L OS T

Figure Ans.3: Two Variations on AMSCO.

3.1: If the book is in English then E is the most common letter of the key. Many plaintext letters will therefore be encrypted with the E row (row 4) of the Vigenere letter-square. If the plaintext is also in English, having many occurrences of E, then many of those Es would be enciphered by row 4, to become Is in the ciphertext. Thus, we can expect I to be most common in the ciphertext.

Tell yourself that everything in nature is a symbol of something like a specimen of an abstruse cryptogram, all the characters of which conceal some meaning. But when we have succeeded in deciphering these living texts, and have grasped the allusion; when, beside the symbol, we have succeeded in finding the commentary, then the most desolate corner of the earth appears to the solitary seeker as a gallery full of the masterpieces of an unsuspected art. Fabre puts into our hands the golden key which opens the doors of this marvellous museum. -Georges Victor Legros, Fabre, Poet of Science

3.2: BUTuWILLuSHEuDECRYPTuMOREuAFGHAN is another good key. It is short and easy to memorize, and it produces the 20-letter string BUTWILSHEDCRYPMOAFGN. Appending the six remaining letters JKQVXZ to this string results in the permutation abcdefghijklmnopqrstuvwxyz BUTSHEWILDCRYPMOAFGNJKQVXZ 406 Answers to Exercises

3.3: The four integers relatively prime to 8 are 1, 3, 5, and 7. They generate the following permutations.

abCdefgh) abCdefgh) abCdefgh) abcdefgh). ( ( ( ( abcdefgh adgbehcf afchebgd ahgfedcb

3.4: Equation (3.5) implies

3.5: Imagine a plaintext that's 38 letters long. The first 36 letters can easily be encrypted and decrypted. Encrypting the remaining two letters is also easy, but de• crypting them must be done by examining 25 strings of ciphertext and selecting the one whose first two letters are the last two plainletters. This may be ambiguous. Interest• ingly enough, a purely numeric sequence may sometimes make sense as, for example, in 1984 1949, which may refer to the book 1984, written in 1949.

3.6: The following table illustrates the idea of balanced codes. ETAOINSHRDLUMWYFCGBPKVXQJZ 01234567890122109876543210 The digit 0 is assigned to the most common and also the least common letters, The 1 is assigned to the second most common and the second least common letters, and so on.

3.7: Yes, as is easy to see by examining the following examples (notice the two occur- rences of 22 in the ciphertext and how they produce different plaintexts).

Plaintext + 66 05 66 11 61 Ciphertext 22 61 88 22 27 Key 66 66 22 11 66 Key 66 66 22 11 66 Ciphertext 22 61 88 22 27 Plaintext 66 05 66 11 61

3.8: Each Pi in the sum Li 6 P; is a probability, so it lies in the interval [0,1]' implying 2' ,\,,26 ,\,,26 2 that Pi cannot be bIgger than Pi' The sum 01 Pi equals 1, so the sum 01 Pi cannot exceed 1. On the other hand, this sum cannot be less than 0.038, as shown below, so this sum lies in the right interval and is therefore a probability. In order to place the lower limit of 0.038 on our sum, we observe that

26 ( 1 ) 2 26 26. 26 1 26 2 26 26 26 1 L Pi - 26 = L P; - 2 L ~~ + L 262 = L P; - 26 L Pi + 262 = L P; - 26' 1 1 1 1 1 1 1 Answers to Exercises 407

which implies that

26 1 26 ( 1 ) 2 1 ~P; = 26 + ~ Pi - 26 ::::: 26 ~ 0.038.

4.1: We consider the general problem of runs (of any length) of identical base-n digits in a string of L digits. The special case investigated here is that of L = 1000 and n = 10. Direct calculation for small values of L suggests the formula

L-1 L-2 (Ans.1)

Equation (Ans.1) is easy to prove by induction on L. For L = 2, there are n 2 2-digit strings and n of them; namely, 11, 22, ... ,nn are runs. The total number of runs divided by the total number of strings is 1 In, and this is what Equation (Ans.1) yields for L = 2. Assuming that the equation is correct for strings of length L, we start with a general string d1d2 ... dL-1dL and append another digit dL+1 to it. The new, L + I-digit string has a new run if dL = dL+1 (an event with probability lin) and dL- 1 -I- dL (an event with probability 1 - lin). The probability of a new run is therefore

1 1 n n 2 .

Adding this to Equation (Ans.1) yields

-----+---L-1 L-2 1 1 (L+1)-1 (L+1)-2 n n 2 n n 2 n n 2

(end of proof). The author is indebted to J. Robert Henderson for this information. For L = 1000 and n = 10, Equation (Ans.1) yields

999 _ 998 = 89.92. 10 100 Table Ans.4lists 1000 pseudo-random decimal digits, obtained by the Mathematica com• mand MatrixForm [Table [Random [Integer, {O, 9}] ,{10}, {100}]]. A direct check ver• ifies that there are 88 runs (mostly pairs) among them.

4366139157098226943753187288954739903033 797395491 777246827481064776612611267184767108419528567246123 07752603174846096203548793263363781700 1 79669515945363363564967828195934367064 75930483141 05 7351 799161 766206221286985283444433877581915321337445403069405 5 81 7 7888653 36560 1688 71 02 8 5 203 7 89 3 72 5 8506 79 4548042 719699268842118744239331094383909802056339907929 70750 1 069 783548240 13 5 80 292 5 51860851 02 4413058 3 3323 9 95 423962221574 7454811325519505129158715399041503734024186 7973623804 7623816083309482973949050833724968 7 6822704141 765406591660875046211425087 43731809418659602331927513555090 140248983564657869830 162127 553 7 271198592487032923843591 75553826845872178278184577713615024267090376577959671 70396183331991 722237348 88064557519304 7984680580484310 16967861333851190753586508627205333092825 796638 77264555511590 149289553 588646483299943019194102269161871113607327 4464 775510944551386448563604 766813350815152219981079872688 16911883559607385033935496452371 0944 73678258245054266678904925131329848551 7509101406027030840314131 0

Table Ans.4: 1000 Pseudo-Random Decimal Digits. 408 Answers to Exercises

4.2: Each of the 50 numbers has a probability of 1/50 to be selected. The probability of selecting any sequence of six numbers is therefore (1/50)6 = 6.4x 1O-11 .

4.3: A computer program loops indefinitely through all the symbols of the alphabet. Each time radioactive decay is detected (by a Geiger counter or a similar detector), the detector interrupts the computer. The interrupt handling routine prints the current symbol of the loop, and resumes the loop.

4.4: The following is a quote from http://random . org/, a Web site offering true random numbers. "A radio is tuned into a frequency where nobody is broadcasting. The atmospheric noise picked up by the receiver is fed into a Sun SPARe workstation through the microphone port where it is sampled by a program as an eight bit mono signal at a frequency of 8KHz. The upper seven bits of each sample are discarded immediately and the remaining bits are gathered and turned into a stream of bits with a high content of entropy. Skew correction is performed on the bit stream, in order to insure that there is an approximately even distribution of Os and Is. The skew correction algorithm used is based on transition mapping. Bits are read two at a time, and if there is a transition between values (the bits are 01 or 10) one of them-say the first-is passed on as random. If there is no transition (the bits are 00 or 11), the bits are discarded and the next two are read."

4.5: In the case of two bits, there is no difference between multiplication and logical AND.

4.6: The choice al = 1, a2 = 0, and a3 = 1 with a starting value of 100, produces the period-7 sequence 100, 001, 011, 111, 110, 101, and 010. This is the maximal period because 23 - 1 = 7.

5.1: Five consecutive As are encrypted to e, e, b, d, and e.

5.2: Imagine an 8-contact rotor where contacts 1, 2, 3, 4, 5, 6, 7, and 8 are connected to contacts 4, 8, 2, 6, 1, 7, 5, and 3, respectively. The eight differences are 3, 6, 7, 2, 4, 1, 6, and 3. Difference 3 appears twice and difference 5 is omitted. The sum of the differences is 32, which taken modulo 8 yields O.

5.3: Quantum mechanics was developed in 1925 by Erwin Schrodinger and Werner Heisenberg, working independently and using different approaches (although Schrodinger published his famous equation in 1926). Another example is the modern electronic com• puter, which was invented by several teams (such as Eckert-Mauchly and Atanasoff• Berry) during a short period in the late 1940s.

5.4: There are 26 holes in the plugboard. Once the first cable is plugged into a hole, its other end can be plugged into any of the remaining 25 holes. There are therefore 25 ways to plug the first cable. Once that cable is fully plugged in, 24 holes remain, Answers to Exercises 409 so there are 23 possibilities for the second cable. The total number of possible ways to plug 6 cables is therefore 25 x 23 x 21 x 19 x 17 x 15 ~ 5.85034 x 107 . With 10 cables, the number mushrooms to approximately 5.27057 x 1011.

5.5: Following the paths shown in Figure 5.6, the sequence of substitution steps is

II III IV refl IV III II plug

5.6: There are 264 = 456,976 initial positions of four rotors and 4! = 24 ways to plug them into the machine. The total number of substitution rules is therefore 26 4 x 24 = 10,967,424, much bigger than 105,456.

5.7: A rotor combination of the form 3xy means having to choose 2 out of 7 rotors (which can be done in G) = 21 ways) and to select a permutation of these two, which can be done in two ways. The total number of rotor combinations of the form 3xy is therefore 42, and the total number of rotor combinations of the form 3xy, x6y, and xy1 is 3 x 42 = 126. The total number of rotors combination was therefore reduced from G) x 3! = 336 to 210, a savings of 37%.

Cypherpunk [from cyberpunk] Someone interested in the uses of encryption via electronic ciphers for enhancing personal privacy and guarding against tyranny by centralized, author• itarian power structures, especially government. There is an active cypherpunks mailing list at cypherpunks-request©toad. com coordinating work on public-key en• cryption freeware, privacy, and digital cash. See also tentacle. -The New Hacker's Dictionary ver. 4.2.2

5.8: This can be done by sliding KOMMANDER under the message from left to right and eliminating all positions where any letter of KOMMANDER is identical to the letter of the message right above it. Direct check shows that the first few impossible positions are 3-5,8,9,16,17, and 20 (there are more).

6.1: The logical operation XNOR (the inverse of XOR, denoted by 8) also has the property: If B = AffiK, then A = BffiK.

6.2: The following special cases support this claim: Case 1. A plaintext of all zeros. The ciphertext is the , which is random. Case 2. A plaintext of all Is. The ciphertext is the inverse of the keystream, which is also random. 410 Answers to Exercises

Case 3. A plaintext with repeating patterns. Each repetition of a pattern is en• crypted differently because of the randomness of the keystream, so the ciphertext does not contain the plaintext's patterns. These special cases do not prove that the ciphertext is random, but they support the claim.

6.3: The average word size in English is 4-5 letters. We therefore start by examining 4-letter words. There are 26 letters, so the number of combinations of 4 letters is 264 = 456,976. A good English-language dictionary contains about 100,000 words. Assuming that half these words have 4 letters, the percentage of valid 4-letter words is 50000/264 ~ 0.11. The percentage of 5-letter words is obtained similarly as 50000/265 ~ 0.004. Random text may therefore have some short (2-4 letters) words, and very few 5-6 letter words, but longer words would be very rare.

6.4: Any 4-stage shift register where the rightmost stage is not a tap will serve. In such a shift register, the state 0001 is followed by 0000 regardless of which of the three left stages are taps.

6.5: The rightmost and leftmost stages of this shift register are taps. Therefore, a direct check produces the following 15-state sequence

1000110011101111 01111011 01011010 1101 011000111001010000100001.

6.6: The truth table of a basic Boolean function with 2 inputs has 4 elements (Ta• ble 6.1), so there can be 24 = 16 Boolean functions of 2 inputs. Similarly, the truth table of a Boolean function with n inputs has 2n elements, so there can be 22n such tables. For n = 8, for example, the (huge) number of Boolean functions is 228 = 2256 ~ 1.16 X 1077 .

6.7: The output sequence of Rl is the 7-bit repeating string 1001011. The output string of R2 is the string 110101111000100 with a 15-bit period. The output of R3 is the 31-bit periodic string 1001010110000111001101111101000. The final output is 1011101010100001011110110001110.

6.8: The output sequence of Rl is the 7-bit periodic sequence 0011101. The output sequence of R2 is the 31-bit sequence 1010000100101100111110001101110. The final output is 10000101111101110.

6.9: If location a of the table contains byte value a, then no special information is needed to construct the inverse table. It should be identical to the forward table.

7.1: The key is implicit in the particular table used. In the case of 3-bit blocks, for example, the table has 8 entries, so there can be 8! tables, and all the parties using this cipher have to agree upon which table to use. To decrypt cipherblock C, the table should be searched until an entry with C is found. The index of that entry is the plainblock P. This also implies that table entries should be unique. Answers to Exercises 411

7.2: The fact that an XOR is its own inverse is exploited. The XOR of (A EB B) with B produces A.

7.3: The hexadecimal values of the four keys are

0101010101010101, lF1F lF1FOEOEOEOE, EOEOEOEO F1Fl F1Fl, FEFEFEFEFEFEFEFE.

7.4: There are 18 P-keys and 4 x 256 S-boxes. Each iteration computes 2 of them, so the total number of iterations is (18 + 4 x 256) /2 = 521.

7.5: The number of 128-bit keys is 2128. A "gig" (or "giga") is defined as 230. This is a little more than a billion. The result of the division 2128/(230 x 230 ) is 268 or approximately 2.95 x 1020 . This means that if we build a piece of hardware that tries a giga keys per second and if we run a giga of them in parallel, it would still take more than 1020 seconds to check all the keys. This is about 9.4x1012 years, and the universe is "only" about 15x109 years old. (Unfortunately, those who believe in multiple universes may find little solace in this result.)

8.1: Mixing salt and pepper is a one-way operation in practice (in principle, they can be separated). Heat flow from high to low temperature in a closed system is a one-way process in principle. Giving birth is one-way in principle, while squeezing glue out of a tube is one-way in practice.

8.2: We arbitrarily select q = 10 and the two slopes a1 = 1 and a2 = 2. The two lines passing through point (5,10) are computed by 10 = 1 x 5 + b1 ---+ b1 = 5 and 10 = 2 x 5 + b2 ---+ b2 = O. Each of the two individuals involved gets one of the two pairs (1,5) and (2,0).

8.3: Denoting the secret by a, we select a number b at random and consider (a, b) a line pair (i.e., a slope and a y-intercept). We then select n different random values Xi and compute a Yi for each by means of Yi = aXi + b. The n pairs (Xi, Yi) are points on the line Y = ax + b and they are distributed to the n participants in the secret. Any two of them can use their two points to compute (a, b). One limitation is that the slope a should not be zero. The line Y = Ox + b is a horizontal line where all the points have the same y-coordinate b. This does not mean that any participant will be able to obtain the secret a single-handedly (after all, they do not know that the line is horizontal), but it is cryptographically weak. Another limitation is that no point should have an x-coordinate of zero. If we know that point (O,Yi) is on a line, then b can be obtained from the basic equation Yi = a·O + b. This does not disclose the secret a but it amounts to providing the opponent with a clue. 412 Answers to Exercises

8.4: We outline two approaches. In the first approach we assume that the three points Pi = (Xi, Yi, Zi), i = 1,2,3, are given. We write the four equations

Ax + By + Cz + D = 0, AXl + BYl + CZl + D = 0, AX2 + BY2 + C Z2 + D = 0, AX3 + BY3 + C Z3 + D = o.

The first equation is true for any point (x, y, z) on the plane. We cannot solve this system of four equations in four unknowns, but we know that it has a solution if and only if its determinant is zero. The expression below assumes this and also expands the determinant by its top row:

x Y Z 1 0= Xl Yl Zl 1 X2 Y2 Z2 1 X3 Y3 Z3 1

Yl 1 Xl Yl 1 Yl =X Y2 1 + Z X2 Y2 1 Y2 Y3 1 X3 Y3 1 Y3

This is of the form Ax + By + C Z + D = 0, so we conclude that

Yl Zl 1 Xl Zl 1 Xl Yl 1 Xl Yl Zl A = Y2 Z2 1 B = - X2 Z2 1 C = X2 Y2 1 D = - X2 Y2 Z2 Y3 Z3 1 X3 Z3 1 X3 Y3 1 X3 Y3 Z3

The second approach uses vector analysis. Given three points P l , P 2 , and P 3 , we subtract Vl = P l - P 2 and V2 = P l - P 3 . The two vectors Vl and V2 are in the plane, so their cross-product N = Vl X V2 is the normal to the plane. We now select any of the three points, say, P l and a general point X = (x, y, z) on the plane. The difference P l - X is a vector in the plane and is therefore perpendicular to the normal, implying that the dot product N· (Pl - X) must equal zero. This yields the plane equation N·X - N·Pl = 0 or Nxx + NyY + Nzz + s = 0 where s is the number -N·Pl .

8.5: Equation (8.6) yields 3P = P + 2P = (-0.11138, -0.576327).

8.6: Equation (8.7) yields 2P = P + P = (2/3, -1/2) and Equation (8.6) yields the sum 3P of the distinct points P + 2P = (2/3,1/2). Notice that points 2P and 3P have the same x-coordinate (Figure Ans.5). Their sum, 5P is therefore O. Thus, we say that point P has order 5.

10.1: Data can be compressed because their original representation has redundancies. Secret data can be embedded in a cover in "holes" that exist in the cover because of redundancies. Thus, redundancy plays a central role in both fields (as well as in error-correcting codes). Answers to Exercises 413

0.75

P 0.5 3P

0.25

-0.25 0.25 0.5 0.75

-0.25 2P -0.5

-0.75

Figure Ans.5: Adding Points in an Elliptic Curve.

10.2: Any phrase with the word "love" may indicate the letter N. Any phrase with a mention of speed may indicate the letter E, and any phrase with the name John may indicate a D. Thus, the text "Make haste. With love. John" indicates the word END.

10.3: The check digit is zero because

ox 10 + 3 x 9 + 8 x 8 + 7 x 7 + 9 x 6 + 8 x 5 + 6 x 4 + 8 x 3 + 2 x 2 = 286 = 26 xII.

10.4: The text "hidden letters will defy simple codebreaking" looks innocent. These six words have 2, 2, 1, 2, 2, and 3 syllables, respectively, thus hiding the two triplets 221 and 223.

10.5: The data are "meet me at nine," hidden in the second letter of every word.

10.6: Direct check reveals the bits OOdOddOdOdOdddOldOl0ldd, where d stands for "undefined."

10.7: An alternative solution is to have dictionary types with 2, 4, 8, 16, etc. words. If one bit remains to be hidden, a 2-word dictionary type is used to hide it regardless of the dictionary type that's specified by the current syntax rule for the next step.

10.8: This is straightforward. The sentences are "Alice is sending clean data," "Alice is sending clean clothes," "Alice is sending dirty data," "Alice is sending dirty clothes," then the same four sentences with "Alice is receiving ... " instead of "sending," and then eight more sentences with "Bob" instead of "Alice," for a total of 16 sentences. 414 Answers to Exercises

11.1: The bitmap size for this case is 3x210 x2lO = 3x220 = 3 Mbytes.

11.2: The permutation 0 ...... 2, 1 ...... 3, up to 253 ...... 255.

11.3: There are (2:,~n1) ways to choose 2T - 1 objects from a set of m'n objects. We can assign the integers from 1 to 2T - 1 to the first 2T - 1 elements of W, and this can be done in (2T - I)! ways. We can then choose each of the remaining m· n - (2T - 1) elements at random from the set of (2T - 1) valid integers, and this can be done in (2T - 1)m.n-(2" -1) ways. The total number of ways to choose matrix W is therefore

For m = n = 8 and r = 5, this number is

e~) x 31! x 31 33 ~ 2.397· lO lO1 , too big to allow for a brute force approach where every possible W is checked.

12.1: Each 0 would result in silence and each sample of 1, in the same tone. The result would be a nonuniform buzz. The amplitude is constant but the frequency varies. It is low when the sound contains long runs of zeros and ones.

12.2: The experiment should be repeated with several persons, preferably of different ages. The person should be placed in a sound-insulated chamber and a pure tone of frequency f should be played. The amplitude of the tone should be gradually increased from zero until the person can just barely hear it. If this happens at a decibel value d, point (d, f) should be plotted. This should be repeated for many frequencies until a graph similar to Figure 12.4a is obtained.

12.3: This is trivial. The filter coefficients are h(O) = 1 and h(2) = (3. The combined signal is produced by y(j) = x(j)h(O) + x(j - 2)h(2).

12.4: By definition, F2 has the value K2 x e = [0,1,1,1, ole = C 1 EEl C2 EEl C3 = 1010. To change it to 1101 we need the difference vector D = 1010 EEl 1101 = 0111. The computation described in the text yields

0]1 [0001][0000][0001]1000 0111 1111 e=eEEl(K~xD)=e [ 1 [0,1,1,1]= 0101 EEl 0111 = 0010 . 1 0111 0111 0000 o 0100 0000 0100

A direct check verifies that the new value of F2 is K2 X e = 1101 and that the two older files Fo = Ko x e = 1100 and F1 = K1 X e = 1110 haven't changed. This result has been achieved because rows C1 , C2 , and C3 of e were modified such that the XORs of any two of them have been preserved. Answers to Exercises 415

12.5: We assume that the probability of a I-bit is greater than 0.5. Therefore, regard• less of the size of the region, the bit configuration with the highest probability is that of all Is. When the size of the region is odd, this configuration has an odd number of Is, so it has a parity of 1 and thus contributes to the probability of interest, raising it above 0.5. For an even-sized region, this bit configuration has an even number of bits and so is not included in the probability we compute, resulting in low probability (below 0.5).

A.I: Each pixel of the convolved image becomes five times its original value minus the values of its four immediate neighbors. This tends to magnify the differences between a pixel and its neighbors and results in a sharper image. Thus, this is a sharpening kernel. If a pixel is identical to its four neighbors, then this kernel has no effect, but if a pixel is surrounded by neighbors that are different, this kernel will make it similar to its neighbors.

B.I: Each of the 8 characters of a name can be one of the 26 letters or the 10 digits, so the total number of names is 36 8 = 2,821,109,907,456; close to 3 trillion.

D.I: Since 5 is a prime, both addition and multiplication in GF(5) are done modulo 5. The tables are + o 1 234 x 0 1 234 0 o 1 234 0 00000 1 1 2 340 1 o 1 234 2 2 340 1 2 02413 3 3 4 012 3 o 3 1 4 2 4 40123 4 04321

D.2: It is easy to add and multiply numbers modulo 4 and produce the tables + o 1 2 3 x o 123 0 0 1 2 3 0 o 0 0 0 1 1 230 1 o 1 2 3 2 230 1 2 020 2 3 301 2 3 032 1

The multiplication table doesn't make sense, since 2x1 = 2x3 and 2xO = 2x2. Elements 1 and 3 cannot be obtained by multiplying 2 by another element. Also, element 2 doesn't have a multiplicative inverse. This happens because 4 is not a prime and field element 2 is a factor of 4. Trying to define multiplication in GF(6) leads to similar results, because 2 and 3 are factors of 6.

D.3: The additive inverse of a polynomial a(x) is itself because the coefficients of the sum a(x) + a(x) are either 0 + 0 or 1 + 1 = O.

D.4: It is easy to show that x8 + 1 = (x4 + 1)2.

(x4 + 1) (x4 + 1) = X4 X x4 + x4 X 1 + 1 X x4 + 1 x 1 = x 8 + x4 X (1 + 1) + 1 = x 8 + 1. 416 Answers to Exercises

D.5: GF(6) does not exist because 6 is not a prime and cannot be expressed as an integer power of a prime.

D.6: We start with the product 2 x 2 in GF(4). In binary, this is 10 x 10 and in polynomial notation it is (x + O)(x + 0). This equals x 2 and x 2 mod (x2 + x + 1) is the polynomial x + 1, which in our notation is 112 or 3. (See Section D.3 and especially Exercise D.11 for polynomial modulo computations.) Another example in GF( 4) is the product 2x3, which is x(x + 1) = x 2 + x. When computed modulo x 2 +x + 1, the result is 1. The last example is the product 5 x 6 in GF(8). This is the polynomial product (x2 + 1)(x2 + x). It equals x4 + x 3 + x 2 + x, which when computed modulo x3 + x + 1 yields a remainder of x + 1 or 0112 = 3.

D.7: A look at Table D.1 shows that the additive inverse (in some sense it is the "negative") of each element is itself. The multiplicative inverses (reciprocals) of the seven nonzero elements are 1, 5, 6, 7, 2, 3, and 4. Notice that 0 does not have a reciprocal and may sometimes be considered its own inverse.

D.8: The multiplication table of GF(5) (Exercise D.1) shows that the smallest n for which 3n = 1 is n = 4 = 5 - 1. Hence, the exponential representation of GF(5) with respect to 3 is (0,3,32 = 4,33 = 2,34 = 1).

D.9: This is easy. The multiplication table of GF( 4) shows that the smallest n such that 2n = 1 is 3 = 4 - 1, and the same is true for element 3.

D.10: Let a be any root of x4 + x3 + 1. From 00 4 + 00 3 + 1 = 0 we get 004 = 00 3 + 1 and the entire exponential representation of GF(16) can be constructed from this relation (Table Ans.6). Notice how the first four powers of a (elements 1, 2, 4, and 8) form a basis for the polynomial representation of GF(16).

expo. polynomial representation expo. polynomial representation repr. repro 0 15 0 0000 0 a O 0 1 0001 1 a 8 8 a 2 + a + 1 0111 7 a l 1 a 0010 2 a 9 9 a 2 + 1 0101 5 a 2 2 a 2 0100 4 ala 10 a 3 +a 1010 10 a 3 3 00 3 1000 8 all 11 a 3 + a 2 + 1 1101 13 a 4 4 a 3 + 1 1001 9 a l2 12 a+1 0011 3 a 5 5 a 3 + a + 1 1011 11 a l3 13 a 2 +a 0110 6 a 6 6 00 3 + a 2 + a + 1 1111 15 a l4 14 a 3 +a2 1100 12 a 7 7 a 3 + a 2 + a 1110 14 a l5 1 0001 1

Table Ans.6: Exponential and Polynomial Representations of GF(16). Answers to Exercises 417

D.ll: A polynomial division can be summarized in a form similar to the long division of integers, so Figure Ans.7 employs this form to summarize the results of the three divisions. Figure Ans.7a shows a quotient of (x3 +x+ 1) and a remainder (modulo) of o. Figure Ans.7b has the same quotient and the modulo x. The quotient of Figure Ans.7c is 1 and the modulo is (x 3 + X + 1). Polynomial Division If f(x) and d(x) i= 0 are polynomials, and the degree of d(x) is less than or equal to the degree of f(x), then there exist unique polynomials q(x) and r(x), so that

f(x) r(x) d(x) = q(x) + d(x) ,

and so that the degree of r(x) is less than the degree of d(x). In the special case where r(x) = 0, we say that d(x) divides evenly into f(x).

x3+x +1 1 x2+1 Ix5+X2+X +1 x2+1 Ix5+x2+1 x4+1Ix4+X3+X x5+x3 x5+x3 x4+1

o x (a) (b) (c)

Figure Ans.7: Three Polynomial Divisions.

Keep the faculty of effort alive in you by a little gratuitous exercise every day. -William James, The Principles of Psychology (1890) Cryptography Timeline

About 1900 B.C. An unknown Egyptian scribe uses nonstandard hieroglyphs in an in• scription. This may be the first known example of written cryptography [Kahn 96 p.71].

1500 B.C. An encrypted formula for making glaze for pottery is written on a clay tablet in Mesopotamia [Kahn 96 p.75].

500-600 B.C. Hebrew scribes writing the book of Jeremiah use a simple substitution cipher known as ATBASH, where the last letter is substituted for the first, the next-to• last is substituted for the second letter, and so on [Kahn 96 p.77].

487 B.C. A long, narrow strip of leather is wrapped around a stick of wood and written on. The leather is then unwrapped and worn as a belt. The receiver has a matching stick to wrap the leather on and decrypt the message. This device, known as a sky tel, was used by the Greeks [Kahn 96 p.82]. (The author's own experience indicates that the sky tel was independently invented and used by children as late as the 1940s and likely even today.) It is interesting to note that the ancient Greeks also introduced a form of steganography. The head of a messenger was shaved and a message written or tattooed on the scalp. Once the hair grew, the messenger was sent on his way, to be shaved again by the receiver.

50-60 B.C. Julius Caesar develops the shift substitution cipher named after him for Roman government (and his private) communications. [Kahn 96 p.83].

1-400? A.D. Mallanaga Vatsayana (India) writes the Kama Sutra (lessons of love) and lists cryptography as the 44th and 45th of 64 yogas (arts) anyone should learn. The Kama Sutra (which may be a compilation of earlier works) was probably written between the first and fourth centuries A.D. [Burton 91].

200 Quoting [Kahn 96 p.91] "The so-called Leiden papyrus ... employs cipher to conceal the crucial portions of important [magic] recipes." 420 Cryptography Timeline

725-790? Quoting [Kahn 96 p.97] "Abu 'Abd aI-Rahman aI-Khalil ibn Ahmad ibn 'Amr ibn Tammam al Farahidi al-Zadi al Yahmadi wrote a (now lost) book on cryp• tography, inspired by his solution of a cryptogram in Greek for the Byzantine emperor. His solution was based on known (correctly guessed) plaintext at the message start-a standard cryptanalytic method, used even in WW-II against Enigma messages."

855 Several cipher alphabets, traditionally used for magic, are published by Abu Bakr Ahmad ben 'Ali ben Wahshiyya an-Nabati [Kahn 96 p.93].

855 Quoting [Kahn 96 p.94] "A few documents with ciphertext survive from the Ghaz• navid government of conquered Persia, and one chronicler reports that high officials were supplied with a personal cipher before setting out for new posts. But the general lack of continuity of Islamic states and the consequent failure to develop a permanent civil service and to set up permanent embassies in other countries militated against cryptog• raphy's more widespread use."

1226 Quoting [Kahn 96 p.106] "As early as 1226, a faint political cryptography ap• peared in the archives of Venice, where dots or crosses replaced the vowels in a few scattered words."

1250 Quoting Roger Bacon, "A man is crazy who writes a secret in any other way than one which will conceal it from the vulgar" [Davis 23].

1379 At the request of Pope Clement VII, a combination substitution alphabet and small code is compiled by Gabrieli di Lavinde. This is apparently the first example of a nomenclator. Nomenclators are easy to use, so they remained popular until about 1800, even though more secure ciphers became available during that time [Kahn 96 p.107].

13005 Quoting [Kahn 96 p.97] "Abd aI-Rahman Ibn Khaldun wrote 'The Muqad• dimah,' a substantial survey of history which cites the use of 'names of perfumes, fruits, birds, or flowers to indicate the letters, or. .. of forms different from the accepted forms of the letters' as a cipher among tax and army bureaus. He also includes a reference to , noting, 'Well-known writings on the subject are in the possession of the people.' "

1392 The Equatorie of the Planetis, attributed to Geoffrey Chaucer, contains passages in a simple substitution cipher with an alphabet consisting of letters, digits, and symbols [Price 55, pp.182-187].

1412 Subh al-a 'sha, an Arabic encyclopedia that includes a chapter on cryptology, is written by Shihab aI-Din abu 'l-'Abbas Ahmad ben 'Ali ben Ahmad 'Abd Allah al• Qalqashandi. The author attributes this material to another Arab scholar who lived from 1312 to 1361 but whose writings on cryptology have been lost. This chapter dis• cusses both substitution and transposition ciphers and also, apparently for the first time, a cipher with multiple substitutions for each plaintext letter. There is also an ex• position on and worked examples of cryptanalysis, including the use of letter frequencies and sets of letters that cannot occur together in one word [Kahn 96 p.95]. Cryptography Timeline 421

1466-7 Leon Battista Alberti (possibly instructed by Leonardo Dato) develops the first polyalphabetic cipher and constructs a (Figure 5.1) to mechanize the process. Alberti also wrote extensively on the state of the art in ciphers (Chapter 3) [Kahn 96 p.127].

1473-1490 Quoting [Kahn 96 p.91] "A manuscript ... by Arnaldus de Bruxella uses five lines of cipher to conceal the crucial part of the operation of making a philosopher's stone."

1518 Johannes Trithemius writes and publishes the first printed book on cryptology Polygmphiae Libri Sex. He also develops a steganographic cipher in which each letter is represented as a word taken from a succession of columns, such that the resulting string of words constitutes a legitimate prayer. He also describes polyalphabetic ciphers in the now-standard form of rectangular substitution tables (Figure 3.4) and introduces the notion of changing alphabets with each letter (Chapter 3 and [Kahn 96 p.130-136]).

1553 Giovan Batista Belaso adds a key to the multirow code table of Trithemius, resulting in the algorithm that today is attributed to Vigen ere [Kahn 96 p.137].

1563 Giambattista (Giovanni Battista) della Porta writes a text on ciphers, introduc• ing the digraphic cipher. Porta classified ciphers as transposition, substitution, and symbol substitution (use of a strange alphabet) and proposed the use of synonyms and misspellings to confuse the cryptanalyst. He apparently introduced the notion of a mixed alphabet in a polyalphabetic tableau [Kahn 96 p.138].

1564 Giovan Batista Belaso publishes an improving on the work of Girolamo Cardano who appears to have invented the idea [Ore 53].

1585 Blaise de Vigenere writes Traicte des chijJres, a book on ciphers including the first authentic plaintext and ciphertext autokey ciphers (in which previous plaintext or ciphertext letters are used for the current letter's key). These ciphers were later forgotten and were reinvented late in the 19th century [Kahn 96 p.146]. The autokey idea was revived in the CBC and CFB modes of DES (Section 7.3).

1623 Sir Francis Bacon describes a biliteral cipher known today as a 5-bit binary encoding. He considers it a steganographic method, and uses variation in typeface to carry each bit of the encoding [Bacon 23].

17905 Thomas Jefferson invents his wheel cipher.

1817 Colonel Decius Wadsworth constructs a geared cipher disk with a different num• ber of letters in the plain and cipher alphabets, resulting in a progressive cipher in which the permuted alphabets are used irregularly, depending on the plaintext [Kahn 96 p.195].

1854 Charles Wheatstone develops the cipher popularized by his friend Lyon Playfair and known today as the Playfair cipher (Section 1.6) [Kahn 96 p.198]. Wheatstone also reinvented the Wadsworth device. 422 Cryptography Timeline

1854 Charles Babbage seems to have reinvented the wheel cipher [Kahn 96 p. 81].

1857 Following the death of Admiral Sir Francis Beaufort, his cipher (a variant of the Vigenere cipher) is published by his brother in the form of a 4 x 5 inch card [Kahn 96 p.202].

1859 Pliny Earle Chase publishes the first fractionating (tomographic) cipher [Kahn 96 p.203].

1861 Friedrich W. Kasiski publishes Die Geheimschriften und die Dechiffrierkunst (Secret writings and the art of deciphering) presenting a general solution of a polyal• phabetic cipher with repeating key, thus ending several hundred years of dominance of the Vigenere cipher [Kahn 96 p.207].

1861-5 During the Civil War, the Union Army uses (in addition to other ciphers) substitution of select words followed by word column-transposition. The Confederacy, on the other hand, uses the Vigen ere cipher (just when it was broken by Kasiski) [Kahn 96 p.215].

1891 . Major Etienne Bazeries develops his version of the Jefferson wheel cipher and publishes the design in 1901 after the French Army rejected it [Kahn 96 81].

1891 September 24 Birth of William Frederick Friedman, in Kishinev, Russia.

1895 The development of commercial, practical radio, by Guglielmo Marconi, had caused a revolution in cryptography. Suddenly, there was no longer a need to string wires and have both sender and receiver located near a telegraph office. Anyone with a radio transmitter could send messages and anyone with a receiver could receive them. Obviously, messages sent by radio can easily be intercepted by anyone, so cryptography became indispensable. From that moment, important messages HAD to be encrypted.

1912 June 23 Birth of at Paddington, London.

1913 Captain Parker Hitt reinvents the wheel cipher, in strip form, leading to the M-138-A cipher of World War II [Kahn 96 p. 81].

1916 Major Joseph O. Mauborgne modifies Hitt's strip device back to wheel form and strengthens the alphabet construction. This led to the M-94 cipher device [Kahn 96 p.81].

1917 William Frederick Friedman, the father of American cryptanalysis, is employed as a civilian cryptanalyst (along with his wife Elizebeth) at Riverbank Laboratories where he concentrates on cryptanalysis for the United States Government (which had no cryptanalytic expertise of its own at this time). He goes on to start a school for military cryptanalysts at Riverbank, later moving it to Washington [Kahn 96 p.371]. Cryptography Timeline 423

1917 Gilbert S. Vernam, working for AT&T, invents a practical polyalphabetic cipher machine using a random, nonrepeating key, a one-time-tape. A Unites States patent (1,310,719) was issued on July 22, 1919 for this device. The device used two tapes of random characters to generate a stream of random characters. The ciphertext is gen• erated by combining ASCII plaintext with a one-time pad or key. The key is combined with the plaintext stream by exclusive-oring the two ASCII codes, thus creating the encrypted ciphertext. If implemented correctly, such a device is absolutely secure. The machine was offered to the Government for use in World War I but was rejected. It was sold commercially in 1920 [Kahn 96 p.401].

1917-1918 The Unites States Army creates the Cipher Bureau, part of the Military Intelligence Division. The small Signal Intelligence Service of the Army Signal Corps later carries on its duties.

1918 The ADFGVX cipher (Section 2.6) is used by the Germans near the end of World War I. This cipher includes a substitution (through a keyed array), fractionation, and transposition of the letter fractions. It was broken by the French cryptanalyst, Lieu• tenant Georges Painvin [Kahn 96 pp.340-5].

1919 Hugo Alexander Koch files a patent in the Netherlands on a rotor cipher machine. In 1927, he assigns his patent rights to Arthur Scherbius who invented and had been marketing the (Section 5.2) since about 1923 [Kahn 96 p.420].

1919 Arvid Gerhard Damm applies for a patent in Sweden on a mechanical rotor ci• pher machine. This machine grew into a family of cipher machines under the direction of Boris Caesar Wilhelm Hagelin who took over the business and was the only commer• cial cryptographer of this period to become a successful businessman. After the war, a Swedish law that enabled the government to appropriate inventions it felt important to defense caused Hagelin to move the company to Zug in Switzerland where it was incor• porated as Crypto AG. The company is still in operation, although facing controversy for having allegedly weakened a cipher product for sale to Iran [Kahn 96 p.422].

1921 Edward Hebern incorporates "Hebern Electric Code," a California company to manufacture the electromechanical rotor cipher machine he invented. The machine was based on scrambling rotors turning each other as in an odometer [Kahn 96 p.415].

1923 "Chiffriermaschinen Aktiengesellschaft" founded by Arthur Scherbius and Richard Ritter to make and sell the Enigma machine [Kahn 96 p.421].

1924 Alexander von develops and sells a "coding machine." This machine was cryptographically weak, because of a short period (a test cryptogram of 1135 characters was solved by American cryptanalysts in less than three hours), but sold well for three decades, owing to the salesmanship of the inventor [Deavours and Kruh 85, p.151].

1924 The United States Navy creates its first cryptanalytic group, part of the Code and Signal Section of the Office of Naval Communications. 424 Cryptography Timeline

1928 Polish intelligence becomes interested in the German Enigma machine. They get a commercial version of the Enigma but are still unable to decipher German military communications.

1929 Lester S. Hill publishes "Cryptography in an Algebraic Alphabet" [Hill 29] in which a block of plaintext is enciphered by a matrix operation [Kahn 96 p.404].

Early 19305 Polish codebreakers led by Marian Rejewski break the Enigma code (with help from German documents and Enigma keys obtained from French intelligence) and routinely read German military messages (Section 5.4).

1936 Publication of "On Computable Numbers" [Turing 36]. Turing machine is pro• posed.

1937 The Japanese cipher machine (code name Purple) is developed in response to revelations by Herbert O. Yardley. Its code is broken by a team headed by William Frederick Friedman. The machine used telephone stepping relays instead of rotors. As a result, the substitution rules for the individual steps were not related in a simple, odometerlike way as in a [Kahn 96 p.18ff].

19305 The American SIGABA cipher machine (code name M-134-C) is developed (by William F. Friedman or someone on his team). It uses random stepping of its rotors on each enciphering step rather than the simple, odometerlike stepping of rotors as in the Enigma. It also has 15 rotors instead of the more conventional 3 or 4 [Kahn 96 p.510ff].

1939 Polish intelligence passes their work on the Enigma to French and British Intel• ligence.

1939-40 Turing is recruited for and is introduced to the Enigma. Has his breakthrough in January 1940. Designs the Bombes and helps in daily decryption.

1939-42 Bletchley Park cryptographers break the German Navy Enigma cipher (used by U-boats), thereby turning the tide in the battle of the Atlantic.

19405 The United States military creates the Army-Navy Communications Intelligence Board (ANCIB) to facilitate cooperation in intelligence gathering

1945 ANCIB adds the State Department to its membership and becomes the State• Army-Navy Communications Intelligence Board (STANCIB).

1946 STANCIB becomes the United States Communications Intelligence Board (US• CIB) and adds the FBI to its membership.

1947 The United States Congress passes the National Security Act, aiming to centralize U.S. intelligence operations. The act establishes the National Security Council (NSC) and the CIA. Cryptography Timeline 425

1949 Secretary of Defense Louis A. Johnson issues a directive creating the Armed Forces Security Agency, the intelligence and security arm of the military.

1952 American president Harry S. Truman issues a top-secret directive creating the National Security Agency (NSA). Major General Ralph Canine is named its first direc• tor.

1954 June 7 Death of Alan Turing by cyanide poisoning, Wilmslow, Cheshire.

1960 The NSA demonstrates its new capabilities by photographing and interpreting the Soviet military buildup in Cuba, including the installation of missiles aimed at the United States.

1970 Horst Feistelleads a research team at IBM whose work culminates in the Lucifer cipher (Section 7.2, [Feistel70] and [Feistel 74]). This was the predecessor of the family of "Feistel ciphers" which includes the data encryption standard (DES, Section 7.3).

1971 The cryptographic agencies of the Unites States Air Force, Army, and Navy are reorganized into the newly created Central Security Service (CSS), operating under the NSA.

1975-1976 The Unites States House and Senate create permanent committees to over• see the actions of the American intelligence community. This was done in response to revelations in the media that the NSA and other government agencies had spied on citizens who participated in the civil rights and anti-Vietnam war movements.

1976 The data encryption standard (DES, Section 7.3). designed by IBM and based on Lucifer, is selected as the standard for encryption in the United States (Section 7.3). It has since gained worldwide acceptance.

1976 Whitfield Diffie and Martin Hellman publish "New Directions in Cryptography" [Diffie and Hellman 76], introducing the idea of public-key cryptography by means of a one-way function. They also propose the idea of message authentication.

April 1977 Ronald Rivest, Adi Shamir, and Leonard Adleman, inspired by [Diffie and Hellman 76] develop the RSA algorithm, a practical public-key cipher whose security is based on the difficulty of factoring large numbers. The algorithm is published in 1978 [Rivest, Shamir, and Adleman 78].

1978 Congress passes the Foreign Intelligence Surveillance Act to regulate electronic intelligence gathering. The act includes the creation of a special court to handle requests by the NSA to perform electronic surveillance on targeted U.S. persons. Later classified regulations deal with the handling of foreign intelligence electronic surveillance. 426 Cryptography Timeline

1982-present Gilles Brassard, Charles Bennett, and collaborators work on (Chapter 9). Two representative publications are [Bennett et al. 82] and [Bennett et al. 92]. Photons are used to generate a random stream of bits that becomes a one-time pad. Encryption is done with a polyalphabetic cipher using this random key. The use of a one-time pad provides absolute security, and in addition, the method generates information on how many bits may have been intercepted during transmission. On the downside, a direct fiber-optic connection between sender and receiver is required.

1984 The proliferation of personal computers and computer communications gives rise to the new "field" of computer crime. American President Ronald Reagan issues a directive assigning the NSA the responsibility of maintaining security of government computers.

1984-5? The ROT13 cipher (short for Rotate 13) is introduced. This simple Caesar cipher is intended to render text temporarily unreadable to the casual observer by shifting each letter 13 positions. Most newsreaders use this cipher, even though anyone can easily break it, because it obfuscates objectionable material, such as dirty jokes in humor newsgroups.

1987 The United States Congress passes the Computer Security Act. This law states that in the area of unclassified computing systems, it is not the NSA but the National Institute of Standards and Technology (NIST) that's responsible for the development of technical standards for civilian communication systems.

1990 The International Data Encryption Algorithm (IDEA, Section 7.5) is proposed by Xuejia Lai and James Massey [Lai and Massey 91] as a potential replacement for DES. The IDEA algorithm employs a 128-bit key and operations that are easy to implement.

1991 The first version of PGP (pretty good privacy, Section 8.6) is released as freeware by Phil Zimmermann. The high-security encryption offered by PGP, combined with its ease of use, quickly make it a worldwide defacto standard.

1993 The United States government proposes the Clipper chip. The idea is to place a special chip inside communication devices to allow for easy encryption of private communications. The Clipper was supposed to use an encryption algorithm, dubbed Skipjack, that was developed by the NSA and initially kept secret. The controversial part of the proposal had to do with establishing a third party (escrow) to keep the keys used by all the Clipper chips, so that the government could decrypt (with court permission) all communications. Faced with heavy opposition from privacy groups and scientists, the government, in 1998, gave up on the Clipper idea and made the details of Skipjack public.

1994 The RC5 block-encryption algorithm (Section 7.6) is designed and published by Ronald Rivest. It uses data-dependent rotation as its nonlinear operation and is parameterized so that the user can vary the block size, number of rounds, and key length. It is suspected, but as yet not proven, that certain values of the parameters may produce better encryption than the data encryption standard (DES). Cryptography Timeline 427

1994 Stego, one of the first modern image-steganograhy methods, is developed by Romana Machado.

1998 The NSA proposes the Echelon project, which it insists complies with United States law, but which privacy groups describe as a worldwide surveillance network that eavesdrops on all communications and shares its knowledge with several allies of the United States. A report issued by the European Parliament claims that Echelon targets civilian communications, concentrating on groups such as Amnesty International and Greenpeace.

1999 The Electronic Privacy Information Center files suit in United States Federal Court, seeking the release of NSA documents concerning potential surveillance of Amer• ican citizens by the Echelon project.

1999 The director of an Australian intelligence agency publicly acknowledges a long• rumored relationship between American and British intelligence agencies known as UKUSA that allows them to share data.

2000 The NSA denies allegations that it collects all electronic communications, spies on American citizens, and provides intelligence information to U.S. companies. At the same time, NSA director Michael Hayden and CIA director George Tenet, while testifying before Congress, refuse to either confirm or deny the existence of Echelon.

October 2000 Rijndael (Section 7.7) is selected by the National Institute of Standards and Technology (NIST) as the new Advanced Encryption Standard (AES).

The invention of cryptography is not limited to either civilians or the government. Wherever the need for secrecy is felt, the invention occurs. However, over time the quality of the best available system continues to improve and those best systems were often invented by civilians. -David Kahn The Codebreakers (1967) Glossary

Adversary. The eavesdropper, the opponent, the enemy, or any other mischievous person who tries to compromise our security.

AES. Advanced Encryption Standard, adopted by NIST as a replacement for the DES. (See Section 7.7.)

Affine cipher. The term affine refers to a linear function, a function of the form f(x) = ax+b where b is nonzero. The affine cipher (in the Introduction) is an extension of the basic Caesar cipher where a plainletter is multiplied by a key before the Caesar key is added to it. (See also Caesar Cipher.)

Algorithm. A mathematical procedure where a task is executed in a finite sequence of steps.

Alice. A term for the first user of cryptography in discussions and examples. Bob's associate.

In Bruce Schneier's definitive introductory text Applied Cryptogmphy he introduces a table of dramatis personae headed by Alice and Bob. Others include Carol (a participant in three- and four-party protocols), Dave (a participant in four-party protocols), Eve (an eavesdropper), Mallory (a malicious active attacker), Trent (a trusted arbitrator), Walter (a warden), Peggy (a prover) and Victor (a verifier). These names for roles are either already standard or, given the wide popularity of the book, may be expected to quickly become so. -The New Hacker's Dictionary ver. 4.2.2

Anagram. A word, phrase, or sentence formed from another by rearranging its letters: "erects" is an anagram of "secret." 430 Glossary

ASCII. Short for "American Standard Code for Information Interchange," a standard that assigns 7-bit codes to a set of 128 characters.

Asymmetric algorithm. A cryptographic algorithm where different keys are used for encryption and decryption. Most often a public-key algorithm.

Asymmetric key. A cryptographic technique where encryption and decryption use dif• ferent keys.

Attack. An approach used by a codebreaker to decrypt encrypted data or to reveal hidden data. An attack may use brute force, where every key is tried, or a sophisticated approach such as differential cryptanalysis. An attacker may use only known ciphertext or known ciphertext and plaintext.

Authentication. The process of verifying that a particular name really belongs to a particular entity.

Authenticity. The ability to ensure that the given information was in fact produced by the entity whose name or identification it carries and that it was not forged or modified.

Autokey. mode in which the cipher is used to generate the key stream. Also called output feedback (OFB) mode.

Back door. A feature in the design of an algorithm that permits those familiar with the feature to bypass the security of the algorithm. The term trapdoor refers to a similar feature. (See Trapdoor.)

Block. A fixed length string of bits. Longer sequences of bits can be broken down into blocks.

Block cipher. A symmetric cipher that encrypts a message by breaking it down into blocks and encrypting each block. DES, IDEA, and SKIP JACK are block ciphers.

BMP. BMP is the native format for image files in the Microsoft Windows operating system. It has been modified several times since its inception, but has remained stable from version 3 of Windows. BMP is a palette-based graphics file format for images with 1, 2, 4, 8, 16, 24, or 32 bitplanes. It uses a simple form of RLE to compress images with 4 or 8 bitplanes.

Bob. A term used for the second user in cryptographic discussions and examples. Alice's associate.

BPCS steganography. A sophisticated algorithm for hiding data bits in individual bit planes of an image. (See Section 11.2.)

Caesar cipher. A cipher where each letter is replaced by the letter located cyclically n positions in front of it in the alphabet. (See also Affine Cipher.)

Camouflage. A term in steganography. Any steganography method that hides a data file D in a cover file A by scrambling D, then appending it to A. Glossary 431

Capstone. A United States government's project to develop a set of standards for publicly available cryptography, as authorized by the Computer Security Act of 1987.

Checksum. A numeric value used to verify the integrity of a block of data. (See CRC.)

Chrominance. Components of color. They represent color in terms of the presence or absence of blue (Cb) and red (Cr) for a given luminance intensity. (See also Luminance.)

Cipher. A key-based algorithm that transforms a message between plaintext and ci• phertext. A cryptographic algorithm.

Ciphertext. Data after being encrypted with a cipher, as opposed to plaintext.

Clipper. An encryption chip developed and sponsored by the United States government as part of the Capstone project.

Code. A cryptographic technique that uses a codebook to replace words and letters in the plaintext with symbols from the code book.

Combiner. A mechanism that mixes two data items into a single result. The XOR operation is a common combiner because it is reversible. Other examples are the Geffe generator and the summation generator (See Latin square combiner, Geffe generator, and Section 6.5).

Confidentiality. Ensuring that information is not disclosed to people who aren't au• thorized to receive it.

Confusion. The part of an encryption algorithm that modifies the correspondence be• tween plain symbols and cipher symbols. (See also Diffusion.)

Context-free grammar (CFG). A set of rewriting (or production) rules used to gener• ate strings of various patterns. CFGs are used by the steganographic method Mimic Functions to generate innocuous text files that hide data. (See Mimic functions.)

Cover (in steganography). A piece of data in which another datum is hidden. Also known as a host, or a carrier. CRe. An error-detecting code (Appendix C) based on polynomial operations. It is appended to a block of data to increase its error-detection and correction capabilities. (See Checksum.) The CRC result is an excellent (but linear) hash value corresponding to the data. Com• pared with other hash alternatives, CRCs are simple and straightforward. They are well understood. They have a strong and complete basis in mathematics, so there can be no surprises. CRC error-detection is mathematically tractable and provable without recourse to unproven assumptions. Such is not the case for most cryptographic hash constructions.

Cryptanalysis. The science and art of breaking encryption (recovering plaintext from ciphertext when the key is unknown).

Cryptanalyst. One who tries to break encrypted codes. 432 Glossary

Cryptographer. One who develops encryption methods. Cryptography. The art and science of using mathematics to obscure the meaning of data by applying transformations to them that are impractical or impossible to reverse without the knowledge of some key. The term comes from Greek for "hidden writing." Cryptology. The branch of mathematics concerned with secret writing in all its forms. It includes cryptography, cryptanalysis, and steganography.

Indiman drew from a locked drawer in the big centre-table the long strip of bluish paper covered with its incomprehensible dashes. "One of the oldest of devices for secret writing," he remarked. "This slip of paper was originally wrapped about a cylinder of a certain diameter and the message traced upon it, and it can only be deciphered by reroIling it upon another cylinder of the same diameter. Easy enough to find the right one by the empiric method-I mean experiment. Once you recognize the fundamental character of the cryptogram the rest follows with ridiculous certainty. Behold!" -Van Tassel Sutphen, The Gates of Chance

Cryptoperiod. The amount of time a particular key is used. Sometimes refers to the amount of data encrypted with it. . An encryption and decryption algorithm (cipher), together with all its possible plaintexts, , and keys. Data compression. The field concerned with reducing the size of data by eliminating redundancies in the data representation. (See Exercise 10.1.) Data Encryption Standard (DES). A block cipher based on the work of Horst Feistel in the 1970s that's widely used in commercial systems. DES is a 64-bit block cipher with a 56-bit key organized in 16 rounds of operations. Data hiding. See Steganography. Data key. A cryptographic key that encrypts data, as opposed to a key that encrypts other keys. Also called a session key. Decipher. To transform an encrypted message (ciphertext) back to the original mes• sage (plaintext). Decode. To decipher. Decryption. To extract encrypted data and make them readable. To decipher. (See also Decipher, Decode, Encryption.) DES. See Data Encryption Standard. Differential cryptanalysis. A technique for attacking a cipher by feeding it carefully• selected plaintext and watching for patterns in the ciphertext. Glossary 433

Diffie-Hellman (DH). A public-key cryptography algorithm that generates a key between two entities after they publicly share some randomly-generated data.

Diffusion. An important principle of encryption. Changing one plain-symbol will change adjacent or nearby cipher-symbols. In a block cipher, diffusion propagates bit changes from one part of a block to other parts of the same block. Diffusion is achieved by mixing, and the step-by-step process of increasing diffusion is described as avalanche. (See also Confusion.)

Digital signature. Data value generated by a public-key algorithm based on the content of a block of data and on a private key. It generates an individualized checksum.

Digital Signature Standard (DSS). A digital signature algorithm developed by the NSA and endorsed by NIST.

Elliptic curve cryptography. A cryptographic method that employs elliptic curves to generate very large finite fields.

Embedding capacity. A concept in steganography. A measure of the amount of data that can be hidden in a cover.

Encipher. To transform an original message (plaintext) to an encrypted message (ci• phertext).

Encode. To encipher.

Encryption. The transformation of plaintext into ciphertext through a mathematical process.

Entering wedge. Weakness in a cryptographic or other security system that gives an attacker a way to break down some of the system's protections.

Error-correcting code. Codes that increase data reliability for errors by adding redun• dancy. Such codes can automatically correct certain errors and can also detect (but not correct) more serious errors.

Error-detecting code. Codes that increase data reliability for errors by adding redun• dancy to the data. Such codes can automatically detect (but not correct) certain errors.

Escrowed Encryption Standard (EES). A standard proposed by the NSA that requires users to deposit their cryptographic keys with a third party and allows law enforcement to obtain these keys. This standard is not used in any currently-available systems or products.

Eve. A term used in cryptography discussions and examples for the Ubiquitous eaves• dropper.

Exclusive-OR. A logical (Boolean) operation that's also its own inverse, which makes it useful in cryptography. It is identical to adding two bits modulo 2. (See XOR.)

Factor. Given an integer N, a factor is any integer that divides it without a remainder. 434 Glossary

Factoring. The process of finding the prime factors of an integer. Feistel cipher. A special class of iterated block ciphers where the ciphertext is cal• culated from the plaintext by repeated application of the same transformation called a round function. Field. A set of mathematical entities satisfying certain rules. Finite fields, also called Galois fields (Appendix D), are used in cryptography in the Rijndael (AES) algorithm and in stream ciphers. (See also Finite field, Group.)

Finite field. See Field. Function. A mathematical relationship between two values called the input and the output, such that for each input there is precisely one output.

Galois field. See Field. Geffe generator. A method used by nonlinear stream ciphers to combine two streams of pseudo-random bits. (See Combiner and Section 6.5.)

Giga. The quantity giga is defined as 230 = 1,073,741,824. In contrast, a billion is defined (in the United States) as 109 . (See Mega.)

Gray code. Binary codes with the useful property that the codes of consecutive numbers differ by exactly one bit.

Group. A set of mathematical entities obeying certain rules. (See Field.)

Hashing. An operation that scrambles the bits of a data item to obtain a value that can be used as a pointer to a data structure called a hash table. (See Appendix B.)

Hide and seek. Steganography software to hide data in the least significant bits of an image. (See also LSB and Section 12.10.2.)

Hill cipher. A polyalphabetic cipher that employs the modulus function and techniques of linear algebra. (See Section 3.12.)

Homophonic substitution cipher. A cryptographic technique where each plainletter has several potential cipherletters that can replace it. The word comes from the Greek for "the same sound." (See Section 1.7.)

Therefore, though the whole point of his "Current Shorthand" is that it can express every sound in the language perfectly, vowels as well as consonants, and that your hand has to make no stroke except the easy and current ones with which you write m, n, and u, 1, p, and q, scribbling them at whatever angle comes easiest to you, his unfortunate determination to make this remarkable and quite legible script serve also as a Shorthand reduced it in his own practice to the most inscrutable of cryptograms. -George Bernard Shaw, Pygmalion (1916) Glossary 435

IDEA. A patented block cipher developed by James Massey and Xuejia Lai in 1992. It uses a 128-bit key and 64-bit blocks. IDEA uses no internal tables and is known mostly because it is used in PCP. (See also Pretty good privacy (PCP) and Section 7.5.)

Inline encryptor. A hardware product that automatically encrypts all data passing along a data link.

International Data Encryption Algorithm (IDEA). (See IDEA.)

Invisibility. A measure of the quality of a steganographic method.

Involution. Any mapping that's its own inverse. (See Section 5.4.)

Kerberos. An authentication service developed by the Project Athena team at MIT.

Kerckhoffs' principle. An important principle in cryptography. It states that the se• curity of an encrypted message must depend on keeping the key secret and should not depend on keeping the encryption algorithm secret.

Key. Information (normally secret) used to encrypt or decrypt a message in a distinctive manner. A key may belong to an individual or to a group of users.

Key distribution. The process (or rather the problem) of safely distributing a crypto• graphic key to a (possibly large) group of authorized parties.

Key escrow. A scheme for storing copies of cryptographic keys so that a third, autho• rized party can recover them if necessary to decrypt messages.

Key space. The number of possible key values. For example, there are 264 key values for a 64-bit key. (See Exercise 5.)

latin square combiner. A cryptographic combining algorithm. In a simple Latin square combiner algorithm, two consecutive plaintext symbols A and B are used to select a third symbol C from the square and the resulting ciphertext consists of either A and C or Band C. (See also Combiner and Section 6.8.)

lFSR. A simple, efficient technique to produce a large number of pseudo-random bits. (See , Shift register, and Section 6.3.)

lSB. The least significant (rightmost) bit of a data item. (See also LSB encoding, MSB.)

lSB encoding. Steganographic methods that hide data in the least significant bits of an image. (See also Hide and seek, BPCS, LSB, S-tools, Stego, and Section 11.1.)

luminance. A component of color. Roughly speaking, luminance corresponds to bright• ness as perceived by the human eye. (See also Chrominance.)

Mega. Mega is defined as 220 = 1,048,576. In contrast, a million is defined as 106 . (See Ciga.) 436 Glossary

Mimic functions. A steganographic method that uses context-free grammars to gen• erate innocuous text files that hide data. (See Context-free grammar (CFG) and Sec• tion 10.8.)

Monoalphabetic substitution cipher. A cryptographic algorithm with a fixed substi• tution rule. (See Chapter 1.)

MSB. The most significant (leftmost) bit of a data item. (See also LSB.)

Multiple encryption. The process of encrypting an already encrypted ciphertext. Such secondary encryption should be done with a different key, not the key used for the first encryption. Multiple encryption may involve more than two encryption steps. The main advantage of multiple encryption is that the input to the second encryption step is the output of the first step, so it is ciphertext that looks random. An attack on the second encryption step should therefore produce something that looks random, making it extremely hard for the codebreaker to decide whether the attack was successful. Multiple encryption also helps to protect the cipher from a known plaintext attack.

National Computer Security Center (NCSC). United States government organization that evaluates computing equipment for high-security applications.

National Institute of Standards and Technology (NIST). An agency of the United States government that establishes national standards.

National Security Agency (NSA). A branch of the United States Department of De• fense responsible for intercepting foreign communications and for ensuring the security of United States government communications.

Network encryption. Cryptographic services applied to data above the data link level but below the application software level in a network. This allows cryptographic protections to use existing networking services and existing application software in a way that's transparent to the user.

Nomenclator. A cipher that consists of a list where each entry associates a letter. syllable, word, or name with a number. Encryption is done by finding a plain word in the list and replacing it by the corresponding number. If a word is not found in the list, its syllables or letters are individually replaced by numbers.

Nonrepudiation. Accountability. An important goal of cryptography. The idea that the reception of a message cannot later be denied by the receiver.

One-time pad. A random sequence of bits that is as long as the message itself and is used as a key. Alternative definition: A Vernam cipher in which one bit of new, purely random key is used for every bit of data being encrypted. (See Vernam cipher.)

Permutation. Any arrangement or rearrangement of symbols or data items.

Plaintext. An as-yet unencrypted message.

Polyalphabetic substitution. A cryptographic technique where the rule of substitution changes all the time. Glossary 437

Polynomial. A function of the form Pn(x) = ao + alX + a2x2 + ... + anxn. Polynomials are simple functions that have many practical applications. Pretty good privacy (PGP). Encryption software developed by Philip Zimmermann. PGP encrypts a message with the IDEA algorithm and uses public-key cryptography to encrypt the IDEA key. (See IDEA and Section 8.6.) Prime. Any positive integer that's evenly divisible only by itself and by 1. The number 1 is considered neither prime nor nonprime. The integer 2 is the only even prime. Prime numbers have important applications in public-key cryptography. Private key. The key used to decrypt messages in any implementation of public-key cryptography. PRNG. A pseudo-random number generator. This is a hardware device or a software procedure that uses deterministic rules to generate a sequence of numbers that passes tests of randomness. (See Pseudo-random numbers, Random numbers.) Pseudo-random numbers. A sequence of numbers that appears to be random but is constructed according to deterministic rules. (See PRNG, Random numbers.) Public key. The key used to encrypt messages in any implementation of public-key cryptography. Public-key algorithm. A cipher that uses a pair of keys, a public key and a private key, for encryption and decryption. Also called an asymmetric algorithm. Public-key cryptography. Cryptography based on methods involving a public key and a private key. Public-key cryptography standards (PKCS). Standards published by RSA Data Se• curity that describe how to use public-key cryptography in a reliable, secure, and inter• operable fashion. Public-key steganography. Steganography based on methods involving a public key and a private key. (Section 12.9.) Quantum cryptography. An approach to cryptography using the Heisenberg uncer• tainty principle to generate any number of true random bits and thereby achieve abso• lute security. Random numbers. A sequence of numbers that passes certain statistical random• ness tests. Only a sequence can be random. A single number is neither random nor nonrandom. (See also PRNG, Pseudo-random numbers.) Robustness. A measure of the ability of a steganographic algorithm to retain the data embedded in the cover even after the cover has been subjected to various modifications as a result of lossy compression and decompression or of certain types of processing such as conversion to analog and back to digital. RSA Data Security, Inc. (RSADSI). The company [RSA Security 02] primarilyen• gaged in selling and licensing public-key cryptography for commercial purposes. 438 Glossary

S-box. A substitution box used by many block ciphers as part of the substitution• permutation network of the cipher. Such a box is a table that has internal connections between its inputs and outputs. For any bit pattern sent as input to the box, a certain bit pattern emerges as output.

S-tools. Software for hiding data in the least significant bits of an image or an audio file. (See also LSB and Section 12.10.3.)

Secret-key algorithm. Cryptographic algorithm that uses the same key to encrypt data and to decrypt data. Also called a symmetric algorithm.

Security. The process of protecting vital information from prying eyes. This is done either by encryption or hiding.

Semantic methods. Steganographic methods that hide data in a cover text by slightly modifying semantic elements of the text, such as word usage. (See Syntactic methods.)

Shift register. An array of simple storage elements (normally flip-flops or latches) where the value of each element is moved into the next (or the previous) element. Such registers (implemented in either software or hardware) are used by many stream ciphers. (See LFSR, Stream cipher.)

Signal-to-noise ratio (SNR). A measure of invisibility (or its opposite, detect ability) of hidden data.

SKIPJACK. Block cipher developed by NSA and included in the CAPSTONE, CLIP• PER, and FORTEZZA devices.

Spread-spectrum steganography. A steganographic method that hides data bits in an image by adding noise to image pixels and hiding one bit in each noise component without changing the statistical properties of the noise. (See Section 11.4.)

Steganographic file system. A method to hide a data file among several other data files. The hidden file can be retrieved with a password, but someone who does not know the password cannot see the hidden file, cannot extract it, and cannot even find out whether the file exists. (Section 12.7.)

Steganography. The art and science of hiding information, as opposed to cryptography, which hides the meaning of the information.

Stego. Software for hiding data in the least significant bits of an image. (See also LSB and Section 12.10.1.)

Stream cipher. A cipher that encrypts one bit at a time. (See LFSR, Shift register.)

Substitution cipher. A cipher that replaces letters of the plaintext with another set of letters or symbols, without changing the order of the letters.

Symmetric cryptography. A cryptographic technique where the same key is used for encryption and decryption. Glossary 439

Syntactic methods. Steganographic methods that hide data in a cover text by slightly modifying syntactic elements of the text, such as punctuation. (See Semantic methods.)

Transform. An operation applied to the pixels of an image to remove correlations be• tween the pixels. Transforms are used for image compression, so there is a need for steganographic methods that hide data in images such that the data are retained after the image is compressed by a transform (as in JPEG or with wavelet methods) and then decompressed.

Transposition cipher. A cipher where the plainletters are rearranged in a different per• mutation.

Trapdoor. See Back door.

Turing machine. A theoretical model of a computing device, proposed by Alan Turing.

Undetectability. A measure of the quality of a steganographic method.

Vernam cipher. Cipher developed for encrypting teletype traffic by computing the ex• clusive OR of the data bits and the key bits. This is a common approach to constructing stream ciphers. (See One-time pad.)

Vigenere cipher. A historically important polyalphabetic cipher where a letter-square and a key are used to determine the rule of substitution for each plainletter.

Watermarking. A steganographic term. A small amount of data that indicates owner• ship, authorship, or another kind of relationship between the cover and a person or an organization.

Weak key. A key value that results in easy breaking of a cipher. The various weak keys of DES are well known (Section 7.3.1). XOR. (See Exclusive OR.) ZN. The set of integers modulo N, i.e., {a, 1, ... ,N -1}. The notation Z'N denotes the set of integers {a E Znlgcd(a,N) = 1}.

For the benefit of those who may care to delve into the derivation of the proper names used in the text, and thus obtain some slight inSight into the language of the race, there is appended an incomplete glossary taken from some of Lord Greystoke's notes. -Edgar Rice Burroughs, Tarzan the Terrible (1921) Bibliography

The last thing one knows when writing a book is what to put first. -Blaise Pascal, Pensees (1670)

ACA (2001) is URL http://www . und. nodak. edu/ org/ crypto/ crypto/. Aegean Park Press (2001) is URL http://www.aegeanparkpress.com/. AES (2002) is URL http://csrc .nist .gov/encryption/aes/rijndael/. AFAC (2001) is URL http://www-vips.icn.gov.ru/. Anderson, Ross, Roger Needham, and Adi Shamir (1998) "The Steganographic File System," in David Aucsmith (ed.) Proceedings of the Second Information Hiding Work• shop, IWIH, pp. 73-82, April. Also available from http://citeseer.nj.nec.com/anderson98steganographic.html. Augarten, Stan (1984) Bit by Bit: An Illustrated History of Computers, New York, Ticknor and Fields. Aura, Thomas (1996) "Practical Invisibility in Digital Communication," in Proceedings of the Workshop on Information Hiding, Cambridge, England, May 1996, pp. 265-278, Lecture Notes in Computer Science 1174, New York, Springer Verlag. Also available from http://www.tcs.hut.fi/Personnel/tuomas . html. Bacon, Sir Francis (1623) De Augmentis Scientarum, Book 6, Chapter i, Leiden, A. Wi• jngaerden. Baharav, Z. and D. Shaked (1999) "Watermarking of Dither Halftoned Images," in Pro• ceedings of the SPIE 3657 Security and Watermarking of Multimedia Contents, pp. 307- 316. Available at http://www.hpl.hp.com/techreports/98/HPL-98-32 . html in PDF format. Bailey, D. H., P. B. Borwein, and S. Plouffe (1995) "A New Formula for Picking off Pieces of Pi," Science News, 148(Oct 28)279. Also available, in PDF format, from 442 Bibliography

URL http://www . cecm. sfu. carpborwein. Barker, Wayne G. (1981) Cryptanalysis of The Hagelin , Laguna Hills, Calif., Aegean Park Press, vol. C-17. Barker, Wayne G. (1984) Cryptanalysis of Shift-Register Generated Stream Cipher Sys• tems, Laguna Hills, Calif., Aegean Park Press, vol. C-39. Barker, Wayne G. (1989) Introduction to the Analysis Of The Data Encryption Standard (DES), Laguna Hills, Calif., Aegean Park Press, vol. C-55. Barker, Wayne G. (1992) Cryptanalysis of the Single Columnar Transposition Cipher, Laguna Hills, Calif., Aegean Park Press, vol. C-59. Barker, Wayne G. (1996) Cryptanalysis of the Double Transposition Cipher, Laguna Hills, Calif., Aegean Park Press, vol. C-69. Bassia, P. and I. Pitas (1998) "Robust Audio Watermarking in the Time Domain," in IX European Signal Processing Conference (EUSIPCO'98), Rhodes, Greece, vol. I, pp. 25-28,8-11 September. Bauer, Friedrich Ludwig (2000) Decrypted Secrets: Methods and Maxims of Cryptology 2nd (revised and extended) edition, , Springer Verlag. Bednar, J. B. and T. L. Watt (1984) "Alpha-Trimmed Means and Their Relationship to the Median Filter," IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(1)145-153, February. Bender, W., D. Gruhl, N. Morimoto, and A. Lu (1996) "Techniques for Data Hiding," IBM Systems Journal, 35(3,4)313-336. Bennett, Charles H., Gilles Brassard, Seth Breidbart, and Stephen Wiesner (1982) "Quantum Cryptography, or Unforgeable Subway Tokens," in David Chaum, Ronald L. Rivest and Alan T. Sherman, ed., Advances in Cryptology: Proceedings of Crypto 82, pp. 267-275, 23-25 August 1982, New York and London, Plenum. Bennett, Charles H., Gilles Brassard, and Arthur K. Ekert (1992) "Quantum Cryptog• raphy," Scientific American, 267(4)50-57, October. Reprinted in The Computer in the 21st Century, Scientific American Press, 1995, pp.164-171. Blake, Ian, Gadiel Seroussi, and Nigel Smart (1999) Elliptic Curves in Cryptography, Cambridge, Cambridge University Press. Blakley, G. R. (1979) "Safeguarding Cryptographic Keys," in AFIPS Conference Pro• ceedings, 48:313-317. Bletchley Park Trust (2001) is at URL http://www . bletchleypark. org. uk/. Bogert, B. P., M. J. R. Healy, and J. W. Tukey (1963) "The Quefrency Alanysis of Time Series for Echoes: Cepstrum, Pseudo-Autocovariance, Cross-Cepstrum, and Saphe Cracking," in Proceedings of the Symposium on Time Series Analysis, M. Rosenthal, ed., New York, John Wiley, pp. 209-243. Bibliography 443

Borwein, J. M., and P. B. Borwein (1987) 7r and the AGM: A Study in Analytic Number Theory and Computational Complexity, New York, John Wiley. BPCS (2001) is URL http://www.know.comp.kyutech.ac.jp/BPCSe/ file BPCSe-principle. html. Burton, Sir Richard F. (translator) (1991) The Kama Sutra of Vatsayana, Inner Tradi• tions. Busch, C., W. Funk, and S. Wolthusen (1999) "Digital Watermarking: From Concepts to Real-Time Video Applications," IEEE Computer Graphics and Applications, Image Security, January/February, pp.25-35. Cain, Thomas R. and Alan T. Sherman (1997) "How to Break Gifford's Cipher," Cryp• tologia, 21(3)237-286, July. Campbell, K. W. and M. J. Wiener (1993) "DES Is Not a Group," Advances in Cryp• tology, CRYPTO '92, New York, Springer Verlag, pp. 512-520. Casanova, Giacomo (1757) Histoire de Ma Vie, in 12 volumes. Translated by Willard R. Trask as The History of My Life, Baltimore, Johns Hopkins University Press, 1967, reissued 1997. Chen Yu-Yuan, Hsiang-Kuang Pan, and Yu-Chee Tseng (2000) "A Secure Data Hiding Scheme for Two-Color Images," in IEEE Symposium on Computers and Communica• tions, ISCC 2000, pp. 750-755. Also available (in PDF format) from URL http://citeseer.nj.nec.com/chenOOsecure.html Childs, J. Rives (2000) General Solution of the ADFGVX Cipher System, Laguna Hills, Calif., Aegean Park Press, vol. C-88. Chomsky, Noam and George A. Miller (1958) "Finite State Languages," Information and Control, 1(2)91-112, May. Chudnovsky, David V. and Gregory V. Chudnovsky (1989) "The Computation of Classi• cal Constants," Proceedings of the National Academy of Science USA, 86(21)8178-8182. Codes and Ciphers (2001) is URL http://www . codesandciphers. org. uk/. Collier, Bruce, and James MacLachlan (1998) Charles Babbage and the Engines of Perfection (Oxford Portraits in Science), Oxford University Press. Conceptlabs (2001) is URL http://www.conceptlabs.co.uk/alicebob.html. Coppersmith, Donald and Philip Rogaway (1994) "A Software-Optimized Encryption Algorithm," Fast Software Encryption, Cambridge Security Workshop Proceedings, New York, Springer-Verlag, pp. 56-63. Coppersmith, Donald and Philip Rogaway (1995) "Software-Efficient Pseudorandom Function and the Use Thereof for Encryption," United States Patent 5,454,039, 26 September. Cox, Ingemar J. (2002) Digital Watermarking, San Francisco, Morgan Kaufmann. 444 Bibliography

Cox, Ingemar J., Joe Kilian, Tom Leighton, and Talal Shamoon (1996) "A Secure, Ro• bust Watermark for Multimedia," Workshop on Information Hiding, Newton Institute, Cambridge University, May. Also available in PDF format from ftp://ftp.nj.nec.com/pub/ingemar/papers/cam96.zip. Crap (2002) is URL http://www.ii.uib.norlarsr/crap.html. Cryptologia (2001) is URL http://www . dean. usma. edu/math/pubs/ cryptologia/. Cryptology (2001) is http://link.springer .de/link/service/journals/00145/. CSE (2001) is URL http://www.cse.dnd.ca/. Czech, Z. J., et al. (1992) "An Optimal Algorithm for Generating Minimal Perfect Hash Functions," Information Processing Letters 43:257-264. Daemen, Joan, and Vincent Rijmen (2002) The Design of Rijndael, Berlin, Springer• Verlag. Davis, Tenney (translator) (1923) Roger Bacon's Letter Concerning the Marvelous Power of Art and of Nature and Concerning the Nullity of Magic, Easton, PA, Chemical Pub• lishing, Deavours, Cipher A. and Louis Kruh (1985) Machine Cryptography and Modern Crypt• analysis, Norwood, MA, Artech House. DES (1999) is http://csrc .nist. gov/publications/fips/fips46-3/fips46-3. pdf. Diffie, Whitfield and M. E. Hellman (1976) "New Directions in Cryptography," IEEE Transactions on Information Theory, IT-22(6)644-654, November. DSD (2001) is URL http://www.dsd.gov.au/. Dunham W. (1990) Journey Through Genius: The Great Theorems of Mathematics, New York, John Wiley. EI Gamal, T. (1985) "A Public-Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms," IEEE Transactions on Information Theory, IT-31 (4)469-472, July. Fang 1. (1966) "It Isn't ETAOIN SHRDLU; It's ETAONI RSHDLC," Journalism Quar• terly 43:761-762. Feige, Uriel, Amos Fiat and Adi Shamir (1988) "Zero Knowledge Proofs of Identity," Journal of Cryptology, 1(2)77-94. Feistel, Horst (1970) "Cryptographic Coding for Data-Bank Privacy," IBM Research Report RC2827, March. Feistel, Horst (1973) "Cryptography and Computer Privacy," Scientific American, 228(5) 15-23, May. Feistel, Horst (1974) "Block Cipher Cryptographic System," United States Patent 3798359. March 19. Bibliography 445

FIPS 140-1 (1992), "Security Requirements for Cryptographic Modules," Federal In• formation Processing Standards, Publication 140-1, United States Department of Com• merce/NIST, National Technical Information Service. This standard is available in PDF format from http://csrc .nist .gov/publications/fips/fips140-1/fips1401.pdf. FIPS (1995) is NIST publication 180-1, available at http://www.itl.nist.gov/fipspubs/fip180-1.htm. Flannery, Sarah and David Flannery (2001) In Code: A Mathematical Journey, Work• man Publishing Company. Fourmilab (2001) is URL http://www.fourmilab.ch/random/. Fox, E. A. et al. (1991) "Order Preserving Minimal Perfect Hash Functions and Infor• mation Retrieval," ACM Transactions on Information Systems 9(2)281~308. Fraunhofer (2001) is URL http: / / syscop. igd. fhg. de/. FreeBSD Words (2001) is URL ftp:/ /www.freebsd.org/usr/share/dict/words. Fridrich, Jiri (1998) "Image Watermarking for Tamper Detection," in Proceedings of the International Conference on Image Processing, ICIP '98, Chicago, October. Fridrich, Jiri (1999) "Methods for Tamper Detection in Digital Images," in Proceedings of the ACM Workshop on Multimedia and Security, pp. 19~23, Orlando, Fl, October. Fridrich, Jessica (Jiri), Miroslav Goljan, and Rui Du (2002) "Lossless Data Embedding for All Image Formats," in Proceedings of the SPIE Photonics West, vol. 4675, Electronic Imaging 2002, Security and Watermarking of Multimedia Contents, San Jose, California, pp. 572~583, January. Friedman, William F. (1996) The Index of Coincidence and Its Applications in Crypt• analysis, Laguna Hills, Calif., Aegean Park Press, vol. C-49. Friedman, William F. and Charles J. Mendelsohn (2000) The Zimmermann Telegram of January 16, 1911 and Its Cryptographic Background, Laguna Hills, Calif., Aegean Park Press, vol. C-13. Gaines, Helen Fouche (1956) Cryptanalysis: A Study of Ciphers and Their Solutions, New York, Dover. Gardner, Martin (1972) "Mathematical Games," Scientific American, 227(2)106, Au• gust. Garfinkel, Simson (1995) PGP: Pretty Good Privacy, Sebastopol, Calif., O'Reilly. GCHQ (2001) is URL http://www.gchq.gov.uk/. Gifford, David K. et al. (1985) "The Application of Digital Broadcast Communications to Large-Scale Information Systems," IEEE Journal on Selected Areas in Communica• tions, SAC-3(3)457~467, May. Golomb, Solomon W. (1982) Shift Register Sequences, 2nd edition, Laguna Hills, Calif., Aegean Park Press. 446 Bibliography

Gray, Frank (1953) "Pulse Code Communication," United States Patent 2,632,058, March 17. Gruhl, Daniel, Walter Bender, and Anthony Lu (1996) "Echo Hiding," in Information Hiding: First International Workshop, Lecture Notes in Computer Science, volume 1174, R. J. Anderson, ed., pp. 295-315, Springer-Verlag, Berlin. Guillou, Louis and Jean-Jacques Quisquater (1988) "A Practical Zero-Knowledge Pro• tocol Fitted to Security Microprocessors Minimizing Both Transmission and Memory," in Advances in Cryptology, Eurocrypt '88 Proceedings, pp. 123-128, Berlin, Springer• Verlag. Gutenberg (2001) is URL http://promo . net/pg/. Hamming, Richard W. (1980) Coding and Information Theory, Englewood Cliffs, N.J., Prentice-Hall. Havas, G. et al. (1993) "Graphs, Hypergraphs and Hashing," in Proceedings of the International Workshop on Graph-Theoretic Concepts in Computer Science (WG'93), Berlin, Springer-Verlag. Heath, F. G. (1972) "Origins of the Binary Code," Scientific American, 227(2):76, August. Heckbert, Paul (1982) "Color Image Quantization for Frame Buffer Display," in Pro• ceedings of SIGGRAPH 82, pp. 297-307, July. Hill, Lester S. (1929) "Cryptography in an Algebraic Alphabet," American Mathemat• ical Monthly 36(6)306-312, June. Also available from http://members.aol.com/tonyspatt i/hi1l29 . htm. Hinsley, F. H., and Alan Stripp (eds.) (1992) The Codebreakers: The Inside Story of Bletchley Park, Oxford, Oxford University Press. Hotbits (2001) is URL http://www.fourmilab.ch/hotbits/. Hunter, R. and A. H. Robinson (1980) "International Digital Facsimile Coding Stan• dards," Proceedings of the IEEE, 68(7):854-867, July. Hyman, Anthony (1982) Charles Babbage: Pioneer of the Computer, Oxford, Oxford University Press. Johnson, Neil F. et al. (2001) Information Hiding: Steganographyand Watermarking• Attacks and Countermeasures, Advances in Information Security, volume 1, Boston, Kluwer Academic. Kahn, David (1981) (Title unknown), Cryptologia 5(4)193-208. Kahn, David (1996) The Codebreakers: The Comprehensive History of Secret Commu• nications from Ancient Times to the Internet, revised edition, New York, Scribner. Katzenbeisser, Stefan and Fabien A. P. Petitcolas (eds.) (2000) Information Hid• ing Techniques for Steganography and Digital Watermarking, Norwood, Mass., Artech House. Bibliography 447

Kerckhoffs, Auguste (1883) "La Cryptographie Militaire," Journal des Sciences Mili• taires, 9:5-38, 161-191, January-February. Also available in html format from URL http://www.cl.cam.ac.uk/-fapp2/kerckhoffs/la_cryptographie_militaire_i.htm. Knuth, Donald E. (1969) The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Reading, Mass., Addison-Wesley. Knuth, Donald E. (1984) The TfjXBook, Reading, Mass., Addison-Wesley. Konheim, Alan G. (1981) Cryptography: A Primer, New York, John Wiley and Sons. Kullback, Solomon, (1990) General Solution for the Double Transposition Cipher, La• guna Hills, Calif., Aegean Park Press, vol. C-84. Kundur, Deepa and Dimitrios Hatzinakos (1997) "A Robust Digital Image Watermark• ing Scheme Using Wavelet-Based Fusion," in Proceedings of the IEEE International Conference On Image Processing, Santa Barbara, California, 1, pp. 544-547, October. Kundur, Deepa and Dimitrios Hatzinakos (1998) "Digital Watermarking Using Multires• olution Wavelet Decomposition," Proceedings of the IEEE International Conference On Acoustics, Speech and Signal Processing, Seattle, Wash., 5, pp. 2969-2972, May. Lai, Xuejia (1992) "On the Design and Security of Block Ciphers," ETH Series in Information Processing, vol. 1, Konstanz, Hartung-Gorre Verlag. Lai, Xuejia, and James L. Massey (1991) "A Proposal for a New Block Encryption Standard," EUROCRYPT, 90:389-404, Berlin, Springer-Verlag. Larson, P. A. and A. Kajla (1984) "Implementation of a Method Guaranteeing Retrieval in One Access," Communications of the ACM, 27(7)670-677, July. Lavarnd (2003) is URL http://www .lavarnd. org/. Lehmer, D. H. (1949) "Mathematical Methods in Large-Scale Computing Units," in Proceedings of the Second Symposium on Large-Scale Digital Calculating Machinery, Cambridge, Mass., 1949, Harvard University Press, Cambridge, Mass., 1951, pp. 141- 146. Levy, Steven (2001) Crypto, New York, Viking. MacLaren, M. Donald and George Marsaglia (1965) "Uniform Random Number Gen• erators," Journal of the ACM, 12(1)83-89, January. Mandelbrot, Benoit (1982) The Fractal Geometry of Nature, San Francisco, W. H. Freeman. Marvel, Lisa M., Charles G. Boncelet, Jr., and Charles T. Retter (1999) "Spread Spec• trum Image Steganography," IEEE Transactions on Image Processing 8, pp. 1075-1083, August. Also available from http://citeseer.nj.nec.com/404493 . html. MathWorld (2002) is html file Gram-SchmidtOrthonormalization. html in URL http://mathworld.wolfram.com/. 448 Bibliography

McDonald, Andrew D. and Markus G. Kuhn (1999) "StegFS: A Steganographic File Sys• tem for Linux," in Proceedings of Information Hiding, New York, Springer-Verlag, LNCS 1768, pp. 463-477. Also available from http://www . mcdonald. org . uk/StegFS/. Merkle, R. C. and M. Hellman (1981) "On the Security of Multiple Encryption," Com• munications of the ACM, 24(7)465-467. NCM (2001) is URL http://www.nsa.gov/museum/. Newton, David E. (1997) Encyclopedia of Cryptology, Santa Barbara, Calif., ABC-Clio. Nicetext (2001) is URL http://www.ctgi.net/nicetext/ . NSA (2001) is URL http://www . nsa. gov /. Ore, 0ystein (1953) Cardano, the Gambling Scholar, Princeton, N.J., Princeton Uni• versity Press (reprinted by Dover). Park, Stephen K., and Keith W. Miller (1988) "Random Number Generators: Good Ones Are Hard to Find," Communications of the ACM, 31(10)1192-1201, October. Pennebaker, William B. and Joan L. Mitchell (1992) JPEG Still Image Data Compres• sion Standard, New York, Van Nostrand Reinhold. Petit colas (2001) http://www.cl.cam.ac . uk;-fapp2/steganography/bibliography/. Pfitzmann, B. (1996) "Information Hiding Terminology," in Information Hiding, New York, Springer Lecture Notes in Computer Science, 1174:347-350. Pitas, Ioannis (1996) "A Method for Signature Casting on Digital Images," 1996 IEEE International Conference on Image Processing (ICIP'96), Lausanne, Switzerland, vol. III, pp. 215-218, 16-19 September. Also available as file Pi tas96a. ps. Z from URL http://poseidon.csd.auth.gr/papers/PUBLISHED/CONFERENCE/Pitas96a/. Podilchuk, C. I, and W. Zeng (1997) "Digital Image Watermarking Using Visual Mod• els," in Proceedings of the IS€'JT/SPIE Conference on Human Vision and Electronic Imaging II, 3016, pp. 100-111, February. Pohlmann, Ken (1985) Principles of Digital Audio, Indianapolis, Ind., Howard Sams. Press, W. H., B. P. Flannery et al. (1988) Numerical Recipes in C: The Art of Sci• entific Computing, Cambridge, Cambridge University Press. (Also available on-line by anonymous ftp from http://www.nr.com/. ) Price, Derek J. (1955) The Equatorie of the Planetis (with a Linguistic Analysis by R.M. Wilson), Cambridge, Cambridge University Press, 1955. Rabin, Michael O. (1979) "Digitized Signatures and Public-Key Functions as Intractable as Factorization," MIT Laboratory for Computer Science Tech. Report MIT/LCS/TR- 212. Rao K. and J. J. Hwang (1996) Techniques and Standards for Image, Video, and Audio Coding, Upper Saddle River, N.J., Prentice-Hall, pp. 273-322. Bibliography 449

Rejewski, Marian (1981) "How Polish Mathematicians Broke the Enigma Cipher," IEEE Annals of the History of Computing, 3(3), July. Rijmen (2001) is http://www.esat.kuleuven.ac.be;-rijmen/rijndael/sbox .pdf. Rijmen (2002) is URL http://www.esat.kuleuven.ac.be;-rijmen/rijndael/. Ritter, Terry (1990) "Substitution Cipher with Pseudo-Random Shuffling: The Dynamic Substitution Combiner," Cryptologia 14(4)289~303. An updated version is available at http://www.ciphersbyritter.com/DYNSUB.HTM. Ritter (1999) is URL http://www.ciphersbyritter.com/ARTS/PRACTLAT.HTM. Rivest, R., A. Shamir, and L. Adleman (1978) "A Method for Obtaining Digital Sig• natures and Public-Key ," Communications of the ACM, 21(2)120~126, February. Rivest, Ronald (1995a) "The RC5 Encryption Algorithm," Dr. Dobb's Journal, 20(1)146 148, January. Rivest, Ronald (1995b) "The RC5 Encryption Algorithm," in Proceedings of the 1994 Leuven Workshop on Fast Software Encryption, New York, Springer-Verlag, pp. 86~96. Also available online from http://theory .lcs. mi t. edu;-ri vest/publications. html.

Rivest, Ronald L. (1995c) "The RC5 Encryption Algorithm," CryptoBytes, 1(1)9~11. RSA (2001) is URL http://www.rsasecurity.com/rsalabs/challenges/factoring/ file faq.html. RSA Security (2002) is URL http://www.rsasecurity.com/. Salomon, David (2000) Data Compression: The Complete Reference, 2nd edition, New York, Springer-Verlag. Savard (2001) is URL http://home.ecn.ab.ca;-jsavard/crypto/jscrypt.htm. Schneier, Bruce (1993) "Fast Software Encryption," in Cambridge Security Workshop Proceedings, pp. 191~204. New York, Springer-Verlag. Also available from http://www.counterpane.com/bfsverlag.html. Schneier, Bruce (1995) Applied Cryptography: Protocols, Algorithms, and Source Code in C, 2nd Edition, New York, John Wiley. Schneier, Bruce (2002) is URL http://www.counterpane.com/crypto-gram . html. Schnorr, Claus Peter (1991) "Efficient Signature Generation for Smart Cards," Journal of Cryptology, 4(3)161~174. Schotti, Gaspari (1665) Schola Steganographica, Jobus Hertz, printer. Some page photos from this old book are available at http://www.cl.cam.ac.uk/-fapp2/steganography/steganographica/index.html.

Shamir, Adi (1979) "How to Share a Secret," Communications of the ACM, 22(11)612~ 613. November. 450 Bibliography

Shannon, Claude E. (1949) "Communication Theory of Secrecy Systems," Bell System Technical Journal, 28656-715, October. Shannon, Claude E. (1951) "Prediction and Entropy of Printed English," Bell System Technical Journal, 3050-64, January. Simovits, l\Iikael J. (1996) The DES, an Extensive Documentation and Evaluation, La• guna Hills, Calif., Aegean Park Press, vol. C-68. Singh, Simon (1999) The Code Book, New York, Doubleday. Sinkov, A. (1980) Elementary Cryptanalysis: A Mathematical Approach (New Mathe• matical Library, No. 22), Washington, D.C., Mathematical Assn. of America. Sloane, Neil (2001) is URL http://www . research. att. comrnjas/sequences/. Sorkin, Arthur (1984) "Lucifer, A Cryptographic Algorithm," Cryptologia, 8(1)22-41, January. An addenda is in 8(3)260-261. Stallings, William (1998) Cryptography and Network Security: Principles and Practice, Englewood Cliffs, N.J., Prentice-Hall. Steganosaurus (2001) is URL http://www.fourmilab.to/stego/. Stego (2001) is URL http://www.stego.com/. Trithemius, Johannes (1606) Steganographia. Available (for private use only) from URL http://www.esotericarchives.com/tritheim/stegano.htm. Tseng, Yu-Chee and Hsiang-Kuang Pan (2001) "Secure and Invisible Data Hiding in 2-Color Images," IEEE Infocom 2001. Also available from http://www.ieee-infocom.org/2001/paper/20.pdf. Tuchman, Barbara \V. (1985) The Zimmermann Telegram, New York, Ballantine. Turing, Alan (1936) "On Computable Numbers, with an Application to the Entschei• dungsproblem," Proceedings of the London Mathematical Society, SeI. 2,42:230-265. Unicode (2001) is URL http://www.unicode.org. Unicode Standard (1996) The Unicode Standard, Version 2.0, Reading, Mass., Addison• Wesley. WatermarkingWorld (2001) is located at URL http://www . watermarkingworld. org/. Wayner. Peter (1992) "Mimic Functions," Cryptologia, XVI(3)193-214, July. Wayner, Peter (2002) Disappearing Cryptography, 2nd edition, London, Academic Press. Wegman, Mark N. and J. Lawrence Carter (1981) "New Hash Functions and Their Use in Authentication and Set Equality," Journal of Computer and Systems Sciences, 22(3)265-279. Wikramaratna, R. S. (1989) "ACORN, a New Method For Generating Sequences of Uniformly-Distributed Pseudo-Random Numbers," Journal of Computational Physics, 83:16-31. Bibliography 451

Wiles, Andrew (1995) "Modular Elliptic Curves and Fermat's Last Theorem," Annals of Mathematics, 141(3)443-551. Williams, Henry Smith (1904) A History of Science, volume 4, New York, London, Harper. Wolfram (2002a) is URL http://www.wolfram.com. Wolfram, Stephen (2002b) A New Kind of Science, Champaign, Ill., Wolfram Media. Wu, M. Y. and J. H. Lee (1998) "A Novel Data Embedding Method for Two-Color Images," in Proceedings of the International Symposium on Multimedia Information Processing, December. Wuarchive (2001) is URL ftp: / /wuarchi ve. wustl. edu/doc/misc/pi/. Xia, Xiang-Gen, Charles G. Boncelet, and Gonzalo R. Arce (1998) "Wavelet-Transform Based Watermark for Digital Images," Optics Express 3(12)497-511, December 7. Zhao, J. and E. Koch (1995) "Embedding Robust Labels into Images for Copyright Protection," in Proceedings of the International Conference on Intellectual Property Rights for Specialized Information Knowledge and New Technologies, August 21-25, Vienna, Austria, Oldenbourg Verlag, pp. 242-251. Also available in PDF format from http://citeseer.nj.nec.com/zhao95embedding.html. Zimmermann, Philip (1995) PGP Source Code and Internals, Cambridge, Mass, MIT Press. Zimmermann, Philip (2001) is http://www.philzimmermann.com/.

There are two kinds of cryptography in this world: cryptography that will stop your kid sister from reading your files, and cryptography that will stop major governments from reading your files. This book is about the latter. -Bruce Schneier, Applied Cryptography (1995) Index

The index caters to those who have already read the book and want to locate a familiar item, as well as to those new to the book who are looking for a particular topic. I have included any terms that may occur to a reader interested in any of the topics discussed in the book (even topics that are just mentioned in passing). As a result, even a quick glancing over the index gives the reader an idea of the terms and topics included in the book. A special effort was made to include full names (first and middle names instead of initials) and dates of persons mentioned in the book.

(n), Euler function, 9, 77 Alberti, Leon Battista (1404-1472), 59, 107, 7r, calculation of, 104-105 421 ZN, 202, 439 Alice (generic name of person A), 6 1984 (novel), 406 alphabet (in cryptography), 6 ambiguity (in ciphers), 85-87 American cipher machine (SIGABA), 424 Abel, Niels Henrik (1802-1829), 214 Amis, Kingsley (1922-1995), 45 Abelian groups, 225 Ampere (electrical current), 402 absolutely secure ciphers, 11-14 AMSCO cipher, 51-52, 405 ACORN pseudo-random number generator, anagram (as a transposition cipher), 43 98-99 analytical engine, 66 Adams, Douglas (1952-2001), 241 ANCIB (Army-Navy Communications Intel- ADC (analog-to-digital converter), 344 ligence Board), 424 Addison, Joseph (1672-1719), 113 Anna Karenina (novel), 73 additive cipher, 8-10 Antheil, George (1900-1959), 297 additive Gaussian white noise, (AGWN), 294 ARC4, see RC4 stream cipher ADFGVX cipher, ix, 52-53, 79, 423 ARCFOUR, see RC4 stream cipher Adleman, Leonard M. (1945-), 199,425 arithmetic coding (compression method), 293 Advanced Encryption Standard (AES), 173, arithmetic of polynomials, 399-400 183-194, 387, 395-398, 427, 429, 434 ASCII, 383 AES, see Advanced Encryption Standard asymmetric-key cryptography, see public-key affine cipher, 8-10, 77, 429 cryptography fixed point, 11, 401, 402 Atanasoff, John Vincent (1904-1995), 408 AGWN, see additive Gaussian white noise ATBASH (ancient Hebrew cipher), 419 454 Index attack (on encrypted or hidden data), 430 IDEA, 178-181 audio compression RC5, 181-183 frequency masking, 347-350 Rijndael, 183-194, 395-398 temporal masking, 347, 350 Blowfish (block cipher), 155, 175-178 audio watermarking BMP (graphics file format), 277, 363, 430 echo hiding, 353-356 BMP file compression, 430 time domain, 351-353 Bob (generic name of person B), 6 audio, digital, 344-347 bombe (decrypting machine), 124 authentication, 4, 205, 212-217 book cipher, 7, 73, 156 digital signatures, 219-220 Borwein, Peter (and the digits of 7f), 105 Feige-Fiat-Shamir protocol, 216 Boswell, James (1740-1795), 105 Guillou-Quisquater, 216-217 bpb, see bits per bit Schnorr, 217 BPCS (image steganography), 271, 280-283, zero-knowledge protocols, 214 430 author's email address, x Brassard, Gilles (1955-), 235, 238,426 avalanche effect (in block ciphers), 160, 433 British Government Communications Head- in Blowfish, 176 quarters (GCHQ), 7 Brown, Andy (S-Tools), 363 Babbage, Charles (1791-1871), 66, 422 Brown, Derek, 183 back door, 430 Browne, Sir Thomas (1605-1682), 4 Back, Adam, 200 Buchan, John (1875-1940), 234 Bacon, Roger (1214-1294), 420 Burroughs, Edgar Rice (1875-1950), 439 Bacon, Sir Francis (1561-1626), 90,160,421 Byron, Lord George (1788-1824), 367 Bacon's biliteral cipher, 254 Bailey, David (and the digits of 7f), 105 Bark (unit of critical band rate), 349 Caesar cipher, 7-10, 12, 63, 65, 75, 429, 430 Barkhausen, Heinrich Georg (1881-1956), Caesar, Julius (100-44 B.C.), 7, 419 350 camouflage (in steganography), 255, 430 and critical bands, 349 Canine, Ralph J. (1895-1969),425 Bauer, Friedrich L. (1924-),57 Cardano, Girolamo (1501-1576), 64, 421 Bazeries, Etienne (1846-1931), 83, 422 Carranza, Venustiano (1859-1920), 2 Beaufort cipher, 62-63 Casanova, Giacomo Girolamo (1725-1798), Beaufort, Sir Francis (1774-1857), 61, 62, 198 422 cellular automata PRNG, 99-100, 139 Belaso, Giovan Batista, 421 Central Security Service (CSS), 425 Bell, Alexander Graham (1847-1922), 111 CFG, see context-free grammars Bennett, Charles H. (1943-),235,238,426 Chapman, Mark T. (Nicetext), 258 Bernstorff, Johann von (German ambas- characteristic of a finite field, 227 sador), 2 Chase, Pliny Earle (developer of fractionat- Berry, Clifford (collaborator of John Atana• ing cipher), 422 soff), 408 Chinese remainder theorem, 202 bits per bit (bpb, hiding capacity), 266 Chomsky, Avram Noam (1928-), 263 Blair, Eric Arthur (George Orwell, 1903- Christie, Samuel Hunter (1784-1865), 30 1950), 211 chrominance (color component), 279, 290, blind cover (in steganography), 248 298, 304, 431 block ciphers, 155-194 Chudnovsky, David and Gregory (and the AES, 183-194,395-398 computation of 7f), 105 Blowfish, 175-178 Churchill, Sir Winston Leonard Spencer DES, 162-174 (1874-1965), 112 Index 455

CIA (Central Intelligence Agency), 424 Polybius monoalphabetic, 29~30, 36, 52~53 ciphers Polybius polyalphabetic, ix, 87~88 ADFGVX, ix, 52~53, 79, 423 polyphonic, 82, 85~87 absolutely secure, 11 ~ 14 Porta, 60~61 additive, 8~ 10 product, 157 AES, 183~ 194, 395~398 public key, 198~206 affine, 8~ 10, 77, 429 public-key, 437 ambiguity, 85~87 Rabin, 203~204 AMSCO, 51~52, 405 rail fence, 41 Bacon's biliteral, 254 RC4, 150~ 153 Beaufort, 62~63 RC5, 155, 181~183, 426 block, 155~194 Rijndael, 155, 183~ 194, 395~398, 427 Blowfish, 155, 175~ 178 Rot13, 8, 44, 426 book, 7, 73, 156 RSA, 199~202, 235, 425 Caesar, 7~1O, 12,429 secure, 4~206 definition of, 5, 431 self-reciprocal, 60~61 Delastelle, 33 stream, 100, 134~153, 387, 434, 438 Delastelle trifid, 34~35 and cellular automata, 139 DES, 155, 162~ 175 RC4, 150~ 153 double Playfair, 32~33 strip, 85 double transposition, 49~51 TDEA, 133, 162~174 El Gamal, 204~205 transposition, ix, 39~57, 439 Eyraud, 77~79 trifid fractionating, 34~35 Feistel, 158 Trithemius, ix, 63~64 Four winds, 41 ultimate secret, 67 fractionating, 32~34, 422 Vernam, 14, 135, 156, 436, 439 Greek cross, 41 Gronsfeld, 75 Vigenere, ix, 64~75, 81, 439 deciphering, 66~73 Hill, 80~81, 434 ciphertext homophonic substitution, ix, 35~37, 434 IDEA, 155, 175, 178~181, 205,435 definition of, 6 Jefferson, 67, 82 written in groups of 5, 6 knock,29 Clipper chip (dead proposal), 426, 431 Lorenz, viii clock-controlled generator (shift register), Lucifer, 161~162 142, 144~ 145 M-94,83 codes (variable size), 86 mono alphabetic substitution, ix, 21 ~24, color lookup table (in steganography), 276~ 436 278 multiplex, 82 , viii multiplicative, 8~ 10 Coltelli, Francesco Procopio dei (ice cream Myszkowsky, 51 bombe inventor), 124 nihilistic, 29 columnar transposition ciphers, 48~53 nomenclator, 5, 420, 436 decryption of, 53~56 one-time pad, ix, 11~14, 67, 90, 135, 207, double encryption, 48 235, 361, 436 combiner (in stream ciphers), 134, 145, 431 pigpen, 28 Combs, Holly Marie (1973~), 365 Playfair, ix, 30~33 Comite Consultatif International Telegraphique polyalphabetic substitution, ix, 59~ 129, et Telephonique (CCITT), 336, 384 200, 436 completeness effect (in block ciphers), 160 456 Index

Computer Security Act, 426 data encryption algorithm (DEA), 162 confidentiality, 4, 212 data encryption standard (DES), 155, 162- confusion (in cryptography), 159, 431 175,425,426,430,432 context-free grammars, 262-267, 431 challenges, 172-173 convolution, 317, 354, 356, 369-376 data hiding, see steganography 2D,373-376 Dato, Leonardo, 421 correlation Davida, George 1. (Nicetext), 258 of pixels, 283, 297, 342 DCT, see discrete cosine transform of video frames, 342 DEA, see data encryption algorithm cover (in steganography), 245, 247, 250, 412, DEA-l, see data encryption algorithm 431 deciphering mono alphabetic ciphers, 24-25 as noise, 248 decryption (unique), 6 escrow, 248 Delastelle fractionation cipher, 33 CPT (data hiding in binary image), 329-332 Delastelle trifid cipher, 34-35 CRC (cyclic redundancy code), x, 137, 213, Delastelle, Felix Marie (1840-1902), 33 383-385, 431 Della Porta, Giambattista (15357-1615), 60, cryptanalysis (definition of), 4, 431 64, 421 cryptanalyst (definition of), 4, 431 deniability (and shared secrets), 207 cryptographer (definition of), 4, 5, 432 DES, see data encryption standard cryptography, 4-206 determinant (and plane equation), 412 as overt secret writing, 4, 245 difference engine, 66 authentication, 430 Diffie, Bailey Whitfield (1944-), 196, 198, definition of, 4, 432 199,425 Diffie-Hellman-Merkle , 196- Diffie-Hellman-Merkle key exchange, 196- 198, 218-219 198, 212, 433 elliptic curve, 218-234, 433 and elliptic curves, 198, 218-219 Enigma machine, 107-129 diffusion (in cryptography), 159, 433 index of coincidence, ix, 88-90 digital audio, 344-347 PGP, 205-206, 426, 437 digrams, 24 public-key, 198-206, 362, 437 common, 72 quantum, 235-241, 437 self-reciprocal, 60 random numbers in, ix, 91-104 rotor encryption machines, 107-129 discrete cosine transform (DCT), 289-291, rules of, 4, 15, 29, 56, 63, 65, 73, 120, 122, 301-304, 309, 319, 342 135, 137, 209, 435 discrete logarithm problem, 218 cryptology (definition of), 432 discrete wavelet transform (DWT), 314-317 cryptoperiod, 432 discriminant of a polynomial, 222 CSS, see Central Security Service distribution of letters, 22 curves (elliptic), 220-225 DOS (operating system), 364 curves (space-filling), 41, 403 Dostoevsky, Fyodor Mikhailovich (1821- cyclic notation of permutations, 44, 404 1881), 129 cypherpunk, 409 double Playfair cipher, 32-33 double transposition cipher, 49-51 DAC (digital-to-analog converter), 345 DWT, see discrete wavelet transform Daemen, Joan (Rijndael), 183 dynamic substitution cipher, 145-147 Damm, Arvid Gerhard, 111,423 Dyson, Freeman (1923-), 392 Danvin, Charles Robert (1809-1882),56 data compression (and encryption), 432 ear (human), 347-350 data compression (lossy), 269 Echelon (project of NSA), 427 Index 457 echo hiding (audio data hiding), 353-356 Feistel ciphers, 158 Eckert, John Presper (1919-1995), 408 Feistel, Horst, Lucifer designer (1915-1990), Eckhardt, Heinrich von, 2 158, 161, 162, 425 EDE, see encrypt-decrypt-encrypt mode Fermat's last theorem (and elliptic curves), Edison, Thomas Alva (1847-1931), 353 220 Einstein, Albert (1879-1955), 24 Feynman, Richard Phillips (1918-1988), 346 and Brownian motion, 111 field (in mathematics), 387-400, 434 and the photoelectric effect, 236 characteristic of, 227 El Gamal public-key method, 204-205 file allocation table (FAT), 364 elliptic curve cryptography, 218-234, 433 fingerprinting (digital data), 247, 252 elliptic curves, 220-225, 433 finite fields, see Galois fields and complex multiplication, 105 fixed point affine ciphers, 11, 401, 402 and Diffie-Hellman-Merkle key exchange, floppy disk (format of), 364 198 Flowers, Thomas Harold (1905-1998), viii Ellis, James H., British cryptographer (?- Four winds cipher, 41 1997), 200 fractionating ciphers, 32-35, 422 email address of author, x embedding capacity (in steganography), 247, , 33-34 433 Freese, Jerry, 276 encrypt-decrypt-encrypt (EDE) mode, 174 frequency domain, 349 encryption (multiple), 436 frequency masking, 347-350 encryption (unique or not unique), 6, 36 Fridrich, Jessica, 293 English (statistical properties of), 24, 53, 56, Friedman, Elizebeth (nee Smith 1892-1980), 72 90,422 English text Friedman, William (Wolfe) Frederick (1891- frequencies of letters, 23, 72 1969), 82, 83, 88, 90, 422, 424 frequencies of vowels, 53 word start, 56 Gaboriau, Emile (1832-1873), 119 Enigma machine, viii, ix, 107-129 Galois fields, 367, 387-400, 434 breaking the code, 44,117-129 characteristic of, 227 history of, 111-112 Galois, Evariste (1811-1832), 214, 367, 388 operation of, 113-117 Gaskell, Elizabeth (1810-1865), 33 error-correcting codes, 433 Gauss's theorem, 202 error-detecting codes, 383, 433 Geffe generator (in stream ciphers), 142,431, escrow cover (in steganography), 248 ETAOINSHRDL U (letter probabilities), 72 434 Euclid's algorithm, 9 generation of permutations, 76-77 extended, 10, 392 German (letter frequencies), 24 Euler function ell(n), 9, 77 GF(256) and Rijndael, 395-398 Eve (generic name of eavesdropper), 6 GIF (graphics file format), 277, 363 exclusive OR (XOR), 207, 357, 383, 433, 439 data hiding in, 289, 291-293 Eyraud cipher, 77-79 Gifford pseudo-random number generator, Eyraud, Charles, 77 143 giga (definition of), 411, 434 Fabyan, George, 90 golden ratio ell (used in RC5), 182 FAT, see file allocation table Gray codes, 280, 283-284, 434 fax images (data hiding in), 336-337 Gray, Elisha (1835-1901) telephone inventor, Feige--Fiat-Shamir identification protocol, 111 216 Greek cross cipher, 41 458 Index

Greene, Henry Graham (1904-1991), 75, 204, image steganography, 269-276 254 image transforms, 315-317 grille, see turning template transposition ci• index of coincidence, ix, 88-90 phers innocuous text (steganography), 258-262 Gronsfeld cipher, 75 integrity, 4, 212-213 group (in mathematics), 174, 387-388, 434 International Electrotechnical Committee, Abelian, 225 341 multiplicative, 202 International Standard Book Number (ISBN), Guillou-Quisquater identification protocol, 48,253 216-217 International Standardization Organization (ISO), 341 Hagelin, Boris Caesar Wilhelm (1892-1983), International Telecommunications Union 111, 423 and MPEG, 341 Halmos, Paul Richard (1916-), x invisibility (in steganography), 247, 435 HAS, see human auditory system invisible ink (for data hiding), 252 hash functions, 377-382 involutary permutations, 44, 60-61, 114, secure, 271, 381-382 118-119,286 secure hash standard (SHS), 381-382 involution, 117-119, 435 hashing, 377-382, 434 ISO, see International Standardization Orga• Hayden, Michael (NSA director), 427 nization hearing (properties of), 347-350 ITU, see International Telecommunications Hebern, Eduard Hugo (1869-1952), 108, Ill, Union 423 Heisenberg, Werner Karl (1901-1976), 408 Jacquard loom, 66 Hellman, Martin E. (1945-), 196,425 James, William (1842-1910), 417 Henderson, Robert J., v, 407 Japanese cipher machine (purple), 424 hide and seek (steganography software), 271, Jefferson cipher, 67, 82 363, 434 Hilbert space-filling curve, 403 Jefferson, Thomas (1743-1826) and cryptog• and steganography, 271 raphy, 67, 82, 421, 422 Hill cipher, 80-81, 434 Johnson, Louis Arthur (1891-1956), 425 Hill, Lester S. (1891-1961),424 JPEG images (data hiding in), 289-291, 303- Histiaeus (and intuitive steganography), 252 308 Hitt, Parker (codeveloper of wheel cipher), JPEG 2000 (wavelet image compression), 319 422 Homer (c. 800 B.C.), 197 Kahn, David A. (1930-), 71, 88 homophonic substitution codes, ix, 35-37, Kanada, Yasumasa (and the computation of 434 ]f), 105 Hotbit (true random numbers), 93 Kasiski, Friedrich Wilhelm (1805-1881), 66, Huffman algorithm, 336 422 human auditory system (HAS), 339, 347-350 Kawaguchi, Eiji (BPCS steganography), 280 human visual system (HVS), 279, 311, 323 Kerckhoffs' principle, ix, 15, 29, 63, 120, 163, human voice (range of), 347 249, 435 HVS, see human visual system key (in cryptography), 435 asymmetric, 198, 430, 437 IDEA (block cipher), 155, 175, 178-181, 205, bad choice of, 76 363, 435 distribution problem, 11, 14, 64, 73, 133, IEC, see International Electrotechnical Com• 195, 196, 198, 200, 238, 435 mittee private, 437 Index 459

public, 198-206, 437 in GIF images, 291-293 symmetric, 198, 438 in JPEG images, 289-291 weak, 170-171,439 lossy data compression, 269 key (in steganography), 249 Lotstein, Michael (1970-), 64 key space, 15, 435 LSB (least significant bit), 269, 285, 435 exhaustive search of, 15, 402 LSB encoding (image steganography), 269- keyword in transposition ciphers, ix, 44, 48- 276,435 53 Lucifer (predecessor of DES), 161-162, 425 Kirby, William (1817-1906), 116 luminance (color component), 279, 298, 304, knight's tour (as a transposition cipher), 40 435 knock cipher, 29 LZW compression method, 293 Koblitz, Neal (1948-), 218 Koch, Hugo Alexander (1870-1928),111,423 Korn, Willi (inventor of Enigma's reflector), M-138 strip cipher, 85 114 M-138-A strip cipher, 67 Kryha, Alexander von, 423 M-94 cylinder cipher, 83 Machado, Romana, 363 Lagrange, Joseph-Louis (1736-1813), 225, Machado, Romana (Stego developer), 269, 388 363, 427 Lai, Xuejia, 178, 426, 435 MacLaren-Marsaglia pseudo-random num- Lamarr, Hedy (Hedwig Eva Maria Kiesler ber generator, 98 1914-2000), 297 magic square (as a transposition cipher), 40 Langer, Gwido, 124 Mandelbrot, Benoit B. (1924-),364 Laplace distribution, 276, 297 MandelSteg (steganography software), 364- Laplace transform (of image pixels), 276 365 latches (SR), 136 Maor, Eli, 44 Latin square Marconi, Guglielmo (1874-1937), 422 combiner, 147-148,435 Maroney, Colin (hide and seek), 363 ideal, 85 Massey, James, 178, 426, 435 in cylinder ciphers, 83 Mauborgne, Joseph 0., 83, 422 in self-reciprocal tables, 60 Mauchly, John William (1907-1980), 408 Lavinde, Gabrieli di, 420 Maugham, William Somerset (1874-1965), Legros, Georges Victor, 405 380 Lena (image), 272 McCaffrey, Anne Inez (1926-), 376 letter distribution in a language, 22 mega (definition of), 435 letter frequencies, ix Merkle, Ralph C., 196 English,23 Miller, Victor S., 218 German, 24 polyalphabetic ciphers, 68 mimic functions (steganography), 262-267, Portuguese, 24 436 transposition ciphers, 39 modulus, viii, 49, 97, 98, 109, 136, 187, 194, Levy, Steven, 195 216,408 LFSR, see linear feedback shift registers and square roots, 203 linear feedback shift registers (LFSR), 136- and XOR, 145 139,435 as a one-way function, 196, 200, 215 linear systems, 369-373 in finite fields, 388-399, 417 logarithms (in finite fields), 218, 395-396 in hashing, 379 Lorenz cipher, viii in IDEA, 178-179 lossless data hiding, 285-293 in the Hill cipher, 80-81, 434 460 Index monoalphabetic substitution ciphers, ix, 21- Ohaver, M. E., 33, 61 24,436 one-time pad cipher, ix, 11-14, 67, 90, 135, deciphering, 24-25 362, 436 extended, 30-51 and shared secrets, 207 Monte Carlo method for Jr, 102 in quantum cryptography, 235 Morse code (in cryptography), 33-34, 52 in steganography, 361 MPEG-2 video compression (data hiding), one-way function, 196, 198, 199, 359 339, 341-344 Orwell, George, see Blair, Eric Arthur MSB (most significant bit), 436 Ovid (Publius Ovid ius Naso), 353 multifid alphabet, 34 multiple encryption, 436 Pagnol, Marcel (1895-1974), 327, 390 multiplex cipher, 82 Painvin, Georges-Jean, breaker of ADFGVX multiplicative cipher, 8-10 cipher (1886-1982), 52, 79,423 music scores (watermarking), 339-341 palette, 277 Musset, Alfred de (1810-1857), xiv pangram (sentence with all 26 letters), 76 Myszkowsky cipher, 51 parity, 383 vertical, 384 Nadin, Mihai, 120, 157 Pascal, Blaise (1623-1662),441 National Institute of Standards and Technol• patchwork (statistical steganography), 276, ogy (NIST), 162, 173, 183, 232, 426, 427, 298-299 429, 433, 436 Patterson, Robert, 67 National Security Agency (NSA), 7, 258,425, payload (in steganography), 247 436 Peano space-filling curve, 403 Neumann, Peter G., 19 in steganography, 271 Newman, Max (and Colossus), viii Peclet, Jean Claude Eugene (1793-1857), 389 NFSR, see nonlinear feedback shift registers NieuwenhofI, Jean Guillaume Hubert Victor pel (in fax compression), 336 Fran<;;ois Alexandre Auguste Kerckhoffs Pemberton, John Styth (1831-1888) Coca• von (1835-1903), 15, 249 Cola inventor, 206 nihilistic cipher, 29 perfect shuffie (as a transposition cipher), 131 NIST, see National Institute of Standards permutations, 436 and Technology as a substitution rule, 117 noise automatically generated, 76-77 additive Gaussian white noise, 294 by a key, 44, 48 in a binary image, 304, 325 consecutive, 404 in a color image, 303 cyclic notation, 44, 404 white, 294 involutary, 44, 60-61,114,117-119,286 nomenclator (secure code), 5, 420, 436 monoalphabetic substitution ciphers, 21 nonlinear combination generator (shift regis• multiplying, 39, 76 ter), 142-143 random, 74 nonlinear feedback shift registers (NFSR), transposition ciphers, ix, 39-57 139-145 Petitcolas, Fabien A. P., 246 nonlinear filter generator (shift register), PGP, see pretty good privacy 142-144 photons (in quantum cryptography), 235-241 nonrepudiation, 4, 212-213, 436 pigpen cipher, 28 NSC (National Security Council), 424 plaintext Nyquist rate, 346 ambiguities in, 6 definition of, 6 oblivious cover (in steganography), 248 plane (equation of), 209 Index 461 plausible deniability, 356, 357 quantization (steganography), 297-298 Playfair cipher, ix, 30-33 quantum cryptography, 235-241, 437 Playfair, Baron Lyon (1818-1898), 30, 421 Plouffe, Simon (and the digits of 1l"), 105 Rabin public-key method, 203-204 Poe, Edgar Allan (1809-1849), 14,22 rail fence cipher, 41 polyalphabetic substitution ciphers, ix, 59- Ramanujan, Srinivasa Aiyangar (1887-1920) 129, 436 and the computation of 1l", 105 compared to RSA, 200 random numbers letter frequencies, 68 in cryptography, ix, 91-104 Polybius cipher produced by Hotbit, 93 and transposition, 52-53 produced by radio noise, 408 monoalphabetic, 29-30, 36 pseudo-random, 96, 100, 136, 437 Morse code, 52 statistical tests for, ix, 100-104 polyalphabetic, ix, 87-88 randomness (criterion for), 91 polynomials RC4 stream cipher, 150-153 and CRC, 384 RC5 block cipher, 155, 181-183, 426 and secret sharing, 210-211 Reagan, Ronald Wilson (1911-), 426 arithmetic, 399-400 redundancy definition of, 437 in algebraic codes, 412 monic, 224 in artificial languages, 82 primitive, 137 in compressed data, 280 polyphonic ciphers, 82, 85-87 in error-correcting codes, 433 Porchez, Jean-Fran<;:ois (1964-), 37 in natural languages, 82 Porta cipher, 60-61 in steganography, 276 Portuguese (letter frequencies), 24, 73 Rejewski, Marian (1905-1980),122-125,424 prefix rule (for variable-size codes), 33, 86 repetition in cryptography, 65, 73, 122 pretty good privacy (PGP), 205-206, 426 437 ' RGB color space, 277-279, 293, 304 Rijmen, Vincent (Rijndael), 183 prime numbers (definition of), 437 Rijndael (AES), 155, 183-194,427 primitive polynomial (in cryptography), 137 and GF(256), 395-398 PRNG, see pseudo-random number genera- Ritter, Richard, 112, 423 tor product cipher, 157 Rivest, Ronald L., 150, 181, 199, 425, 426 pseudo-random number generator, 96-100, robust frequency domain watermarking, 309- 312, 313, 437 311 ACORN, 98-99 robustness (in steganography), 247-249, 251, cellular automata, 99-100, 139 305,437 Gifford, 143 ROT13 cipher, 8, 44, 426 MacLaren-Marsaglia, 98 rotor encryption machines, ix, 107-129 table shuffling, 97-98 Enigma, 111-129 pseudo-random numbers, 96, 100, 136, 437 Purple, 424 psychoacoustics, 347-350 SIGABA,424 public-key cryptography, 198-206, 212, 362 RSA cryptography, 150, 199-202, 235, 425 public-key steganography, 255, 339, 362, 437 cycling attack, 202 pulse code modulation (PCM), 346 multiplicative property of, 202 Purple (Japanese cipher machine), 424 pyramid (wavelet image decomposition), 314, S-box, 156, 158, 160, 161, 166, 167, 169, 178, 321, 322 179, 186, 190, 192, 438 462 Index

S-Tools (steganography software), 363-364, nonlinear, 139-145 438 nonlinear combination generator, 142-143 Sagan, Carl Edward (1934-1996), 267, 337, nonlinear filter generator, 142-144 385 SHS, see secure hash standard Saint Pierre, Bernardin de (1737-1814), 400 Sierpinski curve, 403 sampling of sound, 344-347 SIGABA (American cipher machine), 424 Scherbius, Arthur (1878-1929),111-115,423 signal-to-noise ratio (SNR) in steganography, Scheutz, George and Edvard, 66 248, 438 Schmidt, Hans-Thilo, 120, 124 signature, see watermarking Schneier, Bruce (1963-), 175, 451 signature authentication, 219-220 crypto-gram, 17 signature casting in images, 299-300 Schnorr identification protocol, 217 Skipjack (Clipper encryption method), 426, Schotti, Gaspari (1608-1666), 75, 245, 254 438 Schrodinger, Erwin Rudolf Josef Alexander skytel (ancient Greek encryption device), 419 (1887-1961),408 Smoluchowski, Marian (1872-1917),111 Schwartau, Winn, 81 SNR, see signal-to-noise ratio Schwartzkopf, Melvin (as a bad choice of software for steganography, 362-365 key), 76 hide and seek, 271, 363, 434 Scott, Paul Mark (1920-1978), 371 MandelSteg, 364-365 SEAL stream cipher, 148-150 S-Tools, 363-364, 438 secrets (sharing), 206-212 Stego, 270, 363, 438 deniability, 207 sound sampling, 344-347 secure codes (cryptography), 4-206 space-filling curves (as transposition cipher), secure codes (steganography), viii, x, 245- 41, 403 365, 438 spatial frequency (of pixels in an image), 301, secure codes (watermarking), 251-252 324 secure hash algorithm (SHA-1), 148,381-382 spread spectrum steganography, 294-297, secure hash functions, 271, 381-382 309, 312, 438 secure hash standard (SHS), 381-382 Square cipher (predecessor of Rijndael), 184 self-complementary magic square (as a trans- position cipher), 40 SR latch, 136 self-reciprocal ciphers, 60-61 SSIS, see spread spectrum steganography semantic methods (in steganography), 257, STANCIB (State-Army-Navy Communica- 438 tions Intelligence Board), 424 Seutonius (Gaius Seutonius Tranquillus 707- standard (wavelet image decomposition), 314 1307 B.C.), 7 steganalysis (definition of), 4 SHA-1, see secure hash algorithm steganographic file system, 207, 339, 356- Shakespeare, William (1564-1616), 263 360, 438 letter frequencies, 22, 23 steganography, viii, x, 245-365, 419, 438 Nicetext, 262 and compression, 248, 412 Shamir, Adi, 199, 425 applications of, 250-251 sharing secrets, 206-212 as covert secret writing, 4, 245 deniability, 207 audio watermarking, 351-353 threshold scheme, 206 binary images, 269, 325-341 Shaw, George Bernard (1856-1950), 434 blind cover, 248 shift invariance, 369-373 BPCS, 271, 280-283, 430 shift registers, 438 camouflage, 255, 430 clock-controlled generator, 142, 144-145 color lookup table, 276-278 linear, 136-139,435 CPT method, 329-332 Index 463

definition of, 4 Stego (steganography software), 270, 363, echo hiding, 353-356 427,438 embedding capacity, 247, 433 stego-key (in steganography), 213, 249 escrow cover, 248 Stimson, Henry Lewis, (1867-1950), 17 fax images, 336-337 stream ciphers, 100, 134-153, 387, 434, 438 GIF images, 291-293 and cellular automata, 139 hiding data in text, 255-267 combiner, 134, 145, 431, 434 innocuous text, 258-262 dynamic substitution, 145-147 intuitive methods, 252-254 Latin square combiner, 147-148, 435 invisibility, 247, 435 RC4, 150-153 JPEG images, 289-291, 303-308 SEAL, 148-150 lossless, 285-293 strip ciphers, 85 LSB encoding, 269-276, 435 M-138,85 mimic functions, 262-267, 436 M-138-A,67 MPEG-2 Video, 339, 341-344 subband transform, 315-317 Nicetext, 258-262 substitution ciphers, 7 oblivious cover, 248 consecutive, 157 patchwork, 276, 298-299 substitution-permutation (SP) ciphers, 156- payload, 247 158 public key, 255, 339, 362, 437 summation generator (in stream ciphers), pure, 255 142, 431 quantization, 297-298 Sutphen, Van Tassel (1861-1945), 432 robust frequency domain watermarking, syllables (encryption of), 37 309-311 syntactic methods (in steganography), 256, robustness, 248 439 secret, 255 semantic methods, 257, 438 signal-to-noise ratio (SNR), 248, 438 table shuffling PRNG, 97-98 signature casting, 299-300 Taine, Hippolyte A. (1828-1893), 51 simple digital methods, 255-267 Takahashi, Daisuke (and the computation of software, 362-365 11'),105 spread spectrum, 294-297, 309, 312, 438 tamper resistance watermarking, 312-313 steganographic file system, 207, 339, 356- TAOSWCIHBD (letters at start of word), 56 360, 438 taps (wavelet filter coefficients), 417 steganosaurus, 258 Tartaglia (Niccolo Fontana, 1499-1557), 214 stego-key, 249 TDEA, see triple data encryption standard syntactic methods, 256, 439 Teller, Edward (1908-), 192 tamper resistance, 248 temporal masking, 347, 350 TP method, 332-335 Tenet, George (CIA director), 427 traitor tracing, 250 ternary digit (trit), 30 transform domain, 269, 301-324, 439 tests for randomness, ix, 100-104 ultimate, 339, 361-362 'lEX (and data hiding), 256 undetectability, 247, 439 text watermarking, 250, 309-311, 319-324, 439 data hiding in, 255-267 watermarking music scores, 339-341 English, 72 wavelet methods, 314-324 Thatcher, Margaret Hilda (1925-), 251 Wu Lee method, 328-329 threshold scheme (for secret sharing), 206, Zhao Koch Method, 325-327 212 steganosaurus (steganography software), 258 TP (data hiding in binary image), 332-335 464 Index traitor tracing (in steganography), 250 Vernam cipher (one-time pad), 14, 135, 156, transform domain (data hiding in), 269, 301- 436, 439 324, 439 Vernam, Gilbert S. (1890-1960), 14, 135,423 transforms Verne, Jules Gabriel (1828-1905), 4, 110, 190 images, 315-317 Viaris, Gaetan Henri Leon de (1847-1901), subband,315-317 83 transposition ciphers, ix, 39-57, 439 Vigenere cipher, ix, 64-75, 77, 81, 439 anagram, 43 deciphering of, 66-73 columnar, 48-53 index of coincidence, ix, 88-90 combined, 43 long keys, 72-73 consecutive, 157 nonshift variant, 74-75 drawbacks of, 56-57 Vigenere, Blaise de (1523-1596), 64, 421 key, ix von Neumann, John (1903-1957), 97 knight's tour, 40 letter frequencies, 39 Wadsworth, Decius (inventor of cipher disk), magic square, 40 421 self-complementary magic square, 40 Walker, John (steganosaurus), 258 space-filling curves, 41, 403 watermarking, 213 turning template, ix, 45-47 digital data, 251-252 trapdoor, 430, 439 in steganography, 250, 309-311, 319-324, trigrams, 24 439 common, 72 music scores, 339-341 self-reciprocal, 60 tamper resistance, 312-313 triple data encryption standard (TDEA), wavelet image decomposition 133, 162-174 pyramid, 314, 321, 322 trit (ternary digit), 30 standard, 314 Trithemius cipher, ix, 63-64 wavelet-based watermarking, 314-324 Trithemius, Johannes Heidenberg (1462- Kundur-Hatzinakos, 321-324 1516), 63, 245, 253, 421 Wayner, Peter, 262, 267 Truman, Harry Spencer (1994-1972), 425 weak keys, 170-171, 439 Turing machine, 439 Web site of this book, x Turing, Alan Mathison (1912-1954), 125- Weinstein, Stephen B., 400 127, 422 Weyman, Stanley John (1855-1928), 194, turning template transposition ciphers, ix, 360 45-47 Wheatstone, Sir Charles (1802-1875), 30, Twain, Mark (Samuel Langhorne Clemens 421 1835-1910),382 white noise, 294 Williams, Henry Smith (1863-1943), 25 ultimate secret codes, 67 Wilson, Thomas Woodrow (1856-1924), 1 ultimate steganography, 339, 361-362 Witham, Steve, 243 uncertainty principle (and quantum cryptog- Wolfram, Stephen (1959-), 92 raphy), 237, 238, 437 Wostrowitz, Eduard Fleissner von (turning undetectability (in steganography), 247, 439 template cipher), 45 unicode, 131-132 Wright, Amy, 379 USCIB (United States Communications In• Wu Lee method (data hiding in binary im• telligence Board), 424 age), 328-329 variable-size codes, 86 XOR, see exclusive OR Index 465

Yardley, Herbert Osborne (1889-1958), 424 image), 325-327 YCbCr color space, 277, 279, 298, 304 Zimmermann telegram, ix, 1-2 Zimmermann, Arthur (1864-1940), 1-2 zero-knowledge protocols, 214 Zimmermann, Philip R., PGP developer Zhao Koch Method (data hiding in binary (1954-), 205, 426, 437

Index machines were so called because the operator used a guide, called an index, to pick the letter he wanted to type. Generally, once the desired letter was chosen, another action was necessary to type it. Yes, this was usually a slow and arduous process, but the cost of the index machine made it very appealing. In a time when a standard, or typebar, machine could run $100-$125, the index machines could be had for as little as $1, generally no more than $15-20. Chuck Dilts, http://users.erols.com/chuckl0l/idex . htm