Appendix: Axiomatic Information Theory

Appendix: Axiomatic Information Theory The logic of secrecy was the mirror-image of the logic of information Colin Burke 1994 Perfect security was promised at all times by the inventors of cryptosystems, particularly of crypto machines (Bazeries: je suis indéchiffrable). In 1945, Claude E. Shannon (1916 – 2001) gave in the framework of his information theory a clean definition of what could be meant by perfect security. We show in the following that it is possible to introduce the cryptologically relevant part of information theory axiomatically. Shannon was in contact with cryptanalysis, since he worked 1936 – 1938 in the team of Vannevar Bush, who developed the COMPARATOR for determi- nation of character coincidences. His studies in the Bell Laboratories, going back to the year 1940, led to a confidential report (A Mathematical Theory of Cryptography) dated Sept. 1, 1945, containing apart from the definition of Shannon entropy (Sect. 16.5) the basic relations to be discussed in this appendix. The report was published four years later: Communication Theory of Secrecy Systems, Bell System Technical Journal 28, 656-715 (1949). A.1 Axioms of an Axiomatic Information Theory It is expedient to begin with events, i.e., sets X , Y, Z of ‘elementary events’, and with the uncertainty1 (Shannon: ‘equivocation’) on events—the uncertainties expressed by non-negative real numbers. More precisely, HY (X ) denotes the uncertainty on X , provided Y is known. H(X )=H∅(X ) denotes the uncertainty on X , provided nothing is known. A.1.1 Intuitively patent axioms for the real-valued binary set function H: (0) 0 ≤ HY (X ) (“Uncertainty is nonnegative.”) For 0 = HY (X )wesay“Y uniquely determines X .” (1) HY∪Z(X ) ≤ HZ (X ) (“Uncertainty decreases, if more is known.”) For HY∪Z(X )=HZ (X )wesay“Y says nothing about X .” The critical axiom on additivity is (2) HZ (X∪Y)=HY∪Z(X )+HZ (Y). This says that uncertainty can be built up additively over events. Since in particular H(X∪Y)=H(X )+H(Y), H is called an ‘entropy’ in analogy to the additive entropy of thermodynamical systems. 1 The term ‘uncertainty’ was used as early as 1938 by Solomon Kullback. 488 Appendix: Axiomatic Information Theory The classical stochastic model for this axiomatic information theory is based on pX (a)=Pr[X = a], the probability that the random variable X assumes the value a , and defines (Nyquist, 1944) H∅({X})=− pX (s) · ld pX (s) s p s > : X( ) 0 H∅({X}∪{Y })=− pX,Y (s, t) · ld pX,Y (s, t) s,t : pX,Y (s,t) >0 H{Y }({X})=− pX,Y (s, t) · ld pX|Y (s, t) s,t: pX|Y (s/t) >0 where pX,Y (a, b)=def Pr[(X = a) ∧ (Y = b)] and pX|Y (a/b) obeys Bayes’ rule for conditional probabilities: pX,Y (s, t)=pY (t) · pX|Y (s, t) ,thus −ld pX,Y (s, t)=−ld pY (t) − ld pX|Y (s, t) . A.1.2 From the axioms (0), (1), and (2), all the other properties usually derived for the classical model can be obtained. For Y = ∅, (2) yields (2a) HZ (∅) = 0 (“There is no uncertainty on the empty event set”) (1) and (2) imply (3a) HZ (X∪Y) ≤ HZ (X )+HZ (Y) (“Uncertainty is subadditive”) (0) and (2) imply (3b) HZ (Y) ≤ HZ (X∪Y) (“Uncertainty increases with larger event set”) From (2) and the commutativity of . ∪ . follows (4) HZ (X ) − HY∪Z(X )=HZ (Y) − HX∪Z(Y) (4) suggests the following definition: The mutual information of X and Y under knowledge of Z is defined as IZ (X , Y)=def HZ (X ) − HY∪Z(X ) . Thus, the mutual information IZ (X , Y) is a symmetric (and because of (1) nonnegative) function of the events X and Y. From (2), IZ (X , Y)=HZ (X )+HZ (Y) − HZ (X∪Y) . Because of (4), “Y says nothing about X ” and “X says nothing about Y” are equivalent and are expressed by IZ (X , Y) = 0 . Another way of saying this is that under knowledge of Z , the events X and Y are mutually independent. In the classical stochastic model, this situation is given if and only if X, Y are independent random variables: pX,Y (s, t)=pX (s)·Y (t) . IZ (X , Y) = 0 is equivalent with the additivity of H under knowledge of Z : (5) IZ (X , Y) = 0 if and only if HZ (X )+HZ (Y)=HZ (X∪Y) . A.2 Axiomatic Information Theory of Cryptosystems 489 A.2 Axiomatic Information Theory of Cryptosystems For a cryptosystem X , events in the sense of abstract information theory are sets of finite texts over Zm as an alphabet. Let P be a plaintext(-event), C a cryptotext(-event), K a keytext(-event).2 The uncertainties H(K),HC (K),HP (K),H(C),HP (C),HK (C),H(P ),HK (P ),HC (P ) are now called equivocations. A.2.1 First of all, from (1) one obtains H(K) ≥ HP (K) ,H(C) ≥ HP (C), H(C) ≥ HK (C) ,H(P ) ≥ HK (P ), H(P ) ≥ HC (P ) ,H(K) ≥ HC (K). A.2.1.1 If X is functional, then C is uniquely determined by P and K , thus (CRYPT) HP,K(C) = 0 , i.e., IK (P, C)=HK (C) ,IP (K, C)=HP (C) (“plaintext and keytext together allow no uncertainty on the cryptotext.”) A.2.1.2 If X is injective, then P is uniquely determined by C and K ,thus (DECRYPT) HC,K (P ) = 0 , i.e., IC (K, P)=HC (P ) ,IK (C, P)=HK (P ) (“cryptotext and keytext together allow no uncertainty on the plaintext.”) A.2.1.3 If X is Shannon, then K is uniquely determined by C and P ,thus (SHANN) HC,P (K) = 0 , i.e., IP (C, K)=HP (K) ,IC (P, K)=HC (K) (“cryptotext and plaintext together allow no uncertainty on the keytext.”) A.2.2 From (4) follows immediately HK (C)+HK,C (P )=HK (P ) ,HP (C)+HP,C(K)=HP (K), HC (P )+HC,P (K)=HC (K) ,HK (P )+HK,P (C)=HK (C), HP (K)+HP,K(C)=HP (C) ,HC (K)+HC,K (P )=HC (P ). With (1) this gives Theorem 1: (CRYPT) implies HK (C) ≤ HK (P ) ,HP (C) ≤ HP (K), (DECRYPT) implies HC (P ) ≤ HC (K) ,HK (P ) ≤ HK (C), (SHANN) implies HP (K) ≤ HP (C) ,HC (K) ≤ HC (P ). A.2.3 In a cryptosystem, X is normally injective, i.e., (DECRYPT) holds. In Figure 188, the resulting numerical relations are shown graphically. In the 2 Following a widespread notational misusage, in the sequel we replace {X} by X and {X}∪{Y } by X, Y ; we also omit ∅ as subscript. 490 Appendix: Axiomatic Information Theory H(K) H(C) ≥≤ HP (K) HP (C) ∨ ∨ H(P ) ≤≥ ≥≤ HC (K) HC (P ) HK (P ) HK (C) Fig. 188. Numerical equivocation relations for injective cryptosystems classical professional cryposystems, there are usually no homophones and the Shannon condition (2.6.4) holds. Monoalphabetic simple substitution and transposition are trivial, and VIGENERE,` BEAUFORT, and in particular VERNAM are serious examples of such classical cryptosystems. The conjunction of any two of the three conditions (CRYPT), (DECRYPT), (SHANN) has far-reaching consequences in view of the antisymmetry of the numerical relations: Theorem 2: (CRYPT) ∧ (DECRYPT) implies HK (C)=HK (P ) (“Uncertainty on the cryptotext under knowledge of the keytext equals uncertainty on the plaintext under knowledge of the keytext,”) (DECRYPT) ∧ (SHANN) implies HC (P )=HC (K) (“Uncertainty on the plaintext under knowledge of the cryptotext equals uncertainty on the keytext under knowledge of the cryptotext,”) (CRYPT) ∧ (SHANN) implies HP (K)=HP (C). (“Uncertainty on the keytext under knowledge of the plaintext equals uncertainty on the cryptotext under knowledge of the plaintext.”) In Figure 189, the resulting numerical relations for classical cryptosystems with (CRYPT), (DECRYPT), and (SHANN) are shown graphically. H(K) H(C) ≥≤ HP (K) === HP (C) ∨ ∨ H(P ) ≤≥ HC (K) === HC (P ) HK (P ) === HK (C) Fig. 189. Numerical equivocation relations for classical cryptosystems A.3 Perfect and Independent Key Cryptosystems 491 A.3 Perfect and Independent Key Cryptosystems A.3.1 A cryptosystem is called a perfect cryptosystem, if plaintext and cryptotext are mutually independent: I(P, C)=0. This is equivalent to H(P )=HC (P ) and to H(C)=HP (C) (“Without knowing the keytext: knowledge of the cryptotext does not change the uncertainty on the plaintext, and knowledge of the plaintext does not change the uncertainty on the cryptotext”) and is, according to (5) , equivalent to H(P, C)=H(P )+H(C). A.3.2 A cryptosystem is called an independent key cryptosystem, if plaintext and keytext are mutually independent: I(P, K)=0. This is equivalent to H(P )=HK (P ) and to H(K)=HP (K) (“Without knowing the cryptotext: knowledge of the keytext does not change the uncertainty on the plaintext, and knowledge of the plaintext does not change the uncertainty on the keytext”) and, according to (5) , is equivalent to H(K, P)=H(K)+H(P ). H(K) H(C) ≥≤ ≤≥ HP (K) === HP (C) (independent key) (perfect) ∨ ∨ H(P ) (perfect) ≥≤≥≤(independent key) HC (K) === HC (P ) HK (P ) === HK (C) Fig. 190. Numerical equivocation relations for classical cryptosystems, with additional properties perfect and/or independent key A.3.3 Shannon also proved a pessimistic inequality. Theorem 3K : In a perfect classical cryptosystem (Fig. 190), H(P ) ≤ H(K) and H(C) ≤ H(K) . Proof: H(P ) ≤ HC (P )(perfect) HC (P ) ≤ HC (K) (DECRYPT), Theorem 1 HC (K) ≤ H(K) (1) .

Appendix: Axiomatic Information Theory

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support