Information-Theoretic Analysis Using Theorem Proving
Total Page:16
File Type:pdf, Size:1020Kb
Information-Theoretic Analysis using Theorem Proving Tarek Mhamdi A Thesis in The Department of Electrical and Computer Engineering Presented in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy at Concordia University Montr´eal, Qu´ebec, Canada December 2012 c Tarek Mhamdi, 2012 CONCORDIA UNIVERSITY Division of Graduate Studies This is to certify that the thesis prepared By: Tarek Mhamdi Entitled: Information-Theoretic Analysis using Theorem Proving and submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy complies with the regulations of this University and meets the accepted standards with respect to originality and quality. Signed by the final examining committee: Dr. Deborah Dysart-Gale Dr. Amy Felty Dr. Nizar Bouguila Dr. Weiping Zhu Dr. Otmane Ait Mohamed Dr. Sofi`ene Tahar Approved by Chair of the ECE Department 2012 Dean of Engineering ABSTRACT Information-Theoretic Analysis using Theorem Proving Tarek Mhamdi Concordia University, 2012 Information theory is widely used for analyzing a wide range of scientific and engineering problems, including cryptography, neurobiology, quantum computing, pla- giarism detection and other forms of data analysis. Despite the safety-critical nature of some of these applications, most of the information-theoretic analysis is done using informal techniques, mainly computer simulation and paper-and-pencil analysis, and thus cannot be completely relied upon. The unreliable nature of the produced results poses a serious problem in safety-critical applications and may result in heavy financial losses or even the loss of human life. In order to overcome the inaccuracy limitations of these techniques, this thesis proposes to conduct the analysis within the trusted kernel of a higher-order-logic (HOL) theorem prover. For this purpose, we provide HOL formalizations of the fundamental theories of measure, Lebesgue integration and probability and use them to formalize some of the most widely used information- theoretic principles. We use the Kullback-Leibler divergence as a unified measure of information which is in turn used to define the main measures of information like the Shannon entropy, mutual information and conditional mutual information. Further- more, we introduce two new measures of information leakage, namely the information leakage degree and the conditional information leakage degree and compare them with existing measures. We illustrate the usefulness of the proposed framework by tackling various applications including the performance analysis of a communication encoder iii used in the proof of the Shannon source coding theorem, the quantitative analysis of privacy properties of a digital communications mixer and the one-time pad encryption system using information-theoretic measures. iv To My Wife and Daughter, My Mom and Dad, My Sisters and Brother. v ACKNOWLEDGEMENTS First, I am deeply grateful to Dr. Sofi`ene Tahar for his help, guidance and encouragement throughout my graduate studies. I could not have wished for a better thesis supervisor. Many thanks to Dr. Osman Hasan for his support in my research and for his encouragement. I would like to thank the love of my life, Souha Mahmoudi, for her unconditional love and support. I am deeply grateful to my parents for all the love and support they have provided me over the years. Nothing would be possible without them. Many thanks to the members of the thesis committee for the assistance they provided at all levels of the research project. Finally, many thanks to my good friends at the hardware verification group for their support and motivation. vi TABLE OF CONTENTS LISTOFTABLES.................................. xi LISTOFFIGURES................................. xii LISTOFACRONYMS...............................xiii 1 Introduction 1 1.1Motivation.................................. 1 1.2State-of-the-Art............................... 6 1.2.1 Measure and Probability ...................... 6 1.2.2 LebesgueIntegration........................ 8 1.2.3 InformationTheory......................... 10 1.2.4 Probabilistic Model Checking ................... 11 1.3ProposedMethodology........................... 12 1.4 Thesis Contributions ............................ 15 1.5ThesisOrganization............................. 17 2 Preliminaries 18 2.1InformationTheory............................. 18 2.1.1 QuantifyingUncertainty...................... 19 2.1.2 MeasuresofInformation...................... 20 2.2TheoremProving.............................. 21 3 Measure Theory and Lebesgue Integration in HOL 25 3.1FormalizationofExtendedRealNumbers................. 26 3.1.1 TypeandOperators........................ 26 3.1.2 Supremum and Infimum ...................... 29 vii 3.1.3 Summation over a Set ....................... 30 3.2FormalizationofMeasureTheoryinHOL................. 32 3.2.1 Basic Definitions .......................... 33 3.2.2 BorelSigmaAlgebra........................ 37 RationalNumbers......................... 38 Topology............................... 39 BorelMeasurableSets....................... 41 3.2.3 Extended-Real-ValuedMeasurableFunctions........... 42 3.2.4 ProductsofMeasureSpaces.................... 44 3.3FormalizationofLebesgueIntegrationinHOL.............. 45 3.3.1 LebesgueIntegral.......................... 46 3.3.2 LebesgueMonotoneConvergence................. 48 3.3.3 Integrability ............................. 49 3.3.4 IntegralProperties......................... 51 3.4 Summary and Discussions ......................... 52 4 Formalization of Information Theory 53 4.1 Formalization of Probability in HOL ................... 53 4.1.1 Basic Definitions .......................... 54 4.1.2 StatisticalProperties........................ 57 Expectation............................. 57 Variance............................... 58 4.2FormalizationofInformationMeasuresinHOL............. 59 4.2.1 Radon-NikodymDerivative.................... 59 4.2.2 Kullback-Leibler Divergence .................... 62 4.2.3 MutualInformation......................... 63 viii 4.2.4 ShannonEntropy.......................... 64 4.2.5 Conditional Mutual Information .................. 65 4.3QuantitativeAnalysisofInformationFlow................ 67 4.3.1 InformationLeakageDegree.................... 68 4.3.2 Conditional Information Leakage Degree . ............ 70 4.4 Summary and Discussions ......................... 73 5 Applications 74 5.1 Data Compression ............................. 74 5.1.1 ChebyshevandMarkovInequalities................ 76 5.1.2 WeakLawofLargeNumbers(WLLN).............. 78 5.1.3 Asymptotic Equipartition Property (AEP) ............ 79 5.1.4 TypicalSet............................. 80 5.1.5 Data Compression Limit ...................... 81 5.1.6 Discussions............................. 82 5.2 Anonymity-based Single MIX ....................... 83 5.2.1 Discussions............................. 87 5.3One-TimePad................................ 88 5.3.1 Encoding - Decoding . ...................... 90 5.3.2 Encryption-Decryption...................... 91 5.3.3 Perfect Security ........................... 93 5.3.4 Discussions............................. 94 5.4 Summary and Discussions ......................... 94 6 Conclusions and Future Work 96 6.1Conclusions................................. 96 ix 6.2FutureWork................................. 98 Bibliography 101 Biography 108 x LIST OF TABLES 2.1HOLSymbolsandFunctions........................ 24 5.1 Straddling Checkerboard Example . .................. 90 xi LIST OF FIGURES 1.1OverviewoftheProposedFramework................... 12 5.1 Source Coding ................................ 75 5.2 Typical Set Encoding ............................ 76 5.3 Single MIX Example ............................ 84 5.4 Counter-Example for [68] .......................... 86 5.5ARussianOne-timepad,capturedbyMI5[53]............. 88 5.6One-TimePadEncryption......................... 89 xii LIST OF ACRONYMS ACL2 A Computational Logic for applicative common Lisp AEP Asymptotic Equipartition Property CDF Cumulative Distribution Function DPI Data Processing Inequality HOL Higher-Order Logic HOL4 HOL4 Theorem Prover LCF Logic of Computable Function KL Kullback-Leibler LISP LISt Processing ML Meta Language OTP One-Time Pad PDF Probability Density Function PMF Probability Mass Function PRISM PRobabilistIc Symbolic Model checker PVS Prototype Verification System TOR The Onion Router WLLN Weak Law of Large Numbers ZFC Zermelo-Fraenkel with axiom of Choice xiii Chapter 1 Introduction 1.1 Motivation Knowing ignorance is strength. Ignoring knowledge is sickness. Lao-Tse, Tao Te Ching [6th Century BCE] Hardware and software systems usually exhibit some kind of random or unpredictable behavior, either because of a faulty component, interaction with an unpredictable en- vironment or simply because of the use of a probabilistic algorithm within the system. Due to these elements of randomness and uncertainty, establishing the correctness of a system under all circumstances usually becomes impractically expensive. The engi- neering approach to evaluate the performance of these systems is to use probabilistic analysis. Furthermore, information-theoretic analysis is gaining more ground in the study of correctness and performance of a broad range of scientific and engineering systems. In fact, after it was first introduced by Shannon in his seminal paper [62], information theory has become an increasingly popular discipline in a wide range of applications including communication, cryptography, quantum computing,