Algebraic Information Theory
Total Page:16
File Type:pdf, Size:1020Kb
Statistical Information Theory The Need for other Information Theories Algebraic Information Theory Algebraic Information Theory Marc Pouly [email protected] Apsia Breakfast Seminar Interdisciplinary Centre for Security, Reliability and Trust University of Luxembourg June 2011 Marc Pouly Algebraic Information Theory 1/ 25 Statistical Information Theory Hartley’s Information Theory The Need for other Information Theories Shannon’s Information Theory Algebraic Information Theory Hartley’s Measure (1928) Given a set S = fs1;:::; sng how can we measure its uncertainty u(S) 1 uncertainty is a non-negative value 2 monotone: jS1j ≤ jS2j ) u(S1) ≤ u(S2) 3 additive: u(S1 × S2) = u(S1) + u(S2) Theorem: There is only one function that satisfies these properties u(S) = log jSj Marc Pouly Algebraic Information Theory 2/ 25 Statistical Information Theory Hartley’s Information Theory The Need for other Information Theories Shannon’s Information Theory Algebraic Information Theory From Uncertainty to Information The uncertainty of S = fs1;:::; s8g is log 8 = 3 bits Assume now that someone gives more precise information for example that either s3 or s7 has been transmitted 0 We have S = fs3; s7g with a remaining uncertainty log 2 = 1 bit The information reduces uncertainty by log 8 − log 2 = 2 bits Information is the Reduction of Uncertainty ! Marc Pouly Algebraic Information Theory 3/ 25 Statistical Information Theory Hartley’s Information Theory The Need for other Information Theories Shannon’s Information Theory Algebraic Information Theory Shannon’s Measure (1948) How much uncertainty is contained in a set S = fs1;:::; sng if the probability pi = p(si ) of each element is known? n X S(p1;:::; pn) = − pi log pi i=1 We have a similar uniqueness result for specific properties An information theory is derived by the same principle This is what people call classical or statistical information theory 1 1 Shannon generalizes Hartley: S( n ;:::; n ) = log n Marc Pouly Algebraic Information Theory 4/ 25 Statistical Information Theory Extending Hartley’s Theory The Need for other Information Theories Relational Information Theory Algebraic Information Theory Equational Information Theory How do we represent Information Thanks to Hartley we have an information theory for sets and thanks to Shannon an information theory for probabilities But there are other ways of representing information (on computers) databases (relations), systems of equations and inequalities, constraint systems, possibilistic formalisms, formalisms for imprecise probabilities, Spohn potentials, graphs, logics, ... Statistical information theory is not enough ! Marc Pouly Algebraic Information Theory 5/ 25 Statistical Information Theory Extending Hartley’s Theory The Need for other Information Theories Relational Information Theory Algebraic Information Theory Equational Information Theory Extending Hartley’s Theory Alphabets Hartley's Information Theory Probabilistic Sources Isomorphisms Shannon's Information Theory Relational Information Theory Marc Pouly Algebraic Information Theory 6/ 25 Statistical Information Theory Extending Hartley’s Theory The Need for other Information Theories Relational Information Theory Algebraic Information Theory Equational Information Theory The Fundamental Theorem of Lattice Theory Hartley’s information theory assumes a finite alphabet S and assigns values to subsets, i.e. u : P(S) ! R≥0. P(S) is a distributive lattice with meet \ and join [ Theorem (Fundamental Theorem of Lattice Theory) Every distributive lattice is isomorphic to a lattice of subsets We can carry over Hartley’s measure to isomorphic formalisms for example to the relational algebra used in databases Marc Pouly Algebraic Information Theory 7/ 25 Statistical Information Theory Extending Hartley’s Theory The Need for other Information Theories Relational Information Theory Algebraic Information Theory Equational Information Theory Relational Information Theory We can therefore measure the uncertainty in relations Destination Departure Gate Heathrow 10:00 7 Heathrow 14:00 9 R = Gatwick 08:30 4 City 11:15 5 City 15:20 7 and obtain u(R) = log 5 bits If we agree on the three properties stated by Hartley then u(S) = log jSj is the only correct way of measuring uncertainty in subset systems and hence also the right way for isomorphic formalisms such as the relational algebra. Marc Pouly Algebraic Information Theory 8/ 25 Statistical Information Theory Extending Hartley’s Theory The Need for other Information Theories Relational Information Theory Algebraic Information Theory Equational Information Theory Duality of Information Destination Departure Gate Heathrow 10:00 7 Heathrow 14:00 9 R = Gatwick 08:30 4 City 11:15 5 City 15:20 7 1 How to get to London ? the more tuples the more information 2 I am waiting for my friend, which flight might she have taken ? the less tuples the more information Such a dualism is always present in order theory but not for measures Is measuring information sometimes too restrictive ? Marc Pouly Algebraic Information Theory 9/ 25 Statistical Information Theory Extending Hartley’s Theory The Need for other Information Theories Relational Information Theory Algebraic Information Theory Equational Information Theory Linear Equation Systems Solution set of linear equation systems form affine spaces X1 − 2X2 + 2X3 = −1 3X1 + 5X2 − 3X3 = 8 4X1 + 3X2 − X3 = 7 The null space of the equation matrix A is 4 9 N (A) = f(x ; x ; x ) 2 3 : x = − x ; x = x g 1 2 3 R 1 11 3 2 11 3 How much uncertainty is contained in an equation system ? Can we treat this just as another subset system ? Marc Pouly Algebraic Information Theory 10/ 25 Statistical Information Theory Extending Hartley’s Theory The Need for other Information Theories Relational Information Theory Algebraic Information Theory Equational Information Theory Equational Information Theory Linear equation systems can have no, one or infinitely many solutions. Hence, the uncertainty is either log 0, log 1 or log 1 Here, a (quantitative) measure of information is not appropriate Marc Pouly Algebraic Information Theory 11/ 25 Statistical Information Theory Extending Hartley’s Theory The Need for other Information Theories Relational Information Theory Algebraic Information Theory Equational Information Theory A first Summary A theory of information should explain what information is Hartley & Shannon: information = reduction of uncertainty Rely on the assumption that information can be measured There are many formalisms for representing information on computers that are not covered by this theory Does this theory reflect our daily perception of information ? What is our perception of information ? Marc Pouly Algebraic Information Theory 12/ 25 Statistical Information Theory Information Algebras The Need for other Information Theories Examples Algebraic Information Theory Algebraic Information Theory What is Information ? ... information exists in pieces ... information comes from different sources ... information refers to questions ... pieces of information can be combined ... information can be focussed on the questions of interest Marc Pouly Algebraic Information Theory 13/ 25 Statistical Information Theory Information Algebras The Need for other Information Theories Examples Algebraic Information Theory Algebraic Information Theory Towards a formal Framework information exists in pieces φ, 2 Φ there is a universe of questions r and every piece of information φ 2 Φ refers to a finite set of questions d(φ) ⊆ r combination of information φ ⊗ #y focussing of information if d(φ) = x and y ⊆ x then φ 2 Φ Marc Pouly Algebraic Information Theory 14/ 25 Statistical Information Theory Information Algebras The Need for other Information Theories Examples Algebraic Information Theory Algebraic Information Theory ... and the same again for nerds ... This is a two-sorted algebra (Φ; r) with universe of questions r and information pieces Φ labeling operator d :Φ !P(r) combination operator ⊗ :Φ × Φ ! Φ focussing operator #:Φ × P(r) ! Φ But operations cannot be arbitrary - they must satisfy some rules ! Marc Pouly Algebraic Information Theory 15/ 25 Statistical Information Theory Information Algebras The Need for other Information Theories Examples Algebraic Information Theory Algebraic Information Theory Axioms of Information 1 it should not matter in which order information is combined φ ⊗ = ⊗ φ and (φ ⊗ ) ⊗ ν = φ ⊗ ( ⊗ ν) 2 a combination refers to the union of the question sets d(φ ⊗ ) = d(φ) [ d( ) 3 focussing information on x ⊆ d(φ) gives information about x d(φ#x ) = x 4 focussing can be done step-wise, i.e. if x ⊆ y ⊆ d(φ) φ#x = (φ#y )#x 5 combining a piece of information with a part of itself gives nothing new φ ⊗ φ#x = φ Marc Pouly Algebraic Information Theory 16/ 25 Statistical Information Theory Information Algebras The Need for other Information Theories Examples Algebraic Information Theory Algebraic Information Theory The Combination Axiom How shall ⊗ and # behave with respect to each other ? 6 If φ, 2 Φ with d(φ) = x and d( ) = y then (φ ⊗ )#x = φ ⊗ #x\y Compare with the distributive law: (a × b) + (a × c) = a × (b + c) Definition (Kohlas, 2003) A system (Φ; r) satisfying the six axioms is called information algebra Marc Pouly Algebraic Information Theory 17/ 25 Statistical Information Theory Information Algebras The Need for other Information Theories Examples Algebraic Information Theory Algebraic Information Theory Relational Databases Relations are pieces of information Player Club Goals Player Nationality Ronaldinho