An Elementary Introduction to Information Geometry
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Geometric Manifolds
Wintersemester 2015/2016 University of Heidelberg Geometric Structures on Manifolds Geometric Manifolds by Stephan Schmitt Contents Introduction, first Definitions and Results 1 Manifolds - The Group way .................................... 1 Geometric Structures ........................................ 2 The Developing Map and Completeness 4 An introductory discussion of the torus ............................. 4 Definition of the Developing map ................................. 6 Developing map and Manifolds, Completeness 10 Developing Manifolds ....................................... 10 some completeness results ..................................... 10 Some selected results 11 Discrete Groups .......................................... 11 Stephan Schmitt INTRODUCTION, FIRST DEFINITIONS AND RESULTS Introduction, first Definitions and Results Manifolds - The Group way The keystone of working mathematically in Differential Geometry, is the basic notion of a Manifold, when we usually talk about Manifolds we mean a Topological Space that, at least locally, looks just like Euclidean Space. The usual formalization of that Concept is well known, we take charts to ’map out’ the Manifold, in this paper, for sake of Convenience we will take a slightly different approach to formalize the Concept of ’locally euclidean’, to formulate it, we need some tools, let us introduce them now: Definition 1.1. Pseudogroups A pseudogroup on a topological space X is a set G of homeomorphisms between open sets of X satisfying the following conditions: • The Domains of the elements g 2 G cover X • The restriction of an element g 2 G to any open set contained in its Domain is also in G. • The Composition g1 ◦ g2 of two elements of G, when defined, is in G • The inverse of an Element of G is in G. • The property of being in G is local, that is, if g : U ! V is a homeomorphism between open sets of X and U is covered by open sets Uα such that each restriction gjUα is in G, then g 2 G Definition 1.2. -
Affine Connections and Second-Order Affine Structures
Affine connections and second-order affine structures Filip Bár Dedicated to my good friend Tom Rewwer on the occasion of his 35th birthday. Abstract Smooth manifolds have been always understood intuitively as spaces with an affine geometry on the infinitesimal scale. In Synthetic Differential Geometry this can be made precise by showing that a smooth manifold carries a natural struc- ture of an infinitesimally affine space. This structure is comprised of two pieces of data: a sequence of symmetric and reflexive relations defining the tuples of mu- tual infinitesimally close points, called an infinitesimal structure, and an action of affine combinations on these tuples. For smooth manifolds the only natural infin- itesimal structure that has been considered so far is the one generated by the first neighbourhood of the diagonal. In this paper we construct natural infinitesimal structures for higher-order neighbourhoods of the diagonal and show that on any manifold any symmetric affine connection extends to a second-order infinitesimally affine structure. Introduction A deeply rooted intuition about smooth manifolds is that of spaces that become linear spaces in the infinitesimal neighbourhood of each point. On the infinitesimal scale the geometry underlying a manifold is thus affine geometry. To make this intuition precise requires a good theory of infinitesimals as well as defining precisely what it means for two points on a manifold to be infinitesimally close. As regards infinitesimals we make use of Synthetic Differential Geometry (SDG) and adopt the neighbourhoods of the diagonal from Algebraic Geometry to define when two points are infinitesimally close. The key observations on how to proceed have been made by Kock in [8]: 1) The first neighbourhood arXiv:1809.05944v2 [math.DG] 29 Aug 2020 of the diagonal exists on formal manifolds and can be understood as a symmetric, reflexive relation on points, saying when two points are infinitesimal neighbours, and 2) we can form affine combinations of points that are mutual neighbours. -
2. Chern Connections and Chern Curvatures1
1 2. Chern connections and Chern curvatures1 Let V be a complex vector space with dimC V = n. A hermitian metric h on V is h : V £ V ¡¡! C such that h(av; bu) = abh(v; u) h(a1v1 + a2v2; u) = a1h(v1; u) + a2h(v2; u) h(v; u) = h(u; v) h(u; u) > 0; u 6= 0 where v; v1; v2; u 2 V and a; b; a1; a2 2 C. If we ¯x a basis feig of V , and set hij = h(ei; ej) then ¤ ¤ ¤ ¤ h = hijei ej 2 V V ¤ ¤ ¤ ¤ where ei 2 V is the dual of ei and ei 2 V is the conjugate dual of ei, i.e. X ¤ ei ( ajej) = ai It is obvious that (hij) is a hermitian positive matrix. De¯nition 0.1. A complex vector bundle E is said to be hermitian if there is a positive de¯nite hermitian tensor h on E. r Let ' : EjU ¡¡! U £ C be a trivilization and e = (e1; ¢ ¢ ¢ ; er) be the corresponding frame. The r hermitian metric h is represented by a positive hermitian matrix (hij) 2 ¡(; EndC ) such that hei(x); ej(x)i = hij(x); x 2 U Then hermitian metric on the chart (U; ') could be written as X ¤ ¤ h = hijei ej For example, there are two charts (U; ') and (V; Ã). We set g = à ± '¡1 :(U \ V ) £ Cr ¡¡! (U \ V ) £ Cr and g is represented by matrix (gij). On U \ V , we have X X X ¡1 ¡1 ¡1 ¡1 ¡1 ei(x) = ' (x; "i) = à ± à ± ' (x; "i) = à (x; gij"j) = gijà (x; "j) = gije~j(x) j j For the metric X ~ hij = hei(x); ej(x)i = hgike~k(x); gjle~l(x)i = gikhklgjl k;l that is h = g ¢ h~ ¢ g¤ 12008.04.30 If there are some errors, please contact to: [email protected] 2 Example 0.2 (Fubini-Study metric on holomorphic tangent bundle T 1;0Pn). -
Introduction to Information Geometry – Based on the Book “Methods of Information Geometry ”Written by Shun-Ichi Amari and Hiroshi Nagaoka
Introduction to Information Geometry – based on the book “Methods of Information Geometry ”written by Shun-Ichi Amari and Hiroshi Nagaoka Yunshu Liu 2012-02-17 Introduction to differential geometry Geometric structure of statistical models and statistical inference Outline 1 Introduction to differential geometry Manifold and Submanifold Tangent vector, Tangent space and Vector field Riemannian metric and Affine connection Flatness and autoparallel 2 Geometric structure of statistical models and statistical inference The Fisher metric and α-connection Exponential family Divergence and Geometric statistical inference Yunshu Liu (ASPITRG) Introduction to Information Geometry 2 / 79 Introduction to differential geometry Geometric structure of statistical models and statistical inference Part I Introduction to differential geometry Yunshu Liu (ASPITRG) Introduction to Information Geometry 3 / 79 Introduction to differential geometry Geometric structure of statistical models and statistical inference Basic concepts in differential geometry Basic concepts Manifold and Submanifold Tangent vector, Tangent space and Vector field Riemannian metric and Affine connection Flatness and autoparallel Yunshu Liu (ASPITRG) Introduction to Information Geometry 4 / 79 Introduction to differential geometry Geometric structure of statistical models and statistical inference Manifold Manifold S n Manifold: a set with a coordinate system, a one-to-one mapping from S to R , n supposed to be ”locally” looks like an open subset of R ” n Elements of the set(points): points in R , probability distribution, linear system. Figure : A coordinate system ξ for a manifold S Yunshu Liu (ASPITRG) Introduction to Information Geometry 5 / 79 Introduction to differential geometry Geometric structure of statistical models and statistical inference Manifold Manifold S Definition: Let S be a set, if there exists a set of coordinate systems A for S which satisfies the condition (1) and (2) below, we call S an n-dimensional C1 differentiable manifold. -
Arxiv:1907.11122V2 [Math-Ph] 23 Aug 2019 on M Which Are Dual with Respect to G
Canonical divergence for flat α-connections: Classical and Quantum Domenico Felice1, ∗ and Nihat Ay2, y 1Max Planck Institute for Mathematics in the Sciences Inselstrasse 22{04103 Leipzig, Germany 2 Max Planck Institute for Mathematics in the Sciences Inselstrasse 22{04103 Leipzig, Germany Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501, USA Faculty of Mathematics and Computer Science, University of Leipzig, PF 100920, 04009 Leipzig, Germany A recent canonical divergence, which is introduced on a smooth manifold M endowed with a general dualistic structure (g; r; r∗), is considered for flat α-connections. In the classical setting, we compute such a canonical divergence on the manifold of positive measures and prove that it coincides with the classical α-divergence. In the quantum framework, the recent canonical divergence is evaluated for the quantum α-connections on the manifold of all positive definite Hermitian operators. Also in this case we obtain that the recent canonical divergence is the quantum α-divergence. PACS numbers: Classical differential geometry (02.40.Hw), Riemannian geometries (02.40.Ky), Quantum Information (03.67.-a). I. INTRODUCTION Methods of Information Geometry (IG) [1] are ubiquitous in physical sciences and encom- pass both classical and quantum systems [2]. The natural object of study in IG is a quadruple (M; g; r; r∗) given by a smooth manifold M, a Riemannian metric g and a pair of affine connections arXiv:1907.11122v2 [math-ph] 23 Aug 2019 on M which are dual with respect to g, ∗ X g (Y; Z) = g (rX Y; Z) + g (Y; rX Z) ; (1) for all sections X; Y; Z 2 T (M). -
EXOTIC SPHERES and CURVATURE 1. Introduction Exotic
BULLETIN (New Series) OF THE AMERICAN MATHEMATICAL SOCIETY Volume 45, Number 4, October 2008, Pages 595–616 S 0273-0979(08)01213-5 Article electronically published on July 1, 2008 EXOTIC SPHERES AND CURVATURE M. JOACHIM AND D. J. WRAITH Abstract. Since their discovery by Milnor in 1956, exotic spheres have pro- vided a fascinating object of study for geometers. In this article we survey what is known about the curvature of exotic spheres. 1. Introduction Exotic spheres are manifolds which are homeomorphic but not diffeomorphic to a standard sphere. In this introduction our aims are twofold: First, to give a brief account of the discovery of exotic spheres and to make some general remarks about the structure of these objects as smooth manifolds. Second, to outline the basics of curvature for Riemannian manifolds which we will need later on. In subsequent sections, we will explore the interaction between topology and geometry for exotic spheres. We will use the term differentiable to mean differentiable of class C∞,and all diffeomorphisms will be assumed to be smooth. As every graduate student knows, a smooth manifold is a topological manifold that is equipped with a smooth (differentiable) structure, that is, a smooth maximal atlas. Recall that an atlas is a collection of charts (homeomorphisms from open neighbourhoods in the manifold onto open subsets of some Euclidean space), the domains of which cover the manifold. Where the chart domains overlap, we impose a smooth compatibility condition for the charts [doC, chapter 0] if we wish our manifold to be smooth. Such an atlas can then be extended to a maximal smooth atlas by including all possible charts which satisfy the compatibility condition with the original maps. -
Information Geometry of Orthogonal Initializations and Training
Published as a conference paper at ICLR 2020 INFORMATION GEOMETRY OF ORTHOGONAL INITIALIZATIONS AND TRAINING Piotr Aleksander Sokół and Il Memming Park Department of Neurobiology and Behavior Departments of Applied Mathematics and Statistics, and Electrical and Computer Engineering Institutes for Advanced Computing Science and AI-driven Discovery and Innovation Stony Brook University, Stony Brook, NY 11733 {memming.park, piotr.sokol}@stonybrook.edu ABSTRACT Recently mean field theory has been successfully used to analyze properties of wide, random neural networks. It gave rise to a prescriptive theory for initializing feed-forward neural networks with orthogonal weights, which ensures that both the forward propagated activations and the backpropagated gradients are near `2 isome- tries and as a consequence training is orders of magnitude faster. Despite strong empirical performance, the mechanisms by which critical initializations confer an advantage in the optimization of deep neural networks are poorly understood. Here we show a novel connection between the maximum curvature of the optimization landscape (gradient smoothness) as measured by the Fisher information matrix (FIM) and the spectral radius of the input-output Jacobian, which partially explains why more isometric networks can train much faster. Furthermore, given that or- thogonal weights are necessary to ensure that gradient norms are approximately preserved at initialization, we experimentally investigate the benefits of maintaining orthogonality throughout training, and we conclude that manifold optimization of weights performs well regardless of the smoothness of the gradients. Moreover, we observe a surprising yet robust behavior of highly isometric initializations — even though such networks have a lower FIM condition number at initialization, and therefore by analogy to convex functions should be easier to optimize, experimen- tally they prove to be much harder to train with stochastic gradient descent. -
A Geodesic Connection in Fréchet Geometry
A geodesic connection in Fr´echet Geometry Ali Suri Abstract. In this paper first we propose a formula to lift a connection on M to its higher order tangent bundles T rM, r 2 N. More precisely, for a given connection r on T rM, r 2 N [ f0g, we construct the connection rc on T r+1M. Setting rci = rci−1 c, we show that rc1 = lim rci exists − and it is a connection on the Fr´echet manifold T 1M = lim T iM and the − geodesics with respect to rc1 exist. In the next step, we will consider a Riemannian manifold (M; g) with its Levi-Civita connection r. Under suitable conditions this procedure i gives a sequence of Riemannian manifolds f(T M, gi)gi2N equipped with ci c1 a sequence of Riemannian connections fr gi2N. Then we show that r produces the curves which are the (local) length minimizer of T 1M. M.S.C. 2010: 58A05, 58B20. Key words: Complete lift; spray; geodesic; Fr´echet manifolds; Banach manifold; connection. 1 Introduction In the first section we remind the bijective correspondence between linear connections and homogeneous sprays. Then using the results of [6] for complete lift of sprays, we propose a formula to lift a connection on M to its higher order tangent bundles T rM, r 2 N. More precisely, for a given connection r on T rM, r 2 N [ f0g, we construct its associated spray S and then we lift it to a homogeneous spray Sc on T r+1M [6]. Then, using the bijective correspondence between connections and sprays, we derive the connection rc on T r+1M from Sc. -
Lecture Notes on Foliation Theory
INDIAN INSTITUTE OF TECHNOLOGY BOMBAY Department of Mathematics Seminar Lectures on Foliation Theory 1 : FALL 2008 Lecture 1 Basic requirements for this Seminar Series: Familiarity with the notion of differential manifold, submersion, vector bundles. 1 Some Examples Let us begin with some examples: m d m−d (1) Write R = R × R . As we know this is one of the several cartesian product m decomposition of R . Via the second projection, this can also be thought of as a ‘trivial m−d vector bundle’ of rank d over R . This also gives the trivial example of a codim. d- n d foliation of R , as a decomposition into d-dimensional leaves R × {y} as y varies over m−d R . (2) A little more generally, we may consider any two manifolds M, N and a submersion f : M → N. Here M can be written as a disjoint union of fibres of f each one is a submanifold of dimension equal to dim M − dim N = d. We say f is a submersion of M of codimension d. The manifold structure for the fibres comes from an atlas for M via the surjective form of implicit function theorem since dfp : TpM → Tf(p)N is surjective at every point of M. We would like to consider this description also as a codim d foliation. However, this is also too simple minded one and hence we would call them simple foliations. If the fibres of the submersion are connected as well, then we call it strictly simple. (3) Kronecker Foliation of a Torus Let us now consider something non trivial. -
Jacob Wolfowitz 1910–1981
NATIONAL ACADEMY OF SCIENCES JACOB WOLFOWITZ 1910–1981 A Biographical Memoir by SHELEMYAHU ZACKS Any opinions expressed in this memoir are those of the author and do not necessarily reflect the views of the National Academy of Sciences. Biographical Memoirs, VOLUME 82 PUBLISHED 2002 BY THE NATIONAL ACADEMY PRESS WASHINGTON, D.C. JACOB WOLFOWITZ March 19, 1910–July 16, 1981 BY SHELEMYAHU ZACKS ACOB WOLFOWITZ, A GIANT among the founders of modern Jstatistics, will always be remembered for his originality, deep thinking, clear mind, excellence in teaching, and vast contributions to statistical and information sciences. I met Wolfowitz for the first time in 1957, when he spent a sab- batical year at the Technion, Israel Institute of Technology. I was at the time a graduate student and a statistician at the building research station of the Technion. I had read papers of Wald and Wolfowitz before, and for me the meeting with Wolfowitz was a great opportunity to associate with a great scholar who was very kind to me and most helpful. I took his class at the Technion on statistical decision theory. Outside the classroom we used to spend time together over a cup of coffee or in his office discussing statistical problems. He gave me a lot of his time, as though I was his student. His advice on the correct approach to the theory of statistics accompanied my development as statistician for many years to come. Later we kept in touch, mostly by correspondence and in meetings of the Institute of Mathematical Statistics. I saw him the last time in his office at the University of Southern Florida in Tampa, where he spent the last years 3 4 BIOGRAPHICAL MEMOIRS of his life. -
Information Geometry (Part 1)
October 22, 2010 Information Geometry (Part 1) John Baez Information geometry is the study of 'statistical manifolds', which are spaces where each point is a hypothesis about some state of affairs. In statistics a hypothesis amounts to a probability distribution, but we'll also be looking at the quantum version of a probability distribution, which is called a 'mixed state'. Every statistical manifold comes with a way of measuring distances and angles, called the Fisher information metric. In the first seven articles in this series, I'll try to figure out what this metric really means. The formula for it is simple enough, but when I first saw it, it seemed quite mysterious. A good place to start is this interesting paper: • Gavin E. Crooks, Measuring thermodynamic length. which was pointed out by John Furey in a discussion about entropy and uncertainty. The idea here should work for either classical or quantum statistical mechanics. The paper describes the classical version, so just for a change of pace let me describe the quantum version. First a lightning review of quantum statistical mechanics. Suppose you have a quantum system with some Hilbert space. When you know as much as possible about your system, then you describe it by a unit vector in this Hilbert space, and you say your system is in a pure state. Sometimes people just call a pure state a 'state'. But that can be confusing, because in statistical mechanics you also need more general 'mixed states' where you don't know as much as possible. A mixed state is described by a density matrix, meaning a positive operator with trace equal to 1: tr( ) = 1 The idea is that any observable is described by a self-adjoint operator A, and the expected value of this observable in the mixed state is A = tr( A) The entropy of a mixed state is defined by S( ) = −tr( ln ) where we take the logarithm of the density matrix just by taking the log of each of its eigenvalues, while keeping the same eigenvectors. -
Statistical Manifold, Exponential Family, Autoparallel Submanifold
Global Journal of Advanced Research on Classical and Modern Geometries ISSN: 2284-5569, Vol.8, (2019), Issue 1, pp.18-25 SUBMANIFOLDS OF EXPONENTIAL FAMILIES MAHESH T. V. AND K.S. SUBRAHAMANIAN MOOSATH ABSTRACT . Exponential family with 1 - connection plays an important role in information geom- ± etry. Amari proved that a submanifold M of an exponential family S is exponential if and only if M is a 1- autoparallel submanifold. We show that if all 1- auto parallel proper submanifolds of ∇ ∇ a 1 flat statistical manifold S are exponential then S is an exponential family. Also shown that ± − the submanifold of a parameterized model S which is an exponential family is a 1 - autoparallel ∇ submanifold. Keywords: statistical manifold, exponential family, autoparallel submanifold. 2010 MSC: 53A15 1. I NTRODUCTION Information geometry emerged from the geometric study of a statistical model of probability distributions. The information geometric tools are widely applied to various fields such as statis- tics, information theory, stochastic processes, neural networks, statistical physics, neuroscience etc.[3][7]. The importance of the differential geometric approach to the field of statistics was first noticed by C R Rao [6]. On a statistical model of probability distributions he introduced a Riemannian metric defined by the Fisher information known as the Fisher information metric. Another milestone in this area is the work of Amari [1][2][5]. He introduced the α - geometric structures on a statistical manifold consisting of Fisher information metric and the α - con- nections. Harsha and Moosath [4] introduced more generalized geometric structures± called the (F, G) geometry on a statistical manifold which is a generalization of α geometry.