An Elementary Introduction to Information Geometry
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
When Is a Connection a Metric Connection?
NEW ZEALAND JOURNAL OF MATHEMATICS Volume 38 (2008), 225–238 WHEN IS A CONNECTION A METRIC CONNECTION? Richard Atkins (Received April 2008) Abstract. There are two sides to this investigation: first, the determination of the integrability conditions that ensure the existence of a local parallel metric in the neighbourhood of a given point and second, the characterization of the topological obstruction to a globally defined parallel metric. The local problem has been previously solved by means of constructing a derived flag. Herein we continue the inquiry to the global aspect, which is formulated in terms of the cohomology of the constant sheaf of sections in the general linear group. 1. Introduction A connection on a manifold is a type of differentiation that acts on vector fields, differential forms and tensor products of these objects. Its importance lies in the fact that given a piecewise continuous curve connecting two points on the manifold, the connection defines a linear isomorphism between the respective tangent spaces at these points. Another fundamental concept in the study of differential geometry is that of a metric, which provides a measure of distance on the manifold. It is well known that a metric uniquely determines a Levi-Civita connection: a symmetric connection for which the metric is parallel. Not all connections are derived from a metric and so it is natural to ask, for a symmetric connection, if there exists a parallel metric, that is, whether the connection is a Levi-Civita connection. More generally, a connection on a manifold M, symmetric or not, is said to be metric if there exists a parallel metric defined on M. -
Lecture 24/25: Riemannian Metrics
Lecture 24/25: Riemannian metrics An inner product on Rn allows us to do the following: Given two curves intersecting at a point x Rn,onecandeterminetheanglebetweentheir tangents: ∈ cos θ = u, v / u v . | || | On a general manifold, we would again like to have an inner product on each TpM.Theniftwocurvesintersectatapointp,onecanmeasuretheangle between them. Of course, there are other geometric invariants that arise out of this structure. 1. Riemannian and Hermitian metrics Definition 24.2. A Riemannian metric on a vector bundle E is a section g Γ(Hom(E E,R)) ∈ ⊗ such that g is symmetric and positive definite. A Riemannian manifold is a manifold together with a choice of Riemannian metric on its tangent bundle. Chit-chat 24.3. A section of Hom(E E,R)isasmoothchoiceofalinear map ⊗ gp : Ep Ep R ⊗ → at every point p.Thatg is symmetric means that for every u, v E ,wehave ∈ p gp(u, v)=gp(v, u). That g is positive definite means that g (v, v) 0 p ≥ for all v E ,andequalityholdsifandonlyifv =0 E . ∈ p ∈ p Chit-chat 24.4. As usual, one can try to understand g in local coordinates. If one chooses a trivializing set of linearly independent sections si ,oneob- tains a matrix of functions { } gij = gp((si)p, (sj)p). By symmetry of g,thisisasymmetricmatrix. 85 n Example 24.5. T R is trivial. Let gij = δij be the constant matrix of functions, so that gij(p)=I is the identity matrix for every point. Then on n n every fiber, g defines the usual inner product on TpR ∼= R . -
Introduction to Information Geometry – Based on the Book “Methods of Information Geometry ”Written by Shun-Ichi Amari and Hiroshi Nagaoka
Introduction to Information Geometry – based on the book “Methods of Information Geometry ”written by Shun-Ichi Amari and Hiroshi Nagaoka Yunshu Liu 2012-02-17 Introduction to differential geometry Geometric structure of statistical models and statistical inference Outline 1 Introduction to differential geometry Manifold and Submanifold Tangent vector, Tangent space and Vector field Riemannian metric and Affine connection Flatness and autoparallel 2 Geometric structure of statistical models and statistical inference The Fisher metric and α-connection Exponential family Divergence and Geometric statistical inference Yunshu Liu (ASPITRG) Introduction to Information Geometry 2 / 79 Introduction to differential geometry Geometric structure of statistical models and statistical inference Part I Introduction to differential geometry Yunshu Liu (ASPITRG) Introduction to Information Geometry 3 / 79 Introduction to differential geometry Geometric structure of statistical models and statistical inference Basic concepts in differential geometry Basic concepts Manifold and Submanifold Tangent vector, Tangent space and Vector field Riemannian metric and Affine connection Flatness and autoparallel Yunshu Liu (ASPITRG) Introduction to Information Geometry 4 / 79 Introduction to differential geometry Geometric structure of statistical models and statistical inference Manifold Manifold S n Manifold: a set with a coordinate system, a one-to-one mapping from S to R , n supposed to be ”locally” looks like an open subset of R ” n Elements of the set(points): points in R , probability distribution, linear system. Figure : A coordinate system ξ for a manifold S Yunshu Liu (ASPITRG) Introduction to Information Geometry 5 / 79 Introduction to differential geometry Geometric structure of statistical models and statistical inference Manifold Manifold S Definition: Let S be a set, if there exists a set of coordinate systems A for S which satisfies the condition (1) and (2) below, we call S an n-dimensional C1 differentiable manifold. -
Arxiv:1907.11122V2 [Math-Ph] 23 Aug 2019 on M Which Are Dual with Respect to G
Canonical divergence for flat α-connections: Classical and Quantum Domenico Felice1, ∗ and Nihat Ay2, y 1Max Planck Institute for Mathematics in the Sciences Inselstrasse 22{04103 Leipzig, Germany 2 Max Planck Institute for Mathematics in the Sciences Inselstrasse 22{04103 Leipzig, Germany Santa Fe Institute, 1399 Hyde Park Rd, Santa Fe, NM 87501, USA Faculty of Mathematics and Computer Science, University of Leipzig, PF 100920, 04009 Leipzig, Germany A recent canonical divergence, which is introduced on a smooth manifold M endowed with a general dualistic structure (g; r; r∗), is considered for flat α-connections. In the classical setting, we compute such a canonical divergence on the manifold of positive measures and prove that it coincides with the classical α-divergence. In the quantum framework, the recent canonical divergence is evaluated for the quantum α-connections on the manifold of all positive definite Hermitian operators. Also in this case we obtain that the recent canonical divergence is the quantum α-divergence. PACS numbers: Classical differential geometry (02.40.Hw), Riemannian geometries (02.40.Ky), Quantum Information (03.67.-a). I. INTRODUCTION Methods of Information Geometry (IG) [1] are ubiquitous in physical sciences and encom- pass both classical and quantum systems [2]. The natural object of study in IG is a quadruple (M; g; r; r∗) given by a smooth manifold M, a Riemannian metric g and a pair of affine connections arXiv:1907.11122v2 [math-ph] 23 Aug 2019 on M which are dual with respect to g, ∗ X g (Y; Z) = g (rX Y; Z) + g (Y; rX Z) ; (1) for all sections X; Y; Z 2 T (M). -
Riemannian Geometry and Multilinear Tensors with Vector Fields on Manifolds Md
International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 157 ISSN 2229-5518 Riemannian Geometry and Multilinear Tensors with Vector Fields on Manifolds Md. Abdul Halim Sajal Saha Md Shafiqul Islam Abstract-In the paper some aspects of Riemannian manifolds, pseudo-Riemannian manifolds, Lorentz manifolds, Riemannian metrics, affine connections, parallel transport, curvature tensors, torsion tensors, killing vector fields, conformal killing vector fields are focused. The purpose of this paper is to develop the theory of manifolds equipped with Riemannian metric. I have developed some theorems on torsion and Riemannian curvature tensors using affine connection. A Theorem 1.20 named “Fundamental Theorem of Pseudo-Riemannian Geometry” has been established on Riemannian geometry using tensors with metric. The main tools used in the theorem of pseudo Riemannian are tensors fields defined on a Riemannian manifold. Keywords: Riemannian manifolds, pseudo-Riemannian manifolds, Lorentz manifolds, Riemannian metrics, affine connections, parallel transport, curvature tensors, torsion tensors, killing vector fields, conformal killing vector fields. —————————— —————————— I. Introduction (c) { } is a family of open sets which covers , that is, 푖 = . Riemannian manifold is a pair ( , g) consisting of smooth 푈 푀 manifold and Riemannian metric g. A manifold may carry a (d) ⋃ is푈 푖푖 a homeomorphism푀 from onto an open subset of 푀 ′ further structure if it is endowed with a metric tensor, which is a 푖 . 푖 푖 휑 푈 푈 natural generation푀 of the inner product between two vectors in 푛 ℝ to an arbitrary manifold. Riemannian metrics, affine (e) Given and such that , the map = connections,푛 parallel transport, curvature tensors, torsion tensors, ( ( ) killingℝ vector fields and conformal killing vector fields play from푖 푗 ) to 푖 푗 is infinitely푖푗 −1 푈 푈 푈 ∩ 푈 ≠ ∅ 휓 important role to develop the theorem of Riemannian manifolds. -
Information Geometry of Orthogonal Initializations and Training
Published as a conference paper at ICLR 2020 INFORMATION GEOMETRY OF ORTHOGONAL INITIALIZATIONS AND TRAINING Piotr Aleksander Sokół and Il Memming Park Department of Neurobiology and Behavior Departments of Applied Mathematics and Statistics, and Electrical and Computer Engineering Institutes for Advanced Computing Science and AI-driven Discovery and Innovation Stony Brook University, Stony Brook, NY 11733 {memming.park, piotr.sokol}@stonybrook.edu ABSTRACT Recently mean field theory has been successfully used to analyze properties of wide, random neural networks. It gave rise to a prescriptive theory for initializing feed-forward neural networks with orthogonal weights, which ensures that both the forward propagated activations and the backpropagated gradients are near `2 isome- tries and as a consequence training is orders of magnitude faster. Despite strong empirical performance, the mechanisms by which critical initializations confer an advantage in the optimization of deep neural networks are poorly understood. Here we show a novel connection between the maximum curvature of the optimization landscape (gradient smoothness) as measured by the Fisher information matrix (FIM) and the spectral radius of the input-output Jacobian, which partially explains why more isometric networks can train much faster. Furthermore, given that or- thogonal weights are necessary to ensure that gradient norms are approximately preserved at initialization, we experimentally investigate the benefits of maintaining orthogonality throughout training, and we conclude that manifold optimization of weights performs well regardless of the smoothness of the gradients. Moreover, we observe a surprising yet robust behavior of highly isometric initializations — even though such networks have a lower FIM condition number at initialization, and therefore by analogy to convex functions should be easier to optimize, experimen- tally they prove to be much harder to train with stochastic gradient descent. -
Information Geometry and Optimal Transport
Call for Papers Special Issue Affine Differential Geometry and Hesse Geometry: A Tribute and Memorial to Jean-Louis Koszul Submission Deadline: 30th November 2019 Jean-Louis Koszul (January 3, 1921 – January 12, 2018) was a French mathematician with prominent influence to a wide range of mathematical fields. He was a second generation member of Bourbaki, with several notions in geometry and algebra named after him. He made a great contribution to the fundamental theory of Differential Geometry, which is foundation of Information Geometry. The special issue is dedicated to Koszul for the mathematics he developed that bear on information sciences. Both original contributions and review articles are solicited. Topics include but are not limited to: Affine differential geometry over statistical manifolds Hessian and Kahler geometry Divergence geometry Convex geometry and analysis Differential geometry over homogeneous and symmetric spaces Jordan algebras and graded Lie algebras Pre-Lie algebras and their cohomology Geometric mechanics and Thermodynamics over homogeneous spaces Guest Editor Hideyuki Ishi (Graduate School of Mathematics, Nagoya University) e-mail: [email protected] Submit at www.editorialmanager.com/inge Please select 'S.I.: Affine differential geometry and Hessian geometry Jean-Louis Koszul (Memorial/Koszul)' Photo Author: Konrad Jacobs. Source: Archives of the Mathema�sches Forschungsins�tut Oberwolfach Editor-in-Chief Shinto Eguchi (Tokyo) Special Issue Co-Editors Information Geometry and Optimal Transport Nihat -
Hodge Theory of SKT Manifolds
Hodge theory and deformations of SKT manifolds Gil R. Cavalcanti∗ Department of Mathematics Utrecht University Abstract We use tools from generalized complex geometry to develop the theory of SKT (a.k.a. pluriclosed Hermitian) manifolds and more generally manifolds with special holonomy with respect to a metric connection with closed skew-symmetric torsion. We develop Hodge theory on such manifolds show- ing how the reduction of the holonomy group causes a decomposition of the twisted cohomology. For SKT manifolds this decomposition is accompanied by an identity between different Laplacian operators and forces the collapse of a spectral sequence at the first page. Further we study the de- formation theory of SKT structures, identifying the space where the obstructions live. We illustrate our theory with examples based on Calabi{Eckmann manifolds, instantons, Hopf surfaces and Lie groups. Contents 1 Linear algebra 3 2 Intrinsic torsion of generalized Hermitian structures8 2.1 The Nijenhuis tensor....................................9 2.2 The intrinsic torsion and the road to integrability.................... 10 2.3 The operators δ± and δ± ................................. 12 3 Parallel Hermitian and bi-Hermitian structures 12 4 SKT structures 14 5 Hodge theory 19 5.1 Differential operators, their adjoints and Laplacians.................. 19 5.2 Signature and Euler characteristic of almost Hermitian manifolds........... 20 5.3 Hodge theory on parallel Hermitian manifolds...................... 22 5.4 Hodge theory on SKT manifolds............................. 23 5.5 Relation to Dolbeault cohomology............................ 23 5.6 Hermitian symplectic structures.............................. 27 6 Hodge theory beyond U(n) 28 6.1 Integrability......................................... 29 Keywords. Strong KT structure, generalized complex geometry, generalized K¨ahlergeometry, Hodge theory, instan- tons, deformations. -
Information Geometry (Part 1)
October 22, 2010 Information Geometry (Part 1) John Baez Information geometry is the study of 'statistical manifolds', which are spaces where each point is a hypothesis about some state of affairs. In statistics a hypothesis amounts to a probability distribution, but we'll also be looking at the quantum version of a probability distribution, which is called a 'mixed state'. Every statistical manifold comes with a way of measuring distances and angles, called the Fisher information metric. In the first seven articles in this series, I'll try to figure out what this metric really means. The formula for it is simple enough, but when I first saw it, it seemed quite mysterious. A good place to start is this interesting paper: • Gavin E. Crooks, Measuring thermodynamic length. which was pointed out by John Furey in a discussion about entropy and uncertainty. The idea here should work for either classical or quantum statistical mechanics. The paper describes the classical version, so just for a change of pace let me describe the quantum version. First a lightning review of quantum statistical mechanics. Suppose you have a quantum system with some Hilbert space. When you know as much as possible about your system, then you describe it by a unit vector in this Hilbert space, and you say your system is in a pure state. Sometimes people just call a pure state a 'state'. But that can be confusing, because in statistical mechanics you also need more general 'mixed states' where you don't know as much as possible. A mixed state is described by a density matrix, meaning a positive operator with trace equal to 1: tr( ) = 1 The idea is that any observable is described by a self-adjoint operator A, and the expected value of this observable in the mixed state is A = tr( A) The entropy of a mixed state is defined by S( ) = −tr( ln ) where we take the logarithm of the density matrix just by taking the log of each of its eigenvalues, while keeping the same eigenvectors. -
Statistical Manifold, Exponential Family, Autoparallel Submanifold
Global Journal of Advanced Research on Classical and Modern Geometries ISSN: 2284-5569, Vol.8, (2019), Issue 1, pp.18-25 SUBMANIFOLDS OF EXPONENTIAL FAMILIES MAHESH T. V. AND K.S. SUBRAHAMANIAN MOOSATH ABSTRACT . Exponential family with 1 - connection plays an important role in information geom- ± etry. Amari proved that a submanifold M of an exponential family S is exponential if and only if M is a 1- autoparallel submanifold. We show that if all 1- auto parallel proper submanifolds of ∇ ∇ a 1 flat statistical manifold S are exponential then S is an exponential family. Also shown that ± − the submanifold of a parameterized model S which is an exponential family is a 1 - autoparallel ∇ submanifold. Keywords: statistical manifold, exponential family, autoparallel submanifold. 2010 MSC: 53A15 1. I NTRODUCTION Information geometry emerged from the geometric study of a statistical model of probability distributions. The information geometric tools are widely applied to various fields such as statis- tics, information theory, stochastic processes, neural networks, statistical physics, neuroscience etc.[3][7]. The importance of the differential geometric approach to the field of statistics was first noticed by C R Rao [6]. On a statistical model of probability distributions he introduced a Riemannian metric defined by the Fisher information known as the Fisher information metric. Another milestone in this area is the work of Amari [1][2][5]. He introduced the α - geometric structures on a statistical manifold consisting of Fisher information metric and the α - con- nections. Harsha and Moosath [4] introduced more generalized geometric structures± called the (F, G) geometry on a statistical manifold which is a generalization of α geometry. -
Linear Connections and Curvature Tensors in the Geometry Of
LINEAR CONNECTIONS AND CURVATURE TENSORS IN THE GEOMETRY OF PARALLELIZABLE MANIFOLDS Nabil L. Youssef and Amr M. Sid-Ahmed Department of Mathematics, Faculty of Science, Cairo University, Giza, Egypt. [email protected] and [email protected] Dedicated to the memory of Prof. Dr. A. TAMIM Abstract. In this paper we discuss linear connections and curvature tensors in the context of geometry of parallelizable manifolds (or absolute parallelism geometry). Different curvature tensors are expressed in a compact form in terms of the torsion tensor of the canonical connection. Using the Bianchi identities, some other identities are derived from the expressions obtained. These identities, in turn, are used to reveal some of the properties satisfied by an intriguing fourth order tensor which we refer to as Wanas tensor. A further condition on the canonical connection is imposed, assuming it is semi- symmetric. The formulae thus obtained, together with other formulae (Ricci tensors and scalar curvatures of the different connections admitted by the space) are cal- arXiv:gr-qc/0604111v2 8 Feb 2007 culated under this additional assumption. Considering a specific form of the semi- symmetric connection causes all nonvanishing curvature tensors to coincide, up to a constant, with the Wanas tensor. Physical aspects of some of the geometric objects considered are pointed out. 1 Keywords: Parallelizable manifold, Absolute parallelism geometry, dual connection, semi-symmetric connection, Wanas tensor, Field equations. AMS Subject Classification. 51P05, 83C05, 53B21. 1This paper was presented in “The international Conference on Developing and Extending Ein- stein’s Ideas ”held at Cairo, Egypt, December 19-21, 2005. 1 1. -
Exponential Families: Dually-Flat, Hessian and Legendre Structures
entropy Review Information Geometry of k-Exponential Families: Dually-Flat, Hessian and Legendre Structures Antonio M. Scarfone 1,* ID , Hiroshi Matsuzoe 2 ID and Tatsuaki Wada 3 ID 1 Istituto dei Sistemi Complessi, Consiglio Nazionale delle Ricerche (ISC-CNR), c/o Politecnico di Torino, 10129 Torino, Italy 2 Department of Computer Science and Engineering, Graduate School of Engineering, Nagoya Institute of Technology, Gokiso-cho, Showa-ku, Nagoya 466-8555, Japan; [email protected] 3 Region of Electrical and Electronic Systems Engineering, Ibaraki University, Nakanarusawa-cho, Hitachi 316-8511, Japan; [email protected] * Correspondence: [email protected]; Tel.: +39-011-090-7339 Received: 9 May 2018; Accepted: 1 June 2018; Published: 5 June 2018 Abstract: In this paper, we present a review of recent developments on the k-deformed statistical mechanics in the framework of the information geometry. Three different geometric structures are introduced in the k-formalism which are obtained starting from three, not equivalent, divergence functions, corresponding to the k-deformed version of Kullback–Leibler, “Kerridge” and Brègman divergences. The first statistical manifold derived from the k-Kullback–Leibler divergence form an invariant geometry with a positive curvature that vanishes in the k → 0 limit. The other two statistical manifolds are related to each other by means of a scaling transform and are both dually-flat. They have a dualistic Hessian structure endowed by a deformed Fisher metric and an affine connection that are consistent with a statistical scalar product based on the k-escort expectation. These flat geometries admit dual potentials corresponding to the thermodynamic Massieu and entropy functions that induce a Legendre structure of k-thermodynamics in the picture of the information geometry.