Lec 6: Time-Dependent Linear Iterations

Total Page:16

File Type:pdf, Size:1020Kb

Lec 6: Time-Dependent Linear Iterations CONTENTS 1 TIME-DEPENDENT AVERAGING ALGORITHMS Lecture 6: time-dependent linear iterations Sonia Mart´ınez May 9, 2018 Abstract Iterations correspond to algorithm executions. Linear iterations are amenable to methods from linear algebra. We have seen this when analyzing of the flocking example with fixed topology. However, our study is no longer applicable if the topology is changing from one timestep to the next. The inherent richness of cooperative systems makes linear algebra insufficient { we need to invoke notions and tools from other areas, such as graph theory and stability analysis. This is what we do in this set of lecture notes. The treatment corresponds to selected parts from Chapter 1 in [1]. On the other hand, convergence factors of a (constant-time) linear iteration is introduced (see [1] and[2]) and we point out how this depends on the several properties of the underlying graph. Contents 1 Time-dependent averaging algorithms1 1.1 Stability................................................3 1.2 Convergence for general sequence of stochastic matrices.....................3 1.3 Convergence for sequence of stochastic symmetric matrices...................5 1.4 Final agreement value........................................6 2 Convergence speed of linear iterations7 A Toeplitz and tridiagonal circulant matrices 10 1 Time-dependent averaging algorithms Linear distributed algorithms on synchronous networks are discrete-time linear dynamical systems whose evolution map is linear and has a sparsity structure related to the network. These algorithms represent an important class of iterative algorithms that find applications in optimization, in the solution of systems of equations, and in distributed decision making; see, for instance [3]. In this section, we present some relevant results on distributed linear algorithms. Example 1.1 (Flocking) In our previous lecture notes we have fully characterized the stability and con- vergence of the flocking algorithm, 1 θ(` + 1) = F θ(`);F = (In + D)− (In + A); 1 MAE247 { Cooperative Control of Multi-Agent Systems. Permission is granted to copy, distribute and modify this file, provided original sources are acknowledged. 1 TIME-DEPENDENT AVERAGING ALGORITHMS when the interaction topology is fixed. However, if the topology changes as agents move (something that seem reasonable if it is determined by proximity among agents, no?), then all of our previous analysis does not quite apply. In this case, we would have something like 1 θ(` + 1) = F (`)θ(`);F (`) = (In + D(`))− (In + A(`)); For instance, before it was good enough to determine whether F was semi-convergent. Now, if the topology is changing, even if individual F (`) are semi-convergent, how do we know that arbitrary products of them will be? Assuming they are and agents eventually flock, can we determine the asymptotic common heading? • We study linear combination algorithms over time-dependent weighted directed graphs; we restrict our analysis to nonnegative weights. The averaging algorithm associated to a sequence of stochastic matrices n n F (`) ` Z 0 R × is the discrete-time dynamical system f j 2 ≥ g ⊂ w(` + 1) = F (`) w(`); ` Z 0: (1) · 2 ≥ In the literature, such algorithms are often referred to as agreement algorithms, or as consensus algorithms. As we have seen in previous lectures, there are useful ways to compute a stochastic matrix, and therefore, a time-independent averaging algorithm, from a weighted digraph. Definition 1.2 (Adjacency- and Laplacian-based averaging) Let G be a weighted digraph with node set 1; : : : ; n , weighted adjacency matrix A, weighted out-degree matrix Dout, and weighted Laplacian L. Thenf g 1 (i) the adjacency-based averaging algorithm is defined by the stochastic matrix (In + Dout)− (In + A) and reads in components 1 n w (` + 1) = w (`) + a w (`) ; (2) i 1 + d (i) i ij j out j=1 X (ii) given a positive scalar " upper bounded by min 1=dout(i) i 1; : : : ; n , the Laplacian-based averaging algorithm is defined by the stochastic matrixfI "L(Gj) and2 f reads ingg components n − n n w (` + 1) = 1 " a w (`) + " a w (`): (3) i − ij i ij j j=1;j=i j=1;j=i X6 X6 These notions are immediately extended to sequences of stochastic matrices arising from sequences of weighted digraphs. • We recognize both types of averaging algorithms from previous lectures. Adjacency-based averaging cor- responds to the flocking example and Laplacian-based averaging corresponds to the discretization of the agreement example. Adjacency-based averaging algorithms arising from unweighted undirected graphs without self-loops are also known as equal-neighbor averaging rule or the Vicsek's model [4]. Specifically, if G is an unweighted graph with vertices 1; : : : ; n and without self-loops, then the equal-neighbor averaging rule is f g w (` + 1) = avrg w (`) w (`) j (i) ; (4) i f i g [f j j 2 NG g where we adopt the shorthand avrg( x ; : : : ; x ) = (x + + x )=k. f 1 kg 1 ··· k 2 MAE247 { Cooperative Control of Multi-Agent Systems. Permission is granted to copy, distribute and modify this file, provided original sources are acknowledged. 1.1 Stability 1 TIME-DEPENDENT AVERAGING ALGORITHMS Remark 1.3 (Sensing versus communication interpretation of directed edges) In the definition of averaging algorithms arising from digraphs, the digraph edges play the role of \sensing edges," not that of \communication edges." In other words, a nonzero entry aij, corresponding to the digraph edge (i; j), implies that the ith component of the state is updated with the jth component of the state. It is as if node i could sense the state of node j, rather than node i transmitting to node j its own state. • 1.1 Stability Here, we present the main stability result for averaging algorithms associated to a sequence of stochastic matrices. We start by discussing equilibrium points and their stability. Recall that 1n is an eigenvector of any stochastic matrix with eigenvalue 1 and that the diagonal set diag(Rn) is the vector subspace generated n by 1n. Therefore, any point in diag(R ) is an equilibrium for any averaging algorithm. We refer to the points of the diag(Rn) as agreement configurations, since all the components of an element in diag(Rn) are equal to the same value. We will informally say that an algorithm achieves agreement if it steers the network state toward the set of agreement configurations. Lemma 1.4 (Stability of agreement configurations) Any averaging algorithm (1) in Rn is uniformly stable and uniformly bounded with respect to diag(Rn). 1.2 Convergence for general sequence of stochastic matrices Regarding convergence results, we need to introduce a useful property of collections of stochastic matrices. Given α ]0; 1], the set of non-degenerate matrices with respect to α consists of all stochastic matrices F with entries2 f , for i; j 1; : : : ; n , satisfying ij 2 f g f [α; 1]; and f 0 [α; 1] for j = i: ii 2 ij 2 f g [ 6 Additionally, the sequence of stochastic matrices F (`) ` Z 0 is non-degenerate if there exists α ]0; 1] f j 2 ≥ g 2 such that F (`) is non-degenerate with respect to α for all ` Z 0. We now state the main convergence result. 2 ≥ n n Theorem 1.5 (Convergence for time-dependent stochastic matrices) Let F (`) ` Z 0 R × f j 2 ≥ g ⊂ be a non-degenerate sequence of stochastic matrices. For ` Z 0, let G(`) be the unweighted digraph associ- ated to F (`). The following statements are equivalent: 2 ≥ (i) the set diag(Rn) is uniformly globally asymptotically stable for the averaging algorithm associated to F (`) ` Z 0 ; and f j 2 ≥ g (ii) there exists a duration δ N such that, for all ` Z 0, the digraph 2 2 ≥ G(` + 1) G(` + δ) [···[ contains a globally reachable vertex. We collect a few observations about this result. Remarks 1.6 (Discussion of Theorem 1.5) (i) The statement in Theorem 1.5(i) means that each solution to the time-dependent linear dynamical system (1) converges uniformly and asymptotically to the vector subspace generated by 1n. 3 MAE247 { Cooperative Control of Multi-Agent Systems. Permission is granted to copy, distribute and modify this file, provided original sources are acknowledged. 1.2 Convergence for general sequence of stochastic matrices1TIME-DEPENDENT AVERAGING ALGORITHMS (ii) The necessary and sufficient condition in Theorem 1.5(ii) amounts to the existence of a uniformly bounded time duration δ with the property that a weak connectivity assumption holds over each collection of δ consecutive digraphs. We refer to Example 1.7 below for a counterexample showing that if the duration in Theorem 1.5 is not uniformly bounded, then there exist algorithms that do not 1.21.2converge. Convergence Convergence for for general general sequence sequence of of stochastic stochastic matrices matrices 1 1 AVERAGING AVERAGING ALGORITHMS ALGORITHMS 1.21.2 Convergence Convergence for forgeneral general sequence sequence of stochastic of stochastic matrices matrices 1 AVERAGING 1 AVERAGING ALGORITHMS ALGORITHMS (iii) Uniform convergence is a property of all solutions to system (1) starting at any arbitrary time, and not (iii)(iii)UniformUniform convergence convergence is is a a property property of of all all solutions solutions to to system system (1 ()1) starting starting at atanyany arbitrary arbitrary time time, and, and not not (iii)onlyUniform at time convergence equal to zero. is a property If we restrict of all solutionsour attention to system to solutions (1) starting that at onlyany arbitrary start at time time, zero, and not then (iii) Uniformonlyonly at atconvergence time time equal equal to is to azero. zero.property If If we we of restrict allrestrict solutions our our attention attentionto system to to(1 solutions) solutions starting that at thatany only only arbitrary start start at time at time time, and zero, zero, not then then Theoremonlyonly at time at 1.5 time equalshould equal to be zero.
Recommended publications
  • Lecture Notes in Advanced Matrix Computations
    Lecture Notes in Advanced Matrix Computations Lectures by Dr. Michael Tsatsomeros Throughout these notes, signifies end proof, and N signifies end of example. Table of Contents Table of Contents i Lecture 0 Technicalities 1 0.1 Importance . 1 0.2 Background . 1 0.3 Material . 1 0.4 Matrix Multiplication . 1 Lecture 1 The Cost of Matrix Algebra 2 1.1 Block Matrices . 2 1.2 Systems of Linear Equations . 3 1.3 Triangular Systems . 4 Lecture 2 LU Decomposition 5 2.1 Gaussian Elimination and LU Decomposition . 5 Lecture 3 Partial Pivoting 6 3.1 LU Decomposition continued . 6 3.2 Gaussian Elimination with (Partial) Pivoting . 7 Lecture 4 Complete Pivoting 8 4.1 LU Decomposition continued . 8 4.2 Positive Definite Systems . 9 Lecture 5 Cholesky Decomposition 10 5.1 Positive Definiteness and Cholesky Decomposition . 10 Lecture 6 Vector and Matrix Norms 11 6.1 Vector Norms . 11 6.2 Matrix Norms . 13 Lecture 7 Vector and Matrix Norms continued 13 7.1 Matrix Norms . 13 7.2 Condition Numbers . 15 Notes by Jakob Streipel. Last updated April 27, 2018. i TABLE OF CONTENTS ii Lecture 8 Condition Numbers 16 8.1 Solving Perturbed Systems . 16 Lecture 9 Estimating the Condition Number 18 9.1 More on Condition Numbers . 18 9.2 Ill-conditioning by Scaling . 19 9.3 Estimating the Condition Number . 20 Lecture 10 Perturbing Not Only the Vector 21 10.1 Perturbing the Coefficient Matrix . 21 10.2 Perturbing Everything . 22 Lecture 11 Error Analysis After The Fact 23 11.1 A Posteriori Error Analysis Using the Residue .
    [Show full text]
  • Chapter 7 Powers of Matrices
    Chapter 7 Powers of Matrices 7.1 Introduction In chapter 5 we had seen that many natural (repetitive) processes such as the rabbit examples can be described by a discrete linear system (1) ~un+1 = A~un, n = 0, 1, 2,..., and that the solution of such a system has the form n (2) ~un = A ~u0. n In addition, we had learned how to find an explicit expression for A and hence for ~un by using the constituent matrices of A. In this chapter, however, we will be primarily interested in “long range predictions”, i.e. we want to know the behaviour of ~un for n large (or as n → ∞). In view of (2), this is implied by an understanding of the limit matrix n A∞ = lim A , n→∞ provided that this limit exists. Thus we shall: 1) Establish a criterion to guarantee the existence of this limit (cf. Theorem 7.6); 2) Find a quick method to compute this limit when it exists (cf. Theorems 7.7 and 7.8). We then apply these methods to analyze two application problems: 1) The shipping of commodities (cf. section 7.7) 2) Rat mazes (cf. section 7.7) As we had seen in section 5.9, the latter give rise to Markov chains, which we shall study here in more detail (cf. section 7.7). 312 Chapter 7: Powers of Matrices 7.2 Powers of Numbers As was mentioned in the introduction, we want to study here the behaviour of the powers An of a matrix A as n → ∞. However, before considering the general case, it is useful to first investigate the situation for 1 × 1 matrices, i.e.
    [Show full text]
  • Arxiv:1711.06300V1
    EXPLICIT BLOCK-STRUCTURES FOR BLOCK-SYMMETRIC FIEDLER-LIKE PENCILS∗ M. I. BUENO†, M. MARTIN ‡, J. PEREZ´ §, A. SONG ¶, AND I. VIVIANO k Abstract. In the last decade, there has been a continued effort to produce families of strong linearizations of a matrix polynomial P (λ), regular and singular, with good properties, such as, being companion forms, allowing the recovery of eigen- vectors of a regular P (λ) in an easy way, allowing the computation of the minimal indices of a singular P (λ) in an easy way, etc. As a consequence of this research, families such as the family of Fiedler pencils, the family of generalized Fiedler pencils (GFP), the family of Fiedler pencils with repetition, and the family of generalized Fiedler pencils with repetition (GFPR) were con- structed. In particular, one of the goals was to find in these families structured linearizations of structured matrix polynomials. For example, if a matrix polynomial P (λ) is symmetric (Hermitian), it is convenient to use linearizations of P (λ) that are also symmetric (Hermitian). Both the family of GFP and the family of GFPR contain block-symmetric linearizations of P (λ), which are symmetric (Hermitian) when P (λ) is. Now the objective is to determine which of those structured linearizations have the best numerical properties. The main obstacle for this study is the fact that these pencils are defined implicitly as products of so-called elementary matrices. Recent papers in the literature had as a goal to provide an explicit block-structure for the pencils belonging to the family of Fiedler pencils and any of its further generalizations to solve this problem.
    [Show full text]
  • Topological Recursion and Random Finite Noncommutative Geometries
    Western University Scholarship@Western Electronic Thesis and Dissertation Repository 8-21-2018 2:30 PM Topological Recursion and Random Finite Noncommutative Geometries Shahab Azarfar The University of Western Ontario Supervisor Khalkhali, Masoud The University of Western Ontario Graduate Program in Mathematics A thesis submitted in partial fulfillment of the equirr ements for the degree in Doctor of Philosophy © Shahab Azarfar 2018 Follow this and additional works at: https://ir.lib.uwo.ca/etd Part of the Discrete Mathematics and Combinatorics Commons, Geometry and Topology Commons, and the Quantum Physics Commons Recommended Citation Azarfar, Shahab, "Topological Recursion and Random Finite Noncommutative Geometries" (2018). Electronic Thesis and Dissertation Repository. 5546. https://ir.lib.uwo.ca/etd/5546 This Dissertation/Thesis is brought to you for free and open access by Scholarship@Western. It has been accepted for inclusion in Electronic Thesis and Dissertation Repository by an authorized administrator of Scholarship@Western. For more information, please contact [email protected]. Abstract In this thesis, we investigate a model for quantum gravity on finite noncommutative spaces using the topological recursion method originated from random matrix theory. More precisely, we consider a particular type of finite noncommutative geometries, in the sense of Connes, called spectral triples of type (1; 0) , introduced by Barrett. A random spectral triple of type (1; 0) has a fixed fermion space, and the moduli space of its Dirac operator D = fH; ·} ;H 2 HN , encoding all the possible geometries over the fermion −S(D) space, is the space of Hermitian matrices HN . A distribution of the form e dD is considered over the moduli space of Dirac operators.
    [Show full text]
  • Students' Solutions Manual Applied Linear Algebra
    Students’ Solutions Manual for Applied Linear Algebra by Peter J.Olver and Chehrzad Shakiban Second Edition Undergraduate Texts in Mathematics Springer, New York, 2018. ISBN 978–3–319–91040–6 To the Student These solutions are a resource for students studying the second edition of our text Applied Linear Algebra, published by Springer in 2018. An expanded solutions manual is available for registered instructors of courses adopting it as the textbook. Using the Manual The material taught in this book requires an active engagement with the exercises, and we urge you not to read the solutions in advance. Rather, you should use the ones in this manual as a means of verifying that your solution is correct. (It is our hope that all solutions appearing here are correct; errors should be reported to the authors.) If you get stuck on an exercise, try skimming the solution to get a hint for how to proceed, but then work out the exercise yourself. The more you can do on your own, the more you will learn. Please note: for students taking a course based on Applied Linear Algebra, copying solutions from this Manual can place you in violation of academic honesty. In particular, many solutions here just provide the final answer, and for full credit one must also supply an explanation of how this is found. Acknowledgements We thank a number of people, who are named in the text, for corrections to the solutions manuals that accompanied the first edition. Of course, as authors, we take full respon- sibility for all errors that may yet appear.
    [Show full text]
  • "Distance Measures for Graph Theory"
    Distance measures for graph theory : Comparisons and analyzes of different methods Dissertation presented by Maxime DUYCK for obtaining the Master’s degree in Mathematical Engineering Supervisor(s) Marco SAERENS Reader(s) Guillaume GUEX, Bertrand LEBICHOT Academic year 2016-2017 Acknowledgments First, I would like to thank my supervisor Pr. Marco Saerens for his presence, his advice and his precious help throughout the realization of this thesis. Second, I would also like to thank Bertrand Lebichot and Guillaume Guex for agreeing to read this work. Next, I would like to thank my parents, all my family and my friends to have accompanied and encouraged me during all my studies. Finally, I would thank Malian De Ron for creating this template [65] and making it available to me. This helped me a lot during “le jour et la nuit”. Contents 1. Introduction 1 1.1. Context presentation .................................. 1 1.2. Contents .......................................... 2 2. Theoretical part 3 2.1. Preliminaries ....................................... 4 2.1.1. Networks and graphs .............................. 4 2.1.2. Useful matrices and tools ........................... 4 2.2. Distances and kernels on a graph ........................... 7 2.2.1. Notion of (dis)similarity measures ...................... 7 2.2.2. Kernel on a graph ................................ 8 2.2.3. The shortest-path distance .......................... 9 2.3. Kernels from distances ................................. 9 2.3.1. Multidimensional scaling ............................ 9 2.3.2. Gaussian mapping ............................... 9 2.4. Similarity measures between nodes .......................... 9 2.4.1. Katz index and its Leicht’s extension .................... 10 2.4.2. Commute-time distance and Euclidean commute-time distance .... 10 2.4.3. SimRank similarity measure .........................
    [Show full text]
  • Volume 3 (2010), Number 2
    APLIMAT - JOURNAL OF APPLIED MATHEMATICS VOLUME 3 (2010), NUMBER 2 APLIMAT - JOURNAL OF APPLIED MATHEMATICS VOLUME 3 (2010), NUMBER 2 Edited by: Slovak University of Technology in Bratislava Editor - in - Chief: KOVÁČOVÁ Monika (Slovak Republic) Editorial Board: CARKOVS Jevgenijs (Latvia ) CZANNER Gabriela (USA) CZANNER Silvester (Great Britain) DE LA VILLA Augustin (Spain) DOLEŽALOVÁ Jarmila (Czech Republic) FEČKAN Michal (Slovak Republic) FERREIRA M. A. Martins (Portugal) FRANCAVIGLIA Mauro (Italy) KARPÍŠEK Zdeněk (Czech Republic) KOROTOV Sergey (Finland) LORENZI Marcella Giulia (Italy) MESIAR Radko (Slovak Republic) TALAŠOVÁ Jana (Czech Republic) VELICHOVÁ Daniela (Slovak Republic) Editorial Office: Institute of natural sciences, humanities and social sciences Faculty of Mechanical Engineering Slovak University of Technology in Bratislava Námestie slobody 17 812 31 Bratislava Correspodence concerning subscriptions, claims and distribution: F.X. spol s.r.o Azalková 21 821 00 Bratislava [email protected] Frequency: One volume per year consisting of three issues at price of 120 EUR, per volume, including surface mail shipment abroad. Registration number EV 2540/08 Information and instructions for authors are available on the address: http://www.journal.aplimat.com/ Printed by: FX spol s.r.o, Azalková 21, 821 00 Bratislava Copyright © STU 2007-2010, Bratislava All rights reserved. No part may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise,
    [Show full text]
  • Chapter 4 Introduction to Spectral Graph Theory
    Chapter 4 Introduction to Spectral Graph Theory Spectral graph theory is the study of a graph through the properties of the eigenvalues and eigenvectors of its associated Laplacian matrix. In the following, we use G = (V; E) to represent an undirected n-vertex graph with no self-loops, and write V = f1; : : : ; ng, with the degree of vertex i denoted di. For undirected graphs our convention will be that if there P is an edge then both (i; j) 2 E and (j; i) 2 E. Thus (i;j)2E 1 = 2jEj. If we wish to sum P over edges only once, we will write fi; jg 2 E for the unordered pair. Thus fi;jg2E 1 = jEj. 4.1 Matrices associated to a graph Given an undirected graph G, the most natural matrix associated to it is its adjacency matrix: Definition 4.1 (Adjacency matrix). The adjacency matrix A 2 f0; 1gn×n is defined as ( 1 if fi; jg 2 E; Aij = 0 otherwise. Note that A is always a symmetric matrix with exactly di ones in the i-th row and the i-th column. While A is a natural representation of G when we think of a matrix as a table of numbers used to store information, it is less natural if we think of a matrix as an operator, a linear transformation which acts on vectors. The most natural operator associated with a graph is the diffusion operator, which spreads a quantity supported on any vertex equally onto its neighbors. To introduce the diffusion operator, first consider the degree matrix: Definition 4.2 (Degree matrix).
    [Show full text]
  • MINIMUM DEGREE ENERGY of GRAPHS Dedicated to the Platinum
    Electronic Journal of Mathematical Analysis and Applications Vol. 7(2) July 2019, pp. 230-243. ISSN: 2090-729X(online) http://math-frac.org/Journals/EJMAA/ |||||||||||||||||||||||||||||||| MINIMUM DEGREE ENERGY OF GRAPHS B. BASAVANAGOUD AND PRAVEEN JAKKANNAVAR Dedicated to the Platinum Jubilee year of Dr. V. R. Kulli Abstract. Let G be a graph of order n. Then an n × n symmetric matrix is called the minimum degree matrix MD(G) of a graph G, if its (i; j)th entry is minfdi; dj g whenever i 6= j, and zero otherwise, where di and dj are the degrees of ith and jth vertices of G, respectively. In the present work, we obtain the characteristic polynomial of the minimum degree matrix of graphs obtained by some graph operations. In addition, bounds for the largest minimum degree eigenvalue and minimum degree energy of graphs are obtained. 1. Introduction Throughout this paper by a graph G = (V; E) we mean a finite undirected graph without loops and multiple edges of order n and size m. Let V = V (G) and E = E(G) be the vertex set and edge set of G, respectively. The degree dG(v) of a vertex v 2 V (G) is the number of edges incident to it in G. The graph G is r-regular if and only if the degree of each vertex in G is r. Let fv1; v2; :::; vng be the vertices of G and let di = dG(vi). Basic notations and terminologies can be found in [8, 12, 14]. In literature, there are several graph polynomials defined on different graph matrices such as adjacency matrix [8, 12, 14], Laplacian matrix [15], signless Laplacian matrix [9, 18], seidel matrix [5], degree sum matrix [13, 19], distance matrix [1] etc.
    [Show full text]
  • Formal Matrix Integrals and Combinatorics of Maps
    SPhT-T05/241 Formal matrix integrals and combinatorics of maps B. Eynard 1 Service de Physique Th´eorique de Saclay, F-91191 Gif-sur-Yvette Cedex, France. Abstract: This article is a short review on the relationship between convergent matrix inte- grals, formal matrix integrals, and combinatorics of maps. 1 Introduction This article is a short review on the relationship between convergent matrix inte- grals, formal matrix integrals, and combinatorics of maps. We briefly summa- rize results developed over the last 30 years, as well as more recent discoveries. We recall that formal matrix integrals are identical to combinatorial generating functions for maps, and that formal matrix integrals are in general very different from arXiv:math-ph/0611087v3 19 Dec 2006 convergent matrix integrals. Both may coincide perturbatively (i.e. up to terms smaller than any negative power of N), only for some potentials which correspond to negative weights for the maps, and therefore not very interesting from the combinatorics point of view. We also recall that both convergent and formal matrix integrals are solutions of the same set of loop equations, and that loop equations do not have a unique solution in general. Finally, we give a list of the classical matrix models which have played an important role in physics in the past decades. Some of them are now well understood, some are still difficult challenges. Matrix integrals were first introduced by physicists [55], mostly in two ways: - in nuclear physics, solid state physics, quantum chaos, convergent matrix integrals are 1 E-mail: [email protected] 1 studied for the eigenvalues statistical properties [48, 33, 5, 52].
    [Show full text]
  • Memory Capacity of Neural Turing Machines with Matrix Representation
    Memory Capacity of Neural Turing Machines with Matrix Representation Animesh Renanse * [email protected] Department of Electronics & Electrical Engineering Indian Institute of Technology Guwahati Assam, India Rohitash Chandra * [email protected] School of Mathematics and Statistics University of New South Wales Sydney, NSW 2052, Australia Alok Sharma [email protected] Laboratory for Medical Science Mathematics RIKEN Center for Integrative Medical Sciences Yokohama, Japan *Equal contributions Editor: Abstract It is well known that recurrent neural networks (RNNs) faced limitations in learning long- term dependencies that have been addressed by memory structures in long short-term memory (LSTM) networks. Matrix neural networks feature matrix representation which inherently preserves the spatial structure of data and has the potential to provide better memory structures when compared to canonical neural networks that use vector repre- sentation. Neural Turing machines (NTMs) are novel RNNs that implement notion of programmable computers with neural network controllers to feature algorithms that have copying, sorting, and associative recall tasks. In this paper, we study augmentation of mem- arXiv:2104.07454v1 [cs.LG] 11 Apr 2021 ory capacity with matrix representation of RNNs and NTMs (MatNTMs). We investigate if matrix representation has a better memory capacity than the vector representations in conventional neural networks. We use a probabilistic model of the memory capacity us- ing Fisher information and investigate how the memory capacity for matrix representation networks are limited under various constraints, and in general, without any constraints. In the case of memory capacity without any constraints, we found that the upper bound on memory capacity to be N 2 for an N ×N state matrix.
    [Show full text]
  • MATH7502 Topic 6 - Graphs and Networks
    MATH7502 Topic 6 - Graphs and Networks October 18, 2019 Group ID fbb0bfdc-79d6-44ad-8100-05067b9a0cf9 Chris Lam 41735613 Anthony North 46139896 Stuart Norvill 42938019 Lee Phillips 43908587 Minh Tram Julien Tran 44536389 Tutor Chris Raymond (P03 Tuesday 4pm) 1 [ ]: using Pkg; Pkg.add(["LightGraphs", "GraphPlot", "Laplacians","Colors"]); using LinearAlgebra; using LightGraphs, GraphPlot, Laplacians,Colors; 0.1 Graphs and Networks Take for example, a sample directed graph: [2]: # creating the above directed graph let edges = [(1, 2), (1, 3), (2,4), (3, 2), (3, 5), (4, 5), (4, 6), (5, 2), (5, 6)] global graph = DiGraph(Edge.(edges)) end [2]: {6, 9} directed simple Int64 graph 2 0.2 Incidence matrix • shows the relationship between nodes (columns) via edges (rows) • edges are connected by exactly two nodes (duh), with direction indicated by the sign of each row in the edge column – a value of 1 at A12 indicates that edge 1 is directed towards node 1, or node 2 is the destination node for edge 1 • edge rows sum to 0 and constant column vectors c(1, ... , 1) are in the nullspace • cannot represent self-loops (nodes connected to themselves) Using Strang’s definition in the LALFD book, a graph consists of nodes defined as columns n and edges m as rows between the nodes. An Incidence Matrix A is m × n. For the above sample directed graph, we can generate its incidence matrix. [3]: # create an incidence matrix from a directed graph function create_incidence(graph::DiGraph) M = zeros(Int, ne(graph), nv(graph)) # each edge maps to a row in the incidence
    [Show full text]