MATAR, MONA, Ph.D., August, 2019 APPLIED MATHEMATICS

MATAR, MONA, Ph.D., August, 2019 APPLIED MATHEMATICS NODE AND EDGE IMPORTANCE IN NETWORKS VIA THE MATRIX EXPONENTIAL (131 pp.) Director of Dissertation: Lothar Reichel, Omar De la Cruz Cabrera The matrix exponential has been identified as a useful tool for the analysis of undirected networks, with sound theoretical justifications for its ability to model important aspects of a given network. Its use for directed networks, however, is less developed and has been less successful so far. In this dissertation we discuss some methods to identify important nodes in a directed network using the matrix exponential, taking into account that the notion of importance changes whether we consider the influence of a given node along the edge directions (downstream influence) or how it is influenced by directed paths that point to it (upstream influence). In addition, we introduce a family of importance measures based on counting walks that are allowed to reverse their direction a limited number of times, thus capturing relationships arising from influencing the same nodes, or being influenced by the same nodes, without sacrificing information about edge direction. These measures provide information about branch points. This dissertation is also concerned with the identification of important edges in a network, in both their roles as transmitters and receivers of information. We propose a method based on computing the matrix exponential of a matrix associated with a line graph of the given network. Both undirected and directed networks are considered. Edges may be given positive weights. Computed examples illustrate the performance of the proposed method. In addition to the identification of important nodes and edges in unweighted and edge-weighted networks, we study the importance of nodes in node-weighted graphs. To the best of our knowledge, adjacency matrices for node-weighted graphs have not received much attention. This dissertation describes how the line graph associated with a node-weighted graph can be used to construct an edge-weighted graph, that can be analyzed with available methods. Both undirected and directed graphs with positive node weights are considered. We show that when the weight of a node increases, the importance of this node in the graph increases as well. Some applications to real-life problems are shown. NODE AND EDGE IMPORTANCE IN NETWORKS VIA THE MATRIX EXPONENTIAL A dissertation submitted to Kent State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy by Mona Matar August, 2019 Dissertation written by Mona Matar B.S., The Lebanese University, 2004 M.S., The University of Akron, 2014 Ph.D., Kent State University, 2019 Approved by Lothar Reichel , Chairs, Doctoral Dissertation Committee Omar De la Cruz Cabrera , Jing Li , Members, Doctoral Dissertation Committee Jun Li , Austin Melton , Hassan Peyravi , Accepted by Andrew Tonge , Chair, Department of Mathematical Sciences James L. Blank , Dean, College of Arts and Sciences TABLE OF CONTENTS TABLE OF CONTENTS . .v ACKNOWLDGEMENTS . viii 1 Introduction . .1 2 Basic Definitions and Properties . .7 2.1 Graphs . .7 2.2 Adjacency Matrix . .8 2.3 Incidence Matrix of Undirected Graphs . 10 2.3.1 Incidence to Adjacency Matrix for Undirected Graphs . 10 2.3.2 Adjacency to Incidence Matrix for Undirected Graphs . 11 2.4 Incidence Matrix for Directed Graphs . 12 2.4.1 Incidence Matrix for Directed Graphs Using Entries ±1 . 12 2.4.2 Incidence and Exsurgence Matrices for Directed Graphs . 15 2.5 Weights . 15 3 Analysis of Directed Networks Via the Matrix Exponential . 17 3.1 The Matrix Exponential and Other Matrix Functions . 17 3.2 Node Importance and Communicability Using the Matrix Exponential 18 3.2.1 Existing Methods . 18 3.2.2 Aggregate Upstream and Downstream Reachability . 22 3.3 Reverting Walks with a Bounded Number of Reversions . 24 3.4 Examples . 27 v 3.4.1 Small Examples . 27 3.4.2 Real-Life Large Examples . 35 3.4.3 Bus Route Network Targeting Specific Nodes . 38 3.5 Numerical Considerations . 40 4 Edge Importance in a Network Via Line Graphs and the Matrix Exponential 47 4.1 Incidence and Exsurgence Matrices . 47 4.2 Line Graphs . 48 4.2.1 Line Graphs of an Undirected Graph . 48 4.2.2 Line Graphs of a Directed Graph . 48 4.3 Edge Weights . 50 4.3.1 Example . 53 4.4 Computing the Most Important Edges in an Undirected Network by the Matrix Exponential . 53 4.4.1 Review of the Adjacency Matrix Exponential . 53 4.4.2 Exponential of the Line Graph Adjacency Matrix for Undi- rected Graphs . 54 4.4.3 A Comparison of Downdating Methods for Undirected Graphs 56 4.4.4 A Comparison of Downdating Methods for Directed Graphs . 62 4.5 Computing the Most Important Edges in a Directed Unweighted Net- work Using the Matrix Exponential . 68 4.5.1 The Exponential of the Extended Line Graph Adjacency Matrix E+ ................................. 68 4.5.2 The Exponential of the Line Graph Adjacency Matrix E→ .. 70 4.6 Computing the Most Important Edges in a Directed Weighted Network Using the Matrix Exponential . 75 vi 4.6.1 Example . 75 4.6.2 Flight Example II . 76 4.7 Computational Aspects . 77 5 Node Importance in Node-Weighted Networks via Line Graphs and the Matrix Exponential . 79 5.1 The Sensitivity of Node Centrality to Weight Change . 79 5.1.1 Preliminaries . 80 5.1.2 Matrix Perturbation Results . 82 5.1.3 Example on Sensitivity to Weight Change . 88 5.2 Node-Weighted to Edge-Weighted . 89 5.2.1 Edge Weights from Endpoint Node Weights . 91 5.2.2 The Case when h is Factorizable . 93 5.2.3 Node Weights to Line Graph Edge Weights . 95 5.3 Computing Node Importance in Node-Weighted Networks . 99 5.4 Real-Life Examples . 101 5.4.1 Example of Genotype Mutation . 101 5.4.2 Example of Social Networks: Medium and Twitter . 105 5.5 Computational Aspects . 107 5.5.1 The Arnoldi Process . 109 5.5.2 The Nonsymmetric Lanczos Process . 110 5.5.3 Approximations for the Medium-Twitter Example . 111 6 Conclusions . 114 vii ACKNOWLEDGEMENTS The following work would not have been possible without the guidance of my advisors Dr. Lothar Reichel and Dr. Omar De la Cruz Cabrera, and the support of my parents and my husband. To my daughters I say, you inspire me every day to work hard, and be the best version of myself. Thank you! viii CHAPTER 1 Introduction Often a complex system can be modeled as a network: a set of nodes, any two of which might be connected in some fashion by edges. The nature of the nodes and the connections may vary widely from application to application; Like all models, network models leave out many details of reality; however, they are able to capture a substantial part of the complexity of a system in a way that is amenable to mathematical and computational analysis. Mathematically, we represent a network by a graph, which may be directed or undirected [25, 27, 36, 50]. In spite of their simplicity, which makes mathematical analysis tractable, these concepts can capture much of the complexity of the behavior of the system. Some examples are: • Social networks: Nodes are individuals (human or animal), and edges represent social relationships (e.g., acquaintance, friendship, allegiance). • Online social networks: These are internet-based services in which registered users can establish formalized relationships that modulate sharing of information (the prototype is Facebook). Nodes are users, and edges represent connections like “friendship” (undirected) or “following” (directed). • Road networks: Each intersection or endpoint is a node, and each road section connecting one node to another one is an edge. • In molecular biology, genes and/or proteins can be regarded as nodes, connected by relationships like regulation (directed) or interaction (undirected) [57]. 1 Network analysis can be carried out on at least three levels: individual vertices and edges, subgraphs, and global properties of the whole network [17, 27, 50]. In this work, we are interested in determining the relative importance of nodes, as well as communicability between pairs of nodes. The importance of a node not only depends on how many edges originate from or end at the node, but also on the importance of the neighboring nodes. For instance, consider a graph in which the nodes represent papers and the edges represent citations. An important paper conveys importance to papers that it cites and, to a lesser extent, papers that cite an important paper also may be important. Network analysis can help determine which nodes (papers) contribute the most in broadcasting or receiving of information through the network. Various measures to quantify the importance of a node in a network have been proposed in the literature; see, e.g., [11, 23, 30, 34, 43, 44]. In the undirected case, quantities that try to capture the intuitive notion of importance have come to be known as notions of centrality [13, 17, 27, 31], based on the idea that important nodes should be reachable from many nodes in fairly few steps. In directed networks, a node can be important in two ways: a node can have high downstream influence (can reach many nodes in fairly few steps along the direction of the edges) or high upstream influence (is reached by many nodes in fairly few steps along the direction of the edges); a node with high centrality should have high influence both upstream and downstream. Each one of these concepts can be of independent interest, depending on the application. Some approaches (see, e.g., [5, 10, 11, 12, 13, 30, 34]) involve the use of matrix functions, like the matrix exponential of the adjacency matrix; see Section 3.1 for further details.

Load more