<<

MATAR, MONA, Ph.D., August, 2019 APPLIED MATHEMATICS

NODE AND EDGE IMPORTANCE IN NETWORKS

VIA THE EXPONENTIAL (131 pp.)

Director of Dissertation: Lothar Reichel, Omar De la Cruz Cabrera

The has been identified as a useful tool for the analysis of undirected networks, with sound theoretical justifications for its ability to model important aspects of a given network. Its use for directed networks, however, is less developed and has been less successful so far. In this dissertation we discuss some methods to identify important nodes in a directed network using the matrix exponential, taking into account that the notion of importance changes whether we consider the influence of a given node along the edge directions (downstream influence) or how it is influenced by directed paths that point to it (upstream influence). In addition, we introduce a family of importance measures based on counting walks that are allowed to reverse their direction a limited number of times, thus capturing relationships arising from influencing the same nodes, or being influenced by the same nodes, without sacrificing information about edge direction. These measures provide information about branch points.

This dissertation is also concerned with the identification of important edges in a network, in both their roles as transmitters and receivers of information. We propose a method based on computing the matrix exponential of a matrix associated with a of the given network. Both undirected and directed networks are considered.

Edges may be given positive weights. Computed examples illustrate the performance of the proposed method.

In addition to the identification of important nodes and edges in unweighted and edge-weighted networks, we study the importance of nodes in node-weighted graphs.

To the best of our knowledge, adjacency matrices for node-weighted graphs have not received much attention. This dissertation describes how the line graph associated with a node-weighted graph can be used to construct an edge-weighted graph, that can be analyzed with available methods. Both undirected and directed graphs with positive node weights are considered. We show that when the weight of a node in- creases, the importance of this node in the graph increases as well. Some applications to real-life problems are shown. NODE AND EDGE IMPORTANCE IN NETWORKS

VIA THE MATRIX EXPONENTIAL

A dissertation submitted to

Kent State University in partial

fulfillment of the requirements for the

of Doctor of Philosophy

by

Mona Matar

August, 2019 Dissertation written by

Mona Matar

B.S., The Lebanese University, 2004

M.S., The University of Akron, 2014

Ph.D., Kent State University, 2019

Approved by

Lothar Reichel , Chairs, Doctoral Dissertation Committee

Omar De la Cruz Cabrera ,

Jing Li , Members, Doctoral Dissertation Committee

Jun Li ,

Austin Melton ,

Hassan Peyravi ,

Accepted by

Andrew Tonge , Chair, Department of Mathematical Sciences

James L. Blank , Dean, College of Arts and Sciences TABLE OF CONTENTS

TABLE OF CONTENTS ...... v

ACKNOWLDGEMENTS ...... viii

1 Introduction ...... 1

2 Basic Definitions and Properties ...... 7

2.1 Graphs ...... 7

2.2 ...... 8

2.3 of Undirected Graphs ...... 10

2.3.1 Incidence to Adjacency Matrix for Undirected Graphs . . . . . 10

2.3.2 Adjacency to Incidence Matrix for Undirected Graphs . . . . . 11

2.4 Incidence Matrix for Directed Graphs ...... 12

2.4.1 Incidence Matrix for Directed Graphs Using Entries ±1 . . . . 12

2.4.2 Incidence and Exsurgence Matrices for Directed Graphs . . . 15

2.5 Weights ...... 15

3 Analysis of Directed Networks Via the Matrix Exponential ...... 17

3.1 The Matrix Exponential and Other Matrix Functions ...... 17

3.2 Node Importance and Communicability Using the Matrix Exponential 18

3.2.1 Existing Methods ...... 18

3.2.2 Aggregate Upstream and Downstream Reachability ...... 22

3.3 Reverting Walks with a Bounded Number of Reversions ...... 24

3.4 Examples ...... 27

v 3.4.1 Small Examples ...... 27

3.4.2 Real-Life Large Examples ...... 35

3.4.3 Bus Route Network Targeting Specific Nodes ...... 38

3.5 Numerical Considerations ...... 40

4 Edge Importance in a Network Via Line Graphs and the Matrix Exponential 47

4.1 Incidence and Exsurgence Matrices ...... 47

4.2 Line Graphs ...... 48

4.2.1 Line Graphs of an Undirected Graph ...... 48

4.2.2 Line Graphs of a ...... 48

4.3 Edge Weights ...... 50

4.3.1 Example ...... 53

4.4 Computing the Most Important Edges in an Undirected Network by

the Matrix Exponential ...... 53

4.4.1 Review of the Adjacency Matrix Exponential ...... 53

4.4.2 Exponential of the Line Graph Adjacency Matrix for Undi-

rected Graphs ...... 54

4.4.3 A Comparison of Downdating Methods for Undirected Graphs 56

4.4.4 A Comparison of Downdating Methods for Directed Graphs . 62

4.5 Computing the Most Important Edges in a Directed Unweighted Net-

work Using the Matrix Exponential ...... 68

4.5.1 The Exponential of the Extended Line Graph Adjacency Matrix

E+ ...... 68

4.5.2 The Exponential of the Line Graph Adjacency Matrix E→ .. 70

4.6 Computing the Most Important Edges in a Directed Weighted Network

Using the Matrix Exponential ...... 75

vi 4.6.1 Example ...... 75

4.6.2 Flight Example II ...... 76

4.7 Computational Aspects ...... 77

5 Node Importance in Node-Weighted Networks

via Line Graphs and the Matrix Exponential ...... 79

5.1 The Sensitivity of Node to Weight Change ...... 79

5.1.1 Preliminaries ...... 80

5.1.2 Matrix Perturbation Results ...... 82

5.1.3 Example on Sensitivity to Weight Change ...... 88

5.2 Node-Weighted to Edge-Weighted ...... 89

5.2.1 Edge Weights from Endpoint Node Weights ...... 91

5.2.2 The Case when h is Factorizable ...... 93

5.2.3 Node Weights to Line Graph Edge Weights ...... 95

5.3 Computing Node Importance in Node-Weighted Networks ...... 99

5.4 Real-Life Examples ...... 101

5.4.1 Example of Genotype Mutation ...... 101

5.4.2 Example of Social Networks: Medium and Twitter ...... 105

5.5 Computational Aspects ...... 107

5.5.1 The Arnoldi Process ...... 109

5.5.2 The Nonsymmetric Lanczos Process ...... 110

5.5.3 Approximations for the Medium-Twitter Example ...... 111

6 Conclusions ...... 114

vii ACKNOWLEDGEMENTS

The following work would not have been possible without the guidance of my advisors Dr. Lothar Reichel and Dr. Omar De la Cruz Cabrera, and the support of my parents and my husband. To my daughters I say, you inspire me every day to work hard, and be the best version of myself.

Thank you!

viii CHAPTER 1

Introduction

Often a complex system can be modeled as a network: a set of nodes, any two of which might be connected in some fashion by edges. The nature of the nodes and the connections may vary widely from application to application; Like all models, network models leave out many details of reality; however, they are able to capture a substantial part of the complexity of a system in a way that is amenable to math- ematical and computational analysis. Mathematically, we represent a network by a graph, which may be directed or undirected [25, 27, 36, 50].

In spite of their simplicity, which makes mathematical analysis tractable, these concepts can capture much of the complexity of the behavior of the system. Some examples are:

• Social networks: Nodes are individuals (human or animal), and edges represent

social relationships (e.g., acquaintance, friendship, allegiance).

• Online social networks: These are internet-based services in which registered

users can establish formalized relationships that modulate sharing of informa-

tion (the prototype is Facebook). Nodes are users, and edges represent connec-

tions like “friendship” (undirected) or “following” (directed).

• Road networks: Each intersection or endpoint is a node, and each road section

connecting one node to another one is an edge.

• In molecular biology, genes and/or proteins can be regarded as nodes, connected

by relationships like regulation (directed) or interaction (undirected) [57].

1 Network analysis can be carried out on at least three levels: individual vertices

and edges, subgraphs, and global properties of the whole network [17, 27, 50]. In this

work, we are interested in determining the relative importance of nodes, as well as

communicability between pairs of nodes. The importance of a node not only depends

on how many edges originate from or end at the node, but also on the importance of

the neighboring nodes. For instance, consider a graph in which the nodes represent

papers and the edges represent citations. An important paper conveys importance

to papers that it cites and, to a lesser extent, papers that cite an important paper

also may be important. Network analysis can help determine which nodes (papers)

contribute the most in broadcasting or receiving of information through the network.

Various measures to quantify the importance of a node in a network have been

proposed in the literature; see, e.g., [11, 23, 30, 34, 43, 44]. In the undirected case,

quantities that try to capture the intuitive notion of importance have come to be

known as notions of centrality [13, 17, 27, 31], based on the idea that important nodes should be reachable from many nodes in fairly few steps. In directed networks, a node can be important in two ways: a node can have high downstream influence

(can reach many nodes in fairly few steps along the direction of the edges) or high

upstream influence (is reached by many nodes in fairly few steps along the direction of

the edges); a node with high centrality should have high influence both upstream and

downstream. Each one of these concepts can be of independent interest, depending

on the application.

Some approaches (see, e.g., [5, 10, 11, 12, 13, 30, 34]) involve the use of matrix

functions, like the matrix exponential of the adjacency matrix; see Section 3.1 for

further details. These methods have proved to be useful when applied to undirected

networks.

2 For directed networks, the record is less clear. Some authors claim that the ap-

plication of matrix exponential methods leads to counter-intuitive results in simple

examples ([33]; see Section 3.2.1). The hubs and authorities approach [44] was pro-

posed as a way to avoid some of those perceived shortcomings, while acknowledging

that importance in a directed network should depend on whether it is considered

upstream or downstream. A variation of this approach has recently been described in

[11]. Katz [43] proposed that the resolvent of the adjacency matrix A or its transpose

AT times the vector 1 = [1, 1, ... , 1]T be used as centrality measures. Specifically,

Katz considered the entries of the vectors

(1.0.1) (I − µA)−11, (I − µAT )−11,

as centrality measure. Here, the scalar µ > 0 is chosen sufficiently small so that

the power series expansions of the above expressions converge; see below for further

details. More recently, Benzi et al. [11, 12, 13] considered analogues of the expressions

(1.0.1) with the resolvents replaced by the exponential functions of A, AT , and of the

matrix (3.2.2). However, for directed graphs very little computational analysis has

been reported in the literature that sheds light on the performance of the expressions

T eA1 and eA 1 as measures of a node’s importance and the ease of traveling to or from the node. It is the purpose of this dissertation to elucidate these issues, as well as to introduce matrix functions that allow a finite number of reversals of paths. These matrix functions are helpful in identifying branch points.

Different measures of node centrality/importance capture different network fea- tures. This is particularly true for directed networks, since the notion of importance depends on the choice of direction (upstream or downstream). One of our goals is to show that, for directed networks, using two or more measures simultaneously is

3 necessary to obtain a more complete picture.

In this dissertation, we are interested in measuring the importance of edges in a network as well. This is a problem that arises in various applications. For example, in [18], the authors are concerned about highway sections with congestion that re- duces the overall highway network efficiency. Intuitively, an important edge is a good target for deletion when the goal is to disrupt the network and, therefore, worthy of protection when the goal is to preserve it. On the other hand, unimportant edges may possibly be eliminated (say, to save resources) with a small overall effect.

Our approach to study the importance of edges is to regard them as nodes in a line graph (see Section 4.2) and apply node centrality measures determined by a matrix function, in particular the matrix exponential. This works in a straightforward way for undirected networks, but becomes more complex for directed ones. We also consider the effect of edge weights.

It is often meaningful to assign weights to edges and nodes. For example, in networks in which each node represents a city and each edge represents a road, the edge weight may represent the capacity of transportation of the road. Edge-weighted networks have received considerable attention in the literature; see, e.g., [7, 20, 49,

54, 65]. It also may be purposeful to assign weights to nodes. The interpretation of node weights depends on the context of the model. For instance, in a network that models a part of the brain, where each node corresponds to a region of the brain, node weights may be chosen proportional to the size of the region of interest [1]. In networks, in which each node corresponds to a city and the edges are roads between cities, a node weight may be chosen proportional to the number of restaurants in a city [46]. Node weights also may measure precipitation of a geographical region in a climate network [61]. However, despite many applications of node-weighted networks,

4 the construction of suitable adjacency matrices for such networks has, to the best of our knowledge, not been discussed in the literature.

This part of the work is concerned with the identification of the most important nodes of a node-weighted network by using matrix functions, in particular the matrix exponential. A main challenge is the construction of a suitable adjacency matrix.

Our approach is to transform the given node-weighted graph to an edge-weighted graph based on the line graph associated with the given graph. We describe several ways to construct edge-weighted line graphs. Both undirected and directed graphs are considered. We also discuss how the change of a node weight affects the importance of the node.

This dissertation is organized as follows. Chapter 2 introduces basic notions about graphs and matrix functions. Chapter 3 discusses several ways to measure the impor-

T tance of a node. In particular, we discuss the application of eA1 and eA 1 as measures to quantify aggregate upstream and downstream reachability, as well as relativized versions of this approach that can identify the nodes that have most influence on a predetermined set of nodes. We discuss the possibility of reverting the direction of the walk a limited number of times. We then give examples that show the functions eA1

0 and eA 1 of a non-symmetric adjacency matrix A provide meaningful ranking of the nodes in their roles as broadcasters and receivers, respectively. Numerical methods for large-scale networks with direction reversal are discussed. Chapter 4 introduces graphs and associated matrices and discusses line graphs. Line graphs for both undi- rected and directed graphs are considered. For directed graphs are we define several line graphs. We are also concerned with graphs that have weighted edges. The identi-

fication of the most important edges of an undirected or directed graph with uniform weights by using the exponential function is discussed. The computation of the most

5 important edges of a directed weighted graph is discussed and computed illustrations are provided in most sections. We discuss the computations required to apply the described method. Chapter 5 discusses node-weighted networks. We show results on how the importance of the nodes changes when a node weight is modified, and discusses ways that we can transform a node-weighted graph into an edge-weighted graph. We then show how we can identify the most important node(s) and edge(s) of a node-weighted graph, and present applications of our methods to real-life examples.

Computed illustrations are provided in most sections. We summarize our results in

Chapter 6.

6 CHAPTER 2

Basic Definitions and Properties

Algebraic uses algebraic methods to study graphs. In particular,

the use of Linear Algebra has proved useful for the analysis of networks. Detailed

expositions can be found, for example, in [25, 27, 36]. In this chapter we will develop

only concepts that will be needed below; some notations and definitions are non-

standard.

2.1 Graphs

A network can be described mathematically by a graph G = (V, E), where V =

{v1, v2, ... , vn} is the set of nodes (or vertices) and E = {e1, e2, ... , em} is the set of edges; basic facts about graphs can be found, e.g., in [25, 27, 36, 50]. If some of the edges are directed, then we call the graph directed, otherwise we call it undirected.A directed edge ek pointing from node vi to node vj can be identified with the ordered pair (vi, vj), and we say that ek incides on vj, exsurges from vi, and connects vi and

vj; in the undirected case, each element of E is an unordered pair ek = {vi, vj} of

elements of V, and we say that ek incides on both vi and vj, and connects vi and vj

(and also vj and vi). Notice that in either case it is possible that vi = vj; in specific

cases, we will require that such “self-loops” do not exist. We assume that there are

no multiple edges between any pair of vertices. Two nodes connected by an edge are

called adjacent.

The out-degree of a node counts the number of edges exsurging from that node,

and the in-degree counts those inciding directly at it.

7 A (standard) walk of length k is a sequence vi1 , vi2 , . . . vik+1 of nodes and a sequence

ei1 , ei2 , ... , eik of edges such that eij points from vij to vij+1 . A walk with no repeated vertices is called a . An alternating walk from node vi1 to node vik is a sequence of nodes vi1 , vi2 , . . . vik , such that the direction of the edges is reversed at each step.

If this walk starts with an edge pointing from v1, then an edge eij points from vij to vij+1 if j is odd, and from vij+1 to vij if j is even. If this walk starts with an edge pointing to v1, then an edge eij points from vij+1 to vij if j is odd, and from vij to vij+1 if j is even [11, 21]. We are interested in identifying nodes that are good broadcasters or good receivers, i.e., nodes that originate several (standard) walks (originate much information flow) or are targets of several (standard) walks. We remark that good broadcasters or receivers are not necessarily good hubs or authorities, respectively; see Section 3.2.1.2 for a discussion of the latter concepts and references to the literature.

v1

v4 v2

v3

Figure 2.1: Graph of a directed network.

2.2 Adjacency Matrix

We can describe a network of n nodes or vertices by the n × n adjacency matrix

A = [Aij] of G, with Aij = 1 if there exists an edge that connects nodes vi and

vj, and Aij = 0 otherwise. The choice of 1 for all nonzero elements of A follows the

assumption that the network is unweighted, i.e. all connections are equally important;

self-loops correspond to diagonal entries. For directed networks as in Figure 2.1, the

8 edges are directed, and the resulting adjacency matrix is generally nonsymmetric.

The adjacency matrix for the graph in Figure 2.1 is

   0 1 1 0         0 0 1 0    (2.2.1) A =   .    0 0 0 1       1 0 0 0 

Here and below the superscript T denotes transposition. The transpose AT of the

adjacency matrix can be thought of as the adjacency matrix of the graph obtained if

we switch the direction of the edges.

v2 e3 v3 e1 e4 e7 v1 e5 e6

e2 v5 e8 v4

Figure 2.2: Graph of an undirected network.

For undirected networks as in Figure 2.2, the edges are undirected, and the result- ing adjacency matrix is symmetric. The adjacency matrix for the graph in Figure 2.2 is

   0 1 0 0 1         1 0 1 1 1        (2.2.2) A =  0 1 0 1 1  .        0 1 1 0 1        1 1 1 1 0

9 2.3 Incidence Matrix of Undirected Graphs

Let the graph G be undirected and unweighted. Then the incidence matrix of G is an n × m matrix B = [Bij] with Bij = 1 if ej incides on vi, and Bij = 0 otherwise.

Each row represents a node in the graph, and each column an edge. Notice that each column of B has exactly two entries equal to 1 (and the rest zero), unless the corresponding edge is a self-, in which case exactly one entry is 1. If there are

T no self-loops, then BB = A + D, where D = [Dij] is a with the diagonal entry Dii equal to the degree of vi. The incidence matrix corresponding to the network in Figure 2.2 is

   1 1 0 0 0 0 0 0         1 0 1 1 1 0 0 0        (2.3.1) B =  0 0 1 0 0 1 1 0  .        0 0 0 1 0 1 0 1        0 1 0 0 1 0 1 1

2.3.1 Incidence to Adjacency Matrix for Undirected Graphs

The adjacency matrix of an undirected network can be recovered from the corre- sponding incidence matrix by the formula in Lemma 2.3.1, as in [36].

Lemma 2.3.1. The adjacency matrix A of an undirected network can be computed from the corresponding incidence matrix B by A = BBT − diag(BBT ).

Proof. For an undirected network, the diagonal of the matrix BBT counts the number

th of edges in contact with each node. In other words, the degree of node vi is the i diagonal entry of BBT , calculated by the inner-product of the ith row of B with itself. That row holds the entry 1 each time node vi is involved with an edge. A

10 T nondiagonal entry BB ij is equal to 1 when node vi is connected to node vj and 0

otherwise, therefore it coincides with the corresponding component of the adjacency

matrix.

We note that the networks we study do not have self-loops, and thus the diagonal

entries of the adjacency matrix are all zero. We conclude that the adjacency matrix A

of an undirected network can be computed from the corresponding incidence matrix

B by A = BBT − diag(BBT ).

2.3.2 Adjacency to Incidence Matrix for Undirected Graphs

An incidence matrix can be constructed by reading the graph of the network,

as well as by using the corresponding adjacency matrix. Algorithm 1 written in

MATLAB describes that process. This algorithm is written as a function with Adj

the symmetric adjacency input matrix and Inc the incidence output matrix. Starting with Adj ∈ Rn×n, we use the variable k to enumerate the edges in the graph. Each row and each column of Adj represents a node. Since Adj is symmetric, the 1s above the diagonal cover all the edges. We use the variable i to fix a row of Adj, search for all Adjij = 1 in that row sitting on the right of column i, and assign a new column in the incidence matrix for each one of them. For this kth edge found, we place a 1 in the ith row and another 1 in the jth row of the incidence matrix, under the column k. As the incidence matrix grows, it gets filled with 0 entries to preserve the right format. We assume that there are no standalone nodes in the graph, therefore at the end of the algorithm, Inc ∈ Rn×m. Note that if the adjacency matrix is in sparse format, appropriate adjustments should be made to the algorithm.

11 Algorithm 1: Adjacency to incidence for undirected networks

1 function [Inc] = AdjToInc(Adj) 2 n = size ( Adj , 1 ) ; 3 % Computing incidence matrix 4 k = 0 ; 5 for i =1:n 6 for j=i +1:n 7 i f Adj(i , j)==1 8 k = k+1; 9 Inc(i,k) = 1; 10 Inc(j,k) = 1; 11 end 12 end 13 end 14 end

2.4 Incidence Matrix for Directed Graphs

In this section we present two common methods for describing the incidence matrix

of a directed network. In Chapter 4, we will present a new way of writing such

matrices.

2.4.1 Incidence Matrix for Directed Graphs Using Entries ±1

We can define the incidence matrix by Bˆ ik = −1, if edge ek emerges from node vi,

Bˆ ik = 1, if edge ek points to node vi, and Bˆ ik = 0 otherwise [19, 36]. Since each edge can emerge from only one node, and point to only one, each column of Bˆ contains one component equal to 1 and another one equal to −1, and the rest are zeros. MATLAB uses this notation to define the incidence matrix for digraphs. For the network in

Figure 2.1, the incidence matrix is

12    −1 −1 0 0 1         1 0 −1 0 0  ˆ   (2.4.1) B =   .    0 1 1 −1 0       0 0 0 1 −1 

2.4.1.1 Incidence to Adjacency Matrix for Directed Graphs Using Entries ±1

The formula in Lemma 2.3.1 fails to recover the adjacency matrix A from the incidence matrix of a directed network using entries −1 and 1. One observes that

Bˆ Bˆ T is symmetric, whereas A is not. We derive the adjacency matrix from the corresponding incidence for those networks by using Algorithm 2 for a function that we implement in MATLAB, where Inc is the incidence input matrix, and AdjCalc the calculated adjacency output matrix. In this algorithm, n, the number of nodes, and m, the number of edges, are determined by the size of Inc. We initialize the

calculated adjacency matrix as an n × n . The variable k goes through the edges from 1 to m. For a fixed column k of Inc, we search for the entry holding −1 representing the source node. If that entry is in the ith row, then there is a 1 in the ith

row of AdjCalc. The entry holding 1 in the kth column of Inc represents the target

th node. If that entry is in the j row, then the edge is represented by AdjCalcij = 1.

2.4.1.2 Adjacency to Incidence Matrix for Directed Graphs Using Entries ±1

An incidence matrix can be constructed by reading the graph of the network, as well as by using the corresponding adjacency matrix. Algorithm 3 written in

MATLAB describes that process. This algorithm is written as a functions with Adj

as the adjacency input matrix and Inc as the incidence output matrix. In addition, n

is number of nodes in the network, and m the number of edges. This algorithm works

13 Algorithm 2: Incidence to adjacency for directed networks using entries 1 and −1

1 function [AdjCalc] = IncToAdjDir(Inc) 2 % Computing Adjacency from incidence matrix 3 [ n ,m] = size ( Inc ) ; 4 AdjCalc = zeros (n , n) ; 5 for k=1:m 6 for i =1:n 7 i f Inc ( i , k )==−1 8 for j =1:n 9 i f Inc(j ,k)==1 10 AdjCalc(i , j)=1; 11 end 12 end 13 end 14 end 15 end 16 end like Algorithm 1 with two differences. The first one is that we place a −1 instead of 1 in Inc for the source nodes. The second difference is that since the matrix is nonsymmetric and the edges are directed, we need to take into account all of the 1 entries in the adjacency matrix, and not just the ones over the diagonal.

Algorithm 3: Adjacency to incidence for directed networks using entries 1 and −1

1 function [B1,B2] = AdjToIncDir(Adj) 2 nn = size ( Adj , 1 ) ; 3 mm=nnz( Adj ) ; 4 B1=zeros (nn ,mm) ; B2=zeros (nn ,mm) ; 5 % Computing incidence matrix 6 kk = 0 ; 7 for i =1:nn 8 for j =1:nn 9 i f Adj(i , j)==1 10 kk = kk+1; 11 B1(i,kk) = 1; 12 B2(j,kk) = 1; 13 end 14 end 15 end 16 end

14 2.4.2 Incidence and Exsurgence Matrices for Directed Graphs

Assume now that G is directed and unweighted. We then define the incidence

i i e e and exsurgence matrices of G as the n × m matrices B = [Bij] and B = [Bij],

i e respectively, with Bij = 1 if ej incides on vi, Bij = 1 if ej exsurges from vi, and entries zero otherwise.

In this dissertation, wherever we plan on using the incidence matrix concept for directed graphs, we will be appropriately using the incidence and exsurgence matrices

Bi and Be described above.

2.5 Weights

Whether directed or not, a network can be edge-weighted (or node-weighted), if there is a number assigned to each edge (or node). Figure 2.3 illustrates an edge- weighted graph, in which each edge weight represents the travel cost of traversing the road that corresponds to the edge. The nodes are unweighted, i.e., each node has unit weight. Figure 2.4 displays a node-weighted graph, in which each node weight is the price of a hotel room at the node. The edges are unweighted, i.e., they all have weight one. Graphs also may be both edge-weighted and node-weighted, but this case is beyond the scope of this work. The interpretation of the weights depends on the application. In general, node weights correspond to the “size” of a node, while edge weights indicate a capacity or speed of transportation, or the reciprocal of a transfer or communication cost. In this dissertation weights are positive.

For undirected unweighted graphs, the degree of a node is defined as the number of edges ending at it; for directed unweighted graphs, we identify the indegree of a node as the number edges ending at it, and the outdegree of a node as the number of edges originating from it.

15 Figure 2.3: Directed edge-weighted graph. Figure 2.4: Directed node-weighted graph.

For node-weighted graphs, all edges have weight one. When constructing an as-

sociated edge-weighted graph, its nodes will have weight one, and its edges will have

weights as described in Section 5.2.

2.5.0.1 Adjacency and Incidence Matrices of Edge-Weighted Graphs

Let the edges of the graph G have positive weights and denote the associated

weighted adjacency matrix by A˜. Thus, the (ij)th entry of A˜ is the weight of the

edge from node vi to node vj. We refer to the adjacency matrix A˜ as edge-scaled.

The “unweighted” adjacency matrix A that is associated with A˜ has all edge weights equal to one. Thus, the entries of A belong to {0, 1}.

Consider the unweighted adjacency matrix A = BeBiT associated with the edge- weighted graph G, and let the diagonal matrix Z with diagonal entries z1, z2, ... , zm hold the edge weights of the graph. Then weighted adjacency matrix for graph G can be written as

(2.5.1) A˜ = BeZBiT .

16 CHAPTER 3

Analysis of Directed Networks Via the Matrix Exponential

3.1 The Matrix Exponential and Other Matrix Functions

th k The (ij) element of A is the number of walks of length k starting at node vi and ending at node vj.A matrix function can be defined by a power series of the form

∞ X p (3.1.1) f(A) = cpA p=0 which can be interpreted as the sum of counts of walks of various lengths between the nodes of the network, weighted according to their length by the coefficients cp.

Generally, these coefficients are nonnegative and decreasing. This implies that long

walks are weighted less than short walks, i.e., they are considered less important

than short walks. Moreover, we would like the coefficients to decrease to zero quickly

enough so that the series (3.1.1) converges. A nice introduction to the use of matrix

functions in network analysis is provided by Estrada and Higham [30].

For matrices that are of small to moderate size and are diagonalizable, matrix

functions generally can be evaluated by first computing the spectral factorization of

A and then evaluating the corresponding function of a real or complex variable at

the eigenvalues of the matrix. Many other approaches to define and evaluate matrix

functions are available; see, e.g., [42] for techniques suitable when the matrix A is

small enough to conveniently be factored and [8, 9, 10, 34, 37] for the approximation

of functions of large matrices.

17 The most commonly used matrix functions for network analysis are the matrix exponential, obtained by taking cp = 1/p!, and the resolvent

(3.1.2) f(A) = (I + µA)−1 = I + µA + µ2A2 + ... ,

where the scalar µ > 0 is chosen small enough so that the above series converges; see

[30] for a discussion and illustrations.

3.2 Node Importance and Communicability Using the Matrix Exponential

3.2.1 Existing Methods

3.2.1.1 Methods for Undirected Networks

Measures of importance include simple notions like degree (degree central-

ity) as well as more elaborate concepts [13, 17, 27]. A popular approach is eigenvector

or feedback centrality [15, 16, 35, 41], which formalizes the circular notion that “a node

is central if it is connected to many central nodes” via an eigenvalue/eigenvector equa-

tion; the centrality measure is given by a Perron–Frobenius eigenvector of A.

In [30], the authors use matrix functions to the nodes in undirected net-

works, based on the heuristic that matrix functions produce weighted sums of walks

connecting pairs of nodes with weights that depend only on the length of the walk.

The subgraph centrality of a node vi is defined by [exp(A)]ii [31].

Benzi et al. [11] discussed the application of the matrix exponential to the adja-

cency matrix for a directed network and observed that it may not be meaningful to

use the diagonal entries of the exponential as a measure of importance of the nodes

of a directed network. They considered the following directed network

18 v v v v v 1 2 3 ... n−1 n

Figure 3.1: Graph of a directed chain of nodes.

with the associated adjacency matrix

  0 1 0 ··· 0       0 0 1 ··· 0      ......  n×n (3.2.1) A =  . . . . .  ∈ R .       0 0 0 ··· 1       0 0 0 ··· 0

All diagonal entries of the exponential of this matrix are equal to 1, giving the same importance to each of the nodes. This result was not satisfying for the authors, since the first and last nodes should not be equally important. Benzi et al. [11] therefore proposed to bipartite the network into hubs and authorities. We outline this approach in the following subsection.

3.2.1.2 Directed Networks: Bipartization

The out-degree and in-degree are simple measures of the importance of a node in its roles as hub and authority, respectively. These measures only consider local information of each node, and do not propagate the effect of a node through the whole network.

Kleinberg [44] proposed to split a directed network into hubs and authorities and Benzi et al. [11] constructed a related that makes it possible to compute the hub centrality and authority centrality of nodes using the matrix exponential. This construction is based on forming an undirected , with twice the number of nodes of the original graph; the first n nodes represent the

19 nodes of the original network in their hub role, i.e., edges emerge from those nodes, and the second n nodes of the bipartite graph represent them in their authority role,

i.e., edges point to those nodes.

This way, an edge ek pointing from node vi to node vj in the directed network can be represented by an undirected edge connecting node vi to node vn+j in the

corresponding bipartite. The new undirected graph related to the directed graph in

Figure 2.1 is given by Figure 3.2.

v1 v5

v2 v6

v3 v7

v4 v8

Figure 3.2: Bipartite undirected network graph to represent the graph of the directed network in Figure 2.1.

Specifically, Benzi et al. [11] considered the symmetric matrix

  0 A   (3.2.2) A =   .   AT 0

We calculate the exponential

A2 A3 (3.2.3) exp(A) = I + A + + + ... , 2! 3!

20 where

    AAT 0 0 AAT A 2   3   A =   ; A =   ;     0 AT A AT AAT 0   AAT AAT 0 4   A =   ; ... .   0 AT AAT A

T The quantity [AA ]ii, for i = 1, ... , n, counts the number of alternating walks of

T p length 2 starting at node vi. Similarly, [(AA ) ]ii counts the number of alternating

T p walks of length p starting at node vi, and [(A A) ]ii counts the number of alternating

walks of length p ending at node vi. Therefore, the diagonal entry [exp(A)]ii, for i ≤ n, is a weighted sum of all alternating walks starting at node vi, penalized by a

factorial factor, and gives a measure for the hub centrality of node vi. Likewise the

diagonal entry [exp(A)]n+i,n+i, for 1 ≤ i ≤ n, is a weighted sum of all alternating walks ending at node vi, penalized by a factorial factor, and gives a measure for the authority centrality of node vi. Efficient numerical methods for this purpose are described in [5, 11].

However, while determining the importance of nodes by ranking their hub and authority roles using the diagonal entries of the exponential exp(A) as outlined above yields valuable information in many situations, it is not satisfactory for identifying good broadcasters and receivers. For instance, consider the network of Figure 3.1 and let n = 5. Following the process described above, we obtain the ranking displayed in

Table 1.

Although this ranking shows node v5 to be the least important node in its hub role, and node v1 is seen to be least important in its authority role, this approach fails to give a reasonable ranking for the other nodes in the graph, which appear to be equally

21 hub role authority role

node vi [exp(A)]ii node vi [exp(A)]n+i,n+i v1 1.543081 v2 1.543081 v2 1.543081 v3 1.543081 v3 1.543081 v4 1.543081 v4 1.543081 v5 1.543081 v5 1 v1 1 (a) (b) Table 1: Ranking the nodes in Figure 3.1 by bipartizing the graph.

important as hubs and authorities. However, since node v1 is able to send information to all the other nodes in the network, its role as broadcaster should be the largest.

More generally, node vi broadcasts to all of the nodes vj with j > i. Therefore, one should expect a node vi with a smaller index i to be a more important broadcaster than a node with a larger index. Similarly, each node vi receives information from all nodes vj with j < i. Therefore, a node vi with larger index i is more important as receiver. We conclude that computing the diagonal entries of the matrix exp(A) does not always give an intuitively correct ranking of the nodes in their broadcaster and receiver roles.

In the next section, we discuss an alternative way to rank the nodes of a directed network, without bipartization of the graph. This approach will rank the nodes of the graph of Figure 3.1 so that nodes vi with smaller index i receive higher broadcaster and lower receiver rankings. This ranking method also can be applied when we are interested in reaching or avoiding particular nodes of a network.

3.2.2 Aggregate Upstream and Downstream Reachability

Consider the exponential of a nonsymmetric adjacency matrix A associated with a directed network. The entry [exp(A)]ij calculates a weighted sum of walks from node vi to node vj, giving more weight to shorter walks. Introduce the vector u =

22 T [u1, u2, ... , un] for measuring how important it is to reach each node of the network; we let ui = 0 if the node vi is considered an uninteresting destination. A large value of ui indicates that node vi is a highly targeted node. In the following we only consider entries ui = 0 or ui = 1. Now (exp(A))u allows us to determine which nodes are most important for the requested flow, i.e., which nodes are the most important  T broadcasters. Similarly, uT exp(A) = (exp(AT ))u gives a ranking of the nodes according to their role as receivers from the nodes specified by the nonzero entries of u.

We define the aggregate downstream reachability as the vector

(3.2.4) ADR = exp(A)1.

Thus, here u = 1. This vector gives the same importance to the goal of reaching any node in the network. Similarly, the aggregate upstream reachability is defined as

(3.2.5) AUR = exp(AT )1.

ADR provides a reasonable ranking for the nodes in their broadcaster role, i.e., in their ability to broadcast information through the network. Likewise, AUR can be used to rank the nodes according to their receiver role. We remark that Benzi et al. [11, Section 8.1] tabulated the column sum of exp(A), but did not discuss this approach in any detail. The difference between row and column sums are used by

Croft and Higham [22] with the goal of finding a hierarchical ordering of network nodes. Theoretical results are shown by Benzi and Klymko [13].

For the graph of Figure 3.1, the ADR and AUR methods determine the rankings displayed in Table 2. These rankings satisfy the requirements mentioned at the end

23 of Section 3.2.1.2.

3.3 Reverting Walks with a Bounded Number of Reversions

An appealing feature of the hubs-and-authorities model described by Benzi et

al. [11] is that two nodes vi, vj can be considered related in situations when edges

are connected by alternating walks; the drawback is that only strictly alternating

walks are considered. On the other hand, measures based solely on exp(A), like ADR

and AUR, fail to capture these “lateral” connections. In this section we consider

an approach that allows us to recover some of these connections. Taking lateral

connections into account helps us identify important branch points.

A (somewhat trivial) way to include lateral connections is simply to transform the

directed network into an undirected one by disregarding edge directions. This can be

achieved by symmetrizing the adjacency matrix, i.e., by computing the exponential of

1 T the symmetric matrix AS = 2 (A + A ) instead of the exponential of A. The matrix

AS is the closest symmetric matrix to A in the Frobenius norm. Of course, all direc- tionality information is lost when replacing A by AS. Nevertheless, for comparison purposes, we define the symmetrized aggregate reachability by

1 1 T SAR = exp( 2 A + 2 A )1.

1 1 T Consider now the matrix exp( 2 A) exp( 2 A ) (this matrix equals the one used in SAR if and only if A and AT commute; see the remark at the end of this section). It

24 can be written as:

  n  ∞ An ! ∞ AT 1 1 T X  X  (3.3.1) exp( 2 A) exp( 2 A ) = n  n  n=0 2 n! n=0 2 n!  j ∞ ∞ i T X X A A = i+j . i=0 j=0 2 i!j!

 j Since the entries of Ai AT count walks that move along i edges, and then in reverse

1 1 T along j edges, exp ( 2 A) exp ( 2 A ) is comprised of weighted sums of counts of walks that start in the forward direction, and have exactly one change of direction (either

1 T 1 leg of the walk, or both, can be of length zero). Similarly, exp( 2 A ) exp( 2 A) contains weighted sums of walks that start in reverse, and then change direction exactly once.

1 1 T 1 T 1 The vectors exp( 2 A) exp( 2 A )1 and exp( 2 A ) exp( 2 A)1 are helpful in identifying branch points of directed networks. This will be illustrated in Section 3.4.

Analogously to (3.3.1), we can construct matrices containing information about

walks with at most two reversions, namely

1 1 T 1 1 T 1 1 T exp( 3 A) exp( 3 A ) exp( 3 A) and exp( 3 A ) exp( 3 A) exp( 3 A ).

In general, we can build a matrix containing weighted sums of numbers of walks with

1 at most k reversions using alternating products of k + 1 factors of the form exp( k+1 A) 1 T and exp( k+1 A ). We introduce the bounded number of reversions notions of reachability, denoted

by BNR(x, k), where x ∈ {d, u}, by

 1 1 T  (3.3.2) BNR(d, k) = exp( k+1 A) exp( k+1 A ) ··· 1,

25 and

 1 T 1  (3.3.3) BNR(u, k) = exp( k+1 A ) exp( k+1 A) ··· 1.

Here each product has k + 1 alternating exponential factors.

We are mostly interested in small values of k (say, k = 1 or 2), since intuition suggests that the more reversals we allow, the closer we get to losing all directional information, as in the symmetrization approach SAR. This intuition is formalized by the following result:

Proposition 3.3.1. limk→∞ BNR(d, k) = limk→∞ BNR(u, k) = SAR.

Proof. The statement follows from the Lie product formula (see, e.g., [39, page 35]):

exp(X + Y ) = lim (exp(X m) exp(Y m))m . m→∞ / /

1 1 T Taking X = 2 A and Y = 2 A , we obtain for each odd k = 2m − 1 that

1 T 1 1  1 1 T 1 1 m 1 T 1 exp( k+1 A ) exp( k+1 A) ··· exp( k+1 A) = exp( m 2 A ) exp( m 2 A) → exp( 2 A + 2 A)

1 as m → ∞. The same happens for the products starting with exp( k+1 A). For k even, we obtain terms of the form

  m 1 1 T   m 1 1 T m 1 T exp m+1 m 2 A exp m+1 m 2 A exp( 2m+1 A ).

The last factor converges to I; the remaining part can be shown to converge to

1 T 1 exp( 2 A + 2 A) by the same method as the proof of the Lie product formula in [39]. Finally, the result is obtained multiplying by 1 on the right.

26 We remark that the matrices A and AT commute if and only if A is normal, i.e.,

A has a unitary eigenvector matrix. In this case, both BNR(d, k) and BNR(u, k)

equal SAR, for all odd k. This means that the BNR measures may discard all

directionality information, even for small k, but only for a fairly restrictive class

of adjacency matrices. Indeed, equality of the diagonal entries of AAT and AT A

implies that each vertex has equal in-degree and out-degree, which is a reasonable

condition if the directed network represents a volume-preserving flow; equality of the

off-diagonal entries imposes an even stronger restriction. Examples of non-symmetric

normal adjacency matrices include circulant matrices, which correspond to a cyclic

arrangement of the nodes, and some block-circulant matrices. In any case, A will have at least some eigenvalues with non-zero imaginary part, corresponding to some sort of cyclical structure in the network.

3.4 Examples

The vector u in all examples of Sections 3.4.1 and 3.4.2 is chosen to be 1.

3.4.1 Small Examples

In this subsection we give examples of small synthetic directed networks, and rank their nodes in their broadcaster and receiver roles. We compare our results with the ranking described in Section 3.2.1.2 for each of these networks.

3.4.1.1 Example 1: Simple Chain

This example is illustrated in Figure 3.1 for n = 5. The different measures dis- cussed in this chapter are summarized in Table 2.

27 node vi [exp(A)]ii ADR BNR(d, 1) BNR(d, 2) v1 1.54 2.71 1.99 2.14 v2 1.54 2.67 2.55 2.62 v3 1.54 2.50 2.65 2.64 v4 1.54 2.00 2.47 2.39 v5 1.00 1.00 1.65 1.53

node vi [exp(A)]n+i,n+i AUR BNR(u, 1) BNR(u, 2) SAR v1 1.00 1.00 1.65 1.53 1.83 v2 1.54 2.00 2.47 2.39 2.53 v3 1.54 2.50 2.65 2.64 2.66 v4 1.54 2.67 2.55 2.62 2.53 v5 1.54 2.71 1.99 2.14 1.83

Table 2: Comparison of influence measures for the nodes in Figure 3.1, including aggregate reachability using walks with a bounded number of reversions, as well as disregarding orientation altogether; see Section 3.3 for definitions.

3.4.1.2 Example 2: Simple Chain with One Branch

Regard the graph depicted in Figure 3.3. Bipartization of this graph gives the

nodes v1, v2, v4, and v5 the same hub rank, since they all point to only one node, that is not pointed to by any other node; see Table 3. However, the graph suggests an obvious advantage for nodes v1 and v2, because information from these nodes can spread to node v3, and from there deeper into the network. The ADR ranking shows

nd rd the nodes v2 and v1 to be the 2 and 3 most important broadcasters, respectively.

As far as the authority role, the nodes v4 and v5 get the highest ranking by the

bipartization method, because of the alternating walks reaching them from v2. How-

ever, they actually only receive information from one node, v3. Using the exponential

T of A , v6 and v7 are ranked the highest, since they receive information from the nodes they are directly attached to, from node v3 through a walk of length 2, from node v2 through a walk of length 3, as well as from node v1 through a walk of length 4. Thus, the AUR ranking determines an intuitively reasonable ordering.

28 v4 v6

v1 v2 v3

v5 v7

Figure 3.3: Graph of Example 2.

The measures BNR(d, k) and BNR(u, k) identify node v3 to be the most important for both k = 1 and k = 2, because the graph has a branch point at node v3. This example suggests that the BNR(d, k) and BNR(u, k) measures can be applied to identify important branch points. Node v3 is the most important node also if all directed edges are replaced by undirected ones.

node vi [exp(A)]ii ADR BNR(d, 1) BNR(d, 2) v1 1.54 2.92 2.03 2.20 v2 1.54 3.33 2.79 2.95 v3 2.18 4.00 3.68 3.86 v4 1.54 2.00 2.47 2.53 v5 1.54 2.00 2.47 2.53 v6 1.00 1.00 1.65 1.55 v7 1.00 1.00 1.65 1.55

node vi [exp(A)]n+i,n+i AUR BNR(u, 1) BNR(u, 2) SAR v1 1.00 1.00 1.67 1.54 1.86 v2 1.54 2.00 2.63 2.48 2.72 v3 1.54 2.50 3.35 3.22 3.58 v4 1.59 2.67 2.88 2.80 2.72 v5 1.59 2.67 2.88 2.80 2.72 v6 1.54 2.71 2.07 2.17 1.86 v7 1.54 2.71 2.07 2.17 1.86

Table 3: Comparison of influence measures for the nodes in Figure 3.3, including aggregate reachability using walks with a bounded number of reversions, as well as disregarding orientation altogether; see Section 3.3 for definitions.

29 3.4.1.3 Example 3: Branching at Two Levels

Consider the example in Figure 3.4. Using the bipartization method, Table 4

indicates that the vertices v1 and v3 have the same importance as hubs. However,

Figure 3.4 suggests that this is clearly not the case, since node v1 broadcasts to more nodes in the network than node v3. In fact, v3 can reach directly two nodes v4 and

v5, and v1 also reaches directly two nodes v2 and v3, in addition to the nodes v4 and

v5 through the hub role of node v3. When computing the ADR ranking, the role of

v1 as a more important broadcaster is detected.

v2

v1 v4

v3

v5

Figure 3.4: Graph of Example 3.

Turning to the receiver role of the nodes, we observe from Table 4 that AUR ranks

the vertices v4 and v5 as more important receivers than v2 and v3, since they receive

more information from the network, whereas the bipartization method does not show

this difference.

Turning to the measures BNR(d, 1) and BNR(d, 2), we observe that both of them

identify the vertices v1 and v3 as important; these are branch points for outflow of

the graph. BNR(u, 1) and BNR(u, 2) are large for vertex v3, because this vertex is a

branch point for inflow. These measures are not large at v1, because there is no inflow

to this vertex. This example illustrates that the measures BNR(d, k) and BNR(u, k)

for k ≥ 1 reveal important information about branch points of a graph.

This subsection has compared rankings determined by bipartization, ADR, ADR,

30 node vi [exp(A)]ii ADR BNR(d, 1) BNR(d, 2) v1 2.18 4.00 2.91 3.25 v2 1.00 1.00 1.50 1.59 v3 2.18 3.00 3.12 3.36 v4 1.00 1.00 1.62 1.65 v5 1.00 1.00 1.62 1.65

node vi [exp(A)]n+i,n+i AUR BNR(u, 1) BNR(u, 2) SAR v1 1.00 1.00 2.25 2.04 2.67 v2 1.59 2.00 2.12 2.01 1.85 v3 1.59 2.00 3.12 2.94 3.25 v4 1.59 2.50 2.28 2.26 1.99 v5 1.59 2.50 2.28 2.26 1.99

Table 4: Comparison of influence measures for the nodes in Figure 3.4, including aggregate reachability using walks with a bounded number of reversions, as well as disregarding orientation altogether; see Section 3.3 for definitions.

and BNR. These ranking methods are seen to be sensitive to different node and

edge configurations. When we are interested in broadcasters and receivers, and how

information flows, the latter ranking schemes appear to be more appropriate.

3.4.1.4 Example 4: More than One Path

All the examples given so far present nodes that interact in a unique way, that is,

there exists at most one path that leads from one node to another one. We now give

an example where where connections can happen through more than one path. In

Figure 3.5, node v2 can send information to node v4 either directly, or through node v3, and therefore more than one path exist between nodes v1 and v4, as well as in between v2 and v5. All methods appear in Table 6 to give the primary role of a hub to node v2, and the best role of authority to v4. However they do not agree on the first runner-up. The bipartization method places v3 in the second strongest hub, so does

BRADR. This is the result of the AAT in the computation of

31 Top 10 ranked nodes vi using various measures

[exp(A)]ii ADR BNR(d, 1) BNR(d, 2) 216 149 149 149 72 219 219 217 217 218 218 216 71 178 217 219 149 174 216 218 219 81 174 145 218 82 178 81 178 157 81 178 75 216 82 198 76 217 145 82

[exp(A)]n+i,n+i AUR BNR(u, 1) BNR(u, 2) SAR 305 305 305 305 71 71 71 71 71 72 72 72 72 72 217 74 146 73 73 216 73 276 146 74 305 76 73 74 76 76 78 74 76 146 73 75 76 276 75 75 217 234 75 78 74 77 75 78 217 78

Table 5: Comparison of influence measures for the nodes in the C. elegans network, including aggregate reachability using walks with a bounded number of reversions, as well as disregarding orientation altogether; see Section 3.3 for definitions.

32 these methods, which counts alternate walks starting at that node. In this case v3 is considered connected to node v2 via a walk of length 2, and then to itself and v4 via a walk of length 3, and so on. ADR however only sees forward walks, and places v1 as a stronger hub than v3, since the latter sends information to all the other nodes in the network, whereas the later only reaches v4 and v5. Similar discussion can be retrieved from the authority ranking.

v1 v2 v4 v5

v3

Figure 3.5: Graph of Example 4.

node vi [exp(A)]ii ADR BNR(d, 1) BNR(d, 2) v1 1.54 3.38 2.33 2.50 v2 2.23 4.17 3.99 4.06 v3 1.59 2.50 2.98 3.02 v4 1.54 2.00 3.17 3.13 v5 1.00 1.00 1.79 1.64

node vi [exp(A)]n+i,n+i AUR BNR(u, 1) BNR(u, 2) SAR v1 1.00 1.00 1.79 1.64 2.05 v2 1.54 2.00 3.17 3.13 3.67 v3 1.59 2.50 2.98 3.02 3.10 v4 2.23 4.17 3.99 4.06 3.67 v5 1.54 3.38 2.33 2.50 2.05

Table 6: Comparison of influence measures for the nodes in Figure 3.5 , including aggregate reachability using walks with a bounded number of reversions, as well as disregarding orientation altogether; see Section 3.3 for definitions.

3.4.1.5 Example 5: Sinkhole

The network graphed in Figure 3.6 displays the node v5 as a sinkhole. One way

or another, all information will reach it. Table 7 shows it as important as v4 in their

33 authority role, each having three edges pointing to it, with no possible alternate walks

starting with a reverse step. But in reality, the information emerging from nodes v1, v2, and v3 travels through v4 and reaches v5, even if weakened along the way, which is depicted by the aggregate reachability method. Similarly, all the nodes except v5 have the same importance in their broadcasting role according to the bipartization, whereas the ADR favors the first three nodes, since they can also send information to node v5 through a two step walk.

v3 v6 v7 v2

v1 v4 v5

Figure 3.6: Graph of Example 5.

node vi [exp(A)]ii ADR BNR(d, 1) BNR(d, 2) v1 1.64 2.50 2.61 2.44 v2 1.64 2.50 2.61 2.44 v3 1.64 2.50 2.61 2.44 v4 1.64 2.00 3.94 3.58 v5 1.00 1.00 2.88 2.56 v6 1.64 2.00 2.44 2.19 v7 1.64 2.00 2.44 2.19

node vi [exp(A)]n+i,n+i AUR BNR(u, 1) BNR(u, 2) SAR v1 1.00 1.00 1.62 1.79 2.18 v2 1.00 1.00 1.62 1.79 2.18 v3 1.00 1.00 1.62 1.79 2.18 v4 2.91 4.00 3.94 4.51 4.23 v5 2.91 5.50 3.86 4.52 3.60 v6 1.00 1.00 1.50 1.72 2.04 v7 1.00 1.00 1.50 1.72 2.04

Table 7: Comparison of influence measures for the nodes in Figure 3.6 , including aggregate reachability using walks with a bounded number of reversions, as well as disregarding orientation altogether; see Section 3.3 for definitions.

34 3.4.2 Real-Life Large Examples

3.4.2.1 C. Elegans Network

We consider the neural network of the worm Caenorhabditis elegans [2, 26]. The adjacency matrix for this network is of size 306 × 306. The nodes represent individual neurons, and the edges are links connecting the neurons. Table 5 shows various results regarding which neuron affects the entire network the most, with ADR and

BNR(d, 1) agreeing on number 149, and bipartization favoring node 216. We note that BNR(d, 1) and BNR(d, 2) give node 149 a 9% higher score than the 2nd ranked nodes, and the scores for the following nodes are closer together. ADR gives node

149 a 5% higher score than the 2nd ranked node. Bipartization gives the leading node only a 3% higher score than the following nodes. This suggests that node 149 may be more important than node 216. All measures show node 305 to be the most affected one by other nodes in the network. Looking at the top ten broadcasters and receivers, we find many similarities, which suggests their validity.

Graph models ignore many properties of the neurons in a network. Therefore, it can be difficult to determine from a graph alone which neurons are the most important ones. Nevertheless, high ADR or AUR scores suggest that the corresponding neurons may be important broadcasters or receivers, respectively, and high BNR(d, k) or

BNR(u, k) values indicate that the corresponding neurons may be important branch points of the network.

3.4.2.2 Gene Regulatory Network of the Human B-Cell Interactome

We consider a network of protein-protein, protein-DNA, and modulatory inter- actions in human B cells [62]. There are 5,737 nodes (genes/proteins) and 84,892 directed edges. Taking the matrix exponential of the adjacency matrix (multiplied by a coefficient of 0.25), we found one main strongly connected component (3,891 genes),

35 with 1,833 genes downstream (grouped in singletons, pairs, triplets, or quadruplets)

and 13 individual genes upstream. Analyzing the main strongly connected compo-

nent, we are able to find the distribution of genes based on their aggregate downstream

(exp(0.25A)1) and aggregate upstream (1T exp(0.25A)) reachability; comparing the

two, we can identify the overall role of genes as regulators, regulated, or both. Fig-

ure 3.7 displays the network.

MAPK1 ●● TP53 ESR1 ● 1e+06 ● MYC RPS17 ● ● RPS27 RPS29 ●

CYP11A1 ● ● 1e+05 CYP27A1

● CYP21A2

1e+04 ● SOX5

COPE ● ● PIGQ 1e+03 Upstream aggregate reachability

ZBTB48 ● ● ● BACH1 ZNF263

RBM12 ● SPA17 1e+02 ●

1e+00 1e+02 1e+04 1e+06

Downstream aggregate reachability

Figure 3.7: Gene network: B Cell Interactome. Upstream (1T exp(0.25A)) vs. down- stream (exp(0.25A)1) aggregate reachability for genes in the network. Genes on the top right are well know, highly influential genes, like MYC and TP53. The genes on the left, like CYP27A1, perform metabolic functions but have little effect on other genes. Some ribosomal genes (RPS17, RPS27, etc.) form a small cluster. The gene COPE, which only has one incoming and one outgoing edge, appears less extreme than others.

This example illustrates how ADR and AUR can be used to identify important nodes in a complex directed network, and their different roles. Genes with high ADR and low AUR influence many genes, directly or indirectly, while being influenced by relatively few other genes; typically, these are transcription factors, that control the expression level of many other genes through regulatory pathways (ZBTB48,

36 ZNF263, BACH1, and SOX5 are transcription factors). Genes with low ADR and

high AUR can be expected to be “workhorse” genes, which perform important duties,

and are therefore controlled by many upstream genes, but do not have a regulatory

function (CYP11A1, CYP27A1, and CYP21A2, for example, encode enzymes with

metabolic functions). Finally, genes with high ADR and high AUR can be regarded

as very central in the network, brokers of influence that collect information from many

genes upstream and control the expression of many genes downstream; unsurprisingly,

crucial master genes like TP53 and MAPK1 have extreme values in both measures.

(The information about specific genes mentioned above was obtained from Gene Cards

[63].)

An approach described by Croft and Higham [22] is closely related, but designed

with the goal of extracting a hierarchical structure. In our notation, their measure be-

comes ADR − AUR, and it can be used to identify putatively influential transcription factors; however, it would fail to distinguish between TP53 and a relatively unimpor- tant gene like COPE. One could define a measure for centrality given by ADR + AUR which would distinguish between TP53 and COPE; of course, ADR − AUR and

ADR + AUR are essentially an axis rotation of the measures ADR and AUR. It

seems clear that a full picture requires at least two separate measures. However, in

this example the BNR measures do not seem to add much beyond what is provided

by ADR and AUR, at least for the top ranked genes.

We remark that scaling of an adjacency matrix may enhance the usefulness of the

ordering determined. An interpretation of graphs as oscillator networks in which the

scaling coefficient corresponds to inverse temperature is provided in [29]. Although

the proper choice of scale is an important issue, it falls outside the scope of this work;

our choice of 0.25 as scaling factor is approximately the largest value that prevents

37 the entries of exp(A) from growing to the point of numerical overflow.

Top 10 ranked nodes vi using various measures

[exp(A)]ii ADR BNR(d, 1) BNR(d, 2) MYC MYC MYC MYC ESR1 ESR1 ESR1 ESR1 CREB1 CREB1 CREB1 CREB1 RBL2 RBL2 RBL2 RBL2 FOXM1 TP53 TP53 TP53 SP3 SP3 SP3 SP3 JUND EP300 EP300 EP300 POU2F2 E2F4 E2F4 E2F4 E2F4 MAPK1 MAPK1 MAPK1 TCF1 STAT1 STAT1 STAT1

[exp(A)]n+i,n+i AUR BNR(u, 1) BNR(u, 2) SAR EP300 MAPK1 MAPK1 MAPK1 MYC CREBBP TP53 TP53 TP53 ESR1 CDC2 GRB2 GRB2 GRB2 CREB1 PCNA FYN FYN FYN RBL2 BRCA1 CDC2 CDC2 CDC2 JUND AURKA SRC SRC SRC SP3 CCNA2 MAPK8 MAPK8 MAPK8 FOXM1 LYN JUNB JUNB JUNB POU2F2 LTK STAT3 STAT3 STAT3 E2F4 JUN TRAP1 TRAP1 TRAP1 STAT1

Table 8: Comparison of influence measures for the nodes in the Gene network, in- cluding aggregate reachability using walks with a bounded number of reversions, as well as disregarding orientation altogether; see Section 3.3 for definitions.

3.4.3 Bus Route Network Targeting Specific Nodes

In this example we rank the nodes according to their downstream or upstream influence on particular nodes. Consider the Kent State University main campus bus system, illustrated in Figure 3.8, left panel. The route consists of four working loops:

Front Campus/Summit East (in blue), Reverse Loop (in green), Gateway Loop (in

38 orange), and Alberton (in purple). Due to road construction, Campus Loop (in red)

was not running at the time of writing, and is not included in our example.

Figure 3.8: Left: Kent State University main campus bus route [56]. Right: Network graph of Kent State University main campus bus route.

The bus stops are the nodes v1, ... , v10, which are connected by directed edges, according to the map and schedule information [56]. We assign only one node to each named bus stop, even when the bus stops on both sides of the street. The graph of the directed network is shown in Figure 3.8, right panel. In this example, we assume that all edges have the same weight, regardless of travel length.

We compute exp(A)u for different vectors u. In some applications, one might need to subtract the I from the Taylor series expansion (3.1.2) of exp(A).

In our example, we use the expansion (3.1.2) as is, because this allows a person at a bus stop to stay there instead of riding the bus. As mentioned in Section 3.2.2, the entries of the vector u indicate how important it is to reach each node in the network.

In this example we let these entries be either 0 or 1, depending on whether we are interested in reaching a node or not.

For Table 9, we let all entries of u1 be zero except for the fourth entry. Thus,

we are interested in ranking the nodes according to how much they contribute to

reaching node v4. Table 10 shows how much each node contributes to receiving from

39 node v4. This is a new way of calculating the communicability among the nodes.

Let the vector u2 have the fourth and the seventh entries equal to 1 and the other

entries zero. In other words, we are interested in determining the best node to place

our information in order to reach nodes v4 or v7. This is displayed by Table 9. The

best node to gather information coming from nodes v4 or v7 is shown by Table 10.

broadcasting

node vi (exp(A)1)i node vi (exp(A)u1)i node vi (exp(A)u2)i node vi (exp(A)u3)i v9 10.00 v4 1.02 v4 1.61 v6 3.25 v6 9.56 v3 1.00 v6 1.47 v4 2.29 v1 9.31 v9 0.60 v7 1.20 v8 2.29 v7 5.34 v1 0.20 v3 1.19 v9 2.21 v10 5.34 v7 0.19 v1 0.81 v1 1.12

Table 9: Comparing the top 5 ranked nodes in Figure 3.8 as broadcasters T T where u1 = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0] , u2 = [0, 0, 0, 1, 0, 0, 1, 0, 0, 0] , and u3 = [0, 0, 0, 1, 0, 0, 1, 1, 0, 1]T .

Take the scenario of a driver who would like to drop off four students at the university. One of the students is going to node v4, the second one to v7, the third to v8, and the last one to node v10. The driver can only take them to one bus stop.

Where should he stop his car? Table 9 indicates that it is best to take the students to

v6, where each one of them rides the bus to his/her destination. Table 10 also shows

that it is best that they all ride the bus to node v9, where the driver picks all of them

up at the same time. It is noticeable that, as the number of nonzero elements in u

increases, the ranking looks more and more like the ranking for u = 1.

3.5 Numerical Considerations

When the network has fairly few nodes and, therefore, the associated adjacency

matrix A is small, evaluation of the matrix exponential is quite inexpensive. We then

can calculate expressions of the forms (3.2.4), (3.2.5), (3.3.2), and (3.3.3) by first

40 receiving T T T T node vi [exp(A )1]i node vi [exp(A )u1]i node vi [exp(A )u2]i node vi [exp(A )u3]i v9 11.51 v6 1.37 v9 1.83 v9 3.67 v6 8.33 v4 1.02 v7 1.61 v6 2.94 v1 7.23 v1 0.61 v6 1.47 v1 2.39 v3 5.74 v7 0.59 v4 1.20 v2 2.22 v2 5.74 v8 0.59 v1 0.81 v7 2.22

Table 10: Comparing the top 5 ranked nodes in Figure 3.8 as receivers, T T where u1 = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0] , u2 = [0, 0, 0, 1, 0, 0, 1, 0, 0, 0] , and u3 = [0, 0, 0, 1, 0, 0, 1, 1, 0, 1]T .

evaluating exp(A) and then computing the desired expression(s), where we may use that (exp(AT )) = (exp(A))T .

However, when the network has many nodes and, therefore, the adjacency matrix

A is large, the explicit calculation of exp(A) is too expensive to be attractive. This

section discusses how approximations of the expressions (3.2.4), (3.2.5), (3.3.2), and

(3.3.3) can be evaluated fairly inexpensively for large adjacency matrices with the aid

of the Arnoldi process.

Let k · k denote the Euclidean vector norm. Application of ` steps of the Arnoldi

process to the matrix A with initial vector w 6= 0 gives the decomposition

T (3.5.1) AW` = W`H` + g`e` ,

n×` where the matrix W` = [w1, w2, ... , w`] ∈ R has orthonormal columns that

`−1 span the Krylov subspace K`(A, w) = span{w, Aw, ... , A w} with w1 = w/kwk.

`×` n T The matrix H` ∈ R is of upper Hessenberg form, g` ∈ R satisfies W` g` = T th 0, and e` = [0, ... , 0, 1, 0, ... , 0] denotes the ` column of an identity matrix of

appropriate order; see, e.g., Saad [60, Chapter 6] for further details on the Arnoldi

process. We assume that ` is small enough so that the decomposition (5.5.1) with

41 the stated properties exists. This is the generic situation. The computation of this

decomposition requires the evaluation of ` matrix-vector products with the matrix A.

Expressions of the form exp(A)w are commonly approximated by the right-hand side of

exp(A)w ≈ W` exp(H`)e1kwk, see, e.g., [8, 45] for discussions. In particular, we obtain an approximation of (3.2.4) by letting w = 1. When A is large, the dominating computational work for calculat- ing this approximation is the evaluation of the ` matrix-vector products required to determine the decomposition (5.5.1).

An approximation of the expression (3.2.5) can be determined similarly: we apply

T the Arnoldi process to the matrix A with initial vector wb = 1. This gives the decomposition

T T (3.5.2) A Wc` = Wc`Hc` + gb`e` , which is analogous to (5.5.1). We then evaluate the right-hand side of

T (3.5.3) exp(A )wb ≈ Wc` exp(Hc`)e1kwb k.

Again, when A is large, the dominating computational effort to calculate this approxi- mation is the evaluation of the ` matrix-vector products with AT needed to determine the decomposition (3.5.2).

We turn to the approximation of the expression (3.3.2) for k = 1. Extension to the situation when k > 1 is straightforward. The expression (3.3.3) can be computed in a similar fashion. We first compute the Arnoldi decomposition (3.5.2) with initial vector wb = 1 and then evaluate the Arnoldi decomposition (5.5.1) with initial vector

42 w = Wc` exp(Hc`)e1k1k. This gives the approximation

(3.5.4) W` exp(H`)e1kwk of (3.3.2). The following results sheds some light on this approximation.

Proposition 3.5.1. Let f be a polynomial of degree at most ` − 1 and let wb be an initial vector for the Arnoldi decomposition (3.5.2). Consider the approximation

Wc`f(Hc`)e1kwb k

T of f(A )wb . Use the above vector as initial vector w for the decomposition (5.5.1) and compute the approximation

(3.5.5) W`f(H`)e1kwk

T of f(A)f(A )wb . Then this approximation is exact. We assume that the required Arnoldi decompositions can be computed without breakdown of the Arnoldi process.

Proof. Consider the decomposition (5.5.1). It is well known that for any polynomial f of degree at most ` − 1, we have

f(A)w = W`f(H`)e1kwk; see, e.g, [8]. Clearly, an analogous result holds if the decomposition (5.5.1) is replaced by the decomposition (3.5.2). The desired result follows.

T The approximation (3.5.5) of f(A)f(A )wb requires the evaluation of 2` matrix- vector products, ` with each one of the matrix A and AT . When the adjacency matrix

43 A is stored in a format that makes the evaluation of matrix-vector products with AT more expensive than with A, it may be tempting to carry out 2` steps of the Arnoldi process to A with initial vector wb and then use the low-rank matrices

T T T T (3.5.6) exp(A) ≈ W2` exp(H2`)W2`, exp(A ) ≈ W2` exp(H2`)W2`,

T to approximate exp(A) exp(A )wb . The evaluation of this approximation requires the same number of matrix-vector product evaluations as the approach described

in Proposition 3.5.1. However, no analogue this proposition is available for the ap-

proximation (3.5.6) and, indeed, this approximation typically is of significantly worse

quality than an approximation computed by the approach of Proposition 3.5.1.

We conclude this section with a comparison of the evaluation of the expression

(3.3.2) for k = 1 for the matrix A of Section 3.4.2.2 by explicitly computing the

matrix exponential exp(A) and by evaluating Arnoldi decompositions as described

by Proposition 3.5.1. The matrix A is of order 3891. The computation of the matrix

exponential M = exp(A/2) using the MATLAB function expm required 1222.15

seconds (≈ 20.37 minutes).1 This is the dominating work. Let w = [1, 1, ... , 1]T ∈

R3891. Having the matrix M, we can calculate BNR(d, 1) by evaluating two matrix-

vector products BNR(d, 1) = M(M T w). The top 10 ranked nodes obtained in this

manner are shown in column 3 in the top part of Table 8.

We turn to the approximation of the expression (3.3.2) with the aid of Arnoldi

decompositions. First we compute the decomposition (3.5.2) for ` = 6 and ini-

tial vector w. This required only 0.037 seconds and gives the approximation zb :=

t Wc6 exp(Hc6)e1kwk of M w. The total time needed to compute zb was 0.077 seconds. 1All computations were carried out in MATLAB with about 15 significant decimal digits on a Lenovo ideapad 510 laptop computer with a 2.5 GHz Intel Core i7 processor and 6 GB 2133 MHz DDR4 memory.

44 Next we evaluate the Arnoldi decomposition (5.5.1) with ` = 6 and initial vector zb. The calculation of this decomposition required 0.023 seconds.

1 1 T Algorithm 4: Calculating exp( 2 A) exp( 2 A )w using Arnoldi iterations 1 load genenetwork; 2 Adj = A; 3 4 k = 12; % This is the number of Arnoldi iterations 5 nb = 10; % This is the number of top results we would l i k e to see 6 7 [ n , n ] = size ( Adj ) ; 8 I = eye (n) ; 9 e 1 = I ( : , 1 ) ; 10 vector one = ones(n,1); 11 v e c t o r o n e unit = vector one /norm( vector one ) ; 12 13 h a l f k = k /2; 14 [Q half e1 , H h a l f e1 ] = Arnoldi(Adj’ , v e c t o r o n e u n i t , h a l f k ) ; 15 Hk half e1 = H h a l f e 1 ( 1 : h a l f k , 1 : h a l f k ) ; 16 Qk half e1 = Q half e1(:,1:half k ) ; 17 w = Qk half e1 ∗expm( Hk half e1 ) ∗ e 1 ( 1 : h a l f k , 1 ) ∗norm( vector one ) ; 18 19 w unit = w/norm(w) ; 20 [Q half w , H half w ] = Arnoldi(M , w unit , h a l f k ) ; 21 Hk half w = H half w ( 1 : h a l f k , 1 : h a l f k ) ; 22 Qk half w = Q half w ( : , 1 : h a l f k ) ; 23 z = Qk half w ∗expm( Hk half w ) ∗ e 1 ( 1 : h a l f k , 1 ) ∗norm(w) ; 24 25 function [ Q ,H ] =Arnoldi(A , b, k ) 26 % Arnoldi iteration k steps. Initial vector b. 27 [ n , n ] = size (A); 28 Q = b/norm(b) ; 29 for p = 1 : k 30 v = A∗Q( : , p) ; 31 for i = 1 : p 32 H(i,p) =Q(:,i)’∗ v ; 33 v = v − H( i , p) ∗Q( : , i ) ; 34 end 35 H(p+1,p) = norm( v ) ; 36 Q(:,p+1) = v/norm( v ) ; 37 end 38 end

The difference in time required to compute the decompositions (3.5.2) and (5.5.1)

45 depends on the storage format for the matrix A used by MATLAB. The total time

needed to compute an approximation of (3.3.2) for k = 1 in this manner is only

0.129 seconds. The top 10 ranked nodes are those of column 3 in the top part

of Table 8. Thus, the application of the Arnoldi process twice with ` = 6 as de-

scribed by Proposition 3.5.1 gives the same ranking of the nodes as the evaluation of

1 1 T exp( 2 A) exp( 2 A )w and requires much less time. Algorithm 4 written in MATLAB follows the described process step-by-step.

46 CHAPTER 4

Edge Importance in a Network Via Line Graphs and the Matrix Exponential

4.1 Incidence and Exsurgence Matrices

Assume now that G is directed and unweighted. We use the definitions of incidence

i e i matrix B and exsurgence matrix B of G as described in Section 2.4.2, where Bij = 1

e if edge ej incides on node vi, Bij = 1 if ej exsurges from vi, and entries are zero otherwise.

i i Proposition 4.1.1. Let G be a directed unweighted graph, and let B = [Bij] and

e e B = [Bij] denote the associated incidence and exsurgence matrices. Then

(i) each column of Bi and of Be contains exactly one entry equal to 1, with the

remaining entries of the column zero,

e iT e i (ii) A = [Ajk] = B B . Moreover, the entries of B and B are such that in each

sum

m X e i (4.1.1) Ajk = Bj`Bk`, 1 ≤ j, k ≤ n, `=1

e there is at most one nonvanishing term. Each nonvanishing element Bj` of e i i B is paired with precisely one nonvanishing entry Bk` of B . It follows that e i each nonvanishing entry Ajk can be written as Bj`Bk` for precisely one index e i l ∈ {1, 2, ... , m}. Moreover, each entry Bj` and each entry Bk` determine

precisely one entry Ajk.

Proof. Statement (i) follows from the definition of the matrices Bi and Be. The factorization in (ii) expresses the entries of A in terms of inciding and exsurging

47 edges. Each nonvanishing term of the sum (4.1.1) represents an edge from node vj

to node vk. Since the network is assumed to have simple edges only, there can be

at most one nonvanishing term in each one of the sums (4.1.1). The fact that each

e i nonvanishing element Bj` is paired with precisely one nonvanishing entry Bk` follows from the observation that an exsurgent edge has to lead somewhere, and cannot have

more than one destination.

4.2 Line Graphs

4.2.1 Line Graphs of an Undirected Graph

Given an undirected graph G = (V, E), the line graph of G is an undirected graph

G∗ = (E, F), in which there is an edge f ∈ F that connects the nodes e, e0 ∈ E if and

only if there is a node v ∈ V such that both e and e0 incide on v in G. Line graphs have particular characteristics. For example, each node v in G induces a clique (a complete subgraph, that is, a set of nodes that are all connected to each other) in G∗, containing all e ∈ E that incide on v. In fact, the collection of cliques produced by

nodes in G with degree at least 2 create a partition of F; see [55]. If B is the incidence matrix for G, then it can be easily shown that E = BT B − 2I is the adjacency matrix for G∗. Throughout this chapter I stands for the identity matrix of suitable order.

We refer to E as the line graph adjacency matrix.

4.2.2 Line Graphs of a Directed Graph

Defining a single line graph for a directed graph is difficult, as there are some non-canonical choices. We therefore introduce four line graphs that capture different relationships between the edges of a directed graph.

Definition 1. Let G = (V, E) be a directed graph. We define the following associated line graphs:

48 1. The undirected co-incidence line graph G∨ = (E, F ∨), in which distinct edges

e, e0 ∈ E are connected if and only if both e and e0 incide on the same node in

G.

2. The undirected co-exsurgence line graph G∧ = (E, F ∧), in which distinct edges

e, e0 ∈ E are connected if and only if both e and e0 exsurge from the same node

in G.

3. The directed continuation line graph G→ = (E, F →), in which the edge e is

connected to the edge e0 if e incides on v and e0 exsurges from v for some node

v in G.

For completeness, we also define a fourth line graph: the reverse continuation line graph G← = (E, F ←), where (e, e0) ∈ F ← if and only if (e0, e) ∈ F →. The line graph

G→ is well known; it is described, e.g., in [38, page 265]. The line graphs G∨ and G∧ are new. It is worth noting that for an undirected graph G∨ = G∧ = G→ = G←, because there is no distinction between the types of edge connections described above.

The following properties are easily shown:

Proposition 4.2.1. Let G = (V, E) be a directed graph.

1. G∨ and G∧ have no self-loops; e ∈ E has a self-loop in G→ if and only if e is a

self-loop in G.

2. G∨ partitions E into edge- and vertex-disjoint cliques, with one clique for each

v ∈ V with positive indegree. The same happens with G∧, with one clique for

each v with positive outdegree.

3. Suppose (e, e0) ∈ F →. Then {e, e00} ∈ F ∨ implies that (e00, e0) ∈ F →, and

{e0, e00} ∈ F ∧ implies that (e, e00) ∈ F →.

49 4. BeBeT and BiBiT are n × n diagonal matrices containing the outdegrees and

indegrees, respectively, of the nodes of G.

5. The adjacency matrices for G∨, G∧, G→, and G← are given by E∨ = BiT Bi,

E∧ = BeT Be, E→ = BiT Be, and E← = BeT Bi, respectively. Note that for

undirected networks E∨ = E∧ = E→ = E← = E.

6. Let B+ = [Be Bi] ∈ Rn×2m and define E+ = B+T B+ ∈ R2m×2m. Then

  E∧ E← +   (4.2.1) E =   .   E→ E∨

The matrix E→ is known as the line graph adjacency matrix associated with the graph G; see [64]. To distinguish this matrix from the adjacency matrices for other line graphs, we refer to E+ as the extended line graph adjacency matrix of G. It is a symmetric matrix, and it can be interpreted as the adjacency matrix of an undirected graph with two copies of the set of edges E.

4.3 Edge Weights

The algebraic approach is, clearly, crucial for the use of matrix functions like the matrix exponential, see Section 4.4.1 below, which are important tools to assess the importance of a node or an edge. However, the choice of a particular matrix representation implies some assumptions about the network, as well as limitations in the kinds of networks that can be represented. For example, a node-node adjacency matrix with all entries zero or one cannot accommodate multiple edges between the same pair of nodes, unless we are willing to represent them with weights that count the number of edges.

This section considers directed graphs with weighted edges. The weights are

50 assumed to be positive. The interpretation of the weights depends on the application.

In general, edge weights correspond to a capacity or speed of transportation, or the

reciprocal of a transfer or communication cost.

Let A˜ denote an edge-weighted adjacency matrix. This matrix is obtained by

associating a positive weight with each edge. Thus, the (ij)th entry of A˜ is the

weight of the edge from node vi to node vj. We refer to this matrix as edge-scaled.

The “unweighted” adjacency matrix A that is associated with A˜ has all edge weights

equal to one. Thus, the entries of A belong to {0, 1}.

Theorem 4.3.1. Let A˜ = [A˜ij] be the n × n weighted adjacency matrix of a directed

edge-weighted graph G of n nodes and m edges. Let zk > 0 denote the weight of edge

ek for 1 ≤ k ≤ m. Define the diagonal matrix Z with diagonal entries z1, z2, ... , zm in some order. Let A = BeBiT be the adjacency matrix for the unweighted directed

i i e e graph associated with G, where B = [Bij] and B = [Bij] denote the incidence and exsurgence matrices for the unweighted graph; see Proposition 4.1.1. Then A˜ =

BeZBiT . In particular, each nonvanishing entry of A˜ equals one of the diagonal

entries of the matrix Z, and each diagonal entry of Z corresponds to precisely one

nonvanishing entry of A˜.

Proof. The result follows from Proposition 4.1.1. Each column of Be has precisely

one nonvanishing entry 1. Therefore, BeZ is a weighted exsurgence matrix, with each diagonal entry zj appearing in exactly one column. The theorem now follows from part (ii) of Proposition 4.1.1 with Be replaced by BeZ.

The matrix Z = diag[z1, z2, ... , zm] of Theorem 4.3.1 can be factored according

e i e e e e i i i i to Z = Z Z , where Z = diag[z1, z2, ... , zm] and Z = diag[z1, z2, ... , zm] have positive diagonal entries. Then B˜ e = BeZe is a weighted exsurgence matrix, such

e e i i i that each entry zj of Z appears in exactly one column. Similarly, B˜ = B Z is

51 i i a weighted incidence matrix, such that each entry zj of Z appears in precisely one column. The weighted line graph G→ is defined by the matrix E˜ → = B˜ iT B˜ e. We will use the matrices Ze = Zi = Z1/2 in the computed examples reported in this chapter, but other choices of Ze and Zi also are possible.

For certain (unweighted) adjacency matrices A = BeBiT and weighting matrices

Z, the weighted adjacency matrix A˜ = BeZBiT can be expressed by row and/or column scaling of A. We summarize this in the following proposition, whose proof is straightforward.

e iT Proposition 4.3.2. Let A = [Aij] = B B be an n × n unweighted adjacency

e iT n×n matrix with m edges, let A˜ = [A˜ij] = B ZB ∈ R be an associated edge-weighted adjacency matrix, and let W = diag[w1, w2, ... , wn] be a diagonal matrix with positive diagonal entries. Then (i) A˜ = AW if and only if the weighting matrix Z is such that

A˜ij = Aijwj for all 1 ≤ i, j ≤ n, (ii) A˜ = WA if and only if the weighting matrix

Z is such that A˜ij = Aijwi for all 1 ≤ i, j ≤ n, (iii) A˜ = W AW if and only if the weighting matrix Z is such that A˜ij = Aijwiwj for all 1 ≤ i, j ≤ n.

The significance of the above result is that we may express the edge-weighting defined by the edge-weighted adjacency matrix A˜ in terms of row or column scaling of the “unweighted” adjacency matrix A.

52 4.3.1 Example

Consider the cyclic upper Hessenberg adjacency matrix

  0 0 0 ··· 0 1          1 0 0 ··· 0 0         0 1 0 ··· 0 0    n×n A =   ∈ R .  . .   0 .. . 0 0       .   .. 0 0        0 0 1 0

Then, for any weighting matrix Z, the edge-weighted matrix A˜ can be expressed as

A˜ = AW1 and A˜ = W2A for suitable diagonal weighting matrices W1 and W2 with positive diagonal entries.

4.4 Computing the Most Important Edges in an Undirected Network by the Matrix

Exponential

4.4.1 Review of the Adjacency Matrix Exponential

We saw in Chapter 3 that for a symmetric unweighted adjacency matrix A ∈

n×n th p R , the (ij) entry of A gives the number of walks of length p between nodes vi

th and vj. The (ij) entry of the matrix function

∞ X p (4.4.1) f(A) = cpA p=0 gives a weighted average of the number of walks of various lengths between the nodes vi and vj.

The entry of [f(A)]ii defines the subgraph centrality of the node vi, and the entry

[f(A)]ij, with i 6= j, defines the communicability between the nodes vi and vj.A

53 relatively large value [f(A)]ii indicates that node vi is important, and a relatively

large value [f(A)]ij, i 6= j, suggests that communication between the nodes vi and

vj is relatively easy; see, e.g., [27, 30]. Another importance measure for nodes is

furnished by row sums of the function (4.4.1), i.e., by the entries [f(A)1]i, 1 ≤ i ≤ n,

T where 1 = [1, 1, ... , 1] . A relatively large value of [f(A)1]i suggests that the node

vi is important; see [12] as well as [34, Section 2] and [43].

4.4.2 Exponential of the Line Graph Adjacency Matrix for Undirected Graphs

We seek to rank the edges of graphs and first consider undirected graphs. Let

E ∈ Rm×m be the line graph adjacency matrix for an undirected graph with n nodes

and m edges; see Section 4.2.1 for its definition. We compute the matrix exponential

of E and obtain, analogously to (4.4.1),

∞ 1 (4.4.2) exp(E) = X Ep. p=0 p!

Similarly to the discussion in [30] and above, we can interpret the entries of exp(E) as indicators of the centrality and communicability of the edges of the graph. For instance, a relatively large diagonal entry [exp(E)]kk indicates that the edge ek is

well connected to other edges through the network, therefore is important. Similarly,

a relatively large off-diagonal entry [exp(E)]kl, k 6= l, suggests that information that

travels via edge ek is likely to also travel via edge el. One also may define the centrality

of the edge ek as [exp(E)1]k. We will use the latter measure and define the edge line

graph centrality of an edge ek between the nodes vi and vj as

e (4.4.3) LCk = [exp(E)1]k.

The following examples illustrate the ranking of edges using this measure.

54 4.4.2.1 Example

Consider the graph of Figure 4.1 and the associated line graph of Figure 4.2. We

can see from the graph and the line graph that edge e3 is the most important edge, because it is directly connected to all the other edges in the graph. The symmetry of the graph and the line graph suggests that the edges e1, e2, e4, and e5 are equally important. Table 1 confirms this by computing the edge line graph centrality (4.4.3).

e1 e1 v2 v1

e4 e3 e2 e4 e2 e3

v4 v3 e5 e5

Figure 4.1: Graph of Example 4.4.2.1. Figure 4.2: Line graph of Example 4.4.2.1.

e edge ek LCk e3 29.73 e2 24.11 e4 24.11 e1 24.11 e5 24.11

Table 1: Ranking of the edges of Example 4.4.2.1 by the centrality measure (4.4.3).

4.4.2.2 Example

The graph for this example is shown in Figure 4.3. Visual inspection suggests

that the edge e2 is the most important edge of the graph. Looking at the graph one

might guess that the edge e1 comes next in importance. However, Table 2 shows the

edges e3, e4, and e6 to be ranked higher. The line graph for the graph, shown in

Figure 4.4, sheds light on this ordering. It shows the edges e2, e3, e4, and e6 to be well connected. This example illustrates that looking at a graph may not always give a good idea of which edges are the most important ones.

55 v1 e2 e1 e2 e1 v8 v6 e3 e6 e5 e7 e3 e4 e6 e5 e7 v2 v3 v5 v4 v7 e4

Figure 4.3: Graph of Example 4.4.2.2. Figure 4.4: Line graph of Example 4.4.2.2.

e edge ek LCk e2 28.07 e4 23.68 e3 23.68 e6 23.68 e1 17.23 e7 10.61 e5 10.61

Table 2: Ranking of the edges of Example 4.4.2.2 by using the centrality measure (4.4.3).

4.4.3 A Comparison of Downdating Methods for Undirected Graphs

Let the adjacency matrix A define an undirected graph. We would like to remove an edge from this graph so that the total network communicability, defined by

(4.4.4) TC(A) = 1T exp(A)1, and considered by Benzi and Klymko [12], decreases the least. This problem is dis- cussed by Arrigo and Benzi [4]. Another communicability measure used in the latter paper is the Estrada Index [28, 30] defined by

n X (4.4.5) EE(A) = (exp(A)) = [exp(A)]ii. i=1

There are several ways to measure the importance of an edge. Arrigo and Benzi

56 [4] define the edge total communicability centrality of an existing edge between the

nodes vi and vj as

e (4.4.6) TC(vi, vj) = [exp(A)1]i[exp(A)1]j,

and the edge subgraph centrality of an edge between the nodes vi and vj as

e (4.4.7) SC(vi, vj) = [exp(A)]ii[exp(A)]jj.

In this chapter, we propose to compute the exponential of the line graph adjacency matrix and remove edges with the lowest centrality defined by (4.4.3). For this purpose we introduce the total line graph centrality measure of the network with line graph adjacency matrix E,

m X T (4.4.8) LC(A) = {exp(E)1}k = 1 exp(E)1. k=1

This approach is shown to be competitive with the approach of removing edges with the lowest edge defined by (4.4.6) and (4.4.7), as proposed in [4], and we will show examples where it decreases the total communicability less than the other approaches.

The following examples illustrate the use of the measures described above. We show two examples, for which removing the three least important edges from a graph, using the exponential of the edge line graph to identify these edges, results in a graph with a higher communicability with respect to all the measures (4.4.3), (4.4.5), and

(4.4.6).

In the first example, we compute the exponential of the line graph, remove the

57 e5 e3 v5 e5 e3 v v e1 2 1 e e e 2 1 e e2 7 7 e6 e4 e6 e4 e9 e9 v3 e8 v4 e10 e10 e8 v7 e11 v6 e11

(a) The graph for Example 4.4.3.1 (b) The line graph of Example 4.4.3.1 Figure 4.5: Undirected graph and associated line graph. three least important edges, and then calculate the measures after removal. This approach is referred to as nongreedy downdating in [12]. The second example recalcu- lates the measures after the removal of each edge, to make the decision of removal of the next edge based on the updated graph. This approach is called greedy downdating in [12].

4.4.3.1 Nongreedy Downdating Example

Consider the connected network shown in Figure 4.5. We use different edge cen- trality measures to decide which three edges should be eliminated from the network so that the total communicability is decreased the least, with the constraint that the graph obtained after edge removal should be connected.

Table 3 shows the edges with the lowest centrality measured by several centrality measures defined above. The first method computes the edge total communicability

(4.4.6), the second one ranks the edges according the edge subgraph centrality measure

(4.4.7), and the third one uses the edge line graph centrality (4.4.3). We then remove the three edges with the lowest centrality and determine which method decreases the total communicability the least.

58 e e e edge ek TC(vi, vj) edge ek SC(vi, vj) edge ek LCk e2 549.51 e2 18.82 e7 85.08 e10 549.51 e10 18.82 e9 85.08 e6 581.11 e6 18.96 e5 85.08 e9 581.11 e9 18.96 e6 97.91

Table 3: The least important edges in Figure 4.5(a) ordered according to increasing importance by using the edge total communicability, the edge subgraph centrality, and the edge line graph centrality. The edge ek connects the nodes vi and vj.

When using the edge total communicability centrality measure or the edge sub- graph centrality measure, as suggested in [4], the first 4 columns of Table 3 indicate that the edges e2, e6, and e9 are to be removed. The graph obtained after removal of

these edges is shown in Figure 4.6(a). We note that while some edges have smaller

centrality measure than the ones we remove, those edges are not removable because

this would disconnect the graph. We then use the edge line graph centrality measure,

see the last two columns of Table 3 for the edge ranking, to conclude that the edges

e5, e7, and e9 are to be removed. The graph obtained after removal of these edges is

shown in Figure 4.6(b).

We compute the centrality measures TC(A), EE(A), and LC(A) for both graphs

of Figure 4.6 and report them in Table 4. In each case, we obtain better connectivity

of the graph of Figure 4.6(b) than of the graph of Figure 4.6(a), i.e., when we use the

edge line graph centrality measure to determine which edges to remove.

Figure 4.6(a) Figure 4.6(b) Total network communicability TC(A) 86.22 92.31 Estrada Index EE(A) 20.68 21.16 Total line graph centrality LC(A) 2144.41 3104.77

Table 4: Comparison of various network connectivity measures for the downdated graphs in Figure 4.6 obtained by the method in [4] versus the using the exponential of the line graph. The reduced graph is required to be connected.

59 v5 v5 e3 e5 e3 v2 v1 v2 v1 e1 e1 e2 e7 e6 e e4 e e4 v3 8 v4 v3 8 v4 e10 e10 v7 e11 v6 v7 e11 v6

(a) Graph obtained by removing edges (b) Graph obtained by removing edges using the technique in [4]. using the exponential of the line graph matrix. Figure 4.6: The nongreedy downdated network of Figure 4.5(a).

4.4.3.2 Greedy Downdating Example

We again consider the network in Figure 4.5 and remove the three edges with the lowest centrality using the same measures as in the previous example, but here we update the measures after each edge removal. Like in the previous example, we require the graph obtained after edge removal to be connected. The edge total communicability centrality and the edge subgraph centrality yield the graph in Figure

4.7(a), while using the exponential of the adjacency matrix for the line graph gives the graph of Figure 4.7(b).

v5 v5

v2 v1 v2 v1

v3 v4 v3 v4

v7 v6 v7 v6

(a) Graph obtained by using the greedy (b) Graph obtained by using the technique in [4]. exponential of the line graph. Figure 4.7: The greedy downdated network of Figure 4.5(a).

We compute the network centrality measures TC(A), EE(A), and LC(A) for the

60 graphs of Figure 4.7. These measures are reported in Table 5 and are larger than the entries of Table 4 for the nongreedy algorithm. This means that the greedy approach gives graphs with better connectivity than the nongreedy one. Moreover, the graph of Figure 4.7(b) has better connectivity than the graph of Figure 4.7(a). This also holds for numerous other examples. This strongly suggests that edge removal by the greedy approach based on using the exponential of the line graph adjacency matrix generally is preferable.

Figure 4.7(a) Figure 4.7(b) Total network communicability TC(A) 89.04 92.52 Estrada Index EE(A) 20.91 22.11 Total line graph centrality LC(A) 2507.77 3119.72

Table 5: Comparison of network connectivity measures for the downdated graphs in Figure 4.7 obtained by the method in [4] versus the use of the exponential of the adjacency matrix for the line graph. The reduced graph is required to be connected.

4.4.3.3 Downdating Example When the Reduced Graph is Not Required to be Con-

nected

In Examples 4.4.3.1 and 4.4.3.2, we downdated the graphs with the requirement that the reduced graph be connected, as was done in [12]. In this example, we remove three edges with the smallest centrality of the graph of Figure 4.5(a) without requiring the reduced graph to be connected. The edge centrality measures used are the same as in Examples 4.4.3.1 and 4.4.3.2.

When measuring edge importance by the edge total communicability centrality and the edge subgraph centrality, we obtain the disconnected reduced graph in Figure

4.8 both when we apply the nongreedy and greedy methods described in Examples

4.4.3.1 and 4.4.3.2, respectively. Using the exponential of the adjacency matrix for the line graph gives the graph of Figure 4.6(b) for the nongreedy method, and the

61 graph of Figure 4.7(b) for the greedy method.

Table 6 reports the network centrality measures TC(A), EE(A), and LC(A) for

the graphs obtained. The graphs of Figures 4.6(b) and 4.7(b) have larger connectivity

measures than the graph of Figure 4.8, except for the Estrada Index, which is smaller

but close. We conclude that edge removal by using the exponential of the line graph

adjacency matrix E can be competitive also when we do not require the reduced

graph to be connected.

v5

v2 v1

v3 v4

v7 v6

Figure 4.8: Downdated graph of Figure 4.5(a) obtained by using the nongreedy or greedy techniques in [4] without requiring the network to stay connected.

Figure 4.8 Figure 4.6(b) Figure 4.7(b) Total network communicability TC(A) 91.5 92.31 92.52 Estrada Index EE(A) 22.33 21.16 22.11 Total line graph centrality LC(A) 2501.24 3104.77 3119.72

Table 6: Comparison of network connectivity measures for the downdated graphs in Figure 4.8 obtained by the method in [4] versus the use of the exponential of the adjacency matrix for the line graph. The reduced graph is not required to be connected.

4.4.4 A Comparison of Downdating Methods for Directed Graphs

In this section the adjacency matrix A corresponds to a directed graph. We would like to remove edges of a directed graph so that communicability of the network is decreased the least. A measure used by Arrigo and Benzi [3] is the total network

62 communicability, same as in (4.4.4), with A nonsymmetric. They also use a measure

that takes into account alternating walks in a directed network. This measure assigns

the network a value of its hub strength, referred to as the total hub communicability,

√ T T ThC(A) = 1 cosh( AA )1,

and a value of its authority strength, referred to as the total authority communicability,

√ T T TaC(A) = 1 cosh( A A)1.

To evaluate the total communicability of the network, the authors of [3] compute the

sum ThC(A) + TaC(A). This method is based on the idea of expressing a directed

graph by an undirected graph with twice the number of nodes, and applying the

matrix exponential to the adjacency matrix associated with the latter graph,

  0 A   (4.4.9) A =   ;   AT 0 see [11] for details.

This section extends the total line graph centrality measure defined in (4.4.8) to directed graphs. Taking into account the existing edges ek, we use the line graph adjacency matrix E→ to define the total line graph centrality by

m e X → T → (4.4.10) LC(A) = [exp(E )1]k = 1 exp(E )1. k=1

Note that when assessing the effect of removing an edge on the communicability of a network, the measure ThC(A) + TaC(A) may yield significantly different values than

63 the measures TC(A) and eLC(A), since the latter measures capture the flow deeply in the network, whereas the first measure does not.

We review some of the edge measures used in [3] for ranking edges to identify the least connected edges, which are to be removed. The edge total communicability centrality of an existing edge going from node vi to vj is given by

e T TC(vi, vj) = [exp(A)1]i[1 exp(A)]j,

and its application to the corresponding matrix (4.4.9) yields the measure

e T TC(vi, vj) = [exp(A)1]i[1 exp(A)]n+j.

Based on ideas in [11], the authors of [3] use the generalized hyperbolic sine to define

the edge total communicability,

e gTC(vi, vj) = Ch(i)Ca(j),

where, the total hub communicability of node vi and the total authority communica- bility of node vj are defined by

T T Ch(i) = [U sinh(Σ)V 1]i and Ca(j) = [V sinh(Σ)U 1]j, respectively. The matrices U, Σ, and V are determined by the singular value decom-

position A = UΣV T .

Analogously to our definition of the edge centrality (4.4.3), we define the edge line

64 graph outcentrality as

e → (4.4.11) LCoutk = [exp(E )1]k, and the edge line graph incentrality,

e →T (4.4.12) LCink = [exp(E )1]k.

e A relatively large LCouti indicates that edge ek is an important transmitter of in-

e formation through the network, and a relatively large LCink suggests that edge ek is an important receiver of information.

4.4.4.1 Nongreedy Downdating Example

Consider the directed graph in Figure 4.9(a). We would like to remove the three edges with the lowest edge centrality as measured by eTC, eTC, egTC, or eLC.

The graphs obtained after removing the three edges identified by these measures are shown in Figures 4.9(b-d). In this example, we allow the resulting network to be disconnected.

Figure 4.9(b) Figure 4.9(c) Figure 4.9(d) Total network communicability TC(A) 69.62 69.32 94.97

ThC(A) + TaC(A) 61 61 62 Total line graph centrality LC(A) 160.13 158.99 261.41

Table 7: Comparison of network communicability measures for the downdated graphs in Figure 4.9.

We compute the communicability of the graphs of Figure 4.9. Table 7 shows that when removing the edges with the lowest edge line graph centrality, all the measures of network communicability have a larger value than when we remove edges using one

65 v3 v7 e9 v3 v7 e18 v2 e7 e4 v2 e15 e19 e17 v6 e12 e6 e8 e16 v6 e1 e14 e5 v4 e13 e11 v4 e10 e3 v5 v1

v5 e2 v1

(b) Downdated graph determined by (a) Graph for Example 4.4.4.1. removing three edges with the smallest e TC(vi, vj) values.

v3 v7 v3 v7

v2 v2

v6 v6

v4 v4

v5 v1 v5 v1

(c) Downdated graph determined by (d) Downdated graph determined by removing three edges with the smallest removing three edges with the smallest e e e TC(vi, vj) and gTC(vi, vj) values. LCk values. Figure 4.9: The nongreedy downdating of a directed graph.

66 of the other edge centrality measures. While this is not the case for every example we may encounter, it nevertheless suggests that edge line graph centrality is an important measure of edge centrality.

4.4.4.2 Greedy Downdating Example

This example differs from the previous one only in that we now update the com- municability measures of the graph after each edge removal. Removing edges based on

e the measure LC(vi, vj) performed as well as removing edges by using the measure e TC(vi, vj). These two measures outperform the other measures. The downdated graphs obtained by using the various edge centrality measures are displayed in Figure

4.10, and the communicability measures for the downdated measures are reported in

Table 8. Note that the graph of Figure 4.10(a) is disconnected, while the graphs of

Figures 4.10(b) and 4.10(c) are connected. v3 v7 v3 v7 v3 v7

v2 v2 v2

v6 v6 v6 v4 v4 v4 v5 v1 v5 v1 v5 v1

(a) Downdated graph (b) Downdated graph (c) Downdated graph determined by removing determined by removing determined by removing three edges with the three edges with the three edges with the e e e smallest TC(vi, vj) and smallest TC(vi, vj) smallest gTC(vi, vj) e LCk values. values. values. Figure 4.10: The greedy downdating of the directed graph in Figure 4.9(a).

67 Figure 4.10(a) Figure 4.10(b) Figure 4.10(c) Total network communicability TC(A) 94.97 60.91 84.78

ThC(A) + TaC(A) 62 61 61 Total line graph centrality LC(A) 261.41 127.59 219.59

Table 8: Comparison of network communicability measures for the downdated graphs in Figure 4.10.

4.5 Computing the Most Important Edges in a Directed Unweighted Network Using

the Matrix Exponential

We introduced in Section 4.2 three types of connections between adjacent edges of a directed graph. Each one of these connection types yields a line graph, and the exponential of the adjacency matrices associated with these line graphs defines centrality measures for edges. We are interested in determining which edges are the most and least important ones in a directed network and investigate the application of the matrix exponential to these line graph adjacency matrices.

4.5.1 The Exponential of the Extended Line Graph Adjacency Matrix E+

We defined in Section 4.2.2 the symmetric line graph adjacency matrix E+ to

represent the connections between the edges in a directed graph G by using the

corresponding undirected bipartite graph. A centrality measure for edges based on

E+ is given by the matrix exponential

∞ 1 p exp(E+) = X E+ , p=0 p!

which has an interesting structure. To exploit this structure, it is helpful to introduce

the notion of the total degree of a node in G, which is the sum of the indegree and

outdegree of the node.

Proposition 4.5.1. Let the extended line graph adjacency matrix E+ for a directed

68 unweighted graph G be given by (4.2.1). Then E+ = B+T B+, where B+ = [Be Bi];

see Proposition 4.2.1. Let mi denote the total degree of node vi of G. Then

  exp(m1)−1  m 0 0 ···   1     exp(m2)−1  T  0 0 ···  + +  m2  + (4.5.1) exp(E ) = I + B   B .  exp(m3)−1   0 0 ···   m3   . . .   . . . 

Thus, the structure of the matrix exp(E+) is related to the structure of B+T B+, but the entries of the former matrix depend on the total degree of the nodes. Since

exp(x)−1 the function x → x , x > 0, is increasing, the importance of a node generally increases with its total degree.

Proof. We have

T T T T T (E+)2 = (B+ B+)(B+ B+) = B+ (B+B+ )B+ = B+ (BeBeT + BiBiT )B+, where we note that the matrix M = BeBeT + BiBiT is diagonal. Its nontrivial entries are the total degrees of the nodes of G. It is easy to show that for any integer p ≥ 1, we have (E+)p = B+T M p−1B+, from which it follows that

+T + +T 2 + T (B MB ) (B M B ) exp(E+) = I + (B+ B+) + + + ... 2! 3! T = I + B+ M −1(exp(M) − I)B+.

This shows (4.5.1).

Let us take the simple example in Figure 4.11 to test how meaningful the use of

exp(E+) is for measuring the importance of edges in this graph. Proposition 4.2.1

69 v1 e1 v2 e2 v3 e3 v4

Figure 4.11: Simple directed network. describes the construction of the adjacency matrix E+ associated with this graph.

Its matrix exponential is given by

  2.72 0 0 0 0 0          0 4.19 0 3.19 0 0         0 0 4.19 0 3.19 0  +   exp(E ) =   .    0 3.19 0 4.19 0 0         0 0 3.19 0 4.19 0        0 0 0 0 0 2.72

The top right quarter of the above matrix represents the propagation of signals from

+ edges traveling through the network. The fact that the entry [exp(E )]16 vanishes indicates that the edge e1 cannot connect to the edge e3 by any number of steps.

But we easily see in Figure 4.11 that these edges are connected by two steps: e1 to

+ e2 followed by e2 to e3. This illustrates that the matrix exp(E ) is poorly suited to indicate the importance of edges. We conclude that a matrix other than E+ is needed to make the exponential function meaningful for edges.

4.5.2 The Exponential of the Line Graph Adjacency Matrix E→

Thulasiraman and Swamy [64] discuss the line graph adjacency matrix E→ =

BiT Be of a directed graph. It has an entry 1 in position (i, j) if and only if the edge ei passes information to the edge ej through a node, i.e., if and only if the head of

→ 2 edge ei coincides with the tail of edge ej. The entries of the matrix (E ) tell us whether information is passed from an edge to another one through two nodes. In

70 → 2 other words, [(E ) ]ij = 1 if there exists an edge pointing from the target node of ei

→ p to the source node of ej. Similarly, [(E ) ]ij counts the number of ways information is transferred from the edge ei to the edge ej through p nodes. The matrix exponential exp(E→) is a weighted sum of positive powers of E→, with transfers of information via many nodes having a smaller weight than transfers via few nodes; cf. (4.4.2). Note that the matrix E→ generally is nonsymmetric. Each row of exp(E→) represents an edge in its emitter role, and each column expresses its role in receiving information.

Similarly to the discussion in Section 4.4.3, we can determine the ability of an edge to transmit information through the network by ordering the elements of the row sums of the matrix exp(E→), i.e., of exp(E→)1. The largest entry of this vector corresponds to the most important transmitter. Similarly, the vector 1T exp(E→) provides an ordering of the edges in their role as information receivers, where the largest entry corresponds to the most important receiver. We will use these measures in the following examples.

We note that powers of E∨ and E∧ defined in Proposition 4.2.1 do not add any information about propagation through the network because of the nature of connections between edges that they provide. Moreover, E← = (E→)T . Therefore, calculating exp(E←)1 is equivalent to evaluating 1T exp(E→).

4.5.2.1 Example

We consider the ranking of edges of a graph G that is the small tree shown in

Figure 4.12(a). The corresponding line graph is displayed in Figure 4.12(b). Table 9

→ T → displays the centrality measures exp(E 1) and 1 exp(E ). The edge e2 contributes the most to broadcasting information through the network, because it is the only edge that points to a node from which two edges emerge. The edge e1 is the next most important edge, because it is one step further away from the split at the node v3

71 than the edge e2. The edges e5 and e6 are dead ends and therefore are ranked as the

least important transmitters. On the other hand, the latter edges have the highest

capability of receiving information and therefore are ranked as the most important

receivers. We conclude that the ranking of transmitters and receivers determined by

the measures exp(E→)1 and 1T exp(E→) is in agreement with intuition based on the

graphs in Figure 4.12.

v1 e e1 1 v2 e e2 2 v3 e e 3 4 e3 e4

v4 v5 e e e5 e6 5 6 v6 v7

(a) Graph of a directed tree. (b) Line graph of a directed tree. Figure 4.12: Graph and line graph for Example 4.5.2.1.

→ T → edge ek [exp(E )1]k edge ek [1 exp(E )]k e2 4 e5 2.67 e1 3.33 e6 2.67 e3 2 e3 2.5 e4 2 e4 2.5 e5 1 e2 2 e6 1 e1 1

Table 9: Ranking of the edges of Figure 4.12 using the exponential of the matrix E→.

4.5.2.2 Example

This example illustrates the effect of a loop in a directed graph shown in Figure

4.13. Table 10 shows the edge e6 to be both the highest ranked broadcaster and the

highest ranked receiver. The directed loop between the nodes v1, v4, and v6 makes

72 edges between these nodes be strong broadcasters. v5 v4 e4 e4 e2 e2 v1 v2 v3 e1 e3 e5 e5 e1 e3 e7 e6 e6 v7 v6 e7

(a) Graph with a loop. (b) Line graph. Figure 4.13: Graph and line graph for Example 4.5.2.2.

→ T → edge ek [exp(E )1]k edge ek [1 exp(E )]k e6 4.78 e6 3.76 e2 3.97 e1 3.23 e5 3.56 e2 3.23 e7 3.56 e3 2.89 e1 2 e4 2.89 e3 1 e5 2.89 e4 1 e7 1

Table 10: Ranking of the edges of Figure 4.13 using the exponential of the matrix E→.

4.5.2.3 Flight Example I

The last example of this section uses a directed unweighted network determined by domestic flights in the US during year 2016, as reported by the Bureau of Trans- portation Statistics of the United States Department of Transportation [53]. The airports are nodes and the flight segments are edges. This yields a nonsymmetric adjacency matrix A ∈ R705×705. Since most flights have return flights, the matrix A is close to symmetric.

We determine the matrix E→ using the adjacency matrix for the flights network, and rank the departing flights by computing exp(E→)1. The six largest entries of

73 Most departing flights Most landing flights from airport to airport from airport to airport IAH ATL ATL RDU CLE ATL ATL DSM TPA ATL ATL BHM GRR ATL ATL GSP ALB ATL ATL FSD PIT ATL ATL BTV

Table 11: Ranking of flight segments in the domestic flights network using the expo- nential of the matrix E→. this vector determine the most important flights. They are displayed in Table 11. We rank the arriving flights by computing the largest entries of 1T exp(E→). However, the computation of exp(E→)1 and 1T exp(E→) may result in numerical overflow on many computers. To avoid this difficulty, we compute the µ of E→ and evaluate exp(E→ − µI)1 and 1T exp(E→ − µI) instead. This eliminates overflow and does not affect the relative size of the entries of the computed vectors. Therefore, the ordering is not affected by this modification. This approach of avoiding overflow has previously been applied in computations reported in [34].

We conclude that a flight from George Bush Intercontinental Airport in Houston to Hartsfield-Jackson Airport in Atlanta transmits the best through the network of all domestic flights in the US. The flight from Cleveland-Hopkins International Airport to Atlanta comes second, followed by the flight from Tampa International Airport to

Atlanta. Similarly, the flight from Raleigh-Durham International Airport in North

Carolina to Atlanta absorbs the most flights through the network of national flights.

We remark that although the Los Angeles International Airport and Chicago O’Hare

International Airport are among the three top ranked airports according to the Federal

Aviation Administration [32], these airports are not among the ones shown in Table

74 11. This depends on that our model disregards the number of flights (if larger than

one) and the number of passengers on each segment.

4.6 Computing the Most Important Edges in a Directed Weighted Network Using

the Matrix Exponential

Similarly as in the previous section, we can rank the edges of a directed weighted network in their role as transmitters of information by calculating the entries of the vector exp(E˜ →)1, where the matrix E˜ → = B˜ iT B˜ e is determined by taking weights into account as described in Section 4.3. The relative size of the entries of the vector

1T exp(E˜ →) provides an ordering of the edges in their role as information receivers and the relative size of the elements of exp(E˜ →)1 furnishes an ordering of the edges as information transmitters.

4.6.1 Example

The graph of this example is displayed in Figure 4.14(a), with the edge weight shown for each edge. Let the diagonal entries of the diagonal matrix Z contain positive edge weights. Figure 4.14(b) shows the associated line graph for Zi = Ze = Z1/2; see Section 4.3 for the definition of the matrices Zi and Ze.

v1

e1(8) e1 v2 (2.83) e2(1) e2 v3 e3(4) e4(1) (2) (1) e3 e4 v4 v5 (2) (1.73) e5(1) e6(3) e5 e6 v6 v7

(a) Weighted graph. (b) Line graph. Figure 4.14: A directed weighted tree with line graph.

75 Table 12 ranks the edges according to the importance as transmitters and receivers.

Although the edge e5 has weight 1, and e6 has weight 3, and these edges are positioned

in a similar way, the output of our algorithm suggests that the edge e5 is more

absorbing of information than e6.

→ T → edge ek {exp(E )1}k edge ek {1 exp(E )}k e1 10.77 e5 6.89 e2 6.87 e3 5.83 e3 3.00 e6 4.41 e4 2.73 e2 3.83 e5 1.00 e4 3.41 e6 1.00 e1 1.00

Table 12: Ranking of the edges of Figure 4.14 using the exponential of exp(E→).

4.6.2 Flight Example II

We take the same example studied in Section 4.5.2.3, but this time we include a weight with each edge. A weight is equal to the total number of enplanements on all the flights for that segment, as reported by the Bureau of Transportation Statistics

[53]. The weights define the diagonal matrix Z and determine an edge-weighted adjacency matrix A˜ as in Theorem 4.3.1. Due to the weights, a flight segment with thousands of passengers will affect the flow more than a flight segment with only a few passengers.

To avoid overflow, we evaluate exp(E→ − µI), where µ is the spectral radius of

E→, similarly as in Section 4.5.2.3. Table 13 displays the six top ranked edges of the network. All of the segments of the table start and end at one of the top airports as described by the Federal Aviation Administration [32]. In particular, the segment from the Hartsfield-Jackson Airport in Atlanta to the O’Hare Airport in Chicago is the one that dissipates the highest number of passengers through the network of

76 Most dissipating flights Most absorbing flights from airport to airport from airport to airport ATL ORD ORD ATL DTW ORD ORD DTW OGG LAX LAX LAS PHL DEN DEN PHL LAS LAX LAX SEA SEA LAX DEN LAX

Table 13: Ranking of the segments in the domestic flights network, taking the pas- sengers enplanement as the segments weights, and using the exponential of the line graph adjacency matrix.

domestic flights in the US, and the same segment in the opposite direction receives

the most passengers through the network. These two airports are among the top

three busiest airports in the US according to the Federal Aviation Administration of

the U.S. Department of Transportation [32]. This example attests to the validity of

our algorithm.

4.7 Computational Aspects

We comment in this section on the computations required to evaluate

(4.7.1) exp(E→)1 or 1T exp(E→).

Generally, the matrix E→ is nonsymmetric. For small networks this matrix is small and can be explicitly formed. The exponential exp(E→) then easily can be evaluated, such as by the MATLAB function expm, and the desired quantities (4.7.1) can be determined. If the matrix E→ is large enough so that overflow may occur when evaluating its exponential, the spectral factorization of E→ may be computed. This yields the spectral radius µ of E→. Moreover, the spectral factorization can be used to evaluate exp(E→ − µI)1 and 1T exp(E→ − µI). These matrices may only be

77 computable with reduced accuracy when the eigenvector matrix of E→ is severely

ill-conditioned. This has not been an issue in our computations.

When the matrix E→ is large, it may be attractive to evaluate approximations of

the quantities (4.7.1) with the aid of the nonsymmetric Lanczos process or the Arnoldi

process. Their application does not require the matrix E→ to be formed; only matrix- vector products with E→, and possibly with its transpose, have to be computed; see [23, 34] for details. The eigenvalues of the reduced matrix computed with the nonsymmetric Lanczos or Arnoldi processes yield sufficiently accurate approximations of the spectral radius to avoid overflow in the computation of exp(E→ − µI)1 and

1T exp(E→ − µI).

78 CHAPTER 5

Node Importance in Node-Weighted Networks

via Line Graphs and the Matrix Exponential

This chapter is concerned with node-weighted graphs, for which each node is as- signed a positive weight. All edges have weight one. When constructing an associated edge-weighted graph, its nodes will have weight one, and its edges will have weights as described in Section 5.2.

5.1 The Sensitivity of Node Centrality to Weight Change

This section is concerned with how the importance of a node changes when its weight is modified. In pperarticular, we show that the rank of a node as broadcaster will increase, or at least remain the same, if its weight is increased. Related issues have been discussed by Bini et al. [14] for ranking methods based on the relative size of the entries of the left Perron vector of a row stochastic adjacency matrix of a graph. Bini et al. [14] consider the ranking of scientific papers by taking into account not only the importance of the paper, but also the importance of the authors, and possibly of the workplace. In this situation, it may be meaningful to scale the adjacency matrix to be row stochastic. Pozza and Tudesco [59] investigate the effect of adding a new edge in a graph or increasing the weight of an existing edge on the top ranked nodes, using the subgraph centrality measure furnished by the diagonal entries of the exponential of the adjacency matrix.

To simplify notation, we denote the edge-weighted adjacency matrix by A in this section (this matrix is referred to as A˜ elsewhere in this chapter). Let the weight of node vs increase. Thus, the edge(s) exsurging from node vs increase in weight, while

79 no edge exsurging from vs decreases in weight. With each increment δ in the weight

Ast of the edge pointing from node vs to node vt, the adjacency matrix associated with the perturbed graph can be expressed as

T Aˆ = A + δ1s1t ,

T th where 1t = [0, ... , 0, 1, 0, ... , 0] denotes the t axis vector. We show in Section 5.1.2 that the ADR measure increases the most for node vs, under certain conditions on the graph. Therefore, its ranking as broadcaster either increases or remains the same.

5.1.1 Preliminaries

The results of this section will be applied in later sections.

Lemma 5.1.1. Let the A ∈ Rn×n satisfy

(5.1.1) A1 ≤ 1, where the inequality is component-wise. Then

(5.1.2) Ap1 ≤ 1, p ≥ 1, and

(5.1.3) exp(A)1 ≤ e1.

These inequalities are sharp.

Proof. The result is easily shown by induction. The inequality (5.1.2) holds for p = 1

80 by assumption. Assume it holds for p > 1. Then

Ap+11 = A(Ap1) ≤ A1 ≤ 1,

where we have used that all entries of A are nonnegative. For the exponential, we

have ∞ Ap1 ∞ 1 exp(A)1 = X ≤ X = e1. p=0 p! p=0 p!

The inequalities (5.1.2) and (5.1.3) become equalities for certain matrices, includ- ing the identity matrix and the cyclic . The latter corresponds to an unweighted cyclic graph.

Results analogous to those of Lemma 5.1.1 of course also can be shown for AT .

Lemma 5.1.2. Let the matrix A satisfy the conditions of Lemma 5.1.1. Then for any nonnegative integer n2,

∞ (n + 1)! (5.1.4) X 2 An1 1 < 1, (n + n + 1)! n1=1 2 1

where the inequality holds component-wise.

Proof. We first bound the coefficients in (5.1.4) by induction over n1. For n1 = 1, we

have (n + 1)! 1 1 2 = ≤ . (n2 + 2)! n2 + 2 2n1!

Let n1 ≥ 1 be an arbitrary integer, and assume that

(n + 1)! 1 2 < . (n2 + n1 + 1)! 2n1!

We would like to show that the above inequality holds for n1 replaced by n1 + 1.

81 Using the above inequality, we obtain

(n + 1)! (n + 1)! 1 2 = 2 < (n2 + n1 + 2)! (n2 + n1 + 1)!(n2 + n1 + 2) 2n1!(n2 + n1 + 2) 1 1 < = . 2n1!(n1 + 1) 2(n1 + 1)!

It follows that

(5.1.5) ∞ (n + 1)! ∞ 1 1 e − 1 X 2 An1 1 < X An1 1 < (exp(A)1 − 1) ≤ 1 < 1, (n + n + 1)! 2(n + 1)! 2 2 n1=1 2 1 n1=1 1 where the penultimate inequality is a consequence of (5.1.3). This shows the lemma.

5.1.2 Matrix Perturbation Results

This section considers adjacency matrices that satisfy the conditions of Lemma

T n 5.1.1. As usual, 1 = [1, 1, ... , 1] ∈ R is the vector of only ones, and 1j =

[0, ... , 0, 1, 0, ... , 0]T

n th ∈ R denotes the j axis vector for j = 1, ... , n. We will perturb the entry Ast of the adjacency matrix A ∈ Rn×n by δ. This perturbation is denoted by δA. Thus,

T δA = δ1s1t . We will use the formulas

T T δA1 = δ1s, AδA = δA1s1t , δAA = δ1s1t A.

When s 6= t, we have (δA)2 = 0.

n×n Theorem 5.1.3. Let the adjacency matrix A = [Aij] ∈ R for the graph G satisfy the conditions of Lemma 5.1.1. Add δ > 0 to the matrix entry Ast for some s 6= t, without changing any of the other entries of A. If δ is small enough, the ADR value

82 of the vertex vs increases more than the ADR value of any other vertex. It follows that the rank of the vertex vs as a broadcaster either increases or stays the same.

T More precisely, let δA = δ1s1t and Aˆ = A + δA. Then

(5.1.6) [exp(Aˆ)1]s − [exp(A)1]s > [exp(Aˆ)1]q − [exp(A)1]q, ∀q 6= s.

Proof. Application of the binomial expansion gives

∞ 1 exp(Aˆ)− exp(A) = X ((A + δA)p − Ap) p=0 p!

(5.1.7) ∞ 1   = δA + X Ap−1δA + Ap−2δAA + ... + AδAAp−2 + δAAp−1 + O(δ2). p=2 p!

Multiplying the terms in the above sum by 1 from the right-hand side gives

1 δ T for p = 2, (AδA + δAA)1 = (A1s + 1s1 A1), 2! 2! t 1 2 2 δ 2 T T 2 for p = 3, (A δA + AδAA + δAA )1 = (A 1s + A1s1 A1 + 1s1 A 1), 3! 3! t t 1 3 2 2 3 δ 3 2 T for p = 4, (A δA + A δAA + AδAA + δAA )1 = (A 1s + A 1s1 A1 4! 4! t T 2 T 3 + A1s1t A 1 + 1s1t A 1),

... .

Adding all the above terms “column-wise” and substituting into (5.1.7) multiplied by

83 1 from the right-hand side, we get

   1 1 2 1 3 exp(Aˆ) − exp(A) 1 = δ1s + A1s + A 1s + A 1s + ... 2! 3! 4!   1 1 1 2 1 3 T + 1s + A1s + A 1s + A 1s + ... (1 A1) 2! 3! 4! 5! t   1 1 1 2 1 3 T 2 + 1s + A1s + A 1s + A 1s + ... (1 A 1) 3! 4! 5! 6! t    1 1 1 2 T 3 2 + 1s + A1s + A 1s + ... (1 A 1) + ...  + O(δ ) 4! 5! 6! t  ∞ n1 ∞ n1 X A 1s X A 1s T = δ  + (1 A1) (n + 1)! (n + 2)! t n1=0 1 n1=0 1  ∞ n1 X A 1s T 2 2 + (1 A 1) + ... + O(δ ) (n + 3)! t n1=0 1   ∞ ∞ n1 X X A 1s T n 2 = δ  (1 A 2 1) + O(δ ).(5.1.8) (n + n + 1)! t n1=0 n2=0 1 2

It follows that

T   [exp(Aˆ)1]s − [exp(A)1]s = 1s exp(Aˆ) − exp(A) 1   ∞ ∞ T n1 X X 1s A 1s T n 2 (5.1.9) = δ  (1 A 2 1) + O(δ ) (n + n + 1)! t n1=0 n2=0 1 2 ∞ 1 ≥ δ X (1T An2 1) + O(δ2), (n + 1)! t n2=0 2 where the inequality is obtained by ignoring all terms with n1 > 0. Now apply-

n n ing Lemma 5.1.2, and using the inequality 1qA 1 1s ≤ 1qA 1 1, and the fact that

84 0 1qA 1s = 0 for q 6= s, gives

  ∞ 1 ∞ ∞ (n + 1) 1 X T n2 X X 2 ! n1 T n2 (1 A 1) >  1q A 1 (1 A 1) (n + 1)! t (n + n + 1)! (n + 1)! t n2=0 2 n2=0 n1=1 2 1 2 ∞ ∞ 1 An1 1 ≥ X X q s (1T An2 1) (n + n + 1)! t n2=0 n1=1 2 1 ∞ ∞ 1 An1 1 = X X q s (1T An2 1). (n + n + 1)! t n2=0 n1=0 2 1

Comparing this expression and (5.1.9) shows that

2 [exp(Aˆ)1]s − [exp(A)1]s > [exp(Aˆ)1]q − [exp(A)1]q + O(δ ).

This concludes the proof.

The above proof shows that

∞ ∞ 1 An1 1 X X q s T n2 2 [exp(Aˆ)1]q − [exp(A)1]q = δ (1 A 1) + O(δ ). (n + n + 1)! t n2=0 n1=0 2 1

T n1 The terms have the following network interpretation: The expression 1q A 1s is the

weighted number of walks of length n1 from node vq to node vs, and the expression

T n2 1t A 1 is the weighted number of walks of length n2 from node vt to any node in

T n1 T n2 the network. Finally, δ1q A 1s1t A 1 is the weighted number of walks of length

n2 + n1 + 1 from node vq to node vs in n1 steps, going through the edge pointing

from node vs to vt considering it holds a weight of δ, continuing to any other node

the network in n2 steps; see Figure 5.1.

Corollary 5.1.4. Let the conditions of Theorem 5.1.3 hold. Assume in addition that

the network consists of two clusters that are only connected by one directed edge from

node vs in the first cluster to node vt in the second one. Then (5.1.6) holds for any

85 vq v v vj ... s t ...

i1 in1 j1 j(n2−1) jn2

n1 steps n2 steps

Figure 5.1: A walk from vq to vs, then to vt then to vj.

δ > 0.

Proof. Let G1 be the graph of the first cluster of n1 nodes, including the node vs,

n ×n and denote the associated adjacency matrix by A1 ∈ R 1 1 . Similarly, let G2 be the

n ×n graph of the second cluster of n2 nodes, including the node vt, and let A2 ∈ R 2 2 be the adjacency matrix for G2. We can arrange the rows and columns of A to have the form

  A B  1 1 (5.1.10) A =   ,   0 A2

where B1 is an n1 × n2 matrix with all entries zeros except for the entry (s, t), which is the weight of the edge going from node vs to node vt. The lower left block of A is an n2 × n1 matrix with all entries zeros, because no edge goes from G2 to G1. We can show by induction that all powers of A have the structure

  Ap B p  1 p A =   ,  p 0 A2

p where Bp is some n1 × n2 matrix. Indeed, by (5.1.10), the matrix A has the desired

86 structure for p = 1. Suppose Ap has the desired structure. Then

Ap+1 = ApA     Ap B A B  1 p  1 1 =      p   0 A2 0 A2   Ap+1 ApB + B A  1 1 1 p 2 =    p+1  0 A2   Ap+1 B  1 p+1 =    p+1 0 A2

p It follows that [A ]ts = 0 for all p = 1, 2, 3, ... , because this entry is in the lower left

quadrant of Ap.

T p Let δA = δ1s1t with s 6= t. Then (δA) = 0 for all p > 1. In addition

p T p T 2 p T δAA δA = δ1s1t A δ1s1t = δ 1s[A ]ts1t = 0.

Therefore, all terms in O(δ2) in (5.1.7) vanish. This eliminates the need to require

that 0 < δ  1 in the proof of Theorem 5.1.3.

Corollary 5.1.5. Let the conditions of Theorem 5.1.3 hold. Assume in addition that

the graph G is such that there exist no walk from node vt to node vs. Then (5.1.6)

holds for any δ > 0.

p Proof. Since there exist no walks from node vt to node vs, we have [A ]ts = 0 for all

87 p = 1, 2, 3, ... . Hence,

p T p T 2 p T δAA δA = δ1s1t A δ1s1t = δ 1s[A ]ts1t = 0.

Similarly to the proof of Corollary 5.1.4, we conclude that all terms in O(δ2) in (5.1.7)

vanish and the desired result follows.

n×n Corollary 5.1.6. Let the transpose of the adjacency matrix A = [Aij] ∈ R for

the graph G satisfy the conditions of Lemma 5.1.1. Add δ > 0 to the matrix entry

Ast, for some s 6= t, without changing any of the other entries of A. If δ is small

enough, the AUR value of the vertex vt increases more than the AUR value of any

other vertex. It follows that the rank of the vertex vt as a receiver either increases or

T stays the same. More precisely, let δA = δ1s1t and Aˆ = A + δA. Then

T T T T [exp(Aˆ )1]t − [exp(A )1]t > [exp(Aˆ )1]q − [exp(A )1]q, ∀q 6= t.

Proof. The result follows by applying Theorem 5.1.3 to the matrix AT .

5.1.3 Example on Sensitivity to Weight Change

Consider the weighted network in Figure 5.2. The numbers in parenthesis are the

weights of edges. To satisfy the condition of Theorem (5.1.3), we scale the adjacency

matrix A for the graph by the maximum of the largest column sum and the largest row sum, that is we divide all the adjacency matrix entries by 11.

We increase the weight of the edge pointing from node v1 to node v3 by δ = 0.01.

T The new adjacency matrix is Aˆ = A + δ1113 . According to theorem 5.1.3, node v1 gets the highest ADR increase, which is in agreement with values reported in Table

1. By Corollary 5.1.6 no node should get a larger increase in its AUR value than node

88 v7 (3) v1 (1) v10 (6) v5 (7) (3) (1) (6) (2) (3) v6 v2 (1) (2) (2) v9 (2) v8 (4) v3 (3) v4

Figure 5.2: Network for the example in Section 5.1.3.

v3, which also is in shown by Table 1.

Dissipating nodes(ADR) Absorbing nodes(AUR) T T node vq [exp(Aˆ)1]q − [exp(A)1]q node vq [exp(Aˆ )1]q − [exp(A )1]q v1 2.395 v3 2.168 v7 0.299 v2 0.280 v10 0.100 v5 0.280 v8 0.008 v6 0.074

Table 1: Ranking top 4 nodes of the graph in Figure 5.2 showing ADR and AUR values before and after perturbation of the edge pointing from node v1 to node v3. The graph G is scaled to satisfy the conditions of Lemma 5.1.1.

We found experimentally that the conclusion of Theorem 5.1.1 holds for a larger class of networks than allowed by the theorem, but it does not hold for all networks.

For instance, the scaling of the adjacency matrix A so that (5.1.1) holds is not required by all networks to secure that the conclusion of Theorem 5.1.1 holds. In the examples in the following sections, we will not scale the adjacency matrices to enforce (5.1.1).

5.2 Node-Weighted to Edge-Weighted

In this section we consider ways of incorporating node weights into an adjacency matrix. Unlike in the case of edge weights, there is no single natural approach to encode node weights into a single matrix. But, as will be seen below, different ap- proaches can prove useful in some circumstances.

As shown in Section 2.5.0.1, the weighted adjacency matrix A˜ of an edge-weighted

89 graph can be defined in a natural way as A˜ = BeZBiT , where Bi and Be are the

unweighted incidence and exsurgence matrices, respectively, and Z is a diagonal ma-

e iT trix holding the edge weights z1, ... , zm. This can be rewritten as A˜ = B˜ B˜ using

i i e e weighted incidence and exsurgence matrices defined by B˜ = B Z1 and B˜ = B Z2 where Z2Z1 = Z (as in [24]). However, the choice of Z1 and Z2 is not unique.

Assume we are given node weights w1, ... , wn, and let W be a diagonal matrix containing them. The goal of encoding these weights into an adjacency matrix A˜ implies that we must find edge weights Z = H(W ) depending only on W . In principle, each zk may depend on all but we will consider only “local” dependencies, that is, each edge weight zk will be a function h of the node weights of its endpoints. Below we discuss various ways of defining Z = H(W ) and, for each approach, we describe a scenario where it would be appealing to choose that method. We will also look at the “toy” node-weighted graphs in Figure 5.3 to compare the discussed modeling. Weights are given in parenthesis.

e4 v1(4) v4(1) v1(4) v2(2) v3(7) v4(1) e1 e3 e1 e2 e3

v2(2) v3(7) e2

(a) A directed chain. (b) A cyclic network.

e1 v (2) v (1) v (1) v (4) 5 4 4 1 e e4 2 e e e e e v1(4) v2(2) v3(7) 4 5 6 2 3 e5 e1 e3 e7 v2(2) v3(7) v5(2) v6(5) v7(1) e6 v7(1) v6(5)

(c) Random undirected. (d) Random directed. Figure 5.3: Node-weighted sample graphs.

90 5.2.1 Edge Weights from Endpoint Node Weights

As mentioned above, if the nodes vi, vj are the endpoints of edge ek, then zk = h(wi, wj) for some function h. If the network is undirected, h must be symmetric

(i.e., h(x, y) = h(y, x) for all x, y).

5.2.1.1 Sum of Endpoint Node Weights

Consider a network consisting of buildings as nodes, and the street linking two buildings as an edge. Each one of these edges should be built large enough to accom- modate all occupants from both buildings in case they have to escape a fire where they live. If this network is given as node-weighted, with a node’s weight correspond- ing to the building capacity, then we can convert it into an edge-weighted network with each edge weight equal to the sum of its endpoint node weights. This weighting is most meaningful for undirected graphs. We determine Z from W by calculating the sum node weights (snw)

(5.2.1) Z = snw(W ) or zk = (wi + wj).

This defines the function H.

Figure 5.4 shows the edge-weighted graphs obtained by assigning each edge the sum of the weights from Figure 5.3 of the nodes it connects. Zou et al. [66] assigned

1 to each edge half the sum of its endpoint node weights. This is simply a scaling by 2 of the adjacency matrix that we obtain.

5.2.1.2 Product of Endpoint Node Weights

Another approach is to assume that the computed edge weight is proportional to each of the two endpoint node weights. Symmetry considerations indicate that the constant of proportionality should be the same for all edges. Then, up to a scaling

91 e4(5) v1 v4 v1 v2 v3 v4 e (6) e (8) e1(6) e2(9) e3(8) 1 3

v2 v3 e2(9)

(a) A directed chain. (b) A cyclic network.

e1(5) v5 v4 v4 v1 e2(5) e4(3) v1 v2 v3 e4(3) e6(3) e2(9) e3(5) e5(6) e5(8) e1(6) e3(9) e7(6) v2 v3 v5 v6 v7 e6(9) v7 v6

(d) Random directed. (c) Random undirected. Figure 5.4: Edge weighted graphs obtained from the graphs in Figure 5.3 and using the snw method.

factor, we have that zk = h(wi, wj) = wiwj. This gives the product node weights

(pnw)

(5.2.2) Z = pnw(W ), or zk = (wi × wj).

A situation when this approach to define the function H is reasonable is when the nodes represent cities, the edges represent roads connecting the cities, and the traffic between the cities is assumed to be proportional to their sizes (population).

Another example is when the node weights correspond to probabilities of the node becoming “activated” at a given time. If different nodes get activated independently of each other, and we define an edge becoming activated when both endpoints are activated, then the product node weight scheme provides the probability of activation for each edge. Variations on this scheme (e.g., products of specified powers of node

92 weights) are described in Section 5.2.2.

5.2.1.3 Inheriting the Weight of an Endpoint Node

When the network is directed, it may be meaningful for h to be non-symmetric.

Obvious examples are h(wi, wj) = wi and h(wi, wj) = wj; these correspond, respec- tively, to inheriting the weight of the source node, which we call pull node weight

(pll), and inheriting the weight of the target node, which we call push node weight

(psh). The pull node weight was considered in [58] and the push node weight in [40].

These approaches correspond to the assumptions that the edge weight is propor- tional to the weight of the source node (or target node). In the random activation example described above, it would correspond to assuming that the edge is activated whenever its source or target is activated.

However, if an edge inherits its weight from its source node, then the weight of terminal nodes (those with no edges exsurging from them) will be missing from the line graph representation. If an edge inherits its weight from its target node, then the weights of the vertices not pointed to by any edge will not show in the line graph.

One solution is to add a dummy node to add an edge inhering the lost node’s weight, as in [58].

5.2.2 The Case when h is Factorizable

Whenever the function h can be factored as

(5.2.3) h(wi, wj) = h1(wi)h2(wj), the relationship between W and Z = H(W ) can be expressed neatly in matrix algebraic terms.

th Assume (5.2.3), and let H1(W ) be the diagonal matrix with the k diagonal entry

93 equal to h1(wi), whenever vi is the source node of the edge ek, for k = 1, ... , m;

similarly for H2(W ) (using target nodes). Then H(W ) = H1(W )H2(W ), and we

have

e iT e iT e iT (5.2.4) A˜ = B ZB = B H(W )B = B H1(W )H2(W )B .

If we define h1(W ) as the entry-wise application of h1 to the diagonal elements of W ,

e e i i and similarly for h2(W ), then h1(W )B = B H1(W ) and h2(W )B = B H2(W ).

Therefore,

e iT (5.2.5) A˜ = h1(W )B B h2(W ) = h1(W ) A h2(W ).

e e e Thus, h1(W )B and B H1(W ) are equivalent definitions for B˜ (and similarly for

B˜ i).

An important case in which h is factorizable is when h(x, y) = xαyβ, for fixed

α, β ∈ R. We call the approach of obtaining Z from W this way as product node weight alpha beta (pnwαβ). In this case, Z = pnwαβ(W ) satisfies

BeZBiT = W αAW β.

This weighting scheme includes most of the schemes described above as special cases:

α = β = 1 corresponds to product node weight; α = 1, β = 0 corresponds to pull node weight; α = 0, β = 1 corresponds to push node weight. Notice that h(x, y) = xαyβ is symmetric (and therefore applicable to undirected networks) only when α = β; this includes the case α = β = 1/2 when each edge weight is the geometric mean of the endpoint node weights.

94 Negative values of α or β may make sense in some modeling situations. For

example, α = 1 and β = −1 corresponds to the case when each edge weight is

proportional to the weight of the source node and inversely proportional to the weight

of the target node.

5.2.3 Node Weights to Line Graph Edge Weights

Since, roughly speaking, the roles of nodes and edges of a graph G are interchanged

in the associated line graph G∗ (see Section 4.2), we investigate how a given set of

node weights for G can be incorporated into an adjacency matrix for G∗.

5.2.3.1 Simple Weighting

By using the expressions that relate the adjacency matrix of G∗ to the incidence

matrix (or incidence matrices) for G (see Section 4.2), we obtain expressions for

incorporating node weights W for G into the adjacency matrix for G∗.

Consider first the directed case. Here we have that E→ = BiT Be. Similarly to

the expression for the weighted adjacency matrix A˜ in (2.5.1), we define the simply weighted adjacency matrix of the line graph as

→ iT e (5.2.6) E˜SW = B WB .

For undirected networks, the unweighted adjacency matrix of the undirected line

T graph is E = B B − 2Im, where B is the incidence matrix described in Subsection

2.3. We define the simply weighted adjacency matrix of the line graph as

T (5.2.7) E˜SW = B WB − C,

where C is the diagonal matrix with ckk = wi + wj, whenever vi and vj are the

95 endpoints of the edge ek, k = 1, ... , m. Figure 5.5 shows edge-weighted line graphs corresponding to the graphs in Figure 5.3. In (c), all connections on the left cluster have a weight of 1, and those to the right have a weight of 4.

e4 (4) (1) (2) (7) e1 e3 e1 e2 e3 (2) (7) e2

(a) A directed chain. (b) A cyclic network.

e4

e1 (1) (1) e2 e1 e3 e4 e6 (4) e5 (4) (1) (2) e2 e3 (5) e6 (4) e5 (5) e7

(c) Random undirected. (d) Random directed. Figure 5.5: Edge-weighted line graphs of the node-weighted graphs in Figure 5.3 sample networks.

We notice that this method does not capture the weights of nodes that are only

in direct contact with one edge in the original graph. In order to accommodate for

them without changing the , we add a self-loop at each node of the

graphs in Figure 5.3, and then create the associated line graph. (We only show this

approach in Figure 5.6 for the network in Figure 5.3(a), but we perform it on all

networks from this point onward.) Note that while these self-loops add nodes and

edges in the line graph, we are not concerned with their ranking.

96 (4) e (2) e (7) e (1) v1(4) v2(2) v3(7) v4(1) 1 2 3 e1 e2 e3 (2) (2) (7) (7)

(a) Adding self-loops. (b) Edge-weighted line graph. Figure 5.6: Added self-loops to the graphs in Figure 5.3(a) and the corresponding line graph. The effect of those additions is drawn in gray.

5.2.3.2 Scaling by Node Degree

As is well known, a node v in G does not necessarily become a single edge in G∗;

in fact, in undirected graphs, each node vi produces a complete subgraph (a clique)

∗ dvi ∗ in G , containing ( 2 ) edges in G , where dvi is the degree of vi in G. The simple

dvi weighting approach described above assigns the weight wi to all those ( 2 ) edges in G∗.

For example, in Figure 5.5(c), we notice that all edges of the cluster to the left

have weight 1, which is the weight of node v4 connecting these edges in Figure 5.3(c).

From a modeling point of view, one can argue that in some applications, v4 should

distribute its weights to the surrounding edges, that is each edge can communicate

1 with a weight of 4 . This suggests scaling the weights of those edges by the degree of

v4.

For an undirected network where D is the diagonal matrix holding the degree of its nodes, we define the degree scaled weighted adjacency matrix of the line graph as

T −1 (5.2.8) E˜DS = B WD B − CDS,

where CDS = [diag(ckk)] is the diagonal matrix with ckk = wi/dvi + wj/dvj , and vi and vj are the endpoints of the edge ek in G of degrees dvi and dvj ,respectively.

For directed networks, each node vi in G results in indegree(vi) × outdegree(vi)

97 ∗ edges in G (connecting each of the G-edges inciding on vi to each of the G-edges

exsurging from vi). The number of edges in the line graph depends on the number of

edges exsurging from the nodes in the original graph. Let Dout be the diagonal matrix

holding the out-degrees of the nodes of G and define the out-degree scaled weighted

adjacency matrix of the line graph as

→ iT −1 e (5.2.9) E˜ODS = B WDoutB .

→ The in-degree scaled version E˜IDS is defined similarly.

5.2.3.3 Strong Degree Scaling

Rather than scaling node vi by its degree dvi , it makes sense in some situations to divide by the number of corresponding edges in G∗. For the undirected graphs, this

dvi means dividing by ( 2 ). The algebra is similar to the scaling above, using a diagonal

dvi matrix Ds that contains the values ( 2 ), i = 1, ... , n, instead of the matrix D. This gives the strongly degree scaled adjacency matrix E˜SDS.

Likewise for directed networks, we define the strongly degree scaled weighted ad-

jacency matrix of the line graph as

→ iT −1 e (5.2.10) E˜SDS = B WDio B ,

where Dio = DinDout. Here Din is a diagonal matrix holding the indegree of the nodes, and Dout is the one holding their outdegree. To avoid dividing by zero whenever a node is a source (and therefore has zero indegree) or a sink (and then has zero outdegree), we add self-loops to each node of G before deriving G∗, as described

above.

98 5.3 Computing Node Importance in Node-Weighted Networks

Based on the several ways to incorporate node weights into adjacency matrices described above, we can now use aggregate downstream and upstream reachability measures (described in Chapter 3) to find the most important nodes and edges in the network. To rank the edges, we apply (4.4.3) (if the network is undirected) and

(4.4.11) and (4.4.12) (if the network is directed). Which one of the line graphs from

→ → → Section 5.2.3 (E˜SW, E˜DS, E˜SDS, or their counterparts for directed networks) is most appropriate depends on the application. Recall that we will add self-loops before deriving the line graph.

We apply (3.2.4) and (3.2.5) to rank the nodes and have to decide which edge- weighted adjacency matrix to use. The edge weights may be defined by snw, pnw, psh, pll, or pnwαβ. In addition, we may determine edge weights as follows: After ranking the edges of the graph via the line graph, we can plug these values as edge weights (without the self-loops) into the original graph, i.e.,

e (5.3.1) Z = { LC}k, from (4.4.3), if the network is undirected,

e e (5.3.2) Z = { LCout}k + { LCin}k, from (4.4.11) and (4.4.12), if it is directed, and get the edge-weighted adjacency matrix via (2.5.1), while removing the diagonal corresponding to the added self-loops.

The advantage of this method is that it determines the weight of each edge taking into consideration not only the weights of both its source and target node, but also all other nodes in the network, and the network architecture. Therefore the edge weights are already well informed on the graph.

99 To rank nodes of the network, we calculate ADR and AUR of (3.2.4) and (3.2.5)

and compare the values between nodes.

Dissipating edges(eLCout) Absorbing edges(eLCin)

E˜ → E˜ → E˜ → E˜ →T E˜ →T E˜ →T edge ek {e SW 1}k {e ODS 1}k {e SDS 1}k {e SW 1}k {e ODS 1}k {e SDS 1}k e1 64.05 12.06 3.55 5.00 3.00 3.00 e2 51.17 17.92 6.72 13.67 4.83 2.71 e3 2.00 2.00 1.50 135.87 25.49 7.99 Dissipating nodes(ADR) Absorbing nodes(AUR)

node vi SW ODS SDS SW ODS SDS v1 1.0517E5 1.7569E3 136.16 0.0000E5 0.0010E3 1.00 v2 0.0454E5 0.3364E3 55.19 0.0007E5 0.0161E3 7.55 v3 0.0014E5 0.0285E3 10.49 0.0230E5 0.1951E3 41.32 v4 0.0000E5 0.0010E3 1.00 1.0747E5 1.9107E3 152.96

Table 2: Top part: Ranking of edges of example (a) in Figure 5.3 in dissipating and receiving through the network using (4.4.11) and (4.4.12), for the adjacency matrix options of the line graph described in Section 5.2.3. The options are simply → iT e → iT −1 e weighted (SW): E˜SW = B WB , out-degree scaled (ODS): E˜ODS = B WDoutB , → iT −1 e and strongly degree scaled (SDS): E˜SDS = B WDio B . Bottom part: For the options above, the edges in Figure 5.3 are given the weight as the sum of dissipating and receiving values, then ADR and AUR are calculated to rank nodes. Highest values are bold.

We turn to the graphs in Figure 5.3 and rank their edges and nodes. Tables 2,

3, and 5 show the calculated measures for the components of the graph in Figure

e e 5.3(a), (b), and (d). The top part displays the values of { LCout}k and { LCin}k for each edge ek, using SW, ODS, and SDS adjacency matrix options of the line graph as described in Section 5.2.3.

To turn the original network into an edge-weighted one, we assign each edge a weight equal to the sum of its in and out values as in (5.3.2). The corresponding weighted adjacency matrix becomes A˜ = BeZBiT . The bottom parts of Tables 2, 3, and 5 give the ADR and AUR values for each node vi using the newly calculated A˜.

We notice that the SW method becomes computationally expensive very fast. The

100 ODS and SDS give the same results in Tables 2 and 3, since the largest out-degree in those networks is 2, which is equal to its factorial value. Most rankings are the same for all methods.

The example in Figure 5.3(c) is undirected. Since adding self-loops to an undi- rected network adds many edges in its line graph, the computation of SW may result in numerical overflow on many computers. A way to avoid overflow is discussed in

Section 5.5. Table 4 shows the calculated measures for the components of the graph in Figure 5.3(c).

Dissipating edges(eLCout) Absorbing edges(eLCin)

E˜ → E˜ → E˜ → E˜ →T E˜ →T E˜ →T edge ek {e SW 1}k {e ODS 1}k {e SDS 1}k {e SW 1}k {e ODS 1}k {e SDS 1}k e1 131.7 13.2 3.6 130.8 12.7 4.1 e2 205.2 22.7 7.0 81.1 8.6 2.9 e3 53.6 5.1 1.9 316.9 31.0 8.1 e4 211.0 18.1 4.8 83.0 7.6 2.3 Dissipating nodes (ADR) Absorbing nodes (AUR)

node vi SW ODS SDS SW ODS SDS v1 3.9E130 6.1E12 5.5E3 4.6E130 6.3E12 5.2E3 v2 4.5E130 6.9E12 6.1E3 4.0E130 5.5E12 4.6E3 v3 4.7E130 6.5E12 5.3E3 3.8E130 5.9E12 5.4E3 v4 3.8E130 5.3E12 4.5E3 4.7E130 7.2E12 6.3E3

Table 3: Top part: Ranking of edges of example (b) in Figure 5.3 in dissipating and receiving through the network using (4.4.11) and (4.4.12), for the simply weighted (SW), out-degree scaled (ODS), and strongly degree scaled (SDS) adjacency matrix options of the line graph as described in Section 5.2.3. Bottom part: For the options above, the edges in Figure 5.3 are given the weight as the sum of dissipating and receiving values, then ADR and AUR are calculated to rank nodes. Highest values are bold.

5.4 Real-Life Examples

5.4.1 Example of Genotype Mutation

This section discusses a biological example to illustrate some of the methods de- scribed in this article. To study the resistance of bacteria to an antibiotic, Nichol et

101 Edge importance Node importance (ADR = AUR) E˜ E˜ E˜ edge ek {e SW 1}k {e DS 1}k {e SDS 1}k node vi SW DS SDS e1 3.8E5 38.1 2.5 v1 1.42 3.5E34 0.4E14 e2 4.2E5 53.8 13.8 v2 0.05 0.3E34 0.4E14 e3 3.7E5 35.7 3.0 v3 0.07 1.3E34 4.7E14 e4 0.4E5 8.5 2.8 v4 0.81 2.5E34 4.8E14 e5 0.6E5 42.4 33.3 v5 0.05 0.3E34 0.4E14 e6 0.4E5 8.5 2.8 v6 0.88 2.4E34 0.2E14 v7 0.78 1.6E34 0.4E13

Table 4: Left part: Ranking of edges of example (c) in Figure 5.3 using (4.4.11) and (4.4.12), for the adjacency matrix options of the line graph described in Section 5.2.3. T The options are simply weighted (SW): E˜SW = B WB − C, degree scaled (ODS): T −1 T −1 E˜DS = B WD B − CDS, and strongly degree scaled (SDS): E˜SDS = B WDs B − CSDS. Right part: For the given options, the edges in Figure 5.3 are given the weight as the importance value from the left part of this table, then ADR is calculated to rank nodes. Highest values are bold. Note that SW values for node importance are calculated after subtracting µI from A˜, where µ is the spectral radius of A˜, to avoid overflow. al. [51, 52] use an example that involves genotypes of 3 bits; see Figure 5.7(a). Each genotype is assigned a fitness level according to the fitness landscape in Figure 5.7(c), and a genotype can only mutate to other genotypes if they have a higher fitness level.

The authors present possible scenarios for the probability of these transitions, such as the probability being proportional to the fitness level increase, or the probability being that of a random walk as shown in Figure 5.7(b). In this dissertation we suggest a probability transition based on the edge weights calculated using (5.3.2).

We display the network’s node-weighted graph in Figure 5.8 with a node weights equal to the corresponding fitness level. Note that we do not add the self-loops to nodes v1 and v8 as in [51], since we allow each genotype to remain the same, and not mutate, by adding self-loops to all nodes of the graph as described in Section 5.2.3.

→ → → We construct E˜SW, E˜ODS, and E˜SDS from Section 5.2.3 to get the edge-weighted e e adjacency matrix of the line graph. We then calculate LCoutk and LCink from

102 Dissipating edges(eLCout) Absorbing edges(eLCin)

E˜ → E˜ → E˜ → E˜ →T E˜ →T E˜ →T edge ek {e SW 1}k {e ODS 1}k {e SDS 1}k {e SW 1}k {e ODS 1}k {e SDS 1}k e1 18.7 8.2 3.1 339.5 17.4 3.9 e2 128.0 5.4 1.9 339.5 17.4 3.9 e3 8.0 8.0 4.5 200.0 9.9 2.8 e4 3.0 3.0 2.0 86.8 3.6 1.5 e5 526.0 27.8 4.51 86.8 3.6 1.5 e6 241.5 14.1 4.4 204.4 19.2 4.7 e7 526.0 27.8 4.51 2.0 1.5 1.5 Dissipating nodes(ADR) Absorbing nodes(AUR)

node vi SW ODS SDS SW ODS SDS v1 7.1E218 4.6E12 1.6E3 6.6E218 4.9E12 1.4E3 v2 1.2E203 1.9E01 8.3E0 4.7E218 4.4E12 1.5E3 v3 1.0E000 1.0E00 1.0E0 1.9E218 2.7E12 1.6E3 v4 7.7E218 5.8E12 1.9E3 6.1E218 3.9E12 1.2E3 v5 1.0E000 1.0E00 1.0E0 1.1E218 9.0E11 6.2E2 v6 6.3E218 5.3E12 2.1E3 7.4E218 4.3E12 1.1E3 v7 6.6E218 5.4E12 1.8E3 1.0E000 1.0E00 1.0E0

Table 5: Top part: Ranking of edges of example (d) in Figure 5.3 in dissipating and receiving through the network using (4.4.11) and (4.4.12), for the simply weighted (SW), out-degree scaled (ODS), and strongly degree scaled (SDS) adjacency matrix options of the line graph as described in Section 5.2.3. Bottom part: For the options above, the edges in Figure 5.3 are given the weight as the sum of dissipating and receiving values, then ADR and AUR are calculated to rank nodes. Highest values are bold.

(4.4.11) and (4.4.12) respectively, and report them at the top of Table 6. The ODS

e method resulted in edges e3, e4, and e8 having the highest LCout value. Both the SW and SDS methods favor edges e5 and e10. All methods ranked edge e4 representing the transition from “010” to “000” to have the highest eLCin value.

e e Now we use the sum of LCink and LCoutk as the weight of edge ek, for k =

1, ... , 12, in the edge-weighted version of the graph in Figure 5.8. We identify the nodes importance as transmitters by calculating ADR, and their importance as re- ceivers by AUR, and report them on the bottom of Table 6.

We conclude from the ADR values of all methods that the genotypes “110”, i.e.

103 (a) Graph (b) The corresponding connecting stochastic transitions (c) Fitness landscape. genotypes. based on mutation to a better fitted neighbor. Figure 5.7: The genotype network for a bit strings of length 3 and the corresponding stochastic transitions according to the fitness levels and equations presented in [51].

v4(0.24) e11 v6(0.10)

e8 e7 e10 e12 v8(0.73) v2(0.54) v5(0.35) v7(0.49) e4 e9 e e e3 2 5 e6

e1 v1(0.32) v3(0.14) Figure 5.8: The genotype directed graph of Figure 5.7. The nodes weights are from Figure 5.7(c).

node v6, and “001”, i.e. node v1, are the least stable genotype states, or the most likely to transition into another state. This is regardless of the choice of SW, ODS, or SDS. On the other hand the last table also shows that genotype “111”, i.e. node v7, is most likely to eventually be last genotype reached by mutation, followed by

“000”, i.e. node v8. Note that genotypes “111” and “000” have an ADR value equal to 1 because, according to the fitness landscape in this example, they have a higher

fitness score than the states that differ from them by one digit. Therefore “111” and

“000” do not mutate. Similarly “110” and “001” have an AUR of 1 because they have a lower fitness score than the genotypes that differ from them by one digit, so the

104 Dissipating edges(eLCout) Absorbing edges(eLCin)

E˜ → E˜ → E˜ → E˜ →T E˜ →T E˜ →T edge ek {e SW 1}k {e ODS 1}k {e SDS 1}k {e SW 1}k {e ODS 1}k {e SDS 1}k e1 1.57 1.17 1.07 1.32 1.08 1.08 e2 1.86 1.41 1.12 1.32 1.08 1.08 e3 1.73 1.73 1.18 1.32 1.08 1.08 e4 1.73 1.73 1.18 3.04 1.90 1.28 e5 2.46 1.68 1.19 1.31 1.10 1.05 e6 1.49 1.49 1.12 1.31 1.10 1.05 e7 1.86 1.41 1.12 1.52 1.16 1.08 e8 1.73 1.73 1.18 1.52 1.16 1.08 e9 1.49 1.49 1.12 2.33 1.58 1.18 e10 2.46 1.68 1.19 1.10 1.03 1.03 e11 1.98 1.29 1.13 1.10 1.03 1.03 e12 1.49 1.49 1.12 1.10 1.03 1.03 Dissipating nodes(ADR) Absorbing nodes(AUR)

node vi SW ODS SDS SW ODS SDS v1 34.4 22.2 16.89 1.0 1.0 1.0 v2 5.8 4.6 3.5 13.8 9.6 7.9 v3 16.6 11.4 8.2 3.9 3.2 3.2 v4 14.1 10.4 8.0 4.1 3.3 3.2 v5 4.8 4.1 3.3 12.8 9 7.8 v6 35.5 22.8 16.88 1.0 1.0 1.0 v7 1.0 1.0 1.0 33.4 22.9 16.9 v8 1.0 1.0 1.0 43.2 27.4 17.9

Table 6: Ranking edges and nodes of the genotype mutation graph in Figure 5.8 in dissipating and receiving through the network using (4.4.11) and (4.4.12), for the simply weighted (SW), out-degree scaled (ODS), and strongly degree scaled (SDS) adjacency matrix options of the line graph as described in Section 5.2.3. Bottom part: For the options above, the edges in Figure 5.3 are given the weight as the sum of dissipating and receiving values, then ADR and AUR are calculated to rank nodes. Highest values are bold. latter ones do not mutate to “110” neither “001”. The specific value of 1 comes from the identity matrix in the Taylor expansion of the exponential function.

5.4.2 Example of Social Networks: Medium and Twitter

In this example we predict which are the most influencing users on the Medium , based on their connectivity on Medium, and the influence of their

105 Twitter accounts. We use a data set collected in 2016 that describes 1, 075, 983

users, who are identified by numerical IDs to protect their privacy [47]. The data set

contains information about the users whom they follow and those they are followed

by on Medium, along with information about whether their account is linked to their

Twitter account, and some information about the Twitter account, if available. The

data were collected to argue that linking Medium with Twitter is helpful to attract

a large number of new users. We use the data provided publicly, and construct our

network adjacency matrix from the showing how accounts follow each other

on Medium. We only take into account the users who have a linked Twitter account,

and use the number of followers they have on Twitter as the node weights in the

graph.

For computational purposes, we narrow down our dataset and pick users who have

more than 5, 000 Twitter followers. Our network consists of the subset of 10, 077 users

represented by nodes and 992, 539 directed connections expressed by edges. Because

of the large network size, the MATLAB function expm fails the computation. In the

following computations in this section, we follow the numerical methods described in

Section 5.5 to avoid computational overflow.

→ → → For each of E˜SW, E˜ODS, and E˜SDS options from Section 5.2.3, we calculate eLCout = exp(E˜ →)1 and eLCin = 1 exp(E˜ →) by the approximations in Section

5.5. We use the sum of eLCout and eLCin as the weight of edges in the edge-weighted version of the graph representing Medium network. Before applying ADR and AUR over the constructed edge-weighted adjacency matrices, we consider more numerical methods shown in Section 5.5. The top 15 ranked users for Medium are reported in

Table 7.

No further comments can be made on the top ranked accounts, since they are

106 anonymous, but we observe a redundancy in IDs among our methods, and between influencing accounts and those influenced. A likely explanation is that high impacting social media users are also highly impacted by others.

Dissipating nodes(ADR) Absorbing nodes(AUR) SW ODS SDS SW ODS SDS 822 9790 3801 822 9790 9790 2265 3801 5277 2265 5277 5277 2326 11359 403527 2326 29395 3801 540 5277 38396 540 22969 23125 2806 20198 16842 2806 3801 20150 14745 23099 14745 14745 20198 149845 7539 22811 9790 8813 22811 6572 8813 20150 22839 7539 23413 9911 2681 45 34607 2681 11359 12346 722 403527 75389 722 20150 15354 988 9911 20137 6385 9917 35687 6385 34607 22798 45 12346 96277 9790 14745 250195 988 9911 26342 2631 38396 22032 2631 17238 22969 8058 16842 65898 8058 10629 146948

Table 7: Top 15 ranked accounts in Medium social network based on the associ- ated accounts influence on Twitter using the simply weighted (SW), out-degree scaled (ODS), and strongly degree scaled (SDS) adjacency matrix options of the line graph as described in Section 5.2.3.

5.5 Computational Aspects

This section discusses some computational aspects of how to evaluate exp(E→)1 and related matrix functions for large networks. For small to medium-sized (square) matrices M, we can first evaluate exp(M) with the MATLAB function expm (provided that overflow does not occur), and then multiply the matrix exp(M) by the vector

1. Methods for evaluating exp(M) for small to medium-sized matrices are described by Higham [42]. However, when the matrix M is large, the evaluation of exp(M) is

107 more difficult for several reasons:

1. Adjacency matrices M that represent networks typically are sparse, the matrix

exp(M) generally is not. The memory requirement for computing and subse-

quently storing exp(M) may be substantial.

2. The computational effort required for evaluating exp(M) for a large matrix M

may be prohibitive.

3. Overflow is more likely to take place when M is a large adjacency matrix, then

when M is small.

Our models require that we first evaluate exp(E→)1 and 1T exp(E→) to form the edge-weighted graph. This defines the adjacency matrix A. Subsequently, we compute exp(A˜)1 and 1T exp(A˜) to rank nodes. To simplify the discussion, we will let M denote either one of the matrices E→ and A.

To avoid overflow, we can evaluate (an approximation of) the spectral radius µ of

M and compute exp(M − µI) instead of exp(M). The replacement of M by M − µI does not affect the relative importance of edges and nodes in the graph. This rescaling has also been used in [34]. We applied it for the computation for the genotype example of Section 5.4.1.

Large memory requirement makes it impossible to evaluate the exponential of the matrices from the Medium-Twitter example in Section 5.4.2 on a standard lap- top computer. This difficulty can be circumvented by approximating exp(M) by a low-rank matrix that is determined by application of a few steps of the Arnoldi or nonsymmetric Lanczos processes. We will compare these methods.

108 5.5.1 The Arnoldi Process

Let M ∈ Rn×n and 1) = [1, ... , 1]T ∈ Rn. Application of `  n steps of the

Arnoldi process to the matrix A with initial vector 1 gives the Arnoldi decomposition

T (5.5.1) MW` = W`H` + g`1` ,

n×` where the columns of the matrix W` = [w1, w2, ... , w`] ∈ R form an orthonormal

`−1 basis for the Krylov subspace K`(M, 1) = span{w1, Mw1, ... , M w1} and w1 =

`×` 1/k1k. Here k · k denotes the Euclidean vector norm. The matrix H` ∈ R is of

n T T upper Hessenberg form, g` ∈ R satisfies W` g` = 0, and 1` = [0, ... , 0, 1, 0, ... , 0] denotes the `th axis vector of appropriate order; details on the Arnoldi process can be found, e.g., in Meurant [48] and Saad [60, Chapter 6]. We assume that ` is small enough so that the decomposition (5.5.1) with the stated properties exists. This is the generic situation. The computation of this decomposition requires the evaluation of ` matrix-vector products with the matrix M. We approximate exp(M)1 by the right-hand side of

(5.5.2) exp(M)1 ≈ W` exp(H`)11k1k; see, e.g., [8, 45] for properties of this approximation method. For many adjacency matrices M, it suffices to let ` in the decomposition (5.5.1) be fairly small to obtain a good enough approximation of exp(M)1. This is illustrated below. When the matrix

M is large and ` is fairly small, the dominating computational effort required to compute the left-hand side of (5.5.1) is the evaluation of ` matrix-vector products with

M. In applications of interest to us the matrix M generally is nonsymmetric. Then the computations have to be repeated with M replaced by M T when an approximation

109 of exp(M T )1 also is desired

5.5.2 The Nonsymmetric Lanczos Process

Application of ` steps of the nonsymmetric Lanczos process to the matrix M ∈

Rn×n with initial vector 1 = [1, ... , 1]T ∈ Rn gives the Lanczos decompositions

T MV` = V`T` + δ`+1v`+11` ,

T T T M W` = W`T` + β`+1w`+11` ,(5.5.3)

n×` where the columns of the matrix V` = [v1, v2, ... , v`] ∈ R span the Krylov

`−1 subspace K`(M, v1) = span{v1, Mv1, ... , M v1} with v1 = 1/k1k, and the

n×` columns of the matrix W` = [w1, w2, ... , w`] ∈ R span the Krylov subspace

T `−1 K`(M , w1) = span{w1, Mw1, ... , M w1} with w1 = 1/k1k. The columns of

T the matrices V` and W` are biorthogonal, i.e., V` W` = I`. It follows from (5.5.3) that

W`MV` = T`.

`×` The matrix T` ∈ R is tridiagonal. For details on the nonsymmetric Lanczos

method; see, e.g. Saad [60, Chapter 7]. We assume that ` is small enough so that

the decomposition (5.5.3) with the stated properties exists. How to proceed when

this is not the case is discussed in [6]. The computation of the decomposition (5.5.3)

requires ` matrix-vector product evaluations with M and with M T , Analogously to

(5.5.2), we use the approximation

(5.5.4) exp(M)1 ≈ V` exp(T`)11k1k.

110 In our experience with large-scale real-world networks, we found that a small number

of steps, `, with the nonsymmetric Lanczos algorithms typically was sufficient to

render a quite accurate approximation of exp(M)1.

5.5.3 Approximations for the Medium-Twitter Example

This section discusses in detail applications of the Arnoldi and nonsymmetric

Lanczos processes to the ranking of the nodes of the Medium-Twitter example of Sec-

tion 5.4.2. We first apply ` steps of the Arnoldi or nonsymmetric Lanczos processes to

→ → → T one of the three matrices E˜SW, E˜ODS, and E˜SDS, with initial vector 1 = [1, ... , 1] . The purpose of these computations is to determine edge weights and form the weighted adjacency matrix A˜ following (5.3.2). Subsequently, we apply the Arnoldi and non- symmetric Lanczos processes to approximate exp(A˜)1 and 1T exp(A˜) to rank the nodes of the original graph.

Arnoldi Nonsymmetric Lanczos number of time ADR rel. AUR rel. number of time ADR rel. AUR rel. iterations (sec.) error error iterations (sec.) error error 16 34.5 2.2E-2 9.9E-3 8 33.5 5.8E+1 1.4E+3 SW 17 36.2 4.1E-3 2.0E-3 11 41.6 4.5E-3 8.6E-2 20 40.6 2.1E-5 3.9E-5 14 52.1 9.1E-6 1.1E-4 10 20.9 1.8E-1 2.0E-1 6 27.9 1.5E-2 1.5E-2 ODS 11 24.0 3.2E-2 1.3E-3 7 30.4 2.3E-3 2.3E-3 20 39.6 8.7E-9 3.4E-7 14 32.7 6.1E-7 6.2E-7 8 3.7 1.1E-1 3.1E-2 5 3.0 8.1E-1 8.1E-1 SDS 9 4.1 1.7E-3 9.0E-3 6 3.5 6.6E-3 6.6E-3 20 10.8 2.1E-10 3.1E-9 14 7.5 2.7E-7 1.6E-7

Table 8: Comparison of the performance of Arnoldi and nonsymmetric Lanczos ap- proximations when applied to the Medium social network using simply weighted (SW), out-degree scaled (ODS), and strongly degree scaled (SDS) adjacency matrices for the line graph.

Since we are interested in the node ranking, we show the smallest number of

iterations with the Arnoldi and nonsymmetric Lanczos processes, when applied to

111 → → → one of the matrices E˜SW, E˜ODS, and E˜SDS and to A˜, required so that the ranking of the top 20 nodes does not change when carrying out more steps. While this “stopping criterion” is not practical to use for the Arnoldi and nonsymmetric Lanczos processes, it illustrates that only a fairly small number of steps are required to gain insight into the node ordering. We found this to be true for other real-world large-scale networks as well. Hence, the computations required for many real-world large-scale network

→ → problems is not very expensive. Table 8 reports results for the matrices E˜SW, E˜ODS, → and E˜SDS in the top row of each “window”. Of course, identical ranking does not imply identical ADR and AUR values. The table therefore also displays the error in these values as well as the errors achieved when the number of iterations is increased.

The “exact values” are determined by carrying out 100 iterations with the Arnoldi and nonsymmetric Lanczos processes.

The computations were carried out on a Lenovo Ideapad 510 laptop computer with a 2.5 GHz Intel Core i7 processor and 6 GB 2133 MHz DDR4 memory. Each time

→ reported in Table 8 is the average of 10 runs. The adjacency matrix E˜SW has the most → nonvanishing entries, and the adjacency matrix E˜SDS the least. We observe that the former matrix requires the most iterations and the longest computing time to satisfy our “stopping criterion” and the latter matrix the least iterations and the shortest computing time. In this example, the Arnoldi process requires more iterations to satisfy the “stopping criterion” than the nonsymmetric Lanczos process, but the fact that each step with the latter requires two matrix-vector product evaluations, while the former only demands the evaluation of one matrix-vector product evaluation per step, results in that application of the Lanczos process not always requires less time than the Arnoldi process.

The matrix line graph adjacency matrices are about 106 × 106, whereas the matrix

112 A˜ is only about 104 × 104. The line graph adjacency matrices are very sparse. On average, the Arnoldi and nonsymmetric Lanczos process applied to the approximation of exp(A˜)1 required 5 iterations each, computed in less than 0.1 seconds.

113 CHAPTER 6

Conclusions

Until now the use of matrix functions based on the exponential has not received much attention for ranking the nodes of a directed network. Differently from the situation for undirected networks, it is generally not so useful to tabulate the diagonal entries of the exponential of the adjacency matrix. This already has been observed in the literature; see, e.g., Benzi et al. [11].

An important difference between directed and undirected networks is that in the former the notions of centrality and importance are quite distinct. Nodes in the

“periphery” may influence a large number of other nodes, directly or indirectly, while highly central nodes may be interpreted as important intermediaries which collect influence or information from many nodes and broadcast it to many others. This suggests that one measure of importance is rarely sufficient for analyzing complex directed networks, and that a combination of measures will often provide a more complete picture. The gene network example (Section 3.4.2.2) illustrates this, by combining upstream and downstream aggregate reachabilities to identify genes that play different roles and are important in different ways.

We also introduced a family of reachability measures that consider walks that are allowed to change direction a bounded number of times. This allows us to take into account “lateral” relationships between nodes (they influence the same nodes, or are influenced by the same nodes). Such relationships might escape the aggregate reach- ability measures AUR and ADR. At the same time, we limit the loss of information about directionality by limiting the number of turns that a walk may take. These

114 measures are helpful for identifying important branch points.

This dissertation discusses the determination of the most important edges of an undirected or directed graph by using an associated line graph. For directed graphs several line graphs are described and their usefulness for ranking edges is discussed.

We also consider the task of removing unimportant edges. Computed examples illus- trate the feasibility of the methods described.

We also presented modeling approaches to find the most important nodes in a network for which node weights are provided.

The methods we describe are built upon the notion of aggregate reachability, which uses matrix functions like the matrix exponential, applied to the adjacency matrix of the network. Since the adjacency matrix is such a key concept for this approach, we investigated different ways in which node weights can be incorporated into some version of an adjacency matrix, both for the original network and its line graph.

The large number of possible approaches is the result of several modeling design decisions that can be made. This flexibility is useful, as it makes it possible to model many different situations. However, it is difficult to obtain clear rules about which approach is best in a particular situation. We broke down the design process into a small set of design decisions, and provided guidelines to help with these choices.

As an argument in favor of using aggregate reachability as a measure of importance when node weights are provided, we proved in Section 5.1 that if the weight of a node is increased, then its ranking does not decrease and, for special cases, receives the highest increase in importance.

One should always keep in mind that the definition of importance is subjective, and depends on what the network modeler considers a quality versus a flaw, therefore the interpretation of node and edge importance always lies in the eye of the beholder.

115 Bibliography

[1] S. Achard, R. Salvador, B. Whitcher, J. Suckling, and E.D. Bullmore. A re-

silient, low-frequency, small-world human brain functional network with highly

connected association cortical hubs. Journal of Neuroscience, 26(1):63–72, 2006.

[2] L. A . Amaral, A. Scala, M. Barthelemy, and H. E. Stanley. Classes of small-world

networks. Proceedings of the National Academy of Sciences, 97(21):11149–11152,

2000.

[3] F. Arrigo and M. Benzi. Edge modification criteria for enhancing the com-

municability of digraphs. SIAM Journal on Matrix Analysis and Applications,

37(1):443–468, 2016.

[4] F. Arrigo and M. Benzi. Updating and downdating techniques for optimizing

network communicability. SIAM Journal on Scientific Computing, 38(1):B25–

B49, 2016.

[5] J. Baglama, C. Fenu, L. Reichel, and G. Rodriguez. Analysis of directed networks

via partial singular value decomposition and Gauss quadrature. Linear Algebra

and its Applications, 456:93–121, 2014.

[6] Z. Bai, D. Day, and Q. Ye. Able: An adaptive block lanczos method for non-

hermitian eigenvalue problems. SIAM Journal Matrix Analysis and Applications,

20, 2009.

[7] A. Barrat, M. Barthelemy, R. Pastor-Satorras, and A. Vespignani. The archi-

tecture of complex weighted networks. Proceedings of the National Academy of

Sciences, 101(11):3747–3752, 2004.

116 [8] B. Beckermann and L. Reichel. Error estimation and evaluation of matrix func-

tions via the Faber transform. SIAM Journal on Numerical Analysis, 47(5):3849–

3883, 2009.

[9] M. Bellalij, L. Reichel, G. Rodriguez, and H. Sadok. Bounding matrix functionals

via partial global block Lanczos decomposition. Applied Numerical Mathematics,

94:127–139, 2015.

[10] M. Benzi and P. Boito. Quadrature rule-based bounds for functions of adjacency

matrices. Linear Algebra and its Applications, 433(3):637–652, 2010.

[11] M. Benzi, E. Estrada, and C. Klymko. Ranking hubs and authorities using

matrix functions. Linear Algebra and its Applications, 438(5):2447–2474, 2013.

[12] M. Benzi and C. Klymko. Total communicability as a centrality measure. Journal

of Complex Networks, 1(2):124–149, 2013.

[13] M. Benzi and C. Klymko. On the limiting behavior of parameter-dependent

network centrality measures. SIAM Journal on Matrix Analysis and Applications,

36(2):686–706, 2015.

[14] D. A. Bini, G. M. Del Corso, and F. Romani. Evaluating scientific products

by means of citation-based models: a first analysis and validation. Electronic

Transactions on Numerical Analysis, 33:1–16, 2008.

[15] P. Bonacich. Factoring and weighting approaches to status scores and clique

identification. Journal of Mathematical Sociology, 2(1):113–120, 1972.

[16] P. Bonacich. Power and centrality: a family of measures. American Journal of

Sociology, 92(5):1170–1182, 1987.

117 [17] U. Brandes and T. Erlebach. Network Analysis: Methodological Foundations,

volume 3418. Springer, New York, 2005.

[18] C. Chen, Z. Jia, and P. Varaiya. Causes and cures of highway congestion. IEEE

Control Systems Magazine, 21(6):26–32, 2001.

[19] W.-K. Chen. Graph Theory and its Engineering Applications, volume 5. World

Scientific, Singapore, 1997.

[20] X. Chu, Z. Zhang, J. Guan, and S. Zhou. Epidemic spreading with nonlinear

infectivity in weighted scale-free networks. Physica A: Statistical Mechanics and

its Applications, 390(3):471–481, 2011.

[21] J.J. Crofts, E. Estrada, D.J. Higham, and A. Taylor. Mapping directed networks.

Electronic Transactions on Numerical Analysis, 37:337–350, 2010.

[22] J.J. Crofts and D.J Higham. Googling the brain: Discovering hierarchical

and asymmetric network structures, with applications in neuroscience. Inter-

net Mathematics, 7(4):233–254, 2011.

[23] O. De la Cruz Cabrera, M. Matar, and L. Reichel. Analysis of directed networks

via the matrix exponential. Journal of Computational and Applied Mathematics,

355:182–192, 2019.

[24] O. De la Cruz Cabrera, M. Matar, and L. Reichel. Edge importance in a network

via line graphs and the matrix exponential. Numerical Algorithms, 2019. In press.

[25] R. Diestel. Graph Theory. Springer, Berlin, 2000.

[26] J. Duch and A. Arenas. Community detection in complex networks using ex-

tremal optimization. Physical Review E, 72(2):027104, 2005.

118 [27] E. Estrada. The Structure of Complex Networks: Theory and Applications. Ox-

ford University Press, Oxford, 2012.

[28] E. Estrada and N. Hatano. Statistical-mechanical approach to subgraph central-

ity in complex networks. Chemical Physics Letters, 439(1-3):247–251, 2007.

[29] E. Estrada, N. Hatano, and M. Benzi. The physics of communicability in complex

networks. Physics Reports, 514(3):89–119, 2012.

[30] E. Estrada and D.J. Higham. Network properties revealed through matrix func-

tions. SIAM Review, 52(4):696–714, 2010.

[31] E. Estrada and J.A. Rodriguez-Velazquez. Subgraph centrality in complex net-

works. Physical Review E, 71(5):056103, 2005.

[32] Federal Aviation Administration (FAA). Passenger boarding (enplanement) and

all-cargo data for US airports, 2016.

[33] A. Farahat, T. LoFaro, J.C. Miller, G. Rae, and L.A. Ward. Authority rank-

ings from HITS, PageRank, and SALSA: Existence, uniqueness, and effect of

initialization. SIAM Journal on Scientific Computing, 27(4):1181–1201, 2006.

[34] C. Fenu, D. Martin, L. Reichel, and G. Rodriguez. Network analysis via par-

tial spectral factorization and Gauss quadrature. SIAM Journal on Scientific

Computing, 35(4):A2046–A2068, 2013.

[35] D.F. Gleich. PageRank beyond the web. SIAM Review, 57(3):321–363, 2015.

[36] C. Godsil and G.F. Royle. , volume 207. Springer, New

York, 2013.

119 [37] G.H. Golub and G. Meurant. Matrices, Moments and Quadrature with Applica-

tions, volume 30. Princeton University Press, 2009.

[38] J.L. Gross and J. Yellen. Graph Theory and Its Applications, Second Edition.

Taylor & Francis, Boca Raton, 2006.

[39] B. Hall. Lie Groups, Lie Algebras, and Representations: An Elementary Intro-

duction, volume 222. Springer, New York, 2003.

[40] J. Heitzig, J. F. Donges, Y. Zou, N. Marwan, and J. Kurths. Node-weighted

measures for complex networks with spatially embedded, sampled, or differently

sized nodes. The European Physical Journal B, 85(1):38, 2012.

[41] V.E. Henson and G. Sanders. Locally supported eigenvectors of matrices associ-

ated with connected and unweighted power-law graphs. Electronic Transactions

on Numerical Analysis, 39:353–379, 2012.

[42] N.J. Higham. Functions of Matrices: Theory and Computation. SIAM, Philadel-

phia, 2008.

[43] L. Katz. A new status index derived from sociometric analysis. Psychometrika,

18(1):39–43, 1953.

[44] J.M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of

the ACM, 46(5):604–632, 1999.

[45] L.A. Knizhnerman. Calculation of functions of unsymmetric matrices using

Arnoldi’s method. USSR Comput. Math. Math. Phys., 31(1):1–9, 1991.

[46] I. X.Y. Leung, S.Y. Chan, P. Hui, and P. Lio. Intra-city urban network and

traffic flow analysis from gps mobility trace. arXiv preprint arXiv:1105.5839,

2011.

120 [47] F. Li, Y. Chen, R. Xie, F. Ben Abdesslem, and A. Lindgren. Understanding

service integration of online social networks: A data-driven study. In 2018 IEEE

International Conference on Pervasive Computing and Communications Work-

shops (PerCom Workshops), pages 848–853. IEEE, 2018.

[48] G. Meurant. Computer Solution of Large Linear Systems. Elsevier, Amsterdam,

1999.

[49] M. E.J. Newman. Analysis of weighted networks. Physical Review E,

70(5):056131, 2004.

[50] M.E.J. Newman. Networks: An Introduction. Oxford University Press, Oxford,

2010.

[51] D. Nichol, P. Jeavons, A. G Fletcher, R. A. Bonomo, P. K. Maini, J. L. Paul,

R. A. Gatenby, A. R.A. Anderson, and J. G. Scott. Steering evolution with

sequential therapy to prevent the emergence of bacterial antibiotic resistance.

PLoS Computational Biology, 11(9):e1004493, 2015.

[52] D. Nichol, P. Jeavons, A. G. Fletcher, R. A. Bonomo, P. K. Maini, J. L. Paul,

R. A. Gatneby, A. R.A. Anderson, and J. G. Scott. Exploiting evolutionary

non-commutativity to prevent the emergence of bacterial antibiotic resistance.

BioRxiv, page 007542, 2015.

[53] Bureau of Transportation Statistics. Research and innovative technology admin-

istration/transtats.

[54] T. Opsahl, F. Agneessens, and J. Skvoretz. Node centrality in weighted networks:

Generalizing degree and shortest paths. Social Networks, 32(3):245–251, 2010.

121 [55] J. Orlin. Contentment in graph theory: covering graphs with cliques. Indaga-

tiones Mathematicae (Proceedings), 80(5):406–424, 1977.

[56] PARTA. Kent State Campus Bus Service. http://www.partaonline.org/

ride-parta/campus-bus-service/.

[57] G.A. Pavlopoulos, M. Secrier, C.N. Moschopoulos, T.G. Soldatos, S. Kossida,

J. Aerts, R. Schneider, and P.G. Bagos. Using graph theory to analyze biological

networks. BioData Mining, 4(1):10, 2011.

[58] M. Pelillo, K. Siddiqi, and S. W. Zucker. Many-to-many of attributed

trees using association graphs and game dynamics. In International Workshop

on Visual Form, pages 583–593. Springer, 2001.

[59] S. Pozza and F. Tudisco. On the stability of network indices defined by

means of matrix functions. SIAM Journal on Matrix Analysis and Applications,

39(4):1521–1546, 2018.

[60] Y. Saad. Iterative Methods for Sparse Linear Systems, volume 82. SIAM,

Philadelphia, 2003.

[61] S. Scarsoglio, F. Laio, and L. Ridolfi. Climate dynamics: a network-based ap-

proach for the analysis of global precipitation. PLoS One, 8(8):e71129, 2013.

[62] C. Stark, B.J. Breitkreutz, T. Reguly, L. Boucher, A. Breitkreutz, and M. Tyers.

Biogrid: a general repository for interaction datasets. Nucleic Acids Research,

34(suppl. 1):D535–D539, 2006.

[63] G. Stelzer, N. Rosen, I. Plaschkes, S. Zimmerman, M. Twik, S. Fishilevich, T.I.

Stein, R. Nudel, I. Lieder, Y. Mazor, S. Kaplan, D. Dahary, D. Warshawsky,

Y. Guan-Golan, A. Kohn, N. Rappaport, M. Safran, and D. Lancet. The

122 genecards suite: from gene data mining to disease genome sequence analyses.

Current Protocols in Bioinformatics, 54(1):1–30, 2016.

[64] K. Thulasiraman and M. N. S. Swamy. Graphs: Theory and Algorithms. Wiley,

New York, 1992.

[65] D. Wei, X. Deng, X. Zhang, Y. Deng, and S. Mahadevan. Identifying influential

nodes in weighted networks based on evidence theory. Physica A: Statistical

Mechanics and its Applications, 392(10):2564–2575, 2013.

[66] F. Zou, X. Li, S. Gao, and W. Wu. Node-weighted Steiner tree approximation

in unit disk graphs. Journal of Combinatorial Optimization, 18(4):342, 2009.

123