CHARACTERIZING THE SPECTRAL RADIUS OF A SEQUENCE OF ADJACENCY MATRICES
BY
WILLIAM D. FRIES
A Thesis Submitted to the Graduate Faculty of
WAKE FOREST UNIVERSITY GRADUATE SCHOOL OF ARTS AND SCIENCES
in Partial Fulfillment of the Requirements
for the Degree of
MASTER OF ARTS
Mathematics and Statistics
May 2018
Winston-Salem, North Carolina
Approved By:
Miaohua Jiang, Ph.D., Advisor
Kenneth Berenhaut, Ph.D., Chair Grey Ballard, Ph.D. Acknowledgments
I would like to thank Dr. Miaohua Jiang for his guidance and expertise through- out the research experience. This project would not have been possible without him. I would also like to thank Dr. Kenneth Berenhaut, Dr. Grey Ballard, Dr. John Gem- mer, Dr. Sarah Raynor and the rest of the Mathematics and Statistics department at Wake Forest University for helping me realize the endless opportunities that studying math can bring.
This work is dedicated to my parents, Karen and Andy, my sister, Margaret, and my brother, Jack: thank you for teaching me that some of best things in life come from its unpredictable nature.
ii Table of Contents
Acknowledgments ...... ii
Abstract ...... iv
List of Tables ...... v
List of Figures ...... vi
Chapter 1 Introduction ...... 1 1.1 Foundations of Network Epidemics ...... 2 1.1.1 Continuous Compartmental Epidemic Models ...... 2 1.1.2 Epidemiology on Networks ...... 5 1.1.3 Existing Bounds on the Largest Eigenvalue ...... 10 1.2 Constructing our Problem ...... 11 1.2.1 Terms and Definitions ...... 11 1.2.2 Defining the Transformation ...... 13
Chapter 2 Main Results ...... 16 2.1 Motivation for the Problem ...... 16 2.2 Characterization of Eigenvalues ...... 18 2.2.1 The Characteristic Polynomial ...... 18 2.2.2 Properties of x(m)...... 24 2.2.3 Special Cases ...... 30
Chapter 3 Further Results and Applications ...... 37 3.1 Corollaries ...... 37 3.2 Applications ...... 41 3.2.1 Relative Size of Eigenvalue ...... 42
Chapter 4 Conclusions ...... 46 4.1 Future Work ...... 46
Bibliography ...... 48
Curriculum Vitae ...... 50
iii Abstract
In this paper we explore the introductory theory of modeling epidemics on networks and the significance of the spectral radius in their analysis. We look to establish properties of the spectral radius that would better inform how an epidemic might spread over such a network. We construct a specific transformation of networks that describe a transition from a star network to a path network. For the sequence of adjacency matrices that describe this transition, we show the spectral radius of these graphs can be given in a simple algebraic equation. Using this equation we show the spectral radius increases as the star unfolds and establish bounds on the spectral radius for each network.
iv List of Tables
1.1 Common Compartmental Epidemic Models ...... 2
3.1 Numerical Approximations for d50,k, 27 < k < 47...... 44
3.2 Numerical Approximations for d75,k, 52 < k < 72 ...... 44
3.3 Numerical approximation d100,k, 77 < k < 97...... 45
v List of Figures
1.1 The differential equationx ˙ = .1x − .2x2 for 0 ≤ x ≤ 1 ...... 4 1.2 The solution curves tox ˙ = .1x − .2x2 with varying initial conditions .5 1.3 The differential equationx ˙ = −.05x − .2x2 for 0 ≤ x ≤ 1 ...... 6 1.4 The solution curves tox ˙ = .25x − .2x2 with varying initial conditions7
1.5 The network and associated adjacency matrix for A8(4) ...... 12
1.6 The network and associated adjacency matrix for B6 ...... 12 1.7 n-degree star and associated adjacency matrix ...... 13
1.8 The network and associated adjacency matrix for An(2) ...... 14
1.9 The network and associated adjacency matrix for An(3) ...... 14 1.10 The network and associated matrix after the i − 1 unfolding . . . . . 15 1.11 The network and associated matrix after n − 3 unfolding actions . . . 15 1.12 The network and associated matrix after n − 2 unfolding actions . . . 15
2.1 Plots of the ρ(An(k)) as k varies for n = 50, 75, and 100...... 17
3.1 Numerical Approximations for ρ(A50(k)) ...... 42
3.2 Numerical Approximations for ρ(A75(k)) ...... 43
3.3 Numerical Approximations for ρ(A100(k)) ...... 43
3.4 A graph of n vs. maxk(pn(k)) ...... 44
vi Chapter 1: Introduction
Recent research into the impact of the how the spectral radius of a network on the spread of an epidemic in such network [1, 2, 3] motivates our research into the spectral radius of trees. If we consider a population of agents who are susceptible to an epidemic and whose connections are represented by the adjacency matrix A = [aij] where
β if node i is adjacent to node j ij aij = δi if i = j (1.1) 0 else
with βij being the probability that if agent i is infected that agent j becomes infected in one time-step (∆t) and δi by the probability that i recovers in ∆t, then there is a strong relationship between the largest eigenvalue of A and the reproduction number, the initial rate at which the epidemic spreads through the network [1, 2, 3, 4, 5, 6].
This prompts the question: how does the network structure affect the largest eigenvalue and ultimately how the disease will spread through the population? Re- search in graph theory has shown the maximal and minimal configurations of trees along with inequalities describing how spectral radii of related networks are related [7]. Our question restricts our networks to a specific set of trees, all with n vertices and can be described as stars each having one long arm [8]. We ask: how does the largest eigenvalue change as we transform our graph from a star to a path and can we find bounds for the largest eigenvalue?
1 1.1 Foundations of Network Epidemics
1.1.1 Continuous Compartmental Epidemic Models
Before considering epidemics spreading across networks, it is useful to consider the case when a disease can be transmitted from anyone to anyone. These models are commonly referred to as fully-mixed, and the simplest epidemic models are the SI, SIR, SIS, and SIRS models (Table 1.1.1). ‘S’ refers to the susceptible, ‘I’ refers to infected, and ‘R’ refers to recovered or removed. The order of the letters describe how a member of the population might move through different stages of a disease. Thus the SIRS model would model a disease in which someone might catch the disease, recover with a brief stage of immunity and then return to the susceptible population. In our models, we refer to s as the percent of the population that is susceptible, x, the percent of the population that is infected, and r as the percent of the population that is recovered or removed from the system.
SI SIR s˙ = −βsx s˙ = −βsx x˙ = βsx − δx x˙ = βsx r˙ = δx s˙ = −βs(1 − s − x) x˙ = β(1 − x)x r˙ = δ(1 − s − x)
SIS SIRS s˙ = ηr − βsx s˙ = δx − βsx x˙ = βsx − δx x˙ = βsx − δx r˙ = δx − ηr x˙ = (β − δ − βr − βx)x x˙ = (β − δ)x − βx2 r˙ = δx − ηr
Table 1.1: The dynamical systems for simple epidemic models.
These are commonly referred to as compartmental epidemic models because they
2 separate the population into compartments. This is to be contrasted with the agent- based epidemic model in which each agent’s transmission and recovery rates are in- dependently determined which we will discuss later. The solution to the SI model is well known. Using separation of variables and initial condition x(0) = x0, the solution can be given as
βt x0e x(t) = βt 1 − x0 + x0e
The SI model has two clear fixed points, one unstable at x = 0 and one stable at x = 1 for any β > 0. This implies that, if the disease spreads, then eventually almost everyone will become infected.
If we consider the SIS model, solutions can be given by:
δ Ce(β−δ)t x(t) = 1 − (1.2) β 1 + Ce(β−δ)t with C = βx0 [1]. The bifurcation parameter to this equation, commonly referred β−δ−βx0
β to as the Reproduction Number, is given by R0 = δ , and the bifurcation occurs at
R0 = 1 [1]. To apply this theory we consider the example below.
Example 1. A small number, x0 essentially 0, are discovered to have a disease with a transmission rate of .2 and a recovery rate of .1 and can be modeled by the SIS model. Clearly the disease will spread and will follow the equation:
1 Ce.1t x(t) = 2 1 + Ce.1t
The differential equation can be seen in Figure 1.1 and a sample of its solution curves can be seen in Figure 1.2.
When analyzing this graph, we notice that we will have a fixed point at x∗ = .5
3 0.2 0.4 0.6 0.8 1.0
-0.02
-0.04
-0.06
-0.08
-0.10
Figure 1.1: The differential equationx ˙ = .1x − .2x2 for 0 ≤ x ≤ 1 which is a stable fixed point. That is, over time, about half of the population will be infected.
Notice that if we change δ = .25 our differential equation becomes Figure 1.3 and our solution curves can be seen in Figure 1.4. Clearly, we have passed the bifurcation value and our epidemic will be eradicated. We also can see that as we approach the bifurcation value (R0 = 1), the stable fixed point at (β − δ) approaches 0. When comparing this to the previous example, we notice we have only one fixed point, at x∗ = 0. This implies the percent of the population that is infected tends
β to 0 as time progresses. This is exactly what we would expect because δ < 1 which implies no epidemic occurs.
Obviously these systems can become significantly more complicated in higher di- mensions when we allow for multiple disease stages such as stages of contagious levels and dormant times in a disease. Despite having these extensions, there are some drawbacks to any fully-mixed model.
One significant assumption that this fully-mixed compartmental model makes is that the probability of anyone spreading the disease to anyone else in the population is equal. However, this assumption does not allow us to accurately model many
4 Figure 1.2: The solution curves tox ˙ = .1x − .2x2 with varying initial conditions epidemic situations. For instance, there is a much higher likelihood that diseases are transmitted among family members or a group of friends instead of randomly chosen strangers within the community. We would then want to design a system that appropriately weighs transmission rates. A logical representation of this is a weighted adjacency matrix for a network that represents the community. To address this assumption, we consider the dynamical system acting on a network. This will allow the eventual extension into the agent-based epidemic model.
1.1.2 Epidemiology on Networks
The network extension of the simplified model takes into account the interactions between people and their ability to transfer the epidemic across the network. For this, we want to consider the dynamics on the network as time passes. To do this, we define the system with si denoting agent i’s probability of being susceptible and xi denoting agent i’s probability of being infected. We will start by discussing the extensions of these models to networks before discussing agent-based models. In
5 0.2 0.4 0.6 0.8 1.0
-0.05
-0.10
-0.15
-0.20
-0.25
Figure 1.3: The differential equationx ˙ = −.05x − .2x2 for 0 ≤ x ≤ 1 the most basic (SI) model, we track the change in probability using the system of equations:
X s˙i = −βsi Aijxj (1.3) j∈J X x˙i = βsi Aijxj (1.4) j∈J
where β is defined as before and Aij is an element of the unweighted, undirected adjacency matrix [1]. Note that, with epidemics on networks, we only allow i to become infected from one of his neighbors, which is the reason we include the adja- cency matrix. Using the adjacency matrix forces the probability that agent i will be infected by a non-adjacent member of the network to be 0. Using this construction, and ignoring the quadratic term for small initial outbreaks, we can say
x˙ = βAx (1.5)
where x is a vector with elements xi. Using this, writing x as a linear combination of
6 Figure 1.4: The solution curves tox ˙ = .25x − .2x2 with varying initial conditions the eigenvectors of the adjacency matrix we can say that
n X βλrt x(t) = ar(0)e vr (1.6) r=1
βλrt where ar(0)e vr is a solution for a particular element of the linear combination. We notice that these terms are dominated exponentially by the largest eigenvalue allowing us to say that
βρ(A)t x(t) ∼ e v1 [1] (1.7) where λ1 = ρ(A) is the largest eigenvalue of A. Newman’s argument above illustrates the significance of the largest eigenvalue. No longer does the percent of the popu- lation who are infected depend only on the transmission rate (and in extension to other models, recovery rate), but it also is intrinsically related to the structure of the network and how that affects the largest eigenvalue and associated eigenvector.
Notice that, because those infected cannot recover, any transmission of the disease
7 will spread to everyone, just as in the fully-mixed model. We now consider the case when recovering is possible.
When extending the SIR model to networks, the system of equations becomes X s˙i = −βsi Aijxj (1.8) j X x˙i = βsi Aijxj − δxi (1.9) j
r˙i = δxi. (1.10)
We can show that this system has a solution
n X (βλr−δ)t x(t) = ar(0)vre [1]. (1.11) r=1
Again this illustrates the significance of the leading eigenvalue and eigenvector and its ability to approximate the solution:
(βρ(A)−δ)t x(t) ∼ e v1 (1.12)
Notice if δ > βρ(A) we will have x will exponentially decay. That is, if the recov- ery rate is large enough, the disease will not spread throughout the network. This
β 1 β 1 observation gives us the bifurcation value of δ = ρ(A) . As before, if δ < ρ(A) , then the epidemic will grow. This also leads to the fact that for small values of ρ(A), it is difficult for a disease to spread throughout the network, and for large values of ρ(A) the opposite is true.
However, this modeling still assumes that transmission and recovery rates are constant and independent of the agent. More accurately generated models can be
8 created using agent-based epidemic modeling [2]. As described in the introduction, each directed edge on the network is given an independent probability which mod- els the likelihood that the infection will spread through said contact. However, the assumption of independent probabilities can be an over-simplification and unrealistic [3]. It would make sense that the when computing the probability of adjacent agents i and j being infected at time t, we would need to consider a covariance term that accounts. That is, if agent i is infected, it is more likely that agent j is infected as well. In either the independent or dependent case, we use the a weighted adjacency matrix [wij] similar to that in Equation 1.1 where wij is the contact level between agent i and j and a system of Markov chains to model the disease’s effect on the community [2].
Under the assumption of independence, eigenvalue analysis has shown that, given a matrix A, the Jacobian of a discrete dynamical system evaluated at 0, if ρ(A) < 1 then 0 is a stable fixed point and is referred to as locally asymptotically stable. Similarly, if ρ(A) > 1 then the fixed point is unstable [2]. This implies that if ρ(A) < 1 then the epidemic will die off and if ρ(A) > 1 then it will not. We can then use ρ(A) to model the rate at which the epidemic will be eradicated from the network. If we drop the mutual-independence assumption, then ρ(A) no longer perfectly describes the rate at which the epidemic is eradicated, Rather, it becomes an upper bound for this rate [3]. Thus it could be the case that ρ(A) over estimates the rate at which the disease will initially spread.
While this modeling can be extremely effective in disease tracking [2, 4], it harbors too many complexities for our initial problem. We will then, first consider adjacency matrices, as these are among the simplest methods of displaying network structure. Through the eigenvalue analysis of the simplified problem, we lay the foundations for
9 development of an agent-based model with non-constant transmission and recovery rates.
1.1.3 Existing Bounds on the Largest Eigenvalue
Construction of the dynamics above illustrates that an understanding of spectral anal- ysis of graphs will be beneficial to understanding epidemic dynamics on a network. There are many known results in graph theory which indicate that the largest eigen- value depends on graph structure [2, 7, 9]. We will look primarily at results that pertain to bounds on the largest eigenvalue.
For positive symmetric semi-definite matrices A and B, if A < B then ρ(A) < ρ(B) where ρ(A) denotes the largest eigenvalue of A [2]. We know that, given a graph with n vertices and m edges, that
1 2m(n − 1) 2 ρ(G) ≤ [7]. (1.13) n
Similarly we know that for any graph G, δ(G) ≤ ρ(G) ≤ ∆(G) where δ(G) and ∆(G) are the minimum and maximum node degrees respectively [7].
Another potential bound for ρ(A) can be established through the Gershgorin Cir- cle Theorem which also allows us to generate bounds on the largest eigenvalue of the matrix based on row-sums. The Theorem states that for a complex n × n matrix
A, define ri to be the sum of the absolute value of the entries of row i, then for all j = 1, . . . , n, λj ∈ D(aii, ri) for some i = 1, . . . , n [10].
If we now apply this Theorem to graph theory, if no node is self-adjacent then every diagonal entry of our adjacency matrix will be 0. Thus we now can say that
10 λj ∈ B(0,R) where R = maxi Ri. We will see in Section 1.2 how this pertains to our problem in bounding the largest eigenvalue of our constructed matrices.
We can also consider subgraphs and their associated adjacency matrices to illus- trate some of the advantages of considering simplified graphs. If we let A and B be two adjacency matrices and A + B denote the adjacency matrix of the combination of these two graphs, then we know that
ρ(A + B) ≤ ρ(A) + ρ(B) [7]. (1.14)
This allows us to potentially combine graphs of known spectral radius to find other eigenvalue bounds.
1.2 Constructing our Problem
1.2.1 Terms and Definitions
Let Gn(k) be a tree with n nodes where node k is the center of a star with n − k + 1 pendants and one of those pendants has length k. Let En(k) be the edge set for
Gn(k). Then
Gn(k) = {e(1, 2), e(2, 3), . . . , e(k − 1, k), e(k, k + 1), e(k, k + 2), . . . e(k, n)} (1.15)
We will call An(k) the adjacency matrix of Gn(k) and we will denote the charac- teristic polynomial of An(k) as Pn(k). It should be noted that in all matrix represen- tations, blank entries represent a value of 0.
Example 2. Consider n = 8 and k = 4 then Figure 1.5 is the graph and adjacency matrix of G8(4).
11 (5) 0 1 1 0 1 (6) 1 0 1 1 0 1 1 1 1 (1) (2) (3) (4) 1 0 1 0 (7) 1 0 1 0 (8)
Figure 1.5: The network and associated adjacency matrix for A8(4)
Then we can calculate the characteristic polynomial finding that
8 6 4 P8(4) = λ − 7λ + 9λ (1.16)
= λ4(λ4 − 7λ2 + 9) (1.17)
We will, now, define Bn := An(n−1). This is a path with n nodes. The associated adjacency matrix is commonly referred to as a tridiagonal matrix and has known eigenvalues. We will say Qn is the characteristic polynomial of Bn. The solutions to
Qn = 0 are of the form
kπ λ = 2 cos 1 ≤ k ≤ n. [11]. (1.18) k n + 1